Files
Codex-Launcher---Any-AI-Por…/CHANGELOG.md

11 KiB

Changelog

v2.7.0 (2026-05-20)

  • Usage Dashboard redesigned (inspired by OpenUsage design patterns)

    • Deep Space dark theme with Catppuccin-inspired color palette
    • Header with animated status dots (OK/WARN/ERR provider health)
    • KPI summary strip: total providers, requests, token volume, avg latency
    • Provider cards with colored borders matching health status
    • Status pills: OK (green), WARN (yellow), ERR (red)
    • Colored section separators per metric type (Usage=yellow, Models=lavender)
    • Model composition bar: stacked horizontal segments per model share
    • Per-model breakdown with mini progress bars, percentage, request counts
    • Per-model token breakdown (in/out) when available
    • Token formatting: 1.2M, 45.3K instead of raw numbers
    • Duration formatting: 1.5h, 3.2m instead of raw seconds
    • Error section with warning icon
  • TCP_NODELAY streaming optimization

    • Disables Nagle's algorithm on streaming connections
    • Reduces per-packet latency by up to 40ms on small SSE events
    • Applied to all 4 streaming code paths (openai-compat, retry, command-code, generic)
  • Anthropic prompt caching

    • System prompts now sent as cache_control: ephemeral structured format
    • Enables Anthropic's automatic prompt caching (saves tokens + cost on repeated prompts)

v2.6.1 (2026-05-20)

  • Google OAuth rebuilt to emulate Gemini CLI
    • Uses Google's public OAuth client_id (same as gemini-cli)
    • No client_secret.json needed — zero setup required
    • PKCE (S256 code challenge) + CSRF state protection
    • Scopes: cloud-platform, generative-language, userinfo.email, userinfo.profile
    • Redirects to Google's success/failure pages (same as gemini-cli)
    • Just click "OAuth Login" → browser opens → authorize → done
    • Token file permissions set to 0600 for security

v2.6.0 (2026-05-20)

  • Usage Dashboard — per-provider tracking with visual cards
    • Request counts, success/failure rates, token usage, latency stats
    • Color-coded success rate bars (green/yellow/red)
    • Per-model breakdown showing request counts
    • Last error and last used timestamp
    • Sorted by most-used provider
    • Refresh button for live updates
  • Proxy usage tracking — records every request to usage-stats.json
  • Google OAuth: browse for client_secret.json with file picker dialog
    • No longer requires copying to a specific path manually
    • Auto-copies selected file to ~/.cache/codex-proxy/

v2.5.1 (2026-05-20)

  • Adaptive retry for transient errors (429/502/503)
    • Exponential backoff: 2s → 4s → 8s, up to 3 retries
    • Works for both single-provider and BGP mode
    • BGP routes retry before failing over to next route
    • Connection errors (reset/broken pipe) also retried
  • Proxy socket reuse — no more Address already in use crashes on restart
  • BGP startup log shows route count and names

v2.5.0 (2026-05-20)

  • AI BGP — Multi-provider routing with automatic failover
    • New "AI BGP" button in main window → pool manager
    • Create BGP pools with ordered routes from any configured endpoint
    • Each route has its own endpoint URL, API key, model, and priority
    • Failover strategy: tries primary route, automatically falls back to next on error/timeout
    • BGP pools appear in endpoint dropdown with 🔀 icon
    • Pool editor: add/remove/reorder routes, pick endpoint + model per route
    • Up/down buttons for priority reordering
    • Proxy logs [bgp] trying route 'Name' and [bgp] route 'Name' FAILED on fallback
    • If all routes fail: returns 502 with detailed error per route
  • Fixed TOML config breakage from multi-line paste in API key field (_toml_safe())

v2.4.0 (2026-05-20)

  • Added OpenAdapter provider preset
    • Base URL: https://api.openadapter.in/v1 — one API key, 40+ models
    • Pre-loaded models: glm-4.7, DeepSeek-V3, kimi-k2.6, qwen3.6-plus, claude-sonnet-4-6, gpt-5.4, gemini-2.5-flash, and more
    • Works with existing openai-compat proxy backend — no special handling needed
  • Fixed Add/Edit dialog crash (missing _on_reasoning_toggled method)
  • Redesigned Google OAuth flow with live status dialog and clickable auth URL

v2.3.2 (2026-05-20)

  • Added Google Gemini provider with OAuth support
    • Two presets: "Google Gemini (API Key)" and "Google Gemini (OAuth)"
    • OAuth Login button in endpoint editor — full Google OAuth2 flow
    • Starts local HTTP server (port 8085), opens browser for Google consent
    • Captures auth code, exchanges for access + refresh tokens
    • Stores tokens in ~/.cache/codex-proxy/google-oauth-token.json
    • Auto-refreshes access tokens when expired (no manual re-login)
    • Uses Gemini's OpenAI-compatible endpoint: generativelanguage.googleapis.com/v1beta/openai
    • Models: gemini-2.5-flash, gemini-2.5-pro, gemini-2.0-flash, gemini-2.0-flash-lite, and more
    • Setup instructions shown if client_secret.json not found

v2.3.0 (2026-05-20)

  • Adaptive Crof self-healing system
    • Tracks per-model success/failure history with item counts
    • Dynamically learns max item limit per model (starts at 30, adjusts down on failures)
    • Proactively compacts input when above learned limit before sending to upstream
    • Auto-retry on finish_reason=length with aggressive re-compaction and resend
    • Prevents stream disconnected and incomplete errors on long conversations
    • All tracking logged to stderr: [crof-adaptive] model=X items=N OK/FAIL -> limit=N
  • Fixed NameError: _ts crash in debug logging
  • Fixed ConnectionResetError crash on client disconnect during streaming
  • Added 180s upstream timeout to prevent hanging connections
  • Compaction now preserves function_call/function_call_output pairs (no orphaned tool outputs)
  • Fixed reasoning control: reasoning_effort=none always sends both params

v2.2.1 (2026-05-20)

  • Fixed compaction orphaning function_call_output items — root cause of Crof incomplete responses
    • Compaction cut between function_call and its function_call_output, creating dangling tool results
    • Crof model received orphaned tool messages with empty tool_call_id, causing confusion and token exhaustion
    • Compaction now expands tail boundary to include matching function_call/function_call_output pairs
  • Fixed reasoning control: reasoning_effort=none now always sends both enable_thinking=false AND reasoning_effort=none
    • Crof API testing confirmed reasoning_effort=none is what actually suppresses reasoning, not enable_thinking=false
  • Added upstream debug logging to ~/.cache/codex-proxy/crof-upstream.jsonl

v2.2.0 (2026-05-20)

  • Added per-provider Reasoning controls in endpoint editor
    • Reasoning On/Off toggle — disable reasoning for models that exhaust output tokens (e.g., Crof mimo-v2.5-pro)
    • Reasoning Effort selector: None, Minimal, Low, Medium, High, Max
    • When reasoning is OFF: sends enable_thinking=false + reasoning_effort=none to upstream API
    • When reasoning is ON: sends user-selected effort level (default: Medium)
    • Settings stored per-endpoint, passed through proxy config to upstream requests
  • Strip reasoning_content from proxy output — Codex doesn't use it, avoids token waste
  • Force max_tokens=64000 minimum for openai-compat providers — room for both reasoning and content
  • Inspired by unsloth's reasoning control patterns for Qwen/GPT-OSS models
  • Styled reasoning switch: green = ON, orange = OFF, gentle rounded pill shape
  • Added error handling to endpoint manager Add/Edit/Manage dialogs (prevents silent failures)

v2.1.3 (2026-05-19)

  • Fixed Crof mimo-v2.5-pro stopping mid-response (finish_reason=length)
    • Root cause: model emits 600+ reasoning_content SSE chunks that exhaust max_tokens before any actual content is generated
    • Strip reasoning_content from proxy output — Codex doesn't use reasoning, avoids wasting output tokens on invisible text
    • Force max_tokens minimum of 64000 for openai-compat providers — gives models room for both reasoning and content
    • Works for all openai-compat providers (Crof, Z.AI, DeepSeek, OpenRouter, etc.)

v2.1.2 (2026-05-19)

  • Fixed Crof.ai and providers stopping after first tool call (root cause: None tool IDs)
  • Codex sends function_call items with id=None — proxy now matches tool results to calls by call_id + positional fallback
  • Fixed orphan message output item when response is only tool calls (no text content)
  • Auto-trims long conversations (>30 items) to prevent context overflow on providers like Crof
    • Keeps system/developer messages, original user query, and most recent 10 items
    • Auto-compacts old items into a summary instead of just dropping them
    • Summary includes: user requests, assistant responses, tool calls made, files touched
    • Preserves enough context for the model to continue long tasks intelligently
  • Truncates large tool outputs (>8000 chars) to prevent model output token exhaustion
    • Crof's models return incomplete when tool results contain too much text (e.g., full HTML pages)
    • Truncated outputs include [truncated N chars] suffix so the model knows data was cut
  • Added request/response logging to ~/.cache/codex-proxy/requests.log for debugging
  • Proxy stderr no longer discarded by launcher (visible in terminal for debugging)

v2.1.1 (2026-05-19)

  • Added Command Code backend to translation proxy (proprietary /alpha/generate API)
  • Added Command Code provider preset with 20 models (DeepSeek, Claude, GPT, Kimi, GLM, Qwen, etc.)
  • Added cc_version field in endpoint editor for Command Code version (default: 0.26.8)
  • Proxy sends x-command-code-version header to CC API (fixes 403 "upgrade_required")
  • CC message conversion: system role → user, string content → array, tools stripped, real UUID for threadId
  • Fixed proxy: map developer role to system for Chat Completions providers (DeepSeek, Qwen, etc.)
  • Fixed proxy: map developer role to user for Anthropic providers
  • Forward instructions field from Responses API as system message/param

v2.1.0 (2026-05-19)

  • Added Codex auth status detection (reads codex login status)
  • Auth status bar shows logged-in provider or warning if auth missing/expired
  • "Re-login" button opens codex login in a terminal for re-authentication
  • Auto re-checks auth 30s after re-login flow starts
  • Pre-launch auth check warns before launching Codex Default mode if auth is invalid
  • Auth status checked asynchronously at startup (non-blocking)

v2.0.1 (2026-05-19)

  • Added Codex CLI/Desktop installation verifier to main page
  • Green check (✔) when detected, yellow cross (✘) when missing
  • "Install" button next to missing tools opens install guide dialog
  • Desktop/CLI launch buttons disabled with tooltip when tool is missing
  • Dependency status logged on startup
  • Buttons respect missing-state after busy/unbusy cycles

v2.0.0 (2026-05-19)

  • Initial release: multi-provider Codex Launcher
  • Translation proxy: Responses API to Chat Completions + Anthropic Messages
  • GTK endpoint manager with 10+ provider presets
  • Codex Default mode (built-in OAuth, zero config)
  • Browser UA injection for Cloudflare-protected providers (OpenCode)
  • Streaming SSE, tool calls, reasoning content support
  • Profile backup/import, model auto-fetch, bulk import
  • Refresh Models in background thread
  • URL normalization to prevent double-path bugs
  • Config backup/restore around sessions
  • .deb installer package