# Changelog ## v2.5.0 (2026-05-20) - **AI BGP — Multi-provider routing with automatic failover** - New "AI BGP" button in main window → pool manager - Create BGP pools with ordered routes from any configured endpoint - Each route has its own endpoint URL, API key, model, and priority - **Failover strategy**: tries primary route, automatically falls back to next on error/timeout - BGP pools appear in endpoint dropdown with 🔀 icon - Pool editor: add/remove/reorder routes, pick endpoint + model per route - Up/down buttons for priority reordering - Proxy logs `[bgp] trying route 'Name'` and `[bgp] route 'Name' FAILED` on fallback - If all routes fail: returns 502 with detailed error per route - Fixed TOML config breakage from multi-line paste in API key field (`_toml_safe()`) ## v2.4.0 (2026-05-20) - **Added OpenAdapter provider preset** - Base URL: `https://api.openadapter.in/v1` — one API key, 40+ models - Pre-loaded models: glm-4.7, DeepSeek-V3, kimi-k2.6, qwen3.6-plus, claude-sonnet-4-6, gpt-5.4, gemini-2.5-flash, and more - Works with existing openai-compat proxy backend — no special handling needed - Fixed Add/Edit dialog crash (missing `_on_reasoning_toggled` method) - Redesigned Google OAuth flow with live status dialog and clickable auth URL ## v2.3.2 (2026-05-20) - **Added Google Gemini provider with OAuth support** - Two presets: "Google Gemini (API Key)" and "Google Gemini (OAuth)" - OAuth Login button in endpoint editor — full Google OAuth2 flow - Starts local HTTP server (port 8085), opens browser for Google consent - Captures auth code, exchanges for access + refresh tokens - Stores tokens in `~/.cache/codex-proxy/google-oauth-token.json` - Auto-refreshes access tokens when expired (no manual re-login) - Uses Gemini's OpenAI-compatible endpoint: `generativelanguage.googleapis.com/v1beta/openai` - Models: gemini-2.5-flash, gemini-2.5-pro, gemini-2.0-flash, gemini-2.0-flash-lite, and more - Setup instructions shown if `client_secret.json` not found ## v2.3.0 (2026-05-20) - **Adaptive Crof self-healing system** - Tracks per-model success/failure history with item counts - Dynamically learns max item limit per model (starts at 30, adjusts down on failures) - Proactively compacts input when above learned limit before sending to upstream - Auto-retry on `finish_reason=length` with aggressive re-compaction and resend - Prevents `stream disconnected` and `incomplete` errors on long conversations - All tracking logged to stderr: `[crof-adaptive] model=X items=N OK/FAIL -> limit=N` - Fixed `NameError: _ts` crash in debug logging - Fixed `ConnectionResetError` crash on client disconnect during streaming - Added 180s upstream timeout to prevent hanging connections - Compaction now preserves function_call/function_call_output pairs (no orphaned tool outputs) - Fixed reasoning control: `reasoning_effort=none` always sends both params ## v2.2.1 (2026-05-20) - **Fixed compaction orphaning function_call_output items** — root cause of Crof `incomplete` responses - Compaction cut between function_call and its function_call_output, creating dangling tool results - Crof model received orphaned `tool` messages with empty `tool_call_id`, causing confusion and token exhaustion - Compaction now expands tail boundary to include matching function_call/function_call_output pairs - **Fixed reasoning control**: `reasoning_effort=none` now always sends both `enable_thinking=false` AND `reasoning_effort=none` - Crof API testing confirmed `reasoning_effort=none` is what actually suppresses reasoning, not `enable_thinking=false` - Added upstream debug logging to `~/.cache/codex-proxy/crof-upstream.jsonl` ## v2.2.0 (2026-05-20) - **Added per-provider Reasoning controls in endpoint editor** - Reasoning On/Off toggle — disable reasoning for models that exhaust output tokens (e.g., Crof mimo-v2.5-pro) - Reasoning Effort selector: None, Minimal, Low, Medium, High, Max - When reasoning is OFF: sends `enable_thinking=false` + `reasoning_effort=none` to upstream API - When reasoning is ON: sends user-selected effort level (default: Medium) - Settings stored per-endpoint, passed through proxy config to upstream requests - Strip `reasoning_content` from proxy output — Codex doesn't use it, avoids token waste - Force `max_tokens=64000` minimum for openai-compat providers — room for both reasoning and content - Inspired by unsloth's reasoning control patterns for Qwen/GPT-OSS models - Styled reasoning switch: green = ON, orange = OFF, gentle rounded pill shape - Added error handling to endpoint manager Add/Edit/Manage dialogs (prevents silent failures) ## v2.1.3 (2026-05-19) - **Fixed Crof mimo-v2.5-pro stopping mid-response (finish_reason=length)** - Root cause: model emits 600+ `reasoning_content` SSE chunks that exhaust `max_tokens` before any actual content is generated - Strip `reasoning_content` from proxy output — Codex doesn't use reasoning, avoids wasting output tokens on invisible text - Force `max_tokens` minimum of 64000 for openai-compat providers — gives models room for both reasoning and content - Works for all openai-compat providers (Crof, Z.AI, DeepSeek, OpenRouter, etc.) ## v2.1.2 (2026-05-19) - **Fixed Crof.ai and providers stopping after first tool call (root cause: None tool IDs)** - Codex sends `function_call` items with `id=None` — proxy now matches tool results to calls by call_id + positional fallback - Fixed orphan message output item when response is only tool calls (no text content) - **Auto-trims long conversations (>30 items)** to prevent context overflow on providers like Crof - Keeps system/developer messages, original user query, and most recent 10 items - **Auto-compacts old items into a summary** instead of just dropping them - Summary includes: user requests, assistant responses, tool calls made, files touched - Preserves enough context for the model to continue long tasks intelligently - **Truncates large tool outputs (>8000 chars)** to prevent model output token exhaustion - Crof's models return `incomplete` when tool results contain too much text (e.g., full HTML pages) - Truncated outputs include `[truncated N chars]` suffix so the model knows data was cut - Added request/response logging to `~/.cache/codex-proxy/requests.log` for debugging - Proxy stderr no longer discarded by launcher (visible in terminal for debugging) ## v2.1.1 (2026-05-19) - Added Command Code backend to translation proxy (proprietary `/alpha/generate` API) - Added Command Code provider preset with 20 models (DeepSeek, Claude, GPT, Kimi, GLM, Qwen, etc.) - Added `cc_version` field in endpoint editor for Command Code version (default: 0.26.8) - Proxy sends `x-command-code-version` header to CC API (fixes 403 "upgrade_required") - CC message conversion: `system` role → `user`, string content → array, tools stripped, real UUID for threadId - Fixed proxy: map `developer` role to `system` for Chat Completions providers (DeepSeek, Qwen, etc.) - Fixed proxy: map `developer` role to `user` for Anthropic providers - Forward `instructions` field from Responses API as system message/param ## v2.1.0 (2026-05-19) - Added Codex auth status detection (reads `codex login status`) - Auth status bar shows logged-in provider or warning if auth missing/expired - "Re-login" button opens `codex login` in a terminal for re-authentication - Auto re-checks auth 30s after re-login flow starts - Pre-launch auth check warns before launching Codex Default mode if auth is invalid - Auth status checked asynchronously at startup (non-blocking) ## v2.0.1 (2026-05-19) - Added Codex CLI/Desktop installation verifier to main page - Green check (✔) when detected, yellow cross (✘) when missing - "Install" button next to missing tools opens install guide dialog - Desktop/CLI launch buttons disabled with tooltip when tool is missing - Dependency status logged on startup - Buttons respect missing-state after busy/unbusy cycles ## v2.0.0 (2026-05-19) - Initial release: multi-provider Codex Launcher - Translation proxy: Responses API to Chat Completions + Anthropic Messages - GTK endpoint manager with 10+ provider presets - Codex Default mode (built-in OAuth, zero config) - Browser UA injection for Cloudflare-protected providers (OpenCode) - Streaming SSE, tool calls, reasoning content support - Profile backup/import, model auto-fetch, bulk import - Refresh Models in background thread - URL normalization to prevent double-path bugs - Config backup/restore around sessions - .deb installer package