13 KiB
13 KiB
Changelog
v3.0.0 (2026-05-20)
Major architectural overhaul — Phase 0 + Phase 1 of engineering roadmap
Proxy (translate-proxy.py)
- ThreadingHTTPServer — serves concurrent requests (no more blocking)
- Thread-safe shared state — OrderedDict response store with locks, Crof state lock, stats lock
- Batched + atomic stats writes — stats buffered in memory, flushed every 5s via
os.replace() - Graceful shutdown — SIGTERM/SIGINT drain active connections (up to 5s), reject new with 503
- Progressive upstream timeouts — based on input size and tools (60-300s instead of flat 180s)
- Lazy JSON parsing — skip parsing SSE events unless they contain
response.completed - Buffered SSE writes — flush every 30ms, on urgent events, or at 4KB (reduces syscalls)
/healthendpoint — returns backend, target, models, BGP route count- Consolidated imports — all at top, no more missing import crashes
main()entry point — runtime init moved out of module level- TCP_NODELAY — on all streaming paths (from v2.7.0)
- Anthropic prompt caching —
cache_control: ephemeralon system prompts (from v2.7.0)
Launcher (codex-launcher-gui)
- Dynamic port allocation —
_pick_free_port()picks random free port, no more 8080 conflicts - Proxy health gating — Codex will NOT launch if proxy fails health check within 15s
- Error dialogs — clear GTK error dialog when proxy startup fails
- Atomic config backup/restore — temp file +
os.replace(), no more corrupted config.toml - Config transactions — recovery from interrupted sessions on next startup
- Safe cleanup (PID registry) — only kills processes launched by the app (pids.json)
- Proxy stderr piped to log — real-time proxy logs in launcher UI
- Bearer token — Codex config uses
codex-launcher-localinstead of real API key - Usage Dashboard v2 — OpenUsage-inspired dark theme with status pills, KPI strip, model bars (from v2.7.0)
v2.7.0 (2026-05-20)
-
Usage Dashboard redesigned (inspired by OpenUsage design patterns)
- Deep Space dark theme with Catppuccin-inspired color palette
- Header with animated status dots (OK/WARN/ERR provider health)
- KPI summary strip: total providers, requests, token volume, avg latency
- Provider cards with colored borders matching health status
- Status pills: OK (green), WARN (yellow), ERR (red)
- Colored section separators per metric type (Usage=yellow, Models=lavender)
- Model composition bar: stacked horizontal segments per model share
- Per-model breakdown with mini progress bars, percentage, request counts
- Per-model token breakdown (in/out) when available
- Token formatting: 1.2M, 45.3K instead of raw numbers
- Duration formatting: 1.5h, 3.2m instead of raw seconds
- Error section with warning icon
-
TCP_NODELAY streaming optimization
- Disables Nagle's algorithm on streaming connections
- Reduces per-packet latency by up to 40ms on small SSE events
- Applied to all 4 streaming code paths (openai-compat, retry, command-code, generic)
-
Anthropic prompt caching
- System prompts now sent as
cache_control: ephemeralstructured format - Enables Anthropic's automatic prompt caching (saves tokens + cost on repeated prompts)
- System prompts now sent as
v2.6.1 (2026-05-20)
- Google OAuth rebuilt to emulate Gemini CLI
- Uses Google's public OAuth client_id (same as gemini-cli)
- No
client_secret.jsonneeded — zero setup required - PKCE (S256 code challenge) + CSRF state protection
- Scopes: cloud-platform, generative-language, userinfo.email, userinfo.profile
- Redirects to Google's success/failure pages (same as gemini-cli)
- Just click "OAuth Login" → browser opens → authorize → done
- Token file permissions set to 0600 for security
v2.6.0 (2026-05-20)
- Usage Dashboard — per-provider tracking with visual cards
- Request counts, success/failure rates, token usage, latency stats
- Color-coded success rate bars (green/yellow/red)
- Per-model breakdown showing request counts
- Last error and last used timestamp
- Sorted by most-used provider
- Refresh button for live updates
- Proxy usage tracking — records every request to
usage-stats.json - Google OAuth: browse for
client_secret.jsonwith file picker dialog- No longer requires copying to a specific path manually
- Auto-copies selected file to
~/.cache/codex-proxy/
v2.5.1 (2026-05-20)
- Adaptive retry for transient errors (429/502/503)
- Exponential backoff: 2s → 4s → 8s, up to 3 retries
- Works for both single-provider and BGP mode
- BGP routes retry before failing over to next route
- Connection errors (reset/broken pipe) also retried
- Proxy socket reuse — no more
Address already in usecrashes on restart - BGP startup log shows route count and names
v2.5.0 (2026-05-20)
- AI BGP — Multi-provider routing with automatic failover
- New "AI BGP" button in main window → pool manager
- Create BGP pools with ordered routes from any configured endpoint
- Each route has its own endpoint URL, API key, model, and priority
- Failover strategy: tries primary route, automatically falls back to next on error/timeout
- BGP pools appear in endpoint dropdown with 🔀 icon
- Pool editor: add/remove/reorder routes, pick endpoint + model per route
- Up/down buttons for priority reordering
- Proxy logs
[bgp] trying route 'Name'and[bgp] route 'Name' FAILEDon fallback - If all routes fail: returns 502 with detailed error per route
- Fixed TOML config breakage from multi-line paste in API key field (
_toml_safe())
v2.4.0 (2026-05-20)
- Added OpenAdapter provider preset
- Base URL:
https://api.openadapter.in/v1— one API key, 40+ models - Pre-loaded models: glm-4.7, DeepSeek-V3, kimi-k2.6, qwen3.6-plus, claude-sonnet-4-6, gpt-5.4, gemini-2.5-flash, and more
- Works with existing openai-compat proxy backend — no special handling needed
- Base URL:
- Fixed Add/Edit dialog crash (missing
_on_reasoning_toggledmethod) - Redesigned Google OAuth flow with live status dialog and clickable auth URL
v2.3.2 (2026-05-20)
- Added Google Gemini provider with OAuth support
- Two presets: "Google Gemini (API Key)" and "Google Gemini (OAuth)"
- OAuth Login button in endpoint editor — full Google OAuth2 flow
- Starts local HTTP server (port 8085), opens browser for Google consent
- Captures auth code, exchanges for access + refresh tokens
- Stores tokens in
~/.cache/codex-proxy/google-oauth-token.json - Auto-refreshes access tokens when expired (no manual re-login)
- Uses Gemini's OpenAI-compatible endpoint:
generativelanguage.googleapis.com/v1beta/openai - Models: gemini-2.5-flash, gemini-2.5-pro, gemini-2.0-flash, gemini-2.0-flash-lite, and more
- Setup instructions shown if
client_secret.jsonnot found
v2.3.0 (2026-05-20)
- Adaptive Crof self-healing system
- Tracks per-model success/failure history with item counts
- Dynamically learns max item limit per model (starts at 30, adjusts down on failures)
- Proactively compacts input when above learned limit before sending to upstream
- Auto-retry on
finish_reason=lengthwith aggressive re-compaction and resend - Prevents
stream disconnectedandincompleteerrors on long conversations - All tracking logged to stderr:
[crof-adaptive] model=X items=N OK/FAIL -> limit=N
- Fixed
NameError: _tscrash in debug logging - Fixed
ConnectionResetErrorcrash on client disconnect during streaming - Added 180s upstream timeout to prevent hanging connections
- Compaction now preserves function_call/function_call_output pairs (no orphaned tool outputs)
- Fixed reasoning control:
reasoning_effort=nonealways sends both params
v2.2.1 (2026-05-20)
- Fixed compaction orphaning function_call_output items — root cause of Crof
incompleteresponses- Compaction cut between function_call and its function_call_output, creating dangling tool results
- Crof model received orphaned
toolmessages with emptytool_call_id, causing confusion and token exhaustion - Compaction now expands tail boundary to include matching function_call/function_call_output pairs
- Fixed reasoning control:
reasoning_effort=nonenow always sends bothenable_thinking=falseANDreasoning_effort=none- Crof API testing confirmed
reasoning_effort=noneis what actually suppresses reasoning, notenable_thinking=false
- Crof API testing confirmed
- Added upstream debug logging to
~/.cache/codex-proxy/crof-upstream.jsonl
v2.2.0 (2026-05-20)
- Added per-provider Reasoning controls in endpoint editor
- Reasoning On/Off toggle — disable reasoning for models that exhaust output tokens (e.g., Crof mimo-v2.5-pro)
- Reasoning Effort selector: None, Minimal, Low, Medium, High, Max
- When reasoning is OFF: sends
enable_thinking=false+reasoning_effort=noneto upstream API - When reasoning is ON: sends user-selected effort level (default: Medium)
- Settings stored per-endpoint, passed through proxy config to upstream requests
- Strip
reasoning_contentfrom proxy output — Codex doesn't use it, avoids token waste - Force
max_tokens=64000minimum for openai-compat providers — room for both reasoning and content - Inspired by unsloth's reasoning control patterns for Qwen/GPT-OSS models
- Styled reasoning switch: green = ON, orange = OFF, gentle rounded pill shape
- Added error handling to endpoint manager Add/Edit/Manage dialogs (prevents silent failures)
v2.1.3 (2026-05-19)
- Fixed Crof mimo-v2.5-pro stopping mid-response (finish_reason=length)
- Root cause: model emits 600+
reasoning_contentSSE chunks that exhaustmax_tokensbefore any actual content is generated - Strip
reasoning_contentfrom proxy output — Codex doesn't use reasoning, avoids wasting output tokens on invisible text - Force
max_tokensminimum of 64000 for openai-compat providers — gives models room for both reasoning and content - Works for all openai-compat providers (Crof, Z.AI, DeepSeek, OpenRouter, etc.)
- Root cause: model emits 600+
v2.1.2 (2026-05-19)
- Fixed Crof.ai and providers stopping after first tool call (root cause: None tool IDs)
- Codex sends
function_callitems withid=None— proxy now matches tool results to calls by call_id + positional fallback - Fixed orphan message output item when response is only tool calls (no text content)
- Auto-trims long conversations (>30 items) to prevent context overflow on providers like Crof
- Keeps system/developer messages, original user query, and most recent 10 items
- Auto-compacts old items into a summary instead of just dropping them
- Summary includes: user requests, assistant responses, tool calls made, files touched
- Preserves enough context for the model to continue long tasks intelligently
- Truncates large tool outputs (>8000 chars) to prevent model output token exhaustion
- Crof's models return
incompletewhen tool results contain too much text (e.g., full HTML pages) - Truncated outputs include
[truncated N chars]suffix so the model knows data was cut
- Crof's models return
- Added request/response logging to
~/.cache/codex-proxy/requests.logfor debugging - Proxy stderr no longer discarded by launcher (visible in terminal for debugging)
v2.1.1 (2026-05-19)
- Added Command Code backend to translation proxy (proprietary
/alpha/generateAPI) - Added Command Code provider preset with 20 models (DeepSeek, Claude, GPT, Kimi, GLM, Qwen, etc.)
- Added
cc_versionfield in endpoint editor for Command Code version (default: 0.26.8) - Proxy sends
x-command-code-versionheader to CC API (fixes 403 "upgrade_required") - CC message conversion:
systemrole →user, string content → array, tools stripped, real UUID for threadId - Fixed proxy: map
developerrole tosystemfor Chat Completions providers (DeepSeek, Qwen, etc.) - Fixed proxy: map
developerrole touserfor Anthropic providers - Forward
instructionsfield from Responses API as system message/param
v2.1.0 (2026-05-19)
- Added Codex auth status detection (reads
codex login status) - Auth status bar shows logged-in provider or warning if auth missing/expired
- "Re-login" button opens
codex loginin a terminal for re-authentication - Auto re-checks auth 30s after re-login flow starts
- Pre-launch auth check warns before launching Codex Default mode if auth is invalid
- Auth status checked asynchronously at startup (non-blocking)
v2.0.1 (2026-05-19)
- Added Codex CLI/Desktop installation verifier to main page
- Green check (✔) when detected, yellow cross (✘) when missing
- "Install" button next to missing tools opens install guide dialog
- Desktop/CLI launch buttons disabled with tooltip when tool is missing
- Dependency status logged on startup
- Buttons respect missing-state after busy/unbusy cycles
v2.0.0 (2026-05-19)
- Initial release: multi-provider Codex Launcher
- Translation proxy: Responses API to Chat Completions + Anthropic Messages
- GTK endpoint manager with 10+ provider presets
- Codex Default mode (built-in OAuth, zero config)
- Browser UA injection for Cloudflare-protected providers (OpenCode)
- Streaming SSE, tool calls, reasoning content support
- Profile backup/import, model auto-fetch, bulk import
- Refresh Models in background thread
- URL normalization to prevent double-path bugs
- Config backup/restore around sessions
- .deb installer package