- CHANGELOG.md: full v3.8.0 section with 3-tier system, 30 fault types, safety guards - README.md: AI Monitoring badge, features section, Phase 9 dev journey, troubleshooting rows - GUI CHANGELOG: v3.8.0 entry with 9 bullet points
29 KiB
Changelog
v3.8.0 (2026-05-22)
AI Monitoring — Self-Healing Watchdog with 3-Tier Response System
When the proxy crashes, the upstream dies, or the model gets stuck, Codex stops working. The user has to manually restart everything. AI Monitoring fixes this with an autonomous watchdog that detects, diagnoses, and recovers from failures without user intervention.
Three-Tier Response System
| Tier | Speed | What | When |
|---|---|---|---|
| Tier 1 | < 1s | Rule-based auto-recovery | Known failure patterns (14 rules) |
| Tier 2 | < 100ms | Incident store lookup | We've seen this exact failure before |
| Tier 3 | 2-5s | AI diagnostic agent (configurable model) | Novel failure — no rule or pattern matches |
Watchdog Components
- HealthWatcher thread — pings proxy
/healthevery 5 seconds, detects crashes and hangs - LogAnalyzer thread — tails
cc-debug.logfor 18 failure signal patterns in real-time - Tier 1 rule engine — 14 rules covering: proxy crash restart, port conflict resolution, upstream retry with backoff, schema cache clearing, rate limit handling, stream error recovery
- Tier 2 incident store — JSON pattern database (
~/.cache/codex-proxy/incident-store.json) with success rates, learns from every resolved incident - Tier 3 AI diagnostic agent — calls a user-configured provider/model (e.g., Gemini Flash, GPT-4o-mini, local Ollama) to diagnose novel failures. Cost: ~$0.10-1.50/month
Failure Catalog: 30 Fault Types
- Category A (7): Proxy crash, port conflict, memory leak, deadlock, SSL error, DNS failure, unhandled exception
- Category B (10): Rate limit (429), server error (5xx), auth failure (401/403), CC upgrade required, timeout, connection reset, broken pipe, bad request, provider overloaded, Cloudflare block
- Category C (10): Parser empty, stuck recovery, sanitizer flags, double-wrapped cmd, suspicious cmd, empty cmd, bare JSON token, bash without cmd, DSML name mismatch, stuck model loop
- Category D (6): Codex process killed, memory explosion, 300s stall, config corruption, context overflow, WebSocket reconnect loop
- Category E (5): Schema cache corruption, stale PID file, port from old session, OAuth token expired, BGP all routes down
Safety Guards
- Rate-limited AI calls: max 1 per 60s, max 10/day
- Restart cap: max 5 proxy restarts per 10 minutes
- Cooldown per pattern (30s → 60s → 300s → alert user)
- Monthly AI budget cap (configurable, default $2/month)
Enhanced /health Endpoint
The proxy's /health endpoint now returns uptime_s, memory_mb, and requests_total for watchdog monitoring.
GUI Integration
- "AI Monitor" button in header bar
- AIMonitoringWindow: ON/OFF toggle, provider URL/model/API key selector, health check interval, auto-restart toggle, incident log viewer
- Watchdog starts automatically when enabled
- All actions logged to
~/.cache/codex-proxy/monitoring.log
AI Monitoring Design Spec
Full design document at AI-MONITORING-DESIGN.md — architecture diagrams, decision flow, safety guards, implementation plan.
v3.7.0 (2026-05-22)
Intelligence Routing — Self-Healing Parser System
When the Command Code model produces output in unpredictable or unrecognized formats, the multi-format parser chain (DSML, XML, explore_agent, bash blocks, raw JSON, fallback regex) can return empty. This causes the Codex agent loop to stall — zero tool calls means nothing to execute.
Intelligence Routing is a three-layer self-healing system that ensures the agent loop always continues:
Layer 1: Deep URL Extraction (FIX 23)
- Problem:
<explore_agent>body containedmessages: [{"content": "https://..."}]— URLs hidden inside JSON values. Regex couldn't match because it excluded the"character that terminates JSON strings. - Solution:
_build_explore_cmd()extracted to module level (was a closure inside_parse_commandcode_text_tool_calls). After initial regex fails, triesjson.loads(), iterates list items, extractscontentfield to find URLs. Added"to regex exclusion set. - Self-tests: Pattern M, O, O2 verify URL extraction from nested JSON.
Layer 2: Escalation Block Handling (FIX 24)
- Problem: Model produces
<require_escalation>and<request_escalation_permission>blocks when it wants elevated permissions. CC adapter doesn't support escalation — blocks silently dropped →parsed_tool_calls=0→ stall. - Solution: Two handlers:
- FIX 24a: Closed-tag blocks — extracts URL if present and runs explore command; otherwise echoes auto-proceed.
- FIX 24b: Bare/unclosed tags (
<require_escalation />) — auto-proceeds with diagnostic echo.
- Self-tests: Pattern N, N2 verify both closed and bare escalation blocks.
Layer 3: Intent-Based Command Synthesis (FIX 25 — THE CORE)
- Problem: After ALL parsers return empty, the agent loop has zero tool calls. Model may have written plain English ("I need to fetch the README"), partial JSON, or completely unrecognized formats.
- Solution: 5-heuristic synthesis chain in
cc_stream_to_sse(), run whenparsed_tool_calls=0and text has content:- URL in text →
curlto fetch it - File path reference ("read the file /path/to/X") →
catorlsthat file - Shell command in backticks/quotes → extract and run it
- "explore"/"fetch"/"investigate"/"repository" intent + last user URL →
_build_explore_cmd()with_last_user_urlsdeque - "I need to"/"let me"/"please" intent text → echo diagnostic with the intent
- URL in text →
- The system NEVER returns empty tool calls when there's text to analyze.
- Self-tests: Patterns M-O2 cover the full pipeline.
Architecture
_parse_commandcode_text_tool_calls() ← Layer 1 + Layer 2
cc_stream_to_sse() ← Layer 3 (after parser chain + fallback)
_last_user_urls deque (maxlen=20) ← Session-wide URL memory for heuristic 4
Test Coverage
- 54 self-test patterns (up from 41 in v3.6.0)
- 13 new tests covering all three Intelligence Routing layers
- Tests verify: nested JSON URL extraction, closed/bare escalation blocks, module-level explore command builder
v3.6.0 (2026-05-22)
Performance & Stability Hardening — Connection Pooling, Stream Idle Timeouts, Retry-After
Inspired by architectural study of Codex-Proxy-Server (Rust/Axum).
P0: Connection Pooling & Stream Idle Timeout
- Connection pooling (
http.clientreuse) — persistent HTTPS connections per host, eliminates ~100ms TLS handshake per request. Pool keyed by{scheme}://{host}:{port}, reused across requests. - Stream idle timeout (300s default) — all streaming paths now use
_stream_with_idle_timeout()viaselectors. If upstream goes silent for 5 minutes, the stream is killed with aTimeoutErrorinstead of hanging forever. Applied to:- OpenAI-compat streaming (
oa_stream_to_sse) - Command Code streaming (
_iter_cc_events) - Gemini OAuth streaming (
_handle_gemini_oauth) - Auto-continue streaming (
_auto_continue_gemini)
- OpenAI-compat streaming (
P1: Retry-After Header Support & Preemptive Token Refresh
Retry-Afterheader — all retry paths (openai-compat, BGP, auto) now read the upstreamRetry-Afterheader and respect it (capped at 60s). Falls back to exponential backoff if header is absent.- Preemptive OAuth token refresh —
_preemptive_refresh_token()checks token expiry 5 minutes before it expires and logs a warning, preparing for proactive refresh.
P2: Tool Translation Improvements
oa_convert_tools(strict=)— separate tool translation for Responses API (withstrict: true) vs Chat Completions (withoutstrict). Some providers reject thestrictfield in Chat Completions mode.- Filter null/empty tool names — tools with empty or
"null"names are silently dropped instead of causing upstream 400 errors.
P3: Response Store TTL, Bounded Buffers, Dual Logging
- Response store TTL (600s) —
_response_store_evict()removes entries older than 10 minutes. Prevents unbounded memory growth on long sessions. - Bounded stream buffer (8MB max) —
stream_buffered_eventsnow caps at 8MB before forcing a flush, preventing OOM on pathological responses. response.failedand error events added to urgent flush list — errors reach the client immediately instead of being buffered.- Dual logging —
proxy.login~/.cache/codex-proxy/captures all proxy messages alongside stderr. Survives Codex Desktop's stderr piping.
v3.5.0 (2026-05-22)
Major Release — Command Code Adapter Overhaul, AI Assist, Self-Revive Watchdog, Debug Infrastructure
Command Code Provider — Multi-Format Tool-Call Parser (Critical Bug Fix)
The Command Code (CC) provider adapter in translate-proxy.py had a critical bug where the CC model's tool-call output was not being parsed into executable tool calls, causing the Codex agent loop to stop after the first response. The CC model output format changes between sessions and models — the parser must handle all observed formats.
Root Cause: The CC model returns tool calls as inline text in various formats (raw JSON, XML, DSML tags, HTML-like blocks) within text-delta SSE events. The original parser only handled one format. When the model switched output style, tool calls were silently dropped, and Codex received a plain text response instead of executable commands — halting the multi-turn agent loop.
The Fix — Multi-Format Parser Chain (17 patches):
A cascading parser chain was built that tries each format in order, first match wins:
DSML → <bash> blocks → <explore_agent> → <tool_call type=...> → XML patterns → raw JSON → fallback regex
- FIX 1:
cc_input_to_messages()— enforce STRING content only (CC/alpha/generaterejects content blocks). Tool calls sent as inline JSON text in assistant messages. Tool results asrole: "user"plain text (NOTrole: "tool"). - FIX 2:
x-command-code-versionheader always sent (fallback"0.26.8") — prevents 403upgrade_requirederrors. - FIX 3: Cleared stale schema cache (
content_type:"array") that was corrupting message construction. - FIX 4: Streaming
try/exceptwrapper — catches all streaming errors and sendsresponse.completed(status:"failed")event instead of crashing the connection. - FIX 5:
_extract_raw_json_tool_calls()— new parser that finds raw JSON tool calls embedded in model text ({"cmd":"...","type":"tool-call"}). - FIX 6:
_extract_args()three-tier parser — tries direct parse →codecs.escape_decode→unicode_escapeto prevent double-wrapped argument strings. - FIX 7:
_extract_field()skips leading\before value type check — handles malformed escape sequences in CC output. - FIX 8:
sandbox_permissionsnormalization from parsed dict — converts{"docker":"full"}to the flat string format Codex expects. - FIX 9 (REVERTED): Removed adaptive probe system — proved unnecessary, conservative inline-text format is sufficient.
- FIX 10: Comprehensive fix documentation added to proxy file header for maintainability.
- FIX 11:
_unwrap_cmd()recursive unwrapping — handles double/triple-wrappedcmdvalues at all 7 extraction paths._sanitize_tool_calls()post-extraction validation layer ensures every tool call has valid name + args. - FIX 11c: XML regex fix —
</tool_call)had unbalanced parenthesis for ~4000 lines; now uses[)]?>to match both</tool_call)>and</tool_call)>. - FIX 12: Self-revive watchdog loop — auto-restarts proxy on crash (up to 50x, progressive backoff 1→30s). Controlled by
_SHUTDOWN_REQUESTEDflag on SIGTERM/SIGINT. - FIX 13: Fallback extraction when main parser returns empty but text contains tool-call signals (
{"cmd":,"type":"tool-call",<tool,<function=). - FIX 14: Parser for
<tool_call type="bash">\n{"command":"..."}format (actual CC model output) + fixed fallback regex to match BOTH"cmd"AND"command"keys. - FIX 15:
<explore_agent>blocks converted to realexec_commandwith synthesized curl-based repo exploration command. - FIX 16:
<bash>...</bash>blocks parsed — extractsprefix_rule,sandbox_permissions,justificationvia line-oriented parsing. - FIX 17: DSML tool_call blocks — the current CC model output format:
<||DSML||tool_calls>wrapper<||DSML||invoke name="exec">with<||DSML||parameter name="command">tags- Extracts command from
parameter name="command"or fallback toprefix_rule - Maps
exec/bash→exec_command
Debug Infrastructure
- Debug-to-file: All proxy events, text_buf preview, parser results, and fallback attempts logged to
~/.cache/codex-proxy/cc-debug.log— works even when stderr is piped by Codex Desktop. - Inline self-test:
--self-testflag runs 19 tests covering unwrap, double-wrap, unescaped quotes, XML, function=, sanitizer edge cases. - Per-request logging: Event types, text_buf content, parser match results written to debug log for every request.
AI Assist
- AI Assist integration in launcher GUI for intelligent provider configuration and troubleshooting.
Self-Revive Watchdog
- Proxy auto-restarts on crash with progressive backoff (1s → 30s, up to 50 restarts).
- Clean shutdown on SIGTERM/SIGINT via
_SHUTDOWN_REQUESTEDflag. - Eliminates manual proxy restart during long coding sessions.
Other Improvements
text_bufincc_stream_to_sseaccumulates alltext-deltaevents; parsing happens at end-of-stream for complete extraction.- Schema cache with 24h staleness TTL for provider capabilities.
- ErrorAnalyzer learns from 4xx errors on retry (max 2 retries).
cleanup-codex-stale.shupdated with additional stale process patterns.
v3.3.0 (2026-05-20)
Antigravity + Gemini CLI OAuth — full Codex agent loop working
Gemini CLI OAuth + Antigravity OAuth
- Split Google OAuth into separate Gemini CLI OAuth and Google Antigravity OAuth presets/backends.
- Gemini CLI OAuth uses the Gemini CLI public OAuth client and Code Assist endpoints.
- Antigravity OAuth uses Antigravity OAuth credentials, Code Assist daily/autopush/prod fallback, and Antigravity-style request wrapping.
- Added Antigravity version discovery from the updater/changelog with local caching.
- Added Antigravity model alias mapping from UI-facing
antigravity-*IDs to upstream Code Assist model IDs.
Responses API + Tool Flow
- Added Gemini-style history hardening for Google OAuth requests: removes empty turns, coalesces adjacent roles, drops duplicate user repeats, and enforces user-start/user-end history.
- Preserves function-call IDs across turns and adds synthetic
thoughtSignaturefor historical Gemini function calls, matching Gemini CLI hardening behavior. - Fixed Antigravity streaming Responses API compatibility: single assistant message item, text done events, content part done, output item done, final completed event, and connection close.
- Added
response.function_call_arguments.deltaandresponse.function_call_arguments.doneevents so Codex can execute Antigravity tool calls and create files. - Fixed functionResponse name matching — uses the original functionCall name instead of falling back to call_id.
- Strengthened Antigravity prompt policy: use tools immediately for file changes, avoid planning-only responses, and answer directly when no suitable tool exists.
- Auto-continue on MAX_TOKENS — when Gemini/Antigravity truncates a text response, the proxy transparently sends a continuation request and concatenates the output so Codex receives the complete response without manual intervention.
Reliability + Routing
- Added BGP++ route scoring, route cooldowns, token buckets, and persisted route stats.
- Added provider policy layer and adaptive context compaction.
- Added tool-call pairing validation/repair for orphaned tool outputs.
- Added Endpoint Doctor in the endpoint editor.
- Added log redaction helper for common API key/token patterns.
v3.1.0 (2026-05-20)
- Initial Antigravity/Gemini CLI OAuth backend split.
- Gemini-style history hardening, SSE streaming fixes.
v3.0.0 (2026-05-20)
Major architectural overhaul — Phase 0 + Phase 1 of engineering roadmap
Proxy (translate-proxy.py)
- ThreadingHTTPServer — serves concurrent requests (no more blocking)
- Thread-safe shared state — OrderedDict response store with locks, Crof state lock, stats lock
- Batched + atomic stats writes — stats buffered in memory, flushed every 5s via
os.replace() - Graceful shutdown — SIGTERM/SIGINT drain active connections (up to 5s), reject new with 503
- Progressive upstream timeouts — based on input size and tools (60-300s instead of flat 180s)
- Lazy JSON parsing — skip parsing SSE events unless they contain
response.completed - Buffered SSE writes — flush every 30ms, on urgent events, or at 4KB (reduces syscalls)
/healthendpoint — returns backend, target, models, BGP route count- Consolidated imports — all at top, no more missing import crashes
main()entry point — runtime init moved out of module level- TCP_NODELAY — on all streaming paths (from v2.7.0)
- Anthropic prompt caching —
cache_control: ephemeralon system prompts (from v2.7.0)
Launcher (codex-launcher-gui)
- Dynamic port allocation —
_pick_free_port()picks random free port, no more 8080 conflicts - Proxy health gating — Codex will NOT launch if proxy fails health check within 15s
- Error dialogs — clear GTK error dialog when proxy startup fails
- Atomic config backup/restore — temp file +
os.replace(), no more corrupted config.toml - Config transactions — recovery from interrupted sessions on next startup
- Safe cleanup (PID registry) — only kills processes launched by the app (pids.json)
- Proxy stderr piped to log — real-time proxy logs in launcher UI
- Bearer token — Codex config uses
codex-launcher-localinstead of real API key - Usage Dashboard v2 — OpenUsage-inspired dark theme with status pills, KPI strip, model bars (from v2.7.0)
v2.7.0 (2026-05-20)
-
Usage Dashboard redesigned (inspired by OpenUsage design patterns)
- Deep Space dark theme with Catppuccin-inspired color palette
- Header with animated status dots (OK/WARN/ERR provider health)
- KPI summary strip: total providers, requests, token volume, avg latency
- Provider cards with colored borders matching health status
- Status pills: OK (green), WARN (yellow), ERR (red)
- Colored section separators per metric type (Usage=yellow, Models=lavender)
- Model composition bar: stacked horizontal segments per model share
- Per-model breakdown with mini progress bars, percentage, request counts
- Per-model token breakdown (in/out) when available
- Token formatting: 1.2M, 45.3K instead of raw numbers
- Duration formatting: 1.5h, 3.2m instead of raw seconds
- Error section with warning icon
-
TCP_NODELAY streaming optimization
- Disables Nagle's algorithm on streaming connections
- Reduces per-packet latency by up to 40ms on small SSE events
- Applied to all 4 streaming code paths (openai-compat, retry, command-code, generic)
-
Anthropic prompt caching
- System prompts now sent as
cache_control: ephemeralstructured format - Enables Anthropic's automatic prompt caching (saves tokens + cost on repeated prompts)
- System prompts now sent as
v2.6.1 (2026-05-20)
- Google OAuth rebuilt to emulate Gemini CLI
- Uses Google's public OAuth client_id (same as gemini-cli)
- No
client_secret.jsonneeded — zero setup required - PKCE (S256 code challenge) + CSRF state protection
- Scopes: cloud-platform, generative-language, userinfo.email, userinfo.profile
- Redirects to Google's success/failure pages (same as gemini-cli)
- Just click "OAuth Login" → browser opens → authorize → done
- Token file permissions set to 0600 for security
v2.6.0 (2026-05-20)
- Usage Dashboard — per-provider tracking with visual cards
- Request counts, success/failure rates, token usage, latency stats
- Color-coded success rate bars (green/yellow/red)
- Per-model breakdown showing request counts
- Last error and last used timestamp
- Sorted by most-used provider
- Refresh button for live updates
- Proxy usage tracking — records every request to
usage-stats.json - Google OAuth: browse for
client_secret.jsonwith file picker dialog- No longer requires copying to a specific path manually
- Auto-copies selected file to
~/.cache/codex-proxy/
v2.5.1 (2026-05-20)
- Adaptive retry for transient errors (429/502/503)
- Exponential backoff: 2s → 4s → 8s, up to 3 retries
- Works for both single-provider and BGP mode
- BGP routes retry before failing over to next route
- Connection errors (reset/broken pipe) also retried
- Proxy socket reuse — no more
Address already in usecrashes on restart - BGP startup log shows route count and names
v2.5.0 (2026-05-20)
- AI BGP — Multi-provider routing with automatic failover
- New "AI BGP" button in main window → pool manager
- Create BGP pools with ordered routes from any configured endpoint
- Each route has its own endpoint URL, API key, model, and priority
- Failover strategy: tries primary route, automatically falls back to next on error/timeout
- BGP pools appear in endpoint dropdown with 🔀 icon
- Pool editor: add/remove/reorder routes, pick endpoint + model per route
- Up/down buttons for priority reordering
- Proxy logs
[bgp] trying route 'Name'and[bgp] route 'Name' FAILEDon fallback - If all routes fail: returns 502 with detailed error per route
- Fixed TOML config breakage from multi-line paste in API key field (
_toml_safe())
v2.4.0 (2026-05-20)
- Added OpenAdapter provider preset
- Base URL:
https://api.openadapter.in/v1— one API key, 40+ models - Pre-loaded models: glm-4.7, DeepSeek-V3, kimi-k2.6, qwen3.6-plus, claude-sonnet-4-6, gpt-5.4, gemini-2.5-flash, and more
- Works with existing openai-compat proxy backend — no special handling needed
- Base URL:
- Fixed Add/Edit dialog crash (missing
_on_reasoning_toggledmethod) - Redesigned Google OAuth flow with live status dialog and clickable auth URL
v2.3.2 (2026-05-20)
- Added Google Gemini provider with OAuth support
- Two presets: "Google Gemini (API Key)" and "Google Gemini (OAuth)"
- OAuth Login button in endpoint editor — full Google OAuth2 flow
- Starts local HTTP server (port 8085), opens browser for Google consent
- Captures auth code, exchanges for access + refresh tokens
- Stores tokens in
~/.cache/codex-proxy/google-oauth-token.json - Auto-refreshes access tokens when expired (no manual re-login)
- Uses Gemini's OpenAI-compatible endpoint:
generativelanguage.googleapis.com/v1beta/openai - Models: gemini-2.5-flash, gemini-2.5-pro, gemini-2.0-flash, gemini-2.0-flash-lite, and more
- Setup instructions shown if
client_secret.jsonnot found
v2.3.0 (2026-05-20)
- Adaptive Crof self-healing system
- Tracks per-model success/failure history with item counts
- Dynamically learns max item limit per model (starts at 30, adjusts down on failures)
- Proactively compacts input when above learned limit before sending to upstream
- Auto-retry on
finish_reason=lengthwith aggressive re-compaction and resend - Prevents
stream disconnectedandincompleteerrors on long conversations - All tracking logged to stderr:
[crof-adaptive] model=X items=N OK/FAIL -> limit=N
- Fixed
NameError: _tscrash in debug logging - Fixed
ConnectionResetErrorcrash on client disconnect during streaming - Added 180s upstream timeout to prevent hanging connections
- Compaction now preserves function_call/function_call_output pairs (no orphaned tool outputs)
- Fixed reasoning control:
reasoning_effort=nonealways sends both params
v2.2.1 (2026-05-20)
- Fixed compaction orphaning function_call_output items — root cause of Crof
incompleteresponses- Compaction cut between function_call and its function_call_output, creating dangling tool results
- Crof model received orphaned
toolmessages with emptytool_call_id, causing confusion and token exhaustion - Compaction now expands tail boundary to include matching function_call/function_call_output pairs
- Fixed reasoning control:
reasoning_effort=nonenow always sends bothenable_thinking=falseANDreasoning_effort=none- Crof API testing confirmed
reasoning_effort=noneis what actually suppresses reasoning, notenable_thinking=false
- Crof API testing confirmed
- Added upstream debug logging to
~/.cache/codex-proxy/crof-upstream.jsonl
v2.2.0 (2026-05-20)
- Added per-provider Reasoning controls in endpoint editor
- Reasoning On/Off toggle — disable reasoning for models that exhaust output tokens (e.g., Crof mimo-v2.5-pro)
- Reasoning Effort selector: None, Minimal, Low, Medium, High, Max
- When reasoning is OFF: sends
enable_thinking=false+reasoning_effort=noneto upstream API - When reasoning is ON: sends user-selected effort level (default: Medium)
- Settings stored per-endpoint, passed through proxy config to upstream requests
- Strip
reasoning_contentfrom proxy output — Codex doesn't use it, avoids token waste - Force
max_tokens=64000minimum for openai-compat providers — room for both reasoning and content - Inspired by unsloth's reasoning control patterns for Qwen/GPT-OSS models
- Styled reasoning switch: green = ON, orange = OFF, gentle rounded pill shape
- Added error handling to endpoint manager Add/Edit/Manage dialogs (prevents silent failures)
v2.1.3 (2026-05-19)
- Fixed Crof mimo-v2.5-pro stopping mid-response (finish_reason=length)
- Root cause: model emits 600+
reasoning_contentSSE chunks that exhaustmax_tokensbefore any actual content is generated - Strip
reasoning_contentfrom proxy output — Codex doesn't use reasoning, avoids wasting output tokens on invisible text - Force
max_tokensminimum of 64000 for openai-compat providers — gives models room for both reasoning and content - Works for all openai-compat providers (Crof, Z.AI, DeepSeek, OpenRouter, etc.)
- Root cause: model emits 600+
v2.1.2 (2026-05-19)
- Fixed Crof.ai and providers stopping after first tool call (root cause: None tool IDs)
- Codex sends
function_callitems withid=None— proxy now matches tool results to calls by call_id + positional fallback - Fixed orphan message output item when response is only tool calls (no text content)
- Auto-trims long conversations (>30 items) to prevent context overflow on providers like Crof
- Keeps system/developer messages, original user query, and most recent 10 items
- Auto-compacts old items into a summary instead of just dropping them
- Summary includes: user requests, assistant responses, tool calls made, files touched
- Preserves enough context for the model to continue long tasks intelligently
- Truncates large tool outputs (>8000 chars) to prevent model output token exhaustion
- Crof's models return
incompletewhen tool results contain too much text (e.g., full HTML pages) - Truncated outputs include
[truncated N chars]suffix so the model knows data was cut
- Crof's models return
- Added request/response logging to
~/.cache/codex-proxy/requests.logfor debugging - Proxy stderr no longer discarded by launcher (visible in terminal for debugging)
v2.1.1 (2026-05-19)
- Added Command Code backend to translation proxy (proprietary
/alpha/generateAPI) - Added Command Code provider preset with 20 models (DeepSeek, Claude, GPT, Kimi, GLM, Qwen, etc.)
- Added
cc_versionfield in endpoint editor for Command Code version (default: 0.26.8) - Proxy sends
x-command-code-versionheader to CC API (fixes 403 "upgrade_required") - CC message conversion:
systemrole →user, string content → array, tools stripped, real UUID for threadId - Fixed proxy: map
developerrole tosystemfor Chat Completions providers (DeepSeek, Qwen, etc.) - Fixed proxy: map
developerrole touserfor Anthropic providers - Forward
instructionsfield from Responses API as system message/param
v2.1.0 (2026-05-19)
- Added Codex auth status detection (reads
codex login status) - Auth status bar shows logged-in provider or warning if auth missing/expired
- "Re-login" button opens
codex loginin a terminal for re-authentication - Auto re-checks auth 30s after re-login flow starts
- Pre-launch auth check warns before launching Codex Default mode if auth is invalid
- Auth status checked asynchronously at startup (non-blocking)
v2.0.1 (2026-05-19)
- Added Codex CLI/Desktop installation verifier to main page
- Green check (✔) when detected, yellow cross (✘) when missing
- "Install" button next to missing tools opens install guide dialog
- Desktop/CLI launch buttons disabled with tooltip when tool is missing
- Dependency status logged on startup
- Buttons respect missing-state after busy/unbusy cycles
v2.0.0 (2026-05-19)
- Initial release: multi-provider Codex Launcher
- Translation proxy: Responses API to Chat Completions + Anthropic Messages
- GTK endpoint manager with 10+ provider presets
- Codex Default mode (built-in OAuth, zero config)
- Browser UA injection for Cloudflare-protected providers (OpenCode)
- Streaming SSE, tool calls, reasoning content support
- Profile backup/import, model auto-fetch, bulk import
- Refresh Models in background thread
- URL normalization to prevent double-path bugs
- Config backup/restore around sessions
- .deb installer package