Codex-Launcher---Any-AI-Por…/CHANGELOG.md

# Changelog

## v3.5.0 (2026-05-22)

**Major Release — Command Code Adapter Overhaul, AI Assist, Self-Revive Watchdog, Debug Infrastructure**

### Command Code Provider — Multi-Format Tool-Call Parser (Critical Bug Fix)

The Command Code (CC) provider adapter in `translate-proxy.py` had a critical bug where the CC model's tool-call output was not being parsed into executable tool calls, causing the Codex agent loop to stop after the first response. The CC model output format **changes between sessions and models** — the parser must handle all observed formats.

**Root Cause:** The CC model returns tool calls as inline text in various formats (raw JSON, XML, DSML tags, HTML-like blocks) within `text-delta` SSE events. The original parser only handled one format. When the model switched output style, tool calls were silently dropped, and Codex received a plain text response instead of executable commands — halting the multi-turn agent loop.

**The Fix — Multi-Format Parser Chain (17 patches):**

A cascading parser chain was built that tries each format in order, first match wins:
`DSML → <bash> blocks → <explore_agent> → <tool_call type=...> → XML patterns → raw JSON → fallback regex`

- **FIX 1**: `cc_input_to_messages()` — enforce STRING content only (CC `/alpha/generate` rejects content blocks). Tool calls sent as inline JSON text in assistant messages. Tool results as `role: "user"` plain text (NOT `role: "tool"`).
- **FIX 2**: `x-command-code-version` header always sent (fallback `"0.26.8"`) — prevents 403 `upgrade_required` errors.
- **FIX 3**: Cleared stale schema cache (`content_type:"array"`) that was corrupting message construction.
- **FIX 4**: Streaming `try/except` wrapper — catches all streaming errors and sends `response.completed(status:"failed")` event instead of crashing the connection.
- **FIX 5**: `_extract_raw_json_tool_calls()` — new parser that finds raw JSON tool calls embedded in model text (`{"cmd":"...","type":"tool-call"}`).
- **FIX 6**: `_extract_args()` three-tier parser — tries direct parse → `codecs.escape_decode` → `unicode_escape` to prevent double-wrapped argument strings.
- **FIX 7**: `_extract_field()` skips leading `\` before value type check — handles malformed escape sequences in CC output.
- **FIX 8**: `sandbox_permissions` normalization from parsed dict — converts `{"docker":"full"}` to the flat string format Codex expects.
- **FIX 9** (REVERTED): Removed adaptive probe system — proved unnecessary, conservative inline-text format is sufficient.
- **FIX 10**: Comprehensive fix documentation added to proxy file header for maintainability.
- **FIX 11**: `_unwrap_cmd()` recursive unwrapping — handles double/triple-wrapped `cmd` values at all 7 extraction paths. `_sanitize_tool_calls()` post-extraction validation layer ensures every tool call has valid name + args.
- **FIX 11c**: XML regex fix — `</tool_call)` had unbalanced parenthesis for ~4000 lines; now uses `[)]?>` to match both `</tool_call)>` and `</tool_call)>`.
- **FIX 12**: Self-revive watchdog loop — auto-restarts proxy on crash (up to 50x, progressive backoff 1→30s). Controlled by `_SHUTDOWN_REQUESTED` flag on SIGTERM/SIGINT.
- **FIX 13**: Fallback extraction when main parser returns empty but text contains tool-call signals (`{"cmd":`, `"type":"tool-call"`, `<tool`, `<function=`).
- **FIX 14**: Parser for `<tool_call type="bash">\n{"command":"..."}` format (actual CC model output) + fixed fallback regex to match BOTH `"cmd"` AND `"command"` keys.
- **FIX 15**: `<explore_agent>` blocks converted to real `exec_command` with synthesized curl-based repo exploration command.
- **FIX 16**: `<bash>...</bash>` blocks parsed — extracts `prefix_rule`, `sandbox_permissions`, `justification` via line-oriented parsing.
- **FIX 17**: DSML tool_call blocks — the **current CC model output format**:
  - `<｜｜DSML｜｜tool_calls>` wrapper
  - `<｜｜DSML｜｜invoke name="exec">` with `<｜｜DSML｜｜parameter name="command">` tags
  - Extracts command from `parameter name="command"` or fallback to `prefix_rule`
  - Maps `exec`/`bash` → `exec_command`

### Debug Infrastructure
- **Debug-to-file**: All proxy events, text_buf preview, parser results, and fallback attempts logged to `~/.cache/codex-proxy/cc-debug.log` — works even when stderr is piped by Codex Desktop.
- **Inline self-test**: `--self-test` flag runs 19 tests covering unwrap, double-wrap, unescaped quotes, XML, function=, sanitizer edge cases.
- **Per-request logging**: Event types, text_buf content, parser match results written to debug log for every request.

### AI Assist
- AI Assist integration in launcher GUI for intelligent provider configuration and troubleshooting.

### Self-Revive Watchdog
- Proxy auto-restarts on crash with progressive backoff (1s → 30s, up to 50 restarts).
- Clean shutdown on SIGTERM/SIGINT via `_SHUTDOWN_REQUESTED` flag.
- Eliminates manual proxy restart during long coding sessions.

### Other Improvements
- `text_buf` in `cc_stream_to_sse` accumulates all `text-delta` events; parsing happens at end-of-stream for complete extraction.
- Schema cache with 24h staleness TTL for provider capabilities.
- ErrorAnalyzer learns from 4xx errors on retry (max 2 retries).
- `cleanup-codex-stale.sh` updated with additional stale process patterns.

## v3.3.0 (2026-05-20)

**Antigravity + Gemini CLI OAuth — full Codex agent loop working**

### Gemini CLI OAuth + Antigravity OAuth
- Split Google OAuth into separate Gemini CLI OAuth and Google Antigravity OAuth presets/backends.
- Gemini CLI OAuth uses the Gemini CLI public OAuth client and Code Assist endpoints.
- Antigravity OAuth uses Antigravity OAuth credentials, Code Assist daily/autopush/prod fallback, and Antigravity-style request wrapping.
- Added Antigravity version discovery from the updater/changelog with local caching.
- Added Antigravity model alias mapping from UI-facing `antigravity-*` IDs to upstream Code Assist model IDs.

### Responses API + Tool Flow
- Added Gemini-style history hardening for Google OAuth requests: removes empty turns, coalesces adjacent roles, drops duplicate user repeats, and enforces user-start/user-end history.
- Preserves function-call IDs across turns and adds synthetic `thoughtSignature` for historical Gemini function calls, matching Gemini CLI hardening behavior.
- Fixed Antigravity streaming Responses API compatibility: single assistant message item, text done events, content part done, output item done, final completed event, and connection close.
- Added `response.function_call_arguments.delta` and `response.function_call_arguments.done` events so Codex can execute Antigravity tool calls and create files.
- Fixed functionResponse name matching — uses the original functionCall name instead of falling back to call_id.
- Strengthened Antigravity prompt policy: use tools immediately for file changes, avoid planning-only responses, and answer directly when no suitable tool exists.
- **Auto-continue on MAX_TOKENS** — when Gemini/Antigravity truncates a text response, the proxy transparently sends a continuation request and concatenates the output so Codex receives the complete response without manual intervention.

### Reliability + Routing
- Added BGP++ route scoring, route cooldowns, token buckets, and persisted route stats.
- Added provider policy layer and adaptive context compaction.
- Added tool-call pairing validation/repair for orphaned tool outputs.
- Added Endpoint Doctor in the endpoint editor.
- Added log redaction helper for common API key/token patterns.

## v3.1.0 (2026-05-20)

- Initial Antigravity/Gemini CLI OAuth backend split.
- Gemini-style history hardening, SSE streaming fixes.

## v3.0.0 (2026-05-20)

**Major architectural overhaul — Phase 0 + Phase 1 of engineering roadmap**

### Proxy (translate-proxy.py)
- **ThreadingHTTPServer** — serves concurrent requests (no more blocking)
- **Thread-safe shared state** — OrderedDict response store with locks, Crof state lock, stats lock
- **Batched + atomic stats writes** — stats buffered in memory, flushed every 5s via `os.replace()`
- **Graceful shutdown** — SIGTERM/SIGINT drain active connections (up to 5s), reject new with 503
- **Progressive upstream timeouts** — based on input size and tools (60-300s instead of flat 180s)
- **Lazy JSON parsing** — skip parsing SSE events unless they contain `response.completed`
- **Buffered SSE writes** — flush every 30ms, on urgent events, or at 4KB (reduces syscalls)
- **`/health` endpoint** — returns backend, target, models, BGP route count
- **Consolidated imports** — all at top, no more missing import crashes
- **`main()` entry point** — runtime init moved out of module level
- **TCP_NODELAY** — on all streaming paths (from v2.7.0)
- **Anthropic prompt caching** — `cache_control: ephemeral` on system prompts (from v2.7.0)

### Launcher (codex-launcher-gui)
- **Dynamic port allocation** — `_pick_free_port()` picks random free port, no more 8080 conflicts
- **Proxy health gating** — Codex will NOT launch if proxy fails health check within 15s
- **Error dialogs** — clear GTK error dialog when proxy startup fails
- **Atomic config backup/restore** — temp file + `os.replace()`, no more corrupted config.toml
- **Config transactions** — recovery from interrupted sessions on next startup
- **Safe cleanup (PID registry)** — only kills processes launched by the app (pids.json)
- **Proxy stderr piped to log** — real-time proxy logs in launcher UI
- **Bearer token** — Codex config uses `codex-launcher-local` instead of real API key
- **Usage Dashboard v2** — OpenUsage-inspired dark theme with status pills, KPI strip, model bars (from v2.7.0)

## v2.7.0 (2026-05-20)

- **Usage Dashboard redesigned** (inspired by OpenUsage design patterns)
  - Deep Space dark theme with Catppuccin-inspired color palette
  - Header with animated status dots (OK/WARN/ERR provider health)
  - KPI summary strip: total providers, requests, token volume, avg latency
  - Provider cards with colored borders matching health status
  - Status pills: OK (green), WARN (yellow), ERR (red)
  - Colored section separators per metric type (Usage=yellow, Models=lavender)
  - Model composition bar: stacked horizontal segments per model share
  - Per-model breakdown with mini progress bars, percentage, request counts
  - Per-model token breakdown (in/out) when available
  - Token formatting: 1.2M, 45.3K instead of raw numbers
  - Duration formatting: 1.5h, 3.2m instead of raw seconds
  - Error section with warning icon

- **TCP_NODELAY streaming optimization**
  - Disables Nagle's algorithm on streaming connections
  - Reduces per-packet latency by up to 40ms on small SSE events
  - Applied to all 4 streaming code paths (openai-compat, retry, command-code, generic)

- **Anthropic prompt caching**
  - System prompts now sent as `cache_control: ephemeral` structured format
  - Enables Anthropic's automatic prompt caching (saves tokens + cost on repeated prompts)

## v2.6.1 (2026-05-20)

- **Google OAuth rebuilt to emulate Gemini CLI**
  - Uses Google's public OAuth client_id (same as gemini-cli)
  - No `client_secret.json` needed — zero setup required
  - PKCE (S256 code challenge) + CSRF state protection
  - Scopes: cloud-platform, generative-language, userinfo.email, userinfo.profile
  - Redirects to Google's success/failure pages (same as gemini-cli)
  - Just click "OAuth Login" → browser opens → authorize → done
  - Token file permissions set to 0600 for security

## v2.6.0 (2026-05-20)

- **Usage Dashboard** — per-provider tracking with visual cards
  - Request counts, success/failure rates, token usage, latency stats
  - Color-coded success rate bars (green/yellow/red)
  - Per-model breakdown showing request counts
  - Last error and last used timestamp
  - Sorted by most-used provider
  - Refresh button for live updates
- **Proxy usage tracking** — records every request to `usage-stats.json`
- **Google OAuth**: browse for `client_secret.json` with file picker dialog
  - No longer requires copying to a specific path manually
  - Auto-copies selected file to `~/.cache/codex-proxy/`

## v2.5.1 (2026-05-20)

- **Adaptive retry for transient errors** (429/502/503)
  - Exponential backoff: 2s → 4s → 8s, up to 3 retries
  - Works for both single-provider and BGP mode
  - BGP routes retry before failing over to next route
  - Connection errors (reset/broken pipe) also retried
- **Proxy socket reuse** — no more `Address already in use` crashes on restart
- **BGP startup log** shows route count and names

## v2.5.0 (2026-05-20)

- **AI BGP — Multi-provider routing with automatic failover**
  - New "AI BGP" button in main window → pool manager
  - Create BGP pools with ordered routes from any configured endpoint
  - Each route has its own endpoint URL, API key, model, and priority
  - **Failover strategy**: tries primary route, automatically falls back to next on error/timeout
  - BGP pools appear in endpoint dropdown with 🔀 icon
  - Pool editor: add/remove/reorder routes, pick endpoint + model per route
  - Up/down buttons for priority reordering
  - Proxy logs `[bgp] trying route 'Name'` and `[bgp] route 'Name' FAILED` on fallback
  - If all routes fail: returns 502 with detailed error per route
- Fixed TOML config breakage from multi-line paste in API key field (`_toml_safe()`)

## v2.4.0 (2026-05-20)

- **Added OpenAdapter provider preset**
  - Base URL: `https://api.openadapter.in/v1` — one API key, 40+ models
  - Pre-loaded models: glm-4.7, DeepSeek-V3, kimi-k2.6, qwen3.6-plus, claude-sonnet-4-6, gpt-5.4, gemini-2.5-flash, and more
  - Works with existing openai-compat proxy backend — no special handling needed
- Fixed Add/Edit dialog crash (missing `_on_reasoning_toggled` method)
- Redesigned Google OAuth flow with live status dialog and clickable auth URL

## v2.3.2 (2026-05-20)

- **Added Google Gemini provider with OAuth support**
  - Two presets: "Google Gemini (API Key)" and "Google Gemini (OAuth)"
  - OAuth Login button in endpoint editor — full Google OAuth2 flow
  - Starts local HTTP server (port 8085), opens browser for Google consent
  - Captures auth code, exchanges for access + refresh tokens
  - Stores tokens in `~/.cache/codex-proxy/google-oauth-token.json`
  - Auto-refreshes access tokens when expired (no manual re-login)
  - Uses Gemini's OpenAI-compatible endpoint: `generativelanguage.googleapis.com/v1beta/openai`
  - Models: gemini-2.5-flash, gemini-2.5-pro, gemini-2.0-flash, gemini-2.0-flash-lite, and more
  - Setup instructions shown if `client_secret.json` not found

## v2.3.0 (2026-05-20)

- **Adaptive Crof self-healing system**
  - Tracks per-model success/failure history with item counts
  - Dynamically learns max item limit per model (starts at 30, adjusts down on failures)
  - Proactively compacts input when above learned limit before sending to upstream
  - Auto-retry on `finish_reason=length` with aggressive re-compaction and resend
  - Prevents `stream disconnected` and `incomplete` errors on long conversations
  - All tracking logged to stderr: `[crof-adaptive] model=X items=N OK/FAIL -> limit=N`
- Fixed `NameError: _ts` crash in debug logging
- Fixed `ConnectionResetError` crash on client disconnect during streaming
- Added 180s upstream timeout to prevent hanging connections
- Compaction now preserves function_call/function_call_output pairs (no orphaned tool outputs)
- Fixed reasoning control: `reasoning_effort=none` always sends both params

## v2.2.1 (2026-05-20)

- **Fixed compaction orphaning function_call_output items** — root cause of Crof `incomplete` responses
  - Compaction cut between function_call and its function_call_output, creating dangling tool results
  - Crof model received orphaned `tool` messages with empty `tool_call_id`, causing confusion and token exhaustion
  - Compaction now expands tail boundary to include matching function_call/function_call_output pairs
- **Fixed reasoning control**: `reasoning_effort=none` now always sends both `enable_thinking=false` AND `reasoning_effort=none`
  - Crof API testing confirmed `reasoning_effort=none` is what actually suppresses reasoning, not `enable_thinking=false`
- Added upstream debug logging to `~/.cache/codex-proxy/crof-upstream.jsonl`

## v2.2.0 (2026-05-20)

- **Added per-provider Reasoning controls in endpoint editor**
  - Reasoning On/Off toggle — disable reasoning for models that exhaust output tokens (e.g., Crof mimo-v2.5-pro)
  - Reasoning Effort selector: None, Minimal, Low, Medium, High, Max
  - When reasoning is OFF: sends `enable_thinking=false` + `reasoning_effort=none` to upstream API
  - When reasoning is ON: sends user-selected effort level (default: Medium)
  - Settings stored per-endpoint, passed through proxy config to upstream requests
- Strip `reasoning_content` from proxy output — Codex doesn't use it, avoids token waste
- Force `max_tokens=64000` minimum for openai-compat providers — room for both reasoning and content
- Inspired by unsloth's reasoning control patterns for Qwen/GPT-OSS models
- Styled reasoning switch: green = ON, orange = OFF, gentle rounded pill shape
- Added error handling to endpoint manager Add/Edit/Manage dialogs (prevents silent failures)

## v2.1.3 (2026-05-19)

- **Fixed Crof mimo-v2.5-pro stopping mid-response (finish_reason=length)**
  - Root cause: model emits 600+ `reasoning_content` SSE chunks that exhaust `max_tokens` before any actual content is generated
  - Strip `reasoning_content` from proxy output — Codex doesn't use reasoning, avoids wasting output tokens on invisible text
  - Force `max_tokens` minimum of 64000 for openai-compat providers — gives models room for both reasoning and content
  - Works for all openai-compat providers (Crof, Z.AI, DeepSeek, OpenRouter, etc.)

## v2.1.2 (2026-05-19)

- **Fixed Crof.ai and providers stopping after first tool call (root cause: None tool IDs)**
- Codex sends `function_call` items with `id=None` — proxy now matches tool results to calls by call_id + positional fallback
- Fixed orphan message output item when response is only tool calls (no text content)
- **Auto-trims long conversations (>30 items)** to prevent context overflow on providers like Crof
  - Keeps system/developer messages, original user query, and most recent 10 items
  - **Auto-compacts old items into a summary** instead of just dropping them
  - Summary includes: user requests, assistant responses, tool calls made, files touched
  - Preserves enough context for the model to continue long tasks intelligently
- **Truncates large tool outputs (>8000 chars)** to prevent model output token exhaustion
  - Crof's models return `incomplete` when tool results contain too much text (e.g., full HTML pages)
  - Truncated outputs include `[truncated N chars]` suffix so the model knows data was cut
- Added request/response logging to `~/.cache/codex-proxy/requests.log` for debugging
- Proxy stderr no longer discarded by launcher (visible in terminal for debugging)

## v2.1.1 (2026-05-19)

- Added Command Code backend to translation proxy (proprietary `/alpha/generate` API)
- Added Command Code provider preset with 20 models (DeepSeek, Claude, GPT, Kimi, GLM, Qwen, etc.)
- Added `cc_version` field in endpoint editor for Command Code version (default: 0.26.8)
- Proxy sends `x-command-code-version` header to CC API (fixes 403 "upgrade_required")
- CC message conversion: `system` role → `user`, string content → array, tools stripped, real UUID for threadId
- Fixed proxy: map `developer` role to `system` for Chat Completions providers (DeepSeek, Qwen, etc.)
- Fixed proxy: map `developer` role to `user` for Anthropic providers
- Forward `instructions` field from Responses API as system message/param

## v2.1.0 (2026-05-19)

- Added Codex auth status detection (reads `codex login status`)
- Auth status bar shows logged-in provider or warning if auth missing/expired
- "Re-login" button opens `codex login` in a terminal for re-authentication
- Auto re-checks auth 30s after re-login flow starts
- Pre-launch auth check warns before launching Codex Default mode if auth is invalid
- Auth status checked asynchronously at startup (non-blocking)

## v2.0.1 (2026-05-19)

- Added Codex CLI/Desktop installation verifier to main page
- Green check (✔) when detected, yellow cross (✘) when missing
- "Install" button next to missing tools opens install guide dialog
- Desktop/CLI launch buttons disabled with tooltip when tool is missing
- Dependency status logged on startup
- Buttons respect missing-state after busy/unbusy cycles

## v2.0.0 (2026-05-19)

- Initial release: multi-provider Codex Launcher
- Translation proxy: Responses API to Chat Completions + Anthropic Messages
- GTK endpoint manager with 10+ provider presets
- Codex Default mode (built-in OAuth, zero config)
- Browser UA injection for Cloudflare-protected providers (OpenCode)
- Streaming SSE, tool calls, reasoning content support
- Profile backup/import, model auto-fetch, bulk import
- Refresh Models in background thread
- URL normalization to prevent double-path bugs
- Config backup/restore around sessions
- .deb installer package