Files

admin 78f28bd3ac docs: update CHANGELOG, README, GUI changelog for v3.8.0 AI Monitoring

- CHANGELOG.md: full v3.8.0 section with 3-tier system, 30 fault types, safety guards
- README.md: AI Monitoring badge, features section, Phase 9 dev journey, troubleshooting rows
- GUI CHANGELOG: v3.8.0 entry with 9 bullet points

2026-05-22 23:22:26 +04:00

29 KiB

Raw Permalink Blame History

Changelog

v3.8.0 (2026-05-22)

AI Monitoring — Self-Healing Watchdog with 3-Tier Response System

When the proxy crashes, the upstream dies, or the model gets stuck, Codex stops working. The user has to manually restart everything. AI Monitoring fixes this with an autonomous watchdog that detects, diagnoses, and recovers from failures without user intervention.

Three-Tier Response System

Tier	Speed	What	When
Tier 1	< 1s	Rule-based auto-recovery	Known failure patterns (14 rules)
Tier 2	< 100ms	Incident store lookup	We've seen this exact failure before
Tier 3	2-5s	AI diagnostic agent (configurable model)	Novel failure — no rule or pattern matches

Watchdog Components

HealthWatcher thread — pings proxy /health every 5 seconds, detects crashes and hangs
LogAnalyzer thread — tails cc-debug.log for 18 failure signal patterns in real-time
Tier 1 rule engine — 14 rules covering: proxy crash restart, port conflict resolution, upstream retry with backoff, schema cache clearing, rate limit handling, stream error recovery
Tier 2 incident store — JSON pattern database (~/.cache/codex-proxy/incident-store.json) with success rates, learns from every resolved incident
Tier 3 AI diagnostic agent — calls a user-configured provider/model (e.g., Gemini Flash, GPT-4o-mini, local Ollama) to diagnose novel failures. Cost: ~$0.10-1.50/month

Failure Catalog: 30 Fault Types

Category A (7): Proxy crash, port conflict, memory leak, deadlock, SSL error, DNS failure, unhandled exception
Category B (10): Rate limit (429), server error (5xx), auth failure (401/403), CC upgrade required, timeout, connection reset, broken pipe, bad request, provider overloaded, Cloudflare block
Category C (10): Parser empty, stuck recovery, sanitizer flags, double-wrapped cmd, suspicious cmd, empty cmd, bare JSON token, bash without cmd, DSML name mismatch, stuck model loop
Category D (6): Codex process killed, memory explosion, 300s stall, config corruption, context overflow, WebSocket reconnect loop
Category E (5): Schema cache corruption, stale PID file, port from old session, OAuth token expired, BGP all routes down

Safety Guards

Rate-limited AI calls: max 1 per 60s, max 10/day
Restart cap: max 5 proxy restarts per 10 minutes
Cooldown per pattern (30s → 60s → 300s → alert user)
Monthly AI budget cap (configurable, default $2/month)

Enhanced /health Endpoint

The proxy's /health endpoint now returns uptime_s, memory_mb, and requests_total for watchdog monitoring.

GUI Integration

"AI Monitor" button in header bar
AIMonitoringWindow: ON/OFF toggle, provider URL/model/API key selector, health check interval, auto-restart toggle, incident log viewer
Watchdog starts automatically when enabled
All actions logged to ~/.cache/codex-proxy/monitoring.log

AI Monitoring Design Spec

Full design document at AI-MONITORING-DESIGN.md — architecture diagrams, decision flow, safety guards, implementation plan.

v3.7.0 (2026-05-22)

Intelligence Routing — Self-Healing Parser System

When the Command Code model produces output in unpredictable or unrecognized formats, the multi-format parser chain (DSML, XML, explore_agent, bash blocks, raw JSON, fallback regex) can return empty. This causes the Codex agent loop to stall — zero tool calls means nothing to execute.

Intelligence Routing is a three-layer self-healing system that ensures the agent loop always continues:

Layer 1: Deep URL Extraction (FIX 23)

Problem: <explore_agent> body contained messages: [{"content": "https://..."}] — URLs hidden inside JSON values. Regex couldn't match because it excluded the " character that terminates JSON strings.
Solution: _build_explore_cmd() extracted to module level (was a closure inside _parse_commandcode_text_tool_calls). After initial regex fails, tries json.loads(), iterates list items, extracts content field to find URLs. Added " to regex exclusion set.
Self-tests: Pattern M, O, O2 verify URL extraction from nested JSON.

Layer 2: Escalation Block Handling (FIX 24)

Problem: Model produces <require_escalation> and <request_escalation_permission> blocks when it wants elevated permissions. CC adapter doesn't support escalation — blocks silently dropped → parsed_tool_calls=0 → stall.
Solution: Two handlers:
- FIX 24a: Closed-tag blocks — extracts URL if present and runs explore command; otherwise echoes auto-proceed.
- FIX 24b: Bare/unclosed tags (<require_escalation />) — auto-proceeds with diagnostic echo.
Self-tests: Pattern N, N2 verify both closed and bare escalation blocks.

Layer 3: Intent-Based Command Synthesis (FIX 25 — THE CORE)

Problem: After ALL parsers return empty, the agent loop has zero tool calls. Model may have written plain English ("I need to fetch the README"), partial JSON, or completely unrecognized formats.
Solution: 5-heuristic synthesis chain in cc_stream_to_sse(), run when parsed_tool_calls=0 and text has content:
1. URL in text → curl to fetch it
2. File path reference ("read the file /path/to/X") → cat or ls that file
3. Shell command in backticks/quotes → extract and run it
4. "explore"/"fetch"/"investigate"/"repository" intent + last user URL → _build_explore_cmd() with _last_user_urls deque
5. "I need to"/"let me"/"please" intent text → echo diagnostic with the intent
The system NEVER returns empty tool calls when there's text to analyze.
Self-tests: Patterns M-O2 cover the full pipeline.

Architecture

_parse_commandcode_text_tool_calls()  ←  Layer 1 + Layer 2
cc_stream_to_sse()                    ←  Layer 3 (after parser chain + fallback)
_last_user_urls deque (maxlen=20)     ←  Session-wide URL memory for heuristic 4

Test Coverage

54 self-test patterns (up from 41 in v3.6.0)
13 new tests covering all three Intelligence Routing layers
Tests verify: nested JSON URL extraction, closed/bare escalation blocks, module-level explore command builder

v3.6.0 (2026-05-22)

Performance & Stability Hardening — Connection Pooling, Stream Idle Timeouts, Retry-After

Inspired by architectural study of Codex-Proxy-Server (Rust/Axum).

P0: Connection Pooling & Stream Idle Timeout

Connection pooling (http.client reuse) — persistent HTTPS connections per host, eliminates ~100ms TLS handshake per request. Pool keyed by {scheme}://{host}:{port}, reused across requests.
Stream idle timeout (300s default) — all streaming paths now use _stream_with_idle_timeout() via selectors. If upstream goes silent for 5 minutes, the stream is killed with a TimeoutError instead of hanging forever. Applied to:
- OpenAI-compat streaming (oa_stream_to_sse)
- Command Code streaming (_iter_cc_events)
- Gemini OAuth streaming (_handle_gemini_oauth)
- Auto-continue streaming (_auto_continue_gemini)

P1: Retry-After Header Support & Preemptive Token Refresh

Retry-After header — all retry paths (openai-compat, BGP, auto) now read the upstream Retry-After header and respect it (capped at 60s). Falls back to exponential backoff if header is absent.
Preemptive OAuth token refresh — _preemptive_refresh_token() checks token expiry 5 minutes before it expires and logs a warning, preparing for proactive refresh.

P2: Tool Translation Improvements

oa_convert_tools(strict=) — separate tool translation for Responses API (with strict: true) vs Chat Completions (without strict). Some providers reject the strict field in Chat Completions mode.
Filter null/empty tool names — tools with empty or "null" names are silently dropped instead of causing upstream 400 errors.

P3: Response Store TTL, Bounded Buffers, Dual Logging

Response store TTL (600s) — _response_store_evict() removes entries older than 10 minutes. Prevents unbounded memory growth on long sessions.
Bounded stream buffer (8MB max) — stream_buffered_events now caps at 8MB before forcing a flush, preventing OOM on pathological responses.
response.failed and error events added to urgent flush list — errors reach the client immediately instead of being buffered.
Dual logging — proxy.log in ~/.cache/codex-proxy/ captures all proxy messages alongside stderr. Survives Codex Desktop's stderr piping.

v3.5.0 (2026-05-22)

Major Release — Command Code Adapter Overhaul, AI Assist, Self-Revive Watchdog, Debug Infrastructure

Command Code Provider — Multi-Format Tool-Call Parser (Critical Bug Fix)

The Command Code (CC) provider adapter in translate-proxy.py had a critical bug where the CC model's tool-call output was not being parsed into executable tool calls, causing the Codex agent loop to stop after the first response. The CC model output format changes between sessions and models — the parser must handle all observed formats.

Root Cause: The CC model returns tool calls as inline text in various formats (raw JSON, XML, DSML tags, HTML-like blocks) within text-delta SSE events. The original parser only handled one format. When the model switched output style, tool calls were silently dropped, and Codex received a plain text response instead of executable commands — halting the multi-turn agent loop.

The Fix — Multi-Format Parser Chain (17 patches):

A cascading parser chain was built that tries each format in order, first match wins: DSML → <bash> blocks → <explore_agent> → <tool_call type=...> → XML patterns → raw JSON → fallback regex

FIX 1: cc_input_to_messages() — enforce STRING content only (CC /alpha/generate rejects content blocks). Tool calls sent as inline JSON text in assistant messages. Tool results as role: "user" plain text (NOT role: "tool").
FIX 2: x-command-code-version header always sent (fallback "0.26.8") — prevents 403 upgrade_required errors.
FIX 3: Cleared stale schema cache (content_type:"array") that was corrupting message construction.
FIX 4: Streaming try/except wrapper — catches all streaming errors and sends response.completed(status:"failed") event instead of crashing the connection.
FIX 5: _extract_raw_json_tool_calls() — new parser that finds raw JSON tool calls embedded in model text ({"cmd":"...","type":"tool-call"}).
FIX 6: _extract_args() three-tier parser — tries direct parse → codecs.escape_decode → unicode_escape to prevent double-wrapped argument strings.
FIX 7: _extract_field() skips leading \ before value type check — handles malformed escape sequences in CC output.
FIX 8: sandbox_permissions normalization from parsed dict — converts {"docker":"full"} to the flat string format Codex expects.
FIX 9 (REVERTED): Removed adaptive probe system — proved unnecessary, conservative inline-text format is sufficient.
FIX 10: Comprehensive fix documentation added to proxy file header for maintainability.
FIX 11: _unwrap_cmd() recursive unwrapping — handles double/triple-wrapped cmd values at all 7 extraction paths. _sanitize_tool_calls() post-extraction validation layer ensures every tool call has valid name + args.
FIX 11c: XML regex fix — </tool_call) had unbalanced parenthesis for ~4000 lines; now uses [)]?> to match both </tool_call)> and </tool_call)>.
FIX 12: Self-revive watchdog loop — auto-restarts proxy on crash (up to 50x, progressive backoff 1→30s). Controlled by _SHUTDOWN_REQUESTED flag on SIGTERM/SIGINT.
FIX 13: Fallback extraction when main parser returns empty but text contains tool-call signals ({"cmd":, "type":"tool-call", <tool, <function=).
FIX 14: Parser for <tool_call type="bash">\n{"command":"..."} format (actual CC model output) + fixed fallback regex to match BOTH "cmd" AND "command" keys.
FIX 15: <explore_agent> blocks converted to real exec_command with synthesized curl-based repo exploration command.
FIX 16: <bash>...</bash> blocks parsed — extracts prefix_rule, sandbox_permissions, justification via line-oriented parsing.
FIX 17: DSML tool_call blocks — the current CC model output format:
- <｜｜DSML｜｜tool_calls> wrapper
- <｜｜DSML｜｜invoke name="exec"> with <｜｜DSML｜｜parameter name="command"> tags
- Extracts command from parameter name="command" or fallback to prefix_rule
- Maps exec/bash → exec_command

Debug Infrastructure

Debug-to-file: All proxy events, text_buf preview, parser results, and fallback attempts logged to ~/.cache/codex-proxy/cc-debug.log — works even when stderr is piped by Codex Desktop.
Inline self-test: --self-test flag runs 19 tests covering unwrap, double-wrap, unescaped quotes, XML, function=, sanitizer edge cases.
Per-request logging: Event types, text_buf content, parser match results written to debug log for every request.

AI Assist

AI Assist integration in launcher GUI for intelligent provider configuration and troubleshooting.

Self-Revive Watchdog

Proxy auto-restarts on crash with progressive backoff (1s → 30s, up to 50 restarts).
Clean shutdown on SIGTERM/SIGINT via _SHUTDOWN_REQUESTED flag.
Eliminates manual proxy restart during long coding sessions.

Other Improvements

text_buf in cc_stream_to_sse accumulates all text-delta events; parsing happens at end-of-stream for complete extraction.
Schema cache with 24h staleness TTL for provider capabilities.
ErrorAnalyzer learns from 4xx errors on retry (max 2 retries).
cleanup-codex-stale.sh updated with additional stale process patterns.

v3.3.0 (2026-05-20)

Antigravity + Gemini CLI OAuth — full Codex agent loop working

Gemini CLI OAuth + Antigravity OAuth

Split Google OAuth into separate Gemini CLI OAuth and Google Antigravity OAuth presets/backends.
Gemini CLI OAuth uses the Gemini CLI public OAuth client and Code Assist endpoints.
Antigravity OAuth uses Antigravity OAuth credentials, Code Assist daily/autopush/prod fallback, and Antigravity-style request wrapping.
Added Antigravity version discovery from the updater/changelog with local caching.
Added Antigravity model alias mapping from UI-facing antigravity-* IDs to upstream Code Assist model IDs.

Responses API + Tool Flow

Added Gemini-style history hardening for Google OAuth requests: removes empty turns, coalesces adjacent roles, drops duplicate user repeats, and enforces user-start/user-end history.
Preserves function-call IDs across turns and adds synthetic thoughtSignature for historical Gemini function calls, matching Gemini CLI hardening behavior.
Fixed Antigravity streaming Responses API compatibility: single assistant message item, text done events, content part done, output item done, final completed event, and connection close.
Added response.function_call_arguments.delta and response.function_call_arguments.done events so Codex can execute Antigravity tool calls and create files.
Fixed functionResponse name matching — uses the original functionCall name instead of falling back to call_id.
Strengthened Antigravity prompt policy: use tools immediately for file changes, avoid planning-only responses, and answer directly when no suitable tool exists.
Auto-continue on MAX_TOKENS — when Gemini/Antigravity truncates a text response, the proxy transparently sends a continuation request and concatenates the output so Codex receives the complete response without manual intervention.

Reliability + Routing

Added BGP++ route scoring, route cooldowns, token buckets, and persisted route stats.
Added provider policy layer and adaptive context compaction.
Added tool-call pairing validation/repair for orphaned tool outputs.
Added Endpoint Doctor in the endpoint editor.
Added log redaction helper for common API key/token patterns.

v3.1.0 (2026-05-20)

Initial Antigravity/Gemini CLI OAuth backend split.
Gemini-style history hardening, SSE streaming fixes.

v3.0.0 (2026-05-20)

Major architectural overhaul — Phase 0 + Phase 1 of engineering roadmap

Proxy (translate-proxy.py)

ThreadingHTTPServer — serves concurrent requests (no more blocking)
Thread-safe shared state — OrderedDict response store with locks, Crof state lock, stats lock
Batched + atomic stats writes — stats buffered in memory, flushed every 5s via os.replace()
Graceful shutdown — SIGTERM/SIGINT drain active connections (up to 5s), reject new with 503
Progressive upstream timeouts — based on input size and tools (60-300s instead of flat 180s)
Lazy JSON parsing — skip parsing SSE events unless they contain response.completed
Buffered SSE writes — flush every 30ms, on urgent events, or at 4KB (reduces syscalls)
/health endpoint — returns backend, target, models, BGP route count
Consolidated imports — all at top, no more missing import crashes
main() entry point — runtime init moved out of module level
TCP_NODELAY — on all streaming paths (from v2.7.0)
Anthropic prompt caching — cache_control: ephemeral on system prompts (from v2.7.0)

Launcher (codex-launcher-gui)

Dynamic port allocation — _pick_free_port() picks random free port, no more 8080 conflicts
Proxy health gating — Codex will NOT launch if proxy fails health check within 15s
Error dialogs — clear GTK error dialog when proxy startup fails
Atomic config backup/restore — temp file + os.replace(), no more corrupted config.toml
Config transactions — recovery from interrupted sessions on next startup
Safe cleanup (PID registry) — only kills processes launched by the app (pids.json)
Proxy stderr piped to log — real-time proxy logs in launcher UI
Bearer token — Codex config uses codex-launcher-local instead of real API key
Usage Dashboard v2 — OpenUsage-inspired dark theme with status pills, KPI strip, model bars (from v2.7.0)

v2.7.0 (2026-05-20)

Usage Dashboard redesigned (inspired by OpenUsage design patterns)
- Deep Space dark theme with Catppuccin-inspired color palette
- Header with animated status dots (OK/WARN/ERR provider health)
- KPI summary strip: total providers, requests, token volume, avg latency
- Provider cards with colored borders matching health status
- Status pills: OK (green), WARN (yellow), ERR (red)
- Colored section separators per metric type (Usage=yellow, Models=lavender)
- Model composition bar: stacked horizontal segments per model share
- Per-model breakdown with mini progress bars, percentage, request counts
- Per-model token breakdown (in/out) when available
- Token formatting: 1.2M, 45.3K instead of raw numbers
- Duration formatting: 1.5h, 3.2m instead of raw seconds
- Error section with warning icon
TCP_NODELAY streaming optimization
- Disables Nagle's algorithm on streaming connections
- Reduces per-packet latency by up to 40ms on small SSE events
- Applied to all 4 streaming code paths (openai-compat, retry, command-code, generic)
Anthropic prompt caching
- System prompts now sent as cache_control: ephemeral structured format
- Enables Anthropic's automatic prompt caching (saves tokens + cost on repeated prompts)

v2.6.1 (2026-05-20)

Google OAuth rebuilt to emulate Gemini CLI
- Uses Google's public OAuth client_id (same as gemini-cli)
- No client_secret.json needed — zero setup required
- PKCE (S256 code challenge) + CSRF state protection
- Scopes: cloud-platform, generative-language, userinfo.email, userinfo.profile
- Redirects to Google's success/failure pages (same as gemini-cli)
- Just click "OAuth Login" → browser opens → authorize → done
- Token file permissions set to 0600 for security

v2.6.0 (2026-05-20)

Usage Dashboard — per-provider tracking with visual cards
- Request counts, success/failure rates, token usage, latency stats
- Color-coded success rate bars (green/yellow/red)
- Per-model breakdown showing request counts
- Last error and last used timestamp
- Sorted by most-used provider
- Refresh button for live updates
Proxy usage tracking — records every request to usage-stats.json
Google OAuth: browse for client_secret.json with file picker dialog
- No longer requires copying to a specific path manually
- Auto-copies selected file to ~/.cache/codex-proxy/

v2.5.1 (2026-05-20)

Adaptive retry for transient errors (429/502/503)
- Exponential backoff: 2s → 4s → 8s, up to 3 retries
- Works for both single-provider and BGP mode
- BGP routes retry before failing over to next route
- Connection errors (reset/broken pipe) also retried
Proxy socket reuse — no more Address already in use crashes on restart
BGP startup log shows route count and names

v2.5.0 (2026-05-20)

AI BGP — Multi-provider routing with automatic failover
- New "AI BGP" button in main window → pool manager
- Create BGP pools with ordered routes from any configured endpoint
- Each route has its own endpoint URL, API key, model, and priority
- Failover strategy: tries primary route, automatically falls back to next on error/timeout
- BGP pools appear in endpoint dropdown with 🔀 icon
- Pool editor: add/remove/reorder routes, pick endpoint + model per route
- Up/down buttons for priority reordering
- Proxy logs [bgp] trying route 'Name' and [bgp] route 'Name' FAILED on fallback
- If all routes fail: returns 502 with detailed error per route
Fixed TOML config breakage from multi-line paste in API key field (_toml_safe())

v2.4.0 (2026-05-20)

Added OpenAdapter provider preset
- Base URL: https://api.openadapter.in/v1 — one API key, 40+ models
- Pre-loaded models: glm-4.7, DeepSeek-V3, kimi-k2.6, qwen3.6-plus, claude-sonnet-4-6, gpt-5.4, gemini-2.5-flash, and more
- Works with existing openai-compat proxy backend — no special handling needed
Fixed Add/Edit dialog crash (missing _on_reasoning_toggled method)
Redesigned Google OAuth flow with live status dialog and clickable auth URL

v2.3.2 (2026-05-20)

Added Google Gemini provider with OAuth support
- Two presets: "Google Gemini (API Key)" and "Google Gemini (OAuth)"
- OAuth Login button in endpoint editor — full Google OAuth2 flow
- Starts local HTTP server (port 8085), opens browser for Google consent
- Captures auth code, exchanges for access + refresh tokens
- Stores tokens in ~/.cache/codex-proxy/google-oauth-token.json
- Auto-refreshes access tokens when expired (no manual re-login)
- Uses Gemini's OpenAI-compatible endpoint: generativelanguage.googleapis.com/v1beta/openai
- Models: gemini-2.5-flash, gemini-2.5-pro, gemini-2.0-flash, gemini-2.0-flash-lite, and more
- Setup instructions shown if client_secret.json not found

v2.3.0 (2026-05-20)

Adaptive Crof self-healing system
- Tracks per-model success/failure history with item counts
- Dynamically learns max item limit per model (starts at 30, adjusts down on failures)
- Proactively compacts input when above learned limit before sending to upstream
- Auto-retry on finish_reason=length with aggressive re-compaction and resend
- Prevents stream disconnected and incomplete errors on long conversations
- All tracking logged to stderr: [crof-adaptive] model=X items=N OK/FAIL -> limit=N
Fixed NameError: _ts crash in debug logging
Fixed ConnectionResetError crash on client disconnect during streaming
Added 180s upstream timeout to prevent hanging connections
Compaction now preserves function_call/function_call_output pairs (no orphaned tool outputs)
Fixed reasoning control: reasoning_effort=none always sends both params

v2.2.1 (2026-05-20)

Fixed compaction orphaning function_call_output items — root cause of Crof incomplete responses
- Compaction cut between function_call and its function_call_output, creating dangling tool results
- Crof model received orphaned tool messages with empty tool_call_id, causing confusion and token exhaustion
- Compaction now expands tail boundary to include matching function_call/function_call_output pairs
Fixed reasoning control: reasoning_effort=none now always sends both enable_thinking=false AND reasoning_effort=none
- Crof API testing confirmed reasoning_effort=none is what actually suppresses reasoning, not enable_thinking=false
Added upstream debug logging to ~/.cache/codex-proxy/crof-upstream.jsonl

v2.2.0 (2026-05-20)

Added per-provider Reasoning controls in endpoint editor
- Reasoning On/Off toggle — disable reasoning for models that exhaust output tokens (e.g., Crof mimo-v2.5-pro)
- Reasoning Effort selector: None, Minimal, Low, Medium, High, Max
- When reasoning is OFF: sends enable_thinking=false + reasoning_effort=none to upstream API
- When reasoning is ON: sends user-selected effort level (default: Medium)
- Settings stored per-endpoint, passed through proxy config to upstream requests
Strip reasoning_content from proxy output — Codex doesn't use it, avoids token waste
Force max_tokens=64000 minimum for openai-compat providers — room for both reasoning and content
Inspired by unsloth's reasoning control patterns for Qwen/GPT-OSS models
Styled reasoning switch: green = ON, orange = OFF, gentle rounded pill shape
Added error handling to endpoint manager Add/Edit/Manage dialogs (prevents silent failures)

v2.1.3 (2026-05-19)

Fixed Crof mimo-v2.5-pro stopping mid-response (finish_reason=length)
- Root cause: model emits 600+ reasoning_content SSE chunks that exhaust max_tokens before any actual content is generated
- Strip reasoning_content from proxy output — Codex doesn't use reasoning, avoids wasting output tokens on invisible text
- Force max_tokens minimum of 64000 for openai-compat providers — gives models room for both reasoning and content
- Works for all openai-compat providers (Crof, Z.AI, DeepSeek, OpenRouter, etc.)

v2.1.2 (2026-05-19)

Fixed Crof.ai and providers stopping after first tool call (root cause: None tool IDs)
Codex sends function_call items with id=None — proxy now matches tool results to calls by call_id + positional fallback
Fixed orphan message output item when response is only tool calls (no text content)
Auto-trims long conversations (>30 items) to prevent context overflow on providers like Crof
- Keeps system/developer messages, original user query, and most recent 10 items
- Auto-compacts old items into a summary instead of just dropping them
- Summary includes: user requests, assistant responses, tool calls made, files touched
- Preserves enough context for the model to continue long tasks intelligently
Truncates large tool outputs (>8000 chars) to prevent model output token exhaustion
- Crof's models return incomplete when tool results contain too much text (e.g., full HTML pages)
- Truncated outputs include [truncated N chars] suffix so the model knows data was cut
Added request/response logging to ~/.cache/codex-proxy/requests.log for debugging
Proxy stderr no longer discarded by launcher (visible in terminal for debugging)

v2.1.1 (2026-05-19)

Added Command Code backend to translation proxy (proprietary /alpha/generate API)
Added Command Code provider preset with 20 models (DeepSeek, Claude, GPT, Kimi, GLM, Qwen, etc.)
Added cc_version field in endpoint editor for Command Code version (default: 0.26.8)
Proxy sends x-command-code-version header to CC API (fixes 403 "upgrade_required")
CC message conversion: system role → user, string content → array, tools stripped, real UUID for threadId
Fixed proxy: map developer role to system for Chat Completions providers (DeepSeek, Qwen, etc.)
Fixed proxy: map developer role to user for Anthropic providers
Forward instructions field from Responses API as system message/param

v2.1.0 (2026-05-19)

Added Codex auth status detection (reads codex login status)
Auth status bar shows logged-in provider or warning if auth missing/expired
"Re-login" button opens codex login in a terminal for re-authentication
Auto re-checks auth 30s after re-login flow starts
Pre-launch auth check warns before launching Codex Default mode if auth is invalid
Auth status checked asynchronously at startup (non-blocking)

v2.0.1 (2026-05-19)

Added Codex CLI/Desktop installation verifier to main page
Green check (✔) when detected, yellow cross (✘) when missing
"Install" button next to missing tools opens install guide dialog
Desktop/CLI launch buttons disabled with tooltip when tool is missing
Dependency status logged on startup
Buttons respect missing-state after busy/unbusy cycles

v2.0.0 (2026-05-19)

Initial release: multi-provider Codex Launcher
Translation proxy: Responses API to Chat Completions + Anthropic Messages
GTK endpoint manager with 10+ provider presets
Codex Default mode (built-in OAuth, zero config)
Browser UA injection for Cloudflare-protected providers (OpenCode)
Streaming SSE, tool calls, reasoning content support
Profile backup/import, model auto-fetch, bulk import
Refresh Models in background thread
URL normalization to prevent double-path bugs
Config backup/restore around sessions
.deb installer package

29 KiB Raw Permalink Blame History Unescape Escape