v3.7.0: Intelligence Routing — self-healing parser system

Layer 1 (FIX 23): Deep URL extraction from nested JSON in explore_agent blocks. Layer 2 (FIX 24): Auto-proceed on require_escalation / request_escalation_permission. Layer 3 (FIX 25): Intent-based command synthesis with 5 heuristics when all parsers fail. Module-level _build_explore_cmd() for reuse across parser + stream path. 54 self-test patterns (up from 41).
fix: bare <explore_agent> tags (no closing tag) now trigger URL-based repo exploration fallback, fix _build_explore_cmd name error, add _last_user_urls tracking
2026-05-22 16:29:45 +04:00 · 2026-05-22 16:09:51 +04:00 · 2026-05-22 13:26:03 +04:00 · 2026-05-22 13:23:13 +04:00 · 2026-05-22 13:18:40 +04:00 · 2026-05-22 13:14:51 +04:00
7 changed files with 757 additions and 77 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,5 +1,76 @@
 # Changelog

+## v3.7.0 (2026-05-22)
+
+**Intelligence Routing — Self-Healing Parser System**
+
+When the Command Code model produces output in unpredictable or unrecognized formats, the multi-format parser chain (DSML, XML, explore_agent, bash blocks, raw JSON, fallback regex) can return empty. This causes the Codex agent loop to stall — zero tool calls means nothing to execute.
+
+Intelligence Routing is a **three-layer self-healing system** that ensures the agent loop always continues:
+
+### Layer 1: Deep URL Extraction (FIX 23)
+- **Problem**: `<explore_agent>` body contained `messages: [{"content": "https://..."}]` — URLs hidden inside JSON values. Regex couldn't match because it excluded the `"` character that terminates JSON strings.
+- **Solution**: `_build_explore_cmd()` extracted to module level (was a closure inside `_parse_commandcode_text_tool_calls`). After initial regex fails, tries `json.loads()`, iterates list items, extracts `content` field to find URLs. Added `"` to regex exclusion set.
+- **Self-tests**: Pattern M, O, O2 verify URL extraction from nested JSON.
+
+### Layer 2: Escalation Block Handling (FIX 24)
+- **Problem**: Model produces `<require_escalation>` and `<request_escalation_permission>` blocks when it wants elevated permissions. CC adapter doesn't support escalation — blocks silently dropped → `parsed_tool_calls=0` → stall.
+- **Solution**: Two handlers:
+  - FIX 24a: Closed-tag blocks — extracts URL if present and runs explore command; otherwise echoes auto-proceed.
+  - FIX 24b: Bare/unclosed tags (`<require_escalation />`) — auto-proceeds with diagnostic echo.
+- **Self-tests**: Pattern N, N2 verify both closed and bare escalation blocks.
+
+### Layer 3: Intent-Based Command Synthesis (FIX 25 — THE CORE)
+- **Problem**: After ALL parsers return empty, the agent loop has zero tool calls. Model may have written plain English ("I need to fetch the README"), partial JSON, or completely unrecognized formats.
+- **Solution**: 5-heuristic synthesis chain in `cc_stream_to_sse()`, run when `parsed_tool_calls=0` and text has content:
+  1. **URL in text** → `curl` to fetch it
+  2. **File path reference** ("read the file /path/to/X") → `cat` or `ls` that file
+  3. **Shell command in backticks/quotes** → extract and run it
+  4. **"explore"/"fetch"/"investigate"/"repository" intent** + last user URL → `_build_explore_cmd()` with `_last_user_urls` deque
+  5. **"I need to"/"let me"/"please" intent text** → echo diagnostic with the intent
+- The system NEVER returns empty tool calls when there's text to analyze.
+- **Self-tests**: Patterns M-O2 cover the full pipeline.
+
+### Architecture
+```
+_parse_commandcode_text_tool_calls()  ←  Layer 1 + Layer 2
+cc_stream_to_sse()                    ←  Layer 3 (after parser chain + fallback)
+_last_user_urls deque (maxlen=20)     ←  Session-wide URL memory for heuristic 4
+```
+
+### Test Coverage
+- **54 self-test patterns** (up from 41 in v3.6.0)
+- 13 new tests covering all three Intelligence Routing layers
+- Tests verify: nested JSON URL extraction, closed/bare escalation blocks, module-level explore command builder
+
+## v3.6.0 (2026-05-22)
+
+**Performance & Stability Hardening — Connection Pooling, Stream Idle Timeouts, Retry-After**
+
+Inspired by architectural study of [Codex-Proxy-Server](https://github.com/unluckyjori/Codex-Proxy-Server) (Rust/Axum).
+
+### P0: Connection Pooling & Stream Idle Timeout
+- **Connection pooling** (`http.client` reuse) — persistent HTTPS connections per host, eliminates ~100ms TLS handshake per request. Pool keyed by `{scheme}://{host}:{port}`, reused across requests.
+- **Stream idle timeout** (300s default) — all streaming paths now use `_stream_with_idle_timeout()` via `selectors`. If upstream goes silent for 5 minutes, the stream is killed with a `TimeoutError` instead of hanging forever. Applied to:
+  - OpenAI-compat streaming (`oa_stream_to_sse`)
+  - Command Code streaming (`_iter_cc_events`)
+  - Gemini OAuth streaming (`_handle_gemini_oauth`)
+  - Auto-continue streaming (`_auto_continue_gemini`)
+
+### P1: Retry-After Header Support & Preemptive Token Refresh
+- **`Retry-After` header** — all retry paths (openai-compat, BGP, auto) now read the upstream `Retry-After` header and respect it (capped at 60s). Falls back to exponential backoff if header is absent.
+- **Preemptive OAuth token refresh** — `_preemptive_refresh_token()` checks token expiry 5 minutes before it expires and logs a warning, preparing for proactive refresh.
+
+### P2: Tool Translation Improvements
+- **`oa_convert_tools(strict=)`** — separate tool translation for Responses API (with `strict: true`) vs Chat Completions (without `strict`). Some providers reject the `strict` field in Chat Completions mode.
+- **Filter null/empty tool names** — tools with empty or `"null"` names are silently dropped instead of causing upstream 400 errors.
+
+### P3: Response Store TTL, Bounded Buffers, Dual Logging
+- **Response store TTL** (600s) — `_response_store_evict()` removes entries older than 10 minutes. Prevents unbounded memory growth on long sessions.
+- **Bounded stream buffer** (8MB max) — `stream_buffered_events` now caps at 8MB before forcing a flush, preventing OOM on pathological responses.
+- **`response.failed` and error events** added to urgent flush list — errors reach the client immediately instead of being buffered.
+- **Dual logging** — `proxy.log` in `~/.cache/codex-proxy/` captures all proxy messages alongside stderr. Survives Codex Desktop's stderr piping.
+
 ## v3.5.0 (2026-05-22)

 **Major Release — Command Code Adapter Overhaul, AI Assist, Self-Revive Watchdog, Debug Infrastructure**
--- a/README.md
+++ b/README.md
@@ -107,9 +107,12 @@ A three-component system:
 - **Browser UA injection** — bypasses Cloudflare bot detection for providers like OpenCode
 - **Smart URL construction** — prevents double-path bugs (`/v1/chat/completions/chat/completions`)
 - **Header forwarding** — preserves client identity headers while filtering hop-by-hop headers
- **Self-revive watchdog** — auto-restarts proxy on crash (up to 50x, progressive backoff 1→30s)
- **Debug-to-file logging** — all events and parser results written to `~/.cache/codex-proxy/cc-debug.log`
- **Inline self-test** — `--self-test` flag runs 19 unit tests covering all parser edge cases
+- **Connection pooling** — persistent HTTPS connections per host, eliminates TLS handshake overhead per request
+- **Stream idle timeout** — kills stalled upstream connections after 5 minutes of silence
+- **Retry-After support** — respects upstream `Retry-After` headers on 429/502/503 responses
+- **Response store TTL** — evicts stored responses older than 10 minutes, prevents memory leaks
+- **Bounded stream buffers** — 8MB cap prevents OOM on pathological responses
+- **Dual logging** — all proxy messages written to both stderr and `~/.cache/codex-proxy/proxy.log`
 - Zero dependencies — pure Python stdlib

 ### Command Code Adapter
--- a/codex-launcher_3.6.0_all.deb
+++ b/codex-launcher_3.6.0_all.deb
--- a/codex-launcher_3.7.0_all.deb
+++ b/codex-launcher_3.7.0_all.deb
--- a/install.sh
+++ b/install.sh
@@ -3,11 +3,11 @@ set -e

 SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"

-if [ -f "$SCRIPT_DIR/codex-launcher_3.5.0_all.deb" ]; then
-    echo "Installing codex-launcher_3.5.0_all.deb ..."
-    sudo dpkg -i "$SCRIPT_DIR/codex-launcher_3.5.0_all.deb"
+if [ -f "$SCRIPT_DIR/codex-launcher_3.7.0_all.deb" ]; then
+    echo "Installing codex-launcher_3.7.0_all.deb ..."
+    sudo dpkg -i "$SCRIPT_DIR/codex-launcher_3.7.0_all.deb"
    echo ""
-    echo "Installed v3.5.0 via .deb package."
+    echo "Installed v3.7.0 via .deb package."
    echo "  translate-proxy.py   -> /usr/bin/translate-proxy.py"
    echo "  codex-launcher-gui   -> /usr/bin/codex-launcher-gui"
    echo "  cleanup-codex-stale  -> /usr/bin/cleanup-codex-stale.sh"
--- a/src/codex-launcher-gui
+++ b/src/codex-launcher-gui
@@ -26,6 +26,42 @@ model_catalog_json = ""
 """

 CHANGELOG = [
+    ("3.7.0", "2026-05-22", [
+        "Intelligence Routing — self-healing parser system for Command Code",
+        "Layer 1: Deep URL extraction from nested JSON in explore_agent blocks",
+        "Layer 2: Auto-proceed on require_escalation / request_escalation_permission blocks",
+        "Layer 3: Intent-based command synthesis when all parsers fail (5 heuristics)",
+        "Module-level _build_explore_cmd() — reuses URL extraction across parser + stream",
+        "54 self-test patterns covering all three Intelligence Routing layers",
+    ]),
+    ("3.6.0", "2026-05-22", [
+        "Connection pooling — persistent HTTPS connections per host",
+        "Stream idle timeout (300s) — kills silent streams instead of hanging",
+        "Retry-After header support on all retry paths",
+        "Bounded stream buffers (8MB) — prevents OOM",
+        "Dual logging to proxy.log + stderr",
+    ]),
+    ("3.5.0", "2026-05-22", [
+        "Command Code adapter overhaul — 17 patches for multi-format tool-call parsing",
+        "DSML, XML, explore_agent, bash blocks, raw JSON parser chain",
+        "Self-revive watchdog — auto-restarts proxy on crash",
+        "Debug-to-file logging in cc-debug.log",
+        "Inline self-test (19 patterns)",
+    ]),
+    ("3.3.0", "2026-05-20", [
+        "Antigravity + Gemini CLI OAuth — full Codex agent loop working",
+        "Auto-continue on MAX_TOKENS for Gemini/Antigravity",
+        "BGP++ route scoring and provider policy layer",
+    ]),
+    ("3.0.0", "2026-05-20", [
+        "Major overhaul — ThreadingHTTPServer, thread-safe state, graceful shutdown",
+        "Dynamic port allocation, proxy health gating, atomic config",
+        "Usage Dashboard v2 with dark theme",
+    ]),
+    ("2.7.0", "2026-05-20", [
+        "Usage Dashboard redesigned (OpenUsage-inspired dark theme)",
+        "TCP_NODELAY streaming, Anthropic prompt caching",
+    ]),
    ("2.6.1", "2026-05-20", [
        "Google OAuth rebuilt to emulate Gemini CLI — no client_secret.json needed",
        "Uses Google's public OAuth client_id (same as gemini-cli)",
@@ -1107,7 +1143,7 @@ class LauncherWin(Gtk.Window):
        # header row
        hdr = Gtk.Box(spacing=8)
        vbox.pack_start(hdr, False, False, 0)
-        lbl = Gtk.Label(label="<b>Codex Launcher v3.3.0</b>")
+        lbl = Gtk.Label(label="<b>Codex Launcher v3.7.0</b>")
        lbl.set_use_markup(True)
        hdr.pack_start(lbl, False, False, 0)
        changelog_btn = Gtk.Button(label="Changelog")
--- a/src/translate-proxy.py
+++ b/src/translate-proxy.py