Files

admin beea20686b v3.6.0 — Performance & Stability Hardening

P0: Connection pooling (http.client reuse per host), stream idle timeout
    (300s via selectors) on all streaming paths (OA/CC/Gemini/auto-continue)
P1: Retry-After header support on all retry paths, preemptive OAuth token
    refresh (5min before expiry)
P2: oa_convert_tools(strict=) for Responses vs Chat Completions, filter
    null/empty tool names
P3: Response store TTL (600s eviction), bounded stream buffers (8MB cap),
    response.failed/error urgent flush, dual logging (proxy.log)

.deb: v3.6.0 (71KB) — v3.5.0 and v3.3.0 kept as fallback

2026-05-22 13:14:51 +04:00

26 KiB

Raw Permalink Blame History

Get 10% OFF Z.AI coding plans
z.ai/subscribe

Codex Launcher — Any AI Provider

Run OpenAI Codex CLI & Desktop with any AI provider.
Google Antigravity • Gemini CLI • OpenCode • Z.AI • Anthropic • Command Code • OpenRouter • Crof.ai • NVIDIA NIM • Kilo.ai • DeepSeek • and more

The Problem

OpenAI's Codex CLI v2.0+ exclusively uses the Responses API — a protocol that is incompatible with virtually every other AI provider:

Provider	API	Works with Codex?
OpenAI	Responses API	✅
Google Antigravity (OAuth)	Code Assist / Gemini Native	✅
Gemini CLI OAuth	Code Assist	✅
Z.AI	Chat Completions	✅
OpenCode	Chat Completions	✅
Anthropic	Messages API	✅
Command Code	Custom `/alpha/generate`	✅
Ollama	Chat Completions	✅
OpenRouter	Chat Completions	✅
NVIDIA NIM	Chat Completions	✅
Crof.ai	Chat Completions	✅

The protocols differ in endpoint paths, message formats, tool-call structures, streaming events, and completion semantics. You can't just swap a base URL.

The Solution

A three-component system:

Translation Proxy — translates Responses API ↔ Chat Completions / Anthropic Messages in real-time
Config Engine — generates Codex config files on-the-fly per provider, with backup/restore
GTK Launcher — manages endpoints, launches Desktop or CLI, handles the full lifecycle

┌─────────────────────────────────────────────────────────────────────┐
│                         Codex Launcher GUI                          │
│              (endpoint management + AI Assist + lifecycle)          │
└──────────┬─────────────────┬──────────────────┬────────────────────┘
           │                 │                  │
    ┌──────▼──────┐  ┌──────▼──────┐  ┌────────▼─────────┐
    │  Codex      │  │  Native     │  │  Translation     │
    │  Default    │  │  OpenAI     │  │  Proxy           │
    │  (remove    │  │  (direct    │  │  (auto-revive)   │
    │  config)    │  │  URL)       │  │                  │
    └──────┬──────┘  └──────┬──────┘  └────────┬─────────┘
           │                │                   │
           ▼                ▼          ┌────────┴────────┐
    ┌──────────────┐ ┌───────────┐    │                 │
    │ Built-in     │ │ config.   │    ▼                 ▼
    │ Codex OAuth  │ │ toml      │ ┌────────────┐ ┌───────────┐ ┌──────────┐
    └──────────────┘ └───────────┘ │ OpenAI     │ │ Anthropic │ │ Command  │
                                   │ Chat Comp. │ │ Messages  │ │ Code     │
                                   └────────────┘ └───────────┘ └──────────┘

Features

Multi-Provider Support

Native OpenAI — direct connection, no proxy needed
OpenAI-compatible — Z.AI, OpenCode Zen/Go, Crof.ai, NVIDIA NIM, Kilo.ai, OpenRouter, Ollama, Together, Groq, and any provider with a Chat Completions endpoint
Anthropic — Claude models via the Messages API
Command Code — 20 models (DeepSeek, Claude, GPT, Kimi, GLM, Qwen, etc.) via Command Code's /alpha/generate API with configurable client version
Codex Default — built-in Codex OAuth with official models, zero config

Translation Proxy (`translate-proxy.py`)

Full Responses API ↔ Chat Completions / Anthropic Messages / Command Code API bidirectional translation
Streaming SSE support with proper event sequencing (response.created → response.output_text.delta → response.completed)
Tool calls — full function calling support including parallel tool calls
Reasoning content — forwards reasoning_content fields from providers that support it
Browser UA injection — bypasses Cloudflare bot detection for providers like OpenCode
Smart URL construction — prevents double-path bugs (/v1/chat/completions/chat/completions)
Header forwarding — preserves client identity headers while filtering hop-by-hop headers
Connection pooling — persistent HTTPS connections per host, eliminates TLS handshake overhead per request
Stream idle timeout — kills stalled upstream connections after 5 minutes of silence
Retry-After support — respects upstream Retry-After headers on 429/502/503 responses
Response store TTL — evicts stored responses older than 10 minutes, prevents memory leaks
Bounded stream buffers — 8MB cap prevents OOM on pathological responses
Dual logging — all proxy messages written to both stderr and ~/.cache/codex-proxy/proxy.log
Zero dependencies — pure Python stdlib

Command Code Adapter

Multi-format tool-call parser — handles all known CC model output formats in a cascading chain:
- DSML tags (<｜｜DSML｜｜invoke>) — current model format
- <bash>...</bash> blocks with metadata extraction
- <explore_agent> blocks converted to real exec_command
- <tool_call type="bash"> HTML-like blocks
- XML <function= patterns
- Raw JSON {"cmd":"..."} embedded in text
- Fallback regex for unrecognized tool-call signals
Three-tier argument parser — handles double-wrapped, escaped, and unicode-escaped arguments
Recursive unwrapping — handles double/triple-wrapped cmd values
Post-extraction sanitizer — validates every tool call has valid name + args before forwarding to Codex
ErrorAnalyzer — learns from 4xx errors, retries with adjusted parameters (max 2 retries)
Schema cache with 24h staleness TTL for provider capabilities

GTK Launcher (`codex-launcher-gui`)

Endpoint manager — add, edit, delete, set default providers
Provider presets — one-click setup for 15+ providers with pre-filled URLs and model lists
Model auto-fetch — pulls available models directly from provider APIs
Bulk model import — paste a comma/newline-separated list of model IDs
Launch Desktop — starts Codex Desktop with the selected provider and model
Launch CLI — opens Codex CLI in a terminal with the selected provider
Codex Default — launch with built-in OAuth, no proxy or custom config
AI Assist — integrated AI-powered configuration assistance and troubleshooting
Usage Dashboard — per-provider tracking with dark theme, KPI strip, model bars, status pills
Profile backup/import — export and import endpoint configurations as portable JSON bundles
Threaded operations — model refresh runs in background, UI stays responsive
Process lifecycle — stall detection, kill/cleanup, config backup/restore around sessions
Config normalization — automatically strips stale API path suffixes from URLs
Reasoning controls — per-provider reasoning toggle with effort level selection

Process Management

Kills stale electron/webview/app-server processes from previous sessions
Removes stale PID files and sockets
Manages proxy lifecycle (start, health-check, stop)
Config backup before launch, automatic restore after exit

Technology Stack

Component	Technology	Why
Translation Proxy	Python 3 stdlib (`http.server`, `urllib`, `json`)	Zero dependencies, runs anywhere
GUI Launcher	Python 3 + GTK 3.0 (`PyGObject/gi`)	Native Linux desktop integration
Config Engine	Python 3 (`toml` generation, `json` catalogs)	Dynamic, no hardcoded configs
Process Mgmt	bash + `os.setsid`/`os.killpg`	Unix process group lifecycle
Streaming	Server-Sent Events (SSE)	Required by Codex Responses API
API Translation	Responses API ↔ Chat Completions / Anthropic Messages	Protocol bridging

Zero pip dependencies. Everything uses Python stdlib + system GTK bindings.

Quick Start

Prerequisites

Codex CLI ≥ 2.0 (npm install -g @openai/codex or bundled with Codex Desktop)
Codex Desktop installed at /opt/codex-desktop/ (optional, for Desktop mode)
Python 3.8+ (stdlib only)
python3-gi for GTK (sudo apt install python3-gi)
bash, curl, lsof

Install

git clone https://github.rommark.dev/admin/Codex-Launcher---Any-AI-Porovider.git
cd Codex-Launcher---Any-AI-Porovider
./install.sh

Run

Open Codex Launcher from your app grid, or:

codex-launcher-gui

First Launch

Click Manage Endpoints → Add
Select a provider preset (e.g., "OpenCode Zen (OpenAI-compatible)")
Enter your API key
Click Fetch from API to auto-discover models, or add them manually
Click Save
Select the endpoint and model from the dropdowns
Click Launch Desktop or Launch CLI

Development Journey

Phase 1: The Z.AI Proxy — Protocol Reverse Engineering

Problem: Codex CLI v2.0 switched exclusively to the Responses API. Z.AI (and every other provider) uses Chat Completions. They are fundamentally incompatible.

Approach:

Captured Codex's HTTP traffic to understand the exact Responses API request/response shape
Mapped the protocol differences:
- /v1/responses → /chat/completions (endpoint)
- input array with typed items → messages array with role/content (message format)
- function_call items → tool_calls array on assistant messages (tool format)
- SSE response.output_text.delta events → delta.content chunks (streaming)
Built the initial zai-proxy.py — a 200-line HTTP server that translates in both directions
Hit a critical bug: Codex hung in "thinking" state. Discovered that merely emitting response.done is insufficient — the response.completed event must contain the full output item array
Added streaming SSE with proper event sequencing — this was the breakthrough that made it work

Testing: Manual end-to-end testing with curl, Codex CLI --profile zai, and Codex Desktop. Verified streaming, tool calls, and reasoning content.

Phase 2: Multi-Provider Architecture — The Unified Proxy

Problem: Maintaining separate proxies for each provider (Z.AI, Anthropic, OpenRouter, etc.) was unmaintainable.

Approach:

Abstracted the translation into a backend plugin architecture:
- openai-compat backend: translates Responses → Chat Completions (works for any OpenAI-compatible API)
- anthropic backend: translates Responses → Anthropic Messages API
Unified config loading: JSON config file, CLI arguments, and environment variables
Shared HTTP server, model serving, and SSE framework
Per-backend: message conversion, tool conversion, response conversion, stream conversion

Key design decisions:

Pure Python stdlib — no Flask, no aiohttp, no pip dependencies. The proxy must work on any system with Python 3.
http.server.BaseHTTPRequestHandler — simple, synchronous, but sufficient for single-user desktop use
Config via JSON file — the launcher writes a proxy config to ~/.cache/codex-proxy/ for each endpoint

Phase 3: The GTK Launcher — Desktop Integration

Problem: Users had to manage config files, start proxies manually, and remember which wrapper script to use.

Approach:

Built a GTK 3.0 GUI with three layers: main window, endpoint manager dialog, edit endpoint dialog
Implemented dynamic config generation: for each launch, generate config.toml with the right provider definition, profile, and model catalog
Model catalog generation with dual field naming (slug + model, supported_reasoning_levels + supportedReasoningEfforts) — required because Codex CLI and Codex Desktop use different field names
Process lifecycle: backup config → cleanup stale processes → start proxy → write config → launch Codex → wait → restore config → stop proxy

Threading challenges:

GTK requires all UI updates on the main loop. Used GLib.idle_add() for all cross-thread communication
Model refresh was blocking the UI — moved to a background thread with idle_add completion callbacks
Proxy startup waits up to 15 seconds with health checks before proceeding

Phase 4: Endpoint Management & Provider Presets

Approach:

JSON-based endpoint storage (~/.codex/endpoints.json)
Provider presets with pre-filled URLs and model lists for common providers
Auto-fetch models from provider /v1/models endpoints
Bulk model import (paste comma/newline-separated model IDs)
Profile backup/import — portable JSON bundles with endpoints + config

URL normalization: Discovered that saved URLs sometimes had /chat/completions appended from manual entry. Added normalize_base_url() that strips trailing API path suffixes to prevent double-path bugs.

Phase 5: Cloudflare Bot Detection — The OpenCode 403 Saga

Problem: OpenCode Zen/Go returned 403 (Cloudflare error 1010) for all requests.

Investigation:

Tested direct curl requests — all returned 403 regardless of auth header type (Bearer, x-api-key, both, none)
Examined Codex logs — found turn.has_metadata_header=true and error code: 1010
Error 1010 is Cloudflare's "Access denied" — bot detection based on User-Agent and browser headers
Python's urllib sends User-Agent: Python-urllib/3.x which triggers the block

Solution:

Added _BROWSER_HEADERS — a set of Chrome-like headers (User-Agent, Sec-Ch-Ua, Sec-Fetch-*)
forwarded_headers() with browser_ua=True — injects browser headers while preserving incoming client headers and filtering hop-by-hop headers
Applied only to openai-compat backend where Cloudflare is common
This resolved the 403 — upstream now returns 401 (auth) instead of 403 (bot block), confirming the headers work

Phase 6: Codex Default Mode — OAuth Without Config

Problem: Users wanted to quickly switch back to built-in Codex OAuth without maintaining a separate endpoint.

Approach:

Added "Codex Default (Desktop)" and "Codex Default (CLI)" buttons
On launch: backup config → delete config.toml entirely → start Codex → restore config after exit
Key insight: writing empty strings (model = "", model_provider = "") causes Codex to error with "Model provider `` not found". The config must not exist at all for Codex to fall back to built-in defaults.

Phase 7: Command Code Multi-Format Parser — The 17-Fix Odyssey

Problem: Command Code provider's tool calls were silently dropped, causing the Codex agent loop to stop after the first response. The CC model returns tool calls as inline text in wildly varying formats that change between sessions and model versions.

Root Cause Analysis:

CC's /alpha/generate API uses a proprietary protocol — not Chat Completions, not Anthropic Messages
Tool calls appear as inline text within text-delta SSE events, not as structured JSON
The model output format is non-deterministic — observed 6+ distinct formats:
- Raw JSON: {"cmd":"mkdir -p /foo","type":"tool-call"}
- XML: <function name="exec_command"><parameter name="cmd">...</parameter></function>
- HTML-like: <tool_call type="bash">\n{"command":"..."}
- Bash blocks: <bash>\nprefix_rule: ...\n{"command":"..."}</bash>
- Explore blocks: <explore_agent>...</explore_agent>
- DSML tags: <｜｜DSML｜｜invoke name="exec"><｜｜DSML｜｜parameter name="command">...</parameter></invoke>
Additional complications: double-wrapped arguments, unescaped quotes, unicode escapes, missing fields

The Fix — 17 Incremental Patches: Built a cascading parser chain (DSML → bash → explore → tool_call → XML → raw JSON → fallback regex) that tries each format in order. Each patch addressed a specific format observed in production:

FIX 1–4: Foundation — string-only content, version headers, cache clearing, streaming error handling
FIX 5–8: Core parsing — raw JSON extraction, three-tier argument parser, field extraction, permission normalization
FIX 9–10: Cleanup — removed dead code, added documentation
FIX 11–11c: Robustness — recursive unwrapping of nested cmd values, post-extraction sanitizer, XML regex fix
FIX 12: Self-revive watchdog — proxy auto-restarts on crash instead of dying silently
FIX 13–17: New format support — fallback extraction, HTML-like blocks, explore blocks, bash blocks, DSML tags

Key Design Decision: Field-level regex extraction instead of JSON parsing. Standard JSON parsers fail on unescaped quotes in shell commands (e.g., echo "hello world" breaks JSON). The regex approach tolerates malformed JSON by extracting individual fields.

Verification: --self-test flag runs 19 automated tests covering all edge cases. Debug logging to ~/.cache/codex-proxy/cc-debug.log captures every parser decision for troubleshooting.

Architecture Deep Dive

Request Flow (OpenAI-compatible provider)

Codex CLI/Desktop
    │
    │ POST /responses (Responses API)
    │ Body: { model: "glm-5.1", input: [...], stream: true }
    ▼
┌─────────────────────────────────┐
│  translate-proxy.py (port 8080) │
│                                 │
│  1. Parse Responses API body    │
│  2. Convert input → messages    │
│  3. Convert tools format        │
│  4. Inject browser UA headers   │
│                                 │
│  POST /v1/chat/completions      │──→  Upstream Provider
│  Body: { model, messages, ... }│      (opencode.ai, z.ai, etc.)
│                                 │
│  5. Receive Chat Comp. response │  ←── SSE stream or JSON
│  6. Convert response format     │
│  7. Emit Responses API SSE      │
│                                 │
└─────────────────────────────────┘
    │
    │ SSE: response.created
    │ SSE: response.output_text.delta  (streamed tokens)
    │ SSE: response.output_text.done
    │ SSE: response.completed
    ▼
Codex CLI/Desktop (receives Responses API events)

Config Lifecycle

┌─────────┐    ┌──────────┐    ┌──────────┐    ┌──────────┐    ┌─────────┐
│ Backup   │───→│ Cleanup  │───→│ Generate │───→│ Launch   │───→│ Restore │
│ config   │    │ stale    │    │ config   │    │ Codex    │    │ config  │
│ .toml    │    │ processes│    │ + models │    │ process  │    │ from    │
│          │    │          │    │          │    │          │    │ backup  │
└─────────┘    └──────────┘    └──────────┘    └──────────┘    └─────────┘

Model Catalog Format

The launcher generates model catalog JSON with dual field naming to satisfy both Codex CLI and Codex Desktop:

{
  "models": [{
    "slug": "glm-5.1",          // CLI reads this
    "model": "glm-5.1",         // Desktop reads this
    "supported_reasoning_levels": [...],  // CLI
    "supportedReasoningEfforts": [...],   // Desktop
    ...
  }]
}

Provider Presets

Preset	Backend	Base URL
OpenAI	Native	`https://api.openai.com/v1`
Anthropic	Anthropic	`https://api.anthropic.com/v1`
OpenCode Zen	OpenAI-compat	`https://opencode.ai/zen/v1`
OpenCode Go	OpenAI-compat	`https://opencode.ai/zen/go/v1`
Command Code	Command Code	`https://api.commandcode.ai`
Crof.ai	OpenAI-compat	`https://crof.ai/v1`
NVIDIA NIM	OpenAI-compat	`https://integrate.api.nvidia.com/v1`
Kilo.ai	OpenAI-compat	`https://api.kilo.ai/api/gateway`
OpenRouter	OpenAI-compat	`https://openrouter.ai/api/v1`
Z.AI	OpenAI-compat	`https://api.z.ai/api/coding/paas/v4`
Custom	Any	User-defined

File Structure

src/
├── translate-proxy.py            # Translation proxy (openai-compat + anthropic + command-code)
├── codex-launcher-gui            # GTK launcher GUI
├── cleanup-codex-stale.sh        # Stale process cleanup
└── codex-launcher.desktop.template  # Desktop entry template

install.sh                        # One-command installer
README.md                         # This file

Installed Locations

/usr/bin/translate-proxy.py               # Proxy (from .deb)
/usr/bin/codex-launcher-gui               # Launcher (from .deb)
/usr/bin/cleanup-codex-stale.sh           # Cleanup (from .deb)
/usr/share/applications/codex-launcher.desktop  # App grid entry
~/.codex/endpoints.json                   # Endpoint storage
~/.codex/config.toml                      # Codex config (auto-generated)
~/.cache/codex-proxy/                     # Proxy configs + model catalogs
~/.cache/codex-proxy/cc-debug.log         # Debug log (per-request)

Troubleshooting

Issue	Cause	Fix
Window opens then disappears	Stale processes from previous session	Click Kill && Cleanup
Window never opens	Startup freeze	Kill && Cleanup, then retry
"Model provider `` not found"	Empty strings in config	V2 deletes config entirely for Default mode
403 Forbidden from OpenCode	Cloudflare bot detection	Proxy injects browser UA headers
401 Unauthorized from OpenCode	Invalid key or no credits	Check API key and billing
Double path in URL	Stale `/chat/completions` in base URL	`normalize_base_url()` strips suffixes
Proxy stops when terminal closes	SIGHUP to subprocess	Launcher uses `os.setsid` process groups
Models not showing in picker	Wrong model catalog format	Must have both `slug` + `model` fields
Codex hangs in "thinking"	Missing `response.completed`	Proxy emits full SSE event sequence
Stops after first tool call (Crof)	`previous_response_id` not resolved	V2.1.2 stores and chains responses for multi-turn
CC agent stops after first response	Tool calls not parsed from model text	V3.5 multi-format parser handles all CC output formats
CC tool calls have wrong args	Double-wrapped arguments	V3.5 three-tier parser + recursive unwrapping
Proxy crashes mid-session	Unhandled streaming error	V3.5 self-revive watchdog auto-restarts
CC 403 upgrade_required	Missing version header	V3.5 always sends `x-command-code-version`

Adding a New Provider

Click Manage Endpoints → Add
Choose a preset or set Custom
Set backend type: OpenAI-compatible, Anthropic, or Native
Enter base URL and API key
Click Fetch from API or add models manually
Save and launch

For providers behind Cloudflare, the proxy automatically injects browser headers. For providers with non-standard APIs, add a new backend module to translate-proxy.py following the oa_* / an_* pattern.

Manual Proxy Usage

# Start proxy for any OpenAI-compatible provider
python3 src/translate-proxy.py \
  --backend openai-compat \
  --target-url https://api.your-provider.com/v1 \
  --api-key YOUR_KEY \
  --port 8080

# Or use a JSON config
python3 src/translate-proxy.py --config my-proxy-config.json

# Then run Codex
codex --profile my-profile -c model=my-model

Requirements

Python ≥ 3.8
python3-gi (sudo apt install python3-gi)
Codex CLI ≥ 2.0
Codex Desktop (optional, for Desktop mode)
bash, curl, lsof

No pip dependencies. Zero. Pure stdlib + system GTK.

License

MIT

Get 10% OFF Z.AI coding plans

26 KiB Raw Permalink Blame History Unescape Escape

Codex Launcher — Any AI Provider

The Problem

The Solution

Features

Multi-Provider Support

Translation Proxy (translate-proxy.py)

Command Code Adapter

GTK Launcher (codex-launcher-gui)

Process Management

Technology Stack

Quick Start

Prerequisites

Install

Run

First Launch

Development Journey

Phase 1: The Z.AI Proxy — Protocol Reverse Engineering

Phase 2: Multi-Provider Architecture — The Unified Proxy

Phase 3: The GTK Launcher — Desktop Integration

Phase 4: Endpoint Management & Provider Presets

Phase 5: Cloudflare Bot Detection — The OpenCode 403 Saga

Phase 6: Codex Default Mode — OAuth Without Config

Phase 7: Command Code Multi-Format Parser — The 17-Fix Odyssey

Architecture Deep Dive

Request Flow (OpenAI-compatible provider)

Config Lifecycle

Model Catalog Format

Provider Presets

File Structure

Installed Locations

Troubleshooting

Adding a New Provider

Manual Proxy Usage

Requirements

License

26 KiB

Raw Permalink Blame History

Translation Proxy (`translate-proxy.py`)

GTK Launcher (`codex-launcher-gui`)