Files
admin beea20686b v3.6.0 — Performance & Stability Hardening
P0: Connection pooling (http.client reuse per host), stream idle timeout
    (300s via selectors) on all streaming paths (OA/CC/Gemini/auto-continue)
P1: Retry-After header support on all retry paths, preemptive OAuth token
    refresh (5min before expiry)
P2: oa_convert_tools(strict=) for Responses vs Chat Completions, filter
    null/empty tool names
P3: Response store TTL (600s eviction), bounded stream buffers (8MB cap),
    response.failed/error urgent flush, dual logging (proxy.log)

.deb: v3.6.0 (71KB) — v3.5.0 and v3.3.0 kept as fallback
2026-05-22 13:14:51 +04:00

26 KiB
Raw Permalink Blame History

Z.AI 10% OFF

Get 10% OFF Z.AI coding plans
z.ai/subscribe


Codex Launcher — Any AI Provider

Run OpenAI Codex CLI & Desktop with any AI provider.
Google Antigravity • Gemini CLI • OpenCode • Z.AI • Anthropic • Command Code • OpenRouter • Crof.ai • NVIDIA NIM • Kilo.ai • DeepSeek • and more

Python 3.8+ GTK 3.0 MIT License Zero pip dependencies


The Problem

OpenAI's Codex CLI v2.0+ exclusively uses the Responses API — a protocol that is incompatible with virtually every other AI provider:

Provider API Works with Codex?
OpenAI Responses API
Google Antigravity (OAuth) Code Assist / Gemini Native
Gemini CLI OAuth Code Assist
Z.AI Chat Completions
OpenCode Chat Completions
Anthropic Messages API
Command Code Custom /alpha/generate
Ollama Chat Completions
OpenRouter Chat Completions
NVIDIA NIM Chat Completions
Crof.ai Chat Completions

The protocols differ in endpoint paths, message formats, tool-call structures, streaming events, and completion semantics. You can't just swap a base URL.

The Solution

A three-component system:

  1. Translation Proxy — translates Responses API ↔ Chat Completions / Anthropic Messages in real-time
  2. Config Engine — generates Codex config files on-the-fly per provider, with backup/restore
  3. GTK Launcher — manages endpoints, launches Desktop or CLI, handles the full lifecycle
┌─────────────────────────────────────────────────────────────────────┐
│                         Codex Launcher GUI                          │
│              (endpoint management + AI Assist + lifecycle)          │
└──────────┬─────────────────┬──────────────────┬────────────────────┘
           │                 │                  │
    ┌──────▼──────┐  ┌──────▼──────┐  ┌────────▼─────────┐
    │  Codex      │  │  Native     │  │  Translation     │
    │  Default    │  │  OpenAI     │  │  Proxy           │
    │  (remove    │  │  (direct    │  │  (auto-revive)   │
    │  config)    │  │  URL)       │  │                  │
    └──────┬──────┘  └──────┬──────┘  └────────┬─────────┘
           │                │                   │
           ▼                ▼          ┌────────┴────────┐
    ┌──────────────┐ ┌───────────┐    │                 │
    │ Built-in     │ │ config.   │    ▼                 ▼
    │ Codex OAuth  │ │ toml      │ ┌────────────┐ ┌───────────┐ ┌──────────┐
    └──────────────┘ └───────────┘ │ OpenAI     │ │ Anthropic │ │ Command  │
                                   │ Chat Comp. │ │ Messages  │ │ Code     │
                                   └────────────┘ └───────────┘ └──────────┘

Features

Multi-Provider Support

  • Native OpenAI — direct connection, no proxy needed
  • OpenAI-compatible — Z.AI, OpenCode Zen/Go, Crof.ai, NVIDIA NIM, Kilo.ai, OpenRouter, Ollama, Together, Groq, and any provider with a Chat Completions endpoint
  • Anthropic — Claude models via the Messages API
  • Command Code — 20 models (DeepSeek, Claude, GPT, Kimi, GLM, Qwen, etc.) via Command Code's /alpha/generate API with configurable client version
  • Codex Default — built-in Codex OAuth with official models, zero config

Translation Proxy (translate-proxy.py)

  • Full Responses API ↔ Chat Completions / Anthropic Messages / Command Code API bidirectional translation
  • Streaming SSE support with proper event sequencing (response.createdresponse.output_text.deltaresponse.completed)
  • Tool calls — full function calling support including parallel tool calls
  • Reasoning content — forwards reasoning_content fields from providers that support it
  • Browser UA injection — bypasses Cloudflare bot detection for providers like OpenCode
  • Smart URL construction — prevents double-path bugs (/v1/chat/completions/chat/completions)
  • Header forwarding — preserves client identity headers while filtering hop-by-hop headers
  • Connection pooling — persistent HTTPS connections per host, eliminates TLS handshake overhead per request
  • Stream idle timeout — kills stalled upstream connections after 5 minutes of silence
  • Retry-After support — respects upstream Retry-After headers on 429/502/503 responses
  • Response store TTL — evicts stored responses older than 10 minutes, prevents memory leaks
  • Bounded stream buffers — 8MB cap prevents OOM on pathological responses
  • Dual logging — all proxy messages written to both stderr and ~/.cache/codex-proxy/proxy.log
  • Zero dependencies — pure Python stdlib

Command Code Adapter

  • Multi-format tool-call parser — handles all known CC model output formats in a cascading chain:
    • DSML tags (<DSMLinvoke>) — current model format
    • <bash>...</bash> blocks with metadata extraction
    • <explore_agent> blocks converted to real exec_command
    • <tool_call type="bash"> HTML-like blocks
    • XML <function= patterns
    • Raw JSON {"cmd":"..."} embedded in text
    • Fallback regex for unrecognized tool-call signals
  • Three-tier argument parser — handles double-wrapped, escaped, and unicode-escaped arguments
  • Recursive unwrapping — handles double/triple-wrapped cmd values
  • Post-extraction sanitizer — validates every tool call has valid name + args before forwarding to Codex
  • ErrorAnalyzer — learns from 4xx errors, retries with adjusted parameters (max 2 retries)
  • Schema cache with 24h staleness TTL for provider capabilities

GTK Launcher (codex-launcher-gui)

  • Endpoint manager — add, edit, delete, set default providers
  • Provider presets — one-click setup for 15+ providers with pre-filled URLs and model lists
  • Model auto-fetch — pulls available models directly from provider APIs
  • Bulk model import — paste a comma/newline-separated list of model IDs
  • Launch Desktop — starts Codex Desktop with the selected provider and model
  • Launch CLI — opens Codex CLI in a terminal with the selected provider
  • Codex Default — launch with built-in OAuth, no proxy or custom config
  • AI Assist — integrated AI-powered configuration assistance and troubleshooting
  • Usage Dashboard — per-provider tracking with dark theme, KPI strip, model bars, status pills
  • Profile backup/import — export and import endpoint configurations as portable JSON bundles
  • Threaded operations — model refresh runs in background, UI stays responsive
  • Process lifecycle — stall detection, kill/cleanup, config backup/restore around sessions
  • Config normalization — automatically strips stale API path suffixes from URLs
  • Reasoning controls — per-provider reasoning toggle with effort level selection

Process Management

  • Kills stale electron/webview/app-server processes from previous sessions
  • Removes stale PID files and sockets
  • Manages proxy lifecycle (start, health-check, stop)
  • Config backup before launch, automatic restore after exit

Technology Stack

Component Technology Why
Translation Proxy Python 3 stdlib (http.server, urllib, json) Zero dependencies, runs anywhere
GUI Launcher Python 3 + GTK 3.0 (PyGObject/gi) Native Linux desktop integration
Config Engine Python 3 (toml generation, json catalogs) Dynamic, no hardcoded configs
Process Mgmt bash + os.setsid/os.killpg Unix process group lifecycle
Streaming Server-Sent Events (SSE) Required by Codex Responses API
API Translation Responses API ↔ Chat Completions / Anthropic Messages Protocol bridging

Zero pip dependencies. Everything uses Python stdlib + system GTK bindings.


Quick Start

Prerequisites

  • Codex CLI ≥ 2.0 (npm install -g @openai/codex or bundled with Codex Desktop)
  • Codex Desktop installed at /opt/codex-desktop/ (optional, for Desktop mode)
  • Python 3.8+ (stdlib only)
  • python3-gi for GTK (sudo apt install python3-gi)
  • bash, curl, lsof

Install

git clone https://github.rommark.dev/admin/Codex-Launcher---Any-AI-Porovider.git
cd Codex-Launcher---Any-AI-Porovider
./install.sh

Run

Open Codex Launcher from your app grid, or:

codex-launcher-gui

First Launch

  1. Click Manage EndpointsAdd
  2. Select a provider preset (e.g., "OpenCode Zen (OpenAI-compatible)")
  3. Enter your API key
  4. Click Fetch from API to auto-discover models, or add them manually
  5. Click Save
  6. Select the endpoint and model from the dropdowns
  7. Click Launch Desktop or Launch CLI

Development Journey

Phase 1: The Z.AI Proxy — Protocol Reverse Engineering

Problem: Codex CLI v2.0 switched exclusively to the Responses API. Z.AI (and every other provider) uses Chat Completions. They are fundamentally incompatible.

Approach:

  1. Captured Codex's HTTP traffic to understand the exact Responses API request/response shape
  2. Mapped the protocol differences:
    • /v1/responses/chat/completions (endpoint)
    • input array with typed items → messages array with role/content (message format)
    • function_call items → tool_calls array on assistant messages (tool format)
    • SSE response.output_text.delta events → delta.content chunks (streaming)
  3. Built the initial zai-proxy.py — a 200-line HTTP server that translates in both directions
  4. Hit a critical bug: Codex hung in "thinking" state. Discovered that merely emitting response.done is insufficient — the response.completed event must contain the full output item array
  5. Added streaming SSE with proper event sequencing — this was the breakthrough that made it work

Testing: Manual end-to-end testing with curl, Codex CLI --profile zai, and Codex Desktop. Verified streaming, tool calls, and reasoning content.

Phase 2: Multi-Provider Architecture — The Unified Proxy

Problem: Maintaining separate proxies for each provider (Z.AI, Anthropic, OpenRouter, etc.) was unmaintainable.

Approach:

  1. Abstracted the translation into a backend plugin architecture:
    • openai-compat backend: translates Responses → Chat Completions (works for any OpenAI-compatible API)
    • anthropic backend: translates Responses → Anthropic Messages API
  2. Unified config loading: JSON config file, CLI arguments, and environment variables
  3. Shared HTTP server, model serving, and SSE framework
  4. Per-backend: message conversion, tool conversion, response conversion, stream conversion

Key design decisions:

  • Pure Python stdlib — no Flask, no aiohttp, no pip dependencies. The proxy must work on any system with Python 3.
  • http.server.BaseHTTPRequestHandler — simple, synchronous, but sufficient for single-user desktop use
  • Config via JSON file — the launcher writes a proxy config to ~/.cache/codex-proxy/ for each endpoint

Phase 3: The GTK Launcher — Desktop Integration

Problem: Users had to manage config files, start proxies manually, and remember which wrapper script to use.

Approach:

  1. Built a GTK 3.0 GUI with three layers: main window, endpoint manager dialog, edit endpoint dialog
  2. Implemented dynamic config generation: for each launch, generate config.toml with the right provider definition, profile, and model catalog
  3. Model catalog generation with dual field naming (slug + model, supported_reasoning_levels + supportedReasoningEfforts) — required because Codex CLI and Codex Desktop use different field names
  4. Process lifecycle: backup config → cleanup stale processes → start proxy → write config → launch Codex → wait → restore config → stop proxy

Threading challenges:

  • GTK requires all UI updates on the main loop. Used GLib.idle_add() for all cross-thread communication
  • Model refresh was blocking the UI — moved to a background thread with idle_add completion callbacks
  • Proxy startup waits up to 15 seconds with health checks before proceeding

Phase 4: Endpoint Management & Provider Presets

Approach:

  1. JSON-based endpoint storage (~/.codex/endpoints.json)
  2. Provider presets with pre-filled URLs and model lists for common providers
  3. Auto-fetch models from provider /v1/models endpoints
  4. Bulk model import (paste comma/newline-separated model IDs)
  5. Profile backup/import — portable JSON bundles with endpoints + config

URL normalization: Discovered that saved URLs sometimes had /chat/completions appended from manual entry. Added normalize_base_url() that strips trailing API path suffixes to prevent double-path bugs.

Phase 5: Cloudflare Bot Detection — The OpenCode 403 Saga

Problem: OpenCode Zen/Go returned 403 (Cloudflare error 1010) for all requests.

Investigation:

  1. Tested direct curl requests — all returned 403 regardless of auth header type (Bearer, x-api-key, both, none)
  2. Examined Codex logs — found turn.has_metadata_header=true and error code: 1010
  3. Error 1010 is Cloudflare's "Access denied" — bot detection based on User-Agent and browser headers
  4. Python's urllib sends User-Agent: Python-urllib/3.x which triggers the block

Solution:

  1. Added _BROWSER_HEADERS — a set of Chrome-like headers (User-Agent, Sec-Ch-Ua, Sec-Fetch-*)
  2. forwarded_headers() with browser_ua=True — injects browser headers while preserving incoming client headers and filtering hop-by-hop headers
  3. Applied only to openai-compat backend where Cloudflare is common
  4. This resolved the 403 — upstream now returns 401 (auth) instead of 403 (bot block), confirming the headers work

Phase 6: Codex Default Mode — OAuth Without Config

Problem: Users wanted to quickly switch back to built-in Codex OAuth without maintaining a separate endpoint.

Approach:

  1. Added "Codex Default (Desktop)" and "Codex Default (CLI)" buttons
  2. On launch: backup config → delete config.toml entirely → start Codex → restore config after exit
  3. Key insight: writing empty strings (model = "", model_provider = "") causes Codex to error with "Model provider `` not found". The config must not exist at all for Codex to fall back to built-in defaults.

Phase 7: Command Code Multi-Format Parser — The 17-Fix Odyssey

Problem: Command Code provider's tool calls were silently dropped, causing the Codex agent loop to stop after the first response. The CC model returns tool calls as inline text in wildly varying formats that change between sessions and model versions.

Root Cause Analysis:

  1. CC's /alpha/generate API uses a proprietary protocol — not Chat Completions, not Anthropic Messages
  2. Tool calls appear as inline text within text-delta SSE events, not as structured JSON
  3. The model output format is non-deterministic — observed 6+ distinct formats:
    • Raw JSON: {"cmd":"mkdir -p /foo","type":"tool-call"}
    • XML: <function name="exec_command"><parameter name="cmd">...</parameter></function>
    • HTML-like: <tool_call type="bash">\n{"command":"..."}
    • Bash blocks: <bash>\nprefix_rule: ...\n{"command":"..."}</bash>
    • Explore blocks: <explore_agent>...</explore_agent>
    • DSML tags: <DSMLinvoke name="exec"><DSMLparameter name="command">...</parameter></invoke>
  4. Additional complications: double-wrapped arguments, unescaped quotes, unicode escapes, missing fields

The Fix — 17 Incremental Patches: Built a cascading parser chain (DSML → bash → explore → tool_call → XML → raw JSON → fallback regex) that tries each format in order. Each patch addressed a specific format observed in production:

  • FIX 14: Foundation — string-only content, version headers, cache clearing, streaming error handling
  • FIX 58: Core parsing — raw JSON extraction, three-tier argument parser, field extraction, permission normalization
  • FIX 910: Cleanup — removed dead code, added documentation
  • FIX 1111c: Robustness — recursive unwrapping of nested cmd values, post-extraction sanitizer, XML regex fix
  • FIX 12: Self-revive watchdog — proxy auto-restarts on crash instead of dying silently
  • FIX 1317: New format support — fallback extraction, HTML-like blocks, explore blocks, bash blocks, DSML tags

Key Design Decision: Field-level regex extraction instead of JSON parsing. Standard JSON parsers fail on unescaped quotes in shell commands (e.g., echo "hello world" breaks JSON). The regex approach tolerates malformed JSON by extracting individual fields.

Verification: --self-test flag runs 19 automated tests covering all edge cases. Debug logging to ~/.cache/codex-proxy/cc-debug.log captures every parser decision for troubleshooting.


Architecture Deep Dive

Request Flow (OpenAI-compatible provider)

Codex CLI/Desktop
    │
    │ POST /responses (Responses API)
    │ Body: { model: "glm-5.1", input: [...], stream: true }
    ▼
┌─────────────────────────────────┐
│  translate-proxy.py (port 8080) │
│                                 │
│  1. Parse Responses API body    │
│  2. Convert input → messages    │
│  3. Convert tools format        │
│  4. Inject browser UA headers   │
│                                 │
│  POST /v1/chat/completions      │──→  Upstream Provider
│  Body: { model, messages, ... }│      (opencode.ai, z.ai, etc.)
│                                 │
│  5. Receive Chat Comp. response │  ←── SSE stream or JSON
│  6. Convert response format     │
│  7. Emit Responses API SSE      │
│                                 │
└─────────────────────────────────┘
    │
    │ SSE: response.created
    │ SSE: response.output_text.delta  (streamed tokens)
    │ SSE: response.output_text.done
    │ SSE: response.completed
    ▼
Codex CLI/Desktop (receives Responses API events)

Config Lifecycle

┌─────────┐    ┌──────────┐    ┌──────────┐    ┌──────────┐    ┌─────────┐
│ Backup   │───→│ Cleanup  │───→│ Generate │───→│ Launch   │───→│ Restore │
│ config   │    │ stale    │    │ config   │    │ Codex    │    │ config  │
│ .toml    │    │ processes│    │ + models │    │ process  │    │ from    │
│          │    │          │    │          │    │          │    │ backup  │
└─────────┘    └──────────┘    └──────────┘    └──────────┘    └─────────┘

Model Catalog Format

The launcher generates model catalog JSON with dual field naming to satisfy both Codex CLI and Codex Desktop:

{
  "models": [{
    "slug": "glm-5.1",          // CLI reads this
    "model": "glm-5.1",         // Desktop reads this
    "supported_reasoning_levels": [...],  // CLI
    "supportedReasoningEfforts": [...],   // Desktop
    ...
  }]
}

Provider Presets

Preset Backend Base URL
OpenAI Native https://api.openai.com/v1
Anthropic Anthropic https://api.anthropic.com/v1
OpenCode Zen OpenAI-compat https://opencode.ai/zen/v1
OpenCode Go OpenAI-compat https://opencode.ai/zen/go/v1
Command Code Command Code https://api.commandcode.ai
Crof.ai OpenAI-compat https://crof.ai/v1
NVIDIA NIM OpenAI-compat https://integrate.api.nvidia.com/v1
Kilo.ai OpenAI-compat https://api.kilo.ai/api/gateway
OpenRouter OpenAI-compat https://openrouter.ai/api/v1
Z.AI OpenAI-compat https://api.z.ai/api/coding/paas/v4
Custom Any User-defined

File Structure

src/
├── translate-proxy.py            # Translation proxy (openai-compat + anthropic + command-code)
├── codex-launcher-gui            # GTK launcher GUI
├── cleanup-codex-stale.sh        # Stale process cleanup
└── codex-launcher.desktop.template  # Desktop entry template

install.sh                        # One-command installer
README.md                         # This file

Installed Locations

/usr/bin/translate-proxy.py               # Proxy (from .deb)
/usr/bin/codex-launcher-gui               # Launcher (from .deb)
/usr/bin/cleanup-codex-stale.sh           # Cleanup (from .deb)
/usr/share/applications/codex-launcher.desktop  # App grid entry
~/.codex/endpoints.json                   # Endpoint storage
~/.codex/config.toml                      # Codex config (auto-generated)
~/.cache/codex-proxy/                     # Proxy configs + model catalogs
~/.cache/codex-proxy/cc-debug.log         # Debug log (per-request)

Troubleshooting

Issue Cause Fix
Window opens then disappears Stale processes from previous session Click Kill && Cleanup
Window never opens Startup freeze Kill && Cleanup, then retry
"Model provider `` not found" Empty strings in config V2 deletes config entirely for Default mode
403 Forbidden from OpenCode Cloudflare bot detection Proxy injects browser UA headers
401 Unauthorized from OpenCode Invalid key or no credits Check API key and billing
Double path in URL Stale /chat/completions in base URL normalize_base_url() strips suffixes
Proxy stops when terminal closes SIGHUP to subprocess Launcher uses os.setsid process groups
Models not showing in picker Wrong model catalog format Must have both slug + model fields
Codex hangs in "thinking" Missing response.completed Proxy emits full SSE event sequence
Stops after first tool call (Crof) previous_response_id not resolved V2.1.2 stores and chains responses for multi-turn
CC agent stops after first response Tool calls not parsed from model text V3.5 multi-format parser handles all CC output formats
CC tool calls have wrong args Double-wrapped arguments V3.5 three-tier parser + recursive unwrapping
Proxy crashes mid-session Unhandled streaming error V3.5 self-revive watchdog auto-restarts
CC 403 upgrade_required Missing version header V3.5 always sends x-command-code-version

Adding a New Provider

  1. Click Manage EndpointsAdd
  2. Choose a preset or set Custom
  3. Set backend type: OpenAI-compatible, Anthropic, or Native
  4. Enter base URL and API key
  5. Click Fetch from API or add models manually
  6. Save and launch

For providers behind Cloudflare, the proxy automatically injects browser headers. For providers with non-standard APIs, add a new backend module to translate-proxy.py following the oa_* / an_* pattern.


Manual Proxy Usage

# Start proxy for any OpenAI-compatible provider
python3 src/translate-proxy.py \
  --backend openai-compat \
  --target-url https://api.your-provider.com/v1 \
  --api-key YOUR_KEY \
  --port 8080

# Or use a JSON config
python3 src/translate-proxy.py --config my-proxy-config.json

# Then run Codex
codex --profile my-profile -c model=my-model

Requirements

  • Python ≥ 3.8
  • python3-gi (sudo apt install python3-gi)
  • Codex CLI ≥ 2.0
  • Codex Desktop (optional, for Desktop mode)
  • bash, curl, lsof

No pip dependencies. Zero. Pure stdlib + system GTK.


License

MIT


Get 10% OFF Z.AI coding plans
Z.AI 10% OFF