Get 10% OFF Z.AI coding plans
z.ai/subscribe
---
Codex Launcher — Any AI Provider
Run OpenAI Codex CLI & Desktop with any AI provider.
Google Antigravity • Gemini CLI • OpenCode • Z.AI • Anthropic • Command Code • OpenRouter • Crof.ai • NVIDIA NIM • Kilo.ai • DeepSeek • and more
---
## The Problem
OpenAI's Codex CLI v2.0+ exclusively uses the **Responses API** — a protocol that is incompatible with virtually every other AI provider:
| Provider | API | Works with Codex? |
|----------|-----|:-:|
| OpenAI | Responses API | ✅ |
| Google Antigravity (OAuth) | Code Assist / Gemini Native | ✅ |
| Gemini CLI OAuth | Code Assist | ✅ |
| Z.AI | Chat Completions | ✅ |
| OpenCode | Chat Completions | ✅ |
| Anthropic | Messages API | ✅ |
| Command Code | Custom `/alpha/generate` | ✅ |
| Ollama | Chat Completions | ✅ |
| OpenRouter | Chat Completions | ✅ |
| NVIDIA NIM | Chat Completions | ✅ |
| Crof.ai | Chat Completions | ✅ |
The protocols differ in **endpoint paths**, **message formats**, **tool-call structures**, **streaming events**, and **completion semantics**. You can't just swap a base URL.
## The Solution
A three-component system:
1. **Translation Proxy** — translates Responses API ↔ Chat Completions / Anthropic Messages in real-time
2. **Config Engine** — generates Codex config files on-the-fly per provider, with backup/restore
3. **GTK Launcher** — manages endpoints, launches Desktop or CLI, handles the full lifecycle
```
┌─────────────────────────────────────────────────────────────────────┐
│ Codex Launcher GUI │
│ (endpoint management + AI Assist + lifecycle) │
└──────────┬─────────────────┬──────────────────┬────────────────────┘
│ │ │
┌──────▼──────┐ ┌──────▼──────┐ ┌────────▼─────────┐
│ Codex │ │ Native │ │ Translation │
│ Default │ │ OpenAI │ │ Proxy │
│ (remove │ │ (direct │ │ (auto-revive) │
│ config) │ │ URL) │ │ │
└──────┬──────┘ └──────┬──────┘ └────────┬─────────┘
│ │ │
▼ ▼ ┌────────┴────────┐
┌──────────────┐ ┌───────────┐ │ │
│ Built-in │ │ config. │ ▼ ▼
│ Codex OAuth │ │ toml │ ┌────────────┐ ┌───────────┐ ┌──────────┐
└──────────────┘ └───────────┘ │ OpenAI │ │ Anthropic │ │ Command │
│ Chat Comp. │ │ Messages │ │ Code │
└────────────┘ └───────────┘ └──────────┘
```
---
## Features
### Multi-Provider Support
- **Native OpenAI** — direct connection, no proxy needed
- **OpenAI-compatible** — Z.AI, OpenCode Zen/Go, Crof.ai, NVIDIA NIM, Kilo.ai, OpenRouter, Ollama, Together, Groq, and any provider with a Chat Completions endpoint
- **Anthropic** — Claude models via the Messages API
- **Command Code** — 20 models (DeepSeek, Claude, GPT, Kimi, GLM, Qwen, etc.) via Command Code's `/alpha/generate` API with configurable client version
- **Codex Default** — built-in Codex OAuth with official models, zero config
### Translation Proxy (`translate-proxy.py`)
- Full Responses API ↔ Chat Completions / Anthropic Messages / Command Code API bidirectional translation
- **Streaming SSE** support with proper event sequencing (`response.created` → `response.output_text.delta` → `response.completed`)
- **Tool calls** — full function calling support including parallel tool calls
- **Reasoning content** — forwards `reasoning_content` fields from providers that support it
- **Browser UA injection** — bypasses Cloudflare bot detection for providers like OpenCode
- **Smart URL construction** — prevents double-path bugs (`/v1/chat/completions/chat/completions`)
- **Header forwarding** — preserves client identity headers while filtering hop-by-hop headers
- **Connection pooling** — persistent HTTPS connections per host, eliminates TLS handshake overhead per request
- **Stream idle timeout** — kills stalled upstream connections after 5 minutes of silence
- **Retry-After support** — respects upstream `Retry-After` headers on 429/502/503 responses
- **Response store TTL** — evicts stored responses older than 10 minutes, prevents memory leaks
- **Bounded stream buffers** — 8MB cap prevents OOM on pathological responses
- **Dual logging** — all proxy messages written to both stderr and `~/.cache/codex-proxy/proxy.log`
- Zero dependencies — pure Python stdlib
### Command Code Adapter
- **Multi-format tool-call parser** — handles all known CC model output formats in a cascading chain:
- DSML tags (`<||DSML||invoke>`) — current model format
- `...` blocks with metadata extraction
- `` blocks converted to real `exec_command`
- `` HTML-like blocks
- XML `...`
- HTML-like: `\n{"command":"..."}`
- Bash blocks: `\nprefix_rule: ...\n{"command":"..."}`
- Explore blocks: `...`
- DSML tags: `<||DSML||invoke name="exec"><||DSML||parameter name="command">...`
4. Additional complications: double-wrapped arguments, unescaped quotes, unicode escapes, missing fields
**The Fix — 17 Incremental Patches:**
Built a cascading parser chain (`DSML → bash → explore → tool_call → XML → raw JSON → fallback regex`) that tries each format in order. Each patch addressed a specific format observed in production:
- **FIX 1–4**: Foundation — string-only content, version headers, cache clearing, streaming error handling
- **FIX 5–8**: Core parsing — raw JSON extraction, three-tier argument parser, field extraction, permission normalization
- **FIX 9–10**: Cleanup — removed dead code, added documentation
- **FIX 11–11c**: Robustness — recursive unwrapping of nested cmd values, post-extraction sanitizer, XML regex fix
- **FIX 12**: Self-revive watchdog — proxy auto-restarts on crash instead of dying silently
- **FIX 13–17**: New format support — fallback extraction, HTML-like blocks, explore blocks, bash blocks, DSML tags
**Key Design Decision:** Field-level regex extraction instead of JSON parsing. Standard JSON parsers fail on unescaped quotes in shell commands (e.g., `echo "hello world"` breaks JSON). The regex approach tolerates malformed JSON by extracting individual fields.
**Verification:** `--self-test` flag runs 19 automated tests covering all edge cases. Debug logging to `~/.cache/codex-proxy/cc-debug.log` captures every parser decision for troubleshooting.
---
## Architecture Deep Dive
### Request Flow (OpenAI-compatible provider)
```
Codex CLI/Desktop
│
│ POST /responses (Responses API)
│ Body: { model: "glm-5.1", input: [...], stream: true }
▼
┌─────────────────────────────────┐
│ translate-proxy.py (port 8080) │
│ │
│ 1. Parse Responses API body │
│ 2. Convert input → messages │
│ 3. Convert tools format │
│ 4. Inject browser UA headers │
│ │
│ POST /v1/chat/completions │──→ Upstream Provider
│ Body: { model, messages, ... }│ (opencode.ai, z.ai, etc.)
│ │
│ 5. Receive Chat Comp. response │ ←── SSE stream or JSON
│ 6. Convert response format │
│ 7. Emit Responses API SSE │
│ │
└─────────────────────────────────┘
│
│ SSE: response.created
│ SSE: response.output_text.delta (streamed tokens)
│ SSE: response.output_text.done
│ SSE: response.completed
▼
Codex CLI/Desktop (receives Responses API events)
```
### Config Lifecycle
```
┌─────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌─────────┐
│ Backup │───→│ Cleanup │───→│ Generate │───→│ Launch │───→│ Restore │
│ config │ │ stale │ │ config │ │ Codex │ │ config │
│ .toml │ │ processes│ │ + models │ │ process │ │ from │
│ │ │ │ │ │ │ │ │ backup │
└─────────┘ └──────────┘ └──────────┘ └──────────┘ └─────────┘
```
### Model Catalog Format
The launcher generates model catalog JSON with dual field naming to satisfy both Codex CLI and Codex Desktop:
```json
{
"models": [{
"slug": "glm-5.1", // CLI reads this
"model": "glm-5.1", // Desktop reads this
"supported_reasoning_levels": [...], // CLI
"supportedReasoningEfforts": [...], // Desktop
...
}]
}
```
---
## Provider Presets
| Preset | Backend | Base URL |
|--------|---------|----------|
| OpenAI | Native | `https://api.openai.com/v1` |
| Anthropic | Anthropic | `https://api.anthropic.com/v1` |
| OpenCode Zen | OpenAI-compat | `https://opencode.ai/zen/v1` |
| OpenCode Go | OpenAI-compat | `https://opencode.ai/zen/go/v1` |
| Command Code | Command Code | `https://api.commandcode.ai` |
| Crof.ai | OpenAI-compat | `https://crof.ai/v1` |
| NVIDIA NIM | OpenAI-compat | `https://integrate.api.nvidia.com/v1` |
| Kilo.ai | OpenAI-compat | `https://api.kilo.ai/api/gateway` |
| OpenRouter | OpenAI-compat | `https://openrouter.ai/api/v1` |
| Z.AI | OpenAI-compat | `https://api.z.ai/api/coding/paas/v4` |
| Custom | Any | User-defined |
---
## File Structure
```
src/
├── translate-proxy.py # Translation proxy (openai-compat + anthropic + command-code)
├── codex-launcher-gui # GTK launcher GUI
├── cleanup-codex-stale.sh # Stale process cleanup
└── codex-launcher.desktop.template # Desktop entry template
install.sh # One-command installer
README.md # This file
```
### Installed Locations
```
/usr/bin/translate-proxy.py # Proxy (from .deb)
/usr/bin/codex-launcher-gui # Launcher (from .deb)
/usr/bin/cleanup-codex-stale.sh # Cleanup (from .deb)
/usr/share/applications/codex-launcher.desktop # App grid entry
~/.codex/endpoints.json # Endpoint storage
~/.codex/config.toml # Codex config (auto-generated)
~/.cache/codex-proxy/ # Proxy configs + model catalogs
~/.cache/codex-proxy/cc-debug.log # Debug log (per-request)
```
---
## Troubleshooting
| Issue | Cause | Fix |
|-------|-------|-----|
| Window opens then disappears | Stale processes from previous session | Click **Kill && Cleanup** |
| Window never opens | Startup freeze | Kill && Cleanup, then retry |
| "Model provider `` not found" | Empty strings in config | V2 deletes config entirely for Default mode |
| 403 Forbidden from OpenCode | Cloudflare bot detection | Proxy injects browser UA headers |
| 401 Unauthorized from OpenCode | Invalid key or no credits | Check API key and billing |
| Double path in URL | Stale `/chat/completions` in base URL | `normalize_base_url()` strips suffixes |
| Proxy stops when terminal closes | SIGHUP to subprocess | Launcher uses `os.setsid` process groups |
| Models not showing in picker | Wrong model catalog format | Must have both `slug` + `model` fields |
| Codex hangs in "thinking" | Missing `response.completed` | Proxy emits full SSE event sequence |
| Stops after first tool call (Crof) | `previous_response_id` not resolved | V2.1.2 stores and chains responses for multi-turn |
| CC agent stops after first response | Tool calls not parsed from model text | V3.5 multi-format parser handles all CC output formats |
| CC tool calls have wrong args | Double-wrapped arguments | V3.5 three-tier parser + recursive unwrapping |
| Proxy crashes mid-session | Unhandled streaming error | V3.5 self-revive watchdog auto-restarts |
| CC 403 upgrade_required | Missing version header | V3.5 always sends `x-command-code-version` |
---
## Adding a New Provider
1. Click **Manage Endpoints** → **Add**
2. Choose a preset or set **Custom**
3. Set backend type: `OpenAI-compatible`, `Anthropic`, or `Native`
4. Enter base URL and API key
5. Click **Fetch from API** or add models manually
6. Save and launch
For providers behind Cloudflare, the proxy automatically injects browser headers. For providers with non-standard APIs, add a new backend module to `translate-proxy.py` following the `oa_*` / `an_*` pattern.
---
## Manual Proxy Usage
```bash
# Start proxy for any OpenAI-compatible provider
python3 src/translate-proxy.py \
--backend openai-compat \
--target-url https://api.your-provider.com/v1 \
--api-key YOUR_KEY \
--port 8080
# Or use a JSON config
python3 src/translate-proxy.py --config my-proxy-config.json
# Then run Codex
codex --profile my-profile -c model=my-model
```
---
## Requirements
- Python ≥ 3.8
- python3-gi (`sudo apt install python3-gi`)
- Codex CLI ≥ 2.0
- Codex Desktop (optional, for Desktop mode)
- bash, curl, lsof
**No pip dependencies.** Zero. Pure stdlib + system GTK.
---
## License
MIT
---
Get 10% OFF Z.AI coding plans