admin cb6381afe4 fix: add previous_response_id support for multi-turn tool calls (Crof fix)
Codex Desktop uses previous_response_id to chain conversation turns.
Without storing and resolving these, the proxy sent only the new
function_call_output to upstream providers, missing the original user
message and assistant tool call. This caused Crof.ai (and any provider
using tool calls) to stop after the first response.

- Add in-memory response store (50 entry LRU) keyed by response ID
- resolve_previous_response() reconstructs full input chain on multi-turn
- Fix orphan message output item when response has only tool calls
- Applies to all backends: openai-compat, anthropic, command-code
- v2.1.2
cb6381afe4 · 2026-05-19 20:38:39 +04:00
15 Commits

Z.AI 10% OFF

Get 10% OFF Z.AI coding plans
z.ai/subscribe


Codex Launcher — Any AI Provider

Run OpenAI Codex CLI & Desktop with any AI provider.
OpenCode • Z.AI • Anthropic • Command Code • OpenRouter • Crof.ai • NVIDIA NIM • Kilo.ai • and more

Python 3.8+ GTK 3.0 MIT License Zero pip dependencies


The Problem

OpenAI's Codex CLI v2.0+ exclusively uses the Responses API — a protocol that is incompatible with virtually every other AI provider:

Provider API Works with Codex?
OpenAI Responses API
Z.AI Chat Completions
OpenCode Chat Completions
Anthropic Messages API
Command Code Custom /alpha/generate
Ollama Chat Completions
OpenRouter Chat Completions
NVIDIA NIM Chat Completions
Crof.ai Chat Completions

The protocols differ in endpoint paths, message formats, tool-call structures, streaming events, and completion semantics. You can't just swap a base URL.

The Solution

A three-component system:

  1. Translation Proxy — translates Responses API ↔ Chat Completions / Anthropic Messages in real-time
  2. Config Engine — generates Codex config files on-the-fly per provider, with backup/restore
  3. GTK Launcher — manages endpoints, launches Desktop or CLI, handles the full lifecycle
┌─────────────────────────────────────────────────────────────────────┐
│                         Codex Launcher GUI                          │
│                    (endpoint management + lifecycle)                │
└──────────┬─────────────────┬──────────────────┬────────────────────┘
           │                 │                  │
    ┌──────▼──────┐  ┌──────▼──────┐  ┌────────▼─────────┐
    │  Codex      │  │  Native     │  │  Translation     │
    │  Default    │  │  OpenAI     │  │  Proxy           │
    │  (remove    │  │  (direct    │  │  (port 8080)     │
    │  config)    │  │  URL)       │  │                  │
    └──────┬──────┘  └──────┬──────┘  └────────┬─────────┘
           │                │                   │
           ▼                ▼          ┌────────┴────────┐
    ┌──────────────┐ ┌───────────┐    │                 │
    │ Built-in     │ │ config.   │    ▼                 ▼
    │ Codex OAuth  │ │ toml      │ ┌────────────┐ ┌───────────┐
    └──────────────┘ └───────────┘ │ OpenAI     │ │ Anthropic │
                                   │ Chat Comp. │ │ Messages  │
                                   └────────────┘ └───────────┘

Features

Multi-Provider Support

  • Native OpenAI — direct connection, no proxy needed
  • OpenAI-compatible — Z.AI, OpenCode Zen/Go, Crof.ai, NVIDIA NIM, Kilo.ai, OpenRouter, Ollama, Together, Groq, and any provider with a Chat Completions endpoint
  • Anthropic — Claude models via the Messages API
  • Command Code — 20 models (DeepSeek, Claude, GPT, Kimi, GLM, Qwen, etc.) via Command Code's /alpha/generate API with configurable client version
  • Codex Default — built-in Codex OAuth with official models, zero config

Translation Proxy (translate-proxy.py)

  • Full Responses API ↔ Chat Completions / Anthropic Messages / Command Code API bidirectional translation
  • Streaming SSE support with proper event sequencing (response.createdresponse.output_text.deltaresponse.completed)
  • Tool calls — full function calling support including parallel tool calls
  • Reasoning content — forwards reasoning_content fields from providers that support it
  • Browser UA injection — bypasses Cloudflare bot detection for providers like OpenCode
  • Smart URL construction — prevents double-path bugs (/v1/chat/completions/chat/completions)
  • Header forwarding — preserves client identity headers while filtering hop-by-hop headers
  • Zero dependencies — pure Python stdlib

GTK Launcher (codex-launcher-gui)

  • Endpoint manager — add, edit, delete, set default providers
  • Provider presets — one-click setup for 10+ providers with pre-filled URLs and model lists
  • Model auto-fetch — pulls available models directly from provider APIs
  • Bulk model import — paste a comma/newline-separated list of model IDs
  • Launch Desktop — starts Codex Desktop with the selected provider and model
  • Launch CLI — opens Codex CLI in a terminal with the selected provider
  • Codex Default — launch with built-in OAuth, no proxy or custom config
  • Profile backup/import — export and import endpoint configurations as portable JSON bundles
  • Threaded operations — model refresh runs in background, UI stays responsive
  • Process lifecycle — stall detection, kill/cleanup, config backup/restore around sessions
  • Config normalization — automatically strips stale API path suffixes from URLs

Process Management

  • Kills stale electron/webview/app-server processes from previous sessions
  • Removes stale PID files and sockets
  • Manages proxy lifecycle (start, health-check, stop)
  • Config backup before launch, automatic restore after exit

Technology Stack

Component Technology Why
Translation Proxy Python 3 stdlib (http.server, urllib, json) Zero dependencies, runs anywhere
GUI Launcher Python 3 + GTK 3.0 (PyGObject/gi) Native Linux desktop integration
Config Engine Python 3 (toml generation, json catalogs) Dynamic, no hardcoded configs
Process Mgmt bash + os.setsid/os.killpg Unix process group lifecycle
Streaming Server-Sent Events (SSE) Required by Codex Responses API
API Translation Responses API ↔ Chat Completions / Anthropic Messages Protocol bridging

Zero pip dependencies. Everything uses Python stdlib + system GTK bindings.


Quick Start

Prerequisites

  • Codex CLI ≥ 2.0 (npm install -g @openai/codex or bundled with Codex Desktop)
  • Codex Desktop installed at /opt/codex-desktop/ (optional, for Desktop mode)
  • Python 3.8+ (stdlib only)
  • python3-gi for GTK (sudo apt install python3-gi)
  • bash, curl, lsof

Install

git clone https://github.rommark.dev/admin/Codex-Launcher---Any-AI-Porovider.git
cd Codex-Launcher---Any-AI-Porovider
./install.sh

Run

Open Codex Launcher from your app grid, or:

codex-launcher-gui

First Launch

  1. Click Manage EndpointsAdd
  2. Select a provider preset (e.g., "OpenCode Zen (OpenAI-compatible)")
  3. Enter your API key
  4. Click Fetch from API to auto-discover models, or add them manually
  5. Click Save
  6. Select the endpoint and model from the dropdowns
  7. Click Launch Desktop or Launch CLI

Development Journey

Phase 1: The Z.AI Proxy — Protocol Reverse Engineering

Problem: Codex CLI v2.0 switched exclusively to the Responses API. Z.AI (and every other provider) uses Chat Completions. They are fundamentally incompatible.

Approach:

  1. Captured Codex's HTTP traffic to understand the exact Responses API request/response shape
  2. Mapped the protocol differences:
    • /v1/responses/chat/completions (endpoint)
    • input array with typed items → messages array with role/content (message format)
    • function_call items → tool_calls array on assistant messages (tool format)
    • SSE response.output_text.delta events → delta.content chunks (streaming)
  3. Built the initial zai-proxy.py — a 200-line HTTP server that translates in both directions
  4. Hit a critical bug: Codex hung in "thinking" state. Discovered that merely emitting response.done is insufficient — the response.completed event must contain the full output item array
  5. Added streaming SSE with proper event sequencing — this was the breakthrough that made it work

Testing: Manual end-to-end testing with curl, Codex CLI --profile zai, and Codex Desktop. Verified streaming, tool calls, and reasoning content.

Phase 2: Multi-Provider Architecture — The Unified Proxy

Problem: Maintaining separate proxies for each provider (Z.AI, Anthropic, OpenRouter, etc.) was unmaintainable.

Approach:

  1. Abstracted the translation into a backend plugin architecture:
    • openai-compat backend: translates Responses → Chat Completions (works for any OpenAI-compatible API)
    • anthropic backend: translates Responses → Anthropic Messages API
  2. Unified config loading: JSON config file, CLI arguments, and environment variables
  3. Shared HTTP server, model serving, and SSE framework
  4. Per-backend: message conversion, tool conversion, response conversion, stream conversion

Key design decisions:

  • Pure Python stdlib — no Flask, no aiohttp, no pip dependencies. The proxy must work on any system with Python 3.
  • http.server.BaseHTTPRequestHandler — simple, synchronous, but sufficient for single-user desktop use
  • Config via JSON file — the launcher writes a proxy config to ~/.cache/codex-proxy/ for each endpoint

Phase 3: The GTK Launcher — Desktop Integration

Problem: Users had to manage config files, start proxies manually, and remember which wrapper script to use.

Approach:

  1. Built a GTK 3.0 GUI with three layers: main window, endpoint manager dialog, edit endpoint dialog
  2. Implemented dynamic config generation: for each launch, generate config.toml with the right provider definition, profile, and model catalog
  3. Model catalog generation with dual field naming (slug + model, supported_reasoning_levels + supportedReasoningEfforts) — required because Codex CLI and Codex Desktop use different field names
  4. Process lifecycle: backup config → cleanup stale processes → start proxy → write config → launch Codex → wait → restore config → stop proxy

Threading challenges:

  • GTK requires all UI updates on the main loop. Used GLib.idle_add() for all cross-thread communication
  • Model refresh was blocking the UI — moved to a background thread with idle_add completion callbacks
  • Proxy startup waits up to 15 seconds with health checks before proceeding

Phase 4: Endpoint Management & Provider Presets

Approach:

  1. JSON-based endpoint storage (~/.codex/endpoints.json)
  2. Provider presets with pre-filled URLs and model lists for common providers
  3. Auto-fetch models from provider /v1/models endpoints
  4. Bulk model import (paste comma/newline-separated model IDs)
  5. Profile backup/import — portable JSON bundles with endpoints + config

URL normalization: Discovered that saved URLs sometimes had /chat/completions appended from manual entry. Added normalize_base_url() that strips trailing API path suffixes to prevent double-path bugs.

Phase 5: Cloudflare Bot Detection — The OpenCode 403 Saga

Problem: OpenCode Zen/Go returned 403 (Cloudflare error 1010) for all requests.

Investigation:

  1. Tested direct curl requests — all returned 403 regardless of auth header type (Bearer, x-api-key, both, none)
  2. Examined Codex logs — found turn.has_metadata_header=true and error code: 1010
  3. Error 1010 is Cloudflare's "Access denied" — bot detection based on User-Agent and browser headers
  4. Python's urllib sends User-Agent: Python-urllib/3.x which triggers the block

Solution:

  1. Added _BROWSER_HEADERS — a set of Chrome-like headers (User-Agent, Sec-Ch-Ua, Sec-Fetch-*)
  2. forwarded_headers() with browser_ua=True — injects browser headers while preserving incoming client headers and filtering hop-by-hop headers
  3. Applied only to openai-compat backend where Cloudflare is common
  4. This resolved the 403 — upstream now returns 401 (auth) instead of 403 (bot block), confirming the headers work

Phase 6: Codex Default Mode — OAuth Without Config

Problem: Users wanted to quickly switch back to built-in Codex OAuth without maintaining a separate endpoint.

Approach:

  1. Added "Codex Default (Desktop)" and "Codex Default (CLI)" buttons
  2. On launch: backup config → delete config.toml entirely → start Codex → restore config after exit
  3. Key insight: writing empty strings (model = "", model_provider = "") causes Codex to error with "Model provider `` not found". The config must not exist at all for Codex to fall back to built-in defaults.

Architecture Deep Dive

Request Flow (OpenAI-compatible provider)

Codex CLI/Desktop
    │
    │ POST /responses (Responses API)
    │ Body: { model: "glm-5.1", input: [...], stream: true }
    ▼
┌─────────────────────────────────┐
│  translate-proxy.py (port 8080) │
│                                 │
│  1. Parse Responses API body    │
│  2. Convert input → messages    │
│  3. Convert tools format        │
│  4. Inject browser UA headers   │
│                                 │
│  POST /v1/chat/completions      │──→  Upstream Provider
│  Body: { model, messages, ... }│      (opencode.ai, z.ai, etc.)
│                                 │
│  5. Receive Chat Comp. response │  ←── SSE stream or JSON
│  6. Convert response format     │
│  7. Emit Responses API SSE      │
│                                 │
└─────────────────────────────────┘
    │
    │ SSE: response.created
    │ SSE: response.output_text.delta  (streamed tokens)
    │ SSE: response.output_text.done
    │ SSE: response.completed
    ▼
Codex CLI/Desktop (receives Responses API events)

Config Lifecycle

┌─────────┐    ┌──────────┐    ┌──────────┐    ┌──────────┐    ┌─────────┐
│ Backup   │───→│ Cleanup  │───→│ Generate │───→│ Launch   │───→│ Restore │
│ config   │    │ stale    │    │ config   │    │ Codex    │    │ config  │
│ .toml    │    │ processes│    │ + models │    │ process  │    │ from    │
│          │    │          │    │          │    │          │    │ backup  │
└─────────┘    └──────────┘    └──────────┘    └──────────┘    └─────────┘

Model Catalog Format

The launcher generates model catalog JSON with dual field naming to satisfy both Codex CLI and Codex Desktop:

{
  "models": [{
    "slug": "glm-5.1",          // CLI reads this
    "model": "glm-5.1",         // Desktop reads this
    "supported_reasoning_levels": [...],  // CLI
    "supportedReasoningEfforts": [...],   // Desktop
    ...
  }]
}

Provider Presets

Preset Backend Base URL
OpenAI Native https://api.openai.com/v1
Anthropic Anthropic https://api.anthropic.com/v1
OpenCode Zen OpenAI-compat https://opencode.ai/zen/v1
OpenCode Go OpenAI-compat https://opencode.ai/zen/go/v1
Command Code Command Code https://api.commandcode.ai
Crof.ai OpenAI-compat https://crof.ai/v1
NVIDIA NIM OpenAI-compat https://integrate.api.nvidia.com/v1
Kilo.ai OpenAI-compat https://api.kilo.ai/api/gateway
OpenRouter OpenAI-compat https://openrouter.ai/api/v1
Z.AI OpenAI-compat https://api.z.ai/api/coding/paas/v4
Custom Any User-defined

File Structure

src/
├── translate-proxy.py            # Translation proxy (openai-compat + anthropic + command-code)
├── codex-launcher-gui            # GTK launcher GUI
├── cleanup-codex-stale.sh        # Stale process cleanup
└── codex-launcher.desktop.template  # Desktop entry template

install.sh                        # One-command installer
README.md                         # This file

Installed Locations

~/.local/bin/translate-proxy.py       # Proxy
~/.local/bin/codex-launcher-gui       # Launcher
~/.local/bin/cleanup-codex-stale.sh   # Cleanup
~/.local/share/applications/codex-launcher.desktop  # App grid entry
~/.codex/endpoints.json               # Endpoint storage
~/.codex/config.toml                  # Codex config (auto-generated)
~/.cache/codex-proxy/                 # Proxy configs + model catalogs

Troubleshooting

Issue Cause Fix
Window opens then disappears Stale processes from previous session Click Kill && Cleanup
Window never opens Startup freeze Kill && Cleanup, then retry
"Model provider `` not found" Empty strings in config V2 deletes config entirely for Default mode
403 Forbidden from OpenCode Cloudflare bot detection Proxy injects browser UA headers
401 Unauthorized from OpenCode Invalid key or no credits Check API key and billing
Double path in URL Stale /chat/completions in base URL normalize_base_url() strips suffixes
Proxy stops when terminal closes SIGHUP to subprocess Launcher uses os.setsid process groups
Models not showing in picker Wrong model catalog format Must have both slug + model fields
Codex hangs in "thinking" Missing response.completed Proxy emits full SSE event sequence
Stops after first tool call (Crof) previous_response_id not resolved V2.1.2 stores and chains responses for multi-turn

Adding a New Provider

  1. Click Manage EndpointsAdd
  2. Choose a preset or set Custom
  3. Set backend type: OpenAI-compatible, Anthropic, or Native
  4. Enter base URL and API key
  5. Click Fetch from API or add models manually
  6. Save and launch

For providers behind Cloudflare, the proxy automatically injects browser headers. For providers with non-standard APIs, add a new backend module to translate-proxy.py following the oa_* / an_* pattern.


Manual Proxy Usage

# Start proxy for any OpenAI-compatible provider
python3 src/translate-proxy.py \
  --backend openai-compat \
  --target-url https://api.your-provider.com/v1 \
  --api-key YOUR_KEY \
  --port 8080

# Or use a JSON config
python3 src/translate-proxy.py --config my-proxy-config.json

# Then run Codex
codex --profile my-profile -c model=my-model

Requirements

  • Python ≥ 3.8
  • python3-gi (sudo apt install python3-gi)
  • Codex CLI ≥ 2.0
  • Codex Desktop (optional, for Desktop mode)
  • bash, curl, lsof

No pip dependencies. Zero. Pure stdlib + system GTK.


License

MIT


Get 10% OFF Z.AI coding plans
Z.AI 10% OFF

Description
Launch Codex CLI and Desktop with any AI provider
Readme 3.4 MiB
2026-05-22 18:36:32 +00:00
Languages
Python 99.3%
Shell 0.7%