Go to file

admin cb6381afe4 fix: add previous_response_id support for multi-turn tool calls (Crof fix)

Codex Desktop uses previous_response_id to chain conversation turns.
Without storing and resolving these, the proxy sent only the new
function_call_output to upstream providers, missing the original user
message and assistant tool call. This caused Crof.ai (and any provider
using tool calls) to stop after the first response.

- Add in-memory response store (50 entry LRU) keyed by response ID
- resolve_previous_response() reconstructs full input chain on multi-turn
- Fix orphan message output item when response has only tool calls
- Applies to all backends: openai-compat, anthropic, command-code
- v2.1.2

cb6381afe4 · 2026-05-19 20:38:39 +04:00

15 Commits

src

fix: add previous_response_id support for multi-turn tool calls (Crof fix)

2026-05-19 20:38:39 +04:00

.gitignore

Initial commit: Codex Launcher — Any AI Provider

2026-05-19 14:57:31 +04:00

CHANGELOG.md

fix: add previous_response_id support for multi-turn tool calls (Crof fix)

2026-05-19 20:38:39 +04:00

codex-launcher_2.1.2_all.deb

fix: add previous_response_id support for multi-turn tool calls (Crof fix)

2026-05-19 20:38:39 +04:00

install.sh

Initial commit: Codex Launcher — Any AI Provider

2026-05-19 14:57:31 +04:00

README.md

fix: add previous_response_id support for multi-turn tool calls (Crof fix)

2026-05-19 20:38:39 +04:00

README.md

Get 10% OFF Z.AI coding plans
z.ai/subscribe

Codex Launcher — Any AI Provider

Run OpenAI Codex CLI & Desktop with any AI provider.
OpenCode • Z.AI • Anthropic • Command Code • OpenRouter • Crof.ai • NVIDIA NIM • Kilo.ai • and more

The Problem

OpenAI's Codex CLI v2.0+ exclusively uses the Responses API — a protocol that is incompatible with virtually every other AI provider:

Provider	API	Works with Codex?
OpenAI	Responses API	✅
Z.AI	Chat Completions	❌
OpenCode	Chat Completions	❌
Anthropic	Messages API	❌
Command Code	Custom `/alpha/generate`	❌
Ollama	Chat Completions	❌
OpenRouter	Chat Completions	❌
NVIDIA NIM	Chat Completions	❌
Crof.ai	Chat Completions	❌

The protocols differ in endpoint paths, message formats, tool-call structures, streaming events, and completion semantics. You can't just swap a base URL.

The Solution

A three-component system:

Translation Proxy — translates Responses API ↔ Chat Completions / Anthropic Messages in real-time
Config Engine — generates Codex config files on-the-fly per provider, with backup/restore
GTK Launcher — manages endpoints, launches Desktop or CLI, handles the full lifecycle

┌─────────────────────────────────────────────────────────────────────┐
│                         Codex Launcher GUI                          │
│                    (endpoint management + lifecycle)                │
└──────────┬─────────────────┬──────────────────┬────────────────────┘
           │                 │                  │
    ┌──────▼──────┐  ┌──────▼──────┐  ┌────────▼─────────┐
    │  Codex      │  │  Native     │  │  Translation     │
    │  Default    │  │  OpenAI     │  │  Proxy           │
    │  (remove    │  │  (direct    │  │  (port 8080)     │
    │  config)    │  │  URL)       │  │                  │
    └──────┬──────┘  └──────┬──────┘  └────────┬─────────┘
           │                │                   │
           ▼                ▼          ┌────────┴────────┐
    ┌──────────────┐ ┌───────────┐    │                 │
    │ Built-in     │ │ config.   │    ▼                 ▼
    │ Codex OAuth  │ │ toml      │ ┌────────────┐ ┌───────────┐
    └──────────────┘ └───────────┘ │ OpenAI     │ │ Anthropic │
                                   │ Chat Comp. │ │ Messages  │
                                   └────────────┘ └───────────┘

Features

Multi-Provider Support

Native OpenAI — direct connection, no proxy needed
OpenAI-compatible — Z.AI, OpenCode Zen/Go, Crof.ai, NVIDIA NIM, Kilo.ai, OpenRouter, Ollama, Together, Groq, and any provider with a Chat Completions endpoint
Anthropic — Claude models via the Messages API
Command Code — 20 models (DeepSeek, Claude, GPT, Kimi, GLM, Qwen, etc.) via Command Code's /alpha/generate API with configurable client version
Codex Default — built-in Codex OAuth with official models, zero config

Translation Proxy (`translate-proxy.py`)

Full Responses API ↔ Chat Completions / Anthropic Messages / Command Code API bidirectional translation
Streaming SSE support with proper event sequencing (response.created → response.output_text.delta → response.completed)
Tool calls — full function calling support including parallel tool calls
Reasoning content — forwards reasoning_content fields from providers that support it
Browser UA injection — bypasses Cloudflare bot detection for providers like OpenCode
Smart URL construction — prevents double-path bugs (/v1/chat/completions/chat/completions)
Header forwarding — preserves client identity headers while filtering hop-by-hop headers
Zero dependencies — pure Python stdlib

GTK Launcher (`codex-launcher-gui`)

Endpoint manager — add, edit, delete, set default providers
Provider presets — one-click setup for 10+ providers with pre-filled URLs and model lists
Model auto-fetch — pulls available models directly from provider APIs
Bulk model import — paste a comma/newline-separated list of model IDs
Launch Desktop — starts Codex Desktop with the selected provider and model
Launch CLI — opens Codex CLI in a terminal with the selected provider
Codex Default — launch with built-in OAuth, no proxy or custom config
Profile backup/import — export and import endpoint configurations as portable JSON bundles
Threaded operations — model refresh runs in background, UI stays responsive
Process lifecycle — stall detection, kill/cleanup, config backup/restore around sessions
Config normalization — automatically strips stale API path suffixes from URLs

Process Management

Kills stale electron/webview/app-server processes from previous sessions
Removes stale PID files and sockets
Manages proxy lifecycle (start, health-check, stop)
Config backup before launch, automatic restore after exit

Technology Stack

Component	Technology	Why
Translation Proxy	Python 3 stdlib (`http.server`, `urllib`, `json`)	Zero dependencies, runs anywhere
GUI Launcher	Python 3 + GTK 3.0 (`PyGObject/gi`)	Native Linux desktop integration
Config Engine	Python 3 (`toml` generation, `json` catalogs)	Dynamic, no hardcoded configs
Process Mgmt	bash + `os.setsid`/`os.killpg`	Unix process group lifecycle
Streaming	Server-Sent Events (SSE)	Required by Codex Responses API
API Translation	Responses API ↔ Chat Completions / Anthropic Messages	Protocol bridging

Zero pip dependencies. Everything uses Python stdlib + system GTK bindings.

Quick Start

Prerequisites

Codex CLI ≥ 2.0 (npm install -g @openai/codex or bundled with Codex Desktop)
Codex Desktop installed at /opt/codex-desktop/ (optional, for Desktop mode)
Python 3.8+ (stdlib only)
python3-gi for GTK (sudo apt install python3-gi)
bash, curl, lsof

Install

git clone https://github.rommark.dev/admin/Codex-Launcher---Any-AI-Porovider.git
cd Codex-Launcher---Any-AI-Porovider
./install.sh

Run

Open Codex Launcher from your app grid, or:

codex-launcher-gui

First Launch

Click Manage Endpoints → Add
Select a provider preset (e.g., "OpenCode Zen (OpenAI-compatible)")
Enter your API key
Click Fetch from API to auto-discover models, or add them manually
Click Save
Select the endpoint and model from the dropdowns
Click Launch Desktop or Launch CLI

Development Journey

Phase 1: The Z.AI Proxy — Protocol Reverse Engineering

Problem: Codex CLI v2.0 switched exclusively to the Responses API. Z.AI (and every other provider) uses Chat Completions. They are fundamentally incompatible.

Approach:

Captured Codex's HTTP traffic to understand the exact Responses API request/response shape
Mapped the protocol differences:
- /v1/responses → /chat/completions (endpoint)
- input array with typed items → messages array with role/content (message format)
- function_call items → tool_calls array on assistant messages (tool format)
- SSE response.output_text.delta events → delta.content chunks (streaming)
Built the initial zai-proxy.py — a 200-line HTTP server that translates in both directions
Hit a critical bug: Codex hung in "thinking" state. Discovered that merely emitting response.done is insufficient — the response.completed event must contain the full output item array
Added streaming SSE with proper event sequencing — this was the breakthrough that made it work

Testing: Manual end-to-end testing with curl, Codex CLI --profile zai, and Codex Desktop. Verified streaming, tool calls, and reasoning content.

Phase 2: Multi-Provider Architecture — The Unified Proxy

Problem: Maintaining separate proxies for each provider (Z.AI, Anthropic, OpenRouter, etc.) was unmaintainable.

Approach:

Abstracted the translation into a backend plugin architecture:
- openai-compat backend: translates Responses → Chat Completions (works for any OpenAI-compatible API)
- anthropic backend: translates Responses → Anthropic Messages API
Unified config loading: JSON config file, CLI arguments, and environment variables
Shared HTTP server, model serving, and SSE framework
Per-backend: message conversion, tool conversion, response conversion, stream conversion

Key design decisions:

Pure Python stdlib — no Flask, no aiohttp, no pip dependencies. The proxy must work on any system with Python 3.
http.server.BaseHTTPRequestHandler — simple, synchronous, but sufficient for single-user desktop use
Config via JSON file — the launcher writes a proxy config to ~/.cache/codex-proxy/ for each endpoint

Phase 3: The GTK Launcher — Desktop Integration

Problem: Users had to manage config files, start proxies manually, and remember which wrapper script to use.

Approach:

Built a GTK 3.0 GUI with three layers: main window, endpoint manager dialog, edit endpoint dialog
Implemented dynamic config generation: for each launch, generate config.toml with the right provider definition, profile, and model catalog
Model catalog generation with dual field naming (slug + model, supported_reasoning_levels + supportedReasoningEfforts) — required because Codex CLI and Codex Desktop use different field names
Process lifecycle: backup config → cleanup stale processes → start proxy → write config → launch Codex → wait → restore config → stop proxy

Threading challenges:

GTK requires all UI updates on the main loop. Used GLib.idle_add() for all cross-thread communication
Model refresh was blocking the UI — moved to a background thread with idle_add completion callbacks
Proxy startup waits up to 15 seconds with health checks before proceeding

Phase 4: Endpoint Management & Provider Presets

Approach:

JSON-based endpoint storage (~/.codex/endpoints.json)
Provider presets with pre-filled URLs and model lists for common providers
Auto-fetch models from provider /v1/models endpoints
Bulk model import (paste comma/newline-separated model IDs)
Profile backup/import — portable JSON bundles with endpoints + config

URL normalization: Discovered that saved URLs sometimes had /chat/completions appended from manual entry. Added normalize_base_url() that strips trailing API path suffixes to prevent double-path bugs.

Phase 5: Cloudflare Bot Detection — The OpenCode 403 Saga

Problem: OpenCode Zen/Go returned 403 (Cloudflare error 1010) for all requests.

Investigation:

Tested direct curl requests — all returned 403 regardless of auth header type (Bearer, x-api-key, both, none)
Examined Codex logs — found turn.has_metadata_header=true and error code: 1010
Error 1010 is Cloudflare's "Access denied" — bot detection based on User-Agent and browser headers
Python's urllib sends User-Agent: Python-urllib/3.x which triggers the block

Solution:

Added _BROWSER_HEADERS — a set of Chrome-like headers (User-Agent, Sec-Ch-Ua, Sec-Fetch-*)
forwarded_headers() with browser_ua=True — injects browser headers while preserving incoming client headers and filtering hop-by-hop headers
Applied only to openai-compat backend where Cloudflare is common
This resolved the 403 — upstream now returns 401 (auth) instead of 403 (bot block), confirming the headers work

Phase 6: Codex Default Mode — OAuth Without Config

Problem: Users wanted to quickly switch back to built-in Codex OAuth without maintaining a separate endpoint.

Approach:

Added "Codex Default (Desktop)" and "Codex Default (CLI)" buttons
On launch: backup config → delete config.toml entirely → start Codex → restore config after exit
Key insight: writing empty strings (model = "", model_provider = "") causes Codex to error with "Model provider `` not found". The config must not exist at all for Codex to fall back to built-in defaults.

Architecture Deep Dive

Request Flow (OpenAI-compatible provider)

Codex CLI/Desktop
    │
    │ POST /responses (Responses API)
    │ Body: { model: "glm-5.1", input: [...], stream: true }
    ▼
┌─────────────────────────────────┐
│  translate-proxy.py (port 8080) │
│                                 │
│  1. Parse Responses API body    │
│  2. Convert input → messages    │
│  3. Convert tools format        │
│  4. Inject browser UA headers   │
│                                 │
│  POST /v1/chat/completions      │──→  Upstream Provider
│  Body: { model, messages, ... }│      (opencode.ai, z.ai, etc.)
│                                 │
│  5. Receive Chat Comp. response │  ←── SSE stream or JSON
│  6. Convert response format     │
│  7. Emit Responses API SSE      │
│                                 │
└─────────────────────────────────┘
    │
    │ SSE: response.created
    │ SSE: response.output_text.delta  (streamed tokens)
    │ SSE: response.output_text.done
    │ SSE: response.completed
    ▼
Codex CLI/Desktop (receives Responses API events)

Config Lifecycle

┌─────────┐    ┌──────────┐    ┌──────────┐    ┌──────────┐    ┌─────────┐
│ Backup   │───→│ Cleanup  │───→│ Generate │───→│ Launch   │───→│ Restore │
│ config   │    │ stale    │    │ config   │    │ Codex    │    │ config  │
│ .toml    │    │ processes│    │ + models │    │ process  │    │ from    │
│          │    │          │    │          │    │          │    │ backup  │
└─────────┘    └──────────┘    └──────────┘    └──────────┘    └─────────┘

Model Catalog Format

The launcher generates model catalog JSON with dual field naming to satisfy both Codex CLI and Codex Desktop:

{
  "models": [{
    "slug": "glm-5.1",          // CLI reads this
    "model": "glm-5.1",         // Desktop reads this
    "supported_reasoning_levels": [...],  // CLI
    "supportedReasoningEfforts": [...],   // Desktop
    ...
  }]
}

Provider Presets

Preset	Backend	Base URL
OpenAI	Native	`https://api.openai.com/v1`
Anthropic	Anthropic	`https://api.anthropic.com/v1`
OpenCode Zen	OpenAI-compat	`https://opencode.ai/zen/v1`
OpenCode Go	OpenAI-compat	`https://opencode.ai/zen/go/v1`
Command Code	Command Code	`https://api.commandcode.ai`
Crof.ai	OpenAI-compat	`https://crof.ai/v1`
NVIDIA NIM	OpenAI-compat	`https://integrate.api.nvidia.com/v1`
Kilo.ai	OpenAI-compat	`https://api.kilo.ai/api/gateway`
OpenRouter	OpenAI-compat	`https://openrouter.ai/api/v1`
Z.AI	OpenAI-compat	`https://api.z.ai/api/coding/paas/v4`
Custom	Any	User-defined

File Structure

src/
├── translate-proxy.py            # Translation proxy (openai-compat + anthropic + command-code)
├── codex-launcher-gui            # GTK launcher GUI
├── cleanup-codex-stale.sh        # Stale process cleanup
└── codex-launcher.desktop.template  # Desktop entry template

install.sh                        # One-command installer
README.md                         # This file

Installed Locations

~/.local/bin/translate-proxy.py       # Proxy
~/.local/bin/codex-launcher-gui       # Launcher
~/.local/bin/cleanup-codex-stale.sh   # Cleanup
~/.local/share/applications/codex-launcher.desktop  # App grid entry
~/.codex/endpoints.json               # Endpoint storage
~/.codex/config.toml                  # Codex config (auto-generated)
~/.cache/codex-proxy/                 # Proxy configs + model catalogs

Troubleshooting

Issue	Cause	Fix
Window opens then disappears	Stale processes from previous session	Click Kill && Cleanup
Window never opens	Startup freeze	Kill && Cleanup, then retry
"Model provider `` not found"	Empty strings in config	V2 deletes config entirely for Default mode
403 Forbidden from OpenCode	Cloudflare bot detection	Proxy injects browser UA headers
401 Unauthorized from OpenCode	Invalid key or no credits	Check API key and billing
Double path in URL	Stale `/chat/completions` in base URL	`normalize_base_url()` strips suffixes
Proxy stops when terminal closes	SIGHUP to subprocess	Launcher uses `os.setsid` process groups
Models not showing in picker	Wrong model catalog format	Must have both `slug` + `model` fields
Codex hangs in "thinking"	Missing `response.completed`	Proxy emits full SSE event sequence
Stops after first tool call (Crof)	`previous_response_id` not resolved	V2.1.2 stores and chains responses for multi-turn

Adding a New Provider

Click Manage Endpoints → Add
Choose a preset or set Custom
Set backend type: OpenAI-compatible, Anthropic, or Native
Enter base URL and API key
Click Fetch from API or add models manually
Save and launch

For providers behind Cloudflare, the proxy automatically injects browser headers. For providers with non-standard APIs, add a new backend module to translate-proxy.py following the oa_* / an_* pattern.

Manual Proxy Usage

# Start proxy for any OpenAI-compatible provider
python3 src/translate-proxy.py \
  --backend openai-compat \
  --target-url https://api.your-provider.com/v1 \
  --api-key YOUR_KEY \
  --port 8080

# Or use a JSON config
python3 src/translate-proxy.py --config my-proxy-config.json

# Then run Codex
codex --profile my-profile -c model=my-model

Requirements

Python ≥ 3.8
python3-gi (sudo apt install python3-gi)
Codex CLI ≥ 2.0
Codex Desktop (optional, for Desktop mode)
bash, curl, lsof

No pip dependencies. Zero. Pure stdlib + system GTK.

License

MIT

Get 10% OFF Z.AI coding plans

Releases 4

v3.8.0 — AI Monitoring Latest

2026-05-22 18:36:32 +00:00

Languages

Python 99.3%

Shell 0.7%

README.md

Codex Launcher — Any AI Provider

The Problem

The Solution

Features

Multi-Provider Support

Translation Proxy (translate-proxy.py)

GTK Launcher (codex-launcher-gui)

Process Management

Technology Stack

Quick Start

Prerequisites

Install

Run

First Launch

Development Journey

Phase 1: The Z.AI Proxy — Protocol Reverse Engineering

Phase 2: Multi-Provider Architecture — The Unified Proxy

Phase 3: The GTK Launcher — Desktop Integration

Phase 4: Endpoint Management & Provider Presets

Phase 5: Cloudflare Bot Detection — The OpenCode 403 Saga

Phase 6: Codex Default Mode — OAuth Without Config

Architecture Deep Dive

Request Flow (OpenAI-compatible provider)

Config Lifecycle

Model Catalog Format

Provider Presets

File Structure

Installed Locations

Troubleshooting

Adding a New Provider

Manual Proxy Usage

Requirements

License

Translation Proxy (`translate-proxy.py`)

GTK Launcher (`codex-launcher-gui`)