Files
Codex-Launcher---Any-AI-Por…/README.md
admin cb6381afe4 fix: add previous_response_id support for multi-turn tool calls (Crof fix)
Codex Desktop uses previous_response_id to chain conversation turns.
Without storing and resolving these, the proxy sent only the new
function_call_output to upstream providers, missing the original user
message and assistant tool call. This caused Crof.ai (and any provider
using tool calls) to stop after the first response.

- Add in-memory response store (50 entry LRU) keyed by response ID
- resolve_previous_response() reconstructs full input chain on multi-turn
- Fix orphan message output item when response has only tool calls
- Applies to all backends: openai-compat, anthropic, command-code
- v2.1.2
2026-05-19 20:38:39 +04:00

453 lines
21 KiB
Markdown

<p align="center">
<a href="https://z.ai/subscribe?ic=ROK78RJKNW">
<img src="https://img.shields.io/badge/Z.AI-10%25_OFF_Coding_Plans-6366f1?style=for-the-badge&logo=data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHdpZHRoPSIyNCIgaGVpZ2h0PSIyNCIgdmlld0JveD0iMCAwIDI0IDI0IiBmaWxsPSJub25lIiBzdHJva2U9IndoaXRlIiBzdHJva2Utd2lkdGg9IjIiPgogIDxwYXRoIGQ9Ik0xMiAyTDIgN2wxMCA1IDEwLTV6Ii8+CiAgPHBhdGggZD0iTTIgMTdsMTAgNSAxMC01Ii8+CiAgPHBhdGggZD0iTTIgMTJsMTAgNSAxMC01Ii8+Cjwvc3ZnPg==&labelColor=4f46e5" alt="Z.AI 10% OFF" />
</a>
</p>
<p align="center">
<strong>Get 10% OFF Z.AI coding plans</strong><br/>
<a href="https://z.ai/subscribe?ic=ROK78RJKNW">z.ai/subscribe</a>
</p>
---
<h1 align="center">Codex Launcher — Any AI Provider</h1>
<p align="center">
<strong>Run OpenAI Codex CLI &amp; Desktop with <em>any</em> AI provider.</strong><br/>
OpenCode &bull; Z.AI &bull; Anthropic &bull; Command Code &bull; OpenRouter &bull; Crof.ai &bull; NVIDIA NIM &bull; Kilo.ai &bull; and more
</p>
<p align="center">
<img src="https://img.shields.io/badge/Python-3.8+-blue?logo=python&logoColor=white" alt="Python 3.8+" />
<img src="https://img.shields.io/badge/GTK-3.0-green?logo=gtk&logoColor=white" alt="GTK 3.0" />
<img src="https://img.shields.io/badge/License-MIT-yellow" alt="MIT License" />
<img src="https://img.shields.io/badge/Zero_Pip_Dependencies-✓-brightgreen" alt="Zero pip dependencies" />
</p>
<p align="center">
<img src="https://img.shields.io/badge/Responses_API-✓-success" />
<img src="https://img.shields.io/badge/Chat_Completions-✓-success" />
<img src="https://img.shields.io/badge/Anthropic_Messages-✓-success" />
<img src="https://img.shields.io/badge/Command_Code-✓-success" />
<img src="https://img.shields.io/badge/Streaming_SSE-✓-success" />
<img src="https://img.shields.io/badge/Tool_Calls-✓-success" />
</p>
---
## The Problem
OpenAI's Codex CLI v2.0+ exclusively uses the **Responses API** — a protocol that is incompatible with virtually every other AI provider:
| Provider | API | Works with Codex? |
|----------|-----|:-:|
| OpenAI | Responses API | ✅ |
| Z.AI | Chat Completions | ❌ |
| OpenCode | Chat Completions | ❌ |
| Anthropic | Messages API | ❌ |
| Command Code | Custom `/alpha/generate` | ❌ |
| Ollama | Chat Completions | ❌ |
| OpenRouter | Chat Completions | ❌ |
| NVIDIA NIM | Chat Completions | ❌ |
| Crof.ai | Chat Completions | ❌ |
The protocols differ in **endpoint paths**, **message formats**, **tool-call structures**, **streaming events**, and **completion semantics**. You can't just swap a base URL.
## The Solution
A three-component system:
1. **Translation Proxy** — translates Responses API ↔ Chat Completions / Anthropic Messages in real-time
2. **Config Engine** — generates Codex config files on-the-fly per provider, with backup/restore
3. **GTK Launcher** — manages endpoints, launches Desktop or CLI, handles the full lifecycle
```
┌─────────────────────────────────────────────────────────────────────┐
│ Codex Launcher GUI │
│ (endpoint management + lifecycle) │
└──────────┬─────────────────┬──────────────────┬────────────────────┘
│ │ │
┌──────▼──────┐ ┌──────▼──────┐ ┌────────▼─────────┐
│ Codex │ │ Native │ │ Translation │
│ Default │ │ OpenAI │ │ Proxy │
│ (remove │ │ (direct │ │ (port 8080) │
│ config) │ │ URL) │ │ │
└──────┬──────┘ └──────┬──────┘ └────────┬─────────┘
│ │ │
▼ ▼ ┌────────┴────────┐
┌──────────────┐ ┌───────────┐ │ │
│ Built-in │ │ config. │ ▼ ▼
│ Codex OAuth │ │ toml │ ┌────────────┐ ┌───────────┐
└──────────────┘ └───────────┘ │ OpenAI │ │ Anthropic │
│ Chat Comp. │ │ Messages │
└────────────┘ └───────────┘
```
---
## Features
### Multi-Provider Support
- **Native OpenAI** — direct connection, no proxy needed
- **OpenAI-compatible** — Z.AI, OpenCode Zen/Go, Crof.ai, NVIDIA NIM, Kilo.ai, OpenRouter, Ollama, Together, Groq, and any provider with a Chat Completions endpoint
- **Anthropic** — Claude models via the Messages API
- **Command Code** — 20 models (DeepSeek, Claude, GPT, Kimi, GLM, Qwen, etc.) via Command Code's `/alpha/generate` API with configurable client version
- **Codex Default** — built-in Codex OAuth with official models, zero config
### Translation Proxy (`translate-proxy.py`)
- Full Responses API ↔ Chat Completions / Anthropic Messages / Command Code API bidirectional translation
- **Streaming SSE** support with proper event sequencing (`response.created``response.output_text.delta``response.completed`)
- **Tool calls** — full function calling support including parallel tool calls
- **Reasoning content** — forwards `reasoning_content` fields from providers that support it
- **Browser UA injection** — bypasses Cloudflare bot detection for providers like OpenCode
- **Smart URL construction** — prevents double-path bugs (`/v1/chat/completions/chat/completions`)
- **Header forwarding** — preserves client identity headers while filtering hop-by-hop headers
- Zero dependencies — pure Python stdlib
### GTK Launcher (`codex-launcher-gui`)
- **Endpoint manager** — add, edit, delete, set default providers
- **Provider presets** — one-click setup for 10+ providers with pre-filled URLs and model lists
- **Model auto-fetch** — pulls available models directly from provider APIs
- **Bulk model import** — paste a comma/newline-separated list of model IDs
- **Launch Desktop** — starts Codex Desktop with the selected provider and model
- **Launch CLI** — opens Codex CLI in a terminal with the selected provider
- **Codex Default** — launch with built-in OAuth, no proxy or custom config
- **Profile backup/import** — export and import endpoint configurations as portable JSON bundles
- **Threaded operations** — model refresh runs in background, UI stays responsive
- **Process lifecycle** — stall detection, kill/cleanup, config backup/restore around sessions
- **Config normalization** — automatically strips stale API path suffixes from URLs
### Process Management
- Kills stale electron/webview/app-server processes from previous sessions
- Removes stale PID files and sockets
- Manages proxy lifecycle (start, health-check, stop)
- Config backup before launch, automatic restore after exit
---
## Technology Stack
| Component | Technology | Why |
|-----------|-----------|-----|
| Translation Proxy | Python 3 stdlib (`http.server`, `urllib`, `json`) | Zero dependencies, runs anywhere |
| GUI Launcher | Python 3 + GTK 3.0 (`PyGObject/gi`) | Native Linux desktop integration |
| Config Engine | Python 3 (`toml` generation, `json` catalogs) | Dynamic, no hardcoded configs |
| Process Mgmt | bash + `os.setsid`/`os.killpg` | Unix process group lifecycle |
| Streaming | Server-Sent Events (SSE) | Required by Codex Responses API |
| API Translation | Responses API ↔ Chat Completions / Anthropic Messages | Protocol bridging |
**Zero pip dependencies.** Everything uses Python stdlib + system GTK bindings.
---
## Quick Start
### Prerequisites
- **Codex CLI** ≥ 2.0 (`npm install -g @openai/codex` or bundled with Codex Desktop)
- **Codex Desktop** installed at `/opt/codex-desktop/` (optional, for Desktop mode)
- **Python 3.8+** (stdlib only)
- **python3-gi** for GTK (`sudo apt install python3-gi`)
- bash, curl, lsof
### Install
```bash
git clone https://github.rommark.dev/admin/Codex-Launcher---Any-AI-Porovider.git
cd Codex-Launcher---Any-AI-Porovider
./install.sh
```
### Run
Open **Codex Launcher** from your app grid, or:
```bash
codex-launcher-gui
```
### First Launch
1. Click **Manage Endpoints****Add**
2. Select a provider preset (e.g., "OpenCode Zen (OpenAI-compatible)")
3. Enter your API key
4. Click **Fetch from API** to auto-discover models, or add them manually
5. Click **Save**
6. Select the endpoint and model from the dropdowns
7. Click **Launch Desktop** or **Launch CLI**
---
## Development Journey
### Phase 1: The Z.AI Proxy — Protocol Reverse Engineering
**Problem:** Codex CLI v2.0 switched exclusively to the Responses API. Z.AI (and every other provider) uses Chat Completions. They are fundamentally incompatible.
**Approach:**
1. Captured Codex's HTTP traffic to understand the exact Responses API request/response shape
2. Mapped the protocol differences:
- `/v1/responses``/chat/completions` (endpoint)
- `input` array with typed items → `messages` array with role/content (message format)
- `function_call` items → `tool_calls` array on assistant messages (tool format)
- SSE `response.output_text.delta` events → `delta.content` chunks (streaming)
3. Built the initial `zai-proxy.py` — a 200-line HTTP server that translates in both directions
4. Hit a critical bug: Codex hung in "thinking" state. Discovered that merely emitting `response.done` is insufficient — the `response.completed` event **must** contain the full output item array
5. Added streaming SSE with proper event sequencing — this was the breakthrough that made it work
**Testing:** Manual end-to-end testing with `curl`, Codex CLI `--profile zai`, and Codex Desktop. Verified streaming, tool calls, and reasoning content.
### Phase 2: Multi-Provider Architecture — The Unified Proxy
**Problem:** Maintaining separate proxies for each provider (Z.AI, Anthropic, OpenRouter, etc.) was unmaintainable.
**Approach:**
1. Abstracted the translation into a backend plugin architecture:
- `openai-compat` backend: translates Responses → Chat Completions (works for any OpenAI-compatible API)
- `anthropic` backend: translates Responses → Anthropic Messages API
2. Unified config loading: JSON config file, CLI arguments, and environment variables
3. Shared HTTP server, model serving, and SSE framework
4. Per-backend: message conversion, tool conversion, response conversion, stream conversion
**Key design decisions:**
- Pure Python stdlib — no Flask, no aiohttp, no pip dependencies. The proxy must work on any system with Python 3.
- `http.server.BaseHTTPRequestHandler` — simple, synchronous, but sufficient for single-user desktop use
- Config via JSON file — the launcher writes a proxy config to `~/.cache/codex-proxy/` for each endpoint
### Phase 3: The GTK Launcher — Desktop Integration
**Problem:** Users had to manage config files, start proxies manually, and remember which wrapper script to use.
**Approach:**
1. Built a GTK 3.0 GUI with three layers: main window, endpoint manager dialog, edit endpoint dialog
2. Implemented dynamic config generation: for each launch, generate `config.toml` with the right provider definition, profile, and model catalog
3. Model catalog generation with dual field naming (`slug` + `model`, `supported_reasoning_levels` + `supportedReasoningEfforts`) — required because Codex CLI and Codex Desktop use different field names
4. Process lifecycle: backup config → cleanup stale processes → start proxy → write config → launch Codex → wait → restore config → stop proxy
**Threading challenges:**
- GTK requires all UI updates on the main loop. Used `GLib.idle_add()` for all cross-thread communication
- Model refresh was blocking the UI — moved to a background thread with idle_add completion callbacks
- Proxy startup waits up to 15 seconds with health checks before proceeding
### Phase 4: Endpoint Management & Provider Presets
**Approach:**
1. JSON-based endpoint storage (`~/.codex/endpoints.json`)
2. Provider presets with pre-filled URLs and model lists for common providers
3. Auto-fetch models from provider `/v1/models` endpoints
4. Bulk model import (paste comma/newline-separated model IDs)
5. Profile backup/import — portable JSON bundles with endpoints + config
**URL normalization:** Discovered that saved URLs sometimes had `/chat/completions` appended from manual entry. Added `normalize_base_url()` that strips trailing API path suffixes to prevent double-path bugs.
### Phase 5: Cloudflare Bot Detection — The OpenCode 403 Saga
**Problem:** OpenCode Zen/Go returned 403 (Cloudflare error 1010) for all requests.
**Investigation:**
1. Tested direct `curl` requests — all returned 403 regardless of auth header type (Bearer, x-api-key, both, none)
2. Examined Codex logs — found `turn.has_metadata_header=true` and `error code: 1010`
3. Error 1010 is Cloudflare's "Access denied" — bot detection based on User-Agent and browser headers
4. Python's `urllib` sends `User-Agent: Python-urllib/3.x` which triggers the block
**Solution:**
1. Added `_BROWSER_HEADERS` — a set of Chrome-like headers (User-Agent, Sec-Ch-Ua, Sec-Fetch-*)
2. `forwarded_headers()` with `browser_ua=True` — injects browser headers while preserving incoming client headers and filtering hop-by-hop headers
3. Applied only to `openai-compat` backend where Cloudflare is common
4. This resolved the 403 — upstream now returns 401 (auth) instead of 403 (bot block), confirming the headers work
### Phase 6: Codex Default Mode — OAuth Without Config
**Problem:** Users wanted to quickly switch back to built-in Codex OAuth without maintaining a separate endpoint.
**Approach:**
1. Added "Codex Default (Desktop)" and "Codex Default (CLI)" buttons
2. On launch: backup config → **delete** `config.toml` entirely → start Codex → restore config after exit
3. Key insight: writing empty strings (`model = ""`, `model_provider = ""`) causes Codex to error with "Model provider `` not found". The config must not exist at all for Codex to fall back to built-in defaults.
---
## Architecture Deep Dive
### Request Flow (OpenAI-compatible provider)
```
Codex CLI/Desktop
│ POST /responses (Responses API)
│ Body: { model: "glm-5.1", input: [...], stream: true }
┌─────────────────────────────────┐
│ translate-proxy.py (port 8080) │
│ │
│ 1. Parse Responses API body │
│ 2. Convert input → messages │
│ 3. Convert tools format │
│ 4. Inject browser UA headers │
│ │
│ POST /v1/chat/completions │──→ Upstream Provider
│ Body: { model, messages, ... }│ (opencode.ai, z.ai, etc.)
│ │
│ 5. Receive Chat Comp. response │ ←── SSE stream or JSON
│ 6. Convert response format │
│ 7. Emit Responses API SSE │
│ │
└─────────────────────────────────┘
│ SSE: response.created
│ SSE: response.output_text.delta (streamed tokens)
│ SSE: response.output_text.done
│ SSE: response.completed
Codex CLI/Desktop (receives Responses API events)
```
### Config Lifecycle
```
┌─────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌─────────┐
│ Backup │───→│ Cleanup │───→│ Generate │───→│ Launch │───→│ Restore │
│ config │ │ stale │ │ config │ │ Codex │ │ config │
│ .toml │ │ processes│ │ + models │ │ process │ │ from │
│ │ │ │ │ │ │ │ │ backup │
└─────────┘ └──────────┘ └──────────┘ └──────────┘ └─────────┘
```
### Model Catalog Format
The launcher generates model catalog JSON with dual field naming to satisfy both Codex CLI and Codex Desktop:
```json
{
"models": [{
"slug": "glm-5.1", // CLI reads this
"model": "glm-5.1", // Desktop reads this
"supported_reasoning_levels": [...], // CLI
"supportedReasoningEfforts": [...], // Desktop
...
}]
}
```
---
## Provider Presets
| Preset | Backend | Base URL |
|--------|---------|----------|
| OpenAI | Native | `https://api.openai.com/v1` |
| Anthropic | Anthropic | `https://api.anthropic.com/v1` |
| OpenCode Zen | OpenAI-compat | `https://opencode.ai/zen/v1` |
| OpenCode Go | OpenAI-compat | `https://opencode.ai/zen/go/v1` |
| Command Code | Command Code | `https://api.commandcode.ai` |
| Crof.ai | OpenAI-compat | `https://crof.ai/v1` |
| NVIDIA NIM | OpenAI-compat | `https://integrate.api.nvidia.com/v1` |
| Kilo.ai | OpenAI-compat | `https://api.kilo.ai/api/gateway` |
| OpenRouter | OpenAI-compat | `https://openrouter.ai/api/v1` |
| Z.AI | OpenAI-compat | `https://api.z.ai/api/coding/paas/v4` |
| Custom | Any | User-defined |
---
## File Structure
```
src/
├── translate-proxy.py # Translation proxy (openai-compat + anthropic + command-code)
├── codex-launcher-gui # GTK launcher GUI
├── cleanup-codex-stale.sh # Stale process cleanup
└── codex-launcher.desktop.template # Desktop entry template
install.sh # One-command installer
README.md # This file
```
### Installed Locations
```
~/.local/bin/translate-proxy.py # Proxy
~/.local/bin/codex-launcher-gui # Launcher
~/.local/bin/cleanup-codex-stale.sh # Cleanup
~/.local/share/applications/codex-launcher.desktop # App grid entry
~/.codex/endpoints.json # Endpoint storage
~/.codex/config.toml # Codex config (auto-generated)
~/.cache/codex-proxy/ # Proxy configs + model catalogs
```
---
## Troubleshooting
| Issue | Cause | Fix |
|-------|-------|-----|
| Window opens then disappears | Stale processes from previous session | Click **Kill && Cleanup** |
| Window never opens | Startup freeze | Kill && Cleanup, then retry |
| "Model provider `` not found" | Empty strings in config | V2 deletes config entirely for Default mode |
| 403 Forbidden from OpenCode | Cloudflare bot detection | Proxy injects browser UA headers |
| 401 Unauthorized from OpenCode | Invalid key or no credits | Check API key and billing |
| Double path in URL | Stale `/chat/completions` in base URL | `normalize_base_url()` strips suffixes |
| Proxy stops when terminal closes | SIGHUP to subprocess | Launcher uses `os.setsid` process groups |
| Models not showing in picker | Wrong model catalog format | Must have both `slug` + `model` fields |
| Codex hangs in "thinking" | Missing `response.completed` | Proxy emits full SSE event sequence |
| Stops after first tool call (Crof) | `previous_response_id` not resolved | V2.1.2 stores and chains responses for multi-turn |
---
## Adding a New Provider
1. Click **Manage Endpoints** → **Add**
2. Choose a preset or set **Custom**
3. Set backend type: `OpenAI-compatible`, `Anthropic`, or `Native`
4. Enter base URL and API key
5. Click **Fetch from API** or add models manually
6. Save and launch
For providers behind Cloudflare, the proxy automatically injects browser headers. For providers with non-standard APIs, add a new backend module to `translate-proxy.py` following the `oa_*` / `an_*` pattern.
---
## Manual Proxy Usage
```bash
# Start proxy for any OpenAI-compatible provider
python3 src/translate-proxy.py \
--backend openai-compat \
--target-url https://api.your-provider.com/v1 \
--api-key YOUR_KEY \
--port 8080
# Or use a JSON config
python3 src/translate-proxy.py --config my-proxy-config.json
# Then run Codex
codex --profile my-profile -c model=my-model
```
---
## Requirements
- Python ≥ 3.8
- python3-gi (`sudo apt install python3-gi`)
- Codex CLI ≥ 2.0
- Codex Desktop (optional, for Desktop mode)
- bash, curl, lsof
**No pip dependencies.** Zero. Pure stdlib + system GTK.
---
## License
MIT
---
<p align="center">
<strong>Get 10% OFF Z.AI coding plans</strong><br/>
<a href="https://z.ai/subscribe?ic=ROK78RJKNW">
<img src="https://img.shields.io/badge/Claim_Your_Discount-10%25_OFF-6366f1?style=for-the-badge&labelColor=4f46e5" alt="Z.AI 10% OFF" />
</a>
</p>