Files

Roman 27b22f4fd8 v2.3.0: adaptive Crof self-healing system

- Per-model success/failure tracking with dynamic item limits
- Proactive compaction when above learned limit
- Auto-retry on finish_reason=length with aggressive re-compaction
- Tested: kimi-k2.6 (27 items) and mimo-v2.5-pro both completed
- All previous fixes included: _ts crash, connection reset, timeout, orphaned fco

2026-05-20 14:32:36 +04:00

6.4 KiB

Raw Blame History

Changelog

v2.3.0 (2026-05-20)

Adaptive Crof self-healing system
- Tracks per-model success/failure history with item counts
- Dynamically learns max item limit per model (starts at 30, adjusts down on failures)
- Proactively compacts input when above learned limit before sending to upstream
- Auto-retry on finish_reason=length with aggressive re-compaction and resend
- Prevents stream disconnected and incomplete errors on long conversations
- All tracking logged to stderr: [crof-adaptive] model=X items=N OK/FAIL -> limit=N
Fixed NameError: _ts crash in debug logging
Fixed ConnectionResetError crash on client disconnect during streaming
Added 180s upstream timeout to prevent hanging connections
Compaction now preserves function_call/function_call_output pairs (no orphaned tool outputs)
Fixed reasoning control: reasoning_effort=none always sends both params

v2.2.1 (2026-05-20)

Fixed compaction orphaning function_call_output items — root cause of Crof incomplete responses
- Compaction cut between function_call and its function_call_output, creating dangling tool results
- Crof model received orphaned tool messages with empty tool_call_id, causing confusion and token exhaustion
- Compaction now expands tail boundary to include matching function_call/function_call_output pairs
Fixed reasoning control: reasoning_effort=none now always sends both enable_thinking=false AND reasoning_effort=none
- Crof API testing confirmed reasoning_effort=none is what actually suppresses reasoning, not enable_thinking=false
Added upstream debug logging to ~/.cache/codex-proxy/crof-upstream.jsonl

v2.2.0 (2026-05-20)

Added per-provider Reasoning controls in endpoint editor
- Reasoning On/Off toggle — disable reasoning for models that exhaust output tokens (e.g., Crof mimo-v2.5-pro)
- Reasoning Effort selector: None, Minimal, Low, Medium, High, Max
- When reasoning is OFF: sends enable_thinking=false + reasoning_effort=none to upstream API
- When reasoning is ON: sends user-selected effort level (default: Medium)
- Settings stored per-endpoint, passed through proxy config to upstream requests
Strip reasoning_content from proxy output — Codex doesn't use it, avoids token waste
Force max_tokens=64000 minimum for openai-compat providers — room for both reasoning and content
Inspired by unsloth's reasoning control patterns for Qwen/GPT-OSS models
Styled reasoning switch: green = ON, orange = OFF, gentle rounded pill shape
Added error handling to endpoint manager Add/Edit/Manage dialogs (prevents silent failures)

v2.1.3 (2026-05-19)

Fixed Crof mimo-v2.5-pro stopping mid-response (finish_reason=length)
- Root cause: model emits 600+ reasoning_content SSE chunks that exhaust max_tokens before any actual content is generated
- Strip reasoning_content from proxy output — Codex doesn't use reasoning, avoids wasting output tokens on invisible text
- Force max_tokens minimum of 64000 for openai-compat providers — gives models room for both reasoning and content
- Works for all openai-compat providers (Crof, Z.AI, DeepSeek, OpenRouter, etc.)

v2.1.2 (2026-05-19)

Fixed Crof.ai and providers stopping after first tool call (root cause: None tool IDs)
Codex sends function_call items with id=None — proxy now matches tool results to calls by call_id + positional fallback
Fixed orphan message output item when response is only tool calls (no text content)
Auto-trims long conversations (>30 items) to prevent context overflow on providers like Crof
- Keeps system/developer messages, original user query, and most recent 10 items
- Auto-compacts old items into a summary instead of just dropping them
- Summary includes: user requests, assistant responses, tool calls made, files touched
- Preserves enough context for the model to continue long tasks intelligently
Truncates large tool outputs (>8000 chars) to prevent model output token exhaustion
- Crof's models return incomplete when tool results contain too much text (e.g., full HTML pages)
- Truncated outputs include [truncated N chars] suffix so the model knows data was cut
Added request/response logging to ~/.cache/codex-proxy/requests.log for debugging
Proxy stderr no longer discarded by launcher (visible in terminal for debugging)

v2.1.1 (2026-05-19)

Added Command Code backend to translation proxy (proprietary /alpha/generate API)
Added Command Code provider preset with 20 models (DeepSeek, Claude, GPT, Kimi, GLM, Qwen, etc.)
Added cc_version field in endpoint editor for Command Code version (default: 0.26.8)
Proxy sends x-command-code-version header to CC API (fixes 403 "upgrade_required")
CC message conversion: system role → user, string content → array, tools stripped, real UUID for threadId
Fixed proxy: map developer role to system for Chat Completions providers (DeepSeek, Qwen, etc.)
Fixed proxy: map developer role to user for Anthropic providers
Forward instructions field from Responses API as system message/param

v2.1.0 (2026-05-19)

Added Codex auth status detection (reads codex login status)
Auth status bar shows logged-in provider or warning if auth missing/expired
"Re-login" button opens codex login in a terminal for re-authentication
Auto re-checks auth 30s after re-login flow starts
Pre-launch auth check warns before launching Codex Default mode if auth is invalid
Auth status checked asynchronously at startup (non-blocking)

v2.0.1 (2026-05-19)

Added Codex CLI/Desktop installation verifier to main page
Green check (✔) when detected, yellow cross (✘) when missing
"Install" button next to missing tools opens install guide dialog
Desktop/CLI launch buttons disabled with tooltip when tool is missing
Dependency status logged on startup
Buttons respect missing-state after busy/unbusy cycles

v2.0.0 (2026-05-19)

Initial release: multi-provider Codex Launcher
Translation proxy: Responses API to Chat Completions + Anthropic Messages
GTK endpoint manager with 10+ provider presets
Codex Default mode (built-in OAuth, zero config)
Browser UA injection for Cloudflare-protected providers (OpenCode)
Streaming SSE, tool calls, reasoning content support
Profile backup/import, model auto-fetch, bulk import
Refresh Models in background thread
URL normalization to prevent double-path bugs
Config backup/restore around sessions
.deb installer package

6.4 KiB Raw Blame History

Changelog

v2.3.0 (2026-05-20)

v2.2.1 (2026-05-20)

v2.2.0 (2026-05-20)

v2.1.3 (2026-05-19)

v2.1.2 (2026-05-19)

v2.1.1 (2026-05-19)

v2.1.0 (2026-05-19)

v2.0.1 (2026-05-19)

v2.0.0 (2026-05-19)

6.4 KiB

Raw Blame History