- Full reasoning round-trip: capture reasoning_content + tool_calls from
stream, store by tool_call_id, reinsert before next codebuff POST
- Primary path no longer disables thinking (codebuff doesn't forward the flag)
- Fallback retry uses DeepSeek native {thinking:{type:'disabled'}} format
- Replaced broken _fb_retry_no_reasoning + _fb_retry_stripped with
single _fb_retry_thinking_disabled
- New _ds_store_assistant(), _ds_rebuild_tool_history() functions
- oa_stream_to_sse() now captures tool_calls in reasoning_out dict
- Multi-turn Codex CLI sessions with function calls now complete successfully
- Root cause: _handle_codebuff streaming loop collected events but never
wrote them to self.wfile (stream_buffered_events was not called)
- Fix: Replaced manual loop with stream_buffered_events() + on_event callback
- Confirmed working: raw API streaming, non-stream, and stream through proxy
- Updated CHANGELOG.md, README.md, version labels to 3.8.3
- Usage Dashboard: visual cards with success rate bars, token stats, latency
- Per-model breakdown and error tracking per provider
- Proxy records usage-stats.json after every request
- Google OAuth: browse for client_secret.json instead of fixed path
- Auto-copies selected file to ~/.cache/codex-proxy/
- Exponential backoff retry (2s/4s/8s) for rate limits and transient errors
- BGP routes retry before failing over to next route
- Socket SO_REUSEADDR prevents 'Address already in use' crashes
- Connection reset/broken pipe also retried
- BGP route count shown at proxy startup
- New AI BGP pool manager (create/edit/delete pools)
- Each pool has ordered routes from any configured endpoint
- Failover: tries primary, falls back to next route on error
- Pools appear in endpoint dropdown with shuffle icon
- Pool editor with route add/remove/reorder
- Fixed TOML breakage from multi-line paste
- Added OpenAdapter preset with 0G models
- Fix missing _on_reasoning_toggled method (caused Add button crash)
- Redesigned Google OAuth flow with proper dialog:
- Shows clickable auth URL link in dialog
- Auto-opens browser for Google authorization
- Live status updates while waiting for callback
- Success/error shown in dialog (no popup chain)
- Spinner animation during auth wait
- Better setup instructions if client_secret.json missing
- Two presets: API Key and OAuth modes
- OAuth Login button: full Google OAuth2 flow with auto-refresh
- Auto-refreshes expired access tokens using refresh_token
- Gemini OpenAI-compatible endpoint works with existing proxy
- Models: gemini-2.5-flash, gemini-2.5-pro, gemini-2.0-flash, etc.
- All urlopen() calls now have timeout=180 to prevent infinite hangs
- Crof upstream can idle between SSE chunks, causing proxy to block forever
- Tested: kimi-k2.6 + mimo-v2.5-pro both stream completed successfully
- Add Reasoning On/Off toggle and Effort selector in endpoint editor
- Proxy sends enable_thinking=false when reasoning is OFF
- Proxy sends reasoning_effort level when reasoning is ON
- Strip reasoning_content from output, force max_tokens=64000 minimum
- Fixes Crof mimo-v2.5-pro and similar reasoning model token exhaustion
- Strip reasoning_content from proxy output (Codex doesn't use it)
- Force max_tokens=64000 minimum for openai-compat providers
- Prevents models that emit large reasoning from running out of tokens
Instead of just truncating old items, the proxy now auto-compacts
them into a structured summary preserving key context:
- User requests, assistant responses, tool calls made, files touched
- Keeps original query + system messages + last 10 recent items
- 38 items -> 14 items in testing, with summary of dropped turns
- Similar to Claude Code's auto-compact and Codex CLI's /compact
- No extra API calls needed, instant, zero cost