v3.8.4: Fix codebuff DeepSeek V4 tool-call reasoning_content round-trip

- Full reasoning round-trip: capture reasoning_content + tool_calls from stream, store by tool_call_id, reinsert before next codebuff POST - Primary path no longer disables thinking (codebuff doesn't forward the flag) - Fallback retry uses DeepSeek native {thinking:{type:'disabled'}} format - Replaced broken _fb_retry_no_reasoning + _fb_retry_stripped with single _fb_retry_thinking_disabled - New _ds_store_assistant(), _ds_rebuild_tool_history() functions - oa_stream_to_sse() now captures tool_calls in reasoning_out dict - Multi-turn Codex CLI sessions with function calls now complete successfully
2026-05-24 21:48:00 +04:00
parent 2d4c1a9c2d
commit 8fd6f280f2
5 changed files with 10679 additions and 0 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,5 +1,36 @@
 # Changelog

+## v3.8.4 (2026-05-24)
+
+**Critical Fix — Codebuff DeepSeek V4 Tool-Call Sessions Now Work**
+
+### Root Cause
+Codebuff/Codebuff proxies requests to DeepSeek V4, which defaults to **thinking mode enabled**. When DeepSeek returns `reasoning_content` in a streaming response that includes tool calls, subsequent requests must include that same `reasoning_content` in the assistant message history — otherwise DeepSeek's API rejects it with HTTP 400: `"The reasoning_content in the thinking mode must be passed back to the API."`
+
+The previous approach tried to **disable thinking** (`enable_thinking: false`, `reasoning_effort: "none"`) which Codebuff doesn't reliably forward to DeepSeek. The retry system then tried stripping assistant messages from history — which guarantees failure because DeepSeek needs the full context.
+
+### Fix — Full Reasoning Round-Trip System
+1. **Capture**: After each codebuff streaming response completes, extract `reasoning_content` + `tool_calls` from the stream deltas
+2. **Store**: Index by `tool_call_id` in `_deepseek_reasoning_store` (thread-safe dict with TTL)
+3. **Rebuild**: Before every codebuff POST, `_ds_rebuild_tool_history()` re-inserts stored assistant messages (with `reasoning_content`) before their matching `tool` messages
+4. **Fallback retry**: If reasoning error still occurs, retries with DeepSeek's native `{"thinking": {"type": "disabled"}}` format
+5. **Primary path no longer disables thinking** — lets Codebuff/DeepSeek use default thinking mode with proper round-trip
+
+### Changes
+- **translate-proxy.py**: New `_ds_store_assistant()`, `_ds_rebuild_tool_history()` functions; `_deepseek_reasoning_store` / `_deepseek_reasoning_lock` globals
+- **translate-proxy.py**: `oa_stream_to_sse()` now captures tool_calls in `_reasoning_out` dict alongside reasoning text
+- **translate-proxy.py**: `_handle_codebuff()` stores assistant messages after stream completes; calls `_ds_rebuild_tool_history()` before POST
+- **translate-proxy.py**: Replaced broken `_fb_retry_no_reasoning()` + `_fb_retry_stripped()` with single `_fb_retry_thinking_disabled()` using native DeepSeek format
+- **translate-proxy.py**: Removed `enable_thinking`/`reasoning_effort` from primary codebuff chat_body
+- **codex-launcher-gui**: Version bumped to 3.8.4
+
+### Confirmed Working
+- Codebuff first request: 200 OK (always worked)
+- Codebuff second request after tool call: **now 200 OK** (was 400 reasoning_content error)
+- Multi-turn Codex CLI sessions with function calls complete successfully
+
+---
+
 ## v3.8.3 (2026-05-24)

 **Critical Fix — Codebuff Streaming Now Works End-to-End**