feat: auto-compaction for long conversations (like Claude Code/Codex /compact)
Instead of just truncating old items, the proxy now auto-compacts them into a structured summary preserving key context: - User requests, assistant responses, tool calls made, files touched - Keeps original query + system messages + last 10 recent items - 38 items -> 14 items in testing, with summary of dropped turns - Similar to Claude Code's auto-compact and Codex CLI's /compact - No extra API calls needed, instant, zero cost
This commit is contained in:
@@ -6,9 +6,10 @@
|
||||
- Codex sends `function_call` items with `id=None` — proxy now matches tool results to calls by call_id + positional fallback
|
||||
- Fixed orphan message output item when response is only tool calls (no text content)
|
||||
- **Auto-trims long conversations (>30 items)** to prevent context overflow on providers like Crof
|
||||
- Keeps system/developer messages, original user query, and most recent items
|
||||
- Drops oldest tool call/outputs from the middle when conversation grows too long
|
||||
- Prevents `status=incomplete` errors on providers with smaller context windows
|
||||
- Keeps system/developer messages, original user query, and most recent 10 items
|
||||
- **Auto-compacts old items into a summary** instead of just dropping them
|
||||
- Summary includes: user requests, assistant responses, tool calls made, files touched
|
||||
- Preserves enough context for the model to continue long tasks intelligently
|
||||
- **Truncates large tool outputs (>8000 chars)** to prevent model output token exhaustion
|
||||
- Crof's models return `incomplete` when tool results contain too much text (e.g., full HTML pages)
|
||||
- Truncated outputs include `[truncated N chars]` suffix so the model knows data was cut
|
||||
|
||||
Reference in New Issue
Block a user