v3.11.5: token-aware compaction, vision filter, universal adaptive compaction, smart-continue text detection

2026-05-26 16:14:05 +04:00
parent 028185652d
commit b029e7cb5e
9 changed files with 684 additions and 127 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,5 +1,29 @@
 # Changelog

+## v3.11.5 (2026-05-26)
+
+**Vision Filter, Token-Aware Compaction, Universal Adaptive Compaction, Smart-Continue Text Detection**
+
+### Critical Fixes
+
+- **Token-aware compaction for small-context models (FIX)**: `_crof_compact_for_retry()` had an early return at `len(input_data) <= limit` (item count) — if you had 25 items × 1600 tokens = 40K tokens, it skipped compaction entirely because 25 < 30 (the default item limit). Now also checks estimated token count vs learned model max, and compacts when either item count OR token count exceeds limits. Fixes repeated `context_length_exceeded` errors on models like 0G-GLM-5.1 (~35K token context).
+- **Proactive compaction now token-aware**: Previously only triggered when item count > 30. Now also triggers when estimated tokens exceed 80% of the model's learned token limit, even if item count is below the threshold. Prevents the first-request failure pattern on small-context models.
+- **Compaction aggression threshold**: Changed `est > max_tok` to `est >= max_tok * 0.9` to avoid edge case where estimated tokens exactly equal the limit and compaction is skipped.
+- **Removed all `crof.ai` gates from adaptive compaction**: Proactive compaction, `finish_reason=length` retry, `_crof_record`, and compaction logging were gated behind `"crof.ai" in TARGET_URL`. These gates prevented OpenAdapter and other providers from getting proactive/retry compaction, causing repeated `context_length_exceeded` failures. Now applies universally to ALL providers.
+
+### New Features
+
+- **Vision model detection + image stripping**: `_strip_images_from_input()` and `_model_supports_vision()` detect vision capability by model name pattern. Non-vision models (deepseek, glm, mixtral, llama, command, dbrx, qwen, phi-3) have `input_image`/`image_url` parts stripped and replaced with `[User attached image: filename — this model does not support vision]` text notice. Vision models (gpt-4o, gemini, claude, qwen-vl, glm-5v) keep images intact. Applied in 3 paths: main request, context_length_exceeded retry, smart-continue nudge.
+- **Token estimation and per-model limit learning**: `_estimate_tokens()`, `_estimate_input_tokens()`, `_get_model_max_tokens()`, `_set_model_max_tokens()`. Extracts `~N tokens` from `context_length_exceeded` error messages and stores per-model token limits. Used by proactive compaction and retry compaction to adjust `keep` count dynamically.
+- **Compaction aggression levels**: `_crof_compact_for_retry()` accepts `aggression` parameter (0=normal, 1=extreme). Extreme mode kicks in when estimated tokens > 1.5× the learned limit or on 2nd+ retry attempt. Reduces `keep` count to minimum, ensuring the compacted request fits within model limits.
+- **Smart-continue text-tool detection**: Removed hard requirement for `has_function_call_output(input_data)`. Added `_TOOL_CALL_TEXT_PATTERNS` and `_text_looks_like_tool_calls()` to trigger nudging when model outputs text matching tool-call patterns (e.g., `• (exec_command cmd ...)`, `write_to_file`, `exec_command`) even without prior `function_call_output` in context. Essential for models like 0G-GLM-5.1 that never emit real `function_call_output` items.
+- **Parenthesized tool call regex**: `_PAREN_TC_RE` pattern to match `• (name args...)` format from non-vision models that output tool calls as parenthesized text.
+
+### GUI Fixes
+
+- **Active endpoint sync**: Added `set_active_endpoint()` and `validate_active_endpoint()` to Linux GTK GUI. Syncs `.active-endpoint.json` with `config.toml` on every launch; auto-removes stale references to deleted providers. Fixed `"Error loading configuration: No such file or directory (os error 2)"` crash when active endpoint referenced a deleted provider.
+- **Config state**: `~/.codex/.active-endpoint.json` and `config.toml` model catalog path validated and auto-corrected on GUI startup.
+
 ## v3.11.0 (2026-05-26)

 **Cobra PR Merge + Smart Continuation + API Key Hot-Reload**