v3.11.6: Antigravity loop breakers, vision/OCR preprocessing, has_content fix, auth config error fix, install.ps1

This commit is contained in:
Roman | RyzenAdvanced
2026-05-26 18:07:42 +04:00
Unverified
parent b029e7cb5e
commit e59ef6f28a
8 changed files with 340 additions and 10 deletions

View File

@@ -1,5 +1,37 @@
# Changelog
## v3.11.6 (2026-05-26)
**Antigravity Loop Breakers, Vision/OCR Preprocessing, has_content Fix, Auth Error Fix**
### New Features (Antigravity-only, no other providers affected)
- **Per-session loop tracking**: `_ANTIGRAVITY_LOOP_TRACKER` global dict with `_antigravity_loop_key()` function tracks state per session: `latest_user_hash`, `nudge_injected`, `latest_user_appended`, `tool_calls_for_request`, `repeated_tool`, `force_finalize`, `last_tool`, `last_tool_count`
- **Edit-intent nudge injection**: Injected only on the first turn per request, preventing duplicate nudges across retries
- **Latest user instruction append**: Appended exactly once per request to prevent redundant instruction stacking
- **Loop breaker**: If the same tool + arguments is repeated ≥ 5 times in a session, `force_finalize` is triggered to break the infinite loop
- **Detailed `[antigravity-loop]` logging**: All tracking fields logged on every Antigravity request for debugging
### New Features (All OpenAI-compatible providers)
- **Vision/OCR preprocessing**: When a provider doesn't support images (detected via error messages like "unknown variant image_url", "does not support image"), the proxy automatically calls a configurable vision fallback API (default: Kilo.ai) to describe images as text, then replaces image blocks with text descriptions before sending to text-only models
- **`_vision_describe_image()`**: Calls vision fallback model to describe a single image, with MD5-based caching to avoid re-describing same URL
- **`_preprocess_vision()`**: Replaces `image_url`/`input_image` blocks in Chat Completions message format with text descriptions when provider lacks vision support
- **`_preprocess_vision_input()`**: Same for Responses API input format — runs BEFORE adapter conversion so images are replaced early
- **Vision error retry**: On HTTP 4xx errors containing image-related keywords, automatically retries with images preprocessed instead of failing
- **Configurable via env vars**: `VISION_FALLBACK_URL`, `VISION_FALLBACK_MODEL`, `VISION_FALLBACK_KEY`
- **ProviderSchema `supports_vision` field**: Auto-detected from error responses and persisted in provider-caps.json
### Critical Fixes
- **`has_content` now includes `function_call`** (v3.11.5 fix): `_observe_event` only checked for `"type": "message"` — when models return only tool calls (no text), `has_content` was `False`, causing Codex to loop infinitely and build context until `context_length_exceeded`. Now checks both `"message"` and `"function_call"`.
- **`has_message`/`has_tool_call` initialized in all 5 locations**: Previous fix added variables inside `_observe_event` closure but missed 4 other `has_content = False` locations, causing `NameError: name 'has_message' is not defined` crashes.
- **Auth config-not-found error handling**: When Codex's `config.toml` is missing or deleted, `codex login status` returns "Error loading configuration: No such file or directory (os error 2)". Now caught specifically (`OSError errno==2`) and returns ("not_configured", "Config missing — launch once to create") with clear GUI guidance.
### Bug Fixes (GUI)
- **Active endpoint sync**: GUI auto-removes stale endpoint references on startup
## v3.11.5 (2026-05-26)
**Vision Filter, Token-Aware Compaction, Universal Adaptive Compaction, Smart-Continue Text Detection**

View File

@@ -134,6 +134,10 @@ A three-component system:
- **Token-aware compaction** (v3.11.5) — learns per-model token limits from `context_length_exceeded` errors; proactively compacts when estimated tokens exceed 80% of limit; prevents repeated context overflow on small-context models (~35K tokens)
- **Universal adaptive compaction** (v3.11.5) — compaction now works for ALL providers (was Crof.ai-only); proactive + retry compaction with aggression levels (normal/extreme)
- **Smart-continue text detection** (v3.11.5) — triggers continuation nudging when model outputs text matching tool-call patterns, essential for text-only models that never emit real `function_call_output` items
- **Antigravity loop breakers** (v3.11.6) — per-session tracking with automatic finalization when same tool+args repeats 5+ times; edit-intent nudge injected only on first turn; latest user instruction appended exactly once per request
- **has_content function_call fix** (v3.11.6) — tool-call-only responses now correctly flagged as having content, preventing infinite loops on OpenAdapter/Z.AI/OpenRouter providers
- **Vision/OCR preprocessing** (v3.11.6) — when provider rejects images, automatically calls a configurable vision fallback API (Kilo.ai) to describe images as text for text-only models; MD5-cached; retries on vision errors with preprocessed text
- **Auth config-missing fix** (v3.11.6) — graceful handling when Codex config.toml is missing instead of showing raw os error
- Zero dependencies — pure Python stdlib
### Command Code Adapter

Binary file not shown.

127
install.ps1 Normal file
View File

@@ -0,0 +1,127 @@
<#
.SYNOPSIS
Codex Launcher Windows Installer
.DESCRIPTION
Installs Codex Launcher for the current user.
.NOTES
Requires: Python 3.8+ (stdlib only, zero pip dependencies).
#>
param(
[switch]$Uninstall
)
$ErrorActionPreference = 'Stop'
$BinDir = Join-Path $env:LOCALAPPDATA 'Programs\Codex-Launcher'
$StartMenu = Join-Path $env:APPDATA 'Microsoft\Windows\Start Menu\Programs'
if ($Uninstall) {
Write-Host 'Uninstalling Codex Launcher...' -ForegroundColor Yellow
if (Test-Path $BinDir) {
Remove-Item -Recurse -Force $BinDir
Write-Host " Removed $BinDir"
}
$shortcut = Join-Path $StartMenu 'Codex Launcher.lnk'
if (Test-Path $shortcut) {
Remove-Item -Force $shortcut
Write-Host ' Removed Start Menu shortcut'
}
$userPath = [Environment]::GetEnvironmentVariable('PATH', 'User')
if ($userPath -like "*$BinDir*") {
$newPath = ($userPath -split ';' | Where-Object { $_ -ne $BinDir }) -join ';'
[Environment]::SetEnvironmentVariable('PATH', $newPath, 'User')
Write-Host ' Removed from PATH'
}
Write-Host 'Uninstall complete.' -ForegroundColor Green
return
}
Write-Host ''
Write-Host ' Codex Launcher - Windows Installer' -ForegroundColor Cyan
Write-Host ' ====================================' -ForegroundColor Cyan
Write-Host ''
# Check Python
$pythonExe = Get-Command python -ErrorAction SilentlyContinue
if (-not $pythonExe) {
$pythonExe = Get-Command python3 -ErrorAction SilentlyContinue
}
if (-not $pythonExe) {
Write-Host 'ERROR: Python not found. Install Python 3.8+ and add to PATH.' -ForegroundColor Red
exit 1
}
Write-Host " Python: $($pythonExe.Source)" -ForegroundColor Gray
# Create install directory
New-Item -ItemType Directory -Force -Path $BinDir | Out-Null
# Copy files
$srcDir = Join-Path $PSScriptRoot 'src'
$files = @(
'translate-proxy.py',
'codex-launcher-gui.py',
'codex_launcher_lib.py',
'cleanup-codex-stale.py'
)
foreach ($file in $files) {
$src = Join-Path $srcDir $file
if (Test-Path $src) {
Copy-Item -Force $src $BinDir
Write-Host " Installed: $file" -ForegroundColor Green
} else {
Write-Host " WARNING: $file not found in src/" -ForegroundColor Yellow
}
}
# Create Start Menu shortcut
$WshShell = New-Object -ComObject WScript.Shell
$shortcutPath = Join-Path $StartMenu 'Codex Launcher.lnk'
$Shortcut = $WshShell.CreateShortcut($shortcutPath)
# Find pythonw.exe for no-console launch
$pythonw = Get-Command pythonw -ErrorAction SilentlyContinue
if (-not $pythonw) {
$pythonDir = Split-Path $pythonExe.Source
$pythonwCandidate = Join-Path $pythonDir 'pythonw.exe'
if (Test-Path $pythonwCandidate) {
$pythonw = $pythonwCandidate
}
}
if ($pythonw) {
$targetPath = if ($pythonw.Source) { $pythonw.Source } else { $pythonw }
} else {
$targetPath = $pythonExe.Source
}
$Shortcut.TargetPath = $targetPath
$guiPath = Join-Path $BinDir 'codex-launcher-gui.py'
$Shortcut.Arguments = $guiPath
$Shortcut.WorkingDirectory = $BinDir
$Shortcut.Description = 'Launch Codex Desktop with any AI provider'
$Shortcut.Save()
Write-Host ' Created Start Menu shortcut' -ForegroundColor Green
# Add to PATH
$userPath = [Environment]::GetEnvironmentVariable('PATH', 'User')
if ($userPath -notlike "*$BinDir*") {
$newUserPath = $userPath + ';' + $BinDir
[Environment]::SetEnvironmentVariable('PATH', $newUserPath, 'User')
$env:PATH = $env:PATH + ';' + $BinDir
Write-Host ' Added to user PATH' -ForegroundColor Green
}
# Verify
Write-Host ''
Write-Host ' Installation complete!' -ForegroundColor Cyan
Write-Host " Install dir: $BinDir" -ForegroundColor Gray
Write-Host ''
Write-Host ' Launch options:' -ForegroundColor White
Write-Host ' Start Menu: Codex Launcher' -ForegroundColor Gray
Write-Host ' Command: codex-launcher-gui.py' -ForegroundColor Gray
Write-Host ' Uninstall: powershell -File install.ps1 -Uninstall' -ForegroundColor Gray
Write-Host ''

View File

@@ -3,13 +3,13 @@ set -e
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
if [ -f "$SCRIPT_DIR/codex-launcher_3.11.5_all.deb" ]; then
echo "Installing codex-launcher_3.11.5_all.deb ..."
sudo dpkg -i "$SCRIPT_DIR/codex-launcher_3.11.5_all.deb"
if [ -f "$SCRIPT_DIR/codex-launcher_3.11.6_all.deb" ]; then
echo "Installing codex-launcher_3.11.6_all.deb ..."
sudo dpkg -i "$SCRIPT_DIR/codex-launcher_3.11.6_all.deb"
else
echo "WARNING: codex-launcher_3.11.5_all.deb not found; copying files manually."
echo "WARNING: codex-launcher_3.11.6_all.deb not found; copying files manually."
fi
echo "Installed v3.11.5 via .deb package."
echo "Installed v3.11.6 via .deb package."
echo " translate-proxy.py -> /usr/bin/translate-proxy.py"
echo " codex-launcher-gui -> /usr/bin/codex-launcher-gui"
echo " cleanup-codex-stale -> /usr/bin/cleanup-codex-stale.sh"

View File

@@ -27,6 +27,12 @@ model_catalog_json = ""
"""
CHANGELOG = [
("3.11.6", "2026-05-26", [
"Antigravity loop breakers: per-session tracking, repeated tool detection",
"has_content fix: function_call counts as valid output",
"Latest user instruction appended once per request for Antigravity",
"Antigravity-only changes, no touch to other providers",
]),
("3.11.5", "2026-05-26", [
"Token-aware compaction: fixes context_length_exceeded on small-context models",
"Proactive compaction triggers on token count, not just item count",
@@ -2140,6 +2146,8 @@ class LauncherWin(Gtk.Window):
self._relogin_btn.set_sensitive("cli" not in self._missing)
elif status == "not_installed":
self._auth_label.set_markup("<span foreground='#888'>Auth: N/A (CLI not installed)</span>")
elif status == "not_configured":
self._auth_label.set_markup("<span foreground='#d29922'>⚠ Config missing — launch once to create</span>")
else:
self._auth_label.set_markup(f"<span foreground='#d29922'>⚠ Auth: {msg}</span>")
self._relogin_btn.set_sensitive("cli" not in self._missing)

View File

@@ -83,13 +83,21 @@ model_catalog_json = ""
"""
CHANGELOG = [
("3.11.6", "2026-05-26", [
"Antigravity loop breakers: per-session tracking, edit-intent nudge (first turn only)",
"Loop breaker: same tool+args repeated 5+ times triggers force finalization",
"Latest user instruction appended exactly once per request",
"Detailed [antigravity-loop] logging for all tracking fields",
"has_content fix: function_call now counts as valid output (no more infinite loops)",
"Antigravity-only changes, no touch to other providers",
]),
("3.11.5", "2026-05-26", [
"Token-aware compaction: fixes context_length_exceeded on small-context models (25 items × 1600 tokens)",
"Token-aware compaction: fixes context_length_exceeded on small-context models (25 items x 1600 tokens)",
"Proactive compaction triggers on token count (>80% model limit), not just item count",
"Universal adaptive compaction: removed crof.ai-only gates, all providers get compaction",
"Vision model detection: strips images for non-vision models, keeps for vision-capable ones",
"Per-model token limit learning from context_length_exceeded error messages",
"Compaction aggression levels: normal vs extreme when tokens > 1.5× model limit",
"Compaction aggression levels: normal vs extreme when tokens > 1.5x model limit",
"Smart-continue text-tool detection: triggers on tool-call text patterns, not just function_call_output",
"Active endpoint sync: GUI auto-removes stale endpoint references on startup",
]),
@@ -1713,6 +1721,10 @@ def check_codex_auth():
return ("unknown", "No output from codex login status")
except FileNotFoundError:
return ("not_installed", "codex not found")
except OSError as e:
if e.errno == 2:
return ("not_configured", "Config not found — launch Codex once to create it")
return ("error", str(e))
except Exception as e:
return ("error", str(e))

View File

@@ -157,7 +157,7 @@ Architecture:
import json, http.server, socketserver, urllib.request, urllib.parse, urllib.error, re
import time, uuid, os, sys, argparse, threading, socket, collections, contextlib, signal
import secrets, string
import secrets, string, hashlib
import dataclasses
import http.client
import selectors
@@ -219,6 +219,9 @@ def load_config():
"backend_type": ("PROXY_BACKEND", None, str),
"target_url": ("PROXY_TARGET_URL", "ZAI_BASE_URL", str),
"api_key": ("PROXY_API_KEY", "ZAI_API_KEY", str),
"vision_fallback_url": ("VISION_FALLBACK_URL", None, str),
"vision_fallback_model": ("VISION_FALLBACK_MODEL", None, str),
"vision_fallback_key": ("VISION_FALLBACK_KEY", None, str),
}
for ck, (ev1, ev2, conv) in env_map.items():
if ck not in cfg:
@@ -260,6 +263,9 @@ PROMPT_ENHANCER_MODE = "offline"
PROMPT_ENHANCER_MODEL = ""
PROMPT_ENHANCER_URL = ""
PROMPT_ENHANCER_KEY = ""
VISION_FALLBACK_URL = ""
VISION_FALLBACK_MODEL = ""
VISION_FALLBACK_KEY = ""
SERVER = None
if _IS_WINDOWS:
@@ -855,6 +861,7 @@ def _init_runtime():
global CONFIG, PORT, BACKEND, TARGET_URL, API_KEY, OAUTH_PROVIDER, _antigravity_version
global MODELS, CC_VERSION, REASONING_ENABLED, REASONING_EFFORT, BGP_ROUTES
global _api_key_pool, PROMPT_ENHANCER
global VISION_FALLBACK_URL, VISION_FALLBACK_MODEL, VISION_FALLBACK_KEY
CONFIG = load_config()
PORT = CONFIG["port"]
@@ -872,6 +879,9 @@ def _init_runtime():
PROMPT_ENHANCER_MODEL = CONFIG.get("prompt_enhancer_model", "")
PROMPT_ENHANCER_URL = CONFIG.get("prompt_enhancer_url", "")
PROMPT_ENHANCER_KEY = CONFIG.get("prompt_enhancer_key", "")
VISION_FALLBACK_URL = CONFIG.get("vision_fallback_url") or "https://api.kilo.ai/api/gateway/chat/completions"
VISION_FALLBACK_MODEL = CONFIG.get("vision_fallback_model") or "kilo-auto/small"
VISION_FALLBACK_KEY = CONFIG.get("vision_fallback_key") or ""
BGP_ROUTES = CONFIG.get("bgp_routes", [])
_api_key_pool = None
if API_KEY and "," in API_KEY and not OAUTH_PROVIDER.startswith("google") and BACKEND not in ("codebuff", "freebuff"):
@@ -2366,6 +2376,113 @@ def _mark_vision_fail(model):
with _vision_fail_lock:
_vision_fail_cache.add(model)
def _vision_describe_image(img_data, cache):
"""Call vision fallback API to describe a single image."""
if not VISION_FALLBACK_URL:
return None
if isinstance(img_data, dict):
img_url = img_data.get("url", "")
if not img_url:
inner = img_data.get("image_url", img_data)
img_url = inner.get("url", "") if isinstance(inner, dict) else str(inner)
else:
img_url = str(img_data)
if not img_url:
return None
img_hash = hashlib.md5(img_url.encode("utf-8", errors="replace")).hexdigest()
if img_hash in cache:
return cache[img_hash]
try:
payload = json.dumps({
"model": VISION_FALLBACK_MODEL,
"messages": [{"role": "user", "content": [
{"type": "text", "text": "Describe the content of this image in detail. If it contains text, transcribe it fully."},
{"type": "image_url", "image_url": {"url": img_url}},
]}],
"max_tokens": 1024,
"stream": False,
}).encode()
headers = {"Content-Type": "application/json"}
if VISION_FALLBACK_KEY:
headers["Authorization"] = f"Bearer {VISION_FALLBACK_KEY}"
req = urllib.request.Request(VISION_FALLBACK_URL, data=payload, headers=headers)
resp = urllib.request.urlopen(req, timeout=30)
body = json.loads(resp.read().decode())
choices = body.get("choices", [])
if choices:
msg = choices[0].get("message", {})
desc = msg.get("content", "")
if desc:
cache[img_hash] = desc
return desc
except Exception as e:
print(f"[vision-fallback] error describing image: {e}", file=sys.stderr)
return None
def _preprocess_vision(messages, schema):
"""Replace image blocks with text descriptions when provider lacks vision support."""
if schema.supports_vision:
return messages
cache = {}
for msg in messages:
content = msg.get("content")
if not isinstance(content, list):
continue
new_parts = []
changed = False
for part in content:
if isinstance(part, dict) and part.get("type") in ("image_url", "input_image"):
changed = True
img_data = part.get("image_url", part)
description = _vision_describe_image(img_data, cache)
if description:
new_parts.append({"type": "text", "text": f"[Image: {description}]"})
else:
new_parts.append({"type": "text", "text": "[Image: description unavailable - text-only model]"})
else:
new_parts.append(part)
if changed:
msg["content"] = new_parts
return messages
def _preprocess_vision_input(input_data, schema):
"""Replace input_image blocks in Responses API input format with text descriptions."""
if schema.supports_vision:
return input_data
if not isinstance(input_data, list):
return input_data
cache = {}
changed_any = False
for item in input_data:
if item.get("type") != "message":
continue
content = item.get("content")
if not isinstance(content, list):
continue
new_parts = []
changed = False
for part in content:
if isinstance(part, dict) and part.get("type") in ("input_image", "image_url"):
changed = True
img_url = ""
if part.get("type") == "input_image":
img_url = part.get("image_url", {}).get("url", "")
else:
img_url = part.get("image_url", {}).get("url", part.get("url", ""))
desc = _vision_describe_image({"url": img_url}, cache)
if desc:
new_parts.append({"type": "input_text", "text": f"[Image: {desc}]"})
else:
new_parts.append({"type": "input_text", "text": "[Image: description unavailable - text-only model]"})
else:
new_parts.append(part)
if changed:
item["content"] = new_parts
changed_any = True
return input_data
def _strip_images_from_input(input_data, model):
if not isinstance(input_data, list) or _model_supports_vision(model):
return input_data
@@ -4014,6 +4131,7 @@ class ProviderSchema:
})
response_format: str = "auto" # "sse" | "raw_json" | "ndjson" | "auto"
stream_format: str = "auto" # "sse_data" | "sse_event" | "raw_lines" | "json_lines"
supports_vision: bool = True
def hints(self) -> dict:
"""Return a dict for storing in provider-caps.json."""
@@ -4023,7 +4141,10 @@ class ProviderSchema:
continue
if isinstance(v, dict) and not v:
continue
if v is False:
if k == "supports_vision":
if v is not False:
continue
elif v is False:
continue
if v == "":
continue
@@ -4193,6 +4314,15 @@ class ErrorAnalyzer:
elif re.search(r"tool-call|tool_call.*format", err):
hints["tool_decl_format"] = "command_code"
# ── Response/Stream format hints from content-type or error ──
# ── Vision support detection ──
if re.search(r"unknown variant\b.*image_url", err) or \
re.search(r"unexpected.*image_url", err) or \
re.search(r"does not support.*image", err) or \
re.search(r"image.*not.*support", err) or \
re.search(r"unsupported.*content.*type.*image", err):
hints["supports_vision"] = False
# ── Response/Stream format hints from content-type or error ──
if re.search(r"content.type.*text/event.stream", err) or \
re.search(r"stream.*sse|sse.*expected", err):
@@ -4253,6 +4383,7 @@ def _load_schema(target_url=None, backend=None, model=None):
})),
response_format=data.get("response_format", "auto"),
stream_format=data.get("stream_format", "auto"),
supports_vision=data.get("supports_vision", True),
)
@@ -5053,6 +5184,9 @@ class Handler(http.server.BaseHTTPRequestHandler):
body["input"] = input_data
messages = oa_input_to_messages(input_data)
_schema = _load_schema(model=model)
if _schema and not _schema.supports_vision:
messages = _preprocess_vision(messages, _schema)
messages = _inject_stored_reasoning(messages)
instructions = body.get("instructions", "").strip()
if instructions:
@@ -5082,6 +5216,18 @@ class Handler(http.server.BaseHTTPRequestHandler):
upstream = urllib.request.urlopen(req, timeout=_upstream_timeout(body, stream))
except urllib.error.HTTPError as e:
err_body = e.read().decode()
if re.search(r"unknown variant\b.*image_url", err_body.lower()) or \
re.search(r"unexpected.*image_url", err_body.lower()) or \
re.search(r"does not support.*image", err_body.lower()):
_schema = _load_schema(model=model)
if _schema:
_schema.supports_vision = False
if attempt < max_retries:
print(f"[{self._session_id}] vision not supported, retrying with image preprocessing", file=sys.stderr)
messages = _preprocess_vision(messages, _schema) if _schema else messages
chat_body = self._build_chat_body(model, messages, body, stream)
chat_body_b = json.dumps(chat_body).encode()
continue
if "context_length_exceeded" in err_body and attempt < max_retries:
import re as _re
_tok_m = _re.search(r'~?(\d+)\s*tokens', err_body)
@@ -6869,7 +7015,8 @@ class Handler(http.server.BaseHTTPRequestHandler):
prev_content_type = None # for oscillation detection
for attempt in range(max_retries + 1):
adapter = SchemaAdapter(schema)
messages = adapter.convert(input_data, instructions)
processed_input = _preprocess_vision_input(input_data, schema) if not schema.supports_vision else input_data
messages = adapter.convert(processed_input, instructions)
use_cc_wrap = schema.cc_body_wrap or is_cc
# Build auth header from schema