14 Commits

14 changed files with 3492 additions and 5142 deletions

View File

@@ -1,5 +1,79 @@
# Changelog
## v3.10.12 (2026-05-26)
**Sticky Endpoint, Claude Fixes, Guardrail Skip, Anti-Stall**
### New Features
- **Sticky endpoint caching**: remembers which endpoint last succeeded, reuses it on every subsequent request (zero overhead)
- **Sequential fallback**: if sticky endpoint fails (429/502/503), tries next endpoint in order — no parallel probing, no wasted requests
- **Endpoint order**: `cloudcode-pa.googleapis.com` first (matches agy CLI), `daily-cloudcode-pa.googleapis.com` as fallback
- **Anti-stall engine**: kills stale proxy processes and clears `__pycache__` on every new session start
- **Smart error classification**: distinguishes `quota_exhausted` vs `capacity_exhausted` vs `account_banned` vs `validation_required` vs `service_disabled` vs `auth_permanent`
- **Rate limit reset time parsing**: extracts cooldown from error body (`quotaResetDelay`, `Resets in ~1h27m`, etc.) for accurate cooldown
- **Missing Antigravity headers**: `X-Client-Name`, `X-Client-Version`, `x-goog-api-client`, platform-aware `User-Agent`
- **Session ID**: added `sessionId` to request wrapper for proper session tracking
### Bug Fixes (TRAE Agent)
- **Guardrail skip for simple messages**: when user sends simple messages (e.g. "hi"), skip injecting `_GEMINI_AGENT_GUARDRAIL` — prevents model from aggressively calling tools and looping `ls -la` 50+ times
- **Claude tool preservation**: Claude models through Antigravity now keep ALL tool outputs in normalizer (no summarization/truncation) — prevents context loss that broke Claude sessions
- **Claude compaction guard**: `_adaptive_compact` skipped for Claude models — Claude handles its own context, no forced compaction
- **Claude normalizer guard**: `_antigravity_normalize_context` skipped for Claude models — avoids stripping Claude-specific message structure
- **Claude sanitization guard**: Google content sanitization loop skipped for Claude models — prevents mangling Claude's response format
- **Normalizer model parameter**: `_antigravity_normalize_context` now receives `model` param to distinguish Claude vs Gemini behavior
## v3.10.11 (2026-05-26)
**Hybrid Endpoint Fallback — Redundant Antigravity Endpoints**
### New Features
- Hybrid endpoint fallback: tries `cloudcode-pa.googleapis.com` then `daily-cloudcode-pa.googleapis.com` on 429
- `daily-cloudcode-pa.googleapis.com` is the same production endpoint agy-core uses (separate rate limit bucket)
- 429 errors now log full response body for debugging
- SERVICE_DISABLED (403) still falls through to next endpoint
- Rate-limit marking only happens after ALL endpoints fail
### Bug Fixes
- Fixed 429 on one endpoint immediately failing — now tries fallback before giving up
- Restored SERVICE_DISABLED fallthrough (was accidentally removed)
## v3.10.10 (2026-05-25)
**Context Normalizer Fix — Compaction Summary Preservation**
### Bug Fixes
- Fixed normalizer stripping ALL context on resumed sessions after compaction
- Normalizer no longer auto-resets when compaction summary is present
- Compaction summaries ("Auto-compacted: N earlier turns") are always preserved
- Deduplicates consecutive identical `<goal_context>` messages (10→1)
- Emergency reset now preserves compaction summaries
- Previous behavior: after compaction reduced 1925→185 items, normalizer saw `n_tool_outputs == 0` and stripped to just `system + latest_user`, losing all context — model responded with "I don't have context"
### hashlib Fix (v3.10.9 hotfix)
- `_antigravity_normalize_context` crashed with `NameError: hashlib` on resumed sessions
- Replaced SHA256 duplicate detection with string comparison
## v3.10.9 (2026-05-25)
**Antigravity Overhaul — Context Normalizer, Claude Thinking Fix, Endpoint Lockdown**
### Antigravity Endpoint Lockdown
- Production-only: `cloudcode-pa.googleapis.com` by default
- Sandbox/staging blocked unless `ALLOW_ANTIGRAVITY_STAGING=1`
- 403 SERVICE_DISABLED falls through, 429 returns to client
### AntigravityContextNormalizer
- Bounded context — no more 136-item polluted requests for "hi"
- Simple message detector, auto-reset polluted context
- Duplicate removal, tool output budget, hard char limits
### Claude Thinking Fix (Antigravity-only)
- Fixed 400 error: `maxOutputTokens=64000` when thinking enabled
- Snake_case config, VALIDATED toolConfig, proper budgets
### z.ai / OpenRouter (cobra91 PR #4)
- Full OpenClaw attribution headers, OpenRouter caching
## v3.10.8 (2026-05-25)
**OAuth & Antigravity Endpoint Fixes**

View File

@@ -554,6 +554,7 @@ The launcher generates model catalog JSON with dual field naming to satisfy both
Codex Launcher includes special handling for Gemini 3 / Antigravity OAuth:
- **Sticky endpoint with parallel discovery**: First request probes `cloudcode-pa.googleapis.com` and `daily-cloudcode-pa.googleapis.com` simultaneously — first 200 wins and is cached. All subsequent requests go straight to the cached endpoint. If it fails (429/502/503), cache is cleared and all endpoints are re-probed in parallel. Zero wasted time on rate-limited endpoints.
- **Thought signature preservation**: Captures `thoughtSignature` from Gemini responses
and reattaches them on follow-up requests to maintain tool-call continuity.
- **Edit-intent detection**: When follow-up requests contain edit keywords, a tool-use
@@ -561,7 +562,7 @@ Codex Launcher includes special handling for Gemini 3 / Antigravity OAuth:
- **User instruction enforcement**: The latest user message is guaranteed to be the
final content turn sent to Gemini, even after compaction.
- **Smart compaction**: Old tool outputs capped at 3000 chars, recent 6 at 20000 chars.
- **Context compaction**: Aggressive auto-trimming when approaching 60% of model context
- **Context compaction**: Aggressive auto-trimming when approaching 80% of model context
limit (1M tokens Gemini, 200K Claude, 128K GPT-OSS). Prevents token limit errors.
- **Model ID mapping**: Display names (e.g. `Gemini 3.5 Flash (High)`) mapped to REST API
slugs (e.g. `gemini-3-flash`). See `docs/ANTIGRAVITY.md` for details.

View File

@@ -1856,7 +1856,7 @@ class LauncherWin(Gtk.Window):
# header row
hdr = Gtk.Box(spacing=8)
vbox.pack_start(hdr, False, False, 0)
lbl = Gtk.Label(label="<b>Codex Launcher v3.10.7</b>")
lbl = Gtk.Label(label="<b>Codex Launcher v3.10.9</b>")
lbl.set_use_markup(True)
hdr.pack_start(lbl, False, False, 0)
changelog_btn = Gtk.Button(label="Changelog")

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@@ -83,12 +83,22 @@ model_catalog_json = ""
"""
CHANGELOG = [
("3.10.8", "2026-05-25", [
"Re-OAuth: replaced deprecated OOB flow with PKCE + localhost callback",
"Project auto-discovery: validates project API enabled, searches alternatives if disabled",
"Windows GUI: _google_reoauth now uses PKCE + callback (was broken OOB paste)",
"Windows GUI: endpoint OAuth flow uses shared project discovery helper",
"Linux GUI: endpoint OAuth flow uses shared _oauth_discover_project helper",
("3.10.9", "2026-05-25", [
"Antigravity: production-only endpoints (cloudcode-pa.googleapis.com), sandbox blocked unless ALLOW_ANTIGRAVITY_STAGING=1",
"Antigravity: 403 SERVICE_DISABLED falls through, 429 returns to client (no sandbox fallback)",
"AntigravityContextNormalizer: bounded context — simple messages send minimal payload",
"Simple message detector: 'hi' etc sends only user message, no tool history",
"Auto-reset polluted context: 200+ items with simple message resets to minimal",
"Duplicate user message removal, tool output budget (max 2 verbatim, rest summarized)",
"Hard limits: 20 contents, 120K/250K/500K char budgets",
"Claude thinking fix: maxOutputTokens=64000, snake_case thinking config, VALIDATED toolConfig",
"Claude budgets: low=8192, medium=16384, high=32768",
"All fixes scoped to OAUTH_PROVIDER==google-antigravity only",
"Project discovery uses production endpoint (not staging)",
"z.ai: full OpenClaw attribution headers (cobra91 PR #4)",
"OpenRouter: X-OpenRouter-Cache header (cobra91 PR #4)",
"Fix Linux Re-OAuth: load_oauth_secrets() was undefined",
"Fix GLib.idle_add lambda returning truthy tuple",
]),
("3.10.7", "2026-05-25", [
"Prompt Enhancer: per-provider toggle to improve prompt clarity after compaction",

View File

@@ -3,11 +3,11 @@ set -e
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
if [ -f "$SCRIPT_DIR/codex-launcher_3.10.8_all.deb" ]; then
echo "Installing codex-launcher_3.10.8_all.deb ..."
sudo dpkg -i "$SCRIPT_DIR/codex-launcher_3.10.8_all.deb"
if [ -f "$SCRIPT_DIR/codex-launcher_3.10.12_all.deb" ]; then
echo "Installing codex-launcher_3.10.12_all.deb ..."
sudo dpkg -i "$SCRIPT_DIR/codex-launcher_3.10.12_all.deb"
echo ""
echo "Installed v3.10.8 via .deb package."
echo "Installed v3.10.12 via .deb package."
echo " translate-proxy.py -> /usr/bin/translate-proxy.py"
echo " codex-launcher-gui -> /usr/bin/codex-launcher-gui"
echo " cleanup-codex-stale -> /usr/bin/cleanup-codex-stale.sh"

File diff suppressed because it is too large Load Diff

View File

@@ -83,15 +83,48 @@ model_catalog_json = ""
"""
CHANGELOG = [
("3.10.8", "2026-05-25", [
"Fix Re-OAuth buttons: load_oauth_secrets() was undefined in Linux GUI",
"Re-OAuth: replaced deprecated OOB flow with PKCE + localhost callback",
"Proxy: prefer production cloudcode-pa over staging/sandbox endpoints",
"Proxy: fallthrough 403 SERVICE_DISABLED to next endpoint",
"Project discovery: validate against production endpoint, not staging",
"Antigravity preset base_url changed to production (was daily-cloudcode-pa.sandbox)",
"Fix GLib.idle_add lambda returning truthy tuple (caused repeated calls)",
"Windows GUI: project discovery also uses production endpoint",
("3.10.12", "2026-05-26", [
"Sticky endpoint: caches last working endpoint, sequential fallback on failure",
"Endpoint order: cloudcode-pa first (matches agy CLI), daily-cloudcode-pa fallback",
"Anti-stall engine: kills stale proxy processes + clears pycache on startup",
"Smart error classification: quota vs capacity vs banned vs validation vs auth",
"Rate limit reset parsing: extracts cooldown from error body for accuracy",
"Missing headers: X-Client-Name, X-Client-Version, x-goog-api-client, sessionId",
"Guardrail skip: simple messages (hi) skip agent guardrail, no more tool-call loops",
"Claude fixes: preserve all tools, skip compaction/normalizer/sanitization for Claude",
"Normalizer model param: distinguishes Claude vs Gemini for correct behavior",
]),
("3.10.11", "2026-05-26", [
"Hybrid endpoint fallback: cloudcode-pa then daily-cloudcode-pa on 429",
"daily-cloudcode-pa.googleapis.com (same endpoint agy-core uses)",
"429 errors log full response body for debugging",
"Rate-limit marking only after ALL endpoints fail",
"Restored SERVICE_DISABLED (403) fallthrough",
]),
("3.10.10", "2026-05-25", [
"Fix normalizer stripping ALL context after compaction on resumed sessions",
"No auto-reset when compaction summary present (preserves 1925+ turn history)",
"Always preserve compaction summaries in normalizer output",
"Deduplicate consecutive identical goal_context messages",
"Emergency reset preserves compaction summaries",
"Fix hashlib NameError in _antigravity_normalize_context (string comparison instead)",
]),
("3.10.9", "2026-05-25", [
"Antigravity: production-only endpoints (cloudcode-pa.googleapis.com), sandbox blocked unless ALLOW_ANTIGRAVITY_STAGING=1",
"Antigravity: 403 SERVICE_DISABLED falls through, 429 returns to client (no sandbox fallback)",
"AntigravityContextNormalizer: bounded context — simple messages send minimal payload",
"Simple message detector: 'hi' etc sends only user message, no tool history",
"Auto-reset polluted context: 200+ items with simple message resets to minimal",
"Duplicate user message removal, tool output budget (max 2 verbatim, rest summarized)",
"Hard limits: 20 contents, 120K/250K/500K char budgets",
"Claude thinking fix: maxOutputTokens=64000, snake_case thinking config, VALIDATED toolConfig",
"Claude budgets: low=8192, medium=16384, high=32768",
"All fixes scoped to OAUTH_PROVIDER==google-antigravity only",
"Project discovery uses production endpoint (not staging)",
"z.ai: full OpenClaw attribution headers (cobra91 PR #4)",
"OpenRouter: X-OpenRouter-Cache header (cobra91 PR #4)",
"Fix Linux Re-OAuth: load_oauth_secrets() was undefined",
"Fix GLib.idle_add lambda returning truthy tuple",
]),
("3.10.7", "2026-05-25", [
"Prompt Enhancer: per-provider toggle to improve prompt clarity after compaction",

View File

@@ -616,6 +616,51 @@ class APIKeyPool(AccountPool):
_cb_pool = CodebuffAccountPool("codebuff")
_google_antigravity_pool = GoogleAccountPool("antigravity")
_google_cli_pool = GoogleAccountPool("cli")
_antigravity_preferred_endpoint = None
_antigravity_endpoint_lock = threading.Lock()
def _classify_antigravity_error(status_code, body):
lower = body.lower()
if status_code == 400:
return "bad_request"
if status_code == 401:
if any(x in lower for x in ["invalid_grant", "token revoked", "token_revoked", "invalid_client"]):
return "auth_permanent"
return "auth_transient"
if status_code == 403:
if "validation_required" in lower or "account_disabled" in lower:
return "validation_required"
if "has been disabled" in lower and "violation of terms of service" in lower:
return "account_banned"
if "service_disabled" in lower:
return "service_disabled"
return "forbidden"
if status_code in (429, 503, 529):
if any(x in lower for x in ["model_capacity_exhausted", "capacity_exhausted", "model is currently overloaded", "service temporarily unavailable"]):
return "capacity_exhausted"
if any(x in lower for x in ["quota_exhausted", "resource_exhausted", "daily limit", "quota exceeded", "quotaresetdelay"]):
return "quota_exhausted"
return "rate_limited"
if status_code >= 500:
return "server_error"
return "unknown"
def _parse_rate_limit_reset(body):
import re as _re
m = _re.search(r'quotaResetDelay[:"\s]+(\d+(?:\.\d+)?)(ms|s)', body, _re.IGNORECASE)
if m:
val = float(m.group(1))
return val / 1000 if m.group(2) == 'ms' else val
m = _re.search(r'(\d+)h(\d+)m(\d+)s', body, _re.IGNORECASE)
if m:
return int(m.group(1)) * 3600 + int(m.group(2)) * 60 + int(m.group(3))
m = _re.search(r'Resets in ~(\d+)h(\d+)m', body, _re.IGNORECASE)
if m:
return int(m.group(1)) * 3600 + int(m.group(2)) * 60
m = _re.search(r'retry[-_]?after[:\s]+(\d+)\s*(?:sec|s\b)', body, _re.IGNORECASE)
if m:
return int(m.group(1))
return None
def _get_codebuff_account():
"""Return (token, account_dict) for best available codebuff account."""
@@ -771,6 +816,20 @@ def _ensure_antigravity_version():
_antigravity_version_checked = time.time()
return _antigravity_version
_antigravity_client_version = "1.110.0"
_antigravity_client_version_checked = 0
def _ensure_antigravity_client_version():
global _antigravity_client_version, _antigravity_client_version_checked
env_ver = os.environ.get("ANTIGRAVITY_CLIENT_VERSION", "").strip()
if env_ver:
return env_ver
if time.time() - _antigravity_client_version_checked < 6 * 3600:
return _antigravity_client_version
_antigravity_client_version = os.environ.get("ANTIGRAVITY_CLIENT_VERSION_FALLBACK", "1.110.0")
_antigravity_client_version_checked = time.time()
return _antigravity_client_version
def _init_runtime():
global CONFIG, PORT, BACKEND, TARGET_URL, API_KEY, OAUTH_PROVIDER, _antigravity_version
global MODELS, CC_VERSION, REASONING_ENABLED, REASONING_EFFORT, BGP_ROUTES
@@ -1300,6 +1359,26 @@ def forwarded_headers(request_headers, extra=None, browser_ua=False):
headers.update(extra)
return headers
def _openrouter_extra():
if not TARGET_URL:
return {}
if "z.ai" in TARGET_URL:
return {
"HTTP-Referer": "https://openclaw.ai",
"X-OpenRouter-Title": "OpenClaw",
"X-OpenRouter-Categories":
"cli-agent,cloud-agent,programming-app,creative-writing,"
"writing-assistant,general-chat,personal-agent",
}
if "openrouter.ai" in TARGET_URL:
return {
"HTTP-Referer": "https://chats-llm.com",
"X-OpenRouter-Title": "Chats-LLM",
"X-OpenRouter-Categories": "general-chat, ide-extension",
"X-OpenRouter-Cache": "true",
}
return {}
_MAX_INPUT_ITEMS = 30
_MAX_TOOL_OUTPUT_CHARS = 8000
_COMPACT_KEEP_RECENT = 10
@@ -4237,6 +4316,221 @@ def _auto_continue_gemini(handler, flush_event, message_id, model, gen_config, g
break
return accumulated_text
_ANTIGRAVITY_MAX_CONTENTS = 20
_ANTIGRAVITY_MAX_TOOL_VERBATIM = 2
_ANTIGRAVITY_MAX_TOOL_CHARS = 2000
_ANTIGRAVITY_MAX_OLD_SUMMARY_CHARS = 1200
_ANTIGRAVITY_SOFT_CHARS = 120000
_ANTIGRAVITY_HARD_CHARS = 250000
_ANTIGRAVITY_EMERGENCY_CHARS = 500000
_ANTIGRAVITY_SIMPLE_WORDS = frozenset({"hi", "hello", "hey", "test", "ping", "thanks", "thank you", "ok", "okay", "yes", "no", "cool", "nice", "good", "great", "done", "go", "stop", "yep", "nope", "sure", "right", "correct", "continue", "cont", "k", "thx", "ty", "np", "lol", "brb", "bye"})
_ANTIGRAVITY_EDIT_WORDS = frozenset(("change", "fix", "update", "redesign", "rewrite", "modify", "improve", "replace", "edit", "make it", "add", "remove", "delete", "rename", "move", "convert", "create", "build", "implement"))
_ANTIGRAVITY_REFERENCE_WORDS = frozenset(("previous", "file", "error", "again", "that", "this", "it", "same", "last", "above", "earlier", "before", "earlier output", "last error", "previous result", "what was", "show me", "give me"))
def _antigravity_is_simple_user(text):
if not text:
return True
stripped = text.strip().lower()
if stripped in _ANTIGRAVITY_SIMPLE_WORDS:
return True
if len(stripped) < 30:
words = set(stripped.split())
if not words.intersection(_ANTIGRAVITY_REFERENCE_WORDS) and not words.intersection(_ANTIGRAVITY_EDIT_WORDS):
return True
return False
def _antigravity_normalize_context(input_data, model=""):
if not isinstance(input_data, list) or len(input_data) < 2:
return input_data
is_claude_model = "claude" in model.lower()
latest_user = ""
latest_user_idx = -1
for i in range(len(input_data) - 1, -1, -1):
item = input_data[i]
if isinstance(item, dict) and item.get("type") == "message" and item.get("role") == "user":
c = item.get("content", "")
if isinstance(c, str):
latest_user = c
elif isinstance(c, list):
latest_user = "\n".join(p.get("text", p.get("input_text", "")) for p in c if isinstance(p, dict))
latest_user_idx = i
break
if not latest_user:
return input_data
is_simple = _antigravity_is_simple_user(latest_user)
n_raw = len(input_data)
n_tool_outputs = sum(1 for it in input_data if isinstance(it, dict) and it.get("type") == "function_call_output")
n_tool_calls = sum(1 for it in input_data if isinstance(it, dict) and it.get("type") == "function_call")
auto_reset = (n_raw > 200 or n_tool_outputs > 20) and is_simple
if os.environ.get("ANTIGRAVITY_AUTO_RESET_POLLUTED_CONTEXT", "1") != "1":
auto_reset = False
has_compaction_summary = any(
isinstance(it, dict) and it.get("type") == "message" and it.get("role") == "user"
and ("Auto-compacted" in str(it.get("content", "")) or "auto-compacted" in str(it.get("content", "")).lower())
for it in input_data
)
if is_simple and auto_reset and not has_compaction_summary:
system_items = [it for it in input_data if isinstance(it, dict) and it.get("type") == "message" and it.get("role") in ("developer", "system")]
user_item = input_data[latest_user_idx]
result = system_items + [user_item] if system_items else [user_item]
print(f"[antigravity-context] raw_items={n_raw} compacted_items={n_raw} final_items={len(result)}", file=sys.stderr)
print(f"[antigravity-context] raw_tool_outputs={n_tool_outputs} kept_tool_outputs=0", file=sys.stderr)
print(f"[antigravity-context] simple_latest_user=true auto_reset={auto_reset} has_compaction={has_compaction_summary}", file=sys.stderr)
return result
dev_messages = []
recent_items = []
tool_outputs = []
other_items = []
for i, item in enumerate(input_data):
if not isinstance(item, dict):
continue
t = item.get("type")
if t == "message" and item.get("role") in ("developer", "system"):
dev_messages.append(item)
elif t == "function_call_output":
tool_outputs.append((i, item))
elif t in ("function_call",):
other_items.append((i, item))
elif t == "message":
recent_items.append((i, item))
latest_words = set(latest_user.strip().lower().split())
has_edit_intent = bool(latest_words.intersection(_ANTIGRAVITY_EDIT_WORDS))
has_ref_intent = bool(latest_words.intersection(_ANTIGRAVITY_REFERENCE_WORDS))
if is_claude_model:
keep_tools = len(tool_outputs)
else:
keep_tools = 2 if (has_edit_intent or has_ref_intent) else 1
if is_claude_model:
kept_tools = tool_outputs
else:
kept_tools = tool_outputs[-keep_tools:] if tool_outputs and (has_edit_intent or has_ref_intent) else []
for idx_t, t_item in enumerate(kept_tools):
orig = t_item[1]
out = orig.get("output", "")
if isinstance(out, str) and len(out) > _ANTIGRAVITY_MAX_TOOL_CHARS:
new_item = dict(orig)
new_item["output"] = out[:_ANTIGRAVITY_MAX_TOOL_CHARS] + f"\n... [truncated: kept {_ANTIGRAVITY_MAX_TOOL_CHARS} of {len(out)} chars]"
kept_tools[idx_t] = (t_item[0], new_item)
n_summarized = len(tool_outputs) - len(kept_tools)
tail_start = max(0, len(recent_items) - 6)
recent_tail = recent_items[tail_start:]
deduped_tail = []
seen_goal_context = False
for idx, msg_item in recent_tail:
content_str = ""
c = msg_item.get("content", "")
if isinstance(c, str):
content_str = c
elif isinstance(c, list):
content_str = " ".join(p.get("text", p.get("input_text", "")) for p in c if isinstance(p, dict))
if "<goal_context>" in content_str:
if seen_goal_context:
continue
seen_goal_context = True
deduped_tail.append((idx, msg_item))
recent_tail = deduped_tail if deduped_tail else recent_tail
tool_call_ids = set()
for _, t_item in kept_tools:
cid = t_item.get("call_id", t_item.get("id", ""))
if cid:
tool_call_ids.add(cid)
paired_calls = []
for idx, item in other_items:
cid = item.get("call_id", item.get("id", ""))
if cid in tool_call_ids:
paired_calls.append((idx, item))
result = list(dev_messages)
compaction_summaries = []
for idx, msg_item in recent_items:
if msg_item is input_data[latest_user_idx]:
continue
c = msg_item.get("content", "")
content_str = c if isinstance(c, str) else " ".join(p.get("text", p.get("input_text", "")) for p in c if isinstance(p, dict)) if isinstance(c, list) else ""
if "Auto-compacted" in content_str or "auto-compacted" in content_str.lower():
compaction_summaries.append(msg_item)
if n_summarized > 0:
summary_text = f"[Tool history summary: {n_summarized} older tool outputs omitted. {n_tool_calls} prior function calls were made for file inspection/editing.]"
result.append({"type": "message", "role": "user", "content": [{"type": "input_text", "text": summary_text}]})
for _, call_item in paired_calls:
result.append(call_item)
for _, tool_item in kept_tools:
result.append(tool_item)
for cs_item in compaction_summaries:
result.append(cs_item)
for _, msg_item in recent_tail:
if msg_item is not input_data[latest_user_idx]:
result.append(msg_item)
latest_norm = " ".join(latest_user.strip().split())[:200].lower()
already_present = False
for r in result:
if isinstance(r, dict) and r.get("type") == "message" and r.get("role") == "user":
c = r.get("content", "")
if isinstance(c, str):
rn = " ".join(c.strip().split())[:200].lower()
elif isinstance(c, list):
combined = " ".join(p.get("text", p.get("input_text", "")) for p in c if isinstance(p, dict))
rn = " ".join(combined.strip().split())[:200].lower()
else:
rn = ""
if rn == latest_norm:
already_present = True
break
if not already_present:
result.append(input_data[latest_user_idx])
total_chars = sum(len(json.dumps(it, ensure_ascii=False)) for it in result)
if total_chars > _ANTIGRAVITY_EMERGENCY_CHARS:
print(f"[antigravity-context] EMERGENCY: {total_chars} chars exceeds limit, resetting to minimal", file=sys.stderr)
result = list(dev_messages)
if compaction_summaries:
result.extend(compaction_summaries)
result.append(input_data[latest_user_idx])
total_chars = sum(len(json.dumps(it, ensure_ascii=False)) for it in result)
while len(result) > _ANTIGRAVITY_MAX_CONTENTS and total_chars > _ANTIGRAVITY_SOFT_CHARS:
for i in range(1, len(result) - 1):
if isinstance(result[i], dict) and result[i].get("type") in ("message", "function_call_output"):
removed = result.pop(i)
total_chars -= len(json.dumps(removed, ensure_ascii=False))
break
else:
break
est_tokens = total_chars // 4
print(f"[antigravity-context] raw_items={n_raw} final_items={len(result)}", file=sys.stderr)
print(f"[antigravity-context] raw_tool_outputs={n_tool_outputs} kept_tool_outputs={len(kept_tools)} summarized_tool_outputs={n_summarized}", file=sys.stderr)
print(f"[antigravity-context] simple_latest_user={is_simple} auto_reset={auto_reset}", file=sys.stderr)
print(f"[antigravity-context] final_chars={total_chars} estimated_tokens={est_tokens}", file=sys.stderr)
return result
class Handler(http.server.BaseHTTPRequestHandler):
protocol_version = "HTTP/1.1"
@@ -4412,7 +4706,7 @@ class Handler(http.server.BaseHTTPRequestHandler):
body["input"] = input_data
compacted = False
if policy.get("compaction") and isinstance(input_data, list):
if policy.get("compaction") and isinstance(input_data, list) and "claude" not in model.lower():
input_data, compacted = _adaptive_compact(input_data, model, policy)
if compacted:
body = dict(body)
@@ -4450,6 +4744,7 @@ class Handler(http.server.BaseHTTPRequestHandler):
fwd = forwarded_headers(self.headers, {
"Content-Type": "application/json",
"Authorization": f"Bearer {effective_key}",
**_openrouter_extra(),
}, browser_ua=True)
print(f"[{self._session_id}] POST {target} model={model} stream={stream} items={len(input_data) if isinstance(input_data,list) else 1}", file=sys.stderr)
chat_body_b = json.dumps(chat_body).encode()
@@ -4590,7 +4885,7 @@ class Handler(http.server.BaseHTTPRequestHandler):
body["input"] = input_data
compacted = False
if policy.get("compaction") and isinstance(input_data, list):
if policy.get("compaction") and isinstance(input_data, list) and "claude" not in model.lower():
input_data, compacted = _adaptive_compact(input_data, model, policy)
if compacted:
body = dict(body)
@@ -4601,6 +4896,11 @@ class Handler(http.server.BaseHTTPRequestHandler):
body = dict(body)
body["input"] = input_data
if OAUTH_PROVIDER == "google-antigravity" and isinstance(input_data, list) and "claude" not in model.lower():
input_data = _antigravity_normalize_context(input_data, model)
body = dict(body)
body["input"] = input_data
access_token = _refresh_oauth_token()
token_name = "google-antigravity-oauth-token.json" if OAUTH_PROVIDER == "google-antigravity" else "google-cli-oauth-token.json"
token_path = os.path.join(os.path.expanduser("~"), ".cache", "codex-proxy", token_name)
@@ -4675,7 +4975,7 @@ class Handler(http.server.BaseHTTPRequestHandler):
resp_part["functionResponse"]["id"] = call_id
contents.append({"role": "user", "parts": [resp_part]})
if OAUTH_PROVIDER.startswith("google"):
if OAUTH_PROVIDER.startswith("google") and "claude" not in model.lower():
sanitized = []
last_user_text = None
last_role = None
@@ -4721,7 +5021,26 @@ class Handler(http.server.BaseHTTPRequestHandler):
if body.get("top_p") is not None:
gen_config["topP"] = body["top_p"]
if REASONING_ENABLED and REASONING_EFFORT != "none":
_is_claude_model = "claude" in model.lower()
_is_claude_thinking = _is_claude_model and "thinking" in model.lower()
if OAUTH_PROVIDER == "google-antigravity" and _is_claude_thinking:
if REASONING_ENABLED and REASONING_EFFORT != "none":
budget = {"low": 8192, "medium": 16384, "high": 32768}.get(REASONING_EFFORT, 16384)
else:
budget = 16384
gen_config["thinkingConfig"] = {
"include_thoughts": True,
"thinking_budget": budget,
}
current_max = gen_config.get("maxOutputTokens", 0)
if not current_max or current_max <= budget:
gen_config["maxOutputTokens"] = 64000
print(f"[antigravity-claude] thinking model={model} budget={budget} maxOutputTokens={gen_config.get('maxOutputTokens')}", file=sys.stderr)
elif OAUTH_PROVIDER == "google-antigravity" and _is_claude_model:
if "thinkingConfig" in gen_config:
del gen_config["thinkingConfig"]
elif REASONING_ENABLED and REASONING_EFFORT != "none":
budget = {"low": 2048, "medium": 8192, "high": 24576}.get(REASONING_EFFORT, 8192)
gen_config["thinkingConfig"] = {"includeThoughts": True, "thinkingBudget": budget}
@@ -4747,8 +5066,19 @@ class Handler(http.server.BaseHTTPRequestHandler):
contents = _gemini_reattach_sigs(contents)
if OAUTH_PROVIDER == "google-antigravity":
latest_user = ""
if isinstance(input_data, list):
for item in reversed(input_data):
if item.get("type") == "message" and item.get("role") == "user":
c = item.get("content", "")
if isinstance(c, str):
latest_user = c
elif isinstance(c, list):
latest_user = "\n".join(p.get("text", p.get("input_text", "")) for p in c if isinstance(p, dict))
break
is_latest_simple = _antigravity_is_simple_user(latest_user)
guardrail_found = any("autonomous coding agent" in json.dumps(c.get("parts", []), ensure_ascii=False) for c in contents[:2])
if not guardrail_found:
if not guardrail_found and not is_latest_simple:
contents.insert(0, {"role": "user", "parts": [{"text": _GEMINI_AGENT_GUARDRAIL}]})
if OAUTH_PROVIDER == "google-antigravity" and isinstance(input_data, list):
@@ -4801,6 +5131,11 @@ class Handler(http.server.BaseHTTPRequestHandler):
if gemini_tools:
request_body["tools"] = gemini_tools
if OAUTH_PROVIDER == "google-antigravity" and _is_claude_model and gemini_tools:
request_body["toolConfig"] = {"functionCallingConfig": {"mode": "VALIDATED"}}
if _is_claude_thinking:
print(f"[antigravity-claude] applied VALIDATED toolConfig for thinking model", file=sys.stderr)
wrapped = {
"project": project_id,
"model": model,
@@ -4810,14 +5145,22 @@ class Handler(http.server.BaseHTTPRequestHandler):
wrapped["requestType"] = "agent"
wrapped["userAgent"] = "antigravity"
wrapped["requestId"] = f"agent-{uuid.uuid4().hex[:12]}"
wrapped["request"]["sessionId"] = f"{uuid.uuid4().hex}{int(time.time()*1000)}"
endpoints = ([
"https://cloudcode-pa.googleapis.com",
"https://daily-cloudcode-pa.sandbox.googleapis.com",
"https://autopush-cloudcode-pa.sandbox.googleapis.com",
] if OAUTH_PROVIDER == "google-antigravity" else [
"https://cloudcode-pa.googleapis.com",
])
_allow_staging = os.environ.get("ALLOW_ANTIGRAVITY_STAGING", "0") == "1"
if OAUTH_PROVIDER == "google-antigravity":
_antigravity_endpoints = [
"https://cloudcode-pa.googleapis.com",
"https://daily-cloudcode-pa.googleapis.com",
]
if _allow_staging:
_antigravity_endpoints.extend([
"https://daily-cloudcode-pa.sandbox.googleapis.com",
"https://autopush-cloudcode-pa.sandbox.googleapis.com",
])
endpoints = _antigravity_endpoints
else:
endpoints = ["https://cloudcode-pa.googleapis.com"]
action = "streamGenerateContent" if stream else "generateContent"
url_suffix = f"v1internal:{action}?alt=sse" if stream else f"v1internal:{action}"
@@ -4827,7 +5170,13 @@ class Handler(http.server.BaseHTTPRequestHandler):
}
if OAUTH_PROVIDER == "google-antigravity":
version = _ensure_antigravity_version()
headers["User-Agent"] = f"antigravity/{version} darwin/arm64"
import platform as _plat
_os_name = _plat.system().lower()
_os_arch = _plat.machine().lower().replace("x86_64", "x64").replace("aarch64", "arm64")
headers["User-Agent"] = f"antigravity/{version} {_os_name}/{_os_arch}"
headers["X-Client-Name"] = "antigravity"
headers["X-Client-Version"] = _ensure_antigravity_client_version()
headers["x-goog-api-client"] = "gl-node/18.18.2 fire/0.8.6 grpc/1.10.x"
else:
headers["User-Agent"] = "google-api-nodejs-client/9.15.1"
headers["X-Goog-Api-Client"] = "gl-node/22.17.0"
@@ -4847,14 +5196,33 @@ class Handler(http.server.BaseHTTPRequestHandler):
if OAUTH_PROVIDER == "google-antigravity":
print(f"[antigravity-endpoint] endpoints={[e.replace('https://','') for e in endpoints]} project={project_id}", file=sys.stderr)
for ep in endpoints:
upstream = None
chosen_ep = None
global _antigravity_preferred_endpoint
with _antigravity_endpoint_lock:
_pref = _antigravity_preferred_endpoint
if _pref and _pref in endpoints:
ordered = [_pref] + [e for e in endpoints if e != _pref]
else:
ordered = list(endpoints)
for ep in ordered:
target = f"{ep}/{url_suffix}"
req = urllib.request.Request(target, data=body_b, headers=headers)
try:
upstream = urllib.request.urlopen(req, timeout=_upstream_timeout(body, stream))
chosen_ep = ep
with _antigravity_endpoint_lock:
_antigravity_preferred_endpoint = ep
if ep != _pref:
print(f"[{self._session_id}] fallback OK: {ep.replace('https://','')}", file=sys.stderr)
break
except urllib.error.HTTPError as e:
err_body = e.read().decode()
err_class = _classify_antigravity_error(e.code, err_body)
print(f"[{self._session_id}] {ep.replace('https://','')} {e.code} class={err_class}", file=sys.stderr)
if e.code == 400 and OAUTH_PROVIDER.startswith("google"):
try:
debug_path = os.path.join(_LOG_DIR, "gemini-last-400-request.json")
@@ -4863,23 +5231,38 @@ class Handler(http.server.BaseHTTPRequestHandler):
print(f"[{self._session_id}] saved 400 debug request to {debug_path}", file=sys.stderr)
except Exception:
pass
if e.code == 403 and "SERVICE_DISABLED" in err_body[:500] and ep != endpoints[-1]:
print(f"[{self._session_id}] {ep} SERVICE_DISABLED, trying next endpoint", file=sys.stderr)
return self.send_json(e.code, {"error": {"type": "upstream_error", "message": _sanitize_err_body(err_body)}})
if err_class == "auth_permanent":
return self.send_json(e.code, {"error": {"type": "upstream_error", "message": _sanitize_err_body(err_body)}})
if err_class in ("quota_exhausted", "rate_limited"):
reset_s = _parse_rate_limit_reset(err_body)
if ep == ordered[-1]:
pool = _google_antigravity_pool if OAUTH_PROVIDER == "google-antigravity" else _google_cli_pool
_, acct = _get_google_account(OAUTH_PROVIDER)
if acct:
cooldown = reset_s if reset_s and reset_s > 10 else 60
pool.mark_rate_limited(acct, cooldown)
print(f"[{self._session_id}] quota reset in ~{reset_s}s, cooldown={cooldown}s", file=sys.stderr)
return self.send_json(e.code, {"error": {"type": "upstream_error", "message": _sanitize_err_body(err_body)}})
print(f"[{self._session_id}] {ep.replace('https://','')} 429, trying next", file=sys.stderr)
with _antigravity_endpoint_lock:
_antigravity_preferred_endpoint = None
continue
if e.code == 429 and ep != endpoints[-1]:
print(f"[{self._session_id}] {ep} HTTP 429, trying next endpoint", file=sys.stderr)
if err_class in ("service_disabled", "forbidden", "account_banned", "validation_required"):
if ep == ordered[-1]:
return self.send_json(e.code, {"error": {"type": "upstream_error", "message": _sanitize_err_body(err_body)}})
continue
if e.code == 429:
pool = _google_antigravity_pool if OAUTH_PROVIDER == "google-antigravity" else _google_cli_pool
_, acct = _get_google_account(OAUTH_PROVIDER)
if acct:
pool.mark_rate_limited(acct, 60)
return self.send_json(e.code, {"error": {"type": "upstream_error", "message": _sanitize_err_body(err_body)}})
except Exception as e:
if ep == endpoints[-1]:
return self.send_json(502, {"error": {"type": "proxy_error", "message": str(e)}})
print(f"[{self._session_id}] {ep} failed: {e}, trying next", file=sys.stderr)
if ep == ordered[-1]:
return self.send_json(e.code, {"error": {"type": "upstream_error", "message": _sanitize_err_body(err_body)}})
continue
except Exception as e:
print(f"[{self._session_id}] {ep.replace('https://','')} conn failed: {e}", file=sys.stderr)
if ep == ordered[-1]:
return self.send_json(502, {"error": {"type": "proxy_error", "message": str(e)}})
continue
if upstream is None:
return self.send_json(502, {"error": {"type": "proxy_error", "message": "All endpoints failed"}})
if stream:
self._forward_gemini_sse(upstream, model, body, input_data, tracker)
@@ -5087,6 +5470,7 @@ class Handler(http.server.BaseHTTPRequestHandler):
fwd = forwarded_headers(self.headers, {
"Content-Type": "application/json",
"Authorization": f"Bearer {r_key}",
**_openrouter_extra(),
}, browser_ua=True)
print(f"[{self._session_id}] trying route '{route.get('name', r_url)}' model={r_model}", file=sys.stderr)
req = urllib.request.Request(target, data=json.dumps(chat_body).encode(), headers=fwd)
@@ -5349,6 +5733,7 @@ class Handler(http.server.BaseHTTPRequestHandler):
"Content-Type": "application/json",
"x-api-key": API_KEY,
"anthropic-version": "2023-06-01",
**_openrouter_extra(),
}),
)
self._forward(req, stream, model,
@@ -5416,7 +5801,7 @@ class Handler(http.server.BaseHTTPRequestHandler):
"threadId": thread_id,
}
fwd = forwarded_headers(self.headers, headers_extra, browser_ua=True)
fwd = forwarded_headers(self.headers, {**headers_extra, **_openrouter_extra()}, browser_ua=True)
print(f"[{self._session_id}] POST {target} model={model} stream={stream} attempt={attempt} [command-code]", file=sys.stderr)
req = urllib.request.Request(
target,
@@ -5950,7 +6335,7 @@ class Handler(http.server.BaseHTTPRequestHandler):
req_body["reasoning_effort"] = REASONING_EFFORT
req_body_b = json.dumps(req_body).encode()
fwd = forwarded_headers(self.headers, headers_extra, browser_ua=True)
fwd = forwarded_headers(self.headers, {**headers_extra, **_openrouter_extra()}, browser_ua=True)
print(f"[auto-sense] POST {target} model={model} attempt={attempt} schema={schema.hints()}", file=sys.stderr)
req = urllib.request.Request(target, data=req_body_b, headers=fwd)
@@ -6165,9 +6550,42 @@ def _handle_shutdown_signal(sig, frame):
if 'SERVER' in globals() and SERVER:
SERVER.shutdown()
def _anti_stall_cleanup():
my_pid = os.getpid()
my_port = PORT
killed = []
try:
import subprocess as _sp
out = _sp.run(["pgrep", "-f", "translate-proxy"], capture_output=True, text=True, timeout=5).stdout.strip()
for pid_str in out.splitlines():
pid_str = pid_str.strip()
if not pid_str or not pid_str.isdigit():
continue
pid = int(pid_str)
if pid == my_pid:
continue
try:
os.kill(pid, signal.SIGTERM)
killed.append(pid)
except (ProcessLookupError, PermissionError):
pass
except Exception:
pass
try:
_cache_dir = os.path.join(os.path.dirname(os.path.abspath(__file__)), "__pycache__")
if os.path.isdir(_cache_dir):
import shutil
shutil.rmtree(_cache_dir, ignore_errors=True)
except Exception:
pass
if killed:
print(f"[anti-stall] killed {len(killed)} stale proxy process(es): {killed}", flush=True)
time.sleep(1)
def main():
global SERVER, _START_TIME
_START_TIME = time.time()
_anti_stall_cleanup()
_init_runtime()
try:
_current_cfg = os.path.basename(args.config) if args.config else ""

View File

@@ -1300,6 +1300,26 @@ def forwarded_headers(request_headers, extra=None, browser_ua=False):
headers.update(extra)
return headers
def _openrouter_extra():
if not TARGET_URL:
return {}
if "z.ai" in TARGET_URL:
return {
"HTTP-Referer": "https://openclaw.ai",
"X-OpenRouter-Title": "OpenClaw",
"X-OpenRouter-Categories":
"cli-agent,cloud-agent,programming-app,creative-writing,"
"writing-assistant,general-chat,personal-agent",
}
if "openrouter.ai" in TARGET_URL:
return {
"HTTP-Referer": "https://chats-llm.com",
"X-OpenRouter-Title": "Chats-LLM",
"X-OpenRouter-Categories": "general-chat, ide-extension",
"X-OpenRouter-Cache": "true",
}
return {}
_MAX_INPUT_ITEMS = 30
_MAX_TOOL_OUTPUT_CHARS = 8000
_COMPACT_KEEP_RECENT = 10
@@ -4237,6 +4257,177 @@ def _auto_continue_gemini(handler, flush_event, message_id, model, gen_config, g
break
return accumulated_text
_ANTIGRAVITY_MAX_CONTENTS = 20
_ANTIGRAVITY_MAX_TOOL_VERBATIM = 2
_ANTIGRAVITY_MAX_TOOL_CHARS = 2000
_ANTIGRAVITY_MAX_OLD_SUMMARY_CHARS = 1200
_ANTIGRAVITY_SOFT_CHARS = 120000
_ANTIGRAVITY_HARD_CHARS = 250000
_ANTIGRAVITY_EMERGENCY_CHARS = 500000
_ANTIGRAVITY_SIMPLE_WORDS = frozenset({"hi", "hello", "hey", "test", "ping", "thanks", "thank you", "ok", "okay", "yes", "no", "cool", "nice", "good", "great", "done", "go", "stop", "yep", "nope", "sure", "right", "correct", "continue", "cont", "k", "thx", "ty", "np", "lol", "brb", "bye"})
_ANTIGRAVITY_EDIT_WORDS = frozenset(("change", "fix", "update", "redesign", "rewrite", "modify", "improve", "replace", "edit", "make it", "add", "remove", "delete", "rename", "move", "convert", "create", "build", "implement"))
_ANTIGRAVITY_REFERENCE_WORDS = frozenset(("previous", "file", "error", "again", "that", "this", "it", "same", "last", "above", "earlier", "before", "earlier output", "last error", "previous result", "what was", "show me", "give me"))
def _antigravity_is_simple_user(text):
if not text:
return True
stripped = text.strip().lower()
if stripped in _ANTIGRAVITY_SIMPLE_WORDS:
return True
if len(stripped) < 30:
words = set(stripped.split())
if not words.intersection(_ANTIGRAVITY_REFERENCE_WORDS) and not words.intersection(_ANTIGRAVITY_EDIT_WORDS):
return True
return False
def _antigravity_normalize_context(input_data):
if not isinstance(input_data, list) or len(input_data) < 2:
return input_data
latest_user = ""
latest_user_idx = -1
for i in range(len(input_data) - 1, -1, -1):
item = input_data[i]
if isinstance(item, dict) and item.get("type") == "message" and item.get("role") == "user":
c = item.get("content", "")
if isinstance(c, str):
latest_user = c
elif isinstance(c, list):
latest_user = "\n".join(p.get("text", p.get("input_text", "")) for p in c if isinstance(p, dict))
latest_user_idx = i
break
if not latest_user:
return input_data
is_simple = _antigravity_is_simple_user(latest_user)
n_raw = len(input_data)
n_tool_outputs = sum(1 for it in input_data if isinstance(it, dict) and it.get("type") == "function_call_output")
n_tool_calls = sum(1 for it in input_data if isinstance(it, dict) and it.get("type") == "function_call")
auto_reset = (n_raw > 200 or n_tool_outputs > 20) and is_simple
if os.environ.get("ANTIGRAVITY_AUTO_RESET_POLLUTED_CONTEXT", "1") != "1":
auto_reset = False
if is_simple and (auto_reset or n_tool_outputs == 0):
system_items = [it for it in input_data if isinstance(it, dict) and it.get("type") == "message" and it.get("role") in ("developer", "system")]
user_item = input_data[latest_user_idx]
result = system_items + [user_item] if system_items else [user_item]
print(f"[antigravity-context] raw_items={n_raw} compacted_items={n_raw} final_items={len(result)}", file=sys.stderr)
print(f"[antigravity-context] raw_tool_outputs={n_tool_outputs} kept_tool_outputs=0", file=sys.stderr)
print(f"[antigravity-context] simple_latest_user=true auto_reset={auto_reset}", file=sys.stderr)
return result
dev_messages = []
recent_items = []
tool_outputs = []
other_items = []
for i, item in enumerate(input_data):
if not isinstance(item, dict):
continue
t = item.get("type")
if t == "message" and item.get("role") in ("developer", "system"):
dev_messages.append(item)
elif t == "function_call_output":
tool_outputs.append((i, item))
elif t in ("function_call",):
other_items.append((i, item))
elif t == "message":
recent_items.append((i, item))
latest_words = set(latest_user.strip().lower().split())
has_edit_intent = bool(latest_words.intersection(_ANTIGRAVITY_EDIT_WORDS))
has_ref_intent = bool(latest_words.intersection(_ANTIGRAVITY_REFERENCE_WORDS))
keep_tools = 2 if (has_edit_intent or has_ref_intent) else 1
kept_tools = tool_outputs[-keep_tools:] if tool_outputs and (has_edit_intent or has_ref_intent) else []
for idx_t, t_item in enumerate(kept_tools):
orig = t_item[1]
out = orig.get("output", "")
if isinstance(out, str) and len(out) > _ANTIGRAVITY_MAX_TOOL_CHARS:
new_item = dict(orig)
new_item["output"] = out[:_ANTIGRAVITY_MAX_TOOL_CHARS] + f"\n... [truncated: kept {_ANTIGRAVITY_MAX_TOOL_CHARS} of {len(out)} chars]"
kept_tools[idx_t] = (t_item[0], new_item)
n_summarized = len(tool_outputs) - len(kept_tools)
tail_start = max(0, len(recent_items) - 6)
recent_tail = recent_items[tail_start:]
tool_call_ids = set()
for _, t_item in kept_tools:
cid = t_item.get("call_id", t_item.get("id", ""))
if cid:
tool_call_ids.add(cid)
paired_calls = []
for idx, item in other_items:
cid = item.get("call_id", item.get("id", ""))
if cid in tool_call_ids:
paired_calls.append((idx, item))
result = list(dev_messages)
if n_summarized > 0:
summary_text = f"[Tool history summary: {n_summarized} older tool outputs omitted. {n_tool_calls} prior function calls were made for file inspection/editing.]"
result.append({"type": "message", "role": "user", "content": [{"type": "input_text", "text": summary_text}]})
for _, call_item in paired_calls:
result.append(call_item)
for _, tool_item in kept_tools:
result.append(tool_item)
for _, msg_item in recent_tail:
if msg_item is not input_data[latest_user_idx]:
result.append(msg_item)
latest_norm = " ".join(latest_user.strip().split())[:200].lower()
already_present = False
for r in result:
if isinstance(r, dict) and r.get("type") == "message" and r.get("role") == "user":
c = r.get("content", "")
if isinstance(c, str):
rn = " ".join(c.strip().split())[:200].lower()
elif isinstance(c, list):
combined = " ".join(p.get("text", p.get("input_text", "")) for p in c if isinstance(p, dict))
rn = " ".join(combined.strip().split())[:200].lower()
else:
rn = ""
if rn == latest_norm:
already_present = True
break
if not already_present:
result.append(input_data[latest_user_idx])
total_chars = sum(len(json.dumps(it, ensure_ascii=False)) for it in result)
if total_chars > _ANTIGRAVITY_EMERGENCY_CHARS:
print(f"[antigravity-context] EMERGENCY: {total_chars} chars exceeds limit, resetting to minimal", file=sys.stderr)
result = list(dev_messages) + [input_data[latest_user_idx]]
total_chars = sum(len(json.dumps(it, ensure_ascii=False)) for it in result)
while len(result) > _ANTIGRAVITY_MAX_CONTENTS and total_chars > _ANTIGRAVITY_SOFT_CHARS:
for i in range(1, len(result) - 1):
if isinstance(result[i], dict) and result[i].get("type") in ("message", "function_call_output"):
removed = result.pop(i)
total_chars -= len(json.dumps(removed, ensure_ascii=False))
break
else:
break
est_tokens = total_chars // 4
print(f"[antigravity-context] raw_items={n_raw} final_items={len(result)}", file=sys.stderr)
print(f"[antigravity-context] raw_tool_outputs={n_tool_outputs} kept_tool_outputs={len(kept_tools)} summarized_tool_outputs={n_summarized}", file=sys.stderr)
print(f"[antigravity-context] simple_latest_user={is_simple} auto_reset={auto_reset}", file=sys.stderr)
print(f"[antigravity-context] final_chars={total_chars} estimated_tokens={est_tokens}", file=sys.stderr)
return result
class Handler(http.server.BaseHTTPRequestHandler):
protocol_version = "HTTP/1.1"
@@ -4450,6 +4641,7 @@ class Handler(http.server.BaseHTTPRequestHandler):
fwd = forwarded_headers(self.headers, {
"Content-Type": "application/json",
"Authorization": f"Bearer {effective_key}",
**_openrouter_extra(),
}, browser_ua=True)
print(f"[{self._session_id}] POST {target} model={model} stream={stream} items={len(input_data) if isinstance(input_data,list) else 1}", file=sys.stderr)
chat_body_b = json.dumps(chat_body).encode()
@@ -4601,6 +4793,11 @@ class Handler(http.server.BaseHTTPRequestHandler):
body = dict(body)
body["input"] = input_data
if OAUTH_PROVIDER == "google-antigravity" and isinstance(input_data, list):
input_data = _antigravity_normalize_context(input_data)
body = dict(body)
body["input"] = input_data
access_token = _refresh_oauth_token()
token_name = "google-antigravity-oauth-token.json" if OAUTH_PROVIDER == "google-antigravity" else "google-cli-oauth-token.json"
token_path = os.path.join(os.path.expanduser("~"), ".cache", "codex-proxy", token_name)
@@ -4721,7 +4918,26 @@ class Handler(http.server.BaseHTTPRequestHandler):
if body.get("top_p") is not None:
gen_config["topP"] = body["top_p"]
if REASONING_ENABLED and REASONING_EFFORT != "none":
_is_claude_model = "claude" in model.lower()
_is_claude_thinking = _is_claude_model and "thinking" in model.lower()
if OAUTH_PROVIDER == "google-antigravity" and _is_claude_thinking:
if REASONING_ENABLED and REASONING_EFFORT != "none":
budget = {"low": 8192, "medium": 16384, "high": 32768}.get(REASONING_EFFORT, 16384)
else:
budget = 16384
gen_config["thinkingConfig"] = {
"include_thoughts": True,
"thinking_budget": budget,
}
current_max = gen_config.get("maxOutputTokens", 0)
if not current_max or current_max <= budget:
gen_config["maxOutputTokens"] = 64000
print(f"[antigravity-claude] thinking model={model} budget={budget} maxOutputTokens={gen_config.get('maxOutputTokens')}", file=sys.stderr)
elif OAUTH_PROVIDER == "google-antigravity" and _is_claude_model:
if "thinkingConfig" in gen_config:
del gen_config["thinkingConfig"]
elif REASONING_ENABLED and REASONING_EFFORT != "none":
budget = {"low": 2048, "medium": 8192, "high": 24576}.get(REASONING_EFFORT, 8192)
gen_config["thinkingConfig"] = {"includeThoughts": True, "thinkingBudget": budget}
@@ -4801,6 +5017,11 @@ class Handler(http.server.BaseHTTPRequestHandler):
if gemini_tools:
request_body["tools"] = gemini_tools
if OAUTH_PROVIDER == "google-antigravity" and _is_claude_model and gemini_tools:
request_body["toolConfig"] = {"functionCallingConfig": {"mode": "VALIDATED"}}
if _is_claude_thinking:
print(f"[antigravity-claude] applied VALIDATED toolConfig for thinking model", file=sys.stderr)
wrapped = {
"project": project_id,
"model": model,
@@ -4811,13 +5032,17 @@ class Handler(http.server.BaseHTTPRequestHandler):
wrapped["userAgent"] = "antigravity"
wrapped["requestId"] = f"agent-{uuid.uuid4().hex[:12]}"
endpoints = ([
"https://cloudcode-pa.googleapis.com",
"https://daily-cloudcode-pa.sandbox.googleapis.com",
"https://autopush-cloudcode-pa.sandbox.googleapis.com",
] if OAUTH_PROVIDER == "google-antigravity" else [
"https://cloudcode-pa.googleapis.com",
])
_allow_staging = os.environ.get("ALLOW_ANTIGRAVITY_STAGING", "0") == "1"
if OAUTH_PROVIDER == "google-antigravity":
_antigravity_endpoints = ["https://cloudcode-pa.googleapis.com"]
if _allow_staging:
_antigravity_endpoints.extend([
"https://daily-cloudcode-pa.sandbox.googleapis.com",
"https://autopush-cloudcode-pa.sandbox.googleapis.com",
])
endpoints = _antigravity_endpoints
else:
endpoints = ["https://cloudcode-pa.googleapis.com"]
action = "streamGenerateContent" if stream else "generateContent"
url_suffix = f"v1internal:{action}?alt=sse" if stream else f"v1internal:{action}"
@@ -4866,7 +5091,7 @@ class Handler(http.server.BaseHTTPRequestHandler):
if e.code == 403 and "SERVICE_DISABLED" in err_body[:500] and ep != endpoints[-1]:
print(f"[{self._session_id}] {ep} SERVICE_DISABLED, trying next endpoint", file=sys.stderr)
continue
if e.code == 429 and ep != endpoints[-1]:
if e.code == 429 and ep != endpoints[-1] and _allow_staging:
print(f"[{self._session_id}] {ep} HTTP 429, trying next endpoint", file=sys.stderr)
continue
if e.code == 429:
@@ -5087,6 +5312,7 @@ class Handler(http.server.BaseHTTPRequestHandler):
fwd = forwarded_headers(self.headers, {
"Content-Type": "application/json",
"Authorization": f"Bearer {r_key}",
**_openrouter_extra(),
}, browser_ua=True)
print(f"[{self._session_id}] trying route '{route.get('name', r_url)}' model={r_model}", file=sys.stderr)
req = urllib.request.Request(target, data=json.dumps(chat_body).encode(), headers=fwd)
@@ -5349,6 +5575,7 @@ class Handler(http.server.BaseHTTPRequestHandler):
"Content-Type": "application/json",
"x-api-key": API_KEY,
"anthropic-version": "2023-06-01",
**_openrouter_extra(),
}),
)
self._forward(req, stream, model,
@@ -5416,7 +5643,7 @@ class Handler(http.server.BaseHTTPRequestHandler):
"threadId": thread_id,
}
fwd = forwarded_headers(self.headers, headers_extra, browser_ua=True)
fwd = forwarded_headers(self.headers, {**headers_extra, **_openrouter_extra()}, browser_ua=True)
print(f"[{self._session_id}] POST {target} model={model} stream={stream} attempt={attempt} [command-code]", file=sys.stderr)
req = urllib.request.Request(
target,
@@ -5950,7 +6177,7 @@ class Handler(http.server.BaseHTTPRequestHandler):
req_body["reasoning_effort"] = REASONING_EFFORT
req_body_b = json.dumps(req_body).encode()
fwd = forwarded_headers(self.headers, headers_extra, browser_ua=True)
fwd = forwarded_headers(self.headers, {**headers_extra, **_openrouter_extra()}, browser_ua=True)
print(f"[auto-sense] POST {target} model={model} attempt={attempt} schema={schema.hints()}", file=sys.stderr)
req = urllib.request.Request(target, data=req_body_b, headers=fwd)