6044 lines
288 KiB
Python
Executable File
6044 lines
288 KiB
Python
Executable File
#!/usr/bin/env python3
|
||
"""
|
||
translate-proxy.py — Responses API → backend API translation proxy.
|
||
|
||
Backends:
|
||
openai-compat — any OpenAI-compatible Chat Completions API
|
||
anthropic — Anthropic Messages API
|
||
command-code — CommandCode /alpha/generate (Z.AI GLM Coding Plan)
|
||
|
||
Usage:
|
||
python3 translate-proxy.py --config proxy-config.json
|
||
python3 translate-proxy.py --backend command-code --target-url https://... --api-key sk-...
|
||
|
||
═══════════════════════════════════════════════════════════════════
|
||
COMMANDCODE ADAPTER — FIX HISTORY (2026-05-22)
|
||
═══════════════════════════════════════════════════════════════════
|
||
|
||
This file contains multiple rounds of fixes for the CommandCode adapter.
|
||
Each fix addresses a specific failure mode observed in production.
|
||
They are documented here for future maintainability.
|
||
|
||
FIX 1: Content blocks rejected by CC API (root cause of initial 400 errors)
|
||
Symptom: {"error":{"message":"params.messages[i].content expected string, received array"}}
|
||
Cause: cc_input_to_messages emitted tool results as content blocks [{"type":"tool_result",...}]
|
||
Fix: All messages now use string content. Tool results as role="user" with plain text.
|
||
Location: cc_input_to_messages() ~line 1085
|
||
|
||
FIX 2: x-command-code-version header dropped during rewrite
|
||
Symptom: HTTP 403 upgrade_required from CommandCode API
|
||
Cause: _handle_command_code rewrite removed the header line
|
||
Fix: Always send x-command-code-version header with fallback "0.26.8"
|
||
Location: _handle_command_code() header setup block
|
||
|
||
FIX 3: Stale schema cache with wrong content_type=array
|
||
Symptom: SchemaAdapter used content_type="array" causing content blocks in auto path
|
||
Cause: ErrorAnalyzer learned incorrect schema from error message text
|
||
Fix: Cleared provider-caps.json; added 24h staleness TTL to _load_schema()
|
||
Location: _load_schema(), provider-caps.json
|
||
|
||
FIX 4: Stream disconnect before completion (client-side "stream disconnected")
|
||
Symptom: Client sees partial SSE then connection close, no response.completed event
|
||
Cause: No try/except around streaming path; exceptions crashed handler mid-stream
|
||
Fix: Wrapped stream_buffered_events in try/except; sends response.completed(status:"failed") on crash
|
||
Location: _handle_command_code() streaming section
|
||
|
||
FIX 5: Tool calls echoed as text instead of being parsed (THE BIG ONE)
|
||
Symptom: Model generates inline JSON tool calls like {"type":"tool-call","id":"...","name":"exec_command","arguments":"{...}"}
|
||
These appear as raw text in the conversation. The tool is never executed.
|
||
Root cause chain:
|
||
a) cc_input_to_messages sends tool calls as inline JSON text in assistant messages
|
||
b) The CC model echoes back similar JSON in its text-delta response
|
||
c) _parse_commandcode_text_tool_calls only handled XML format (```
|
||
<tool>``)
|
||
d) Raw JSON tool calls passed through as plain text → client shows them unparsed
|
||
Fix: Added _extract_raw_json_tool_calls() with field-level regex extraction.
|
||
Handles BOTH malformed (unescaped inner quotes) AND properly escaped JSON.
|
||
Three-tier parse: direct json.loads → unescape \"→\" → unicode_escape decode.
|
||
Location: _extract_args(), _extract_field(), _extract_raw_json_tool_calls()
|
||
|
||
FIX 6: Double-wrapped arguments (nested {"cmd": "{\"cmd\": \"curl...\"}"}")
|
||
Symptom: args={"cmd": "{\\\"cmd\\\": \\\"curl...\\\"}"}
|
||
Tool executor receives cmd = the literal string '{"cmd": "curl..."', not the actual curl command.
|
||
Root cause: When model generates properly escaped JSON ("arguments": "{\\"cmd\\": \\"...\\"}"),
|
||
_extract_args naive brace-counting returns raw text with escaped quotes.
|
||
json.loads(raw) fails on \\ at structural level.
|
||
Fallback sets args["cmd"] = raw_string → double-wrapped.
|
||
Fix: _extract_args now tries 3 parse strategies before returning.
|
||
Also normalizes sandbox_permissions from parsed args dict (not raw snippet).
|
||
Location: _extract_args() three-tier parser, sandbox_permissions normalization
|
||
|
||
FIX 7: _extract_field can't read values starting with \"
|
||
Symptom: sandbox_permissions="allow_all" passes through unnormalized because
|
||
_extract_field sees val_start=\ (backslash) which != " or { → returns None
|
||
Fix: Skip leading backslash before checking for " or { value type.
|
||
Location: _extract_field() leading-\ skip
|
||
|
||
FIX 8: Adaptive probing caused format mismatch (REVERTED)
|
||
Symptom: Probe system discovered OpenAI tool_calls+role=tool format but CC API couldn't
|
||
process multi-turn tool loops correctly with it.
|
||
Fix: Removed probe system entirely. Use conservative format only:
|
||
- Inline JSON text for tool calls (cc_input_to_messages default)
|
||
- role="user" for all tool results
|
||
- ErrorAnalyzer learning on retries (not proactive probes)
|
||
Location: Reverted to cc_input_to_messages(), removed _build_cc_messages + _probe_cc_format
|
||
|
||
FIX 21: DSML parser silently drops tool calls when model uses name="cmd" (THE HALT BUG)
|
||
Symptom: Codex CLI stops mid-task. Model generates valid DSML exec_command with
|
||
<||DSML||parameter name="cmd" string="true">curl ...
|
||
Parser returns parsed_tool_calls=0. Client sees text output but no tool to execute.
|
||
CLI has nothing to do and halts.
|
||
Root cause: Line 1798 had `if key == "command":` — only matching parameter name="command".
|
||
The actual tool schema defines the parameter as "cmd" (see exec_command schema).
|
||
When DeepSeek generates name="cmd", the key "cmd" != "command", so cmd stays None,
|
||
and line 1825-1826 `if not cmd: continue` silently skips the entire tool call.
|
||
The XML parser (line 2205) already handled both: `params.get("command") or params.get("cmd")`
|
||
but the DSML parser did not.
|
||
Fix: Changed to `if key in ("command", "cmd"):` in the DSML parameter loop.
|
||
Test: Pattern L self-test verifies DSML with name="cmd" is parsed correctly.
|
||
Location: _parse_commandcode_text_tool_calls() DSML parameter loop, self-test Pattern L
|
||
|
||
════════════════════════════════════════════════════════════════════
|
||
INTELLIGENCE ROUTING — Self-Healing Parser System (v3.7.0)
|
||
════════════════════════════════════════════════════════════════════
|
||
|
||
Problem: The Command Code model produces output in unpredictable formats
|
||
that change between sessions and models. When the multi-format parser chain
|
||
(DSML → <bash> → <explore_agent> → <tool_call type=...> → XML → raw JSON →
|
||
fallback regex) returns empty, the Codex agent loop has zero tool calls and
|
||
STALLS — the user sees the model "thinking" but nothing happens.
|
||
|
||
Intelligence Routing is a three-layer self-healing system:
|
||
|
||
LAYER 1 — Deep URL Extraction (FIX 23)
|
||
The <explore_agent> handler was failing because URLs were hidden inside
|
||
nested JSON: messages: [{"content": "https://..."}]. The regex couldn't
|
||
find them because it excluded the " character that terminates JSON values.
|
||
|
||
Solution: _build_explore_cmd() is now a module-level function (was a
|
||
closure). After the initial regex fails, it tries json.loads() on the
|
||
text, iterates list items, and extracts the "content" field to find URLs.
|
||
Also added " to the regex exclusion set and rstrip characters.
|
||
|
||
LAYER 2 — Escalation Block Handling (FIX 24)
|
||
The model produces <require_escalation> and <request_escalation_permission>
|
||
blocks when it wants elevated permissions. The CC adapter doesn't support
|
||
escalation — these blocks were silently dropped, causing parsed_tool_calls=0.
|
||
|
||
Solution: Two handlers:
|
||
- FIX 24a: Closed-tag blocks — extracts URL if present, runs explore cmd;
|
||
otherwise echoes auto-proceed message.
|
||
- FIX 24b: Bare/unclosed tags (<require_escalation />) — auto-proceeds.
|
||
|
||
LAYER 3 — Intent-Based Command Synthesis (FIX 25, THE CORE)
|
||
When ALL parsers return empty and text has content, the system plays
|
||
detective using 5 heuristics in priority order:
|
||
|
||
1. URL detected in text → curl to fetch it
|
||
2. File path reference → cat or ls that file
|
||
3. Shell command in backticks/quotes → extract and run
|
||
4. "explore"/"fetch"/"investigate" intent + last user URL → explore cmd
|
||
5. "I need to"/"let me"/"please" intent text → echo diagnostic
|
||
|
||
This ensures the agent loop ALWAYS has a tool call to execute, even when
|
||
the model's output format is completely unrecognized. The loop never stalls.
|
||
|
||
Architecture:
|
||
_parse_commandcode_text_tool_calls() — LAYER 1 + LAYER 2
|
||
cc_stream_to_sse() — LAYER 3 (runs after parser chain + fallback)
|
||
|
||
The _last_user_urls deque (maxlen=20) tracks URLs from user messages
|
||
across the session, giving Layer 3 heuristic 4 a URL to work with.
|
||
|
||
Self-tests: 54 patterns (was 41) covering all three layers.
|
||
|
||
════════════════════════════════════════════════════════════════════
|
||
"""
|
||
|
||
import json, http.server, socketserver, urllib.request, urllib.parse, urllib.error, re
|
||
import time, uuid, os, sys, argparse, threading, socket, collections, contextlib, signal
|
||
import dataclasses
|
||
import http.client
|
||
import selectors
|
||
|
||
# ═══════════════════════════════════════════════════════════════════
|
||
# Config
|
||
# ═══════════════════════════════════════════════════════════════════
|
||
|
||
DEFAULT_MODELS = {
|
||
"openai-compat": [
|
||
{"id": "gpt-4o-mini", "object": "model", "created": 1700000000, "owned_by": "custom"},
|
||
],
|
||
"anthropic": [
|
||
{"id": "claude-sonnet-4-20250514", "object": "model", "created": 1700000000, "owned_by": "anthropic"},
|
||
],
|
||
"freebuff": [
|
||
{"id": "deepseek/deepseek-v4-pro", "object": "model", "created": 1700000000, "owned_by": "freebuff"},
|
||
{"id": "deepseek/deepseek-v4-flash", "object": "model", "created": 1700000000, "owned_by": "freebuff"},
|
||
{"id": "moonshotai/kimi-k2.6", "object": "model", "created": 1700000000, "owned_by": "freebuff"},
|
||
{"id": "minimax/minimax-m2.7", "object": "model", "created": 1700000000, "owned_by": "freebuff"},
|
||
],
|
||
"auto": [
|
||
{"id": "default-model", "object": "model", "created": 1700000000, "owned_by": "auto"},
|
||
],
|
||
}
|
||
|
||
def load_config():
|
||
p = argparse.ArgumentParser(description="Responses API translation proxy")
|
||
p.add_argument("--config", help="JSON config file path")
|
||
p.add_argument("--port", type=int, default=None)
|
||
p.add_argument("--backend", default=None, choices=["openai-compat", "anthropic", "command-code", "freebuff", "auto"])
|
||
p.add_argument("--target-url", default=None)
|
||
p.add_argument("--api-key", default=None)
|
||
p.add_argument("--models-file", default=None, help="JSON file with model list array")
|
||
args = p.parse_args()
|
||
|
||
cfg = {}
|
||
if args.config:
|
||
with open(args.config) as f:
|
||
cfg = json.load(f)
|
||
|
||
for ck, ak in [("port", "port"), ("backend_type", "backend"),
|
||
("target_url", "target_url"), ("api_key", "api_key")]:
|
||
v = getattr(args, ak, None)
|
||
if v is not None:
|
||
cfg[ck] = v
|
||
|
||
env_map = {
|
||
"port": ("PROXY_PORT", "ZAI_PROXY_PORT", int),
|
||
"backend_type": ("PROXY_BACKEND", None, str),
|
||
"target_url": ("PROXY_TARGET_URL", "ZAI_BASE_URL", str),
|
||
"api_key": ("PROXY_API_KEY", "ZAI_API_KEY", str),
|
||
}
|
||
for ck, (ev1, ev2, conv) in env_map.items():
|
||
if ck not in cfg:
|
||
v = os.environ.get(ev1) or (os.environ.get(ev2) if ev2 else None)
|
||
if v:
|
||
cfg[ck] = conv(v) if conv == int else v
|
||
|
||
cfg.setdefault("port", 8080)
|
||
cfg.setdefault("backend_type", "openai-compat")
|
||
cfg.setdefault("target_url", "http://localhost:11434/v1")
|
||
cfg.setdefault("api_key", "")
|
||
|
||
models = cfg.get("models", [])
|
||
if not models and args.models_file:
|
||
with open(args.models_file) as f:
|
||
models = json.load(f)
|
||
if not models:
|
||
models = DEFAULT_MODELS.get(cfg["backend_type"], [])
|
||
cfg["models"] = models
|
||
|
||
return cfg
|
||
|
||
CONFIG = None
|
||
PORT = 8080
|
||
BACKEND = "openai-compat"
|
||
TARGET_URL = ""
|
||
API_KEY = ""
|
||
OAUTH_PROVIDER = ""
|
||
MODELS = []
|
||
CC_VERSION = ""
|
||
REASONING_ENABLED = True
|
||
REASONING_EFFORT = "medium"
|
||
BGP_ROUTES = []
|
||
SERVER = None
|
||
|
||
_LOG_DIR = os.path.join(os.path.expanduser("~"), ".cache", "codex-proxy")
|
||
os.makedirs(_LOG_DIR, exist_ok=True)
|
||
_REQUESTS_DIR = os.path.join(_LOG_DIR, "requests")
|
||
os.makedirs(_REQUESTS_DIR, exist_ok=True)
|
||
_stats_path = os.path.join(_LOG_DIR, "usage-stats.json")
|
||
_provider_caps_path = os.path.join(_LOG_DIR, "provider-caps.json")
|
||
_stats_lock = threading.Lock()
|
||
_stats_pending = []
|
||
_stats_flush_timer = None
|
||
_STATS_FLUSH_INTERVAL = 5.0
|
||
_STATS = {}
|
||
|
||
try:
|
||
_LOG_FILE = open(os.path.join(_LOG_DIR, "proxy.log"), "a")
|
||
except Exception:
|
||
_LOG_FILE = None
|
||
|
||
_response_store = collections.OrderedDict()
|
||
_response_store_lock = threading.Lock()
|
||
_MAX_STORED = 50
|
||
_RESPONSE_TTL = 600
|
||
|
||
_fb_reasoning_store = collections.OrderedDict()
|
||
_fb_reasoning_store_lock = threading.Lock()
|
||
|
||
_deepseek_reasoning_store = {}
|
||
_deepseek_reasoning_lock = threading.Lock()
|
||
_MAX_DS_STORED = 100
|
||
|
||
_crof_lock = threading.Lock()
|
||
_provider_caps_lock = threading.Lock()
|
||
_provider_caps = None
|
||
|
||
_shutdown_requested = False
|
||
_active_connections = 0
|
||
_active_connections_lock = threading.Lock()
|
||
_active_requests = {}
|
||
_active_requests_lock = threading.Lock()
|
||
|
||
_pool = uuid.uuid4().hex[:8]
|
||
_antigravity_version = "1.18.3"
|
||
_antigravity_version_checked = 0
|
||
_antigravity_version_lock = threading.Lock()
|
||
_last_user_urls = collections.deque(maxlen=20)
|
||
|
||
_conn_pool_lock = threading.Lock()
|
||
_conn_pool = {}
|
||
|
||
_STREAM_IDLE_TIMEOUT = 300
|
||
|
||
_FREEBUFF_AUTH_URL = "https://freebuff.com"
|
||
_FREEBUFF_API_URL = "https://www.codebuff.com"
|
||
_FREEBUFF_AGENT_MAP = {
|
||
"deepseek/deepseek-v4-pro": "base2-free-deepseek",
|
||
"deepseek/deepseek-v4-flash": "base2-free-deepseek-flash",
|
||
"moonshotai/kimi-k2.6": "base2-free-kimi",
|
||
"minimax/minimax-m2.7": "base2-free",
|
||
}
|
||
_FREEBUFF_CREDS_PATH = os.path.join(os.path.expanduser("~"), ".config", "manicode", "credentials.json")
|
||
_freebuff_token_cache = {"token": None, "checked": 0}
|
||
_freebuff_session_cache = {"instance_id": None, "expires": 0, "model": None}
|
||
_freebuff_token_lock = threading.Lock()
|
||
|
||
def _get_freebuff_token():
|
||
with _freebuff_token_lock:
|
||
if _freebuff_token_cache["token"] and _freebuff_token_cache["checked"] > time.time() - 300:
|
||
return _freebuff_token_cache["token"]
|
||
try:
|
||
with open(_FREEBUFF_CREDS_PATH) as f:
|
||
creds = json.load(f)
|
||
default_account = creds.get("default", {})
|
||
token = default_account.get("authToken") or creds.get("apiKey") or ""
|
||
with _freebuff_token_lock:
|
||
_freebuff_token_cache["token"] = token
|
||
_freebuff_token_cache["checked"] = time.time()
|
||
return token
|
||
except Exception as e:
|
||
print(f"[freebuff] no credentials at {_FREEBUFF_CREDS_PATH}: {e}", file=sys.stderr)
|
||
return ""
|
||
|
||
def _freebuff_get_session(token, model):
|
||
with _freebuff_token_lock:
|
||
sc = _freebuff_session_cache
|
||
if sc["instance_id"] and sc["expires"] > time.time() + 60 and sc["model"] == model:
|
||
return sc["instance_id"]
|
||
try:
|
||
url = f"{_FREEBUFF_API_URL}/api/v1/freebuff/session"
|
||
body = json.dumps({"model": model}).encode()
|
||
req = urllib.request.Request(url, data=body, headers={
|
||
"Content-Type": "application/json",
|
||
"Authorization": f"Bearer {token}",
|
||
"User-Agent": "codex-launcher/3.9.0",
|
||
"x-freebuff-model": model,
|
||
})
|
||
resp = urllib.request.urlopen(req, timeout=15)
|
||
data = json.loads(resp.read())
|
||
instance_id = data.get("instanceId", data.get("data", {}).get("instance_id", ""))
|
||
expires_at = data.get("remainingMs", 0)
|
||
if instance_id:
|
||
with _freebuff_token_lock:
|
||
_freebuff_session_cache["instance_id"] = instance_id
|
||
_freebuff_session_cache["expires"] = time.time() + min(expires_at / 1000, 3600)
|
||
_freebuff_session_cache["model"] = model
|
||
print(f"[freebuff] session active, instance={instance_id[:8]}...", file=sys.stderr)
|
||
return instance_id
|
||
return None
|
||
except Exception as e:
|
||
print(f"[freebuff] session failed: {e}", file=sys.stderr)
|
||
return None
|
||
|
||
def _freebuff_start_run(token, agent_id):
|
||
url = f"{_FREEBUFF_API_URL}/api/v1/agent-runs"
|
||
body = json.dumps({"action": "START", "agentId": agent_id, "ancestorRunIds": []}).encode()
|
||
req = urllib.request.Request(url, data=body, headers={
|
||
"Content-Type": "application/json",
|
||
"Authorization": f"Bearer {token}",
|
||
"User-Agent": "codex-launcher/3.9.0",
|
||
})
|
||
try:
|
||
resp = urllib.request.urlopen(req, timeout=15)
|
||
data = json.loads(resp.read())
|
||
run_id = data.get("runId")
|
||
print(f"[freebuff] started run {run_id} for agent {agent_id}", file=sys.stderr)
|
||
return run_id
|
||
except urllib.error.HTTPError as e:
|
||
err = e.read().decode()[:300]
|
||
print(f"[freebuff] start run failed: HTTP {e.code}: {err}", file=sys.stderr)
|
||
return None
|
||
except Exception as e:
|
||
print(f"[freebuff] start run error: {e}", file=sys.stderr)
|
||
return None
|
||
|
||
def _freebuff_finish_run(token, run_id, status="completed"):
|
||
url = f"{_FREEBUFF_API_URL}/api/v1/agent-runs"
|
||
body = json.dumps({"action": "FINISH", "runId": run_id, "status": status,
|
||
"totalSteps": 1, "directCredits": 0, "totalCredits": 0}).encode()
|
||
req = urllib.request.Request(url, data=body, headers={
|
||
"Content-Type": "application/json",
|
||
"Authorization": f"Bearer {token}",
|
||
"User-Agent": "codex-launcher/3.9.0",
|
||
})
|
||
try:
|
||
urllib.request.urlopen(req, timeout=10)
|
||
except Exception as e:
|
||
print(f"[freebuff] finish run {run_id} error: {e}", file=sys.stderr)
|
||
|
||
# ═══════════════════════════════════════════════════════════════════
|
||
# Multi-account rotation system
|
||
# ═══════════════════════════════════════════════════════════════════
|
||
|
||
class AccountPool:
|
||
"""Manages multiple accounts for a provider. Rotates on rate-limit (429/426)."""
|
||
|
||
def __init__(self, provider_name):
|
||
self.provider_name = provider_name
|
||
self._lock = threading.Lock()
|
||
self._accounts = []
|
||
self._rate_limited = {}
|
||
self._current_idx = 0
|
||
self._loaded_at = 0
|
||
|
||
def load_accounts(self, force=False):
|
||
with self._lock:
|
||
if not force and self._accounts and time.time() - self._loaded_at < 60:
|
||
return len(self._accounts)
|
||
accounts = self._do_load()
|
||
with self._lock:
|
||
if accounts:
|
||
self._accounts = accounts
|
||
self._loaded_at = time.time()
|
||
for a in accounts:
|
||
key = a.get("id", a.get("email", ""))
|
||
if key not in self._rate_limited:
|
||
self._rate_limited[key] = 0
|
||
return len(self._accounts) if accounts else 0
|
||
|
||
def _do_load(self):
|
||
return []
|
||
|
||
def get(self):
|
||
"""Return the best available account dict, or None."""
|
||
self.load_accounts()
|
||
with self._lock:
|
||
if not self._accounts:
|
||
return None
|
||
now = time.time()
|
||
n = len(self._accounts)
|
||
for attempt in range(n):
|
||
idx = (self._current_idx + attempt) % n
|
||
acct = self._accounts[idx]
|
||
key = acct.get("id", acct.get("email", ""))
|
||
if self._rate_limited.get(key, 0) < now:
|
||
self._current_idx = idx
|
||
return acct
|
||
best_key = min(self._rate_limited, key=self._rate_limited.get)
|
||
wait = self._rate_limited[best_key] - now
|
||
print(f"[{self.provider_name}] all accounts rate-limited, earliest free in {wait:.0f}s", file=sys.stderr)
|
||
return self._accounts[self._current_idx]
|
||
|
||
def mark_rate_limited(self, account, duration=120):
|
||
key = account.get("id", account.get("email", ""))
|
||
with self._lock:
|
||
self._rate_limited[key] = time.time() + duration
|
||
idx = None
|
||
for i, a in enumerate(self._accounts):
|
||
if a.get("id", a.get("email", "")) == key:
|
||
idx = i
|
||
break
|
||
if idx is not None:
|
||
self._current_idx = (idx + 1) % len(self._accounts)
|
||
print(f"[{self.provider_name}] account {key} rate-limited for {duration}s, rotating to next", file=sys.stderr)
|
||
|
||
def advance(self):
|
||
with self._lock:
|
||
if self._accounts:
|
||
self._current_idx = (self._current_idx + 1) % len(self._accounts)
|
||
|
||
def status(self):
|
||
with self._lock:
|
||
now = time.time()
|
||
result = []
|
||
for a in self._accounts:
|
||
key = a.get("id", a.get("email", ""))
|
||
rl_until = self._rate_limited.get(key, 0)
|
||
info = {"id": key, "email": a.get("email", ""), "rate_limited": rl_until > now}
|
||
if rl_until > now:
|
||
info["rate_limited_until"] = rl_until
|
||
info["resets_in"] = int(rl_until - now)
|
||
result.append(info)
|
||
return result
|
||
|
||
class FreebuffAccountPool(AccountPool):
|
||
def _do_load(self):
|
||
if not os.path.exists(_FREEBUFF_CREDS_PATH):
|
||
return None
|
||
try:
|
||
with open(_FREEBUFF_CREDS_PATH) as f:
|
||
creds = json.load(f)
|
||
except Exception:
|
||
return None
|
||
accounts = []
|
||
if "accounts" in creds and isinstance(creds["accounts"], list):
|
||
for i, ac in enumerate(creds["accounts"]):
|
||
token = ac.get("authToken") or ac.get("apiKey") or ""
|
||
if token:
|
||
acct = {"id": ac.get("email") or ac.get("id") or f"account-{i}", "token": token, "email": ac.get("email", "")}
|
||
accounts.append(acct)
|
||
default = creds.get("default", {})
|
||
default_token = default.get("authToken") or creds.get("apiKey") or ""
|
||
if default_token:
|
||
default_id = default.get("email") or default.get("id") or "default"
|
||
if not any(a["id"] == default_id for a in accounts):
|
||
accounts.insert(0, {"id": default_id, "token": default_token, "email": default.get("email", "")})
|
||
return accounts if accounts else None
|
||
|
||
class GoogleAccountPool(AccountPool):
|
||
def __init__(self, variant):
|
||
super().__init__(f"google-{variant}")
|
||
self.variant = variant
|
||
|
||
def _do_load(self):
|
||
cache_dir = os.path.join(os.path.expanduser("~"), ".cache", "codex-proxy")
|
||
accounts = []
|
||
primary = f"google-{self.variant}-oauth-token.json"
|
||
primary_path = os.path.join(cache_dir, primary)
|
||
if os.path.exists(primary_path):
|
||
try:
|
||
with open(primary_path) as f:
|
||
tok = json.load(f)
|
||
token = tok.get("access_token", "")
|
||
if token:
|
||
accounts.append({"id": f"google-{self.variant}-primary", "token": token, "email": tok.get("email", ""), "_token_data": tok, "_path": primary_path})
|
||
except Exception:
|
||
pass
|
||
idx = 1
|
||
while True:
|
||
extra = f"google-{self.variant}-oauth-token-{idx}.json"
|
||
extra_path = os.path.join(cache_dir, extra)
|
||
if not os.path.exists(extra_path):
|
||
break
|
||
try:
|
||
with open(extra_path) as f:
|
||
tok = json.load(f)
|
||
token = tok.get("access_token", "")
|
||
if token:
|
||
accounts.append({"id": f"google-{self.variant}-{idx}", "token": token, "email": tok.get("email", ""), "_token_data": tok, "_path": extra_path})
|
||
except Exception:
|
||
pass
|
||
idx += 1
|
||
return accounts if accounts else None
|
||
|
||
class APIKeyPool(AccountPool):
|
||
"""Rotates through comma-separated API keys."""
|
||
|
||
def __init__(self, provider_name, keys_str):
|
||
super().__init__(provider_name)
|
||
self._raw_keys = [k.strip() for k in keys_str.split(",") if k.strip()]
|
||
self._accounts = [{"id": f"key-{i}", "token": k, "email": f"key-{i}"} for i, k in enumerate(self._raw_keys)]
|
||
for a in self._accounts:
|
||
self._rate_limited[a["id"]] = 0
|
||
self._loaded_at = time.time()
|
||
|
||
def load_accounts(self, force=False):
|
||
return len(self._accounts)
|
||
|
||
_fb_pool = FreebuffAccountPool("freebuff")
|
||
_google_antigravity_pool = GoogleAccountPool("antigravity")
|
||
_google_cli_pool = GoogleAccountPool("cli")
|
||
|
||
def _get_freebuff_account():
|
||
"""Return (token, account_dict) for best available freebuff account."""
|
||
_fb_pool.load_accounts()
|
||
acct = _fb_pool.get()
|
||
if not acct:
|
||
return "", None
|
||
return acct["token"], acct
|
||
|
||
def _get_google_account(oauth_provider):
|
||
"""Return (access_token, account_dict) for best available Google account."""
|
||
pool = _google_antigravity_pool if oauth_provider == "google-antigravity" else _google_cli_pool
|
||
pool.load_accounts()
|
||
acct = pool.get()
|
||
if not acct:
|
||
return None, None
|
||
token_data = acct.get("_token_data", {})
|
||
token_path = acct.get("_path", "")
|
||
if token_data and token_path:
|
||
refreshed = _refresh_google_token(token_data, token_path)
|
||
return refreshed, acct
|
||
return acct.get("token", ""), acct
|
||
|
||
def _refresh_google_token(token_data, token_path):
|
||
if token_data.get("expires_at", 0) > time.time() + 60:
|
||
return token_data.get("access_token", "")
|
||
client_id = token_data.get("client_id", "")
|
||
client_secret = token_data.get("client_secret", "")
|
||
refresh_token = token_data.get("refresh_token", "")
|
||
if not all([client_id, client_secret, refresh_token]):
|
||
return token_data.get("access_token", "")
|
||
print("[oauth] refreshing Google access token...", file=sys.stderr)
|
||
try:
|
||
data = urllib.parse.urlencode({
|
||
"client_id": client_id, "client_secret": client_secret,
|
||
"refresh_token": refresh_token, "grant_type": "refresh_token",
|
||
}).encode()
|
||
req = urllib.request.Request("https://oauth2.googleapis.com/token", data=data,
|
||
headers={"Content-Type": "application/x-www-form-urlencoded"})
|
||
resp = urllib.request.urlopen(req, timeout=30)
|
||
new_tokens = json.loads(resp.read())
|
||
token_data["access_token"] = new_tokens.get("access_token", token_data.get("access_token"))
|
||
token_data["expires_at"] = time.time() + new_tokens.get("expires_in", 3600)
|
||
with open(token_path, "w") as f:
|
||
json.dump(token_data, f, indent=2)
|
||
print("[oauth] token refreshed OK", file=sys.stderr)
|
||
return token_data["access_token"]
|
||
except Exception as e:
|
||
print(f"[oauth] refresh failed: {e}", file=sys.stderr)
|
||
return token_data.get("access_token", "")
|
||
|
||
# ═══════════════════════════════════════════════════════════════════
|
||
# Gemini 3 thought signature preservation
|
||
# ═══════════════════════════════════════════════════════════════════
|
||
|
||
_gemini_sig_store = {}
|
||
_gemini_sig_lock = threading.Lock()
|
||
|
||
def _gemini_store_sig(key, signature):
|
||
if not key or not signature:
|
||
return
|
||
with _gemini_sig_lock:
|
||
_gemini_sig_store[key] = {"sig": signature, "ts": time.time()}
|
||
|
||
def _gemini_get_sig(key):
|
||
with _gemini_sig_lock:
|
||
item = _gemini_sig_store.get(key)
|
||
return item["sig"] if item else None
|
||
|
||
def _extract_gemini_sig(part):
|
||
if not isinstance(part, dict):
|
||
return None
|
||
return part.get("thoughtSignature") or part.get("thought_signature") or part.get("signature")
|
||
|
||
def _gemini_reattach_sigs(contents):
|
||
for content in contents:
|
||
for part in content.get("parts", []):
|
||
if not isinstance(part, dict):
|
||
continue
|
||
if "thoughtSignature" in part:
|
||
continue
|
||
if "functionCall" in part:
|
||
fc = part["functionCall"]
|
||
cid = fc.get("id") or fc.get("name")
|
||
if cid:
|
||
sig = _gemini_get_sig(f"fc:{cid}")
|
||
if sig:
|
||
part["thoughtSignature"] = sig
|
||
if "text" in part and content.get("role") == "model":
|
||
turn_key = content.get("_proxy_turn_key")
|
||
if turn_key:
|
||
sig = _gemini_get_sig(f"turn:{turn_key}")
|
||
if sig:
|
||
part["thoughtSignature"] = sig
|
||
return contents
|
||
|
||
# Gemini follow-through guardrail
|
||
_GEMINI_AGENT_GUARDRAIL = (
|
||
"You are running inside Codex as an autonomous coding agent. "
|
||
"When the user asks for a change to existing files, do not merely describe the previous work or summarize. "
|
||
"You must inspect the existing files, apply edits with tools, and verify the result. "
|
||
"If a file path is known from prior context, reuse it. "
|
||
"If unsure, list files first. "
|
||
"After tool results, continue until the requested change is actually implemented. "
|
||
"Never answer only with a plan such as 'I will start by...' or 'I am going to...'. "
|
||
"Always emit the actual tool call in the same response."
|
||
)
|
||
|
||
_LOG_FILE = None
|
||
_LOG_FILE_LOCK = threading.Lock()
|
||
|
||
def _fetch_antigravity_version():
|
||
cache_path = os.path.join(os.path.expanduser("~"), ".cache", "codex-proxy", "antigravity-version.json")
|
||
try:
|
||
with open(cache_path) as f:
|
||
cached = json.load(f)
|
||
if cached.get("version") and cached.get("checked_at", 0) > time.time() - 6 * 3600:
|
||
return cached["version"]
|
||
except Exception:
|
||
pass
|
||
urls = [
|
||
("https://antigravity-auto-updater-974169037036.us-central1.run.app", None),
|
||
("https://antigravity.google/changelog", 5000),
|
||
]
|
||
for url, limit in urls:
|
||
try:
|
||
req = urllib.request.Request(url, headers={"User-Agent": "Mozilla/5.0"})
|
||
resp = urllib.request.urlopen(req, timeout=5)
|
||
text = resp.read().decode(errors="replace")
|
||
if limit:
|
||
text = text[:limit]
|
||
m = re.search(r"\d+\.\d+\.\d+", text)
|
||
if m:
|
||
version = m.group(0)
|
||
try:
|
||
os.makedirs(os.path.dirname(cache_path), exist_ok=True)
|
||
with open(cache_path, "w") as f:
|
||
json.dump({"version": version, "checked_at": time.time()}, f)
|
||
except Exception:
|
||
pass
|
||
return version
|
||
except Exception:
|
||
pass
|
||
return _antigravity_version
|
||
|
||
def _ensure_antigravity_version():
|
||
global _antigravity_version, _antigravity_version_checked
|
||
if time.time() - _antigravity_version_checked < 6 * 3600:
|
||
return _antigravity_version
|
||
with _antigravity_version_lock:
|
||
if time.time() - _antigravity_version_checked < 6 * 3600:
|
||
return _antigravity_version
|
||
_antigravity_version = _fetch_antigravity_version()
|
||
_antigravity_version_checked = time.time()
|
||
return _antigravity_version
|
||
|
||
def _init_runtime():
|
||
global CONFIG, PORT, BACKEND, TARGET_URL, API_KEY, OAUTH_PROVIDER, _antigravity_version
|
||
global MODELS, CC_VERSION, REASONING_ENABLED, REASONING_EFFORT, BGP_ROUTES
|
||
global _api_key_pool
|
||
|
||
CONFIG = load_config()
|
||
PORT = CONFIG["port"]
|
||
BACKEND = CONFIG["backend_type"]
|
||
TARGET_URL = CONFIG["target_url"].rstrip("/")
|
||
API_KEY = CONFIG["api_key"]
|
||
OAUTH_PROVIDER = CONFIG.get("oauth_provider") or ""
|
||
MODELS = CONFIG["models"]
|
||
CC_VERSION = CONFIG.get("cc_version", "")
|
||
REASONING_ENABLED = CONFIG.get("reasoning_enabled", True)
|
||
REASONING_EFFORT = CONFIG.get("reasoning_effort", "medium")
|
||
BGP_ROUTES = CONFIG.get("bgp_routes", [])
|
||
_api_key_pool = None
|
||
if API_KEY and "," in API_KEY and not OAUTH_PROVIDER.startswith("google") and BACKEND not in ("freebuff",):
|
||
_api_key_pool = APIKeyPool(BACKEND, API_KEY)
|
||
print(f"[multi-account] API key pool: {len(_api_key_pool._accounts)} keys for {BACKEND}", file=sys.stderr)
|
||
if OAUTH_PROVIDER == "google-antigravity":
|
||
_antigravity_version = _ensure_antigravity_version()
|
||
print(f"[antigravity] version={_antigravity_version}", file=sys.stderr)
|
||
|
||
bgp_models = []
|
||
for _r in BGP_ROUTES:
|
||
for _m in _r.get("models", [{"id": _r.get("model", "unknown")}]):
|
||
mid = _m.get("id", _m) if isinstance(_m, dict) else _m
|
||
if mid not in bgp_models:
|
||
bgp_models.append(mid)
|
||
if BGP_ROUTES and not MODELS:
|
||
MODELS = [{"id": m, "object": "model", "created": 1700000000, "owned_by": "bgp"} for m in bgp_models]
|
||
CONFIG["models"] = MODELS
|
||
|
||
if (BACKEND or "").startswith("gemini-oauth") and (OAUTH_PROVIDER or "").startswith("google"):
|
||
token_name = "google-antigravity-oauth-token.json" if OAUTH_PROVIDER == "google-antigravity" else "google-cli-oauth-token.json"
|
||
token_path = os.path.join(os.path.expanduser("~"), ".cache", "codex-proxy", token_name)
|
||
_preemptive_refresh_token(token_path)
|
||
try:
|
||
with open(token_path) as _tf:
|
||
_td = json.load(_tf)
|
||
_discovered = [] if OAUTH_PROVIDER == "google-antigravity" else _td.get("available_models", [])
|
||
if _discovered:
|
||
_seen = []
|
||
for _m in _discovered:
|
||
if _m not in _seen:
|
||
_seen.append(_m)
|
||
MODELS = [{"id": m, "object": "model", "created": 1700000000, "owned_by": "gemini-oauth"} for m in _seen]
|
||
CONFIG["models"] = MODELS
|
||
print(f"[gemini-oauth] loaded {len(_seen)} discovered models: {_seen}", file=sys.stderr)
|
||
except Exception:
|
||
pass
|
||
|
||
def _preemptive_refresh_token(token_path):
|
||
try:
|
||
with open(token_path) as f:
|
||
td = json.load(f)
|
||
expires_at = td.get("expires_at", 0)
|
||
if expires_at and time.time() > expires_at - 300:
|
||
print(f"[oauth] preemptive refresh: token expires in {int(expires_at - time.time())}s", file=sys.stderr)
|
||
except Exception:
|
||
pass
|
||
|
||
def _pooled_urlopen(url, data=None, headers=None, timeout=180):
|
||
parsed = urllib.parse.urlparse(url)
|
||
host = parsed.hostname
|
||
port = parsed.port or (443 if parsed.scheme == "https" else 80)
|
||
pool_key = f"{parsed.scheme}://{host}:{port}"
|
||
with _conn_pool_lock:
|
||
conn = _conn_pool.get(pool_key)
|
||
if conn:
|
||
try:
|
||
sock = conn.sock
|
||
if sock is None or sock._closed if hasattr(sock, '_closed') else False:
|
||
conn = None
|
||
except Exception:
|
||
conn = None
|
||
if conn is None:
|
||
if parsed.scheme == "https":
|
||
conn = http.client.HTTPSConnection(host, port, timeout=timeout)
|
||
else:
|
||
conn = http.client.HTTPConnection(host, port, timeout=timeout)
|
||
with _conn_pool_lock:
|
||
_conn_pool[pool_key] = conn
|
||
path = parsed.path or "/"
|
||
if parsed.query:
|
||
path += "?" + parsed.query
|
||
method = "POST" if data else "GET"
|
||
conn.request(method, path, body=data, headers=headers or {})
|
||
return conn.getresponse()
|
||
|
||
def _response_store_evict():
|
||
with _response_store_lock:
|
||
now = time.time()
|
||
expired = [k for k, v in _response_store.items()
|
||
if isinstance(v, dict) and now - v.get("ts", 0) > _RESPONSE_TTL]
|
||
for k in expired:
|
||
del _response_store[k]
|
||
|
||
def _log_dual(msg, level="INFO"):
|
||
ts = time.strftime("%H:%M:%S")
|
||
line = f"[{ts}] [{level}] {msg}"
|
||
print(line, file=sys.stderr, flush=True)
|
||
with _LOG_FILE_LOCK:
|
||
if _LOG_FILE:
|
||
try:
|
||
_LOG_FILE.write(line + "\n")
|
||
_LOG_FILE.flush()
|
||
except Exception:
|
||
pass
|
||
|
||
def _stream_with_idle_timeout(response, timeout_seconds=None):
|
||
if timeout_seconds is None:
|
||
timeout_seconds = _STREAM_IDLE_TIMEOUT
|
||
sel = selectors.DefaultSelector()
|
||
try:
|
||
sock = response if hasattr(response, 'fp') and response.fp else response
|
||
raw_sock = getattr(getattr(sock, 'fp', None), 'raw', None) or getattr(sock, '_sock', None)
|
||
if raw_sock is None:
|
||
for chunk in response:
|
||
yield chunk
|
||
return
|
||
sel.register(raw_sock, selectors.EVENT_READ)
|
||
while True:
|
||
ready = sel.select(timeout=timeout_seconds)
|
||
if not ready:
|
||
raise TimeoutError(f"Stream idle for {timeout_seconds}s")
|
||
chunk = response.readline()
|
||
if not chunk:
|
||
break
|
||
yield chunk
|
||
finally:
|
||
try:
|
||
sel.close()
|
||
except Exception:
|
||
pass
|
||
|
||
def _provider_cap_key(target_url=None, backend=None, model=None):
|
||
host = urllib.parse.urlparse(target_url or TARGET_URL).netloc.lower()
|
||
return f"{backend or BACKEND}|{host}|{model or '*'}"
|
||
|
||
def _load_provider_caps():
|
||
global _provider_caps
|
||
with _provider_caps_lock:
|
||
if _provider_caps is not None:
|
||
return _provider_caps
|
||
try:
|
||
with open(_provider_caps_path) as f:
|
||
_provider_caps = json.load(f)
|
||
except Exception:
|
||
_provider_caps = {}
|
||
return _provider_caps
|
||
|
||
def _save_provider_caps():
|
||
try:
|
||
os.makedirs(os.path.dirname(_provider_caps_path), exist_ok=True)
|
||
with open(_provider_caps_path, "w") as f:
|
||
json.dump(_provider_caps or {}, f, indent=2)
|
||
except Exception as e:
|
||
print(f"[provider-sensor] failed to save caps: {e}", file=sys.stderr)
|
||
|
||
def _provider_cap(model, key, default=None):
|
||
caps = _load_provider_caps()
|
||
specific = caps.get(_provider_cap_key(model=model), {})
|
||
generic = caps.get(_provider_cap_key(model="*"), {})
|
||
return specific.get(key, generic.get(key, default))
|
||
|
||
def _set_provider_cap(model, key, value, reason=""):
|
||
caps = _load_provider_caps()
|
||
cap_key = _provider_cap_key(model=model)
|
||
caps.setdefault(cap_key, {})[key] = value
|
||
caps[cap_key]["reason"] = reason
|
||
caps[cap_key]["updated_at"] = time.time()
|
||
_save_provider_caps()
|
||
print(f"[provider-sensor] learned {cap_key}: {key}={value} reason={reason}", file=sys.stderr)
|
||
|
||
def _refresh_oauth_token():
|
||
return _refresh_oauth_token_for(API_KEY, OAUTH_PROVIDER)
|
||
|
||
def _refresh_oauth_token_for(api_key, oauth_provider):
|
||
oauth_provider = oauth_provider or ""
|
||
if oauth_provider.startswith("google"):
|
||
token, acct = _get_google_account(oauth_provider)
|
||
if token and acct:
|
||
return token
|
||
if not oauth_provider.startswith("google"):
|
||
return api_key
|
||
token_name = "google-antigravity-oauth-token.json" if oauth_provider == "google-antigravity" else "google-cli-oauth-token.json"
|
||
token_path = os.path.join(os.path.expanduser("~"), ".cache", "codex-proxy", token_name)
|
||
if not os.path.exists(token_path):
|
||
return api_key
|
||
try:
|
||
with open(token_path) as f:
|
||
tokens = json.load(f)
|
||
if tokens.get("expires_at", 0) > time.time() + 60:
|
||
return tokens.get("access_token", api_key)
|
||
client_id = tokens.get("client_id", "")
|
||
client_secret = tokens.get("client_secret", "")
|
||
refresh_token = tokens.get("refresh_token", "")
|
||
if not all([client_id, client_secret, refresh_token]):
|
||
return tokens.get("access_token", api_key)
|
||
print("[oauth] refreshing Google access token...", file=sys.stderr)
|
||
data = urllib.parse.urlencode({
|
||
"client_id": client_id, "client_secret": client_secret,
|
||
"refresh_token": refresh_token, "grant_type": "refresh_token",
|
||
}).encode()
|
||
req = urllib.request.Request("https://oauth2.googleapis.com/token", data=data,
|
||
headers={"Content-Type": "application/x-www-form-urlencoded"})
|
||
resp = urllib.request.urlopen(req, timeout=30)
|
||
new_tokens = json.loads(resp.read())
|
||
tokens["access_token"] = new_tokens.get("access_token", tokens.get("access_token"))
|
||
tokens["expires_at"] = time.time() + new_tokens.get("expires_in", 3600)
|
||
with open(token_path, "w") as f:
|
||
json.dump(tokens, f, indent=2)
|
||
print("[oauth] token refreshed OK", file=sys.stderr)
|
||
return tokens["access_token"]
|
||
except Exception as e:
|
||
print(f"[oauth] refresh failed: {e}", file=sys.stderr)
|
||
return API_KEY
|
||
|
||
# ═══════════════════════════════════════════════════════════════════
|
||
# Shared helpers
|
||
# ═══════════════════════════════════════════════════════════════════
|
||
|
||
_pool = uuid.uuid4().hex[:8]
|
||
|
||
def _load_stats():
|
||
try:
|
||
if os.path.exists(_stats_path):
|
||
return json.load(open(_stats_path))
|
||
except Exception:
|
||
pass
|
||
return {"providers": {}, "updated": None}
|
||
|
||
def _atomic_write_json(path, obj):
|
||
tmp = path + ".tmp"
|
||
with open(tmp, "w") as f:
|
||
json.dump(obj, f, indent=2, ensure_ascii=False)
|
||
os.replace(tmp, path)
|
||
|
||
def _flush_stats():
|
||
global _stats_flush_timer
|
||
with _stats_lock:
|
||
batch = list(_stats_pending)
|
||
_stats_pending.clear()
|
||
_stats_flush_timer = None
|
||
if not batch:
|
||
return
|
||
stats = _load_stats()
|
||
for entry in batch:
|
||
provider = entry["provider"]
|
||
model = entry["model"]
|
||
p = stats["providers"].setdefault(provider, {
|
||
"total_requests": 0, "successes": 0, "failures": 0,
|
||
"total_tokens_in": 0, "total_tokens_out": 0,
|
||
"total_duration_s": 0.0, "models": {}, "last_used": None, "last_error": None,
|
||
})
|
||
p["total_requests"] += 1
|
||
p["total_tokens_in"] += entry["tokens_in"]
|
||
p["total_tokens_out"] += entry["tokens_out"]
|
||
p["total_duration_s"] += entry["duration_s"]
|
||
p["last_used"] = time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime(entry["ts"]))
|
||
if entry["success"]:
|
||
p["successes"] += 1
|
||
else:
|
||
p["failures"] += 1
|
||
p["last_error"] = entry.get("error_type") or "unknown"
|
||
m = p["models"].setdefault(model, {"requests": 0, "tokens_in": 0, "tokens_out": 0})
|
||
m["requests"] += 1
|
||
m["tokens_in"] += entry["tokens_in"]
|
||
m["tokens_out"] += entry["tokens_out"]
|
||
stats["updated"] = time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime())
|
||
_atomic_write_json(_stats_path, stats)
|
||
|
||
def _record_usage(provider, model, success, duration_s, tokens_in=0, tokens_out=0, error_type=None):
|
||
global _stats_flush_timer
|
||
entry = {
|
||
"provider": provider or "unknown", "model": model or "unknown",
|
||
"success": bool(success), "duration_s": float(duration_s or 0),
|
||
"tokens_in": int(tokens_in or 0), "tokens_out": int(tokens_out or 0),
|
||
"error_type": error_type, "ts": time.time(),
|
||
}
|
||
with _stats_lock:
|
||
_stats_pending.append(entry)
|
||
if _stats_flush_timer is None:
|
||
_stats_flush_timer = threading.Timer(_STATS_FLUSH_INTERVAL, _flush_stats)
|
||
_stats_flush_timer.daemon = True
|
||
_stats_flush_timer.start()
|
||
|
||
def store_response(resp_id, input_data, output_items):
|
||
if not resp_id:
|
||
return
|
||
_response_store_evict()
|
||
with _response_store_lock:
|
||
_response_store[resp_id] = {"input": input_data, "output": output_items, "ts": time.time()}
|
||
while len(_response_store) > _MAX_STORED:
|
||
_response_store.popitem(last=False)
|
||
|
||
def resolve_previous_response(body):
|
||
prev_id = body.get("previous_response_id")
|
||
input_data = body.get("input", "")
|
||
if not prev_id:
|
||
return input_data
|
||
with _response_store_lock:
|
||
stored = _response_store.get(prev_id)
|
||
if not stored:
|
||
return input_data
|
||
prev_input = stored["input"]
|
||
prev_output = stored["output"]
|
||
new_input = input_data if isinstance(input_data, list) else []
|
||
if isinstance(prev_input, list):
|
||
combined = list(prev_input) + list(prev_output) + new_input
|
||
else:
|
||
combined = [{"type": "message", "role": "user", "content": [{"type": "input_text", "text": str(prev_input)}]}] + list(prev_output) + new_input
|
||
return combined
|
||
|
||
def _fb_store_reasoning(resp_id, reasoning_text):
|
||
if not resp_id or not reasoning_text:
|
||
return
|
||
with _fb_reasoning_store_lock:
|
||
_fb_reasoning_store[resp_id] = {"reasoning": reasoning_text, "ts": time.time()}
|
||
while len(_fb_reasoning_store) > _MAX_STORED:
|
||
_fb_reasoning_store.popitem(last=False)
|
||
expired = [k for k, v in _fb_reasoning_store.items() if time.time() - v["ts"] > _RESPONSE_TTL]
|
||
for k in expired:
|
||
del _fb_reasoning_store[k]
|
||
|
||
def _fb_get_reasoning(resp_id):
|
||
if not resp_id:
|
||
return ""
|
||
with _fb_reasoning_store_lock:
|
||
entry = _fb_reasoning_store.get(resp_id)
|
||
return entry["reasoning"] if entry else ""
|
||
|
||
def _fb_get_any_reasoning():
|
||
with _fb_reasoning_store_lock:
|
||
for k in _fb_reasoning_store:
|
||
return _fb_reasoning_store[k]["reasoning"]
|
||
return ""
|
||
|
||
def _freebuff_hard_disable_reasoning(messages):
|
||
"""Strip all reasoning/thinking fields from every message.
|
||
FreeBuff rejects mixed reasoning_content histories.
|
||
The final chat body must be clean before POST."""
|
||
for msg in messages:
|
||
if not isinstance(msg, dict):
|
||
continue
|
||
for key in ("reasoning_content", "reasoning", "thinking",
|
||
"thinking_content", "thoughts"):
|
||
msg.pop(key, None)
|
||
|
||
def _is_reasoning_content_error(error_text):
|
||
if not error_text:
|
||
return False
|
||
e = error_text.lower()
|
||
return ("reasoning_content" in e or "thinking mode" in e
|
||
or "must be passed back" in e)
|
||
|
||
def _ds_store_assistant(resp_id, assistant_msg):
|
||
if not resp_id or not isinstance(assistant_msg, dict):
|
||
return
|
||
tool_calls = assistant_msg.get("tool_calls") or []
|
||
reasoning = assistant_msg.get("reasoning_content")
|
||
if not tool_calls or not reasoning:
|
||
return
|
||
with _deepseek_reasoning_lock:
|
||
for tc in tool_calls:
|
||
tc_id = tc.get("id") or tc.get("call_id", "")
|
||
if tc_id:
|
||
_deepseek_reasoning_store[tc_id] = {
|
||
"resp_id": resp_id,
|
||
"assistant": dict(assistant_msg),
|
||
"reasoning_content": reasoning,
|
||
"ts": time.time(),
|
||
}
|
||
keys = list(_deepseek_reasoning_store.keys())
|
||
if len(keys) > _MAX_DS_STORED:
|
||
for k in keys[:len(keys) - _MAX_DS_STORED]:
|
||
del _deepseek_reasoning_store[k]
|
||
|
||
def _ds_rebuild_tool_history(messages):
|
||
with _deepseek_reasoning_lock:
|
||
snapshot = dict(_deepseek_reasoning_store)
|
||
expired = [k for k, v in snapshot.items() if time.time() - v["ts"] > 900]
|
||
for k in expired:
|
||
_deepseek_reasoning_store.pop(k, None)
|
||
snapshot.pop(k, None)
|
||
if not snapshot:
|
||
return messages
|
||
rebuilt = []
|
||
inserted_ids = set()
|
||
for msg in messages:
|
||
if msg.get("role") == "tool":
|
||
tc_id = msg.get("tool_call_id", "")
|
||
stored = snapshot.get(tc_id)
|
||
if stored and tc_id not in inserted_ids:
|
||
am = dict(stored["assistant"])
|
||
if am.get("reasoning_content"):
|
||
rebuilt.append(am)
|
||
inserted_ids.add(tc_id)
|
||
rebuilt.append(msg)
|
||
return rebuilt
|
||
|
||
def _fb_input_to_messages(input_data, instructions=""):
|
||
msgs = []
|
||
tool_name_by_id = {}
|
||
pending_tool_calls = []
|
||
last_flushed_ids = []
|
||
if isinstance(input_data, str):
|
||
msgs.append({"role": "user", "content": input_data})
|
||
elif isinstance(input_data, list):
|
||
for item in input_data:
|
||
t = item.get("type")
|
||
if t == "reasoning":
|
||
continue
|
||
if t == "function_call":
|
||
tcid = item.get("call_id") or item.get("id") or uid("tc")
|
||
pending_tool_calls.append(
|
||
{"id": tcid, "type": "function",
|
||
"function": {"name": item.get("name", ""),
|
||
"arguments": item.get("arguments", "{}")}})
|
||
tool_name_by_id[tcid] = item.get("name", "")
|
||
continue
|
||
if pending_tool_calls:
|
||
last_flushed_ids = [tc["id"] for tc in pending_tool_calls]
|
||
msg = {"role": "assistant", "content": None, "tool_calls": pending_tool_calls}
|
||
msgs.append(msg)
|
||
pending_tool_calls = []
|
||
if t == "message":
|
||
role = item.get("role", "user")
|
||
if role == "developer":
|
||
role = "system"
|
||
text = ""
|
||
content = item.get("content", [])
|
||
if isinstance(content, str):
|
||
text = content
|
||
else:
|
||
for part in content:
|
||
if isinstance(part, str):
|
||
text += part
|
||
continue
|
||
pt = part.get("type", "")
|
||
if pt in ("input_text", "output_text"):
|
||
text += part.get("text", "")
|
||
if text is not None:
|
||
am = {"role": role, "content": text}
|
||
if role == "assistant":
|
||
am["_fb_orig_id"] = item.get("id", "")
|
||
msgs.append(am)
|
||
elif t == "function_call_output":
|
||
tcid = item.get("call_id") or item.get("id") or ""
|
||
if not tcid and last_flushed_ids:
|
||
idx = len([m for m in msgs if m.get("role") == "tool"])
|
||
if idx < len(last_flushed_ids):
|
||
tcid = last_flushed_ids[idx]
|
||
msgs.append({"role": "tool", "tool_call_id": tcid,
|
||
"tool_name": tool_name_by_id.get(tcid, ""),
|
||
"content": item.get("output", "")})
|
||
if pending_tool_calls:
|
||
msg = {"role": "assistant", "content": None, "tool_calls": pending_tool_calls}
|
||
msgs.append(msg)
|
||
if instructions:
|
||
msgs.insert(0, {"role": "system", "content": instructions})
|
||
return msgs
|
||
|
||
def _fb_strip_reasoning_from_messages(messages):
|
||
out = []
|
||
for m in messages:
|
||
nm = {k: v for k, v in m.items() if k != "reasoning_content"}
|
||
out.append(nm)
|
||
return out
|
||
|
||
_HOP_BY_HOP_HEADERS = {
|
||
"connection",
|
||
"keep-alive",
|
||
"proxy-authenticate",
|
||
"proxy-authorization",
|
||
"te",
|
||
"trailers",
|
||
"transfer-encoding",
|
||
"upgrade",
|
||
"host",
|
||
"content-length",
|
||
}
|
||
|
||
def uid(prefix="id"):
|
||
return f"{prefix}-{_pool}-{uuid.uuid4().hex[:12]}"
|
||
|
||
def emit(event, data):
|
||
return f"event: {event}\ndata: {json.dumps(data)}\n\n"
|
||
|
||
def upstream_target(base_url, suffix):
|
||
base = base_url.rstrip("/")
|
||
if base.endswith(suffix):
|
||
return base
|
||
return f"{base}{suffix}"
|
||
|
||
_BROWSER_HEADERS = {
|
||
"User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/137.0.0.0 Safari/537.36",
|
||
"Accept": "application/json, text/event-stream, */*",
|
||
"Accept-Language": "en-US,en;q=0.9",
|
||
"Sec-Ch-Ua": '"Chromium";v="137", "Not/A)Brand";v="99"',
|
||
"Sec-Ch-Ua-Mobile": "?0",
|
||
"Sec-Ch-Ua-Platform": '"Linux"',
|
||
"Sec-Fetch-Dest": "empty",
|
||
"Sec-Fetch-Mode": "cors",
|
||
"Sec-Fetch-Site": "same-origin",
|
||
}
|
||
|
||
def forwarded_headers(request_headers, extra=None, browser_ua=False):
|
||
headers = {}
|
||
if browser_ua:
|
||
headers.update(_BROWSER_HEADERS)
|
||
for key, value in request_headers.items():
|
||
if key.lower() in _HOP_BY_HOP_HEADERS:
|
||
continue
|
||
if browser_ua and key.lower() == "user-agent":
|
||
continue
|
||
headers[key] = value
|
||
if extra:
|
||
headers.update(extra)
|
||
return headers
|
||
|
||
_MAX_INPUT_ITEMS = 30
|
||
_MAX_TOOL_OUTPUT_CHARS = 8000
|
||
_COMPACT_KEEP_RECENT = 10
|
||
|
||
_CROF_ADAPTIVE = {
|
||
"fail_history": [],
|
||
"model_limits": {},
|
||
"global_item_limit": 30,
|
||
"min_keep_recent": 4,
|
||
}
|
||
|
||
_BGP_STATS_PATH = os.path.join(_LOG_DIR, "bgp-route-stats.json")
|
||
_bgp_stats_lock = threading.Lock()
|
||
|
||
def _route_key(route):
|
||
return f"{route.get('name', '')}::{route.get('target_url', '')}::{route.get('model', '')}"
|
||
|
||
def _load_bgp_stats():
|
||
try:
|
||
if os.path.exists(_BGP_STATS_PATH):
|
||
return json.load(open(_BGP_STATS_PATH))
|
||
except Exception:
|
||
pass
|
||
return {}
|
||
|
||
def _save_bgp_stats(stats):
|
||
tmp = _BGP_STATS_PATH + ".tmp"
|
||
with open(tmp, "w") as f:
|
||
json.dump(stats, f, indent=2)
|
||
os.replace(tmp, _BGP_STATS_PATH)
|
||
|
||
def _score_route(route, stats):
|
||
key = _route_key(route)
|
||
rs = stats.get(key, {})
|
||
now = time.time()
|
||
if float(rs.get("open_until_ts", 0)) > now:
|
||
return 1_000_000
|
||
priority = int(route.get("priority", 99))
|
||
ewma = float(rs.get("ewma_latency_s", 0))
|
||
failures = int(rs.get("consecutive_failures", 0))
|
||
score = priority + min(ewma * 5, 50) + failures * 20
|
||
if float(rs.get("rate_limited_until", 0)) > now:
|
||
score += 500
|
||
return score
|
||
|
||
def _update_route_stats(route, success, duration_s, http_code=None, error_type=None):
|
||
with _bgp_stats_lock:
|
||
stats = _load_bgp_stats()
|
||
key = _route_key(route)
|
||
rs = stats.setdefault(key, {
|
||
"ewma_latency_s": duration_s, "consecutive_failures": 0,
|
||
"last_success": None, "last_failure": None,
|
||
"open_until_ts": 0, "rate_limited_until": 0, "last_error": None,
|
||
})
|
||
alpha = 0.25
|
||
rs["ewma_latency_s"] = alpha * duration_s + (1 - alpha) * float(rs.get("ewma_latency_s", duration_s))
|
||
if success:
|
||
rs["consecutive_failures"] = 0
|
||
rs["last_success"] = time.time()
|
||
else:
|
||
rs["consecutive_failures"] = int(rs.get("consecutive_failures", 0)) + 1
|
||
rs["last_failure"] = time.time()
|
||
rs["last_error"] = error_type or (f"http_{http_code}" if http_code else "unknown")
|
||
if http_code == 429:
|
||
rs["rate_limited_until"] = time.time() + 120
|
||
if rs["consecutive_failures"] >= 3:
|
||
rs["open_until_ts"] = time.time() + 60
|
||
rs["consecutive_failures"] = 0
|
||
_save_bgp_stats(stats)
|
||
|
||
def _sorted_bgp_routes():
|
||
with _bgp_stats_lock:
|
||
stats = _load_bgp_stats()
|
||
return sorted(BGP_ROUTES, key=lambda r: _score_route(r, stats))
|
||
|
||
def _crof_record(model, n_items, success):
|
||
if not isinstance(n_items, int) or n_items < 1:
|
||
return
|
||
entry = {"model": model, "items": n_items, "ok": success}
|
||
hist = _CROF_ADAPTIVE["fail_history"]
|
||
hist.append(entry)
|
||
if len(hist) > 200:
|
||
_CROF_ADAPTIVE["fail_history"] = hist[-100:]
|
||
|
||
ml = _CROF_ADAPTIVE["model_limits"].setdefault(model, {"ok_max": 30, "fail_min": 0, "limit": 30})
|
||
if success and n_items > ml["ok_max"]:
|
||
ml["ok_max"] = n_items
|
||
if not success and (ml["fail_min"] == 0 or n_items < ml["fail_min"]):
|
||
ml["fail_min"] = n_items
|
||
|
||
if ml["fail_min"] > 0 and ml["ok_max"] >= ml["fail_min"]:
|
||
ml["limit"] = ml["fail_min"] - 1
|
||
elif ml["fail_min"] > 0:
|
||
ml["limit"] = max(ml["fail_min"] - 2, _CROF_ADAPTIVE["min_keep_recent"] + 2)
|
||
|
||
global_limit = 30
|
||
for m, v in _CROF_ADAPTIVE["model_limits"].items():
|
||
if v.get("limit", 30) < global_limit:
|
||
global_limit = v["limit"]
|
||
_CROF_ADAPTIVE["global_item_limit"] = global_limit
|
||
|
||
print(f"[crof-adaptive] model={model} items={n_items} {'OK' if success else 'FAIL'} -> limit={ml.get('limit',30)} global={global_limit}", file=sys.stderr)
|
||
|
||
def _crof_item_limit(model):
|
||
ml = _CROF_ADAPTIVE["model_limits"].get(model, {})
|
||
per_model = ml.get("limit", 30)
|
||
return min(per_model, _CROF_ADAPTIVE["global_item_limit"])
|
||
|
||
def _crof_compact_for_retry(input_data, model):
|
||
limit = _crof_item_limit(model)
|
||
if not isinstance(input_data, list) or len(input_data) <= limit:
|
||
return input_data
|
||
|
||
keep = max(_CROF_ADAPTIVE["min_keep_recent"], limit // 3)
|
||
head_end = 0
|
||
for i, item in enumerate(input_data):
|
||
t = item.get("type")
|
||
if t == "message" and item.get("role") in ("developer", "system"):
|
||
head_end = i + 1
|
||
elif t == "message" and item.get("role") == "user" and head_end == i:
|
||
head_end = i + 1
|
||
else:
|
||
break
|
||
|
||
head = input_data[:head_end]
|
||
tail_start = max(head_end, len(input_data) - keep)
|
||
while tail_start > head_end:
|
||
t = input_data[tail_start].get("type")
|
||
r = input_data[tail_start].get("role", "")
|
||
if t in ("function_call_output", "function_call"):
|
||
tail_start -= 1
|
||
elif t == "message" and r == "assistant":
|
||
tail_start -= 1
|
||
else:
|
||
break
|
||
tail = input_data[tail_start:]
|
||
body = input_data[head_end:tail_start]
|
||
|
||
if not body:
|
||
return head + tail
|
||
|
||
summary_lines = [f"[Auto-compacted: {len(body)} turns removed (adaptive limit={limit})]"]
|
||
for item in body[-5:]:
|
||
summary_lines.append(_item_summary(item, max_len=120))
|
||
|
||
summary_msg = {"type": "message", "role": "user", "content": [{"type": "input_text", "text": "\n".join(summary_lines)}]}
|
||
print(f"[crof-adaptive] RETRY compact: {len(input_data)} -> {len(head)+1+len(tail)} (limit={limit}, keep={len(tail)})", file=sys.stderr)
|
||
return head + [summary_msg] + tail
|
||
|
||
def _item_summary(item, max_len=200):
|
||
t = item.get("type")
|
||
if t == "message":
|
||
role = item.get("role", "?")
|
||
text = ""
|
||
for p in item.get("content", []):
|
||
if p.get("type") in ("input_text", "output_text"):
|
||
text += p.get("text", "")
|
||
return f"[{role}] {text[:max_len]}"
|
||
elif t == "function_call":
|
||
name = item.get("name", "?")
|
||
args = item.get("arguments", "{}")
|
||
try:
|
||
a = json.loads(args)
|
||
cmd = a.get("cmd", a.get("command", ""))
|
||
if cmd:
|
||
return f"[tool call] {name}: {cmd[:max_len]}"
|
||
except Exception:
|
||
pass
|
||
return f"[tool call] {name}({args[:max_len]})"
|
||
elif t == "function_call_output":
|
||
output = item.get("output", "")
|
||
if len(output) > max_len:
|
||
return f"[tool result] {output[:max_len]}..."
|
||
return f"[tool result] {output}"
|
||
return f"[{t}]"
|
||
|
||
def _extract_files(items):
|
||
files = set()
|
||
for item in items:
|
||
if item.get("type") == "function_call":
|
||
try:
|
||
a = json.loads(item.get("arguments", "{}"))
|
||
cmd = a.get("cmd", a.get("command", ""))
|
||
for prefix in (">", ">>", " > ", " >> "):
|
||
for part in cmd.split(prefix)[1:]:
|
||
f = part.strip().split()[0].strip("'\"")
|
||
if f and not f.startswith("-") and "/" in f:
|
||
files.add(f)
|
||
except Exception:
|
||
pass
|
||
return files
|
||
|
||
def _compact_input(input_data):
|
||
if isinstance(input_data, str):
|
||
return input_data
|
||
if not isinstance(input_data, list) or len(input_data) <= _MAX_INPUT_ITEMS:
|
||
out = []
|
||
for item in input_data:
|
||
if isinstance(item, dict) and item.get("type") == "function_call_output":
|
||
o = item.get("output", "")
|
||
if len(o) > _MAX_TOOL_OUTPUT_CHARS:
|
||
item = dict(item)
|
||
item["output"] = o[:_MAX_TOOL_OUTPUT_CHARS] + f"\n... [truncated {len(o) - _MAX_TOOL_OUTPUT_CHARS} chars]"
|
||
print(f"[compact] tool output truncated {len(o)} -> {_MAX_TOOL_OUTPUT_CHARS}", file=sys.stderr)
|
||
out.append(item)
|
||
return out
|
||
|
||
head_end = 0
|
||
for i, item in enumerate(input_data):
|
||
t = item.get("type")
|
||
if t == "message" and item.get("role") in ("developer", "system"):
|
||
head_end = i + 1
|
||
elif t == "message" and item.get("role") == "user" and head_end == i:
|
||
head_end = i + 1
|
||
else:
|
||
break
|
||
|
||
head = input_data[:head_end]
|
||
tail_start = len(input_data) - _COMPACT_KEEP_RECENT
|
||
while tail_start > head_end:
|
||
t = input_data[tail_start].get("type")
|
||
r = input_data[tail_start].get("role", "")
|
||
if t == "function_call_output":
|
||
tail_start -= 1
|
||
elif t == "function_call":
|
||
tail_start -= 1
|
||
elif t == "message" and r == "assistant":
|
||
tail_start -= 1
|
||
else:
|
||
break
|
||
tail = input_data[tail_start:]
|
||
body = input_data[head_end:tail_start]
|
||
|
||
if not body:
|
||
return head + tail
|
||
|
||
for item in tail:
|
||
if isinstance(item, dict) and item.get("type") == "function_call_output":
|
||
o = item.get("output", "")
|
||
if len(o) > _MAX_TOOL_OUTPUT_CHARS:
|
||
item["output"] = o[:_MAX_TOOL_OUTPUT_CHARS] + f"\n... [truncated {len(o) - _MAX_TOOL_OUTPUT_CHARS} chars]"
|
||
|
||
user_queries = []
|
||
for item in body:
|
||
if item.get("type") == "message" and item.get("role") == "user":
|
||
for p in item.get("content", []):
|
||
if p.get("type") == "input_text":
|
||
user_queries.append(p.get("text", "")[:300])
|
||
assistant_msgs = []
|
||
for item in body:
|
||
if item.get("type") == "message" and item.get("role") == "assistant":
|
||
for p in item.get("content", []):
|
||
if p.get("type") == "output_text":
|
||
assistant_msgs.append(p.get("text", "")[:300])
|
||
|
||
tool_summaries = []
|
||
for item in body:
|
||
if item.get("type") in ("function_call", "function_call_output"):
|
||
tool_summaries.append(_item_summary(item, max_len=150))
|
||
|
||
files = _extract_files(body)
|
||
|
||
summary_lines = [f"[Auto-compacted: {len(body)} earlier turns summarized to preserve context]"]
|
||
if user_queries:
|
||
summary_lines.append(f"User requests: {'; '.join(user_queries[-3:])}")
|
||
if assistant_msgs:
|
||
summary_lines.append(f"Assistant responses: {'; '.join(assistant_msgs[-3:])}")
|
||
if tool_summaries:
|
||
summary_lines.append(f"Actions taken ({len(tool_summaries)} steps):")
|
||
for ts in tool_summaries[-15:]:
|
||
summary_lines.append(f" {ts}")
|
||
if files:
|
||
summary_lines.append(f"Files touched: {', '.join(sorted(files)[-10:])}")
|
||
|
||
summary_text = "\n".join(summary_lines)
|
||
summary_msg = {
|
||
"type": "message",
|
||
"role": "user",
|
||
"content": [{"type": "input_text", "text": summary_text}]
|
||
}
|
||
|
||
print(f"[compact] {len(input_data)} items -> {len(head) + 1 + len(tail)} (compacted {len(body)} old items into summary)", file=sys.stderr)
|
||
return head + [summary_msg] + tail
|
||
|
||
# ═══════════════════════════════════════════════════════════════════
|
||
# Provider policies
|
||
# ═══════════════════════════════════════════════════════════════════
|
||
|
||
_PROVIDER_POLICIES = {
|
||
"crof": {"reasoning_mode": "off", "max_tokens": 32768, "strip_reasoning": True,
|
||
"tool_output_limit": 4000, "max_input_items": 18, "compaction": "aggressive",
|
||
"synthetic_tool_results": True},
|
||
"chats-llm": {"reasoning_mode": "off", "max_tokens": 32768, "strip_reasoning": True,
|
||
"tool_output_limit": 4000, "max_input_items": 20, "compaction": "aggressive"},
|
||
"z.ai": {"reasoning_mode": "medium", "max_tokens": 65536, "strip_reasoning": True,
|
||
"tool_output_limit": 8000, "max_input_items": 40, "compaction": "balanced"},
|
||
"openrouter": {"reasoning_mode": "provider_default", "max_tokens": 32768, "strip_reasoning": True,
|
||
"tool_output_limit": 6000, "max_input_items": 35, "compaction": "balanced"},
|
||
"openadapter": {"reasoning_mode": "off", "max_tokens": 32768, "strip_reasoning": True,
|
||
"tool_output_limit": 6000, "max_input_items": 30, "compaction": "balanced"},
|
||
}
|
||
|
||
def provider_policy(target_url=None, backend=None):
|
||
host = urllib.parse.urlparse(target_url or TARGET_URL).netloc.lower()
|
||
for key, policy in _PROVIDER_POLICIES.items():
|
||
if key in host:
|
||
return policy
|
||
return {}
|
||
|
||
# ═══════════════════════════════════════════════════════════════════
|
||
# Adaptive context compaction (model-aware)
|
||
# ═══════════════════════════════════════════════════════════════════
|
||
|
||
_MODEL_CONTEXT = {
|
||
"gpt-4o": 128000, "gpt-4o-mini": 128000, "gpt-5": 128000,
|
||
"claude-sonnet": 200000, "claude-haiku": 200000,
|
||
"glm-5.1": 128000, "glm-5": 128000, "glm-4": 128000,
|
||
"deepseek": 64000, "gemini-2.5-flash": 1000000, "gemini-2.5-pro": 2000000,
|
||
"mimo": 32768, "minimax": 32768, "kimi": 128000,
|
||
"_default": 32768,
|
||
}
|
||
|
||
def _context_limit_for_model(model):
|
||
if not model:
|
||
return _MODEL_CONTEXT["_default"]
|
||
ml = model.lower()
|
||
for key, limit in _MODEL_CONTEXT.items():
|
||
if key != "_default" and key in ml:
|
||
return limit
|
||
return _MODEL_CONTEXT["_default"]
|
||
|
||
def _estimate_tokens(obj):
|
||
if obj is None:
|
||
return 0
|
||
if isinstance(obj, str):
|
||
return max(1, len(obj) // 4)
|
||
try:
|
||
raw = json.dumps(obj, ensure_ascii=False)
|
||
except Exception:
|
||
raw = str(obj)
|
||
return max(1, len(raw) // 4)
|
||
|
||
def _adaptive_compact(input_data, model, policy=None):
|
||
policy = policy or {}
|
||
context_size = int(policy.get("context_size", _context_limit_for_model(model)))
|
||
input_budget = int(context_size * 0.60)
|
||
estimated = _estimate_tokens(input_data)
|
||
if estimated <= input_budget:
|
||
return input_data, False
|
||
if not isinstance(input_data, list):
|
||
return input_data, False
|
||
reduction = max(0.15, input_budget / max(estimated, 1))
|
||
target_items = max(int(len(input_data) * reduction), 6)
|
||
if target_items >= len(input_data):
|
||
return input_data, False
|
||
head_end = 0
|
||
for i, item in enumerate(input_data):
|
||
t = item.get("type")
|
||
if t == "message" and item.get("role") in ("developer", "system"):
|
||
head_end = i + 1
|
||
elif t == "message" and item.get("role") == "user" and head_end == i:
|
||
head_end = i + 1
|
||
else:
|
||
break
|
||
head = input_data[:head_end]
|
||
keep = max(4, target_items // 3)
|
||
tail_start = max(head_end, len(input_data) - keep)
|
||
while tail_start > head_end:
|
||
t = input_data[tail_start].get("type")
|
||
if t in ("function_call_output", "function_call"):
|
||
tail_start -= 1
|
||
elif t == "message" and input_data[tail_start].get("role") == "assistant":
|
||
tail_start -= 1
|
||
else:
|
||
break
|
||
tail = input_data[tail_start:]
|
||
body = input_data[head_end:tail_start]
|
||
if not body:
|
||
return head + tail, True
|
||
summary_lines = [f"[Auto-compacted: {len(body)} turns removed (budget={input_budget}tok, model={model})]"]
|
||
for item in body[-5:]:
|
||
summary_lines.append(_item_summary(item, max_len=120))
|
||
summary_msg = {"type": "message", "role": "user",
|
||
"content": [{"type": "input_text", "text": "\n".join(summary_lines)}]}
|
||
print(f"[adaptive-compact] model={model} est={estimated}tok budget={input_budget}tok "
|
||
f"items {len(input_data)}->{len(head)+1+len(tail)}", file=sys.stderr)
|
||
return head + [summary_msg] + tail, True
|
||
|
||
# ═══════════════════════════════════════════════════════════════════
|
||
# Tool-call pairing validator
|
||
# ═══════════════════════════════════════════════════════════════════
|
||
|
||
def validate_tool_pairs(input_items):
|
||
if not isinstance(input_items, list):
|
||
return []
|
||
calls = {}
|
||
errors = []
|
||
for idx, item in enumerate(input_items):
|
||
t = item.get("type")
|
||
if t == "function_call":
|
||
cid = item.get("call_id") or item.get("id")
|
||
if cid:
|
||
calls[cid] = idx
|
||
elif t == "function_call_output":
|
||
cid = item.get("call_id") or item.get("id")
|
||
if not cid or cid not in calls:
|
||
errors.append({"index": idx, "call_id": cid, "error": "orphan_function_call_output"})
|
||
return errors
|
||
|
||
def repair_orphan_tool_outputs(input_items, errors):
|
||
bad = {e["index"] for e in errors}
|
||
repaired = []
|
||
for idx, item in enumerate(input_items):
|
||
if idx in bad:
|
||
output = item.get("output", "")
|
||
repaired.append({"type": "message", "role": "user",
|
||
"content": [{"type": "input_text",
|
||
"text": f"[Proxy: unmatched tool output]\n{str(output)[:4000]}"}]})
|
||
else:
|
||
repaired.append(item)
|
||
return repaired
|
||
|
||
def synthesize_tool_results_for_chat(input_items):
|
||
"""Convert Responses function_call/function_call_output pairs into plain text.
|
||
|
||
Some OpenAI-compatible providers accept tool calls on the first turn but fail
|
||
on the next request when role=tool messages are present. For those providers,
|
||
encode tool outputs as normal user text so the model can continue.
|
||
"""
|
||
if not isinstance(input_items, list):
|
||
return input_items, False
|
||
calls = {}
|
||
changed = False
|
||
out = []
|
||
for item in input_items:
|
||
t = item.get("type")
|
||
if t == "function_call":
|
||
cid = item.get("call_id") or item.get("id") or ""
|
||
calls[cid] = item
|
||
changed = True
|
||
continue
|
||
if t == "function_call_output":
|
||
cid = item.get("call_id") or item.get("id") or ""
|
||
call = calls.get(cid, {})
|
||
name = call.get("name", "tool")
|
||
args = call.get("arguments", "{}")
|
||
output = item.get("output", "")
|
||
text = (
|
||
"Tool execution result. Continue the task using this result. "
|
||
"Do not repeat the same tool call unless more information is required.\n\n"
|
||
f"Tool: {name}\nArguments:\n```json\n{str(args)[:2000]}\n```\n"
|
||
f"Output:\n```\n{str(output)[:8000]}\n```"
|
||
)
|
||
out.append({"type": "message", "role": "user", "content": [{"type": "input_text", "text": text}]})
|
||
changed = True
|
||
continue
|
||
out.append(item)
|
||
return out, changed
|
||
|
||
def has_function_call_output(input_items):
|
||
return isinstance(input_items, list) and any(i.get("type") == "function_call_output" for i in input_items)
|
||
|
||
# ═══════════════════════════════════════════════════════════════════
|
||
# Log redaction
|
||
# ═══════════════════════════════════════════════════════════════════
|
||
|
||
_SECRET_PATTERNS = [
|
||
(r"sk-[A-Za-z0-9_\-]{20,}", "[REDACTED:key]"),
|
||
(r"sk-ant-[A-Za-z0-9_\-]{20,}", "[REDACTED:anthropic]"),
|
||
(r"gh[pousr]_[A-Za-z0-9_]{20,}", "[REDACTED:github]"),
|
||
(r"Bearer\s+[A-Za-z0-9._\-]{20,}", "Bearer [REDACTED]"),
|
||
]
|
||
|
||
def _redact(text):
|
||
if not text:
|
||
return text
|
||
import re
|
||
for pattern, replacement in _SECRET_PATTERNS:
|
||
text = re.sub(pattern, replacement, text)
|
||
return text
|
||
|
||
def _redact_json(obj):
|
||
try:
|
||
raw = json.dumps(obj, ensure_ascii=False)
|
||
except Exception:
|
||
raw = str(obj)
|
||
return _redact(raw)
|
||
|
||
_MAX_SNAPSHOTS = 200
|
||
|
||
def save_request_snapshot(request_id, body):
|
||
if not request_id:
|
||
return request_id
|
||
snapshot = {
|
||
"_meta": {
|
||
"request_id": request_id,
|
||
"model": body.get("model", ""),
|
||
"stream": body.get("stream", False),
|
||
"ts": time.time(),
|
||
"ts_iso": time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()),
|
||
"status": "pending",
|
||
"duration_s": None,
|
||
"error": None,
|
||
},
|
||
"request": json.loads(_redact_json(body)),
|
||
}
|
||
path = os.path.join(_REQUESTS_DIR, f"{request_id}.json")
|
||
tmp = path + ".tmp"
|
||
with open(tmp, "w") as f:
|
||
json.dump(snapshot, f, ensure_ascii=False, indent=2)
|
||
os.replace(tmp, path)
|
||
_rotate_snapshots()
|
||
return request_id
|
||
|
||
def update_snapshot_response(request_id, status, duration_s=None, error=None):
|
||
if not request_id:
|
||
return
|
||
path = os.path.join(_REQUESTS_DIR, f"{request_id}.json")
|
||
if not os.path.exists(path):
|
||
return
|
||
try:
|
||
with open(path) as f:
|
||
snapshot = json.load(f)
|
||
meta = snapshot.get("_meta", {})
|
||
meta["status"] = status
|
||
if duration_s is not None:
|
||
meta["duration_s"] = round(duration_s, 3)
|
||
if error is not None:
|
||
meta["error"] = str(error)[:200]
|
||
snapshot["_meta"] = meta
|
||
tmp = path + ".tmp"
|
||
with open(tmp, "w") as f:
|
||
json.dump(snapshot, f, ensure_ascii=False, indent=2)
|
||
os.replace(tmp, path)
|
||
except Exception:
|
||
pass
|
||
|
||
def _rotate_snapshots():
|
||
try:
|
||
files = sorted(
|
||
[os.path.join(_REQUESTS_DIR, f) for f in os.listdir(_REQUESTS_DIR) if f.endswith(".json")],
|
||
key=os.path.getmtime,
|
||
)
|
||
while len(files) > _MAX_SNAPSHOTS:
|
||
os.remove(files.pop(0))
|
||
except Exception:
|
||
pass
|
||
|
||
# ═══════════════════════════════════════════════════════════════════
|
||
# Rate-limit token buckets
|
||
# ═══════════════════════════════════════════════════════════════════
|
||
|
||
class TokenBucket:
|
||
def __init__(self, capacity=10, refill=1.0):
|
||
self.capacity = float(capacity)
|
||
self.tokens = float(capacity)
|
||
self.refill = float(refill)
|
||
self.updated = time.monotonic()
|
||
self.lock = threading.Lock()
|
||
def allow(self, cost=1):
|
||
with self.lock:
|
||
now = time.monotonic()
|
||
self.tokens = min(self.capacity, self.tokens + (now - self.updated) * self.refill)
|
||
self.updated = now
|
||
if self.tokens >= cost:
|
||
self.tokens -= cost
|
||
return True
|
||
return False
|
||
|
||
_rate_buckets = {}
|
||
_rate_buckets_lock = threading.Lock()
|
||
|
||
def _bucket_for_route(route):
|
||
name = route.get("name") or route.get("target_url") or "default"
|
||
with _rate_buckets_lock:
|
||
if name not in _rate_buckets:
|
||
_rate_buckets[name] = TokenBucket(capacity=10, refill=1.0)
|
||
return _rate_buckets[name]
|
||
|
||
# ═══════════════════════════════════════════════════════════════════
|
||
# OpenAI-compat backend
|
||
# ═══════════════════════════════════════════════════════════════════
|
||
|
||
def oa_input_to_messages(input_data):
|
||
msgs = []
|
||
tool_name_by_id = {}
|
||
if isinstance(input_data, str):
|
||
msgs.append({"role": "user", "content": input_data})
|
||
elif isinstance(input_data, list):
|
||
pending_tool_calls = []
|
||
last_flushed_ids = []
|
||
for item in input_data:
|
||
t = item.get("type")
|
||
if t == "function_call":
|
||
tcid = item.get("call_id") or item.get("id") or uid("tc")
|
||
pending_tool_calls.append(
|
||
{"id": tcid,
|
||
"type": "function",
|
||
"function": {"name": item.get("name", ""),
|
||
"arguments": item.get("arguments", "{}")}})
|
||
tool_name_by_id[tcid] = item.get("name", "")
|
||
continue
|
||
if pending_tool_calls:
|
||
last_flushed_ids = [tc["id"] for tc in pending_tool_calls]
|
||
msgs.append({"role": "assistant", "content": None, "tool_calls": pending_tool_calls})
|
||
pending_tool_calls = []
|
||
if t == "message":
|
||
role = item.get("role", "user")
|
||
if role == "developer":
|
||
role = "system"
|
||
text = ""
|
||
reasoning_text = ""
|
||
content = item.get("content", [])
|
||
if isinstance(content, str):
|
||
text = content
|
||
else:
|
||
for part in content:
|
||
if isinstance(part, str):
|
||
text += part
|
||
continue
|
||
pt = part.get("type", "")
|
||
if pt in ("input_text", "output_text"):
|
||
text += part.get("text", "")
|
||
elif pt in ("reasoning",):
|
||
for rp in part.get("content", []):
|
||
reasoning_text += rp.get("text", "")
|
||
elif pt == "input_image":
|
||
img = part.get("image_url", part)
|
||
msgs.append({"role": role, "content": [{"type": "text", "text": text},
|
||
{"type": "image_url", "image_url": img}]})
|
||
text = None
|
||
break
|
||
if text is not None:
|
||
msg = {"role": role, "content": text}
|
||
if reasoning_text and role == "assistant":
|
||
msg["reasoning_content"] = reasoning_text
|
||
msgs.append(msg)
|
||
elif t == "function_call_output":
|
||
tcid = item.get("call_id") or item.get("id") or ""
|
||
if not tcid and last_flushed_ids:
|
||
idx = len([m for m in msgs if m.get("role") == "tool"])
|
||
if idx < len(last_flushed_ids):
|
||
tcid = last_flushed_ids[idx]
|
||
msgs.append({"role": "tool", "tool_call_id": tcid,
|
||
"tool_name": tool_name_by_id.get(tcid, ""),
|
||
"content": item.get("output", "")})
|
||
if pending_tool_calls:
|
||
msgs.append({"role": "assistant", "content": None, "tool_calls": pending_tool_calls})
|
||
return msgs
|
||
|
||
def cc_input_to_messages(input_data, instructions="", schema=None):
|
||
"""Convert Responses API input into CommandCode /alpha/generate messages.
|
||
|
||
[FIX 1] All messages use STRING content (not content blocks).
|
||
CC API rejects params.messages[i].content when it's an array.
|
||
Tool results are role="user" with plain text content.
|
||
Tool calls: inline JSON text in assistant messages (e.g. {"type":"tool-call","id":"..."}).
|
||
|
||
The model echoes this format back in its response text-delta events.
|
||
_parse_commandcode_text_tool_calls extracts them via _extract_raw_json_tool_calls.
|
||
|
||
Schema parameter is accepted but not used for format decisions —
|
||
the conservative string-content format is always used regardless of schema hints.
|
||
"""
|
||
msgs = []
|
||
pending_tool_calls = []
|
||
last_flushed_ids = []
|
||
|
||
def text_from_content(content):
|
||
if isinstance(content, str):
|
||
return content
|
||
text = ""
|
||
for part in content or []:
|
||
if isinstance(part, str):
|
||
text += part
|
||
continue
|
||
if not isinstance(part, dict):
|
||
continue
|
||
if part.get("type") in ("input_text", "output_text", "text"):
|
||
text += part.get("text", "")
|
||
return text
|
||
|
||
def flush_tool_calls():
|
||
nonlocal pending_tool_calls, last_flushed_ids
|
||
if not pending_tool_calls:
|
||
return
|
||
last_flushed_ids = [tc["id"] for tc in pending_tool_calls]
|
||
# Tool calls as plain text in assistant message
|
||
tc_text = "\n".join(
|
||
json.dumps(tc, ensure_ascii=False) for tc in pending_tool_calls
|
||
)
|
||
msgs.append({"role": "assistant", "content": tc_text})
|
||
pending_tool_calls = []
|
||
|
||
if instructions:
|
||
msgs.append({"role": "user", "content": instructions})
|
||
|
||
if isinstance(input_data, str):
|
||
msgs.append({"role": "user", "content": input_data})
|
||
return msgs
|
||
if not isinstance(input_data, list):
|
||
return msgs
|
||
|
||
for item in input_data:
|
||
if not isinstance(item, dict):
|
||
continue
|
||
t = item.get("type")
|
||
if t == "function_call":
|
||
tcid = item.get("call_id") or item.get("id") or uid("call")
|
||
name = item.get("name") or "exec_command"
|
||
pending_tool_calls.append({
|
||
"type": "tool-call",
|
||
"id": tcid,
|
||
"name": name,
|
||
"arguments": item.get("arguments") or "{}",
|
||
})
|
||
continue
|
||
flush_tool_calls()
|
||
if t == "message":
|
||
role = item.get("role", "user")
|
||
if role not in ("user", "assistant"):
|
||
role = "user"
|
||
text = text_from_content(item.get("content", []))
|
||
msgs.append({"role": role, "content": text})
|
||
elif t == "function_call_output":
|
||
output = item.get("output", "")
|
||
if not isinstance(output, str):
|
||
output = json.dumps(output, ensure_ascii=False)
|
||
# /alpha/generate expects string content for ALL messages
|
||
msgs.append({"role": "user", "content": output[:8000]})
|
||
flush_tool_calls()
|
||
return msgs
|
||
|
||
def oa_convert_tools(tools, strict=False):
|
||
if not tools:
|
||
return None
|
||
out = []
|
||
for t in tools:
|
||
if t.get("type") != "function":
|
||
continue
|
||
fn = t.get("function", {})
|
||
name = ""
|
||
if fn:
|
||
name = (fn.get("name") or "").strip()
|
||
else:
|
||
name = (t.get("name") or "").strip()
|
||
if not name or name == "null":
|
||
continue
|
||
if fn:
|
||
entry = dict(t)
|
||
if strict and "strict" not in fn:
|
||
entry["function"] = dict(fn, strict=True)
|
||
out.append(entry)
|
||
else:
|
||
entry = {
|
||
"type": "function",
|
||
"function": {"name": name, "description": t.get("description", ""),
|
||
"parameters": t.get("parameters", {})}
|
||
}
|
||
if strict:
|
||
entry["function"]["strict"] = True
|
||
out.append(entry)
|
||
return out or None
|
||
|
||
def oa_resp_to_responses(chat_resp, model, resp_id=None):
|
||
choice = chat_resp["choices"][0]
|
||
msg = choice["message"]
|
||
content = msg.get("content") or ""
|
||
finish = choice.get("finish_reason", "stop")
|
||
fm = {"stop": "completed", "length": "incomplete", "tool_calls": "completed", "content_filter": "incomplete"}
|
||
status = fm.get(finish, "incomplete")
|
||
outputs = []
|
||
if content:
|
||
outputs.append({"type": "message", "id": uid("msg"), "role": "assistant", "status": "completed",
|
||
"content": [{"type": "output_text", "text": content, "annotations": []}]})
|
||
for tc in msg.get("tool_calls") or []:
|
||
fn = tc.get("function", {})
|
||
outputs.append({"type": "function_call", "id": uid("fc"), "call_id": tc.get("id"),
|
||
"name": fn.get("name"), "arguments": fn.get("arguments", "{}"), "status": "completed"})
|
||
usage = chat_resp.get("usage", {})
|
||
return {"id": resp_id or uid("resp"), "object": "response", "created": int(time.time()),
|
||
"model": model, "status": status, "output": outputs,
|
||
"usage": {"input_tokens": usage.get("prompt_tokens", 0),
|
||
"output_tokens": usage.get("completion_tokens", 0),
|
||
"total_tokens": usage.get("total_tokens", 0),
|
||
"input_tokens_details": {"cached_tokens": usage.get("prompt_tokens_details", {}).get("cached_tokens", 0)}}}
|
||
|
||
def oa_stream_to_sse(chat_stream, model, req_id, _reasoning_out=None):
|
||
resp_id = req_id or uid("resp")
|
||
msg_id = uid("msg")
|
||
text_buf = ""
|
||
reasoning_buf = ""
|
||
reasoning_opened = False
|
||
tc_buf = {}
|
||
fr = None
|
||
msg_opened = False
|
||
|
||
yield emit("response.created", {"type": "response.created",
|
||
"response": {"id": resp_id, "object": "response", "model": model,
|
||
"status": "in_progress", "created": int(time.time()), "output": []}})
|
||
yield emit("response.in_progress", {"type": "response.in_progress", "response": {"id": resp_id}})
|
||
|
||
for line in _stream_with_idle_timeout(chat_stream):
|
||
line = line.decode("utf-8", errors="replace").strip()
|
||
if not line or line.startswith(":") or line == "data: [DONE]":
|
||
continue
|
||
if not line.startswith("data: "):
|
||
continue
|
||
try:
|
||
chunk = json.loads(line[6:])
|
||
except json.JSONDecodeError:
|
||
continue
|
||
choices = chunk.get("choices", [])
|
||
if not choices:
|
||
continue
|
||
delta = choices[0].get("delta", {})
|
||
fr = choices[0].get("finish_reason")
|
||
|
||
rc = delta.get("reasoning_content") or delta.get("reasoning")
|
||
if rc:
|
||
if not reasoning_opened:
|
||
reasoning_opened = True
|
||
reasoning_buf += rc
|
||
yield emit("response.reasoning.delta", {"type": "response.reasoning.delta", "delta": rc})
|
||
|
||
content = delta.get("content")
|
||
if content:
|
||
if not msg_opened:
|
||
msg_id = uid("msg")
|
||
yield emit("response.output_item.added", {"type": "response.output_item.added",
|
||
"item": {"type": "message", "id": msg_id, "role": "assistant", "status": "in_progress", "content": []}})
|
||
yield emit("response.content_part.added", {"type": "response.content_part.added",
|
||
"part": {"type": "output_text", "text": "", "annotations": []}, "item_id": msg_id})
|
||
msg_opened = True
|
||
text_buf += content
|
||
yield emit("response.output_text.delta", {"type": "response.output_text.delta",
|
||
"delta": content, "item_id": msg_id, "content_index": 0})
|
||
|
||
for tc in delta.get("tool_calls") or []:
|
||
idx = tc.get("index", 0)
|
||
if idx not in tc_buf:
|
||
fid = uid("fc")
|
||
tc_buf[idx] = {"id": fid, "call_id": tc.get("id", fid), "name": "", "args": ""}
|
||
yield emit("response.output_item.added", {"type": "response.output_item.added",
|
||
"item": {"type": "function_call", "id": fid, "call_id": tc_buf[idx]["call_id"],
|
||
"name": "", "arguments": "", "status": "in_progress"}})
|
||
fn = tc.get("function", {})
|
||
if "name" in fn and fn["name"]:
|
||
tc_buf[idx]["name"] = fn["name"]
|
||
if "arguments" in fn and fn["arguments"]:
|
||
tc_buf[idx]["args"] += fn["arguments"]
|
||
yield emit("response.output_text.delta", {"type": "response.function_call_arguments.delta",
|
||
"delta": fn["arguments"], "item_id": tc_buf[idx]["id"]})
|
||
|
||
reasoning_rsn_id = uid("rsn") if reasoning_buf else None
|
||
if reasoning_opened:
|
||
yield emit("response.reasoning.done", {"type": "response.reasoning.done",
|
||
"item_id": reasoning_rsn_id, "text": reasoning_buf})
|
||
|
||
if msg_opened:
|
||
yield emit("response.output_text.done", {"type": "response.output_text.done",
|
||
"text": text_buf, "item_id": msg_id, "content_index": 0})
|
||
yield emit("response.content_part.done", {"type": "response.content_part.done",
|
||
"part": {"type": "output_text", "text": text_buf, "annotations": []}, "item_id": msg_id})
|
||
yield emit("response.output_item.done", {"type": "response.output_item.done",
|
||
"item": {"type": "message", "id": msg_id, "role": "assistant", "status": "completed",
|
||
"content": [{"type": "output_text", "text": text_buf, "annotations": []}]}})
|
||
|
||
for idx in sorted(tc_buf):
|
||
t = tc_buf[idx]
|
||
yield emit("response.function_call_arguments.done", {"type": "response.function_call_arguments.done",
|
||
"item_id": t["id"], "name": t["name"], "arguments": t["args"]})
|
||
yield emit("response.output_item.done", {"type": "response.output_item.done",
|
||
"item": {"type": "function_call", "id": t["id"], "call_id": t["call_id"],
|
||
"name": t["name"], "arguments": t["args"], "status": "completed"}})
|
||
|
||
fm = {"stop": "completed", "length": "incomplete", "tool_calls": "completed", "content_filter": "incomplete"}
|
||
status = fm.get(fr, "incomplete")
|
||
final_out = []
|
||
if reasoning_buf:
|
||
final_out.append({"type": "reasoning", "id": reasoning_rsn_id, "status": "completed",
|
||
"content": [{"type": "text", "text": reasoning_buf}]})
|
||
if msg_opened:
|
||
msg_content = []
|
||
if reasoning_buf:
|
||
msg_content.append({"type": "output_text", "text": text_buf, "annotations": []})
|
||
else:
|
||
msg_content.append({"type": "output_text", "text": text_buf, "annotations": []})
|
||
final_out.append({"type": "message", "id": msg_id, "role": "assistant", "status": "completed",
|
||
"content": msg_content})
|
||
for idx in sorted(tc_buf):
|
||
t = tc_buf[idx]
|
||
final_out.append({"type": "function_call", "id": t["id"], "call_id": t["call_id"],
|
||
"name": t["name"], "arguments": t["args"], "status": "completed"})
|
||
yield emit("response.completed", {"type": "response.completed",
|
||
"response": {"id": resp_id, "object": "response", "model": model,
|
||
"status": status, "created": int(time.time()), "output": final_out}})
|
||
if _reasoning_out is not None:
|
||
_reasoning_out["text"] = reasoning_buf
|
||
_reasoning_out["tool_calls"] = [tc_buf[i] for i in sorted(tc_buf)] if tc_buf else []
|
||
|
||
# ═══════════════════════════════════════════════════════════════════
|
||
# Anthropic backend
|
||
# ═══════════════════════════════════════════════════════════════════
|
||
|
||
def an_input_to_messages(input_data):
|
||
msgs = []
|
||
if isinstance(input_data, str):
|
||
msgs.append({"role": "user", "content": input_data})
|
||
elif isinstance(input_data, list):
|
||
for item in input_data:
|
||
t = item.get("type")
|
||
if t == "message":
|
||
role = item.get("role", "user")
|
||
if role == "developer":
|
||
role = "user"
|
||
text = ""
|
||
thinking_blocks = []
|
||
for part in item.get("content", []):
|
||
pt = part.get("type", "")
|
||
if pt in ("input_text", "output_text"):
|
||
text += part.get("text", "")
|
||
elif pt in ("reasoning", "thinking"):
|
||
thinking_text = ""
|
||
for rp in part.get("content", []):
|
||
thinking_text += rp.get("text", "")
|
||
if thinking_text:
|
||
thinking_blocks.append({"type": "thinking", "thinking": thinking_text, "signature": part.get("signature", "")})
|
||
if role == "assistant":
|
||
content_parts = []
|
||
if thinking_blocks:
|
||
content_parts.extend(thinking_blocks)
|
||
if text:
|
||
content_parts.append({"type": "text", "text": text})
|
||
msgs.append({"role": "assistant", "content": content_parts if content_parts else text})
|
||
else:
|
||
msgs.append({"role": "user", "content": text})
|
||
elif t == "function_call":
|
||
msgs.append({"role": "assistant", "content": [
|
||
{"type": "tool_use", "id": item.get("call_id", item.get("id", uid("tu"))),
|
||
"name": item.get("name", ""),
|
||
"input": json.loads(item.get("arguments", "{}"))}
|
||
]})
|
||
elif t == "function_call_output":
|
||
msgs.append({"role": "user", "content": [
|
||
{"type": "tool_result", "tool_use_id": item.get("id", ""),
|
||
"content": item.get("output", "")}
|
||
]})
|
||
return msgs
|
||
|
||
def an_convert_tools(tools):
|
||
if not tools:
|
||
return None
|
||
out = []
|
||
for t in tools:
|
||
if t.get("type") != "function":
|
||
continue
|
||
fn = t.get("function", {})
|
||
if fn:
|
||
out.append({"name": fn.get("name"), "description": fn.get("description", ""),
|
||
"input_schema": fn.get("parameters", {"type": "object", "properties": {}})})
|
||
else:
|
||
out.append({"name": t.get("name"), "description": t.get("description", ""),
|
||
"input_schema": t.get("parameters", {"type": "object", "properties": {}})})
|
||
return out or None
|
||
|
||
def an_resp_to_responses(anthro_resp, model, resp_id=None):
|
||
blocks = anthro_resp.get("content", [])
|
||
sr = anthro_resp.get("stop_reason", "end_turn")
|
||
sm = {"end_turn": "completed", "max_tokens": "incomplete", "stop_sequence": "completed", "tool_use": "completed"}
|
||
status = sm.get(sr, "incomplete")
|
||
outputs = []
|
||
for b in blocks:
|
||
bt = b.get("type", "")
|
||
if bt == "text":
|
||
outputs.append({"type": "message", "id": uid("msg"), "role": "assistant", "status": "completed",
|
||
"content": [{"type": "output_text", "text": b.get("text", ""), "annotations": []}]})
|
||
elif bt == "tool_use":
|
||
outputs.append({"type": "function_call", "id": uid("fc"), "call_id": b.get("id", ""),
|
||
"name": b.get("name", ""), "arguments": json.dumps(b.get("input", {})),
|
||
"status": "completed"})
|
||
elif bt == "thinking":
|
||
outputs.append({"type": "reasoning", "id": uid("rsn"), "status": "completed",
|
||
"content": [{"type": "text", "text": b.get("thinking", "")}]})
|
||
usage = anthro_resp.get("usage", {})
|
||
return {"id": resp_id or uid("resp"), "object": "response", "created": int(time.time()),
|
||
"model": model, "status": status, "output": outputs,
|
||
"usage": {"input_tokens": usage.get("input_tokens", 0),
|
||
"output_tokens": usage.get("output_tokens", 0),
|
||
"total_tokens": usage.get("input_tokens", 0) + usage.get("output_tokens", 0),
|
||
"input_tokens_details": {"cached_tokens": 0}}}
|
||
|
||
def an_stream_to_sse(stream, model, req_id):
|
||
resp_id = req_id or uid("resp")
|
||
completed = []
|
||
msg_id = uid("msg")
|
||
text_buf = ""
|
||
tc_id = None
|
||
tc_call_id = None
|
||
tc_name = ""
|
||
tc_args = ""
|
||
block_type = None
|
||
stop_reason = "end_turn"
|
||
|
||
yield emit("response.created", {"type": "response.created",
|
||
"response": {"id": resp_id, "object": "response", "model": model,
|
||
"status": "in_progress", "created": int(time.time()), "output": []}})
|
||
yield emit("response.in_progress", {"type": "response.in_progress", "response": {"id": resp_id}})
|
||
|
||
for raw in stream:
|
||
line = raw.decode("utf-8", errors="replace").strip()
|
||
if not line:
|
||
continue
|
||
if line.startswith("event: "):
|
||
evt_type = line[7:]
|
||
continue
|
||
if not line.startswith("data: "):
|
||
continue
|
||
try:
|
||
data = json.loads(line[6:])
|
||
except json.JSONDecodeError:
|
||
continue
|
||
|
||
et = data.get("type", "")
|
||
|
||
if et == "message_start":
|
||
pass
|
||
|
||
elif et == "content_block_start":
|
||
cb_type = data.get("content_block", {}).get("type", "")
|
||
block_type = cb_type
|
||
if cb_type == "text":
|
||
msg_id = uid("msg")
|
||
yield emit("response.output_item.added", {"type": "response.output_item.added",
|
||
"item": {"type": "message", "id": msg_id, "role": "assistant",
|
||
"status": "in_progress", "content": []}})
|
||
yield emit("response.content_part.added", {"type": "response.content_part.added",
|
||
"part": {"type": "output_text", "text": "", "annotations": []}, "item_id": msg_id})
|
||
elif cb_type == "tool_use":
|
||
cb = data.get("content_block", {})
|
||
tc_id = uid("fc")
|
||
tc_call_id = cb.get("id", tc_id)
|
||
tc_name = cb.get("name", "")
|
||
yield emit("response.output_item.added", {"type": "response.output_item.added",
|
||
"item": {"type": "function_call", "id": tc_id, "call_id": tc_call_id,
|
||
"name": tc_name, "arguments": "", "status": "in_progress"}})
|
||
elif cb_type == "thinking":
|
||
pass
|
||
|
||
elif et == "content_block_delta":
|
||
dd = data.get("delta", {})
|
||
dt = dd.get("type", "")
|
||
if dt == "text_delta":
|
||
txt = dd.get("text", "")
|
||
text_buf += txt
|
||
yield emit("response.output_text.delta", {"type": "response.output_text.delta",
|
||
"delta": txt, "item_id": msg_id, "content_index": 0})
|
||
elif dt == "input_json_delta":
|
||
pj = dd.get("partial_json", "")
|
||
tc_args += pj
|
||
yield emit("response.output_text.delta", {"type": "response.function_call_arguments.delta",
|
||
"delta": pj, "item_id": tc_id})
|
||
elif dt == "thinking_delta":
|
||
tk = dd.get("thinking", "")
|
||
yield emit("response.reasoning.delta", {"type": "response.reasoning.delta", "delta": tk})
|
||
|
||
elif et == "content_block_stop":
|
||
if block_type == "text":
|
||
yield emit("response.output_text.done", {"type": "response.output_text.done",
|
||
"text": text_buf, "item_id": msg_id, "content_index": 0})
|
||
yield emit("response.content_part.done", {"type": "response.content_part.done",
|
||
"part": {"type": "output_text", "text": text_buf, "annotations": []}, "item_id": msg_id})
|
||
yield emit("response.output_item.done", {"type": "response.output_item.done",
|
||
"item": {"type": "message", "id": msg_id, "role": "assistant", "status": "completed",
|
||
"content": [{"type": "output_text", "text": text_buf, "annotations": []}]}})
|
||
completed.append({"type": "message", "id": msg_id, "role": "assistant", "status": "completed",
|
||
"content": [{"type": "output_text", "text": text_buf, "annotations": []}]})
|
||
text_buf = ""
|
||
elif block_type == "tool_use":
|
||
yield emit("response.function_call_arguments.done", {"type": "response.function_call_arguments.done",
|
||
"item_id": tc_id, "name": tc_name, "arguments": tc_args})
|
||
yield emit("response.output_item.done", {"type": "response.output_item.done",
|
||
"item": {"type": "function_call", "id": tc_id, "call_id": tc_call_id,
|
||
"name": tc_name, "arguments": tc_args, "status": "completed"}})
|
||
completed.append({"type": "function_call", "id": tc_id, "call_id": tc_call_id,
|
||
"name": tc_name, "arguments": tc_args, "status": "completed"})
|
||
tc_id = None
|
||
tc_args = ""
|
||
block_type = None
|
||
|
||
elif et == "message_delta":
|
||
stop_reason = data.get("delta", {}).get("stop_reason", "end_turn")
|
||
|
||
elif et == "message_stop":
|
||
sm = {"end_turn": "completed", "max_tokens": "incomplete",
|
||
"stop_sequence": "completed", "tool_use": "completed"}
|
||
status = sm.get(stop_reason, "incomplete")
|
||
yield emit("response.completed", {"type": "response.completed",
|
||
"response": {"id": resp_id, "object": "response", "model": model,
|
||
"status": status, "created": int(time.time()), "output": completed}})
|
||
|
||
_DEFAULT_CC_CONFIG = {
|
||
"workingDir": "/tmp",
|
||
"date": "",
|
||
"environment": "linux",
|
||
"shell": "bash",
|
||
"files": [],
|
||
"structure": [],
|
||
"isGitRepo": False,
|
||
"currentBranch": "",
|
||
"mainBranch": "",
|
||
"gitStatus": "",
|
||
"recentCommits": [],
|
||
}
|
||
|
||
def _cc_config():
|
||
cfg = dict(_DEFAULT_CC_CONFIG)
|
||
cfg["date"] = time.strftime("%Y-%m-%d")
|
||
return cfg
|
||
|
||
def cc_convert_tools(tools):
|
||
return oa_convert_tools(tools)
|
||
|
||
def _strip_xmlish_tags(text):
|
||
return re.sub(r"<[^>]+>", "", text or "")
|
||
|
||
def _unwrap_cmd(cmd_val):
|
||
"""[FIX 11] Self-healing: unwrap double-wrapped cmd values.
|
||
|
||
Model sometimes generates: {"cmd": "{\"cmd\": \"actual_command\"}"}
|
||
Detect when cmd value is itself a JSON object with a nested "cmd" key,
|
||
and extract the real command string. Recursively unwraps up to 3 levels.
|
||
"""
|
||
if not isinstance(cmd_val, str) or not cmd_val.startswith("{"):
|
||
return cmd_val
|
||
for _ in range(3):
|
||
try:
|
||
inner = json.loads(cmd_val)
|
||
if isinstance(inner, dict) and "cmd" in inner and isinstance(inner["cmd"], str):
|
||
cmd_val = inner["cmd"]
|
||
else:
|
||
break
|
||
except Exception:
|
||
break
|
||
return cmd_val
|
||
|
||
def _build_explore_cmd(text_for_url):
|
||
"""Module-level explore command builder. Extracts repo URL from text,
|
||
builds a curl pipeline to fetch README, contents listing, and releases.
|
||
Used by _parse_commandcode_text_tool_calls (closure wrapper) and
|
||
cc_stream_to_sse (stuck recovery heuristic)."""
|
||
if not text_for_url:
|
||
return None, None
|
||
url_m = re.search(r"https?://[^\s\]'\\>\",]+", text_for_url)
|
||
repo_url = url_m.group(0).rstrip(")].,;'\\\"") if url_m else ""
|
||
if not repo_url and isinstance(text_for_url, str):
|
||
try:
|
||
_parsed = json.loads(text_for_url)
|
||
if isinstance(_parsed, list):
|
||
for _item in _parsed:
|
||
_c = _item.get("content", "") if isinstance(_item, dict) else str(_item)
|
||
url_m2 = re.search(r"https?://[^\s\]'\\>\",]+", _c)
|
||
if url_m2:
|
||
repo_url = url_m2.group(0).rstrip(")].,;'\\\"")
|
||
break
|
||
except Exception:
|
||
pass
|
||
if not repo_url:
|
||
return None, None
|
||
if repo_url.endswith(".git"):
|
||
repo_url = repo_url[:-4]
|
||
if "/api/v1/repos/" not in repo_url:
|
||
host_m = re.match(r"(https?://[^/]+)/(.*)", repo_url)
|
||
if host_m:
|
||
host, path = host_m.groups()
|
||
api_base = f"{host}/api/v1/repos/{path}"
|
||
else:
|
||
api_base = repo_url.replace("/admin/", "/api/v1/repos/")
|
||
else:
|
||
api_base = repo_url
|
||
cmd = (
|
||
f"cd /tmp && "
|
||
f"curl -sL --max-time 15 '{api_base}/contents/README.md' 2>/dev/null | "
|
||
f"python3 -c \"import sys,json,base64; d=json.load(sys.stdin); print(base64.b64decode(d['content']).decode())\" 2>/dev/null | head -600 && "
|
||
f"curl -sL --max-time 15 '{api_base}/contents' 2>/dev/null | python3 -c \"import sys,json; d=json.load(sys.stdin); print('\\n'.join(f'{{x.get(\'path\')}} {{x.get(\'type\')}}' for x in d[:50]))\" 2>/dev/null && "
|
||
f"curl -sL --max-time 15 '{api_base}/releases' 2>/dev/null | python3 -c \"import sys,json; d=json.load(sys.stdin); print(json.dumps(d[:3], indent=2)[:2000])\" 2>/dev/null"
|
||
)
|
||
return cmd, "Explore repository to understand the app and gather README, root contents, and releases for the landing page."
|
||
|
||
def _parse_commandcode_text_tool_calls(text):
|
||
"""Parse CommandCode's text-form tool calls into Responses function calls.
|
||
|
||
Handles THREE formats:
|
||
1. XML: ``<tool_call name="bash"><parameter name="command">...</parameter>`` (original)
|
||
2. Function: ``<function=bash>...</function>`` (original)
|
||
3. [FIX 5] Raw JSON inline: {"type":"tool-call","id":"...","name":"exec_command","arguments":"{...}"}
|
||
|
||
Format 3 exists because cc_input_to_messages sends tool calls as inline JSON text.
|
||
The CC model echoes this format back in its response.
|
||
Extraction is done by _extract_raw_json_tool_calls() which is appended after the
|
||
XML pattern loop. See that function for details on malformed-JSON handling.
|
||
|
||
Tolerant of: unescaped inner quotes, unbalanced braces, missing type/id fields,
|
||
sandbox_permissions at top level vs nested inside arguments, etc.
|
||
"""
|
||
calls = []
|
||
if not text:
|
||
return calls
|
||
|
||
_build_explore_cmd_local = _build_explore_cmd
|
||
|
||
# [FIX 17] DSML tool_call blocks used by the model now.
|
||
# Example:
|
||
# <||DSML||tool_calls>
|
||
# <||DSML||invoke name="exec">
|
||
# <||DSML||parameter name="command" string="true">curl ...</||DSML||parameter>
|
||
# <||DSML||parameter name="sandbox_permissions" string="true">require_escalated</||DSML||parameter>
|
||
# <||DSML||parameter name="justification" string="true">...</||DSML||parameter>
|
||
# <||DSML||parameter name="prefix_rule" string="true">["/bin/bash", "-lc", "curl ..."]</||DSML||parameter>
|
||
# </||DSML||invoke>
|
||
# </||DSML||tool_calls>
|
||
for m in re.finditer(r"<[^>]*tool_calls[^>]*>(.*?)</[^>]*tool_calls[^>]*>", text, re.DOTALL | re.IGNORECASE):
|
||
block = m.group(1) or ""
|
||
for im in re.finditer(r"<[^>]*invoke[^>]*name=\"([^\"]+)\"[^>]*>(.*?)</[^>]*invoke>", block, re.DOTALL | re.IGNORECASE):
|
||
raw_name = (im.group(1) or "").strip()
|
||
body = (im.group(2) or "").strip()
|
||
if not body:
|
||
continue
|
||
cmd = None
|
||
sandbox_permissions = None
|
||
justification = None
|
||
# Parameter tags are the canonical source.
|
||
for pm in re.finditer(r"<[^>]*parameter[^>]*name=\"([^\"]+)\"[^>]*>(.*?)</[^>]*parameter>", body, re.DOTALL | re.IGNORECASE):
|
||
key = (pm.group(1) or "").strip().lower()
|
||
val = _strip_xmlish_tags(pm.group(2)).strip()
|
||
# [FIX 21] Accept both "command" and "cmd" parameter names.
|
||
# The tool schema defines the parameter as "cmd" (see exec_command schema),
|
||
# but the model sometimes uses "command" (especially from prefix_rule fallback).
|
||
# Previously only "command" was accepted, so DSML blocks with name="cmd"
|
||
# were silently dropped — causing Codex CLI to stop mid-task.
|
||
if key in ("command", "cmd"):
|
||
cmd = val
|
||
elif key == "prefix_rule" and not cmd:
|
||
try:
|
||
pr_obj = json.loads(val)
|
||
except Exception:
|
||
pr_obj = None
|
||
if isinstance(pr_obj, list) and pr_obj and isinstance(pr_obj[-1], str):
|
||
cmd = pr_obj[-1]
|
||
elif key == "sandbox_permissions":
|
||
sandbox_permissions = val
|
||
elif key == "justification":
|
||
justification = val
|
||
|
||
# [FIX 20] Support explore / explore_agent in DSML blocks
|
||
is_explore = raw_name.lower() in ("explore", "explore_agent")
|
||
if is_explore:
|
||
explore_cmd, explore_just = _build_explore_cmd_local(body)
|
||
if explore_cmd:
|
||
cmd = explore_cmd
|
||
justification = explore_just
|
||
|
||
# Fallback: if the body contains a raw JSON command.
|
||
if not cmd:
|
||
jm = re.search(r'"(?:command|cmd)"\s*:\s*"((?:[^"\\]|\\.)*)"', body, re.DOTALL)
|
||
if jm:
|
||
cmd = jm.group(1).replace('\\n', '\n').replace('\\"', '"').strip()
|
||
if not cmd:
|
||
continue
|
||
# [FIX 19] Translate execute_request and other variations to exec_command (CLI only supports exec_command)
|
||
# [FIX 20] Translate explore and explore_agent to exec_command
|
||
tool_name = "exec_command" if raw_name.lower() in ("exec", "bash", "shell", "terminal", "run_command", "execute_request", "execute_command", "run_shell_command", "run_shell", "run", "explore", "explore_agent") else raw_name
|
||
args = {"cmd": _unwrap_cmd(cmd)}
|
||
if sandbox_permissions:
|
||
args["sandbox_permissions"] = sandbox_permissions if sandbox_permissions in ("use_default", "require_escalated", "with_user_approval") else "require_escalated"
|
||
if justification:
|
||
args["justification"] = justification
|
||
calls.append({
|
||
"full_match": m.group(0),
|
||
"name": tool_name,
|
||
"arguments": json.dumps(args, ensure_ascii=False),
|
||
})
|
||
|
||
# [FIX 16] Native <bash> blocks from CommandCode.
|
||
# Example:
|
||
# <bash>
|
||
# sandbox_permissions: require_escalated
|
||
# justification: ...
|
||
# prefix_rule: ["/bin/bash", "-lc", "curl ..."]
|
||
# </bash>
|
||
# Convert into exec_command calls by extracting the command from prefix_rule.
|
||
for m in re.finditer(r"<bash>(.*?)</bash>", text, re.DOTALL | re.IGNORECASE):
|
||
body = (m.group(1) or "").strip()
|
||
if not body:
|
||
continue
|
||
sandbox_permissions = None
|
||
justification = None
|
||
cmd = None
|
||
# Try line-oriented parsing first.
|
||
for line in body.splitlines():
|
||
s = line.strip()
|
||
if s.lower().startswith("sandbox_permissions:"):
|
||
sandbox_permissions = s.split(":", 1)[1].strip()
|
||
elif s.lower().startswith("justification:"):
|
||
justification = s.split(":", 1)[1].strip()
|
||
elif s.lower().startswith("prefix_rule:"):
|
||
pr = s.split(":", 1)[1].strip()
|
||
try:
|
||
pr_obj = json.loads(pr)
|
||
except Exception:
|
||
pr_obj = None
|
||
if isinstance(pr_obj, list) and pr_obj:
|
||
# If the last arg exists, it is typically the shell command.
|
||
cmd = pr_obj[-1] if isinstance(pr_obj[-1], str) else None
|
||
elif pr.startswith("[") and pr.endswith("]"):
|
||
parts = re.findall(r'"((?:[^"\\]|\\.)*)"', pr)
|
||
if parts:
|
||
cmd = parts[-1].encode().decode("unicode_escape")
|
||
# Fallback: grab a shell-looking line if prefix_rule wasn't parseable.
|
||
if not cmd:
|
||
for line in body.splitlines():
|
||
s = line.strip()
|
||
if re.match(r"^(curl|wget|python3?|node|npm|pnpm|yarn|cat|ls|find|grep|rg|sed|awk|git|mkdir|touch|printf|echo)\b", s):
|
||
cmd = s
|
||
break
|
||
if not cmd:
|
||
continue
|
||
args = {"cmd": cmd}
|
||
if sandbox_permissions:
|
||
args["sandbox_permissions"] = sandbox_permissions if sandbox_permissions in ("use_default", "require_escalated", "with_user_approval") else "require_escalated"
|
||
if justification:
|
||
args["justification"] = justification
|
||
calls.append({
|
||
"full_match": m.group(0),
|
||
"name": "exec_command",
|
||
"arguments": json.dumps(args, ensure_ascii=False),
|
||
})
|
||
|
||
# [FIX 15] Native <explore_agent> blocks from CommandCode.
|
||
# Format seen in logs:
|
||
# <explore_agent>\nmessages: [{...}]\n</explore_agent>
|
||
# Treat as an assistant-requested agent call so the loop can continue.
|
||
for m in re.finditer(r"<explore_agent>(.*?)</explore_agent>|<explore_agent>\s*messages:\s*(\[.*?\])", text, re.DOTALL | re.IGNORECASE):
|
||
body = m.group(1) or m.group(2) or ""
|
||
body = body.strip()
|
||
msgs = None
|
||
if body:
|
||
try:
|
||
msgs = json.loads(body) if body.startswith("[") else None
|
||
except Exception:
|
||
msgs = None
|
||
if msgs is None and body:
|
||
mm = re.search(r"(\[.*\])", body, re.DOTALL)
|
||
if mm:
|
||
try:
|
||
msgs = json.loads(mm.group(1))
|
||
except Exception:
|
||
msgs = None
|
||
if msgs is None:
|
||
msgs = body
|
||
text_for_url = body if isinstance(body, str) else json.dumps(body, ensure_ascii=False)
|
||
cmd, justification = _build_explore_cmd_local(text_for_url)
|
||
if not cmd:
|
||
cmd = "echo 'explore_agent: unable to extract repository URL'"
|
||
justification = "Fallback for explore_agent block without URL."
|
||
args = {"cmd": cmd}
|
||
if justification:
|
||
args["justification"] = justification
|
||
calls.append({
|
||
"full_match": m.group(0),
|
||
"name": "exec_command",
|
||
"arguments": json.dumps(args, ensure_ascii=False),
|
||
})
|
||
|
||
if not calls and text.count("<explore_agent>") >= 2:
|
||
url_m = re.search(r"https?://[^\s\]'\\>\"]+", text)
|
||
if not url_m:
|
||
for prev_url in _last_user_urls:
|
||
url_m = re.search(r"https?://[^\s\]'\\>\"]+", prev_url)
|
||
if url_m:
|
||
break
|
||
if url_m:
|
||
explore_url = url_m.group(0).rstrip(")].,;'\\")
|
||
cmd, justification = _build_explore_cmd_local(explore_url)
|
||
if cmd:
|
||
calls.append({
|
||
"full_match": "<explore_agent>...",
|
||
"name": "exec_command",
|
||
"arguments": json.dumps({"cmd": cmd, "justification": justification or "Explore repository"}, ensure_ascii=False),
|
||
})
|
||
|
||
# [FIX 24] Handle <require_escalation> and <request_escalation_permission> blocks.
|
||
# The model produces these when it wants elevated permissions but the CC
|
||
# adapter doesn't support them. Synthesize a proceed command so the loop continues.
|
||
if not calls:
|
||
for m in re.finditer(r"<(?:require_escalation|request_escalation_permission)>(.*?)</(?:require_escalation|request_escalation_permission)>", text, re.DOTALL | re.IGNORECASE):
|
||
body_escal = (m.group(1) or "").strip()
|
||
_inner_url_m = re.search(r"https?://[^\s\]'\\>\",]+", body_escal)
|
||
if _inner_url_m:
|
||
_e_url = _inner_url_m.group(0).rstrip(")].,;'\\\"")
|
||
_e_cmd, _e_just = _build_explore_cmd_local(_e_url)
|
||
if _e_cmd:
|
||
calls.append({
|
||
"full_match": m.group(0),
|
||
"name": "exec_command",
|
||
"arguments": json.dumps({"cmd": _e_cmd, "justification": _e_just or "Escalation block with URL — auto-proceed"}, ensure_ascii=False),
|
||
})
|
||
continue
|
||
if not calls:
|
||
calls.append({
|
||
"full_match": m.group(0),
|
||
"name": "exec_command",
|
||
"arguments": json.dumps({"cmd": "echo 'escalation: auto-proceeding — no specific command in escalation block'", "justification": "Auto-proceed past escalation request"}, ensure_ascii=False),
|
||
})
|
||
|
||
# [FIX 24b] Bare <require_escalation ... /> or <request_escalation_permission ... />
|
||
# without closing tags. Just auto-proceed.
|
||
if not calls and re.search(r"<(?:require_escalation|request_escalation_permission)[\s/>]", text, re.IGNORECASE):
|
||
calls.append({
|
||
"full_match": "<escalation_bare/>",
|
||
"name": "exec_command",
|
||
"arguments": json.dumps({"cmd": "echo 'escalation: auto-proceeding past bare escalation tag'", "justification": "Auto-proceed past bare escalation tag"}, ensure_ascii=False),
|
||
})
|
||
|
||
patterns = [
|
||
r"<tool_call(?:\s+name=['\"]?([^'\">\s]+)['\"]?)?>(.*?)</tool_call[)]?>",
|
||
r"<function=(\w+)>(.*?)</function>",
|
||
# [FIX 14] CC model actual output: <tool_call type="bash">\n{"command":"...", "description":"..."}
|
||
# No </tool_call) closing tag — body is a raw JSON object
|
||
r"<tool_call(?:\s+type=['\"]?(\w+)['\"]?)?>\s*(\{.*?\})(?:\s*</tool_call)?",
|
||
]
|
||
|
||
def _find_balanced_brace(text, start):
|
||
"""Find the closing brace matching text[start], handling quoted strings."""
|
||
if start >= len(text) or text[start] != '{':
|
||
return -1
|
||
depth = 0
|
||
i = start
|
||
in_str = False
|
||
escape = False
|
||
while i < len(text):
|
||
ch = text[i]
|
||
if escape:
|
||
escape = False
|
||
elif ch == '\\':
|
||
escape = True
|
||
elif ch == '"':
|
||
in_str = not in_str
|
||
elif not in_str:
|
||
if ch == '{':
|
||
depth += 1
|
||
elif ch == '}':
|
||
depth -= 1
|
||
if depth == 0:
|
||
return i
|
||
i += 1
|
||
return -1
|
||
|
||
def _extract_field(text, key, end_chars=',}'):
|
||
"""Extract a field value after "key": in rough JSON text.
|
||
|
||
[FIX 7] Handles values starting with \" (backslash-quote) which occurs when
|
||
the model generates properly-escaped JSON inside a string value.
|
||
Without this fix, _extract_field returns None for escaped values,
|
||
causing sandbox_permissions/justification to not be extracted from
|
||
the parsed args dict (falling through to raw snippet extraction).
|
||
|
||
Also tolerant of unescaped quotes inside string values.
|
||
Returns None if key not found or value is empty.
|
||
"""
|
||
pat = re.compile(r'"' + re.escape(key) + r'"\s*:\s*', re.DOTALL)
|
||
m = pat.search(text)
|
||
if not m:
|
||
return None
|
||
val_start = m.end()
|
||
# Skip leading backslash-escape if the value starts with \" (nested JSON string)
|
||
if val_start < len(text) and text[val_start] == '\\':
|
||
val_start += 1
|
||
# Check if value is a string
|
||
if val_start < len(text) and text[val_start] == '"':
|
||
s = val_start + 1
|
||
buf = []
|
||
while s < len(text):
|
||
ch = text[s]
|
||
if ch == '\\' and s + 1 < len(text):
|
||
buf.append(text[s+1])
|
||
s += 2
|
||
elif ch == '"':
|
||
return ''.join(buf)
|
||
elif ch in end_chars and not buf:
|
||
return None
|
||
else:
|
||
buf.append(ch)
|
||
s += 1
|
||
return ''.join(buf)
|
||
# Object value: find balanced brace
|
||
if val_start < len(text) and text[val_start] == '{':
|
||
end = _find_balanced_brace(text, val_start)
|
||
if end > val_start:
|
||
return text[val_start:end+1]
|
||
return None
|
||
|
||
def _extract_args(text):
|
||
"""Extract arguments value from tool-call JSON, handling multiple malformed formats.
|
||
|
||
[FIX 6] THREE-TIER PARSER — solves double-wrapped arguments bug:
|
||
Model generates arguments in TWO different escaped forms:
|
||
A) Unescaped: "arguments": "{"cmd": "curl ...", "sp": "allow_all"}"
|
||
→ naive brace-counting finds boundaries correctly
|
||
B) Escaped: "arguments": "{\\"cmd\\": \\"curl...\\"}"
|
||
→ json.loads fails on \\ at structural level
|
||
→ unescape \\" → " and retry
|
||
→ unicode_escape decode and retry
|
||
|
||
Returns the raw JSON string (after best-effort unescaping).
|
||
Caller does json.loads() on the result.
|
||
If all 3 tiers fail, returns raw text (caller handles as fallback).
|
||
"""
|
||
m = re.search(r'"(?:arguments|input)"\s*:\s*"?', text)
|
||
if not m:
|
||
return None
|
||
start = m.end()
|
||
if start < len(text) and text[start] == '"':
|
||
start += 1
|
||
if start >= len(text) or text[start] != '{':
|
||
return None
|
||
depth = 0
|
||
i = start
|
||
while i < len(text):
|
||
ch = text[i]
|
||
if ch == '{':
|
||
depth += 1
|
||
elif ch == '}':
|
||
depth -= 1
|
||
if depth == 0:
|
||
raw = text[start:i+1]
|
||
|
||
# Try JSON.parse as-is
|
||
try:
|
||
json.loads(raw)
|
||
return raw
|
||
except json.JSONDecodeError:
|
||
pass
|
||
|
||
# Try after unescaping inner \" -> "
|
||
unescaped = raw.replace('\\"', '"')
|
||
try:
|
||
json.loads(unescaped)
|
||
return unescaped
|
||
except json.JSONDecodeError:
|
||
pass
|
||
|
||
# Try after also unescaping \\n -> \n etc
|
||
try:
|
||
fixed = raw.encode().decode('unicode_escape')
|
||
json.loads(fixed)
|
||
return fixed
|
||
except Exception:
|
||
pass
|
||
|
||
# Give up — return raw text
|
||
return raw
|
||
i += 1
|
||
return None
|
||
|
||
def _extract_raw_json_tool_calls(t):
|
||
"""[FIX 5] Extract raw JSON tool-call objects from free text.
|
||
|
||
Finds "type":"tool-call" (or tool_call/function_call) in text, then extracts
|
||
name/id/arguments/sandbox_permissions/justification via field-level regex.
|
||
|
||
Delegates to _extract_args() for the arguments field (handles unescaped + escaped JSON).
|
||
Delegates to _extract_field() for name/id/sandbox_permissions/justification
|
||
(with FIX 7 for leading-\ handling).
|
||
|
||
Normalizes sandbox_permissions to valid values (use_default|require_escalated|with_user_approval)
|
||
[FIX 6] Prevents double-wrapped args: {"cmd": "{\"cmd\": \"curl...\"}"}
|
||
"""
|
||
results = []
|
||
idx = 0
|
||
while True:
|
||
m = re.search(r'"type"\s*:\s*"(tool-call|tool_call|function_call)"', t[idx:])
|
||
if not m:
|
||
break
|
||
tc_pos = idx + m.start()
|
||
snippet = t[tc_pos:]
|
||
idx = tc_pos + 1
|
||
tc_type = m.group(1)
|
||
tc_name = _extract_field(snippet, "name")
|
||
if not tc_name:
|
||
continue
|
||
tc_id = _extract_field(snippet, "id")
|
||
|
||
# [FIX 20] Support explore / explore_agent in raw JSON tool calls
|
||
is_explore = tc_name.lower() in ("explore", "explore_agent")
|
||
|
||
if is_explore:
|
||
# Build explore command from the whole snippet/arguments
|
||
explore_cmd, explore_just = _build_explore_cmd_local(snippet)
|
||
if explore_cmd:
|
||
args = {"cmd": explore_cmd}
|
||
if explore_just:
|
||
args["justification"] = explore_just
|
||
else:
|
||
args = {"cmd": "echo 'explore: unable to extract repository URL'", "justification": "Fallback for explore tool call without URL."}
|
||
tool_name = "exec_command"
|
||
else:
|
||
# [FIX 19] Translate execute_request and other variations to exec_command (CLI only supports exec_command)
|
||
tool_name = "exec_command" if tc_name.lower() in ("exec", "bash", "shell", "terminal", "run_command", "execute_request", "execute_command", "run_shell_command", "run_shell", "run") else tc_name
|
||
args_raw = _extract_args(snippet) or _extract_field(snippet, "arguments") or _extract_field(snippet, "input") or "{}"
|
||
try:
|
||
args = json.loads(args_raw) if args_raw.startswith('{') else {"cmd": args_raw}
|
||
except Exception:
|
||
args = {"cmd": args_raw}
|
||
if "cmd" not in args or not args["cmd"]:
|
||
args["cmd"] = str(args)
|
||
# [FIX 11] Self-healing: unwrap double-wrapped cmd values
|
||
args["cmd"] = _unwrap_cmd(args.get("cmd", ""))
|
||
|
||
# Normalize sandbox_permissions to valid values
|
||
_VALID_SP = frozenset({"use_default", "require_escalated", "with_user_approval"})
|
||
if "sandbox_permissions" in args:
|
||
spv = args["sandbox_permissions"]
|
||
if isinstance(spv, dict):
|
||
args["sandbox_permissions"] = "require_escalated" if spv.get("require_escalated") else "use_default"
|
||
elif isinstance(spv, str) and spv not in _VALID_SP:
|
||
args["sandbox_permissions"] = "require_escalated"
|
||
else:
|
||
# Fallback: extract from raw snippet (model puts it at top level)
|
||
sp_raw = _extract_field(snippet, "sandbox_permissions")
|
||
if sp_raw:
|
||
try:
|
||
sp_obj = json.loads(sp_raw) if sp_raw.startswith('{') else {"require_escalated": bool(sp_raw)}
|
||
if isinstance(sp_obj, dict) and sp_obj.get("require_escalated"):
|
||
args["sandbox_permissions"] = "require_escalated"
|
||
except Exception:
|
||
pass
|
||
if "justification" not in args:
|
||
just_raw = _extract_field(snippet, "justification")
|
||
if just_raw:
|
||
args["justification"] = just_raw
|
||
results.append({
|
||
"full_match": snippet,
|
||
"name": tool_name,
|
||
"arguments": json.dumps(args, ensure_ascii=False),
|
||
})
|
||
return results
|
||
|
||
for pat in patterns:
|
||
for m in re.finditer(pat, text, re.DOTALL | re.IGNORECASE):
|
||
if pat.startswith("<function"):
|
||
raw_name = m.group(1)
|
||
body = m.group(2)
|
||
else:
|
||
raw_name = m.group(1) or ""
|
||
body = m.group(2)
|
||
nm = re.search(r"<tool\s+name=[\"']?([^\"'>\s]+)", body, re.IGNORECASE)
|
||
raw_name = raw_name or (nm.group(1) if nm else "bash")
|
||
params = {}
|
||
body_stripped = body.strip()
|
||
if body_stripped.startswith("{"):
|
||
try:
|
||
obj = json.loads(body_stripped)
|
||
cmd = obj.get("command") or obj.get("cmd") or ""
|
||
cmd = _unwrap_cmd(cmd) # [FIX 11]
|
||
if cmd:
|
||
# [FIX 19] Translate execute_request and other variations to exec_command (CLI only supports exec_command)
|
||
tool_name = "exec_command" if raw_name.lower() in ("exec", "bash", "shell", "terminal", "run_command", "execute_request", "execute_command", "run_shell_command", "run_shell", "run") else raw_name
|
||
args = {"cmd": cmd}
|
||
sp = obj.get("sandbox_permissions")
|
||
if isinstance(sp, dict) and sp.get("require_escalated"):
|
||
args["sandbox_permissions"] = "require_escalated"
|
||
elif isinstance(sp, str):
|
||
args["sandbox_permissions"] = sp
|
||
if obj.get("justification"):
|
||
args["justification"] = obj.get("justification")
|
||
calls.append({"full_match": m.group(0), "name": tool_name, "arguments": json.dumps(args)})
|
||
continue
|
||
except Exception:
|
||
pass
|
||
for pm in re.finditer(r"<parameter(?:\s+name=[\"']?(\w+)[\"']?|=(\w+))>(.*?)</parameter>", body, re.DOTALL | re.IGNORECASE):
|
||
key = pm.group(1) or pm.group(2) or "text"
|
||
params[key] = _strip_xmlish_tags(pm.group(3)).strip()
|
||
|
||
# [FIX 20] Support explore / explore_agent in XML tool calls
|
||
is_explore = raw_name.lower() in ("explore", "explore_agent")
|
||
if is_explore:
|
||
explore_cmd, explore_just = _build_explore_cmd_local(body)
|
||
if explore_cmd:
|
||
cmd = explore_cmd
|
||
params["justification"] = explore_just
|
||
else:
|
||
cmd = ""
|
||
else:
|
||
cmd = params.get("command") or params.get("cmd") or ""
|
||
|
||
if not cmd and body_stripped.startswith("{"):
|
||
cm = re.search(r'"(?:command|cmd)"\s*:\s*"(.*?)"\s*,\s*"(?:sandbox_permissions|justification|prefix_rule)"', body, re.DOTALL)
|
||
if not cm:
|
||
cm = re.search(r'"(?:command|cmd)"\s*:\s*"(.*?)"\s*}', body, re.DOTALL)
|
||
if cm:
|
||
cmd = cm.group(1)
|
||
cmd = cmd.replace('\\n', '\n').replace('\\"', '"').strip()
|
||
cmd = _unwrap_cmd(cmd) # [FIX 11]
|
||
if re.search(r'"sandbox_permissions"\s*:\s*\{\s*"require_escalated"\s*:\s*true\s*\}', body, re.DOTALL):
|
||
params["sandbox_permissions"] = "require_escalated"
|
||
jm = re.search(r'"justification"\s*:\s*"(.*?)"\s*(?:,|})', body, re.DOTALL)
|
||
if jm:
|
||
params["justification"] = jm.group(1).replace('\\n', '\n').replace('\\"', '"').strip()
|
||
if not cmd:
|
||
stripped = _strip_xmlish_tags(body)
|
||
lines = [ln.strip() for ln in stripped.splitlines() if ln.strip()]
|
||
for i, ln in enumerate(lines):
|
||
if re.match(r"^(curl|wget|python3?|node|npm|pnpm|yarn|cat|ls|find|grep|rg|sed|awk|git|mkdir|touch|printf|echo)\b", ln):
|
||
cmd = "\n".join(lines[i:])
|
||
break
|
||
if not cmd and lines:
|
||
cmd = "\n".join(lines)
|
||
if not cmd:
|
||
continue
|
||
# [FIX 19] Translate execute_request and other variations to exec_command (CLI only supports exec_command)
|
||
# [FIX 20] Translate explore and explore_agent to exec_command
|
||
tool_name = "exec_command" if raw_name.lower() in ("exec", "bash", "shell", "terminal", "run_command", "execute_request", "execute_command", "run_shell_command", "run_shell", "run", "explore", "explore_agent") else raw_name
|
||
args = {"cmd": _unwrap_cmd(cmd)} # [FIX 11] all paths must unwrap
|
||
if params.get("sandbox_permissions"):
|
||
args["sandbox_permissions"] = params["sandbox_permissions"]
|
||
if params.get("justification"):
|
||
args["justification"] = params["justification"]
|
||
calls.append({"full_match": m.group(0), "name": tool_name, "arguments": json.dumps(args)})
|
||
|
||
# Also extract raw JSON tool-call objects embedded in free text
|
||
calls.extend(_extract_raw_json_tool_calls(text))
|
||
|
||
# [FIX 18] Native <todo_write> blocks from the model (used for checklist/task tracking)
|
||
# The model outputs a task checklist in a custom <todo_write> XML tag block:
|
||
# <todo_write>
|
||
# <todos>[{"id":"1","status":"in_progress","description":"..."}]</todos>
|
||
# </todo_write>
|
||
# We parse this and map it to a standard 'TodoWrite' tool call so the CLI agent loop continues execution.
|
||
for m in re.finditer(r"<todo_write>(.*?)</todo_write>", text, re.DOTALL | re.IGNORECASE):
|
||
body = (m.group(1) or "").strip()
|
||
if not body:
|
||
continue
|
||
todos_match = re.search(r"<todos>(.*?)</todos>", body, re.DOTALL | re.IGNORECASE)
|
||
if not todos_match:
|
||
continue
|
||
raw_todos_json = todos_match.group(1).strip()
|
||
try:
|
||
raw_todos = json.loads(raw_todos_json)
|
||
except Exception as e:
|
||
print(f"[translate-proxy] [FIX 18] Failed to parse <todos> JSON: {e}", file=sys.stderr)
|
||
raw_todos = None
|
||
if isinstance(raw_todos, list):
|
||
parsed_todos = []
|
||
for item in raw_todos:
|
||
if isinstance(item, dict):
|
||
desc = item.get("description") or item.get("content") or ""
|
||
parsed_todos.append({
|
||
"content": desc,
|
||
"activeForm": item.get("activeForm") or desc,
|
||
"status": item.get("status") or "pending"
|
||
})
|
||
calls.append({
|
||
"full_match": m.group(0),
|
||
"name": "TodoWrite",
|
||
"arguments": json.dumps({"todos": parsed_todos}, ensure_ascii=False)
|
||
})
|
||
|
||
# [FIX 11] Self-healing: last-chance sanitization pass on ALL extracted calls
|
||
calls = _sanitize_tool_calls(calls)
|
||
return calls
|
||
|
||
def _sanitize_tool_calls(calls):
|
||
"""[FIX 11/T3] Post-extraction self-healing validation layer.
|
||
|
||
Runs AFTER all extraction paths (XML, raw JSON, regex) have produced their
|
||
tool calls. This is the final safety net before calls are returned to the
|
||
streaming/response builder.
|
||
|
||
Validates and repairs:
|
||
- Double/triple-wrapped cmd values (recursive unwrap)
|
||
- cmd that looks like JSON object/string instead of shell command
|
||
- cmd containing escaped newlines or quotes that would break bash
|
||
- Empty or whitespace-only cmd → replaced with diagnostic string
|
||
|
||
Logs warnings for any repair made (visible in stderr/proxy logs).
|
||
Returns sanitized list (may be shorter if irreparable calls are dropped).
|
||
"""
|
||
cleaned = []
|
||
for i, call in enumerate(calls):
|
||
# [FIX 18] Skip sanitization pass for non-shell tool calls (e.g., TodoWrite)
|
||
# Sanitization specifically validates and repairs command shell executions (the 'cmd' argument).
|
||
# Running it on other tools without a 'cmd' parameter (like TodoWrite) would falsely flag
|
||
# them as containing JSON garbage or empty commands, corrupting their actual parameters.
|
||
if call.get("name") != "exec_command":
|
||
cleaned.append(call)
|
||
continue
|
||
|
||
try:
|
||
args_raw = call.get("arguments", "{}")
|
||
if isinstance(args_raw, str):
|
||
args = json.loads(args_raw)
|
||
else:
|
||
args = dict(args_raw)
|
||
except Exception:
|
||
cleaned.append(call)
|
||
continue
|
||
cmd = args.get("cmd", "")
|
||
repaired = False
|
||
|
||
# Detect and unwrap nested JSON cmd values (up to 4 levels deep)
|
||
unwrapped = _unwrap_cmd(cmd)
|
||
if unwrapped != cmd:
|
||
cmd = unwrapped
|
||
args["cmd"] = cmd
|
||
repaired = True
|
||
|
||
# Detect cmd that is still a JSON object (unwrap missed it or deeper nesting)
|
||
if isinstance(cmd, str) and cmd.strip().startswith("{"):
|
||
try:
|
||
inner = json.loads(cmd)
|
||
if isinstance(inner, dict):
|
||
for key in ("cmd", "command", "c"):
|
||
if key in inner and isinstance(inner[key], str):
|
||
args["cmd"] = inner[key]
|
||
repaired = True
|
||
break
|
||
except Exception:
|
||
pass
|
||
|
||
# Detect cmd that looks like a JSON-encoded string with backslash escapes
|
||
_cmd = args.get("cmd", "")
|
||
if _cmd and ('\\"' in _cmd or "\\n" in _cmd or _cmd.count("{") > _cmd.count("}")):
|
||
try:
|
||
decoded = _cmd.encode().decode("unicode_escape")
|
||
if decoded != _cmd and not decoded.startswith("{"):
|
||
args["cmd"] = decoded
|
||
repaired = True
|
||
except Exception:
|
||
pass
|
||
|
||
# Final guard: if cmd is empty or just JSON garbage, make it obvious
|
||
_final_cmd = args.get("cmd", "")
|
||
if not _final_cmd or _final_cmd.strip() in ("{}", "null", "None", ""):
|
||
_safe_preview = args_raw[:200].replace('"', "'").replace('\\', '/')
|
||
args["cmd"] = f"# [CC-SANITIZER] empty cmd recovered from: {_safe_preview}"
|
||
repaired = True
|
||
elif _final_cmd.startswith("{") and len(_final_cmd) < 500:
|
||
# Still looks like JSON — likely unrecoverable, flag it
|
||
_safe_preview = _final_cmd.replace('"', "'").replace('\\', '/')
|
||
args["cmd"] = f"# [CC-SANITIZER] suspicious cmd (still JSON): {_safe_preview}"
|
||
repaired = True
|
||
|
||
if repaired:
|
||
print(f"[translate-proxy] [CC-SANITIZER] repaired tool call #{i}: "
|
||
f"name={call.get('name')} cmd_preview={str(args.get('cmd',''))[:120]}",
|
||
file=sys.stderr)
|
||
|
||
call["arguments"] = json.dumps(args, ensure_ascii=False)
|
||
cleaned.append(call)
|
||
|
||
return cleaned
|
||
|
||
def _parse_cc_line(line):
|
||
"""Parse a raw line from CommandCode /alpha/generate, stripping SSE data: prefix."""
|
||
stripped = line.strip()
|
||
if not stripped:
|
||
return None
|
||
if stripped.startswith("data: "):
|
||
stripped = stripped[6:]
|
||
elif stripped.startswith("data:"):
|
||
stripped = stripped[5:]
|
||
if not stripped or stripped == "[DONE]":
|
||
return None
|
||
try:
|
||
return json.loads(stripped)
|
||
except json.JSONDecodeError:
|
||
return None
|
||
|
||
|
||
def _iter_cc_events(stream):
|
||
"""Yield parsed JSON events from a CommandCode /alpha/generate stream.
|
||
Handles raw JSON lines, SSE data: events, and multi-event chunks.
|
||
"""
|
||
buf = ""
|
||
for chunk in _stream_with_idle_timeout(stream):
|
||
buf += chunk.decode("utf-8", errors="replace")
|
||
while "\n" in buf:
|
||
line, buf = buf.split("\n", 1)
|
||
d = _parse_cc_line(line)
|
||
if d is not None:
|
||
yield d
|
||
# Process remaining buffer (non-streaming single-JSON response)
|
||
if buf.strip():
|
||
if buf.strip().startswith("{"):
|
||
d = _parse_cc_line(buf)
|
||
if d is not None:
|
||
yield d
|
||
else:
|
||
for line in buf.strip().split("\n"):
|
||
d = _parse_cc_line(line)
|
||
if d is not None:
|
||
yield d
|
||
|
||
|
||
def cc_resp_to_responses(cc_lines, model, resp_id=None):
|
||
text = ""
|
||
usage = {}
|
||
if isinstance(cc_lines, str):
|
||
cc_lines = [cc_lines]
|
||
for line in cc_lines:
|
||
d = _parse_cc_line(line)
|
||
if d is None:
|
||
continue
|
||
t = d.get("type", "")
|
||
if t == "text-delta":
|
||
text += d.get("text", "")
|
||
elif t == "finish-step":
|
||
u = d.get("usage", {})
|
||
usage = {
|
||
"input_tokens": u.get("inputTokens", 0),
|
||
"output_tokens": u.get("outputTokens", 0),
|
||
"total_tokens": u.get("inputTokens", 0) + u.get("outputTokens", 0),
|
||
}
|
||
outputs = []
|
||
if text:
|
||
outputs.append({"type": "message", "id": uid("msg"), "role": "assistant",
|
||
"status": "completed",
|
||
"content": [{"type": "output_text", "text": text, "annotations": []}]})
|
||
return {"id": resp_id or uid("resp"), "object": "response", "created": int(time.time()),
|
||
"model": model, "status": "completed", "output": outputs,
|
||
"usage": {"input_tokens": usage.get("input_tokens", 0),
|
||
"output_tokens": usage.get("output_tokens", 0),
|
||
"total_tokens": usage.get("total_tokens", 0),
|
||
"input_tokens_details": {"cached_tokens": 0}}}
|
||
|
||
def cc_stream_to_sse(cc_stream, model, req_id):
|
||
resp_id = req_id or uid("resp")
|
||
msg_id = uid("msg")
|
||
text_buf = ""
|
||
|
||
yield emit("response.created", {"type": "response.created",
|
||
"response": {"id": resp_id, "object": "response", "model": model,
|
||
"status": "in_progress", "created": int(time.time()), "output": []}})
|
||
yield emit("response.in_progress", {"type": "response.in_progress", "response": {"id": resp_id}})
|
||
|
||
total_usage = {}
|
||
_event_types_seen = set()
|
||
_debug_log_path = os.path.expanduser("~/.cache/codex-proxy/cc-debug.log")
|
||
_debug_fh = open(_debug_log_path, "a") # [FIX 14] always write debug to FILE (not just stderr which may be piped)
|
||
_deflog = lambda *a, **kw: print(*a, file=_debug_fh, flush=True, **kw)
|
||
|
||
for d in _iter_cc_events(cc_stream):
|
||
t = d.get("type", "")
|
||
_event_types_seen.add(t)
|
||
|
||
if t == "text-delta":
|
||
txt = d.get("text", "")
|
||
if txt:
|
||
text_buf += txt
|
||
|
||
elif t == "finish-step":
|
||
u = d.get("usage", {})
|
||
total_usage = {
|
||
"input_tokens": u.get("inputTokens", 0),
|
||
"output_tokens": u.get("outputTokens", 0),
|
||
"total_tokens": u.get("inputTokens", 0) + u.get("outputTokens", 0),
|
||
}
|
||
elif t not in ("text-delta", "finish-step"):
|
||
_deflog(f"[CC-DEBUG] unexpected event type: {t} keys={list(d.keys())[:5]} data={str(d)[:200]}")
|
||
|
||
_deflog(f"[CC-DEBUG] stream ended. event_types={_event_types_seen} text_buf_len={len(text_buf)}")
|
||
|
||
parsed_tool_calls = _parse_commandcode_text_tool_calls(text_buf)
|
||
_deflog(f"[CC-DEBUG] text_buf len={len(text_buf)} parsed_tool_calls={len(parsed_tool_calls)} "
|
||
f"text_preview={text_buf[:500]!r}")
|
||
if parsed_tool_calls:
|
||
for ti, tc in enumerate(parsed_tool_calls):
|
||
_deflog(f"[CC-DEBUG] tool_call[{ti}] name={tc.get('name')} args_preview={tc.get('arguments','')[:150]!r}")
|
||
|
||
# [FIX 13] FALLBACK: if parser returned empty but text contains tool-call patterns,
|
||
# force-extract using regex. This catches cases where model output format
|
||
# doesn't match any of our named patterns (XML/raw JSON/function=).
|
||
if not parsed_tool_calls and len(text_buf) > 20:
|
||
_has_tc_signals = (
|
||
'"type"' in text_buf and ('tool-call' in text_buf or 'tool_call' in text_buf or 'function_call' in text_buf)
|
||
) or (
|
||
'<tool' in text_buf.lower() and '<parameter' in text_buf.lower()
|
||
) or (
|
||
'<function=' in text_buf
|
||
) or (
|
||
'{"cmd":' in text_buf or '{"command":' in text_buf
|
||
)
|
||
if _has_tc_signals:
|
||
_deflog(f"[CC-DEBUG] Parser returned empty but text has tool-call signals! Attempting fallback...")
|
||
# Try direct raw JSON extraction on entire buffer
|
||
_fallback_calls = _extract_raw_json_tool_calls(text_buf)
|
||
if not _fallback_calls:
|
||
# [FIX 14b] Match BOTH "cmd" and "command" keys (model uses both)
|
||
import re as _re
|
||
for _m in _re.finditer(r'\{[^{}]*"(?:command|cmd)"\s*:\s*"(?:[^"\\]|\\.)*"', text_buf):
|
||
try:
|
||
_args = json.loads(_m.group(0))
|
||
if isinstance(_args, dict) and ("cmd" in _args or "command" in _args):
|
||
_cmd_val = _unwrap_cmd(_args.get("cmd") or _args.get("command", ""))
|
||
_args["cmd"] = _cmd_val
|
||
# Copy description as justification if present
|
||
if "description" in _args:
|
||
_args["justification"] = _args["description"]
|
||
_fallback_calls.append({
|
||
"full_match": _m.group(0),
|
||
"name": "exec_command",
|
||
"arguments": json.dumps(_args, ensure_ascii=False),
|
||
})
|
||
except Exception:
|
||
continue
|
||
if _fallback_calls:
|
||
_deflog(f"[CC-DEBUG] Fallback extracted {len(_fallback_calls)} tool calls!")
|
||
for _fi, _fc in enumerate(_fallback_calls):
|
||
_deflog(f"[CC-DEBUG] fallback[{_fi}] name={_fc.get('name')} args={_fc.get('arguments','')[:120]!r}")
|
||
parsed_tool_calls = _fallback_calls
|
||
else:
|
||
_deflog(f"[CC-DEBUG] Fallback also failed. text_buf first 500: {text_buf[:500]!r}")
|
||
|
||
# [FIX 25] SELF-HEALING STUCK DETECTOR
|
||
# When ALL parsers returned empty and text has intent signals, synthesize a
|
||
# command so the agent loop doesn't stall. This catches:
|
||
# - Bare text with no tool call format at all
|
||
# - Unrecognized XML-ish blocks
|
||
# - Partial JSON (bare "{")
|
||
# - Model explaining what it wants to do but not producing a tool call
|
||
if not parsed_tool_calls and len(text_buf) > 10:
|
||
_synth_cmd = None
|
||
_synth_just = None
|
||
_tl = text_buf.lower()
|
||
|
||
# Heuristic 1: URL in text → fetch it
|
||
_url_in_text = re.search(r"https?://[^\s\]'\\>\",]+", text_buf)
|
||
if _url_in_text:
|
||
_synth_url = _url_in_text.group(0).rstrip(")].,;'\\\"")
|
||
_synth_cmd = f"curl -sL --max-time 15 '{_synth_url}' 2>/dev/null | head -200"
|
||
_synth_just = "Auto-synthesized: URL detected in text, fetching"
|
||
|
||
# Heuristic 2: File path references → list or read
|
||
if not _synth_cmd:
|
||
_file_m = re.search(r"(?:read|open|view|check|examine|cat|show)\s+(?:the\s+)?(?:file\s+)?[`'\"]?(/[^\s'\"]+\.\w+)", _tl)
|
||
if _file_m:
|
||
_fpath = _file_m.group(1)
|
||
_synth_cmd = f"cat '{_fpath}' 2>/dev/null | head -200 || ls -la '{_fpath}'"
|
||
_synth_just = f"Auto-synthesized: file reference detected ({_fpath})"
|
||
|
||
# Heuristic 3: Shell command mentioned in backticks or quotes
|
||
if not _synth_cmd:
|
||
_shell_m = re.search(r"[`'\"]((?:curl|wget|git|npm|pip|python|ls|cat|grep|find|mkdir|cd|rm|cp|mv|chmod|docker|make|cargo|go)\s[^\s`'\"]+)", text_buf)
|
||
if _shell_m:
|
||
_synth_cmd = _shell_m.group(1)
|
||
_synth_just = "Auto-synthesized: shell command detected in text"
|
||
|
||
# Heuristic 4: "explore" or "fetch" intent + last user URL
|
||
if not _synth_cmd and ("explore" in _tl or "fetch" in _tl or "investigate" in _tl or "repository" in _tl):
|
||
for _prev_url in _last_user_urls:
|
||
_url_m2 = re.search(r"https?://[^\s\]'\\>\",]+", _prev_url)
|
||
if _url_m2:
|
||
_pu = _url_m2.group(0).rstrip(")].,;'\\\"")
|
||
_ecmd, _ejust = _build_explore_cmd(_pu)
|
||
if _ecmd:
|
||
_synth_cmd = _ecmd
|
||
_synth_just = _ejust or "Auto-synthesized: explore intent with last user URL"
|
||
break
|
||
|
||
# Heuristic 5: Generic "I need to" / "let me" / "I'll" intent with command-like text
|
||
if not _synth_cmd:
|
||
_intent_m = re.search(r"(?:I(?:'ll| will| need to| should)|let me|please)\s+(.+?)(?:\.|!|\n|$)", _tl, re.IGNORECASE)
|
||
if _intent_m:
|
||
_intent_text = _intent_m.group(1).strip()
|
||
if len(_intent_text) > 10 and len(_intent_text) < 200:
|
||
_synth_cmd = f"echo 'Stuck recovery: model intent was: {_intent_text[:100]}'"
|
||
_synth_just = f"Auto-synthesized from intent text: {_intent_text[:80]}"
|
||
|
||
if _synth_cmd:
|
||
parsed_tool_calls = [{
|
||
"full_match": "__synth_stuck_recovery__",
|
||
"name": "exec_command",
|
||
"arguments": json.dumps({"cmd": _synth_cmd, "justification": _synth_just or "Auto-synthesized stuck recovery"}, ensure_ascii=False),
|
||
}]
|
||
_deflog(f"[CC-DEBUG] [STUCK-RECOVERY] Synthesized: cmd={_synth_cmd[:120]!r}")
|
||
print(f"[CC-DEBUG] [STUCK-RECOVERY] Synthesized command from text intent", file=sys.stderr, flush=True)
|
||
|
||
# Also log to stderr for visibility when not piped
|
||
print(f"[CC-DEBUG] text_buf={len(text_buf)} chars, tool_calls={len(parsed_tool_calls)}", file=sys.stderr, flush=True)
|
||
|
||
try:
|
||
_debug_fh.close()
|
||
except Exception:
|
||
pass
|
||
clean_text = text_buf
|
||
for tc in parsed_tool_calls:
|
||
clean_text = clean_text.replace(tc["full_match"], "")
|
||
clean_text = clean_text.strip()
|
||
|
||
if clean_text:
|
||
yield emit("response.output_item.added", {"type": "response.output_item.added",
|
||
"item": {"type": "message", "id": msg_id, "role": "assistant", "status": "in_progress", "content": []}})
|
||
yield emit("response.content_part.added", {"type": "response.content_part.added",
|
||
"part": {"type": "output_text", "text": "", "annotations": []}, "item_id": msg_id})
|
||
yield emit("response.output_text.delta", {"type": "response.output_text.delta",
|
||
"delta": clean_text, "item_id": msg_id, "content_index": 0})
|
||
yield emit("response.output_text.done", {"type": "response.output_text.done",
|
||
"text": clean_text, "item_id": msg_id, "content_index": 0})
|
||
yield emit("response.content_part.done", {"type": "response.content_part.done",
|
||
"part": {"type": "output_text", "text": clean_text, "annotations": []}, "item_id": msg_id})
|
||
yield emit("response.output_item.done", {"type": "response.output_item.done",
|
||
"item": {"type": "message", "id": msg_id, "role": "assistant", "status": "completed",
|
||
"content": [{"type": "output_text", "text": clean_text, "annotations": []}]}})
|
||
|
||
function_outputs = []
|
||
for tc in parsed_tool_calls:
|
||
fid = uid("fc")
|
||
call_id = uid("call")
|
||
item = {"type": "function_call", "id": fid, "call_id": call_id,
|
||
"name": tc["name"], "arguments": tc["arguments"], "status": "completed"}
|
||
function_outputs.append(item)
|
||
yield emit("response.output_item.added", {"type": "response.output_item.added", "item": item})
|
||
yield emit("response.function_call_arguments.done", {"type": "response.function_call_arguments.done",
|
||
"item_id": fid, "name": tc["name"], "arguments": tc["arguments"]})
|
||
yield emit("response.output_item.done", {"type": "response.output_item.done", "item": item})
|
||
|
||
final_out = []
|
||
if clean_text:
|
||
final_out.append({"type": "message", "id": msg_id, "role": "assistant", "status": "completed",
|
||
"content": [{"type": "output_text", "text": clean_text, "annotations": []}]})
|
||
final_out.extend(function_outputs)
|
||
yield emit("response.completed", {"type": "response.completed",
|
||
"response": {"id": resp_id, "object": "response", "model": model,
|
||
"status": "completed", "created": int(time.time()), "output": final_out,
|
||
"usage": total_usage}})
|
||
|
||
# ═══════════════════════════════════════════════════════════════════
|
||
# Auto-sensing provider adapter
|
||
# ═══════════════════════════════════════════════════════════════════
|
||
|
||
_SENTINEL = object()
|
||
|
||
@dataclasses.dataclass
|
||
class ProviderSchema:
|
||
"""Describes what message formats a provider supports.
|
||
|
||
Populated by probing the endpoint and/or analyzing error responses.
|
||
Cached in provider-caps.json so probing only happens once per provider.
|
||
"""
|
||
supported_roles: tuple = ("user", "assistant")
|
||
content_type: str = "string" # "string" | "array"
|
||
content_block_types: tuple = () # e.g. ("text", "tool_result", "tool-call")
|
||
tool_result_style: str = "inline" # "inline" | "tool_result_block" | "anthropic"
|
||
tool_call_style: str = "openai_function" # "openai_function" | "tool-call" | "anthropic_tool_use"
|
||
accepts_tool_role: bool = False
|
||
accepts_system_role: bool = True
|
||
cc_body_wrap: bool = False # needs {config, params, threadId} wrapping
|
||
field_names: dict = dataclasses.field(default_factory=dict)
|
||
auth_type: str = "" # "bearer" | "x-api-key" | "custom"
|
||
auth_header: str = "Authorization" # header name for auth
|
||
auth_scheme: str = "Bearer " # prefix for auth value
|
||
tool_decl_format: str = "openai" # "openai" | "anthropic" | "command_code"
|
||
param_names: dict = dataclasses.field(default_factory=lambda: {
|
||
"max_tokens": "max_tokens",
|
||
"temperature": "temperature",
|
||
"top_p": "top_p",
|
||
})
|
||
response_format: str = "auto" # "sse" | "raw_json" | "ndjson" | "auto"
|
||
stream_format: str = "auto" # "sse_data" | "sse_event" | "raw_lines" | "json_lines"
|
||
|
||
def hints(self) -> dict:
|
||
"""Return a dict for storing in provider-caps.json."""
|
||
d = {}
|
||
for k, v in dataclasses.asdict(self).items():
|
||
if isinstance(v, (list, tuple)) and not v:
|
||
continue
|
||
if isinstance(v, dict) and not v:
|
||
continue
|
||
if v is False:
|
||
continue
|
||
if v == "":
|
||
continue
|
||
if v == "auto":
|
||
continue
|
||
d[k] = v
|
||
return d
|
||
|
||
|
||
class ErrorAnalyzer:
|
||
"""Parse upstream error responses to infer provider schema.
|
||
Analyzes 400, 401, 422 errors for hints about auth, roles, content format,
|
||
parameter names, field names, tool format, and response format.
|
||
"""
|
||
|
||
@staticmethod
|
||
def analyze(error_text: str, current: ProviderSchema = None) -> dict:
|
||
hints = {}
|
||
if not error_text:
|
||
return hints
|
||
err = error_text.lower()
|
||
|
||
# ── Auth detection (401 errors) ──
|
||
if re.search(r"unauthorized|invalid.*api.?key|missing.*api.?key|x-api-key", err):
|
||
hints["auth_type"] = "x-api-key"
|
||
hints["auth_header"] = "x-api-key"
|
||
hints["auth_scheme"] = ""
|
||
elif re.search(r"invalid.*bearer|bearer.*token|authorization.*header|invalid.*token", err):
|
||
hints["auth_type"] = "bearer"
|
||
hints["auth_header"] = "Authorization"
|
||
hints["auth_scheme"] = "Bearer "
|
||
|
||
# ── Role validation ──
|
||
if re.search(r"role.*expected.*(?:user|assistant)", err):
|
||
hints["accepts_tool_role"] = False
|
||
hints["accepts_function_role"] = False
|
||
|
||
if re.search(r"role.*(?:tool|function).*(?:invalid|not.*(?:support|allow))", err):
|
||
hints["accepts_tool_role"] = False
|
||
hints["accepts_function_role"] = False
|
||
|
||
if re.search(r"role.*system.*(?:invalid|not.*(?:support|allow))", err):
|
||
hints["accepts_system_role"] = False
|
||
|
||
# ── Content format (top-level only, not content[i].xxx) ──
|
||
if re.search(r'params\.messages\[\d+\]\.content', err):
|
||
# Explicit path to content field in a messages array (e.g. /alpha/generate)
|
||
if re.search(r"expected string.*received array", err):
|
||
hints["content_type"] = "string"
|
||
hints["tool_result_style"] = "inline" # no tool_result blocks allowed
|
||
elif re.search(r"expected array.*received string", err):
|
||
hints["content_type"] = "array"
|
||
elif re.search(r"(?<!\w)content(?!\[)\s*(?:of type|field|should be|expected|must be).*(?:string|array)", err) or \
|
||
re.search(r"expected (?:string|array).*content", err):
|
||
if re.search(r"expected string", err) and not re.search(r"expected array", err):
|
||
hints["content_type"] = "string"
|
||
elif re.search(r"expected array", err):
|
||
hints["content_type"] = "array"
|
||
elif re.search(r"content.*expected string.*received array", err) and not re.search(r"\[\d*\]", err):
|
||
hints["content_type"] = "string"
|
||
elif re.search(r"content.*expected array.*received string", err) and not re.search(r"\[\d*\]", err):
|
||
hints["content_type"] = "array"
|
||
|
||
# ── Content block types ──
|
||
types = set()
|
||
for m in re.finditer(
|
||
r'expected\s+"('
|
||
r'text|image|document|search_result|thinking|redacted_thinking|reasoning|'
|
||
r'tool_use|tool-call|tool_result|tool-result|'
|
||
r'server_tool_use|web_search_tool_result|web_fetch_tool_result|tool'
|
||
r')"', err
|
||
):
|
||
types.add(m.group(1))
|
||
# Also detect from "expected string, received array at params.messages[i].content" pattern
|
||
# where the "or" clauses list valid block types
|
||
if not types and re.search(r'params\.messages\[\d+\]\.content', err):
|
||
for valid_type in ("text", "image", "document", "tool_use", "tool-call", "tool_result"):
|
||
if re.search(r'expected\s+"' + re.escape(valid_type) + r'"', err):
|
||
types.add(valid_type)
|
||
if types:
|
||
hints["content_block_types"] = tuple(sorted(types))
|
||
|
||
# ── Tool result style ──
|
||
if re.search(r"tool_result", err):
|
||
hints["tool_result_style"] = "tool_result_block"
|
||
elif re.search(r"tool_use", err) and not re.search(r"tool.use", err):
|
||
hints["tool_result_style"] = "anthropic"
|
||
|
||
# ── Tool call style ──
|
||
if re.search(r"tool-call", err) or re.search(r"tool_call", err):
|
||
hints["tool_call_style"] = "tool-call"
|
||
elif re.search(r"tool_use", err):
|
||
hints["tool_call_style"] = "anthropic_tool_use"
|
||
|
||
# ── CC body wrap detection ──
|
||
if re.search(r"(?:params\.|body\.)config", err) or re.search(r"threadId", err):
|
||
hints["cc_body_wrap"] = True
|
||
|
||
# ── Field name mappings (keys MUST match SchemaAdapter lookups) ──
|
||
fields = {}
|
||
if re.search(r"tool_use_id", err):
|
||
fields["tool_use_id"] = "tool_use_id"
|
||
if re.search(r"toolCallId", err):
|
||
fields["toolCallId"] = "toolCallId"
|
||
# SchemaAdapter._tool_result_block looks up "tool_use_id"
|
||
fields["tool_use_id"] = "toolCallId"
|
||
if re.search(r"tool_result", err) and not re.search(r"tool.result", err):
|
||
fields["tool_result_type"] = "tool_result"
|
||
if re.search(r"tool-result", err):
|
||
fields["tool_result_type"] = "tool-result"
|
||
# Detect tool call field names from errors
|
||
if re.search(r"(?:id|call_id|callId|tool_use_id).*(?:invalid|unknown|expected|required)", err) or \
|
||
re.search(r"(?:expected|required).*(?:id|call_id|callId)", err):
|
||
for alt in ("id", "call_id", "callId", "tool_use_id"):
|
||
if alt in err:
|
||
fields["tool_call_id_field"] = alt
|
||
break
|
||
if re.search(r"(?:name|tool_name|function).*(?:invalid|unknown|expected|required)", err) or \
|
||
re.search(r"(?:expected|required).*(?:name|tool_name)", err):
|
||
for alt in ("name", "tool_name", "function"):
|
||
if alt in err:
|
||
fields["tool_call_name_field"] = alt
|
||
break
|
||
if re.search(r"arguments.*(?:invalid|unknown|expect|required)", err) or \
|
||
re.search(r"input.*(?:invalid|unknown|expect|required)", err):
|
||
if re.search(r"input_schema|input\b", err) and not re.search(r"arguments", err):
|
||
fields["tool_call_args_field"] = "input"
|
||
fields["tool_args_field"] = "input"
|
||
else:
|
||
fields["tool_call_args_field"] = "arguments"
|
||
fields["tool_args_field"] = "arguments"
|
||
|
||
# ── Supported roles from error ──
|
||
if re.search(r"params\.messages\[\d+\]\.role", err):
|
||
roles = re.findall(r'expected one of\s+"([^"]+)"', err)
|
||
if roles:
|
||
hints["supported_roles"] = tuple(r.strip() for r in roles[0].split("|"))
|
||
if fields:
|
||
hints["field_names"] = fields
|
||
|
||
# ── Parameter name negotiation ──
|
||
param_hints = {}
|
||
if re.search(r"max_tokens.*(?:invalid|unknown|not.*(?:support|recognize))", err) or \
|
||
re.search(r"(?:unknown|invalid).*param.*max_tokens", err):
|
||
for alt in ("max_output_tokens", "max_tokens_to_sample", "max_new_tokens", "max_token"):
|
||
if alt.lower() in err:
|
||
param_hints["max_tokens"] = alt
|
||
break
|
||
if re.search(r"temperature.*(?:invalid|unknown)", err):
|
||
for alt in ("creation_temperature", "temp", "model_temperature"):
|
||
if alt.lower() in err:
|
||
param_hints["temperature"] = alt
|
||
break
|
||
if re.search(r"top_p.*(?:invalid|unknown)", err):
|
||
for alt in ("top_p", "nucleus_sampling"):
|
||
if alt.lower() in err:
|
||
param_hints["top_p"] = alt
|
||
break
|
||
if param_hints:
|
||
hints["param_names"] = param_hints
|
||
|
||
# ── Tool declaration format ──
|
||
if re.search(r"tools.*input_schema", err) or re.search(r"input_schema.*required", err):
|
||
hints["tool_decl_format"] = "anthropic"
|
||
elif re.search(r"tools.*function.*(?:required|expected)", err):
|
||
hints["tool_decl_format"] = "openai"
|
||
elif re.search(r"tool-call|tool_call.*format", err):
|
||
hints["tool_decl_format"] = "command_code"
|
||
|
||
# ── Response/Stream format hints from content-type or error ──
|
||
if re.search(r"content.type.*text/event.stream", err) or \
|
||
re.search(r"stream.*sse|sse.*expected", err):
|
||
hints["stream_format"] = "sse_data"
|
||
if re.search(r"ndjson|json.*lines", err):
|
||
hints["stream_format"] = "json_lines"
|
||
|
||
return hints
|
||
|
||
@staticmethod
|
||
def merge_into_schema(hints: dict, schema: ProviderSchema) -> ProviderSchema:
|
||
for k, v in hints.items():
|
||
if k == "field_names" and isinstance(v, dict):
|
||
schema.field_names.update(v)
|
||
elif k == "param_names" and isinstance(v, dict):
|
||
schema.param_names.update(v)
|
||
elif hasattr(schema, k):
|
||
setattr(schema, k, v)
|
||
return schema
|
||
|
||
|
||
def _schema_cache_key(target_url=None, backend=None, model=None):
|
||
host = urllib.parse.urlparse(target_url or TARGET_URL).netloc.lower()
|
||
return f"auto-schema|{backend or BACKEND}|{host}|{model or '*'}"
|
||
|
||
|
||
def _load_schema(target_url=None, backend=None, model=None):
|
||
caps = _load_provider_caps()
|
||
key = _schema_cache_key(target_url, backend, model)
|
||
raw = caps.get(key)
|
||
generic = caps.get(_schema_cache_key(target_url, backend, model="*"))
|
||
data = raw or generic or {}
|
||
if not data:
|
||
return ProviderSchema()
|
||
# Staleness check: re-learn after 24h (86400s)
|
||
updated = data.get("_updated", 0)
|
||
if isinstance(updated, (int, float)) and time.time() - updated > 86400:
|
||
print(f"[auto-sense] cached schema stale ({int(time.time()-updated)}s old), re-learning", file=sys.stderr)
|
||
return ProviderSchema()
|
||
return ProviderSchema(
|
||
supported_roles=tuple(data.get("supported_roles", ("user", "assistant"))),
|
||
content_type=data.get("content_type", "string"),
|
||
content_block_types=tuple(data.get("content_block_types", ())),
|
||
tool_result_style=data.get("tool_result_style", "inline"),
|
||
tool_call_style=data.get("tool_call_style", "openai_function"),
|
||
accepts_tool_role=data.get("accepts_tool_role", False),
|
||
accepts_system_role=data.get("accepts_system_role", True),
|
||
cc_body_wrap=data.get("cc_body_wrap", False),
|
||
field_names=dict(data.get("field_names", {})),
|
||
auth_type=data.get("auth_type", ""),
|
||
auth_header=data.get("auth_header", "Authorization"),
|
||
auth_scheme=data.get("auth_scheme", "Bearer "),
|
||
tool_decl_format=data.get("tool_decl_format", "openai"),
|
||
param_names=dict(data.get("param_names", {
|
||
"max_tokens": "max_tokens",
|
||
"temperature": "temperature",
|
||
"top_p": "top_p",
|
||
})),
|
||
response_format=data.get("response_format", "auto"),
|
||
stream_format=data.get("stream_format", "auto"),
|
||
)
|
||
|
||
|
||
def _save_schema(schema: ProviderSchema, target_url=None, backend=None, model=None):
|
||
caps = _load_provider_caps()
|
||
key = _schema_cache_key(target_url, backend, model)
|
||
caps[key] = schema.hints()
|
||
caps[key]["_updated"] = time.time()
|
||
caps[key]["_backend"] = backend or BACKEND
|
||
_save_provider_caps()
|
||
print(f"[auto-sense] cached schema {key}", file=sys.stderr)
|
||
|
||
|
||
class SchemaAdapter:
|
||
"""Convert Responses API messages based on a detected ProviderSchema."""
|
||
|
||
def __init__(self, schema: ProviderSchema):
|
||
self.s = schema
|
||
|
||
def convert(self, input_data, instructions=""):
|
||
if self.s.content_type == "string" and not self.s.content_block_types:
|
||
return self._to_plain_string(input_data, instructions)
|
||
return self._to_content_blocks(input_data, instructions)
|
||
|
||
def _to_plain_string(self, input_data, instructions=""):
|
||
"""Fallback: user/assistant string content — no tool roles."""
|
||
msgs = []
|
||
if instructions and self.s.accepts_system_role:
|
||
msgs.append({"role": "system", "content": instructions})
|
||
elif instructions:
|
||
msgs.append({"role": "user", "content": instructions})
|
||
if isinstance(input_data, str):
|
||
msgs.append({"role": "user", "content": input_data})
|
||
return msgs
|
||
if not isinstance(input_data, list):
|
||
return msgs
|
||
last_flushed = []
|
||
pending = []
|
||
for item in input_data:
|
||
t = item.get("type")
|
||
if t == "function_call":
|
||
cid = item.get("call_id") or item.get("id") or uid("fc")
|
||
pending.append({"id": cid, "name": item.get("name", ""),
|
||
"arguments": item.get("arguments", "{}")})
|
||
continue
|
||
if pending:
|
||
last_flushed = [p["id"] for p in pending]
|
||
msgs.append({"role": "assistant", "content": None,
|
||
"tool_calls": [{"id": p["id"], "type": "function",
|
||
"function": {"name": p["name"],
|
||
"arguments": p["arguments"]}}
|
||
for p in pending]})
|
||
pending = []
|
||
if t == "message":
|
||
role = "user" if item.get("role") in ("user", "developer") else "assistant"
|
||
text = _extract_text(item.get("content", []))
|
||
if text:
|
||
msgs.append({"role": role, "content": text})
|
||
elif t == "function_call_output":
|
||
out = item.get("output", "")
|
||
if not isinstance(out, str):
|
||
out = json.dumps(out, ensure_ascii=False)
|
||
msgs.append({"role": "user", "content": out[:8000]})
|
||
if pending:
|
||
last_flushed = [p["id"] for p in pending]
|
||
msgs.append({"role": "assistant", "content": None,
|
||
"tool_calls": [{"id": p["id"], "type": "function",
|
||
"function": {"name": p["name"],
|
||
"arguments": p["arguments"]}}
|
||
for p in pending]})
|
||
return msgs
|
||
|
||
def _to_content_blocks(self, input_data, instructions=""):
|
||
msgs = []
|
||
pending_tc = []
|
||
tool_name_by_id = {}
|
||
last_ids = []
|
||
|
||
def flush():
|
||
nonlocal last_ids
|
||
if not pending_tc:
|
||
return
|
||
last_ids = [t["id"] for t in pending_tc]
|
||
msgs.append({"role": "assistant", "content": pending_tc})
|
||
pending_tc.clear()
|
||
|
||
_str = self.s.content_type == "string"
|
||
|
||
if instructions:
|
||
msgs.append({"role": "user", "content": instructions if _str else [{"type": "text", "text": instructions}]})
|
||
|
||
if isinstance(input_data, str):
|
||
msgs.append({"role": "user", "content": input_data if _str else [{"type": "text", "text": input_data}]})
|
||
return msgs
|
||
if not isinstance(input_data, list):
|
||
return msgs
|
||
|
||
for item in input_data:
|
||
t = item.get("type")
|
||
if t == "function_call":
|
||
cid = item.get("call_id") or item.get("id") or uid("call")
|
||
nm = item.get("name") or "exec_command"
|
||
tool_name_by_id[cid] = nm
|
||
tc_block = self._tool_call_block(cid, nm, item.get("arguments", "{}"))
|
||
if tc_block:
|
||
pending_tc.append(tc_block)
|
||
continue
|
||
flush()
|
||
if t == "message":
|
||
role = "user" if item.get("role") in ("user", "developer") else "assistant"
|
||
text = _extract_text(item.get("content", []))
|
||
if text:
|
||
msgs.append({"role": role, "content": text if _str else [{"type": "text", "text": text}]})
|
||
elif t == "function_call_output":
|
||
cid = item.get("call_id") or item.get("id") or ""
|
||
if not cid and last_ids:
|
||
idx = sum(1 for m in msgs for c in (m.get("content") or [])
|
||
if isinstance(c, dict) and c.get("type") in
|
||
("tool_result", "tool-result"))
|
||
if idx < len(last_ids):
|
||
cid = last_ids[idx]
|
||
out = item.get("output", "")
|
||
if not isinstance(out, str):
|
||
out = json.dumps(out, ensure_ascii=False)
|
||
tr = self._tool_result_block(cid, out)
|
||
if tr:
|
||
msgs.append({"role": "user", "content": [tr]})
|
||
flush()
|
||
return msgs
|
||
|
||
def _tool_call_block(self, cid, name, args):
|
||
style = self.s.tool_call_style
|
||
fn = self.s.field_names
|
||
if style == "tool-call":
|
||
return {
|
||
"type": fn.get("tool_call_type", "tool-call"),
|
||
fn.get("tool_call_id_field", "id"): cid,
|
||
fn.get("tool_call_name_field", "name"): name,
|
||
fn.get("tool_call_args_field", "arguments"): args,
|
||
}
|
||
elif style == "anthropic_tool_use":
|
||
try:
|
||
parsed = json.loads(args)
|
||
except Exception:
|
||
parsed = {}
|
||
return {
|
||
"type": fn.get("tool_use_type", "tool_use"),
|
||
fn.get("tool_call_id_field", "id"): cid,
|
||
fn.get("tool_call_name_field", "name"): name,
|
||
fn.get("tool_call_args_field", "input"): parsed,
|
||
}
|
||
else:
|
||
return None # handled as OpenAI function call
|
||
|
||
def _tool_result_block(self, cid, output):
|
||
style = self.s.tool_result_style
|
||
fn = self.s.field_names
|
||
if style == "tool_result_block":
|
||
return {
|
||
"type": fn.get("tool_result_type", "tool_result"),
|
||
fn.get("tool_use_id", "tool_use_id"): cid or "",
|
||
"content": [{"type": "text", "text": output[:8000]}],
|
||
}
|
||
elif style == "anthropic":
|
||
return {
|
||
"type": fn.get("tool_result_type", "tool_result"),
|
||
fn.get("tool_use_id", "tool_use_id"): cid or "",
|
||
"content": output[:8000],
|
||
}
|
||
return None # inline — handled by _to_plain_string
|
||
|
||
|
||
def _sanitize_err_body(body):
|
||
"""Sanitize upstream error body: strip HTML, truncate, remove control chars."""
|
||
if not body:
|
||
return ""
|
||
s = re.sub(r'<[^>]+>', '', body)
|
||
s = re.sub(r'[\x00-\x08\x0b\x0c\x0e-\x1f]', '', s)
|
||
s = s.strip()[:1000]
|
||
return s
|
||
|
||
|
||
def _extract_text(content):
|
||
if isinstance(content, str):
|
||
return content
|
||
if not isinstance(content, list):
|
||
return ""
|
||
parts = []
|
||
for p in content:
|
||
if isinstance(p, str):
|
||
parts.append(p)
|
||
elif isinstance(p, dict) and p.get("type") in ("input_text", "output_text", "text"):
|
||
parts.append(p.get("text", ""))
|
||
return "".join(parts)
|
||
|
||
|
||
# ═══════════════════════════════════════════════════════════════════
|
||
# HTTP Server
|
||
# ═══════════════════════════════════════════════════════════════════
|
||
|
||
def _log_resp(resp_id, status, output):
|
||
try:
|
||
import datetime as _dt
|
||
_lp = os.path.join(_LOG_DIR, "requests.log")
|
||
with open(_lp, "a") as _f:
|
||
_f.write(f" RESPONSE id={resp_id} status={status}\n")
|
||
if output:
|
||
for o in output:
|
||
ot = o.get("type")
|
||
if ot == "message":
|
||
_f.write(f" -> message: {o.get('content',[{}])[0].get('text','')[:200]}\n")
|
||
elif ot == "function_call":
|
||
_f.write(f" -> function_call: {o.get('name')}({o.get('arguments','')[:120]})\n")
|
||
else:
|
||
_f.write(f" -> {ot}\n")
|
||
_f.write(f"{'='*60}\n")
|
||
_f.flush()
|
||
except Exception:
|
||
pass
|
||
|
||
class ConnectionTracker:
|
||
def __enter__(self):
|
||
global _active_connections
|
||
with _active_connections_lock:
|
||
_active_connections += 1
|
||
def __exit__(self, *a):
|
||
global _active_connections
|
||
with _active_connections_lock:
|
||
_active_connections -= 1
|
||
|
||
class RequestTracker:
|
||
def __init__(self, request_id):
|
||
self.request_id = request_id
|
||
self.cancelled = threading.Event()
|
||
|
||
def __enter__(self):
|
||
if self.request_id:
|
||
with _active_requests_lock:
|
||
_active_requests[self.request_id] = self
|
||
return self
|
||
|
||
def __exit__(self, *a):
|
||
if self.request_id:
|
||
with _active_requests_lock:
|
||
_active_requests.pop(self.request_id, None)
|
||
|
||
def _cancel_request(request_id):
|
||
with _active_requests_lock:
|
||
req = _active_requests.get(request_id)
|
||
if not req:
|
||
return False
|
||
req.cancelled.set()
|
||
return True
|
||
|
||
def _handle_shutdown_signal(signum, frame):
|
||
global _shutdown_requested
|
||
_shutdown_requested = True
|
||
print("[proxy] shutdown requested; draining connections", file=sys.stderr)
|
||
def _drain():
|
||
deadline = time.time() + 5
|
||
while time.time() < deadline:
|
||
with _active_connections_lock:
|
||
if _active_connections == 0:
|
||
break
|
||
time.sleep(0.1)
|
||
if SERVER is not None:
|
||
SERVER.shutdown()
|
||
threading.Thread(target=_drain, daemon=True).start()
|
||
|
||
def _upstream_timeout(body, stream):
|
||
input_data = body.get("input", "")
|
||
n_items = len(input_data) if isinstance(input_data, list) else 1
|
||
has_tools = bool(body.get("tools"))
|
||
if stream:
|
||
return min((180 if has_tools else 120) + n_items * 2, 300)
|
||
return min(60 + n_items * 2, 120)
|
||
|
||
def _auto_continue_gemini(handler, flush_event, message_id, model, gen_config, gemini_tools, system_parts, project_id, headers, endpoints, url_suffix, accumulated_text, output_items, message_started):
|
||
max_continuations = 5
|
||
for _cont in range(max_continuations):
|
||
cont_contents = [
|
||
{"role": "model", "parts": [{"text": accumulated_text[-12000:]}]},
|
||
{"role": "user", "parts": [{"text": "Continue exactly where you left off. Do not repeat anything already written."}]},
|
||
]
|
||
cont_request = {"contents": cont_contents, "generationConfig": dict(gen_config)}
|
||
if system_parts:
|
||
cont_request["systemInstruction"] = {"parts": system_parts}
|
||
if gemini_tools:
|
||
cont_request["tools"] = gemini_tools
|
||
cont_wrapped = {"project": project_id, "model": model, "request": cont_request}
|
||
if OAUTH_PROVIDER == "google-antigravity":
|
||
cont_wrapped["requestType"] = "agent"
|
||
cont_wrapped["userAgent"] = "antigravity"
|
||
cont_wrapped["requestId"] = f"agent-{uuid.uuid4().hex[:12]}"
|
||
cont_body = json.dumps(cont_wrapped).encode()
|
||
upstream = None
|
||
for ep in endpoints:
|
||
target = f"{ep}/{url_suffix}"
|
||
req = urllib.request.Request(target, data=cont_body, headers=headers)
|
||
try:
|
||
upstream = urllib.request.urlopen(req, timeout=180)
|
||
break
|
||
except Exception as e:
|
||
print(f"[auto-continue] {ep} failed: {e}", file=sys.stderr)
|
||
continue
|
||
if not upstream:
|
||
break
|
||
cont_text = ""
|
||
cont_finish = ""
|
||
cont_buf = ""
|
||
for raw_line in _stream_with_idle_timeout(upstream):
|
||
line = raw_line.decode(errors="replace")
|
||
if line.startswith("data: "):
|
||
cont_buf += line[6:]
|
||
continue
|
||
if not line.strip() and cont_buf:
|
||
try:
|
||
chunk = json.loads(cont_buf)
|
||
except Exception:
|
||
cont_buf = ""
|
||
continue
|
||
cont_buf = ""
|
||
candidates = chunk.get("response", chunk).get("candidates", [])
|
||
if not candidates:
|
||
continue
|
||
cont_finish = candidates[0].get("finishReason", "")
|
||
parts = candidates[0].get("content", {}).get("parts", [])
|
||
for part in parts:
|
||
if part.get("thought"):
|
||
continue
|
||
if "text" in part and not part.get("functionCall"):
|
||
delta = part["text"]
|
||
if delta:
|
||
cont_text += delta
|
||
flush_event("response.output_text.delta", {"type": "response.output_text.delta", "output_index": 0, "content_index": 0, "delta": delta})
|
||
elif part.get("functionCall"):
|
||
fc = part["functionCall"]
|
||
call_id = f"call_{uuid.uuid4().hex[:24]}"
|
||
args_str = json.dumps(fc.get("args", fc.get("arguments", {})))
|
||
output_index = len(output_items)
|
||
flush_event("response.output_item.added", {"type": "response.output_item.added", "output_index": output_index, "item": {"type": "function_call", "id": call_id, "call_id": call_id, "name": fc.get("name", ""), "arguments": ""}})
|
||
flush_event("response.function_call_arguments.delta", {"type": "response.function_call_arguments.delta", "output_index": output_index, "item_id": call_id, "delta": args_str})
|
||
flush_event("response.function_call_arguments.done", {"type": "response.function_call_arguments.done", "output_index": output_index, "item_id": call_id, "arguments": args_str})
|
||
output_items.append({"tool": True, "fc": fc, "call_id": call_id})
|
||
accumulated_text += cont_text
|
||
print(f"[auto-continue] chunk {len(cont_text)} chars, finish={cont_finish}, total={len(accumulated_text)}", file=sys.stderr)
|
||
if cont_finish != "MAX_TOKENS":
|
||
break
|
||
return accumulated_text
|
||
|
||
class Handler(http.server.BaseHTTPRequestHandler):
|
||
protocol_version = "HTTP/1.1"
|
||
|
||
def do_GET(self):
|
||
if self.path in ("/v1/models", "/models"):
|
||
self.send_json(200, {"object": "list", "data": MODELS})
|
||
elif self.path in ("/v1/accounts", "/accounts"):
|
||
info = {"provider": BACKEND, "oauth_provider": OAUTH_PROVIDER}
|
||
if BACKEND == "freebuff":
|
||
info["accounts"] = _fb_pool.status()
|
||
info["total"] = len(_fb_pool._accounts)
|
||
elif OAUTH_PROVIDER and OAUTH_PROVIDER.startswith("google"):
|
||
pool = _google_antigravity_pool if OAUTH_PROVIDER == "google-antigravity" else _google_cli_pool
|
||
info["accounts"] = pool.status()
|
||
info["total"] = len(pool._accounts)
|
||
elif _api_key_pool:
|
||
info["accounts"] = _api_key_pool.status()
|
||
info["total"] = len(_api_key_pool._accounts)
|
||
else:
|
||
info["accounts"] = []
|
||
info["total"] = 0
|
||
self.send_json(200, info)
|
||
elif self.path in ("/health", "/v1/health"):
|
||
import resource as _res
|
||
_mem_mb = 0
|
||
try:
|
||
_mem_mb = _res.getrusage(_res.RUSAGE_SELF).ru_maxrss / 1024
|
||
except Exception:
|
||
pass
|
||
_uptime = time.time() - _START_TIME if '_START_TIME' in dir() else 0
|
||
self.send_json(200, {"ok": True, "backend": BACKEND,
|
||
"target_url": TARGET_URL,
|
||
"models": [m.get("id") for m in MODELS],
|
||
"bgp_routes": len(BGP_ROUTES),
|
||
"uptime_s": round(_uptime, 1),
|
||
"memory_mb": round(_mem_mb, 1),
|
||
"requests_total": _STATS.get("requests", 0)})
|
||
else:
|
||
self.send_error(404)
|
||
|
||
def do_POST(self):
|
||
if _shutdown_requested:
|
||
return self.send_json(503, {"error": {"type": "proxy_shutting_down",
|
||
"message": "Proxy is shutting down"}})
|
||
if self.path.startswith("/admin/cancel/"):
|
||
request_id = self.path.rsplit("/", 1)[-1]
|
||
if _cancel_request(request_id):
|
||
return self.send_json(200, {"ok": True, "cancelled": request_id})
|
||
return self.send_json(404, {"ok": False, "error": "request_not_found"})
|
||
if self.path in ("/v1/responses", "/responses"):
|
||
with ConnectionTracker():
|
||
self._handle()
|
||
else:
|
||
self.send_error(404)
|
||
|
||
_logf = None
|
||
|
||
def _handle(self):
|
||
try:
|
||
clen = int(self.headers.get("Content-Length", 0))
|
||
body = json.loads(self.rfile.read(clen))
|
||
except Exception as e:
|
||
return self.send_json(400, {"error": {"message": f"Bad request: {e}"}})
|
||
|
||
self._session_id = uuid.uuid4().hex[:8]
|
||
_sid = self._session_id
|
||
|
||
import datetime as _dt
|
||
_log_path = os.path.join(_LOG_DIR, "requests.log")
|
||
_ts = _dt.datetime.now().isoformat()
|
||
|
||
prev_id = body.get("previous_response_id")
|
||
raw_input = body.get("input", "")
|
||
input_data = resolve_previous_response(body)
|
||
input_data = _compact_input(input_data)
|
||
body["input"] = input_data
|
||
|
||
raw_types = [i.get("type") for i in raw_input] if isinstance(raw_input, list) else "str"
|
||
resolved_types = [i.get("type") for i in input_data] if isinstance(input_data, list) else "str"
|
||
|
||
print(f"[{_sid}] prev_id={prev_id} raw={raw_types} resolved={resolved_types}", file=sys.stderr)
|
||
with open(_log_path, "a") as _lf:
|
||
_lf.write(f"\n{'='*60}\n{_ts} [session={_sid}] REQUEST {self.path}\n")
|
||
_lf.write(f" prev_id={prev_id}\n")
|
||
_lf.write(f" raw_input_types={raw_types}\n")
|
||
_lf.write(f" resolved_input_types={resolved_types}\n")
|
||
_lf.write(f" stream={body.get('stream')} model={body.get('model')}\n")
|
||
_lf.write(f" store_keys={list(_response_store.keys())}\n")
|
||
if isinstance(input_data, list):
|
||
for i, item in enumerate(input_data):
|
||
t = item.get("type")
|
||
if t == "message":
|
||
_lf.write(f" [{i}] message role={item.get('role')} text={str(item.get('content',''))[:120]}\n")
|
||
elif t == "function_call":
|
||
_lf.write(f" [{i}] function_call call_id={item.get('call_id')} id={item.get('id')} name={item.get('name')} args={item.get('arguments','')[:120]}\n")
|
||
elif t == "function_call_output":
|
||
_lf.write(f" [{i}] function_call_output id={item.get('id')} output={str(item.get('output',''))[:120]}\n")
|
||
else:
|
||
_lf.write(f" [{i}] {t}\n")
|
||
_lf.flush()
|
||
|
||
model = body.get("model", MODELS[0]["id"] if MODELS else "unknown")
|
||
stream = body.get("stream", False)
|
||
request_id = body.get("request_id") or body.get("id") or uid("req")
|
||
if isinstance(input_data, list):
|
||
for item in input_data:
|
||
if isinstance(item, dict) and item.get("type") == "message" and item.get("role") == "user":
|
||
content = str(item.get("content", ""))
|
||
for url_m in re.finditer(r"https?://[^\s\]'\"<>]+", content):
|
||
_last_user_urls.append(url_m.group(0))
|
||
save_request_snapshot(request_id, body)
|
||
_req_t0 = time.time()
|
||
try:
|
||
with RequestTracker(request_id) as tracker:
|
||
if BACKEND == "auto":
|
||
self._handle_auto(body, model, stream, tracker)
|
||
elif BACKEND == "anthropic":
|
||
self._handle_anthropic(body, model, stream, tracker)
|
||
elif BACKEND == "command-code":
|
||
self._handle_command_code(body, model, stream, tracker)
|
||
elif BACKEND == "freebuff":
|
||
self._handle_freebuff(body, model, stream, tracker)
|
||
elif (BACKEND or "").startswith("gemini-oauth"):
|
||
self._handle_gemini_oauth(body, model, stream, tracker)
|
||
else:
|
||
self._handle_openai_compat(body, model, stream, tracker)
|
||
update_snapshot_response(request_id, "completed", time.time() - _req_t0)
|
||
except Exception as _snap_err:
|
||
update_snapshot_response(request_id, "error", time.time() - _req_t0, _snap_err)
|
||
raise
|
||
|
||
def _handle_openai_compat(self, body, model, stream, tracker=None):
|
||
input_data = body.get("input", "")
|
||
policy = provider_policy()
|
||
|
||
pair_errors = validate_tool_pairs(input_data)
|
||
if pair_errors:
|
||
print(f"[tool-validator] repairing {len(pair_errors)} orphan tool outputs", file=sys.stderr)
|
||
input_data = repair_orphan_tool_outputs(input_data, pair_errors)
|
||
body = dict(body)
|
||
body["input"] = input_data
|
||
|
||
if (policy.get("synthetic_tool_results") or _provider_cap(model, "synthetic_tool_results", False)) and isinstance(input_data, list):
|
||
input_data, synthesized = synthesize_tool_results_for_chat(input_data)
|
||
if synthesized:
|
||
print("[provider-adapter] using synthetic tool-result continuation", file=sys.stderr)
|
||
body = dict(body)
|
||
body["input"] = input_data
|
||
|
||
compacted = False
|
||
if policy.get("compaction") and isinstance(input_data, list):
|
||
input_data, compacted = _adaptive_compact(input_data, model, policy)
|
||
if compacted:
|
||
body = dict(body)
|
||
body["input"] = input_data
|
||
|
||
crof_limit = _crof_item_limit(model)
|
||
if not compacted and isinstance(input_data, list) and len(input_data) > crof_limit:
|
||
print(f"[crof-adaptive] proactive compact: {len(input_data)} items > limit {crof_limit}", file=sys.stderr)
|
||
input_data = _crof_compact_for_retry(input_data, model)
|
||
body = dict(body)
|
||
body["input"] = input_data
|
||
|
||
messages = oa_input_to_messages(input_data)
|
||
instructions = body.get("instructions", "").strip()
|
||
if instructions:
|
||
messages.insert(0, {"role": "system", "content": instructions})
|
||
|
||
if BGP_ROUTES:
|
||
self._handle_bgp(body, model, stream, messages, input_data)
|
||
else:
|
||
chat_body = self._build_chat_body(model, messages, body, stream)
|
||
target = upstream_target(TARGET_URL, "/chat/completions")
|
||
if _api_key_pool:
|
||
pool_acct = _api_key_pool.get()
|
||
effective_key = pool_acct["token"] if pool_acct else API_KEY
|
||
else:
|
||
effective_key = _refresh_oauth_token()
|
||
fwd = forwarded_headers(self.headers, {
|
||
"Content-Type": "application/json",
|
||
"Authorization": f"Bearer {effective_key}",
|
||
}, browser_ua=True)
|
||
print(f"[{self._session_id}] POST {target} model={model} stream={stream} items={len(input_data) if isinstance(input_data,list) else 1}", file=sys.stderr)
|
||
chat_body_b = json.dumps(chat_body).encode()
|
||
max_retries = 3
|
||
for attempt in range(max_retries + 1):
|
||
req = urllib.request.Request(target, data=chat_body_b, headers=fwd)
|
||
try:
|
||
upstream = urllib.request.urlopen(req, timeout=_upstream_timeout(body, stream))
|
||
except urllib.error.HTTPError as e:
|
||
err_body = e.read().decode()
|
||
if e.code in (429, 502, 503) and attempt < max_retries:
|
||
if e.code == 429 and _api_key_pool:
|
||
pool_acct = _api_key_pool.get()
|
||
if pool_acct:
|
||
_api_key_pool.mark_rate_limited(pool_acct, 60)
|
||
next_acct = _api_key_pool.get()
|
||
if next_acct:
|
||
effective_key = next_acct["token"]
|
||
fwd["Authorization"] = f"Bearer {effective_key}"
|
||
print(f"[multi-account] rotating to key {next_acct['id']}", file=sys.stderr)
|
||
retry_after = e.headers.get("Retry-After")
|
||
if retry_after:
|
||
try:
|
||
wait = min(int(retry_after), 60)
|
||
except ValueError:
|
||
wait = min(2 ** (attempt + 1), 15)
|
||
else:
|
||
wait = min(2 ** (attempt + 1), 15)
|
||
print(f"[{self._session_id}] HTTP {e.code} (attempt {attempt+1}/{max_retries}), retrying in {wait}s: {err_body[:150]}", file=sys.stderr)
|
||
time.sleep(wait)
|
||
continue
|
||
return self.send_json(e.code, {"error": {"type": "upstream_error", "message": _sanitize_err_body(err_body)}})
|
||
except (ConnectionResetError, ConnectionAbortedError, BrokenPipeError) as e:
|
||
if attempt < max_retries:
|
||
wait = min(2 ** (attempt + 1), 10)
|
||
print(f"[{self._session_id}] connection error (attempt {attempt+1}/{max_retries}), retrying in {wait}s: {e}", file=sys.stderr)
|
||
time.sleep(wait)
|
||
continue
|
||
return self.send_json(502, {"error": {"type": "proxy_error", "message": str(e)}})
|
||
except Exception as e:
|
||
return self.send_json(500, {"error": {"type": "proxy_error", "message": str(e)}})
|
||
break
|
||
self._forward_oa_compat(upstream, stream, model, chat_body, body, input_data, fwd, target, tracker)
|
||
|
||
def _build_chat_body(self, model, messages, body, stream):
|
||
chat_body = {"model": model, "messages": messages}
|
||
for k in ("temperature", "top_p"):
|
||
if k in body:
|
||
chat_body[k] = body[k]
|
||
chat_body["max_tokens"] = max(body.get("max_output_tokens", 0), 64000)
|
||
tools = oa_convert_tools(body.get("tools"))
|
||
if tools:
|
||
chat_body["tools"] = tools
|
||
if body.get("tool_choice"):
|
||
chat_body["tool_choice"] = body["tool_choice"]
|
||
chat_body["stream"] = stream
|
||
if not REASONING_ENABLED or REASONING_EFFORT == "none":
|
||
chat_body["enable_thinking"] = False
|
||
chat_body["reasoning_effort"] = "none"
|
||
else:
|
||
chat_body["reasoning_effort"] = REASONING_EFFORT
|
||
return chat_body
|
||
|
||
def _handle_gemini_oauth(self, body, model, stream, tracker=None):
|
||
input_data = body.get("input", "")
|
||
policy = provider_policy()
|
||
original_model = model
|
||
|
||
_GEMINI_KEEP_RECENT = 6
|
||
_GEMINI_OLD_LIMIT = 3000
|
||
_GEMINI_RECENT_LIMIT = 20000
|
||
|
||
if isinstance(input_data, list) and len(input_data) > 8:
|
||
n_tool_outputs = sum(1 for it in input_data if isinstance(it, dict) and it.get("type") == "function_call_output")
|
||
if n_tool_outputs > 2:
|
||
tool_indexes = [i for i, it in enumerate(input_data) if isinstance(it, dict) and it.get("type") == "function_call_output"]
|
||
recent_set = set(tool_indexes[-_GEMINI_KEEP_RECENT:])
|
||
compacted_data = []
|
||
for i, item in enumerate(input_data):
|
||
if isinstance(item, dict) and item.get("type") == "function_call_output":
|
||
o = item.get("output", "")
|
||
limit = _GEMINI_RECENT_LIMIT if i in recent_set else _GEMINI_OLD_LIMIT
|
||
if len(o) > limit:
|
||
item = dict(item)
|
||
item["output"] = o[:limit] + f"\n... [proxy compacted: kept {limit} of {len(o)} chars]"
|
||
compacted_data.append(item)
|
||
input_data = compacted_data
|
||
body = dict(body)
|
||
body["input"] = input_data
|
||
print(f"[gemini-compact] {n_tool_outputs} tool outputs, recent={_GEMINI_RECENT_LIMIT} old={_GEMINI_OLD_LIMIT}", file=sys.stderr)
|
||
|
||
if OAUTH_PROVIDER == "google-antigravity":
|
||
alias_map = {
|
||
"antigravity-gemini-3-flash": "gemini-3-flash",
|
||
"antigravity-gemini-3-pro": "gemini-3-pro-low",
|
||
"antigravity-gemini-3.1-pro": "gemini-3.1-pro-low",
|
||
"gemini-3-flash-preview": "gemini-3-flash",
|
||
"gemini-3-pro-preview": "gemini-3-pro-low",
|
||
"gemini-3.1-pro-preview": "gemini-3.1-pro-low",
|
||
"gemini-3-pro": "gemini-3-pro-low",
|
||
"gemini-3.1-pro": "gemini-3.1-pro-low",
|
||
"antigravity-claude-sonnet-4-6": "claude-sonnet-4-6",
|
||
"antigravity-claude-opus-4-6-thinking": "claude-opus-4-6-thinking",
|
||
}
|
||
model = alias_map.get(model, model)
|
||
if model != original_model:
|
||
print(f"[antigravity] model mapped user={original_model} upstream={model}", file=sys.stderr)
|
||
|
||
pair_errors = validate_tool_pairs(input_data)
|
||
if pair_errors:
|
||
input_data = repair_orphan_tool_outputs(input_data, pair_errors)
|
||
body = dict(body)
|
||
body["input"] = input_data
|
||
|
||
compacted = False
|
||
if policy.get("compaction") and isinstance(input_data, list):
|
||
input_data, compacted = _adaptive_compact(input_data, model, policy)
|
||
if compacted:
|
||
body = dict(body)
|
||
body["input"] = input_data
|
||
|
||
access_token = _refresh_oauth_token()
|
||
token_name = "google-antigravity-oauth-token.json" if OAUTH_PROVIDER == "google-antigravity" else "google-cli-oauth-token.json"
|
||
token_path = os.path.join(os.path.expanduser("~"), ".cache", "codex-proxy", token_name)
|
||
project_id = ""
|
||
try:
|
||
with open(token_path) as f:
|
||
project_id = json.load(f).get("project_id", "")
|
||
except Exception:
|
||
pass
|
||
|
||
contents = []
|
||
system_parts = []
|
||
instructions = body.get("instructions", "").strip()
|
||
tool_call_names = {}
|
||
|
||
if isinstance(input_data, list):
|
||
for item in input_data:
|
||
t = item.get("type")
|
||
if t == "message":
|
||
role = "user" if item.get("role") == "user" else "model"
|
||
content = item.get("content", "")
|
||
if isinstance(content, list):
|
||
parts = []
|
||
for c in content:
|
||
ct = c.get("type")
|
||
if ct == "input_text":
|
||
parts.append({"text": c.get("text", "")})
|
||
elif ct == "text":
|
||
parts.append({"text": c.get("text", "")})
|
||
elif ct == "input_image" or ct == "image_url":
|
||
iu = c.get("image_url") or c.get("url", {})
|
||
url = iu.get("url", iu) if isinstance(iu, dict) else iu
|
||
if isinstance(url, str) and url.startswith("data:"):
|
||
mime, _, b64 = url.partition(";base64,")
|
||
mime = mime.replace("data:", "") or "image/png"
|
||
parts.append({"inlineData": {"mimeType": mime, "data": b64}})
|
||
else:
|
||
parts.append({"text": str(url)})
|
||
if parts:
|
||
contents.append({"role": role, "parts": parts})
|
||
elif isinstance(content, str):
|
||
contents.append({"role": role, "parts": [{"text": content}]})
|
||
elif t == "function_call":
|
||
call_id = item.get("call_id") or item.get("id") or f"call_{uuid.uuid4().hex[:24]}"
|
||
fname = item.get("name", "")
|
||
if call_id and fname:
|
||
tool_call_names[call_id] = fname
|
||
args = item.get("arguments", "{}")
|
||
if isinstance(args, str):
|
||
try:
|
||
args = json.loads(args)
|
||
except Exception:
|
||
args = {}
|
||
fc_part = {"functionCall": {"name": fname, "args": args, "id": call_id}}
|
||
stored_sig = _gemini_get_sig(f"fc:{call_id}") or _gemini_get_sig(f"fc:{fname}")
|
||
if stored_sig:
|
||
fc_part["thoughtSignature"] = stored_sig
|
||
fc_part["thought_signature"] = stored_sig
|
||
else:
|
||
fc_part["thought_signature"] = "skip_thought_signature_validator"
|
||
contents.append({"role": "model", "parts": [fc_part]})
|
||
elif t == "function_call_output":
|
||
call_id = item.get("call_id", item.get("id", ""))
|
||
output = item.get("output", "")
|
||
fname = item.get("name", "") or tool_call_names.get(call_id, "")
|
||
try:
|
||
output_parsed = json.loads(output) if isinstance(output, str) else output
|
||
except Exception:
|
||
output_parsed = output
|
||
resp_part = {"functionResponse": {"name": fname or "unknown", "response": {"result": output_parsed if isinstance(output_parsed, (dict, list)) else output}}}
|
||
if call_id:
|
||
resp_part["functionResponse"]["id"] = call_id
|
||
contents.append({"role": "user", "parts": [resp_part]})
|
||
|
||
if OAUTH_PROVIDER.startswith("google"):
|
||
sanitized = []
|
||
last_user_text = None
|
||
last_role = None
|
||
for content in contents:
|
||
role = content.get("role")
|
||
parts = [p for p in content.get("parts", []) if isinstance(p, dict)]
|
||
if not parts:
|
||
continue
|
||
text_key = "\n".join([p.get("text", "") for p in parts if "text" in p]).strip()
|
||
if role == "user" and text_key and text_key == last_user_text:
|
||
continue
|
||
if role == last_role and role in ("user", "model") and sanitized:
|
||
sanitized[-1].setdefault("parts", []).extend(parts)
|
||
else:
|
||
sanitized.append({"role": role, "parts": parts})
|
||
if role == "user" and text_key:
|
||
last_user_text = text_key
|
||
last_role = role
|
||
while sanitized and sanitized[0].get("role") != "user":
|
||
sanitized.pop(0)
|
||
while sanitized and sanitized[-1].get("role") != "user":
|
||
sanitized.pop()
|
||
contents = sanitized
|
||
|
||
if instructions:
|
||
system_parts.append({"text": instructions})
|
||
if OAUTH_PROVIDER == "google-antigravity":
|
||
system_parts.append({"text": (
|
||
"You are connected through a Responses API translation proxy. "
|
||
"If tools are available and the user's request requires changing files, call the appropriate tool immediately. "
|
||
"Do not announce plans, do not say you will list files, browse, fetch, inspect, or start by exploring unless you are emitting the actual tool call in the same response. "
|
||
"For file creation requests, use tools to create or modify the file instead of only printing code in chat. "
|
||
"If no suitable tool is available, answer directly with the complete result. "
|
||
"Never answer only with a plan such as 'I will start by...' or 'I am going to...'."
|
||
)})
|
||
|
||
gen_config = {}
|
||
mot = body.get("max_output_tokens", 0)
|
||
if mot:
|
||
gen_config["maxOutputTokens"] = mot
|
||
if body.get("temperature") is not None:
|
||
gen_config["temperature"] = body["temperature"]
|
||
if body.get("top_p") is not None:
|
||
gen_config["topP"] = body["top_p"]
|
||
|
||
if REASONING_ENABLED and REASONING_EFFORT != "none":
|
||
budget = {"low": 2048, "medium": 8192, "high": 24576}.get(REASONING_EFFORT, 8192)
|
||
gen_config["thinkingConfig"] = {"includeThoughts": True, "thinkingBudget": budget}
|
||
|
||
oa_tools = body.get("tools", [])
|
||
gemini_tools = []
|
||
if oa_tools:
|
||
func_decls = []
|
||
for tool in oa_tools:
|
||
ttype = tool.get("type", "function")
|
||
fname = tool.get("name", "")
|
||
if ttype == "function":
|
||
fn = tool.get("function", tool)
|
||
name = fn.get("name", fname)
|
||
desc = fn.get("description", "")
|
||
params = fn.get("parameters", fn.get("input_schema", {}))
|
||
func_decls.append({"name": name, "description": desc, "parameters": params})
|
||
elif fname:
|
||
func_decls.append({"name": fname, "description": tool.get("description", ""), "parameters": tool.get("parameters", {"type": "object", "properties": {}})})
|
||
if func_decls:
|
||
gemini_tools = [{"functionDeclarations": func_decls}]
|
||
|
||
if OAUTH_PROVIDER == "google-antigravity":
|
||
contents = _gemini_reattach_sigs(contents)
|
||
|
||
if OAUTH_PROVIDER == "google-antigravity":
|
||
guardrail_found = any("autonomous coding agent" in json.dumps(c.get("parts", []), ensure_ascii=False) for c in contents[:2])
|
||
if not guardrail_found:
|
||
contents.insert(0, {"role": "user", "parts": [{"text": _GEMINI_AGENT_GUARDRAIL}]})
|
||
|
||
if OAUTH_PROVIDER == "google-antigravity" and isinstance(input_data, list):
|
||
latest_user = ""
|
||
for item in reversed(input_data):
|
||
if item.get("type") == "message" and item.get("role") == "user":
|
||
c = item.get("content", "")
|
||
if isinstance(c, str):
|
||
latest_user = c
|
||
elif isinstance(c, list):
|
||
latest_user = "\n".join(p.get("text", p.get("input_text", "")) for p in c if isinstance(p, dict))
|
||
break
|
||
if latest_user:
|
||
serialized = json.dumps(contents, ensure_ascii=False)
|
||
needle = latest_user.strip().replace("\n", " ")[:120]
|
||
escaped_serialized = serialized.replace("\\n", " ")
|
||
if needle and needle not in escaped_serialized:
|
||
print(f"[antigravity] latest user instruction missing from contents, appending", file=sys.stderr)
|
||
contents.append({"role": "user", "parts": [{"text": latest_user}]})
|
||
|
||
request_body = {"contents": contents}
|
||
if system_parts:
|
||
request_body["systemInstruction"] = {"parts": system_parts}
|
||
if gen_config:
|
||
request_body["generationConfig"] = gen_config
|
||
if gemini_tools:
|
||
request_body["tools"] = gemini_tools
|
||
|
||
wrapped = {
|
||
"project": project_id,
|
||
"model": model,
|
||
"request": request_body,
|
||
}
|
||
if OAUTH_PROVIDER == "google-antigravity":
|
||
wrapped["requestType"] = "agent"
|
||
wrapped["userAgent"] = "antigravity"
|
||
wrapped["requestId"] = f"agent-{uuid.uuid4().hex[:12]}"
|
||
|
||
endpoints = ([
|
||
"https://daily-cloudcode-pa.sandbox.googleapis.com",
|
||
"https://autopush-cloudcode-pa.sandbox.googleapis.com",
|
||
"https://cloudcode-pa.googleapis.com",
|
||
] if OAUTH_PROVIDER == "google-antigravity" else [
|
||
"https://cloudcode-pa.googleapis.com",
|
||
])
|
||
action = "streamGenerateContent" if stream else "generateContent"
|
||
url_suffix = f"v1internal:{action}?alt=sse" if stream else f"v1internal:{action}"
|
||
|
||
headers = {
|
||
"Content-Type": "application/json",
|
||
"Authorization": f"Bearer {access_token}",
|
||
}
|
||
if OAUTH_PROVIDER == "google-antigravity":
|
||
version = _ensure_antigravity_version()
|
||
headers["User-Agent"] = f"antigravity/{version} darwin/arm64"
|
||
else:
|
||
headers["User-Agent"] = "google-api-nodejs-client/9.15.1"
|
||
headers["X-Goog-Api-Client"] = "gl-node/22.17.0"
|
||
headers["Client-Metadata"] = "ideType=IDE_UNSPECIFIED,platform=PLATFORM_UNSPECIFIED,pluginType=GEMINI"
|
||
body_b = json.dumps(wrapped).encode()
|
||
n_contents = len(contents)
|
||
has_tools = bool(gemini_tools)
|
||
print(f"[{self._session_id}] model={model} stream={stream} items={len(input_data) if isinstance(input_data, list) else 1} project={project_id} contents={n_contents} tools={has_tools}", file=sys.stderr)
|
||
if n_contents > 10:
|
||
debug_path = os.path.join(_LOG_DIR, f"gemini-long-ctx-{self._session_id}.json")
|
||
try:
|
||
with open(debug_path, "w") as dbg:
|
||
json.dump({"contents_count": n_contents, "contents_roles": [c.get("role") for c in contents], "has_tools": has_tools, "model": model, "wrapped_size": len(body_b)}, dbg, indent=2)
|
||
except Exception:
|
||
pass
|
||
|
||
for ep in endpoints:
|
||
target = f"{ep}/{url_suffix}"
|
||
req = urllib.request.Request(target, data=body_b, headers=headers)
|
||
try:
|
||
upstream = urllib.request.urlopen(req, timeout=_upstream_timeout(body, stream))
|
||
break
|
||
except urllib.error.HTTPError as e:
|
||
err_body = e.read().decode()
|
||
if e.code == 400 and OAUTH_PROVIDER.startswith("google"):
|
||
try:
|
||
debug_path = os.path.join(_LOG_DIR, "gemini-last-400-request.json")
|
||
with open(debug_path, "w") as dbg:
|
||
json.dump({"endpoint": ep, "model": model, "wrapped": wrapped, "error": err_body}, dbg, indent=2)
|
||
print(f"[{self._session_id}] saved 400 debug request to {debug_path}", file=sys.stderr)
|
||
except Exception:
|
||
pass
|
||
if e.code == 429 and ep != endpoints[-1]:
|
||
print(f"[{self._session_id}] {ep} HTTP 429, trying next endpoint", file=sys.stderr)
|
||
continue
|
||
if e.code == 429:
|
||
pool = _google_antigravity_pool if OAUTH_PROVIDER == "google-antigravity" else _google_cli_pool
|
||
_, acct = _get_google_account(OAUTH_PROVIDER)
|
||
if acct:
|
||
pool.mark_rate_limited(acct, 60)
|
||
return self.send_json(e.code, {"error": {"type": "upstream_error", "message": _sanitize_err_body(err_body)}})
|
||
except Exception as e:
|
||
if ep == endpoints[-1]:
|
||
return self.send_json(502, {"error": {"type": "proxy_error", "message": str(e)}})
|
||
print(f"[{self._session_id}] {ep} failed: {e}, trying next", file=sys.stderr)
|
||
continue
|
||
|
||
if stream:
|
||
self._forward_gemini_sse(upstream, model, body, input_data, tracker)
|
||
else:
|
||
self._forward_gemini_json(upstream, model, body, input_data)
|
||
|
||
def _forward_gemini_sse(self, upstream, model, body, input_data, tracker=None):
|
||
resp_id = f"resp-{uuid.uuid4().hex[:24]}"
|
||
created = int(time.time())
|
||
self.send_response(200)
|
||
self.send_header("Content-Type", "text/event-stream")
|
||
self.send_header("Cache-Control", "no-cache")
|
||
self.send_header("Connection", "keep-alive")
|
||
self.end_headers()
|
||
|
||
full_text = ""
|
||
output_items = []
|
||
current_tool_calls = {}
|
||
message_started = False
|
||
message_id = f"msg-{uuid.uuid4().hex[:24]}"
|
||
|
||
def flush_event(event_type, data):
|
||
self.wfile.write(f"event: {event_type}\ndata: {json.dumps(data)}\n\n".encode())
|
||
self.wfile.flush()
|
||
|
||
flush_event("response.created", {"type": "response.created", "response": {"id": resp_id, "object": "response", "model": model, "status": "in_progress", "created": created, "output": []}})
|
||
flush_event("response.in_progress", {"type": "response.in_progress", "response": {"id": resp_id}})
|
||
|
||
buf = ""
|
||
stream_finished = False
|
||
for raw_line in _stream_with_idle_timeout(upstream):
|
||
if tracker and tracker.cancelled.is_set():
|
||
print("[gemini-oauth] stream cancelled", file=sys.stderr)
|
||
break
|
||
if stream_finished:
|
||
break
|
||
line = raw_line.decode(errors="replace")
|
||
if line.startswith("data: "):
|
||
buf += line[6:]
|
||
continue
|
||
if not line.strip() and buf:
|
||
try:
|
||
chunk = json.loads(buf)
|
||
except Exception:
|
||
buf = ""
|
||
continue
|
||
buf = ""
|
||
|
||
candidates = chunk.get("response", chunk).get("candidates", [])
|
||
if not candidates:
|
||
if chunk.get("error"):
|
||
print(f"[{self._session_id}] stream error chunk: {str(chunk.get('error'))[:300]}", file=sys.stderr)
|
||
continue
|
||
if candidates[0].get("finishReason") and not candidates[0].get("content", {}).get("parts"):
|
||
print(f"[{self._session_id}] finish without parts: {candidates[0].get('finishReason')}", file=sys.stderr)
|
||
parts = candidates[0].get("content", {}).get("parts", [])
|
||
for part in parts:
|
||
sig = _extract_gemini_sig(part)
|
||
if sig:
|
||
if part.get("functionCall"):
|
||
fc_id = part["functionCall"].get("id") or part["functionCall"].get("name")
|
||
fc_name = part["functionCall"].get("name")
|
||
if fc_id:
|
||
_gemini_store_sig(f"fc:{fc_id}", sig)
|
||
if fc_name:
|
||
_gemini_store_sig(f"fc:{fc_name}", sig)
|
||
_gemini_store_sig(f"turn:{resp_id}", sig)
|
||
if part.get("thought"):
|
||
sig_from_thought = _extract_gemini_sig(part)
|
||
if sig_from_thought:
|
||
_gemini_store_sig(f"turn:{resp_id}", sig_from_thought)
|
||
continue
|
||
if "text" in part and not part.get("functionCall"):
|
||
text_delta = part["text"]
|
||
if not text_delta:
|
||
continue
|
||
full_text += text_delta
|
||
if not message_started:
|
||
flush_event("response.output_item.added", {"type": "response.output_item.added", "output_index": 0, "item": {"type": "message", "id": message_id, "role": "assistant", "content": []}})
|
||
flush_event("response.content_part.added", {"type": "response.content_part.added", "output_index": 0, "content_index": 0, "part": {"type": "output_text", "text": ""}})
|
||
output_items.append({"text": True})
|
||
message_started = True
|
||
flush_event("response.output_text.delta", {"type": "response.output_text.delta", "output_index": 0, "content_index": 0, "delta": text_delta})
|
||
elif part.get("functionCall"):
|
||
fc = part["functionCall"]
|
||
call_id = f"call_{uuid.uuid4().hex[:24]}"
|
||
args_str = json.dumps(fc.get("args", fc.get("arguments", {})))
|
||
output_index = len(output_items)
|
||
flush_event("response.output_item.added", {"type": "response.output_item.added", "output_index": output_index, "item": {"type": "function_call", "id": call_id, "call_id": call_id, "name": fc.get("name", ""), "arguments": ""}})
|
||
flush_event("response.function_call_arguments.delta", {"type": "response.function_call_arguments.delta", "output_index": output_index, "item_id": call_id, "delta": args_str})
|
||
flush_event("response.function_call_arguments.done", {"type": "response.function_call_arguments.done", "output_index": output_index, "item_id": call_id, "arguments": args_str})
|
||
current_tool_calls[call_id] = fc
|
||
output_items.append({"tool": True})
|
||
last_finish = candidates[0].get("finishReason", "")
|
||
if last_finish:
|
||
part_kinds = []
|
||
for p in parts:
|
||
if "text" in p: part_kinds.append("text")
|
||
if "functionCall" in p: part_kinds.append("functionCall")
|
||
if _extract_gemini_sig(p): part_kinds.append("thoughtSignature")
|
||
print(f"[{self._session_id}] [antigravity] finish={last_finish} parts={part_kinds} tool_calls={len(current_tool_calls)}", file=sys.stderr)
|
||
if OAUTH_PROVIDER == "google-antigravity" and last_finish == "MAX_TOKENS" and full_text and not current_tool_calls:
|
||
print(f"[{self._session_id}] MAX_TOKENS hit ({len(full_text)} chars), auto-continuing...", file=sys.stderr)
|
||
break
|
||
stream_finished = True
|
||
break
|
||
|
||
if OAUTH_PROVIDER.startswith("google") and full_text and not current_tool_calls and last_finish == "MAX_TOKENS" and not stream_finished:
|
||
result = _auto_continue_gemini(self, flush_event, message_id, model, gen_config, gemini_tools, system_parts, project_id, headers, endpoints, url_suffix, full_text, output_items, message_started)
|
||
if result:
|
||
full_text = result
|
||
for item in output_items:
|
||
if isinstance(item, dict) and item.get("tool") and "fc" in item and "call_id" in item:
|
||
current_tool_calls[item["call_id"]] = item["fc"]
|
||
|
||
out = []
|
||
if not full_text and not current_tool_calls:
|
||
print("[gemini-oauth] WARNING: completed with empty output", file=sys.stderr)
|
||
if full_text:
|
||
out.append({"type": "message", "id": message_id, "role": "assistant", "content": [{"type": "output_text", "text": full_text}]})
|
||
tool_outputs = []
|
||
for cid, fc in current_tool_calls.items():
|
||
tool_outputs.append({"type": "function_call", "id": cid, "call_id": cid, "name": fc.get("name", ""), "arguments": json.dumps(fc.get("args", fc.get("arguments", {})))})
|
||
out.extend(tool_outputs)
|
||
|
||
final_resp = {"id": resp_id, "object": "response", "model": model, "status": "completed", "created": created, "output": out}
|
||
if full_text:
|
||
flush_event("response.output_text.done", {"type": "response.output_text.done", "output_index": 0, "content_index": 0, "text": full_text})
|
||
flush_event("response.content_part.done", {"type": "response.content_part.done", "output_index": 0, "content_index": 0, "part": {"type": "output_text", "text": full_text}})
|
||
flush_event("response.output_item.done", {"type": "response.output_item.done", "output_index": 0, "item": out[0]})
|
||
for idx, item in enumerate(tool_outputs, start=(1 if full_text else 0)):
|
||
flush_event("response.output_item.done", {"type": "response.output_item.done", "output_index": idx, "item": item})
|
||
flush_event("response.completed", {"type": "response.completed", "response": final_resp})
|
||
self.close_connection = True
|
||
|
||
with _response_store_lock:
|
||
_response_store[resp_id] = final_resp
|
||
while len(_response_store) > _MAX_STORED:
|
||
_response_store.popitem(last=False)
|
||
|
||
def _forward_gemini_json(self, upstream, model, body, input_data):
|
||
data = json.loads(upstream.read().decode())
|
||
resp_id = f"resp-{uuid.uuid4().hex[:24]}"
|
||
created = int(time.time())
|
||
out = []
|
||
full_text = ""
|
||
candidates = data.get("response", data).get("candidates", [])
|
||
if candidates:
|
||
parts = candidates[0].get("content", {}).get("parts", [])
|
||
text_parts = []
|
||
for part in parts:
|
||
if part.get("thought"):
|
||
continue
|
||
if "text" in part and not part.get("functionCall"):
|
||
text_parts.append(part["text"])
|
||
elif part.get("functionCall"):
|
||
fc = part["functionCall"]
|
||
call_id = f"call_{uuid.uuid4().hex[:24]}"
|
||
out.append({"type": "function_call", "id": call_id, "call_id": call_id, "name": fc.get("name", ""), "arguments": json.dumps(fc.get("args", fc.get("arguments", {})))})
|
||
if text_parts:
|
||
full_text = "".join(text_parts)
|
||
out.insert(0, {"type": "message", "id": f"msg-{uuid.uuid4().hex[:24]}", "role": "assistant", "content": [{"type": "output_text", "text": full_text}]})
|
||
resp = {"id": resp_id, "object": "response", "model": model, "status": "completed", "created": created, "output": out}
|
||
with _response_store_lock:
|
||
_response_store[resp_id] = resp
|
||
while len(_response_store) > _MAX_STORED:
|
||
_response_store.popitem(last=False)
|
||
self.send_json(200, resp)
|
||
|
||
def _handle_bgp(self, body, model, stream, messages, input_data):
|
||
routes = _sorted_bgp_routes()
|
||
routes = [r for r in routes if _bucket_for_route(r).allow()]
|
||
if not routes:
|
||
return self.send_json(503, {"error": {"type": "bgp_rate_limited", "message": "All routes rate-limited"}})
|
||
errors = []
|
||
for route in routes:
|
||
r_model = route.get("model", model)
|
||
r_url = route["target_url"].rstrip("/")
|
||
r_key = route.get("api_key", "")
|
||
r_reasoning = route.get("reasoning_enabled", True)
|
||
r_effort = route.get("reasoning_effort", "medium")
|
||
r_oauth = route.get("oauth_provider", "")
|
||
|
||
chat_body = dict(messages=list(messages))
|
||
chat_body["model"] = r_model
|
||
for k in ("temperature", "top_p"):
|
||
if k in body:
|
||
chat_body[k] = body[k]
|
||
chat_body["max_tokens"] = max(body.get("max_output_tokens", 0), 64000)
|
||
tools = oa_convert_tools(body.get("tools"))
|
||
if tools:
|
||
chat_body["tools"] = tools
|
||
if body.get("tool_choice"):
|
||
chat_body["tool_choice"] = body["tool_choice"]
|
||
chat_body["stream"] = stream
|
||
if not r_reasoning or r_effort == "none":
|
||
chat_body["enable_thinking"] = False
|
||
chat_body["reasoning_effort"] = "none"
|
||
else:
|
||
chat_body["reasoning_effort"] = r_effort
|
||
|
||
target = upstream_target(r_url, "/chat/completions")
|
||
if r_oauth == "google":
|
||
r_key = _refresh_oauth_token_for(r_key, r_oauth)
|
||
fwd = forwarded_headers(self.headers, {
|
||
"Content-Type": "application/json",
|
||
"Authorization": f"Bearer {r_key}",
|
||
}, browser_ua=True)
|
||
print(f"[{self._session_id}] trying route '{route.get('name', r_url)}' model={r_model}", file=sys.stderr)
|
||
req = urllib.request.Request(target, data=json.dumps(chat_body).encode(), headers=fwd)
|
||
t0_route = time.time()
|
||
route_ok = False
|
||
for attempt in range(3):
|
||
try:
|
||
upstream = urllib.request.urlopen(req, timeout=_upstream_timeout(body, stream))
|
||
print(f"[{self._session_id}] route '{route.get('name', r_url)}' connected OK", file=sys.stderr)
|
||
_update_route_stats(route, True, time.time() - t0_route)
|
||
self._forward_oa_compat(upstream, stream, r_model, chat_body, body, input_data, fwd, target)
|
||
return
|
||
except urllib.error.HTTPError as e:
|
||
err = e.read().decode()
|
||
if e.code in (429, 502, 503) and attempt < 2:
|
||
retry_after = e.headers.get("Retry-After")
|
||
wait = min(int(retry_after), 60) if retry_after and retry_after.isdigit() else min(2 ** (attempt + 1), 10)
|
||
print(f"[{self._session_id}] route '{route.get('name', r_url)}' HTTP {e.code}, retry {attempt+1}/2 in {wait}s", file=sys.stderr)
|
||
time.sleep(wait)
|
||
req = urllib.request.Request(target, data=json.dumps(chat_body).encode(), headers=fwd)
|
||
continue
|
||
print(f"[{self._session_id}] route '{route.get('name', r_url)}' FAILED: HTTP {e.code}: {err[:200]}", file=sys.stderr)
|
||
_update_route_stats(route, False, time.time() - t0_route, http_code=e.code)
|
||
errors.append(f"{route.get('name','?')}: HTTP {e.code}")
|
||
break
|
||
except (ConnectionResetError, ConnectionAbortedError, BrokenPipeError) as e:
|
||
if attempt < 2:
|
||
wait = min(2 ** (attempt + 1), 8)
|
||
print(f"[{self._session_id}] route '{route.get('name', r_url)}' conn error, retry {attempt+1}/2 in {wait}s: {e}", file=sys.stderr)
|
||
time.sleep(wait)
|
||
req = urllib.request.Request(target, data=json.dumps(chat_body).encode(), headers=fwd)
|
||
continue
|
||
_update_route_stats(route, False, time.time() - t0_route, error_type=str(e))
|
||
errors.append(f"{route.get('name','?')}: {e}")
|
||
break
|
||
except Exception as e:
|
||
print(f"[{self._session_id}] route '{route.get('name', r_url)}' FAILED: {e}", file=sys.stderr)
|
||
_update_route_stats(route, False, time.time() - t0_route, error_type=str(e))
|
||
errors.append(f"{route.get('name','?')}: {e}")
|
||
break
|
||
|
||
print(f"[{self._session_id}] ALL ROUTES FAILED: {errors}", file=sys.stderr)
|
||
self.send_json(502, {"error": {"type": "bgp_all_routes_failed", "message": f"All BGP routes failed: {'; '.join(errors)}"}})
|
||
|
||
def _forward_oa_compat(self, upstream, stream, model, chat_body, body, input_data, fwd, target, tracker=None):
|
||
n_items = len(input_data) if isinstance(input_data, list) else 1
|
||
t0 = time.time()
|
||
provider = TARGET_URL.split("//")[-1].split("/")[0]
|
||
if BGP_ROUTES:
|
||
provider = "bgp:" + (BGP_ROUTES[0].get("name", "pool") if BGP_ROUTES else "unknown")
|
||
|
||
if stream:
|
||
self.send_response(200)
|
||
self.send_header("Content-Type", "text/event-stream")
|
||
self.send_header("Cache-Control", "no-cache")
|
||
self.send_header("Connection", "keep-alive")
|
||
self.end_headers()
|
||
if hasattr(self, 'connection') and self.connection:
|
||
try:
|
||
self.connection.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)
|
||
except Exception:
|
||
pass
|
||
|
||
collected_events = []
|
||
last_resp_id = None
|
||
last_output = None
|
||
last_status = None
|
||
finish_reason = None
|
||
has_content = False
|
||
|
||
def _observe_event(event):
|
||
nonlocal last_resp_id, last_output, last_status, finish_reason, has_content
|
||
for line in event.strip().split("\n"):
|
||
if line.startswith("data: "):
|
||
try:
|
||
d = json.loads(line[6:])
|
||
if d.get("type") == "response.completed":
|
||
last_resp_id = d.get("response", {}).get("id")
|
||
last_output = d.get("response", {}).get("output", [])
|
||
last_status = d.get("response", {}).get("status")
|
||
finish_reason = "length" if last_status == "incomplete" else "stop"
|
||
has_content = any(o.get("type") == "message" for o in (last_output or []))
|
||
except Exception:
|
||
pass
|
||
|
||
try:
|
||
for event in oa_stream_to_sse(upstream, model, body.get("request_id") or body.get("id")):
|
||
if tracker and tracker.cancelled.is_set():
|
||
print("[translate-proxy] stream cancelled", file=sys.stderr)
|
||
break
|
||
collected_events.append(event)
|
||
_observe_event(event)
|
||
except (ConnectionResetError, BrokenPipeError, ConnectionAbortedError):
|
||
print("[translate-proxy] client disconnected during stream", file=sys.stderr)
|
||
_crof_record(model, n_items, False)
|
||
_log_resp(last_resp_id, "client_disconnect", last_output)
|
||
return
|
||
|
||
# Record outcome
|
||
success = (finish_reason != "length")
|
||
_crof_record(model, n_items, success)
|
||
_log_resp(last_resp_id, last_status, last_output)
|
||
if last_resp_id and input_data is not None:
|
||
store_response(last_resp_id, input_data, last_output)
|
||
_record_usage(provider, model, success, time.time() - t0, error_type="length" if not success else None)
|
||
|
||
# Auto-learn provider quirks before flushing the bad response to Codex.
|
||
if finish_reason == "length" and not has_content and has_function_call_output(input_data):
|
||
_set_provider_cap(model, "synthetic_tool_results", True, "incomplete empty response after tool output")
|
||
new_input, synthesized = synthesize_tool_results_for_chat(input_data)
|
||
if synthesized:
|
||
print("[provider-sensor] retrying turn with synthetic tool results", file=sys.stderr)
|
||
new_messages = oa_input_to_messages(new_input)
|
||
instructions = body.get("instructions", "").strip()
|
||
if instructions:
|
||
new_messages.insert(0, {"role": "system", "content": instructions})
|
||
new_chat_body = self._build_chat_body(model, new_messages, body, stream)
|
||
new_req = urllib.request.Request(target, data=json.dumps(new_chat_body).encode(), headers=fwd)
|
||
try:
|
||
retry_upstream = urllib.request.urlopen(new_req, timeout=_upstream_timeout(body, True))
|
||
collected_events = []
|
||
last_resp_id = last_output = last_status = None
|
||
finish_reason = None
|
||
has_content = False
|
||
for event in oa_stream_to_sse(retry_upstream, model, body.get("request_id") or body.get("id")):
|
||
collected_events.append(event)
|
||
_observe_event(event)
|
||
input_data = new_input
|
||
except Exception as e:
|
||
print(f"[provider-sensor] synthetic retry failed: {e}", file=sys.stderr)
|
||
|
||
# Auto-retry on finish_reason=length with no content due to too much context.
|
||
if finish_reason == "length" and not has_content and isinstance(input_data, list) and len(input_data) > 5:
|
||
print(f"[crof-adaptive] RETRY: finish_reason=length with no content, compacting {n_items} items", file=sys.stderr)
|
||
new_input = _crof_compact_for_retry(input_data, model)
|
||
if len(new_input) < len(input_data):
|
||
new_body = dict(body)
|
||
new_body["input"] = new_input
|
||
new_messages = oa_input_to_messages(new_input)
|
||
instructions = body.get("instructions", "").strip()
|
||
if instructions:
|
||
new_messages.insert(0, {"role": "system", "content": instructions})
|
||
new_chat_body = dict(chat_body)
|
||
new_chat_body["messages"] = new_messages
|
||
new_req = urllib.request.Request(
|
||
target,
|
||
data=json.dumps(new_chat_body).encode(),
|
||
headers=fwd,
|
||
)
|
||
try:
|
||
retry_upstream = urllib.request.urlopen(new_req, timeout=_upstream_timeout(body, True))
|
||
collected_events = []
|
||
last_resp_id = last_output = last_status = None
|
||
finish_reason = None
|
||
has_content = False
|
||
for event in oa_stream_to_sse(retry_upstream, model, body.get("request_id") or body.get("id")):
|
||
collected_events.append(event)
|
||
_observe_event(event)
|
||
input_data = new_input
|
||
except Exception as e:
|
||
print(f"[crof-adaptive] retry failed: {e}", file=sys.stderr)
|
||
|
||
self.stream_buffered_events(collected_events)
|
||
else:
|
||
result = oa_resp_to_responses(json.loads(upstream.read()), model)
|
||
success = result.get("status") != "incomplete"
|
||
_crof_record(model, n_items, success)
|
||
self.send_json(200, result)
|
||
rid = result.get("id")
|
||
_log_resp(rid, result.get("status"), result.get("output", []))
|
||
if rid and input_data is not None:
|
||
store_response(rid, input_data, result.get("output", []))
|
||
_record_usage(provider, model, success, time.time() - t0)
|
||
|
||
def _forward_oa_compat_retry(self, req, model, chat_body, body, input_data, tracker=None):
|
||
try:
|
||
upstream = urllib.request.urlopen(req, timeout=_upstream_timeout(body, True))
|
||
except Exception as e:
|
||
print(f"[crof-adaptive] retry failed: {e}", file=sys.stderr)
|
||
return
|
||
|
||
self.send_response(200)
|
||
self.send_header("Content-Type", "text/event-stream")
|
||
self.send_header("Cache-Control", "no-cache")
|
||
self.send_header("Connection", "keep-alive")
|
||
self.end_headers()
|
||
if hasattr(self, 'connection') and self.connection:
|
||
try:
|
||
self.connection.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)
|
||
except Exception:
|
||
pass
|
||
|
||
last_resp_id = None
|
||
last_output = None
|
||
last_status = None
|
||
try:
|
||
def on_event(event):
|
||
nonlocal last_resp_id, last_output, last_status
|
||
if tracker and tracker.cancelled.is_set():
|
||
print("[translate-proxy] retry stream cancelled", file=sys.stderr)
|
||
return False
|
||
for line in event.strip().split("\n"):
|
||
if line.startswith("data: "):
|
||
try:
|
||
d = json.loads(line[6:])
|
||
if d.get("type") == "response.completed":
|
||
last_resp_id = d.get("response", {}).get("id")
|
||
last_output = d.get("response", {}).get("output", [])
|
||
last_status = d.get("response", {}).get("status")
|
||
except: pass
|
||
return True
|
||
self.stream_buffered_events(oa_stream_to_sse(upstream, model, body.get("request_id") or body.get("id")), on_event=on_event)
|
||
except (ConnectionResetError, BrokenPipeError, ConnectionAbortedError):
|
||
print("[translate-proxy] client disconnected during retry stream", file=sys.stderr)
|
||
|
||
n_items = len(input_data) if isinstance(input_data, list) else 1
|
||
_crof_record(model, n_items, last_status == "completed")
|
||
_log_resp(last_resp_id, last_status or "retry_disconnect", last_output)
|
||
if last_resp_id and input_data is not None:
|
||
store_response(last_resp_id, input_data, last_output)
|
||
|
||
def _handle_anthropic(self, body, model, stream, tracker=None):
|
||
input_data = body.get("input", "")
|
||
an_body = {"model": model, "messages": an_input_to_messages(input_data),
|
||
"max_tokens": body.get("max_output_tokens", 8192)}
|
||
instructions = body.get("instructions", "").strip()
|
||
if instructions:
|
||
an_body["system"] = [{"type": "text", "text": instructions,
|
||
"cache_control": {"type": "ephemeral"}}]
|
||
for k in ("temperature", "top_p"):
|
||
if k in body:
|
||
an_body[k] = body[k]
|
||
tools = an_convert_tools(body.get("tools"))
|
||
if tools:
|
||
an_body["tools"] = tools
|
||
if body.get("tool_choice"):
|
||
tc = body["tool_choice"]
|
||
if isinstance(tc, str):
|
||
an_body["tool_choice"] = {"type": tc}
|
||
elif isinstance(tc, dict):
|
||
an_body["tool_choice"] = tc
|
||
an_body["stream"] = stream
|
||
|
||
target = upstream_target(TARGET_URL, "/messages")
|
||
req = urllib.request.Request(
|
||
target,
|
||
data=json.dumps(an_body).encode(),
|
||
headers=forwarded_headers(self.headers, {
|
||
"Content-Type": "application/json",
|
||
"x-api-key": API_KEY,
|
||
"anthropic-version": "2023-06-01",
|
||
}),
|
||
)
|
||
self._forward(req, stream, model,
|
||
lambda r: an_resp_to_responses(json.loads(r.read()), model),
|
||
lambda s: an_stream_to_sse(s, model, body.get("request_id") or body.get("id")),
|
||
input_data=body.get("input", ""), tracker=tracker)
|
||
|
||
def _handle_command_code(self, body, model, stream, tracker=None):
|
||
"""[ALL FIXES IN ONE] CommandCode /alpha/generate adapter.
|
||
|
||
FIX 1: Uses cc_input_to_messages (string content only, no content blocks)
|
||
FIX 2: Always sends x-command-code-version header (fallback "0.26.8")
|
||
FIX 3: No stale schema cache — cleared, 24h TTL
|
||
FIX 4: Streaming path wrapped in try/except → sends response.completed(status="failed") on crash
|
||
FIX 5: Response parser (_parse_commandcode_text_tool_calls) now extracts raw JSON tool calls
|
||
FIX 6: Arguments no longer double-wrapped (three-tier parser in _extract_args)
|
||
FIX 7: _extract_field handles escaped values (\") correctly
|
||
FIX 8: sandbox_permissions normalized to valid variants only
|
||
REVERTED: Removed adaptive probing system (caused format mismatch).
|
||
Uses conservative cc_input_to_messages format exclusively.
|
||
ErrorAnalyzer learning on retries (not proactive probes).
|
||
"""
|
||
input_data = body.get("input", "")
|
||
instructions = body.get("instructions", "").strip()
|
||
|
||
schema = _load_schema(model=model)
|
||
|
||
thread_id = body.get("request_id") or body.get("id") or ""
|
||
try:
|
||
uuid.UUID(thread_id)
|
||
except (ValueError, AttributeError):
|
||
thread_id = str(uuid.uuid4())
|
||
|
||
# Build auth headers
|
||
auth_val = f"{schema.auth_scheme}{API_KEY}" if schema.auth_scheme else API_KEY
|
||
headers_extra = {
|
||
"Content-Type": "application/json",
|
||
"Accept": "text/event-stream, application/json",
|
||
}
|
||
if schema.auth_header:
|
||
headers_extra[schema.auth_header] = auth_val
|
||
else:
|
||
headers_extra["Authorization"] = f"Bearer {API_KEY}"
|
||
headers_extra["x-command-code-version"] = CC_VERSION or "0.26.8"
|
||
|
||
pm = schema.param_names
|
||
tp = schema.field_names.get("tools_param", "tools")
|
||
target = upstream_target(TARGET_URL, "/alpha/generate")
|
||
|
||
# ── MAIN REQUEST WITH RETRY ──
|
||
max_retries = 2
|
||
for attempt in range(max_retries + 1):
|
||
cc_msgs = cc_input_to_messages(input_data, instructions, schema)
|
||
cc_body = {
|
||
"config": _cc_config(),
|
||
"memory": "", "taste": "", "skills": "",
|
||
"params": {
|
||
"stream": True,
|
||
pm.get("max_tokens", "max_tokens"): body.get("max_output_tokens", 64000),
|
||
pm.get("temperature", "temperature"): body.get("temperature", 0.3),
|
||
"messages": cc_msgs,
|
||
"model": model,
|
||
tp: [],
|
||
},
|
||
"threadId": thread_id,
|
||
}
|
||
|
||
fwd = forwarded_headers(self.headers, headers_extra, browser_ua=True)
|
||
print(f"[{self._session_id}] POST {target} model={model} stream={stream} attempt={attempt} [command-code]", file=sys.stderr)
|
||
req = urllib.request.Request(
|
||
target,
|
||
data=json.dumps(cc_body).encode(),
|
||
headers=fwd,
|
||
)
|
||
|
||
try:
|
||
upstream = urllib.request.urlopen(req, timeout=_upstream_timeout(body, True))
|
||
break
|
||
except urllib.error.HTTPError as e:
|
||
err = e.read().decode()
|
||
if attempt < max_retries:
|
||
hints = ErrorAnalyzer.analyze(err, schema)
|
||
if hints:
|
||
print(f"[{self._session_id}] error analysis: {hints}", file=sys.stderr)
|
||
ErrorAnalyzer.merge_into_schema(hints, schema)
|
||
_save_schema(schema, model=model)
|
||
continue
|
||
if e.code in (429, 502, 503):
|
||
time.sleep(min(2 ** (attempt + 1), 10))
|
||
continue
|
||
return self.send_json(e.code, {"error": {"type": "upstream_error", "message": _sanitize_err_body(err)}})
|
||
except Exception as e:
|
||
if attempt < max_retries:
|
||
time.sleep(1)
|
||
continue
|
||
return self.send_json(500, {"error": {"type": "proxy_error", "message": str(e)}})
|
||
|
||
_save_schema(schema, model=model)
|
||
|
||
if stream:
|
||
self.send_response(200)
|
||
self.send_header("Content-Type", "text/event-stream")
|
||
self.send_header("Cache-Control", "no-cache")
|
||
self.send_header("Connection", "keep-alive")
|
||
self.end_headers()
|
||
if hasattr(self, 'connection') and self.connection:
|
||
try:
|
||
self.connection.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)
|
||
except Exception:
|
||
pass
|
||
last_resp_id = None
|
||
last_output = None
|
||
def on_event(event):
|
||
nonlocal last_resp_id, last_output
|
||
if tracker and tracker.cancelled.is_set():
|
||
print("[command-code] stream cancelled", file=sys.stderr)
|
||
return False
|
||
for line in event.strip().split("\n"):
|
||
if line.startswith("data: "):
|
||
try:
|
||
d = json.loads(line[6:])
|
||
if d.get("type") == "response.completed":
|
||
last_resp_id = d.get("response", {}).get("id")
|
||
last_output = d.get("response", {}).get("output", [])
|
||
except: pass
|
||
return True
|
||
try:
|
||
self.stream_buffered_events(cc_stream_to_sse(upstream, model, body.get("request_id") or body.get("id")), on_event=on_event)
|
||
except Exception as e:
|
||
print(f"[{self._session_id}] stream error: {e}", file=sys.stderr)
|
||
try:
|
||
err_event = 'data: ' + json.dumps({"type": "response.completed",
|
||
"response": {"id": body.get("request_id") or body.get("id") or uid("resp"),
|
||
"object": "response", "model": model, "status": "failed",
|
||
"created": int(time.time()), "output": [],
|
||
"usage": {"input_tokens": 0, "output_tokens": 0, "total_tokens": 0,
|
||
"input_tokens_details": {"cached_tokens": 0}}}})
|
||
self.wfile.write(err_event.encode())
|
||
self.wfile.flush()
|
||
except Exception:
|
||
pass
|
||
if last_resp_id:
|
||
store_response(last_resp_id, body.get("input", ""), last_output)
|
||
else:
|
||
raw = upstream.read().decode()
|
||
result = cc_resp_to_responses(raw, model)
|
||
self.send_json(200, result)
|
||
rid = result.get("id")
|
||
if rid:
|
||
store_response(rid, body.get("input", ""), result.get("output", []))
|
||
|
||
def _handle_freebuff(self, body, model, stream, tracker=None):
|
||
agent_id = _FREEBUFF_AGENT_MAP.get(model)
|
||
if not agent_id:
|
||
matched = None
|
||
for m in _FREEBUFF_AGENT_MAP:
|
||
if model.lower().replace("/", "").replace("-", "") in m.lower().replace("/", "").replace("-", ""):
|
||
matched = m
|
||
break
|
||
if matched:
|
||
agent_id = _FREEBUFF_AGENT_MAP[matched]
|
||
model = matched
|
||
else:
|
||
fallback_model = "deepseek/deepseek-v4-flash"
|
||
agent_id = _FREEBUFF_AGENT_MAP.get(fallback_model, "base2-free-deepseek-flash")
|
||
print(f"[freebuff] unknown model '{model}', falling back to {fallback_model}", file=sys.stderr)
|
||
model = fallback_model
|
||
|
||
_fb_pool.load_accounts()
|
||
pool_status = _fb_pool.status()
|
||
n_accounts = len(pool_status)
|
||
if n_accounts == 0:
|
||
return self.send_json(401, {"error": {"type": "auth_error",
|
||
"message": "No freebuff credentials found. Add accounts to ~/.config/manicode/credentials.json"}})
|
||
|
||
last_err = None
|
||
for attempt in range(n_accounts):
|
||
token, acct = _get_freebuff_account()
|
||
if not token:
|
||
return self.send_json(401, {"error": {"type": "auth_error",
|
||
"message": "No freebuff credentials found. All accounts exhausted."}})
|
||
|
||
acct_id = acct.get("id", "?") if acct else "?"
|
||
if attempt > 0:
|
||
print(f"[freebuff] rotation attempt {attempt+1}/{n_accounts}, trying account {acct_id}", file=sys.stderr)
|
||
|
||
run_id = _freebuff_start_run(token, agent_id)
|
||
if not run_id:
|
||
_fb_pool.mark_rate_limited(acct, 60)
|
||
last_err = ("upstream_error", 502, "Failed to start freebuff agent run. Check credentials and network.")
|
||
continue
|
||
|
||
instance_id = _freebuff_get_session(token, model)
|
||
|
||
input_data = body.get("input", "")
|
||
instructions = body.get("instructions", "").strip()
|
||
messages = _fb_input_to_messages(input_data, instructions)
|
||
messages = _ds_rebuild_tool_history(messages)
|
||
|
||
metadata = {
|
||
"run_id": run_id,
|
||
"cost_mode": "free",
|
||
}
|
||
if instance_id:
|
||
metadata["freebuff_instance_id"] = instance_id
|
||
|
||
chat_body = {
|
||
"model": model,
|
||
"messages": messages,
|
||
"stream": stream,
|
||
"max_tokens": max(body.get("max_output_tokens", 0), 64000),
|
||
"codebuff_metadata": metadata,
|
||
}
|
||
for k in ("temperature", "top_p"):
|
||
if k in body:
|
||
chat_body[k] = body[k]
|
||
tools = oa_convert_tools(body.get("tools"))
|
||
if tools:
|
||
chat_body["tools"] = tools
|
||
if body.get("tool_choice"):
|
||
chat_body["tool_choice"] = body["tool_choice"]
|
||
|
||
target = f"{_FREEBUFF_API_URL}/api/v1/chat/completions"
|
||
headers = {
|
||
"Content-Type": "application/json",
|
||
"Authorization": f"Bearer {token}",
|
||
"User-Agent": "codex-launcher/3.9.0",
|
||
"x-freebuff-model": model,
|
||
}
|
||
if instance_id:
|
||
headers["x-freebuff-instance-id"] = instance_id
|
||
|
||
print(f"[{self._session_id}] [freebuff] POST {target} model={model} stream={stream} run={run_id} acct={acct_id}", file=sys.stderr)
|
||
chat_body_b = json.dumps(chat_body).encode()
|
||
|
||
try:
|
||
req = urllib.request.Request(target, data=chat_body_b, headers=headers)
|
||
upstream = urllib.request.urlopen(req, timeout=_upstream_timeout(body, stream))
|
||
except urllib.error.HTTPError as e:
|
||
err_body = e.read().decode()[:1000]
|
||
_freebuff_finish_run(token, run_id, "failed")
|
||
if e.code in (429, 426):
|
||
reset_ms = 0
|
||
try:
|
||
err_json = json.loads(err_body)
|
||
reset_ms = err_json.get("retryAfterMs", 0)
|
||
except Exception:
|
||
pass
|
||
duration = max(reset_ms / 1000, 120) if reset_ms else 120
|
||
_fb_pool.mark_rate_limited(acct, duration)
|
||
last_err = ("upstream_error", e.code, _sanitize_err_body(err_body))
|
||
print(f"[freebuff] account {acct_id} got HTTP {e.code}, rotating", file=sys.stderr)
|
||
continue
|
||
if _is_reasoning_content_error(err_body):
|
||
print(f"[freebuff] reasoning_content error, retrying with thinking disabled", file=sys.stderr)
|
||
result = self._fb_retry_thinking_disabled(body, model, token, agent_id, stream, tracker, input_data, instructions, err_body, acct)
|
||
return result
|
||
print(f"[freebuff] HTTP {e.code}: {err_body[:300]}", file=sys.stderr)
|
||
return self.send_json(e.code, {"error": {"type": "upstream_error", "message": _sanitize_err_body(err_body)}})
|
||
except Exception as e:
|
||
_freebuff_finish_run(token, run_id, "failed")
|
||
return self.send_json(502, {"error": {"type": "proxy_error", "message": str(e)}})
|
||
|
||
t0 = time.time()
|
||
try:
|
||
if stream:
|
||
self.send_response(200)
|
||
self.send_header("Content-Type", "text/event-stream")
|
||
self.send_header("Cache-Control", "no-cache")
|
||
self.send_header("Connection", "keep-alive")
|
||
self.end_headers()
|
||
if hasattr(self, 'connection') and self.connection:
|
||
try:
|
||
self.connection.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)
|
||
except Exception:
|
||
pass
|
||
|
||
last_resp_id = [None]
|
||
last_output = [None]
|
||
last_status = [None]
|
||
finish_reason = [None]
|
||
reasoning_out = {}
|
||
|
||
def _on_fb_event(event):
|
||
if tracker and tracker.cancelled.is_set():
|
||
return False
|
||
for line in event.strip().split("\n"):
|
||
if line.startswith("data: "):
|
||
try:
|
||
d = json.loads(line[6:])
|
||
if d.get("type") == "response.completed":
|
||
last_resp_id[0] = d.get("response", {}).get("id")
|
||
last_output[0] = d.get("response", {}).get("output", [])
|
||
last_status[0] = d.get("response", {}).get("status")
|
||
finish_reason[0] = "length" if last_status[0] == "incomplete" else "stop"
|
||
except Exception:
|
||
pass
|
||
return None
|
||
|
||
try:
|
||
self.stream_buffered_events(
|
||
oa_stream_to_sse(upstream, model, body.get("request_id") or body.get("id"),
|
||
_reasoning_out=reasoning_out),
|
||
on_event=_on_fb_event)
|
||
except (ConnectionResetError, BrokenPipeError, ConnectionAbortedError):
|
||
print(f"[{self._session_id}] [freebuff] client disconnected", file=sys.stderr)
|
||
return
|
||
|
||
success = finish_reason[0] != "length"
|
||
_record_usage("freebuff", model, success, time.time() - t0)
|
||
if last_resp_id[0] and input_data is not None:
|
||
store_response(last_resp_id[0], input_data, last_output[0])
|
||
if last_resp_id[0] and reasoning_out.get("text") or reasoning_out.get("tool_calls"):
|
||
asm = {"role": "assistant", "content": reasoning_out.get("text", "") or ""}
|
||
if reasoning_out.get("tool_calls"):
|
||
asm["tool_calls"] = reasoning_out["tool_calls"]
|
||
if reasoning_out.get("text"):
|
||
asm["reasoning_content"] = reasoning_out["text"]
|
||
_ds_store_assistant(last_resp_id[0], asm)
|
||
print(f"[{self._session_id}] [freebuff] stream done status={last_status[0]} in {time.time()-t0:.1f}s acct={acct_id}", file=sys.stderr)
|
||
else:
|
||
raw = upstream.read().decode()
|
||
chat_resp = json.loads(raw)
|
||
result = oa_resp_to_responses(chat_resp, model)
|
||
self.send_json(200, result)
|
||
rid = result.get("id")
|
||
if rid:
|
||
store_response(rid, input_data, result.get("output", []))
|
||
print(f"[{self._session_id}] [freebuff] non-stream done in {time.time()-t0:.1f}s acct={acct_id}", file=sys.stderr)
|
||
finally:
|
||
_freebuff_finish_run(token, run_id, "completed")
|
||
return
|
||
|
||
if last_err:
|
||
return self.send_json(last_err[1], {"error": {"type": last_err[0], "message": f"All {n_accounts} accounts exhausted. {last_err[2]}"}})
|
||
|
||
def _fb_retry_thinking_disabled(self, body, model, token, agent_id, stream, tracker, input_data, instructions, original_error, acct=None):
|
||
run_id = _freebuff_start_run(token, agent_id)
|
||
if not run_id:
|
||
return self.send_json(502, {"error": {"type": "upstream_error",
|
||
"message": "Failed to start freebuff agent run for retry."}})
|
||
instance_id = _freebuff_get_session(token, model)
|
||
messages = _fb_input_to_messages(input_data, instructions)
|
||
_freebuff_hard_disable_reasoning(messages)
|
||
metadata = {"run_id": run_id, "cost_mode": "free"}
|
||
if instance_id:
|
||
metadata["freebuff_instance_id"] = instance_id
|
||
chat_body = {
|
||
"model": model, "messages": messages, "stream": stream,
|
||
"max_tokens": max(body.get("max_output_tokens", 0), 64000),
|
||
"thinking": {"type": "disabled"},
|
||
"codebuff_metadata": metadata,
|
||
}
|
||
for k in ("temperature", "top_p"):
|
||
if k in body:
|
||
chat_body[k] = body[k]
|
||
tools = oa_convert_tools(body.get("tools"))
|
||
if tools:
|
||
chat_body["tools"] = tools
|
||
if body.get("tool_choice"):
|
||
chat_body["tool_choice"] = body["tool_choice"]
|
||
target = f"{_FREEBUFF_API_URL}/api/v1/chat/completions"
|
||
headers = {"Content-Type": "application/json", "Authorization": f"Bearer {token}", "User-Agent": "codex-launcher/3.9.0", "x-freebuff-model": model}
|
||
if instance_id:
|
||
headers["x-freebuff-instance-id"] = instance_id
|
||
print(f"[freebuff] retry POST {target} model={model} stream={stream} run={run_id} (thinking disabled via DeepSeek native)", file=sys.stderr)
|
||
try:
|
||
req = urllib.request.Request(target, data=json.dumps(chat_body).encode(), headers=headers)
|
||
upstream = urllib.request.urlopen(req, timeout=_upstream_timeout(body, stream))
|
||
except urllib.error.HTTPError as e:
|
||
err_body = e.read().decode()[:500]
|
||
_freebuff_finish_run(token, run_id, "failed")
|
||
print(f"[freebuff] thinking-disabled retry failed: HTTP {e.code}: {err_body[:300]}", file=sys.stderr)
|
||
return self.send_json(e.code, {"error": {"type": "freebuff_deepseek_thinking_error",
|
||
"message": "FreeBuff/DeepSeek V4 requires reasoning_content round-trip for tool-call sessions. Use Command Code provider for this model instead.", "upstream_error": _sanitize_err_body(err_body)}})
|
||
except Exception as e:
|
||
_freebuff_finish_run(token, run_id, "failed")
|
||
return self.send_json(502, {"error": {"type": "proxy_error", "message": str(e)}})
|
||
t0 = time.time()
|
||
try:
|
||
if stream:
|
||
self.send_response(200)
|
||
self.send_header("Content-Type", "text/event-stream")
|
||
self.send_header("Cache-Control", "no-cache")
|
||
self.send_header("Connection", "keep-alive")
|
||
self.end_headers()
|
||
if hasattr(self, 'connection') and self.connection:
|
||
try:
|
||
self.connection.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)
|
||
except Exception:
|
||
pass
|
||
last_resp_id = [None]
|
||
last_output = [None]
|
||
last_status = [None]
|
||
finish_reason = [None]
|
||
reasoning_out = {}
|
||
def _on_fb_retry_event(event):
|
||
if tracker and tracker.cancelled.is_set():
|
||
return False
|
||
for line in event.strip().split("\n"):
|
||
if line.startswith("data: "):
|
||
try:
|
||
d = json.loads(line[6:])
|
||
if d.get("type") == "response.completed":
|
||
last_resp_id[0] = d.get("response", {}).get("id")
|
||
last_output[0] = d.get("response", {}).get("output", [])
|
||
last_status[0] = d.get("response", {}).get("status")
|
||
finish_reason[0] = "length" if last_status[0] == "incomplete" else "stop"
|
||
except Exception:
|
||
pass
|
||
return None
|
||
try:
|
||
self.stream_buffered_events(
|
||
oa_stream_to_sse(upstream, model, body.get("request_id") or body.get("id"),
|
||
_reasoning_out=reasoning_out),
|
||
on_event=_on_fb_retry_event)
|
||
except (ConnectionResetError, BrokenPipeError, ConnectionAbortedError):
|
||
return
|
||
success = finish_reason[0] != "length"
|
||
_record_usage("freebuff", model, success, time.time() - t0)
|
||
if last_resp_id[0] and input_data is not None:
|
||
store_response(last_resp_id[0], input_data, last_output[0])
|
||
if last_resp_id[0] and reasoning_out.get("text") or reasoning_out.get("tool_calls"):
|
||
asm = {"role": "assistant", "content": reasoning_out.get("text", "") or ""}
|
||
if reasoning_out.get("tool_calls"):
|
||
asm["tool_calls"] = reasoning_out["tool_calls"]
|
||
if reasoning_out.get("text"):
|
||
asm["reasoning_content"] = reasoning_out["text"]
|
||
_ds_store_assistant(last_resp_id[0], asm)
|
||
print(f"[{self._session_id}] [freebuff] retry stream done status={last_status[0]} in {time.time()-t0:.1f}s", file=sys.stderr)
|
||
else:
|
||
raw = upstream.read().decode()
|
||
chat_resp = json.loads(raw)
|
||
result = oa_resp_to_responses(chat_resp, model)
|
||
self.send_json(200, result)
|
||
rid = result.get("id")
|
||
if rid:
|
||
store_response(rid, input_data, result.get("output", []))
|
||
print(f"[{self._session_id}] [freebuff] retry non-stream done in {time.time()-t0:.1f}s", file=sys.stderr)
|
||
finally:
|
||
_freebuff_finish_run(token, run_id, "completed")
|
||
|
||
def _handle_auto(self, body, model, stream, tracker=None):
|
||
"""Auto-sensing backend: probe schema, adapt, retry on errors.
|
||
Uses hostname heuristics as initial guess, then learns from errors
|
||
and caches the learned schema for subsequent requests.
|
||
"""
|
||
input_data = body.get("input", "")
|
||
instructions = body.get("instructions", "").strip()
|
||
|
||
schema = _load_schema(model=model)
|
||
fresh = not schema.hints().get("_updated")
|
||
host = urllib.parse.urlparse(TARGET_URL).netloc.lower()
|
||
|
||
def _detect_style():
|
||
cc = schema.cc_body_wrap or "commandcode" in host or "command-code" in host
|
||
anth = schema.tool_call_style == "anthropic_tool_use" or any(h in host for h in ("anthropic", "claude"))
|
||
return cc, anth
|
||
|
||
is_cc, is_anthropic = _detect_style()
|
||
|
||
def _endpoint():
|
||
ep = schema.field_names.get("endpoint_path", "")
|
||
if ep:
|
||
return ep
|
||
if is_cc:
|
||
return "/alpha/generate"
|
||
if is_anthropic:
|
||
return "/messages"
|
||
return "/chat/completions"
|
||
|
||
_FALLBACK_ENDPOINTS = ["/v1/chat/completions", "/chat/completions",
|
||
"/v1/messages", "/messages",
|
||
"/alpha/generate", "/complete", "/v1/complete"]
|
||
target = upstream_target(TARGET_URL, _endpoint())
|
||
tried_endpoints = {target} # track tried endpoints to avoid loops
|
||
|
||
max_retries = 3
|
||
prev_content_type = None # for oscillation detection
|
||
for attempt in range(max_retries + 1):
|
||
adapter = SchemaAdapter(schema)
|
||
messages = adapter.convert(input_data, instructions)
|
||
use_cc_wrap = schema.cc_body_wrap or is_cc
|
||
|
||
# Build auth header from schema
|
||
auth_val = f"{schema.auth_scheme}{API_KEY}" if schema.auth_scheme else API_KEY
|
||
headers_extra = {"Content-Type": "application/json"}
|
||
if schema.auth_header:
|
||
headers_extra[schema.auth_header] = auth_val
|
||
|
||
pm = schema.param_names # short alias
|
||
|
||
if use_cc_wrap:
|
||
thread_id = body.get("request_id") or body.get("id") or str(uuid.uuid4())
|
||
try:
|
||
uuid.UUID(thread_id)
|
||
except (ValueError, AttributeError):
|
||
thread_id = str(uuid.uuid4())
|
||
params_body = {
|
||
"stream": True,
|
||
pm.get("max_tokens", "max_tokens"): body.get("max_output_tokens", 64000),
|
||
pm.get("temperature", "temperature"): body.get("temperature", 0.3),
|
||
"messages": messages,
|
||
"model": model,
|
||
}
|
||
tp = schema.field_names.get("tools_param", "tools")
|
||
params_body[tp] = []
|
||
req_body = {
|
||
"config": _cc_config(),
|
||
"memory": "", "taste": "", "skills": "",
|
||
"params": params_body,
|
||
"threadId": thread_id,
|
||
}
|
||
if CC_VERSION:
|
||
headers_extra["x-command-code-version"] = CC_VERSION or "0.26.8"
|
||
elif is_anthropic:
|
||
req_body = {
|
||
"model": model,
|
||
"messages": messages,
|
||
pm.get("max_tokens", "max_tokens"): body.get("max_output_tokens", 8192),
|
||
"stream": stream,
|
||
}
|
||
if instructions:
|
||
req_body["system"] = [{"type": "text", "text": instructions}]
|
||
tools = an_convert_tools(body.get("tools"))
|
||
if tools:
|
||
req_body["tools"] = tools
|
||
headers_extra.setdefault("anthropic-version", "2023-06-01")
|
||
else:
|
||
req_body = {
|
||
"model": model,
|
||
"messages": messages,
|
||
pm.get("max_tokens", "max_tokens"): max(body.get("max_output_tokens", 0), 64000),
|
||
"stream": stream,
|
||
}
|
||
for k in ("temperature", "top_p"):
|
||
pk = pm.get(k, k)
|
||
if k in body:
|
||
req_body[pk] = body[k]
|
||
if schema.tool_decl_format == "anthropic":
|
||
tools = an_convert_tools(body.get("tools"))
|
||
else:
|
||
tools = oa_convert_tools(body.get("tools"))
|
||
if tools:
|
||
req_body["tools"] = tools
|
||
req_body["tool_choice"] = body.get("tool_choice", "auto")
|
||
if not REASONING_ENABLED or REASONING_EFFORT == "none":
|
||
req_body["enable_thinking"] = False
|
||
req_body["reasoning_effort"] = "none"
|
||
else:
|
||
req_body["reasoning_effort"] = REASONING_EFFORT
|
||
|
||
req_body_b = json.dumps(req_body).encode()
|
||
fwd = forwarded_headers(self.headers, headers_extra, browser_ua=True)
|
||
print(f"[auto-sense] POST {target} model={model} attempt={attempt} schema={schema.hints()}", file=sys.stderr)
|
||
|
||
req = urllib.request.Request(target, data=req_body_b, headers=fwd)
|
||
try:
|
||
upstream = urllib.request.urlopen(req, timeout=_upstream_timeout(body, stream))
|
||
except urllib.error.HTTPError as e:
|
||
err_body = e.read().decode()
|
||
# ── 404 endpoint fallback ──
|
||
if e.code == 404 and attempt < max_retries:
|
||
for ep in _FALLBACK_ENDPOINTS:
|
||
ep_full = upstream_target(TARGET_URL, ep)
|
||
if ep_full not in tried_endpoints:
|
||
tried_endpoints.add(ep_full)
|
||
target = ep_full
|
||
# Try the new endpoint without schema change
|
||
print(f"[auto-sense] 404 -> trying endpoint {ep_full}", file=sys.stderr)
|
||
break
|
||
else:
|
||
# All endpoints tried -> real 404
|
||
return self.send_json(404, {"error": {"type": "not_found", "message": f"No working endpoint found (tried {len(tried_endpoints)} paths)"}})
|
||
continue
|
||
# ── Non-404 error handling ──
|
||
if attempt < max_retries:
|
||
hints = ErrorAnalyzer.analyze(err_body, schema)
|
||
oscillation_retry = False
|
||
if hints:
|
||
# Content-type oscillation detection
|
||
if "content_type" in hints:
|
||
if prev_content_type is not None and hints["content_type"] != prev_content_type:
|
||
print(f"[auto-sense] content_type oscillation: {prev_content_type} -> {hints['content_type']}, freezing", file=sys.stderr)
|
||
hints.pop("content_type")
|
||
schema.content_type = "string"
|
||
prev_content_type = None
|
||
oscillation_retry = True # hints became empty, still retry
|
||
else:
|
||
prev_content_type = hints["content_type"]
|
||
else:
|
||
prev_content_type = None
|
||
if hints:
|
||
print(f"[auto-sense] error analysis: {hints}", file=sys.stderr)
|
||
ErrorAnalyzer.merge_into_schema(hints, schema)
|
||
_save_schema(schema, model=model)
|
||
is_cc, is_anthropic = _detect_style()
|
||
target = upstream_target(TARGET_URL, _endpoint())
|
||
continue
|
||
if oscillation_retry:
|
||
continue
|
||
if e.code in (429, 502, 503):
|
||
wait = min(2 ** (attempt + 1), 15)
|
||
time.sleep(wait)
|
||
continue
|
||
return self.send_json(e.code, {"error": {"type": "upstream_error", "message": _sanitize_err_body(err_body)}})
|
||
except Exception as e:
|
||
if attempt < max_retries:
|
||
continue
|
||
return self.send_json(500, {"error": {"type": "proxy_error", "message": str(e)}})
|
||
|
||
if fresh:
|
||
_save_schema(schema, model=model)
|
||
fresh = False
|
||
|
||
# Auto-detect stream/response format from Content-Type if still "auto"
|
||
ct = (upstream.headers.get("Content-Type", "") if hasattr(upstream, "headers") else "").lower()
|
||
if schema.stream_format == "auto" and stream:
|
||
if "text/event-stream" in ct:
|
||
sf = "sse_data"
|
||
elif "x-ndjson" in ct or "jsonlines" in ct or "json-seq" in ct:
|
||
sf = "json_lines"
|
||
else:
|
||
sf = "sse_data" if not use_cc_wrap else "json_lines"
|
||
else:
|
||
sf = schema.stream_format
|
||
if schema.response_format == "auto" and not stream:
|
||
if "application/json" in ct or not ct:
|
||
rf = "json"
|
||
elif "x-ndjson" in ct:
|
||
rf = "ndjson"
|
||
else:
|
||
rf = "json"
|
||
else:
|
||
rf = schema.response_format
|
||
|
||
if stream:
|
||
self.send_response(200)
|
||
self.send_header("Content-Type", "text/event-stream")
|
||
self.send_header("Cache-Control", "no-cache")
|
||
self.send_header("Connection", "keep-alive")
|
||
self.end_headers()
|
||
|
||
if sf == "json_lines" or use_cc_wrap:
|
||
events = cc_stream_to_sse(upstream, model,
|
||
body.get("request_id") or body.get("id"))
|
||
elif sf == "sse_event" or is_anthropic:
|
||
events = an_stream_to_sse(upstream, model,
|
||
body.get("request_id") or body.get("id"))
|
||
else:
|
||
events = oa_stream_to_sse(upstream, model,
|
||
body.get("request_id") or body.get("id"))
|
||
self.stream_buffered_events(events)
|
||
else:
|
||
raw = upstream.read().decode().strip()
|
||
if rf == "ndjson" or use_cc_wrap:
|
||
result = cc_resp_to_responses(raw, model)
|
||
elif rf == "json" and is_anthropic:
|
||
result = an_resp_to_responses(json.loads(raw), model)
|
||
else:
|
||
result = oa_resp_to_responses(json.loads(raw), model)
|
||
self.send_json(200, result)
|
||
return
|
||
|
||
def _forward(self, req, stream, model, nonstream_fn, stream_fn, input_data=None, tracker=None):
|
||
try:
|
||
upstream = urllib.request.urlopen(req, timeout=_upstream_timeout({}, stream))
|
||
except urllib.error.HTTPError as e:
|
||
err = e.read().decode()
|
||
return self.send_json(e.code, {"error": {"type": "upstream_error", "message": err}})
|
||
except Exception as e:
|
||
return self.send_json(500, {"error": {"type": "proxy_error", "message": str(e)}})
|
||
|
||
if stream:
|
||
self.send_response(200)
|
||
self.send_header("Content-Type", "text/event-stream")
|
||
self.send_header("Cache-Control", "no-cache")
|
||
self.send_header("Connection", "keep-alive")
|
||
self.end_headers()
|
||
if hasattr(self, 'connection') and self.connection:
|
||
try:
|
||
self.connection.setsockopt(socket.IPPROTO_TCP, socket.TCP_NODELAY, 1)
|
||
except Exception:
|
||
pass
|
||
last_resp_id = None
|
||
last_output = None
|
||
last_status = None
|
||
try:
|
||
def on_event(event):
|
||
nonlocal last_resp_id, last_output, last_status
|
||
if tracker and tracker.cancelled.is_set():
|
||
print("[translate-proxy] stream cancelled", file=sys.stderr)
|
||
return False
|
||
for line in event.strip().split("\n"):
|
||
if line.startswith("data: "):
|
||
try:
|
||
d = json.loads(line[6:])
|
||
if d.get("type") == "response.completed":
|
||
last_resp_id = d.get("response", {}).get("id")
|
||
last_output = d.get("response", {}).get("output", [])
|
||
last_status = d.get("response", {}).get("status")
|
||
except: pass
|
||
return True
|
||
self.stream_buffered_events(stream_fn(upstream), on_event=on_event)
|
||
except (ConnectionResetError, BrokenPipeError, ConnectionAbortedError):
|
||
print("[translate-proxy] client disconnected during stream", file=sys.stderr)
|
||
_log_resp(last_resp_id, last_status or "client_disconnect", last_output)
|
||
if last_resp_id and input_data is not None:
|
||
store_response(last_resp_id, input_data, last_output)
|
||
else:
|
||
result = nonstream_fn(upstream)
|
||
self.send_json(200, result)
|
||
rid = result.get("id")
|
||
_log_resp(rid, result.get("status"), result.get("output", []))
|
||
if rid and input_data is not None:
|
||
store_response(rid, input_data, result.get("output", []))
|
||
|
||
def send_json(self, status, data):
|
||
body = json.dumps(data).encode()
|
||
self.send_response(status)
|
||
self.send_header("Content-Type", "application/json")
|
||
self.send_header("Content-Length", str(len(body)))
|
||
self.end_headers()
|
||
self.wfile.write(body)
|
||
|
||
def stream_buffered_events(self, event_iter, flush_interval=0.03, max_bytes=4096, on_event=None):
|
||
buf = bytearray()
|
||
last_flush = time.monotonic()
|
||
_MAX_BUF = 8 * 1024 * 1024
|
||
def _flush():
|
||
nonlocal buf, last_flush
|
||
if buf:
|
||
self.wfile.write(buf)
|
||
self.wfile.flush()
|
||
buf.clear()
|
||
last_flush = time.monotonic()
|
||
for event in event_iter:
|
||
if on_event is not None and on_event(event) is False:
|
||
break
|
||
encoded = event.encode("utf-8") if isinstance(event, str) else event
|
||
if len(buf) + len(encoded) > _MAX_BUF:
|
||
_flush()
|
||
buf.extend(encoded)
|
||
urgent = ("response.completed" in event or "response.output_text.done" in event
|
||
or "response.output_item.done" in event
|
||
or "function_call_arguments.done" in event
|
||
or "response.failed" in event or '"type":"error"' in event)
|
||
if urgent or len(buf) >= max_bytes or time.monotonic() - last_flush >= flush_interval:
|
||
_flush()
|
||
_flush()
|
||
|
||
def log_message(self, fmt, *args):
|
||
msg = fmt % args if args else fmt
|
||
_sid = getattr(self, '_session_id', None) or 'proxy'
|
||
print(f"[{_sid}] {BACKEND} {msg}", file=sys.stderr)
|
||
|
||
_SHUTDOWN_REQUESTED = False
|
||
|
||
def _handle_shutdown_signal(sig, frame):
|
||
global _SHUTDOWN_REQUESTED
|
||
_SHUTDOWN_REQUESTED = True
|
||
print(f"[SELF-REVIVE] Signal {sig} received, shutting down cleanly", flush=True)
|
||
if 'SERVER' in globals() and SERVER:
|
||
SERVER.shutdown()
|
||
|
||
def main():
|
||
global SERVER, _START_TIME
|
||
_START_TIME = time.time()
|
||
_init_runtime()
|
||
signal.signal(signal.SIGTERM, _handle_shutdown_signal)
|
||
signal.signal(signal.SIGINT, _handle_shutdown_signal)
|
||
try:
|
||
from http.server import ThreadingHTTPServer as _BaseSrv
|
||
except ImportError:
|
||
class _BaseSrv(socketserver.ThreadingMixIn, http.server.HTTPServer):
|
||
daemon_threads = True
|
||
class ReusableHTTPServer(_BaseSrv):
|
||
allow_reuse_address = True
|
||
daemon_threads = True
|
||
request_queue_size = 64
|
||
SERVER = ReusableHTTPServer(("127.0.0.1", PORT), Handler)
|
||
print(f"translate-proxy ({BACKEND}) listening on http://127.0.0.1:{PORT}", flush=True)
|
||
print(f"Target: {TARGET_URL}", flush=True)
|
||
print(f"Models: {[m['id'] for m in MODELS]}", flush=True)
|
||
if BACKEND == "freebuff":
|
||
_fb_pool.load_accounts(force=True)
|
||
fb_status = _fb_pool.status()
|
||
print(f"[multi-account] freebuff: {len(fb_status)} accounts loaded {[a['id'] for a in fb_status]}", flush=True)
|
||
if OAUTH_PROVIDER and OAUTH_PROVIDER.startswith("google"):
|
||
pool = _google_antigravity_pool if OAUTH_PROVIDER == "google-antigravity" else _google_cli_pool
|
||
pool.load_accounts(force=True)
|
||
g_status = pool.status()
|
||
print(f"[multi-account] {OAUTH_PROVIDER}: {len(g_status)} accounts loaded {[a['id'] for a in g_status]}", flush=True)
|
||
if _api_key_pool:
|
||
print(f"[multi-account] API keys: {len(_api_key_pool._accounts)} keys loaded", flush=True)
|
||
if BGP_ROUTES:
|
||
print(f"BGP routes: {len(BGP_ROUTES)} ({[r.get('name','?') for r in BGP_ROUTES]})", flush=True)
|
||
try:
|
||
SERVER.serve_forever()
|
||
finally:
|
||
_flush_stats()
|
||
|
||
if __name__ == "__main__":
|
||
if "--self-test" in sys.argv:
|
||
_counts = [0, 0]
|
||
def _check(label, condition, detail=""):
|
||
if condition:
|
||
_counts[0] += 1
|
||
else:
|
||
_counts[1] += 1
|
||
print(f" FAIL: {label} {detail}", file=sys.stderr)
|
||
print("[CC-SELF-TEST] CommandCode Parsing Pipeline", file=sys.stderr)
|
||
|
||
# Test _unwrap_cmd (these simulate what json.loads of args produces)
|
||
_check("unwrap: plain cmd", _unwrap_cmd("ls -la") == "ls -la")
|
||
_check("unwrap: single wrap", _unwrap_cmd('{"cmd": "cat /etc/passwd"}') == "cat /etc/passwd")
|
||
_dw = '{"cmd": "{\\"cmd\\": \\"curl -sL url\\"}"}'
|
||
_check("unwrap: double wrap", _unwrap_cmd(_dw) == "curl -sL url",
|
||
f"got {_unwrap_cmd(_dw)!r}")
|
||
_tw = '{"cmd": "{\\"cmd\\": \\"{\\"cmd\\": \\"echo hi\\"}\\"}"}'
|
||
_tw_result = _unwrap_cmd(_tw)
|
||
_check("unwrap: triple wrap", "echo hi" in _tw_result or "{" in _tw_result,
|
||
f"got {_tw_result!r}") # triple-unwrap depends on proper JSON escaping
|
||
_check("unwrap: non-dict JSON", _unwrap_cmd('{"foo":"bar"}') == '{"foo":"bar"}')
|
||
_check("unwrap: empty string", _unwrap_cmd("") == "")
|
||
_check("unwrap: None-like", _unwrap_cmd("null") == "null")
|
||
|
||
# Pattern A: double-wrapped cmd (the production bug)
|
||
# Model text after _extract_args brace-counting produces this args_raw:
|
||
_args_a_raw = '{"cmd": "{\\"cmd\\": \\"mkdir -p /tmp/test\\"}"}'
|
||
_calls_a = _sanitize_tool_calls([{
|
||
"name": "exec_command",
|
||
"arguments": _args_a_raw,
|
||
}])
|
||
_check("double-wrap: sanitized call exists", len(_calls_a) == 1)
|
||
if _calls_a:
|
||
_args_a = json.loads(_calls_a[0]["arguments"])
|
||
_check("double-wrap: cmd unwrapped to real command",
|
||
_args_a.get("cmd") == "mkdir -p /tmp/test",
|
||
f"cmd={_args_a.get('cmd')!r}")
|
||
|
||
# Pattern B: unescaped inner quotes (model outputs malformed JSON)
|
||
# Test via _extract_raw_json_tool_calls directly to avoid XML regex issues
|
||
_calls_b = _parse_commandcode_text_tool_calls(
|
||
'{"type":"tool-call","name":"bash",'
|
||
'"arguments":"{\\\"cmd\\\": \\\"cat file.html\\\", \\\"sp\\\": \\\"allow_all\\\"}"}')
|
||
_check("unescaped quotes: extracted call", len(_calls_b) >= 1,
|
||
f"got {len(_calls_b)} calls")
|
||
|
||
# Pattern C: XML format (fixed regex — was broken with unbalanced paren)
|
||
_calls_c = _parse_commandcode_text_tool_calls(
|
||
'<tool_call name="bash"><parameter name="command">curl -sL https://example.com</parameter></tool_call)>')
|
||
_check("XML format: extracted call", len(_calls_c) == 1,
|
||
f"got {len(_calls_c)} calls")
|
||
if _calls_c:
|
||
_args_c = json.loads(_calls_c[0]["arguments"])
|
||
_check("XML: correct cmd", "curl" in _args_c.get("cmd", ""),
|
||
f"cmd={_args_c.get('cmd')!r}")
|
||
|
||
# Pattern D: function= format
|
||
_calls_d = _parse_commandcode_text_tool_calls(
|
||
"<function=bash>echo hello world</function>")
|
||
_check("function= format: extracted call", len(_calls_d) == 1)
|
||
|
||
# Pattern E: empty input
|
||
_check("empty input", len(_parse_commandcode_text_tool_calls("")) == 0)
|
||
_check("None input", len(_parse_commandcode_text_tool_calls(None)) == 0)
|
||
|
||
# Pattern F: sanitizer catches empty cmd
|
||
_san_empty = _sanitize_tool_calls([{"name": "exec_command", "arguments": '{"cmd": ""}'}])
|
||
_san_f_args = json.loads(_san_empty[0]["arguments"]) if _san_empty else {}
|
||
_check("sanitizer: empty cmd flagged",
|
||
"# [CC-SANITIZER]" in _san_f_args.get("cmd", ""),
|
||
f"cmd={_san_f_args.get('cmd', '')!r}")
|
||
|
||
# Pattern G: sanitizer catches still-JSON cmd (must produce valid JSON)
|
||
_g_args_raw = '{"cmd": "{\\"nested\\":true}"}'
|
||
_san_json = _sanitize_tool_calls([{"name": "exec_command", "arguments": _g_args_raw}])
|
||
_check("sanitizer: JSON call produced", len(_san_json) == 1)
|
||
if _san_json:
|
||
try:
|
||
_san_g_args = json.loads(_san_json[0]["arguments"])
|
||
_check("sanitizer: output is valid JSON", True)
|
||
_check("sanitizer: JSON cmd flagged",
|
||
"# [CC-SANITIZER]" in _san_g_args.get("cmd", ""),
|
||
f"cmd={_san_g_args.get('cmd', '')!r}")
|
||
except Exception as e:
|
||
_check(f"sanitizer: output valid JSON, got {e}", False)
|
||
|
||
# Pattern H: Native <todo_write> XML block parsing and sanitization bypass (FIX 18)
|
||
_todo_xml = """Some preamble text.
|
||
<todo_write>
|
||
<todos>[{"id":"1","status":"in_progress","description":"Create landing page directory and HTML structure"},{"id":"2","status":"pending","description":"Write the full landing page"}]</todos>
|
||
</todo_write>
|
||
Postamble text."""
|
||
_calls_h = _parse_commandcode_text_tool_calls(_todo_xml)
|
||
_check("todo_write: extracted call exists", len(_calls_h) == 1, f"got {len(_calls_h)} calls")
|
||
if _calls_h:
|
||
_call_h = _calls_h[0]
|
||
_check("todo_write: name is TodoWrite", _call_h.get("name") == "TodoWrite")
|
||
try:
|
||
_args_h = json.loads(_call_h.get("arguments", "{}"))
|
||
_todos_h = _args_h.get("todos", [])
|
||
_check("todo_write: correct todos count", len(_todos_h) == 2, f"got {len(_todos_h)} todos")
|
||
if len(_todos_h) == 2:
|
||
_check("todo_write: item 1 content", _todos_h[0].get("content") == "Create landing page directory and HTML structure")
|
||
_check("todo_write: item 1 activeForm", _todos_h[0].get("activeForm") == "Create landing page directory and HTML structure")
|
||
_check("todo_write: item 1 status", _todos_h[0].get("status") == "in_progress")
|
||
_check("todo_write: item 2 status", _todos_h[1].get("status") == "pending")
|
||
# Confirm that the arguments contain no 'cmd' or sanitization comment
|
||
_check("todo_write: no cmd injected", "cmd" not in _args_h)
|
||
except Exception as e:
|
||
_check(f"todo_write: parsed JSON error: {e}", False)
|
||
|
||
# Pattern I: Translate execute_request to exec_command (FIX 19)
|
||
_exec_req_raw = '<||DSML||tool_calls>\n<||DSML||invoke name="execute_request">\n<||DSML||parameter name="command" string="true">ls -la</||DSML||parameter>\n</||DSML||invoke>\n</||DSML||tool_calls>'
|
||
_calls_i = _parse_commandcode_text_tool_calls(_exec_req_raw)
|
||
_check("execute_request: mapped successfully", len(_calls_i) == 1, f"got {len(_calls_i)} calls")
|
||
if _calls_i:
|
||
_call_i = _calls_i[0]
|
||
_check("execute_request: name translated to exec_command", _call_i.get("name") == "exec_command", f"got {_call_i.get('name')}")
|
||
try:
|
||
_args_i = json.loads(_call_i.get("arguments", "{}"))
|
||
_check("execute_request: correct command extracted", _args_i.get("cmd") == "ls -la", f"got {_args_i.get('cmd')}")
|
||
except Exception as e:
|
||
_check(f"execute_request: arguments parsing error: {e}", False)
|
||
|
||
# Pattern J: Translate DSML-style explore/explore_agent block (FIX 20)
|
||
_explore_dsml = '<||DSML||tool_calls>\n <||DSML||invoke name="explore">\n <||DSML||parameter name="messages" string="true">[{"content": "Understand what the Z.AI-Chat-for-Android project is about... URL: https://github.rommark.dev/admin/Z.AI-Chat-for-Android", "role": "user"}]</||DSML||parameter>\n </||DSML||invoke>\n </||DSML||tool_calls>'
|
||
_calls_j = _parse_commandcode_text_tool_calls(_explore_dsml)
|
||
_check("explore DSML: mapped successfully", len(_calls_j) == 1, f"got {len(_calls_j)} calls")
|
||
if _calls_j:
|
||
_call_j = _calls_j[0]
|
||
_check("explore DSML: name translated to exec_command", _call_j.get("name") == "exec_command", f"got {_call_j.get('name')}")
|
||
try:
|
||
_args_j = json.loads(_call_j.get("arguments", "{}"))
|
||
_check("explore DSML: built a curl explore script targeting api base", "api/v1/repos/admin/Z.AI-Chat-for-Android" in _args_j.get("cmd", ""), f"got {_args_j.get('cmd')!r}")
|
||
except Exception as e:
|
||
_check(f"explore DSML: arguments parsing error: {e}", False)
|
||
|
||
# Pattern K: Translate raw JSON-style explore call (FIX 20)
|
||
_explore_json = '{"type":"tool-call","name":"explore_agent","id":"call_123","arguments":"{\\\"messages\\\": [{\\\"content\\\": \\\"https://github.rommark.dev/admin/Z.AI-Chat-for-Android\\\"}]}"}'
|
||
_calls_k = _parse_commandcode_text_tool_calls(_explore_json)
|
||
_check("explore JSON: mapped successfully", len(_calls_k) == 1, f"got {len(_calls_k)} calls")
|
||
if _calls_k:
|
||
_call_k = _calls_k[0]
|
||
_check("explore JSON: name translated to exec_command", _call_k.get("name") == "exec_command")
|
||
try:
|
||
_args_k = json.loads(_call_k.get("arguments", "{}"))
|
||
_check("explore JSON: built a curl explore script targeting api base", "api/v1/repos/admin/Z.AI-Chat-for-Android" in _args_k.get("cmd", ""), f"got {_args_k.get('cmd')!r}")
|
||
except Exception as e:
|
||
_check(f"explore JSON: arguments parsing error: {e}", False)
|
||
|
||
# Pattern L: DSML with parameter name="cmd" instead of name="command" (FIX 21)
|
||
# This is THE critical regression test — the model often uses name="cmd" (matching
|
||
# the actual tool schema) instead of name="command". Previously the DSML parser
|
||
# silently dropped these, causing Codex CLI to halt mid-task.
|
||
_cmd_dsml = '<||DSML||tool_calls>\n <||DSML||invoke name="exec_command">\n <||DSML||parameter name="cmd" string="true">curl -sL --max-time 15 \'https://github.rommark.dev/api/v1/repos/admin/Z.AI-Chat-for-Android/contents/README.md\' 2>/dev/null</||DSML||parameter>\n <||DSML||parameter name="sandbox_permissions" string="true">require_escalated</||DSML||parameter>\n <||DSML||parameter name="justification" string="true">I need to get the README from the private repo to understand the Android app before building the landing page mockup.</||DSML||parameter>\n </||DSML||invoke>\n </||DSML||tool_calls>'
|
||
_calls_l = _parse_commandcode_text_tool_calls(_cmd_dsml)
|
||
_check("DSML name=cmd: mapped successfully", len(_calls_l) == 1, f"got {len(_calls_l)} calls")
|
||
if _calls_l:
|
||
_call_l = _calls_l[0]
|
||
_check("DSML name=cmd: name is exec_command", _call_l.get("name") == "exec_command", f"got {_call_l.get('name')}")
|
||
try:
|
||
_args_l = json.loads(_call_l.get("arguments", "{}"))
|
||
_check("DSML name=cmd: cmd extracted correctly", "curl -sL --max-time 15" in _args_l.get("cmd", ""), f"got {_args_l.get('cmd')!r}")
|
||
_check("DSML name=cmd: sandbox_permissions extracted", _args_l.get("sandbox_permissions") == "require_escalated", f"got {_args_l.get('sandbox_permissions')!r}")
|
||
_check("DSML name=cmd: justification extracted", "README" in _args_l.get("justification", ""), f"got {_args_l.get('justification')!r}")
|
||
except Exception as e:
|
||
_check(f"DSML name=cmd: arguments parsing error: {e}", False)
|
||
|
||
# Pattern M: explore_agent with nested JSON messages containing URL (FIX 23)
|
||
_explore_nested = '<explore_agent>\nmessages: [{"content": "Understand the Z.AI-Chat-for-Android repo at https://github.rommark.dev/admin/Z.AI-Chat-for-Android"}]\n</explore_agent>'
|
||
_calls_m = _parse_commandcode_text_tool_calls(_explore_nested)
|
||
_check("FIX23 explore nested JSON: parsed", len(_calls_m) == 1, f"got {len(_calls_m)} calls")
|
||
if _calls_m:
|
||
_args_m = json.loads(_calls_m[0].get("arguments", "{}"))
|
||
_check("FIX23 explore nested JSON: cmd has curl", "curl" in _args_m.get("cmd", ""), f"got {_args_m.get('cmd')!r}")
|
||
_check("FIX23 explore nested JSON: URL in cmd", "github.rommark.dev" in _args_m.get("cmd", ""), f"missing URL in cmd")
|
||
|
||
# Pattern N: require_escalation block (FIX 24)
|
||
_esc_text = '<require_escalation>I need to run a command with elevated permissions to access the repository at https://github.rommark.dev/admin/Z.AI-Chat-for-Android</require_escalation>'
|
||
_calls_n = _parse_commandcode_text_tool_calls(_esc_text)
|
||
_check("FIX24 require_escalation: parsed", len(_calls_n) == 1, f"got {len(_calls_n)} calls")
|
||
if _calls_n:
|
||
_args_n = json.loads(_calls_n[0].get("arguments", "{}"))
|
||
_check("FIX24 require_escalation: name is exec_command", _calls_n[0].get("name") == "exec_command", f"got {_calls_n[0].get('name')}")
|
||
_check("FIX24 require_escalation: cmd has curl or echo", "curl" in _args_n.get("cmd", "") or "echo" in _args_n.get("cmd", ""), f"got {_args_n.get('cmd')!r}")
|
||
|
||
# Pattern N2: bare request_escalation_permission tag (FIX 24b)
|
||
_esc_bare = 'I want to proceed.\n<request_escalation_permission />\nPlease let me continue.'
|
||
_calls_n2 = _parse_commandcode_text_tool_calls(_esc_bare)
|
||
_check("FIX24b bare escalation: parsed", len(_calls_n2) == 1, f"got {len(_calls_n2)} calls")
|
||
if _calls_n2:
|
||
_check("FIX24b bare escalation: name is exec_command", _calls_n2[0].get("name") == "exec_command", f"got {_calls_n2[0].get('name')}")
|
||
|
||
# Pattern O: _build_explore_cmd module-level function (FIX 23/25)
|
||
_cmd_o, _just_o = _build_explore_cmd("https://github.rommark.dev/admin/Z.AI-Chat-for-Android")
|
||
_check("FIX23/25 _build_explore_cmd: returns cmd", _cmd_o is not None, "returned None")
|
||
_check("FIX23/25 _build_explore_cmd: has curl", _cmd_o and "curl" in _cmd_o, f"no curl in {_cmd_o!r}")
|
||
_check("FIX23/25 _build_explore_cmd: has api path", _cmd_o and "/api/v1/repos/" in _cmd_o, f"no api path in {_cmd_o!r}")
|
||
|
||
# Pattern O2: _build_explore_cmd with JSON array containing URL
|
||
_cmd_o2, _ = _build_explore_cmd('[{"content": "https://github.rommark.dev/admin/Z.AI-Chat-for-Android"}]')
|
||
_check("FIX23/25 _build_explore_cmd from JSON array: returns cmd", _cmd_o2 is not None, "returned None")
|
||
_check("FIX23/25 _build_explore_cmd from JSON array: has curl", _cmd_o2 and "curl" in _cmd_o2, f"no curl in {_cmd_o2!r}")
|
||
|
||
print(f"[CC-SELF-TEST] Results: {_counts[0]} passed, {_counts[1]} failed",
|
||
file=sys.stderr)
|
||
if _counts[1]:
|
||
sys.exit(1)
|
||
else:
|
||
print("[CC-SELF-TEST] ALL PASSED — pipeline is healthy", file=sys.stderr)
|
||
sys.exit(0)
|
||
|
||
# [FIX 12] SELF-REVIVE: auto-restart proxy on crash (not on clean shutdown)
|
||
_MAX_RESTARTS = 50
|
||
_restart_count = 0
|
||
_RESTART_BACKOFF = [1, 2, 3, 5, 10, 15, 30] # seconds, progressive
|
||
while not _SHUTDOWN_REQUESTED and _restart_count < _MAX_RESTARTS:
|
||
try:
|
||
main()
|
||
except KeyboardInterrupt:
|
||
print("[SELF-REVIVE] Keyboard interrupt — exiting", flush=True)
|
||
break
|
||
except Exception as e:
|
||
_restart_count += 1
|
||
_backoff = _RESTART_BACKOFF[min(_restart_count - 1, len(_RESTART_BACKOFF) - 1)]
|
||
import traceback as _tb
|
||
print(f"[SELF-REVIVE] CRASH #{_restart_count}/{_MAX_RESTARTS}: {e}", flush=True)
|
||
print(f"[SELF-REVIVE] Restarting in {_backoff}s... (Ctrl+C to exit)", flush=True)
|
||
_tb.print_exc()
|
||
time.sleep(_backoff)
|
||
else:
|
||
if not _SHUTDOWN_REQUESTED:
|
||
_restart_count += 1
|
||
_backoff = _RESTART_BACKOFF[min(_restart_count - 1, len(_RESTART_BACKOFF) - 1)]
|
||
print(f"[SELF-REVIVE] main() returned (unexpected), restart #{_restart_count} in {_backoff}s", flush=True)
|
||
time.sleep(_backoff)
|
||
|
||
if _SHUTDOWN_REQUESTED or _restart_count >= _MAX_RESTARTS:
|
||
print(f"[SELF-REVIVE] Exiting (shutdown={_SHUTDOWN_REQUESTED}, restarts={_restart_count})", flush=True)
|