v3.3.0: Antigravity OAuth + Gemini CLI OAuth, full Codex agent loop with tool calls, history hardening, SSE fixes

This commit is contained in:
Roman
2026-05-20 21:44:33 +04:00
Unverified
parent b060706e18
commit e2f20810f0
6 changed files with 1085 additions and 87 deletions

View File

@@ -1,5 +1,36 @@
# Changelog
## v3.3.0 (2026-05-20)
**Antigravity + Gemini CLI OAuth — full Codex agent loop working**
### Gemini CLI OAuth + Antigravity OAuth
- Split Google OAuth into separate Gemini CLI OAuth and Google Antigravity OAuth presets/backends.
- Gemini CLI OAuth uses the Gemini CLI public OAuth client and Code Assist endpoints.
- Antigravity OAuth uses Antigravity OAuth credentials, Code Assist daily/autopush/prod fallback, and Antigravity-style request wrapping.
- Added Antigravity version discovery from the updater/changelog with local caching.
- Added Antigravity model alias mapping from UI-facing `antigravity-*` IDs to upstream Code Assist model IDs.
### Responses API + Tool Flow
- Added Gemini-style history hardening for Google OAuth requests: removes empty turns, coalesces adjacent roles, drops duplicate user repeats, and enforces user-start/user-end history.
- Preserves function-call IDs across turns and adds synthetic `thoughtSignature` for historical Gemini function calls, matching Gemini CLI hardening behavior.
- Fixed Antigravity streaming Responses API compatibility: single assistant message item, text done events, content part done, output item done, final completed event, and connection close.
- Added `response.function_call_arguments.delta` and `response.function_call_arguments.done` events so Codex can execute Antigravity tool calls and create files.
- Fixed functionResponse name matching — uses the original functionCall name instead of falling back to call_id.
- Strengthened Antigravity prompt policy: use tools immediately for file changes, avoid planning-only responses, and answer directly when no suitable tool exists.
### Reliability + Routing
- Added BGP++ route scoring, route cooldowns, token buckets, and persisted route stats.
- Added provider policy layer and adaptive context compaction.
- Added tool-call pairing validation/repair for orphaned tool outputs.
- Added Endpoint Doctor in the endpoint editor.
- Added log redaction helper for common API key/token patterns.
## v3.1.0 (2026-05-20)
- Initial Antigravity/Gemini CLI OAuth backend split.
- Gemini-style history hardening, SSE streaming fixes.
## v3.0.0 (2026-05-20)
**Major architectural overhaul — Phase 0 + Phase 1 of engineering roadmap**

View File

@@ -15,7 +15,7 @@
<p align="center">
<strong>Run OpenAI Codex CLI &amp; Desktop with <em>any</em> AI provider.</strong><br/>
OpenCode &bull; Z.AI &bull; Anthropic &bull; Command Code &bull; OpenRouter &bull; Crof.ai &bull; NVIDIA NIM &bull; Kilo.ai &bull; and more
Google Antigravity &bull; Gemini CLI &bull; OpenCode &bull; Z.AI &bull; Anthropic &bull; Command Code &bull; OpenRouter &bull; Crof.ai &bull; NVIDIA NIM &bull; Kilo.ai &bull; and more
</p>
<p align="center">
@@ -43,14 +43,16 @@ OpenAI's Codex CLI v2.0+ exclusively uses the **Responses API** — a protocol t
| Provider | API | Works with Codex? |
|----------|-----|:-:|
| OpenAI | Responses API | ✅ |
| Z.AI | Chat Completions | |
| OpenCode | Chat Completions | |
| Anthropic | Messages API | |
| Command Code | Custom `/alpha/generate` | |
| Ollama | Chat Completions | |
| OpenRouter | Chat Completions | |
| NVIDIA NIM | Chat Completions | |
| Crof.ai | Chat Completions | |
| Google Antigravity (OAuth) | Code Assist / Gemini Native | |
| Gemini CLI OAuth | Code Assist | |
| Z.AI | Chat Completions | |
| OpenCode | Chat Completions | |
| Anthropic | Messages API | |
| Command Code | Custom `/alpha/generate` | |
| Ollama | Chat Completions | |
| OpenRouter | Chat Completions | |
| NVIDIA NIM | Chat Completions | ✅ |
| Crof.ai | Chat Completions | ✅ |
The protocols differ in **endpoint paths**, **message formats**, **tool-call structures**, **streaming events**, and **completion semantics**. You can't just swap a base URL.

Binary file not shown.

Binary file not shown.

View File

@@ -4,8 +4,9 @@
import gi
gi.require_version("Gtk", "3.0")
from gi.repository import Gtk, GLib
import subprocess, os, signal, sys, threading, time, json, urllib.request, tempfile, shutil
import subprocess, os, signal, sys, threading, time, json, urllib.request, urllib.parse, tempfile, shutil
import hashlib, socket, contextlib
import base64, secrets
from pathlib import Path
HOME = Path.home()
@@ -25,6 +26,21 @@ model_catalog_json = ""
"""
CHANGELOG = [
("3.3.0", "2026-05-20", [
"Added Google Antigravity OAuth backend with Code Assist endpoints and model alias mapping",
"Added Gemini CLI OAuth backend using public Gemini CLI OAuth client",
"Antigravity now creates files via tool calls — full Codex agent loop with Gemini-style history hardening",
"Fixed tool-call streaming: function_call_arguments delta/done events, thought signatures, functionResponse name matching",
"Added Endpoint Doctor, adaptive BGP scoring, provider policies, adaptive compaction, log redaction",
]),
("3.1.0", "2026-05-20", [
"Initial Antigravity/Gemini CLI OAuth split, history hardening, SSE fixes",
]),
("3.0.0", "2026-05-20", [
"ThreadingHTTPServer with dynamic proxy ports and health-gated Codex launch",
"Atomic config writes, safe cleanup registry, graceful shutdown, and buffered SSE streaming",
"Usage Dashboard v2, TCP_NODELAY streaming, Anthropic prompt caching, and batched usage stats",
]),
("2.6.1", "2026-05-20", [
"Google OAuth rebuilt to emulate Gemini CLI — no client_secret.json needed",
"Uses Google's public OAuth client_id (same as gemini-cli)",
@@ -226,13 +242,25 @@ PROVIDER_PRESETS = {
],
},
"Google Gemini (OAuth)": {
"backend_type": "openai-compat",
"base_url": "https://generativelanguage.googleapis.com/v1beta/openai",
"oauth_provider": "google",
"backend_type": "gemini-oauth-cli",
"base_url": "https://cloudcode-pa.googleapis.com",
"oauth_provider": "google-cli",
"models": [
"gemini-2.5-flash", "gemini-2.5-pro",
"gemini-2.0-flash", "gemini-2.0-flash-lite",
"gemini-2.5-flash-preview-native-audio-dialog",
],
},
"Google Antigravity (OAuth)": {
"backend_type": "gemini-oauth-antigravity",
"base_url": "https://daily-cloudcode-pa.sandbox.googleapis.com",
"oauth_provider": "google-antigravity",
"models": [
"antigravity-gemini-3-flash",
"antigravity-gemini-3-pro",
"antigravity-gemini-3.1-pro",
"antigravity-claude-sonnet-4-6",
"antigravity-claude-opus-4-6-thinking",
"gemini-2.5-flash", "gemini-2.5-pro",
"gemini-3-flash-preview", "gemini-3-pro-preview", "gemini-3.1-pro-preview",
],
},
"OpenAdapter": {
@@ -301,8 +329,10 @@ def apply_provider_preset(endpoint, preset_name):
updated["base_url"] = normalize_base_url(preset["base_url"])
if preset.get("cc_version") and not updated.get("cc_version"):
updated["cc_version"] = preset["cc_version"]
if not updated.get("models"):
if not updated.get("models") or (preset.get("backend_type") or "").startswith("gemini-oauth"):
updated["models"] = list(preset.get("models", []))
if preset.get("oauth_provider"):
updated["oauth_provider"] = preset["oauth_provider"]
if not updated.get("default_model") and updated.get("models"):
updated["default_model"] = updated["models"][0]
return updated
@@ -630,6 +660,18 @@ def _start_proxy_for(endpoint, logfn):
port = _pick_free_port()
_proxy_port = port
model_list = endpoint.get("models", [])
if (endpoint.get("backend_type") or "").startswith("gemini-oauth") and (endpoint.get("oauth_provider") or "").startswith("google"):
token_name = "google-antigravity-oauth-token.json" if endpoint.get("oauth_provider") == "google-antigravity" else "google-cli-oauth-token.json"
token_path = os.path.expanduser(f"~/.cache/codex-proxy/{token_name}")
try:
with open(token_path) as tf:
td = json.load(tf)
discovered = [] if endpoint.get("oauth_provider") == "google-antigravity" else td.get("available_models", [])
if discovered:
model_list = discovered
except Exception:
pass
pcfg = {
"port": port,
"backend_type": endpoint["backend_type"],
@@ -640,7 +682,7 @@ def _start_proxy_for(endpoint, logfn):
"reasoning_enabled": endpoint.get("reasoning_enabled", True),
"reasoning_effort": endpoint.get("reasoning_effort", "medium"),
"models": [{"id": m, "object": "model", "created": 1700000000, "owned_by": endpoint["name"]}
for m in endpoint.get("models", [])],
for m in model_list],
}
pcfg_path = PROXY_CONFIG_DIR / f"proxy-{safe_name(endpoint['name'])}-{port}.json"
pcfg_path.parent.mkdir(parents=True, exist_ok=True)
@@ -763,7 +805,7 @@ class LauncherWin(Gtk.Window):
# header row
hdr = Gtk.Box(spacing=8)
vbox.pack_start(hdr, False, False, 0)
lbl = Gtk.Label(label="<b>Codex Launcher v3.0.0</b>")
lbl = Gtk.Label(label="<b>Codex Launcher v3.3.0</b>")
lbl.set_use_markup(True)
hdr.pack_start(lbl, False, False, 0)
changelog_btn = Gtk.Button(label="Changelog")
@@ -1277,7 +1319,7 @@ class LauncherWin(Gtk.Window):
self.log("ERROR: no model selected")
return
is_bgp = name.startswith("🔀 ")
is_bgp = bool(name and name.startswith("🔀 "))
if is_bgp:
pool_name = name[2:]
pool = None
@@ -1781,6 +1823,8 @@ class EditEndpointDialog(Gtk.Dialog):
for val, lab in [("openai-compat", "OpenAI-compatible (needs proxy)"),
("anthropic", "Anthropic (needs proxy)"),
("command-code", "Command Code (needs proxy)"),
("gemini-oauth-cli", "Gemini CLI OAuth (needs proxy)"),
("gemini-oauth-antigravity", "Antigravity OAuth (needs proxy)"),
("native", "Native OpenAI (no proxy)")]:
self._combo_type.append(val, lab)
bt = self._data.get("backend_type", "openai-compat")
@@ -1866,6 +1910,9 @@ class EditEndpointDialog(Gtk.Dialog):
self._fetch_models_btn = Gtk.Button(label="Fetch from API")
self._fetch_models_btn.connect("clicked", lambda b: self._fetch_models())
mbox.pack_start(self._fetch_models_btn, False, False, 0)
self._test_btn = Gtk.Button(label="Test Endpoint")
self._test_btn.connect("clicked", lambda b: self._diagnose_endpoint())
mbox.pack_start(self._test_btn, False, False, 0)
bulk_lbl = Gtk.Label(label="Bulk add models (one per line or comma-separated):", xalign=0)
area.pack_start(bulk_lbl, False, False, 2)
@@ -1970,29 +2017,52 @@ class EditEndpointDialog(Gtk.Dialog):
preset_name = self._combo_preset.get_active_text() or "Custom"
preset = PROVIDER_PRESETS.get(preset_name, {})
provider = preset.get("oauth_provider", "")
if provider == "google":
self._google_oauth_flow()
if (provider or "").startswith("google"):
self._google_oauth_flow(provider)
def _google_oauth_flow(self):
token_path = os.path.expanduser("~/.cache/codex-proxy/google-oauth-token.json")
def _google_oauth_flow(self, oauth_provider="google-cli"):
is_antigravity = oauth_provider == "google-antigravity"
token_path = os.path.expanduser("~/.cache/codex-proxy/google-antigravity-oauth-token.json" if is_antigravity else "~/.cache/codex-proxy/google-cli-oauth-token.json")
CLIENT_ID = "681255809395-oo8ft2oprdrnp9e3aqf6av3hmdib135j.apps.googleusercontent.com"
CLIENT_SECRET = "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w"
if is_antigravity:
CLIENT_ID = "1071006060591-tmhssin2h21lcre235vtolojh4g403ep.apps.googleusercontent.com"
CLIENT_SECRET = "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"
SCOPES = [
"https://www.googleapis.com/auth/cloud-platform",
"https://www.googleapis.com/auth/userinfo.email",
"https://www.googleapis.com/auth/userinfo.profile",
"https://www.googleapis.com/auth/cclog",
"https://www.googleapis.com/auth/experimentsandconfigs",
]
port = 51121
redirect_uri = f"http://localhost:{port}/oauth-callback"
callback_path = "/oauth-callback"
provider_kind = "antigravity"
else:
CLIENT_ID = "681255809395-oo8ft2oprdrnp9e3aqf6av3hmdib135j.apps.googleusercontent.com"
CLIENT_SECRET = "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"
SCOPES = [
"https://www.googleapis.com/auth/cloud-platform",
"https://www.googleapis.com/auth/generative-language.retriever",
"https://www.googleapis.com/auth/userinfo.email",
"https://www.googleapis.com/auth/userinfo.profile",
]
import http.server, hashlib, secrets, socket
port = 0
redirect_uri = None
callback_path = "/oauth2callback"
provider_kind = "cli"
import http.server
port = 8085
state = secrets.token_hex(32)
verifier = secrets.token_urlsafe(32)
challenge = hashlib.sha256(verifier.encode()).digest()
challenge_b64 = urllib.parse.quote_plus(__import__('base64').urlsafe_b64encode(challenge).rstrip(b'=').decode())
verifier = secrets.token_urlsafe(64)
challenge = base64.urlsafe_b64encode(hashlib.sha256(verifier.encode()).digest()).rstrip(b"=").decode()
if port == 0:
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
s.bind(("127.0.0.1", 0))
port = s.getsockname()[1]
redirect_uri = f"http://127.0.0.1:{port}/oauth2callback"
scope_str = " ".join(SCOPES)
auth_url = (
f"https://accounts.google.com/o/oauth2/v2/auth?"
@@ -2001,9 +2071,9 @@ class EditEndpointDialog(Gtk.Dialog):
f"&response_type=code"
f"&scope={urllib.parse.quote(scope_str)}"
f"&access_type=offline"
f"&prompt=consent"
f"&prompt=select_account%20consent"
f"&state={state}"
f"&code_challenge={challenge_b64}"
f"&code_challenge={challenge}"
f"&code_challenge_method=S256"
)
@@ -2043,6 +2113,14 @@ class EditEndpointDialog(Gtk.Dialog):
qs = urllib.parse.urlparse(self2.path).query
params = urllib.parse.parse_qs(qs)
received_state[0] = params.get("state", [None])[0]
with open("/tmp/codex-oauth-debug.log", "a") as _dbg:
_dbg.write(f"[{time.strftime('%H:%M:%S')}] GET {self2.path} state={received_state[0]} code={'code' in params}\n")
if self2.path.find(callback_path) == -1:
self2.send_response(302)
self2.send_header("Location", "https://developers.google.com/gemini-code-assist/auth_failure_gemini")
self2.end_headers()
error_holder[0] = "unexpected request"
return
if "code" in params:
if received_state[0] != state:
self2.send_response(400)
@@ -2061,43 +2139,39 @@ class EditEndpointDialog(Gtk.Dialog):
self2.send_response(302)
self2.send_header("Location", "https://developers.google.com/gemini-code-assist/auth_failure_gemini")
self2.end_headers()
def log_message(self2, *a): pass
def log_message(self2, fmt, *args):
with open("/tmp/codex-oauth-debug.log", "a") as _dbg:
_dbg.write(f"[{time.strftime('%H:%M:%S')}] {fmt % args}\n")
try:
server = http.server.HTTPServer(("127.0.0.1", port), OAuthHandler)
bind_host = "localhost" if is_antigravity else "127.0.0.1"
server = http.server.HTTPServer((bind_host, port), OAuthHandler)
except OSError:
self._oauth_status.set_text(f"Port {port} already in use — close other apps and retry.")
spinner.stop()
dlg.run(); dlg.destroy()
return
def _oauth_log(msg):
with open("/tmp/codex-oauth-debug.log", "a") as _f:
_f.write(f"[{time.strftime('%H:%M:%S')}] {msg}\n")
_oauth_log(f"Starting OAuth: port={port} redirect_uri={redirect_uri}")
def wait_for_code():
_oauth_log("wait_for_code thread started")
deadline = time.time() + 120
while code_holder[0] is None and error_holder[0] is None and time.time() < deadline:
server.handle_request()
server.server_close()
GLib.idle_add(self._google_oauth_complete_gemini, dlg, code_holder, error_holder,
CLIENT_ID, CLIENT_SECRET, redirect_uri, token_path, spinner, verifier)
threading.Thread(target=wait_for_code, daemon=True).start()
subprocess.Popen(["xdg-open", auth_url], stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
dlg.run()
dlg.destroy()
def _google_oauth_complete_gemini(self, dlg, code_holder, error_holder,
client_id, client_secret, redirect_uri, token_path, spinner, verifier):
spinner.stop()
if error_holder[0]:
self._oauth_status.set_markup(f'<span foreground="#e74c3c">Error: {error_holder[0]}</span>')
return
if not code_holder[0]:
self._oauth_status.set_text("No authorization code received.")
return
self._oauth_status.set_text("Exchanging code for token…")
_oauth_log(f"Server closed. code={'yes' if code_holder[0] else 'no'} error={'yes' if error_holder[0] else 'no'}")
if code_holder[0]:
try:
_oauth_log("Exchanging code for token...")
token_data = urllib.parse.urlencode({
"code": code_holder[0],
"client_id": client_id,
"client_secret": client_secret,
"client_id": CLIENT_ID,
"client_secret": CLIENT_SECRET,
"redirect_uri": redirect_uri,
"grant_type": "authorization_code",
"code_verifier": verifier,
@@ -2106,18 +2180,130 @@ class EditEndpointDialog(Gtk.Dialog):
headers={"Content-Type": "application/x-www-form-urlencoded"})
resp = urllib.request.urlopen(req, timeout=30)
tokens = json.loads(resp.read())
tokens["client_id"] = client_id
tokens["client_secret"] = client_secret
tokens["client_id"] = CLIENT_ID
tokens["client_secret"] = CLIENT_SECRET
tokens["provider_kind"] = provider_kind
tokens["expires_at"] = time.time() + tokens.get("expires_in", 3600)
os.makedirs(os.path.dirname(token_path), exist_ok=True)
with open(token_path, "w") as f:
json.dump(tokens, f, indent=2)
os.chmod(token_path, 0o600)
self._entry_key.set_text(tokens.get("access_token", ""))
_oauth_log(f"Token saved to {token_path}")
project_id = ""
try:
_oauth_log("Discovering project ID via loadCodeAssist...")
lr = urllib.request.Request(
"https://cloudcode-pa.googleapis.com/v1internal:loadCodeAssist",
data=json.dumps({}).encode(),
headers={
"Content-Type": "application/json",
"Authorization": f"Bearer {tokens['access_token']}",
"User-Agent": "google-api-nodejs-client/9.15.1",
})
lresp = urllib.request.urlopen(lr, timeout=15)
ldata = json.loads(lresp.read())
p = ldata.get("cloudaicompanionProject", "")
if isinstance(p, dict):
project_id = p.get("id", "")
elif isinstance(p, str):
project_id = p
_oauth_log(f"Project ID: {project_id or '(none)'}")
if project_id:
tokens["project_id"] = project_id
with open(token_path, "w") as f2:
json.dump(tokens, f2, indent=2)
os.chmod(token_path, 0o600)
except Exception as pe:
_oauth_log(f"loadCodeAssist failed (non-fatal): {pe}")
if is_antigravity:
found_models = [
"gemini-2.5-flash", "gemini-2.5-pro",
"gemini-3-flash-preview", "gemini-3-pro-preview", "gemini-3.1-pro-preview",
"gemini-3-pro-low", "gemini-3-pro-high",
"gemini-3.1-pro-low", "gemini-3.1-pro-high",
"gemini-3-flash-low", "gemini-3-flash-medium", "gemini-3-flash-high",
"claude-sonnet-4-6", "claude-opus-4-6-thinking",
"claude-opus-4-6-thinking-low", "claude-opus-4-6-thinking-medium", "claude-opus-4-6-thinking-high",
"gemini-claude-sonnet-4-6",
"gemini-claude-opus-4-6-thinking-low", "gemini-claude-opus-4-6-thinking-medium", "gemini-claude-opus-4-6-thinking-high",
"gemini-3-pro-image",
]
probe_candidates = [
"gemini-2.5-flash", "gemini-2.5-pro",
"gemini-3-flash-preview", "gemini-3-pro-preview", "gemini-3.1-pro-preview",
]
_oauth_log(f"Probing {len(probe_candidates)} model candidates...")
for mc in probe_candidates:
try:
pr = urllib.request.Request(
"https://daily-cloudcode-pa.sandbox.googleapis.com/v1internal:generateContent",
data=json.dumps({
"project": project_id,
"model": mc,
"request": {"contents": [{"role": "user", "parts": [{"text": "x"}]}],
"generationConfig": {"maxOutputTokens": 1}},
}).encode(),
headers={
"Content-Type": "application/json",
"Authorization": f"Bearer {tokens['access_token']}",
"User-Agent": "google-api-nodejs-client/9.15.1",
"Client-Metadata": "ideType=IDE_UNSPECIFIED,platform=PLATFORM_UNSPECIFIED,pluginType=GEMINI",
})
pr.get_method = lambda: "POST"
resp = urllib.request.urlopen(pr, timeout=10)
resp.read()
found_models.append(mc)
_oauth_log(f" {mc} → available")
except urllib.error.HTTPError as e:
if e.code == 429:
found_models.append(mc)
_oauth_log(f" {mc} → available (rate limited)")
else:
e.read()
_oauth_log(f" {mc} → HTTP {e.code}")
except Exception as e:
_oauth_log(f" {mc} → error: {e}")
else:
found_models = ["gemini-2.5-flash", "gemini-2.5-pro"]
if found_models:
tokens["available_models"] = found_models
with open(token_path, "w") as f3:
json.dump(tokens, f3, indent=2)
os.chmod(token_path, 0o600)
_oauth_log(f"Discovered {len(found_models)} models: {found_models}")
else:
_oauth_log("No models discovered (will use defaults)")
GLib.idle_add(self._oauth_success, dlg, tokens.get("access_token", ""), spinner)
return
except urllib.error.HTTPError as e:
body = e.read().decode(errors='replace')
_oauth_log(f"Token exchange HTTP {e.code}: {body}")
GLib.idle_add(self._oauth_failed, dlg, f"Token exchange failed ({e.code}): {body[:200]}", spinner)
return
except Exception as e:
_oauth_log(f"Token exchange FAILED: {e}")
GLib.idle_add(self._oauth_failed, dlg, f"Token exchange failed: {e}", spinner)
return
_oauth_log(f"OAuth failed: {error_holder[0] or 'timeout'}")
GLib.idle_add(self._oauth_failed, dlg,
error_holder[0] or "No authorization code received.", spinner)
threading.Thread(target=wait_for_code, daemon=True).start()
subprocess.Popen(["xdg-open", auth_url], stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
dlg.connect("response", lambda d, r: d.destroy())
dlg.run()
def _oauth_success(self, dlg, access_token, spinner):
spinner.stop()
self._entry_key.set_text(access_token)
self._oauth_status.set_markup('<span foreground="#27ae60" weight="bold">Authorization successful! Token saved.</span>')
dlg.set_title("Google OAuth — Success")
except Exception as e:
self._oauth_status.set_markup(f'<span foreground="#e74c3c">Token exchange failed: {e}</span>')
GLib.timeout_add(1500, lambda: dlg.response(Gtk.ResponseType.OK))
def _oauth_failed(self, dlg, msg, spinner):
spinner.stop()
self._oauth_status.set_markup(f'<span foreground="#e74c3c">{msg}</span>')
GLib.timeout_add(3000, lambda: dlg.response(Gtk.ResponseType.CANCEL))
def _remove_model(self, path):
current = self._combo_default.get_active_text()
@@ -2163,6 +2349,70 @@ class EditEndpointDialog(Gtk.Dialog):
return True, None
return False, err or "No models returned by endpoint"
def _diagnose_endpoint(self):
url = self._entry_url.get_text().strip()
key = self._entry_key.get_text().strip()
bt = self._combo_type.get_active_id() or "openai-compat"
model = self._combo_default.get_active_text() or ""
checks = []
def add(name, ok, detail=""):
checks.append((name, ok, detail))
parsed = urllib.parse.urlparse(url)
add("URL format", bool(parsed.scheme and parsed.netloc),
url if parsed.scheme else "Missing scheme (https://)")
try:
t0 = time.time()
ep = {"base_url": url, "api_key": key, "backend_type": bt}
ids, err = fetch_models_for_endpoint(ep)
lat = (time.time() - t0) * 1000
if ids:
add("Network reachable", True, f"{lat:.0f}ms")
add("Auth valid", True)
add("/models endpoint", True, f"{len(ids)} models in {lat:.0f}ms")
if model:
add("Selected model exists", model in ids,
model if model in ids else f"'{model}' not in {ids[:5]}...")
else:
add("Selected model", False, "No model selected")
elif err and ("401" in str(err) or "403" in str(err)):
add("Network reachable", True, f"{lat:.0f}ms")
add("Auth valid", False, str(err)[:100])
add("/models endpoint", False, "Auth failed")
else:
add("Network reachable", False, str(err or "no response")[:100])
except Exception as e:
add("Network", False, str(e)[:100])
dlg = Gtk.Dialog(title="Endpoint Doctor", parent=self, modal=True)
dlg.add_button("Close", Gtk.ResponseType.CLOSE)
dlg.set_default_size(420, 300)
area = dlg.get_content_area()
area.set_margin_start(12)
area.set_margin_end(12)
area.set_margin_top(12)
area.set_margin_bottom(12)
area.set_spacing(4)
for name, ok, detail in checks:
row = Gtk.Box(spacing=6)
icon = Gtk.Label()
icon.set_markup(f'<span foreground="{"#27ae60" if ok else "#e74c3c"}"'
f' weight="bold">{"\u2713" if ok else "\u2717"}</span>')
row.pack_start(icon, False, False, 0)
lbl = Gtk.Label()
lbl.set_markup(f'<b>{name}</b>')
row.pack_start(lbl, False, False, 0)
if detail:
det = Gtk.Label()
det.set_markup(f'<span foreground="#7f8c8d" size="small">{detail}</span>')
row.pack_end(det, False, False, 0)
area.pack_start(row, False, False, 0)
dlg.show_all()
dlg.run()
dlg.destroy()
def _on_response(self, dialog, response):
if response != Gtk.ResponseType.OK:
self.destroy()
@@ -2172,7 +2422,7 @@ class EditEndpointDialog(Gtk.Dialog):
if not name:
self._show_error("Name is required")
return
bt = self._combo_type.get_active_id()
bt = self._combo_type.get_active_id() or PROVIDER_PRESETS.get(self._combo_preset.get_active_text() or "", {}).get("backend_type") or "openai-compat"
url = self._entry_url.get_text().strip()
key = self._entry_key.get_text().strip()
models = [self._model_store[i][0] for i in range(len(self._model_store))]

View File

@@ -11,7 +11,7 @@ Usage:
python3 translate-proxy.py --backend openai-compat --target-url https://... --api-key sk-...
"""
import json, http.server, socketserver, urllib.request, urllib.parse, urllib.error
import json, http.server, socketserver, urllib.request, urllib.parse, urllib.error, re
import time, uuid, os, sys, argparse, threading, socket, collections, contextlib, signal
# ═══════════════════════════════════════════════════════════════════
@@ -107,9 +107,57 @@ _active_connections = 0
_active_connections_lock = threading.Lock()
_pool = uuid.uuid4().hex[:8]
_antigravity_version = "1.18.3"
_antigravity_version_checked = 0
_antigravity_version_lock = threading.Lock()
def _fetch_antigravity_version():
cache_path = os.path.join(os.path.expanduser("~"), ".cache", "codex-proxy", "antigravity-version.json")
try:
with open(cache_path) as f:
cached = json.load(f)
if cached.get("version") and cached.get("checked_at", 0) > time.time() - 6 * 3600:
return cached["version"]
except Exception:
pass
urls = [
("https://antigravity-auto-updater-974169037036.us-central1.run.app", None),
("https://antigravity.google/changelog", 5000),
]
for url, limit in urls:
try:
req = urllib.request.Request(url, headers={"User-Agent": "Mozilla/5.0"})
resp = urllib.request.urlopen(req, timeout=5)
text = resp.read().decode(errors="replace")
if limit:
text = text[:limit]
m = re.search(r"\d+\.\d+\.\d+", text)
if m:
version = m.group(0)
try:
os.makedirs(os.path.dirname(cache_path), exist_ok=True)
with open(cache_path, "w") as f:
json.dump({"version": version, "checked_at": time.time()}, f)
except Exception:
pass
return version
except Exception:
pass
return _antigravity_version
def _ensure_antigravity_version():
global _antigravity_version, _antigravity_version_checked
if time.time() - _antigravity_version_checked < 6 * 3600:
return _antigravity_version
with _antigravity_version_lock:
if time.time() - _antigravity_version_checked < 6 * 3600:
return _antigravity_version
_antigravity_version = _fetch_antigravity_version()
_antigravity_version_checked = time.time()
return _antigravity_version
def _init_runtime():
global CONFIG, PORT, BACKEND, TARGET_URL, API_KEY, OAUTH_PROVIDER
global CONFIG, PORT, BACKEND, TARGET_URL, API_KEY, OAUTH_PROVIDER, _antigravity_version
global MODELS, CC_VERSION, REASONING_ENABLED, REASONING_EFFORT, BGP_ROUTES
CONFIG = load_config()
@@ -117,12 +165,15 @@ def _init_runtime():
BACKEND = CONFIG["backend_type"]
TARGET_URL = CONFIG["target_url"].rstrip("/")
API_KEY = CONFIG["api_key"]
OAUTH_PROVIDER = CONFIG.get("oauth_provider", "")
OAUTH_PROVIDER = CONFIG.get("oauth_provider") or ""
MODELS = CONFIG["models"]
CC_VERSION = CONFIG.get("cc_version", "")
REASONING_ENABLED = CONFIG.get("reasoning_enabled", True)
REASONING_EFFORT = CONFIG.get("reasoning_effort", "medium")
BGP_ROUTES = CONFIG.get("bgp_routes", [])
if OAUTH_PROVIDER == "google-antigravity":
_antigravity_version = _ensure_antigravity_version()
print(f"[antigravity] version={_antigravity_version}", file=sys.stderr)
bgp_models = []
for _r in BGP_ROUTES:
@@ -134,13 +185,33 @@ def _init_runtime():
MODELS = [{"id": m, "object": "model", "created": 1700000000, "owned_by": "bgp"} for m in bgp_models]
CONFIG["models"] = MODELS
if (BACKEND or "").startswith("gemini-oauth") and (OAUTH_PROVIDER or "").startswith("google"):
token_name = "google-antigravity-oauth-token.json" if OAUTH_PROVIDER == "google-antigravity" else "google-cli-oauth-token.json"
token_path = os.path.join(os.path.expanduser("~"), ".cache", "codex-proxy", token_name)
try:
with open(token_path) as _tf:
_td = json.load(_tf)
_discovered = [] if OAUTH_PROVIDER == "google-antigravity" else _td.get("available_models", [])
if _discovered:
_seen = []
for _m in _discovered:
if _m not in _seen:
_seen.append(_m)
MODELS = [{"id": m, "object": "model", "created": 1700000000, "owned_by": "gemini-oauth"} for m in _seen]
CONFIG["models"] = MODELS
print(f"[gemini-oauth] loaded {len(_seen)} discovered models: {_seen}", file=sys.stderr)
except Exception:
pass
def _refresh_oauth_token():
return _refresh_oauth_token_for(API_KEY, OAUTH_PROVIDER)
def _refresh_oauth_token_for(api_key, oauth_provider):
if oauth_provider != "google":
oauth_provider = oauth_provider or ""
if not oauth_provider.startswith("google"):
return api_key
token_path = os.path.join(os.path.expanduser("~"), ".cache", "codex-proxy", "google-oauth-token.json")
token_name = "google-antigravity-oauth-token.json" if oauth_provider == "google-antigravity" else "google-cli-oauth-token.json"
token_path = os.path.join(os.path.expanduser("~"), ".cache", "codex-proxy", token_name)
if not os.path.exists(token_path):
return api_key
try:
@@ -329,6 +400,70 @@ _CROF_ADAPTIVE = {
"min_keep_recent": 4,
}
_BGP_STATS_PATH = os.path.join(_LOG_DIR, "bgp-route-stats.json")
_bgp_stats_lock = threading.Lock()
def _route_key(route):
return f"{route.get('name', '')}::{route.get('target_url', '')}::{route.get('model', '')}"
def _load_bgp_stats():
try:
if os.path.exists(_BGP_STATS_PATH):
return json.load(open(_BGP_STATS_PATH))
except Exception:
pass
return {}
def _save_bgp_stats(stats):
tmp = _BGP_STATS_PATH + ".tmp"
with open(tmp, "w") as f:
json.dump(stats, f, indent=2)
os.replace(tmp, _BGP_STATS_PATH)
def _score_route(route, stats):
key = _route_key(route)
rs = stats.get(key, {})
now = time.time()
if float(rs.get("open_until_ts", 0)) > now:
return 1_000_000
priority = int(route.get("priority", 99))
ewma = float(rs.get("ewma_latency_s", 0))
failures = int(rs.get("consecutive_failures", 0))
score = priority + min(ewma * 5, 50) + failures * 20
if float(rs.get("rate_limited_until", 0)) > now:
score += 500
return score
def _update_route_stats(route, success, duration_s, http_code=None, error_type=None):
with _bgp_stats_lock:
stats = _load_bgp_stats()
key = _route_key(route)
rs = stats.setdefault(key, {
"ewma_latency_s": duration_s, "consecutive_failures": 0,
"last_success": None, "last_failure": None,
"open_until_ts": 0, "rate_limited_until": 0, "last_error": None,
})
alpha = 0.25
rs["ewma_latency_s"] = alpha * duration_s + (1 - alpha) * float(rs.get("ewma_latency_s", duration_s))
if success:
rs["consecutive_failures"] = 0
rs["last_success"] = time.time()
else:
rs["consecutive_failures"] = int(rs.get("consecutive_failures", 0)) + 1
rs["last_failure"] = time.time()
rs["last_error"] = error_type or (f"http_{http_code}" if http_code else "unknown")
if http_code == 429:
rs["rate_limited_until"] = time.time() + 120
if rs["consecutive_failures"] >= 3:
rs["open_until_ts"] = time.time() + 60
rs["consecutive_failures"] = 0
_save_bgp_stats(stats)
def _sorted_bgp_routes():
with _bgp_stats_lock:
stats = _load_bgp_stats()
return sorted(BGP_ROUTES, key=lambda r: _score_route(r, stats))
def _crof_record(model, n_items, success):
if not isinstance(n_items, int) or n_items < 1:
return
@@ -536,6 +671,193 @@ def _compact_input(input_data):
print(f"[compact] {len(input_data)} items -> {len(head) + 1 + len(tail)} (compacted {len(body)} old items into summary)", file=sys.stderr)
return head + [summary_msg] + tail
# ═══════════════════════════════════════════════════════════════════
# Provider policies
# ═══════════════════════════════════════════════════════════════════
_PROVIDER_POLICIES = {
"crof": {"reasoning_mode": "off", "max_tokens": 32768, "strip_reasoning": True,
"tool_output_limit": 4000, "max_input_items": 18, "compaction": "aggressive"},
"chats-llm": {"reasoning_mode": "off", "max_tokens": 32768, "strip_reasoning": True,
"tool_output_limit": 4000, "max_input_items": 20, "compaction": "aggressive"},
"z.ai": {"reasoning_mode": "medium", "max_tokens": 65536, "strip_reasoning": True,
"tool_output_limit": 8000, "max_input_items": 40, "compaction": "balanced"},
"openrouter": {"reasoning_mode": "provider_default", "max_tokens": 32768, "strip_reasoning": True,
"tool_output_limit": 6000, "max_input_items": 35, "compaction": "balanced"},
"openadapter": {"reasoning_mode": "off", "max_tokens": 32768, "strip_reasoning": True,
"tool_output_limit": 6000, "max_input_items": 30, "compaction": "balanced"},
}
def provider_policy(target_url=None, backend=None):
host = urllib.parse.urlparse(target_url or TARGET_URL).netloc.lower()
for key, policy in _PROVIDER_POLICIES.items():
if key in host:
return policy
return {}
# ═══════════════════════════════════════════════════════════════════
# Adaptive context compaction (model-aware)
# ═══════════════════════════════════════════════════════════════════
_MODEL_CONTEXT = {
"gpt-4o": 128000, "gpt-4o-mini": 128000, "gpt-5": 128000,
"claude-sonnet": 200000, "claude-haiku": 200000,
"glm-5.1": 128000, "glm-5": 128000, "glm-4": 128000,
"deepseek": 64000, "gemini-2.5-flash": 1000000, "gemini-2.5-pro": 2000000,
"mimo": 32768, "minimax": 32768, "kimi": 128000,
"_default": 32768,
}
def _context_limit_for_model(model):
if not model:
return _MODEL_CONTEXT["_default"]
ml = model.lower()
for key, limit in _MODEL_CONTEXT.items():
if key != "_default" and key in ml:
return limit
return _MODEL_CONTEXT["_default"]
def _estimate_tokens(obj):
if obj is None:
return 0
if isinstance(obj, str):
return max(1, len(obj) // 4)
try:
raw = json.dumps(obj, ensure_ascii=False)
except Exception:
raw = str(obj)
return max(1, len(raw) // 4)
def _adaptive_compact(input_data, model, policy=None):
policy = policy or {}
context_size = int(policy.get("context_size", _context_limit_for_model(model)))
input_budget = int(context_size * 0.60)
estimated = _estimate_tokens(input_data)
if estimated <= input_budget:
return input_data, False
if not isinstance(input_data, list):
return input_data, False
reduction = max(0.15, input_budget / max(estimated, 1))
target_items = max(int(len(input_data) * reduction), 6)
if target_items >= len(input_data):
return input_data, False
head_end = 0
for i, item in enumerate(input_data):
t = item.get("type")
if t == "message" and item.get("role") in ("developer", "system"):
head_end = i + 1
elif t == "message" and item.get("role") == "user" and head_end == i:
head_end = i + 1
else:
break
head = input_data[:head_end]
keep = max(4, target_items // 3)
tail_start = max(head_end, len(input_data) - keep)
while tail_start > head_end:
t = input_data[tail_start].get("type")
if t in ("function_call_output", "function_call"):
tail_start -= 1
elif t == "message" and input_data[tail_start].get("role") == "assistant":
tail_start -= 1
else:
break
tail = input_data[tail_start:]
body = input_data[head_end:tail_start]
if not body:
return head + tail, True
summary_lines = [f"[Auto-compacted: {len(body)} turns removed (budget={input_budget}tok, model={model})]"]
for item in body[-5:]:
summary_lines.append(_item_summary(item, max_len=120))
summary_msg = {"type": "message", "role": "user",
"content": [{"type": "input_text", "text": "\n".join(summary_lines)}]}
print(f"[adaptive-compact] model={model} est={estimated}tok budget={input_budget}tok "
f"items {len(input_data)}->{len(head)+1+len(tail)}", file=sys.stderr)
return head + [summary_msg] + tail, True
# ═══════════════════════════════════════════════════════════════════
# Tool-call pairing validator
# ═══════════════════════════════════════════════════════════════════
def validate_tool_pairs(input_items):
if not isinstance(input_items, list):
return []
calls = {}
errors = []
for idx, item in enumerate(input_items):
t = item.get("type")
if t == "function_call":
cid = item.get("call_id") or item.get("id")
if cid:
calls[cid] = idx
elif t == "function_call_output":
cid = item.get("call_id") or item.get("id")
if not cid or cid not in calls:
errors.append({"index": idx, "call_id": cid, "error": "orphan_function_call_output"})
return errors
def repair_orphan_tool_outputs(input_items, errors):
bad = {e["index"] for e in errors}
repaired = []
for idx, item in enumerate(input_items):
if idx in bad:
output = item.get("output", "")
repaired.append({"type": "message", "role": "user",
"content": [{"type": "input_text",
"text": f"[Proxy: unmatched tool output]\n{str(output)[:4000]}"}]})
else:
repaired.append(item)
return repaired
# ═══════════════════════════════════════════════════════════════════
# Log redaction
# ═══════════════════════════════════════════════════════════════════
_SECRET_PATTERNS = [
(r"sk-[A-Za-z0-9_\-]{20,}", "[REDACTED:key]"),
(r"sk-ant-[A-Za-z0-9_\-]{20,}", "[REDACTED:anthropic]"),
(r"gh[pousr]_[A-Za-z0-9_]{20,}", "[REDACTED:github]"),
(r"Bearer\s+[A-Za-z0-9._\-]{20,}", "Bearer [REDACTED]"),
]
def _redact(text):
if not text:
return text
import re
for pattern, replacement in _SECRET_PATTERNS:
text = re.sub(pattern, replacement, text)
return text
# ═══════════════════════════════════════════════════════════════════
# Rate-limit token buckets
# ═══════════════════════════════════════════════════════════════════
class TokenBucket:
def __init__(self, capacity=10, refill=1.0):
self.capacity = float(capacity)
self.tokens = float(capacity)
self.refill = float(refill)
self.updated = time.monotonic()
self.lock = threading.Lock()
def allow(self, cost=1):
with self.lock:
now = time.monotonic()
self.tokens = min(self.capacity, self.tokens + (now - self.updated) * self.refill)
self.updated = now
if self.tokens >= cost:
self.tokens -= cost
return True
return False
_rate_buckets = {}
_rate_buckets_lock = threading.Lock()
def _bucket_for_route(route):
name = route.get("name") or route.get("target_url") or "default"
with _rate_buckets_lock:
if name not in _rate_buckets:
_rate_buckets[name] = TokenBucket(capacity=10, refill=1.0)
return _rate_buckets[name]
# ═══════════════════════════════════════════════════════════════════
# OpenAI-compat backend
# ═══════════════════════════════════════════════════════════════════
@@ -1154,14 +1476,31 @@ class Handler(http.server.BaseHTTPRequestHandler):
self._handle_anthropic(body, model, stream)
elif BACKEND == "command-code":
self._handle_command_code(body, model, stream)
elif (BACKEND or "").startswith("gemini-oauth"):
self._handle_gemini_oauth(body, model, stream)
else:
self._handle_openai_compat(body, model, stream)
def _handle_openai_compat(self, body, model, stream):
input_data = body.get("input", "")
policy = provider_policy()
pair_errors = validate_tool_pairs(input_data)
if pair_errors:
print(f"[tool-validator] repairing {len(pair_errors)} orphan tool outputs", file=sys.stderr)
input_data = repair_orphan_tool_outputs(input_data, pair_errors)
body = dict(body)
body["input"] = input_data
compacted = False
if policy.get("compaction") and isinstance(input_data, list):
input_data, compacted = _adaptive_compact(input_data, model, policy)
if compacted:
body = dict(body)
body["input"] = input_data
crof_limit = _crof_item_limit(model)
if isinstance(input_data, list) and len(input_data) > crof_limit:
if not compacted and isinstance(input_data, list) and len(input_data) > crof_limit:
print(f"[crof-adaptive] proactive compact: {len(input_data)} items > limit {crof_limit}", file=sys.stderr)
input_data = _crof_compact_for_retry(input_data, model)
body = dict(body)
@@ -1228,8 +1567,379 @@ class Handler(http.server.BaseHTTPRequestHandler):
chat_body["reasoning_effort"] = REASONING_EFFORT
return chat_body
def _handle_gemini_oauth(self, body, model, stream):
input_data = body.get("input", "")
policy = provider_policy()
if OAUTH_PROVIDER == "google-antigravity":
alias_map = {
"antigravity-gemini-3-flash": "gemini-3-flash",
"antigravity-gemini-3-pro": "gemini-3-pro-low",
"antigravity-gemini-3.1-pro": "gemini-3.1-pro-low",
"gemini-3-flash-preview": "gemini-3-flash",
"gemini-3-pro-preview": "gemini-3-pro-low",
"gemini-3.1-pro-preview": "gemini-3.1-pro-low",
"gemini-3-pro": "gemini-3-pro-low",
"gemini-3.1-pro": "gemini-3.1-pro-low",
"antigravity-claude-sonnet-4-6": "claude-sonnet-4-6",
"antigravity-claude-opus-4-6-thinking": "claude-opus-4-6-thinking",
}
model = alias_map.get(model, model)
pair_errors = validate_tool_pairs(input_data)
if pair_errors:
input_data = repair_orphan_tool_outputs(input_data, pair_errors)
body = dict(body)
body["input"] = input_data
compacted = False
if policy.get("compaction") and isinstance(input_data, list):
input_data, compacted = _adaptive_compact(input_data, model, policy)
if compacted:
body = dict(body)
body["input"] = input_data
access_token = _refresh_oauth_token()
token_name = "google-antigravity-oauth-token.json" if OAUTH_PROVIDER == "google-antigravity" else "google-cli-oauth-token.json"
token_path = os.path.join(os.path.expanduser("~"), ".cache", "codex-proxy", token_name)
project_id = ""
try:
with open(token_path) as f:
project_id = json.load(f).get("project_id", "")
except Exception:
pass
contents = []
system_parts = []
instructions = body.get("instructions", "").strip()
tool_call_names = {}
if isinstance(input_data, list):
for item in input_data:
t = item.get("type")
if t == "message":
role = "user" if item.get("role") == "user" else "model"
content = item.get("content", "")
if isinstance(content, list):
parts = []
for c in content:
ct = c.get("type")
if ct == "input_text":
parts.append({"text": c.get("text", "")})
elif ct == "text":
parts.append({"text": c.get("text", "")})
elif ct == "input_image" or ct == "image_url":
iu = c.get("image_url") or c.get("url", {})
url = iu.get("url", iu) if isinstance(iu, dict) else iu
if isinstance(url, str) and url.startswith("data:"):
mime, _, b64 = url.partition(";base64,")
mime = mime.replace("data:", "") or "image/png"
parts.append({"inlineData": {"mimeType": mime, "data": b64}})
else:
parts.append({"text": str(url)})
if parts:
contents.append({"role": role, "parts": parts})
elif isinstance(content, str):
contents.append({"role": role, "parts": [{"text": content}]})
elif t == "function_call":
call_id = item.get("call_id") or item.get("id") or f"call_{uuid.uuid4().hex[:24]}"
fname = item.get("name", "")
if call_id and fname:
tool_call_names[call_id] = fname
args = item.get("arguments", "{}")
if isinstance(args, str):
try:
args = json.loads(args)
except Exception:
args = {}
contents.append({"role": "model", "parts": [{"functionCall": {"name": fname, "args": args, "id": call_id}, "thoughtSignature": "skip_thought_signature_validator"}]})
elif t == "function_call_output":
call_id = item.get("call_id", item.get("id", ""))
output = item.get("output", "")
fname = item.get("name", "") or tool_call_names.get(call_id, "")
try:
output_parsed = json.loads(output) if isinstance(output, str) else output
except Exception:
output_parsed = output
resp_part = {"functionResponse": {"name": fname or "unknown", "response": {"result": output_parsed if isinstance(output_parsed, (dict, list)) else output}}}
if call_id:
resp_part["functionResponse"]["id"] = call_id
contents.append({"role": "user", "parts": [resp_part]})
if OAUTH_PROVIDER.startswith("google"):
sanitized = []
last_user_text = None
last_role = None
for content in contents:
role = content.get("role")
parts = [p for p in content.get("parts", []) if isinstance(p, dict)]
if not parts:
continue
text_key = "\n".join([p.get("text", "") for p in parts if "text" in p]).strip()
if role == "user" and text_key and text_key == last_user_text:
continue
if role == last_role and role in ("user", "model") and sanitized:
sanitized[-1].setdefault("parts", []).extend(parts)
else:
sanitized.append({"role": role, "parts": parts})
if role == "user" and text_key:
last_user_text = text_key
last_role = role
while sanitized and sanitized[0].get("role") != "user":
sanitized.pop(0)
while sanitized and sanitized[-1].get("role") != "user":
sanitized.pop()
contents = sanitized
if instructions:
system_parts.append({"text": instructions})
if OAUTH_PROVIDER == "google-antigravity":
system_parts.append({"text": (
"You are connected through a Responses API translation proxy. "
"If tools are available and the user's request requires changing files, call the appropriate tool immediately. "
"Do not announce plans, do not say you will list files, browse, fetch, inspect, or start by exploring unless you are emitting the actual tool call in the same response. "
"For file creation requests, use tools to create or modify the file instead of only printing code in chat. "
"If no suitable tool is available, answer directly with the complete result. "
"Never answer only with a plan such as 'I will start by...' or 'I am going to...'."
)})
gen_config = {}
mot = body.get("max_output_tokens", 0)
if mot:
gen_config["maxOutputTokens"] = mot
if body.get("temperature") is not None:
gen_config["temperature"] = body["temperature"]
if body.get("top_p") is not None:
gen_config["topP"] = body["top_p"]
if REASONING_ENABLED and REASONING_EFFORT != "none":
budget = {"low": 2048, "medium": 8192, "high": 24576}.get(REASONING_EFFORT, 8192)
gen_config["thinkingConfig"] = {"includeThoughts": True, "thinkingBudget": budget}
oa_tools = body.get("tools", [])
gemini_tools = []
if oa_tools:
func_decls = []
for tool in oa_tools:
ttype = tool.get("type", "function")
fname = tool.get("name", "")
if ttype == "function":
fn = tool.get("function", tool)
name = fn.get("name", fname)
desc = fn.get("description", "")
params = fn.get("parameters", fn.get("input_schema", {}))
func_decls.append({"name": name, "description": desc, "parameters": params})
elif fname:
func_decls.append({"name": fname, "description": tool.get("description", ""), "parameters": tool.get("parameters", {"type": "object", "properties": {}})})
if func_decls:
gemini_tools = [{"functionDeclarations": func_decls}]
request_body = {"contents": contents}
if system_parts:
request_body["systemInstruction"] = {"parts": system_parts}
if gen_config:
request_body["generationConfig"] = gen_config
if gemini_tools:
request_body["tools"] = gemini_tools
wrapped = {
"project": project_id,
"model": model,
"request": request_body,
}
if OAUTH_PROVIDER == "google-antigravity":
wrapped["requestType"] = "agent"
wrapped["userAgent"] = "antigravity"
wrapped["requestId"] = f"agent-{uuid.uuid4().hex[:12]}"
endpoints = ([
"https://daily-cloudcode-pa.sandbox.googleapis.com",
"https://autopush-cloudcode-pa.sandbox.googleapis.com",
"https://cloudcode-pa.googleapis.com",
] if OAUTH_PROVIDER == "google-antigravity" else [
"https://cloudcode-pa.googleapis.com",
])
action = "streamGenerateContent" if stream else "generateContent"
url_suffix = f"v1internal:{action}?alt=sse" if stream else f"v1internal:{action}"
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {access_token}",
}
if OAUTH_PROVIDER == "google-antigravity":
version = _ensure_antigravity_version()
headers["User-Agent"] = f"antigravity/{version} darwin/arm64"
else:
headers["User-Agent"] = "google-api-nodejs-client/9.15.1"
headers["X-Goog-Api-Client"] = "gl-node/22.17.0"
headers["Client-Metadata"] = "ideType=IDE_UNSPECIFIED,platform=PLATFORM_UNSPECIFIED,pluginType=GEMINI"
body_b = json.dumps(wrapped).encode()
print(f"[gemini-oauth] model={model} stream={stream} items={len(input_data) if isinstance(input_data, list) else 1} project={project_id}", file=sys.stderr)
for ep in endpoints:
target = f"{ep}/{url_suffix}"
req = urllib.request.Request(target, data=body_b, headers=headers)
try:
upstream = urllib.request.urlopen(req, timeout=_upstream_timeout(body, stream))
break
except urllib.error.HTTPError as e:
err_body = e.read().decode()
if e.code == 400 and OAUTH_PROVIDER.startswith("google"):
try:
debug_path = os.path.join(_LOG_DIR, "gemini-last-400-request.json")
with open(debug_path, "w") as dbg:
json.dump({"endpoint": ep, "model": model, "wrapped": wrapped, "error": err_body}, dbg, indent=2)
print(f"[gemini-oauth] saved 400 debug request to {debug_path}", file=sys.stderr)
except Exception:
pass
if e.code == 429 and ep != endpoints[-1]:
print(f"[gemini-oauth] {ep} HTTP 429, trying next endpoint", file=sys.stderr)
continue
return self.send_json(e.code, {"error": {"type": "upstream_error", "message": err_body}})
except Exception as e:
if ep == endpoints[-1]:
return self.send_json(502, {"error": {"type": "proxy_error", "message": str(e)}})
print(f"[gemini-oauth] {ep} failed: {e}, trying next", file=sys.stderr)
continue
if stream:
self._forward_gemini_sse(upstream, model, body, input_data)
else:
self._forward_gemini_json(upstream, model, body, input_data)
def _forward_gemini_sse(self, upstream, model, body, input_data):
resp_id = f"resp-{uuid.uuid4().hex[:24]}"
created = int(time.time())
self.send_response(200)
self.send_header("Content-Type", "text/event-stream")
self.send_header("Cache-Control", "no-cache")
self.send_header("Connection", "keep-alive")
self.end_headers()
full_text = ""
output_items = []
current_tool_calls = {}
message_started = False
message_id = f"msg-{uuid.uuid4().hex[:24]}"
def flush_event(event_type, data):
self.wfile.write(f"event: {event_type}\ndata: {json.dumps(data)}\n\n".encode())
self.wfile.flush()
flush_event("response.created", {"type": "response.created", "response": {"id": resp_id, "object": "response", "model": model, "status": "in_progress", "created": created, "output": []}})
flush_event("response.in_progress", {"type": "response.in_progress", "response": {"id": resp_id}})
buf = ""
stream_finished = False
for raw_line in upstream:
if stream_finished:
break
line = raw_line.decode(errors="replace")
if line.startswith("data: "):
buf += line[6:]
continue
if not line.strip() and buf:
try:
chunk = json.loads(buf)
except Exception:
buf = ""
continue
buf = ""
candidates = chunk.get("response", chunk).get("candidates", [])
if not candidates:
if chunk.get("error"):
print(f"[gemini-oauth] stream error chunk: {str(chunk.get('error'))[:300]}", file=sys.stderr)
continue
if candidates[0].get("finishReason") and not candidates[0].get("content", {}).get("parts"):
print(f"[gemini-oauth] finish without parts: {candidates[0].get('finishReason')}", file=sys.stderr)
parts = candidates[0].get("content", {}).get("parts", [])
for part in parts:
if part.get("thought"):
continue
if "text" in part and not part.get("functionCall"):
text_delta = part["text"]
if not text_delta:
continue
full_text += text_delta
if not message_started:
flush_event("response.output_item.added", {"type": "response.output_item.added", "output_index": 0, "item": {"type": "message", "id": message_id, "role": "assistant", "content": []}})
flush_event("response.content_part.added", {"type": "response.content_part.added", "output_index": 0, "content_index": 0, "part": {"type": "output_text", "text": ""}})
output_items.append({"text": True})
message_started = True
flush_event("response.output_text.delta", {"type": "response.output_text.delta", "output_index": 0, "content_index": 0, "delta": text_delta})
elif part.get("functionCall"):
fc = part["functionCall"]
call_id = f"call_{uuid.uuid4().hex[:24]}"
args_str = json.dumps(fc.get("args", fc.get("arguments", {})))
output_index = len(output_items)
flush_event("response.output_item.added", {"type": "response.output_item.added", "output_index": output_index, "item": {"type": "function_call", "id": call_id, "call_id": call_id, "name": fc.get("name", ""), "arguments": ""}})
flush_event("response.function_call_arguments.delta", {"type": "response.function_call_arguments.delta", "output_index": output_index, "item_id": call_id, "delta": args_str})
flush_event("response.function_call_arguments.done", {"type": "response.function_call_arguments.done", "output_index": output_index, "item_id": call_id, "arguments": args_str})
current_tool_calls[call_id] = fc
output_items.append({"tool": True})
if OAUTH_PROVIDER == "google-antigravity" and full_text and candidates[0].get("finishReason"):
stream_finished = True
break
out = []
if not full_text and not current_tool_calls:
print("[gemini-oauth] WARNING: completed with empty output", file=sys.stderr)
if full_text:
out.append({"type": "message", "id": message_id, "role": "assistant", "content": [{"type": "output_text", "text": full_text}]})
tool_outputs = []
for cid, fc in current_tool_calls.items():
tool_outputs.append({"type": "function_call", "id": cid, "call_id": cid, "name": fc.get("name", ""), "arguments": json.dumps(fc.get("args", fc.get("arguments", {})))})
out.extend(tool_outputs)
final_resp = {"id": resp_id, "object": "response", "model": model, "status": "completed", "created": created, "output": out}
if full_text:
flush_event("response.output_text.done", {"type": "response.output_text.done", "output_index": 0, "content_index": 0, "text": full_text})
flush_event("response.content_part.done", {"type": "response.content_part.done", "output_index": 0, "content_index": 0, "part": {"type": "output_text", "text": full_text}})
flush_event("response.output_item.done", {"type": "response.output_item.done", "output_index": 0, "item": out[0]})
for idx, item in enumerate(tool_outputs, start=(1 if full_text else 0)):
flush_event("response.output_item.done", {"type": "response.output_item.done", "output_index": idx, "item": item})
flush_event("response.completed", {"type": "response.completed", "response": final_resp})
self.close_connection = True
with _response_store_lock:
_response_store[resp_id] = final_resp
while len(_response_store) > _MAX_STORED:
_response_store.popitem(last=False)
def _forward_gemini_json(self, upstream, model, body, input_data):
data = json.loads(upstream.read().decode())
resp_id = f"resp-{uuid.uuid4().hex[:24]}"
created = int(time.time())
out = []
full_text = ""
candidates = data.get("response", data).get("candidates", [])
if candidates:
parts = candidates[0].get("content", {}).get("parts", [])
text_parts = []
for part in parts:
if part.get("thought"):
continue
if "text" in part and not part.get("functionCall"):
text_parts.append(part["text"])
elif part.get("functionCall"):
fc = part["functionCall"]
call_id = f"call_{uuid.uuid4().hex[:24]}"
out.append({"type": "function_call", "id": call_id, "call_id": call_id, "name": fc.get("name", ""), "arguments": json.dumps(fc.get("args", fc.get("arguments", {})))})
if text_parts:
full_text = "".join(text_parts)
out.insert(0, {"type": "message", "id": f"msg-{uuid.uuid4().hex[:24]}", "role": "assistant", "content": [{"type": "output_text", "text": full_text}]})
resp = {"id": resp_id, "object": "response", "model": model, "status": "completed", "created": created, "output": out}
with _response_store_lock:
_response_store[resp_id] = resp
while len(_response_store) > _MAX_STORED:
_response_store.popitem(last=False)
self.send_json(200, resp)
def _handle_bgp(self, body, model, stream, messages, input_data):
routes = sorted(BGP_ROUTES, key=lambda r: r.get("priority", 99))
routes = _sorted_bgp_routes()
routes = [r for r in routes if _bucket_for_route(r).allow()]
if not routes:
return self.send_json(503, {"error": {"type": "bgp_rate_limited", "message": "All routes rate-limited"}})
errors = []
for route in routes:
r_model = route.get("model", model)
@@ -1266,11 +1976,13 @@ class Handler(http.server.BaseHTTPRequestHandler):
}, browser_ua=True)
print(f"[bgp] trying route '{route.get('name', r_url)}' model={r_model}", file=sys.stderr)
req = urllib.request.Request(target, data=json.dumps(chat_body).encode(), headers=fwd)
t0_route = time.time()
route_ok = False
for attempt in range(3):
try:
upstream = urllib.request.urlopen(req, timeout=_upstream_timeout(body, stream))
print(f"[bgp] route '{route.get('name', r_url)}' connected OK", file=sys.stderr)
_update_route_stats(route, True, time.time() - t0_route)
self._forward_oa_compat(upstream, stream, r_model, chat_body, body, input_data, fwd, target)
return
except urllib.error.HTTPError as e:
@@ -1282,6 +1994,7 @@ class Handler(http.server.BaseHTTPRequestHandler):
req = urllib.request.Request(target, data=json.dumps(chat_body).encode(), headers=fwd)
continue
print(f"[bgp] route '{route.get('name', r_url)}' FAILED: HTTP {e.code}: {err[:200]}", file=sys.stderr)
_update_route_stats(route, False, time.time() - t0_route, http_code=e.code)
errors.append(f"{route.get('name','?')}: HTTP {e.code}")
break
except (ConnectionResetError, ConnectionAbortedError, BrokenPipeError) as e:
@@ -1291,10 +2004,12 @@ class Handler(http.server.BaseHTTPRequestHandler):
time.sleep(wait)
req = urllib.request.Request(target, data=json.dumps(chat_body).encode(), headers=fwd)
continue
_update_route_stats(route, False, time.time() - t0_route, error_type=str(e))
errors.append(f"{route.get('name','?')}: {e}")
break
except Exception as e:
print(f"[bgp] route '{route.get('name', r_url)}' FAILED: {e}", file=sys.stderr)
_update_route_stats(route, False, time.time() - t0_route, error_type=str(e))
errors.append(f"{route.get('name','?')}: {e}")
break