v3.3.0: Antigravity OAuth + Gemini CLI OAuth, full Codex agent loop with tool calls, history hardening, SSE fixes

2026-05-20 21:44:33 +04:00
parent b060706e18
commit e2f20810f0
6 changed files with 1085 additions and 87 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,5 +1,36 @@
 # Changelog

+## v3.3.0 (2026-05-20)
+
+**Antigravity + Gemini CLI OAuth — full Codex agent loop working**
+
+### Gemini CLI OAuth + Antigravity OAuth
+- Split Google OAuth into separate Gemini CLI OAuth and Google Antigravity OAuth presets/backends.
+- Gemini CLI OAuth uses the Gemini CLI public OAuth client and Code Assist endpoints.
+- Antigravity OAuth uses Antigravity OAuth credentials, Code Assist daily/autopush/prod fallback, and Antigravity-style request wrapping.
+- Added Antigravity version discovery from the updater/changelog with local caching.
+- Added Antigravity model alias mapping from UI-facing `antigravity-*` IDs to upstream Code Assist model IDs.
+
+### Responses API + Tool Flow
+- Added Gemini-style history hardening for Google OAuth requests: removes empty turns, coalesces adjacent roles, drops duplicate user repeats, and enforces user-start/user-end history.
+- Preserves function-call IDs across turns and adds synthetic `thoughtSignature` for historical Gemini function calls, matching Gemini CLI hardening behavior.
+- Fixed Antigravity streaming Responses API compatibility: single assistant message item, text done events, content part done, output item done, final completed event, and connection close.
+- Added `response.function_call_arguments.delta` and `response.function_call_arguments.done` events so Codex can execute Antigravity tool calls and create files.
+- Fixed functionResponse name matching — uses the original functionCall name instead of falling back to call_id.
+- Strengthened Antigravity prompt policy: use tools immediately for file changes, avoid planning-only responses, and answer directly when no suitable tool exists.
+
+### Reliability + Routing
+- Added BGP++ route scoring, route cooldowns, token buckets, and persisted route stats.
+- Added provider policy layer and adaptive context compaction.
+- Added tool-call pairing validation/repair for orphaned tool outputs.
+- Added Endpoint Doctor in the endpoint editor.
+- Added log redaction helper for common API key/token patterns.
+
+## v3.1.0 (2026-05-20)
+
+- Initial Antigravity/Gemini CLI OAuth backend split.
+- Gemini-style history hardening, SSE streaming fixes.
+
 ## v3.0.0 (2026-05-20)

 **Major architectural overhaul — Phase 0 + Phase 1 of engineering roadmap**
--- a/README.md
+++ b/README.md
@@ -15,7 +15,7 @@

 <p align="center">
  <strong>Run OpenAI Codex CLI &amp; Desktop with <em>any</em> AI provider.</strong><br/>
-  OpenCode &bull; Z.AI &bull; Anthropic &bull; Command Code &bull; OpenRouter &bull; Crof.ai &bull; NVIDIA NIM &bull; Kilo.ai &bull; and more
+  Google Antigravity &bull; Gemini CLI &bull; OpenCode &bull; Z.AI &bull; Anthropic &bull; Command Code &bull; OpenRouter &bull; Crof.ai &bull; NVIDIA NIM &bull; Kilo.ai &bull; and more
 </p>

 <p align="center">
@@ -43,14 +43,16 @@ OpenAI's Codex CLI v2.0+ exclusively uses the **Responses API** — a protocol t
 | Provider | API | Works with Codex? |
 |----------|-----|:-:|
 | OpenAI | Responses API | ✅ |
-| Z.AI | Chat Completions | ❌ |
-| OpenCode | Chat Completions | ❌ |
-| Anthropic | Messages API | ❌ |
-| Command Code | Custom `/alpha/generate` | ❌ |
-| Ollama | Chat Completions | ❌ |
-| OpenRouter | Chat Completions | ❌ |
-| NVIDIA NIM | Chat Completions | ❌ |
-| Crof.ai | Chat Completions | ❌ |
+| Google Antigravity (OAuth) | Code Assist / Gemini Native | ✅ |
+| Gemini CLI OAuth | Code Assist | ✅ |
+| Z.AI | Chat Completions | ✅ |
+| OpenCode | Chat Completions | ✅ |
+| Anthropic | Messages API | ✅ |
+| Command Code | Custom `/alpha/generate` | ✅ |
+| Ollama | Chat Completions | ✅ |
+| OpenRouter | Chat Completions | ✅ |
+| NVIDIA NIM | Chat Completions | ✅ |
+| Crof.ai | Chat Completions | ✅ |

 The protocols differ in **endpoint paths**, **message formats**, **tool-call structures**, **streaming events**, and **completion semantics**. You can't just swap a base URL.

--- a/codex-launcher_2.7.0_all.deb
+++ b/codex-launcher_2.7.0_all.deb
--- a/codex-launcher_3.3.0_all.deb
+++ b/codex-launcher_3.3.0_all.deb
--- a/src/codex-launcher-gui
+++ b/src/codex-launcher-gui
@@ -4,8 +4,9 @@
 import gi
 gi.require_version("Gtk", "3.0")
 from gi.repository import Gtk, GLib
-import subprocess, os, signal, sys, threading, time, json, urllib.request, tempfile, shutil
+import subprocess, os, signal, sys, threading, time, json, urllib.request, urllib.parse, tempfile, shutil
 import hashlib, socket, contextlib
+import base64, secrets
 from pathlib import Path

 HOME = Path.home()
@@ -25,6 +26,21 @@ model_catalog_json = ""
 """

 CHANGELOG = [
+    ("3.3.0", "2026-05-20", [
+        "Added Google Antigravity OAuth backend with Code Assist endpoints and model alias mapping",
+        "Added Gemini CLI OAuth backend using public Gemini CLI OAuth client",
+        "Antigravity now creates files via tool calls — full Codex agent loop with Gemini-style history hardening",
+        "Fixed tool-call streaming: function_call_arguments delta/done events, thought signatures, functionResponse name matching",
+        "Added Endpoint Doctor, adaptive BGP scoring, provider policies, adaptive compaction, log redaction",
+    ]),
+    ("3.1.0", "2026-05-20", [
+        "Initial Antigravity/Gemini CLI OAuth split, history hardening, SSE fixes",
+    ]),
+    ("3.0.0", "2026-05-20", [
+        "ThreadingHTTPServer with dynamic proxy ports and health-gated Codex launch",
+        "Atomic config writes, safe cleanup registry, graceful shutdown, and buffered SSE streaming",
+        "Usage Dashboard v2, TCP_NODELAY streaming, Anthropic prompt caching, and batched usage stats",
+    ]),
    ("2.6.1", "2026-05-20", [
        "Google OAuth rebuilt to emulate Gemini CLI — no client_secret.json needed",
        "Uses Google's public OAuth client_id (same as gemini-cli)",
@@ -226,13 +242,25 @@ PROVIDER_PRESETS = {
        ],
    },
    "Google Gemini (OAuth)": {
-        "backend_type": "openai-compat",
-        "base_url": "https://generativelanguage.googleapis.com/v1beta/openai",
-        "oauth_provider": "google",
+        "backend_type": "gemini-oauth-cli",
+        "base_url": "https://cloudcode-pa.googleapis.com",
+        "oauth_provider": "google-cli",
        "models": [
            "gemini-2.5-flash", "gemini-2.5-pro",
-            "gemini-2.0-flash", "gemini-2.0-flash-lite",
-            "gemini-2.5-flash-preview-native-audio-dialog",
+        ],
+    },
+    "Google Antigravity (OAuth)": {
+        "backend_type": "gemini-oauth-antigravity",
+        "base_url": "https://daily-cloudcode-pa.sandbox.googleapis.com",
+        "oauth_provider": "google-antigravity",
+        "models": [
+            "antigravity-gemini-3-flash",
+            "antigravity-gemini-3-pro",
+            "antigravity-gemini-3.1-pro",
+            "antigravity-claude-sonnet-4-6",
+            "antigravity-claude-opus-4-6-thinking",
+            "gemini-2.5-flash", "gemini-2.5-pro",
+            "gemini-3-flash-preview", "gemini-3-pro-preview", "gemini-3.1-pro-preview",
        ],
    },
    "OpenAdapter": {
@@ -301,8 +329,10 @@ def apply_provider_preset(endpoint, preset_name):
    updated["base_url"] = normalize_base_url(preset["base_url"])
    if preset.get("cc_version") and not updated.get("cc_version"):
        updated["cc_version"] = preset["cc_version"]
-    if not updated.get("models"):
+    if not updated.get("models") or (preset.get("backend_type") or "").startswith("gemini-oauth"):
        updated["models"] = list(preset.get("models", []))
+    if preset.get("oauth_provider"):
+        updated["oauth_provider"] = preset["oauth_provider"]
    if not updated.get("default_model") and updated.get("models"):
        updated["default_model"] = updated["models"][0]
    return updated
@@ -630,6 +660,18 @@ def _start_proxy_for(endpoint, logfn):
    port = _pick_free_port()
    _proxy_port = port

+    model_list = endpoint.get("models", [])
+    if (endpoint.get("backend_type") or "").startswith("gemini-oauth") and (endpoint.get("oauth_provider") or "").startswith("google"):
+        token_name = "google-antigravity-oauth-token.json" if endpoint.get("oauth_provider") == "google-antigravity" else "google-cli-oauth-token.json"
+        token_path = os.path.expanduser(f"~/.cache/codex-proxy/{token_name}")
+        try:
+            with open(token_path) as tf:
+                td = json.load(tf)
+            discovered = [] if endpoint.get("oauth_provider") == "google-antigravity" else td.get("available_models", [])
+            if discovered:
+                model_list = discovered
+        except Exception:
+            pass
    pcfg = {
        "port": port,
        "backend_type": endpoint["backend_type"],
@@ -640,7 +682,7 @@ def _start_proxy_for(endpoint, logfn):
        "reasoning_enabled": endpoint.get("reasoning_enabled", True),
        "reasoning_effort": endpoint.get("reasoning_effort", "medium"),
        "models": [{"id": m, "object": "model", "created": 1700000000, "owned_by": endpoint["name"]}
-                   for m in endpoint.get("models", [])],
+                   for m in model_list],
    }
    pcfg_path = PROXY_CONFIG_DIR / f"proxy-{safe_name(endpoint['name'])}-{port}.json"
    pcfg_path.parent.mkdir(parents=True, exist_ok=True)
@@ -763,7 +805,7 @@ class LauncherWin(Gtk.Window):
        # header row
        hdr = Gtk.Box(spacing=8)
        vbox.pack_start(hdr, False, False, 0)
-        lbl = Gtk.Label(label="<b>Codex Launcher v3.0.0</b>")
+        lbl = Gtk.Label(label="<b>Codex Launcher v3.3.0</b>")
        lbl.set_use_markup(True)
        hdr.pack_start(lbl, False, False, 0)
        changelog_btn = Gtk.Button(label="Changelog")
@@ -1277,7 +1319,7 @@ class LauncherWin(Gtk.Window):
            self.log("ERROR: no model selected")
            return

-        is_bgp = name.startswith("🔀 ")
+        is_bgp = bool(name and name.startswith("🔀 "))
        if is_bgp:
            pool_name = name[2:]
            pool = None
@@ -1781,6 +1823,8 @@ class EditEndpointDialog(Gtk.Dialog):
        for val, lab in [("openai-compat", "OpenAI-compatible (needs proxy)"),
                          ("anthropic", "Anthropic (needs proxy)"),
                          ("command-code", "Command Code (needs proxy)"),
+                          ("gemini-oauth-cli", "Gemini CLI OAuth (needs proxy)"),
+                          ("gemini-oauth-antigravity", "Antigravity OAuth (needs proxy)"),
                          ("native", "Native OpenAI (no proxy)")]:
            self._combo_type.append(val, lab)
        bt = self._data.get("backend_type", "openai-compat")
@@ -1866,6 +1910,9 @@ class EditEndpointDialog(Gtk.Dialog):
        self._fetch_models_btn = Gtk.Button(label="Fetch from API")
        self._fetch_models_btn.connect("clicked", lambda b: self._fetch_models())
        mbox.pack_start(self._fetch_models_btn, False, False, 0)
+        self._test_btn = Gtk.Button(label="Test Endpoint")
+        self._test_btn.connect("clicked", lambda b: self._diagnose_endpoint())
+        mbox.pack_start(self._test_btn, False, False, 0)

        bulk_lbl = Gtk.Label(label="Bulk add models (one per line or comma-separated):", xalign=0)
        area.pack_start(bulk_lbl, False, False, 2)
@@ -1970,29 +2017,52 @@ class EditEndpointDialog(Gtk.Dialog):
        preset_name = self._combo_preset.get_active_text() or "Custom"
        preset = PROVIDER_PRESETS.get(preset_name, {})
        provider = preset.get("oauth_provider", "")
-        if provider == "google":
-            self._google_oauth_flow()
+        if (provider or "").startswith("google"):
+            self._google_oauth_flow(provider)

-    def _google_oauth_flow(self):
-        token_path = os.path.expanduser("~/.cache/codex-proxy/google-oauth-token.json")
+    def _google_oauth_flow(self, oauth_provider="google-cli"):
+        is_antigravity = oauth_provider == "google-antigravity"
+        token_path = os.path.expanduser("~/.cache/codex-proxy/google-antigravity-oauth-token.json" if is_antigravity else "~/.cache/codex-proxy/google-cli-oauth-token.json")

-        CLIENT_ID = "681255809395-oo8ft2oprdrnp9e3aqf6av3hmdib135j.apps.googleusercontent.com"
-        CLIENT_SECRET = "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w"
+        if is_antigravity:
+            CLIENT_ID = "1071006060591-tmhssin2h21lcre235vtolojh4g403ep.apps.googleusercontent.com"
+            CLIENT_SECRET = "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"
+            SCOPES = [
+                "https://www.googleapis.com/auth/cloud-platform",
+                "https://www.googleapis.com/auth/userinfo.email",
+                "https://www.googleapis.com/auth/userinfo.profile",
+                "https://www.googleapis.com/auth/cclog",
+                "https://www.googleapis.com/auth/experimentsandconfigs",
+            ]
+            port = 51121
+            redirect_uri = f"http://localhost:{port}/oauth-callback"
+            callback_path = "/oauth-callback"
+            provider_kind = "antigravity"
+        else:
+            CLIENT_ID = "681255809395-oo8ft2oprdrnp9e3aqf6av3hmdib135j.apps.googleusercontent.com"
+            CLIENT_SECRET = "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"
            SCOPES = [
                "https://www.googleapis.com/auth/cloud-platform",
-            "https://www.googleapis.com/auth/generative-language.retriever",
                "https://www.googleapis.com/auth/userinfo.email",
                "https://www.googleapis.com/auth/userinfo.profile",
            ]
-        import http.server, hashlib, secrets, socket
+            port = 0
+            redirect_uri = None
+            callback_path = "/oauth2callback"
+            provider_kind = "cli"
+
+        import http.server

-        port = 8085
        state = secrets.token_hex(32)
-        verifier = secrets.token_urlsafe(32)
-        challenge = hashlib.sha256(verifier.encode()).digest()
-        challenge_b64 = urllib.parse.quote_plus(__import__('base64').urlsafe_b64encode(challenge).rstrip(b'=').decode())
+        verifier = secrets.token_urlsafe(64)
+        challenge = base64.urlsafe_b64encode(hashlib.sha256(verifier.encode()).digest()).rstrip(b"=").decode()

+        if port == 0:
+            with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
+                s.bind(("127.0.0.1", 0))
+                port = s.getsockname()[1]
            redirect_uri = f"http://127.0.0.1:{port}/oauth2callback"
+
        scope_str = " ".join(SCOPES)
        auth_url = (
            f"https://accounts.google.com/o/oauth2/v2/auth?"
@@ -2001,9 +2071,9 @@ class EditEndpointDialog(Gtk.Dialog):
            f"&response_type=code"
            f"&scope={urllib.parse.quote(scope_str)}"
            f"&access_type=offline"
-            f"&prompt=consent"
+            f"&prompt=select_account%20consent"
            f"&state={state}"
-            f"&code_challenge={challenge_b64}"
+            f"&code_challenge={challenge}"
            f"&code_challenge_method=S256"
        )

@@ -2043,6 +2113,14 @@ class EditEndpointDialog(Gtk.Dialog):
                qs = urllib.parse.urlparse(self2.path).query
                params = urllib.parse.parse_qs(qs)
                received_state[0] = params.get("state", [None])[0]
+                with open("/tmp/codex-oauth-debug.log", "a") as _dbg:
+                    _dbg.write(f"[{time.strftime('%H:%M:%S')}] GET {self2.path} state={received_state[0]} code={'code' in params}\n")
+                if self2.path.find(callback_path) == -1:
+                    self2.send_response(302)
+                    self2.send_header("Location", "https://developers.google.com/gemini-code-assist/auth_failure_gemini")
+                    self2.end_headers()
+                    error_holder[0] = "unexpected request"
+                    return
                if "code" in params:
                    if received_state[0] != state:
                        self2.send_response(400)
@@ -2061,43 +2139,39 @@ class EditEndpointDialog(Gtk.Dialog):
                    self2.send_response(302)
                    self2.send_header("Location", "https://developers.google.com/gemini-code-assist/auth_failure_gemini")
                    self2.end_headers()
-            def log_message(self2, *a): pass
+            def log_message(self2, fmt, *args):
+                with open("/tmp/codex-oauth-debug.log", "a") as _dbg:
+                    _dbg.write(f"[{time.strftime('%H:%M:%S')}] {fmt % args}\n")

        try:
-            server = http.server.HTTPServer(("127.0.0.1", port), OAuthHandler)
+            bind_host = "localhost" if is_antigravity else "127.0.0.1"
+            server = http.server.HTTPServer((bind_host, port), OAuthHandler)
        except OSError:
            self._oauth_status.set_text(f"Port {port} already in use — close other apps and retry.")
            spinner.stop()
            dlg.run(); dlg.destroy()
            return

+        def _oauth_log(msg):
+            with open("/tmp/codex-oauth-debug.log", "a") as _f:
+                _f.write(f"[{time.strftime('%H:%M:%S')}] {msg}\n")
+
+        _oauth_log(f"Starting OAuth: port={port} redirect_uri={redirect_uri}")
+
        def wait_for_code():
+            _oauth_log("wait_for_code thread started")
+            deadline = time.time() + 120
+            while code_holder[0] is None and error_holder[0] is None and time.time() < deadline:
                server.handle_request()
            server.server_close()
-            GLib.idle_add(self._google_oauth_complete_gemini, dlg, code_holder, error_holder,
-                          CLIENT_ID, CLIENT_SECRET, redirect_uri, token_path, spinner, verifier)
-
-        threading.Thread(target=wait_for_code, daemon=True).start()
-        subprocess.Popen(["xdg-open", auth_url], stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
-        dlg.run()
-        dlg.destroy()
-
-    def _google_oauth_complete_gemini(self, dlg, code_holder, error_holder,
-                                       client_id, client_secret, redirect_uri, token_path, spinner, verifier):
-        spinner.stop()
-        if error_holder[0]:
-            self._oauth_status.set_markup(f'<span foreground="#e74c3c">Error: {error_holder[0]}</span>')
-            return
-        if not code_holder[0]:
-            self._oauth_status.set_text("No authorization code received.")
-            return
-
-        self._oauth_status.set_text("Exchanging code for token…")
+            _oauth_log(f"Server closed. code={'yes' if code_holder[0] else 'no'} error={'yes' if error_holder[0] else 'no'}")
+            if code_holder[0]:
                try:
+                    _oauth_log("Exchanging code for token...")
                    token_data = urllib.parse.urlencode({
                        "code": code_holder[0],
-                "client_id": client_id,
-                "client_secret": client_secret,
+                        "client_id": CLIENT_ID,
+                        "client_secret": CLIENT_SECRET,
                        "redirect_uri": redirect_uri,
                        "grant_type": "authorization_code",
                        "code_verifier": verifier,
@@ -2106,18 +2180,130 @@ class EditEndpointDialog(Gtk.Dialog):
                                                 headers={"Content-Type": "application/x-www-form-urlencoded"})
                    resp = urllib.request.urlopen(req, timeout=30)
                    tokens = json.loads(resp.read())
-            tokens["client_id"] = client_id
-            tokens["client_secret"] = client_secret
+                    tokens["client_id"] = CLIENT_ID
+                    tokens["client_secret"] = CLIENT_SECRET
+                    tokens["provider_kind"] = provider_kind
                    tokens["expires_at"] = time.time() + tokens.get("expires_in", 3600)
                    os.makedirs(os.path.dirname(token_path), exist_ok=True)
                    with open(token_path, "w") as f:
                        json.dump(tokens, f, indent=2)
                    os.chmod(token_path, 0o600)
-            self._entry_key.set_text(tokens.get("access_token", ""))
+                    _oauth_log(f"Token saved to {token_path}")
+                    project_id = ""
+                    try:
+                        _oauth_log("Discovering project ID via loadCodeAssist...")
+                        lr = urllib.request.Request(
+                            "https://cloudcode-pa.googleapis.com/v1internal:loadCodeAssist",
+                            data=json.dumps({}).encode(),
+                            headers={
+                                "Content-Type": "application/json",
+                                "Authorization": f"Bearer {tokens['access_token']}",
+                                "User-Agent": "google-api-nodejs-client/9.15.1",
+                            })
+                        lresp = urllib.request.urlopen(lr, timeout=15)
+                        ldata = json.loads(lresp.read())
+                        p = ldata.get("cloudaicompanionProject", "")
+                        if isinstance(p, dict):
+                            project_id = p.get("id", "")
+                        elif isinstance(p, str):
+                            project_id = p
+                        _oauth_log(f"Project ID: {project_id or '(none)'}")
+                        if project_id:
+                            tokens["project_id"] = project_id
+                            with open(token_path, "w") as f2:
+                                json.dump(tokens, f2, indent=2)
+                            os.chmod(token_path, 0o600)
+                    except Exception as pe:
+                        _oauth_log(f"loadCodeAssist failed (non-fatal): {pe}")
+                    if is_antigravity:
+                        found_models = [
+                            "gemini-2.5-flash", "gemini-2.5-pro",
+                            "gemini-3-flash-preview", "gemini-3-pro-preview", "gemini-3.1-pro-preview",
+                            "gemini-3-pro-low", "gemini-3-pro-high",
+                            "gemini-3.1-pro-low", "gemini-3.1-pro-high",
+                            "gemini-3-flash-low", "gemini-3-flash-medium", "gemini-3-flash-high",
+                            "claude-sonnet-4-6", "claude-opus-4-6-thinking",
+                            "claude-opus-4-6-thinking-low", "claude-opus-4-6-thinking-medium", "claude-opus-4-6-thinking-high",
+                            "gemini-claude-sonnet-4-6",
+                            "gemini-claude-opus-4-6-thinking-low", "gemini-claude-opus-4-6-thinking-medium", "gemini-claude-opus-4-6-thinking-high",
+                            "gemini-3-pro-image",
+                        ]
+                        probe_candidates = [
+                            "gemini-2.5-flash", "gemini-2.5-pro",
+                            "gemini-3-flash-preview", "gemini-3-pro-preview", "gemini-3.1-pro-preview",
+                        ]
+                        _oauth_log(f"Probing {len(probe_candidates)} model candidates...")
+                        for mc in probe_candidates:
+                            try:
+                                pr = urllib.request.Request(
+                                    "https://daily-cloudcode-pa.sandbox.googleapis.com/v1internal:generateContent",
+                                    data=json.dumps({
+                                        "project": project_id,
+                                        "model": mc,
+                                        "request": {"contents": [{"role": "user", "parts": [{"text": "x"}]}],
+                                                    "generationConfig": {"maxOutputTokens": 1}},
+                                    }).encode(),
+                                    headers={
+                                        "Content-Type": "application/json",
+                                        "Authorization": f"Bearer {tokens['access_token']}",
+                                        "User-Agent": "google-api-nodejs-client/9.15.1",
+                                        "Client-Metadata": "ideType=IDE_UNSPECIFIED,platform=PLATFORM_UNSPECIFIED,pluginType=GEMINI",
+                                    })
+                                pr.get_method = lambda: "POST"
+                                resp = urllib.request.urlopen(pr, timeout=10)
+                                resp.read()
+                                found_models.append(mc)
+                                _oauth_log(f"  {mc} → available")
+                            except urllib.error.HTTPError as e:
+                                if e.code == 429:
+                                    found_models.append(mc)
+                                    _oauth_log(f"  {mc} → available (rate limited)")
+                                else:
+                                    e.read()
+                                    _oauth_log(f"  {mc} → HTTP {e.code}")
+                            except Exception as e:
+                                _oauth_log(f"  {mc} → error: {e}")
+                    else:
+                        found_models = ["gemini-2.5-flash", "gemini-2.5-pro"]
+                    if found_models:
+                        tokens["available_models"] = found_models
+                        with open(token_path, "w") as f3:
+                            json.dump(tokens, f3, indent=2)
+                        os.chmod(token_path, 0o600)
+                        _oauth_log(f"Discovered {len(found_models)} models: {found_models}")
+                    else:
+                        _oauth_log("No models discovered (will use defaults)")
+                    GLib.idle_add(self._oauth_success, dlg, tokens.get("access_token", ""), spinner)
+                    return
+                except urllib.error.HTTPError as e:
+                    body = e.read().decode(errors='replace')
+                    _oauth_log(f"Token exchange HTTP {e.code}: {body}")
+                    GLib.idle_add(self._oauth_failed, dlg, f"Token exchange failed ({e.code}): {body[:200]}", spinner)
+                    return
+                except Exception as e:
+                    _oauth_log(f"Token exchange FAILED: {e}")
+                    GLib.idle_add(self._oauth_failed, dlg, f"Token exchange failed: {e}", spinner)
+                    return
+            _oauth_log(f"OAuth failed: {error_holder[0] or 'timeout'}")
+            GLib.idle_add(self._oauth_failed, dlg,
+                          error_holder[0] or "No authorization code received.", spinner)
+
+        threading.Thread(target=wait_for_code, daemon=True).start()
+        subprocess.Popen(["xdg-open", auth_url], stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
+        dlg.connect("response", lambda d, r: d.destroy())
+        dlg.run()
+
+    def _oauth_success(self, dlg, access_token, spinner):
+        spinner.stop()
+        self._entry_key.set_text(access_token)
        self._oauth_status.set_markup('<span foreground="#27ae60" weight="bold">Authorization successful! Token saved.</span>')
        dlg.set_title("Google OAuth — Success")
-        except Exception as e:
-            self._oauth_status.set_markup(f'<span foreground="#e74c3c">Token exchange failed: {e}</span>')
+        GLib.timeout_add(1500, lambda: dlg.response(Gtk.ResponseType.OK))
+
+    def _oauth_failed(self, dlg, msg, spinner):
+        spinner.stop()
+        self._oauth_status.set_markup(f'<span foreground="#e74c3c">{msg}</span>')
+        GLib.timeout_add(3000, lambda: dlg.response(Gtk.ResponseType.CANCEL))

    def _remove_model(self, path):
        current = self._combo_default.get_active_text()
@@ -2163,6 +2349,70 @@ class EditEndpointDialog(Gtk.Dialog):
            return True, None
        return False, err or "No models returned by endpoint"

+    def _diagnose_endpoint(self):
+        url = self._entry_url.get_text().strip()
+        key = self._entry_key.get_text().strip()
+        bt = self._combo_type.get_active_id() or "openai-compat"
+        model = self._combo_default.get_active_text() or ""
+
+        checks = []
+        def add(name, ok, detail=""):
+            checks.append((name, ok, detail))
+
+        parsed = urllib.parse.urlparse(url)
+        add("URL format", bool(parsed.scheme and parsed.netloc),
+            url if parsed.scheme else "Missing scheme (https://)")
+
+        try:
+            t0 = time.time()
+            ep = {"base_url": url, "api_key": key, "backend_type": bt}
+            ids, err = fetch_models_for_endpoint(ep)
+            lat = (time.time() - t0) * 1000
+            if ids:
+                add("Network reachable", True, f"{lat:.0f}ms")
+                add("Auth valid", True)
+                add("/models endpoint", True, f"{len(ids)} models in {lat:.0f}ms")
+                if model:
+                    add("Selected model exists", model in ids,
+                        model if model in ids else f"'{model}' not in {ids[:5]}...")
+                else:
+                    add("Selected model", False, "No model selected")
+            elif err and ("401" in str(err) or "403" in str(err)):
+                add("Network reachable", True, f"{lat:.0f}ms")
+                add("Auth valid", False, str(err)[:100])
+                add("/models endpoint", False, "Auth failed")
+            else:
+                add("Network reachable", False, str(err or "no response")[:100])
+        except Exception as e:
+            add("Network", False, str(e)[:100])
+
+        dlg = Gtk.Dialog(title="Endpoint Doctor", parent=self, modal=True)
+        dlg.add_button("Close", Gtk.ResponseType.CLOSE)
+        dlg.set_default_size(420, 300)
+        area = dlg.get_content_area()
+        area.set_margin_start(12)
+        area.set_margin_end(12)
+        area.set_margin_top(12)
+        area.set_margin_bottom(12)
+        area.set_spacing(4)
+        for name, ok, detail in checks:
+            row = Gtk.Box(spacing=6)
+            icon = Gtk.Label()
+            icon.set_markup(f'<span foreground="{"#27ae60" if ok else "#e74c3c"}"'
+                           f' weight="bold">{"\u2713" if ok else "\u2717"}</span>')
+            row.pack_start(icon, False, False, 0)
+            lbl = Gtk.Label()
+            lbl.set_markup(f'<b>{name}</b>')
+            row.pack_start(lbl, False, False, 0)
+            if detail:
+                det = Gtk.Label()
+                det.set_markup(f'<span foreground="#7f8c8d" size="small">{detail}</span>')
+                row.pack_end(det, False, False, 0)
+            area.pack_start(row, False, False, 0)
+        dlg.show_all()
+        dlg.run()
+        dlg.destroy()
+
    def _on_response(self, dialog, response):
        if response != Gtk.ResponseType.OK:
            self.destroy()
@@ -2172,7 +2422,7 @@ class EditEndpointDialog(Gtk.Dialog):
        if not name:
            self._show_error("Name is required")
            return
-        bt = self._combo_type.get_active_id()
+        bt = self._combo_type.get_active_id() or PROVIDER_PRESETS.get(self._combo_preset.get_active_text() or "", {}).get("backend_type") or "openai-compat"
        url = self._entry_url.get_text().strip()
        key = self._entry_key.get_text().strip()
        models = [self._model_store[i][0] for i in range(len(self._model_store))]
--- a/src/translate-proxy.py
+++ b/src/translate-proxy.py
@@ -11,7 +11,7 @@ Usage:
  python3 translate-proxy.py --backend openai-compat --target-url https://... --api-key sk-...
 """

-import json, http.server, socketserver, urllib.request, urllib.parse, urllib.error
+import json, http.server, socketserver, urllib.request, urllib.parse, urllib.error, re
 import time, uuid, os, sys, argparse, threading, socket, collections, contextlib, signal

 # ═══════════════════════════════════════════════════════════════════
@@ -107,9 +107,57 @@ _active_connections = 0
 _active_connections_lock = threading.Lock()

 _pool = uuid.uuid4().hex[:8]
+_antigravity_version = "1.18.3"
+_antigravity_version_checked = 0
+_antigravity_version_lock = threading.Lock()
+
+def _fetch_antigravity_version():
+    cache_path = os.path.join(os.path.expanduser("~"), ".cache", "codex-proxy", "antigravity-version.json")
+    try:
+        with open(cache_path) as f:
+            cached = json.load(f)
+        if cached.get("version") and cached.get("checked_at", 0) > time.time() - 6 * 3600:
+            return cached["version"]
+    except Exception:
+        pass
+    urls = [
+        ("https://antigravity-auto-updater-974169037036.us-central1.run.app", None),
+        ("https://antigravity.google/changelog", 5000),
+    ]
+    for url, limit in urls:
+        try:
+            req = urllib.request.Request(url, headers={"User-Agent": "Mozilla/5.0"})
+            resp = urllib.request.urlopen(req, timeout=5)
+            text = resp.read().decode(errors="replace")
+            if limit:
+                text = text[:limit]
+            m = re.search(r"\d+\.\d+\.\d+", text)
+            if m:
+                version = m.group(0)
+                try:
+                    os.makedirs(os.path.dirname(cache_path), exist_ok=True)
+                    with open(cache_path, "w") as f:
+                        json.dump({"version": version, "checked_at": time.time()}, f)
+                except Exception:
+                    pass
+                return version
+        except Exception:
+            pass
+    return _antigravity_version
+
+def _ensure_antigravity_version():
+    global _antigravity_version, _antigravity_version_checked
+    if time.time() - _antigravity_version_checked < 6 * 3600:
+        return _antigravity_version
+    with _antigravity_version_lock:
+        if time.time() - _antigravity_version_checked < 6 * 3600:
+            return _antigravity_version
+        _antigravity_version = _fetch_antigravity_version()
+        _antigravity_version_checked = time.time()
+        return _antigravity_version

 def _init_runtime():
-    global CONFIG, PORT, BACKEND, TARGET_URL, API_KEY, OAUTH_PROVIDER
+    global CONFIG, PORT, BACKEND, TARGET_URL, API_KEY, OAUTH_PROVIDER, _antigravity_version
    global MODELS, CC_VERSION, REASONING_ENABLED, REASONING_EFFORT, BGP_ROUTES

    CONFIG = load_config()
@@ -117,12 +165,15 @@ def _init_runtime():
    BACKEND = CONFIG["backend_type"]
    TARGET_URL = CONFIG["target_url"].rstrip("/")
    API_KEY = CONFIG["api_key"]
-    OAUTH_PROVIDER = CONFIG.get("oauth_provider", "")
+    OAUTH_PROVIDER = CONFIG.get("oauth_provider") or ""
    MODELS = CONFIG["models"]
    CC_VERSION = CONFIG.get("cc_version", "")
    REASONING_ENABLED = CONFIG.get("reasoning_enabled", True)
    REASONING_EFFORT = CONFIG.get("reasoning_effort", "medium")
    BGP_ROUTES = CONFIG.get("bgp_routes", [])
+    if OAUTH_PROVIDER == "google-antigravity":
+        _antigravity_version = _ensure_antigravity_version()
+        print(f"[antigravity] version={_antigravity_version}", file=sys.stderr)

    bgp_models = []
    for _r in BGP_ROUTES:
@@ -134,13 +185,33 @@ def _init_runtime():
        MODELS = [{"id": m, "object": "model", "created": 1700000000, "owned_by": "bgp"} for m in bgp_models]
        CONFIG["models"] = MODELS

+    if (BACKEND or "").startswith("gemini-oauth") and (OAUTH_PROVIDER or "").startswith("google"):
+        token_name = "google-antigravity-oauth-token.json" if OAUTH_PROVIDER == "google-antigravity" else "google-cli-oauth-token.json"
+        token_path = os.path.join(os.path.expanduser("~"), ".cache", "codex-proxy", token_name)
+        try:
+            with open(token_path) as _tf:
+                _td = json.load(_tf)
+            _discovered = [] if OAUTH_PROVIDER == "google-antigravity" else _td.get("available_models", [])
+            if _discovered:
+                _seen = []
+                for _m in _discovered:
+                    if _m not in _seen:
+                        _seen.append(_m)
+                MODELS = [{"id": m, "object": "model", "created": 1700000000, "owned_by": "gemini-oauth"} for m in _seen]
+                CONFIG["models"] = MODELS
+                print(f"[gemini-oauth] loaded {len(_seen)} discovered models: {_seen}", file=sys.stderr)
+        except Exception:
+            pass
+
 def _refresh_oauth_token():
    return _refresh_oauth_token_for(API_KEY, OAUTH_PROVIDER)

 def _refresh_oauth_token_for(api_key, oauth_provider):
-    if oauth_provider != "google":
+    oauth_provider = oauth_provider or ""
+    if not oauth_provider.startswith("google"):
        return api_key
-    token_path = os.path.join(os.path.expanduser("~"), ".cache", "codex-proxy", "google-oauth-token.json")
+    token_name = "google-antigravity-oauth-token.json" if oauth_provider == "google-antigravity" else "google-cli-oauth-token.json"
+    token_path = os.path.join(os.path.expanduser("~"), ".cache", "codex-proxy", token_name)
    if not os.path.exists(token_path):
        return api_key
    try:
@@ -329,6 +400,70 @@ _CROF_ADAPTIVE = {
    "min_keep_recent": 4,
 }

+_BGP_STATS_PATH = os.path.join(_LOG_DIR, "bgp-route-stats.json")
+_bgp_stats_lock = threading.Lock()
+
+def _route_key(route):
+    return f"{route.get('name', '')}::{route.get('target_url', '')}::{route.get('model', '')}"
+
+def _load_bgp_stats():
+    try:
+        if os.path.exists(_BGP_STATS_PATH):
+            return json.load(open(_BGP_STATS_PATH))
+    except Exception:
+        pass
+    return {}
+
+def _save_bgp_stats(stats):
+    tmp = _BGP_STATS_PATH + ".tmp"
+    with open(tmp, "w") as f:
+        json.dump(stats, f, indent=2)
+    os.replace(tmp, _BGP_STATS_PATH)
+
+def _score_route(route, stats):
+    key = _route_key(route)
+    rs = stats.get(key, {})
+    now = time.time()
+    if float(rs.get("open_until_ts", 0)) > now:
+        return 1_000_000
+    priority = int(route.get("priority", 99))
+    ewma = float(rs.get("ewma_latency_s", 0))
+    failures = int(rs.get("consecutive_failures", 0))
+    score = priority + min(ewma * 5, 50) + failures * 20
+    if float(rs.get("rate_limited_until", 0)) > now:
+        score += 500
+    return score
+
+def _update_route_stats(route, success, duration_s, http_code=None, error_type=None):
+    with _bgp_stats_lock:
+        stats = _load_bgp_stats()
+        key = _route_key(route)
+        rs = stats.setdefault(key, {
+            "ewma_latency_s": duration_s, "consecutive_failures": 0,
+            "last_success": None, "last_failure": None,
+            "open_until_ts": 0, "rate_limited_until": 0, "last_error": None,
+        })
+        alpha = 0.25
+        rs["ewma_latency_s"] = alpha * duration_s + (1 - alpha) * float(rs.get("ewma_latency_s", duration_s))
+        if success:
+            rs["consecutive_failures"] = 0
+            rs["last_success"] = time.time()
+        else:
+            rs["consecutive_failures"] = int(rs.get("consecutive_failures", 0)) + 1
+            rs["last_failure"] = time.time()
+            rs["last_error"] = error_type or (f"http_{http_code}" if http_code else "unknown")
+            if http_code == 429:
+                rs["rate_limited_until"] = time.time() + 120
+            if rs["consecutive_failures"] >= 3:
+                rs["open_until_ts"] = time.time() + 60
+                rs["consecutive_failures"] = 0
+        _save_bgp_stats(stats)
+
+def _sorted_bgp_routes():
+    with _bgp_stats_lock:
+        stats = _load_bgp_stats()
+    return sorted(BGP_ROUTES, key=lambda r: _score_route(r, stats))
+
 def _crof_record(model, n_items, success):
    if not isinstance(n_items, int) or n_items < 1:
        return
@@ -536,6 +671,193 @@ def _compact_input(input_data):
    print(f"[compact] {len(input_data)} items -> {len(head) + 1 + len(tail)} (compacted {len(body)} old items into summary)", file=sys.stderr)
    return head + [summary_msg] + tail

+# ═══════════════════════════════════════════════════════════════════
+# Provider policies
+# ═══════════════════════════════════════════════════════════════════
+
+_PROVIDER_POLICIES = {
+    "crof": {"reasoning_mode": "off", "max_tokens": 32768, "strip_reasoning": True,
+             "tool_output_limit": 4000, "max_input_items": 18, "compaction": "aggressive"},
+    "chats-llm": {"reasoning_mode": "off", "max_tokens": 32768, "strip_reasoning": True,
+                  "tool_output_limit": 4000, "max_input_items": 20, "compaction": "aggressive"},
+    "z.ai": {"reasoning_mode": "medium", "max_tokens": 65536, "strip_reasoning": True,
+             "tool_output_limit": 8000, "max_input_items": 40, "compaction": "balanced"},
+    "openrouter": {"reasoning_mode": "provider_default", "max_tokens": 32768, "strip_reasoning": True,
+                   "tool_output_limit": 6000, "max_input_items": 35, "compaction": "balanced"},
+    "openadapter": {"reasoning_mode": "off", "max_tokens": 32768, "strip_reasoning": True,
+                    "tool_output_limit": 6000, "max_input_items": 30, "compaction": "balanced"},
+}
+
+def provider_policy(target_url=None, backend=None):
+    host = urllib.parse.urlparse(target_url or TARGET_URL).netloc.lower()
+    for key, policy in _PROVIDER_POLICIES.items():
+        if key in host:
+            return policy
+    return {}
+
+# ═══════════════════════════════════════════════════════════════════
+# Adaptive context compaction (model-aware)
+# ═══════════════════════════════════════════════════════════════════
+
+_MODEL_CONTEXT = {
+    "gpt-4o": 128000, "gpt-4o-mini": 128000, "gpt-5": 128000,
+    "claude-sonnet": 200000, "claude-haiku": 200000,
+    "glm-5.1": 128000, "glm-5": 128000, "glm-4": 128000,
+    "deepseek": 64000, "gemini-2.5-flash": 1000000, "gemini-2.5-pro": 2000000,
+    "mimo": 32768, "minimax": 32768, "kimi": 128000,
+    "_default": 32768,
+}
+
+def _context_limit_for_model(model):
+    if not model:
+        return _MODEL_CONTEXT["_default"]
+    ml = model.lower()
+    for key, limit in _MODEL_CONTEXT.items():
+        if key != "_default" and key in ml:
+            return limit
+    return _MODEL_CONTEXT["_default"]
+
+def _estimate_tokens(obj):
+    if obj is None:
+        return 0
+    if isinstance(obj, str):
+        return max(1, len(obj) // 4)
+    try:
+        raw = json.dumps(obj, ensure_ascii=False)
+    except Exception:
+        raw = str(obj)
+    return max(1, len(raw) // 4)
+
+def _adaptive_compact(input_data, model, policy=None):
+    policy = policy or {}
+    context_size = int(policy.get("context_size", _context_limit_for_model(model)))
+    input_budget = int(context_size * 0.60)
+    estimated = _estimate_tokens(input_data)
+    if estimated <= input_budget:
+        return input_data, False
+    if not isinstance(input_data, list):
+        return input_data, False
+    reduction = max(0.15, input_budget / max(estimated, 1))
+    target_items = max(int(len(input_data) * reduction), 6)
+    if target_items >= len(input_data):
+        return input_data, False
+    head_end = 0
+    for i, item in enumerate(input_data):
+        t = item.get("type")
+        if t == "message" and item.get("role") in ("developer", "system"):
+            head_end = i + 1
+        elif t == "message" and item.get("role") == "user" and head_end == i:
+            head_end = i + 1
+        else:
+            break
+    head = input_data[:head_end]
+    keep = max(4, target_items // 3)
+    tail_start = max(head_end, len(input_data) - keep)
+    while tail_start > head_end:
+        t = input_data[tail_start].get("type")
+        if t in ("function_call_output", "function_call"):
+            tail_start -= 1
+        elif t == "message" and input_data[tail_start].get("role") == "assistant":
+            tail_start -= 1
+        else:
+            break
+    tail = input_data[tail_start:]
+    body = input_data[head_end:tail_start]
+    if not body:
+        return head + tail, True
+    summary_lines = [f"[Auto-compacted: {len(body)} turns removed (budget={input_budget}tok, model={model})]"]
+    for item in body[-5:]:
+        summary_lines.append(_item_summary(item, max_len=120))
+    summary_msg = {"type": "message", "role": "user",
+                   "content": [{"type": "input_text", "text": "\n".join(summary_lines)}]}
+    print(f"[adaptive-compact] model={model} est={estimated}tok budget={input_budget}tok "
+          f"items {len(input_data)}->{len(head)+1+len(tail)}", file=sys.stderr)
+    return head + [summary_msg] + tail, True
+
+# ═══════════════════════════════════════════════════════════════════
+# Tool-call pairing validator
+# ═══════════════════════════════════════════════════════════════════
+
+def validate_tool_pairs(input_items):
+    if not isinstance(input_items, list):
+        return []
+    calls = {}
+    errors = []
+    for idx, item in enumerate(input_items):
+        t = item.get("type")
+        if t == "function_call":
+            cid = item.get("call_id") or item.get("id")
+            if cid:
+                calls[cid] = idx
+        elif t == "function_call_output":
+            cid = item.get("call_id") or item.get("id")
+            if not cid or cid not in calls:
+                errors.append({"index": idx, "call_id": cid, "error": "orphan_function_call_output"})
+    return errors
+
+def repair_orphan_tool_outputs(input_items, errors):
+    bad = {e["index"] for e in errors}
+    repaired = []
+    for idx, item in enumerate(input_items):
+        if idx in bad:
+            output = item.get("output", "")
+            repaired.append({"type": "message", "role": "user",
+                             "content": [{"type": "input_text",
+                                          "text": f"[Proxy: unmatched tool output]\n{str(output)[:4000]}"}]})
+        else:
+            repaired.append(item)
+    return repaired
+
+# ═══════════════════════════════════════════════════════════════════
+# Log redaction
+# ═══════════════════════════════════════════════════════════════════
+
+_SECRET_PATTERNS = [
+    (r"sk-[A-Za-z0-9_\-]{20,}", "[REDACTED:key]"),
+    (r"sk-ant-[A-Za-z0-9_\-]{20,}", "[REDACTED:anthropic]"),
+    (r"gh[pousr]_[A-Za-z0-9_]{20,}", "[REDACTED:github]"),
+    (r"Bearer\s+[A-Za-z0-9._\-]{20,}", "Bearer [REDACTED]"),
+]
+
+def _redact(text):
+    if not text:
+        return text
+    import re
+    for pattern, replacement in _SECRET_PATTERNS:
+        text = re.sub(pattern, replacement, text)
+    return text
+
+# ═══════════════════════════════════════════════════════════════════
+# Rate-limit token buckets
+# ═══════════════════════════════════════════════════════════════════
+
+class TokenBucket:
+    def __init__(self, capacity=10, refill=1.0):
+        self.capacity = float(capacity)
+        self.tokens = float(capacity)
+        self.refill = float(refill)
+        self.updated = time.monotonic()
+        self.lock = threading.Lock()
+    def allow(self, cost=1):
+        with self.lock:
+            now = time.monotonic()
+            self.tokens = min(self.capacity, self.tokens + (now - self.updated) * self.refill)
+            self.updated = now
+            if self.tokens >= cost:
+                self.tokens -= cost
+                return True
+            return False
+
+_rate_buckets = {}
+_rate_buckets_lock = threading.Lock()
+
+def _bucket_for_route(route):
+    name = route.get("name") or route.get("target_url") or "default"
+    with _rate_buckets_lock:
+        if name not in _rate_buckets:
+            _rate_buckets[name] = TokenBucket(capacity=10, refill=1.0)
+        return _rate_buckets[name]
+
 # ═══════════════════════════════════════════════════════════════════
 # OpenAI-compat backend
 # ═══════════════════════════════════════════════════════════════════
@@ -1154,14 +1476,31 @@ class Handler(http.server.BaseHTTPRequestHandler):
            self._handle_anthropic(body, model, stream)
        elif BACKEND == "command-code":
            self._handle_command_code(body, model, stream)
+        elif (BACKEND or "").startswith("gemini-oauth"):
+            self._handle_gemini_oauth(body, model, stream)
        else:
            self._handle_openai_compat(body, model, stream)

    def _handle_openai_compat(self, body, model, stream):
        input_data = body.get("input", "")
+        policy = provider_policy()
+
+        pair_errors = validate_tool_pairs(input_data)
+        if pair_errors:
+            print(f"[tool-validator] repairing {len(pair_errors)} orphan tool outputs", file=sys.stderr)
+            input_data = repair_orphan_tool_outputs(input_data, pair_errors)
+            body = dict(body)
+            body["input"] = input_data
+
+        compacted = False
+        if policy.get("compaction") and isinstance(input_data, list):
+            input_data, compacted = _adaptive_compact(input_data, model, policy)
+            if compacted:
+                body = dict(body)
+                body["input"] = input_data

        crof_limit = _crof_item_limit(model)
-        if isinstance(input_data, list) and len(input_data) > crof_limit:
+        if not compacted and isinstance(input_data, list) and len(input_data) > crof_limit:
            print(f"[crof-adaptive] proactive compact: {len(input_data)} items > limit {crof_limit}", file=sys.stderr)
            input_data = _crof_compact_for_retry(input_data, model)
            body = dict(body)
@@ -1228,8 +1567,379 @@ class Handler(http.server.BaseHTTPRequestHandler):
            chat_body["reasoning_effort"] = REASONING_EFFORT
        return chat_body

+    def _handle_gemini_oauth(self, body, model, stream):
+        input_data = body.get("input", "")
+        policy = provider_policy()
+        if OAUTH_PROVIDER == "google-antigravity":
+            alias_map = {
+                "antigravity-gemini-3-flash": "gemini-3-flash",
+                "antigravity-gemini-3-pro": "gemini-3-pro-low",
+                "antigravity-gemini-3.1-pro": "gemini-3.1-pro-low",
+                "gemini-3-flash-preview": "gemini-3-flash",
+                "gemini-3-pro-preview": "gemini-3-pro-low",
+                "gemini-3.1-pro-preview": "gemini-3.1-pro-low",
+                "gemini-3-pro": "gemini-3-pro-low",
+                "gemini-3.1-pro": "gemini-3.1-pro-low",
+                "antigravity-claude-sonnet-4-6": "claude-sonnet-4-6",
+                "antigravity-claude-opus-4-6-thinking": "claude-opus-4-6-thinking",
+            }
+            model = alias_map.get(model, model)
+
+        pair_errors = validate_tool_pairs(input_data)
+        if pair_errors:
+            input_data = repair_orphan_tool_outputs(input_data, pair_errors)
+            body = dict(body)
+            body["input"] = input_data
+
+        compacted = False
+        if policy.get("compaction") and isinstance(input_data, list):
+            input_data, compacted = _adaptive_compact(input_data, model, policy)
+            if compacted:
+                body = dict(body)
+                body["input"] = input_data
+
+        access_token = _refresh_oauth_token()
+        token_name = "google-antigravity-oauth-token.json" if OAUTH_PROVIDER == "google-antigravity" else "google-cli-oauth-token.json"
+        token_path = os.path.join(os.path.expanduser("~"), ".cache", "codex-proxy", token_name)
+        project_id = ""
+        try:
+            with open(token_path) as f:
+                project_id = json.load(f).get("project_id", "")
+        except Exception:
+            pass
+
+        contents = []
+        system_parts = []
+        instructions = body.get("instructions", "").strip()
+        tool_call_names = {}
+
+        if isinstance(input_data, list):
+            for item in input_data:
+                t = item.get("type")
+                if t == "message":
+                    role = "user" if item.get("role") == "user" else "model"
+                    content = item.get("content", "")
+                    if isinstance(content, list):
+                        parts = []
+                        for c in content:
+                            ct = c.get("type")
+                            if ct == "input_text":
+                                parts.append({"text": c.get("text", "")})
+                            elif ct == "text":
+                                parts.append({"text": c.get("text", "")})
+                            elif ct == "input_image" or ct == "image_url":
+                                iu = c.get("image_url") or c.get("url", {})
+                                url = iu.get("url", iu) if isinstance(iu, dict) else iu
+                                if isinstance(url, str) and url.startswith("data:"):
+                                    mime, _, b64 = url.partition(";base64,")
+                                    mime = mime.replace("data:", "") or "image/png"
+                                    parts.append({"inlineData": {"mimeType": mime, "data": b64}})
+                                else:
+                                    parts.append({"text": str(url)})
+                        if parts:
+                            contents.append({"role": role, "parts": parts})
+                    elif isinstance(content, str):
+                        contents.append({"role": role, "parts": [{"text": content}]})
+                elif t == "function_call":
+                    call_id = item.get("call_id") or item.get("id") or f"call_{uuid.uuid4().hex[:24]}"
+                    fname = item.get("name", "")
+                    if call_id and fname:
+                        tool_call_names[call_id] = fname
+                    args = item.get("arguments", "{}")
+                    if isinstance(args, str):
+                        try:
+                            args = json.loads(args)
+                        except Exception:
+                            args = {}
+                    contents.append({"role": "model", "parts": [{"functionCall": {"name": fname, "args": args, "id": call_id}, "thoughtSignature": "skip_thought_signature_validator"}]})
+                elif t == "function_call_output":
+                    call_id = item.get("call_id", item.get("id", ""))
+                    output = item.get("output", "")
+                    fname = item.get("name", "") or tool_call_names.get(call_id, "")
+                    try:
+                        output_parsed = json.loads(output) if isinstance(output, str) else output
+                    except Exception:
+                        output_parsed = output
+                    resp_part = {"functionResponse": {"name": fname or "unknown", "response": {"result": output_parsed if isinstance(output_parsed, (dict, list)) else output}}}
+                    if call_id:
+                        resp_part["functionResponse"]["id"] = call_id
+                    contents.append({"role": "user", "parts": [resp_part]})
+
+        if OAUTH_PROVIDER.startswith("google"):
+            sanitized = []
+            last_user_text = None
+            last_role = None
+            for content in contents:
+                role = content.get("role")
+                parts = [p for p in content.get("parts", []) if isinstance(p, dict)]
+                if not parts:
+                    continue
+                text_key = "\n".join([p.get("text", "") for p in parts if "text" in p]).strip()
+                if role == "user" and text_key and text_key == last_user_text:
+                    continue
+                if role == last_role and role in ("user", "model") and sanitized:
+                    sanitized[-1].setdefault("parts", []).extend(parts)
+                else:
+                    sanitized.append({"role": role, "parts": parts})
+                if role == "user" and text_key:
+                    last_user_text = text_key
+                last_role = role
+            while sanitized and sanitized[0].get("role") != "user":
+                sanitized.pop(0)
+            while sanitized and sanitized[-1].get("role") != "user":
+                sanitized.pop()
+            contents = sanitized
+
+        if instructions:
+            system_parts.append({"text": instructions})
+        if OAUTH_PROVIDER == "google-antigravity":
+            system_parts.append({"text": (
+                "You are connected through a Responses API translation proxy. "
+                "If tools are available and the user's request requires changing files, call the appropriate tool immediately. "
+                "Do not announce plans, do not say you will list files, browse, fetch, inspect, or start by exploring unless you are emitting the actual tool call in the same response. "
+                "For file creation requests, use tools to create or modify the file instead of only printing code in chat. "
+                "If no suitable tool is available, answer directly with the complete result. "
+                "Never answer only with a plan such as 'I will start by...' or 'I am going to...'."
+            )})
+
+        gen_config = {}
+        mot = body.get("max_output_tokens", 0)
+        if mot:
+            gen_config["maxOutputTokens"] = mot
+        if body.get("temperature") is not None:
+            gen_config["temperature"] = body["temperature"]
+        if body.get("top_p") is not None:
+            gen_config["topP"] = body["top_p"]
+
+        if REASONING_ENABLED and REASONING_EFFORT != "none":
+            budget = {"low": 2048, "medium": 8192, "high": 24576}.get(REASONING_EFFORT, 8192)
+            gen_config["thinkingConfig"] = {"includeThoughts": True, "thinkingBudget": budget}
+
+        oa_tools = body.get("tools", [])
+        gemini_tools = []
+        if oa_tools:
+            func_decls = []
+            for tool in oa_tools:
+                ttype = tool.get("type", "function")
+                fname = tool.get("name", "")
+                if ttype == "function":
+                    fn = tool.get("function", tool)
+                    name = fn.get("name", fname)
+                    desc = fn.get("description", "")
+                    params = fn.get("parameters", fn.get("input_schema", {}))
+                    func_decls.append({"name": name, "description": desc, "parameters": params})
+                elif fname:
+                    func_decls.append({"name": fname, "description": tool.get("description", ""), "parameters": tool.get("parameters", {"type": "object", "properties": {}})})
+            if func_decls:
+                gemini_tools = [{"functionDeclarations": func_decls}]
+
+        request_body = {"contents": contents}
+        if system_parts:
+            request_body["systemInstruction"] = {"parts": system_parts}
+        if gen_config:
+            request_body["generationConfig"] = gen_config
+        if gemini_tools:
+            request_body["tools"] = gemini_tools
+
+        wrapped = {
+            "project": project_id,
+            "model": model,
+            "request": request_body,
+        }
+        if OAUTH_PROVIDER == "google-antigravity":
+            wrapped["requestType"] = "agent"
+            wrapped["userAgent"] = "antigravity"
+            wrapped["requestId"] = f"agent-{uuid.uuid4().hex[:12]}"
+
+        endpoints = ([
+            "https://daily-cloudcode-pa.sandbox.googleapis.com",
+            "https://autopush-cloudcode-pa.sandbox.googleapis.com",
+            "https://cloudcode-pa.googleapis.com",
+        ] if OAUTH_PROVIDER == "google-antigravity" else [
+            "https://cloudcode-pa.googleapis.com",
+        ])
+        action = "streamGenerateContent" if stream else "generateContent"
+        url_suffix = f"v1internal:{action}?alt=sse" if stream else f"v1internal:{action}"
+
+        headers = {
+            "Content-Type": "application/json",
+            "Authorization": f"Bearer {access_token}",
+        }
+        if OAUTH_PROVIDER == "google-antigravity":
+            version = _ensure_antigravity_version()
+            headers["User-Agent"] = f"antigravity/{version} darwin/arm64"
+        else:
+            headers["User-Agent"] = "google-api-nodejs-client/9.15.1"
+            headers["X-Goog-Api-Client"] = "gl-node/22.17.0"
+            headers["Client-Metadata"] = "ideType=IDE_UNSPECIFIED,platform=PLATFORM_UNSPECIFIED,pluginType=GEMINI"
+        body_b = json.dumps(wrapped).encode()
+        print(f"[gemini-oauth] model={model} stream={stream} items={len(input_data) if isinstance(input_data, list) else 1} project={project_id}", file=sys.stderr)
+
+        for ep in endpoints:
+            target = f"{ep}/{url_suffix}"
+            req = urllib.request.Request(target, data=body_b, headers=headers)
+            try:
+                upstream = urllib.request.urlopen(req, timeout=_upstream_timeout(body, stream))
+                break
+            except urllib.error.HTTPError as e:
+                err_body = e.read().decode()
+                if e.code == 400 and OAUTH_PROVIDER.startswith("google"):
+                    try:
+                        debug_path = os.path.join(_LOG_DIR, "gemini-last-400-request.json")
+                        with open(debug_path, "w") as dbg:
+                            json.dump({"endpoint": ep, "model": model, "wrapped": wrapped, "error": err_body}, dbg, indent=2)
+                        print(f"[gemini-oauth] saved 400 debug request to {debug_path}", file=sys.stderr)
+                    except Exception:
+                        pass
+                if e.code == 429 and ep != endpoints[-1]:
+                    print(f"[gemini-oauth] {ep} HTTP 429, trying next endpoint", file=sys.stderr)
+                    continue
+                return self.send_json(e.code, {"error": {"type": "upstream_error", "message": err_body}})
+            except Exception as e:
+                if ep == endpoints[-1]:
+                    return self.send_json(502, {"error": {"type": "proxy_error", "message": str(e)}})
+                print(f"[gemini-oauth] {ep} failed: {e}, trying next", file=sys.stderr)
+                continue
+
+        if stream:
+            self._forward_gemini_sse(upstream, model, body, input_data)
+        else:
+            self._forward_gemini_json(upstream, model, body, input_data)
+
+    def _forward_gemini_sse(self, upstream, model, body, input_data):
+        resp_id = f"resp-{uuid.uuid4().hex[:24]}"
+        created = int(time.time())
+        self.send_response(200)
+        self.send_header("Content-Type", "text/event-stream")
+        self.send_header("Cache-Control", "no-cache")
+        self.send_header("Connection", "keep-alive")
+        self.end_headers()
+
+        full_text = ""
+        output_items = []
+        current_tool_calls = {}
+        message_started = False
+        message_id = f"msg-{uuid.uuid4().hex[:24]}"
+
+        def flush_event(event_type, data):
+            self.wfile.write(f"event: {event_type}\ndata: {json.dumps(data)}\n\n".encode())
+            self.wfile.flush()
+
+        flush_event("response.created", {"type": "response.created", "response": {"id": resp_id, "object": "response", "model": model, "status": "in_progress", "created": created, "output": []}})
+        flush_event("response.in_progress", {"type": "response.in_progress", "response": {"id": resp_id}})
+
+        buf = ""
+        stream_finished = False
+        for raw_line in upstream:
+            if stream_finished:
+                break
+            line = raw_line.decode(errors="replace")
+            if line.startswith("data: "):
+                buf += line[6:]
+                continue
+            if not line.strip() and buf:
+                try:
+                    chunk = json.loads(buf)
+                except Exception:
+                    buf = ""
+                    continue
+                buf = ""
+
+                candidates = chunk.get("response", chunk).get("candidates", [])
+                if not candidates:
+                    if chunk.get("error"):
+                        print(f"[gemini-oauth] stream error chunk: {str(chunk.get('error'))[:300]}", file=sys.stderr)
+                    continue
+                if candidates[0].get("finishReason") and not candidates[0].get("content", {}).get("parts"):
+                    print(f"[gemini-oauth] finish without parts: {candidates[0].get('finishReason')}", file=sys.stderr)
+                parts = candidates[0].get("content", {}).get("parts", [])
+                for part in parts:
+                    if part.get("thought"):
+                        continue
+                    if "text" in part and not part.get("functionCall"):
+                        text_delta = part["text"]
+                        if not text_delta:
+                            continue
+                        full_text += text_delta
+                        if not message_started:
+                            flush_event("response.output_item.added", {"type": "response.output_item.added", "output_index": 0, "item": {"type": "message", "id": message_id, "role": "assistant", "content": []}})
+                            flush_event("response.content_part.added", {"type": "response.content_part.added", "output_index": 0, "content_index": 0, "part": {"type": "output_text", "text": ""}})
+                            output_items.append({"text": True})
+                            message_started = True
+                        flush_event("response.output_text.delta", {"type": "response.output_text.delta", "output_index": 0, "content_index": 0, "delta": text_delta})
+                    elif part.get("functionCall"):
+                        fc = part["functionCall"]
+                        call_id = f"call_{uuid.uuid4().hex[:24]}"
+                        args_str = json.dumps(fc.get("args", fc.get("arguments", {})))
+                        output_index = len(output_items)
+                        flush_event("response.output_item.added", {"type": "response.output_item.added", "output_index": output_index, "item": {"type": "function_call", "id": call_id, "call_id": call_id, "name": fc.get("name", ""), "arguments": ""}})
+                        flush_event("response.function_call_arguments.delta", {"type": "response.function_call_arguments.delta", "output_index": output_index, "item_id": call_id, "delta": args_str})
+                        flush_event("response.function_call_arguments.done", {"type": "response.function_call_arguments.done", "output_index": output_index, "item_id": call_id, "arguments": args_str})
+                        current_tool_calls[call_id] = fc
+                        output_items.append({"tool": True})
+                if OAUTH_PROVIDER == "google-antigravity" and full_text and candidates[0].get("finishReason"):
+                    stream_finished = True
+                    break
+
+        out = []
+        if not full_text and not current_tool_calls:
+            print("[gemini-oauth] WARNING: completed with empty output", file=sys.stderr)
+        if full_text:
+            out.append({"type": "message", "id": message_id, "role": "assistant", "content": [{"type": "output_text", "text": full_text}]})
+        tool_outputs = []
+        for cid, fc in current_tool_calls.items():
+            tool_outputs.append({"type": "function_call", "id": cid, "call_id": cid, "name": fc.get("name", ""), "arguments": json.dumps(fc.get("args", fc.get("arguments", {})))})
+        out.extend(tool_outputs)
+
+        final_resp = {"id": resp_id, "object": "response", "model": model, "status": "completed", "created": created, "output": out}
+        if full_text:
+            flush_event("response.output_text.done", {"type": "response.output_text.done", "output_index": 0, "content_index": 0, "text": full_text})
+            flush_event("response.content_part.done", {"type": "response.content_part.done", "output_index": 0, "content_index": 0, "part": {"type": "output_text", "text": full_text}})
+            flush_event("response.output_item.done", {"type": "response.output_item.done", "output_index": 0, "item": out[0]})
+        for idx, item in enumerate(tool_outputs, start=(1 if full_text else 0)):
+            flush_event("response.output_item.done", {"type": "response.output_item.done", "output_index": idx, "item": item})
+        flush_event("response.completed", {"type": "response.completed", "response": final_resp})
+        self.close_connection = True
+
+        with _response_store_lock:
+            _response_store[resp_id] = final_resp
+            while len(_response_store) > _MAX_STORED:
+                _response_store.popitem(last=False)
+
+    def _forward_gemini_json(self, upstream, model, body, input_data):
+        data = json.loads(upstream.read().decode())
+        resp_id = f"resp-{uuid.uuid4().hex[:24]}"
+        created = int(time.time())
+        out = []
+        full_text = ""
+        candidates = data.get("response", data).get("candidates", [])
+        if candidates:
+            parts = candidates[0].get("content", {}).get("parts", [])
+            text_parts = []
+            for part in parts:
+                if part.get("thought"):
+                    continue
+                if "text" in part and not part.get("functionCall"):
+                    text_parts.append(part["text"])
+                elif part.get("functionCall"):
+                    fc = part["functionCall"]
+                    call_id = f"call_{uuid.uuid4().hex[:24]}"
+                    out.append({"type": "function_call", "id": call_id, "call_id": call_id, "name": fc.get("name", ""), "arguments": json.dumps(fc.get("args", fc.get("arguments", {})))})
+            if text_parts:
+                full_text = "".join(text_parts)
+                out.insert(0, {"type": "message", "id": f"msg-{uuid.uuid4().hex[:24]}", "role": "assistant", "content": [{"type": "output_text", "text": full_text}]})
+        resp = {"id": resp_id, "object": "response", "model": model, "status": "completed", "created": created, "output": out}
+        with _response_store_lock:
+            _response_store[resp_id] = resp
+            while len(_response_store) > _MAX_STORED:
+                _response_store.popitem(last=False)
+        self.send_json(200, resp)
+
    def _handle_bgp(self, body, model, stream, messages, input_data):
-        routes = sorted(BGP_ROUTES, key=lambda r: r.get("priority", 99))
+        routes = _sorted_bgp_routes()
+        routes = [r for r in routes if _bucket_for_route(r).allow()]
+        if not routes:
+            return self.send_json(503, {"error": {"type": "bgp_rate_limited", "message": "All routes rate-limited"}})
        errors = []
        for route in routes:
            r_model = route.get("model", model)
@@ -1266,11 +1976,13 @@ class Handler(http.server.BaseHTTPRequestHandler):
            }, browser_ua=True)
            print(f"[bgp] trying route '{route.get('name', r_url)}' model={r_model}", file=sys.stderr)
            req = urllib.request.Request(target, data=json.dumps(chat_body).encode(), headers=fwd)
+            t0_route = time.time()
            route_ok = False
            for attempt in range(3):
                try:
                    upstream = urllib.request.urlopen(req, timeout=_upstream_timeout(body, stream))
                    print(f"[bgp] route '{route.get('name', r_url)}' connected OK", file=sys.stderr)
+                    _update_route_stats(route, True, time.time() - t0_route)
                    self._forward_oa_compat(upstream, stream, r_model, chat_body, body, input_data, fwd, target)
                    return
                except urllib.error.HTTPError as e:
@@ -1282,6 +1994,7 @@ class Handler(http.server.BaseHTTPRequestHandler):
                        req = urllib.request.Request(target, data=json.dumps(chat_body).encode(), headers=fwd)
                        continue
                    print(f"[bgp] route '{route.get('name', r_url)}' FAILED: HTTP {e.code}: {err[:200]}", file=sys.stderr)
+                    _update_route_stats(route, False, time.time() - t0_route, http_code=e.code)
                    errors.append(f"{route.get('name','?')}: HTTP {e.code}")
                    break
                except (ConnectionResetError, ConnectionAbortedError, BrokenPipeError) as e:
@@ -1291,10 +2004,12 @@ class Handler(http.server.BaseHTTPRequestHandler):
                        time.sleep(wait)
                        req = urllib.request.Request(target, data=json.dumps(chat_body).encode(), headers=fwd)
                        continue
+                    _update_route_stats(route, False, time.time() - t0_route, error_type=str(e))
                    errors.append(f"{route.get('name','?')}: {e}")
                    break
                except Exception as e:
                    print(f"[bgp] route '{route.get('name', r_url)}' FAILED: {e}", file=sys.stderr)
+                    _update_route_stats(route, False, time.time() - t0_route, error_type=str(e))
                    errors.append(f"{route.get('name','?')}: {e}")
                    break