From 21657f9dbb09b0b0610a438f012e26fcdd40e824 Mon Sep 17 00:00:00 2001
From: Gemini AI <gemini@google.com>
Date: Thu, 25 Dec 2025 00:29:51 +0400
Subject: [PATCH] Update README.md to use only Z.ai blog data - remove
 OpenRouter pricing and unsupported models

---
 README.md | 218 ++++++++++++++++--------------------------------------
 1 file changed, 64 insertions(+), 154 deletions(-)
diff --git a/README.md b/README.md
index bb8b243..31fef34 100644
--- a/README.md
+++ b/README.md
@@ -12,225 +12,135 @@
 
 ### 💡 Key Takeaways (TL;DR)
 - **GLM-4.7** is the new **SOTA (State of the Art)** AI coding model for 2025.
-- Developed by **Zhipu AI**, it offers enterprise-level performance matching or exceeding flagship models like **Claude Sonnet 4.5** and **GPT-5.1**.
-- **Price Point**: $0.60 per 1M input tokens ($0.11 cached), $2.20 per 1M output tokens vs. $3.00/$15.00 for Claude Sonnet 4.5.
+- Developed by **Zhipu AI**, it offers enterprise-level performance matching or exceeding flagship models like **Claude Sonnet 4.5** and **GPT-5.1 High**.
 - **Context Window**: Massive **200K tokens** for full codebase analysis.
 - **Best For**: Cost-conscious developers, agentic workflows, and high-complexity debugging.
 
 The global landscape for AI-powered development is shifting. While Western tools like **Cursor Pro** and **GitHub Copilot** have dominated by charging premium subscription rates (often reaching $200 per year), a new contender from Beijing, China, has arrived to dismantle that pricing model.
 
-**Zhipu AI** has released **GLM-4.7**, a large language model specifically engineered for coding, offering performance that rivals top-tier US models at a fraction of the cost. With a price point of **$0.60 per 1M input tokens** (dropping to **$0.11** with caching) and **$2.20 per 1M output tokens**, GLM-4.7 is forcing developers to question if expensive subscriptions are still necessary.
+**Zhipu AI** has released **GLM-4.7**, a large language model specifically engineered for coding, offering performance that rivals top-tier US models. For pricing information, visit [Z.ai subscription page](https://z.ai/subscribe) or use via [OpenRouter](https://openrouter.ai/).
 
 ---
 
 ## ⚔️ The Frontier Battle: Verified Benchmarks
 
-GLM-4.7 demonstrates competitive performance against the newest generation of flagship models, including **Claude Sonnet 4.5** and **GPT-5.1**, based on the latest 2025 public technical reports.
+GLM-4.7 demonstrates competitive performance against the newest generation of flagship models, including **Claude Sonnet 4.5** and **GPT-5.1 High**, based on the official Z.ai Technical Report (Dec 2025).
 
 ### 📊 2025 AI Coding Model Performance Comparison
-*Note: Best scores per category are highlighted in $\color{green}{\text{green}}$.*
+*Note: Best scores per category are highlighted in $\color{green}{\text{green}}$. Data sourced from [Z.ai Official Blog](https://z.ai/blog/glm-4.7).*
 
 ```mermaid
 graph TD
     subgraph "2025 Flagship Benchmark Comparison"
     M[Math - AIME 25] --> G1{<b>GLM-4.7: 95.7%</b>}
-    M --> C1[Claude Opus 4.5: 93.5%]
-    M --> C2[Claude Sonnet 4.5: 87.0%]
-    M --> Q1[Qwen-3 Coder: 89.3%]
-    M --> D1[DeepSeek-V3.2: 96.0%]
-    M --> M1[MiniMax 2.1: 78.0%]
+    M --> C1[Claude Sonnet 4.5: 87.0%]
+    M --> G2[Gemini 3.0 Pro: 95.0%]
+    M --> D1[DeepSeek-V3.2: 93.1%]
+    M --> P1[GPT-5.1 High: 94.0%]
     
-    CO[Coding - LiveCode] --> G2{<b>GLM-4.7: 84.9%</b>}
-    CO --> C3[Claude Opus 4.5: 64.0%]
-    CO --> C4[Claude Sonnet 4.5: 64.0%]
-    CO --> Q2[Qwen-3 Coder: 74.8%]
+    CO[Coding - LiveCode] --> G2_C{<b>GLM-4.7: 84.9%</b>}
+    CO --> C2[Claude Sonnet 4.5: 64.0%]
     CO --> D2[DeepSeek-V3.2: 83.3%]
+    CO --> P2[GPT-5.1 High: 87.0%]
+    CO --> G2_CO[Gemini 3.0 Pro: 90.7%]
     
     S[Science - GPQA] --> G3{<b>GLM-4.7: 85.7%</b>}
-    S --> C5[Claude Opus 4.5: 87.0%]
-    S --> C6[Claude Sonnet 4.5: 83.4%]
-    S --> D3[DeepSeek-V3.2: 80.3%]
-    S --> M2[MiniMax 2.1: 78.0%]
+    S --> C3[Claude Sonnet 4.5: 83.4%]
+    S --> D3[DeepSeek-V3.2: 82.4%]
+    S --> P3[GPT-5.1 High: 88.1%]
+    S --> G3_S[Gemini 3.0 Pro: 91.9%]
     
     L[Logic - HLE w/Tools] --> G4{<b>GLM-4.7: 42.8%</b>}
-    L --> C7[Claude Opus 4.5: 43.2%]
-    L --> C8[Claude Sonnet 4.5: 28.0%]
-    L --> D4[DeepSeek-V3.2: 21.7%]
-    L --> M3[MiniMax 2.1: 31.8%]
+    L --> C4[Claude Sonnet 4.5: 32.0%]
+    L --> D4[DeepSeek-V3.2: 40.8%]
+    L --> P4[GPT-5.1 High: 42.7%]
+    L --> G4_L[Gemini 3.0 Pro: 45.8%]
     end
     
     classDef glmNode fill:#00c853,stroke:#1b5e20,stroke-width:3px,color:#ffffff,font-weight:bold,font-size:14px
-    classDef opusNode fill:#ff9800,stroke:#e65100,stroke-width:2px,color:#ffffff
     classDef sonnetNode fill:#f1f8e9,stroke:#c5e1a5,stroke-width:1px,color:#558b2f
     classDef budgetNode fill:#e3f2fd,stroke:#2196f3,stroke-width:1px,color:#0d47a1
     
-    class G1,G2,G3,G4 glmNode
-    class C1,C3,C5,C7 opusNode
-    class C2,C4,C6,C8 sonnetNode
-    class Q1,Q2,D1,D2,D3,D4,M1,M2,M3 budgetNode
+    class G1,G2_C,G3,G4 glmNode
+    class C1,C2,C3,C4 sonnetNode
+    class D1,D2,D3,D4,G2,P1,P2,P3,P4,G2_CO,G3_S,G4_L budgetNode
 ```
 
-| Category | Benchmark | **GLM-4.7** | Claude Opus 4.5 | Claude Sonnet 4.5 | GPT-5.1 | Qwen-3 Coder | DeepSeek-V3.2 | MiniMax 2.1 | Source |
-| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
-| **Math** | AIME 25 | $\color{green}{\textbf{95.7}}$ | 93.5 | 87.0 | 94.6 | 89.3 | $\color{green}{\textbf{96.0}}$ | 78.0 | [Z.ai](https://z.ai/blog/glm-4.7) • [Anthropic](https://www.anthropic.com) • [Qwen](https://qwenlm.github.io/blog/qwen2.5-coder/) • [Ollama](https://ollama.com/library/glm4) |
-| **Coding** | LiveCodeBench | 84.9 | 64.0 | 64.0 | $\color{green}{\textbf{87.0}}$ | 74.8 | 83.3 | N/A | [LiveCodeBench](https://livecodebench.github.io/leaderboard.html) • [Cursor](https://cursor.com/blog) • [Qwen](https://qwenlm.github.io/blog/qwen2.5-coder/) • [Ollama](https://ollama.com/library/glm4) |
-| **Science** | GPQA-Diamond | 85.7 | 87.0 | 83.4 | $\color{green}{\textbf{88.1}}$ | N/A | 80.3 | 78.0 | [Zhipu AI](https://zhipuai.cn/) • [Anthropic](https://www.anthropic.com) • [Vellum](https://www.vellum.ai/llm-leaderboard) • [Ollama](https://ollama.com/library/glm4) |
-| **Logic** | HLE (w/ Tools) | 42.8 | $\color{green}{\textbf{43.2}}$ | 28.0 | 31.6 | N/A | 21.7 | 31.8 | [Scale AI](https://scale.com/leaderboard/humanitys_last_exam) • [Vellum](https://www.vellum.ai/llm-leaderboard) • [Ollama](https://ollama.com/library/glm4) |
-| **Engineering** | SWE-bench (Verified) | 73.8% | $\color{green}{\textbf{80.9%}}$ | 77.2% | 74.9% | 69.6% | 73.1% | 69.4% | [SWE-bench](https://www.swebench.com/) • [Anthropic](https://www.anthropic.com) • [Index](https://index.dev/blog/) • [Ollama](https://ollama.com/library/glm4) • [Hugging Face](https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard) |
-| **Agentic** | τ²-Bench | $\color{green}{\textbf{87.4%}}$ | N/A | 84.7 | 82.7% | N/A | 66.7 | 77.2 | [Official Z.AI](https://z.ai/blog/glm-4.7) • [Ollama](https://ollama.com/library/glm4) • [Vellum](https://www.vellum.ai/llm-leaderboard) |
+| Category | Benchmark | **GLM-4.7** | Claude Sonnet 4.5 | GPT-5.1 High | DeepSeek-V3.2 | Gemini 3.0 Pro | Source |
+| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |
+| **Math** | AIME 25 | $\color{green}{\textbf{95.7}}$ | 87.0 | 94.0 | 93.1 | 95.0 | [Z.ai](https://z.ai/blog/glm-4.7) |
+| **Coding** | LiveCodeBench v6 | 84.9 | 64.0 | 87.0 | 83.3 | $\color{green}{\textbf{90.7}}$ | [Z.ai](https://z.ai/blog/glm-4.7) |
+| **Science** | GPQA-Diamond | 85.7 | 83.4 | 88.1 | 82.4 | $\color{green}{\textbf{91.9}}$ | [Z.ai](https://z.ai/blog/glm-4.7) |
+| **Logic** | HLE (w/ Tools) | 42.8 | 32.0 | 42.7 | 40.8 | $\color{green}{\textbf{45.8}}$ | [Z.ai](https://z.ai/blog/glm-4.7) |
+| **Engineering**| SWE-bench (Ver.) | 73.8% | $\color{green}{\textbf{77.2%}}$ | 76.3% | 73.1% | 76.2% | [Z.ai](https://z.ai/blog/glm-4.7) |
+| **Agentic** | τ²-Bench | 87.4% | 87.2% | 82.7% | 85.3% | $\color{green}{\textbf{90.7%}}$ | [Z.ai](https://z.ai/blog/glm-4.7) |
 
 ---
 
 ## 🛠️ What is GLM-4.7? Technical Specifications and Features
+GLM-4.7 is the latest iteration of the General Language Model (GLM) series developed by Beijing-based **Zhipu AI**.
 
-GLM-4.7 is the latest iteration of the General Language Model (GLM) series developed by Beijing-based **Zhipu AI**. Unlike general-purpose models, GLM-4.7 is optimized heavily for code generation and function calling.
-
-### 🚀 Key Technical Highlights
-- **Extended Context Window (200K):** Capable of handling massive codebases in a single prompt, allowing it to "remember" entire file structures better than any previous iteration.
-- **Interleaved & Preserved Thinking:** A proprietary reasoning architecture that "thinks" before every response and tool call. **Preserved Thinking** allows the model to retain reasoning blocks across multi-turn conversations, preventing context loss in complex debugging sessions.
-- **Turn-level Thinking Control:** Developers can toggle reasoning on/off per turn—disabling it for speed in simple tasks or enabling it for high-precision logic in architectural decisions.
-- **All-Tools Capabilities:** Autonomously browses the web, executes code, and interprets data to solve complex engineering problems without human hand-holding.
-- **Extreme Efficiency:** Proprietary architecture delivering high inference speeds at a significantly lower cost than current flagship standards like **Claude Sonnet 4.5**.
+### 🚀 Key Technical Highlights (from Z.ai blog)
+- **Interleaved Thinking:** GLM-4.7 thinks before every response and tool calling, improving instruction following and quality of generation.
+- **Preserved Thinking:** In coding agent scenarios, GLM-4.7 automatically retains all thinking blocks across multi-turn conversations, reusing existing reasoning instead of re-deriving from scratch.
+- **Turn-level Thinking:** GLM-4.7 supports per-turn control over reasoning within a session—disable thinking for lightweight requests to reduce latency/cost, enable it for complex tasks to improve accuracy and stability.
+- **Tool Using:** GLM-4.7 achieves significant improvements in tool using, with better performances on benchmarks such as τ²-Bench and on web browsing via BrowseComp.
 
 ---
 
-## 📈 The ROI Shift & Competitive Landscape
+## 📈 GLM-4.7 vs GLM-4.6: Key Improvements
 
-The "value add" of an AI coding tool is no longer just about its ability to generate code, but about its **ROI**. With the entry of GLM-4.7 into global marketplaces like **OpenRouter**, the pricing floor has been permanently lowered.
+Based on [Z.ai Technical Report](https://z.ai/blog/glm-4.7), GLM-4.7 delivers significant gains across core benchmarks compared to its predecessor GLM-4.6:
 
-### � OpenRouter Pricing Comparison (USD per 1M Tokens)
-*Current market rates via [OpenRouter.ai](https://openrouter.ai/models) as of late 2024/early 2025.*
+### � Performance Gains
+| Benchmark | GLM-4.6 | GLM-4.7 | Improvement |
+| :--- | :--- | :--- | :--- |
+| **SWE-bench** | 68.0% | 73.8% | **+5.8%** |
+| **SWE-bench Multilingual** | 53.8% | 66.7% | **+12.9%** |
+| **Terminal Bench 2.0** | 24.5% | 41.0% | **+16.5%** |
+| **HLE (w/ Tools)** | 30.4% | 42.8% | **+12.4%** |
+| **LiveCodeBench-v6** | 82.8% | 84.9% | **+2.1%** |
 
-| Model Name | Input Price | Output Price | Context | Provider |
-| :--- | :--- | :--- | :--- | :--- |
-| **GLM-4.7 (Zhipu)** | **$0.10** | **$0.10** | 128K+ | Z.AI via OpenRouter |
-| **DeepSeek V3.2** | $0.22 | $0.32 | 163K | DeepSeek |
-| **Qwen 2.5 Coder 32B** | $0.03 | $0.11 | 32K | Alibaba |
-| **Claude Sonnet 4.5** | $3.00 | $15.00 | 200K | Anthropic |
-| **Claude Opus 4.5** | $15.00 | $75.00 | 200K | Anthropic |
-| **GPT-4o (Omni)** | $2.50 | $10.00 | 128K | OpenAI |
-| **GPT-5.1 (Early Access)**| $5.00 | $15.00 | 400K+ | OpenAI |
-
-### 🚀 Why OpenRouter is the Strategic Choice
-By using GLM-4.7 via OpenRouter, developers gain:
-- **No Monthly "Tax":** Pay only for what you use instead of a flat $20/month fee.
-- **Aggregated Access:** Switch between GLM-4.7 for heavy reasoning and Qwen-Coder for rapid, cheap boilerplate.
-- **Global Availability:** Bypasses regional restrictions, bringing SOTA Chinese intelligence to Western workflows.
+### �️ Enhanced Capabilities
+- **Interleaved Thinking:** GLM-4.7 thinks before every response and tool calling, improving instruction following and quality of generation.
+- **Preserved Thinking:** In coding agent scenarios, GLM-4.7 automatically retains all thinking blocks across multi-turn conversations, reusing existing reasoning instead of re-deriving from scratch.
+- **Turn-level Thinking:** GLM-4.7 supports per-turn control over reasoning within a session—disable thinking for lightweight requests to reduce latency/cost, enable it for complex tasks to improve accuracy and stability.
 
 ---
 
 ## ❓ FAQ: GLM-4.7 and the AI Coding Market
 
-**What is the best cost-effective AI for coding in 2025?**
-The market for high-performance, budget-friendly AI has expanded significantly in 2025. Leading the pack are **GLM-4.7 (Zhipu AI)**, **DeepSeek-V3.2**, **Qwen-3 Coder (Alibaba)**, and **MiniMax 2.1**. While all four offer performance comparable to **Claude Sonnet 4.5** and **Claude Opus 4.5** at a fraction of the cost, GLM-4.7 is often preferred for agentic workflows due to its advanced "Preserved Thinking" architecture. DeepSeek remains a strong choice for raw logic, Qwen excels in multilingual code generation, and MiniMax 2.1 delivers strong performance at roughly half the parameter size of GLM-4.7.
+**What is best cost-effective AI for coding in 2025?**
+The market for high-performance, budget-friendly AI has expanded significantly in 2025. Leading the pack are **GLM-4.7 (Zhipu AI)** and **DeepSeek-V3.2**, both offering performance comparable to **Claude Sonnet 4.5** at a fraction of the cost. GLM-4.7 is often preferred for agentic workflows due to its advanced "Preserved Thinking" architecture, while DeepSeek-V3.2 remains a strong choice for raw logic and reasoning tasks.
 
 **Is GLM-4.7 better than GPT-5.1 or Claude Sonnet 4.5 for coding?**
 Objectively, **Claude Sonnet 4.5** and **GPT-5.1** currently hold the edge in massive-scale architectural planning and natural language nuance. However, GLM-4.7 has achieved parity or leadership in execution-heavy benchmarks (LiveCodeBench: 84.9) and mathematical reasoning (AIME 25: 95.7). For developers, the choice is often between paying for the absolute peak (Claude/GPT) or achieving 95% of that performance with GLM-4.7 for 1/20th the price.
 
 **How much does the GLM-4.7 coding tool cost?**
-The Z.AI Lite plan starts at **$9/quarter**. For API users, GLM-4.7 is priced at **$0.60 per 1M input tokens** ($0.11 with caching) and **$2.20 per 1M output tokens**, significantly undercutting the $3.00/$15.00 token rate of Claude Sonnet 4.5.
+GLM-4.7 is available via the [Z.ai API platform](https://docs.z.ai/guides/llm/glm-4.7) and through [OpenRouter](https://openrouter.ai/). For detailed pricing, visit [Z.ai subscription page](https://z.ai/subscribe).
 
 **Who developed GLM-4.7?**
 GLM-4.7 was developed by **Zhipu AI**, a leading artificial intelligence company based in Beijing, China, emerging from the Knowledge Engineering Group (KEG) at Tsinghua University.
 
 **Can I use GLM-4.7 in the US and Europe?**
-Yes, Zhipu AI's ecosystem is accessible globally. It is natively integrated into popular developer tools like **TRAE SOLO**, **Cline**, and **Roo Code**, allowing developers worldwide to bypass the "$200 AI tax."
-
----
-
-## 🌐 Official Z.AI Ecosystem & Technical Specs
-*Data sourced from the [Official Z.AI Website](https://z.ai/subscribe?ic=R0K78RJKNW)*
-
-GLM-4.7 powers a robust ecosystem designed for high-volume agentic workflows. According to official Z.AI technical specifications, the platform delivers elite-tier performance and deep tool integration.
-
-### 📊 Technical Performance (LMSYS & Code Arena)
-As of **2025-12-22**, GLM-4.7 consistently ranks as a top-tier contender on global leaderboards, optimized for engineering-heavy tasks:
-- **Claude Sonnet 4.5 (Flagship)**: 1482
-- **GLM-4.7 (Reasoning Leader)**: 1452 🥇
-- **GPT-5.1 (High)**: 1410
-
-### 🛠️ Supported Coding Tools & IDEs
-GLM-4.7 is natively compatible with the most advanced coding environments:
-- **TRAE (SOLO Agent)** - Optimized for real-time tool streaming
-- **Claude Code** - Full documentation support
-- **Cursor** - Seamless workflow integration
-- **Cline** & **Roo Code** - Advanced agentic capabilities
-- **Kilo Code**, **OpenCode**, **Goose**, **Crush**
-
-### 💎 Official Subscription Tiers & Technical Specs
-| Plan | Quarterly Cost | Usage Quota | Advanced Capabilities |
-| :--- | :--- | :--- | :--- |
-| **Lite** | **$9** | **3× Claude Pro** | GLM-4.7 Powered, Same-tier updates |
-| **Pro** | **$45** | **15× Claude Pro equivalent** | 40-60% Faster, Vision Analyze, Web Search |
-| **Max** | **$90** | **60× Claude Pro equivalent** | Peak-hour Guarantee, Early Access, Flagship Updates |
-
-**Additional Technical Features:**
-- **Vision Analyze:** Interpret UI designs and screenshots directly.
-- **Web Search & Reader:** Real-time information retrieval for up-to-date documentation.
-- **Zread MCP:** Enhanced context management via Model Context Protocol.
-- **Cross-Model Compatibility:** Operates on existing code from other models without breaking architectural integrity.
-
----
-
-## 🎄 Special Christmas Offer
-
-Same as I did, you may get one of the most powerful models for the lowest price, through the current GLM promotions for new year and xmas:
-
-![GLM-4.7 Christmas 2025 Discount Coupon: 50% off first purchase plus 10% additional stackable discount with invite code R0K78RJKNW](data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHdpZHRoPSIxIiBoZWlnaHQ9IjEiPjwvc3ZnPg==)
-
-```text
- ___________________________________________________________
-/                                                           \
-|   🎄 GLM-4.7 CHRISTMAS EXCLUSIVE CODE 🎄                   |
-|  _______________________________________________________  |
-| |                                                       | |
-| |   🎁  50% OFF FIRST PURCHASE                          | |
-| |   🚀  +10% ADDITIONAL STACKABLE DISCOUNT              | |
-| |_______________________________________________________| |
-|                                                           |
-|   INVITE CODE:  R0K78RJKNW                                |
-|   STATUS:       ACTIVE [STRICTLY LIMITED]                 |
-|                                                           |
-\___________________________________________________________/
-          \   ^__^
-           \  (oo)\_______
-              (__)\       )\/\
-                  ||----w |
-                  ||     ||
-```
-
-🔗 **Get +10% additional discount here:** [https://z.ai/subscribe?ic=R0K78RJKNW](https://z.ai/subscribe?ic=R0K78RJKNW)
-🎟️ **Discount Invite Code:** `R0K78RJKNW`
+Yes, GLM-4.7 is available worldwide through [OpenRouter](https://openrouter.ai/). It is compatible with coding agent frameworks mentioned in the Z.ai blog: **Claude Code**, **Kilo Code**, **Cline**, and **Roo Code**.
 
 ---
 
 ## 📚 References & Methodology
-To ensure transparency and build trust, the data presented in this article is derived from publicly available benchmarks and technical reports as of late 2024 / early 2025:
-- **AIME 2025 (Math):** Results sourced from the [Official Z.AI Portal](https://z.ai/subscribe?ic=R0K78RJKNW).
-- **LiveCodeBench:** Independent evaluation of coding execution accuracy.
-- **SWE-bench (Verified):** The industry standard for evaluating AI on real-world software engineering issues.
-- **HLE (Humanity's Last Exam):** A high-difficulty reasoning benchmark where GLM-4.7 (42.8%) significantly outscores Claude Sonnet 4.5 (32.0%).
-- **τ²-Bench:** State-of-the-art evaluation for multi-step tool orchestration in real-world scenarios.
-- **Token Pricing:** GLM-4.7 pricing data sourced from [BuildingClub Cost Calculator](https://buildingclub.info/z-ai-glm-4-7-token-cost-calculator-and-pricing-estimator/).
-- **Claude 4.5 Pricing:** Anthropic official documentation for token-based pricing comparison.
-- **GLM-4.7 vs MiniMax M2.1:** Real-world performance comparison insights from [YouTube](https://www.youtube.com/watch?v=kEPLuEjVr_4).
-- **OpenRouter Model Catalog:** Real-time pricing and availability data from [OpenRouter.ai](https://openrouter.ai/models).
+All data presented in this article is derived from the [Z.ai Official Technical Report](https://z.ai/blog/glm-4.7) (December 2025):
+- **Benchmark Performance:** GLM-4.7 compared against GLM-4.6, Kimi K2 Thinking, DeepSeek-V3.2, Gemini 3.0 Pro, Claude Sonnet 4.5, GPT-5 High, and GPT-5.1 High across 17 benchmarks.
+- **Core Coding:** SWE-bench (73.8%, +5.8%), SWE-bench Multilingual (66.7%, +12.9%), Terminal Bench 2.0 (41%, +16.5%).
+- **Reasoning:** HLE (w/ Tools): 42.8%, AIME 2025: 95.7%, GPQA-Diamond: 85.7%.
+- **Agentic:** τ²-Bench: 87.4%, BrowseComp: 52.0%.
+- **Features:** Interleaved Thinking, Preserved Thinking, Turn-level Thinking for stable multi-turn conversations.
+- **Supported Tools:** Claude Code, Kilo Code, Cline, and Roo Code for agent workflows.
 
 ---
 
 ## 🔗 Source Links
 - [Z.ai Tech Report](https://z.ai/blog/glm-4.7)
-- [Anthropic Benchmarks](https://www.anthropic.com/news/claude-3-5-sonnet)
-- [Qwen Technical Report](https://qwenlm.github.io/blog/qwen2.5-coder/)
-- [LiveCodeBench Leaderboard](https://livecodebench.github.io/leaderboard.html)
-- [SWE-bench Verified](https://www.swebench.com/)
-- [Humanity's Last Exam (HLE)](https://scale.com/leaderboard/humanitys_last_exam)
-- [LMSYS Chatbot Arena](https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard)
-- [τ²-Bench Tool Use](https://z.ai/blog/glm-4.7)
-- [Vellum LLM Leaderboard](https://www.vellum.ai/llm-leaderboard)
-- [BuildingClub Pricing Data](https://buildingclub.info/z-ai-glm-4-7-token-cost-calculator-and-pricing-estimator/)
-
-*Note: AI performance metrics are subject to change as models are updated. Users are encouraged to verify latest scores on platforms like [LMSYS Chatbot Arena](https://lmarena.ai/).*
 
 ---
-*The era of the "$200 AI coding tax" is over. Join the GLM revolution today.*
+
+*The era of "$200 AI coding tax" is over. Join the GLM revolution today.*