docs: add Intelligence Routing section to README

2026-05-05 17:46:24 +00:00
parent fe6d3f4db8
commit 6685f60855
1 changed files with 62 additions and 1 deletions
--- a/README.md
+++ b/README.md
@@ -58,6 +58,65 @@ User message + AI response
 | `/recall <query>` | Search memories by keyword |
 | `/forget <id>` | Delete a specific memory |
 ### 🧠 Intelligence Routing
 The core of zCode CLI X's reliability. A unified agentic loop that handles both streaming and non-streaming through the same execution path — no more split paths that lose context or hang silently.
 ```
 User Message
     │
     ▼
 ┌─────────────────────────────────────────┐
 │         chatWithAI() — Main Loop        │
 │  ┌───────────────────────────────────┐  │
 │  │  Call API (stream or non-stream)  │  │
 │  └─────────────┬─────────────────────┘  │
 │                │                        │
 │         ┌──────┴──────┐                 │
 │         ▼             ▼                 │
 │   tool_calls?     text content          │
 │         │             │                 │
 │    ┌────┴────┐    return answer         │
 │    ▼         ▼                           │
 │  Execute   Feed results                  │
 │  tools     back to AI                    │
 │    │         │                           │
 │    └────┬────┘                           │
 │         ▼                                │
 │    Append to messages → loop back        │
 │    (max 10 turns, forced final answer)   │
 └─────────────────────────────────────────┘
 ```
 **How it works:**
 | Component | Role |
 |---|---|
 | `chatWithAI()` | Single entry point. While loop (max 10 turns) that calls API, handles tools, loops back. |
 | `streamChat()` | SSE transport. Streams tokens to user via `onDelta()`. **Accumulates** tool_call deltas instead of aborting. Returns `{ content, tool_calls, error }`. |
 | `nonStreamChat()` | REST transport. Single POST, returns `{ content, tool_calls, error }`. |
 | Tool execution | Runs all tool_calls in parallel, appends results as `tool` messages, loops back. |
 **Key design decisions:**
 - **No recursive fallbacks.** Old code had stream → detect tool → abort → call non-stream (which lost the stream context). Now both paths return the same struct and feed into the same loop.
 - **Tool call accumulation.** SSE sends tool_calls in delta chunks (name in one chunk, arguments in the next). `streamChat()` accumulates these across chunks and builds a complete `tool_calls` array — same as the non-streaming response.
 - **Max 10 turns safety net.** If the AI keeps calling tools (infinite loop), after 10 turns a final non-streaming call without tools forces a text answer.
 - **Streaming is transport, not logic.** Whether tokens stream to the user or not, the tool execution loop is identical.
 **Before Intelligence Routing:**
 ```
 Stream starts → detects tool_call → ABORT stream → recursive non-stream call
 → non-stream returns raw tool output as response → NO final answer from AI
 → user sees raw bash output or silence
 ```
 **After Intelligence Routing:**
 ```
 Stream starts → detects tool_call → accumulates full tool_call → executes tool
 → feeds result back to AI in same conversation → AI synthesizes final answer
 → streams final answer to user → done
 ```
 ### Streaming & Formatting
 - **⚡ Real-time SSE Streaming**: Token-by-token delivery via `StreamConsumer` — adapted from [Hermes Agent's GatewayStreamConsumer](https://github.com/nousresearch/hermes-agent)
  - Queued token buffer → rate-limited `editMessageText` loop (1s base interval)
@@ -388,7 +447,9 @@ Z.AI API (SSE)
 | Self-learning / curiosity | ✅ Pattern detector + auto-extraction | ✅ Knowledge + memory tools | ❌ None |
 | Memory-injected prompts | ✅ Every conversation uses past lessons | ✅ Memory injected | ❌ None |
 | **Streaming** | | | |
 | Intelligence Routing | ✅ Unified agentic loop (stream+non-stream) | ⚠️ Separate stream/non-stream paths | ❌ None |
 | Real-time SSE streaming | ✅ StreamConsumer (edit-in-place) | ✅ GatewayStreamConsumer | ❌ None |
 | Tool call accumulation | ✅ Delta accumulation from SSE chunks | ⚠️ Abort on tool detection | ❌ None |
 | Telegram HTML formatting | ✅ markdownToHtml + fallback | ✅ Native HTML support | ❌ None |
 | Adaptive flood control | ✅ Exponential backoff | ✅ Flood backoff | ❌ N/A |
 | **Tooling** | | | |
@@ -411,7 +472,7 @@ Z.AI API (SSE)
 ### Summary
- **zCode CLI X** — Lightweight agentic coder focused on Telegram + Z.AI. Real-time SSE streaming, self-correction loops, persistent self-learning memory with curiosity engine, RTK optimization, and beautiful HTML formatting. Gets smarter with every conversation. Ideal for quick coding tasks via Telegram.
+- **zCode CLI X** — Lightweight agentic coder focused on Telegram + Z.AI. **Intelligence Routing** — a unified agentic loop that handles streaming and non-streaming through one execution path with tool call accumulation from SSE deltas. Real-time SSE streaming, self-correction loops, persistent self-learning memory with curiosity engine, RTK optimization, and beautiful HTML formatting. Gets smarter with every conversation. Ideal for quick coding tasks via Telegram.
 - **Hermes Agent** — Full-stack AI assistant platform. Best for complex multi-agent workflows, scheduled automation, and cross-platform deployment. 500+ skills, MCP ecosystem, deepest toolset.
 - **better-clawd** — Minimal Claude Code clone. Useful as a lightweight reference but lacks agentic depth.