docs: add Intelligence Routing section to README

2026-05-05 17:46:24 +00:00
parent fe6d3f4db8
commit 6685f60855
1 changed files with 62 additions and 1 deletions
--- a/README.md
+++ b/README.md
@@ -58,6 +58,65 @@ User message + AI response
 | `/recall <query>` | Search memories by keyword |
 | `/forget <id>` | Delete a specific memory |

+### 🧠 Intelligence Routing
+
+The core of zCode CLI X's reliability. A unified agentic loop that handles both streaming and non-streaming through the same execution path — no more split paths that lose context or hang silently.
+
+```
+User Message
+     │
+     ▼
+┌─────────────────────────────────────────┐
+│         chatWithAI() — Main Loop        │
+│  ┌───────────────────────────────────┐  │
+│  │  Call API (stream or non-stream)  │  │
+│  └─────────────┬─────────────────────┘  │
+│                │                        │
+│         ┌──────┴──────┐                 │
+│         ▼             ▼                 │
+│   tool_calls?     text content          │
+│         │             │                 │
+│    ┌────┴────┐    return answer         │
+│    ▼         ▼                           │
+│  Execute   Feed results                  │
+│  tools     back to AI                    │
+│    │         │                           │
+│    └────┬────┘                           │
+│         ▼                                │
+│    Append to messages → loop back        │
+│    (max 10 turns, forced final answer)   │
+└─────────────────────────────────────────┘
+```
+
+**How it works:**
+
+| Component | Role |
+|---|---|
+| `chatWithAI()` | Single entry point. While loop (max 10 turns) that calls API, handles tools, loops back. |
+| `streamChat()` | SSE transport. Streams tokens to user via `onDelta()`. **Accumulates** tool_call deltas instead of aborting. Returns `{ content, tool_calls, error }`. |
+| `nonStreamChat()` | REST transport. Single POST, returns `{ content, tool_calls, error }`. |
+| Tool execution | Runs all tool_calls in parallel, appends results as `tool` messages, loops back. |
+
+**Key design decisions:**
+- **No recursive fallbacks.** Old code had stream → detect tool → abort → call non-stream (which lost the stream context). Now both paths return the same struct and feed into the same loop.
+- **Tool call accumulation.** SSE sends tool_calls in delta chunks (name in one chunk, arguments in the next). `streamChat()` accumulates these across chunks and builds a complete `tool_calls` array — same as the non-streaming response.
+- **Max 10 turns safety net.** If the AI keeps calling tools (infinite loop), after 10 turns a final non-streaming call without tools forces a text answer.
+- **Streaming is transport, not logic.** Whether tokens stream to the user or not, the tool execution loop is identical.
+
+**Before Intelligence Routing:**
+```
+Stream starts → detects tool_call → ABORT stream → recursive non-stream call
+→ non-stream returns raw tool output as response → NO final answer from AI
+→ user sees raw bash output or silence
+```
+
+**After Intelligence Routing:**
+```
+Stream starts → detects tool_call → accumulates full tool_call → executes tool
+→ feeds result back to AI in same conversation → AI synthesizes final answer
+→ streams final answer to user → done
+```
+
 ### Streaming & Formatting
 - **⚡ Real-time SSE Streaming**: Token-by-token delivery via `StreamConsumer` — adapted from [Hermes Agent's GatewayStreamConsumer](https://github.com/nousresearch/hermes-agent)
  - Queued token buffer → rate-limited `editMessageText` loop (1s base interval)
@@ -388,7 +447,9 @@ Z.AI API (SSE)
 | Self-learning / curiosity | ✅ Pattern detector + auto-extraction | ✅ Knowledge + memory tools | ❌ None |
 | Memory-injected prompts | ✅ Every conversation uses past lessons | ✅ Memory injected | ❌ None |
 | **Streaming** | | | |
+| Intelligence Routing | ✅ Unified agentic loop (stream+non-stream) | ⚠️ Separate stream/non-stream paths | ❌ None |
 | Real-time SSE streaming | ✅ StreamConsumer (edit-in-place) | ✅ GatewayStreamConsumer | ❌ None |
+| Tool call accumulation | ✅ Delta accumulation from SSE chunks | ⚠️ Abort on tool detection | ❌ None |
 | Telegram HTML formatting | ✅ markdownToHtml + fallback | ✅ Native HTML support | ❌ None |
 | Adaptive flood control | ✅ Exponential backoff | ✅ Flood backoff | ❌ N/A |
 | **Tooling** | | | |
@@ -411,7 +472,7 @@ Z.AI API (SSE)

 ### Summary

- **zCode CLI X** — Lightweight agentic coder focused on Telegram + Z.AI. Real-time SSE streaming, self-correction loops, persistent self-learning memory with curiosity engine, RTK optimization, and beautiful HTML formatting. Gets smarter with every conversation. Ideal for quick coding tasks via Telegram.
+- **zCode CLI X** — Lightweight agentic coder focused on Telegram + Z.AI. **Intelligence Routing** — a unified agentic loop that handles streaming and non-streaming through one execution path with tool call accumulation from SSE deltas. Real-time SSE streaming, self-correction loops, persistent self-learning memory with curiosity engine, RTK optimization, and beautiful HTML formatting. Gets smarter with every conversation. Ideal for quick coding tasks via Telegram.
 - **Hermes Agent** — Full-stack AI assistant platform. Best for complex multi-agent workflows, scheduled automation, and cross-platform deployment. 500+ skills, MCP ecosystem, deepest toolset.
 - **better-clawd** — Minimal Claude Code clone. Useful as a lightweight reference but lacks agentic depth.