docs: add Intelligence Routing section to README
This commit is contained in:
63
README.md
63
README.md
@@ -58,6 +58,65 @@ User message + AI response
|
||||
| `/recall <query>` | Search memories by keyword |
|
||||
| `/forget <id>` | Delete a specific memory |
|
||||
|
||||
### 🧠 Intelligence Routing
|
||||
|
||||
The core of zCode CLI X's reliability. A unified agentic loop that handles both streaming and non-streaming through the same execution path — no more split paths that lose context or hang silently.
|
||||
|
||||
```
|
||||
User Message
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────┐
|
||||
│ chatWithAI() — Main Loop │
|
||||
│ ┌───────────────────────────────────┐ │
|
||||
│ │ Call API (stream or non-stream) │ │
|
||||
│ └─────────────┬─────────────────────┘ │
|
||||
│ │ │
|
||||
│ ┌──────┴──────┐ │
|
||||
│ ▼ ▼ │
|
||||
│ tool_calls? text content │
|
||||
│ │ │ │
|
||||
│ ┌────┴────┐ return answer │
|
||||
│ ▼ ▼ │
|
||||
│ Execute Feed results │
|
||||
│ tools back to AI │
|
||||
│ │ │ │
|
||||
│ └────┬────┘ │
|
||||
│ ▼ │
|
||||
│ Append to messages → loop back │
|
||||
│ (max 10 turns, forced final answer) │
|
||||
└─────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
**How it works:**
|
||||
|
||||
| Component | Role |
|
||||
|---|---|
|
||||
| `chatWithAI()` | Single entry point. While loop (max 10 turns) that calls API, handles tools, loops back. |
|
||||
| `streamChat()` | SSE transport. Streams tokens to user via `onDelta()`. **Accumulates** tool_call deltas instead of aborting. Returns `{ content, tool_calls, error }`. |
|
||||
| `nonStreamChat()` | REST transport. Single POST, returns `{ content, tool_calls, error }`. |
|
||||
| Tool execution | Runs all tool_calls in parallel, appends results as `tool` messages, loops back. |
|
||||
|
||||
**Key design decisions:**
|
||||
- **No recursive fallbacks.** Old code had stream → detect tool → abort → call non-stream (which lost the stream context). Now both paths return the same struct and feed into the same loop.
|
||||
- **Tool call accumulation.** SSE sends tool_calls in delta chunks (name in one chunk, arguments in the next). `streamChat()` accumulates these across chunks and builds a complete `tool_calls` array — same as the non-streaming response.
|
||||
- **Max 10 turns safety net.** If the AI keeps calling tools (infinite loop), after 10 turns a final non-streaming call without tools forces a text answer.
|
||||
- **Streaming is transport, not logic.** Whether tokens stream to the user or not, the tool execution loop is identical.
|
||||
|
||||
**Before Intelligence Routing:**
|
||||
```
|
||||
Stream starts → detects tool_call → ABORT stream → recursive non-stream call
|
||||
→ non-stream returns raw tool output as response → NO final answer from AI
|
||||
→ user sees raw bash output or silence
|
||||
```
|
||||
|
||||
**After Intelligence Routing:**
|
||||
```
|
||||
Stream starts → detects tool_call → accumulates full tool_call → executes tool
|
||||
→ feeds result back to AI in same conversation → AI synthesizes final answer
|
||||
→ streams final answer to user → done
|
||||
```
|
||||
|
||||
### Streaming & Formatting
|
||||
- **⚡ Real-time SSE Streaming**: Token-by-token delivery via `StreamConsumer` — adapted from [Hermes Agent's GatewayStreamConsumer](https://github.com/nousresearch/hermes-agent)
|
||||
- Queued token buffer → rate-limited `editMessageText` loop (1s base interval)
|
||||
@@ -388,7 +447,9 @@ Z.AI API (SSE)
|
||||
| Self-learning / curiosity | ✅ Pattern detector + auto-extraction | ✅ Knowledge + memory tools | ❌ None |
|
||||
| Memory-injected prompts | ✅ Every conversation uses past lessons | ✅ Memory injected | ❌ None |
|
||||
| **Streaming** | | | |
|
||||
| Intelligence Routing | ✅ Unified agentic loop (stream+non-stream) | ⚠️ Separate stream/non-stream paths | ❌ None |
|
||||
| Real-time SSE streaming | ✅ StreamConsumer (edit-in-place) | ✅ GatewayStreamConsumer | ❌ None |
|
||||
| Tool call accumulation | ✅ Delta accumulation from SSE chunks | ⚠️ Abort on tool detection | ❌ None |
|
||||
| Telegram HTML formatting | ✅ markdownToHtml + fallback | ✅ Native HTML support | ❌ None |
|
||||
| Adaptive flood control | ✅ Exponential backoff | ✅ Flood backoff | ❌ N/A |
|
||||
| **Tooling** | | | |
|
||||
@@ -411,7 +472,7 @@ Z.AI API (SSE)
|
||||
|
||||
### Summary
|
||||
|
||||
- **zCode CLI X** — Lightweight agentic coder focused on Telegram + Z.AI. Real-time SSE streaming, self-correction loops, persistent self-learning memory with curiosity engine, RTK optimization, and beautiful HTML formatting. Gets smarter with every conversation. Ideal for quick coding tasks via Telegram.
|
||||
- **zCode CLI X** — Lightweight agentic coder focused on Telegram + Z.AI. **Intelligence Routing** — a unified agentic loop that handles streaming and non-streaming through one execution path with tool call accumulation from SSE deltas. Real-time SSE streaming, self-correction loops, persistent self-learning memory with curiosity engine, RTK optimization, and beautiful HTML formatting. Gets smarter with every conversation. Ideal for quick coding tasks via Telegram.
|
||||
- **Hermes Agent** — Full-stack AI assistant platform. Best for complex multi-agent workflows, scheduled automation, and cross-platform deployment. 500+ skills, MCP ecosystem, deepest toolset.
|
||||
- **better-clawd** — Minimal Claude Code clone. Useful as a lightweight reference but lacks agentic depth.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user