feat: Add complete Agentic Compaction & Pipeline System

- Context Compaction System with token counting and summarization - Deterministic State Machine for flow control (no LLM decisions) - Parallel Execution Engine (up to 12 concurrent sessions) - Event-Driven Coordination via Event Bus - Agent Workspace Isolation (tools, memory, identity, files) - YAML Workflow Integration (OpenClaw/Lobster compatible) - Claude Code integration layer - Complete demo UI with real-time visualization - Comprehensive documentation and README Components: - agent-system/: Context management, token counting, subagent spawning - pipeline-system/: State machine, parallel executor, event bus, workflows - skills/: AI capabilities (LLM, ASR, TTS, VLM, image generation, etc.) - src/app/: Next.js demo application Total: ~100KB of production-ready TypeScript code
2026-03-03 12:40:47 +00:00
parent 63a8b123c9
commit 2380d33861
152 changed files with 51569 additions and 817 deletions
--- a/skills/podcast-generate/LICENSE.txt
+++ b/skills/podcast-generate/LICENSE.txt
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2025 z-ai-web-dev-sdk Skills
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
--- a/skills/podcast-generate/SKILL.md
+++ b/skills/podcast-generate/SKILL.md
@@ -0,0 +1,198 @@
+---
+name: Podcast Generate
+description: Generate podcast episodes from user-provided content or by searching the web for specified topics. If user uploads a text file/article, creates a dual-host dialogue podcast (or single-host upon request). If no content is provided, searches the web for information about the user-specified topic and generates a podcast. Duration scales with content size (3-20 minutes, ~240 chars/min). Uses z-ai-web-dev-sdk for LLM script generation and TTS audio synthesis. Outputs both a podcast script (Markdown) and a complete audio file (WAV).
+license: MIT
+---
+
+# Podcast Generate Skill（TypeScript 版本）
+
+根据用户提供的资料或联网搜索结果，自动生成播客脚本与音频。
+
+该 Skill 适用于：
+- 长文内容的快速理解和播客化
+- 知识型内容的音频化呈现
+- 热点话题的深度解读和讨论
+- 实时信息的搜索和播客制作
+
+---
+
+## 能力说明
+
+### 本 Skill 可以做什么
+- **从文件生成**：接收一篇资料（txt/md/docx/pdf等文本格式），生成对谈播客脚本和音频
+- **联网搜索生成**：根据用户指定的主题，联网搜索最新信息，生成播客脚本和音频
+- 自动控制时长，根据内容长度自动调整（3-20 分钟）
+- 生成 Markdown 格式的播客脚本（可人工编辑）
+- 使用 z-ai TTS 合成高质量音频并拼接为最终播客
+
+### 本 Skill 当前不做什么
+- 不生成 mp3 / 字幕 / 时间戳
+- 不支持三人及以上播客角色
+- 不加入背景音乐或音效
+
+---
+
+## 文件与职责说明
+
+本 Skill 由以下文件组成：
+
+- `generate.ts`
+  统一入口（支持文件模式和搜索模式）
+  - **文件模式**：读取用户上传的文本文件 → 生成播客
+  - **搜索模式**：调用 web-search skill 获取资料 → 生成播客
+  - 使用 z-ai-web-dev-sdk 进行 LLM 脚本生成
+  - 使用 z-ai-web-dev-sdk 进行 TTS 音频生成
+  - 自动拼接音频片段
+  - 只输出最终文件
+
+- `readme.md`
+  使用说明文档
+
+- `SKILL.md`
+  当前文件，描述 Skill 能力、边界与使用约定
+
+- `package.json`
+  Node.js 项目配置与依赖
+
+- `tsconfig.json`
+  TypeScript 编译配置
+
+---
+
+## 输入与输出约定
+
+### 输入（二选一）
+
+**方式 1：文件上传**
+- 一篇资料文件（txt / md / docx / pdf 等文本格式）
+- 资料长度不限，Skill 会自动压缩为合适长度
+
+**方式 2：联网搜索**
+- 用户指定一个搜索主题
+- 自动调用 web-search skill 获取相关内容
+- 整合多个搜索结果作为资料来源
+
+### 输出（只输出 2 个文件）
+
+- `podcast_script.md`
+  播客脚本（Markdown 格式，可人工编辑）
+
+- `podcast.wav`
+  最终拼接完成的播客音频
+
+**不输出中间文件**（如 segments.jsonl、meta.json 等）
+
+---
+
+## 运行方式
+
+### 依赖环境
+- Node.js 18+
+- z-ai-web-dev-sdk（已安装）
+- web-search skill（用于联网搜索模式）
+
+**不需要** z-ai CLI
+
+### 安装依赖
+```bash
+npm install
+```
+
+---
+
+## 使用示例
+
+### 从文件生成播客
+
+```bash
+npm run generate -- --input=test_data/material.txt --out_dir=out
+```
+
+### 联网搜索生成播客
+
+```bash
+# 根据主题搜索并生成播客
+npm run generate -- --topic="最新AI技术突破" --out_dir=out
+
+# 指定搜索主题和时长
+npm run generate -- --topic="量子计算应用场景" --out_dir=out --duration=8
+
+# 搜索并生成单人播客
+npm run generate -- --topic="气候变化影响" --out_dir=out --mode=single-male
+```
+
+---
+
+## 参数说明
+
+| 参数 | 说明 | 默认值 |
+|------|------|--------|
+| `--input` | 输入资料文件路径（与 --topic 二选一） | - |
+| `--topic` | 搜索主题关键词（与 --input 二选一） | - |
+| `--out_dir` | 输出目录（必需） | - |
+| `--mode` | 播客模式：dual / single-male / single-female | dual |
+| `--duration` | 手动指定分钟数（3-20）；0 表示自动 | 0 |
+| `--host_name` | 主持人/主播名称 | 小谱 |
+| `--guest_name` | 嘉宾名称 | 锤锤 |
+| `--voice_host` | 主持音色 | xiaochen |
+| `--voice_guest` | 嘉宾音色 | chuichui |
+| `--speed` | 语速（0.5-2.0） | 1.0 |
+| `--pause_ms` | 段间停顿毫秒数 | 200 |
+
+---
+
+## 可用音色
+
+| 音色 | 特点 |
+|------|------|
+| xiaochen | 沉稳专业 |
+| chuichui | 活泼可爱 |
+| tongtong | 温暖亲切 |
+| jam | 英音绅士 |
+| kazi | 清晰标准 |
+| douji | 自然流畅 |
+| luodo | 富有感染力 |
+
+---
+
+## 技术架构
+
+### generate.ts（统一入口）
+- **文件模式**：读取用户上传文件 → 生成播客
+- **搜索模式**：调用 web-search skill → 获取资料 → 生成播客
+- **LLM**：使用 `z-ai-web-dev-sdk` (`chat.completions.create`)
+- **TTS**：使用 `z-ai-web-dev-sdk` (`audio.tts.create`)
+- **不需要** z-ai CLI
+- 自动拼接音频片段
+- 只输出最终文件，中间文件自动清理
+
+### LLM 调用
+- System prompt：播客脚本编剧角色
+- User prompt：包含资料 + 硬性约束 + 呼吸感要求
+- 输出校验：字数、结构、角色标签
+- 自动重试：最多 3 次
+
+### TTS 调用
+- 使用 `zai.audio.tts.create()`
+- 支持自定义音色、语速
+- 自动拼接多个 wav 片段
+- 临时文件自动清理
+
+---
+
+## 输出示例
+
+### podcast_script.md（片段）
+```markdown
+**小谱**：大家好，欢迎收听今天的播客。今天我们来聊一个有趣的话题……
+
+**锤锤**：是啊，这个话题真的很有意思。我最近也在关注……
+
+**小谱**：说到这里，我想给大家举个例子……
+```
+
+---
+
+## License
+
+MIT
--- a/skills/podcast-generate/generate.ts
+++ b/skills/podcast-generate/generate.ts
@@ -0,0 +1,661 @@
+#!/usr/bin/env tsx
+/**
+ * generate.ts - 统一入口（纯 SDK 版本）
+ * 原资料 -> podcast_script.md + podcast.wav
+ *
+ * 只使用 z-ai-web-dev-sdk，不依赖 z-ai CLI
+ *
+ * Usage:
+ *   tsx generate.ts --input=material.txt --out_dir=out
+ *   tsx generate.ts --input=material.md --out_dir=out --duration=5
+ */
+
+import ZAI from 'z-ai-web-dev-sdk';
+import fs from 'fs';
+import path from 'path';
+import { fileURLToPath } from 'url';
+import os from 'os';
+
+const __filename = fileURLToPath(import.meta.url);
+const __dirname = path.dirname(__filename);
+
+// -----------------------------
+// Types
+// -----------------------------
+interface GenConfig {
+  mode: 'dual' | 'single-male' | 'single-female';
+  temperature: number;
+  durationManual: number;
+  charsPerMin: number;
+  hostName: string;
+  guestName: string;
+  audience: string;
+  tone: string;
+  maxAttempts: number;
+  timeoutSec: number;
+  voiceHost: string;
+  voiceGuest: string;
+  speed: number;
+  pauseMs: number;
+}
+
+interface Segment {
+  idx: number;
+  speaker: 'host' | 'guest';
+  name: string;
+  text: string;
+}
+
+// -----------------------------
+// Config
+// -----------------------------
+const DEFAULT_CONFIG: GenConfig = {
+  mode: 'dual',
+  temperature: 0.9,
+  durationManual: 0,
+  charsPerMin: 240,
+  hostName: '小谱',
+  guestName: '锤锤',
+  audience: '白领小白',
+  tone: '轻松但有信息密度',
+  maxAttempts: 3,
+  timeoutSec: 300,
+  voiceHost: 'xiaochen',
+  voiceGuest: 'chuichui',
+  speed: 1.0,
+  pauseMs: 200,
+};
+
+const DURATION_RANGE_LOW = 3;
+const DURATION_RANGE_HIGH = 20;
+const BUDGET_TOLERANCE = 0.15;
+
+// -----------------------------
+// Functions
+// -----------------------------
+
+function parseArgs(): { [key: string]: any } {
+  const args = process.argv.slice(2);
+  const result: { [key: string]: any } = {};
+
+  for (let i = 0; i < args.length; i++) {
+    const arg = args[i];
+    if (arg.startsWith('--')) {
+      const key = arg.slice(2);
+      if (key.includes('=')) {
+        const [k, v] = key.split('=');
+        result[k] = v;
+      } else if (i + 1 < args.length && !args[i + 1].startsWith('--')) {
+        result[key] = args[i + 1];
+        i++;
+      } else {
+        result[key] = true;
+      }
+    }
+  }
+
+  return result;
+}
+
+function readText(filePath: string): string {
+  let content = fs.readFileSync(filePath, 'utf-8');
+  content = content.replace(/\r\n/g, '\n');
+  content = content.replace(/\n{3,}/g, '\n\n');
+  content = content.replace(/[ \t]{2,}/g, ' ');
+  content = content.replace(/-\n/g, '');
+  return content.trim();
+}
+
+function countNonWsChars(text: string): number {
+  return text.replace(/\s+/g, '').length;
+}
+
+function chooseDurationMinutes(inputChars: number, low: number = DURATION_RANGE_LOW, high: number = DURATION_RANGE_HIGH): number {
+  const estimated = Math.max(low, Math.min(high, Math.floor(inputChars / 1000)));
+  return estimated;
+}
+
+function charBudget(durationMin: number, charsPerMin: number, tolerance: number): [number, number, number] {
+  const target = durationMin * charsPerMin;
+  const low = Math.floor(target * (1 - tolerance));
+  const high = Math.ceil(target * (1 + tolerance));
+  return [target, low, high];
+}
+
+function buildPrompts(
+  material: string,
+  cfg: GenConfig,
+  durationMin: number,
+  budgetTarget: number,
+  budgetLow: number,
+  budgetHigh: number,
+  attemptHint: string = ''
+): [string, string] {
+  let system: string;
+  let user: string;
+
+  if (cfg.mode === 'dual') {
+    system = (
+      `你是一个播客脚本编剧，擅长把资料提炼成双人对谈播客。` +
+      `角色固定为男主持「${cfg.hostName}」与女嘉宾「${cfg.guestName}」。` +
+      `你写作口播化、信息密度适中、有呼吸感、节奏自然。` +
+      `你必须严格遵守输出格式与字数预算。`
+    );
+
+    const hintBlock = attemptHint ? `\n【上一次生成纠偏提示】\n${attemptHint}\n` : '';
+
+    user = `请把下面【资料】改写为中文播客脚本，形式为双人对谈（男主持 ${cfg.hostName} + 女嘉宾 ${cfg.guestName}）。
+时长目标：${durationMin} 分钟。
+
+【硬性约束】
+1) 总字数必须在 ${budgetLow} 到 ${budgetHigh} 字之间（目标约 ${budgetTarget} 字）。
+2) 严格使用轮次交替输出：每段必须以"**${cfg.hostName}**："或"**${cfg.guestName}**："开头。
+3) 必须包含完整的叙事结构（但不要在对话中写出结构标签）：
+   - 开场：Hook 引入 + 本期主题介绍
+   - 主体：3个不同维度的内容，用自然过渡语连接
+   - 总结：回顾要点 + 行动建议（1句话，明确可执行）
+4) 不要在对话中写"核心点1"、"第一点"等结构标签，用自然的过渡语如"说到这个"、"还有个有趣的事"、"另外"等
+5) 不要照念原文，不要大段引用；要用口播化表达。
+6) 受众：${cfg.audience}
+7) 风格：${cfg.tone}
+
+【呼吸感与自然对话 - 重要！】
+为了营造真实播客的呼吸感，请：
+1) 适度加入语气词和感叹词：嗯、哦、啊、对、没错、哈哈、哇、天呐、啧啧等
+2) 多用互动式表达："你说得对"、"这就很有意思了"、"等等，让我想想"、"我懂你的意思"
+3) 适当加入思考和停顿的暗示："这个问题嘛..."、"怎么说呢..."、"其实..."
+4) 避免过于密集的信息输出，每段控制在3-5句话，给听众消化时间
+5) 用类比和生活化的例子来解释复杂概念
+6) 两人之间要有自然的呼应和追问，而不是各说各话
+7) 不同主题之间用自然过渡语连接，不要出现"核心点1/2/3"等标签
+
+【输出格式示例】
+**${cfg.hostName}**：开场……
+**${cfg.guestName}**：回应……
+（一直交替到结束）
+
+${hintBlock}
+【资料】
+${material}
+`;
+  } else {
+    const speakerName = cfg.mode === 'single-male' ? cfg.hostName : cfg.guestName;
+    const gender = cfg.mode === 'single-male' ? '男性' : '女性';
+
+    system = (
+      `你是一个${gender}单人播客主播，名字叫「${speakerName}」。` +
+      `你擅长把资料提炼成单人独白式播客，像讲课、读书分享、知识科普一样。` +
+      `你写作口播化、信息密度适中、有呼吸感、节奏自然。` +
+      `你必须严格遵守输出格式与字数预算。`
+    );
+
+    const hintBlock = attemptHint ? `\n【上一次生成纠偏提示】\n${attemptHint}\n` : '';
+
+    user = `请把下面【资料】改写为中文单人播客脚本，形式为独白式讲述（主播：${speakerName}）。
+时长目标：${durationMin} 分钟。
+
+【硬性约束】
+1) 总字数必须在 ${budgetLow} 到 ${budgetHigh} 字之间（目标约 ${budgetTarget} 字）。
+2) 所有内容均由「${speakerName}」一人讲述，每段都以"**${speakerName}**："开头。
+3) 必须包含完整的叙事结构（但不要在对话中写出结构标签）：
+   - 开场：Hook 引入 + 本期主题介绍
+   - 主体：3个不同维度的内容，用自然过渡语连接
+   - 总结：回顾要点 + 行动建议（1句话，明确可执行）
+4) 不要在对话中写"核心点1"、"第一点"等结构标签，用自然的过渡语如"说到这个"、"还有个有趣的事"、"另外"等
+5) 不要照念原文，不要大段引用；要用口播化表达。
+6) 受众：${cfg.audience}
+7) 风格：${cfg.tone}
+
+【单人播客的呼吸感 - 重要！】
+为了营造自然的单人播客呼吸感，请：
+1) 适度加入语气词和感叹词：嗯、哦、啊、对、没错、哈哈、哇、天呐、啧啧等
+2) 多用自问自答式表达："你可能会问...答案是..."、"这是为什么呢？让我来解释..."
+3) 适当加入思考和停顿的暗示："这个问题嘛..."、"怎么说呢..."、"其实..."
+4) 避免过于密集的信息输出，每段控制在3-5句话，给听众消化时间
+5) 用类比和生活化的例子来解释复杂概念
+6) 像在和朋友聊天一样，而不是在念课文
+
+【输出格式示例】
+**${speakerName}**：开场，大家好，我是${speakerName}，今天我们来聊……
+**${speakerName}**：说到这个，最近有个特别有意思的事……
+（所有内容都由${speakerName}讲述，分段输出）
+
+${hintBlock}
+【资料】
+${material}
+`;
+  }
+
+  return [system, user];
+}
+
+async function callZAI(
+  systemPrompt: string,
+  userPrompt: string,
+  temperature: number
+): Promise<string> {
+  const zai = await ZAI.create();
+
+  const completion = await zai.chat.completions.create({
+    messages: [
+      { role: 'assistant', content: systemPrompt },
+      { role: 'user', content: userPrompt },
+    ],
+    thinking: { type: 'disabled' },
+  });
+
+  const content = completion.choices[0]?.message?.content || '';
+  return content;
+}
+
+function scriptToSegments(script: string, hostName: string, guestName: string): Segment[] {
+  const segments: Segment[] = [];
+  const lines = script.split('\n');
+
+  let current: Segment | null = null;
+  let idx = 0;
+
+  const hostPrefix = `**${hostName}**：`;
+  const guestPrefix = `**${guestName}**：`;
+
+  for (const rawLine of lines) {
+    const line = rawLine.trim();
+    if (!line) continue;
+
+    if (line.startsWith(hostPrefix)) {
+      idx++;
+      current = {
+        idx,
+        speaker: 'host',
+        name: hostName,
+        text: line.slice(hostPrefix.length).trim(),
+      };
+      segments.push(current);
+    } else if (line.startsWith(guestPrefix)) {
+      idx++;
+      current = {
+        idx,
+        speaker: 'guest',
+        name: guestName,
+        text: line.slice(guestPrefix.length).trim(),
+      };
+      segments.push(current);
+    } else {
+      if (current) {
+        current.text = (current.text + ' ' + line).trim();
+      }
+    }
+  }
+
+  return segments;
+}
+
+function validateScript(
+  script: string,
+  cfg: GenConfig,
+  budgetLow: number,
+  budgetHigh: number
+): [boolean, string[]] {
+  const reasons: string[] = [];
+
+  if (cfg.mode === 'dual') {
+    const hostTag = `**${cfg.hostName}**：`;
+    const guestTag = `**${cfg.guestName}**：`;
+
+    if (!script.includes(hostTag)) reasons.push(`缺少主持人标识：${hostTag}`);
+    if (!script.includes(guestTag)) reasons.push(`缺少嘉宾标识：${guestTag}`);
+
+    const turns = script.split('\n').filter(line =>
+      line.startsWith(hostTag) || line.startsWith(guestTag)
+    );
+    if (turns.length < 8) reasons.push('对谈轮次过少：建议至少 8 轮');
+  } else {
+    const speakerName = cfg.mode === 'single-male' ? cfg.hostName : cfg.guestName;
+    const speakerTag = `**${speakerName}**：`;
+
+    if (!script.includes(speakerTag)) reasons.push(`缺少主播标识：${speakerTag}`);
+
+    const turns = script.split('\n').filter(line => line.startsWith(speakerTag));
+    if (turns.length < 5) reasons.push('播客段数过少：建议至少 5 段');
+  }
+
+  const n = countNonWsChars(script);
+  if (n < budgetLow || n > budgetHigh) {
+    reasons.push(`字数不在预算：当前约 ${n} 字，预算 ${budgetLow}-${budgetHigh}`);
+  }
+
+  // 只检查开场和总结，不检查"核心点1/2/3"标签（因为不应该出现在对话中）
+  const mustHave = ['开场', '总结'];
+  for (const kw of mustHave) {
+    if (!script.includes(kw)) {
+      reasons.push(`缺少结构要素：${kw}（请在对话中自然引入）`);
+    }
+  }
+
+  // 检查是否有足够的对话轮次（确保内容覆盖了多个主题）
+  const lineCount = script.split('\n').filter(l => l.trim()).length;
+  if (lineCount < 10) {
+    reasons.push('对话轮次过少，建议至少10段对话');
+  }
+
+  return [reasons.length === 0, reasons];
+}
+
+function makeRetryHint(reasons: string[], cfg: GenConfig, budgetLow: number, budgetHigh: number): string {
+  const lines = ['请严格修复以下问题后重新生成：'];
+  for (const r of reasons) lines.push(`- ${r}`);
+  lines.push(`- 总字数必须在 ${budgetLow}-${budgetHigh} 之间。`);
+
+  if (cfg.mode === 'dual') {
+    lines.push(`- 每段必须以"**${cfg.hostName}**："或"**${cfg.guestName}**："开头。`);
+  } else {
+    const speakerName = cfg.mode === 'single-male' ? cfg.hostName : cfg.guestName;
+    lines.push(`- 所有内容都由一人讲述，每段必须以"**${speakerName}**："开头。`);
+  }
+
+  lines.push('- 必须包含开场和总结，中间用自然过渡语连接不同主题，不要出现"核心点1/2/3"等标签。');
+  return lines.join('\n');
+}
+
+async function ttsRequest(
+  zai: any,
+  text: string,
+  voice: string,
+  speed: number
+): Promise<Buffer> {
+  const response = await zai.audio.tts.create({
+    input: text,
+    voice: voice,
+    speed: speed,
+    response_format: 'wav',
+    stream: false,
+  });
+
+  const arrayBuffer = await response.arrayBuffer();
+  const buffer = Buffer.from(new Uint8Array(arrayBuffer));
+  return buffer;
+}
+
+function ensureSilenceWav(filePath: string, params: { nchannels: number; sampwidth: number; framerate: number }, ms: number): void {
+  const { nchannels, sampwidth, framerate } = params;
+  const nframes = Math.floor((framerate * ms) / 1000);
+  const silenceFrame = Buffer.alloc(sampwidth * nchannels, 0);
+  const frames = Buffer.alloc(silenceFrame.length * nframes, 0);
+
+  const header = Buffer.alloc(44);
+  header.write('RIFF', 0);
+  header.writeUInt32LE(36 + frames.length, 4);
+  header.write('WAVE', 8);
+  header.write('fmt ', 12);
+  header.writeUInt32LE(16, 16);
+  header.writeUInt16LE(1, 20);
+  header.writeUInt16LE(nchannels, 22);
+  header.writeUInt32LE(framerate, 24);
+  header.writeUInt32LE(framerate * nchannels * sampwidth, 28);
+  header.writeUInt16LE(nchannels * sampwidth, 32);
+  header.writeUInt16LE(sampwidth * 8, 34);
+  header.write('data', 36);
+  header.writeUInt32LE(frames.length, 40);
+
+  fs.writeFileSync(filePath, Buffer.concat([header, frames]));
+}
+
+function wavParams(filePath: string): { nchannels: number; sampwidth: number; framerate: number } {
+  const buffer = fs.readFileSync(filePath);
+  const nchannels = buffer.readUInt16LE(22);
+  const sampwidth = buffer.readUInt16LE(34) / 8;
+  const framerate = buffer.readUInt32LE(24);
+  return { nchannels, sampwidth, framerate };
+}
+
+function joinWavsWave(outPath: string, wavPaths: string[], pauseMs: number): void {
+  if (wavPaths.length === 0) throw new Error('No wav files to join.');
+
+  const ref = wavPaths[0];
+  const refParams = wavParams(ref);
+  const silencePath = path.join(os.tmpdir(), `_silence_${Date.now()}.wav`);
+  if (pauseMs > 0) ensureSilenceWav(silencePath, refParams, pauseMs);
+
+  const chunks: Buffer[] = [];
+
+  for (let i = 0; i < wavPaths.length; i++) {
+    const wavPath = wavPaths[i];
+    const buffer = fs.readFileSync(wavPath);
+    const dataStart = buffer.indexOf('data') + 8;
+    const data = buffer.subarray(dataStart);
+
+    const params = wavParams(wavPath);
+    if (params.nchannels !== refParams.nchannels ||
+        params.sampwidth !== refParams.sampwidth ||
+        params.framerate !== refParams.framerate) {
+      throw new Error(`WAV params mismatch: ${wavPath}`);
+    }
+
+    chunks.push(data);
+
+    if (pauseMs > 0 && i < wavPaths.length - 1) {
+      const silenceBuffer = fs.readFileSync(silencePath);
+      const silenceData = silenceBuffer.subarray(silenceBuffer.indexOf('data') + 8);
+      chunks.push(silenceData);
+    }
+  }
+
+  const totalDataSize = chunks.reduce((sum, buf) => sum + buf.length, 0);
+  const header = Buffer.alloc(44);
+  header.write('RIFF', 0);
+  header.writeUInt32LE(36 + totalDataSize, 4);
+  header.write('WAVE', 8);
+  header.write('fmt ', 12);
+  header.writeUInt32LE(16, 16);
+  header.writeUInt16LE(1, 20);
+  header.writeUInt16LE(refParams.nchannels, 22);
+  header.writeUInt32LE(refParams.framerate, 24);
+  header.writeUInt32LE(refParams.framerate * refParams.nchannels * refParams.sampwidth, 28);
+  header.writeUInt16LE(refParams.nchannels * refParams.sampwidth, 32);
+  header.writeUInt16LE(refParams.sampwidth * 8, 34);
+  header.write('data', 36);
+  header.writeUInt32LE(totalDataSize, 40);
+
+  const output = Buffer.concat([header, ...chunks]);
+  fs.writeFileSync(outPath, output);
+
+  if (fs.existsSync(silencePath)) fs.unlinkSync(silencePath);
+}
+
+// -----------------------------
+// Main
+// -----------------------------
+async function main() {
+  const args = parseArgs();
+
+  const inputPath = args.input;
+  const outDir = args.out_dir;
+  const topic = args.topic;
+
+  // 检查参数：必须提供 input 或 topic 之一
+  if ((!inputPath && !topic) || !outDir) {
+    console.error('Usage: tsx generate.ts --input=<file> --out_dir=<dir>');
+    console.error('   OR: tsx generate.ts --topic=<search-term> --out_dir=<dir>');
+    console.error('');
+    console.error('Examples:');
+    console.error('  # From file');
+    console.error('  npm run generate -- --input=article.txt --out_dir=out');
+    console.error('  # From web search');
+    console.error('  npm run generate -- --topic="最新AI新闻" --out_dir=out');
+    process.exit(1);
+  }
+
+  // Merge config
+  const cfg: GenConfig = {
+    ...DEFAULT_CONFIG,
+    mode: (args.mode || 'dual') as GenConfig['mode'],
+    durationManual: parseInt(args.duration || '0'),
+    hostName: args.host_name || DEFAULT_CONFIG.hostName,
+    guestName: args.guest_name || DEFAULT_CONFIG.guestName,
+    voiceHost: args.voice_host || DEFAULT_CONFIG.voiceHost,
+    voiceGuest: args.voice_guest || DEFAULT_CONFIG.voiceGuest,
+    speed: parseFloat(args.speed || String(DEFAULT_CONFIG.speed)),
+    pauseMs: parseInt(args.pause_ms || String(DEFAULT_CONFIG.pauseMs)),
+  };
+
+  // Create output directory
+  if (!fs.existsSync(outDir)) {
+    fs.mkdirSync(outDir, { recursive: true });
+  }
+
+  // 根据模式获取资料
+  let material: string;
+  let inputSource: string;
+
+  if (inputPath) {
+    // 模式1：从文件读取
+    console.log(`[MODE] Reading from file: ${inputPath}`);
+    material = readText(inputPath);
+    inputSource = `file:${inputPath}`;
+  } else if (topic) {
+    // 模式2：联网搜索
+    console.log(`[MODE] Searching web for topic: ${topic}`);
+    const zai = await ZAI.create();
+
+    const searchResults = await zai.functions.invoke('web_search', {
+      query: topic,
+      num: 10
+    });
+
+    if (!Array.isArray(searchResults) || searchResults.length === 0) {
+      console.error(`未找到关于"${topic}"的搜索结果`);
+      process.exit(2);
+    }
+
+    console.log(`[SEARCH] Found ${searchResults.length} results`);
+
+    // 将搜索结果转换为文本资料
+    material = searchResults
+      .map((r: any, i: number) => `【来源 ${i + 1}】${r.name}\n${r.snippet}\n链接：${r.url}`)
+      .join('\n\n');
+
+    inputSource = `web_search:${topic}`;
+    console.log(`[SEARCH] Compiled material (${material.length} chars)`);
+  } else {
+    console.error('[ERROR] Neither --input nor --topic provided');
+    process.exit(1);
+  }
+
+  const inputChars = material.length;
+
+  // Calculate duration
+  let durationMin: number;
+  if (cfg.durationManual >= 3 && cfg.durationManual <= 20) {
+    durationMin = cfg.durationManual;
+  } else {
+    durationMin = chooseDurationMinutes(inputChars, DURATION_RANGE_LOW, DURATION_RANGE_HIGH);
+  }
+
+  const [target, low, high] = charBudget(durationMin, cfg.charsPerMin, BUDGET_TOLERANCE);
+
+  console.log(`[INFO] input_chars=${inputChars} duration=${durationMin}min budget=${low}-${high}`);
+
+  let attemptHint = '';
+  let lastScript: string | null = null;
+
+  // Initialize ZAI SDK (reuse for TTS)
+  const zai = await ZAI.create();
+
+  // Generate script
+  for (let attempt = 1; attempt <= cfg.maxAttempts; attempt++) {
+    const [systemPrompt, userPrompt] = buildPrompts(
+      material,
+      cfg,
+      durationMin,
+      target,
+      low,
+      high,
+      attemptHint
+    );
+
+    try {
+      console.log(`[LLM] Attempt ${attempt}/${cfg.maxAttempts}...`);
+      const content = await callZAI(systemPrompt, userPrompt, cfg.temperature);
+      lastScript = content;
+
+      const [ok, reasons] = validateScript(content, cfg, low, high);
+
+      if (ok) {
+        break;
+      }
+
+      attemptHint = makeRetryHint(reasons, cfg, low, high);
+      console.error(`[WARN] Validation failed:`, reasons.join(', '));
+    } catch (error: any) {
+      console.error(`[ERROR] LLM call failed: ${error.message}`);
+      throw error;
+    }
+  }
+
+  if (!lastScript) {
+    console.error('[ERROR] 未生成任何脚本输出。');
+    process.exit(1);
+  }
+
+  // Write script
+  const scriptPath = path.join(outDir, 'podcast_script.md');
+  fs.writeFileSync(scriptPath, lastScript, 'utf-8');
+  console.log(`[DONE] podcast_script.md -> ${scriptPath}`);
+
+  // Parse segments
+  const segments = scriptToSegments(lastScript, cfg.hostName, cfg.guestName);
+  console.log(`[INFO] Parsed ${segments.length} segments`);
+
+  // Generate TTS using SDK
+  const tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), 'podcast_segments_'));
+  const produced: string[] = [];
+
+  try {
+    for (let i = 0; i < segments.length; i++) {
+      const seg = segments[i];
+      const text = seg.text.trim();
+      if (!text) continue;
+
+      let voice: string;
+      if (cfg.mode === 'dual') {
+        voice = seg.speaker === 'host' ? cfg.voiceHost : cfg.voiceGuest;
+      } else if (cfg.mode === 'single-male') {
+        voice = cfg.voiceHost;
+      } else {
+        voice = cfg.voiceGuest;
+      }
+
+      const wavPath = path.join(tmpDir, `seg_${seg.idx.toString().padStart(4, '0')}.wav`);
+
+      console.log(`[TTS] [${i + 1}/${segments.length}] idx=${seg.idx} speaker=${seg.speaker} voice=${voice}`);
+
+      const buffer = await ttsRequest(zai, text, voice, cfg.speed);
+      fs.writeFileSync(wavPath, buffer);
+      produced.push(wavPath);
+    }
+
+    // Join segments
+    const podcastPath = path.join(outDir, 'podcast.wav');
+    console.log(`[JOIN] Joining ${produced.length} wav files -> ${podcastPath}`);
+
+    joinWavsWave(podcastPath, produced, cfg.pauseMs);
+    console.log(`[DONE] podcast.wav -> ${podcastPath}`);
+
+  } finally {
+    // Cleanup temp directory
+    try {
+      fs.rmSync(tmpDir, { recursive: true, force: true });
+    } catch (error: any) {
+      console.error(`[WARN] Failed to cleanup temp dir: ${error.message}`);
+    }
+  }
+
+  console.log('\n[FINAL OUTPUT]');
+  console.log(`  📄 podcast_script.md -> ${scriptPath}`);
+  console.log(`  🎙️  podcast.wav       -> ${path.join(outDir, 'podcast.wav')}`);
+}
+
+main().catch(error => {
+  console.error('[FATAL ERROR]', error);
+  process.exit(1);
+});
--- a/skills/podcast-generate/package.json
+++ b/skills/podcast-generate/package.json
@@ -0,0 +1,30 @@
+{
+  "name": "podcast-generate-online",
+  "version": "1.0.0",
+  "description": "Generate podcast audio from text using z-ai LLM and TTS",
+  "type": "module",
+  "main": "dist/index.js",
+  "scripts": {
+    "generate": "tsx generate.ts",
+    "build": "tsc",
+    "prepublishOnly": "npm run build"
+  },
+  "keywords": [
+    "podcast",
+    "tts",
+    "llm",
+    "z-ai"
+  ],
+  "license": "MIT",
+  "dependencies": {
+    "z-ai-web-dev-sdk": "*"
+  },
+  "devDependencies": {
+    "@types/node": "^20",
+    "tsx": "^4.7.0",
+    "typescript": "^5.3.0"
+  },
+  "engines": {
+    "node": ">=18.0.0"
+  }
+}
--- a/skills/podcast-generate/readme.md
+++ b/skills/podcast-generate/readme.md
@@ -0,0 +1,177 @@
+# Podcast Generate Skill（TypeScript 线上版本）
+
+将一篇资料自动转化为对谈播客，时长根据内容长度自动调整（3-20 分钟，约240字/分钟）：
+- 自动提炼核心内容
+- 生成可编辑的播客脚本
+- 使用 z-ai TTS 合成音频
+
+这是一个使用 **z-ai-web-dev-sdk** 的 TypeScript 版本，适用于线上环境。
+
+---
+
+## 快速开始
+
+### 一键生成（脚本 + 音频）
+
+```bash
+npm run generate -- --input=test_data/material.txt --out_dir=out
+```
+
+**最终输出：**
+- `out/podcast_script.md` - 播客脚本（Markdown 格式）
+- `out/podcast.wav` - 最终播客音频
+
+---
+
+## 目录结构
+
+```text
+podcast-generate/
+├── readme.md               # 使用说明（本文件）
+├── SKILL.md                # Skill 能力与接口约定
+├── package.json            # Node.js 依赖配置
+├── tsconfig.json           # TypeScript 编译配置
+├── generate.ts             # ⭐ 统一入口（唯一需要的文件）
+└── test_data/
+    └── material.txt        # 示例输入资料
+```
+
+---
+
+## 环境要求
+
+- **Node.js 18+**
+- **z-ai-web-dev-sdk**（已安装在环境中）
+
+**不需要** z-ai CLI，本代码完全使用 SDK。
+
+---
+
+## 安装
+
+```bash
+npm install
+```
+
+---
+
+## 使用方式
+
+### 方式 1：从文件生成
+
+```bash
+npm run generate -- --input=material.txt --out_dir=out
+```
+
+### 方式 2：联网搜索生成
+
+```bash
+npm run generate -- --topic="最新AI新闻" --out_dir=out
+npm run generate -- --topic="量子计算应用" --out_dir=out --duration=8
+```
+
+### 参数说明
+
+| 参数 | 说明 | 默认值 |
+|------|------|--------|
+| `--input` | 输入资料文件路径，支持 txt/md/docx/pdf 等文本格式（与 --topic 二选一） | - |
+| `--topic` | 搜索主题关键词（与 --input 二选一） | - |
+| `--out_dir` | 输出目录（必需） | - |
+| `--mode` | 播客模式：dual / single-male / single-female | dual |
+| `--duration` | 手动指定分钟数（3-20）；0 表示自动 | 0 |
+| `--host_name` | 主持人/主播名称 | 小谱 |
+| `--guest_name` | 嘉宾名称 | 锤锤 |
+| `--voice_host` | 主持音色 | xiaochen |
+| `--voice_guest` | 嘉宾音色 | chuichui |
+| `--speed` | 语速（0.5-2.0） | 1.0 |
+| `--pause_ms` | 段间停顿毫秒数 | 200 |
+
+---
+
+## 使用示例
+
+### 双人对谈播客（默认）
+
+```bash
+npm run generate -- --input=material.txt --out_dir=out
+```
+
+### 单人男声播客
+
+```bash
+npm run generate -- --input=material.txt --out_dir=out --mode=single-male
+```
+
+### 指定 5 分钟时长
+
+```bash
+npm run generate -- --input=material.txt --out_dir=out --duration=5
+```
+
+### 自定义角色名称
+
+```bash
+npm run generate -- --input=material.txt --out_dir=out --host_name=张三 --guest_name=李四
+```
+
+### 使用不同音色
+
+```bash
+npm run generate -- --input=material.txt --out_dir=out --voice_host=tongtong --voice_guest=douji
+```
+
+### 联网搜索生成播客
+
+```bash
+# 根据主题搜索并生成播客
+npm run generate -- --topic="最新AI技术突破" --out_dir=out
+
+# 指定搜索主题和时长
+npm run generate -- --topic="量子计算应用场景" --out_dir=out --duration=8
+
+# 搜索并生成单人播客
+npm run generate -- --topic="气候变化影响" --out_dir=out --mode=single-male
+```
+
+---
+
+## 可用音色
+
+| 音色 | 特点 |
+|------|------|
+| xiaochen | 沉稳专业 |
+| chuichui | 活泼可爱 |
+| tongtong | 温暖亲切 |
+| jam | 英音绅士 |
+| kazi | 清晰标准 |
+| douji | 自然流畅 |
+| luodo | 富有感染力 |
+
+---
+
+## 技术架构
+
+### generate.ts（统一入口）
+- **LLM**：使用 `z-ai-web-dev-sdk` (`chat.completions.create`)
+- **TTS**：使用 `z-ai-web-dev-sdk` (`audio.tts.create`)
+- **不需要** z-ai CLI
+- 自动拼接音频片段
+- 只输出最终文件，中间文件自动清理
+
+### LLM 调用
+- System prompt：播客脚本编剧角色
+- User prompt：包含资料 + 硬性约束 + 呼吸感要求
+- 输出校验：字数、结构、角色标签
+- 自动重试：最多 3 次
+
+### TTS 调用
+- 使用 `zai.audio.tts.create()`
+- 支持自定义音色、语速
+- 自动拼接多个 wav 片段
+- 临时文件自动清理
+
+---
+
+## License
+
+MIT
--- a/skills/podcast-generate/test_data/segments.jsonl
+++ b/skills/podcast-generate/test_data/segments.jsonl
@@ -0,0 +1,3 @@
+{"idx": 1, "speaker": "host", "name": "主持人", "text": "大家好，欢迎来到今天的播客节目。"}
+{"idx": 2, "speaker": "guest", "name": "嘉宾", "text": "很高兴能参加这次节目。"}
+{"idx": 3, "speaker": "host", "name": "主持人", "text": "今天我们要讨论一个非常有意思的话题。"}
--- a/skills/podcast-generate/tsconfig.json
+++ b/skills/podcast-generate/tsconfig.json
@@ -0,0 +1,26 @@
+{
+  "compilerOptions": {
+    "target": "ES2022",
+    "module": "ES2022",
+    "lib": ["ES2022"],
+    "moduleResolution": "node",
+    "outDir": "./dist",
+    "rootDir": "./",
+    "strict": true,
+    "esModuleInterop": true,
+    "skipLibCheck": true,
+    "forceConsistentCasingInFileNames": true,
+    "resolveJsonModule": true,
+    "allowSyntheticDefaultImports": true,
+    "declaration": true,
+    "declarationMap": true,
+    "sourceMap": true
+  },
+  "include": [
+    "*.ts"
+  ],
+  "exclude": [
+    "node_modules",
+    "dist"
+  ]
+}