feat: Add GLM Tools, Skills & Agents collection
This commit is contained in:
527
skills/glm-skills/SKILL.md
Normal file
527
skills/glm-skills/SKILL.md
Normal file
@@ -0,0 +1,527 @@
|
||||
---
|
||||
name: glm-skills
|
||||
description: "Reference for Super Z/GLM platform skills and SDK. AUTO-TRIGGERS when: speech-to-text, ASR, TTS, text-to-speech, image generation, video generation, VLM, vision model, PDF processing, DOCX, XLSX, PPTX, web search, web scraping, podcast generation, multimodal AI, z-ai-web-dev-sdk."
|
||||
priority: 100
|
||||
autoTrigger: true
|
||||
triggers:
|
||||
- "speech to text"
|
||||
- "ASR"
|
||||
- "transcribe"
|
||||
- "text to speech"
|
||||
- "TTS"
|
||||
- "voice"
|
||||
- "image generation"
|
||||
- "generate image"
|
||||
- "video generation"
|
||||
- "generate video"
|
||||
- "VLM"
|
||||
- "vision language"
|
||||
- "analyze image"
|
||||
- "PDF"
|
||||
- "DOCX"
|
||||
- "XLSX"
|
||||
- "PPTX"
|
||||
- "spreadsheet"
|
||||
- "presentation"
|
||||
- "web search"
|
||||
- "web scraping"
|
||||
- "podcast"
|
||||
- "multimodal"
|
||||
- "z-ai-web-dev-sdk"
|
||||
- "Super Z"
|
||||
- "GLM"
|
||||
---
|
||||
|
||||
# Super Z / GLM Skills & Agents Reference
|
||||
|
||||
Complete reference for the Super Z (z.ai) platform's skills system, agents, and development patterns.
|
||||
|
||||
---
|
||||
|
||||
## SDK: z-ai-web-dev-sdk
|
||||
|
||||
All skills use the `z-ai-web-dev-sdk` JavaScript/TypeScript SDK.
|
||||
|
||||
### Installation
|
||||
```bash
|
||||
npm install z-ai-web-dev-sdk
|
||||
# or
|
||||
bun add z-ai-web-dev-sdk
|
||||
```
|
||||
|
||||
### Initialization
|
||||
```javascript
|
||||
import ZAI from 'z-ai-web-dev-sdk';
|
||||
|
||||
const zai = await ZAI.create();
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Multimodal AI Skills
|
||||
|
||||
### ASR (Automatic Speech Recognition)
|
||||
**Command**: `ASR`
|
||||
|
||||
Speech-to-text using z-ai-web-dev-sdk.
|
||||
|
||||
```javascript
|
||||
// Supports base64 encoded audio
|
||||
const transcription = await zai.asr.transcribe({
|
||||
audio: audioBase64
|
||||
});
|
||||
```
|
||||
|
||||
**Use Cases**: Transcription, voice input, audio processing
|
||||
|
||||
---
|
||||
|
||||
### TTS (Text-to-Speech)
|
||||
**Command**: `TTS`
|
||||
|
||||
Convert text to natural-sounding speech.
|
||||
|
||||
```javascript
|
||||
const audio = await zai.tts.synthesize({
|
||||
text: "Hello world",
|
||||
voice: "default",
|
||||
speed: 1.0
|
||||
});
|
||||
```
|
||||
|
||||
**Features**: Multiple voices, adjustable speed, various audio formats
|
||||
|
||||
---
|
||||
|
||||
### LLM (Large Language Model)
|
||||
**Command**: `LLM`
|
||||
|
||||
Chat completions with context management.
|
||||
|
||||
```javascript
|
||||
const completion = await zai.chat.completions.create({
|
||||
messages: [
|
||||
{ role: 'system', content: 'You are a helpful assistant.' },
|
||||
{ role: 'user', content: 'Hello!' }
|
||||
],
|
||||
temperature: 0.7
|
||||
});
|
||||
```
|
||||
|
||||
**Features**: Multi-turn conversations, system prompts, context management
|
||||
|
||||
---
|
||||
|
||||
### VLM (Vision Language Model)
|
||||
**Command**: `VLM`
|
||||
|
||||
Image understanding with conversational AI.
|
||||
|
||||
```javascript
|
||||
const response = await zai.vlm.analyze({
|
||||
image: imageUrlOrBase64,
|
||||
prompt: "Describe this image"
|
||||
});
|
||||
```
|
||||
|
||||
**Supports**: Image URLs, base64 encoded images, multimodal interactions
|
||||
|
||||
---
|
||||
|
||||
### Image Generation
|
||||
**Command**: `image-generation`
|
||||
|
||||
AI image creation from text.
|
||||
|
||||
```javascript
|
||||
const response = await zai.images.generations.create({
|
||||
prompt: 'A cute cat playing in the garden',
|
||||
size: '1024x1024'
|
||||
});
|
||||
|
||||
const imageBase64 = response.data[0].base64;
|
||||
```
|
||||
|
||||
**CLI Tool**:
|
||||
```bash
|
||||
z-ai-generate --prompt "A beautiful landscape" --output "./image.png"
|
||||
z-ai-generate -p "A cute cat" -o "./cat.png" -s 1024x1024
|
||||
```
|
||||
|
||||
**Supported Sizes**: 1024x1024, 768x1344, 864x1152, 1344x768, 1152x864, 1440x720, 720x1440
|
||||
|
||||
---
|
||||
|
||||
### Image Edit
|
||||
**Command**: `image-edit`
|
||||
|
||||
Modify existing images with AI.
|
||||
|
||||
```javascript
|
||||
const edited = await zai.images.edits.create({
|
||||
image: originalImageBase64,
|
||||
prompt: "Add a sunset background"
|
||||
});
|
||||
```
|
||||
|
||||
**Use Cases**: Variations, redesign, text-based transformation
|
||||
|
||||
---
|
||||
|
||||
### Image Understand
|
||||
**Command**: `image-understand`
|
||||
|
||||
Analyze and understand images.
|
||||
|
||||
```javascript
|
||||
const analysis = await zai.image.understand({
|
||||
image: imagePath,
|
||||
task: "extract_text" // or "detect_objects", "classify"
|
||||
});
|
||||
```
|
||||
|
||||
**Supports**: PNG, JPEG, GIF, WebP, BMP
|
||||
|
||||
---
|
||||
|
||||
### Video Generation
|
||||
**Command**: `video-generation`
|
||||
|
||||
AI-powered video creation.
|
||||
|
||||
```javascript
|
||||
const task = await zai.videos.generations.create({
|
||||
prompt: "A dog running in a park",
|
||||
// or from image
|
||||
image: imageBase64
|
||||
});
|
||||
|
||||
// Async status polling
|
||||
const status = await zai.videos.generations.status(task.id);
|
||||
const result = await zai.videos.generations.retrieve(task.id);
|
||||
```
|
||||
|
||||
**Features**: Async task management, status polling, result retrieval
|
||||
|
||||
---
|
||||
|
||||
### Video Understand
|
||||
**Command**: `video-understand`
|
||||
|
||||
Analyze video content.
|
||||
|
||||
```javascript
|
||||
const analysis = await zai.video.analyze({
|
||||
video: videoPath,
|
||||
prompt: "Describe what happens in this video"
|
||||
});
|
||||
```
|
||||
|
||||
**Supports**: MP4, AVI, MOV
|
||||
|
||||
---
|
||||
|
||||
## Document Processing Skills
|
||||
|
||||
### PDF
|
||||
**Command**: `pdf`
|
||||
|
||||
Comprehensive PDF toolkit.
|
||||
|
||||
**Capabilities**:
|
||||
- Extract text and tables
|
||||
- Create new PDFs
|
||||
- Merge/split documents
|
||||
- Handle forms
|
||||
|
||||
---
|
||||
|
||||
### DOCX
|
||||
**Command**: `docx`
|
||||
|
||||
Word document processing.
|
||||
|
||||
**Capabilities**:
|
||||
- Create and edit documents
|
||||
- Tracked changes
|
||||
- Comments
|
||||
- Formatting preservation
|
||||
- Text extraction
|
||||
|
||||
---
|
||||
|
||||
### XLSX
|
||||
**Command**: `xlsx`
|
||||
|
||||
Spreadsheet processing.
|
||||
|
||||
**Capabilities**:
|
||||
- Create with formulas and formatting
|
||||
- Read and analyze data
|
||||
- Modify while preserving formulas
|
||||
- Data visualization
|
||||
- Formula recalculation
|
||||
|
||||
**Supports**: .xlsx, .xlsm, .csv, .tsv
|
||||
|
||||
---
|
||||
|
||||
### PPTX
|
||||
**Command**: `pptx`
|
||||
|
||||
Presentation processing.
|
||||
|
||||
**Capabilities**:
|
||||
- Edit existing presentations
|
||||
- Add slides
|
||||
- Create new presentations
|
||||
- Work with layouts
|
||||
- Add comments/speaker notes
|
||||
|
||||
---
|
||||
|
||||
## Web & Data Skills
|
||||
|
||||
### Web Search
|
||||
**Command**: `web-search`
|
||||
|
||||
Search for real-time information.
|
||||
|
||||
```javascript
|
||||
const results = await zai.functions.invoke("web_search", {
|
||||
query: "What is the capital of France?",
|
||||
num: 10
|
||||
});
|
||||
|
||||
// Result type
|
||||
interface SearchFunctionResultItem {
|
||||
url: string;
|
||||
name: string;
|
||||
snippet: string;
|
||||
host_name: string;
|
||||
rank: number;
|
||||
date: string;
|
||||
favicon: string;
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Web Reader
|
||||
**Command**: `web-reader`
|
||||
|
||||
Extract web page content.
|
||||
|
||||
```javascript
|
||||
const content = await zai.web.read({
|
||||
url: "https://example.com"
|
||||
});
|
||||
```
|
||||
|
||||
**Features**: Automatic content extraction, title, HTML, publication time
|
||||
|
||||
---
|
||||
|
||||
### CSV Data Summarizer
|
||||
**Command**: `csv-data-summarizer`
|
||||
|
||||
Automatic CSV analysis.
|
||||
|
||||
**Features**:
|
||||
- Detects data types (sales, customer, financial, operational, survey)
|
||||
- Generates correlation heatmaps
|
||||
- Time-series plots
|
||||
- Distribution charts
|
||||
- Missing data analysis
|
||||
- Automatic date detection
|
||||
|
||||
**Built with**: pandas, matplotlib, seaborn
|
||||
|
||||
---
|
||||
|
||||
### Deep Research
|
||||
**Command**: `deep-research`
|
||||
|
||||
Enterprise-grade research.
|
||||
|
||||
**Triggers**: "deep research", "comprehensive analysis", "research report", "compare X vs Y"
|
||||
|
||||
**Features**:
|
||||
- Multi-source synthesis
|
||||
- Citation tracking
|
||||
- Verification
|
||||
- 10+ sources
|
||||
|
||||
---
|
||||
|
||||
## Specialized Skills
|
||||
|
||||
### Podcast Generate
|
||||
**Command**: `podcast-generate`
|
||||
|
||||
Create podcast episodes.
|
||||
|
||||
**Modes**:
|
||||
1. From uploaded text/article → dual-host dialogue
|
||||
2. From topic → web search + generation
|
||||
|
||||
**Features**:
|
||||
- Duration scales with content (3-20 min, ~240 chars/min)
|
||||
- Outputs: Markdown script + WAV audio
|
||||
|
||||
---
|
||||
|
||||
### Story Video Generation
|
||||
**Command**: `story-video-generation`
|
||||
|
||||
Convert sentences to story videos.
|
||||
|
||||
**Triggers** (Chinese): "生成故事", "故事视频", "把一句话变成视频"
|
||||
|
||||
**Process**: Sentence → Story → Scene Images → Video
|
||||
|
||||
---
|
||||
|
||||
### Frontend Design
|
||||
**Command**: `frontend-design`
|
||||
|
||||
Transform UI requirements to production code.
|
||||
|
||||
**Features**:
|
||||
- Design tokens
|
||||
- Accessibility compliance
|
||||
- Creative execution
|
||||
|
||||
**Use Cases**: Websites, web apps, React/Vue components, dashboards, landing pages
|
||||
|
||||
---
|
||||
|
||||
### Finance
|
||||
**Command**: `finance`
|
||||
|
||||
Finance API integration.
|
||||
|
||||
**Capabilities**:
|
||||
- Stock price queries
|
||||
- Market data analysis
|
||||
- Company financials
|
||||
- Portfolio tracking
|
||||
- Market news
|
||||
- Stock screening
|
||||
- Technical analysis
|
||||
|
||||
---
|
||||
|
||||
### Gift Evaluator
|
||||
**Command**: `gift-evaluator`
|
||||
|
||||
Spring Festival gift analysis.
|
||||
|
||||
**Features**:
|
||||
- Visual perception
|
||||
- Market valuation
|
||||
- HTML card generation
|
||||
|
||||
**Use Cases**: Gift photos, value inquiry, authenticity, social responses
|
||||
|
||||
---
|
||||
|
||||
## Subagents (Task Tool)
|
||||
|
||||
### Available Agent Types
|
||||
|
||||
| Agent | Description | Tools |
|
||||
|-------|-------------|-------|
|
||||
| `general-purpose` | Complex research, multi-step tasks | All |
|
||||
| `statusline-setup` | Status line configuration | Read, Edit |
|
||||
| `Explore` | Codebase exploration | All |
|
||||
| `Plan` | Implementation planning | All |
|
||||
| `frontend-styling-expert` | CSS, styling, responsive design | All |
|
||||
| `full-stack-developer` | Next.js 15 + React + Prisma | All |
|
||||
|
||||
### Explore Agent Thoroughness
|
||||
- `quick`: Basic searches
|
||||
- `medium`: Moderate exploration
|
||||
- `very thorough`: Comprehensive analysis
|
||||
|
||||
---
|
||||
|
||||
## Project Environment
|
||||
|
||||
### Standard Stack
|
||||
- Next.js 15 with App Router
|
||||
- Port: 3000
|
||||
- Package manager: Bun
|
||||
- Database: Prisma
|
||||
- UI: shadcn/ui components
|
||||
|
||||
### Commands
|
||||
```bash
|
||||
bun run dev # Start dev server (auto-runs)
|
||||
bun run lint # Check code quality
|
||||
```
|
||||
|
||||
### File Output
|
||||
All generated files go to:
|
||||
```
|
||||
/home/z/my-project/download/
|
||||
```
|
||||
|
||||
### Backend-Only Rule
|
||||
`z-ai-web-dev-sdk` MUST be used in backend only (API routes, server components).
|
||||
|
||||
---
|
||||
|
||||
## Design Patterns
|
||||
|
||||
### Async Task Pattern (Video Generation)
|
||||
```javascript
|
||||
// 1. Create task
|
||||
const task = await create({ prompt });
|
||||
|
||||
// 2. Poll status
|
||||
const status = await status(task.id);
|
||||
|
||||
// 3. Retrieve result when complete
|
||||
const result = await retrieve(task.id);
|
||||
```
|
||||
|
||||
### Multi-Modal Input Pattern
|
||||
```javascript
|
||||
// Flexible input handling
|
||||
const input = {
|
||||
// Text only
|
||||
text: "description",
|
||||
|
||||
// Image only
|
||||
image: base64OrUrl,
|
||||
|
||||
// Mixed
|
||||
text: "modify this",
|
||||
image: base64OrUrl
|
||||
};
|
||||
```
|
||||
|
||||
### Function Invocation Pattern
|
||||
```javascript
|
||||
const result = await zai.functions.invoke("function_name", {
|
||||
param1: "value",
|
||||
param2: 123
|
||||
});
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## When to Use This Reference
|
||||
|
||||
1. **Building AI-powered applications**: SDK patterns and examples
|
||||
2. **Document processing**: PDF/DOCX/XLSX/PPTX capabilities
|
||||
3. **Multimodal features**: Image, video, audio processing
|
||||
4. **Web integration**: Search and scraping patterns
|
||||
5. **Agent design**: Subagent patterns and capabilities
|
||||
|
||||
## Source
|
||||
- Platform: Super Z (z.ai)
|
||||
- SDK: z-ai-web-dev-sdk
|
||||
- Framework: Next.js 15 + React + Prisma
|
||||
- UI: shadcn/ui
|
||||
Reference in New Issue
Block a user