Add comprehensive agents and skills documentation

This commit is contained in:
Z User
2026-02-11 20:44:49 +00:00
Unverified
parent 2473b25520
commit 91e646632d

820
AGENTS_AND_SKILLS.md Normal file
View File

@@ -0,0 +1,820 @@
# GLM Tools, Skills, and Agents Documentation
This document provides a comprehensive overview of all agents, skills, and tools available in the Super Z system.
---
## Table of Contents
1. [Subagents (Task Tool)](#subagents-task-tool)
2. [Skills System](#skills-system)
3. [Tools Reference](#tools-reference)
---
## Subagents (Task Tool)
The Task tool launches specialized agents (subprocesses) that autonomously handle complex tasks. Each agent type has specific capabilities and tools available to it.
### Agent Types
#### 1. general-purpose
**Description:** General-purpose agent for researching complex questions, searching for code, and executing multi-step tasks. Use this when you need to perform search for a keyword or file and are not confident that you will find the right match in the first few tries.
**Tools Available:** All tools (*)
**Use Cases:**
- Complex research questions
- Multi-step tasks requiring various tools
- Code searching across multiple files
- Open-ended exploration tasks
---
#### 2. statusline-setup
**Description:** Use this agent to configure the user's Claude Code status line setting.
**Tools Available:** Read, Edit
**Use Cases:**
- Status line configuration
- Settings management
---
#### 3. Explore
**Description:** Fast agent specialized for exploring codebases. Use this when you need to quickly find files by patterns (e.g., "src/components/**/*.tsx"), search code for keywords (e.g., "API endpoints"), or answer questions about the codebase (e.g., "how do API endpoints work?").
**Tools Available:** All tools
**Thoroughness Levels:**
- **quick**: Basic searches
- **medium**: Moderate exploration
- **very thorough**: Comprehensive analysis across multiple locations and naming conventions
**Use Cases:**
- Finding files by patterns
- Searching code for keywords
- Understanding codebase structure
- Answering questions about code architecture
---
#### 4. Plan
**Description:** Software architect agent for designing implementation plans. Use this when you need to plan the implementation strategy for a task. Returns step-by-step plans, identifies critical files, and considers architectural trade-offs.
**Tools Available:** All tools
**Use Cases:**
- Implementation planning
- Architecture design
- Identifying critical files
- Evaluating trade-offs
---
#### 5. frontend-styling-expert
**Description:** Use this agent when you need help with CSS, styling frameworks, responsive design, UI/UX implementation, animations, layout systems, or any visual/presentational aspects of web development.
**Tools Available:** All tools
**Example Use Cases:**
1. User: "I need to create a responsive navigation menu that collapses on mobile" → Use this agent to help design responsive navigation component
2. User: "The button hover effects aren't smooth enough" → Use this agent to optimize animations
3. User: "Can you help me center this div?" → Use this agent to provide the best centering solution
4. After writing HTML structure: Use this agent to add professional styling to components
**Specialties:**
- CSS frameworks
- Responsive design
- UI/UX implementation
- Animations
- Layout systems
- Visual/presentational aspects
---
#### 6. full-stack-developer
**Description:** Full-stack Next.js 15 development agent that builds production-ready web applications with React, API routes, and Prisma databases.
**Strongly Recommended For:**
- Building complete websites
- Creating data visualization dashboards
- Developing blog/CMS systems
- Implementing responsive web pages
- Integrating AI features (chat/image generation/web search)
- Real-time collaborative applications with WebSocket
**Tools Available:** All tools
**Key Strengths:**
- Frontend-first development approach
- Feature-complete modules combining UI components (shadcn/ui)
- Backend APIs
- Database operations
- Work documentation in /agent-ctx directory
---
## Skills System
Skills provide specialized capabilities and domain knowledge. Before starting any task, the system must follow a workflow to identify and invoke appropriate skills.
### Available Skills
#### 1. ASR (Automatic Speech Recognition)
**Command:** `ASR`
**Description:** Implement speech-to-text (ASR/automatic speech recognition) capabilities using the z-ai-web-dev-sdk.
**Use Cases:**
- Transcribe audio files
- Convert speech to text
- Build voice input features
- Process audio recordings
**Technical Details:**
- Supports base64 encoded audio files
- Returns accurate text transcriptions
- Uses z-ai-web-dev-sdk
---
#### 2. LLM (Large Language Model)
**Command:** `LLM`
**Description:** Implement large language model (LLM) chat completions using the z-ai-web-dev-sdk.
**Use Cases:**
- Build conversational AI applications
- Create chatbots
- Develop AI assistants
- Text generation features
**Technical Details:**
- Supports multi-turn conversations
- System prompts
- Context management
- Uses z-ai-web-dev-sdk
---
#### 3. TTS (Text-to-Speech)
**Command:** `TTS`
**Description:** Implement text-to-speech (TTS) capabilities using the z-ai-web-dev-sdk.
**Use Cases:**
- Convert text into natural-sounding speech
- Create audio content
- Build voice-enabled applications
- Generate spoken audio files
**Technical Details:**
- Supports multiple voices
- Adjustable speed
- Various audio formats
- Uses z-ai-web-dev-sdk
---
#### 4. VLM (Vision Language Model)
**Command:** `VLM`
**Description:** Implement vision-based AI chat capabilities using the z-ai-web-dev-sdk.
**Use Cases:**
- Analyze images
- Describe visual content
- Create applications combining image understanding with conversational AI
**Technical Details:**
- Supports image URLs
- Supports base64 encoded images
- Multimodal interactions
- Uses z-ai-web-dev-sdk
---
#### 5. csv-data-summarizer
**Command:** `csv-data-summarizer`
**Description:** Automatically analyzes CSV files with comprehensive statistical summaries, data quality checks, and intelligent visualizations.
**Features:**
- Detects data types (sales, customer, financial, operational, survey)
- Adapts analysis accordingly
- Generates correlation heatmaps, time-series plots, distribution charts
- Categorical breakdowns
- Missing data analysis
- Automatic date column detection
- Industry-specific insights
**Technical Details:**
- Built with pandas, matplotlib, and seaborn
- NO user prompting required - immediately runs full analysis
- Supports all relevant visualizations based on detected data patterns
**Use Cases:**
- User uploads CSV
- Data analysis requests
- Dataset structure understanding
---
#### 6. deep-research
**Command:** `deep-research`
**Description:** Conduct enterprise-grade research with multi-source synthesis, citation tracking, and verification.
**Use Cases:**
- Comprehensive analysis requiring 10+ sources
- Verified claims
- Comparison of approaches
- Research reports
- Trend analysis
**Triggers:**
- "deep research"
- "comprehensive analysis"
- "research report"
- "compare X vs Y"
- "analyze trends"
**NOT For:**
- Simple lookups
- Debugging
- Questions answerable with 1-2 searches
---
#### 7. docx
**Command:** `docx`
**Description:** Comprehensive document creation, editing, and analysis with support for tracked changes, comments, formatting preservation, and text extraction.
**Use Cases:**
1. Creating new documents
2. Modifying or editing content
3. Working with tracked changes
4. Adding comments
5. Any other document tasks
**File Types:** .docx files
---
#### 8. finance
**Command:** `finance`
**Description:** Comprehensive Finance API integration skill for real-time and historical financial data analysis, market research, and investment decision-making.
**Priority Use Cases:**
- Stock price queries
- Market data analysis
- Company financial information
- Portfolio tracking
- Market news retrieval
- Stock screening
- Technical analysis
- Any financial market-related requests
**Note:** This skill should be the primary choice for all Finance API interactions and financial data needs.
---
#### 9. frontend-design
**Command:** `frontend-design`
**Description:** Transform UI style requirements into production-ready frontend code with systematic design tokens, accessibility compliance, and creative execution.
**Use Cases:**
- Building websites
- Web applications
- React/Vue components
- Dashboards
- Landing pages
- Any web UI requiring design consistency and aesthetic quality
---
#### 10. get-fortune-analysis
**Command:** `get-fortune-analysis`
**Description:** Fortune analysis skill for specialized use cases.
---
#### 11. gift-evaluator
**Command:** `gift-evaluator`
**Description:** The PRIMARY tool for Spring Festival gift analysis and social interaction generation.
**Use Cases:**
- Users upload photos of gifts (alcohol, tea, supplements, etc.)
- Inquire about gift value
- Authenticity verification
- Social response generation
**Features:**
- Integrates visual perception
- Market valuation
- HTML card generation
---
#### 12. image-edit
**Command:** `image-edit`
**Description:** Implement AI image editing and modification capabilities using the z-ai-web-dev-sdk.
**Use Cases:**
- Edit existing images
- Create variations
- Modify visual content
- Redesign assets
- Transform images based on text descriptions
**Technical Details:**
- Supports multiple image sizes
- Returns base64 encoded results
- Includes CLI tool for quick image editing
- Uses z-ai-web-dev-sdk
---
#### 13. image-generation
**Command:** `image-generation`
**Description:** Implement AI image generation capabilities using the z-ai-web-dev-sdk.
**Use Cases:**
- Create images from text descriptions
- Generate visual content
- Create artwork
- Design assets
- Build applications with AI-powered image creation
**Technical Details:**
- Supports multiple image sizes
- Returns base64 encoded images
- Includes CLI tool for quick image generation
- Uses z-ai-web-dev-sdk
**CLI Usage:**
```bash
# Generate image
z-ai-generate --prompt "A beautiful landscape" --output "./image.png"
# Short form
z-ai-generate -p "A cute cat" -o "./cat.png" -s 1024x1024
```
**Supported Sizes:**
- 1024x1024
- 768x1344
- 864x1152
- 1344x768
- 1152x864
- 1440x720
- 720x1440
---
#### 14. image-understand
**Command:** `image-understand`
**Description:** Implement specialized image understanding capabilities using the z-ai-web-dev-sdk.
**Use Cases:**
- Analyze static images
- Extract visual information
- Perform OCR
- Detect objects
- Classify images
- Understand visual content
**Supported Formats:** PNG, JPEG, GIF, WebP, BMP
---
#### 15. pdf
**Command:** `pdf`
**Description:** Comprehensive PDF manipulation toolkit for extracting text and tables, creating new PDFs, merging/splitting documents, and handling forms.
**Use Cases:**
- Fill in PDF forms
- Programmatically process PDFs
- Generate PDF documents
- Analyze PDF documents at scale
---
#### 16. podcast-generate
**Command:** `podcast-generate`
**Description:** Generate podcast episodes from user-provided content or by searching the web for specified topics.
**Modes:**
1. **User uploads text file/article:** Creates dual-host dialogue podcast (or single-host upon request)
2. **No content provided:** Searches web for information about user-specified topic and generates podcast
**Features:**
- Duration scales with content size (3-20 minutes, ~240 chars/min)
- Uses z-ai-web-dev-sdk for LLM script generation and TTS audio synthesis
- Outputs podcast script (Markdown) and complete audio file (WAV)
---
#### 17. pptx
**Command:** `pptx`
**Description:** Presentation editing, creation, and analysis.
**Use Cases:**
1. Editing existing presentations
2. Adding slides to existing files
3. Creating new presentations
4. Working with layouts
5. Adding comments or speaker notes
**File Types:** .pptx files
---
#### 18. story-video-generation
**Command:** `story-video-generation`
**Description:** 根据用户输入的一句话,自动拓展为小故事,自动生成分场景图片,并将图片合成为视频。
**Triggers:**
- "生成故事"
- "故事视频"
- "把一句话变成视频"
- "制作故事视频"
---
#### 19. video-generation
**Command:** `video-generation`
**Description:** Implement AI-powered video generation capabilities using the z-ai-web-dev-sdk.
**Use Cases:**
- Generate videos from text prompts or images
- Create video content programmatically
- Build applications that produce video outputs
**Technical Details:**
- Supports asynchronous task management
- Status polling
- Result retrieval
- Uses z-ai-web-dev-sdk
---
#### 20. video-understand
**Command:** `video-understand`
**Description:** Implement specialized video understanding capabilities using the z-ai-web-dev-sdk.
**Use Cases:**
- Analyze video content
- Understand motion and temporal sequences
- Extract information from video frames
- Describe video scenes
- Perform video-based AI analysis
**Supported Formats:** MP4, AVI, MOV, and other common video formats
---
#### 21. web-reader
**Command:** `web-reader`
**Description:** Implement web page content extraction capabilities using the z-ai-web-dev-sdk.
**Use Cases:**
- Scrape web pages
- Extract article content
- Retrieve page metadata
- Build applications that process web content
**Features:**
- Automatic content extraction
- Title retrieval
- HTML retrieval
- Publication time retrieval
---
#### 22. web-search
**Command:** `web-search`
**Description:** Implement web search capabilities using the z-ai-web-dev-sdk.
**Use Cases:**
- Search for real-time information from the web
- Retrieve up-to-date content beyond knowledge cutoff
- Find latest news and data
**Returns:** Structured search results with URLs, snippets, and metadata
---
#### 23. xlsx
**Command:** `xlsx`
**Description:** Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization.
**Use Cases:**
1. Creating new spreadsheets with formulas and formatting
2. Reading or analyzing data
3. Modifying existing spreadsheets while preserving formulas
4. Data analysis and visualization in spreadsheets
5. Recalculating formulas
**Supported Formats:** .xlsx, .xlsm, .csv, .tsv, etc.
---
## Tools Reference
### 1. Task Tool
**Purpose:** Launch a new agent to handle complex, multi-step tasks autonomously.
**Parameters:**
- `subagent_type` (Required): Type of specialized agent
- `prompt` (Required): Task for the agent to perform
- `description` (Optional): Short 3-5 word description
- `model` (Optional): Model to use (sonnet, opus, haiku)
- `resume` (Optional): Agent ID to resume from
---
### 2. Bash Tool
**Purpose:** Execute bash commands in a persistent shell session.
**Parameters:**
- `command` (Required): Command to execute
- `description` (Optional): Description of the command
- `timeout` (Optional): Timeout in milliseconds (max 600000ms)
---
### 3. Glob Tool
**Purpose:** Fast file pattern matching tool.
**Parameters:**
- `pattern` (Required): Glob pattern (e.g., "**/*.js")
- `path` (Optional): Directory to search (defaults to current)
---
### 4. Grep Tool
**Purpose:** Powerful search tool built on ripgrep.
**Parameters:**
- `pattern` (Required): Regular expression pattern
- `path` (Optional): Directory to search
- `output_mode`: content, files_with_matches, count
- `glob`: File pattern filter
- `file_type`: File type filter (js, py, rust, go, java, etc.)
- `-A`, `-B`, `-C`: Context lines
- `-i`: Case insensitive
- `-n`: Show line numbers
- `multiline`: Enable multiline mode
---
### 5. LS Tool
**Purpose:** List files and directories.
**Parameters:**
- `path` (Required): Absolute path to list
- `ignore`: Glob patterns to ignore
---
### 6. Read Tool
**Purpose:** Read files from the filesystem.
**Parameters:**
- `filepath` (Required): Absolute path to the file
- `offset` (Optional): Starting line number
- `limit` (Optional): Number of lines to read
---
### 7. Edit Tool
**Purpose:** Perform exact string replacements in files.
**Parameters:**
- `filepath` (Required): Absolute path
- `old_str` (Required): Text to replace
- `new_str` (Required): Replacement text
- `replace_all`: Replace all occurrences
---
### 8. MultiEdit Tool
**Purpose:** Multiple edits to a single file in one operation.
**Parameters:**
- `filepath` (Required): Absolute path
- `edits` (Required): Array of edit operations
---
### 9. Write Tool
**Purpose:** Write a file to the filesystem.
**Parameters:**
- `filepath` (Required): Absolute path
- `content` (Required): File content
---
### 10. TodoWrite / TodoRead Tools
**Purpose:** Manage task lists for coding sessions.
**Parameters (TodoWrite):**
- `todos` (Required): Array of todo items with id, content, status, priority
---
### 11. Skill Tool
**Purpose:** Execute a skill within the main conversation.
**Parameters:**
- `command` (Required): Skill name (no arguments)
---
### 12. Complete Tool
**Purpose:** Mark completion of code development or PPT creation tasks.
**Parameters:**
- `project_type` (Required): Type of project ("web_dev")
- `summary` (Optional): Summary of completed project
---
## SDK Integration (z-ai-web-dev-sdk)
### Chat Completions Example
```javascript
import ZAI from 'z-ai-web-dev-sdk';
async function main() {
try {
const zai = await ZAI.create()
const completion = await zai.chat.completions.create({
messages: [
{
role: 'system',
content: 'You are a helpful assistant.'
},
{
role: 'user',
content: 'Hello, who are you?'
}
],
// Other parameters like temperature, max_tokens, etc.
});
const messageContent = completion.choices[0]?.message?.content;
console.log('Assistant says:', messageContent);
} catch (error) {
console.error('An error occurred:', error.message);
}
}
```
### Image Generation Example
```javascript
import ZAI from 'z-ai-web-dev-sdk';
async function generateImage() {
try {
const zai = await ZAI.create();
const response = await zai.images.generations.create({
prompt: 'A cute cat playing in the garden',
size: '1024x1024'
});
const imageBase64 = response.data[0].base64;
console.log('Generated image base64:', imageBase64);
} catch (error) {
console.error('Image generation failed:', error.message);
}
}
```
### Web Search Example
```javascript
import ZAI from 'z-ai-web-dev-sdk';
async function testSearch() {
try {
const zai = await ZAI.create()
const searchResult = await zai.functions.invoke("web_search", {
query: "What is the capital of France?",
num: 10
})
console.log('Full API Response:', searchResult)
} catch (error: any) {
console.error('An error occurred:', error.message);
}
}
```
### Search Result Type
```typescript
interface SearchFunctionResultItem {
url: string;
name: string;
snippet: string;
host_name: string;
rank: number;
date: string;
favicon: string;
}
```
---
## Project Environment Notes
- Next.js 15 with App Router
- Port: 3000 (auto dev server)
- `bun run dev` runs automatically - do not run manually
- Use `bun run lint` to check code quality
- z-ai-web-dev-sdk MUST be used in the backend only
- User can only see the / route defined in src/app/page.tsx
---
## File Output Directory
All generated files must be saved to:
```
/home/z/my-project/download/
```
---
*Document generated automatically from Super Z system configuration*