# GLM Tools, Skills, and Agents Documentation

This document provides a comprehensive overview of all agents, skills, and tools available in the Super Z system.

---

## Table of Contents

1. [Subagents (Task Tool)](#subagents-task-tool)
2. [Skills System](#skills-system)
3. [Tools Reference](#tools-reference)

---

## Subagents (Task Tool)

The Task tool launches specialized agents (subprocesses) that autonomously handle complex tasks. Each agent type has specific capabilities and tools available to it.

### Agent Types

#### 1. general-purpose

**Description:** General-purpose agent for researching complex questions, searching for code, and executing multi-step tasks. Use this when you need to perform search for a keyword or file and are not confident that you will find the right match in the first few tries.

**Tools Available:** All tools (*)

**Use Cases:**
- Complex research questions
- Multi-step tasks requiring various tools
- Code searching across multiple files
- Open-ended exploration tasks

---

#### 2. statusline-setup

**Description:** Use this agent to configure the user's Claude Code status line setting.

**Tools Available:** Read, Edit

**Use Cases:**
- Status line configuration
- Settings management

---

#### 3. Explore

**Description:** Fast agent specialized for exploring codebases. Use this when you need to quickly find files by patterns (e.g., "src/components/**/*.tsx"), search code for keywords (e.g., "API endpoints"), or answer questions about the codebase (e.g., "how do API endpoints work?").

**Tools Available:** All tools

**Thoroughness Levels:**
- **quick**: Basic searches
- **medium**: Moderate exploration
- **very thorough**: Comprehensive analysis across multiple locations and naming conventions

**Use Cases:**
- Finding files by patterns
- Searching code for keywords
- Understanding codebase structure
- Answering questions about code architecture

---

#### 4. Plan

**Description:** Software architect agent for designing implementation plans. Use this when you need to plan the implementation strategy for a task. Returns step-by-step plans, identifies critical files, and considers architectural trade-offs.

**Tools Available:** All tools

**Use Cases:**
- Implementation planning
- Architecture design
- Identifying critical files
- Evaluating trade-offs

---

#### 5. frontend-styling-expert

**Description:** Use this agent when you need help with CSS, styling frameworks, responsive design, UI/UX implementation, animations, layout systems, or any visual/presentational aspects of web development.

**Tools Available:** All tools

**Example Use Cases:**
1. User: "I need to create a responsive navigation menu that collapses on mobile" → Use this agent to help design responsive navigation component
2. User: "The button hover effects aren't smooth enough" → Use this agent to optimize animations
3. User: "Can you help me center this div?" → Use this agent to provide the best centering solution
4. After writing HTML structure: Use this agent to add professional styling to components

**Specialties:**
- CSS frameworks
- Responsive design
- UI/UX implementation
- Animations
- Layout systems
- Visual/presentational aspects

---

#### 6. full-stack-developer

**Description:** Full-stack Next.js 15 development agent that builds production-ready web applications with React, API routes, and Prisma databases.

**Strongly Recommended For:**
- Building complete websites
- Creating data visualization dashboards
- Developing blog/CMS systems
- Implementing responsive web pages
- Integrating AI features (chat/image generation/web search)
- Real-time collaborative applications with WebSocket

**Tools Available:** All tools

**Key Strengths:**
- Frontend-first development approach
- Feature-complete modules combining UI components (shadcn/ui)
- Backend APIs
- Database operations
- Work documentation in /agent-ctx directory

---

## Skills System

Skills provide specialized capabilities and domain knowledge. Before starting any task, the system must follow a workflow to identify and invoke appropriate skills.

### Available Skills

#### 1. ASR (Automatic Speech Recognition)

**Command:** `ASR`

**Description:** Implement speech-to-text (ASR/automatic speech recognition) capabilities using the z-ai-web-dev-sdk.

**Use Cases:**
- Transcribe audio files
- Convert speech to text
- Build voice input features
- Process audio recordings

**Technical Details:**
- Supports base64 encoded audio files
- Returns accurate text transcriptions
- Uses z-ai-web-dev-sdk

---

#### 2. LLM (Large Language Model)

**Command:** `LLM`

**Description:** Implement large language model (LLM) chat completions using the z-ai-web-dev-sdk.

**Use Cases:**
- Build conversational AI applications
- Create chatbots
- Develop AI assistants
- Text generation features

**Technical Details:**
- Supports multi-turn conversations
- System prompts
- Context management
- Uses z-ai-web-dev-sdk

---

#### 3. TTS (Text-to-Speech)

**Command:** `TTS`

**Description:** Implement text-to-speech (TTS) capabilities using the z-ai-web-dev-sdk.

**Use Cases:**
- Convert text into natural-sounding speech
- Create audio content
- Build voice-enabled applications
- Generate spoken audio files

**Technical Details:**
- Supports multiple voices
- Adjustable speed
- Various audio formats
- Uses z-ai-web-dev-sdk

---

#### 4. VLM (Vision Language Model)

**Command:** `VLM`

**Description:** Implement vision-based AI chat capabilities using the z-ai-web-dev-sdk.

**Use Cases:**
- Analyze images
- Describe visual content
- Create applications combining image understanding with conversational AI

**Technical Details:**
- Supports image URLs
- Supports base64 encoded images
- Multimodal interactions
- Uses z-ai-web-dev-sdk

---

#### 5. csv-data-summarizer

**Command:** `csv-data-summarizer`

**Description:** Automatically analyzes CSV files with comprehensive statistical summaries, data quality checks, and intelligent visualizations.

**Features:**
- Detects data types (sales, customer, financial, operational, survey)
- Adapts analysis accordingly
- Generates correlation heatmaps, time-series plots, distribution charts
- Categorical breakdowns
- Missing data analysis
- Automatic date column detection
- Industry-specific insights

**Technical Details:**
- Built with pandas, matplotlib, and seaborn
- NO user prompting required - immediately runs full analysis
- Supports all relevant visualizations based on detected data patterns

**Use Cases:**
- User uploads CSV
- Data analysis requests
- Dataset structure understanding

---

#### 6. deep-research

**Command:** `deep-research`

**Description:** Conduct enterprise-grade research with multi-source synthesis, citation tracking, and verification.

**Use Cases:**
- Comprehensive analysis requiring 10+ sources
- Verified claims
- Comparison of approaches
- Research reports
- Trend analysis

**Triggers:**
- "deep research"
- "comprehensive analysis"
- "research report"
- "compare X vs Y"
- "analyze trends"

**NOT For:**
- Simple lookups
- Debugging
- Questions answerable with 1-2 searches

---

#### 7. docx

**Command:** `docx`

**Description:** Comprehensive document creation, editing, and analysis with support for tracked changes, comments, formatting preservation, and text extraction.

**Use Cases:**
1. Creating new documents
2. Modifying or editing content
3. Working with tracked changes
4. Adding comments
5. Any other document tasks

**File Types:** .docx files

---

#### 8. finance

**Command:** `finance`

**Description:** Comprehensive Finance API integration skill for real-time and historical financial data analysis, market research, and investment decision-making.

**Priority Use Cases:**
- Stock price queries
- Market data analysis
- Company financial information
- Portfolio tracking
- Market news retrieval
- Stock screening
- Technical analysis
- Any financial market-related requests

**Note:** This skill should be the primary choice for all Finance API interactions and financial data needs.

---

#### 9. frontend-design

**Command:** `frontend-design`

**Description:** Transform UI style requirements into production-ready frontend code with systematic design tokens, accessibility compliance, and creative execution.

**Use Cases:**
- Building websites
- Web applications
- React/Vue components
- Dashboards
- Landing pages
- Any web UI requiring design consistency and aesthetic quality

---

#### 10. get-fortune-analysis

**Command:** `get-fortune-analysis`

**Description:** Fortune analysis skill for specialized use cases.

---

#### 11. gift-evaluator

**Command:** `gift-evaluator`

**Description:** The PRIMARY tool for Spring Festival gift analysis and social interaction generation.

**Use Cases:**
- Users upload photos of gifts (alcohol, tea, supplements, etc.)
- Inquire about gift value
- Authenticity verification
- Social response generation

**Features:**
- Integrates visual perception
- Market valuation
- HTML card generation

---

#### 12. image-edit

**Command:** `image-edit`

**Description:** Implement AI image editing and modification capabilities using the z-ai-web-dev-sdk.

**Use Cases:**
- Edit existing images
- Create variations
- Modify visual content
- Redesign assets
- Transform images based on text descriptions

**Technical Details:**
- Supports multiple image sizes
- Returns base64 encoded results
- Includes CLI tool for quick image editing
- Uses z-ai-web-dev-sdk

---

#### 13. image-generation

**Command:** `image-generation`

**Description:** Implement AI image generation capabilities using the z-ai-web-dev-sdk.

**Use Cases:**
- Create images from text descriptions
- Generate visual content
- Create artwork
- Design assets
- Build applications with AI-powered image creation

**Technical Details:**
- Supports multiple image sizes
- Returns base64 encoded images
- Includes CLI tool for quick image generation
- Uses z-ai-web-dev-sdk

**CLI Usage:**
```bash
# Generate image
z-ai-generate --prompt "A beautiful landscape" --output "./image.png"

# Short form
z-ai-generate -p "A cute cat" -o "./cat.png" -s 1024x1024
```

**Supported Sizes:**
- 1024x1024
- 768x1344
- 864x1152
- 1344x768
- 1152x864
- 1440x720
- 720x1440

---

#### 14. image-understand

**Command:** `image-understand`

**Description:** Implement specialized image understanding capabilities using the z-ai-web-dev-sdk.

**Use Cases:**
- Analyze static images
- Extract visual information
- Perform OCR
- Detect objects
- Classify images
- Understand visual content

**Supported Formats:** PNG, JPEG, GIF, WebP, BMP

---

#### 15. pdf

**Command:** `pdf`

**Description:** Comprehensive PDF manipulation toolkit for extracting text and tables, creating new PDFs, merging/splitting documents, and handling forms.

**Use Cases:**
- Fill in PDF forms
- Programmatically process PDFs
- Generate PDF documents
- Analyze PDF documents at scale

---

#### 16. podcast-generate

**Command:** `podcast-generate`

**Description:** Generate podcast episodes from user-provided content or by searching the web for specified topics.

**Modes:**
1. **User uploads text file/article:** Creates dual-host dialogue podcast (or single-host upon request)
2. **No content provided:** Searches web for information about user-specified topic and generates podcast

**Features:**
- Duration scales with content size (3-20 minutes, ~240 chars/min)
- Uses z-ai-web-dev-sdk for LLM script generation and TTS audio synthesis
- Outputs podcast script (Markdown) and complete audio file (WAV)

---

#### 17. pptx

**Command:** `pptx`

**Description:** Presentation editing, creation, and analysis.

**Use Cases:**
1. Editing existing presentations
2. Adding slides to existing files
3. Creating new presentations
4. Working with layouts
5. Adding comments or speaker notes

**File Types:** .pptx files

---

#### 18. story-video-generation

**Command:** `story-video-generation`

**Description:** 根据用户输入的一句话，自动拓展为小故事，自动生成分场景图片，并将图片合成为视频。

**Triggers:**
- "生成故事"
- "故事视频"
- "把一句话变成视频"
- "制作故事视频"

---

#### 19. video-generation

**Command:** `video-generation`

**Description:** Implement AI-powered video generation capabilities using the z-ai-web-dev-sdk.

**Use Cases:**
- Generate videos from text prompts or images
- Create video content programmatically
- Build applications that produce video outputs

**Technical Details:**
- Supports asynchronous task management
- Status polling
- Result retrieval
- Uses z-ai-web-dev-sdk

---

#### 20. video-understand

**Command:** `video-understand`

**Description:** Implement specialized video understanding capabilities using the z-ai-web-dev-sdk.

**Use Cases:**
- Analyze video content
- Understand motion and temporal sequences
- Extract information from video frames
- Describe video scenes
- Perform video-based AI analysis

**Supported Formats:** MP4, AVI, MOV, and other common video formats

---

#### 21. web-reader

**Command:** `web-reader`

**Description:** Implement web page content extraction capabilities using the z-ai-web-dev-sdk.

**Use Cases:**
- Scrape web pages
- Extract article content
- Retrieve page metadata
- Build applications that process web content

**Features:**
- Automatic content extraction
- Title retrieval
- HTML retrieval
- Publication time retrieval

---

#### 22. web-search

**Command:** `web-search`

**Description:** Implement web search capabilities using the z-ai-web-dev-sdk.

**Use Cases:**
- Search for real-time information from the web
- Retrieve up-to-date content beyond knowledge cutoff
- Find latest news and data

**Returns:** Structured search results with URLs, snippets, and metadata

---

#### 23. xlsx

**Command:** `xlsx`

**Description:** Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization.

**Use Cases:**
1. Creating new spreadsheets with formulas and formatting
2. Reading or analyzing data
3. Modifying existing spreadsheets while preserving formulas
4. Data analysis and visualization in spreadsheets
5. Recalculating formulas

**Supported Formats:** .xlsx, .xlsm, .csv, .tsv, etc.

---

## Tools Reference

### 1. Task Tool

**Purpose:** Launch a new agent to handle complex, multi-step tasks autonomously.

**Parameters:**
- `subagent_type` (Required): Type of specialized agent
- `prompt` (Required): Task for the agent to perform
- `description` (Optional): Short 3-5 word description
- `model` (Optional): Model to use (sonnet, opus, haiku)
- `resume` (Optional): Agent ID to resume from

---

### 2. Bash Tool

**Purpose:** Execute bash commands in a persistent shell session.

**Parameters:**
- `command` (Required): Command to execute
- `description` (Optional): Description of the command
- `timeout` (Optional): Timeout in milliseconds (max 600000ms)

---

### 3. Glob Tool

**Purpose:** Fast file pattern matching tool.

**Parameters:**
- `pattern` (Required): Glob pattern (e.g., "**/*.js")
- `path` (Optional): Directory to search (defaults to current)

---

### 4. Grep Tool

**Purpose:** Powerful search tool built on ripgrep.

**Parameters:**
- `pattern` (Required): Regular expression pattern
- `path` (Optional): Directory to search
- `output_mode`: content, files_with_matches, count
- `glob`: File pattern filter
- `file_type`: File type filter (js, py, rust, go, java, etc.)
- `-A`, `-B`, `-C`: Context lines
- `-i`: Case insensitive
- `-n`: Show line numbers
- `multiline`: Enable multiline mode

---

### 5. LS Tool

**Purpose:** List files and directories.

**Parameters:**
- `path` (Required): Absolute path to list
- `ignore`: Glob patterns to ignore

---

### 6. Read Tool

**Purpose:** Read files from the filesystem.

**Parameters:**
- `filepath` (Required): Absolute path to the file
- `offset` (Optional): Starting line number
- `limit` (Optional): Number of lines to read

---

### 7. Edit Tool

**Purpose:** Perform exact string replacements in files.

**Parameters:**
- `filepath` (Required): Absolute path
- `old_str` (Required): Text to replace
- `new_str` (Required): Replacement text
- `replace_all`: Replace all occurrences

---

### 8. MultiEdit Tool

**Purpose:** Multiple edits to a single file in one operation.

**Parameters:**
- `filepath` (Required): Absolute path
- `edits` (Required): Array of edit operations

---

### 9. Write Tool

**Purpose:** Write a file to the filesystem.

**Parameters:**
- `filepath` (Required): Absolute path
- `content` (Required): File content

---

### 10. TodoWrite / TodoRead Tools

**Purpose:** Manage task lists for coding sessions.

**Parameters (TodoWrite):**
- `todos` (Required): Array of todo items with id, content, status, priority

---

### 11. Skill Tool

**Purpose:** Execute a skill within the main conversation.

**Parameters:**
- `command` (Required): Skill name (no arguments)

---

### 12. Complete Tool

**Purpose:** Mark completion of code development or PPT creation tasks.

**Parameters:**
- `project_type` (Required): Type of project ("web_dev")
- `summary` (Optional): Summary of completed project

---

## SDK Integration (z-ai-web-dev-sdk)

### Chat Completions Example

```javascript
import ZAI from 'z-ai-web-dev-sdk';

async function main() {
  try {
    const zai = await ZAI.create()

    const completion = await zai.chat.completions.create({
      messages: [
        {
          role: 'system',
          content: 'You are a helpful assistant.'
        },
        {
          role: 'user',
          content: 'Hello, who are you?'
        }
      ],
      // Other parameters like temperature, max_tokens, etc.
    });

    const messageContent = completion.choices[0]?.message?.content;
    console.log('Assistant says:', messageContent);

  } catch (error) {
    console.error('An error occurred:', error.message);
  }
}
```

### Image Generation Example

```javascript
import ZAI from 'z-ai-web-dev-sdk';

async function generateImage() {
  try {
    const zai = await ZAI.create();

    const response = await zai.images.generations.create({
      prompt: 'A cute cat playing in the garden',
      size: '1024x1024'
    });

    const imageBase64 = response.data[0].base64;
    console.log('Generated image base64:', imageBase64);

  } catch (error) {
    console.error('Image generation failed:', error.message);
  }
}
```

### Web Search Example

```javascript
import ZAI from 'z-ai-web-dev-sdk';

async function testSearch() {
  try {
    const zai = await ZAI.create()

    const searchResult = await zai.functions.invoke("web_search", {
      query: "What is the capital of France?",
      num: 10
    })

    console.log('Full API Response:', searchResult)

  } catch (error: any) {
    console.error('An error occurred:', error.message);
  }
}
```

### Search Result Type

```typescript
interface SearchFunctionResultItem {
    url: string;
    name: string;
    snippet: string;
    host_name: string;
    rank: number;
    date: string;
    favicon: string;
}
```

---

## Project Environment Notes

- Next.js 15 with App Router
- Port: 3000 (auto dev server)
- `bun run dev` runs automatically - do not run manually
- Use `bun run lint` to check code quality
- z-ai-web-dev-sdk MUST be used in the backend only
- User can only see the / route defined in src/app/page.tsx

---

## File Output Directory

All generated files must be saved to:
```
/home/z/my-project/download/
```

---

*Document generated automatically from Super Z system configuration*