# GLM Tools, Skills, and Agents Documentation This document provides a comprehensive overview of all agents, skills, and tools available in the Super Z system. --- ## Table of Contents 1. [Subagents (Task Tool)](#subagents-task-tool) 2. [Skills System](#skills-system) 3. [Tools Reference](#tools-reference) --- ## Subagents (Task Tool) The Task tool launches specialized agents (subprocesses) that autonomously handle complex tasks. Each agent type has specific capabilities and tools available to it. ### Agent Types #### 1. general-purpose **Description:** General-purpose agent for researching complex questions, searching for code, and executing multi-step tasks. Use this when you need to perform search for a keyword or file and are not confident that you will find the right match in the first few tries. **Tools Available:** All tools (*) **Use Cases:** - Complex research questions - Multi-step tasks requiring various tools - Code searching across multiple files - Open-ended exploration tasks --- #### 2. statusline-setup **Description:** Use this agent to configure the user's Claude Code status line setting. **Tools Available:** Read, Edit **Use Cases:** - Status line configuration - Settings management --- #### 3. Explore **Description:** Fast agent specialized for exploring codebases. Use this when you need to quickly find files by patterns (e.g., "src/components/**/*.tsx"), search code for keywords (e.g., "API endpoints"), or answer questions about the codebase (e.g., "how do API endpoints work?"). **Tools Available:** All tools **Thoroughness Levels:** - **quick**: Basic searches - **medium**: Moderate exploration - **very thorough**: Comprehensive analysis across multiple locations and naming conventions **Use Cases:** - Finding files by patterns - Searching code for keywords - Understanding codebase structure - Answering questions about code architecture --- #### 4. Plan **Description:** Software architect agent for designing implementation plans. Use this when you need to plan the implementation strategy for a task. Returns step-by-step plans, identifies critical files, and considers architectural trade-offs. **Tools Available:** All tools **Use Cases:** - Implementation planning - Architecture design - Identifying critical files - Evaluating trade-offs --- #### 5. frontend-styling-expert **Description:** Use this agent when you need help with CSS, styling frameworks, responsive design, UI/UX implementation, animations, layout systems, or any visual/presentational aspects of web development. **Tools Available:** All tools **Example Use Cases:** 1. User: "I need to create a responsive navigation menu that collapses on mobile" → Use this agent to help design responsive navigation component 2. User: "The button hover effects aren't smooth enough" → Use this agent to optimize animations 3. User: "Can you help me center this div?" → Use this agent to provide the best centering solution 4. After writing HTML structure: Use this agent to add professional styling to components **Specialties:** - CSS frameworks - Responsive design - UI/UX implementation - Animations - Layout systems - Visual/presentational aspects --- #### 6. full-stack-developer **Description:** Full-stack Next.js 15 development agent that builds production-ready web applications with React, API routes, and Prisma databases. **Strongly Recommended For:** - Building complete websites - Creating data visualization dashboards - Developing blog/CMS systems - Implementing responsive web pages - Integrating AI features (chat/image generation/web search) - Real-time collaborative applications with WebSocket **Tools Available:** All tools **Key Strengths:** - Frontend-first development approach - Feature-complete modules combining UI components (shadcn/ui) - Backend APIs - Database operations - Work documentation in /agent-ctx directory --- ## Skills System Skills provide specialized capabilities and domain knowledge. Before starting any task, the system must follow a workflow to identify and invoke appropriate skills. ### Available Skills #### 1. ASR (Automatic Speech Recognition) **Command:** `ASR` **Description:** Implement speech-to-text (ASR/automatic speech recognition) capabilities using the z-ai-web-dev-sdk. **Use Cases:** - Transcribe audio files - Convert speech to text - Build voice input features - Process audio recordings **Technical Details:** - Supports base64 encoded audio files - Returns accurate text transcriptions - Uses z-ai-web-dev-sdk --- #### 2. LLM (Large Language Model) **Command:** `LLM` **Description:** Implement large language model (LLM) chat completions using the z-ai-web-dev-sdk. **Use Cases:** - Build conversational AI applications - Create chatbots - Develop AI assistants - Text generation features **Technical Details:** - Supports multi-turn conversations - System prompts - Context management - Uses z-ai-web-dev-sdk --- #### 3. TTS (Text-to-Speech) **Command:** `TTS` **Description:** Implement text-to-speech (TTS) capabilities using the z-ai-web-dev-sdk. **Use Cases:** - Convert text into natural-sounding speech - Create audio content - Build voice-enabled applications - Generate spoken audio files **Technical Details:** - Supports multiple voices - Adjustable speed - Various audio formats - Uses z-ai-web-dev-sdk --- #### 4. VLM (Vision Language Model) **Command:** `VLM` **Description:** Implement vision-based AI chat capabilities using the z-ai-web-dev-sdk. **Use Cases:** - Analyze images - Describe visual content - Create applications combining image understanding with conversational AI **Technical Details:** - Supports image URLs - Supports base64 encoded images - Multimodal interactions - Uses z-ai-web-dev-sdk --- #### 5. csv-data-summarizer **Command:** `csv-data-summarizer` **Description:** Automatically analyzes CSV files with comprehensive statistical summaries, data quality checks, and intelligent visualizations. **Features:** - Detects data types (sales, customer, financial, operational, survey) - Adapts analysis accordingly - Generates correlation heatmaps, time-series plots, distribution charts - Categorical breakdowns - Missing data analysis - Automatic date column detection - Industry-specific insights **Technical Details:** - Built with pandas, matplotlib, and seaborn - NO user prompting required - immediately runs full analysis - Supports all relevant visualizations based on detected data patterns **Use Cases:** - User uploads CSV - Data analysis requests - Dataset structure understanding --- #### 6. deep-research **Command:** `deep-research` **Description:** Conduct enterprise-grade research with multi-source synthesis, citation tracking, and verification. **Use Cases:** - Comprehensive analysis requiring 10+ sources - Verified claims - Comparison of approaches - Research reports - Trend analysis **Triggers:** - "deep research" - "comprehensive analysis" - "research report" - "compare X vs Y" - "analyze trends" **NOT For:** - Simple lookups - Debugging - Questions answerable with 1-2 searches --- #### 7. docx **Command:** `docx` **Description:** Comprehensive document creation, editing, and analysis with support for tracked changes, comments, formatting preservation, and text extraction. **Use Cases:** 1. Creating new documents 2. Modifying or editing content 3. Working with tracked changes 4. Adding comments 5. Any other document tasks **File Types:** .docx files --- #### 8. finance **Command:** `finance` **Description:** Comprehensive Finance API integration skill for real-time and historical financial data analysis, market research, and investment decision-making. **Priority Use Cases:** - Stock price queries - Market data analysis - Company financial information - Portfolio tracking - Market news retrieval - Stock screening - Technical analysis - Any financial market-related requests **Note:** This skill should be the primary choice for all Finance API interactions and financial data needs. --- #### 9. frontend-design **Command:** `frontend-design` **Description:** Transform UI style requirements into production-ready frontend code with systematic design tokens, accessibility compliance, and creative execution. **Use Cases:** - Building websites - Web applications - React/Vue components - Dashboards - Landing pages - Any web UI requiring design consistency and aesthetic quality --- #### 10. get-fortune-analysis **Command:** `get-fortune-analysis` **Description:** Fortune analysis skill for specialized use cases. --- #### 11. gift-evaluator **Command:** `gift-evaluator` **Description:** The PRIMARY tool for Spring Festival gift analysis and social interaction generation. **Use Cases:** - Users upload photos of gifts (alcohol, tea, supplements, etc.) - Inquire about gift value - Authenticity verification - Social response generation **Features:** - Integrates visual perception - Market valuation - HTML card generation --- #### 12. image-edit **Command:** `image-edit` **Description:** Implement AI image editing and modification capabilities using the z-ai-web-dev-sdk. **Use Cases:** - Edit existing images - Create variations - Modify visual content - Redesign assets - Transform images based on text descriptions **Technical Details:** - Supports multiple image sizes - Returns base64 encoded results - Includes CLI tool for quick image editing - Uses z-ai-web-dev-sdk --- #### 13. image-generation **Command:** `image-generation` **Description:** Implement AI image generation capabilities using the z-ai-web-dev-sdk. **Use Cases:** - Create images from text descriptions - Generate visual content - Create artwork - Design assets - Build applications with AI-powered image creation **Technical Details:** - Supports multiple image sizes - Returns base64 encoded images - Includes CLI tool for quick image generation - Uses z-ai-web-dev-sdk **CLI Usage:** ```bash # Generate image z-ai-generate --prompt "A beautiful landscape" --output "./image.png" # Short form z-ai-generate -p "A cute cat" -o "./cat.png" -s 1024x1024 ``` **Supported Sizes:** - 1024x1024 - 768x1344 - 864x1152 - 1344x768 - 1152x864 - 1440x720 - 720x1440 --- #### 14. image-understand **Command:** `image-understand` **Description:** Implement specialized image understanding capabilities using the z-ai-web-dev-sdk. **Use Cases:** - Analyze static images - Extract visual information - Perform OCR - Detect objects - Classify images - Understand visual content **Supported Formats:** PNG, JPEG, GIF, WebP, BMP --- #### 15. pdf **Command:** `pdf` **Description:** Comprehensive PDF manipulation toolkit for extracting text and tables, creating new PDFs, merging/splitting documents, and handling forms. **Use Cases:** - Fill in PDF forms - Programmatically process PDFs - Generate PDF documents - Analyze PDF documents at scale --- #### 16. podcast-generate **Command:** `podcast-generate` **Description:** Generate podcast episodes from user-provided content or by searching the web for specified topics. **Modes:** 1. **User uploads text file/article:** Creates dual-host dialogue podcast (or single-host upon request) 2. **No content provided:** Searches web for information about user-specified topic and generates podcast **Features:** - Duration scales with content size (3-20 minutes, ~240 chars/min) - Uses z-ai-web-dev-sdk for LLM script generation and TTS audio synthesis - Outputs podcast script (Markdown) and complete audio file (WAV) --- #### 17. pptx **Command:** `pptx` **Description:** Presentation editing, creation, and analysis. **Use Cases:** 1. Editing existing presentations 2. Adding slides to existing files 3. Creating new presentations 4. Working with layouts 5. Adding comments or speaker notes **File Types:** .pptx files --- #### 18. story-video-generation **Command:** `story-video-generation` **Description:** 根据用户输入的一句话,自动拓展为小故事,自动生成分场景图片,并将图片合成为视频。 **Triggers:** - "生成故事" - "故事视频" - "把一句话变成视频" - "制作故事视频" --- #### 19. video-generation **Command:** `video-generation` **Description:** Implement AI-powered video generation capabilities using the z-ai-web-dev-sdk. **Use Cases:** - Generate videos from text prompts or images - Create video content programmatically - Build applications that produce video outputs **Technical Details:** - Supports asynchronous task management - Status polling - Result retrieval - Uses z-ai-web-dev-sdk --- #### 20. video-understand **Command:** `video-understand` **Description:** Implement specialized video understanding capabilities using the z-ai-web-dev-sdk. **Use Cases:** - Analyze video content - Understand motion and temporal sequences - Extract information from video frames - Describe video scenes - Perform video-based AI analysis **Supported Formats:** MP4, AVI, MOV, and other common video formats --- #### 21. web-reader **Command:** `web-reader` **Description:** Implement web page content extraction capabilities using the z-ai-web-dev-sdk. **Use Cases:** - Scrape web pages - Extract article content - Retrieve page metadata - Build applications that process web content **Features:** - Automatic content extraction - Title retrieval - HTML retrieval - Publication time retrieval --- #### 22. web-search **Command:** `web-search` **Description:** Implement web search capabilities using the z-ai-web-dev-sdk. **Use Cases:** - Search for real-time information from the web - Retrieve up-to-date content beyond knowledge cutoff - Find latest news and data **Returns:** Structured search results with URLs, snippets, and metadata --- #### 23. xlsx **Command:** `xlsx` **Description:** Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization. **Use Cases:** 1. Creating new spreadsheets with formulas and formatting 2. Reading or analyzing data 3. Modifying existing spreadsheets while preserving formulas 4. Data analysis and visualization in spreadsheets 5. Recalculating formulas **Supported Formats:** .xlsx, .xlsm, .csv, .tsv, etc. --- ## Tools Reference ### 1. Task Tool **Purpose:** Launch a new agent to handle complex, multi-step tasks autonomously. **Parameters:** - `subagent_type` (Required): Type of specialized agent - `prompt` (Required): Task for the agent to perform - `description` (Optional): Short 3-5 word description - `model` (Optional): Model to use (sonnet, opus, haiku) - `resume` (Optional): Agent ID to resume from --- ### 2. Bash Tool **Purpose:** Execute bash commands in a persistent shell session. **Parameters:** - `command` (Required): Command to execute - `description` (Optional): Description of the command - `timeout` (Optional): Timeout in milliseconds (max 600000ms) --- ### 3. Glob Tool **Purpose:** Fast file pattern matching tool. **Parameters:** - `pattern` (Required): Glob pattern (e.g., "**/*.js") - `path` (Optional): Directory to search (defaults to current) --- ### 4. Grep Tool **Purpose:** Powerful search tool built on ripgrep. **Parameters:** - `pattern` (Required): Regular expression pattern - `path` (Optional): Directory to search - `output_mode`: content, files_with_matches, count - `glob`: File pattern filter - `file_type`: File type filter (js, py, rust, go, java, etc.) - `-A`, `-B`, `-C`: Context lines - `-i`: Case insensitive - `-n`: Show line numbers - `multiline`: Enable multiline mode --- ### 5. LS Tool **Purpose:** List files and directories. **Parameters:** - `path` (Required): Absolute path to list - `ignore`: Glob patterns to ignore --- ### 6. Read Tool **Purpose:** Read files from the filesystem. **Parameters:** - `filepath` (Required): Absolute path to the file - `offset` (Optional): Starting line number - `limit` (Optional): Number of lines to read --- ### 7. Edit Tool **Purpose:** Perform exact string replacements in files. **Parameters:** - `filepath` (Required): Absolute path - `old_str` (Required): Text to replace - `new_str` (Required): Replacement text - `replace_all`: Replace all occurrences --- ### 8. MultiEdit Tool **Purpose:** Multiple edits to a single file in one operation. **Parameters:** - `filepath` (Required): Absolute path - `edits` (Required): Array of edit operations --- ### 9. Write Tool **Purpose:** Write a file to the filesystem. **Parameters:** - `filepath` (Required): Absolute path - `content` (Required): File content --- ### 10. TodoWrite / TodoRead Tools **Purpose:** Manage task lists for coding sessions. **Parameters (TodoWrite):** - `todos` (Required): Array of todo items with id, content, status, priority --- ### 11. Skill Tool **Purpose:** Execute a skill within the main conversation. **Parameters:** - `command` (Required): Skill name (no arguments) --- ### 12. Complete Tool **Purpose:** Mark completion of code development or PPT creation tasks. **Parameters:** - `project_type` (Required): Type of project ("web_dev") - `summary` (Optional): Summary of completed project --- ## SDK Integration (z-ai-web-dev-sdk) ### Chat Completions Example ```javascript import ZAI from 'z-ai-web-dev-sdk'; async function main() { try { const zai = await ZAI.create() const completion = await zai.chat.completions.create({ messages: [ { role: 'system', content: 'You are a helpful assistant.' }, { role: 'user', content: 'Hello, who are you?' } ], // Other parameters like temperature, max_tokens, etc. }); const messageContent = completion.choices[0]?.message?.content; console.log('Assistant says:', messageContent); } catch (error) { console.error('An error occurred:', error.message); } } ``` ### Image Generation Example ```javascript import ZAI from 'z-ai-web-dev-sdk'; async function generateImage() { try { const zai = await ZAI.create(); const response = await zai.images.generations.create({ prompt: 'A cute cat playing in the garden', size: '1024x1024' }); const imageBase64 = response.data[0].base64; console.log('Generated image base64:', imageBase64); } catch (error) { console.error('Image generation failed:', error.message); } } ``` ### Web Search Example ```javascript import ZAI from 'z-ai-web-dev-sdk'; async function testSearch() { try { const zai = await ZAI.create() const searchResult = await zai.functions.invoke("web_search", { query: "What is the capital of France?", num: 10 }) console.log('Full API Response:', searchResult) } catch (error: any) { console.error('An error occurred:', error.message); } } ``` ### Search Result Type ```typescript interface SearchFunctionResultItem { url: string; name: string; snippet: string; host_name: string; rank: number; date: string; favicon: string; } ``` --- ## Project Environment Notes - Next.js 15 with App Router - Port: 3000 (auto dev server) - `bun run dev` runs automatically - do not run manually - Use `bun run lint` to check code quality - z-ai-web-dev-sdk MUST be used in the backend only - User can only see the / route defined in src/app/page.tsx --- ## File Output Directory All generated files must be saved to: ``` /home/z/my-project/download/ ``` --- *Document generated automatically from Super Z system configuration*