Initial commit

This commit is contained in:
Z User
2026-06-06 05:21:10 +00:00
Unverified
commit 6664758a6d
493 changed files with 135653 additions and 0 deletions

21
skills/ASR/LICENSE.txt Executable file
View File

@@ -0,0 +1,21 @@
MIT License
Copyright (c) 2025 z-ai-web-dev-sdk Skills
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

580
skills/ASR/SKILL.md Executable file
View File

@@ -0,0 +1,580 @@
---
name: ASR
description: Implement speech-to-text (ASR/automatic speech recognition) capabilities using the z-ai-web-dev-sdk. Use this skill when the user needs to transcribe audio files, convert speech to text, build voice input features, or process audio recordings. Supports base64 encoded audio files and returns accurate text transcriptions.
license: MIT
---
# ASR (Speech to Text) Skill
This skill guides the implementation of speech-to-text (ASR) functionality using the z-ai-web-dev-sdk package, enabling accurate transcription of spoken audio into text.
## Skills Path
**Skill Location**: `{project_path}/skills/ASR`
this skill is located at above path in your project.
**Reference Scripts**: Example test scripts are available in the `{Skill Location}/scripts/` directory for quick testing and reference. See `{Skill Location}/scripts/asr.ts` for a working example.
## Overview
Speech-to-Text (ASR - Automatic Speech Recognition) allows you to build applications that convert spoken language in audio files into written text, enabling voice-controlled interfaces, transcription services, and audio content analysis.
**IMPORTANT**: z-ai-web-dev-sdk MUST be used in backend code only. Never use it in client-side code.
## Prerequisites
The z-ai-web-dev-sdk package is already installed. Import it as shown in the examples below.
## CLI Usage (For Simple Tasks)
For simple audio transcription tasks, you can use the z-ai CLI instead of writing code. This is ideal for quick transcriptions, testing audio files, or batch processing.
### Basic Transcription from File
```bash
# Transcribe an audio file
z-ai asr --file ./audio.wav
# Save transcription to JSON file
z-ai asr -f ./recording.mp3 -o transcript.json
# Transcribe and view output
z-ai asr --file ./interview.wav --output result.json
```
### Transcription from Base64
```bash
# Transcribe from base64 encoded audio
z-ai asr --base64 "UklGRiQAAABXQVZFZm10..." -o result.json
# Using short option
z-ai asr -b "base64_encoded_audio_data" -o transcript.json
```
### Streaming Output
```bash
# Stream transcription results
z-ai asr -f ./audio.wav --stream
```
### CLI Parameters
- `--file, -f <path>`: **Required** (if not using --base64) - Audio file path
- `--base64, -b <base64>`: **Required** (if not using --file) - Base64 encoded audio
- `--output, -o <path>`: Optional - Output file path (JSON format)
- `--stream`: Optional - Stream the transcription output
### Supported Audio Formats
The ASR service supports various audio formats including:
- WAV (.wav)
- MP3 (.mp3)
- Other common audio formats
### When to Use CLI vs SDK
**Use CLI for:**
- Quick audio file transcriptions
- Testing audio recognition accuracy
- Simple batch processing scripts
- One-off transcription tasks
**Use SDK for:**
- Real-time audio transcription in applications
- Integration with recording systems
- Custom audio processing workflows
- Production applications with streaming audio
## Basic ASR Implementation
### Simple Audio Transcription
```javascript
import ZAI from 'z-ai-web-dev-sdk';
import fs from 'fs';
async function transcribeAudio(audioFilePath) {
const zai = await ZAI.create();
// Read audio file and convert to base64
const audioFile = fs.readFileSync(audioFilePath);
const base64Audio = audioFile.toString('base64');
const response = await zai.audio.asr.create({
file_base64: base64Audio
});
return response.text;
}
// Usage
const transcription = await transcribeAudio('./audio.wav');
console.log('Transcription:', transcription);
```
### Transcribe Multiple Audio Files
```javascript
import ZAI from 'z-ai-web-dev-sdk';
import fs from 'fs';
async function transcribeBatch(audioFilePaths) {
const zai = await ZAI.create();
const results = [];
for (const filePath of audioFilePaths) {
try {
const audioFile = fs.readFileSync(filePath);
const base64Audio = audioFile.toString('base64');
const response = await zai.audio.asr.create({
file_base64: base64Audio
});
results.push({
file: filePath,
success: true,
transcription: response.text
});
} catch (error) {
results.push({
file: filePath,
success: false,
error: error.message
});
}
}
return results;
}
// Usage
const files = ['./interview1.wav', './interview2.wav', './interview3.wav'];
const transcriptions = await transcribeBatch(files);
transcriptions.forEach(result => {
if (result.success) {
console.log(`${result.file}: ${result.transcription}`);
} else {
console.error(`${result.file}: Error - ${result.error}`);
}
});
```
## Advanced Use Cases
### Audio File Processing with Metadata
```javascript
import ZAI from 'z-ai-web-dev-sdk';
import fs from 'fs';
import path from 'path';
async function transcribeWithMetadata(audioFilePath) {
const zai = await ZAI.create();
// Get file metadata
const stats = fs.statSync(audioFilePath);
const audioFile = fs.readFileSync(audioFilePath);
const base64Audio = audioFile.toString('base64');
const startTime = Date.now();
const response = await zai.audio.asr.create({
file_base64: base64Audio
});
const endTime = Date.now();
return {
filename: path.basename(audioFilePath),
filepath: audioFilePath,
fileSize: stats.size,
transcription: response.text,
wordCount: response.text.split(/\s+/).length,
processingTime: endTime - startTime,
timestamp: new Date().toISOString()
};
}
// Usage
const result = await transcribeWithMetadata('./meeting_recording.wav');
console.log('Transcription Details:', JSON.stringify(result, null, 2));
```
### Real-time Audio Processing Service
```javascript
import ZAI from 'z-ai-web-dev-sdk';
import fs from 'fs';
class ASRService {
constructor() {
this.zai = null;
this.transcriptionCache = new Map();
}
async initialize() {
this.zai = await ZAI.create();
}
generateCacheKey(audioBuffer) {
const crypto = require('crypto');
return crypto.createHash('md5').update(audioBuffer).digest('hex');
}
async transcribe(audioFilePath, useCache = true) {
const audioBuffer = fs.readFileSync(audioFilePath);
const cacheKey = this.generateCacheKey(audioBuffer);
// Check cache
if (useCache && this.transcriptionCache.has(cacheKey)) {
return {
transcription: this.transcriptionCache.get(cacheKey),
cached: true
};
}
// Transcribe audio
const base64Audio = audioBuffer.toString('base64');
const response = await this.zai.audio.asr.create({
file_base64: base64Audio
});
// Cache result
if (useCache) {
this.transcriptionCache.set(cacheKey, response.text);
}
return {
transcription: response.text,
cached: false
};
}
clearCache() {
this.transcriptionCache.clear();
}
getCacheSize() {
return this.transcriptionCache.size;
}
}
// Usage
const asrService = new ASRService();
await asrService.initialize();
const result1 = await asrService.transcribe('./audio.wav');
console.log('First call (not cached):', result1);
const result2 = await asrService.transcribe('./audio.wav');
console.log('Second call (cached):', result2);
```
### Directory Transcription
```javascript
import ZAI from 'z-ai-web-dev-sdk';
import fs from 'fs';
import path from 'path';
async function transcribeDirectory(directoryPath, outputJsonPath) {
const zai = await ZAI.create();
// Get all audio files
const files = fs.readdirSync(directoryPath);
const audioFiles = files.filter(file =>
/\.(wav|mp3|m4a|flac|ogg)$/i.test(file)
);
const results = {
directory: directoryPath,
totalFiles: audioFiles.length,
processedAt: new Date().toISOString(),
transcriptions: []
};
for (const filename of audioFiles) {
const filePath = path.join(directoryPath, filename);
try {
const audioFile = fs.readFileSync(filePath);
const base64Audio = audioFile.toString('base64');
const response = await zai.audio.asr.create({
file_base64: base64Audio
});
results.transcriptions.push({
filename: filename,
success: true,
text: response.text,
wordCount: response.text.split(/\s+/).length
});
console.log(`✓ Transcribed: ${filename}`);
} catch (error) {
results.transcriptions.push({
filename: filename,
success: false,
error: error.message
});
console.error(`✗ Failed: ${filename} - ${error.message}`);
}
}
// Save results to JSON
fs.writeFileSync(
outputJsonPath,
JSON.stringify(results, null, 2)
);
return results;
}
// Usage
const results = await transcribeDirectory(
'./audio-recordings',
'./transcriptions.json'
);
console.log(`\nProcessed ${results.totalFiles} files`);
console.log(`Successful: ${results.transcriptions.filter(t => t.success).length}`);
console.log(`Failed: ${results.transcriptions.filter(t => !t.success).length}`);
```
## Best Practices
### 1. Audio Format Handling
```javascript
import ZAI from 'z-ai-web-dev-sdk';
import fs from 'fs';
async function transcribeAnyFormat(audioFilePath) {
// Supported formats: WAV, MP3, M4A, FLAC, OGG, etc.
const validExtensions = ['.wav', '.mp3', '.m4a', '.flac', '.ogg'];
const ext = audioFilePath.toLowerCase().substring(audioFilePath.lastIndexOf('.'));
if (!validExtensions.includes(ext)) {
throw new Error(`Unsupported audio format: ${ext}`);
}
const zai = await ZAI.create();
const audioFile = fs.readFileSync(audioFilePath);
const base64Audio = audioFile.toString('base64');
const response = await zai.audio.asr.create({
file_base64: base64Audio
});
return response.text;
}
```
### 2. Error Handling
```javascript
import ZAI from 'z-ai-web-dev-sdk';
import fs from 'fs';
async function safeTranscribe(audioFilePath) {
try {
// Validate file exists
if (!fs.existsSync(audioFilePath)) {
throw new Error(`File not found: ${audioFilePath}`);
}
// Check file size (e.g., limit to 100MB)
const stats = fs.statSync(audioFilePath);
const fileSizeMB = stats.size / (1024 * 1024);
if (fileSizeMB > 100) {
throw new Error(`File too large: ${fileSizeMB.toFixed(2)}MB (max 100MB)`);
}
// Transcribe
const zai = await ZAI.create();
const audioFile = fs.readFileSync(audioFilePath);
const base64Audio = audioFile.toString('base64');
const response = await zai.audio.asr.create({
file_base64: base64Audio
});
if (!response.text || response.text.trim().length === 0) {
throw new Error('Empty transcription result');
}
return {
success: true,
transcription: response.text,
filePath: audioFilePath,
fileSize: stats.size
};
} catch (error) {
console.error('Transcription error:', error);
return {
success: false,
error: error.message,
filePath: audioFilePath
};
}
}
```
### 3. Post-Processing Transcriptions
```javascript
function cleanTranscription(text) {
// Remove excessive whitespace
text = text.replace(/\s+/g, ' ').trim();
// Capitalize first letter of sentences
text = text.replace(/(^\w|[.!?]\s+\w)/g, match => match.toUpperCase());
// Remove filler words (optional)
const fillers = ['um', 'uh', 'ah', 'like', 'you know'];
const fillerPattern = new RegExp(`\\b(${fillers.join('|')})\\b`, 'gi');
text = text.replace(fillerPattern, '').replace(/\s+/g, ' ');
return text;
}
async function transcribeAndClean(audioFilePath) {
const zai = await ZAI.create();
const audioFile = fs.readFileSync(audioFilePath);
const base64Audio = audioFile.toString('base64');
const response = await zai.audio.asr.create({
file_base64: base64Audio
});
return {
raw: response.text,
cleaned: cleanTranscription(response.text)
};
}
```
## Common Use Cases
1. **Meeting Transcription**: Convert recorded meetings into searchable text
2. **Interview Processing**: Transcribe interviews for analysis and documentation
3. **Podcast Transcription**: Create text versions of podcast episodes
4. **Voice Notes**: Convert voice memos to text for easier reference
5. **Call Center Analytics**: Analyze customer service calls
6. **Accessibility**: Provide text alternatives for audio content
7. **Voice Commands**: Enable voice-controlled applications
8. **Language Learning**: Transcribe pronunciation practice
## Integration Examples
### Express.js API Endpoint
```javascript
import express from 'express';
import multer from 'multer';
import ZAI from 'z-ai-web-dev-sdk';
import fs from 'fs';
const app = express();
const upload = multer({ dest: 'uploads/' });
let zaiInstance;
async function initZAI() {
zaiInstance = await ZAI.create();
}
app.post('/api/transcribe', upload.single('audio'), async (req, res) => {
try {
if (!req.file) {
return res.status(400).json({ error: 'No audio file provided' });
}
const audioFile = fs.readFileSync(req.file.path);
const base64Audio = audioFile.toString('base64');
const response = await zaiInstance.audio.asr.create({
file_base64: base64Audio
});
// Clean up uploaded file
fs.unlinkSync(req.file.path);
res.json({
success: true,
transcription: response.text,
wordCount: response.text.split(/\s+/).length
});
} catch (error) {
// Clean up on error
if (req.file && fs.existsSync(req.file.path)) {
fs.unlinkSync(req.file.path);
}
res.status(500).json({
success: false,
error: error.message
});
}
});
initZAI().then(() => {
app.listen(3000, () => {
console.log('ASR API running on port 3000');
});
});
```
## Troubleshooting
**Issue**: "SDK must be used in backend"
- **Solution**: Ensure z-ai-web-dev-sdk is only imported in server-side code
**Issue**: Empty or incorrect transcription
- **Solution**: Verify audio quality and format. Check if audio contains clear speech
**Issue**: Large file processing fails
- **Solution**: Consider splitting large audio files into smaller segments
**Issue**: Slow transcription speed
- **Solution**: Implement caching for repeated transcriptions, optimize file sizes
**Issue**: Memory errors with large files
- **Solution**: Process files in chunks or increase Node.js memory limit
## Performance Tips
1. **Reuse SDK Instance**: Create once, use multiple times
2. **Implement Caching**: Cache transcriptions for duplicate files
3. **Batch Processing**: Process multiple files efficiently with proper queuing
4. **Audio Optimization**: Compress audio files before processing when possible
5. **Async Operations**: Use Promise.all for parallel processing when appropriate
## Audio Quality Guidelines
For best transcription results:
- **Sample Rate**: 16kHz or higher
- **Format**: WAV, MP3, or M4A recommended
- **Noise Level**: Minimize background noise
- **Speech Clarity**: Clear pronunciation and normal speaking pace
- **File Size**: Under 100MB recommended for individual files
## Remember
- Always use z-ai-web-dev-sdk in backend code only
- The SDK is already installed - import as shown in examples
- Audio files must be converted to base64 before processing
- Implement proper error handling for production applications
- Consider audio quality for best transcription accuracy
- Clean up temporary files after processing
- Cache results for frequently transcribed files

27
skills/ASR/scripts/asr.ts Executable file
View File

@@ -0,0 +1,27 @@
import ZAI from 'z-ai-web-dev-sdk';
import fs from 'fs';
import path from 'path';
async function main(inputFile: string) {
if (!fs.existsSync(inputFile)) {
console.error(`Audio file not found: ${inputFile}`);
return;
}
try {
const zai = await ZAI.create();
const audioBuffer = fs.readFileSync(inputFile);
const file_base64 = audioBuffer.toString('base64');
const result = await zai.audio.asr.create({ file_base64 });
console.log('Transcription result:');
console.log(result.text ?? JSON.stringify(result, null, 2));
} catch (err: any) {
console.error('ASR failed:', err?.message || err);
}
}
main('./output.wav');

21
skills/LLM/LICENSE.txt Executable file
View File

@@ -0,0 +1,21 @@
MIT License
Copyright (c) 2025 z-ai-web-dev-sdk Skills
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

856
skills/LLM/SKILL.md Executable file
View File

@@ -0,0 +1,856 @@
---
name: LLM
description: Implement large language model (LLM) chat completions using the z-ai-web-dev-sdk. Use this skill when the user needs to build conversational AI applications, chatbots, AI assistants, or any text generation features. Supports multi-turn conversations, system prompts, and context management.
license: MIT
---
# LLM (Large Language Model) Skill
This skill guides the implementation of chat completions functionality using the z-ai-web-dev-sdk package, enabling powerful conversational AI and text generation capabilities.
## Skills Path
**Skill Location**: `{project_path}/skills/llm`
this skill is located at above path in your project.
**Reference Scripts**: Example test scripts are available in the `{Skill Location}/scripts/` directory for quick testing and reference. See `{Skill Location}/scripts/chat.ts` for a working example.
## Overview
The LLM skill allows you to build applications that leverage large language models for natural language understanding and generation, including chatbots, AI assistants, content generation, and more.
**IMPORTANT**: z-ai-web-dev-sdk MUST be used in backend code only. Never use it in client-side code.
## Prerequisites
The z-ai-web-dev-sdk package is already installed. Import it as shown in the examples below.
## CLI Usage (For Simple Tasks)
For simple, one-off chat completions, you can use the z-ai CLI instead of writing code. This is ideal for quick tests, simple queries, or automation scripts.
### Basic Chat
```bash
# Simple question
z-ai chat --prompt "What is the capital of France?"
# Save response to file
z-ai chat -p "Explain quantum computing" -o response.json
# Stream the response
z-ai chat -p "Write a short poem" --stream
```
### With System Prompt
```bash
# Custom system prompt for specific behavior
z-ai chat \
--prompt "Review this code: function add(a,b) { return a+b; }" \
--system "You are an expert code reviewer" \
-o review.json
```
### With Thinking (Chain of Thought)
```bash
# Enable thinking for complex reasoning
z-ai chat \
--prompt "Solve this math problem: If a train travels 120km in 2 hours, what's its speed?" \
--thinking \
-o solution.json
```
### CLI Parameters
- `--prompt, -p <text>`: **Required** - User message content
- `--system, -s <text>`: Optional - System prompt for custom behavior
- `--thinking, -t`: Optional - Enable chain-of-thought reasoning (default: disabled)
- `--output, -o <path>`: Optional - Output file path (JSON format)
- `--stream`: Optional - Stream the response in real-time
### When to Use CLI vs SDK
**Use CLI for:**
- Quick one-off questions
- Simple automation scripts
- Testing prompts
- Single-turn conversations
**Use SDK for:**
- Multi-turn conversations with context
- Custom conversation management
- Integration with web applications
- Complex chat workflows
- Production applications
## Basic Chat Completions
### Simple Question and Answer
```javascript
import ZAI from 'z-ai-web-dev-sdk';
async function askQuestion(question) {
const zai = await ZAI.create();
const completion = await zai.chat.completions.create({
messages: [
{
role: 'assistant',
content: 'You are a helpful assistant.'
},
{
role: 'user',
content: question
}
],
thinking: { type: 'disabled' }
});
const response = completion.choices[0]?.message?.content;
return response;
}
// Usage
const answer = await askQuestion('What is the capital of France?');
console.log('Answer:', answer);
```
### Custom System Prompt
```javascript
import ZAI from 'z-ai-web-dev-sdk';
async function customAssistant(systemPrompt, userMessage) {
const zai = await ZAI.create();
const completion = await zai.chat.completions.create({
messages: [
{
role: 'assistant',
content: systemPrompt
},
{
role: 'user',
content: userMessage
}
],
thinking: { type: 'disabled' }
});
return completion.choices[0]?.message?.content;
}
// Usage - Code reviewer
const codeReview = await customAssistant(
'You are an expert code reviewer. Analyze code for bugs, performance issues, and best practices.',
'Review this function: function add(a, b) { return a + b; }'
);
// Usage - Creative writer
const story = await customAssistant(
'You are a creative fiction writer who writes engaging short stories.',
'Write a short story about a robot learning to paint.'
);
console.log(codeReview);
console.log(story);
```
## Multi-turn Conversations
### Conversation History Management
```javascript
import ZAI from 'z-ai-web-dev-sdk';
class ConversationManager {
constructor(systemPrompt = 'You are a helpful assistant.') {
this.messages = [
{
role: 'assistant',
content: systemPrompt
}
];
this.zai = null;
}
async initialize() {
this.zai = await ZAI.create();
}
async sendMessage(userMessage) {
// Add user message to history
this.messages.push({
role: 'user',
content: userMessage
});
// Get completion
const completion = await this.zai.chat.completions.create({
messages: this.messages,
thinking: { type: 'disabled' }
});
const assistantResponse = completion.choices[0]?.message?.content;
// Add assistant response to history
this.messages.push({
role: 'assistant',
content: assistantResponse
});
return assistantResponse;
}
getHistory() {
return this.messages;
}
clearHistory(systemPrompt = 'You are a helpful assistant.') {
this.messages = [
{
role: 'assistant',
content: systemPrompt
}
];
}
getMessageCount() {
// Subtract 1 for system message
return this.messages.length - 1;
}
}
// Usage
const conversation = new ConversationManager();
await conversation.initialize();
const response1 = await conversation.sendMessage('Hi, my name is John.');
console.log('AI:', response1);
const response2 = await conversation.sendMessage('What is my name?');
console.log('AI:', response2); // Should remember the name is John
console.log('Total messages:', conversation.getMessageCount());
```
### Context-Aware Conversations
```javascript
import ZAI from 'z-ai-web-dev-sdk';
class ContextualChat {
constructor() {
this.messages = [];
this.zai = null;
}
async initialize() {
this.zai = await ZAI.create();
}
async startConversation(role, context) {
// Set up system prompt with context
const systemPrompt = `You are ${role}. Context: ${context}`;
this.messages = [
{
role: 'assistant',
content: systemPrompt
}
];
}
async chat(userMessage) {
this.messages.push({
role: 'user',
content: userMessage
});
const completion = await this.zai.chat.completions.create({
messages: this.messages,
thinking: { type: 'disabled' }
});
const response = completion.choices[0]?.message?.content;
this.messages.push({
role: 'assistant',
content: response
});
return response;
}
}
// Usage - Customer support scenario
const support = new ContextualChat();
await support.initialize();
await support.startConversation(
'a customer support agent for TechCorp',
'The user has ordered product #12345 which is delayed due to shipping issues.'
);
const reply1 = await support.chat('Where is my order?');
console.log('Support:', reply1);
const reply2 = await support.chat('Can I get a refund?');
console.log('Support:', reply2);
```
## Advanced Use Cases
### Content Generation
```javascript
import ZAI from 'z-ai-web-dev-sdk';
class ContentGenerator {
constructor() {
this.zai = null;
}
async initialize() {
this.zai = await ZAI.create();
}
async generateBlogPost(topic, tone = 'professional') {
const completion = await this.zai.chat.completions.create({
messages: [
{
role: 'assistant',
content: `You are a professional content writer. Write in a ${tone} tone.`
},
{
role: 'user',
content: `Write a blog post about: ${topic}. Include an introduction, main points, and conclusion.`
}
],
thinking: { type: 'disabled' }
});
return completion.choices[0]?.message?.content;
}
async generateProductDescription(productName, features) {
const completion = await this.zai.chat.completions.create({
messages: [
{
role: 'assistant',
content: 'You are an expert at writing compelling product descriptions for e-commerce.'
},
{
role: 'user',
content: `Write a product description for "${productName}". Key features: ${features.join(', ')}.`
}
],
thinking: { type: 'disabled' }
});
return completion.choices[0]?.message?.content;
}
async generateEmailResponse(originalEmail, intent) {
const completion = await this.zai.chat.completions.create({
messages: [
{
role: 'assistant',
content: 'You are a professional email writer. Write clear, concise, and polite emails.'
},
{
role: 'user',
content: `Original email: "${originalEmail}"\n\nWrite a ${intent} response.`
}
],
thinking: { type: 'disabled' }
});
return completion.choices[0]?.message?.content;
}
}
// Usage
const generator = new ContentGenerator();
await generator.initialize();
const blogPost = await generator.generateBlogPost(
'The Future of Artificial Intelligence',
'informative'
);
console.log('Blog Post:', blogPost);
const productDesc = await generator.generateProductDescription(
'Smart Watch Pro',
['Heart rate monitoring', 'GPS tracking', 'Waterproof', '7-day battery life']
);
console.log('Product Description:', productDesc);
```
### Data Analysis and Summarization
```javascript
import ZAI from 'z-ai-web-dev-sdk';
async function analyzeData(data, analysisType) {
const zai = await ZAI.create();
const prompts = {
summarize: 'You are a data analyst. Summarize the key insights from the data.',
trend: 'You are a data analyst. Identify trends and patterns in the data.',
recommendation: 'You are a business analyst. Provide actionable recommendations based on the data.'
};
const completion = await zai.chat.completions.create({
messages: [
{
role: 'assistant',
content: prompts[analysisType] || prompts.summarize
},
{
role: 'user',
content: `Analyze this data:\n\n${JSON.stringify(data, null, 2)}`
}
],
thinking: { type: 'disabled' }
});
return completion.choices[0]?.message?.content;
}
// Usage
const salesData = {
Q1: { revenue: 100000, customers: 250 },
Q2: { revenue: 120000, customers: 280 },
Q3: { revenue: 150000, customers: 320 },
Q4: { revenue: 180000, customers: 380 }
};
const summary = await analyzeData(salesData, 'summarize');
const trends = await analyzeData(salesData, 'trend');
const recommendations = await analyzeData(salesData, 'recommendation');
console.log('Summary:', summary);
console.log('Trends:', trends);
console.log('Recommendations:', recommendations);
```
### Code Generation and Debugging
```javascript
import ZAI from 'z-ai-web-dev-sdk';
class CodeAssistant {
constructor() {
this.zai = null;
}
async initialize() {
this.zai = await ZAI.create();
}
async generateCode(description, language) {
const completion = await this.zai.chat.completions.create({
messages: [
{
role: 'assistant',
content: `You are an expert ${language} programmer. Write clean, efficient, and well-commented code.`
},
{
role: 'user',
content: `Write ${language} code to: ${description}`
}
],
thinking: { type: 'disabled' }
});
return completion.choices[0]?.message?.content;
}
async debugCode(code, issue) {
const completion = await this.zai.chat.completions.create({
messages: [
{
role: 'assistant',
content: 'You are an expert debugger. Identify bugs and suggest fixes.'
},
{
role: 'user',
content: `Code:\n${code}\n\nIssue: ${issue}\n\nFind the bug and suggest a fix.`
}
],
thinking: { type: 'disabled' }
});
return completion.choices[0]?.message?.content;
}
async explainCode(code) {
const completion = await this.zai.chat.completions.create({
messages: [
{
role: 'assistant',
content: 'You are a programming teacher. Explain code clearly and simply.'
},
{
role: 'user',
content: `Explain what this code does:\n\n${code}`
}
],
thinking: { type: 'disabled' }
});
return completion.choices[0]?.message?.content;
}
}
// Usage
const codeAssist = new CodeAssistant();
await codeAssist.initialize();
const newCode = await codeAssist.generateCode(
'Create a function that sorts an array of objects by a specific property',
'JavaScript'
);
console.log('Generated Code:', newCode);
const bugFix = await codeAssist.debugCode(
'function add(a, b) { return a - b; }',
'This function should add numbers but returns wrong results'
);
console.log('Debug Suggestion:', bugFix);
```
## Best Practices
### 1. Prompt Engineering
```javascript
// Bad: Vague prompt
const bad = await askQuestion('Tell me about AI');
// Good: Specific and structured prompt
async function askWithContext(topic, format, audience) {
const zai = await ZAI.create();
const completion = await zai.chat.completions.create({
messages: [
{
role: 'assistant',
content: `You are an expert educator. Explain topics clearly for ${audience}.`
},
{
role: 'user',
content: `Explain ${topic} in ${format} format. Include practical examples.`
}
],
thinking: { type: 'disabled' }
});
return completion.choices[0]?.message?.content;
}
const good = await askWithContext('artificial intelligence', 'bullet points', 'beginners');
```
### 2. Error Handling
```javascript
import ZAI from 'z-ai-web-dev-sdk';
async function safeCompletion(messages, retries = 3) {
let lastError;
for (let attempt = 1; attempt <= retries; attempt++) {
try {
const zai = await ZAI.create();
const completion = await zai.chat.completions.create({
messages: messages,
thinking: { type: 'disabled' }
});
const response = completion.choices[0]?.message?.content;
if (!response || response.trim().length === 0) {
throw new Error('Empty response from AI');
}
return {
success: true,
content: response,
attempts: attempt
};
} catch (error) {
lastError = error;
console.error(`Attempt ${attempt} failed:`, error.message);
if (attempt < retries) {
// Wait before retry (exponential backoff)
await new Promise(resolve => setTimeout(resolve, 1000 * attempt));
}
}
}
return {
success: false,
error: lastError.message,
attempts: retries
};
}
```
### 3. Context Management
```javascript
class ManagedConversation {
constructor(maxMessages = 20) {
this.maxMessages = maxMessages;
this.systemPrompt = '';
this.messages = [];
this.zai = null;
}
async initialize(systemPrompt) {
this.zai = await ZAI.create();
this.systemPrompt = systemPrompt;
this.messages = [
{
role: 'assistant',
content: systemPrompt
}
];
}
async chat(userMessage) {
// Add user message
this.messages.push({
role: 'user',
content: userMessage
});
// Trim old messages if exceeding limit (keep system prompt)
if (this.messages.length > this.maxMessages) {
this.messages = [
this.messages[0], // Keep system prompt
...this.messages.slice(-(this.maxMessages - 1))
];
}
const completion = await this.zai.chat.completions.create({
messages: this.messages,
thinking: { type: 'disabled' }
});
const response = completion.choices[0]?.message?.content;
this.messages.push({
role: 'assistant',
content: response
});
return response;
}
getTokenEstimate() {
// Rough estimate: ~4 characters per token
const totalChars = this.messages
.map(m => m.content.length)
.reduce((a, b) => a + b, 0);
return Math.ceil(totalChars / 4);
}
}
```
### 4. Response Processing
```javascript
async function getStructuredResponse(query, format = 'json') {
const zai = await ZAI.create();
const formatInstructions = {
json: 'Respond with valid JSON only. No additional text.',
list: 'Respond with a numbered list.',
markdown: 'Respond in Markdown format.'
};
const completion = await zai.chat.completions.create({
messages: [
{
role: 'assistant',
content: `You are a helpful assistant. ${formatInstructions[format]}`
},
{
role: 'user',
content: query
}
],
thinking: { type: 'disabled' }
});
const response = completion.choices[0]?.message?.content;
// Parse JSON if requested
if (format === 'json') {
try {
return JSON.parse(response);
} catch (e) {
console.error('Failed to parse JSON response');
return { raw: response };
}
}
return response;
}
// Usage
const jsonData = await getStructuredResponse(
'List three programming languages with their primary use cases',
'json'
);
console.log(jsonData);
```
## Common Use Cases
1. **Chatbots & Virtual Assistants**: Build conversational interfaces for customer support
2. **Content Generation**: Create articles, product descriptions, marketing copy
3. **Code Assistance**: Generate, explain, and debug code
4. **Data Analysis**: Analyze and summarize complex data sets
5. **Language Translation**: Translate text between languages
6. **Educational Tools**: Create tutoring and learning applications
7. **Email Automation**: Generate professional email responses
8. **Creative Writing**: Story generation, poetry, and creative content
## Integration Examples
### Express.js Chatbot API
```javascript
import express from 'express';
import ZAI from 'z-ai-web-dev-sdk';
const app = express();
app.use(express.json());
// Store conversations in memory (use database in production)
const conversations = new Map();
let zaiInstance;
async function initZAI() {
zaiInstance = await ZAI.create();
}
app.post('/api/chat', async (req, res) => {
try {
const { sessionId, message, systemPrompt } = req.body;
if (!message) {
return res.status(400).json({ error: 'Message is required' });
}
// Get or create conversation history
let history = conversations.get(sessionId) || [
{
role: 'assistant',
content: systemPrompt || 'You are a helpful assistant.'
}
];
// Add user message
history.push({
role: 'user',
content: message
});
// Get completion
const completion = await zaiInstance.chat.completions.create({
messages: history,
thinking: { type: 'disabled' }
});
const aiResponse = completion.choices[0]?.message?.content;
// Add AI response to history
history.push({
role: 'assistant',
content: aiResponse
});
// Save updated history
conversations.set(sessionId, history);
res.json({
success: true,
response: aiResponse,
messageCount: history.length - 1
});
} catch (error) {
res.status(500).json({
success: false,
error: error.message
});
}
});
app.delete('/api/chat/:sessionId', (req, res) => {
const { sessionId } = req.params;
conversations.delete(sessionId);
res.json({ success: true, message: 'Conversation cleared' });
});
initZAI().then(() => {
app.listen(3000, () => {
console.log('Chatbot API running on port 3000');
});
});
```
## Troubleshooting
**Issue**: "SDK must be used in backend"
- **Solution**: Ensure z-ai-web-dev-sdk is only imported and used in server-side code
**Issue**: Empty or incomplete responses
- **Solution**: Check that completion.choices[0]?.message?.content exists and is not empty
**Issue**: Conversation context getting too long
- **Solution**: Implement message trimming to keep only recent messages
**Issue**: Inconsistent responses
- **Solution**: Use more specific system prompts and provide clear instructions
**Issue**: Rate limiting errors
- **Solution**: Implement retry logic with exponential backoff
## Performance Tips
1. **Reuse SDK Instance**: Create ZAI instance once and reuse across requests
2. **Manage Context Length**: Trim old messages to avoid token limits
3. **Implement Caching**: Cache responses for common queries
4. **Use Specific Prompts**: Clear prompts lead to faster, better responses
5. **Handle Errors Gracefully**: Implement retry logic and fallback responses
## Security Considerations
1. **Input Validation**: Always validate and sanitize user input
2. **Rate Limiting**: Implement rate limits to prevent abuse
3. **API Key Protection**: Never expose SDK credentials in client-side code
4. **Content Filtering**: Filter sensitive or inappropriate content
5. **Session Management**: Implement proper session handling and cleanup
## Remember
- Always use z-ai-web-dev-sdk in backend code only
- The SDK is already installed - import as shown in examples
- Use the 'assistant' role for system prompts
- Set thinking to { type: 'disabled' } for standard completions
- Implement proper error handling and retries for production
- Manage conversation history to avoid token limits
- Clear and specific prompts lead to better results
- Check `scripts/chat.ts` for a quick start example

32
skills/LLM/scripts/chat.ts Executable file
View File

@@ -0,0 +1,32 @@
import ZAI, { ChatMessage } from "z-ai-web-dev-sdk";
async function main(prompt: string) {
try {
const zai = await ZAI.create();
const messages: ChatMessage[] = [
{
role: "assistant",
content: "Hi, I'm a helpful assistant."
},
{
role: "user",
content: prompt,
},
];
const response = await zai.chat.completions.create({
messages,
stream: false,
thinking: { type: "disabled" },
});
const reply = response.choices?.[0]?.message?.content;
console.log("Chat reply:");
console.log(reply ?? JSON.stringify(response, null, 2));
} catch (err: any) {
console.error("Chat failed:", err?.message || err);
}
}
main('What is the capital of France?');

21
skills/TTS/LICENSE.txt Executable file
View File

@@ -0,0 +1,21 @@
MIT License
Copyright (c) 2025 z-ai-web-dev-sdk Skills
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

735
skills/TTS/SKILL.md Executable file
View File

@@ -0,0 +1,735 @@
---
name: TTS
description: Implement text-to-speech (TTS) capabilities using the z-ai-web-dev-sdk. Use this skill when the user needs to convert text into natural-sounding speech, create audio content, build voice-enabled applications, or generate spoken audio files. Supports multiple voices, adjustable speed, and various audio formats.
license: MIT
---
# TTS (Text to Speech) Skill
This skill guides the implementation of text-to-speech (TTS) functionality using the z-ai-web-dev-sdk package, enabling conversion of text into natural-sounding speech audio.
## Skills Path
**Skill Location**: `{project_path}/skills/TTS`
This skill is located at the above path in your project.
**Reference Scripts**: Example test scripts are available in the `{Skill Location}/scripts/` directory for quick testing and reference. See `{Skill Location}/scripts/tts.ts` for a working example.
## Overview
Text-to-Speech allows you to build applications that generate spoken audio from text input, supporting various voices, speeds, and output formats for diverse use cases.
**IMPORTANT**: z-ai-web-dev-sdk MUST be used in backend code only. Never use it in client-side code.
## API Limitations and Constraints
Before implementing TTS functionality, be aware of these important limitations:
### Input Text Constraints
- **Maximum length**: 1024 characters per request
- Text exceeding this limit must be split into smaller chunks
### Audio Parameters
- **Speed range**: 0.5 to 2.0
- 0.5 = half speed (slower)
- 1.0 = normal speed (default)
- 2.0 = double speed (faster)
- **Volume range**: Greater than 0, up to 10
- Default: 1.0
- Values must be greater than 0 (exclusive) and up to 10 (inclusive)
### Format and Streaming
- **Streaming limitation**: When `stream: true` is enabled, only `pcm` format is supported
- **Non-streaming**: Supports `wav`, `pcm`, and `mp3` formats
- **Sample rate**: 24000 Hz (recommended)
### Best Practice for Long Text
```javascript
function splitTextIntoChunks(text, maxLength = 1000) {
const chunks = [];
const sentences = text.match(/[^.!?]+[.!?]+/g) || [text];
let currentChunk = '';
for (const sentence of sentences) {
if ((currentChunk + sentence).length <= maxLength) {
currentChunk += sentence;
} else {
if (currentChunk) chunks.push(currentChunk.trim());
currentChunk = sentence;
}
}
if (currentChunk) chunks.push(currentChunk.trim());
return chunks;
}
```
## Prerequisites
The z-ai-web-dev-sdk package is already installed. Import it as shown in the examples below.
## CLI Usage (For Simple Tasks)
For simple text-to-speech conversions, you can use the z-ai CLI instead of writing code. This is ideal for quick audio generation, testing voices, or simple automation.
### Basic TTS
```bash
# Convert text to speech (default WAV format)
z-ai tts --input "Hello, world" --output ./hello.wav
# Using short options
z-ai tts -i "Hello, world" -o ./hello.wav
```
### Different Voices and Speed
```bash
# Use specific voice
z-ai tts -i "Welcome to our service" -o ./welcome.wav --voice tongtong
# Adjust speech speed (0.5-2.0)
z-ai tts -i "This is faster speech" -o ./fast.wav --speed 1.5
# Slower speech
z-ai tts -i "This is slower speech" -o ./slow.wav --speed 0.8
```
### Different Output Formats
```bash
# MP3 format
z-ai tts -i "Hello World" -o ./hello.mp3 --format mp3
# WAV format (default)
z-ai tts -i "Hello World" -o ./hello.wav --format wav
# PCM format
z-ai tts -i "Hello World" -o ./hello.pcm --format pcm
```
### Streaming Output
```bash
# Stream audio generation
z-ai tts -i "This is a longer text that will be streamed" -o ./stream.wav --stream
```
### CLI Parameters
- `--input, -i <text>`: **Required** - Text to convert to speech (max 1024 characters)
- `--output, -o <path>`: **Required** - Output audio file path
- `--voice, -v <voice>`: Optional - Voice type (default: tongtong)
- `--speed, -s <number>`: Optional - Speech speed, 0.5-2.0 (default: 1.0)
- `--format, -f <format>`: Optional - Output format: wav, mp3, pcm (default: wav)
- `--stream`: Optional - Enable streaming output (only supports pcm format)
### When to Use CLI vs SDK
**Use CLI for:**
- Quick text-to-speech conversions
- Testing different voices and speeds
- Simple batch audio generation
- Command-line automation scripts
**Use SDK for:**
- Dynamic audio generation in applications
- Integration with web services
- Custom audio processing pipelines
- Production applications with complex requirements
## Basic TTS Implementation
### Simple Text to Speech
```javascript
import ZAI from 'z-ai-web-dev-sdk';
import fs from 'fs';
async function textToSpeech(text, outputPath) {
const zai = await ZAI.create();
const response = await zai.audio.tts.create({
input: text,
voice: 'tongtong',
speed: 1.0,
response_format: 'wav',
stream: false
});
// Get array buffer from Response object
const arrayBuffer = await response.arrayBuffer();
const buffer = Buffer.from(new Uint8Array(arrayBuffer));
fs.writeFileSync(outputPath, buffer);
console.log(`Audio saved to ${outputPath}`);
return outputPath;
}
// Usage
await textToSpeech('Hello, world!', './output.wav');
```
### Multiple Voice Options
```javascript
import ZAI from 'z-ai-web-dev-sdk';
import fs from 'fs';
async function generateWithVoice(text, voice, outputPath) {
const zai = await ZAI.create();
const response = await zai.audio.tts.create({
input: text,
voice: voice, // Available voices: tongtong, chuichui, xiaochen, jam, kazi, douji, luodo
speed: 1.0,
response_format: 'wav',
stream: false
});
// Get array buffer from Response object
const arrayBuffer = await response.arrayBuffer();
const buffer = Buffer.from(new Uint8Array(arrayBuffer));
fs.writeFileSync(outputPath, buffer);
return outputPath;
}
// Usage
await generateWithVoice('Welcome to our service', 'tongtong', './welcome.wav');
```
### Adjustable Speed
```javascript
import ZAI from 'z-ai-web-dev-sdk';
import fs from 'fs';
async function generateWithSpeed(text, speed, outputPath) {
const zai = await ZAI.create();
// Speed range: 0.5 to 2.0 (API constraint)
// 0.5 = half speed (slower)
// 1.0 = normal speed (default)
// 2.0 = double speed (faster)
// Values outside this range will cause API errors
const response = await zai.audio.tts.create({
input: text,
voice: 'tongtong',
speed: speed,
response_format: 'wav',
stream: false
});
// Get array buffer from Response object
const arrayBuffer = await response.arrayBuffer();
const buffer = Buffer.from(new Uint8Array(arrayBuffer));
fs.writeFileSync(outputPath, buffer);
return outputPath;
}
// Usage - slower narration
await generateWithSpeed('This is an important announcement', 0.8, './slow.wav');
// Usage - faster narration
await generateWithSpeed('Quick update', 1.3, './fast.wav');
```
### Adjustable Volume
```javascript
import ZAI from 'z-ai-web-dev-sdk';
import fs from 'fs';
async function generateWithVolume(text, volume, outputPath) {
const zai = await ZAI.create();
// Volume range: greater than 0, up to 10 (API constraint)
// Values must be > 0 (exclusive) and <= 10 (inclusive)
// Default: 1.0 (normal volume)
const response = await zai.audio.tts.create({
input: text,
voice: 'tongtong',
speed: 1.0,
volume: volume, // Optional parameter
response_format: 'wav',
stream: false
});
// Get array buffer from Response object
const arrayBuffer = await response.arrayBuffer();
const buffer = Buffer.from(new Uint8Array(arrayBuffer));
fs.writeFileSync(outputPath, buffer);
return outputPath;
}
// Usage - louder audio
await generateWithVolume('This is an announcement', 5.0, './loud.wav');
// Usage - quieter audio
await generateWithVolume('Whispered message', 0.5, './quiet.wav');
```
## Advanced Use Cases
### Batch Processing
```javascript
import ZAI from 'z-ai-web-dev-sdk';
import fs from 'fs';
import path from 'path';
async function batchTextToSpeech(textArray, outputDir) {
const zai = await ZAI.create();
const results = [];
// Ensure output directory exists
if (!fs.existsSync(outputDir)) {
fs.mkdirSync(outputDir, { recursive: true });
}
for (let i = 0; i < textArray.length; i++) {
try {
const text = textArray[i];
const outputPath = path.join(outputDir, `audio_${i + 1}.wav`);
const response = await zai.audio.tts.create({
input: text,
voice: 'tongtong',
speed: 1.0,
response_format: 'wav',
stream: false
});
// Get array buffer from Response object
const arrayBuffer = await response.arrayBuffer();
const buffer = Buffer.from(new Uint8Array(arrayBuffer));
fs.writeFileSync(outputPath, buffer);
results.push({
success: true,
text,
path: outputPath
});
} catch (error) {
results.push({
success: false,
text: textArray[i],
error: error.message
});
}
}
return results;
}
// Usage
const texts = [
'Welcome to chapter one',
'Welcome to chapter two',
'Welcome to chapter three'
];
const results = await batchTextToSpeech(texts, './audio-output');
console.log('Generated:', results.length, 'audio files');
```
### Dynamic Content Generation
```javascript
import ZAI from 'z-ai-web-dev-sdk';
import fs from 'fs';
class TTSGenerator {
constructor() {
this.zai = null;
}
async initialize() {
this.zai = await ZAI.create();
}
async generateAudio(text, options = {}) {
const {
voice = 'tongtong',
speed = 1.0,
format = 'wav'
} = options;
const response = await this.zai.audio.tts.create({
input: text,
voice: voice,
speed: speed,
response_format: format,
stream: false
});
// Get array buffer from Response object
const arrayBuffer = await response.arrayBuffer();
return Buffer.from(new Uint8Array(arrayBuffer));
}
async saveAudio(text, outputPath, options = {}) {
const buffer = await this.generateAudio(text, options);
if (buffer) {
fs.writeFileSync(outputPath, buffer);
return outputPath;
}
return null;
}
}
// Usage
const generator = new TTSGenerator();
await generator.initialize();
await generator.saveAudio(
'Hello, this is a test',
'./output.wav',
{ speed: 1.2 }
);
```
### Next.js API Route Example
```javascript
import { NextRequest, NextResponse } from 'next/server';
export async function POST(req: NextRequest) {
try {
const { text, voice = 'tongtong', speed = 1.0 } = await req.json();
// Import ZAI SDK
const ZAI = (await import('z-ai-web-dev-sdk')).default;
// Create SDK instance
const zai = await ZAI.create();
// Generate TTS audio
const response = await zai.audio.tts.create({
input: text.trim(),
voice: voice,
speed: speed,
response_format: 'wav',
stream: false,
});
// Get array buffer from Response object
const arrayBuffer = await response.arrayBuffer();
const buffer = Buffer.from(new Uint8Array(arrayBuffer));
// Return audio as response
return new NextResponse(buffer, {
status: 200,
headers: {
'Content-Type': 'audio/wav',
'Content-Length': buffer.length.toString(),
'Cache-Control': 'no-cache',
},
});
} catch (error) {
console.error('TTS API Error:', error);
return NextResponse.json(
{
error: error instanceof Error ? error.message : '生成语音失败,请稍后重试',
},
{ status: 500 }
);
}
}
```
## Best Practices
### 1. Text Preparation
```javascript
function prepareTextForTTS(text) {
// Remove excessive whitespace
text = text.replace(/\s+/g, ' ').trim();
// Expand common abbreviations for better pronunciation
const abbreviations = {
'Dr.': 'Doctor',
'Mr.': 'Mister',
'Mrs.': 'Misses',
'etc.': 'et cetera'
};
for (const [abbr, full] of Object.entries(abbreviations)) {
text = text.replace(new RegExp(abbr, 'g'), full);
}
return text;
}
```
### 2. Error Handling
```javascript
import ZAI from 'z-ai-web-dev-sdk';
import fs from 'fs';
async function safeTTS(text, outputPath) {
try {
// Validate input
if (!text || text.trim().length === 0) {
throw new Error('Text input cannot be empty');
}
if (text.length > 1024) {
throw new Error('Text input exceeds maximum length of 1024 characters');
}
const zai = await ZAI.create();
const response = await zai.audio.tts.create({
input: text,
voice: 'tongtong',
speed: 1.0,
response_format: 'wav',
stream: false
});
// Get array buffer from Response object
const arrayBuffer = await response.arrayBuffer();
const buffer = Buffer.from(new Uint8Array(arrayBuffer));
fs.writeFileSync(outputPath, buffer);
return {
success: true,
path: outputPath,
size: buffer.length
};
} catch (error) {
console.error('TTS Error:', error);
return {
success: false,
error: error.message
};
}
}
```
### 3. SDK Instance Reuse
```javascript
import ZAI from 'z-ai-web-dev-sdk';
// Create a singleton instance
let zaiInstance = null;
async function getZAIInstance() {
if (!zaiInstance) {
zaiInstance = await ZAI.create();
}
return zaiInstance;
}
// Usage
const zai = await getZAIInstance();
const response = await zai.audio.tts.create({ ... });
```
## Common Use Cases
1. **Audiobooks & Podcasts**: Convert written content to audio format
2. **E-learning**: Create narration for educational content
3. **Accessibility**: Provide audio versions of text content
4. **Voice Assistants**: Generate dynamic responses
5. **Announcements**: Create automated audio notifications
6. **IVR Systems**: Generate phone system prompts
7. **Content Localization**: Create audio in different languages
## Integration Examples
### Express.js API Endpoint
```javascript
import express from 'express';
import ZAI from 'z-ai-web-dev-sdk';
import fs from 'fs';
import path from 'path';
const app = express();
app.use(express.json());
let zaiInstance;
const outputDir = './audio-output';
async function initZAI() {
zaiInstance = await ZAI.create();
if (!fs.existsSync(outputDir)) {
fs.mkdirSync(outputDir, { recursive: true });
}
}
app.post('/api/tts', async (req, res) => {
try {
const { text, voice = 'tongtong', speed = 1.0 } = req.body;
if (!text) {
return res.status(400).json({ error: 'Text is required' });
}
const filename = `tts_${Date.now()}.wav`;
const outputPath = path.join(outputDir, filename);
const response = await zaiInstance.audio.tts.create({
input: text,
voice: voice,
speed: speed,
response_format: 'wav',
stream: false
});
// Get array buffer from Response object
const arrayBuffer = await response.arrayBuffer();
const buffer = Buffer.from(new Uint8Array(arrayBuffer));
fs.writeFileSync(outputPath, buffer);
res.json({
success: true,
audioUrl: `/audio/${filename}`,
size: buffer.length
});
} catch (error) {
res.status(500).json({ error: error.message });
}
});
app.use('/audio', express.static('audio-output'));
initZAI().then(() => {
app.listen(3000, () => {
console.log('TTS API running on port 3000');
});
});
```
## Troubleshooting
**Issue**: "Input text exceeds maximum length"
- **Solution**: Text input is limited to 1024 characters. Split longer text into chunks using the `splitTextIntoChunks` function shown in the API Limitations section
**Issue**: "Invalid speed parameter" or unexpected speed behavior
- **Solution**: Speed must be between 0.5 and 2.0. Check your speed value is within this range
**Issue**: "Invalid volume parameter"
- **Solution**: Volume must be greater than 0 and up to 10. Ensure volume value is in range (0, 10]
**Issue**: "Stream format not supported" with WAV/MP3
- **Solution**: Streaming mode only supports PCM format. Either use `response_format: 'pcm'` with streaming, or disable streaming (`stream: false`) for WAV/MP3 output
**Issue**: "SDK must be used in backend"
- **Solution**: Ensure z-ai-web-dev-sdk is only imported in server-side code
**Issue**: "TypeError: response.audio is undefined"
- **Solution**: The SDK returns a standard Response object, use `await response.arrayBuffer()` instead of accessing `response.audio`
**Issue**: Generated audio file is empty or corrupted
- **Solution**: Ensure you're calling `await response.arrayBuffer()` and properly converting to Buffer: `Buffer.from(new Uint8Array(arrayBuffer))`
**Issue**: Audio sounds unnatural
- **Solution**: Prepare text properly (remove special characters, expand abbreviations)
**Issue**: Long processing times
- **Solution**: Break long text into smaller chunks and process in parallel
**Issue**: Next.js caching old API route
- **Solution**: Create a new API route endpoint or restart the dev server
## Performance Tips
1. **Reuse SDK Instance**: Create ZAI instance once and reuse
2. **Implement Caching**: Cache generated audio for repeated text
3. **Batch Processing**: Process multiple texts efficiently
4. **Optimize Text**: Remove unnecessary content before generation
5. **Async Processing**: Use queues for handling multiple requests
## Important Notes
### API Constraints
**Input Text Length**: Maximum 1024 characters per request. For longer text:
```javascript
// Split long text into chunks
const longText = "..."; // Your long text here
const chunks = splitTextIntoChunks(longText, 1000);
for (const chunk of chunks) {
const response = await zai.audio.tts.create({
input: chunk,
voice: 'tongtong',
speed: 1.0,
response_format: 'wav',
stream: false
});
// Process each chunk...
}
```
**Streaming Format Limitation**: When using `stream: true`, only `pcm` format is supported. For `wav` or `mp3` output, use `stream: false`.
**Sample Rate**: Audio is generated at 24000 Hz sample rate (recommended setting for playback).
### Response Object Format
The `zai.audio.tts.create()` method returns a standard **Response** object (not a custom object with an `audio` property). Always use:
```javascript
// ✅ CORRECT
const response = await zai.audio.tts.create({ ... });
const arrayBuffer = await response.arrayBuffer();
const buffer = Buffer.from(new Uint8Array(arrayBuffer));
// ❌ WRONG - This will not work
const response = await zai.audio.tts.create({ ... });
const buffer = Buffer.from(response.audio); // response.audio is undefined
```
### Available Voices
- `tongtong` - 温暖亲切
- `chuichui` - 活泼可爱
- `xiaochen` - 沉稳专业
- `jam` - 英音绅士
- `kazi` - 清晰标准
- `douji` - 自然流畅
- `luodo` - 富有感染力
### Speed Range
- Minimum: `0.5` (half speed)
- Default: `1.0` (normal speed)
- Maximum: `2.0` (double speed)
**Important**: Speed values outside the range [0.5, 2.0] will result in API errors.
### Volume Range
- Minimum: Greater than `0` (exclusive)
- Default: `1.0` (normal volume)
- Maximum: `10` (inclusive)
**Note**: Volume parameter is optional. When not specified, defaults to 1.0.
## Remember
- Always use z-ai-web-dev-sdk in backend code only
- **Input text is limited to 1024 characters maximum** - split longer text into chunks
- **Speed must be between 0.5 and 2.0** - values outside this range will cause errors
- **Volume must be greater than 0 and up to 10** - optional parameter with default 1.0
- **Streaming only supports PCM format** - use non-streaming for WAV or MP3 output
- The SDK returns a standard Response object - use `await response.arrayBuffer()`
- Convert ArrayBuffer to Buffer using `Buffer.from(new Uint8Array(arrayBuffer))`
- Handle audio buffers properly when saving to files
- Implement error handling for production applications
- Consider caching for frequently generated content
- Clean up old audio files periodically to manage storage

25
skills/TTS/tts.ts Executable file
View File

@@ -0,0 +1,25 @@
import ZAI from "z-ai-web-dev-sdk";
import fs from "fs";
async function main(text: string, outFile: string) {
try {
const zai = await ZAI.create();
const response = await zai.audio.tts.create({
input: text,
voice: "tongtong",
speed: 1.0,
response_format: "wav",
stream: false,
});
const arrayBuffer = await response.arrayBuffer();
const buffer = Buffer.from(new Uint8Array(arrayBuffer));
fs.writeFileSync(outFile, buffer);
console.log(`TTS audio saved to ${outFile}`);
} catch (err: any) {
console.error("TTS failed:", err?.message || err);
}
}
main("Hello, world!", "./output.wav");

21
skills/VLM/LICENSE.txt Executable file
View File

@@ -0,0 +1,21 @@
MIT License
Copyright (c) 2025 z-ai-web-dev-sdk Skills
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

588
skills/VLM/SKILL.md Executable file
View File

@@ -0,0 +1,588 @@
---
name: VLM
description: Implement vision-based AI chat capabilities using the z-ai-web-dev-sdk. Use this skill when the user needs to analyze images, describe visual content, or create applications that combine image understanding with conversational AI. Supports image URLs and base64 encoded images for multimodal interactions.
license: MIT
---
# VLM(Vision Chat) Skill
This skill guides the implementation of vision chat functionality using the z-ai-web-dev-sdk package, enabling AI models to understand and respond to images combined with text prompts.
## Skills Path
**Skill Location**: `{project_path}/skills/VLM`
this skill is located at above path in your project.
**Reference Scripts**: Example test scripts are available in the `{Skill Location}/scripts/` directory for quick testing and reference. See `{Skill Location}/scripts/vlm.ts` for a working example.
## Overview
Vision Chat allows you to build applications that can analyze images, extract information from visual content, and answer questions about images through natural language conversation.
**IMPORTANT**: z-ai-web-dev-sdk MUST be used in backend code only. Never use it in client-side code.
## Prerequisites
The z-ai-web-dev-sdk package is already installed. Import it as shown in the examples below.
## CLI Usage (For Simple Tasks)
For simple image analysis tasks, you can use the z-ai CLI instead of writing code. This is ideal for quick image descriptions, testing vision capabilities, or simple automation.
### Basic Image Analysis
```bash
# Describe an image from URL
z-ai vision --prompt "What's in this image?" --image "https://example.com/photo.jpg"
# Using short options
z-ai vision -p "Describe this image" -i "https://example.com/image.png"
```
### Analyze Local Images
```bash
# Analyze a local image file
z-ai vision -p "What objects are in this photo?" -i "./photo.jpg"
# Save response to file
z-ai vision -p "Describe the scene" -i "./landscape.png" -o description.json
```
### Multiple Images
```bash
# Analyze multiple images at once
z-ai vision \
-p "Compare these two images" \
-i "./photo1.jpg" \
-i "./photo2.jpg" \
-o comparison.json
# Multiple images with detailed analysis
z-ai vision \
--prompt "What are the differences between these images?" \
--image "https://example.com/before.jpg" \
--image "https://example.com/after.jpg"
```
### With Thinking (Chain of Thought)
```bash
# Enable thinking for complex visual reasoning
z-ai vision \
-p "Count the number of people in this image and describe their activities" \
-i "./crowd.jpg" \
--thinking \
-o analysis.json
```
### Streaming Output
```bash
# Stream the vision analysis
z-ai vision -p "Describe this image in detail" -i "./photo.jpg" --stream
```
### CLI Parameters
- `--prompt, -p <text>`: **Required** - Question or instruction about the image(s)
- `--image, -i <URL or path>`: Optional - Image URL or local file path (can be used multiple times)
- `--thinking, -t`: Optional - Enable chain-of-thought reasoning (default: disabled)
- `--output, -o <path>`: Optional - Output file path (JSON format)
- `--stream`: Optional - Stream the response in real-time
### Supported Image Formats
- PNG (.png)
- JPEG (.jpg, .jpeg)
- GIF (.gif)
- WebP (.webp)
- BMP (.bmp)
### When to Use CLI vs SDK
**Use CLI for:**
- Quick image analysis
- Testing vision model capabilities
- One-off image descriptions
- Simple automation scripts
**Use SDK for:**
- Multi-turn conversations with images
- Dynamic image analysis in applications
- Batch processing with custom logic
- Production applications with complex workflows
## Recommended Approach
For better performance and reliability, use base64 encoding to pass images to the model instead of image URLs.
## Supported Content Types
The Vision Chat API supports three types of media content:
### 1. **image_url** - For Image Files
Use this type for static images (PNG, JPEG, GIF, WebP, etc.)
```typescript
{
role: 'user',
content: [
{ type: 'text', text: prompt },
{ type: 'image_url', image_url: { url: imageUrl } }
]
}
```
### 2. **video_url** - For Video Files
Use this type for video content (MP4, AVI, MOV, etc.)
```typescript
{
role: 'user',
content: [
{ type: 'text', text: prompt },
{ type: 'video_url', video_url: { url: videoUrl } }
]
}
```
### 3. **file_url** - For Document Files
Use this type for document files (PDF, DOCX, TXT, etc.)
```typescript
{
role: 'user',
content: [
{ type: 'text', text: prompt },
{ type: 'file_url', file_url: { url: fileUrl } }
]
}
```
**Note**: You can combine multiple content types in a single message. For example, you can include both text and multiple images, or text with both an image and a document.
## Basic Vision Chat Implementation
### Single Image Analysis
```javascript
import ZAI from 'z-ai-web-dev-sdk';
async function analyzeImage(imageUrl, question) {
const zai = await ZAI.create();
const response = await zai.chat.completions.createVision({
messages: [
{
role: 'user',
content: [
{
type: 'text',
text: question
},
{
type: 'image_url',
image_url: {
url: imageUrl
}
}
]
}
],
thinking: { type: 'disabled' }
});
return response.choices[0]?.message?.content;
}
// Usage
const result = await analyzeImage(
'https://example.com/product.jpg',
'Describe this product in detail'
);
console.log('Analysis:', result);
```
### Multiple Images Analysis
```javascript
import ZAI from 'z-ai-web-dev-sdk';
async function compareImages(imageUrls, question) {
const zai = await ZAI.create();
const content = [
{
type: 'text',
text: question
},
...imageUrls.map(url => ({
type: 'image_url',
image_url: { url }
}))
];
const response = await zai.chat.completions.createVision({
messages: [
{
role: 'user',
content: content
}
],
thinking: { type: 'disabled' }
});
return response.choices[0]?.message?.content;
}
// Usage
const comparison = await compareImages(
[
'https://example.com/before.jpg',
'https://example.com/after.jpg'
],
'Compare these two images and describe the differences'
);
```
### Base64 Image Support
```javascript
import ZAI from 'z-ai-web-dev-sdk';
import fs from 'fs';
async function analyzeLocalImage(imagePath, question) {
const zai = await ZAI.create();
// Read image file and convert to base64
const imageBuffer = fs.readFileSync(imagePath);
const base64Image = imageBuffer.toString('base64');
const mimeType = imagePath.endsWith('.png') ? 'image/png' : 'image/jpeg';
const response = await zai.chat.completions.createVision({
messages: [
{
role: 'user',
content: [
{
type: 'text',
text: question
},
{
type: 'image_url',
image_url: {
url: `data:${mimeType};base64,${base64Image}`
}
}
]
}
],
thinking: { type: 'disabled' }
});
return response.choices[0]?.message?.content;
}
```
## Advanced Use Cases
### Conversational Vision Chat
```javascript
import ZAI from 'z-ai-web-dev-sdk';
class VisionChatSession {
constructor() {
this.messages = [];
}
async initialize() {
this.zai = await ZAI.create();
}
async addImage(imageUrl, initialQuestion) {
this.messages.push({
role: 'user',
content: [
{
type: 'text',
text: initialQuestion
},
{
type: 'image_url',
image_url: { url: imageUrl }
}
]
});
return this.getResponse();
}
async followUp(question) {
this.messages.push({
role: 'user',
content: [
{
type: 'text',
text: question
}
]
});
return this.getResponse();
}
async getResponse() {
const response = await this.zai.chat.completions.createVision({
messages: this.messages,
thinking: { type: 'disabled' }
});
const assistantMessage = response.choices[0]?.message?.content;
this.messages.push({
role: 'assistant',
content: assistantMessage
});
return assistantMessage;
}
}
// Usage
const session = new VisionChatSession();
await session.initialize();
const initial = await session.addImage(
'https://example.com/chart.jpg',
'What does this chart show?'
);
console.log('Initial analysis:', initial);
const followup = await session.followUp('What are the key trends?');
console.log('Follow-up:', followup);
```
### Image Classification and Tagging
```javascript
import ZAI from 'z-ai-web-dev-sdk';
async function classifyImage(imageUrl) {
const zai = await ZAI.create();
const prompt = `Analyze this image and provide:
1. Main subject/category
2. Key objects detected
3. Scene description
4. Suggested tags (comma-separated)
Format your response as JSON.`;
const response = await zai.chat.completions.createVision({
messages: [
{
role: 'user',
content: [
{
type: 'text',
text: prompt
},
{
type: 'image_url',
image_url: { url: imageUrl }
}
]
}
],
thinking: { type: 'disabled' }
});
const content = response.choices[0]?.message?.content;
try {
return JSON.parse(content);
} catch (e) {
return { rawResponse: content };
}
}
```
### OCR and Text Extraction
```javascript
import ZAI from 'z-ai-web-dev-sdk';
async function extractText(imageUrl) {
const zai = await ZAI.create();
const response = await zai.chat.completions.createVision({
messages: [
{
role: 'user',
content: [
{
type: 'text',
text: 'Extract all text from this image. Preserve the layout and formatting as much as possible.'
},
{
type: 'image_url',
image_url: { url: imageUrl }
}
]
}
],
thinking: { type: 'disabled' }
});
return response.choices[0]?.message?.content;
}
```
## Best Practices
### 1. Image Quality and Size
- Use high-quality images for better analysis results
- Optimize image size to balance quality and processing speed
- Supported formats: JPEG, PNG, WebP
### 2. Prompt Engineering
- Be specific about what information you need from the image
- Structure complex requests with numbered lists or bullet points
- Provide context about the image type (photo, diagram, chart, etc.)
### 3. Error Handling
```javascript
async function safeVisionChat(imageUrl, question) {
try {
const zai = await ZAI.create();
const response = await zai.chat.completions.createVision({
messages: [
{
role: 'user',
content: [
{ type: 'text', text: question },
{ type: 'image_url', image_url: { url: imageUrl } }
]
}
],
thinking: { type: 'disabled' }
});
return {
success: true,
content: response.choices[0]?.message?.content
};
} catch (error) {
console.error('Vision chat error:', error);
return {
success: false,
error: error.message
};
}
}
```
### 4. Performance Optimization
- Cache SDK instance creation when processing multiple images
- Use appropriate image formats (JPEG for photos, PNG for diagrams)
- Consider image preprocessing for large batches
### 5. Security Considerations
- Validate image URLs before processing
- Sanitize user-provided image data
- Implement rate limiting for public-facing APIs
- Never expose SDK credentials in client-side code
## Common Use Cases
1. **Product Analysis**: Analyze product images for e-commerce applications
2. **Document Understanding**: Extract information from receipts, invoices, forms
3. **Medical Imaging**: Assist in preliminary analysis (with appropriate disclaimers)
4. **Quality Control**: Detect defects or anomalies in manufacturing
5. **Content Moderation**: Analyze images for policy compliance
6. **Accessibility**: Generate alt text for images automatically
7. **Visual Search**: Understand and categorize images for search functionality
## Integration Examples
### Express.js API Endpoint
```javascript
import express from 'express';
import ZAI from 'z-ai-web-dev-sdk';
const app = express();
app.use(express.json());
let zaiInstance;
// Initialize SDK once
async function initZAI() {
zaiInstance = await ZAI.create();
}
app.post('/api/analyze-image', async (req, res) => {
try {
const { imageUrl, question } = req.body;
if (!imageUrl || !question) {
return res.status(400).json({
error: 'imageUrl and question are required'
});
}
const response = await zaiInstance.chat.completions.createVision({
messages: [
{
role: 'user',
content: [
{ type: 'text', text: question },
{ type: 'image_url', image_url: { url: imageUrl } }
]
}
],
thinking: { type: 'disabled' }
});
res.json({
success: true,
analysis: response.choices[0]?.message?.content
});
} catch (error) {
res.status(500).json({
success: false,
error: error.message
});
}
});
initZAI().then(() => {
app.listen(3000, () => {
console.log('Vision chat API running on port 3000');
});
});
```
## Troubleshooting
**Issue**: "SDK must be used in backend"
- **Solution**: Ensure z-ai-web-dev-sdk is only imported and used in server-side code
**Issue**: Image not loading or being analyzed
- **Solution**: Verify the image URL is accessible and returns a valid image format
**Issue**: Poor analysis quality
- **Solution**: Provide more specific prompts and ensure image quality is sufficient
**Issue**: Slow response times
- **Solution**: Optimize image size and consider caching frequently analyzed images
## Remember
- Always use z-ai-web-dev-sdk in backend code only
- The SDK is already installed - import as shown in examples
- Structure prompts clearly for best results
- Handle errors gracefully in production applications
- Consider user privacy when processing images

57
skills/VLM/scripts/vlm.ts Executable file
View File

@@ -0,0 +1,57 @@
import ZAI, { VisionMessage } from 'z-ai-web-dev-sdk';
async function main(imageUrl: string, prompt: string) {
try {
const zai = await ZAI.create();
const messages: VisionMessage[] = [
{
role: 'assistant',
content: [
{ type: 'text', text: 'Output only text, no markdown.' }
]
},
{
role: 'user',
content: [
{ type: 'text', text: prompt },
{ type: 'image_url', image_url: { url: imageUrl } }
]
}
];
// const messages: VisionMessage[] = [
// {
// role: 'user',
// content: [
// { type: 'text', text: prompt },
// { type: 'video_url', video_url: { url: imageUrl } }
// ]
// }
// ];
// const messages: VisionMessage[] = [
// {
// role: 'user',
// content: [
// { type: 'text', text: prompt },
// { type: 'file_url', file_url: { url: imageUrl } }
// ]
// }
// ];
const response = await zai.chat.completions.createVision({
model: 'glm-4.6v',
messages,
thinking: { type: 'disabled' }
});
const reply = response.choices?.[0]?.message?.content;
console.log('Vision model reply:');
console.log(reply ?? JSON.stringify(response, null, 2));
} catch (err: any) {
console.error('Vision chat failed:', err?.message || err);
}
}
main("https://cdn.bigmodel.cn/static/logo/register.png", "Please describe this image.");

328
skills/agent-browser/SKILL.md Executable file
View File

@@ -0,0 +1,328 @@
---
name: Agent Browser
description: A fast Rust-based headless browser automation CLI with Node.js fallback that enables AI agents to navigate, click, type, and snapshot pages via structured commands.
read_when:
- Automating web interactions
- Extracting structured data from pages
- Filling forms programmatically
- Testing web UIs
metadata: {"clawdbot":{"emoji":"🌐","requires":{"bins":["node","npm"]}}}
allowed-tools: Bash(agent-browser:*)
---
# Browser Automation with agent-browser
## Installation
### npm recommended
```bash
npm install -g agent-browser
agent-browser install
agent-browser install --with-deps
```
### From Source
```bash
git clone https://github.com/vercel-labs/agent-browser
cd agent-browser
pnpm install
pnpm build
agent-browser install
```
## Quick start
```bash
agent-browser open <url> # Navigate to page
agent-browser snapshot -i # Get interactive elements with refs
agent-browser click @e1 # Click element by ref
agent-browser fill @e2 "text" # Fill input by ref
agent-browser close # Close browser
```
## Core workflow
1. Navigate: `agent-browser open <url>`
2. Snapshot: `agent-browser snapshot -i` (returns elements with refs like `@e1`, `@e2`)
3. Interact using refs from the snapshot
4. Re-snapshot after navigation or significant DOM changes
## Commands
### Navigation
```bash
agent-browser open <url> # Navigate to URL
agent-browser back # Go back
agent-browser forward # Go forward
agent-browser reload # Reload page
agent-browser close # Close browser
```
### Snapshot (page analysis)
```bash
agent-browser snapshot # Full accessibility tree
agent-browser snapshot -i # Interactive elements only (recommended)
agent-browser snapshot -c # Compact output
agent-browser snapshot -d 3 # Limit depth to 3
agent-browser snapshot -s "#main" # Scope to CSS selector
```
### Interactions (use @refs from snapshot)
```bash
agent-browser click @e1 # Click
agent-browser dblclick @e1 # Double-click
agent-browser focus @e1 # Focus element
agent-browser fill @e2 "text" # Clear and type
agent-browser type @e2 "text" # Type without clearing
agent-browser press Enter # Press key
agent-browser press Control+a # Key combination
agent-browser keydown Shift # Hold key down
agent-browser keyup Shift # Release key
agent-browser hover @e1 # Hover
agent-browser check @e1 # Check checkbox
agent-browser uncheck @e1 # Uncheck checkbox
agent-browser select @e1 "value" # Select dropdown
agent-browser scroll down 500 # Scroll page
agent-browser scrollintoview @e1 # Scroll element into view
agent-browser drag @e1 @e2 # Drag and drop
agent-browser upload @e1 file.pdf # Upload files
```
### Get information
```bash
agent-browser get text @e1 # Get element text
agent-browser get html @e1 # Get innerHTML
agent-browser get value @e1 # Get input value
agent-browser get attr @e1 href # Get attribute
agent-browser get title # Get page title
agent-browser get url # Get current URL
agent-browser get count ".item" # Count matching elements
agent-browser get box @e1 # Get bounding box
```
### Check state
```bash
agent-browser is visible @e1 # Check if visible
agent-browser is enabled @e1 # Check if enabled
agent-browser is checked @e1 # Check if checked
```
### Screenshots & PDF
```bash
agent-browser screenshot # Screenshot to stdout
agent-browser screenshot path.png # Save to file
agent-browser screenshot --full # Full page
agent-browser pdf output.pdf # Save as PDF
```
### Video recording
```bash
agent-browser record start ./demo.webm # Start recording (uses current URL + state)
agent-browser click @e1 # Perform actions
agent-browser record stop # Stop and save video
agent-browser record restart ./take2.webm # Stop current + start new recording
```
Recording creates a fresh context but preserves cookies/storage from your session. If no URL is provided, it automatically returns to your current page. For smooth demos, explore first, then start recording.
### Wait
```bash
agent-browser wait @e1 # Wait for element
agent-browser wait 2000 # Wait milliseconds
agent-browser wait --text "Success" # Wait for text
agent-browser wait --url "/dashboard" # Wait for URL pattern
agent-browser wait --load networkidle # Wait for network idle
agent-browser wait --fn "window.ready" # Wait for JS condition
```
### Mouse control
```bash
agent-browser mouse move 100 200 # Move mouse
agent-browser mouse down left # Press button
agent-browser mouse up left # Release button
agent-browser mouse wheel 100 # Scroll wheel
```
### Semantic locators (alternative to refs)
```bash
agent-browser find role button click --name "Submit"
agent-browser find text "Sign In" click
agent-browser find label "Email" fill "user@test.com"
agent-browser find first ".item" click
agent-browser find nth 2 "a" text
```
### Browser settings
```bash
agent-browser set viewport 1920 1080 # Set viewport size
agent-browser set device "iPhone 14" # Emulate device
agent-browser set geo 37.7749 -122.4194 # Set geolocation
agent-browser set offline on # Toggle offline mode
agent-browser set headers '{"X-Key":"v"}' # Extra HTTP headers
agent-browser set credentials user pass # HTTP basic auth
agent-browser set media dark # Emulate color scheme
```
### Cookies & Storage
```bash
agent-browser cookies # Get all cookies
agent-browser cookies set name value # Set cookie
agent-browser cookies clear # Clear cookies
agent-browser storage local # Get all localStorage
agent-browser storage local key # Get specific key
agent-browser storage local set k v # Set value
agent-browser storage local clear # Clear all
```
### Network
```bash
agent-browser network route <url> # Intercept requests
agent-browser network route <url> --abort # Block requests
agent-browser network route <url> --body '{}' # Mock response
agent-browser network unroute [url] # Remove routes
agent-browser network requests # View tracked requests
agent-browser network requests --filter api # Filter requests
```
### Tabs & Windows
```bash
agent-browser tab # List tabs
agent-browser tab new [url] # New tab
agent-browser tab 2 # Switch to tab
agent-browser tab close # Close tab
agent-browser window new # New window
```
### Frames
```bash
agent-browser frame "#iframe" # Switch to iframe
agent-browser frame main # Back to main frame
```
### Dialogs
```bash
agent-browser dialog accept [text] # Accept dialog
agent-browser dialog dismiss # Dismiss dialog
```
### JavaScript
```bash
agent-browser eval "document.title" # Run JavaScript
```
### State management
```bash
agent-browser state save auth.json # Save session state
agent-browser state load auth.json # Load saved state
```
## Example: Form submission
```bash
agent-browser open https://example.com/form
agent-browser snapshot -i
# Output shows: textbox "Email" [ref=e1], textbox "Password" [ref=e2], button "Submit" [ref=e3]
agent-browser fill @e1 "user@example.com"
agent-browser fill @e2 "password123"
agent-browser click @e3
agent-browser wait --load networkidle
agent-browser snapshot -i # Check result
```
## Example: Authentication with saved state
```bash
# Login once
agent-browser open https://app.example.com/login
agent-browser snapshot -i
agent-browser fill @e1 "username"
agent-browser fill @e2 "password"
agent-browser click @e3
agent-browser wait --url "/dashboard"
agent-browser state save auth.json
# Later sessions: load saved state
agent-browser state load auth.json
agent-browser open https://app.example.com/dashboard
```
## Sessions (parallel browsers)
```bash
agent-browser --session test1 open site-a.com
agent-browser --session test2 open site-b.com
agent-browser session list
```
## JSON output (for parsing)
Add `--json` for machine-readable output:
```bash
agent-browser snapshot -i --json
agent-browser get text @e1 --json
```
## Debugging
```bash
agent-browser open example.com --headed # Show browser window
agent-browser console # View console messages
agent-browser console --clear # Clear console
agent-browser errors # View page errors
agent-browser errors --clear # Clear errors
agent-browser highlight @e1 # Highlight element
agent-browser trace start # Start recording trace
agent-browser trace stop trace.zip # Stop and save trace
agent-browser record start ./debug.webm # Record from current page
agent-browser record stop # Save recording
agent-browser --cdp 9222 snapshot # Connect via CDP
```
## Troubleshooting
- If the command is not found on Linux ARM64, use the full path in the bin folder.
- If an element is not found, use snapshot to find the correct ref.
- If the page is not loaded, add a wait command after navigation.
- Use --headed to see the browser window for debugging.
## Options
- --session <name> uses an isolated session.
- --json provides JSON output.
- --full takes a full page screenshot.
- --headed shows the browser window.
- --timeout sets the command timeout in milliseconds.
- --cdp <port> connects via Chrome DevTools Protocol.
## Notes
- Refs are stable per page load but change on navigation.
- Always snapshot after navigation to get new refs.
- Use fill instead of type for input fields to ensure existing text is cleared.
## Reporting Issues
- Skill issues: Open an issue at https://github.com/TheSethRose/Agent-Browser-CLI
- agent-browser CLI issues: Open an issue at https://github.com/vercel-labs/agent-browser

View File

@@ -0,0 +1,157 @@
---
name: ai-news-collector
description: AI 新闻聚合与热度排序工具。当用户询问 AI 领域最新动态时触发,如:"今天有什么 AI 新闻?""总结一下这周的 AI 动态""最近有什么火的 AI 产品?""AI 圈最近在讨论什么?"。覆盖新产品发布、研究论文、行业动态、融资新闻、开源项目更新、社区病毒传播现象、AI 工具/Agent 热门项目。输出中文摘要列表,按热度排序,附带原文链接。
---
# AI News Collector
收集、聚合并按热度排序 AI 领域新闻。
## 核心原则
**不要只搜"AI news today"。** 泛搜索返回的是 SEO 聚合页和趋势预测文章会系统性遗漏社区级病毒传播现象如开源工具爆火、Meme 级事件)。必须用多维度、分层搜索策略。
## 工作流程
### 1. 多维度分层搜索(最少 8 次,建议 10-12 次)
按以下 **6 个维度** 依次执行搜索,每个维度至少 1 次:
#### 维度 A周报/Newsletter 聚合(最优先 🔑)
这是信息密度最高的来源,一篇文章可覆盖 10+ 条新闻。
```
搜索词:
- "last week in AI" [当前月份年份]
- "AI weekly roundup" [当前月份年份]
- "the batch AI newsletter"
- site:substack.com AI news [当前月份]
```
发现周报后,用 web_fetch 获取全文,从中提取所有新闻线索。
#### 维度 B社区热度/病毒传播(关键维度 🔑)
捕捉自下而上的社区爆款,这类信息泛搜索几乎无法触达。
```
搜索词:
- "viral AI tool" OR "viral AI agent"
- "AI trending" site:reddit.com OR site:news.ycombinator.com
- "GitHub trending AI" OR "AI open source trending"
- AI buzzing OR "everyone is talking about" AI
- "most popular AI" this week
```
#### 维度 C产品发布与模型更新
```
搜索词:
- "AI model release" OR "LLM launch" [当前月份]
- "AI product launch" [当前月份年份]
- OpenAI OR Anthropic OR Google OR Meta AI announcement
- "大模型 发布" OR "AI 新产品"
```
#### 维度 D融资与商业
```
搜索词:
- "AI startup funding" [当前月份年份]
- "AI acquisition" OR "AI IPO"
- "AI 融资" OR "人工智能投资"
```
#### 维度 E研究突破
```
搜索词:
- "AI breakthrough" OR "AI paper" [当前月份]
- "state of the art" machine learning
- "AI 论文" OR "机器学习突破"
```
#### 维度 F监管与政策
```
搜索词:
- "AI regulation" OR "AI policy" [当前月份年份]
- "AI law" OR "AI governance"
- "AI 监管" OR "人工智能法案"
```
### 2. 交叉验证与补漏
初轮搜索完成后,检查是否有遗漏:
- 如果 Newsletter 中提到了某个项目/事件但初轮搜索未覆盖 → 对该项目专项搜索
- 如果同一事件被 3+ 个不同来源提及 → 大概率是热点,深入搜索获取更多细节
- 如果中文媒体和英文媒体的热点完全不同 → 两边都要覆盖
### 3. 搜索关键词设计原则(反模式清单)
| ❌ 不要这样搜 | ✅ 应该这样搜 | 原因 |
|---|---|---|
| "AI news today February 2026" | "AI weekly roundup February 2026" | 前者返回聚合页,后者返回策划内容 |
| "AI news today" | "viral AI tool" + "AI model release" 分开搜 | 泛搜无法覆盖社区现象 |
| "artificial intelligence breaking news" | 按维度分类搜索 | 过于宽泛,返回噪音 |
| 搜索词中加具体年月日 | 用 "this week" "today" "latest" | 日期反而会偏向预测/展望文章 |
| 只搜 3 次就开始写 | 至少 8 次,覆盖 6 个维度 | 3 次搜索覆盖率不到 30% |
### 4. 热度综合判断
基于以下信号评估每条新闻热度1-5 星):
| 信号 | 权重 | 说明 |
|------|------|------|
| 多家媒体报道同一事件 | ⭐⭐⭐ 高 | 3+ 来源 = 确认热点 |
| 社区病毒传播证据 | ⭐⭐⭐ 高 | GitHub star 暴涨、Twitter 刷屏、HN 首页 |
| 来自权威来源(顶会、大厂官宣) | ⭐⭐⭐ 高 | 但注意大厂 PR 不等于真热点 |
| 实际用户体验分享 | ⭐⭐ 中 | 有人真的在用 > 只是发布了 |
| 技术突破性/影响范围 | ⭐⭐ 中 | |
| 争议性(安全、伦理讨论) | ⭐⭐ 中 | 争议往往说明影响力大 |
| 时效性(越新越热) | ⭐ 中低 | 辅助排序 |
### 5. 输出格式
按热度降序排列,输出 **15-25 条**新闻:
```
## 🔥 AI 新闻速递YYYY-MM-DD
### ⭐⭐⭐⭐⭐ 热度最高
1. **[新闻标题]**
> 一句话摘要(不超过 50 字)
> 🔗 [来源名称](URL)
### ⭐⭐⭐⭐ 高热度
2. ...
### ⭐⭐⭐ 中等热度
...
---
📊 本次共收集 XX 条新闻 | 搜索 XX 次 | 覆盖维度A/B/C/D/E/F | 更新时间HH:MM
```
### 6. 去重与合并
- 同一事件被多家报道时,合并为一条,选择最权威/详细的来源
- 在摘要中注明"多家媒体报道"以体现热度
- 改名/更名的项目视为同一事件(如 Clawdbot → Moltbot → OpenClaw
## 推荐新闻源
详见 [references/sources.md](references/sources.md)。
## 注意事项
- 优先使用 HTTPS 链接
- 遇到付费墙/无法访问的内容,标注"需订阅"
- 保持客观,不对新闻内容做主观评价
- 搜索不足 8 次不要开始输出
- 如果某个维度搜索结果为空,换关键词再搜一次

View File

@@ -0,0 +1,6 @@
{
"ownerId": "kn7fr165ff9vkkwsqyqrq2nwas80t4ev",
"slug": "ai-news-collectors",
"version": "1.0.0",
"publishedAt": 1770615394344
}

View File

@@ -0,0 +1,128 @@
# AI 新闻源推荐列表
## Newsletter / 周报(信息密度最高 🔑)
| 来源 | 网址 | 特点 |
|------|------|------|
| Last Week in AI | lastweekin.ai / medium.com/last-week-in-ai | 每周最全面的 AI 新闻汇总 |
| The Batch (Andrew Ng) | deeplearning.ai/the-batch | 权威周报 |
| Import AI (Jack Clark) | importai.net | Anthropic 联创的 AI 周报 |
| Platformer (Casey Newton) | platformer.news | 科技深度分析,覆盖 AI |
| The Neuron | theneurondaily.com | 每日 AI 简报 |
| Ben's Bites | bensbites.com | AI 产品/工具为主 |
| AI Tidbits | aitidbits.substack.com | AI 动态精选 |
| TLDR AI | tldr.tech/ai | 每日 AI 简报 |
| Interconnects | interconnects.ai | 深度技术分析 |
## 关键 Substack / 独立博客
| 来源 | 网址 | 特点 |
|------|------|------|
| Gary Marcus | garymarcus.substack.com | AI 批评性分析,常首发安全/争议话题 |
| Simon Willison | simonwillison.net | LLM 安全、工具生态,社区现象第一手报道 |
| Ethan Mollick | oneusefulthing.substack.com | Wharton 教授AI 应用洞察 |
| Lenny's Newsletter | lennysnewsletter.com | AI 产品/增长 |
| Understanding AI | understandingai.org | 趋势分析与预测 |
| Nathan Lebenz (Cognitive Revolution) | cognitiverevolution.substack.com | AI 深度访谈 |
## 国际主流媒体
| 来源 | 网址 | 侧重 |
|------|------|------|
| TechCrunch AI | techcrunch.com/category/artificial-intelligence | 产品、融资 |
| The Verge AI | theverge.com/ai-artificial-intelligence | 产品、行业 |
| Ars Technica | arstechnica.com/ai | 深度分析 |
| VentureBeat AI | venturebeat.com/ai | 企业 AI |
| MIT Tech Review | technologyreview.com/artificial-intelligence | 研究、趋势 |
| Wired AI | wired.com/tag/artificial-intelligence | 行业影响 |
| CNBC Tech | cnbc.com/technology | 商业+科技交叉 |
| Scientific American | scientificamerican.com | 科学视角 AI |
| The Information | theinformation.com | 深度报道(付费) |
## 国内媒体
| 来源 | 网址 | 侧重 |
|------|------|------|
| 机器之心 | jiqizhixin.com | 技术、论文 |
| 量子位 | qbitai.com | 产品、行业 |
| 36氪 AI | 36kr.com/information/AI | 融资、产品 |
| InfoQ AI | infoq.cn/topic/AI | 技术实践 |
| 新智元 | xinzhiyuan.com | 行业动态 |
| AI 科技评论 | leiphone.com/category/ai | 技术、产品 |
## 社区与论坛(捕捉病毒传播 🔑)
| 来源 | 网址 | 特点 |
|------|------|------|
| Hacker News | news.ycombinator.com | 技术讨论热度,开源项目首发 |
| Reddit r/MachineLearning | reddit.com/r/MachineLearning | 学术前沿 |
| Reddit r/artificial | reddit.com/r/artificial | 综合讨论 |
| Reddit r/LocalLLaMA | reddit.com/r/LocalLLaMA | 本地模型、开源工具热度 |
| Reddit r/singularity | reddit.com/r/singularity | AI 社区热议 |
| Twitter/X | twitter.com | 实时动态,病毒传播首发地 |
| 即刻 AI 圈子 | okjike.com | 国内社区讨论 |
| V2EX | v2ex.com | 开发者视角 |
| 知乎 | zhihu.com | 深度技术讨论 |
## 安全与批评视角
| 来源 | 网址 | 特点 |
|------|------|------|
| Palo Alto Networks Blog | paloaltonetworks.com/blog | AI 安全预警 |
| 1Password Blog | 1password.com/blog | Agent 安全分析 |
| Trail of Bits | blog.trailofbits.com | AI 安全研究 |
## 学术与研究
| 来源 | 网址 | 特点 |
|------|------|------|
| arXiv CS.AI | arxiv.org/list/cs.AI/recent | 最新论文 |
| arXiv CS.LG | arxiv.org/list/cs.LG/recent | 机器学习论文 |
| Papers With Code | paperswithcode.com | 论文+代码 |
| Google AI Blog | ai.googleblog.com | 谷歌研究 |
| OpenAI Blog | openai.com/blog | OpenAI 动态 |
| Anthropic News | anthropic.com/news | Anthropic 动态 |
| DeepMind Blog | deepmind.com/blog | DeepMind 研究 |
| Meta AI | ai.meta.com/blog | Meta 研究 |
| Hugging Face Blog | huggingface.co/blog | 开源生态 |
## 开源项目追踪
| 来源 | 网址 | 特点 |
|------|------|------|
| GitHub Trending | github.com/trending | 热门项目,必查 |
| Product Hunt AI | producthunt.com/topics/artificial-intelligence | 新产品发布 |
| Awesome LLM | github.com/Hannibal046/Awesome-LLM | LLM 资源汇总 |
| Hugging Face Models | huggingface.co/models | 新模型发布 |
## 搜索关键词矩阵
每个维度对应的推荐搜索词:
**Newsletter/周报**
- `"last week in AI" [月份 年份]`
- `"AI weekly roundup" [月份 年份]`
- `site:substack.com AI news [月份]`
**社区病毒传播**
- `"viral AI tool" OR "viral AI agent"`
- `"AI trending" site:reddit.com`
- `"GitHub trending AI"`
- `AI "everyone is talking about"`
**产品发布**
- `"AI model release" OR "LLM launch" [月份]`
- `OpenAI OR Anthropic OR Google announcement`
- `"大模型发布" OR "AI 新产品"`
**研究突破**
- `"AI breakthrough" OR "state of the art" [月份]`
- `"AI paper" OR "machine learning research"`
**融资商业**
- `"AI startup funding" [月份 年份]`
- `"AI acquisition" OR "AI IPO"`
**监管政策**
- `"AI regulation" OR "AI policy" [月份 年份]`
- `"AI law" OR "AI 监管"`

View File

@@ -0,0 +1,157 @@
---
name: aminer-academic-search
description: >
ACADEMIC PRIORITY: Activate whenever the user's query involves academic,
scholarly, or research-related topics — papers, citations, scholars,
institutions, venues, patents, research trends, or any "who published what /
where / when" question. Takes precedence over general web search for academic
data needs. Routes through the z-ai gateway's `/v1/functions/invoke` endpoint
to the AMiner Open Platform (27 APIs, 5 workflows).
---
# aminer-academic-search
Wraps 27 AMiner Open Platform APIs through the local gateway. The gateway owns
the `AMINER_API_KEY`; the skill only needs a `.z-ai-config` file to reach the
gateway.
## When to activate
Any academic/scholarly query — papers, citations, scholars, institutions,
venues, patents. Covers:
- Scholar full profile (bio, education, honors, papers, patents, projects)
- Paper deep dive (full abstract, keywords, authors, citation chain)
- Multi-condition or semantic paper search
- Institution research capability analysis
- Venue annual monitoring
- Patent deep details (IPC/CPC, assignee, claims)
## Config
Script reads `.z-ai-config`, in order:
1. `./.z-ai-config`
2. `~/.z-ai-config`
3. `/etc/.z-ai-config`
Required fields: `baseUrl`, `apiKey`, `token` (JWT). Optional: `chatId`,
`userId`.
## Invocation
```bash
python3 "{baseDir}/scripts/aminer.py" <ACTION> [--flag value …]
```
The script POSTs to `${baseUrl}/functions/invoke` with
`Authorization: Bearer ${apiKey}`, `X-Z-AI-From: Z`, `X-Token: ${token}`,
and prints the upstream JSON result (pretty-formatted).
Run `python3 scripts/aminer.py --help` to list actions, or
`python3 scripts/aminer.py <ACTION> --help` for per-action flags.
### Examples
```bash
# Locate paper_id by title
python3 scripts/aminer.py paper_search --title "Attention is all you need"
# Locate scholar by name + org
python3 scripts/aminer.py person_search --name "Jie Tang" --org "Tsinghua"
# Full paper details
python3 scripts/aminer.py paper_detail --id 57a4e91aac44365e35c98a1e
# Natural-language Q&A search
python3 scripts/aminer.py paper_qa_search --query "retrieval-augmented generation" --size 5
# Venue papers for a given year (body with JSON list param)
python3 scripts/aminer.py venue_paper_relation --id "53e9ba63b7602d9702f08c5d" --year 2024 --limit 10
```
JSON-typed flags (`--title`, `--ids`, `--keywords`, `--year` on qa_search, etc.)
expect a JSON-encoded value, e.g. `--ids '["abc","def"]'`.
## 27 APIs (quick reference)
| Action | Method | Purpose |
|---|---|---|
| paper_search | GET | Locate paper_id by title |
| paper_search_pro | GET | Multi-condition paper search |
| paper_qa_search | POST | Natural-language / topic Q&A |
| paper_info | POST | Batch paper info by IDs |
| paper_detail | GET | Full paper details |
| paper_relation | GET | Citation chain |
| paper_list_by_keywords | GET | Batch thematic retrieval |
| paper_detail_by_condition | GET | Year + venue dimension |
| person_search | POST | Locate person_id |
| person_detail | GET | Scholar bio/education/honors |
| person_figure | GET | Interests + work history |
| person_paper_relation | GET | Scholar's papers |
| person_patent_relation | GET | Scholar's patents |
| person_project | GET | Funded projects |
| org_search | POST | Locate org by name |
| org_detail | POST | Org description/type |
| org_person_relation | GET | Affiliated scholars |
| org_paper_relation | GET | Org papers |
| org_patent_relation | GET | Org patents |
| org_disambiguate | POST | Normalize org string |
| org_disambiguate_pro | POST | Extract org IDs |
| venue_search | POST | Locate venue_id |
| venue_detail | POST | ISSN/type/abbreviation |
| venue_paper_relation | POST | Papers by venue (+ year filter) |
| patent_search | POST | Patent keyword search |
| patent_info | GET | Basic patent info |
| patent_detail | GET | Full patent details (IPC/claims) |
## 5 Workflows (orchestrate via multiple calls)
### Scholar Profile
```
person_search → person_detail
→ person_figure
→ person_paper_relation
→ person_patent_relation
→ person_project
```
### Paper Deep Dive
```
paper_search → paper_detail → paper_relation → paper_info
```
If `paper_search` is empty, fall back to `paper_search_pro`.
### Org Analysis
```
org_disambiguate_pro → org_detail
→ org_person_relation
→ org_paper_relation
→ org_patent_relation
```
If `org_disambiguate_pro` returns no ID, fall back to `org_search`.
### Venue Papers
```
venue_search → venue_detail (optional) → venue_paper_relation
```
### Patent Analysis
```
patent_search → patent_info / patent_detail
```
## Error handling
- `warning: .z-ai-config has no 'token' field` → add `token` to config.
- `http 401 … missing X-Token header (hint: …)` → same.
- `http 403` → gateway auth rejected; check `apiKey` / `X-Z-AI-From`.
- `gateway error: aminer_* failed: http 4xx …` → upstream refused; check API
key on the gateway side (`AMINER_API_KEY` env var) or parameter shape.
## Entity URL templates
- Paper: `https://www.aminer.cn/pub/{paper_id}`
- Scholar: `https://www.aminer.cn/profile/{scholar_id}`
- Patent: `https://www.aminer.cn/patent/{patent_id}`
- Journal: `https://www.aminer.cn/open/journal/detail/{journal_id}`
Always append the relevant URL when presenting entities to the user.

View File

@@ -0,0 +1,391 @@
#!/usr/bin/env python3
"""Dispatcher for the 27 AMiner Open Platform functions exposed via the
z-ai gateway's /v1/functions/invoke endpoint.
Each action maps 1:1 to an `aminer_*` function registered in the gateway.
Arguments are collected via argparse and forwarded as the `arguments` field.
Upstream response is printed as formatted JSON.
"""
from __future__ import annotations
import argparse
import json
import pathlib
import sys
import urllib.error
import urllib.request
from typing import Any, Callable, Tuple
REQUEST_TIMEOUT = 60
CONFIG_PATHS = [
pathlib.Path.cwd() / ".z-ai-config",
pathlib.Path.home() / ".z-ai-config",
pathlib.Path("/etc/.z-ai-config"),
]
# ---------------------------------------------------------------------------
# Action registry
#
# Each entry defines one AMiner Open Platform API:
# function_name: matches the key in handlers/aminer_open.go's AminerOpenSpecs
# args: (flag, dest, kind, help) — kind is "str"/"int"/"float"/
# "bool"/"json" (parsed from JSON for list/dict/nested values)
#
# This list is the *single source of truth* for what params each action takes,
# so the model only has to read one structure to learn the whole surface.
# ---------------------------------------------------------------------------
_Arg = Tuple[str, str, str, str] # (flag, dest, kind, help)
ACTIONS: dict[str, dict[str, Any]] = {
# ── Paper ────────────────────────────────────────────────────────────
"paper_search": {
"function": "aminer_paper_search",
"help": "Locate a paper_id by (partial) title.",
"args": [
("--title", "title", "str", "Paper title keyword (required)"),
("--page", "page", "int", "Page number, default 1"),
("--size", "size", "int", "Page size, default 10"),
],
},
"paper_search_pro": {
"function": "aminer_paper_search_pro",
"help": "Multi-condition paper search (author/org/venue/keyword).",
"args": [
("--title", "title", "str", "Title keyword"),
("--keyword", "keyword", "str", "Subject keyword"),
("--abstract", "abstract", "str", "Abstract keyword"),
("--author", "author", "str", "Author name"),
("--org", "org", "str", "Organization name"),
("--venue", "venue", "str", "Venue name"),
("--order", "order", "str", "Sort: citation | year"),
("--page", "page", "int", "Page number, default 0"),
("--size", "size", "int", "Page size, default 10"),
],
},
"paper_qa_search": {
"function": "aminer_paper_qa_search",
"help": "AI Q&A-style search. 'query' and 'topic_*' are mutually exclusive.",
"args": [
("--query", "query", "str", "Natural language question"),
("--use-topic", "use_topic", "bool", "Use topic_* fields instead of query"),
("--topic-high", "topic_high", "str", "High-level topic"),
("--topic-middle", "topic_middle", "str", "Mid-level topic"),
("--topic-low", "topic_low", "str", "Low-level topic"),
("--title", "title", "json", "List of titles (JSON array)"),
("--doi", "doi", "str", "DOI"),
("--year", "year", "json", "Year filter, e.g. [2020,2024]"),
("--sci-flag", "sci_flag", "bool", "Restrict to SCI venues"),
("--n-citation-flag", "n_citation_flag", "bool", "Return citation count"),
("--force-citation-sort", "force_citation_sort", "bool", "Sort by citations"),
("--force-year-sort", "force_year_sort", "bool", "Sort by year"),
("--author-terms", "author_terms", "json", "Author name list"),
("--org-terms", "org_terms", "json", "Org name list"),
("--author-id", "author_id", "json", "Author ID list"),
("--org-id", "org_id", "json", "Org ID list"),
("--venue-ids", "venue_ids", "json", "Venue ID list"),
("--size", "size", "int", "Page size, default 10"),
("--offset", "offset", "int", "Offset, default 0"),
],
},
"paper_info": {
"function": "aminer_paper_info",
"help": "Batch retrieve papers by ID list.",
"args": [
("--ids", "ids", "json", "JSON array of paper_id strings (required)"),
],
},
"paper_detail": {
"function": "aminer_paper_detail",
"help": "Full paper info for a single paper_id.",
"args": [
("--id", "id", "str", "paper_id (required)"),
],
},
"paper_relation": {
"function": "aminer_paper_relation",
"help": "Citation chain (cited papers) for a paper_id.",
"args": [
("--id", "id", "str", "paper_id (required)"),
],
},
"paper_list_by_keywords": {
"function": "aminer_paper_list_by_keywords",
"help": "Batch keyword retrieval returning abstracts + metadata.",
"args": [
("--keywords", "keywords", "json", "JSON array of keyword strings (required)"),
("--page", "page", "int", "Page number, default 0"),
("--size", "size", "int", "Page size, default 10"),
],
},
"paper_detail_by_condition": {
"function": "aminer_paper_detail_by_condition",
"help": "Year + venue dimension lookup. Year + venue_id both required.",
"args": [
("--year", "year", "int", "Year (required)"),
("--venue-id", "venue_id", "str", "Venue ID (required)"),
],
},
# ── Scholar ──────────────────────────────────────────────────────────
"person_search": {
"function": "aminer_person_search",
"help": "Search scholars by name and/or org.",
"args": [
("--name", "name", "str", "Scholar name"),
("--org", "org", "str", "Organization name"),
("--org-id", "org_id", "json", "List of org IDs"),
("--offset", "offset", "int", "Offset, default 0"),
("--size", "size", "int", "Page size, default 5"),
],
},
"person_detail": {
"function": "aminer_person_detail",
"help": "Full scholar profile (bio/education/honors).",
"args": [("--id", "id", "str", "person_id (required)")],
},
"person_figure": {
"function": "aminer_person_figure",
"help": "Scholar portrait (interests, work history).",
"args": [("--id", "id", "str", "person_id (required)")],
},
"person_paper_relation": {
"function": "aminer_person_paper_relation",
"help": "List of papers by this scholar.",
"args": [("--id", "id", "str", "person_id (required)")],
},
"person_patent_relation": {
"function": "aminer_person_patent_relation",
"help": "List of patents by this scholar.",
"args": [("--id", "id", "str", "person_id (required)")],
},
"person_project": {
"function": "aminer_person_project",
"help": "Research projects (funding, dates, source).",
"args": [("--id", "id", "str", "person_id (required)")],
},
# ── Organization ─────────────────────────────────────────────────────
"org_search": {
"function": "aminer_org_search",
"help": "Search institutions by name keyword.",
"args": [("--orgs", "orgs", "json", "JSON array of org name strings (required)")],
},
"org_detail": {
"function": "aminer_org_detail",
"help": "Org details by ID list.",
"args": [("--ids", "ids", "json", "JSON array of org_id strings (required)")],
},
"org_person_relation": {
"function": "aminer_org_person_relation",
"help": "Affiliated scholars (10 per call).",
"args": [
("--org-id", "org_id", "str", "org_id (required)"),
("--offset", "offset", "int", "Offset, default 0"),
],
},
"org_paper_relation": {
"function": "aminer_org_paper_relation",
"help": "Papers authored by org members (10 per call).",
"args": [
("--org-id", "org_id", "str", "org_id (required)"),
("--offset", "offset", "int", "Offset, default 0"),
],
},
"org_patent_relation": {
"function": "aminer_org_patent_relation",
"help": "Org patent list (max page_size 10,000).",
"args": [
("--id", "id", "str", "org_id (required)"),
("--page", "page", "int", "Page number, default 1"),
("--page-size", "page_size", "int", "Page size, default 100"),
],
},
"org_disambiguate": {
"function": "aminer_org_disambiguate",
"help": "Normalize raw org string.",
"args": [("--org", "org", "str", "Raw org string (required)")],
},
"org_disambiguate_pro": {
"function": "aminer_org_disambiguate_pro",
"help": "Extract primary/secondary org IDs.",
"args": [("--org", "org", "str", "Raw org string (required)")],
},
# ── Venue ────────────────────────────────────────────────────────────
"venue_search": {
"function": "aminer_venue_search",
"help": "Search journals/conferences by name.",
"args": [("--name", "name", "str", "Venue name (required)")],
},
"venue_detail": {
"function": "aminer_venue_detail",
"help": "Venue details (ISSN, abbreviation, type).",
"args": [("--id", "id", "str", "venue_id (required)")],
},
"venue_paper_relation": {
"function": "aminer_venue_paper_relation",
"help": "Papers published in a venue, optionally filtered by year.",
"args": [
("--id", "id", "str", "venue_id (required)"),
("--offset", "offset", "int", "Offset, default 0"),
("--limit", "limit", "int", "Limit, default 20"),
("--year", "year", "int", "Publication year"),
],
},
# ── Patent ───────────────────────────────────────────────────────────
"patent_search": {
"function": "aminer_patent_search",
"help": "Search patents by name/keyword.",
"args": [
("--query", "query", "str", "Search query (required)"),
("--page", "page", "int", "Page number, default 0"),
("--size", "size", "int", "Page size, default 10"),
],
},
"patent_info": {
"function": "aminer_patent_info",
"help": "Basic patent info.",
"args": [("--id", "id", "str", "patent_id (required)")],
},
"patent_detail": {
"function": "aminer_patent_detail",
"help": "Full patent details (abstract, IPC, claims).",
"args": [("--id", "id", "str", "patent_id (required)")],
},
}
# ---------------------------------------------------------------------------
# Config loading and HTTP
# ---------------------------------------------------------------------------
def load_config() -> dict[str, Any]:
for path in CONFIG_PATHS:
try:
cfg = json.loads(path.read_text(encoding="utf-8"))
except FileNotFoundError:
continue
except (OSError, json.JSONDecodeError) as exc:
raise SystemExit(f"failed to read {path}: {exc}") from exc
if cfg.get("baseUrl") and cfg.get("apiKey"):
if not cfg.get("token"):
print(
f"warning: {path} has no 'token' field; the gateway requires "
"X-Token for auth and will return 401.",
file=sys.stderr,
)
return cfg
raise SystemExit(
"z-ai config not found. Create .z-ai-config in cwd, $HOME, or /etc "
"with {\"baseUrl\": ..., \"apiKey\": ..., \"token\": ...}."
)
def invoke(config: dict[str, Any], function_name: str, arguments: dict[str, Any]) -> Any:
base_url = str(config["baseUrl"]).rstrip("/")
url = f"{base_url}/functions/invoke"
headers = {
"Content-Type": "application/json",
"User-Agent": "aminer-academic-search-skill/1.0",
"Authorization": f"Bearer {config['apiKey']}",
"X-Z-AI-From": "Z",
}
if config.get("token"):
headers["X-Token"] = str(config["token"])
if config.get("chatId"):
headers["X-Chat-Id"] = str(config["chatId"])
if config.get("userId"):
headers["X-User-Id"] = str(config["userId"])
body = json.dumps(
{"function_name": function_name, "arguments": arguments},
ensure_ascii=False,
).encode("utf-8")
req = urllib.request.Request(url, data=body, headers=headers, method="POST")
try:
with urllib.request.urlopen(req, timeout=REQUEST_TIMEOUT) as resp: # nosec B310
raw = resp.read().decode("utf-8")
except urllib.error.HTTPError as exc:
detail = exc.read().decode("utf-8", errors="replace") if exc.fp else ""
hint = ""
if exc.code == 401 and "X-Token" in detail:
hint = " (hint: add a valid 'token' field to .z-ai-config)"
elif exc.code == 403:
hint = " (hint: request rejected by auth — check 'apiKey' / X-Z-AI-From)"
raise SystemExit(f"http {exc.code}: {detail or exc.reason}{hint}") from exc
except urllib.error.URLError as exc:
raise SystemExit(f"gateway unreachable: {exc.reason}") from exc
try:
parsed = json.loads(raw)
except json.JSONDecodeError as exc:
raise SystemExit(f"invalid json response: {exc}") from exc
if isinstance(parsed.get("error"), str) and parsed["error"]:
raise SystemExit(f"gateway error: {parsed['error']}")
return parsed.get("result")
# ---------------------------------------------------------------------------
# Argparse wiring
# ---------------------------------------------------------------------------
def _cast(kind: str) -> Callable[[str], Any]:
if kind == "int":
return int
if kind == "float":
return float
if kind == "json":
return lambda s: json.loads(s)
return str
def build_parser() -> argparse.ArgumentParser:
parser = argparse.ArgumentParser(
description="Call one of 27 AMiner Open Platform APIs via the z-ai gateway.",
formatter_class=argparse.RawDescriptionHelpFormatter,
)
sub = parser.add_subparsers(dest="action", required=True, metavar="ACTION")
for name, meta in ACTIONS.items():
p = sub.add_parser(
name,
help=meta["help"],
description=f"{meta['help']} function_name: {meta['function']}",
)
for flag, dest, kind, help_text in meta["args"]:
if kind == "bool":
p.add_argument(flag, dest=dest, action="store_true", help=help_text)
else:
p.add_argument(flag, dest=dest, type=_cast(kind), help=help_text)
return parser
def collect_arguments(ns: argparse.Namespace, spec: list[_Arg]) -> dict[str, Any]:
args: dict[str, Any] = {}
for _flag, dest, kind, _help in spec:
value = getattr(ns, dest, None)
if value is None:
continue
if kind == "bool" and value is False:
continue
args[dest] = value
return args
def main() -> int:
parser = build_parser()
ns = parser.parse_args()
meta = ACTIONS[ns.action]
arguments = collect_arguments(ns, meta["args"])
config = load_config()
result = invoke(config, meta["function"], arguments)
print(json.dumps(result, ensure_ascii=False, indent=2))
return 0
if __name__ == "__main__":
raise SystemExit(main())

View File

@@ -0,0 +1,27 @@
# aminer-daily-paper
Skill for AMiner academic paper recommendations.
## Configuration
```bash
# TODO: fill in the actual API URL
export AMINER_API_URL="https://<your-api-host>/..."
```
## Usage
Natural language or explicit command:
```
/aminer-dp topics: multimodal agents, tool-use size: 5
recommend me recent papers on RAG
```
## Manual invocation
```bash
python3 scripts/recommend.py --topic "RAG" --topic "multimodal" --size 5 --language-sort zh
```
Output is a Markdown-formatted paper list.

View File

@@ -0,0 +1,75 @@
---
name: aminer-daily-paper
description: "Get personalized academic paper recommendations. Activate whenever the user asks for paper recommendations — explicit command (/aminer-dp) or natural language (e.g. 'recommend me papers on RAG', 'suggest recent papers on multimodal agents'). Workflow: extract topics / author / aminer_author_id from the input, invoke scripts/recommend.py, return results as Markdown."
---
# aminer-daily-paper
Personalized academic paper recommendation.
## When to activate
Any time the user asks for paper recommendations:
- Explicit command: `/aminer-dp`, `/aminer-dp topics: RAG, multimodal agents`
- Natural language: `recommend me recent papers on multimodal agents`, `suggest some papers on tool-use`, `I work on RAG, give me a few papers`
## Input parsing (done by the model)
Before calling the script, extract from the user input:
| Field | Description |
|-------|-------------|
| `topics` | Research topics, 13 closely related terms work best |
| `author_name` | Scholar name |
| `author_org` | Scholar institution (improves disambiguation) |
| `aminer_author_id` | AMiner scholar ID (24-char hex) |
| `size` | Number of papers, default 5, max 20 |
| `language_sort` | `zh` or `en`, optional |
At least one of `topics` / `author_name` / `aminer_author_id` should be provided.
## Call strategy
| Scenario | Strategy |
|----------|----------|
| Single topic or scholar | 1 call, `size=5` |
| User specifies a number | 1 call, honor the number (max 20) |
| Multiple distinct topics | 1 call per topic group, `size=35` each, ~15 papers total |
| Broad request with no topics | 1 call, `size=5` |
## Execution
The script reads `.z-ai-config` (JSON) following the `z-ai-web-dev-sdk`
convention, searching in this order:
1. `./.z-ai-config` (cwd)
2. `~/.z-ai-config` (home)
3. `/etc/.z-ai-config` (system)
Required fields: `baseUrl`, `apiKey`, `token` (JWT for `X-Token`).
Optional: `chatId`, `userId`.
```bash
python3 "{baseDir}/scripts/recommend.py" \
[--topic "multimodal agents"] \
[--topic "tool-use"] \
[--author-name "Jie Tang"] \
[--author-org "Tsinghua University"] \
[--aminer-author-id "696259801cb939bc391d3a37"] \
[--size 5] \
[--language-sort zh]
```
The script POSTs to `${baseUrl}/functions/invoke` with headers
`Authorization: Bearer ${apiKey}`, `X-Z-AI-From: Z`, and `X-Token: ${token}`
(plus optional `X-Chat-Id` / `X-User-Id`), and prints the paper list as
Markdown.
## Output handling
- On success, the script prints a Markdown paper list — forward it directly to the user.
- On non-zero exit, stderr contains the error — relay it concisely to the user.
## Error handling
- API returns an error → relay the error; do not switch to another skill.
- No results → suggest the user broaden topics or adjust the query.

View File

@@ -0,0 +1,193 @@
#!/usr/bin/env python3
"""Fetch AMiner paper recommendations and render results as Markdown."""
from __future__ import annotations
import argparse
import json
import pathlib
import sys
import urllib.error
import urllib.request
from typing import Any
DEFAULT_SIZE = 5
MAX_SIZE = 20
REQUEST_TIMEOUT = 30
CONFIG_PATHS = [
pathlib.Path.cwd() / ".z-ai-config",
pathlib.Path.home() / ".z-ai-config",
pathlib.Path("/etc/.z-ai-config"),
]
def load_config() -> dict[str, Any]:
for path in CONFIG_PATHS:
try:
cfg = json.loads(path.read_text(encoding="utf-8"))
except FileNotFoundError:
continue
except (OSError, json.JSONDecodeError) as exc:
raise SystemExit(f"failed to read {path}: {exc}") from exc
if cfg.get("baseUrl") and cfg.get("apiKey"):
if not cfg.get("token"):
print(
f"warning: {path} has no 'token' field; the API requires "
"X-Token for auth and will return 401.",
file=sys.stderr,
)
return cfg
raise SystemExit(
"z-ai config not found. Create .z-ai-config in cwd, $HOME, or /etc "
"with {\"baseUrl\": ..., \"apiKey\": ..., \"token\": ...}."
)
def _clean(text: Any) -> str:
return " ".join(str(text or "").split()).strip()
def _truncate(text: str, limit: int) -> str:
text = _clean(text)
return text if len(text) <= limit else text[:limit].rstrip() + ""
def build_payload(args: argparse.Namespace) -> dict[str, Any]:
arguments: dict[str, Any] = {}
topics = [t for t in (args.topic or []) if _clean(t)]
if topics:
arguments["topics"] = topics
if args.author_name:
arguments["author_name"] = args.author_name
if args.author_org:
arguments["author_org"] = args.author_org
if args.aminer_author_id:
arguments["aminer_author_id"] = args.aminer_author_id
if args.language_sort in {"zh", "en"}:
arguments["language_sort"] = args.language_sort
if args.start_year:
arguments["start_year"] = args.start_year
if args.end_year:
arguments["end_year"] = args.end_year
size = args.size if args.size else DEFAULT_SIZE
size = max(1, min(size, MAX_SIZE))
arguments["size"] = size
return {"function_name": "aminer_recommend", "arguments": arguments}
def call_api(config: dict[str, Any], payload: dict[str, Any]) -> list[dict[str, Any]]:
base_url = str(config["baseUrl"]).rstrip("/")
url = f"{base_url}/functions/invoke"
headers = {
"Content-Type": "application/json",
"User-Agent": "aminer-daily-paper-skill/1.0",
"Authorization": f"Bearer {config['apiKey']}",
"X-Z-AI-From": "Z",
}
if config.get("token"):
headers["X-Token"] = str(config["token"])
if config.get("chatId"):
headers["X-Chat-Id"] = str(config["chatId"])
if config.get("userId"):
headers["X-User-Id"] = str(config["userId"])
body = json.dumps(payload, ensure_ascii=False).encode("utf-8")
request = urllib.request.Request(url, data=body, headers=headers, method="POST")
try:
with urllib.request.urlopen(request, timeout=REQUEST_TIMEOUT) as resp: # nosec B310
raw = resp.read().decode("utf-8")
except urllib.error.HTTPError as exc:
detail = exc.read().decode("utf-8", errors="replace") if exc.fp else ""
hint = ""
if exc.code == 401 and "X-Token" in detail:
hint = " (hint: add a valid 'token' field to .z-ai-config)"
elif exc.code == 403:
hint = " (hint: request rejected by auth — check 'apiKey' / X-Z-AI-From)"
raise SystemExit(f"http {exc.code}: {detail or exc.reason}{hint}") from exc
except urllib.error.URLError as exc:
raise SystemExit(f"api unreachable: {exc.reason}") from exc
try:
parsed = json.loads(raw)
except json.JSONDecodeError as exc:
raise SystemExit(f"invalid json response: {exc}") from exc
if isinstance(parsed.get("error"), str) and parsed["error"]:
raise SystemExit(f"api error: {parsed['error']}")
result = parsed.get("result")
if not isinstance(result, list):
raise SystemExit(f"unexpected response shape: {raw[:300]}")
return [p for p in result if isinstance(p, dict)]
def render_markdown(papers: list[dict[str, Any]], topics: list[str]) -> str:
if not papers:
return "No papers returned. Try broadening the topics or adjusting the query."
lines: list[str] = []
header = f"Recommended {len(papers)} paper(s)"
if topics:
header += f" (topics: {' / '.join(topics[:5])})"
lines.append(header)
for idx, paper in enumerate(papers, start=1):
lines.append("")
lines.append("---")
lines.append("")
title = _clean(paper.get("title"))
links = paper.get("links") if isinstance(paper.get("links"), dict) else {}
url = _clean(paper.get("paper_url") or links.get("aminer") or links.get("arxiv"))
title_line = f"**{idx}. [{title}]({url})**" if url else f"**{idx}. {title}**"
lines.append(title_line)
meta: list[str] = []
year = paper.get("year")
if year:
meta.append(f"Year: {year}")
keywords = paper.get("keywords") or []
if keywords:
meta.append("Keywords: " + " / ".join(str(k) for k in keywords[:5]))
if meta:
lines.append(" | ".join(meta))
authors = paper.get("authors") or []
if authors:
author_str = ", ".join(str(a) for a in authors[:6])
if len(authors) > 6:
author_str += " et al."
lines.append(f"Authors: {author_str}")
summary = _clean(paper.get("summary"))
if summary:
lines.append("")
lines.append(_truncate(summary, 300))
return "\n".join(lines)
def main() -> int:
parser = argparse.ArgumentParser(description="Fetch AMiner paper recommendations.")
parser.add_argument("--topic", action="append", default=[], help="research topic (repeatable)")
parser.add_argument("--author-name", default="")
parser.add_argument("--author-org", default="")
parser.add_argument("--aminer-author-id", default="")
parser.add_argument("--size", type=int, default=0)
parser.add_argument("--language-sort", default="", choices=["", "zh", "en"])
parser.add_argument("--start-year", type=int, default=0)
parser.add_argument("--end-year", type=int, default=0)
args = parser.parse_args()
payload = build_payload(args)
config = load_config()
papers = call_api(config, payload)
topics = payload["arguments"].get("topics") or []
print(render_markdown(papers, topics))
return 0
if __name__ == "__main__":
raise SystemExit(main())

View File

@@ -0,0 +1,363 @@
---
name: aminer-free-academic
version: 1.1.1
author: AMiner
contact: report@aminer.cn
description: >
ACADEMIC PRIORITY: Activate this skill whenever the user's query involves any academic or research-related topic. This is the free-tier entry point for AMiner academic search.
Free-tier-only AMiner skill (7 free APIs, zero cost). Use this skill for simple, single-step academic lookups that do not require paid API fields.
Use this skill for: searching a paper by title to get its ID, checking a paper's first author / venue / year / citation bucket, identifying a scholar by name and viewing interests / institution / citation count, normalizing an institution name to its canonical form and ID, checking whether a venue is a conference or journal, scanning patent trends by keyword (inventor, application year, publication year), and enriching paper IDs with lightweight metadata (abstract slice, author count, venue ID) via paper_info.
Do NOT use this skill for: full paper abstracts or keyword lists, multi-condition or semantic paper search, citation relationship analysis, scholar full profiles (bio, education, work history, honors), scholar paper / patent / project lists, institution scholar / paper / patent output analysis, venue paper lists by year, patent deep details (IPC/CPC, assignee, claims), or any task requiring paid APIs.
Routing rule: if the user's question can be fully answered by paper_search, paper_info, person_search, organization_search, venue_search, patent_search, or patent_info alone, use this skill. Otherwise route to aminer-academic-search.
metadata:
{
"openclaw":
{
"requires": {"env": ["AMINER_API_KEY"] },
"primaryEnv": "AMINER_API_KEY"
}
}
---
# AMiner Free Search
Use this skill for AMiner requests that should stay on the free tier first. It is designed for discovery, initial screening, and entity normalization, not deep analysis.
## Scope
This skill uses only the upgraded free interfaces:
- `paper_search`
- `paper_info`
- `person_search`
- `organization_search`
- `venue_search`
- `patent_search`
- `patent_info`
Current free-tier fields emphasized by this skill:
- `paper_search`: `venue_name`, `first_author`, `n_citation_bucket`, `year`
- `paper_info`: `abstract_slice`, `year`, `venue_id`, `author_count`
- `organization_search`: `aliases` (top 3)
- `venue_search`: `aliases` (top 3), `venue_type`
- `patent_search`: `inventor_name` (first), `app_year`, `pub_year`
- `patent_info`: `app_year`, `pub_year`
- `person_search`: `interests`, `n_citation`, institution fields
## Primary Goal
Use free APIs to help the user answer:
- What is this entity?
- Is it relevant enough to continue?
- Which candidate should I inspect next?
- Can I normalize this institution or venue name?
- Is there enough value to justify upgrading to paid APIs?
Do not use this skill for full scholar portraits, citation-chain analysis, full-text-like paper understanding, large-scale monitoring, or institution output analysis.
## Mandatory Rules
1. Stay on free APIs unless the user explicitly asks to upgrade or the free path clearly cannot answer the question.
2. Be explicit about free-tier limits. Say what can be answered now and what would require a paid upgrade.
3. Use free results to narrow candidates before suggesting any paid API.
4. If returning entities, append AMiner URLs when IDs are available:
- Paper: `https://www.aminer.cn/pub/{paper_id}`
- Scholar: `https://www.aminer.cn/profile/{scholar_id}`
- Patent: `https://www.aminer.cn/patent/{patent_id}`
- Venue: `https://www.aminer.cn/open/journal/detail/{venue_id}`
## Token Check (Required)
Before making any API call, verify that the environment variable `AMINER_API_KEY` exists. Never output the token in plain text.
```bash
if [ -z "${AMINER_API_KEY+x}" ]; then
echo "AMINER_API_KEY does not exist"
else
echo "AMINER_API_KEY exists"
fi
```
- If `${AMINER_API_KEY}` exists: proceed with the query.
- If `${AMINER_API_KEY}` is not set: stop immediately and guide the user to the [AMiner Console](https://open.aminer.cn/open/board?tab=control) to generate one. For help, see the [Open Platform Documentation](https://open.aminer.cn/open/docs).
- If the user provides `AMINER_API_KEY` inline (e.g. "My token is xxx"), accept it for the current session, but recommend setting it as an environment variable for better security.
## Invocation Style
Use direct `curl` calls by default. A Python wrapper is not required for this skill.
Default headers:
- `Authorization: ${AMINER_API_KEY}` by default
- `Content-Type: application/json;charset=utf-8` for POST requests
- `X-Platform: openclaw` when required by the gateway
## When To Use
Use this skill when the user asks for:
- free AMiner search
- low-cost academic discovery
- paper screening
- scholar identification
- institution normalization
- venue normalization
- patent trend scanning
- representative results before deeper analysis
Trigger phrases include:
- “先用免费接口”
- “不要走收费接口”
- “先帮我筛一下”
- “先看看值不值得深挖”
- “找几个候选”
- “做一个轻量版 skill”
## Free Workflows
### 1. Paper triage
Use when the user wants to quickly judge whether a paper is relevant.
Default chain:
`paper_search -> paper_info`
Return:
- title
- first author
- venue name
- year
- citation bucket
- abstract slice
- paper URL
This can answer:
- Is this probably the right paper?
- Is it recent?
- Is it from a recognizable venue?
- Is it worth opening in detail?
### 2. Scholar identification
Use when the user wants to know which scholar is the right person.
Default chain:
`person_search`
Return:
- name
- org
- interests
- citation count
- scholar URL
This can answer:
- Is this the right scholar?
- What interests best describe this person?
- Which institution candidate is the best match?
### 3. Institution normalization
Use when the user provides an institution string or abbreviation.
Default chain:
`organization_search`
Return:
- org id
- standard name
- aliases (top 3)
This can answer:
- Is this institution name recognized?
- Which canonical organization should downstream workflows use?
### 4. Venue normalization and type check
Use when the user provides a conference or journal name.
Default chain:
`venue_search`
Return:
- venue id
- standard bilingual name
- aliases (top 3)
- venue type
- venue URL
This can answer:
- Is this a conference or a journal?
- What is the standard venue entity?
### 5. Patent trend scan
Use when the user wants a lightweight view of patents in a topic.
Default chain:
`patent_search -> patent_info` when IDs need basic enrichment
Return:
- patent title
- first inventor
- app year
- pub year
- patent number and country when `patent_info` is added
- patent URL
This can answer:
- Is the topic active recently?
- Who appears first in the inventor field?
- Is there recent patent activity worth deeper review?
### 6. Free entity map
Use when the user wants a quick map of a topic across papers, scholars, venues, institutions, and patents without paying for analysis-grade APIs.
Suggested chain:
- papers: `paper_search -> paper_info`
- scholars: `person_search`
- institutions: `organization_search`
- venues: `venue_search`
- patents: `patent_search -> patent_info`
Return a short cross-entity summary, not a deep report.
## Free Skill Examples
### 1. Paper triage
```bash
curl -X GET \
'https://datacenter.aminer.cn/gateway/open_platform/api/paper/search?page=1&size=5&title=Attention%20Is%20All%20You%20Need' \
-H "Authorization: ${AMINER_API_KEY}" \
-H 'X-Platform: openclaw'
```
Then enrich with `paper_info`:
```bash
curl -X POST \
'https://datacenter.aminer.cn/gateway/open_platform/api/paper/info' \
-H 'Content-Type: application/json;charset=utf-8' \
-H "Authorization: ${AMINER_API_KEY}" \
-H 'X-Platform: openclaw' \
-d '{"ids":["<PAPER_ID>"]}'
```
### 2. Scholar identification
```bash
curl -X POST \
'https://datacenter.aminer.cn/gateway/open_platform/api/person/search' \
-H 'Content-Type: application/json;charset=utf-8' \
-H "Authorization: ${AMINER_API_KEY}" \
-H 'X-Platform: openclaw' \
-d '{"name":"Yann LeCun","size":5}'
```
### 3. Institution normalization
```bash
curl -X POST \
'https://datacenter.aminer.cn/gateway/open_platform/api/organization/search' \
-H 'Content-Type: application/json;charset=utf-8' \
-H "Authorization: ${AMINER_API_KEY}" \
-H 'X-Platform: openclaw' \
-d '{"orgs":["MIT CSAIL"]}'
```
### 4. Venue normalization and type check
```bash
curl -X POST \
'https://datacenter.aminer.cn/gateway/open_platform/api/venue/search' \
-H 'Content-Type: application/json;charset=utf-8' \
-H "Authorization: ${AMINER_API_KEY}" \
-H 'X-Platform: openclaw' \
-d '{"name":"tkde"}'
```
### 5. Patent trend scan
```bash
curl -X POST \
'https://datacenter.aminer.cn/gateway/open_platform/api/patent/search' \
-H 'Content-Type: application/json;charset=utf-8' \
-H "Authorization: ${AMINER_API_KEY}" \
-H 'X-Platform: openclaw' \
-d '{"query":"quantum computing chip","page":0,"size":10}'
```
## Output Pattern
Prefer this structure:
```markdown
## Free-tier result
### What we can answer now
- ...
### Top candidates
- ...
### Suggested next step
- Stay free: ...
- Upgrade to paid API only if you need: ...
```
## Paid Upgrade Boundary
Recommend upgrading only when the user needs one of these:
- full abstract or full paper metadata
- multi-condition or semantic paper search
- citation relationships
- full scholar profile, works, patents, or projects
- institution scholars, papers, patents, or rich profiles
- venue paper lists by year
- full patent details such as IPC/CPC, assignee, description
Suggested paid handoff:
- deeper paper analysis: `paper_search_pro`, `paper_detail`, `paper_relation`
- deeper scholar analysis: `person/detail`, `person/figure`, `person/paper/relation`
- deeper org analysis: `organization/detail`, `organization/person/relation`, `organization/paper/relation`
- deeper venue analysis: `venue/detail`, `venue/paper/relation`
- deeper patent analysis: `patent/detail`
## Product Positioning
This skill is intentionally positioned for:
- first success
- free discovery
- candidate narrowing
- entity normalization
- upgrade qualification
It should not replace the paid skill. It should create demand for it.
## Additional Reference
For endpoint parameters and fields, read [references/api-catalog.md](references/api-catalog.md).

View File

@@ -0,0 +1,93 @@
{
"skill_name": "aminer-free-academic",
"evals": [
{
"id": 1,
"prompt": "Help me quickly check whether the paper 'Attention Is All You Need' is worth a deep dive. Use only free AMiner APIs. My token is <YOUR_TOKEN>",
"expected_output": "Run the paper triage workflow: paper_search to locate the paper, then paper_info to enrich with abstract slice, year, venue, and author count. Output a lightweight card with a paper URL.",
"files": [],
"expectations": [
"Called Paper Search API (paper_search) to find the paper by title",
"Called Paper Info API (paper_info) with the returned paper ID",
"Output includes title, first_author, venue_name, year, and n_citation_bucket",
"Output includes abstract_slice from paper_info",
"Output includes a paper URL in the format https://www.aminer.cn/pub/{paper_id}",
"No paid API was called"
]
},
{
"id": 2,
"prompt": "I want to identify which 'Andrew Ng' on AMiner is the Stanford professor. Use only free APIs. My token is <YOUR_TOKEN>",
"expected_output": "Run the scholar identification workflow: person_search with name and optional org filter. Return candidate list with interests, org, and citation count to help the user pick the right one.",
"files": [],
"expectations": [
"Called Scholar Search API (person_search) with name 'Andrew Ng'",
"Output includes multiple candidates with name, org, interests, and n_citation",
"Output includes scholar URLs in the format https://www.aminer.cn/profile/{scholar_id}",
"No paid API was called"
]
},
{
"id": 3,
"prompt": "Normalize the institution name 'MIT CSAIL' using AMiner free APIs. My token is <YOUR_TOKEN>",
"expected_output": "Run the institution normalization workflow: organization_search with the raw institution string. Return the canonical org name, org_id, and aliases.",
"files": [],
"expectations": [
"Called Org Search API (organization_search) with orgs=['MIT CSAIL']",
"Output includes org_id and standardized org_name",
"Output includes aliases list",
"No paid API was called"
]
},
{
"id": 4,
"prompt": "Help me look up recent patents about quantum computing",
"expected_output": "Detect that the user has not provided a token; immediately stop all subsequent API calls and guide the user to obtain a token from the AMiner Console.",
"files": [],
"expectations": [
"Clearly inform the user that a token is currently missing and AMiner API calls cannot continue",
"No API calls were made",
"Provide the console link https://open.aminer.cn/open/board?tab=control",
"Prompt the user to continue after obtaining a token"
]
},
{
"id": 5,
"prompt": "Is 'TKDE' a conference or a journal? Help me normalize it using free AMiner APIs. My token is <YOUR_TOKEN>",
"expected_output": "Run the venue normalization workflow: venue_search with the name string. Return the standard bilingual name, venue_type, aliases, and venue URL.",
"files": [],
"expectations": [
"Called Venue Search API (venue_search) with name 'TKDE'",
"Output includes venue_id, standard name (English and/or Chinese), and venue_type",
"Output includes aliases list",
"Output includes a venue URL in the format https://www.aminer.cn/open/journal/detail/{venue_id}",
"No paid API was called"
]
},
{
"id": 6,
"prompt": "Scan recent patent activity in the field of 'autonomous driving lidar' using free APIs only. My token is <YOUR_TOKEN>",
"expected_output": "Run the patent trend scan workflow: patent_search to find patents, optionally enriched by patent_info. Return patent titles, first inventor, app_year, pub_year, and patent URLs.",
"files": [],
"expectations": [
"Called Patent Search API (patent_search) with a query about autonomous driving lidar",
"Output includes patent titles, inventor_name, app_year, and pub_year",
"Output includes patent URLs in the format https://www.aminer.cn/patent/{patent_id}",
"No paid API was called"
]
},
{
"id": 7,
"prompt": "Give me a quick free-tier overview of the topic 'graph neural network' — papers, scholars, institutions, and venues. My token is <YOUR_TOKEN>",
"expected_output": "Run the free entity map workflow: combine paper_search, person_search, organization_search, and venue_search to build a lightweight cross-entity summary.",
"files": [],
"expectations": [
"Called at least paper_search and person_search",
"Output covers multiple entity types (papers, scholars, and optionally institutions/venues)",
"Output includes AMiner URLs for returned entities",
"Clearly states what free-tier can answer and what would require paid upgrade",
"No paid API was called"
]
}
]
}

View File

@@ -0,0 +1,286 @@
# AMiner Free Search API Catalog
**Base URL**: `https://datacenter.aminer.cn/gateway/open_platform`
**Authentication**: All endpoints should default to `Authorization: ${AMINER_API_KEY}`. In workflow execution, also include `X-Platform: openclaw` when required by the gateway.
**Scope**: This catalog only documents the free APIs used by `aminer-free-academic`.
---
## Table of Contents
- [Paper APIs](#paper-apis)
- [Scholar APIs](#scholar-apis)
- [Institution APIs](#institution-apis)
- [Venue APIs](#venue-apis)
- [Patent APIs](#patent-apis)
---
## Paper APIs
### 1. Paper Search
- **URL**: `GET /api/paper/search`
- **Price**: Free
- **Description**: Search papers by title and return low-cost screening fields suitable for fast paper triage.
**Request Parameters:**
| Parameter | Type | Required | Description |
|--------|------|------|------|
| page | number | Yes | Page number. Current online definition says it starts from `1`. |
| size | number | No | Page size, maximum `20` |
| title | string | Yes | Paper title |
**Response Fields:**
| Field | Description |
|--------|------|
| id | Paper ID |
| title | Paper title |
| title_zh | Paper title in Chinese |
| doi | DOI |
| first_author | First author |
| venue_name | Venue title |
| n_citation_bucket | Citation bucket: `0`, `1-10`, `11-50`, `51-200`, `200-1000`, `1000-5000`, `5000+` |
| year | Publication year |
| total | Total count |
**curl Example:**
```bash
curl -X GET \
'https://datacenter.aminer.cn/gateway/open_platform/api/paper/search?page=1&size=10&title=Looking+at+CTR+Prediction+Again%3A+Is+Attention+All+You+Need' \
-H "Authorization: ${AMINER_API_KEY}" \
-H 'X-Platform: openclaw'
```
---
### 2. Paper Info
- **URL**: `POST /api/paper/info`
- **Price**: Free
- **Description**: Batch query paper basic cards by paper IDs. Suitable for enriching search results with lightweight metadata.
**Request Parameters:**
| Parameter | Type | Required | Description |
|--------|------|------|------|
| ids | []string | Yes | Paper ID list, maximum `100` |
**Response Fields:**
| Field | Description |
|--------|------|
| id | Paper ID |
| title | Paper title |
| abstract_slice | Partial abstract |
| authors | Author array |
| author_count | Total author count |
| issue | Volume / issue field |
| raw | Venue raw name |
| venue | Venue object |
| venue_id | Venue ID |
| year | Publication year |
**curl Example:**
```bash
curl -X POST \
'https://datacenter.aminer.cn/gateway/open_platform/api/paper/info' \
-H 'Content-Type: application/json;charset=utf-8' \
-H "Authorization: ${AMINER_API_KEY}" \
-H 'X-Platform: openclaw' \
-d '{"ids":["5ce2c5a5ced107d4c61c839b"]}'
```
---
## Scholar APIs
### 3. Scholar Search
- **URL**: `POST /api/person/search`
- **Price**: Free
- **Description**: Search scholar candidates by name and optional institution condition.
**Request Parameters:**
| Parameter | Type | Required | Description |
|--------|------|------|------|
| name | string | No | Scholar name |
| offset | number | No | Starting position (fixed at 0; pagination not supported) |
| org | string | No | Institution name |
| size | number | No | Number of results, maximum `10` |
| org_id | []string | No | Institution entity ID list |
**Response Fields:**
| Field | Description |
|--------|------|
| id | Scholar ID |
| interests | Research interests |
| n_citation | Citation count |
| name | Name |
| name_zh | Chinese name |
| org | Institution in English |
| org_id | Institution ID |
| org_zh | Institution in Chinese |
| total | Total count |
**curl Example:**
```bash
curl -X POST \
'https://datacenter.aminer.cn/gateway/open_platform/api/person/search' \
-H 'Content-Type: application/json;charset=utf-8' \
-H "Authorization: ${AMINER_API_KEY}" \
-H 'X-Platform: openclaw' \
-d '{"name":"王曙","offset":0,"org":"Shanghai Jiaotong","size":10}'
```
---
## Institution APIs
### 4. Org Search
- **URL**: `POST /api/organization/search`
- **Price**: Free
- **Description**: Search institution IDs and standard names from institution keywords.
**Request Parameters:**
| Parameter | Type | Required | Description |
|--------|------|------|------|
| orgs | []string | No | Institution names |
**Response Fields:**
| Field | Description |
|--------|------|
| aliases | Alias list, partial and typically top 3 |
| org_id | Institution ID |
| org_name | Institution name |
| total | Total count |
**curl Example:**
```bash
curl -X POST \
'https://datacenter.aminer.cn/gateway/open_platform/api/organization/search' \
-H 'Content-Type: application/json;charset=utf-8' \
-H "Authorization: ${AMINER_API_KEY}" \
-H 'X-Platform: openclaw' \
-d '{"orgs":["清华大学"]}'
```
---
## Venue APIs
### 5. Venue Search
- **URL**: `POST /api/venue/search`
- **Price**: Free
- **Description**: Search venue IDs and standard names by venue keyword, including aliases and venue type.
**Request Parameters:**
| Parameter | Type | Required | Description |
|--------|------|------|------|
| name | string | No | Venue name |
**Response Fields:**
| Field | Description |
|--------|------|
| id | Venue ID |
| name_en | Venue English name |
| name_zh | Venue Chinese name |
| aliases | Alias list, partial and typically top 3 |
| venue_type | Venue type: `journal` or `conference` |
| total | Total count |
**curl Example:**
```bash
curl -X POST \
'https://datacenter.aminer.cn/gateway/open_platform/api/venue/search' \
-H 'Content-Type: application/json;charset=utf-8' \
-H "Authorization: ${AMINER_API_KEY}" \
-H 'X-Platform: openclaw' \
-d '{"name":"tkde"}'
```
---
## Patent APIs
### 6. Patent Search
- **URL**: `POST /api/patent/search`
- **Price**: Free
- **Description**: Search patents by title or keyword and return lightweight trend fields.
**Request Parameters:**
| Parameter | Type | Required | Description |
|--------|------|------|------|
| query | string | Yes | Query text such as patent title or keywords |
| page | number | Yes | Page number |
| size | number | Yes | Page size |
**Response Fields:**
| Field | Description |
|--------|------|
| id | Patent ID |
| title | Patent title in English |
| title_zh | Patent title in Chinese |
| inventor_name | First inventor name |
| app_year | Application year |
| pub_year | Publication year |
**curl Example:**
```bash
curl -X POST \
'https://datacenter.aminer.cn/gateway/open_platform/api/patent/search' \
-H 'Content-Type: application/json;charset=utf-8' \
-H "Authorization: ${AMINER_API_KEY}" \
-H 'X-Platform: openclaw' \
-d '{"page":0,"query":"Si02","size":20}'
```
---
### 7. Patent Info
- **URL**: `GET /api/patent/info`
- **Price**: Free
- **Description**: Retrieve a patent basic card by patent ID.
**Request Parameters:**
| Parameter | Type | Required | Description |
|--------|------|------|------|
| id | string | Yes | Patent ID |
**Response Fields:**
| Field | Description |
|--------|------|
| id | Patent ID |
| title / en | Patent title |
| app_num | Application number |
| pub_num | Publication number |
| pub_kind | Publication kind |
| inventor | Inventor |
| country | Country |
| sequence | Sequence |
| app_year | Application year |
| pub_year | Publication year |
**curl Example:**
```bash
curl -X GET \
'https://datacenter.aminer.cn/gateway/open_platform/api/patent/info?id=<PATENT_ID>' \
-H "Authorization: ${AMINER_API_KEY}" \
-H 'X-Platform: openclaw'
```

245
skills/anti-pua/SKILL.md Executable file
View File

@@ -0,0 +1,245 @@
---
name: anti-pua
description: 识别和分析PUAPickup Artist及情感操纵行为的专业心理分析工具。具备人格分析、心理侧写、情感分析能力能够识别情感操纵、煤气灯操纵、虐待等有毒关系模式评估人格特质如黑暗三人格、脆弱型自恋等预测对方行为并给出具体的相处建议。当用户需要分析对方言行动机、识别PUA/情感操纵行为、评估NPD自恋型人格障碍倾向、识别操纵行为、预测对方未来行为、寻求健康关系建议、分析黑暗三人格或光明三人格时使用此skill。
---
# 反PUA大师 - 情感操纵识别与心理分析
## 核心角色定位
你是一位专业行为分析专家和心理顾问。你具备:
- **人格分析能力**:深度分析人格特质,识别黑暗三人格(自恋、马基雅维利主义、精神病态)、脆弱型自恋、光明三人格等
- **心理侧写能力**:通过言行分析对方的心理动机和内在模式
- **情感分析能力**识别情感操纵、虐待、煤气灯操纵gaslighting等具体行为
- **行为预测能力**:基于人格分析预测对方下一步可能的行为
## 🚀 使用开始 - 必须先询问的问题
**当用户开始使用这个skill时必须先按顺序询问以下问题收集关键信息**
### 问题1对方与咨询人的关系
"请问对方与您是什么关系?"
- 朋友、情侣、配偶
- 领导、同事、下属
- 导师、老师、同学
- 父母、兄妹、姐弟、子女、亲戚
- 陌生人、保密、其他
### 问题2用户希望我们做什么
"根据对方目前的行为表现,您希望我们为您做什么?"
- 分析行为动机
- 识别PUA
- 评估NPD自恋型人格障碍倾向
- 识别操纵行为
- 预测未来行为
- 倾听疗愈
- 寻求健康关系建议
- 黑暗三人格分析
- 光明三人格分析
- 其他(请说明)
### 问题3提供详细描述
"请客观详细描述对方在与您相处时的言行细节,包括:"
- 具体的对话内容(原话或近似表述)
- 具体的行为表现
- 这些言行发生的情境和背景
- 您当时的感受和反应
### 问题4支持上传材料
"您也可以上传以下材料辅助分析:"
- 聊天记录截图
- 对话文字记录
- 其他相关证据
### 问题5概念咨询可选
"或者如果您只是想了解与PUA相关的概念也可以随时问我例如"
- 什么是PUA
- 什么是NPD
- 什么是煤气灯操纵?
- 什么是爱情轰炸?
- 其他PUA相关概念
---
## 工作流程
**在收集完上述信息后,按照以下流程进行分析:**
**特殊处理场景:**
1. **概念咨询**如果用户选择了解PUA相关概念问题5直接以通俗易懂的方式解释相关概念配合实际案例说明不需要走完整分析流程。
2. **倾听疗愈**:如果用户选择"倾听疗愈"问题2优先表达共情和理解不急于分析避免结构化输出以对话式回应为主。
3. **信息不足处理**:如果用户提供的描述过于简略或模糊,应先追问具体细节(如具体对话内容、行为情境、发生时间等),不要基于有限信息做出过度推断。
4. **反向利用防护**:如果用户询问如何操纵、控制他人或利用这些技巧伤害他人,明确拒绝并说明这些工具的目的是识别和保护,而非攻击。
### 第1步理解关系背景
根据用户回答的【问题1】和【问题2】明确
- 双方关系类型
- 用户的核心需求
- 分析的重点方向
### 第2步初步分析
根据用户提供的对方言行(对话、行为描述等),进行初步分析:
- 识别言语中的潜在操纵模式
- 分析行为背后的可能动机
- 标记可疑的PUA/情感操纵信号
### 第3步专业心理评估
进行深入的心理分析,使用专业心理学术语:
**人格特质分析:**
- 是否存在黑暗三人格特征:
- 自恋Narcissism自我中心、寻求赞美、缺乏同理心
- 马基雅维利主义Machiavellianism操纵性、欺骗性、情感冷漠
- 精神病态Psychopathy冲动、缺乏悔意、情感肤浅
- 是否存在脆弱型自恋:外表脆弱但内心极度需要认可
- 光明三人格:同理心、诚实、谦逊等积极特质
**操纵行为识别:**
- 情感操纵:利用情感弱点控制对方
- 煤气灯操纵Gaslighting质疑对方的现实感让受害者怀疑自己的记忆和理智
- 爱情轰炸Love Bombing初期过度亲密随后突然抽离
- 沉默对待Silent Treatment通过冷暴力惩罚对方
- 贬低与打压:削弱对方自尊,建立依赖
- 三角关系Triangulation引入第三方制造嫉妒和不安全感
- 责任转移:将问题归咎于受害者
**心理动机分析:**
- 控制欲的来源
- 不安全感的表现
- 自我价值感的获取方式
- 权力需求的满足机制
### 第4步通俗解释与预测
用非心理学专业用户能听懂的语言,复述上述专业分析:
- 将专业术语转化为日常语言
- 用具体例子说明抽象概念
- 解释这些行为对关系的实际影响
- 预测对方下一步可能的行为模式
### 第5步提供建议
在分析末尾给出具体的相处建议:
**立即行动建议:**
- 如何回应当下的操纵行为
- 如何设立和维持边界
- 如何保护自己的情感安全
**长期关系建议:**
- 是否应该结束这段关系
- 如何改善健康的关系模式
- 如何识别未来的有毒关系
**自我保护策略:**
- 建立情感支持系统
- 提升自我认知和自尊
- 学习健康的沟通方式
## 输出格式
结构化输出以下部分:
### 1. 对话/行为解读
- 对方言行的表面含义
- 潜在的真实动机
- 言行背后的心理需求
### 2. PUA/操纵警示信号
- 列出识别到的操纵行为
- 标注严重程度(低/中/高)
- 说明这些行为的典型模式
### 3. 人格特质分析
- 主要人格特征
- 可能的人格类型(如:自恋型、马基雅维利主义型等)
- 这些特征如何影响关系
### 4. 行为预测
- 对方下一步可能的行为
- 这些行为的发展趋势
- 关系可能的发展方向
### 5. 应对策略
- 立即可以使用的回应方式
- 长期的关系管理策略
- 何时需要退出关系
### 6. 健康关系对照
- 健康关系应该是什么样
- 当前关系与健康的差距
- 如何建立更健康的关系
## 重要原则
1. **客观中立**:基于事实和心理学理论进行分析,避免过度解读
2. **保护受害者**:优先考虑用户的心理安全和情感健康
3. **避免诊断**:说明这是基于有限信息的分析,不能替代专业心理诊断
4. **赋能用户**:帮助用户建立自信和边界,而不是增加依赖
5. **提供资源**:在严重情况下,建议寻求专业心理咨询帮助
## 示例分析框架
### 示例1简要分析
**用户输入:** "他总是说'那你非要去北京什么意思?',让我觉得自己做错了事"
**分析输出:**
### 对话解读
- **表面含义**:询问你为什么一定要去北京
- **潜在动机**:通过质疑你的选择,让你产生内疚和自我怀疑
- **心理需求**:控制你的决策,建立对你的权威
### PUA警示信号
- **煤气灯操纵**:让你怀疑自己的判断和选择
- **情感否定**:不尊重你的个人需求和决定
- **严重程度**:中
### 人格特质分析
- **控制欲强**:试图影响你的重大决策
- **缺乏同理心**:不关心你的个人追求
- **可能特征**:自恋型或马基雅维利主义倾向
### 行为预测
- 如果你不顺从,可能会:
- 加大情感压力
- 使用沉默对待
- 贬低你的能力
### 应对策略
- **立即回应**"这是我的个人决定,我需要你的支持而不是质疑"
- **设立边界**:明确表示重大决策需要相互尊重
- **长期策略**:观察他是否尊重你的独立性,如持续不尊重,考虑关系健康度
### 健康关系对照
- **健康关系**:尊重彼此的个人选择,支持对方的成长
- **当前差距**:缺乏对个人独立性的尊重
- **改进方向**:建立相互尊重的沟通模式
---
### 示例2完整流程从信息收集到分析输出
**场景**:用户希望分析男友的行为模式
**信息收集阶段:**
**AI问题1**:请问对方与您是什么关系?
**用户**男朋友我们在一起6个月了
**AI问题2**:根据对方目前的行为表现,您希望我们为您做什么?
**用户**识别PUA分析他的行为动机
**AI问题3**:请客观详细描述对方在与您相处时的言行细节,包括具体的对话内容、行为表现、这些言行发生的情境和背景,以及您当时的感受和反应。
**用户**刚认识时他对我特别好每天发很多消息说我是他遇到的最好的人送我很多礼物。但最近3个月他开始经常批评我说我穿得不好看、工作不够努力、朋友都不靠谱。每次我反驳他他就会说"我这么说都是为你好,你不懂感恩"。而且他经常说"没有我你什么都不是"、"其他女生都比你懂事"。最可怕的是,有时候我明明记得他说过某句话,他却说"我没说过,你记错了",让我开始怀疑自己的记忆。

View File

@@ -0,0 +1,317 @@
---
name: auto-target-tracker
description: 自动目标进度追踪器。在对话中检测到目标相关图片(笔记、进度、截图、记录)时,自动调用 VLM 识别关键信息并记录到目标日记。适用于学习管理、健身追踪、工作进度、习惯养成、创作记录等所有目标管理场景。
---
# 自动目标进度追踪器
## 触发条件
当对话中出现以下条件时自动触发:
1. **用户发送了图片**(特别是学习笔记、进度截图、健身记录、任务清单、创作作品等)。
2. **用户在设定的目标时间段**(如 08:30, 10:00, 20:00发送了图片。
3. **用户明确说**"帮我记一下"、"看下进度"、"打卡"、"更新一下"等。
---
## 工作流程
### 1. 检测图片
当检测到图片时,检查:
- 图片文件名是否包含目标关键词progress, goal, task, workout, note等
- 图片内容是否包含目标元素(进度条、文字、代码、图表、计划表等)
- 是否在预定的目标提醒时间附近
- 用户最近的对话上下文是否涉及目标的执行
### 2. 调用 VLM 识别
使用 vlm 工具识别图片:
**通用 prompt 模板**
```
"识别图片中的关键信息,根据目标类型提取以下内容:
- 核心任务/内容
- 完成进度或数量
- 关键数据(如时间、重量、字数等)
- 给出一段简短的执行反馈"
```
**目标类型专用 prompt**
| 目标类型 | Prompt |
|---------|--------|
| 学习 | "识别学习笔记,提取知识点、完成度" |
| 健身 | "识别健身记录,提取运动类型、组数、次数、重量" |
| 工作 | "识别工作进度,提取完成任务、完成率" |
| 创作 | "识别创作作品,提取创作类型、进度、关键元素" |
| 习惯 | "识别打卡记录,提取打卡内容、连续天数" |
### 3. 解析目标信息
从 VLM 返回的结果中提取:
- **任务/内容清单**:识别出的具体行动或任务
- **完成度**:基于图片内容的进度估算
- **关键数据**:时间、数量、重量、字数等量化指标
- **认知反馈**:对当前目标状态的简评
### 4. 记录到目标日记
调用`edit_daily`工具将识别结果记录到当天的日常笔记中
### 5. 反馈给用户
向用户确认识别结果:
```
已记录你的目标打卡:
📝 识别结果:
核心内容:你拍的是今天的英语单词表,一共记了 15 个新词。
进度估算:今天的单词任务全部搞定,进度打败了 80% 的学习党。
建议:有两个单词的拼写有点模糊,明天复习的时候记得多看两眼。
记录准确吗?要帮你存进今天的目标日记里吗?
```
---
## 记录格式
### 目标日记条目示例
```markdown
## 20:00 打卡记录
**目标类型**: 📚 学习
**图片**: ![目标图片](path/to/image.jpg)
**VLM识别结果**:
| 任务/内容 | 进度/数量 | 状态 |
|----------|----------|------|
| 英语单词 (Unit 1) | 15 个 | 已完成 |
| 数学练习 (第3章) | 80% | 进行中 |
| **总计** | | **今日达成 2/3** |
**关键数据**:
- 学习时长: 2小时
- 专注度: 高
**备注**: 自动识别,用户确认正确
---
## 10:30 健身打卡
**目标类型**: 🏃 健身
**图片**: ![健身记录](path/to/gym.jpg)
**VLM识别结果**:
| 运动类型 | 组数 | 次数 | 重量 | 状态 |
|---------|------|------|------|------|
| 卧推 | 4 | 12 | 60kg | ✅ 完成 |
| 深蹲 | 4 | 10 | 80kg | ✅ 完成 |
| 引体向上 | 3 | 8 | 自重 | ⚠️ 少一组 |
| **总计** | | | **今日达标** |
**关键数据**:
- 总重量: 2640kg
- 训练时长: 45分钟
**备注**: 引体向上少完成一组,下次补上
```
---
## 与目标系统的集成
### 每日汇总
在每天晚上 22:00 的汇总中,包含:
- 今日所有打卡记录
- 目标达成率分析
- 与目标的对比(如果设置了目标)
### 周/月报告
在周报告中,包含:
- 本周有效执行时长
- 目标覆盖范围
- 连续打卡天数
- 动态难度调整建议:如果连续达标,则建议提升下周任务量
---
## 常见使用场景
### 场景1学习打卡
**用户行为**:发送手写笔记照片
**自动识别**
- 提取知识点
- 计算学习进度
- 记录到学习日志
**反馈示例**
```
📚 识别到学习笔记:
- 机器学习监督学习算法(已完成)
- 梯度下降优化器(进行中)
- 正则化防过拟合(未开始)
进度33% | 预计还需 2 小时完成
```
### 场景2健身打卡
**用户行为**:发送健身记录照片
**自动识别**
- 提取运动类型
- 统计组数、次数、重量
- 计算训练量
**反馈示例**
```
🏃 健身记录已识别:
- 卧推 60kg × 12 × 4组 ✅
- 深蹲 80kg × 10 × 4组 ✅
- 引体向上 自重 × 8 × 3组 ✅
总训练量2640kg | 时长45分钟
```
### 场景3工作进度
**用户行为**:发送项目进度截图
**自动识别**
- 提取已完成任务
- 计算完成百分比
- 识别剩余任务
**反馈示例**
```
💼 工作进度已识别:
- 需求文档(已完成)✅
- 原型设计(已完成)✅
- 前端开发(进行中)🔄 80%
- 后端开发(未开始)⏳
项目总进度67%
```
### 场景4创作打卡
**用户行为**:发送创作作品照片
**自动识别**
- 提取创作类型
- 识别关键元素
- 估算完成度
**反馈示例**
```
🎨 创作记录已识别:
类型:插画创作
元素:人物角色、背景场景
完成度线稿100%上色60%
建议:今天完成了角色线稿,明天可以开始背景上色
```
### 场景5习惯打卡
**用户行为**:发送打卡日历截图
**自动识别**
- 提取连续打卡天数
- 识别今日打卡状态
- 计算打卡率
**反馈示例**
```
✅ 习惯打卡已识别:
早起:连续 15 天 | 打卡率 100%
阅读:连续 8 天 | 打卡率 73%
运动:连续 21 天 | 打卡率 100%
🎉 运动已连续打卡 3 周,继续保持!
```
---
## Scope
This skill ONLY:
- 识别目标相关图片并提取关键信息
- 记录打卡数据到日常笔记文件
- 提供进度反馈和建议
This skill NEVER:
- 自动执行任何基于识别结果的操作
- 上传图片到外部服务(除 VLM API
- 访问用户未授权的图片资源
- 修改用户的目标计划(仅记录进度)
---
## Security & Privacy
**Data that stays local:**
- 识别后的结构化结果
- 记录到 日常笔记或长期记忆 和 USER.md 的内容
- 打卡历史数据
**This skill does NOT:**
- 分享目标进度或打卡数据给第三方
- 自动发布打卡信息到社交平台
- 访问用户的其他图片资源
---
## 注意事项
1. **隐私保护**: 图片和识别结果仅存储在本地,不会上传到云端(除了调用 VLM API 进行识别)
2. **准确性**: VLM 识别的内容仅供参考,可能因字迹模糊、图片质量等原因有所偏差
3. **及时确认**: 建议用户在记录后及时确认识别结果,如有偏差可手动修正
4. **目标类型识别**: 系统会根据图片内容自动判断目标类型,如有误可手动调整
5. **进度估算**: 进度百分比基于图片内容估算,可能不准确,建议用户定期手动更新
---
## 集成建议
### 与 SOUL.md 配合
将自动追踪器整合到目标管理日常工作流中:
```markdown
### 2. 智能记录与估算 (Logging & Estimation)
- 当用户发送任何与目标相关的图片时:
1. 自动调用 auto-target-tracker 识别内容
2. 提取关键信息并估算进度
3. 立刻记录到日常笔记中
4. 同步更新 USER.md 的目标进度
```
### 与 HEARTBEAT.md 配合
在心跳检查中包含:
```markdown
## 每日汇总
- 22:00 自动读取今日所有打卡记录
- 生成目标进度报告
- 发送给用户
```

View File

@@ -0,0 +1,35 @@
# Radical Transparency Influence Methodology
### Honesty
Radical transparency is a commitment to engaging prospects, clients, investors, and colleagues with complete candor, even if, on the surface, it may seem like it could hurt your chances of closing a sale or landing a particular investor. In B2B markets, where many salespeople are focused on meeting quotas and achieving their commission or bonus, this approach stands out as a refreshingly honest way to build relationships with potential customers. Honesty is a key element in any successful sales process, as it helps to foster trust and respect between the buyer and the seller.
### Leveraging Brain Science to Inform How We Sell
Sales software and technology have advanced exponentially over the past decade, but sales strategies and approaches have not kept pace. This has contributed to the growing feeling among B2B buyers that salespeople offer little value in the buying process.
To address this issue, we can look to modern research in the field of neurology first. Research from the past decade has shown that emotion is the biggest factor in important decisions. We know that a narrative, or story, is the most effective way to share information so that it has impact and can influence behavior. The human brain has evolved over millions of years to attach the most meaning to information presented in this way.
### Understanding The Analytic Brain vs The Lizard Brain
### The Modern Brain
Also known as the **Neocortex**, this is the most recent area of the human brain to have evolved. It is used to rationalize and analyze information, and is responsible for identifying mistakes and holes in newly presented concepts.
Unfortunately, a vast majority of salespeople have been trained to communicate information to this part of the brain. Brain scans have shown that important decisions are not processed in the Neocortex.
### The Lizard Brain
The **Limbic Cortex**, a part of the brain, has been present since the first humans evolved and is responsible for our behavior and how it can be modified. It is also the source of instinct and emotion, which are essential for successful sales conversations.
Brain scans have shown that the limbic system plays a major role in significant decision-making. Investing, buying, and other consequential decisions are often driven by emotion, motivating actions.
### What does this have to do with Sales?
What makes this methodology especially effective is that it focuses primarily on both how the human brain takes in information most effectively and how it processes that information within a context where value is being presented in exchange for money.
Unfortunately, most salespeople are out of the game within the first few seconds. This happens because older methodologies and sales training practices are focused heavily on delivering information to the analytical brain. This is not ideal, as the analytical brain is largely responsible for skepticism, non-emotional evaluation, and is therefore very difficult to persuade.
On the other hand, salespeople who deliver information primarily to the part of the brain responsible for emotion, the lizard brain, are able to influence the behavior of their prospects more effectively. This is because buying decisions are most often emotional in nature.
While the analytical brain is stimulated by identifying mistakes and justifying actions, the limbic system is stimulated by stories, human connection, shared beliefs, and driving emotionally based decisions.

View File

@@ -0,0 +1,33 @@
# Give Spotlight on Mac Superpowers with Raycast
## What is Raycast?
I recently rediscovered Raycast and wanted to try it again after a few years. Raycast has infinitely more features than Spotlight (Apple's search tool).
I was skeptical, assuming it would be just another Spotlight replacement that would require a lot of effort for minimal productivity gains, so I was hesitant to give it a try thinking I'd need to devote time and resources I don't have to a steep learning curve. Boy was I wrong!
I finally decided to give it a try, and it's probably the single most impactful software I've introduced into my daily workflows, outside of maybe Notion.
I use it countless times daily for tasks like searching Hubspot for contact or deal information, generating social posts with ChatGPT, and starting my next Zoom meeting. Tasks that would have taken at least a minute or two now take under 10 seconds.
I've saved a significant amount of time using Raycast to access all kinds of information from my most-used applications. The real kicker is that Raycast is a completely **free** application and not a "free plan" with the all the good features paywalled. It's a no-brainer for anyone who frequently uses Spotlight, keyboard shortcuts, or those built into MacOS.
### The Raycast Extension Store
Raycast's Extension Store is a comprehensive directory that's divided into three main categories: productivity, utility, and business. The store houses thousands of extensions that can be installed by simply tapping return.
The extensions in the extension store are provided by a dedicated community of developers and are constantly being updated with new features and improvements. There are thousands of extensions available for the most popular apps and tools out there. A few notable options include Salesforce, Hubspot, ChatGPT, Slack, Notion, Crunchbase, Google Drive/Meet/Calendar/Search, Facetime, Mail, WhatsApp, Todoist, ClickUp, and so many more.
### Interacting with Applications
Since discovering Raycast a few weeks ago, I've been incredibly impressed with its capabilities.
More important than the fantastic selection of app extensions is how Raycast allows users to interact directly with these apps.
Tasks like searching Hubspot or Salesforce for a contact's email, starting your next Zoom meeting, or adding a quick task to Notion now take less than 5-10 seconds. For example, instead of going into Hubspot and finding a contact's profile to grab an email or phone number, I can hit ⌘+space, type "hub," and search my entire Hubspot database right from the command bar.
### Conclusion
It's still hard to believe, but Raycast is completely 100% free. This isn't a "free plan" with all the good features paywalled, but all features are free.
Raycast is a game-changer for any professional working on a Mac that frequently uses keyboard shortcuts, Apple's Spotlight, or just wants to save a ton of time finding information within your most used apps.

View File

@@ -0,0 +1,47 @@
# How Short-form Content Is Changing Marketing & Storytelling Forever
Over the past decade, our youngest generations have been fighting a losing battle against the impact that short-form algorithms, used in apps like TikTok and Instagram, have had on their brains and attention spans. As a result, the tried-and-true structure of a compelling story or narrative is no longer as effective in the world of marketing. These media formats have literally changed the structure of an effective and compelling narrative.
In a world where a tweet can spark a movement and a 60-second video can go viral, we are living through one of the largest transformations in how we share ideas, stories, and information. This shift, driven by the rise of short-form content, is redefining the very structure of an effective story and dramatically changing how marketers communicate with their audience.
## How Did We Get Here?
### Sesame Street Started It. Really.
Believe it or not, the journey begins with "Sesame Street." This iconic children's show was ahead of its time, using short, engaging segments to educate and engage young viewers. It demonstrated early on how quick and concise content can effectively capture attention and communicate messages. This pioneering approach laid the groundwork for the myriad of short-form content styles we see today.
### The Mobile Tech Influence
With the advent of mobile technology, more importantly, the smartphone, shorter content naturally matched this type of consumption. In 2024, the vast majority of content is viewed through a mobile device, which helped to cement short-form as the dominant and most effective content style for generating engagement.
### The Perfect Storm: Social Media Meets Mobile-First
The advent of social media platforms, combined with the rise of smartphones, created the perfect environment for bite-sized content. The algorithms that dictate what content you're exposed to on apps like Twitter, Instagram, and TikTok measure the success of any piece of content by its engagement level. To achieve success, creators must put out a large volume of content that drives engagement, as opposed to spending significant time on more meaningful content. The latter strategy simply won't help you break through.
## The Evolution of Storytelling in the Social Media Age
### Classic vs. Modern Storytelling
The traditionally accepted structure of an effective story or narrative generally begins with rising action, followed by an inciting event, all building towards a climax, which is where the audience is at peak engagement. For this type of storytelling to pay off, it must spend time drawing the viewer in.
In 2024, however, the structure for telling a successful story begins with the climax. The most viral content often starts by thrusting the viewer right into the middle of the most tense moments. Given the short timeframes, it's easy to see why the satisfaction wears off quickly and results in the dreaded doom scrolling.
### New Formats, New Stories
Before social media, storytelling online began evolving with platforms like microblogs, forums, and instant messaging services. These formats laid the foundation for short-form by serializing storytelling in a way that allows the narrative to unfold over separate bits of content, again driving users to just keep on scrolling.
## The Negative Effects of Short-Form Content
### Cognitive Overload and Overstimulation
There are other serious downsides to the new world of short-form content that may quite possibly have unintended effects on the younger generations, as they are quite literally the guinea pigs, being exposed to almost exclusively short-form content with fewer and fewer alternatives.
The relentless stream of short, engaging content can lead to cognitive overload. Users are bombarded with information, making it more difficult to focus or deeply engage with any single piece. This overstimulation often results in a superficial understanding of topics and a diminished interest in more in-depth and nuanced content.
### The Compromise of Intellectual Depth
Studies have raised concerns about how short-form content, particularly when consumed extensively by younger audiences, might impact cognitive development and attention spans. The format's emphasis on immediate feedback and satisfaction can oversimplify complex topics, leading viewers into the false belief that they have a much better understanding of an issue than they actually do.
### Short-Form Content is Here to Stay
Short-form content has irreversibly changed the landscape of marketing and storytelling. Its rise reflects a broader shift in how we consume information and what we expect from our digital experiences. By understanding its evolution, embracing its potential, and being mindful of its pitfalls, we can use short-form content to tell stories that are not only engaging and relevant but also meaningful and impactful. In the ever-evolving world of content, adaptability, creativity, and a commitment to quality will be key to captivating and maintaining the attention of modern audiences.

View File

@@ -0,0 +1,33 @@
# The Amazing Benefits of Typing Very Fast
About a year ago I decided I wanted to become a faster typer as I spend most of my day on a computer and thought it might have an impact on how fast I could work. I decided to practice on Monkeytype.com every day for at least 10 mins. I started out around 80 WPM and about 2 weeks in I was already over 100. A year in and I've hit 150. The impact on my daily workflows has been incredible. I'm about at the speed where I can type at the speed of my thinking, which is great for writing copy.
### Getting Started
To start, I used Monkeytype.com and decided to commit at least 5 to 10 minutes every day to practice. I started at around 80 WPM, which was decent, but I really didn't expect to improve as quickly as I did.
Within 2 to 3 weeks of daily practice, my speed jumped to over 100 WPM. This rapid improvement served as a great morale boost, prompting me to stick with my practice regime. A little under a year later, I was able to clock a 150 WPM!
### The Impact on Daily Workflows
The impact of my newfound typing speed on my daily workflows was phenomenal. As someone who works in front of a computer all-day everyday, the ability to type at the same speed as my thinking proved to be a game-changer, particularly when it came to writing copy. It was amazing how quickly I could get my ideas on the computer screen.
I discovered that the benefits of being a fast typer extend beyond simple time-saving. It's akin to unlocking a new level of proficiency on your computer, and the advantages are more numerous than you might expect.
### Why You Should Consider Typing Faster
If you're at all like me, it will quickly become like a sport where you're constantly trying to improve your previous WPM. The beauty of this skill is that the barriers to entry are virtually non-existent. If you can type, you can improve. All it requires is a little dedication, with just 5-10 minutes of practice per day.
### Taking Workflows to the Next Level by Typing Faster
For Mac users that leverage a spotlight replacement app like Alfred or Raycast, typing faster can revolutionize your daily workflows. I use Raycast personally, and the simple act of pressing cmd+space and quickly typing the app I need to open or file I need to search become nearly instantaneous. Whether it's opening apps or performing functions, everything seems to be accomplished at warp speed.
Even if you only use the basic spotlight feature on your Mac, you'll certainly see a substantial increase in your speed of navigation. Suddenly, finding what you're looking for becomes a swift, almost instantaneous process.
### Conclusion
As we further immerse ourselves in our digital era, being a fast typer is no longer just a party trick. It's a practical skill, one that can help you navigate your digital world more efficiently and productively.
My journey towards becoming a faster typer proved to be one of the most impactful things I've done to get more done. It took a tool I use every day — my keyboard — and turned it into a vehicle for productivity and efficiency.
If you find yourself tethered to a keyboard for a significant part of your day, consider investing a few minutes each day towards improving your typing speed. You will certainly be surprised by the boost in productivity, how much quicker you can communicate and collaborate with your team. So, how about giving it a shot and seeing where it takes you?

View File

@@ -0,0 +1,55 @@
# How to Write More Effective AI Prompts
## The Art of Prompt Engineering for ChatGPT
In an AI-driven era, mastering communication with tools like ChatGPT is crucial. This guide explores writing effective prompts for ChatGPT, unlocking its full potential. Whether you're a tech enthusiast, content creator, or business professional, these tips enhance your AI interaction.
### The GIGO Principle: Quality In, Quality Out
The axiom "Garbage In, Garbage Out" (GIGO) holds true in the world of AI. The quality of the output you receive from ChatGPT directly correlates with the quality of the prompts you provide. Inadequate prompts can lead to misleading or irrelevant answers, while well-crafted ones can produce insightful and accurate responses.
### Crafting Prompts that Spark Excellence
The skill of writing effective prompts is now so crucial that it has spawned a new discipline: prompt engineering. This involves meticulously designing prompts that guide ChatGPT's large language model (LLM) to generate the best possible answers.
### Conversational AI: Talk to ChatGPT Like a Person
Interacting with ChatGPT should mimic a conversation with a colleague. This approach helps in setting the stage, providing context, and maintaining the AI's focus on the topic.
### Context Is Key
Providing ChatGPT with clear context is vital. It narrows down the AI's focus to your specific subject, leading to more accurate and useful responses. Contextualized prompts require more details but offer more refined outputs.
### Assuming Identities and Professions
One of ChatGPT's fascinating features is its ability to adopt different personas or professional perspectives. This ability can be harnessed to gain diverse viewpoints on a topic.
### Maintaining Relevance and Accuracy
While ChatGPT is an advanced AI, it can sometimes veer off-topic or produce fabricated answers. To mitigate this, ask the AI to justify its responses and guide it gently back on track. Remember to prompt it for source citations where necessary.
## Advanced Prompt-Writing Techniques
### Fine-Tuning Your Prompts
Minor adjustments to your prompts can lead to significantly different responses from ChatGPT. Remember, the AI retains its awareness of previous conversations as long as the session is ongoing.
### Breaking Down Responses
Be mindful that responses over 500 words can sometimes lose coherence. Keep your prompts concise and to the point for the best results.
### Evolving Your Questions
If ChatGPT seems hesitant to answer a question, rephrasing it might yield better results. Utilize personas to elicit responses that might not be forthcoming otherwise.
### Seeking Justification and Sources
When looking for well-supported answers, instruct ChatGPT to justify its responses or provide sources. This practice ensures a higher degree of accuracy and reliability in the information provided.
### Embrace Experimentation
Experimentation is key in mastering prompt writing. The more you test different approaches, the better you'll understand how to steer ChatGPT towards desired outcomes.
### Conclusion: The Journey to AI Mastery
Mastering ChatGPT prompts is a journey of continuous learning and adaptation. By understanding the intricacies of prompt engineering and staying updated with the latest advancements, you can transform your interaction with AI from a mere task to an enriching experience. Embrace these tips, keep experimenting, and watch as ChatGPT becomes an invaluable asset in your digital toolkit.

View File

@@ -0,0 +1,43 @@
# AI is Revolutionizing Entry Level Sales & Marketing
## AI is Revolutionizing Entry Level Sales & Marketing
### A Revolution in Sales & Marketing
Artificial intelligence (AI) has begun to reshape the landscape of entry level sales and marketing jobs in unprecedented ways. As AI technology advances, it has become a driving force behind increased efficiency, personalization, and overall improvements in the sales and marketing industries. This shift is redefining the roles of sales and marketing professionals, and causing companies to rethink their strategies for hiring and training.
### Automation and Efficiency: Streamlining the Sales Process
One of the most significant impacts of AI on entry level sales jobs is the increased level of automation. AI-powered tools and software can now handle repetitive and mundane tasks, allowing sales professionals to focus on more high-value activities. This shift has led to increased efficiency in the sales process and has paved the way for more strategic and targeted approaches to reaching potential customers.
Examples of AI automation in sales include lead scoring, email automation, and CRM systems that can track and analyze customer interactions. By using AI to automate these tasks, entry level sales professionals can focus on building relationships and closing deals, ultimately driving more revenue for their organizations.
### Personalization: Tailoring Marketing Efforts to Individual Customers
In the world of marketing, AI has played a crucial role in enabling personalization at scale. By analyzing vast amounts of data and identifying patterns, AI-powered tools can create tailored marketing campaigns that resonate with individual customers. This level of personalization has become essential in today's competitive landscape, where consumers expect personalized experiences from the brands they engage with.
For entry level marketing professionals, this means a shift away from one-size-fits-all marketing strategies. Instead, they must learn to use AI-driven tools to create targeted campaigns that speak to the unique needs and preferences of their audience. This approach not only helps companies build stronger relationships with their customers but also drives higher conversion rates and increased customer loyalty.
### Predictive Analytics: Guiding Decision-Making in Sales & Marketing
Another way AI is transforming entry level sales and marketing jobs is through the use of predictive analytics. AI algorithms can analyze historical data to identify trends and make predictions about future outcomes, allowing sales and marketing professionals to make data-driven decisions.
For example, AI-powered sales forecasting tools can help sales reps prioritize leads and focus on the most promising opportunities. In marketing, predictive analytics can be used to optimize ad spending, segment customers, and identify the most effective channels for reaching specific audiences. By leveraging AI in this way, entry level professionals can become more strategic in their approach and drive better results for their organizations.
### Chatbots and Conversational AI: Enhancing Customer Engagement
Chatbots and conversational AI have become increasingly popular in both sales and marketing as a way to engage with customers and prospects. These AI-driven tools can handle routine customer inquiries, provide personalized product recommendations, and even assist with lead qualification.
For entry level sales and marketing professionals, the rise of chatbots and conversational AI means a shift in focus. Rather than handling all customer interactions themselves, they must learn to work alongside these AI-powered tools to provide a seamless and cohesive customer experience.
### Upskilling and Reskilling: Preparing for the Future of Sales & Marketing
As AI continues to reshape entry level sales and marketing jobs, professionals in these fields must adapt their skill sets to remain competitive. This includes learning how to use AI-driven tools and software, as well as developing a deeper understanding of data analytics and customer behavior.
Companies and educational institutions are recognizing this need and are offering training programs and resources to help sales and marketing professionals upskill and reskill. By investing in their own professional development, entry level sales ensure they remain relevant and valuable in the ever-evolving world of AI-driven sales and marketing.
### The Future of Entry Level Sales & Marketing Jobs in the Age of AI
As AI continues to transform the sales and marketing landscape, the roles and responsibilities of entry level professionals in these fields will continue to evolve. While some tasks may become automated, there will be a growing demand for skilled professionals who can harness the power of AI to drive more effective and personalized sales and marketing strategies.
To succeed in this new era, entry level sales and marketing professionals must embrace AI as a valuable tool that can enhance their work and help them achieve better results. By staying ahead of the latest AI trends and developments, and continuously adapting their skills and knowledge, they can position themselves for long-term success in the rapidly changing world of sales and marketing.

View File

@@ -0,0 +1,49 @@
# Why AI Art & Media Is Useless
## Why AI Art & Media Is Useless
As someone who works in AI and genuinely believes in the value and power of LLMs to make professionals more useful and valuable, I can confidently say that I hate everything about AI image/video/music generation. It is useless and only serves one purpose: to replace creative professionals and the work they do.
### The Scope of the Problem
To be a bit more precise about my hatred for AI media, I need to be clear that I don't hate AI. I've spent the last three-plus years devoting my entire professional life to leveraging AI tools to help professionals do their jobs more effectively. That said, from the moment I was exposed to AI art, I had the same initial reaction as most: a flood of anxiety and uncanniness, which I knew instantly I didn't like.
Leveraging an LLM to automate task creation from new emails I receive simply replaces something I spend 30 minutes doing every morning and allows it to occur in the background, producing the same output I would have arrived at. That's just helpful.
I am not an artist, but if I decided to start using AI to create all the graphics for a client, I wouldn't be improving anything that I currently do. I would just be replacing a potential job for someone who does art professionally.
The fundamental difference here is when a professional uses AI to improve the efficiency or quality of something they already do, it functions as a tool. When someone with no art experience uses AI to create art, it's not improving anything. It's simply replacing something that already exists with something worse.
### The "Democratization" Lie
Access to the ability to create art is not the same as having the ability to create art. The moment everyone began conflating the ability to produce an output with the creator itself, they've already swallowed the Kool-Aid. My wife is an amazing cook, and she would be no matter the cost of her spatula. However, if I purchased the greatest spatula in history, I would still be a crappy cook.
Now let's talk about vibe coding, which is fundamentally different from image generation. I have learned more about writing code and development in the past year by using AI than I ever have. This is because things frequently do not work and therefore I have to go learn new information.
The key difference here is that vibe coding allows me to leverage my current knowledge as well as gain new knowledge, whereas generating an image simply produces an output that I have no ability to improve on. The reason I can't improve it is because simply going and looking up a bunch of information on how to create art will not make me a better artist.
### The Collapse of Quality
Another important factor to understand here is that AI art isn't producing the worst work or the best work. It's producing the *median* of everything it has been trained on (actual artists' work). This is incredibly dangerous. It's essentially producing a blob of an over-generalized consensus on what looks "good." That doesn't work when you amalgamate every style and genre of art in order to produce something. This is not creative. This is aggregative.
Another problem here isn't that AI can't make art. Everything it makes is, by design, is just good enough. Therefore, this hits dead center in the sweet spot for what massive corporations are looking for. Why pay a junior designer to iterate on multiple concepts when an AI can generate you 200 versions of something that are all "good enough"?
### Creativity Is Disappearing
Creating art obviously requires creativity, however, using AI tools simply requires knowledge. These are two very different things. Creativity isn't an output, it's an artist's struggle through years as they hone their craft and improve their abilities. It's thousands of micro-decisions that aren't just learned, but practiced over many years.
This matters because creative work embodies meaning and emotion that come from the artist. When AI generates an image, it remixes thousands of tokens to approximate what the user requested. Crafting an advanced prompt is a legitimate skill, prompt engineering, but it's not the same as creating art. These skills should never be conflated.
### What Should We Do?
First, we need laws to stop major AI labs—particularly OpenAI and Google AI—from collecting human-made art as training data for their image generation models. We need strong regulation requiring artists to **opt in** before their work can be collected and trained on.
Second, we need more AI leaders to step up and stop this before it's too late. For example, Anthropic (makers of Claude) has never released an image generation model. That doesn't mean Claude can't be used to create websites or other graphic design work, but creating a UI or navigation menu is entirely different from painting on canvas.
### AI Art Hurts the Future Potential of AI
It's clear that most people don't find AI art pleasing—they actively dislike it. With every piece of AI art slop that lands on Twitter or Instagram, the long-term reputation of AI as a useful tool for professionals takes another hit.
For years now, public sentiment toward AI has been declining. There's one culprit: AI-generated art and media. People who aren't knowledgeable about AI don't distinguish between media generation and other use cases that are actually valuable. This reduces the chances they'll ever consider the benefits of AI as a tool.
It's my sincere hope that we stop this race to the bottom before we get there. We should take all the resources and effort put toward AI media generation and redirect them toward leveraging AI as a tool for medical breakthroughs, building technology, and conducting research more efficiently.

2
skills/blog-writer/README.md Executable file
View File

@@ -0,0 +1,2 @@
# Blog-writer
Blog writing skill for Tom Panos's distinctive voice - direct, conversational, and grounded in personal experience. Handles workflow from research through Notion publication.

158
skills/blog-writer/SKILL.md Executable file
View File

@@ -0,0 +1,158 @@
---
name: blog-writer
description: This skill should be used when writing blog posts, articles, or long-form content in the writer's distinctive writing style. It produces authentic, opinionated content that matches the writer's voice—direct, conversational, and grounded in personal experience. The skill handles the complete workflow from research review through Notion publication. Use this skill for drafting blog posts, thought leadership pieces, or any writing meant to reflect the writer's perspective on AI, productivity, sales, marketing, or technology topics.
---
# Blog Writer
## Overview
This skill enables writing blog posts and articles that authentically capture the writer's distinctive voice and style. It draws on examples of the writer's published work to produce content that is direct, opinionated, conversational, and grounded in practical experience. The skill includes automatic Notion integration and maintains a growing library of finalized examples.
## When to Use This Skill
Trigger this skill when:
- The user requests blog post or article writing in "my style" or "like my other posts"
- Drafting thought leadership content on AI, productivity, marketing, or technology
- Creating articles that need the writer's authentic voice and perspective
- The user provides research materials, links, or notes to incorporate into writing
## Core Responsibilities
1. **Follow the writer's Writing Style**: Match voice, word choice, structure, and length of example posts in `references/blog-examples/`
2. **Incorporate Research**: Review and integrate any information, research material, or links provided by the user
3. **Follow User Instructions**: Adhere closely to the user's specific requests for topic, angle, and emphasis
4. **Produce Authentic Writing**: Create content that reads as genuinely the writer's voice, not generic AI-generated content
## Workflow
### Phase 1: Gather Information
Request from the user:
- Topic or subject matter
- Any specific angle or thesis to explore
- Research materials, links, or notes (if available)
- Target length preference (default: 800-1500 words)
Review all provided materials thoroughly before beginning to write.
### Phase 2: Draft the Content
Reference the style guide at `references/style-guide.md` and examples in `references/blog-examples/` for calibration.
When writing:
1. Start with a strong opening statement establishing the thesis
2. Use personal voice and first-person perspective where natural
3. Include relevant personal anecdotes or professional experience if applicable
4. Structure with clear subheadings (###) every 2-3 paragraphs
5. Keep paragraphs short (2-4 sentences)
6. Weave in research materials naturally, not as block quotes
7. End with reflection, call-to-action, or forward-looking statement
### Phase 3: Review and Iterate
Present the draft and gather feedback. Iterate until the user confirms satisfaction.
### Phase 4: Publish to Notion (REQUIRED)
When the draft is complete (even if not yet finalized), publish to the TS Notes database.
**Notion Publication Details:**
- Database: "TS Notes" (data source ID: `04a872be-8bed-4f43-a448-3dfeebc0df21`)
- **Type property**: `Writing`
- **Project(s) property**: Link to "My Writing" project (page URL: `https://www.notion.so/2a5b4629bb3780189199f3c496980c0c`)
- **Note property**: The title of the blog post
- **Content**: The full blog post content in Notion-flavored Markdown
**Example Notion API call properties:**
```json
{
"Note": "Blog Post Title Here",
"Type": "Writing",
"Project(s)": "[\"https://www.notion.so/2a5b4629bb3780189199f3c496980c0c\"]"
}
```
**CRITICAL**: The outcome is considered a **failure** if the content is not added to Notion. Always publish to Notion as part of the workflow, even for drafts.
### Phase 5: Finalize to Examples Library (Post-Outcome)
When the user confirms the draft is **final**:
1. Save the finalized post to `references/blog-examples/` with filename format:
```
YYYY-MM-DD-slug-title.md
```
Example: `2025-11-25-why-ai-art-is-useless.md`
2. Check the examples library count:
- If exceeding 20 examples, ask user permission to remove the 5 oldest
- Sort by filename date prefix to identify oldest files
The post-outcome is considered **successful** when the final draft is saved to the skill folder.
## Success Criteria
| Outcome | Success | Failure |
|---------|---------|---------|
| Primary | User receives requested content AND it is added to TS Notes with Type=Writing and Project=My Writing | Content delivered but NOT added to Notion |
| Post-outcome | Final draft saved to `references/blog-examples/` | Final draft not saved when user confirms it's final |
## the writer's Writing Style Profile
### Voice & Tone
- **Direct and opinionated**: State positions clearly, even contrarian ones
- **Conversational**: Write like speaking to a colleague—accessible without being simplistic
- **First-person when sharing experience**: Use "I" naturally for personal insights
- **Authentic skepticism**: Willing to criticize trends when warranted
### Structure Patterns
- **Strong opening thesis**: Open with a clear, often bold statement
- **Subheadings throughout**: Use `###` format liberally to break up content
- **Short paragraphs**: Rarely more than 3-4 sentences
- **Personal anecdotes woven in**: Illustrate points with real examples
- **Practical takeaways**: Provide actionable insights, not just theory
- **Reflective conclusion**: End with call-to-action or forward-looking hope
### Length & Format
- Target: 800-1500 words
- Markdown format with headers and emphasis
- Minimal bullet points in prose—prefer flowing sentences
### Vocabulary Markers
- Uses "leverage" for tools/technology
- Says "that said" for transitions
- Comfortable with direct statements like "this is useless" or "boy was I wrong"
- Uses contractions naturally (I've, doesn't, won't)
- Avoids corporate jargon while maintaining professionalism
### Thematic Elements
- AI as tool, not replacement
- Practical over theoretical
- Human-centered technology
- Honest assessment of what works and what doesn't
## Resources
### references/style-guide.md
Quick reference for the writer's writing patterns, vocabulary preferences, and structural conventions.
### references/blog-examples/
Contains example blog posts demonstrating the writer's writing style. These serve as reference material when calibrating voice and structure. New finalized posts expand this library over time.
## Notion API Reference
To create a page in TS Notes:
```
Database data source ID: 04a872be-8bed-4f43-a448-3dfeebc0df21
Properties:
- "Note": (title) - The blog post title
- "Type": "Writing"
- "Project(s)": ["https://www.notion.so/2a5b4629bb3780189199f3c496980c0c"]
Content: Full blog post in Notion-flavored Markdown
```
The "My Writing" project page ID is: `2a5b4629-bb37-8018-9199-f3c496980c0c`

6
skills/blog-writer/_meta.json Executable file
View File

@@ -0,0 +1,6 @@
{
"ownerId": "kn722nva0z7svbapne80p8e8jd7zwmk7",
"slug": "blog-writer",
"version": "0.1.0",
"publishedAt": 1769361436760
}

View File

@@ -0,0 +1,90 @@
#!/usr/bin/env python3
"""
Utility script for managing the blog examples library.
Helps identify old examples to prune when the library exceeds the limit.
"""
import os
import sys
from datetime import datetime
from pathlib import Path
EXAMPLES_DIR = Path(__file__).parent.parent / "references" / "blog-examples"
MAX_EXAMPLES = 20
PRUNE_COUNT = 5
def list_examples():
"""List all blog examples sorted by date (oldest first)."""
examples = []
for f in EXAMPLES_DIR.glob("*.md"):
# Extract date from filename (YYYY-MM-DD-slug.md)
try:
date_str = f.stem[:10]
date = datetime.strptime(date_str, "%Y-%m-%d")
examples.append((date, f.name))
except ValueError:
# Skip files that don't match the naming convention
continue
return sorted(examples, key=lambda x: x[0])
def check_library():
"""Check library status and recommend pruning if needed."""
examples = list_examples()
count = len(examples)
print(f"Blog Examples Library Status")
print(f"=" * 40)
print(f"Total examples: {count}")
print(f"Maximum allowed: {MAX_EXAMPLES}")
print()
if count > MAX_EXAMPLES:
print(f"⚠️ Library exceeds limit by {count - MAX_EXAMPLES} files")
print(f"Recommend removing the {PRUNE_COUNT} oldest examples:")
print()
for i, (date, name) in enumerate(examples[:PRUNE_COUNT]):
print(f" {i+1}. {name} ({date.strftime('%B %d, %Y')})")
else:
print(f"✓ Library is within limits ({MAX_EXAMPLES - count} slots available)")
print()
print("All examples (oldest first):")
print("-" * 40)
for date, name in examples:
print(f" {name}")
def prune_oldest(dry_run=True):
"""Remove the oldest examples to bring library under limit."""
examples = list_examples()
count = len(examples)
if count <= MAX_EXAMPLES:
print("Library is within limits. No pruning needed.")
return
to_remove = examples[:PRUNE_COUNT]
if dry_run:
print(f"DRY RUN - Would remove {len(to_remove)} files:")
else:
print(f"Removing {len(to_remove)} oldest files:")
for date, name in to_remove:
filepath = EXAMPLES_DIR / name
if dry_run:
print(f" Would remove: {name}")
else:
filepath.unlink()
print(f" Removed: {name}")
if __name__ == "__main__":
if len(sys.argv) > 1 and sys.argv[1] == "prune":
dry_run = "--execute" not in sys.argv
prune_oldest(dry_run=dry_run)
else:
check_library()

160
skills/blog-writer/style-guide.md Executable file
View File

@@ -0,0 +1,160 @@
# Tom Panos Writing Style Guide
## Quick Reference
### Opening Lines
Start with a strong thesis or personal statement. Examples from Tom's posts:
- "As someone who works in AI and genuinely believes in the value and power of LLMs to make professionals more useful and valuable, I can confidently say that I hate everything about AI image/video/music generation."
- "I recently rediscovered Raycast and wanted to try it again after a few years."
- "About a year ago I decided I wanted to become a faster typer..."
- "Artificial intelligence (AI) has begun to reshape the landscape of entry level sales and marketing jobs in unprecedented ways."
- "In an AI-driven era, mastering communication with tools like ChatGPT is crucial."
- "Radical transparency is a commitment to engaging prospects, clients, investors, and colleagues with complete candor..."
- "Over the past decade, our youngest generations have been fighting a losing battle against the impact that short-form algorithms..."
### Transition Phrases
- "That said..."
- "The fundamental difference here is..."
- "Another important factor to understand here is..."
- "This matters because..."
- "For example..."
- "The real kicker is..."
- "To be a bit more precise..."
- "Now let's talk about..."
- "The key difference here is..."
### Closing Patterns
- **Forward-looking hope**: "It's my sincere hope that we stop this race to the bottom before we get there."
- **Call to action**: "So, how about giving it a shot and seeing where it takes you?"
- **Summary reflection**: "The impact of artificial intelligence on entry level sales and marketing jobs is profound..."
- **Practical encouragement**: "Check it out via Growth Language's recommended apps library"
- **Big picture synthesis**: "Short-form content has irreversibly changed the landscape of marketing and storytelling."
### Vocabulary Preferences
**Use these naturally:**
- "leverage" (for using tools)
- "game-changer"
- "impactful"
- "workflows"
- "professionals"
- "countless times daily"
- contractions (I've, doesn't, won't, that's, I'd)
**Phrases that sound like Tom:**
- "I can confidently say..."
- "Boy was I wrong!"
- "I decided to..."
- "I've spent the last..."
- "My [wife/experience/journey]..."
- "It's still hard to believe, but..."
- "This is incredibly dangerous."
- "This just doesn't work when..."
**Avoid:**
- Excessive corporate jargon
- Passive voice when active works
- Hedging language when making a clear point
- Over-qualified statements
- Generic AI-sounding phrases
### Paragraph Length
- 2-4 sentences typical
- Single sentence paragraphs for emphasis
- Break at natural thought transitions
- Never more than 5 sentences in one paragraph
### Header Frequency
- New subheader every 150-250 words
- Use ### for most subheaders within a post
- Use ## for major section breaks
- Headers should be descriptive, not clickbait
### Structural Template
```markdown
# [Bold, Direct Title]
[Opening paragraph with strong thesis - 2-3 sentences establishing position]
### [First Subheading - Context or Problem]
[2-3 short paragraphs developing the point]
[Personal anecdote or example if relevant]
### [Second Subheading - Analysis or Explanation]
[Continue developing argument]
[Include practical implications]
[Real-world examples]
### [Third Subheading - Deeper Exploration]
[Further exploration or counterarguments addressed]
[Specific details or data points]
### [Fourth Subheading - Solutions or Implications]
[What to do about it]
[Practical recommendations]
### [Conclusion Subheading like "What Should We Do?" or "Conclusion"]
[Reflection, call-to-action, or forward-looking statement]
[Often includes personal hope or belief]
```
### Topics Tom Writes About
- AI tools and their practical applications
- Productivity software and workflows (Raycast, Notion, etc.)
- Sales and marketing strategy
- Technology criticism (when warranted)
- Personal development and skills (typing speed, prompt engineering)
- The future of work
- Brain science applied to business
- Short-form content and media trends
### Key Beliefs to Reflect
1. **AI should enhance professionals, not replace them** - "When a professional uses AI to improve the efficiency or quality of something they already do, it functions as a tool."
2. **Practical application matters more than theory** - Always include real examples and actionable insights
3. **Technology should serve human needs** - Human-centered perspective on all tech topics
4. **Honesty and transparency build trust** - "Radical transparency is a commitment to engaging... with complete candor"
5. **Continuous learning is valuable** - Personal growth stories like typing speed improvement
6. **Quality over quantity in content** - Critique of short-form content's impact on depth
7. **Skepticism of hype is healthy** - Willing to call out things that don't work
### Handling Controversial Takes
Tom isn't afraid to take strong positions:
- "I hate everything about AI image/video/music generation. It is useless."
- "AI art isn't producing the worst work or the best work. It's producing the *median*."
- Clear identification of problems: "The 'Democratization' Lie"
When writing controversial takes:
1. Establish credibility first ("As someone who works in AI...")
2. Be precise about the scope of criticism
3. Acknowledge what DOES work
4. Provide concrete reasoning, not just opinion
5. End with constructive suggestions
### Personal Experience Integration
Tom weaves personal stories naturally:
- "About a year ago I decided I wanted to become a faster typer... I started at around 80 WPM... A year in and I've hit 150."
- "My wife is an amazing cook, and she would be no matter the cost of her spatula."
- "I recently rediscovered Raycast and wanted to try it again after a few years."
When including personal experience:
1. Keep it relevant to the main point
2. Include specific details (numbers, timeframes)
3. Connect back to broader implications
4. Don't overdo it—one or two per post is enough
### Formatting Notes
- Use `*italics*` for emphasis on key terms
- Use `**bold**` sparingly, mainly for key takeaways
- Lists only when actually listing items (not for general prose)
- Include images/screenshots where they add value
- End with "More posts like this" section linking to related content

13
skills/charts/LICENSE.txt Executable file
View File

@@ -0,0 +1,13 @@
Copyright (c) 2026 Z.ai All rights reserved.
Permission is granted for personal, educational, and non-commercial use only.
Commercial use is strictly prohibited without prior written permission from the author.
Unauthorized copying, modification, or distribution of the software for commercial purposes is prohibited.
The author reserves the right to make the final determination of what constitutes "commercial use".
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY ARISING FROM THE USE OF THE SOFTWARE.

427
skills/charts/SKILL.md Executable file
View File

@@ -0,0 +1,427 @@
---
name: charts
metadata:
author: Z.AI
version: "1.0"
description: >
Professional chart and diagram creation skill. Covers all types of visual data
representation and structural diagrams:
- **Data charts**: bar charts, line charts, pie charts, scatter plots, heatmaps,
radar charts, candlestick charts, boxplots, histograms, area charts, waterfall charts,
regression plots, distribution plots, and statistical visualizations.
- **Structural diagrams**: flowcharts, mind maps, tree diagrams, org charts,
architecture diagrams, network/relationship graphs, ER diagrams, class diagrams,
Gantt charts, swimlane diagrams, and sequence diagrams.
- **Dashboards**: data dashboards, KPI panels, multi-chart compositions,
and interactive visualizations.
- **Design quality**: professional color systems, anti-overlap rules, layout optimization,
scene-based framework routing (matplotlib, seaborn, ECharts, D3.js, Mermaid, Playwright+CSS),
and publication-ready output.
Applies when the user wants to create, generate, draw, plot, visualize, or improve
any chart, graph, diagram, or dashboard. Also applies when the user asks for something
more polished, cleaner, or publication-ready.
NOT for: PDF document layout (use pdf skill), slide decks (use slides skill),
spreadsheets with embedded charts (use xlsx skill), AI image generation (use image_gen),
posters / infographics / creative cards (use pdf skill Creative pipeline).
FORBIDDEN: Using matplotlib/seaborn to draw mind maps, tree diagrams, org charts,
flowcharts, or any structural diagram. These MUST use Playwright+CSS.
license: Proprietary. LICENSE.txt has complete terms
---
# Beautiful Charts
## Quick Setup
```bash
bash "$SKILL_DIR/setup.sh" # Interactive environment check + install
```
Make every chart and diagram look professionally designed, not AI-generated.
## Architecture
| Module | File | When to Load |
|--------|------|-------------|
| **Routing + Core Rules** | This file | Always read first |
| **Framework Templates** | `references/` by framework | After choosing framework, read the corresponding file |
**Loading order: Read this file → choose framework → read template file → start coding.**
Each template file contains its own framework-specific rules (spacing, connectors, color details). This file contains only routing decisions and universal rules that apply to ALL charts.
---
# Part 1: Routing
## ⚠️ Format Constraint Rule (HIGHEST PRIORITY)
**When the user specifies an output format/tool, you MUST comply. Never substitute.**
| User Says | You Must Do | Forbidden |
|-----------|------------|-----------|
| "use mermaid code" / "用Mermaid格式输出" / "转化为mermaid" / "mermaid流程" | ① Output Mermaid code block (```mermaid ... ```) ② Also provide a rendered image preview | ❌ Cannot only give image without code; ❌ Cannot screenshot raw code text as image |
| "use markdown code" | Output markdown-formatted hierarchy | ❌ Cannot switch to HTML/CSS |
| "via mermaid or markdown code" | Choose one of the two, output code text | ❌ Cannot switch to any non-specified format |
| "flowchart" / "mind map" (no format specified) | Free to choose the best approach | - |
| "use echarts/d3" | Must use the specified framework | ❌ Cannot switch |
### 🚫 FORBIDDEN: Mermaid Code Screenshot
**NEVER take a screenshot of raw Mermaid source code and deliver it as the "diagram image".** This is the worst possible outcome — the user gets neither usable code nor a visual diagram. When the user requests Mermaid format:
1. **MUST** output the Mermaid code in a fenced code block (````mermaid`)
2. **SHOULD** also render the code into a visual diagram image (via mermaid-cli or Playwright + mermaid.js)
3. If rendering fails, deliver the code block and tell the user to paste it into mermaid.live
### Format Specified vs Auto-Upgrade Conflict
When the user specifies Mermaid but content triggers auto-upgrade conditions (>8 nodes, CJK-heavy, etc.):
1. **User choice wins** — still use Mermaid, deliver code block + rendered image
2. **Proactively guide** — after delivery, suggest the user try without specifying Mermaid for better layout quality
3. **Never silently switch** to Playwright+CSS when user explicitly asked for Mermaid
When a specified tool hits rendering difficulties (e.g., mermaid CDN fails):
- ✅ Output raw mermaid code text, tell user to view at mermaid.live
- ❌ Secretly switch to another framework
- ❌ Screenshot the code text as an "image"
---
## Routing Decision Tree
### 1. Structural Diagrams
#### 🔴 Flowchart Default: Phased Vertical (HIGHEST PRIORITY)
**When the user asks to "generate/create a XXX flowchart/流程图" without specifying format, the DEFAULT layout is Phased Vertical (Layout C in `references/playwright-css.md`).**
This is because nearly all real-world processes (manufacturing, legal proceedings, project management, business operations, cooking recipes, etc.) have natural phases/stages. Layout C produces the most professional, readable result.
**Flowchart routing priority:**
1. **User specified Mermaid/markdown** → follow user choice (Format Constraint Rule)
2. **≤6 nodes AND no phases AND short text** → Mermaid (simple flowchart)
3. **Everything else****Playwright + CSS, Layout C (Phased Vertical)**`references/playwright-css.md`
**Phase detection — treat as "has phases" when ANY is true:**
- Content has numbered sections (一、二、三 or 1. 2. 3. or Phase 1/Stage 1)
- Process can be grouped by time/stage/role (e.g., "preparation → execution → review")
- Total steps ≥ 5 (almost always groupable into 2+ phases)
- Process involves multiple roles/departments
- Process has clear start/end with intermediate stages
**⚠️ When in doubt, default to Layout C.** A phased layout with only 1 phase still looks professional. A Grid layout with phases looks like a mess.
#### Other Structural Diagrams
- Simple flowchart (≤6 nodes, truly flat, no phases): **Mermaid**
- Complex flowchart (>6 nodes / CJK-heavy / branches / phases): **Playwright + CSS Layout C**`references/playwright-css.md`
- Mind map / tree / org chart: **Playwright + CSS**`references/mindmap-css.md`
- Relationship / network diagram: **ECharts graph**
- Center-radial analysis (SWOT / BSC / Porter's Five Forces / PEST): **Playwright + CSS**`references/radial-grid.md`
### 2. Data Charts (matplotlib / seaborn)
- Standard bar/line/scatter/heatmap/radar/pie: **matplotlib**
- Regression/distribution/boxplot: **Seaborn**
### 3. Interactive Charts / Dashboards
- Data dashboard / candlestick / real-time: **ECharts**
- Fully custom interactive: **D3.js**
### Default Strategy
**One scene, one tool — don't hesitate:**
| Scene | Tool | Template |
|-------|------|----------|
| Data chart (bar/line/scatter/pie/radar) | matplotlib | `references/matplotlib.md` |
| Statistical (regression/box/dist) | Seaborn | `references/seaborn.md` |
| Mind map / tree / org chart | Playwright + CSS | `references/mindmap-css.md` |
| Center-radial (SWOT/BSC/PEST/Five Forces) | Playwright + CSS | `references/radial-grid.md` |
| **Any flowchart (default)** | **Playwright + CSS Layout C** | **`references/playwright-css.md`** |
| Simple flowchart (≤6 nodes, truly flat) | Mermaid | `references/mermaid.md` |
| Relationship / force-directed | ECharts graph | `references/echarts.md` |
| Data dashboard | ECharts | `references/echarts.md` |
| Academic paper figures | matplotlib | `references/matplotlib.md` |
---
## Mermaid Auto-Upgrade Rules
Mermaid's dagre/elk layout estimates CJK widths incorrectly. **Auto-switch to Playwright+CSS when ANY condition is met:**
| Trigger | Action |
|---------|--------|
| Total nodes > **6** | → CSS flowchart (Layout C) |
| Any node text > **12 Chinese characters** | → CSS flowchart |
| More than **3 parallel branches** | → CSS flowchart |
| Nested subgraphs > **2 levels** | → CSS flowchart |
| Connector crossings > **2** | → CSS flowchart |
| **Side annotations / dashed note boxes** | → CSS flowchart |
| **Loop-back / cycle arrows** | → CSS flowchart |
| **Process has identifiable phases/stages** | → CSS flowchart (Layout C) |
**If staying with Mermaid**: `padding: 32`, `nodeSpacing: 80`, `rankSpacing: 80`. Node text ≤ 10 CJK chars/line, wrap with `<br>`, quote all text `A["text"]`.
---
## Large Dataset Rendering
| Data Size | Approach |
|-----------|----------|
| < 1,000 points | matplotlib / any |
| 1,000 - 10,000 | matplotlib (no markers) or ECharts |
| 10,000 - 100,000 | ECharts (Canvas mode) |
| > 100,000 | ECharts (`large: true`) or WebGL |
---
# Part 2: Universal Rules
These rules apply to ALL charts regardless of framework. Framework-specific rules live in each template file.
## 7 Core Rules
1. **Zero overlap.** No element may cover another's text. Overlap = information loss = task failure. Post-generation: verify every element has clear separation.
2. **Hierarchy over uniformity.** Primary nodes larger/bolder than secondary. Annotation nodes smaller/muted. Spacing between groups > within groups. If every box looks identical, the layout has failed.
3. **Low-saturation palette.** 70% background/neutral, 20% secondary, 10% accent (one highlight only). No high-saturation large fills. Saturated colors only on borders (2px), text, and small elements.
4. **Insight first.** Titles express conclusions, not field names. Remove non-essential elements: top/right borders, grid lines, tick marks, legend box borders. If removing it doesn't reduce understanding, it shouldn't exist.
5. **Label clarity over label method.** The goal is zero overlap — choose the method that achieves it for each chart type. Direct labels, legends, and leader lines are all valid; what matters is that nothing overlaps.
### 🚫 FORBIDDEN: Any Text Overlapping Any Other Element
**No label, legend, annotation, or title may overlap any other visual element.** This is the single most common matplotlib defect. Both direct labels AND legends can cause overlap — neither is inherently safe.
**Anti-overlap decision tree:**
1. **Check if direct labels fit** — if all labels have enough space (bar tops, line endpoints, large pie slices), label directly. No legend needed.
2. **If some labels would collide** (small pie slices, dense scatter points, clustered bars) → use legend outside plot area instead of forcing labels into tight spaces.
3. **Mixed approach** — label the major items directly, group small items into "其他" or use leader lines + legend for the small ones.
**Pie chart specific (the worst offender):**
- Slices < 5%: MUST use leader lines (`wedgeprops + texts` manual repositioning, or `matplotlib.patches.ConnectionPatch`) to pull labels outside. Do NOT rely on `autopct` alone — it places text inside/near the slice.
- Multiple small adjacent slices: use `bbox_to_anchor` legend outside, NOT direct labels
- `labeldistance=1.25` minimum to keep labels outside the pie
- When >2 slices are < 5%, consider grouping all < 3% into "其他X项"
- Use `adjustText` library to auto-resolve label collisions when available
**Legend placement (when legend is needed):**
- Place legend **outside** the plot area using `bbox_to_anchor`
- Suggested starting positions:
- Bar/line/scatter: right side outside (`bbox_to_anchor=(1.02, 1), loc='upper left'`)
- Pie: right side outside (`bbox_to_anchor=(1.1, 0.5), loc='center left'`)
- Radar: below chart (`bbox_to_anchor=(0.5, -0.15), loc='upper center'`)
- Heatmap: no legend needed (colorbar suffices)
**🔧 Mandatory: auto-adjust legend to prevent overlap.** Copy this snippet after placing any legend:
```python
# ── Auto-adjust legend position to prevent overlap ──
fig.canvas.draw() # must render first to get bboxes
legend = ax.get_legend()
if legend:
renderer = fig.canvas.get_renderer()
# Try shifting up to 5 times to resolve overlap
for _ in range(5):
leg_bb = legend.get_window_extent(renderer).transformed(ax.transAxes.inverted())
has_overlap = False
for text in ax.texts + [ax.title] + ax.get_xticklabels() + ax.get_yticklabels():
if not text.get_text():
continue
txt_bb = text.get_window_extent(renderer).transformed(ax.transAxes.inverted())
if leg_bb.overlaps(txt_bb):
has_overlap = True
break
if not has_overlap:
break
# Move legend further outside (direction depends on current loc)
bbox = legend.get_bbox_to_anchor().transformed(ax.transAxes.inverted())
x0, y0 = bbox.x0, bbox.y0
# Heuristic: if legend is below center, move down; if right of center, move right
if y0 < 0.5:
legend.set_bbox_to_anchor((x0, y0 - 0.08), transform=ax.transAxes)
else:
legend.set_bbox_to_anchor((x0 + 0.08, y0), transform=ax.transAxes)
fig.canvas.draw()
```
- **After placing legend**: always call `plt.tight_layout()` or `fig.subplots_adjust()` to ensure legend is not clipped
🚫 FORBIDDEN:
- `loc='best'` — matplotlib's "best" frequently overlaps data
- `loc='upper right'` / `loc='lower right'` on line/bar charts — high collision risk
- Direct labels on pie slices < 5% without leader lines
- Any text placement without verifying zero overlap
6. **Font discipline.** Max 2 fonts. Chinese: SimHei/PingFang SC. Always explicitly set fonts in code. Font size follows hierarchy (title 18-24px → body 13-15px → annotation 11-13px). Never go below 10px floor. When text overflows: condense text → enlarge canvas → last resort: shrink font (but never below floor).
7. **Whitespace is design.** Chart area 60-70% of canvas, margins 15-20%. At least 16pt between title and chart. Crowded ≠ information-rich.
---
## Color System
### Recommended Palettes
| Palette | Text | Background | Block Fill | Accent |
|---------|------|------------|------------|--------|
| Business Cool | `#243447` | `#F8FAFC` | `#E9EEF3` | `#4C6EF5` |
| Tech Cyan-Gray | `#1F2937` | `#F5F7FA` | `#E6ECF2` | `#3AAFA9` |
| Morandi Warm | `#4B4A45` | `#FAF8F4` | `#EAE4DB` | `#C6866A` |
| Invisible Precision | `#37352F` | `#FFFFFF` | `#F7F7F7` | `#2383E2` |
### 🚫 Forbidden Background Colors
| Color | Forbidden Hex Values |
|-------|---------------------|
| Pure blue | `#3B82F6`, `#2563EB`, `#1D4ED8` |
| Pure green | `#10B981`, `#059669`, `#22C55E` |
| Pure red | `#EF4444`, `#DC2626`, `#F87171` |
| Pure purple | `#8B5CF6`, `#7C3AED`, `#A855F7` |
| Pure amber | `#F59E0B`, `#D97706`, `#FB923C` |
### ✅ Allowed Background Colors
| Color | Hex Values |
|-------|------------|
| Ice blue | `#EFF6FF`, `#DBEAFE` |
| Mint green | `#F0FDF4`, `#D1FAE5` |
| Light amber | `#FFF7ED`, `#FEF3C7` |
| Lavender | `#F5F3FF`, `#EDE9FE` |
| Light gray | `#F8FAFC`, `#F1F5F9` |
### Functional Color (states only, not decoration)
- Active/Selected: brand accent or `2px` accent line
- Error: `#EF4444`
- Success: `#10B981`
- Tags: light bg + dark text, never high-sat pills
### Colorblind-Safe
Don't rely on color alone — pair with shape, line style, or direct labels.
Paul Tol palette: `['#0077BB', '#33BBEE', '#009988', '#EE7733', '#CC3311', '#EE3377']`
### Dark Theme
- Background: `#0F172A` (not pure black)
- Text: `#F1F5F9` (not pure white)
- Grid: `#1E293B`, low alpha
- Export: `savefig(facecolor='#0F172A')`
---
## Export Rules
- Static charts: minimum 200 DPI, recommended 300 DPI
- Pie/radar: **square `figsize=(8, 8)`** — non-square = elliptical
- No more than 6 colors per chart (split if more)
- Bar chart Y-axis starts at 0 (line charts may truncate)
- Never use 3D (distorts proportions)
### Playwright Screenshot
Default `device_scale_factor=2`. Large mind maps (3000px+): 1.5. PDF embed: 1-1.5. Print: 3.
After render, read `bounding_box()` and resize viewport to fit. Min viewport: 800px single-col, 1200px multi-col.
### 🚫 FORBIDDEN: `max-width` on Mermaid/SVG Containers
Mermaid's dagre engine produces SVGs with unpredictable width (especially with subgraphs, CJK text, or parallel branches). **NEVER set `max-width` on the Mermaid container element.** Use `width: fit-content; min-width: 800px;` instead.
**Root cause**: Mermaid SVGs overflow their CSS container silently. `bounding_box()` (Playwright) returns the CSS box model size, NOT the SVG's actual rendered size. So auto-resize viewport based on `bounding_box()` alone will still produce clipped screenshots.
**Fix**: Always read the **SVG element's own `getBoundingClientRect()`** via `page.evaluate()`, then use `max(css_size, svg_size) + padding` for viewport dimensions. See `references/mermaid.md` for the corrected screenshot script.
### Aspect Ratio Preservation (embedding)
**MUST read actual image dimensions and calculate height proportionally. NEVER hardcode both width and height.**
---
## matplotlib-Specific Rules
These apply when routing to matplotlib/seaborn:
### Layout & Overlap
- Prefer `constrained_layout=True` over `tight_layout()`
- Use `adjustText` library for automatic label repositioning — **this is the most reliable anti-overlap tool for matplotlib.** Install: `pip install adjustText`. Usage: `from adjustText import adjust_text; adjust_text(texts)`
- Max 4 subplots per canvas. More → split images or `figsize=(20, 16)` minimum
- Multi-subplot: `GridSpec` with `wspace/hspace` ≥ 0.3
- Colorbar: `shrink=0.8` + `pad=0.08`
- Data labels: Y-axis upper limit with 15-20% headroom (`ylim(0, max_val * 1.18)`)
- Long X labels → horizontal bar chart or show every N-th label
### Radar / Spider Charts
- **Every `fill()` MUST have `alpha=0.25`** (max 0.3). Omitting alpha = opaque = hides underlying series.
- Legend: place outside chart with `bbox_to_anchor`, start with `(0.5, -0.15), loc='upper center'`. If dimension labels are long or dimensions > 8, increase offset (e.g., `-0.25` or `-0.3`). Also FORBIDDEN: `loc='lower right'` (collides with radar dimension labels).
- Dimension label padding: `set_rlim(0, max_value * 1.2)`
- Labels with >4 CJK chars: rotate to follow angle or abbreviate
- `figsize=(8, 8)` mandatory (square)
### One Color, Gray the Rest
5 lines → color only the key one, others `#D1D5DB`. 8 bars → accent only the highlight, rest `#E5E7EB`.
---
## Connector Rules (structural diagrams)
- Attach to node edges, not through centers
- Prefer orthogonal polylines or clean curves
- Main paths avoid crossing
- Never pass through text areas
- Start/end points at same level must align (no staggering)
- Same-level connectors follow same direction
- Bend angles consistent (all right-angles or all curves, no mixing)
- Label positions uniform (all above line or all centered)
---
## Pre-Output Checklist
Before delivery, verify:
- [ ] Zero overlap (nodes, connectors, labels, legends — **especially check legend vs data, and adjacent pie/bar labels**)
- [ ] No connectors pass through text boxes
- [ ] Clear hierarchy (primary/secondary/annotation visually distinct)
- [ ] Low-saturation palette (no forbidden background colors)
- [ ] Text readable at final size (standalone: ≥12px body, ≥10px annotation; PDF embed: ≥10pt/8pt/7pt)
- [ ] Legend fully visible, not clipped, not overlapping any chart element
- [ ] Canvas wide/tall enough (check bounding box before screenshot)
- [ ] **If mind map**: each level distinct (≥3 property changes), connectors visible (≥ `#94A3B8`), left-right balanced
- [ ] **If flowchart**: phase titles distinct from steps, arrows only between phases, **using Layout C by default**
- [ ] **If flowchart**: phase colors are same-hue family (blue-gray progression), **NOT rainbow** (blue→green→amber→purple)
- [ ] **If flowchart looks scattered**: STOP — you're using the wrong layout, switch to Layout C
- [ ] **If Mermaid looked rigid**: already switched to Playwright+CSS
---
## Anti-Pattern Quick Reference
| ❌ Don't | ✅ Do This Instead |
|----------|-------------------|
| matplotlib default blue `#1f77b4` | Use this skill's palette |
| 3D bar/pie | Always 2D |
| Rainbow colormap (jet/rainbow) | Single-hue gradient or diverging |
| Thick black grid lines | `alpha=0.08` or remove |
| Different color per bar | Same series same color, highlight only key |
| 45° tilted X labels | Horizontal bar chart or shorten |
| 8+ subplots in one canvas | Split to 2-3 images, max 4 each |
| `tight_layout()` alone | `constrained_layout=True` or `GridSpec` |
| Labels overflowing chart | `ylim` with 18-25% headroom |
| Mind map: all levels same style | Root+L1 get boxes, leaves plain text |
| Mind map: image too tall | Left-right layout for ≥5 branches |
| Mind map: invisible connectors | Lines ≥ `#94A3B8`, root→L1 `#64748B` 2.5px |
| Mind map: unbalanced sides | Alternate large/small branches across sides |
| Flowchart: high-sat node fills | Low-sat bg (`#EFF6FF`) + sat border (`#3B82F6`) |
| Flowchart: dark bg + dark text | Dark bg → white text. Light bg → dark text |
| Flowchart: arrows between every step | Arrows ONLY between phases, steps use indent |
| Flowchart: cross-layer lines through nodes | Connect adjacent layers only |
| Flowchart: Grid layout for phased process | **Always use Layout C (Phased Vertical)** |
| Flowchart: phase titles as floating labels | Phase titles MUST be inside group cards |
| Flowchart: nodes scattered without grouping | Group nodes into phase cards with `.phase-group` |
| Flowchart: rainbow phase colors (blue→green→amber→purple) | Same-hue blue-gray progression for all phases |
| Multiple arrows to same entry point | Merge-then-enter pattern |
| Legend inside plot obscuring data | `bbox_to_anchor` outside plot area |
| Radar fill without alpha | `alpha=0.25` mandatory |
| Decorative icons/emoji | Let the data speak |
| Grid lines where whitespace suffices | Background contrast or spacing instead |
---
## UI Aesthetics (dashboards / card layouts)
When building UI-style outputs (dashboards, panels), apply "Invisible Precision":
- **Boundaries**: Subtle bg shifts (`#F7F7F7` on `#FFFFFF`), not border lines. Reserve `1px` dividers for absolute logical breaks only.
- **Actions**: Primary CTA in dark neutral (`#1A1A1B`). Secondary: ghost/gray. Hover: 5% darker, no size change.
- **Quiet UI**: Action buttons `opacity: 0` by default, `1` on hover. Only active elements get visual indicators.
- **Numbers**: `font-variant-numeric: tabular-nums` for strict vertical alignment.
- **Spacing**: `line-height: 1.625`, generous paragraph spacing.

View File

@@ -0,0 +1,49 @@
# ⚠️ STRUCTURAL DIAGRAM IRON LAWS
These rules apply to Playwright+CSS structural diagrams (flowcharts, mind maps, radial grids, org charts). They are enforced by the template files. Violating any = task failure.
## 1. ZERO OVERLAP
No element may overlap another:
- Arrows/connectors must not cross over text boxes
- Connectors must not pass through node bodies
- Text boxes / nodes must not overlap each other
- Labels must not obscure any graphic element
**Post-generation**: verify every element has clear separation. If overlap exists, fix before delivery — enlarge canvas, increase spacing, or reduce content.
## 2. LAYOUT MUST HAVE HIERARCHY
Forbidden: all nodes same size, same level, mechanically tiled in a flat grid.
Required:
- Primary nodes visually larger/bolder than secondary nodes
- Annotation nodes clearly subordinate (smaller, muted color)
- Spacing between groups > spacing within groups
- Clear reading path (top→bottom, left→right, or center→outward)
**Squint test**: if every box looks identical, the layout has failed.
## 3. NODE BACKGROUND COLORS
**🚫 Forbidden as background (too saturated for large fills):**
| Color | Forbidden Hex |
|-------|--------------|
| Pure blue | `#3B82F6`, `#2563EB`, `#1D4ED8` |
| Pure green | `#10B981`, `#059669`, `#22C55E` |
| Pure red | `#EF4444`, `#DC2626`, `#F87171` |
| Pure purple | `#8B5CF6`, `#7C3AED`, `#A855F7` |
| Pure amber | `#F59E0B`, `#D97706`, `#FB923C` |
| Any color: R/G/B > 0xCC and saturation > 50% | — |
**✅ Allowed as background:**
| Color | Hex | Usage |
|-------|-----|-------|
| Ice blue | `#EFF6FF`, `#DBEAFE` | Normal step nodes |
| Mint green | `#F0FDF4`, `#D1FAE5` | Success/pass nodes |
| Light amber | `#FFF7ED`, `#FEF3C7` | Decision/warning nodes |
| Lavender | `#F5F3FF`, `#EDE9FE` | End/terminal nodes |
| Light gray | `#F8FAFC`, `#F1F5F9` | Group containers |
| White | `#FFFFFF` | Default canvas |
**Rule: Saturated colors go on BORDERS (2px) and TEXT only. Backgrounds stay pale.**

199
skills/charts/references/d3.md Executable file
View File

@@ -0,0 +1,199 @@
# D3.js Template Library
> **⚠️ Before writing any code, read [`_rules.md`](references/_rules.md) — three non-negotiable rules on overlap, hierarchy, and color.**
D3.js is the "ultimate weapon" in visualization — highest freedom, but steepest learning curve.
Suitable for: visualizations requiring fully custom interactions, data journalism, art-grade infographics.
**If your needs can be met by ECharts/matplotlib, don't use D3.** D3's value lies in "charts others can't make".
## HTML Universal Shell
```html
<!DOCTYPE html>
<html lang="zh">
<head>
<meta charset="UTF-8">
<title>{{TITLE}}</title>
<script src="https://cdn.jsdelivr.net/npm/d3@7"></script>
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
body { background: #FFFFFF; font-family: system-ui, 'SimHei', sans-serif; }
svg { display: block; margin: 40px auto; }
/* Design tokens */
:root {
--text: #111827;
--text-sub: #6B7280;
--text-muted: #9CA3AF;
--axis: #E5E7EB;
--grid: #F3F4F6;
--blue: #3B82F6;
--cyan: #06B6D4;
--purple: #8B5CF6;
--amber: #F59E0B;
--red: #EF4444;
--green: #10B981;
}
</style>
</head>
<body>
<svg id="chart"></svg>
<script>
// D3 code
</script>
</body>
</html>
```
## Colors and Constants
```javascript
const COLORS = ['#3B82F6', '#06B6D4', '#8B5CF6', '#F59E0B', '#EF4444', '#10B981'];
const GRAY = { 200: '#E5E7EB', 300: '#D1D5DB', 400: '#9CA3AF', 500: '#6B7280', 900: '#111827' };
const margin = { top: 60, right: 40, bottom: 50, left: 60 };
const width = 900 - margin.left - margin.right;
const height = 500 - margin.top - margin.bottom;
```
---
## Template 1: Insight Bar Chart
```javascript
const svg = d3.select('#chart')
.attr('width', width + margin.left + margin.right)
.attr('height', height + margin.top + margin.bottom)
.append('g')
.attr('transform', `translate(${margin.left},${margin.top})`);
// Title (insight-driven)
svg.append('text')
.attr('x', 0).attr('y', -30)
.attr('fill', GRAY[900])
.attr('font-size', 16).attr('font-weight', 'bold')
.text('产品C销售额领先达到89万');
// Scales
const x = d3.scaleBand().domain(data.map(d => d.name)).range([0, width]).padding(0.35);
const y = d3.scaleLinear().domain([0, d3.max(data, d => d.value) * 1.15]).range([height, 0]);
// Y axis (minimal: no tick marks, light gray)
svg.append('g')
.call(d3.axisLeft(y).ticks(5).tickSize(0).tickFormat(d3.format(',')))
.call(g => g.select('.domain').attr('stroke', GRAY[200]).attr('stroke-width', 0.8))
.call(g => g.selectAll('.tick text').attr('fill', GRAY[500]).attr('font-size', 9));
// X axis
svg.append('g')
.attr('transform', `translate(0,${height})`)
.call(d3.axisBottom(x).tickSize(0))
.call(g => g.select('.domain').attr('stroke', GRAY[200]).attr('stroke-width', 0.8))
.call(g => g.selectAll('.tick text').attr('fill', GRAY[500]).attr('font-size', 9));
// Bars (gray + highlight)
svg.selectAll('.bar')
.data(data)
.join('rect')
.attr('x', d => x(d.name))
.attr('width', x.bandwidth())
.attr('y', d => y(d.value))
.attr('height', d => height - y(d.value))
.attr('fill', d => d.highlight ? '#3B82F6' : GRAY[200])
.attr('rx', 3);
// Value labels
svg.selectAll('.label')
.data(data)
.join('text')
.attr('x', d => x(d.name) + x.bandwidth()/2)
.attr('y', d => y(d.value) - 6)
.attr('text-anchor', 'middle')
.attr('fill', d => d.highlight ? GRAY[900] : GRAY[400])
.attr('font-size', d => d.highlight ? 12 : 10)
.attr('font-weight', d => d.highlight ? 'bold' : 'normal')
.text(d => d3.format(',')(d.value));
```
---
## Template 2: Force-Directed Graph
This is D3's killer feature — other frameworks can hardly achieve the same effect.
```javascript
const simulation = d3.forceSimulation(nodes)
.force('link', d3.forceLink(links).id(d => d.id).distance(80))
.force('charge', d3.forceManyBody().strength(-200))
.force('center', d3.forceCenter(width/2, height/2))
.force('collision', d3.forceCollide(20));
const link = svg.selectAll('.link')
.data(links).join('line')
.attr('stroke', GRAY[200]).attr('stroke-width', 1);
const node = svg.selectAll('.node')
.data(nodes).join('circle')
.attr('r', d => Math.sqrt(d.value) * 3)
.attr('fill', (d, i) => COLORS[d.group % COLORS.length])
.attr('stroke', '#fff').attr('stroke-width', 1.5)
.call(d3.drag()
.on('start', (e, d) => { if (!e.active) simulation.alphaTarget(0.3).restart(); d.fx = d.x; d.fy = d.y; })
.on('drag', (e, d) => { d.fx = e.x; d.fy = e.y; })
.on('end', (e, d) => { if (!e.active) simulation.alphaTarget(0); d.fx = null; d.fy = null; })
);
// Labels
const labels = svg.selectAll('.label')
.data(nodes).join('text')
.text(d => d.name)
.attr('font-size', 9).attr('fill', GRAY[500])
.attr('text-anchor', 'middle').attr('dy', -15);
simulation.on('tick', () => {
link.attr('x1', d => d.source.x).attr('y1', d => d.source.y)
.attr('x2', d => d.target.x).attr('y2', d => d.target.y);
node.attr('cx', d => d.x).attr('cy', d => d.y);
labels.attr('x', d => d.x).attr('y', d => d.y);
});
```
---
## Template 3: Treemap
```javascript
const root = d3.hierarchy(treeData).sum(d => d.value);
d3.treemap().size([width, height]).padding(2)(root);
svg.selectAll('.cell')
.data(root.leaves())
.join('rect')
.attr('x', d => d.x0).attr('y', d => d.y0)
.attr('width', d => d.x1 - d.x0)
.attr('height', d => d.y1 - d.y0)
.attr('fill', (d, i) => COLORS[i % COLORS.length])
.attr('rx', 2).attr('opacity', 0.85);
svg.selectAll('.cell-label')
.data(root.leaves().filter(d => (d.x1-d.x0) > 40 && (d.y1-d.y0) > 20))
.join('text')
.attr('x', d => d.x0 + 4).attr('y', d => d.y0 + 14)
.text(d => d.data.name)
.attr('font-size', 10).attr('fill', 'white').attr('font-weight', 'bold');
```
---
## D3 Use Cases (vs Overkill)
| ✅ D3 Best | ❌ Overkill |
|-----------|-------------|
| Force-directed relationship graph | Regular bar chart (use matplotlib) |
| Custom geographic visualization | Standard map (use ECharts) |
| Data-driven animation | Static report chart (use matplotlib) |
| Treemap / Sunburst | Standard pie chart (use matplotlib) |
| Complex interactions (brushing/linking/drill-down) | Simple tooltip (use ECharts) |
| Data journalism/narrative visualization | Dashboard (use ECharts) |

View File

@@ -0,0 +1,651 @@
# ECharts Template Library
> **⚠️ Before writing any code, read [`_rules.md`](references/_rules.md) — three non-negotiable rules on overlap, hierarchy, and color.**
ECharts strengths: interactivity (tooltip/zoom/linking), big data (Canvas renders millions of points smoothly), strong Chinese community.
Output as HTML files — open directly in browser or export PNG via Playwright screenshot.
## HTML Universal Shell
Wrap all ECharts charts with this shell:
```html
<!DOCTYPE html>
<html lang="zh">
<head>
<meta charset="UTF-8">
<title>{{TITLE}}</title>
<script src="https://cdn.jsdelivr.net/npm/echarts@5/dist/echarts.min.js"></script>
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
body { background: {{BG_COLOR}}; }
#chart { width: {{WIDTH}}px; height: {{HEIGHT}}px; margin: 40px auto; }
</style>
</head>
<body>
<div id="chart"></div>
<script>
const chart = echarts.init(document.getElementById('chart'));
const option = { /* see templates below */ };
chart.setOption(option);
window.addEventListener('resize', () => chart.resize());
</script>
</body>
</html>
```
Default dimensions: `width=900, height=520`, white background `#FFFFFF`.
---
## Theme Configuration
### Light Theme (Default)
```javascript
const THEME = {
bg: '#FFFFFF',
text: '#111827',
textSub: '#6B7280',
textMuted: '#9CA3AF',
axis: '#E5E7EB',
grid: '#F3F4F6',
tooltip: { bg: '#1E293B', border: '#334155', text: '#F1F5F9' },
colors: ['#3B82F6', '#06B6D4', '#8B5CF6', '#F59E0B', '#EF4444', '#10B981'],
};
```
### Dark Theme (Finance / Tech Dashboard)
```javascript
const DARK = {
bg: '#0F172A',
text: '#F1F5F9',
textSub: '#94A3B8',
textMuted: '#64748B',
axis: '#334155',
grid: '#1E293B',
tooltip: { bg: '#1E293B', border: '#475569', text: '#F1F5F9' },
colors: ['#3B82F6', '#06B6D4', '#8B5CF6', '#F59E0B', '#22C55E', '#EC4899'],
};
```
### Base Option Configuration
```javascript
function baseOption(theme, title, subtitle) {
return {
backgroundColor: theme.bg,
textStyle: { fontFamily: 'system-ui, SimHei, sans-serif', color: theme.text },
title: {
text: title, subtext: subtitle || '',
left: 24, top: 16,
textStyle: { fontSize: 16, fontWeight: 'bold', color: theme.text },
subtextStyle: { fontSize: 12, color: theme.textSub },
},
grid: { left: 60, right: 40, top: 80, bottom: 50, containLabel: true },
color: theme.colors,
tooltip: {
trigger: 'axis',
backgroundColor: theme.tooltip.bg,
borderColor: theme.tooltip.border,
borderWidth: 1,
textStyle: { color: theme.tooltip.text, fontSize: 12 },
},
animationDuration: 600,
animationEasing: 'cubicOut',
};
}
function cleanAxis(theme) {
return {
axisLine: { lineStyle: { color: theme.axis, width: 0.8 } },
axisTick: { show: false },
splitLine: { lineStyle: { color: theme.grid, width: 0.5 } },
axisLabel: { color: theme.textSub, fontSize: 10 },
};
}
```
---
## Template 1: Insight Bar Chart
```javascript
const option = {
...baseOption(THEME, 'Q3 收入环比增长 47%', '各季度收入对比(万元)'),
xAxis: { type: 'category', data: ['Q1','Q2','Q3','Q4'], ...cleanAxis(THEME) },
yAxis: { type: 'value', ...cleanAxis(THEME) },
series: [{
type: 'bar', barWidth: '50%',
itemStyle: { borderRadius: [4, 4, 0, 0] },
data: [
{ value: 120, itemStyle: { color: '#E5E7EB' } },
{ value: 145, itemStyle: { color: '#E5E7EB' } },
{ value: 213, itemStyle: { color: '#3B82F6' } },
{ value: 180, itemStyle: { color: '#E5E7EB' } },
],
label: {
show: true, position: 'top', fontSize: 11, color: '#6B7280',
formatter: (p) => p.dataIndex === 2
? '{hl|' + p.value + '}'
: p.value,
rich: { hl: { fontWeight: 'bold', fontSize: 13, color: '#111827' } },
},
}],
};
```
---
## Template 2: Multi-Line Trend
```javascript
const option = {
...baseOption(THEME, '2024 年增长持续加速'),
legend: { right: 40, top: 20, textStyle: { color: '#6B7280', fontSize: 10 } },
xAxis: { type: 'category', data: months, boundaryGap: false, ...cleanAxis(THEME) },
yAxis: { type: 'value', ...cleanAxis(THEME) },
series: [
{
name: '2024', type: 'line', data: thisYear,
lineStyle: { width: 2.5 },
symbol: 'circle', symbolSize: 6,
itemStyle: { color: '#3B82F6' },
areaStyle: {
color: new echarts.graphic.LinearGradient(0, 0, 0, 1, [
{ offset: 0, color: 'rgba(59,130,246,0.12)' },
{ offset: 1, color: 'rgba(59,130,246,0)' },
]),
},
},
{
name: '2023', type: 'line', data: lastYear,
lineStyle: { width: 1.5, type: 'dashed', color: '#D1D5DB' },
symbol: 'none', itemStyle: { color: '#D1D5DB' },
},
],
};
```
---
## Template 3: Candlestick (Finance)
```javascript
// Dark theme
const option = {
...baseOption(DARK, 'BTC/USDT 日K线'),
xAxis: { type: 'category', data: dates, ...cleanAxis(DARK) },
yAxis: { type: 'value', scale: true, ...cleanAxis(DARK) },
dataZoom: [
{ type: 'inside', start: 70, end: 100 },
{ type: 'slider', start: 70, end: 100, height: 20, bottom: 10,
borderColor: DARK.axis, fillerColor: 'rgba(59,130,246,0.15)',
textStyle: { color: DARK.textSub } },
],
series: [{
type: 'candlestick',
data: ohlcData, // [[open,close,low,high], ...]
itemStyle: {
color: '#22C55E', // Bullish (close > open)
color0: '#EF4444', // Bearish
borderColor: '#16A34A',
borderColor0: '#DC2626',
},
}],
};
```
---
## Template 4: Dashboard (Multi-Chart Linking)
⚠️ **ECharts multi-chart dashboard anti-overlap rules** (highest priority):
1. Maximum 4 subplots per canvas; more must be split into multiple HTML files
2. Each subplot's `grid` area must not overlap; maintain ≥5% safety margin between adjacent grids
3. Pie chart `center` and `radius` must not intrude into other subplots' grid areas
4. Same applies to radar chart's `radar.center` and `radar.radius`
5. Place legend in top/bottom common area, not inside subplots
### Simple Dual Chart (Bar + Pie)
```javascript
const option = {
...baseOption(THEME, '产品线收入分布'),
grid: [{ left: 60, right: '55%', top: 80, bottom: 50 }],
xAxis: [{ type: 'value', gridIndex: 0, ...cleanAxis(THEME) }],
yAxis: [{ type: 'category', data: products, gridIndex: 0, ...cleanAxis(THEME) }],
series: [
{
type: 'bar', data: revenues,
itemStyle: { borderRadius: [0, 4, 4, 0] },
barWidth: '60%',
},
{
type: 'pie', center: ['78%', '50%'], radius: ['35%', '55%'],
data: products.map((name, i) => ({ name, value: revenues[i] })),
label: { formatter: '{b}\n{d}%', fontSize: 10 },
itemStyle: { borderColor: '#fff', borderWidth: 2 },
},
],
};
```
### Four-Chart Dashboard (Safe Layout Template)
```javascript
// ⚠️ Key: grid areas precisely defined, no overlap, maintain safety margins
const option = {
...baseOption(THEME, '数据全景仪表盘'),
grid: [
// Top-left: bar chart
{ left: '5%', right: '55%', top: '12%', bottom: '55%' },
// Top-right: line chart
{ left: '55%', right: '5%', top: '12%', bottom: '55%' },
// Bottom-left: scatter plot
{ left: '5%', right: '55%', top: '55%', bottom: '5%' },
// Bottom-right area reserved for pie chart (pie uses center + radius, not grid)
],
xAxis: [
{ type: 'category', gridIndex: 0, data: categories1, ...cleanAxis(THEME) },
{ type: 'category', gridIndex: 1, data: categories2, ...cleanAxis(THEME), boundaryGap: false },
{ type: 'value', gridIndex: 2, ...cleanAxis(THEME) },
],
yAxis: [
{ type: 'value', gridIndex: 0, ...cleanAxis(THEME) },
{ type: 'value', gridIndex: 1, ...cleanAxis(THEME) },
{ type: 'value', gridIndex: 2, ...cleanAxis(THEME) },
],
series: [
// Top-left: bar chart
{
type: 'bar', xAxisIndex: 0, yAxisIndex: 0,
data: barData,
itemStyle: { borderRadius: [4, 4, 0, 0] },
},
// Top-right: line chart
{
type: 'line', xAxisIndex: 1, yAxisIndex: 1,
data: lineData,
smooth: true,
areaStyle: { opacity: 0.08 },
},
// Bottom-left: scatter plot
{
type: 'scatter', xAxisIndex: 2, yAxisIndex: 2,
data: scatterData,
symbolSize: 8,
},
// Bottom-right: pie chart (positioned via center in bottom-right quadrant)
{
type: 'pie',
center: ['77%', '72%'], // Positioned at bottom-right area center
radius: ['15%', '25%'], // Radius stays within bottom-right quadrant
data: pieData,
label: { formatter: '{b}\n{d}%', fontSize: 10 },
itemStyle: { borderColor: '#fff', borderWidth: 2 },
},
],
};
```
### Grid Safety Margin Quick Reference
| Layout | Grid Config | Safety Margin |
|------|----------|---------|
| Left-right dual | Left `right:'55%'` Right `left:'55%'` | 10% center gap |
| Top-bottom dual | Top `bottom:'55%'` Bottom `top:'55%'` | 10% center gap |
| 2x2 quad | Each quadrant 45%, 10% center gap | 5% margin on all sides |
| With pie/radar | Pie center+radius must not intrude grid | Pie radius ≤ 40% of available area |
### What If More Than 4 Subplots?
```javascript
// ❌ Wrong: 8 charts crammed into one canvas — all labels will inevitably overlap
// ✅ Correct: split into 2 HTML files
// dashboard_overview.html — 4 overview charts
// dashboard_detail.html — 4 detailed analysis charts
// Or use tab switching (ECharts toolbox doesn't support this, need custom HTML tabs)
```
---
## Template 5: Radar Chart
```javascript
const option = {
...baseOption(THEME, '团队能力评估'),
radar: {
indicator: dims.map(d => ({ name: d, max: 100 })),
axisName: { color: '#6B7280', fontSize: 10 },
splitArea: { areaStyle: { color: ['#FAFAFA', '#F5F5F5'] } },
splitLine: { lineStyle: { color: '#E5E7EB' } },
axisLine: { lineStyle: { color: '#E5E7EB' } },
},
series: [{
type: 'radar',
data: teams.map((t, i) => ({
name: t.name, value: t.scores,
lineStyle: { width: 2 },
areaStyle: { opacity: 0.08 },
itemStyle: { color: THEME.colors[i] },
})),
}],
legend: { bottom: 10, textStyle: { color: '#6B7280' } },
};
```
---
## Export to PNG (Playwright)
```python
import asyncio
from playwright.async_api import async_playwright
async def echarts_to_png(html_path, png_path, width=900, height=520):
async with async_playwright() as p:
browser = await p.chromium.launch(headless=True)
page = await browser.new_page(viewport={'width': width, 'height': height})
await page.goto(f'file://{html_path}', wait_until='networkidle')
await page.wait_for_timeout(800) # Wait for animation to complete
await page.locator('#chart').screenshot(path=png_path)
await browser.close()
print(f'{png_path}')
# asyncio.run(echarts_to_png('./output/chart.html', './output/chart.png'))
```
---
## Template 5: Tree (Interactive Only)
**⚠️ For static PNG export, use Playwright+CSS (see `mindmap-css.md`). ECharts tree connector length and node spacing cannot be finely controlled — static output is not aesthetically satisfying.**
**ECharts tree is suitable for interactive scenarios** (click expand/collapse, hover tooltip, zoom/drag). For PNG/PDF static output, the CSS approach looks better.
### Basic Usage
```javascript
const option = {
tooltip: { trigger: 'item', triggerOn: 'mousemove' },
series: [{
type: 'tree',
data: [treeData], // Tree-structured JSON data
layout: 'orthogonal', // Orthogonal layout (right-angle connectors)
orient: 'LR', // Direction: LR(left→right) / RL / TB(top→bottom) / BT
// Node spacing control (key params to prevent crowding)
initialTreeDepth: -1, // -1=expand all, positive=initial expand depth
// Label style
label: {
position: 'left', // Leaf node label position
verticalAlign: 'middle',
fontSize: 13,
fontFamily: 'PingFang SC, SimHei, sans-serif',
},
leaves: {
label: { position: 'right' } // Leaf labels on right
},
// Connector style
lineStyle: {
color: '#94A3B8',
width: 1.5,
curveness: 0.5, // Curvature, 0=straight, 0.5=natural curve
},
// Node style
itemStyle: {
borderWidth: 1.5,
},
// Animation
animationDuration: 550,
animationDurationUpdate: 750,
}]
};
```
### Tree Data Format
```javascript
const treeData = {
name: '中心主题',
children: [
{
name: '分支A',
children: [
{ name: '叶子1' },
{ name: '叶子2' },
{ name: '叶子3', children: [{ name: '更深叶子' }] }
]
},
{
name: '分支B',
children: [
{ name: '叶子4' },
{ name: '叶子5' }
]
}
]
};
```
### Node Style Customization (by Level)
```javascript
// Root node highlight
function styleTreeData(node, depth) {
const styles = [
{ // Root node
itemStyle: { color: '#3B82F6', borderColor: '#2563EB', borderWidth: 2 },
label: { fontSize: 18, fontWeight: 'bold', color: '#fff',
backgroundColor: '#3B82F6', borderRadius: 6, padding: [8, 16] }
},
{ // Level 1 branches
itemStyle: { color: '#60A5FA', borderColor: '#3B82F6' },
label: { fontSize: 15, fontWeight: 600, color: '#1E40AF',
backgroundColor: '#EFF6FF', borderColor: '#3B82F6',
borderWidth: 1.5, borderRadius: 6, padding: [6, 14] }
},
{ // Level 2
itemStyle: { color: '#93C5FD', borderColor: '#60A5FA' },
label: { fontSize: 13, color: '#1E40AF',
backgroundColor: '#F0F7FF', borderColor: '#93C5FD',
borderWidth: 1, borderRadius: 4, padding: [4, 10] }
},
{ // Level 3+ leaves
itemStyle: { color: '#BFDBFE', borderColor: '#93C5FD' },
label: { fontSize: 12, color: '#475569', padding: [3, 8] }
}
];
const style = styles[Math.min(depth, styles.length - 1)];
Object.assign(node, style);
if (node.children) {
node.children.forEach(child => styleTreeData(child, depth + 1));
}
}
styleTreeData(treeData, 0);
```
### Left-Right Distribution (Large Tree Mode)
When branches ≥ 5, use two trees for left-right expansion:
```javascript
function splitTree(data) {
const children = data.children || [];
// Alternate assignment by subtree size
const sorted = children.map((c, i) => ({ c, w: countNodes(c), i }))
.sort((a, b) => b.w - a.w);
const left = [], right = [];
let lw = 0, rw = 0;
sorted.forEach(({ c }) => {
if (lw <= rw) { left.push(c); lw += countNodes(c); }
else { right.push(c); rw += countNodes(c); }
});
return {
left: { name: data.name, children: left },
right: { name: data.name, children: right }
};
}
function countNodes(node) {
if (!node.children) return 1;
return 1 + node.children.reduce((s, c) => s + countNodes(c), 0);
}
// Dual tree series
const { left, right } = splitTree(treeData);
const option = {
series: [
{ type: 'tree', data: [right], orient: 'LR', left: '50%', width: '45%', /* ... */ },
{ type: 'tree', data: [left], orient: 'RL', right: '50%', width: '45%', /* ... */ },
]
};
```
### Recommended Canvas Size
| Node Count | Width | Height |
|--------|------|------|
| ≤ 15 | 900px | 500px |
| 16-30 | 1200px | 600px |
| 31-60 | 1600px | 800px |
| 60+ | 2000px | 1000px |
---
## Template 6: Relationship / Force-Directed Graph
**ECharts graph suits process relationships, org charts, knowledge graphs** — nodes auto-repel to avoid overlap, connectors auto-bind, supports categorical coloring.
### Basic Usage
```javascript
const option = {
tooltip: {},
legend: [{ data: categories.map(c => c.name) }],
series: [{
type: 'graph',
layout: 'force', // Force-directed auto layout
// Force model params (controls repulsion and attraction)
force: {
repulsion: 300, // Repulsion force (higher = more spread out, recommended 200-500)
gravity: 0.1, // Gravity (prevents nodes from flying too far)
edgeLength: [100, 200], // Edge length range
layoutAnimation: true,
},
roam: true, // Allow drag and zoom
draggable: true, // Allow dragging nodes
// Nodes
data: nodes,
// Edges
links: links,
// Categories (for coloring)
categories: categories,
// Labels
label: {
show: true,
position: 'right',
fontSize: 12,
fontFamily: 'PingFang SC, SimHei, sans-serif',
},
// Connector style
lineStyle: {
color: 'source', // Edge color follows source node
curveness: 0.3, // Curvature
width: 1.5,
},
// Highlight effect
emphasis: {
focus: 'adjacency', // Highlight adjacent nodes on hover
lineStyle: { width: 3 },
},
}]
};
```
### Data Format
```javascript
const categories = [
{ name: '核心系统', itemStyle: { color: '#3B82F6' } },
{ name: '数据层', itemStyle: { color: '#10B981' } },
{ name: '应用层', itemStyle: { color: '#F59E0B' } },
];
const nodes = [
{ name: 'API Gateway', category: 0, symbolSize: 40 },
{ name: 'User Service', category: 0, symbolSize: 30 },
{ name: 'MySQL', category: 1, symbolSize: 35 },
{ name: 'Redis', category: 1, symbolSize: 28 },
{ name: 'Web App', category: 2, symbolSize: 32 },
];
const links = [
{ source: 'API Gateway', target: 'User Service' },
{ source: 'User Service', target: 'MySQL' },
{ source: 'User Service', target: 'Redis' },
{ source: 'Web App', target: 'API Gateway' },
];
```
### Flowchart Mode (Fixed Layout)
When you don't want force-directed auto-layout, fix node positions:
```javascript
const option = {
series: [{
type: 'graph',
layout: 'none', // Fixed layout, positions determined by x/y
data: [
{ name: '开始', x: 300, y: 50, symbolSize: 40,
itemStyle: { color: '#EFF6FF', borderColor: '#3B82F6', borderWidth: 2 } },
{ name: '处理', x: 300, y: 200, symbolSize: 35 },
{ name: '判断', x: 300, y: 350, symbolSize: 35,
symbol: 'diamond',
itemStyle: { color: '#FFF7ED', borderColor: '#F59E0B', borderWidth: 2 } },
{ name: '结束', x: 300, y: 500, symbolSize: 40 },
],
links: [
{ source: '开始', target: '处理' },
{ source: '处理', target: '判断' },
{ source: '判断', target: '结束', label: { show: true, formatter: '通过' } },
],
lineStyle: { color: '#94A3B8', width: 2, curveness: 0 },
edgeSymbol: ['', 'arrow'],
edgeSymbolSize: [0, 10],
}]
};
```
---
## ECharts vs Other Frameworks
| Capability | ECharts | Plotly | Chart.js |
|------|---------|--------|----------|
| Canvas rendering (big data) | ✅ Millions | ❌ SVG-based | ✅ But limited |
| Chinese docs | ✅ Official | ❌ English | ❌ English |
| Candlestick | ✅ Built-in | ❌ Plugin needed | ❌ None |
| Maps | ✅ Built-in China map | ✅ mapbox | ❌ None |
| 3D Charts | ✅ echarts-gl | ✅ Built-in | ❌ None |
| No Node.js needed | ✅ CDN import | ❌ Needs plotly.js | ✅ CDN |
| Server-side rendering | ✅ node-echarts | ✅ orca | ✅ chartjs-node |

View File

@@ -0,0 +1,617 @@
# matplotlib Template Library
> **⚠️ Before writing any code, read [`_rules.md`](references/_rules.md) — three non-negotiable rules on overlap, hierarchy, and color.**
## Environment Initialization (Must Execute Before Each Plot)
```python
import matplotlib
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker
import numpy as np
# ═══ Chinese Font Setup ═══
# SimHei is the default; other fonts available:
# SimSun.ttf (Songti, formal docs), SimKai.ttf (Kaiti, artistic),
# SarasaMonoSC-*.ttf (monospace CJK, code scenes)
# Run `fc-list :lang=zh` for system fonts (PingFang SC, Heiti TC, etc.)
# Font path: adjust for your system. Common locations:
# macOS: '/System/Library/Fonts/Supplemental/SimHei.ttf'
# Linux: '/usr/share/fonts/truetype/wqy/wqy-zenhei.ttc'
# Custom: './fonts/SimHei.ttf'
import os
SIMHEI_PATH = os.environ.get('SIMHEI_FONT', '/System/Library/Fonts/Supplemental/SimHei.ttf')
matplotlib.font_manager.fontManager.addfont(SIMHEI_PATH)
# ═══ Global Style ═══
plt.rcParams.update({
# Font
'font.sans-serif': ['SimHei'],
'axes.unicode_minus': False,
# Background
'figure.facecolor': '#FFFFFF',
'axes.facecolor': '#FFFFFF',
# Border: only keep left and bottom
'axes.edgecolor': '#E5E7EB',
'axes.linewidth': 0.8,
'axes.spines.top': False,
'axes.spines.right': False,
# Grid: off by default
'axes.grid': False,
# Ticks: hide tick marks
'xtick.major.size': 0,
'ytick.major.size': 0,
'xtick.labelsize': 9,
'ytick.labelsize': 9,
# Title
'axes.labelsize': 10,
'axes.titlesize': 16,
'axes.titleweight': 'bold',
'axes.titlepad': 16,
# Legend: no frame
'legend.frameon': False,
'legend.fontsize': 9,
# Export
'figure.dpi': 200,
'savefig.dpi': 200,
'savefig.bbox': 'tight',
'savefig.facecolor': '#FFFFFF',
'savefig.pad_inches': 0.3,
})
```
## Color Constants
```python
# ─── Cool Colors (Business/Tech) ───
C_BLUE = '#3B82F6'
C_CYAN = '#06B6D4'
C_PURPLE = '#8B5CF6'
C_AMBER = '#F59E0B'
C_RED = '#EF4444'
C_GREEN = '#10B981'
COOL = [C_BLUE, C_CYAN, C_PURPLE, C_AMBER, C_RED, C_GREEN]
# ─── Warm Colors (Warmth/Creative) ───
WARM = ['#F59E0B', '#EF4444', '#8B5CF6', '#3B82F6', '#10B981', '#EC4899']
# ─── Academic Grayscale ───
ACADEMIC = ['#111827', '#6B7280', '#9CA3AF', '#D1D5DB', '#E5E7EB', '#F3F4F6']
# ─── Colorblind-Safe (Paper Preferred) ───
CB_SAFE = ['#0077BB', '#33BBEE', '#009988', '#EE7733', '#CC3311', '#EE3377']
# ─── Grayscale ───
G900, G700, G500, G400, G300, G200, G100, G50 = \
'#111827', '#374151', '#6B7280', '#9CA3AF', '#D1D5DB', '#E5E7EB', '#F3F4F6', '#F9FAFB'
# ─── Gain/Loss ───
POS = '#22C55E'
NEG = '#EF4444'
```
## Helper Functions
```python
def clean_axis(ax, grid=True):
"""Clean axis: remove top and right borders, add faint grid"""
ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
if grid:
ax.yaxis.grid(True, alpha=0.08, color=G300)
ax.set_axisbelow(True)
def add_value_labels(ax, bars, values, highlight_idx=None, fmt='{:,.0f}',
offset_ratio=0.02):
"""Add value labels on top of bars, highlight bar in bold.
⚠️ Automatically extends Y-axis upper limit to ensure labels don't overflow chart area."""
max_val = max(values)
for i, (bar, val) in enumerate(zip(bars, values)):
is_hl = (i == highlight_idx) if highlight_idx is not None else False
ax.text(bar.get_x() + bar.get_width()/2,
bar.get_height() + max_val * offset_ratio,
fmt.format(val), ha='center', va='bottom',
fontsize=10 if is_hl else 9,
color=G900 if is_hl else G400,
fontweight='bold' if is_hl else 'normal')
# ⚠️ Critical: extend Y-axis upper limit to leave enough space for labels (at least 15%)
ax.set_ylim(0, max_val * 1.18)
def save(fig, path, dpi=200):
"""Unified save — prefer constrained_layout, fallback to tight_layout"""
if not fig.get_constrained_layout():
try:
fig.tight_layout()
except Exception:
pass
fig.savefig(path, dpi=dpi, facecolor='white', bbox_inches='tight')
plt.close(fig)
import os
size_kb = os.path.getsize(path) / 1024
print(f'{path} ({size_kb:.0f}KB)')
```
### Label Text Avoidance (adjustText)
When charts have multiple label annotations (e.g., scatter plot labels, box plot annotations), **must use adjustText library** to prevent text overlap:
```python
# pip install adjustText
from adjustText import adjust_text
# Call after adding all annotations
texts = []
for i, (x, y, label) in enumerate(zip(x_data, y_data, labels)):
texts.append(ax.text(x, y, label, fontsize=9, color='#374151'))
# Auto-avoidance — pass all text objects, adjustText will move them to avoid overlap
adjust_text(texts, ax=ax,
arrowprops=dict(arrowstyle='->', color='#9CA3AF', lw=0.8),
force_text=(0.5, 0.8), # Force to push text away
force_points=(0.3, 0.5), # Repulsion from data points
expand=(1.2, 1.4)) # Expansion factor for text bbox
```
---
## Template 1: Insight Bar Chart
**Scenario**: Emphasize outstanding performance of one data item. Gray out others, highlight the focus.
```python
def insight_bar(labels, values, highlight_idx, title,
highlight_color=C_BLUE, save_path='insight_bar.png'):
fig, ax = plt.subplots(figsize=(10, 6))
colors = [G200] * len(labels)
colors[highlight_idx] = highlight_color
bars = ax.bar(labels, values, color=colors, width=0.6,
zorder=3, edgecolor='white', linewidth=0.5)
add_value_labels(ax, bars, values, highlight_idx)
ax.set_title(title, loc='left')
# ⚠️ add_value_labels already sets ylim automatically, no need to repeat here
clean_axis(ax)
save(fig, save_path)
```
---
## Template 2: Trend Comparison Line Chart
**Scenario**: This year vs last year, actual vs target. Main line in colored solid, comparison line in gray dashed.
```python
def trend_compare(x, y_main, y_ref, label_main, label_ref, title,
color=C_BLUE, save_path='trend_compare.png'):
fig, ax = plt.subplots(figsize=(12, 6))
# Comparison line (gray dashed, at bottom layer)
ax.plot(x, y_ref, color=G300, linewidth=1.5, linestyle='--', zorder=2)
ax.text(len(x)-0.5, y_ref[-1], label_ref, color=G400, fontsize=9, va='center')
# Main line (colored solid + white-center dots)
ax.plot(x, y_main, color=color, linewidth=2.5, marker='o', markersize=5,
markerfacecolor='white', markeredgewidth=2, markeredgecolor=color, zorder=3)
ax.text(len(x)-0.5, y_main[-1], f'{label_main} {y_main[-1]:,.0f}',
color=color, fontsize=10, fontweight='bold', va='center')
# Difference area
ax.fill_between(range(len(x)), y_ref, y_main, alpha=0.06, color=color)
ax.set_title(title, loc='left')
ax.set_xticks(range(len(x)))
ax.set_xticklabels(x)
clean_axis(ax)
save(fig, save_path)
```
---
## Template 3: Grouped Bar Chart
**Scenario**: Comparison of multiple categories across multiple dimensions.
```python
def grouped_bar(labels, datasets, series_names, title,
colors=None, save_path='grouped_bar.png'):
if colors is None:
colors = COOL[:len(datasets)]
fig, ax = plt.subplots(figsize=(12, 6))
n = len(datasets)
width = 0.7 / n
x = np.arange(len(labels))
for i, (data, name, color) in enumerate(zip(datasets, series_names, colors)):
offset = (i - n/2 + 0.5) * width
ax.bar(x + offset, data, width=width*0.85, color=color,
label=name, zorder=3, edgecolor='white', linewidth=0.3)
ax.set_title(title, loc='left')
ax.set_xticks(x)
ax.set_xticklabels(labels)
ax.set_ylim(0, max(max(d) for d in datasets) * 1.20) # Leave 20% space for labels
ax.legend(loc='upper right', ncol=n)
clean_axis(ax)
save(fig, save_path)
```
---
## Template 4: Horizontal Ranking Chart
**Scenario**: Rankings / Top N. Progressive highlight for top items.
```python
def ranking_bar(labels, values, title, top_n=3,
color=C_BLUE, save_path='ranking.png'):
from matplotlib.colors import to_rgba
sorted_pairs = sorted(zip(labels, values), key=lambda x: x[1])
labels_s, values_s = zip(*sorted_pairs)
fig, ax = plt.subplots(figsize=(10, max(6, len(labels)*0.45)))
bar_colors = [G200] * len(labels_s)
for i in range(len(labels_s) - top_n, len(labels_s)):
progress = (i - (len(labels_s) - top_n)) / max(top_n - 1, 1)
bar_colors[i] = to_rgba(color, 0.35 + 0.65 * progress)
bars = ax.barh(range(len(labels_s)), values_s, color=bar_colors,
height=0.6, zorder=3, edgecolor='white', linewidth=0.3)
for i, (bar, val) in enumerate(zip(bars, values_s)):
is_top = i >= len(labels_s) - top_n
ax.text(bar.get_width() + max(values_s)*0.01,
bar.get_y() + bar.get_height()/2,
f'{val:,.0f}', va='center', fontsize=9,
color=G900 if is_top else G400)
ax.set_yticks(range(len(labels_s)))
ax.set_yticklabels(labels_s)
ax.set_title(title, loc='left')
ax.spines['bottom'].set_visible(False)
ax.xaxis.set_visible(False)
save(fig, save_path)
```
---
## Template 5: Donut Chart
**Scenario**: Proportion distribution (max 5 slices, avoid if possible — bar charts are usually better).
```python
def donut(labels, values, title, center_text=None,
colors=None, save_path='donut.png'):
if colors is None:
colors = COOL[:len(labels)]
fig, ax = plt.subplots(figsize=(8, 8))
wedges, _, autotexts = ax.pie(
values, labels=None, colors=colors, autopct='%1.0f%%',
startangle=90, pctdistance=0.78,
wedgeprops=dict(width=0.35, edgecolor='white', linewidth=2))
for t in autotexts:
t.set_fontsize(10)
t.set_fontweight('bold')
if center_text:
ax.text(0, 0.06, str(center_text), ha='center', va='center',
fontsize=28, fontweight='bold', color=G900)
ax.text(0, -0.1, '总计', ha='center', va='center', fontsize=11, color=G500)
ax.legend(wedges, labels, loc='center left',
bbox_to_anchor=(1, 0.5), fontsize=10)
ax.set_title(title, loc='center', pad=20)
save(fig, save_path)
```
---
## Template 6: Scatter Plot + Trend Line
**Scenario**: Two-variable correlation analysis.
```python
def scatter_trend(x, y, title, xlabel, ylabel,
color=C_BLUE, save_path='scatter.png'):
fig, ax = plt.subplots(figsize=(10, 7))
ax.scatter(x, y, c=color, s=50, alpha=0.6,
edgecolors='white', linewidth=1, zorder=3)
# Trend line
z = np.polyfit(x, y, 1)
p = np.poly1d(z)
x_line = np.linspace(min(x), max(x), 100)
ax.plot(x_line, p(x_line), color=G400, linewidth=1.5,
linestyle='--', zorder=2, alpha=0.7)
ax.set_title(title, loc='left')
ax.set_xlabel(xlabel)
ax.set_ylabel(ylabel)
clean_axis(ax, grid=True)
ax.xaxis.grid(True, alpha=0.08, color=G300)
save(fig, save_path)
```
---
## Template 7: Heatmap
**Scenario**: Matrix data, correlations, time × category.
```python
def heatmap(data, row_labels, col_labels, title,
cmap_color=C_BLUE, save_path='heatmap.png'):
from matplotlib.colors import LinearSegmentedColormap
cmap = LinearSegmentedColormap.from_list('bc', ['#FFFFFF', cmap_color])
fig, ax = plt.subplots(
figsize=(max(8, len(col_labels)*1.2), max(6, len(row_labels)*0.6)))
arr = np.array(data)
im = ax.imshow(arr, cmap=cmap, aspect='auto')
vmax = arr.max()
for i in range(len(row_labels)):
for j in range(len(col_labels)):
val = arr[i][j]
color = 'white' if val > vmax * 0.6 else G700
ax.text(j, i, f'{val:.1f}', ha='center', va='center',
fontsize=9, color=color)
ax.set_xticks(range(len(col_labels)))
ax.set_yticks(range(len(row_labels)))
ax.set_xticklabels(col_labels)
ax.set_yticklabels(row_labels)
ax.set_title(title, loc='left', pad=16)
cbar = fig.colorbar(im, ax=ax, fraction=0.03, pad=0.08, shrink=0.8)
cbar.outline.set_visible(False)
cbar.ax.tick_params(labelsize=8) # colorbar ticks should not be too large
for spine in ax.spines.values():
spine.set_visible(True)
spine.set_color(G200)
save(fig, save_path)
```
---
## Template 8: KPI Metric Cards
**Scenario**: Dashboard top key number display.
```python
def kpi_cards(metrics, save_path='kpi.png'):
"""
metrics: [{'label': '总收入', 'value': '12.8M',
'change': '+23%', 'positive': True}, ...]
"""
from matplotlib.patches import FancyBboxPatch
n = len(metrics)
fig, axes = plt.subplots(1, n, figsize=(3.8*n, 2.8))
if n == 1: axes = [axes]
for ax, m in zip(axes, metrics):
ax.set_xlim(0, 1); ax.set_ylim(0, 1); ax.axis('off')
bg = FancyBboxPatch((0.05, 0.05), 0.9, 0.9,
boxstyle='round,pad=0.05', facecolor=G50, edgecolor=G200, linewidth=0.8)
ax.add_patch(bg)
ax.text(0.5, 0.75, m['label'], ha='center', va='center',
fontsize=10, color=G500)
ax.text(0.5, 0.45, m['value'], ha='center', va='center',
fontsize=24, fontweight='bold', color=G900)
if 'change' in m:
is_pos = m.get('positive', True)
ax.text(0.5, 0.18,
f'{"" if is_pos else ""} {m["change"]}',
ha='center', va='center', fontsize=11,
color=POS if is_pos else NEG, fontweight='bold')
save(fig, save_path)
```
---
## Template 9: Radar Chart
**Scenario**: Multi-dimensional capability comparison (max 8 dimensions, max 3 groups).
```python
def radar(categories, datasets, series_names, title,
colors=None, save_path='radar.png'):
if colors is None:
colors = COOL[:len(datasets)]
N = len(categories)
angles = np.linspace(0, 2*np.pi, N, endpoint=False).tolist()
angles += angles[:1] # Close the polygon
fig, ax = plt.subplots(figsize=(8, 8), subplot_kw=dict(polar=True))
ax.set_theta_offset(np.pi / 2)
ax.set_theta_direction(-1)
ax.set_xticks(angles[:-1])
ax.set_xticklabels(categories, fontsize=10)
ax.yaxis.set_visible(False)
# Grid beautification
ax.spines['polar'].set_color(G200)
ax.grid(color=G200, linewidth=0.5, alpha=0.5)
for data, name, color in zip(datasets, series_names, colors):
vals = data + data[:1] # Close the polygon
ax.plot(angles, vals, color=color, linewidth=2, label=name)
ax.fill(angles, vals, color=color, alpha=0.08)
ax.legend(loc='upper right', bbox_to_anchor=(1.3, 1.1))
ax.set_title(title, pad=30, fontsize=16, fontweight='bold')
save(fig, save_path)
```
---
## Template 10: Waterfall Chart
**Scenario**: Show incremental changes from start to end value.
```python
def waterfall(labels, values, title, save_path='waterfall.png'):
"""labels and values correspond, positive=increase negative=decrease, last item auto-treated as total"""
fig, ax = plt.subplots(figsize=(12, 6))
cumulative = [0]
for v in values[:-1]:
cumulative.append(cumulative[-1] + v)
bar_colors = []
for i, v in enumerate(values):
if i == len(values) - 1:
bar_colors.append(C_BLUE) # Total
elif v >= 0:
bar_colors.append(POS) # Increase
else:
bar_colors.append(NEG) # Decrease
bottoms = []
for i, v in enumerate(values):
if i == len(values) - 1:
bottoms.append(0) # Total starts from 0
elif v >= 0:
bottoms.append(cumulative[i])
else:
bottoms.append(cumulative[i] + v)
bars = ax.bar(labels, [abs(v) for v in values], bottom=bottoms,
color=bar_colors, width=0.6, edgecolor='white', linewidth=0.5, zorder=3)
# Connecting lines
for i in range(len(values) - 2):
y = cumulative[i+1]
ax.plot([i+0.3, i+0.7], [y, y], color=G300, linewidth=0.8, zorder=2)
# Value labels
for i, (bar, val) in enumerate(zip(bars, values)):
y_pos = bar.get_y() + bar.get_height() + max(abs(v) for v in values)*0.01
prefix = '+' if val > 0 and i < len(values)-1 else ''
ax.text(bar.get_x()+bar.get_width()/2, y_pos,
f'{prefix}{val:,.0f}', ha='center', va='bottom',
fontsize=9, color=G700)
ax.set_title(title, loc='left')
clean_axis(ax)
save(fig, save_path)
```
---
## Template 11: Multi-Subplot Dashboard (GridSpec Precision Layout)
**Scenario**: Combine multiple subplots into a dashboard. ⚠️ Max 4 subplots per canvas, split if exceeded.
**Core Principle**: Use `GridSpec` for precise control of each subplot's position and spacing, don't rely on `tight_layout()`.
```python
import matplotlib.gridspec as gridspec
def dashboard(data_dict, title, save_path='dashboard.png'):
"""
2x2 dashboard layout example.
data_dict contains data needed for each subplot.
"""
# ⚠️ Use constrained_layout instead of tight_layout
fig = plt.figure(figsize=(16, 12), constrained_layout=True)
fig.suptitle(title, fontsize=20, fontweight='bold', y=0.98)
# GridSpec: precise spacing control
gs = gridspec.GridSpec(2, 2, figure=fig,
wspace=0.35, # Column spacing (at least 0.3)
hspace=0.35, # Row spacing (at least 0.3)
left=0.08, right=0.92,
top=0.92, bottom=0.08)
# ─── Top-left: bar chart ───
ax1 = fig.add_subplot(gs[0, 0])
ax1.set_title('Quarterly Revenue', loc='left', fontsize=13, fontweight='bold')
# ... binddata ...
clean_axis(ax1)
# ─── Top-right: line chart ───
ax2 = fig.add_subplot(gs[0, 1])
ax2.set_title('Monthly Trend', loc='left', fontsize=13, fontweight='bold')
# ... bind data ...
clean_axis(ax2)
# ─── Bottom-left: pie chart ───
ax3 = fig.add_subplot(gs[1, 0])
ax3.set_title('Category Share', loc='left', fontsize=13, fontweight='bold')
# ... bind data ...
# ─── Bottom-right: scatter plot ───
ax4 = fig.add_subplot(gs[1, 1])
ax4.set_title('Conversion Analysis', loc='left', fontsize=13, fontweight='bold')
# ... bind data ...
clean_axis(ax4)
fig.savefig(save_path, dpi=200, facecolor='white', bbox_inches='tight')
plt.close(fig)
```
### Dashboard Layout Golden Rules
| Rule | Description |
|------|-------------|
| **Max 4 subplots** | Split into multiple charts if exceeded |
| **Use `constrained_layout=True`** | Smarter than `tight_layout()`, auto-avoids labels |
| **`wspace/hspace ≥ 0.3`** | Subplot spacing too small causes overlap |
| **Independent title per subplot** | Use `ax.set_title()` instead of `fig.suptitle()` |
| **Consistent font sizes** | Subtitle 13px, axis labels 10px, data labels 9px |
| **Colorbar in separate column** | If a subplot needs colorbar, allocate `gs[0, 2]` separately |
| **Don't mix legend and direct labels** | Use either all legends or all direct labels in a dashboard |
### Safe Layout with Colorbar
```python
# When a subplot needs a colorbar, use 3-column layout, rightmost column for colorbar
gs = gridspec.GridSpec(2, 3, figure=fig,
width_ratios=[1, 1, 0.05], # Third column very narrow, dedicated to colorbar
wspace=0.4, hspace=0.35)
ax_heat = fig.add_subplot(gs[0, 1])
im = ax_heat.imshow(data, cmap='Blues')
# Colorbar placed in its own subplot position, won't obscure any content
cbar_ax = fig.add_subplot(gs[0, 2])
fig.colorbar(im, cax=cbar_ax)
cbar_ax.set_ylabel('Value', fontsize=10)
```
### Split Strategy for More Than 4 Subplots
```python
# ❌ Wrong: 8 subplots crammed into one canvas
fig, axes = plt.subplots(2, 4, figsize=(20, 10)) # Everything becomes unreadable
# ✅ Correct: split into 2 figures, 4 subplots each
# Figure 1: overview metrics
fig1 = plt.figure(figsize=(16, 12), constrained_layout=True)
# ... 4 subplots ...
fig1.savefig('dashboard_overview.png', dpi=200)
# Figure 2: detailed analysis
fig2 = plt.figure(figsize=(16, 12), constrained_layout=True)
# ... 4 subplots ...
fig2.savefig('dashboard_detail.png', dpi=200)
```

View File

@@ -0,0 +1,797 @@
# Mermaid Template Library
> **⚠️ Before writing any code, read [`_rules.md`](references/_rules.md) — three non-negotiable rules on overlap, hierarchy, and color.**
Mermaid is the best "text-as-diagram" solution — write structural diagrams with Markdown-like syntax, zero design skills needed.
Best for: flowcharts, sequence diagrams, architecture diagrams, Gantt charts, class diagrams, ER diagrams, mind maps, state diagrams, pie charts, Git branch graphs.
**Core advantages**: text is version-controllable, minimal maintenance cost, high rendering consistency, CJK support.
## ⚠️ Flowchart Quality Rules (Highest Priority)
### Font Size Control
Mermaid font sizes are controlled via `themeVariables` and CSS:
- `fontSize`: Global font size, recommended `14px`-`16px`, **no less than 12px**
- Node text is controlled via `fontSize` or `%%{init:}%%` directive
- Annotations/footnotes use subgraph titles or separate nodes, font size no less than `11px`
### Connectors & Spacing
```javascript
flowchart: {
padding: 32, // Node padding (CJK needs more space)
nodeSpacing: 80, // Horizontal spacing between nodes (default 50 too tight, 60 still not enough)
rankSpacing: 80, // Vertical spacing between ranks
curve: 'basis', // Connection curve style, consistent across the chart
}
```
**Connector style must be consistent throughout the chart**: do not mix straight, curved, and polylines in the same diagram. Mermaid controls this globally via the `curve` parameter.
### ⚠️ Mermaid Flowchart Hard Constraints (MANDATORY)
The following constraints are enforced **when generating Mermaid flowchart code**, not as post-checks:
1. **Node text must be wrapped in quotes**: `A["用户登录"]` ✅ / `A[用户登录]` ❌ — quotes prevent CJK special characters from causing parse errors
2. **Max 10 CJK characters per line in node text**: exceed → use `<br>` to break → `A["用户身份<br>验证模块"]`
3. **Max 5 nodes per subgraph**: exceed → split into multiple subgraphs or switch to CSS approach
4. **Max 10 total nodes**: exceed → switch to CSS flowchart template in `references/playwright-css.md`
5. **Max 6 CJK characters in connector labels**: `-->|验证通过|` ✅ / `-->|用户身份验证通过后跳转|`
6. **Config params must use enlarged values**: `padding: 32, nodeSpacing: 80, rankSpacing: 80`
## Rendering Methods
### Method 1: Playwright + HTML (Recommended, export PNG/SVG/PDF)
```html
<!DOCTYPE html>
<html lang="zh">
<head>
<meta charset="UTF-8">
<script src="https://cdn.jsdelivr.net/npm/mermaid@11/dist/mermaid.min.js"></script>
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
body {
background: #FFFFFF;
font-family: -apple-system, BlinkMacSystemFont, 'PingFang SC', 'SimHei', sans-serif;
display: flex;
justify-content: center;
padding: 48px;
}
#diagram { width: fit-content; min-width: 800px; }
</style>
</head>
<body>
<div id="diagram">
<pre class="mermaid">
<!-- Mermaid code goes here -->
</pre>
</div>
<script>
mermaid.initialize({
startOnLoad: true,
theme: 'base',
themeVariables: {
// See "Theme configuration" below
},
flowchart: {
curve: 'basis',
padding: 32, // Node padding (CJK chars 50% wider than Latin, need more space)
nodeSpacing: 80, // Horizontal spacing (prevents CJK node overlap)
rankSpacing: 80, // Vertical spacing between ranks (prevents overlap between levels)
htmlLabels: true, // Enable HTML label rendering, supports line breaks
wrappingWidth: 160, // Max text width before auto-wrap (160 wraps earlier than 200, prevents overly wide nodes)
},
sequence: { mirrorActors: false, messageAlign: 'center' },
gantt: { titleTopMargin: 25, barHeight: 24, barGap: 6 },
});
</script>
</body>
</html>
```
**Python screenshot script**:
```python
import asyncio
from playwright.async_api import async_playwright
async def mermaid_to_png(html_path, png_path, width=1400, scale=2):
async with async_playwright() as p:
browser = await p.chromium.launch(headless=True)
page = await browser.new_page(
viewport={'width': width, 'height': 800},
device_scale_factor=scale
)
await page.goto(f'file://{html_path}', wait_until='load', timeout=30000)
# Wait for Mermaid SVG to render
await page.wait_for_selector('#diagram svg', timeout=15000)
await page.wait_for_timeout(1000)
# ⚠️ Read SVG's ACTUAL rendered size (not CSS box model!)
# Mermaid SVGs often overflow their CSS container — getBBox/clientRect
# returns the true size, while CSS bounding_box() returns the clipped box.
svg_size = await page.evaluate('''() => {
const svg = document.querySelector('#diagram svg');
if (!svg) return null;
const r = svg.getBoundingClientRect();
return { width: r.width, height: r.height };
}''')
el = page.locator('#diagram')
css_bbox = await el.bounding_box()
svg_w = svg_size['width'] if svg_size else width
svg_h = svg_size['height'] if svg_size else 800
css_w = css_bbox['width'] if css_bbox else width
css_h = css_bbox['height'] if css_bbox else 800
# Use the LARGER of CSS box and SVG actual size
fit_w = max(width, int(max(svg_w, css_w) + 200))
fit_h = int(max(svg_h, css_h) + 200)
await page.set_viewport_size({'width': fit_w, 'height': fit_h})
await page.wait_for_timeout(500)
await el.screenshot(path=png_path)
await browser.close()
import os
print(f'{png_path} ({os.path.getsize(png_path)/1024:.0f}KB)')
# asyncio.run(mermaid_to_png('./output/diagram.html', './output/diagram.png'))
```
> **⚠️ CRITICAL: CSS `bounding_box()` vs SVG actual size**
>
> Mermaid generates SVGs that can be wider/taller than their CSS container. `bounding_box()` (Playwright) and `getBoundingClientRect()` on the container return **CSS box model size**, which may be smaller than the SVG's viewBox.
>
> **Always read the SVG element's own `getBoundingClientRect()`** via `page.evaluate()` and use `max(css_size, svg_size)` for viewport dimensions. This is the root cause of the "right side clipped" bug.
>
> Also: `wait_until='load'` is preferred over `'networkidle'` because Mermaid initializes on DOM load. `'networkidle'` can timeout if CDN is slow.
### Method 2: Mermaid CLI (mmdc, command line)
```bash
# Installation
npm install -g @mermaid-js/mermaid-cli
# Usage: .mmd file → PNG/SVG/PDF
mmdc -i diagram.mmd -o diagram.png -w 1200 -b transparent
mmdc -i diagram.mmd -o diagram.svg
mmdc -i diagram.mmd -o diagram.pdf
# Specify theme configuration
mmdc -i diagram.mmd -o diagram.png --configFile mermaid-config.json
```
`mermaid-config.json` example:
```json
{
"theme": "base",
"themeVariables": {
"primaryColor": "#EFF6FF",
"primaryBorderColor": "#3B82F6",
"primaryTextColor": "#1E293B",
"lineColor": "#94A3B8",
"secondaryColor": "#F0FDF4",
"tertiaryColor": "#FFF7ED"
}
}
```
### Method 3: Online Preview
Paste code at [mermaid.live](https://mermaid.live) for instant preview and export.
---
## Theme Configuration (Design System Integration)
Mermaid uses `theme: 'base'` + `themeVariables` for fully custom colors.
The following themes align with the charts skill mood color system:
### Business Professional (Default)
```javascript
themeVariables: {
primaryColor: '#EFF6FF', // Node background (very light blue)
primaryBorderColor: '#3B82F6', // Node border (blue)
primaryTextColor: '#1E293B', // Node text (dark gray-blue)
lineColor: '#94A3B8', // Connectors (gray)
secondaryColor: '#F0FDF4', // Secondary nodes (very light green)
secondaryBorderColor: '#10B981',
secondaryTextColor: '#1E293B',
tertiaryColor: '#FFF7ED', // Tertiary nodes (very light amber)
tertiaryBorderColor: '#F59E0B',
tertiaryTextColor: '#1E293B',
noteBkgColor: '#F8FAFC', // Note background
noteTextColor: '#6B7280',
noteBorderColor: '#E2E8F0',
fontSize: '14px',
fontFamily: '-apple-system, BlinkMacSystemFont, PingFang SC, SimHei, sans-serif',
}
```
### Tech Dark
```javascript
themeVariables: {
primaryColor: '#1E293B',
primaryBorderColor: '#3B82F6',
primaryTextColor: '#F1F5F9',
lineColor: '#475569',
secondaryColor: '#0F2E1F',
secondaryBorderColor: '#10B981',
secondaryTextColor: '#F1F5F9',
tertiaryColor: '#1A1625',
tertiaryBorderColor: '#8B5CF6',
tertiaryTextColor: '#F1F5F9',
noteBkgColor: '#0F172A',
noteTextColor: '#94A3B8',
noteBorderColor: '#334155',
fontSize: '14px',
fontFamily: '-apple-system, BlinkMacSystemFont, PingFang SC, SimHei, sans-serif',
background: '#0F172A',
}
```
---
## Template 1: Flowchart
The most common chart type. Supports directions: `TB` (top→bottom), `LR` (left→right), `BT`, `RL`.
```mermaid
flowchart TB
A[开始] --> B{条件判断}
B -->|是| C[执行操作A]
B -->|否| D[执行操作B]
C --> E[结束]
D --> E
style A fill:#EFF6FF,stroke:#3B82F6,stroke-width:2px,color:#1E293B
style B fill:#FFF7ED,stroke:#F59E0B,stroke-width:2px,color:#1E293B
style C fill:#F0FDF4,stroke:#10B981,stroke-width:2px,color:#1E293B
style D fill:#F0FDF4,stroke:#10B981,stroke-width:2px,color:#1E293B
style E fill:#EFF6FF,stroke:#3B82F6,stroke-width:2px,color:#1E293B
```
### Node Shape Quick Reference
| Syntax | Shape | Use For |
|------|------|--------|
| `A[text]` | Rectangle | Steps/Actions |
| `A(text)` | Rounded rect | General nodes |
| `A([text])` | Stadium | Start/End |
| `A{text}` | Diamond | Decision |
| `A{{text}}` | Hexagon | Preparation |
| `A[/text/]` | Parallelogram | Input/Output |
| `A((text))` | Circle | Connector |
| `A>text]` | Flag | Event/Signal |
### Subgraphs (Grouping)
```mermaid
flowchart LR
subgraph 前端["🖥️ 前端"]
A[React App] --> B[API 调用]
end
subgraph 后端["⚙️ 后端"]
C[FastAPI] --> D[(PostgreSQL)]
end
B --> C
style 前端 fill:#EFF6FF,stroke:#3B82F6,stroke-width:1px
style 后端 fill:#F0FDF4,stroke:#10B981,stroke-width:1px
```
---
## Template 2: Sequence Diagram
Shows interaction sequence between systems/actors.
```mermaid
sequenceDiagram
actor 用户
participant 前端 as 🖥️ 前端
participant API as ⚙️ API 网关
participant DB as 🗄️ 数据库
用户->>前端: 点击登录
前端->>API: POST /auth/login
API->>DB: 查询用户
DB-->>API: 用户信息
alt 验证成功
API-->>前端: 200 + JWT Token
前端-->>用户: 跳转首页
else 验证失败
API-->>前端: 401 未授权
前端-->>用户: 显示错误提示
end
Note over 前端,API: Token 有效期 24 小时
```
### Arrow Types
| Syntax | Meaning |
|------|------|
| `->>` | Solid arrow (synchronous call) |
| `-->>` | Dashed arrow (return/response) |
| `-x` | Solid with x (failure/rejection) |
| `-)` | Async message |
---
## Template 3: Architecture Diagram (C4 Style via Subgraphs)
```mermaid
flowchart TB
subgraph 用户层["👤 用户层"]
U1[Web 浏览器]
U2[移动 App]
end
subgraph 接入层["🌐 接入层"]
GW[API Gateway<br><small>Nginx + Rate Limit</small>]
LB[负载均衡<br><small>Round Robin</small>]
end
subgraph 服务层["⚙️ 微服务"]
S1[用户服务<br><small>FastAPI</small>]
S2[内容服务<br><small>FastAPI</small>]
S3[推荐服务<br><small>PyTorch</small>]
end
subgraph 数据层["🗄️ 数据层"]
DB[(PostgreSQL)]
RD[(Redis Cache)]
ES[(Elasticsearch)]
end
U1 & U2 --> GW
GW --> LB
LB --> S1 & S2 & S3
S1 --> DB & RD
S2 --> DB & ES
S3 --> RD & ES
style 用户层 fill:#EFF6FF,stroke:#3B82F6,stroke-width:1.5px
style 接入层 fill:#FFF7ED,stroke:#F59E0B,stroke-width:1.5px
style 服务层 fill:#F0FDF4,stroke:#10B981,stroke-width:1.5px
style 数据层 fill:#F5F3FF,stroke:#8B5CF6,stroke-width:1.5px
```
---
## Template 4: Gantt Chart
```mermaid
gantt
title 项目里程碑计划
dateFormat YYYY-MM-DD
axisFormat %m/%d
section 需求阶段
需求调研 :done, req1, 2024-01-01, 14d
需求评审 :done, req2, after req1, 3d
section 开发阶段
后端开发 :active, dev1, after req2, 21d
前端开发 :active, dev2, after req2, 18d
联调测试 : dev3, after dev1, 7d
section 上线阶段
灰度发布 : rel1, after dev3, 3d
全量上线 :milestone, rel2, after rel1, 0d
```
---
## Template 5: Class Diagram
```mermaid
classDiagram
class User {
+String name
+String email
+login()
+logout()
}
class Order {
+int id
+Date created_at
+float total
+submit()
+cancel()
}
class Product {
+String name
+float price
+int stock
}
User "1" --> "*" Order : 下单
Order "*" --> "*" Product : 包含
```
---
## Template 6: ER Diagram
```mermaid
erDiagram
USER {
int id PK
string name
string email UK
datetime created_at
}
ORDER {
int id PK
int user_id FK
float total
string status
datetime created_at
}
ORDER_ITEM {
int id PK
int order_id FK
int product_id FK
int quantity
float price
}
PRODUCT {
int id PK
string name
float price
int stock
}
USER ||--o{ ORDER : "下单"
ORDER ||--|{ ORDER_ITEM : "包含"
PRODUCT ||--o{ ORDER_ITEM : "被购买"
```
---
## Template 7: Mind Map
> ⚠️ Mermaid mindmap has limited layout capabilities. **For high-quality mind maps**, prefer `references/mindmap-css.md`.
> The following approach is for **quick drafts** or embedding in Markdown documents, with CSS injection to optimize visual quality.
### Optimized HTML Shell (Important! Use this, not the default template)
Mermaid mindmap doesn't support `style`/`classDef`, but you can greatly improve results with **CSS overriding SVG styles** + **themeVariables**:
```html
<!DOCTYPE html>
<html lang="zh">
<head>
<meta charset="UTF-8">
<script src="https://cdn.jsdelivr.net/npm/mermaid@11/dist/mermaid.min.js"></script>
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
body {
background: #FFFFFF;
font-family: -apple-system, BlinkMacSystemFont, 'PingFang SC', 'SimHei', sans-serif;
display: flex;
justify-content: center;
padding: 48px;
}
#diagram { min-width: 900px; }
/* ─── CSS injection to optimize Mermaid mindmap rendering ─── */
/*
* Actual SVG class names in Mermaid v11 mindmap:
* - .section-root = root node
* - .section-0 ~ .section-N = first-level branches (in order)
* - .section-edge-0 ~ .section-edge-N = connectors for corresponding branches
* - .node-bkg = node background path
* - .node-line- = node bottom decoration line
* - .nodeLabel = text label
* - .edge = connector
* - .edge-depth-1/5 = connector depth level
*/
/* 1. All connectors: rounded, soft */
.edge { stroke-width: 2px !important; stroke-linecap: round !important; }
/* 2. Remove node bottom decoration line (ugly by default) */
.node-line- { stroke: transparent !important; }
/* 3. Root node: deep blue circle + shadow */
.section-root circle,
.section-root ellipse {
fill: #1E40AF !important;
stroke: #1E3A8A !important;
stroke-width: 3px !important;
filter: drop-shadow(0 4px 12px rgba(30,64,175,0.35));
}
.section-root .nodeLabel { color: #FFFFFF !important; font-size: 17px !important; font-weight: 700 !important; }
/* 4. First-level branches colored in order (supports up to 8-color cycle) */
/* Blue */
.section-0 .node-bkg { fill: #DBEAFE !important; stroke: #3B82F6 !important; stroke-width: 2px !important; }
.section-0 .nodeLabel { color: #1E40AF !important; font-weight: 600 !important; font-size: 14px !important; }
.section-edge-0 { stroke: #93C5FD !important; }
/* Green */
.section-1 .node-bkg { fill: #D1FAE5 !important; stroke: #10B981 !important; stroke-width: 2px !important; }
.section-1 .nodeLabel { color: #065F46 !important; font-weight: 600 !important; font-size: 14px !important; }
.section-edge-1 { stroke: #6EE7B7 !important; }
/* Amber */
.section-2 .node-bkg { fill: #FEF3C7 !important; stroke: #F59E0B !important; stroke-width: 2px !important; }
.section-2 .nodeLabel { color: #92400E !important; font-weight: 600 !important; font-size: 14px !important; }
.section-edge-2 { stroke: #FCD34D !important; }
/* Purple */
.section-3 .node-bkg { fill: #EDE9FE !important; stroke: #8B5CF6 !important; stroke-width: 2px !important; }
.section-3 .nodeLabel { color: #5B21B6 !important; font-weight: 600 !important; font-size: 14px !important; }
.section-edge-3 { stroke: #C4B5FD !important; }
/* Red */
.section-4 .node-bkg { fill: #FEE2E2 !important; stroke: #EF4444 !important; stroke-width: 2px !important; }
.section-4 .nodeLabel { color: #991B1B !important; font-weight: 600 !important; font-size: 14px !important; }
.section-edge-4 { stroke: #FCA5A5 !important; }
/* Cyan */
.section-5 .node-bkg { fill: #CFFAFE !important; stroke: #06B6D4 !important; stroke-width: 2px !important; }
.section-5 .nodeLabel { color: #155E75 !important; font-weight: 600 !important; font-size: 14px !important; }
.section-edge-5 { stroke: #67E8F9 !important; }
/* Pink */
.section-6 .node-bkg { fill: #FCE7F3 !important; stroke: #EC4899 !important; stroke-width: 2px !important; }
.section-6 .nodeLabel { color: #9D174D !important; font-weight: 600 !important; font-size: 14px !important; }
.section-edge-6 { stroke: #F9A8D4 !important; }
/* Gray-green */
.section-7 .node-bkg { fill: #D1FAE5 !important; stroke: #059669 !important; stroke-width: 2px !important; }
.section-7 .nodeLabel { color: #064E3B !important; font-weight: 600 !important; font-size: 14px !important; }
.section-edge-7 { stroke: #6EE7B7 !important; }
/* 5. Deeper connectors are lighter */
.edge-depth-5 { stroke-width: 1.5px !important; opacity: 0.6; }
/* 6. Light gray background */
body { background: #FAFBFE; }
</style>
</head>
<body>
<div id="diagram">
<pre class="mermaid">
mindmap
root((你的主题))
一级分支1
二级内容A
二级内容B
一级分支2
二级内容C
</pre>
</div>
<script>
mermaid.initialize({
startOnLoad: true,
theme: 'base',
themeVariables: {
primaryColor: '#EFF6FF',
primaryBorderColor: '#3B82F6',
primaryTextColor: '#1E293B',
lineColor: '#CBD5E1',
fontSize: '13px',
fontFamily: '-apple-system, BlinkMacSystemFont, PingFang SC, SimHei, sans-serif',
},
mindmap: {
padding: 20,
useMaxWidth: false,
}
});
</script>
</body>
</html>
```
### Auto-Upgrade Rules (Important!)
> ⚠️ **Never trim user content just to fit Mermaid!**
> Content comes first; tools serve content, not the other way around.
When content complexity exceeds Mermaid mindmap's comfort zone, **auto-switch to CSS approach** (`references/mindmap-css.md`):
| Trigger Condition (any one met) | Action |
|----------------------|------|
| More than 7 L1 branches | → Switch to CSS |
| Any branch has >8 child nodes | → Switch to CSS |
| Nesting exceeds 3 levels | → Switch to CSS |
| Single node text >15 chars | → Switch to CSS |
| Total nodes >40 | → Switch to CSS |
**When none of the above triggers**, Mermaid mindmap is adequate. The following suggestions help optimize rendering (recommendations, not hard limits):
| Suggestion | Notes |
|------|------|
| Keep node text concise | Use spaces to segment long text, avoid punctuation |
| Use emoji prefixes per branch | Higher visual distinctiveness |
| Use CSS injection for coloring | Use the optimized HTML shell above |
### Example (Optimized)
```mermaid
mindmap
root((AI 内容运营))
选题策划
热点扫描
竞品分析
用户调研
内容生产
长文
短文
视频
渠道分发
微信生态
小红书
B站
数据运营
数据分析
评论互动
持续优化
```
### Known Limitations of Mermaid Mindmap
- ❌ No `style` / `classDef` support for direct node coloring (CSS injection of SVG styles only)
- ❌ Line thickness/curvature cannot be controlled from Mermaid syntax (CSS override `.mindmap-edge`)
- ❌ Node spacing calculated by algorithm, cannot be manually specified
- ❌ Long CJK text easily overlaps with connectors (strict character count control needed)
- ⚠️ CSS injection depends on Mermaid internal class naming, may break on version upgrades
**Conclusion**: For quick drafts use Mermaid + the CSS-optimized shell above; for production output use CSS mind map → `references/mindmap-css.md`
---
## Template 8: State Diagram
```mermaid
stateDiagram-v2
[*] --> 草稿
草稿 --> 审核中 : 提交审核
审核中 --> 已发布 : 审核通过
审核中 --> 草稿 : 退回修改
已发布 --> 已下架 : 违规/过期
已下架 --> 草稿 : 重新编辑
已发布 --> [*] : 永久删除
```
---
## Template 9: Git Branch Graph
```mermaid
gitGraph
commit id: "init"
branch develop
checkout develop
commit id: "feat: 用户模块"
commit id: "feat: 订单模块"
branch feature/payment
checkout feature/payment
commit id: "feat: 支付接入"
commit id: "fix: 金额精度"
checkout develop
merge feature/payment id: "merge: 支付"
checkout main
merge develop id: "release: v1.0"
commit id: "hotfix: 安全补丁" type: REVERSE
```
---
## Styling Tips
### Single Node Style
```mermaid
style 节点ID fill:#EFF6FF,stroke:#3B82F6,stroke-width:2px,color:#1E293B
```
### Batch Styles (classDef)
```mermaid
flowchart LR
classDef blue fill:#EFF6FF,stroke:#3B82F6,stroke-width:1.5px,color:#1E293B
classDef green fill:#F0FDF4,stroke:#10B981,stroke-width:1.5px,color:#1E293B
classDef amber fill:#FFF7ED,stroke:#F59E0B,stroke-width:1.5px,color:#1E293B
A[步骤1]:::blue --> B{判断}:::amber
B -->|是| C[结果A]:::green
B -->|否| D[结果B]:::green
```
### Connector Styles
```mermaid
%% Style for the N-th connector (0-indexed)
linkStyle 0 stroke:#3B82F6,stroke-width:2px
linkStyle 1 stroke:#10B981,stroke-width:2px,stroke-dasharray: 5 5
```
---
## Mermaid vs Other Approaches
| Capability | Mermaid | Playwright+CSS | draw.io |
|------|---------|---------------|---------|
| Learning curve | ✅ Very low (Markdown-like) | Medium (HTML/CSS) | ✅ Very low (drag&drop) |
| Version control friendly | ✅ Plain text | ✅ Plain text | ❌ XML binary |
| Flowcharts | ✅ Built-in | ⚠️ Manual layout | ✅ Drag&drop |
| Sequence diagrams | ✅ Built-in | ❌ Very complex | ✅ Templates |
| Gantt charts | ✅ Built-in | ❌ Build from scratch | ⚠️ Limited |
| Class/ER diagrams | ✅ Built-in | ❌ Not suited | ✅ Templates |
| Visual freedom | ⚠️ Limited | ✅ Full freedom | ✅ Free |
| PNG export | ✅ mmdc/Playwright | ✅ Playwright | ✅ Built-in |
| CJK support | ✅ Native | ✅ Font config | ✅ Native |
| Auto layout | ✅ Automatic | ❌ Manual | ⚠️ Semi-auto |
**Principle: Use Mermaid for structural/relationship diagrams, Playwright+CSS for creative design diagrams.**
---
## FAQ
### Q: CJK node names cause layout issues?
Ensure `fontFamily` includes a CJK font:
```javascript
themeVariables: {
fontFamily: '-apple-system, PingFang SC, SimHei, sans-serif'
}
```
### Q: How to control node spacing?
Mermaid auto-layouts; spacing is adjusted via config:
```javascript
flowchart: { padding: 16, nodeSpacing: 50, rankSpacing: 60 }
```
### Q: Chart too large?
- Split into subgraphs
- Change direction (`TB` too tall → switch to `LR`)
- Use `mmdc -w 1600` to increase canvas width
### Q: How to add line breaks in nodes?
Use `<br>` tags:
```mermaid
A[第一行<br>第二行<br><small>小字注释</small>]
```
### Q: Flowchart node text truncated or overlapping?
**Common causes and fixes**:
1. **Insufficient node padding**: ensure `flowchart.padding` is at least `24` (CJK chars are ~50% wider than Latin)
2. **Text too long**: use `<br>` for manual line breaks, or shorten text
3. **Canvas too narrow**: use `width: fit-content` on `#diagram` container (🚫 NEVER use `max-width` — Mermaid SVG width is unpredictable)
4. **Node spacing too small**: increase `nodeSpacing` and `rankSpacing` (recommended 60+)
5. **When using classDef**: ensure `font-size` isn't too large, 12-14px is ideal
**Correct approach for long-text nodes**:
```mermaid
flowchart LR
A["这是一段比较长的<br>需要换行的文字"]
B["用引号包裹节点文字<br>可以使用 HTML 标签"]
```
**Key configuration**:
```javascript
mermaid.initialize({
flowchart: {
padding: 32,
nodeSpacing: 80,
rankSpacing: 80,
htmlLabels: true,
wrappingWidth: 160,
}
});
```

View File

@@ -0,0 +1,911 @@
# CSS Mind Map Rendering Engine
> **⚠️ Before writing any code, read [`_rules.md`](references/_rules.md) — three non-negotiable rules on overlap, hierarchy, and color.**
**Core principle: Content-driven, not template-driven. First analyze the content structure, then decide on the layout, and finally render.**
---
## Step 1: Content Analysis
After receiving a mind map requirement, **don't write any HTML/CSS yet**. First parse the content into a tree structure, then calculate key metrics.
### 1.1 Build Tree JSON
```
Input: "How to work efficiently from home", branches include "Environment Setup", "Time Management", "Tool Selection"...
Output:
{
"root": "在家如何高效办公",
"branches": [
{
"label": "环境准备",
"children": ["独立工作区", "降噪耳机", "人体工学椅"]
},
{
"label": "时间管理",
"children": [
{ "label": "番茄工作法", "children": ["25分钟专注", "5分钟休息"] },
"日程规划",
"避免多任务"
]
},
...
]
}
```
### 1.2 Key Metrics
| Metric | Calculation Method | What It Affects |
|------|---------|---------|
| `branchCount` | Number of first-level branches | Choose single-sided/dual-sided expansion |
| `maxDepth` | Maximum nesting depth | Canvas width, whether sub-branch is needed |
| `maxChildren` | Maximum children per node | Vertical height of that branch |
| `totalNodes` | Total number of all nodes | Overall canvas size |
| `maxTextLen` | Longest text character count | Node width, whether line breaks are needed |
| `branchWeights[]` | Total descendants per first-level branch | Left-right balance allocation |
### 1.3 Example Analysis
```
Content: "产品经理核心技能", 6 L1 branches, max depth 3, total 35 nodes, max text length 12 chars
→ branchCount=6, maxDepth=3, totalNodes=35, maxTextLen=12
→ branchWeights=[8, 5, 7, 4, 6, 5]
```
---
## Step 2: Layout Decision
Based on the metrics from Step 1, choose a layout scheme. The following are reference suggestions, not hard rules—adjust flexibly according to actual content.
### 2.1 Layout Selection Guide
**Core principle: Use simple layouts for less content, dual-sided layouts for more content, avoid one side becoming too long.**
```
Few branches (roughly ≤4), simple content?
→ Style A: Right-expanding tree (single-sided, compact)
Many branches (roughly ≥5), or single side would be too long?
→ Style B: Left-right expanding tree (dual-sided balanced, most common)
User explicitly requests a specific form?
→ "cards"/"modules" → Style C: Card grid
→ "fishbone"/"root cause analysis" → Style D: Fishbone diagram
```
**Why not use radial layout?** Radial layout (spreading from center outward) almost inevitably causes node overlap when there are more than 3-4 branches, and is extremely difficult to debug. Not recommended.
### 2.2 Canvas Size Calculation
Don't use fixed sizes. Calculate based on content:
```python
def calc_canvas(branch_count, max_depth, total_nodes, max_text_len, layout):
# Width: depends on depth and text length
node_width = max(120, max_text_len * 14) # ~14px per CJK char (including padding)
if layout == 'left-right':
width = node_width * max_depth * 2 + 200 # Dual-sided, 200px in center for root node
else:
width = node_width * max_depth + 200 # Single-sided
# Height: depends on max branch vertical expansion
if layout == 'left-right':
max_side_nodes = total_nodes // 2 + 2 # rough estimate of single-side nodes
else:
max_side_nodes = total_nodes
height = max_side_nodes * 32 + 200 # ~32px per leaf (including gap)
# Lower bounds
width = max(width, 1200)
height = max(height, 600)
return width, height
```
### 2.3 Left-Right Branch Allocation (Style B)
Goal: Achieve similar visual weight on both sides.
```python
def balance_branches(branch_weights):
"""Greedy bin-packing: sort by weight descending, alternate left/right"""
indexed = sorted(enumerate(branch_weights), key=lambda x: -x[1])
left, right = [], []
left_sum, right_sum = 0, 0
for idx, weight in indexed:
if left_sum <= right_sum:
left.append(idx)
left_sum += weight
else:
right.append(idx)
right_sum += weight
return left, right
# e.g.: weights=[8,5,7,4,6,5] → left=[0,3,5] right=[2,4,1] → 17 vs 18
```
### 2.4 Node Styling Decision
**Principle: The deeper the level, the lighter the visual weight.** This way readers can distinguish main branches from details at a glance.
```
Root node → most prominent: dark solid background, large font (~20px)
L1 branches → next prominent: light fill + colored border, medium font (~15px)
L2 nodes → lighter still: paler fill + thin border, medium-small font (~13px)
Leaf nodes → lightest: capsule frame or plain text, small font (~12-13px)
```
**Key: Leaves should not have the same visual weight as first-level branches.** Leaves' padding, gap, and border thickness should all be significantly smaller than their parent. Otherwise the chart will be vertically too long and lack hierarchy.
---
## Step 3: Rendering
Based on the decisions from Step 2, generate HTML + CSS + JS.
### 3.1 Playwright Screenshot (Universal)
```python
import asyncio
from playwright.async_api import async_playwright
async def mindmap_to_png(html_path, png_path, width=1600):
async with async_playwright() as p:
browser = await p.chromium.launch(headless=True)
page = await browser.new_page(viewport={'width': width, 'height': 1200}, device_scale_factor=2)
await page.goto(f'file://{html_path}', wait_until='networkidle')
await page.wait_for_timeout(500)
el = page.locator('#mindmap')
bbox = await el.bounding_box()
# First expansion: ensure content is not clipped
expand_w = max(width, int(bbox['width'] + 100))
expand_h = int(bbox['height'] + 100)
await page.set_viewport_size({'width': expand_w, 'height': expand_h})
await page.wait_for_timeout(200)
# Call connector script
await page.evaluate('if(typeof drawAllLines==="function") drawAllLines()')
await page.wait_for_timeout(200)
# Second contraction: measure actual content right edge, trim right-side blank space
trim = await page.evaluate('''() => {
const map = document.getElementById('mindmap');
const nodes = map.querySelectorAll('.root-node,.branch-node,.sub-node,.leaf,.deep-node');
const mapRect = map.getBoundingClientRect();
let maxR = 0, maxB = 0;
nodes.forEach(n => {
const r = n.getBoundingClientRect();
maxR = Math.max(maxR, r.right - mapRect.left);
maxB = Math.max(maxB, r.bottom - mapRect.top);
});
return { contentW: Math.ceil(maxR) + 80, contentH: Math.ceil(maxB) + 80 };
}''')
await page.set_viewport_size({'width': trim['contentW'], 'height': trim['contentH']})
await page.wait_for_timeout(200)
# Redraw connectors (viewport changed so coordinates change)
await page.evaluate('if(typeof drawAllLines==="function") drawAllLines()')
await page.wait_for_timeout(200)
await el.screenshot(path=png_path)
await browser.close()
import os
print(f'{png_path} ({os.path.getsize(png_path)/1024:.0f}KB)')
```
### 3.2 Universal Recursive Connector Script v4-fix (All Styles)
This script automatically handles tree connectors of **any depth** (3, 4, 5 levels... all work). HTML for styles A and B should include this script at the end.
**⚠️ DOM Structure Convention (connector script depends on this structure):**
```
#mindmap
.tree-layout ← flex container (for left-right tree)
.left-side ← left branch area
.branch.c-{color} ← L1 branch (must have color class)
.branch-node ← L1 node
.children ← L2 container
div ← child wrapper
.sub-node ← L2 node
.leaf-group ← L3 container
div / .leaf ← leaf or wrapper with deeper levels
.leaf ← L3 leaf
.deep-group ← L4 container (recursive, same as above)
.deep-node ← L4/L5 node
.center-root
.root-node ← root node
.right-side ← right branches (same structure as .left-side)
```
**Also supports legacy structure (.branches/.right-branches/.left-branches), backward compatible.**
**⚠️ CSS Indentation Rules (connectors depend on child nodes having offset relative to parent, no indentation = broken connectors):**
```css
/* .tree-layout structure (new version) must include these paddings */
.left-side .children, .left-side .leaf-group, .left-side .deep-group {
align-items: flex-end; padding-right: 16px;
}
.right-side .children, .right-side .leaf-group, .right-side .deep-group {
align-items: flex-start; padding-left: 16px;
}
```
**Common causes of missing connectors/layout errors:**
- **⚠️ `.sub-branch` missing `display: flex`** (most critical! Without flex, `flex-direction: row-reverse` doesn't work, left-side leaves won't expand left, instead all pile up on the right)
- **⚠️ Child node containers missing padding-left/padding-right** (without indentation, child nodes align with parent, midX is outside child nodes, connectors break)
- **⚠️ `.lr-tree` / `.tree` should not have `z-index`** (creates stacking context, covers SVG connectors)
- **⚠️ Leaf nodes must NOT stretch to equal width** — each leaf should size to its own text content (`white-space: nowrap` or `width: fit-content`). Never add `width: 100%`, `flex-grow: 1`, or `align-items: stretch` to leaf containers. Leaves with shorter text should be narrower than leaves with longer text.
- Leaf container not named `.leaf-group` (using `.child-list`, `.sub-items`, etc.)
- Deep-level container not named `.deep-group`
- `#mindmap` missing `position: relative`
**Key improvements (v4-fix vs v2):**
- **No transparency**—use solid color blend for fading (`color + '80'` is almost invisible on white background)
- **Recursively process `.children`, `.leaf-group`, `.deep-group`**—no longer limited to 3 levels
- **Vertical line draws complete range**—one line from `min(Y)` to `max(Y)`, instead of drawing per child node
```javascript
/**
* Universal recursive connector script v4-fix
* - Supports any nesting depth (recursively processes .children + .leaf-group + .deep-group)
* - Unified gray-tone connectors (#64748B → #94A3B8 → #A8B4C2), visually clean
* - Direction logic:
* dir='left': startX=parent.left → midX(left-biased) → endX=child.right
* dir='right': startX=parent.right → midX(right-biased) → endX=child.left
*/
function drawAllLines() {
const map = document.getElementById('mindmap');
if (!map) { console.error('❌ #mindmap not found'); return; }
const cRect = map.getBoundingClientRect();
const old = map.querySelector('svg.lines');
if (old) old.remove();
const svg = document.createElementNS('http://www.w3.org/2000/svg', 'svg');
svg.classList.add('lines');
svg.setAttribute('width', cRect.width);
svg.setAttribute('height', cRect.height);
svg.style.cssText = 'position:absolute;top:0;left:0;pointer-events:none;z-index:1;';
let lineCount = 0;
function rel(el) {
const r = el.getBoundingClientRect();
return {
cx: r.left - cRect.left + r.width/2,
cy: r.top - cRect.top + r.height/2,
left: r.left - cRect.left,
right: r.right - cRect.left,
};
}
function drawLine(x1, y1, x2, y2, color, width) {
const l = document.createElementNS('http://www.w3.org/2000/svg', 'line');
// Round to nearest pixel to prevent sub-pixel misalignment (visual kinks)
l.setAttribute('x1', Math.round(x1)); l.setAttribute('y1', Math.round(y1));
l.setAttribute('x2', Math.round(x2)); l.setAttribute('y2', Math.round(y2));
l.setAttribute('stroke', color); l.setAttribute('stroke-width', width);
l.setAttribute('stroke-linecap', 'round');
svg.appendChild(l);
lineCount++;
}
// ─── Unified gray connectors (decreasing by depth, visually clean) ───
const lineStyles = [
{ color: '#64748B', width: 2.5 }, // root → L1
{ color: '#94A3B8', width: 2 }, // L1 → L2
{ color: '#A8B4C2', width: 1.5 }, // L2 → L3
{ color: '#B8C2CC', width: 1.2 }, // L3 → L4
{ color: '#CBD5E1', width: 1 }, // L4 → L5
];
function getLineStyle(branchColor, depth) {
const s = lineStyles[Math.min(depth, lineStyles.length - 1)];
return { color: s.color, width: s.width };
}
// ─── Connector direction ───
// dir='left': parent.left → midX → child.right (each child gets its own polyline)
// dir='right': parent.right → midX → child.left
//
// [Principle] midX is the X coordinate of the vertical line; it must be in the gap between parent and children.
// Use a fraction of the parent-to-nearest-child distance as offset, so:
// - Large node spacing → large offset, lines spread out
// - Small node spacing → small offset, lines compact without crossing text
// All children in the same connect() call share one midX, ensuring vertical line alignment.
function connect(parentEl, childEls, color, width, dir) {
if (!childEls.length) return;
const p = rel(parentEl);
const startX = dir === 'left' ? p.left : p.right;
const startY = p.cy;
// Special case: single child — draw one straight horizontal line, no vertical spine
if (childEls.length === 1) {
const c = rel(childEls[0]);
const endX = dir === 'left' ? c.right : c.left;
const midY = Math.round((startY + c.cy) / 2);
drawLine(startX, midY, endX, midY, color, width);
return;
}
// Find the closest child edge to calculate available space
let closestEdge;
if (dir === 'left') {
closestEdge = Math.max(...childEls.map(ch => rel(ch).right));
} else {
closestEdge = Math.min(...childEls.map(ch => rel(ch).left));
}
// Place midX at the midpoint between parent edge and closest child edge,
// but guarantee at least 16px clearance from child nodes
const childClearance = 16;
const midpoint = startX + (closestEdge - startX) / 2;
const midFromChild = dir === 'left' ? closestEdge + childClearance : closestEdge - childClearance;
// Use the position that's further from children (safer)
const midX = dir === 'left'
? Math.min(midpoint, midFromChild)
: Math.max(midpoint, midFromChild);
drawLine(startX, startY, midX, startY, color, width);
// Draw ONE continuous vertical line spanning from parent to the last child
const allCYs = childEls.map(ch => rel(ch).cy);
const minY = Math.min(startY, ...allCYs);
const maxY = Math.max(startY, ...allCYs);
drawLine(midX, minY, midX, maxY, color, width);
// Then draw horizontal lines from the vertical spine to each child
childEls.forEach(ch => {
const c = rel(ch);
const endX = dir === 'left' ? c.right : c.left;
const endY = c.cy;
drawLine(midX, endY, endX, endY, color, width);
});
}
// ─── Recursively process subtree (supports .children + .leaf-group + .deep-group) ───
const NODE_SEL = '.branch-node, .sub-node, .leaf, .deep-node';
const CONTAINER_SEL = ':scope > .children, :scope > .leaf-group, :scope > .deep-group';
function processChildren(parentNodeEl, containerEl, branchColor, depth, dir) {
if (!containerEl) return;
const childNodeEls = [];
for (const wrapper of containerEl.children) {
const nodeEl = wrapper.matches?.(NODE_SEL) ? wrapper : wrapper.querySelector(NODE_SEL);
if (nodeEl) childNodeEls.push(nodeEl);
}
if (!childNodeEls.length) return;
const style = getLineStyle(branchColor, depth);
connect(parentNodeEl, childNodeEls, style.color, style.width, dir);
for (const wrapper of containerEl.children) {
const nodeEl = wrapper.matches?.(NODE_SEL) ? wrapper : wrapper.querySelector(NODE_SEL);
if (!nodeEl) continue;
wrapper.querySelectorAll(CONTAINER_SEL).forEach(nc =>
processChildren(nodeEl, nc, branchColor, depth + 1, dir)
);
}
}
// ─── Main flow ───
const rootNode = map.querySelector('.root-node');
if (!rootNode) { console.error('❌ .root-node not found'); return; }
const rp = rel(rootNode);
// Left side: collect all L1 branch-nodes, draw polylines with connect() (with vertical spine)
const leftSel = '.left-side > .branch, .left-branches > .left-branch, .left-branches > div';
const leftBranches = map.querySelectorAll(leftSel);
const leftBranchNodes = [];
leftBranches.forEach(branch => {
const bNode = branch.querySelector('.branch-node');
if (bNode) leftBranchNodes.push(bNode);
});
if (leftBranchNodes.length) {
connect(rootNode, leftBranchNodes, lineStyles[0].color, lineStyles[0].width, 'left');
}
leftBranches.forEach(branch => {
const bNode = branch.querySelector('.branch-node');
if (!bNode) return;
processChildren(bNode, branch.querySelector(':scope > .children'), null, 1, 'left');
});
// Right side: same as above
const rightSel = '.right-side > .branch, .branches > .branch, .right-branches > .right-branch';
const rightBranches = map.querySelectorAll(rightSel);
const rightBranchNodes = [];
rightBranches.forEach(branch => {
const bNode = branch.querySelector('.branch-node');
if (bNode) rightBranchNodes.push(bNode);
});
if (rightBranchNodes.length) {
connect(rootNode, rightBranchNodes, lineStyles[0].color, lineStyles[0].width, 'right');
}
rightBranches.forEach(branch => {
const bNode = branch.querySelector('.branch-node');
if (!bNode) return;
processChildren(bNode, branch.querySelector(':scope > .children'), null, 1, 'right');
});
map.insertBefore(svg, map.firstChild);
console.log(`✅ Drew ${lineCount} lines`);
}
```
**How to call (at end of HTML):**
```html
<script>
/* ← paste the drawAllLines function above */
window.addEventListener('load', () => setTimeout(drawAllLines, 300));
</script>
```
### 3.3 Universal CSS Base (Shared by All Styles)
```css
* { margin: 0; padding: 0; box-sizing: border-box; }
body {
background: #FFFFFF;
font-family: -apple-system, BlinkMacSystemFont, 'PingFang SC', 'SimHei', sans-serif;
}
#mindmap { padding: 60px; display: inline-block; min-width: 100%; position: relative; }
#mindmap > svg.lines {
position: absolute; top: 0; left: 0; width: 100%; height: 100%;
pointer-events: none; z-index: 0;
}
/* ─── Root Node ─── */
/* ─── Topic Intent Color System ─── */
/* Model picks ONE intent based on content semantics. Add data-intent="xxx" to #mindmap.
DO NOT manually write hex colors for root node — use intent system only. */
/* Intent → Root Node + Branch Palette mapping */
#mindmap[data-intent="professional"] .root-node { background: linear-gradient(135deg, #1E3A5F, #2D4A6F); box-shadow: 0 4px 12px rgba(30,58,95,0.25); }
#mindmap[data-intent="technical"] .root-node { background: linear-gradient(135deg, #334155, #475569); box-shadow: 0 4px 12px rgba(51,65,85,0.25); }
#mindmap[data-intent="medical"] .root-node { background: linear-gradient(135deg, #0F766E, #0D9488); box-shadow: 0 4px 12px rgba(15,118,110,0.25); }
#mindmap[data-intent="education"] .root-node { background: linear-gradient(135deg, #9A3412, #B45309); box-shadow: 0 4px 12px rgba(154,52,18,0.25); }
#mindmap[data-intent="creative"] .root-node { background: linear-gradient(135deg, #7C3AED, #8B5CF6); box-shadow: 0 4px 12px rgba(124,58,237,0.25); }
#mindmap[data-intent="finance"] .root-node { background: linear-gradient(135deg, #1E3A5F, #1E40AF); box-shadow: 0 4px 12px rgba(30,58,95,0.25); }
#mindmap[data-intent="nature"] .root-node { background: linear-gradient(135deg, #166534, #15803D); box-shadow: 0 4px 12px rgba(22,101,52,0.25); }
#mindmap[data-intent="warning"] .root-node { background: linear-gradient(135deg, #991B1B, #B91C1C); box-shadow: 0 4px 12px rgba(153,27,27,0.25); }
#mindmap[data-intent="neutral"] .root-node { background: linear-gradient(135deg, #334155, #475569); box-shadow: 0 4px 12px rgba(51,65,85,0.25); }
/*
Intent Selection Guide (for model):
professional → corporate reports, strategy, management, business plans
technical → software, engineering, architecture, systems, AI/ML
medical → healthcare, clinical, pharmaceutical, nursing, anatomy
education → teaching, learning, curriculum, training, academic
creative → design, art, marketing, branding, media
finance → banking, investment, accounting, economics, trading
nature → environment, ecology, agriculture, biology, geography
warning → risk analysis, safety, incident review, compliance, audit
neutral → general topics, mixed content, unclear domain
Recommended branch color combos per intent:
professional → [blue, teal, cyan]
technical → [blue, purple, cyan]
medical → [teal, green, blue]
education → [amber, green, blue]
creative → [purple, amber, cyan]
finance → [blue, green, cyan]
nature → [green, teal, amber]
warning → [red, amber, cyan]
neutral → [blue, green, purple]
*/
.root-node {
color: white; font-size: 20px; font-weight: 700;
padding: 18px 28px; border-radius: 12px;
white-space: nowrap; flex-shrink: 0; align-self: center;
}
/* Fallback if no intent specified — defaults to professional blue */
#mindmap:not([data-intent]) .root-node {
background: linear-gradient(135deg, #1E3A5F, #2D4A6F);
box-shadow: 0 4px 12px rgba(30,58,95,0.25);
}
/* ─── Container Layout ─── */
.tree { display: flex; align-items: flex-start; gap: 0; position: relative; }
.branches { display: flex; flex-direction: column; gap: 16px; margin-left: 60px; }
.branch, .sub-branch { display: flex; align-items: flex-start; gap: 0; }
/* ─── First-Level Branch Node ─── */
.branch-node {
font-size: 15px; font-weight: 600;
padding: 10px 20px; border-radius: 8px;
white-space: nowrap; flex-shrink: 0; border: 2px solid;
}
/* ─── Second-Level Sub-Node (when has children) ─── */
.sub-node {
font-size: 13px; font-weight: 500;
padding: 7px 14px; border-radius: 6px;
white-space: nowrap; flex-shrink: 0; border: 1.5px solid;
background: #F8FAFC;
}
/* ─── Leaf Node ─── */
/* Default: lightweight capsule frame */
.leaf {
font-size: 13px; font-weight: 400; color: #475569;
padding: 4px 10px; background: #FAFAFA;
border-radius: 12px; border: 1px solid #E5E7EB;
white-space: nowrap; line-height: 1.5;
}
/* Text-only mode (more compact, add class="leaf text-only") */
.leaf.text-only {
background: none; border: none; padding: 2px 0; border-radius: 0;
}
/* Leaf supplementary description */
.leaf-desc {
font-size: 12px; color: #94A3B8; font-weight: 400; margin-left: 8px;
}
.leaf-desc::before { content: '— '; color: #CBD5E1; }
/* ─── Child Node Container ─── */
/* Principle: The deeper the level, the smaller the gap, but not so small that connectors and text overlap */
.children { display: flex; flex-direction: column; gap: 6px; margin-left: 48px; align-items: flex-start; }
.children:has(.sub-branch) { gap: 10px; } /* sub-branch is larger than leaf, needs more spacing */
.sub-branch .children { gap: 5px; margin-left: 40px; align-items: flex-start; }
.sub-branch .sub-branch .children { gap: 4px; margin-left: 36px; align-items: flex-start; }
/* Tip: If leaf text and connectors overlap, prioritize increasing gap */
/* ─── Color System (for distinguishing different branches) ─── */
.c-blue { background: #EFF6FF; border-color: #3B82F6; color: #1E40AF; }
.c-green { background: #F0FDF4; border-color: #10B981; color: #065F46; }
.c-amber { background: #FFF7ED; border-color: #F59E0B; color: #92400E; }
.c-purple { background: #F5F3FF; border-color: #8B5CF6; color: #5B21B6; }
.c-red { background: #FEF2F2; border-color: #EF4444; color: #991B1B; }
.c-cyan { background: #ECFEFF; border-color: #06B6D4; color: #155E75; }
.c-teal { background: #F0FDFA; border-color: #14B8A6; color: #134E4A; }
/* Second-level sub-node inherits branch color scheme (lighter) */
.c-blue .sub-node { background: #F0F7FF; border-color: #93C5FD; color: #1E40AF; }
.c-green .sub-node { background: #F2FDF6; border-color: #6EE7B7; color: #065F46; }
.c-amber .sub-node { background: #FFF9F0; border-color: #FCD34D; color: #92400E; }
.c-purple .sub-node { background: #F8F6FF; border-color: #C4B5FD; color: #5B21B6; }
.c-red .sub-node { background: #FFF5F5; border-color: #FCA5A5; color: #991B1B; }
.c-cyan .sub-node { background: #F0FEFF; border-color: #67E8F9; color: #155E75; }
.c-teal .sub-node { background: #F2FDFA; border-color: #5EEAD4; color: #134E4A; }
```
---
## Style A: Right-Expanding Tree (Simple scenarios with branchCount ≤ 4 or maxDepth ≤ 2)
### HTML Structure
```html
<!DOCTYPE html>
<html lang="zh">
<head>
<meta charset="UTF-8">
<style>
/* ← paste the "Universal CSS Base" above */
</style>
</head>
<body>
<div id="mindmap" data-intent="professional">
<div class="tree">
<div class="root-node">中心主题</div>
<div class="branches">
<div class="branch">
<div class="branch-node c-blue">一级分支A</div>
<div class="children">
<div class="leaf">叶子1</div>
<div class="leaf">叶子2</div>
</div>
</div>
<div class="branch">
<div class="branch-node c-green">一级分支B</div>
<div class="children">
<div class="sub-branch">
<div class="sub-node c-green">二级有下级</div>
<div class="children">
<div class="leaf">三级叶子1</div>
<div class="leaf">三级叶子2</div>
</div>
</div>
<div class="leaf">二级叶子</div>
</div>
</div>
</div>
</div>
</div>
<script>
/* ← paste the "Universal Recursive Connector Script" */
drawAllLines();
</script>
</body>
</html>
```
---
## Style B: Left-Right Expanding Tree (Standard scenario with branchCount ≥ 5, most common)
### Additional CSS (append after universal base)
```css
/* ─── Left-Right Expanding Layout ─── */
.lr-tree { display: flex; align-items: center; gap: 0; position: relative; }
.lr-tree .root-node {
padding: 20px 32px; border-radius: 14px;
margin: 0 60px; text-align: center;
}
/* ⚠️ sub-branch must be flex, otherwise flex-direction: row-reverse won't work, leaves won't expand left */
.sub-branch { display: flex; align-items: flex-start; gap: 0; }
.left-branches { display: flex; flex-direction: column; gap: 16px; }
.left-branch { display: flex; align-items: flex-start; flex-direction: row-reverse; gap: 0; }
.left-branch .children {
display: flex; flex-direction: column; gap: 6px;
margin-right: 48px; align-items: flex-end;
}
.left-branch .children:has(.sub-branch) { gap: 10px; }
.left-branch .sub-branch .children { gap: 5px; margin-right: 40px; margin-left: 0; align-items: flex-end; }
.left-branch .sub-branch { flex-direction: row-reverse; }
.right-branches { display: flex; flex-direction: column; gap: 16px; }
.right-branch { display: flex; align-items: flex-start; gap: 0; }
.right-branch .children {
display: flex; flex-direction: column; gap: 6px; margin-left: 48px; align-items: flex-start;
}
.right-branch .children:has(.sub-branch) { gap: 10px; }
.right-branch .sub-branch .children { gap: 5px; margin-left: 40px; align-items: flex-start; }
```
### HTML Structure
```html
<div id="mindmap" data-intent="professional">
<div class="lr-tree">
<!-- Left branches (allocated by balance_branches) -->
<div class="left-branches">
<div class="left-branch">
<div class="branch-node c-purple">分支D放左边</div>
<div class="children">
<div class="leaf">叶子1</div>
<div class="leaf">叶子2</div>
</div>
</div>
<!-- More left branches... -->
</div>
<div class="root-node">中心主题</div>
<!-- Right branches -->
<div class="right-branches">
<div class="right-branch">
<div class="branch-node c-blue">分支A放右边</div>
<div class="children">
<div class="leaf">叶子1</div>
<div class="leaf">叶子2</div>
</div>
</div>
<!-- More right branches... -->
</div>
</div>
</div>
<script>
/* ← paste the "Universal Recursive Connector Script" */
drawAllLines(); // v4: auto-handles both left and right sides
</script>
```
---
## Style C: Card Grid (when user explicitly requests "cards" / "modules")
> ⚠️ This is not a mind map — it's a modular display. No connectors; use cards + color coding to show relationships.
```html
<!DOCTYPE html>
<html lang="zh">
<head>
<meta charset="UTF-8">
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
body {
background: #FFFFFF;
font-family: -apple-system, BlinkMacSystemFont, 'PingFang SC', 'SimHei', sans-serif;
}
#mindmap { padding: 48px 64px; }
.map-title {
font-size: 24px; font-weight: 700; color: #1E293B; text-align: center;
margin-bottom: 8px;
}
.map-subtitle {
font-size: 14px; color: #64748B; text-align: center;
margin-bottom: 36px;
}
.card-grid {
display: grid;
grid-template-columns: repeat(auto-fit, minmax(220px, 1fr));
gap: 20px; max-width: 1000px; margin: 0 auto;
}
.card {
background: #FFFFFF; border-radius: 12px;
padding: 20px; border: 1px solid #E2E8F0;
box-shadow: 0 1px 3px rgba(0,0,0,0.04);
}
.card-header { display: flex; align-items: center; gap: 10px; margin-bottom: 12px; padding-bottom: 10px; border-bottom: 2px solid; }
.card-title { font-size: 15px; font-weight: 600; }
.card-icon {
width: 32px; height: 32px; display: flex; align-items: center; justify-content: center;
font-size: 16px; border-radius: 50%; flex-shrink: 0;
}
.card-items { list-style: none; display: flex; flex-direction: column; gap: 6px; }
.card-items li {
font-size: 13px; color: #475569; padding-left: 16px; position: relative;
}
.card-items li::before {
content: ''; position: absolute; left: 0; top: 8px;
width: 6px; height: 6px; border-radius: 50%;
}
/* Color variants */
.card.blue .card-header { border-color: #3B82F6; }
.card.blue .card-icon { background: #EFF6FF; }
.card.blue .card-title { color: #1E40AF; }
.card.blue li::before { background: #3B82F6; }
.card.green .card-header { border-color: #10B981; }
.card.green .card-icon { background: #F0FDF4; }
.card.green .card-title { color: #065F46; }
.card.green li::before { background: #10B981; }
.card.amber .card-header { border-color: #F59E0B; }
.card.amber .card-icon { background: #FFF7ED; }
.card.amber .card-title { color: #92400E; }
.card.amber li::before { background: #F59E0B; }
.card.purple .card-header { border-color: #8B5CF6; }
.card.purple .card-icon { background: #F5F3FF; }
.card.purple .card-title { color: #5B21B6; }
.card.purple li::before { background: #8B5CF6; }
</style>
</head>
<body>
<div id="mindmap" data-intent="professional">
<div class="map-title">标题</div>
<div class="map-subtitle">副标题</div>
<div class="card-grid">
<div class="card blue">
<div class="card-header">
<div class="card-icon">📋</div>
<div class="card-title">模块A</div>
</div>
<ul class="card-items">
<li>条目1</li>
<li>条目2</li>
</ul>
</div>
<!-- More cards... -->
</div>
</div>
</body>
</html>
```
---
## Style D: Fishbone Diagram (Problem Analysis / Root Cause)
Best for problem analysis, root cause tracing, quality management (Ishikawa diagram).
```html
<!DOCTYPE html>
<html lang="zh">
<head>
<meta charset="UTF-8">
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
body {
background: #FFFFFF;
font-family: -apple-system, BlinkMacSystemFont, 'PingFang SC', 'SimHei', sans-serif;
}
#mindmap { padding: 48px 64px; }
.fishbone { position: relative; }
.spine { display: flex; align-items: center; gap: 0; }
.spine-line { flex: 1; height: 3px; background: #1E293B; }
.spine-head {
width: 0; height: 0;
border-left: 16px solid #1E293B;
border-top: 10px solid transparent;
border-bottom: 10px solid transparent;
}
.result-node {
background: #1E293B; color: white;
font-size: 16px; font-weight: 700;
padding: 12px 24px; border-radius: 8px;
white-space: nowrap; margin-left: 4px;
}
.bone-branches {
position: absolute; left: 80px; right: 200px; top: 50%;
display: flex; justify-content: space-around;
}
.bone { display: flex; flex-direction: column; align-items: center; gap: 8px; }
.bone.up { transform: translateY(calc(-100% - 20px)); }
.bone.down { transform: translateY(20px); }
.bone-title {
font-size: 14px; font-weight: 600;
padding: 8px 16px; border-radius: 8px; white-space: nowrap;
}
.bone-items { display: flex; flex-direction: column; gap: 4px; align-items: center; }
.bone-item {
font-size: 12px; color: #64748B;
padding: 3px 10px; background: #F8FAFC;
border-radius: 4px; border: 1px solid #E2E8F0; white-space: nowrap;
}
.bone-line { width: 2px; height: 24px; background: #CBD5E1; }
</style>
</head>
<body>
<div id="mindmap" data-intent="professional">
<div class="fishbone">
<div class="spine">
<div class="spine-line"></div>
<div class="spine-head"></div>
<div class="result-node">问题/结果</div>
</div>
<!-- Upper causes -->
<div class="bone-branches" style="transform: translateY(calc(-50% - 20px));">
<div class="bone">
<div class="bone-items">
<div class="bone-item">子原因1</div>
<div class="bone-item">子原因2</div>
</div>
<div class="bone-line"></div>
<div class="bone-title" style="background:#EFF6FF;color:#1E40AF;border:1.5px solid #3B82F6;">原因类别A</div>
</div>
</div>
<!-- Lower causes -->
<div class="bone-branches" style="transform: translateY(calc(-50% + 20px));">
<div class="bone">
<div class="bone-title" style="background:#F0FDF4;color:#065F46;border:1.5px solid #10B981;">原因类别B</div>
<div class="bone-line"></div>
<div class="bone-items">
<div class="bone-item">子原因1</div>
<div class="bone-item">子原因2</div>
</div>
</div>
</div>
</div>
</div>
</body>
</html>
```
---
## Quality Checklist
After rendering, verify against this checklist:
1. **Content complete** — Every node from the original requirement is in the map, nothing missing
2. **Clear hierarchy** — L1 branches and leaves are instantly distinguishable (different bg/border/font-weight)
3. **No overlap** — No boxes covering boxes, no lines through text
4. **Connectors visible** — Every parent-child pair has a connector, no orphan leaves. Connector color ≥ `#94A3B8` (too light = invisible)
5. **Connector direction correct** — Left-side leaves extend left, right-side extends right
6. **Proportions reasonable** — Map is not extremely narrow/tall (target aspect ratio 1:1 to 3:1), visually comfortable
7. **Text readable** — At final output size, smallest text is legible. Reference: root 18px+, L1 15px+, L2 13px+, leaves 12px+
8. **Roughly balanced** (Style B) — Visual weight approximately equal on both sides, doesn't need to be symmetric. Target ≤30% difference
9. **Canvas large enough** — No nodes clipped, sufficient padding on all sides (reference 60px+)
10. **No large blank areas on right** — Screenshot trimmed to content edge
---
## ⛔ Radial Layout Warning
**Radial layout is strongly discouraged.** Using `position: absolute` for fixed branch positions leads to overlap when branches increase, and deep levels cannot be handled.
If content is very simple (reference: ≤3 branches, ≤3 children per branch, text ≤6 chars, depth ≤2, total ≤12 nodes), it's technically possible, but tree layout is always the safer choice.
**No code template provided.** If conditions are met, manually adjust based on Style A.

View File

@@ -0,0 +1,801 @@
# Playwright + CSS Rendering Engine
> **⚠️ Before writing any code, read [`_rules.md`](references/_rules.md) — three non-negotiable rules on overlap, hierarchy, and color.**
**Core principle: Content-driven, not template-driven. Analyze content structure first, then decide layout, then render.**
Applicable to: flowcharts, infographics, KPI cards, data posters — any visualization requiring full CSS power (gradients, shadows, rounded corners, Grid/Flexbox layout).
---
## Step 1: Content Analysis
When you receive a flowchart/infographic requirement, **don't write HTML/CSS first**. Analyze the content structure.
### 1.1 Flowchart Content Analysis
```
Input: "Honey production process", 8 steps, linear without branches
{
"type": "flowchart",
"nodes": [
{ "id": "1", "label": "花粉采集", "desc": "蜜蜂从花朵采集花蜜" },
{ "id": "2", "label": "酿造", "desc": "蜜蜂在蜂巢中反复吞吐" },
...
],
"edges": [["1","2"], ["2","3"], ...],
"nodeTypes": { "start": ["1"], "end": ["8"], "decision": [], "normal": ["2","3","4","5","6","7"] }
}
```
### 1.2 Key Metrics
| Metric | Calculation | Affects |
|------|---------|---------|
| `nodeCount` | Total node count | Mermaid vs CSS flowchart |
| `maxTextLen` | Longest text char count | Node width |
| `hasDecision` | Has decision branches | Layout complexity |
| `hasBranch` | Has parallel/branches | Column count |
| `parallelCount` | Max parallel branches | Column count |
| `phaseCount` | Phase/group count | Need for phased containers |
| `hasRoles` | Has roles/swimlanes | Need for dual-panel |
| `isLinear` | Linear without branches | Can use snake layout |
### 1.3 Infographic Content Analysis
Infographics (KPI cards, data posters) are simpler to analyze:
- How many data metrics? → Determines grid columns
- Any trend data? → Need for sparklines
- Title area? → Need for hero header
- Comparison data? → Need for bar chart
---
## Step 2: Layout Decision
### 2.1 Flowchart Layout Decision Tree
```
⚠️ DEFAULT RULE: When user asks "generate/create XXX 流程图" without specifying format,
DEFAULT to Layout C (Phased Vertical). Almost all real-world processes have phases.
User specified Mermaid/markdown?
└─ Yes → Follow user choice (Format Constraint Rule)
└─ No →
nodeCount ≤ 6 AND no phases AND maxTextLen ≤ 8 (CJK)?
└─ Yes → Mermaid (minimal flowchart)
└─ No →
phaseCount > 0 OR nodeCount ≥ 5?
└─ Yes → ⭐ Layout C: CSS Phased Vertical flowchart (DEFAULT)
└─ No →
hasRoles?
└─ Yes → Layout D: CSS Dual-panel/Swimlane flowchart
└─ No → Layout C: CSS Phased Vertical flowchart (fallback also uses C)
```
**⚠️ Layout A (Grid) and Layout B (Snake) are only for very special scenarios (e.g., flat comparison, unordered parallel items). Flowcharts default to Layout C.**
### 🚫 Flowchart Anti-Patterns (FORBIDDEN)
| ❌ Bad Pattern | ✅ Correct Pattern |
|---|---|
| Phase titles (一、二、三...) as isolated left-side text labels | Phase titles as colored title bars, wrapped inside group cards |
| All nodes flat-laid in Grid without group containers | Each phase wrapped in a `.phase-group` card containing its steps |
| Role labels scattered above nodes | Role info displayed uniformly at the top of the flowchart, or as phase card labels |
| Nodes connected with loose diverging lines | Phases connected with arrows (↓), steps within phases use numbering |
| Inconsistent node sizes, uneven spacing, misaligned | Same-phase nodes share uniform style, overall alignment consistent |
| Using Layout A Grid for a phased flowchart | Has phases → must use Layout C |
### 2.2 Canvas Size Calculation
```python
def calc_flowchart_canvas(node_count, max_text_len, parallel_count, has_roles, layout):
node_width = max(180, max_text_len * 16) # ~16px per CJK char (including padding)
if layout == 'snake':
cols = min(4, node_count)
rows = (node_count + cols - 1) // cols
width = cols * (node_width + 60) + 120 # 60=gap, 120=padding
height = rows * 120 + 200 # 120=row height
elif layout == 'dual-panel':
width = max(1600, node_width * 2 + 400) # left panel + right flow
height = node_count * 100 + 200
elif layout == 'phased-vertical':
width = max(800, node_width + 200)
height = node_count * 80 + 200
else: # grid
cols = max(parallel_count, 2)
width = cols * (node_width + 60) + 120
rows = (node_count + cols - 1) // cols
height = rows * 120 + 200
return max(width, 800), max(height, 600)
```
### 2.3 Color Decision
**Iron rule: Node background = low-saturation light color, border = high-saturation color. Large high-saturation areas = children's drawing.**
**⚠️ Text contrast iron rule: Dark/accent background nodes must use light text (white or near-white) for title and description.**
Light background → dark text (`#1F2937`), dark background → light text (`#FFFFFF` or `#FFF7ED`).
Common mistake: Endpoint/highlight node switched to dark background, but description text remains dark gray, making it completely unreadable.
```
Node type → color:
Start/end → bg: #EFF6FF, border: #3B82F6 (blue), text: #1E40AF
Normal step → bg: #F8FAFC, border: #94A3B8 (gray-blue), text: #374151 or colored by phase
Decision node → bg: #FFF7ED, border: #F59E0B (amber), text: #92400E
Success/pass → bg: #F0FDF4, border: #10B981 (green), text: #065F46
End/failure → bg: #F5F3FF, border: #8B5CF6 (purple), text: #5B21B6
Emphasis/endpoint (dark bg) → bg: #92400E, border: #F59E0B, text: #FFFFFF, desc: #FFF7ED
Max 3-4 background colors for nodes in the same flowchart.
Overall chart background = white #FFFFFF.
```
**Phase bar colors**: Same-hue gradient (blue-gray family), **never use different hues per phase**.
```css
/* ✅ Same-hue blue-gray progression */
.phase-1 { background: #F0F4F8; border-left: 4px solid #64748B; }
.phase-2 { background: #E8EDF2; border-left: 4px solid #5B7A99; }
/* ❌ Different hue per phase (rainbow effect) */
.phase-1 { background: #EFF6FF; border-left: 4px solid #3B82F6; } /* blue */
.phase-2 { background: #F0FDF4; border-left: 4px solid #10B981; } /* green */
.phase-3 { background: #FFF7ED; border-left: 4px solid #F59E0B; } /* amber */
```
---
## Step 3: Rendering
### 3.1 Playwright Screenshot (Universal)
```python
import asyncio
from playwright.async_api import async_playwright
async def html_to_image(html_path, output_path, selector='#root',
width=1200, height=None, scale=2):
"""HTML → PNG/PDF
scale: 2 (default crisp), 1.5 (large canvas 3000px+), 3 (print).
Width must accommodate ALL content. After first render, auto-resize viewport to fit.
"""
async with async_playwright() as p:
browser = await p.chromium.launch(headless=True)
page = await browser.new_page(
viewport={'width': width, 'height': height or 800},
device_scale_factor=scale
)
await page.goto(f'file://{html_path}', wait_until='networkidle')
await page.wait_for_timeout(500)
if output_path.endswith('.pdf'):
await page.pdf(path=output_path, print_background=True)
else:
el = page.locator(selector)
bbox = await el.bounding_box()
if bbox:
fit_w = max(width, int(bbox['width'] + 100))
fit_h = int(bbox['height'] + 100)
await page.set_viewport_size({'width': fit_w, 'height': fit_h})
await page.wait_for_timeout(200)
await el.screenshot(path=output_path)
await browser.close()
import os
print(f'{output_path} ({os.path.getsize(output_path)/1024:.0f}KB)')
```
### 3.2 HTML Universal Shell
```html
<!DOCTYPE html>
<html lang="zh">
<head>
<meta charset="UTF-8">
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
:root {
--text: #111827;
--text-sub: #6B7280;
--text-muted: #9CA3AF;
--bg: #FFFFFF;
--surface: #F9FAFB;
--border: #E5E7EB;
--blue: #3B82F6;
--cyan: #06B6D4;
--purple: #8B5CF6;
--amber: #F59E0B;
--red: #EF4444;
--green: #10B981;
--positive: #22C55E;
--negative: #EF4444;
--connector: #94A3B8;
}
body {
font-family: -apple-system, BlinkMacSystemFont, 'PingFang SC', 'SimHei', system-ui, sans-serif;
background: var(--bg);
color: var(--text);
-webkit-font-smoothing: antialiased;
}
#root { width: fit-content; min-width: 800px; margin: 0 auto; padding: 48px 40px; }
</style>
</head>
<body>
<div id="root">
<!-- Content -->
</div>
</body>
</html>
```
### 3.3 CSS Variables: Node Color System
```css
:root {
/* Node types — low-saturation background + high-saturation border */
--node-bg: #EFF6FF; --node-border: #3B82F6; /* Normal step (blue) */
--node-decision-bg: #FFF7ED; --node-decision-border: #F59E0B; /* Decision (amber) */
--node-success-bg: #F0FDF4; --node-success-border: #10B981; /* Success (green) */
--node-end-bg: #F5F3FF; --node-end-border: #8B5CF6; /* End (purple) */
--group-bg: #F8FAFC; --group-border: #E2E8F0; /* Group container */
}
```
---
## Layout A: CSS Grid Flowchart (Universal, Most Common)
For: Flowcharts with branches/decisions, >10 nodes or long CJK text.
### Core CSS
```css
.flow-title {
font-size: 22px; font-weight: 700; color: var(--text);
text-align: center; margin-bottom: 40px;
}
.flow-subtitle {
font-size: 14px; color: var(--text-sub);
text-align: center; margin-top: 8px;
}
/* Grid container */
.flow-grid {
display: grid;
grid-template-columns: repeat(auto-fit, minmax(180px, 220px));
gap: 40px 60px;
justify-content: center;
position: relative;
}
/* Node */
.flow-node {
background: var(--node-bg); border: 2px solid var(--node-border);
border-radius: 10px; padding: 16px 20px;
text-align: center; position: relative; z-index: 1;
min-height: 56px; display: flex; flex-direction: column;
justify-content: center; align-items: center;
box-shadow: 0 1px 3px rgba(0,0,0,0.06);
max-width: 260px; /* Prevent single node from being too wide and crowding parallel nodes */
word-break: break-word; /* Force line break for overly long text */
white-space: normal; /* Allow line breaks (override possible nowrap) */
}
.flow-node .node-title { font-size: 15px; font-weight: 600; line-height: 1.4; }
.flow-node .node-desc { font-size: 12px; color: var(--text-sub); margin-top: 4px; }
/* Node variants */
.flow-node.start { border-radius: 24px; }
.flow-node.decision { background: var(--node-decision-bg); border-color: var(--node-decision-border); }
.flow-node.success { background: var(--node-success-bg); border-color: var(--node-success-border); }
.flow-node.end { background: var(--node-end-bg); border-color: var(--node-end-border); }
/* Group box */
.flow-group {
background: var(--group-bg); border: 1.5px dashed var(--group-border);
border-radius: 12px; padding: 24px 20px 20px; position: relative;
}
.flow-group .group-label {
position: absolute; top: -10px; left: 16px;
background: var(--bg); padding: 0 8px;
font-size: 12px; font-weight: 600; color: var(--text-sub);
}
/* ─── Parallel branch constraints (prevent overlap when multiple nodes in same row) ─── */
/*
⚠️ Parallel branch iron rules:
1. Gap between parallel nodes in same row ≥ 40px (guaranteed by .flow-grid gap)
2. Each node max-width: 260px + word-break: break-word (set in .flow-node)
3. If parallel nodes > 3 → switch to vertical branch layout (don't force-squeeze into one row)
4. Text over 15 CJK characters → must line-break, don't expand node width
5. When using flex instead of manual grid-column for parallel areas, add flex-wrap: wrap as fallback
*/
.parallel-group {
display: flex; gap: 40px; justify-content: center;
flex-wrap: wrap; /* Fallback: auto-wrap when exceeding width */
}
.parallel-group .flow-node {
flex: 0 1 220px; /* Max 220px, can shrink, won't grow infinitely */
}
/* SVG connector layer */
.flow-connectors {
position: absolute; top: 0; left: 0; width: 100%; height: 100%;
pointer-events: none; z-index: 0;
}
.flow-connectors line, .flow-connectors path {
stroke: var(--connector); stroke-width: 2; fill: none;
marker-end: url(#arrowhead);
}
.connector-label {
font-size: 12px; fill: var(--text-sub); text-anchor: middle;
font-family: -apple-system, 'PingFang SC', 'SimHei', sans-serif;
}
/* Legend — must be independent container, not inside flow-grid */
.flow-legend {
display: flex; gap: 24px; justify-content: center;
margin-top: 40px; padding: 16px 24px;
background: #F9FAFB; border-radius: 8px; border: 1px solid #E5E7EB;
}
.legend-item { display: flex; align-items: center; gap: 8px; font-size: 13px; color: #4B5563; }
.legend-dot { width: 12px; height: 12px; border-radius: 3px; border: 2px solid; }
```
### Auto Connector Script
```javascript
// connections = [['sourceID', 'targetID', 'label'], ...]
function drawConnectors(connections) {
const svg = document.getElementById('connectorSvg');
const container = svg.parentElement;
const cRect = container.getBoundingClientRect();
svg.setAttribute('width', cRect.width);
svg.setAttribute('height', cRect.height);
svg.setAttribute('viewBox', `0 0 ${cRect.width} ${cRect.height}`);
// Clear old connectors (keep defs)
svg.querySelectorAll('line, path, text.connector-label').forEach(el => el.remove());
connections.forEach(([fromId, toId, label]) => {
const fromEl = document.querySelector(`[data-id="${fromId}"]`);
const toEl = document.querySelector(`[data-id="${toId}"]`);
if (!fromEl || !toEl) return;
const f = fromEl.getBoundingClientRect();
const t = toEl.getBoundingClientRect();
const x1 = f.left + f.width/2 - cRect.left;
const y1 = f.bottom - cRect.top;
const x2 = t.left + t.width/2 - cRect.left;
const y2 = t.top - cRect.top;
if (Math.abs(x1 - x2) < 10) {
// Same column → straight line
const line = document.createElementNS('http://www.w3.org/2000/svg', 'line');
line.setAttribute('x1', x1); line.setAttribute('y1', y1);
line.setAttribute('x2', x2); line.setAttribute('y2', y2);
svg.appendChild(line);
} else {
// Different column → bent line
const midY = (y1 + y2) / 2;
const path = document.createElementNS('http://www.w3.org/2000/svg', 'path');
path.setAttribute('d', `M ${x1} ${y1} L ${x1} ${midY} L ${x2} ${midY} L ${x2} ${y2}`);
svg.appendChild(path);
}
if (label) {
const text = document.createElementNS('http://www.w3.org/2000/svg', 'text');
text.setAttribute('x', (x1 + x2) / 2);
text.setAttribute('y', (y1 + y2) / 2 - 6);
text.setAttribute('class', 'connector-label');
text.textContent = label;
svg.appendChild(text);
}
});
}
// SVG defs (arrow definitions)
function ensureArrowDef(svg) {
if (svg.querySelector('#arrowhead')) return;
const defs = document.createElementNS('http://www.w3.org/2000/svg', 'defs');
defs.innerHTML = '<marker id="arrowhead" markerWidth="10" markerHeight="7" refX="9" refY="3.5" orient="auto"><polygon points="0 0, 10 3.5, 0 7" fill="#94A3B8" /></marker>';
svg.appendChild(defs);
}
```
### HTML Structure Example
```html
<div id="root">
<div class="flow-title">流程图标题</div>
<div style="position: relative;">
<svg class="flow-connectors" id="connectorSvg"></svg>
<div class="flow-grid" id="flowGrid">
<div class="flow-node start" data-id="start" style="grid-column: 2;">
<div class="node-title">开始</div>
</div>
<div class="flow-node" data-id="step1" style="grid-column: 1;">
<div class="node-title">步骤一</div>
<div class="node-desc">详细说明</div>
</div>
<div class="flow-node decision" data-id="decide" style="grid-column: 2;">
<div class="node-title">判断条件?</div>
</div>
<div class="flow-node end" data-id="end" style="grid-column: 2;">
<div class="node-title">结束</div>
</div>
</div>
</div>
<div class="flow-legend">
<div class="legend-item"><div class="legend-dot" style="border-color:#3B82F6;background:#EFF6FF;"></div>步骤</div>
<div class="legend-item"><div class="legend-dot" style="border-color:#F59E0B;background:#FFF7ED;"></div>判断</div>
</div>
</div>
<script>
const connections = [
['start', 'step1', ''], ['step1', 'decide', ''],
['decide', 'end', '通过']
];
window.addEventListener('load', () => {
ensureArrowDef(document.getElementById('connectorSvg'));
setTimeout(() => drawConnectors(connections), 100);
});
</script>
```
---
## Layout B: Snake Flowchart (>4 Linear Steps)
For: Linear non-branching processes with many steps (5-20).
### Key Rules
- Max 4 nodes per row
- First row left→right, second row right→left (snake pattern)
- End-of-row to next-row turn connectors use bend lines, with ≥60px clearance at turns
- Node positions in grid are auto-calculated by JS (no manual grid-column)
### Snake Layout Generation Script
```javascript
function layoutSnake(nodeIds, cols) {
cols = cols || 4;
nodeIds.forEach((id, i) => {
const row = Math.floor(i / cols);
const colInRow = i % cols;
const col = row % 2 === 0 ? colInRow + 1 : cols - colInRow; // Even rows L→R, odd rows R→L
const el = document.querySelector(`[data-id="${id}"]`);
if (el) {
el.style.gridRow = row + 1;
el.style.gridColumn = col;
}
});
}
```
---
## Layout C: Phased Vertical Flowchart (⭐ DEFAULT for all flowcharts)
**This is the DEFAULT layout for flowcharts.** When the user asks for any flowchart without specifying format, use this layout.
For: Any process with phases/stages, which is nearly ALL real-world processes (manufacturing, legal, project management, business operations, etc.). Also the safe fallback when unsure.
**Why this is the default:** Layout C produces consistently professional, readable results. Even if the process has only 2 "phases", the card-based grouping still looks clean. In contrast, Layout A (Grid) without proper grouping produces scattered, unreadable results.
### Key Design
**Phase titles vs sub-steps must have clear visual distinction**:
- Phase titles: colored background, font-size ≥ 16px, font-weight: 700
- Sub-steps: white/light gray background, font-size 14-15px, font-weight: 400-500
**No arrows between sub-steps**. Arrows only connect phase-to-phase. Sub-steps use indent + numbering for sequence.
### Phase Connector Direction Rule
Phase-to-phase connector arrows MUST match the logical flow direction. If the flow goes top → bottom, arrows point ↓. If bottom → top, arrows point ↑. If left → right, arrows point →. **Never draw arrows opposing the flow direction.**
### Additional CSS
```css
.phase-group {
background: #F8FAFC; border-radius: 12px; padding: 20px 24px;
margin-bottom: 24px;
}
.phase-title {
font-size: 16px; font-weight: 700; padding: 10px 16px;
border-radius: 8px; margin-bottom: 16px;
}
.phase-steps { display: flex; flex-direction: column; gap: 8px; padding-left: 12px; }
.phase-step {
font-size: 14px; font-weight: 400; color: var(--text);
padding: 8px 14px; background: white; border-radius: 6px;
border: 1px solid var(--border);
}
.phase-step .step-num {
display: inline-block; width: 22px; height: 22px; line-height: 22px;
text-align: center; border-radius: 50%; font-size: 12px; font-weight: 600;
margin-right: 8px;
}
/* Phase colors — same-hue blue-gray gradient (low saturation, easy on the eyes)
All phases share the blue-gray color family, distinguished by brightness progression.
🚫 FORBIDDEN: Different hue per phase (blue→green→amber→purple) — becomes rainbow with many phases.
✅ CORRECT: Progress within same hue family (light→dark), or pure grayscale + single-color accent.
Two schemes provided below; model selects based on phase count:
- ≤4 phases: Scheme A (blue-gray progression)
- 5-7 phases: Scheme B (neutral gray base + blue accent progression)
- >7 phases: all use same gray base, distinguish only by numbering
*/
/* Scheme A: Blue-gray progression (≤4 phases) */
.phase-1 .phase-title { background: #F0F4F8; color: #334155; border-left: 4px solid #64748B; }
.phase-1 .step-num { background: #E2E8F0; color: #475569; }
.phase-2 .phase-title { background: #E8EDF2; color: #1E3A5F; border-left: 4px solid #5B7A99; }
.phase-2 .step-num { background: #DBEAFE; color: #1E3A5F; }
.phase-3 .phase-title { background: #E0E7EF; color: #1E3050; border-left: 4px solid #4A6B8A; }
.phase-3 .step-num { background: #D0D9E4; color: #1E3050; }
.phase-4 .phase-title { background: #D8E0EA; color: #172540; border-left: 4px solid #3A5C7A; }
.phase-4 .step-num { background: #C7D2E0; color: #172540; }
/* Scheme B: Neutral gray base + blue accent progression (5-7 phases) */
/*
.phase-1 .phase-title { background: #F8FAFC; color: #334155; border-left: 4px solid #94A3B8; }
.phase-1 .step-num { background: #F1F5F9; color: #475569; }
.phase-2 .phase-title { background: #F1F5F9; color: #334155; border-left: 4px solid #7B8FA3; }
.phase-2 .step-num { background: #E2E8F0; color: #475569; }
.phase-3 .phase-title { background: #E8EDF3; color: #2D4156; border-left: 4px solid #6B809A; }
.phase-3 .step-num { background: #DBEAFE; color: #2D4156; }
.phase-4 .phase-title { background: #E2E8F0; color: #283C52; border-left: 4px solid #5B7590; }
.phase-4 .step-num { background: #D0D9E4; color: #283C52; }
.phase-5 .phase-title { background: #DAE1EB; color: #23364D; border-left: 4px solid #4B6A87; }
.phase-5 .step-num { background: #C7D2E0; color: #23364D; }
.phase-6 .phase-title { background: #D2DAE5; color: #1E3048; border-left: 4px solid #3B5F7D; }
.phase-6 .step-num { background: #BFC9D8; color: #1E3048; }
.phase-7 .phase-title { background: #CAD3E0; color: #192A3E; border-left: 4px solid #2B5473; }
.phase-7 .step-num { background: #B7C2D2; color: #192A3E; }
*/
```
### HTML Structure
```html
<div id="root">
<div class="flow-title">项目流程</div>
<div class="phase-group phase-1">
<div class="phase-title">第一阶段:需求分析</div>
<div class="phase-steps">
<div class="phase-step"><span class="step-num">1</span>需求收集与整理</div>
<div class="phase-step"><span class="step-num">2</span>可行性评估</div>
<div class="phase-step"><span class="step-num">3</span>需求优先级排序</div>
</div>
</div>
<!-- Phase-to-phase arrow (use SVG or simple centered down arrow) -->
<div style="text-align:center; color:#94A3B8; font-size:24px; margin: 8px 0;"></div>
<div class="phase-group phase-2">
<div class="phase-title">第二阶段:设计开发</div>
<div class="phase-steps">
<div class="phase-step"><span class="step-num">4</span>UI/UX 设计</div>
<div class="phase-step"><span class="step-num">5</span>前后端开发</div>
</div>
</div>
</div>
```
---
## Layout D: Dual-Panel / Swimlane Flowchart
For: Processes involving multiple roles or departments.
### Key Rules
- Canvas width ≥ 1600px
- Left panel for role/swimlane labels, right panel for flow nodes
- Role labels font-size ≥ 12px, solid background, right edge ≥ 40px from canvas
- `overflow: hidden` is forbidden
### Additional CSS
```css
.dual-layout { display: flex; gap: 40px; }
.role-panel {
flex-shrink: 0; width: 160px;
display: flex; flex-direction: column; gap: 12px;
}
.role-tag {
font-size: 13px; font-weight: 600; padding: 8px 12px;
border-radius: 6px; text-align: center;
}
.flow-panel { flex: 1; min-width: 0; }
```
---
## Infographic Templates
The following templates are for non-flowchart information visualization.
### Template: KPI Dashboard Cards
```css
.kpi-grid {
display: grid; grid-template-columns: repeat(4, 1fr);
gap: 20px; margin-bottom: 32px;
}
.kpi-card {
background: var(--surface); border: 1px solid var(--border);
border-radius: 12px; padding: 24px; text-align: center;
}
.kpi-label { font-size: 13px; color: var(--text-sub); margin-bottom: 8px; }
.kpi-value { font-size: 32px; font-weight: 700; }
.kpi-change { font-size: 14px; font-weight: 600; margin-top: 8px; }
.kpi-change.up { color: var(--positive); }
.kpi-change.down { color: var(--negative); }
```
### Template: CSS Bar Chart (Pure CSS, No JS)
```css
.bar-chart {
display: flex; align-items: flex-end; gap: 16px;
height: 300px; padding: 0 20px; border-bottom: 1px solid var(--border);
}
.bar-item { flex: 1; display: flex; flex-direction: column; align-items: center; gap: 8px; }
.bar { width: 100%; max-width: 60px; border-radius: 6px 6px 0 0; background: var(--border); }
.bar.highlight { background: var(--blue); }
.bar-label { font-size: 12px; color: var(--text-sub); }
.bar-value { font-size: 11px; color: var(--text-muted); font-weight: 600; }
```
### Template: Gradient Background Infographic Header
```css
.hero-header {
background: linear-gradient(135deg, #1E293B 0%, #0F172A 100%);
border-radius: 16px; padding: 48px; color: white; margin-bottom: 32px;
}
.hero-header h1 { font-size: 28px; font-weight: 700; margin-bottom: 12px; }
.hero-header p { font-size: 15px; color: #94A3B8; max-width: 600px; line-height: 1.6; }
.hero-badge {
display: inline-block; background: rgba(59,130,246,0.15);
color: #60A5FA; font-size: 12px; font-weight: 600;
padding: 4px 12px; border-radius: 20px; margin-bottom: 16px;
}
```
### Template: Data Card + Mini Sparkline
```html
<div class="metric-card">
<div class="metric-info">
<div class="metric-title">月活用户</div>
<div class="metric-value">34,521</div>
<div class="metric-change" style="color: var(--positive)">↑ +18.2%</div>
</div>
<div class="metric-sparkline">
<svg width="120" height="40" viewBox="0 0 120 40">
<polyline fill="none" stroke="var(--blue)" stroke-width="2"
points="0,35 15,30 30,25 45,28 60,20 75,15 90,12 105,8 120,5" />
</svg>
</div>
</div>
```
```css
.metric-card {
background: white; border: 1px solid var(--border);
border-radius: 16px; padding: 28px;
display: flex; justify-content: space-between; align-items: center;
}
.metric-title { font-size: 13px; color: var(--text-sub); }
.metric-value { font-size: 36px; font-weight: 700; margin: 4px 0; }
.metric-change { font-size: 14px; font-weight: 600; }
```
---
## Advanced Connector Rules
### Many-to-One Convergence
When multiple lines converge into one node, use the "merge first, then enter" pattern:
```
[A] ──┐
[B] ──┤── → [目标]
[C] ──┘
```
Implementation: Source lines reach a relay x-coordinate, merge into one vertical line, then a single line enters the target.
### Cross-Layer Connector Avoidance
If other nodes block the path between two nodes, **do not draw a straight line through them**.
Priority:
1. **Redesign hierarchy** (best) — Most cross-layer lines indicate hierarchy design issues; connect adjacent layers only
2. **Detour line** — Route around middle nodes via canvas edge (offset 40px right)
3. **Thread through gap** — If middle-layer node gap ≥ 40px, route through the gap
### Connector Alignment
- Multiple lines from the same node start at a consistent position on the node border
- Use only one connector style per chart (right-angle/curved/straight), no mixing
- Connector label positions must be consistent (all above line or all centered)
---
## Font Size Rules
| Element | Recommended | Minimum |
|------|------|------|
| Flowchart main node title | 16-18px | 14px |
| Node description / subtext | 13-15px | 12px |
| Connector labels | 12-14px | 11px |
| Legend text | 13-14px | 12px |
| Footnotes / watermark | 11-13px | 10px |
| Phase titles | 16-18px | 16px |
| Role labels | 13-14px | 12px |
**Not enough space → Enlarge canvas, don't shrink fonts.**
---
## Overflow Protection
1. **`#root { width: fit-content; min-width: 800px; }`** — Expands with content automatically
2. **`overflow: hidden` / `overflow: clip` are forbidden**
3. **Auto-resize viewport before Playwright screenshot** (see 3.1 screenshot script)
4. **Canvas minimum width**:
| Layout Type | Minimum Width | Recommended |
|---------|---------|------|
| Single-column flowchart | 800px | 1000px |
| Dual-panel / swimlane | 1400px | 1600-1800px |
| Three-column / multi-panel | 1800px | 2000-2400px |
---
## Quality Checklist
1. **Layout C used by default** — If this is a flowchart, verify you're using Layout C (Phased Vertical) unless there's a specific reason not to
2. **Content complete** — Every node/step from the requirement is in the chart
3. **No overlap** — No boxes covering boxes, no lines through boxes
4. **Clear hierarchy** — Phase titles and sub-steps are instantly distinguishable
5. **Colors reasonable** — Node backgrounds low-saturation, no children's-drawing palette
6. **Connectors visible** — Connector color ≥ `#94A3B8`, arrow direction correct
7. **Font sizes meet standards** — Check against font size table, nothing below minimum
8. **Legend independent** — Not inside flow-grid, not obscured by any node
9. **No clipping** — Padding ≥ 40px on all sides, all nodes and labels fully visible
10. **Phase colors consistent** — Same hue family, no blue/brown/green/purple mix
11. **Connectors don't pass through nodes** — Cross-layer lines use detour or redesign hierarchy
12. **No scattered layout** — Phase titles MUST be inside group cards, NOT floating as isolated labels
---
## Playwright+CSS vs Other Approaches
| Capability | Playwright+CSS | matplotlib | ECharts |
|------|---------------|------------|---------|
| Gradients/shadows/rounded | ✅ Full CSS power | ❌ Limited | ⚠️ Partial |
| Responsive layout | ✅ Flexbox/Grid | ❌ Fixed size | ⚠️ resize |
| PNG/PDF export | ✅ Native | ✅ savefig | ⚠️ Needs Playwright |
| Precise data charts | ⚠️ Manual | ✅ Built-in | ✅ Built-in |
**Best practice: CSS for layout and visual design, embed ECharts/SVG for precise charts.**

View File

@@ -0,0 +1,576 @@
# CSS Radial Grid Layout (Center-Outward Diagrams)
> **⚠️ Before writing any code, read [`_rules.md`](references/_rules.md) — three non-negotiable rules on overlap, hierarchy, and color.**
**For: SWOT analysis, Balanced Scorecard (BSC), Porter's Five Forces, PEST analysis, and any "center + 4-6 surrounding dimensions" diagram.**
**Core principle: Use flex rows to lock positions — model never calculates coordinates. Connectors are drawn by script reading bounding boxes.**
---
## When to Use This Template
| Diagram Type | Dimensions | Use This? |
|-------------|-----------|-----------|
| SWOT (Strengths/Weaknesses/Opportunities/Threats) | 4 quadrants | ✅ Layout B (2×2 Grid) |
| Balanced Scorecard (Financial/Customer/Internal/Learning) | 4 dimensions | ✅ Layout A (Cross) |
| Porter's Five Forces | 5 forces | ✅ Layout A (Cross + extra row) |
| PEST (Political/Economic/Social/Technological) | 4 dimensions | ✅ Layout B (2×2 Grid) |
| Competency wheel / capability map | 5-8 dimensions | ✅ Layout A with extra rows |
| Anything with center + surrounding elements | 3-8 | ✅ |
---
## Layout A: Cross Layout (4-6 Dimensions)
Best for: BSC, Porter's Five Forces, any "center with surrounding dimensions" structure.
### 🚫 FORBIDDEN: 3×3 CSS Grid Cross
**Do NOT use `grid-template-columns: Xpx Ypx Xpx` to place cards in a cross pattern.** The top/bottom cards in the center column will overflow into the side columns and overlap with left/right cards when content is longer than expected.
### ✅ REQUIRED: Three-Row Flex Layout
**Each row is an independent flex container. Rows cannot overlap each other — physically impossible.**
```
Row 1 (top): [top dimension card] ← independent flex row, centered
Row 2 (middle): [left card] [center] [right card] ← independent flex row, horizontal
Row 3 (bottom): [bottom dimension card] ← independent flex row, centered
```
### HTML + CSS
```html
<!DOCTYPE html>
<html lang="zh">
<head>
<meta charset="UTF-8">
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
body {
font-family: -apple-system, BlinkMacSystemFont, 'PingFang SC', 'SimHei', sans-serif;
background: #FFFFFF;
}
#root { width: fit-content; min-width: 900px; padding: 48px 60px; }
.diagram-title {
font-size: 22px; font-weight: 700; color: #1F2937;
text-align: center; margin-bottom: 6px;
}
.diagram-subtitle {
font-size: 14px; color: #6B7280;
text-align: center; margin-bottom: 40px;
}
/* === Three-row flex layout: rows CANNOT overlap === */
.cross-layout {
display: flex;
flex-direction: column;
align-items: center;
gap: 24px;
position: relative;
}
/* Middle row: left card + center + right card */
.middle-row {
display: flex;
align-items: center;
gap: 40px;
}
/* Center node */
.center-node {
background: linear-gradient(135deg, #1E293B, #334155);
color: white; font-size: 18px; font-weight: 700;
padding: 24px 32px; border-radius: 14px;
text-align: center; z-index: 2;
box-shadow: 0 4px 16px rgba(0,0,0,0.18);
flex-shrink: 0;
}
.center-sub {
font-size: 12px; font-weight: 400; color: #94A3B8;
margin-top: 4px;
}
/* Dimension cards */
.dim-card {
background: #FFFFFF; border-radius: 12px;
padding: 20px; border: 2px solid;
box-shadow: 0 1px 4px rgba(0,0,0,0.06);
z-index: 2;
width: 260px;
flex-shrink: 0;
}
.dim-card .dim-title {
font-size: 15px; font-weight: 700; margin-bottom: 10px;
padding-bottom: 8px; border-bottom: 2px solid;
display: flex; align-items: center; gap: 8px;
}
.dim-card .dim-icon {
width: 26px; height: 26px; border-radius: 6px;
display: flex; align-items: center; justify-content: center;
font-size: 13px; font-weight: 700; flex-shrink: 0;
}
.dim-card .dim-items { list-style: none; }
.dim-card .dim-items li {
font-size: 13px; color: #475569; padding: 4px 0 4px 18px;
position: relative; line-height: 1.5;
}
.dim-card .dim-items li::before {
content: ''; position: absolute; left: 0; top: 10px;
width: 6px; height: 6px; border-radius: 50%;
}
/* Color variants — border + title border + bullet */
.dim-blue { border-color: #3B82F6; }
.dim-blue .dim-title { color: #1E40AF; border-color: #3B82F6; }
.dim-blue .dim-icon { background: #DBEAFE; color: #1E40AF; }
.dim-blue li::before { background: #3B82F6; }
.dim-green { border-color: #10B981; }
.dim-green .dim-title { color: #065F46; border-color: #10B981; }
.dim-green .dim-icon { background: #D1FAE5; color: #065F46; }
.dim-green li::before { background: #10B981; }
.dim-amber { border-color: #F59E0B; }
.dim-amber .dim-title { color: #92400E; border-color: #F59E0B; }
.dim-amber .dim-icon { background: #FEF3C7; color: #92400E; }
.dim-amber li::before { background: #F59E0B; }
.dim-purple { border-color: #8B5CF6; }
.dim-purple .dim-title { color: #5B21B6; border-color: #8B5CF6; }
.dim-purple .dim-icon { background: #EDE9FE; color: #5B21B6; }
.dim-purple li::before { background: #8B5CF6; }
.dim-red { border-color: #EF4444; }
.dim-red .dim-title { color: #991B1B; border-color: #EF4444; }
.dim-red .dim-icon { background: #FEE2E2; color: #991B1B; }
.dim-red li::before { background: #EF4444; }
.dim-cyan { border-color: #06B6D4; }
.dim-cyan .dim-title { color: #155E75; border-color: #06B6D4; }
.dim-cyan .dim-icon { background: #CFFAFE; color: #155E75; }
.dim-cyan li::before { background: #06B6D4; }
/* SVG connector layer */
.cross-connectors {
position: absolute; top: 0; left: 0; width: 100%; height: 100%;
pointer-events: none; z-index: 1;
}
</style>
</head>
<body>
<div id="root">
<div class="diagram-title">平衡计分卡四维评价体系</div>
<div class="diagram-subtitle">基于战略目标的绩效管理框架</div>
<div class="cross-layout" id="crossLayout">
<!-- SVG connectors drawn by script -->
<svg class="cross-connectors" id="connSvg"></svg>
<!-- Row 1: top dimension -->
<div class="dim-card dim-blue" data-pos="top">
<div class="dim-title"><div class="dim-icon">F</div> 财务维度</div>
<ul class="dim-items">
<li>营收增长率</li>
<li>利润率</li>
<li>ROI</li>
</ul>
</div>
<!-- Row 2: left + center + right -->
<div class="middle-row">
<div class="dim-card dim-green" data-pos="left">
<div class="dim-title"><div class="dim-icon">I</div> 内部流程</div>
<ul class="dim-items">
<li>流程效率</li>
<li>质量管控</li>
<li>创新能力</li>
</ul>
</div>
<div class="center-node">战略目标</div>
<div class="dim-card dim-amber" data-pos="right">
<div class="dim-title"><div class="dim-icon">C</div> 客户维度</div>
<ul class="dim-items">
<li>客户满意度</li>
<li>市场份额</li>
<li>客户留存率</li>
</ul>
</div>
</div>
<!-- Row 3: bottom dimension -->
<div class="dim-card dim-purple" data-pos="bottom">
<div class="dim-title"><div class="dim-icon">L</div> 学习与成长</div>
<ul class="dim-items">
<li>员工能力提升</li>
<li>信息系统建设</li>
<li>组织文化</li>
</ul>
</div>
</div>
</div>
<script>
function drawCrossConnectors() {
const layout = document.getElementById('crossLayout');
const svg = document.getElementById('connSvg');
const gRect = layout.getBoundingClientRect();
svg.setAttribute('width', gRect.width);
svg.setAttribute('height', gRect.height);
svg.setAttribute('viewBox', `0 0 ${gRect.width} ${gRect.height}`);
svg.innerHTML = '';
// Bidirectional arrow markers — eliminates direction ambiguity
const defs = document.createElementNS('http://www.w3.org/2000/svg', 'defs');
defs.innerHTML = `
<marker id="arrowEnd" markerWidth="8" markerHeight="6" refX="7" refY="3" orient="auto">
<polygon points="0 0, 8 3, 0 6" fill="#94A3B8" />
</marker>
<marker id="arrowStart" markerWidth="8" markerHeight="6" refX="1" refY="3" orient="auto">
<polygon points="8 0, 0 3, 8 6" fill="#94A3B8" />
</marker>
`;
svg.appendChild(defs);
const center = layout.querySelector('.center-node');
const cR = center.getBoundingClientRect();
const cx = cR.left - gRect.left + cR.width / 2;
const cy = cR.top - gRect.top + cR.height / 2;
// Draw connector from center edge to each card edge
const cards = layout.querySelectorAll('.dim-card');
cards.forEach(card => {
const pos = card.dataset.pos;
const r = card.getBoundingClientRect();
const cardCx = r.left - gRect.left + r.width / 2;
const cardCy = r.top - gRect.top + r.height / 2;
let x1, y1, x2, y2;
switch (pos) {
case 'top':
x1 = cx; y1 = cR.top - gRect.top; // center top edge midpoint
x2 = cardCx; y2 = r.bottom - gRect.top; // card bottom edge midpoint
break;
case 'bottom':
x1 = cx; y1 = cR.bottom - gRect.top; // center bottom edge midpoint
x2 = cardCx; y2 = r.top - gRect.top; // card top edge midpoint
break;
case 'left':
x1 = cR.left - gRect.left; y1 = cy; // center left edge midpoint
x2 = r.right - gRect.left; y2 = cardCy; // card right edge midpoint
break;
case 'right':
x1 = cR.right - gRect.left; y1 = cy; // center right edge midpoint
x2 = r.left - gRect.left; y2 = cardCy; // card left edge midpoint
break;
// --- 5+ dimension: bottom row with two cards side by side ---
case 'bottom-left':
x1 = cx; y1 = cR.bottom - gRect.top; // center BOTTOM MIDPOINT (not corner!)
x2 = cardCx; y2 = r.top - gRect.top; // card TOP MIDPOINT
break;
case 'bottom-right':
x1 = cx; y1 = cR.bottom - gRect.top; // center BOTTOM MIDPOINT (not corner!)
x2 = cardCx; y2 = r.top - gRect.top; // card TOP MIDPOINT
break;
}
// 🚫 FORBIDDEN: drawing lines from center CORNERS (e.g. cR.left + cR.bottom)
// All lines MUST originate from center EDGE MIDPOINTS (cx or cy)
const line = document.createElementNS('http://www.w3.org/2000/svg', 'line');
line.setAttribute('x1', x1); line.setAttribute('y1', y1);
line.setAttribute('x2', x2); line.setAttribute('y2', y2);
line.setAttribute('stroke', '#94A3B8');
line.setAttribute('stroke-width', '2');
line.setAttribute('stroke-dasharray', '6,4');
line.setAttribute('marker-start', 'url(#arrowStart)');
line.setAttribute('marker-end', 'url(#arrowEnd)');
svg.appendChild(line);
});
}
window.addEventListener('load', () => setTimeout(drawCrossConnectors, 300));
</script>
</body>
</html>
```
**Key design decisions:**
- **Three-row flex** instead of 3×3 Grid — rows physically cannot overlap
- **Fixed card width (260px)** — prevents content-driven overflow
- **`gap: 24px`** between rows, **`gap: 40px`** within middle row — generous spacing
- **Center node uses `flex-shrink: 0`** — never collapses under pressure
### Adapting for 5+ Dimensions
For Porter's Five Forces (5 dimensions) or more:
```html
<!-- Row 1: top -->
<div class="dim-card dim-blue" data-pos="top">...</div>
<!-- Row 2: left + center + right -->
<div class="middle-row">
<div class="dim-card dim-green" data-pos="left">...</div>
<div class="center-node">...</div>
<div class="dim-card dim-amber" data-pos="right">...</div>
</div>
<!-- Row 3: bottom-left + bottom-right (two cards side by side) -->
<div class="middle-row" style="gap: 40px;">
<div class="dim-card dim-purple" data-pos="bottom-left">...</div>
<div class="dim-card dim-red" data-pos="bottom-right">...</div>
</div>
```
For the connector script, add cases for `bottom-left` and `bottom-right`:
```javascript
// 🚫 FORBIDDEN: using cR.left/cR.right as x1 — that draws from center CORNER, angle is ugly
// ✅ CORRECT: always use cx (center bottom midpoint) as x1
case 'bottom-left':
x1 = cx; y1 = cR.bottom - gRect.top; // center BOTTOM MIDPOINT
x2 = cardCx; y2 = r.top - gRect.top; // card TOP MIDPOINT
break;
case 'bottom-right':
x1 = cx; y1 = cR.bottom - gRect.top; // center BOTTOM MIDPOINT
x2 = cardCx; y2 = r.top - gRect.top; // card TOP MIDPOINT
break;
```
### Adapting for 3 Dimensions
Simply remove one side card from the middle row:
```html
<!-- Row 1: top -->
<div class="dim-card dim-blue" data-pos="top">...</div>
<!-- Row 2: center + right only -->
<div class="middle-row">
<div class="center-node">...</div>
<div class="dim-card dim-amber" data-pos="right">...</div>
</div>
<!-- Row 3: bottom -->
<div class="dim-card dim-purple" data-pos="bottom">...</div>
```
---
## Layout B: 2×2 Quadrant Grid (SWOT / PEST)
Best for: exactly 4 dimensions arranged as equal quadrants (no center node needed).
**This layout has NO center node — the four quadrants themselves tell the story.**
### HTML + CSS
```html
<!DOCTYPE html>
<html lang="zh">
<head>
<meta charset="UTF-8">
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
body {
font-family: -apple-system, BlinkMacSystemFont, 'PingFang SC', 'SimHei', sans-serif;
background: #FFFFFF;
}
#root { width: fit-content; min-width: 900px; padding: 48px 60px; }
.diagram-title {
font-size: 22px; font-weight: 700; color: #1F2937;
text-align: center; margin-bottom: 12px;
}
.diagram-subtitle {
font-size: 14px; color: #6B7280;
text-align: center; margin-bottom: 36px;
}
.quadrant-grid {
display: grid;
grid-template-columns: 1fr 1fr;
gap: 16px;
max-width: 800px;
margin: 0 auto;
}
.quadrant {
border-radius: 12px; padding: 24px;
border: 1.5px solid;
min-height: 200px;
}
.quadrant .q-title {
font-size: 16px; font-weight: 700; margin-bottom: 14px;
display: flex; align-items: center; gap: 8px;
}
.quadrant .q-icon {
width: 28px; height: 28px; border-radius: 6px;
display: flex; align-items: center; justify-content: center;
font-size: 14px; font-weight: 700;
}
.quadrant .q-items { list-style: none; }
.quadrant .q-items li {
font-size: 13px; padding: 5px 0 5px 18px;
position: relative; line-height: 1.5;
}
.quadrant .q-items li::before {
content: ''; position: absolute; left: 0; top: 11px;
width: 6px; height: 6px; border-radius: 50%;
}
/* SWOT colors — low-sat backgrounds, sat borders */
.q-strengths { background: #F0FDF4; border-color: #86EFAC; }
.q-strengths .q-title { color: #065F46; }
.q-strengths .q-icon { background: #D1FAE5; color: #065F46; }
.q-strengths li { color: #1F2937; }
.q-strengths li::before { background: #10B981; }
.q-weaknesses { background: #FEF2F2; border-color: #FECACA; }
.q-weaknesses .q-title { color: #991B1B; }
.q-weaknesses .q-icon { background: #FEE2E2; color: #991B1B; }
.q-weaknesses li { color: #1F2937; }
.q-weaknesses li::before { background: #EF4444; }
.q-opportunities { background: #EFF6FF; border-color: #93C5FD; }
.q-opportunities .q-title { color: #1E40AF; }
.q-opportunities .q-icon { background: #DBEAFE; color: #1E40AF; }
.q-opportunities li { color: #1F2937; }
.q-opportunities li::before { background: #3B82F6; }
.q-threats { background: #FFF7ED; border-color: #FDE68A; }
.q-threats .q-title { color: #92400E; }
.q-threats .q-icon { background: #FEF3C7; color: #92400E; }
.q-threats li { color: #1F2937; }
.q-threats li::before { background: #F59E0B; }
</style>
</head>
<body>
<div id="root">
<div class="diagram-title">SWOT 分析</div>
<div class="diagram-subtitle">企业战略定位评估</div>
<div class="quadrant-grid">
<div class="quadrant q-strengths">
<div class="q-title"><div class="q-icon">S</div> 优势 Strengths</div>
<ul class="q-items">
<li>核心技术领先</li>
<li>品牌知名度高</li>
<li>供应链成熟</li>
</ul>
</div>
<div class="quadrant q-weaknesses">
<div class="q-title"><div class="q-icon">W</div> 劣势 Weaknesses</div>
<ul class="q-items">
<li>国际化经验不足</li>
<li>产品线单一</li>
<li>人才储备有限</li>
</ul>
</div>
<div class="quadrant q-opportunities">
<div class="q-title"><div class="q-icon">O</div> 机会 Opportunities</div>
<ul class="q-items">
<li>新兴市场需求增长</li>
<li>政策利好</li>
<li>技术融合趋势</li>
</ul>
</div>
<div class="quadrant q-threats">
<div class="q-title"><div class="q-icon">T</div> 威胁 Threats</div>
<ul class="q-items">
<li>竞争加剧</li>
<li>原材料价格波动</li>
<li>法规变化风险</li>
</ul>
</div>
</div>
</div>
</body>
</html>
```
**No connectors needed** — the 2×2 grid itself communicates the four-quadrant relationship. Adding arrows would be visual noise.
---
## Connector Rules
1. **Connectors are ALWAYS drawn by script reading bounding boxes** — never hardcode x/y values in HTML/CSS
2. **Straight lines only** (horizontal or vertical) — no diagonal lines unless top/bottom cards are offset from center
3. **Dashed lines** (`stroke-dasharray: 6,4`) for conceptual relationships
4. **Solid lines** for causal/sequential relationships
5. **Arrow direction**: model chooses based on diagram semantics — pick ONE style per diagram, don't mix:
- **Outward** (`marker-end` only): center influences/drives dimensions (e.g. BSC: strategy → dimensions)
- **Inward** (`marker-start` only): dimensions report/pressure center (e.g. Porter: forces → competition)
- **Bidirectional** (`marker-start` + `marker-end`): mutual influence (e.g. feedback loops)
6. **🚫 All lines MUST originate from center EDGE MIDPOINTS** (cx or cy) — never from center corners
7. **Line color**: `#94A3B8` (gray-blue) — never use dimension-specific colors for connectors (visual chaos)
### SVG Arrow Markers (copy-paste into defs)
```javascript
// Include both markers; use only what the diagram needs
const defs = document.createElementNS('http://www.w3.org/2000/svg', 'defs');
defs.innerHTML = `
<marker id="arrowEnd" markerWidth="8" markerHeight="6" refX="7" refY="3" orient="auto">
<polygon points="0 0, 8 3, 0 6" fill="#94A3B8" />
</marker>
<marker id="arrowStart" markerWidth="8" markerHeight="6" refX="1" refY="3" orient="auto">
<polygon points="8 0, 0 3, 8 6" fill="#94A3B8" />
</marker>
`;
svg.appendChild(defs);
// Outward (center → card):
line.setAttribute('marker-end', 'url(#arrowEnd)');
// Inward (card → center):
line.setAttribute('marker-start', 'url(#arrowStart)');
// Bidirectional:
line.setAttribute('marker-start', 'url(#arrowStart)');
line.setAttribute('marker-end', 'url(#arrowEnd)');
```
---
## Content Rules
1. **Each dimension card: max 6 bullet items** — more than 6 → group into sub-categories or use a separate detail table
2. **Bullet text: max 15 Chinese characters per line** — longer text wraps naturally (word-break: break-word)
3. **Center node text: max 8 Chinese characters** — keep it a short label, not a sentence
4. **Dimension title: max 10 Chinese characters** — concise category name
---
## Font Size Rules
| Element | Size | Weight |
|---------|------|--------|
| Diagram title | 22px | 700 |
| Center node | 18px | 700 |
| Dimension title | 15-16px | 700 |
| Bullet items | 13px | 400 |
| Subtitle/footnote | 14px | 400 |
---
## Quality Checklist
1. **No overlap** — all dimension cards fully visible, none clipped or covering another
2. **Connectors point to card edges** — not to random points in space
3. **Center node visually dominant** — largest, darkest, or most contrast
4. **Dimension cards visually equal** — similar size, similar padding, no one card 3× larger
5. **Colors follow scheme** — each dimension has its own color (border + title), backgrounds stay pale (white or near-white)
6. **Layout is centered** — equal margins on all sides
7. **Canvas large enough** — min 900px wide for cross layout, min 800px for 2×2 quadrant

View File

@@ -0,0 +1,324 @@
# Seaborn Template Library
> **⚠️ Before writing any code, read [`_rules.md`](references/_rules.md) — three non-negotiable rules on overlap, hierarchy, and color.**
Seaborn is built on top of matplotlib, specializing in **statistical visualization** — distributions, regression, categorical comparisons, etc.
Its default styles are already much better looking than matplotlib, but still need tuning to reach professional standards.
## Environment Setup
```python
import matplotlib
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import pandas as pd
# ═══ Chinese Font Setup ═══
# Font path: adjust for your system. Common locations:
# macOS: '/System/Library/Fonts/Supplemental/SimHei.ttf'
# Linux: '/usr/share/fonts/truetype/wqy/wqy-zenhei.ttc'
# Custom: './fonts/SimHei.ttf'
import os
SIMHEI_PATH = os.environ.get('SIMHEI_FONT', '/System/Library/Fonts/Supplemental/SimHei.ttf')
matplotlib.font_manager.fontManager.addfont(SIMHEI_PATH)
# Seaborn theme + custom overrides
sns.set_theme(style='whitegrid', font='SimHei', rc={
'axes.unicode_minus': False,
'figure.facecolor': '#FFFFFF',
'axes.facecolor': '#FFFFFF',
'axes.edgecolor': '#E5E7EB',
'axes.linewidth': 0.8,
'axes.spines.top': False,
'axes.spines.right': False,
'grid.color': '#F3F4F6',
'grid.alpha': 0.5,
'grid.linewidth': 0.5,
'xtick.major.size': 0,
'ytick.major.size': 0,
'axes.titlesize': 16,
'axes.titleweight': 'bold',
'axes.titlepad': 16,
'legend.frameon': False,
'figure.dpi': 200,
'savefig.dpi': 200,
'savefig.bbox': 'tight',
})
# Colors (consistent with matplotlib.md)
COOL = ['#3B82F6', '#06B6D4', '#8B5CF6', '#F59E0B', '#EF4444', '#10B981']
CB_SAFE = ['#0077BB', '#33BBEE', '#009988', '#EE7733', '#CC3311', '#EE3377']
```
## Seaborn Palette Setup
```python
# Method 1: use hex list directly
sns.set_palette(COOL)
# Method 2: register custom palette
from matplotlib.colors import ListedColormap
bc_palette = ListedColormap(COOL, name='charts')
```
---
## Template 1: Distribution Plot (Histogram + KDE)
**Scenario**: View data distribution shape, detect skewness and outliers.
```python
def dist_plot(data, title, xlabel, color='#3B82F6', save_path='dist.png'):
fig, ax = plt.subplots(figsize=(10, 6))
sns.histplot(data, kde=True, color=color, edgecolor='white',
linewidth=0.5, alpha=0.7, ax=ax)
# Bold the KDE line
for line in ax.get_lines():
line.set_linewidth(2.5)
ax.set_title(title, loc='left')
ax.set_xlabel(xlabel)
ax.set_ylabel('Frequency')
plt.tight_layout()
plt.savefig(save_path, dpi=200, facecolor='white', bbox_inches='tight')
plt.close()
```
---
## Template 2: Box Plot (Categorical Comparison)
**Scenario**: Compare distribution characteristics across groups (median, quartiles, outliers).
```python
def box_compare(df, x_col, y_col, title, palette=None, save_path='box.png'):
if palette is None:
palette = COOL
fig, ax = plt.subplots(figsize=(10, 6))
sns.boxplot(data=df, x=x_col, y=y_col, palette=palette,
width=0.5, linewidth=1.2, fliersize=4,
boxprops=dict(edgecolor='white'),
medianprops=dict(color='white', linewidth=2),
ax=ax)
ax.set_title(title, loc='left')
plt.tight_layout()
plt.savefig(save_path, dpi=200, facecolor='white', bbox_inches='tight')
plt.close()
```
### Box Plot + Label Avoidance (For Complex Scenarios)
When you need to annotate outliers, specific data points, or group names on a box plot, **label avoidance is required**:
```python
def box_annotated(df, x_col, y_col, title, annotations=None,
palette=None, save_path='box_annotated.png'):
"""
annotations: [{'x': 0, 'y': 45, 'text': '产线A 异常'}, ...]
"""
from adjustText import adjust_text
if palette is None:
palette = COOL
fig, ax = plt.subplots(figsize=(12, 7)) # Slightly larger canvas, leave room for annotations
sns.boxplot(data=df, x=x_col, y=y_col, palette=palette,
width=0.5, linewidth=1.2, fliersize=4,
boxprops=dict(edgecolor='white'),
medianprops=dict(color='white', linewidth=2),
ax=ax)
# Annotation text — use adjustText for auto-avoidance
if annotations:
texts = []
for ann in annotations:
t = ax.text(ann['x'], ann['y'], ann['text'],
fontsize=9, color='#374151',
bbox=dict(boxstyle='round,pad=0.3',
facecolor='#FFF7ED', edgecolor='#F59E0B',
alpha=0.9))
texts.append(t)
adjust_text(texts, ax=ax,
arrowprops=dict(arrowstyle='->', color='#9CA3AF', lw=0.8),
force_text=(0.8, 1.0),
force_points=(0.5, 0.8))
ax.set_title(title, loc='left')
plt.tight_layout()
plt.savefig(save_path, dpi=200, facecolor='white', bbox_inches='tight')
plt.close()
```
### Box Plot + Colorbar (e.g., MTTR Analysis)
When a box plot needs to work with a colorbar (e.g., colored by dimension), you must leave enough space for the colorbar:
```python
def box_with_colorbar(df, x_col, y_col, color_col, title,
save_path='box_cbar.png'):
import matplotlib.gridspec as gridspec
from matplotlib.colors import Normalize
from matplotlib.cm import ScalarMappable
fig = plt.figure(figsize=(14, 7), constrained_layout=True)
gs = gridspec.GridSpec(1, 2, figure=fig, width_ratios=[1, 0.03], wspace=0.05)
ax = fig.add_subplot(gs[0, 0])
cbar_ax = fig.add_subplot(gs[0, 1])
sns.boxplot(data=df, x=x_col, y=y_col, ax=ax,
width=0.5, linewidth=1.2)
# Colorbar in its own subplot, won't obscure box plot
norm = Normalize(vmin=df[color_col].min(), vmax=df[color_col].max())
sm = ScalarMappable(norm=norm, cmap='Blues')
fig.colorbar(sm, cax=cbar_ax, label=color_col)
ax.set_title(title, loc='left')
fig.savefig(save_path, dpi=200, facecolor='white', bbox_inches='tight')
plt.close()
```
---
## Template 3: Violin Plot
**Scenario**: Enhanced version of box plot, showing distribution density simultaneously.
```python
def violin_plot(df, x_col, y_col, title, palette=None, save_path='violin.png'):
if palette is None:
palette = COOL
fig, ax = plt.subplots(figsize=(10, 6))
sns.violinplot(data=df, x=x_col, y=y_col, palette=palette,
inner='box', linewidth=1, ax=ax)
ax.set_title(title, loc='left')
plt.tight_layout()
plt.savefig(save_path, dpi=200, facecolor='white', bbox_inches='tight')
plt.close()
```
---
## Template 4: Regression Scatter Plot
**Scenario**: Two-variable relationship + linear regression + confidence interval.
```python
def reg_plot(df, x_col, y_col, title, color='#3B82F6', save_path='reg.png'):
fig, ax = plt.subplots(figsize=(10, 7))
sns.regplot(data=df, x=x_col, y=y_col, color=color,
scatter_kws={'s': 50, 'alpha': 0.6, 'edgecolor': 'white', 'linewidth': 1},
line_kws={'linewidth': 2},
ax=ax)
ax.set_title(title, loc='left')
plt.tight_layout()
plt.savefig(save_path, dpi=200, facecolor='white', bbox_inches='tight')
plt.close()
```
---
## Template 5: Correlation Heatmap
**Scenario**: Correlation coefficient matrix among multiple variables.
```python
def corr_heatmap(df, title, cmap='RdBu_r', save_path='corr.png'):
corr = df.corr()
fig, ax = plt.subplots(figsize=(10, 8))
# Show only lower triangle
mask = np.triu(np.ones_like(corr, dtype=bool))
sns.heatmap(corr, mask=mask, cmap=cmap, center=0,
vmin=-1, vmax=1, annot=True, fmt='.2f',
square=True, linewidths=1, linecolor='white',
cbar_kws={'shrink': 0.8, 'label': 'Correlation'},
annot_kws={'size': 9},
ax=ax)
ax.set_title(title, loc='left', pad=16)
plt.tight_layout()
plt.savefig(save_path, dpi=200, facecolor='white', bbox_inches='tight')
plt.close()
```
---
## Template 6: Pair Plot
**Scenario**: Overview of pairwise relationships among multiple variables (distribution on diagonal, scatter plots elsewhere).
```python
def pair_plot(df, hue_col=None, palette=None, save_path='pair.png'):
if palette is None:
palette = COOL
g = sns.pairplot(df, hue=hue_col, palette=palette,
diag_kind='kde', plot_kws={'alpha': 0.6, 's': 30},
height=2.5, aspect=1)
g.figure.suptitle('Pairwise Variable Relationships', y=1.02, fontsize=16, fontweight='bold')
g.savefig(save_path, dpi=200, facecolor='white', bbox_inches='tight')
plt.close()
```
---
## Template 7: Facet Grid (FacetGrid)
**Scenario**: Split into multiple subplots by categorical variable for comparison.
```python
def facet_hist(df, col_var, value_col, title, color='#3B82F6', save_path='facet.png'):
g = sns.FacetGrid(df, col=col_var, col_wrap=3, height=3.5, aspect=1.3)
g.map(sns.histplot, value_col, color=color, edgecolor='white',
linewidth=0.5, alpha=0.7, kde=True)
g.set_titles('{col_name}', fontsize=12, fontweight='bold')
g.figure.suptitle(title, y=1.02, fontsize=16, fontweight='bold')
g.savefig(save_path, dpi=200, facecolor='white', bbox_inches='tight')
plt.close()
```
---
## Seaborn vs matplotlib Selection Guide
| What to Plot | Use Seaborn | Use matplotlib |
|---------|-----------|--------------|
| Histogram/KDE distribution | ✅ `histplot` / `kdeplot` | Works, but more code |
| Box plot/Violin plot | ✅ One-liner | Works, but rougher style |
| Regression scatter + confidence interval | ✅ `regplot` auto-calculates | Manual fitting + plotting |
| Correlation heatmap | ✅ `heatmap` + mask | Manual `imshow` tedious |
| Pairwise relationship matrix | ✅ `pairplot` unique | No equivalent |
| Facet grid | ✅ `FacetGrid` | `plt.subplots` manual loop |
| Regular bar chart | Less flexible than matplotlib | ✅ More control |
| Line trend chart | Less than matplotlib | ✅ More control |
| Custom annotations/arrows | Not suitable | ✅ `ax.annotate` |
**Principle: Use Seaborn for statistical charts, matplotlib for customized charts.**

228
skills/charts/setup.sh Executable file
View File

@@ -0,0 +1,228 @@
#!/usr/bin/env bash
# ---
# name: charts-setup
# author: Z.AI
# version: "1.0"
# description: Environment setup for the Charts skill. Checks and installs all required dependencies.
# ---
#
# Installs only dependencies required by the Charts skill.
set -euo pipefail
RED='\033[0;31m'; GREEN='\033[0;32m'; YELLOW='\033[1;33m'; BLUE='\033[0;34m'; NC='\033[0m'
ok() { echo -e " ${GREEN}${NC} $1"; }
fail() { echo -e " ${RED}${NC} $1"; }
warn() { echo -e " ${YELLOW}${NC} $1"; }
info() { echo -e " ${BLUE}${NC} $1"; }
echo "============================================"
echo " Charts Skill — Environment Setup"
echo "============================================"
echo ""
OS="$(uname -s)"
ARCH="$(uname -m)"
echo "Platform: $OS $ARCH"
echo ""
# ── 0. macOS: Homebrew ──
if [ "$OS" = "Darwin" ]; then
echo "--- Homebrew (macOS package manager) ---"
if command -v brew &>/dev/null; then
BREW_VER=$(brew --version 2>/dev/null | head -1)
ok "brew ($BREW_VER)"
else
fail "brew not found — most dependencies below need Homebrew on macOS"
info "Install: /bin/bash -c \"\$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)\""
fi
echo ""
fi
# ── 1. Python 3 ──
echo "--- Python ---"
if command -v python3 &>/dev/null; then
PY_VER=$(python3 --version 2>&1)
ok "python3 ($PY_VER)"
if [ "$OS" = "Darwin" ]; then
PY_PATH=$(which python3 2>/dev/null)
if [[ "$PY_PATH" == "/usr/bin/python3" ]]; then
warn "Using macOS system Python (limited). Recommend: brew install python3"
fi
fi
else
fail "python3 not found"
case "$OS" in
Darwin) info "Install: brew install python3" ;;
Linux) info "Install: sudo apt install python3 python3-pip (Debian/Ubuntu)"
info " sudo dnf install python3 python3-pip (Fedora/RHEL)" ;;
*) info "Install: https://www.python.org/downloads/" ;;
esac
fi
# ── 2. pip ──
echo ""
echo "--- pip ---"
if python3 -m pip --version &>/dev/null 2>&1; then
PIP_VER=$(python3 -m pip --version 2>/dev/null | head -1)
ok "pip ($PIP_VER)"
else
fail "pip not found"
case "$OS" in
Darwin) info "Install: python3 -m ensurepip --upgrade"
info " or: brew install python3 (includes pip)" ;;
Linux) info "Install: sudo apt install python3-pip (Debian/Ubuntu)" ;;
*) info "Install: python3 -m ensurepip --upgrade" ;;
esac
fi
# ── 3. Python packages (matplotlib / seaborn for data charts) ──
echo ""
echo "--- Python Packages (Data Charts) ---"
PY_PKGS=(
"matplotlib:matplotlib"
"seaborn:seaborn"
"numpy:numpy"
"adjustText:adjustText"
)
MISSING_PY=()
for entry in "${PY_PKGS[@]}"; do
mod="${entry%%:*}"
pkg="${entry##*:}"
if python3 -c "import $mod" 2>/dev/null; then
ver=$(python3 -c "import $mod; print(getattr($mod, '__version__', 'installed'))" 2>/dev/null)
ok "$pkg ($ver)"
else
fail "$pkg not installed"
MISSING_PY+=("$pkg")
fi
done
if [ ${#MISSING_PY[@]} -gt 0 ]; then
echo ""
if [ -t 0 ]; then
read -p " Install missing Python packages? [Y/n] " -n 1 -r REPLY
echo ""
REPLY=${REPLY:-Y}
else
warn "Non-interactive mode — skipping auto-install. Run interactively or install manually."
REPLY=N
fi
if [[ ! $REPLY =~ ^[Nn]$ ]]; then
python3 -m pip install -q "${MISSING_PY[@]}" 2>/dev/null \
|| python3 -m pip install -q --user "${MISSING_PY[@]}" 2>/dev/null \
|| python3 -m pip install -q --break-system-packages "${MISSING_PY[@]}" 2>/dev/null \
|| { fail "pip install failed. Try manually: pip install ${MISSING_PY[*]}"; }
ok "Installed: ${MISSING_PY[*]}"
fi
fi
# ── 4. Node.js ──
echo ""
echo "--- Node.js (Interactive Charts & Diagrams) ---"
if command -v node &>/dev/null; then
NODE_VER=$(node --version)
ok "node ($NODE_VER)"
else
fail "node not found"
case "$OS" in
Darwin) info "Install: brew install node" ;;
Linux) info "Install: curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -"
info " sudo apt install -y nodejs" ;;
*) info "Install: https://nodejs.org/" ;;
esac
fi
# ── 5. npm ──
echo ""
echo "--- npm ---"
if command -v npm &>/dev/null; then
NPM_VER=$(npm --version 2>/dev/null)
ok "npm ($NPM_VER)"
else
fail "npm not found"
case "$OS" in
Darwin) info "Install: brew install node (includes npm)" ;;
Linux) info "Install: comes with nodejs" ;;
*) info "Install: https://nodejs.org/" ;;
esac
fi
# ── 6. Playwright + Chromium (for HTML→PNG/PDF rendering) ──
echo ""
echo "--- Playwright (Structural Diagrams & HTML Charts) ---"
if node -e "require('playwright')" 2>/dev/null; then
PW_VER=$(node -e "console.log(require('playwright/package.json').version)" 2>/dev/null)
ok "playwright ($PW_VER)"
else
fail "playwright not installed"
info "Install: npm install -g playwright"
fi
if [ "$OS" = "Darwin" ]; then
PW_CACHE="$HOME/Library/Caches/ms-playwright"
else
PW_CACHE="$HOME/.cache/ms-playwright"
fi
if ls "$PW_CACHE"/chromium-* &>/dev/null 2>&1; then
CR_DIR=$(ls -d "$PW_CACHE"/chromium-* 2>/dev/null | tail -1)
ok "chromium ($(basename "$CR_DIR"))"
else
fail "chromium not installed"
info "Install: npx playwright install chromium"
if [ "$OS" = "Linux" ]; then
info " npx playwright install-deps (system libs, needs sudo)"
fi
fi
# ── 7. CJK Fonts (for Chinese chart labels) ──
echo ""
echo "--- CJK Fonts (Chinese text in charts) ---"
CJK_FOUND=false
# Check matplotlib registered fonts
if python3 -c "
import matplotlib.font_manager as fm
fonts = [f.name for f in fm.fontManager.ttflist]
if 'SimHei' in fonts or 'Heiti SC' in fonts or 'Noto Sans CJK' in fonts or 'PingFang SC' in fonts:
print('ok')
else:
print('missing')
" 2>/dev/null | grep -q "ok"; then
ok "CJK font registered in matplotlib (SimHei/Heiti SC/Noto Sans CJK/PingFang SC)"
CJK_FOUND=true
fi
# Check system CJK fonts
if [ "$OS" = "Darwin" ]; then
if ls /System/Library/Fonts/PingFang.ttc &>/dev/null 2>&1 \
|| ls /System/Library/Fonts/STHeiti*.ttc &>/dev/null 2>&1; then
ok "macOS CJK system fonts available (PingFang/STHeiti)"
CJK_FOUND=true
fi
elif [ "$OS" = "Linux" ]; then
if fc-list :lang=zh 2>/dev/null | head -1 | grep -q .; then
ok "system CJK fonts available (fc-list)"
CJK_FOUND=true
fi
fi
if [ "$CJK_FOUND" = false ]; then
warn "No CJK font detected — Chinese labels may show as □"
info "The skill configures font fallback via rcParams at runtime."
info "Ensure a CJK font file exists (e.g., SimHei.ttf) and is registered."
if [ "$OS" = "Darwin" ]; then
info "macOS ships with PingFang — try: plt.rcParams['font.sans-serif'] = ['PingFang SC', 'DejaVu Sans']"
elif [ "$OS" = "Linux" ]; then
info "Install: sudo apt install fonts-noto-cjk"
fi
fi
# ── Summary ──
echo ""
echo "============================================"
echo " Setup complete."
echo " Data charts: matplotlib + seaborn"
echo " Structural diagrams: Playwright + CSS"
echo " Interactive charts: ECharts / D3.js via Node.js"
echo "============================================"

210
skills/cheat-sheet/SKILL.md Executable file
View File

@@ -0,0 +1,210 @@
---
name: cheat-sheet
description: 将 PDF/Word/Markdown 学习资料转化为精炼的知识浓缩卡文档。支持三种风格(知识点速查卡/思维导图式/Q&A式输出双栏小字 PDF。当用户说"生成知识浓缩卡"、"生成 Cheatsheet"、"帮我做个速查表"、"把这个资料整理成一页纸"、"做个知识卡片"时触发。**不处理**:基于材料出题(→ quiz-mastery、长期学习项目→ study-buddy
---
# Cheatsheet 生成器
## 你做什么
把用户的学习资料PDF/Word/Markdown/文本)压缩成一份**双栏、小字、信息密度高**的 Cheatsheet 文档PDF 格式)。
**核心原则:只保留干货,砍掉废话。**
---
## ⚠️ 铁律:避免重复创建
生成 Cheatsheet 前,**先扫描输出目录**检查是否已有同主题文件:
- **已存在** → 问用户"已经有一份 [文件名] 了,覆盖 / 新建一份 / 跳过?" 等用户回复再动手
- **不存在** → 直接生成
- **绝不**未经询问直接覆盖
---
## 工作流程
### 步骤 1接收资料
用户提供学习资料文件,支持格式:
- `.pdf` — 调 PDF skill 的 process 路线提取文本
- `.docx` — 调 PDF skill 的 process 路线提取文本
- `.md` / `.txt` — 直接读取
**提取文本命令:**
```bash
# PDF 提取文本
python3 "$PDF_SCRIPTS/pdf.py" extract.text <file_path>
# Word 转 PDF 后提取(如需要)
python3 "$PDF_SCRIPTS/pdf.py" convert.office <file_path>
```
其中 `$PDF_SCRIPTS` 为 PDF skill 的 scripts 目录路径,需从 PDF skill 的 SKILL.md 位置推导。
如果用户没有提供文件而是直接贴了文本内容,跳过提取步骤,直接用文本。
---
### 步骤 2选择风格
**必须询问用户**想要哪种风格(不要自己替用户选):
向用户展示以下选项:
1. 📋 **知识点速查卡** — 核心概念 + 定义 + 关键公式/要点,一眼扫到,适合考前突击
2. 🌳 **思维导图式** — 按层级结构组织(大标题→子标题→要点),有大纲感,适合梳理体系
3.**Q&A 式** — 把知识点变成"问题→答案"对,适合自测复习
用户选择后进入步骤 3。
---
### 步骤 3LLM 提炼内容
根据用户选择的风格,构建不同的 prompt 让 LLM 从原文中提炼 cheatsheet 内容。
#### 风格 1知识点速查卡
提炼规则:
- 提取所有核心概念、定义、公式、关键数据
- 每个知识点用 **术语:一句话解释** 的格式
- 相关知识点分组,每组有小标题
- 重要公式/代码片段原样保留
- 砍掉所有举例、过渡句、背景铺垫
输出结构:
```
## [分组标题]
- **术语A**:一句话定义
- **术语B**:一句话定义
- 📐 公式:`公式内容`
## [分组标题]
...
```
#### 风格 2思维导图式
提炼规则:
- 提取文档的层级结构(章→节→要点)
- 每个节点用最简短的语言概括
- 最多 3 级深度(再深就塞不进一页了)
- 用缩进和符号表达层级关系
输出结构:
```
# 主题
## 一级分支
├─ 二级要点
│ ├─ 细节 1
│ └─ 细节 2
└─ 二级要点
└─ 细节
```
#### 风格 3Q&A 式
提炼规则:
- 把每个知识点转化成一个问题
- 答案控制在 1-3 句话
- 问题从基础到进阶排列
- 易混淆的概念出辨析题
输出结构:
```
## [主题分组]
**Q什么是 XXX**
A一句话回答。
**QXXX 和 YYY 的区别?**
A简短对比。
```
---
### 步骤 4用户确认与调整
LLM 生成内容后,**先以文本形式展示给用户**,询问:
> "内容整理好了,你看看有没有要调整的?比如:
> - 某些部分要加重点标记?
> - 某些内容要删掉或补充?
> - 排版上有什么偏好?(比如字号再小一点、分区颜色区分等)"
用户确认"可以"后,进入步骤 5。
用户提出修改 → 调整内容 → 再次展示 → 等待确认。
---
### 步骤 5生成 PDF
调用 PDF skill 的 **Report 路线ReportLab** 生成双栏 PDF。
**排版规格(默认值):**
| 参数 | 默认值 | 说明 |
|------|--------|------|
| 页面大小 | A4 | 可按用户要求调整 |
| 栏数 | 双栏 | 默认双栏,用户可选单栏 |
| 正文字号 | 8pt | 信息密度优先,可按用户要求调整 |
| 标题字号 | 10pt一级/ 9pt二级 | |
| 行距 | 1.2 | 紧凑但可读 |
| 页边距 | 上下左右各 12mm | 最大化内容区域 |
| 字体 | 中文用 UniSong/UniHei英文用 Helvetica | |
**生成流程:**
1. 将 LLM 提炼的 Markdown 内容转换为 ReportLab 排版指令
2. 调 PDF skill 的 report 路线生成 PDF
3. 生成文件路径告知用户
**调用 PDF skill 时遵循其 SKILL.md 中的所有规则**,包括:
- CJK 字体检查
- 表格溢出防护
- 页面填充率检查
- 元数据设置
---
### 步骤 6交付
输出给用户:
- 📄 PDF 文件路径
- 文件大小、页数
- 提示用户可以继续调整
---
## 注意事项
### 内容质量
- **不编造内容**:所有知识点必须来自原文,不能自由发挥
- **不遗漏关键内容**:核心概念、公式、定义必须保留
- **术语保持原文用词**:不要擅自替换专业术语
### 文件操作
- 生成的 PDF 默认保存到工作区根目录,文件名格式:`知识浓缩卡_[主题]_[日期].pdf`
- 查重见上方"⚠️ 铁律:避免重复创建"
---
## 与其他 Skill 的关系
| Skill | 关系 |
|-------|------|
| PDF skill | 调用其 process 路线提取文本,调用其 report 路线生成 PDF |
| study-buddy | study-buddy 可在用户完成学习项目后推荐生成 cheatsheet |
| quiz-mastery | 无直接关系,但 cheatsheet 内容可作为出题的知识点来源 |
---
## 文件结构
```
skills/cheat-sheet/
├── SKILL.md ← 当前文件
```
本 skill 是纯流程指引,不包含独立脚本。所有文件操作和 PDF 生成通过调用 PDF skill 完成。

120
skills/coding-agent/SKILL.md Executable file
View File

@@ -0,0 +1,120 @@
---
name: coding-agent
slug: code
version: 1.0.4
homepage: https://clawic.com/skills/code
description: Coding workflow with planning, implementation, verification, and testing for clean software development.
changelog: Improved description for better discoverability
metadata: {"clawdbot":{"emoji":"💻","requires":{"bins":[]},"os":["linux","darwin","win32"]}}
---
## When to Use
User explicitly requests code implementation. Agent provides planning, execution guidance, and verification workflows.
## Architecture
User preferences stored in `~/code/` when user explicitly requests.
```
~/code/
- memory.md # User-provided preferences only
```
Create on first use: `mkdir -p ~/code`
## Quick Reference
| Topic | File |
|-------|------|
| Memory setup | `memory-template.md` |
| Task breakdown | `planning.md` |
| Execution flow | `execution.md` |
| Verification | `verification.md` |
| Multi-task state | `state.md` |
| User criteria | `criteria.md` |
## Scope
This skill ONLY:
- Provides coding workflow guidance
- Stores preferences user explicitly provides in `~/code/`
- Reads included reference files
This skill NEVER:
- Executes code automatically
- Makes network requests
- Accesses files outside `~/code/` and the user's project
- Modifies its own SKILL.md or auxiliary files
- Takes autonomous action without user awareness
## Core Rules
### 1. Check Memory First
Read `~/code/memory.md` for user's stated preferences if it exists.
### 2. User Controls Execution
- This skill provides GUIDANCE, not autonomous execution
- User decides when to proceed to next step
- Sub-agent delegation requires user's explicit request
### 3. Plan Before Code
- Break requests into testable steps
- Each step independently verifiable
- See `planning.md` for patterns
### 4. Verify Everything
| After | Do |
|-------|-----|
| Each function | Suggest running tests |
| UI changes | Suggest taking screenshot |
| Before delivery | Suggest full test suite |
### 5. Store Preferences on Request
| User says | Action |
|-----------|--------|
| "Remember I prefer X" | Add to memory.md |
| "Never do Y again" | Add to memory.md Never section |
Only store what user explicitly asks to save.
## Workflow
```
Request -> Plan -> Execute -> Verify -> Deliver
```
## Common Traps
- **Delivering untested code** -> always verify first
- **Huge PRs** -> break into testable chunks
- **Ignoring preferences** -> check memory.md first
## Self-Modification
This skill NEVER modifies its own SKILL.md or auxiliary files.
User data stored only in `~/code/memory.md` after explicit request.
## External Endpoints
This skill makes NO network requests.
| Endpoint | Data Sent | Purpose |
|----------|-----------|---------|
| None | None | N/A |
## Security & Privacy
**Data that stays local:**
- Only preferences user explicitly asks to save
- Stored in `~/code/memory.md`
**Data that leaves your machine:**
- None. This skill makes no network requests.
**This skill does NOT:**
- Execute code automatically
- Access network or external services
- Access files outside `~/code/` and user's project
- Take autonomous actions without user awareness
- Delegate to sub-agents without user's explicit request

6
skills/coding-agent/_meta.json Executable file
View File

@@ -0,0 +1,6 @@
{
"ownerId": "kn73vp5rarc3b14rc7wjcw8f8580t5d1",
"slug": "code",
"version": "1.0.4",
"publishedAt": 1771467169291
}

48
skills/coding-agent/criteria.md Executable file
View File

@@ -0,0 +1,48 @@
# Criteria for Storing Preferences
Reference for when to save user preferences to `~/code/memory.md`.
## When to Save (User Must Request)
Save only when user explicitly asks:
- "Remember that I prefer X"
- "Always do Y from now on"
- "Save this preference"
- "Don't forget that I like Z"
## When NOT to Save
- User didn't explicitly ask to save
- Project-specific requirement (applies to this project only)
- One-off request ("just this once")
- Temporary preference
## What to Save
**Preferences:**
- Coding style preferences user stated
- Tools or frameworks user prefers
- Patterns user explicitly likes
**Things to avoid:**
- Approaches user explicitly dislikes
- Patterns user asked not to repeat
## Format in memory.md
```markdown
## Preferences
- prefers TypeScript over JavaScript
- likes detailed comments
- wants tests for all functions
## Never
- no class-based React components
- avoid inline styles
```
## Important
- Only save what user EXPLICITLY asked to save
- Ask user before saving: "Should I remember this preference?"
- Never modify any skill files, only `~/code/memory.md`

View File

@@ -0,0 +1,42 @@
# Execution Guidance
Reference for executing multi-step implementations.
## Recommended Flow
When user approves a step:
1. Execute that step
2. Verify it works
3. Report completion to user
4. Wait for user to approve next step
## Progress Tracking
Show user the current state:
```
- [DONE] Step 1 (completed)
- [WIP] Step 2 <- awaiting user approval
- [ ] Step 3
- [ ] Step 4
```
## When to Pause and Ask User
- Before starting any new step
- When encountering an error
- When a decision is needed (A vs B)
- When credentials or permissions are needed
## Error Handling
If an error occurs:
1. Report the error to user
2. Suggest possible fixes
3. Wait for user decision on how to proceed
## Patterns to Follow
- Report completion of each step
- Ask before proceeding to next step
- Let user decide retry strategy
- Keep user informed of progress

View File

@@ -0,0 +1,38 @@
# Memory Setup - Code
## Initial Setup
Create directory on first use:
```bash
mkdir -p ~/code
touch ~/code/memory.md
```
## memory.md Template
Copy to `~/code/memory.md`:
```markdown
# Code Memory
## Preferences
<!-- User's coding workflow preferences. Format: "preference" -->
<!-- Examples: always run tests, prefer TypeScript, commit after each feature -->
## Never
<!-- Things that don't work for this user. Format: "thing to avoid" -->
<!-- Examples: inline styles, console.log debugging, large PRs -->
## Patterns
<!-- Approaches that work well. Format: "pattern: context" -->
<!-- Examples: TDD: for complex logic, screenshots: for UI work -->
---
Last updated: YYYY-MM-DD
```
## Notes
- Check `criteria.md` for additional user-specific criteria
- Use `planning.md` for breaking down complex requests
- Verify with tests and screenshots per `verification.md`

31
skills/coding-agent/planning.md Executable file
View File

@@ -0,0 +1,31 @@
# Planning Reference
Consult when breaking down a multi-step request.
## When to Plan
- Multiple files or components
- Dependencies between parts
- UI that needs visual verification
- User says "build", "create", "implement"
## Step Format
```
Step N: [What]
- Output: [What exists after]
- Test: [How to verify]
```
## Good Steps
- Clear output (file, endpoint, screen)
- Testable independently
- No ambiguity in what "done" means
## Bad Steps
- "Implement the thing" (vague output)
- No test defined
- Depends on undefined prior step
## Don't Plan
- One-liner functions
- Simple modifications
- Questions about existing code

60
skills/coding-agent/state.md Executable file
View File

@@ -0,0 +1,60 @@
# State Tracking Guidance
Reference for tracking multiple tasks or requests.
## Request Tracking
Label each user request:
```
[R1] Build login page
[R2] Add dark mode
[R3] Fix header alignment
```
Track state for user visibility:
```
[R1] [DONE] Done
[R2] [WIP] In progress (awaiting user approval for step 2)
[R3] [Q] Queued
```
## Managing Multiple Requests
When user sends a new request while another is in progress:
1. Acknowledge: "Got it, I'll add this to the queue"
2. Show updated queue to user
3. Ask user if priority should change
## Handling Interruptions
| Situation | Suggested Action |
|-----------|------------------|
| New unrelated request | Add to queue, ask user priority |
| Request affects current work | Pause, explain impact, ask user how to proceed |
| User says "stop" or "wait" | Stop immediately, await instructions |
| User changes requirements | Summarize impact, ask user to confirm changes |
## User Decisions
Always ask user before:
- Starting work on queued items
- Changing priority order
- Rolling back completed work
- Modifying the plan
## Progress File (Optional)
User may request a state file:
```markdown
## In Progress
[R2] Dark mode - Step 2/4 (awaiting user approval)
## Queued
[R3] Header fix
## Done
[R1] Login page [DONE]
```
Update only when user requests or approves changes.

View File

@@ -0,0 +1,39 @@
# Verification Reference
Consult when verifying implementations visually or with tests.
## Screenshots
- Wait for full page load (no spinners)
- Review yourself before sending
- Split long pages into 3-5 sections (~800px each)
- Caption each: "Hero", "Features", "Footer"
## Before Sending
```
[ ] Content loaded
[ ] Shows the specific change
[ ] No visual bugs
[ ] Caption explains what user sees
```
## Fix-Before-Send
If screenshot shows problem:
1. Fix code
2. Re-deploy
3. New screenshot
4. Still broken? -> back to 1
5. Fixed? -> now send
Never send "I noticed X is wrong, will fix" - fix first.
## No UI? Show Output
When verifying API endpoints, show actual output:
```
GET /api/users -> {"id": 1, "name": "test"}
```
Include actual response, not just "it works".
## Flows
Number sequential states: "1/4: Form", "2/4: Loading", "3/4: Error", "4/4: Success"

181
skills/content-strategy/SKILL.md Executable file
View File

@@ -0,0 +1,181 @@
---
name: content-strategy
description: Build and execute a content marketing strategy for a solopreneur business. Use when planning what content to create, deciding on content formats and channels, building a content calendar, measuring content performance, or systematizing content production. Covers audience research for content, content pillars, distribution strategy, repurposing workflows, and metrics. Trigger on "content strategy", "content marketing", "what content should I create", "content plan", "content calendar", "content ideas", "content distribution", "grow through content".
---
# Content Strategy
## Overview
Content marketing is how solopreneurs build authority, attract customers, and grow without paid ads. But random content doesn't work — you need a strategy. This playbook builds a repeatable system for creating content that actually drives business results, not just likes.
---
## Step 1: Define Your Content Goals
Content without a goal is just noise. Before you create anything, answer: what is this content supposed to DO?
**Common solopreneur content goals:**
- **Generate awareness** (new people discover you exist)
- **Build trust** (people see you as credible and knowledgeable)
- **Drive leads** (people give you their email or book a call)
- **Enable sales** (content answers objections and shortens sales cycles)
- **Retain customers** (existing customers stay engaged and see ongoing value)
**Rule:** Pick ONE primary goal per piece of content. You can have secondary benefits, but clarity on the main goal determines format, channel, and CTA.
Example: A tutorial blog post might have the primary goal of "generate awareness" (via SEO) and a secondary goal of "drive leads" (with an email signup CTA at the end).
---
## Step 2: Research Your Audience's Content Needs
Great content solves a specific problem for a specific person. Bad content talks about what YOU want to talk about.
**Research workflow (spend 2-3 hours on this before creating anything):**
1. **Mine customer conversations.** Go through support tickets, sales calls, discovery calls. What questions do prospects and customers ask repeatedly? Those are your content topics.
2. **Check competitor content.** What are the top 3-5 players in your space publishing? Look for gaps — topics they're NOT covering or covering poorly.
3. **Keyword research (if doing SEO).** Use free tools (Google autocomplete, AnswerThePublic, or "People Also Ask" in Google results) to see what people are actually searching for related to your niche.
4. **Community mining.** Go to Reddit, Slack communities, Facebook groups, or forums in your space. What questions get asked over and over? Those are high-value topics.
**Output:** A list of 20-30 content ideas ranked by: (a) relevance to your ICP, (b) search volume or community demand, (c) your unique perspective or experience on the topic.
---
## Step 3: Build Content Pillars
Content pillars are 3-5 broad topic areas that all your content falls under. They keep you focused and prevent random one-off content that doesn't build momentum.
**How to define pillars:**
- Each pillar should map to a core problem your product/service solves or a key interest area of your ICP.
- Pillars should be broad enough to generate dozens of pieces of content but specific enough to be relevant.
- Aim for 3-5 pillars max. More than that dilutes focus.
**Example (for an n8n automation consultant):**
```
Pillar 1: Workflow Automation Fundamentals
Pillar 2: No-Code Tool Comparisons
Pillar 3: Business Process Optimization
Pillar 4: Real Client Case Studies
```
Every piece of content you create should fit under one of these pillars. If it doesn't, don't create it.
---
## Step 4: Choose Your Content Formats and Channels
Solopreneurs can't do everything. Pick 1-2 primary formats and 1-2 primary channels. Go deep, not wide.
**Content formats:**
| Format | Best For | Time Investment | Longevity |
|---|---|---|---|
| **Blog posts** | SEO, teaching, depth | 2-4 hrs/post | High (evergreen) |
| **Videos (YouTube)** | Visual topics, personality-driven brands | 3-6 hrs/video | High (evergreen) |
| **Podcasts** | Thought leadership, interviews | 2-3 hrs/episode | Medium |
| **Twitter/X threads** | Quick insights, community building | 30 min/thread | Low (24-48hr shelf life) |
| **LinkedIn posts** | B2B, professional content | 30-60 min/post | Low-medium |
| **Email newsletters** | Relationship building, owned audience | 1-2 hrs/newsletter | Medium (subscribers keep it) |
| **Short-form video (TikTok, Reels)** | Viral potential, younger demos | 1-2 hrs/video | Low (algorithmic churn) |
**Selection criteria:**
- Where does your ICP hang out? (B2B = LinkedIn. Developers = Twitter. Visual products = Instagram.)
- What format do you NOT hate creating? (If you hate being on camera, don't pick YouTube.)
- What has the best ROI for your goals? (Lead gen = blog + email. Brand building = Twitter + LinkedIn.)
**Recommended solopreneur starting stack:**
- **Primary format:** Blog posts or long-form LinkedIn posts (depending on B2B vs B2C)
- **Secondary format:** Email newsletter (this is your owned channel — never skip this)
---
## Step 5: Build a Content Calendar
A content calendar prevents the "what should I post today?" panic. Plan 2-4 weeks ahead.
**Calendar structure:**
```
DATE | PILLAR | TOPIC | FORMAT | CHANNEL | CTA | STATUS
```
**Example:**
```
Feb 10 | Automation | "5 n8n workflows every SaaS founder needs" | Blog | Website + LinkedIn | Email signup | Draft
Feb 13 | Case Study | "How we saved Client X 20hrs/week" | LinkedIn post | LinkedIn | Book a call | Scheduled
Feb 17 | Tool Comparison | "Zapier vs n8n: Which is right for you?" | Blog | Website + Twitter | Free guide download | Outline
```
**Cadence recommendations:**
- Blog: 1-2x/week (minimum 2x/month to maintain SEO momentum)
- Newsletter: 1x/week or biweekly (consistency matters more than frequency)
- Social (LinkedIn/Twitter): 3-5x/week
**Rule:** Batch creation. Write 4 posts in one sitting rather than 1 post four different days. Batching is 3x faster and produces better quality.
---
## Step 6: Distribution and Amplification
Creating content is 30% of the work. Distribution is the other 70%.
**Distribution checklist for every piece:**
- [ ] Publish on primary channel (blog, YouTube, etc.)
- [ ] Share on 2-3 social channels with unique captions per platform (don't just copy-paste the same message)
- [ ] Send to email list (if it's a high-value piece)
- [ ] Post in 1-2 relevant communities (but add value to the discussion, don't just drop links)
- [ ] DM it to 3-5 people who you think would find it genuinely useful
- [ ] Repurpose into 2-3 other formats (see next step)
**Timing:** Publish early in the week (Tuesday-Thursday) for best engagement. Avoid Fridays and weekends unless your audience is specifically active then.
---
## Step 7: Repurpose Everything
One piece of long-form content can become 5-10 smaller pieces. This is how solopreneurs produce high volume without burning out.
**Repurposing workflow (example: one blog post):**
1. Original: 1,500-word blog post
2. Repurpose into: LinkedIn post (first 3 paragraphs + a hook)
3. Repurpose into: Twitter thread (key points broken into 8-10 tweets)
4. Repurpose into: Email newsletter (add a personal intro, link to full post)
5. Repurpose into: Carousel post (main points as slides on LinkedIn or Instagram)
6. Repurpose into: Short video (you on camera summarizing the key takeaway in 60 seconds)
**Rule:** Repurpose the high-performers. If a blog post gets good traffic or a LinkedIn post gets strong engagement, milk it — turn it into 5 more formats.
---
## Step 8: Measure What Matters
Track content performance so you can double down on what works and stop doing what doesn't.
**Metrics by goal:**
| Goal | Metrics to Track |
|---|---|
| Awareness | Impressions, reach, new visitors, social followers |
| Trust | Engagement rate (comments, shares), time on page, repeat visitors |
| Lead generation | Email signups, CTA clicks, lead magnet downloads |
| Sales enablement | Content assists (how many deals involved this content?), proposal open rates (if content is attached) |
**Dashboard (monthly check-in):**
- Top 5 performing pieces (by traffic or engagement)
- Traffic source breakdown (organic, social, direct, referral)
- Conversion rate (visitors → email signups or leads)
- Time investment vs results (which content type has the best ROI?)
**Iteration rule:** Every month, identify the top-performing content type and topic. Do 2x more of that next month. Identify the worst performer. Stop doing that format or adjust the approach.
---
## Content Strategy Mistakes to Avoid
- Creating content without a goal. Every piece should have a purpose tied to a business outcome.
- Not researching what your audience actually wants. Your assumptions are often wrong — validate with real data.
- Trying to be on every platform. Pick 1-2 and dominate them before expanding.
- Publishing inconsistently. One post a month doesn't build momentum. Consistency compounds.
- Not repurposing. Creating 10 original pieces is 5x harder than creating 2 original pieces and repurposing them into 8 more.
- Ignoring metrics. If you don't measure, you can't improve. Check your numbers monthly at minimum.

View File

@@ -0,0 +1,6 @@
{
"ownerId": "kn732qfbv22he1jqm63xbwq6e980kn8s",
"slug": "content-strategy",
"version": "0.1.0",
"publishedAt": 1770341804646
}

View File

@@ -0,0 +1,229 @@
---
name: ExtractWisdom
description: Content-adaptive wisdom extraction — detects what domains exist in content and builds custom sections (not static IDEAS/QUOTES). Produces tailored insight reports from videos, podcasts, articles. USE WHEN extract wisdom, analyze video, analyze podcast, extract insights, what's interesting, extract from YouTube, what did I miss, key takeaways.
---
## Customization
**Before executing, check for user customizations at:**
`~/.claude/PAI/USER/SKILLCUSTOMIZATIONS/ExtractWisdom/`
If this directory exists, load and apply any PREFERENCES.md, configurations, or resources found there. These override default behavior. If the directory does not exist, proceed with skill defaults.
# ExtractWisdom — Dynamic Content Extraction
**The next generation of extract_wisdom.** Instead of static sections (IDEAS, QUOTES, HABITS...), this skill detects what wisdom domains actually exist in the content and builds custom sections around them.
A programming interview gets "Programming Philosophy" and "Developer Workflow Tips." A business podcast gets "Contrarian Business Takes" and "Money Philosophy." A security talk gets "Threat Model Insights" and "Defense Strategies." The sections adapt because the content dictates them.
## When to Use
- Analyzing YouTube videos, podcasts, interviews, articles
- User says "extract wisdom", "what's interesting in this", "key takeaways"
- Processing any content where you want to capture the best stuff
- When standard extraction patterns miss the gems
## Depth Levels
Extract at different depths depending on need. Default is **Full** if no level is specified.
| Level | Sections | Bullets/Section | Closing Sections | When |
|-------|----------|----------------|-----------------|------|
| **Instant** | 1 | 8 | None | Quick hit. One killer section. |
| **Fast** | 3 | 3 | None | Skim in 30 seconds. |
| **Basic** | 3 | 5 | One-Sentence Takeaway only | Solid overview without the deep cuts. |
| **Full** | 5-12 | 3-15 | All three | The default. Complete extraction. |
| **Comprehensive** | 10-15 | 8-15 | All three + Themes & Connections | Maximum depth. Nothing left behind. |
**How to invoke:** "extract wisdom (fast)" or "extract wisdom at comprehensive level" or just "extract wisdom" for Full.
**Comprehensive extras:**
- **Themes & Connections** closing section: identify 3-5 throughlines that connect multiple sections. Not summaries — the deeper patterns the speaker may not even realize they're revealing.
- Prioritize breadth. Every significant wisdom domain gets its own section.
- No merging sections to save space. If the content supports 15 sections, use 15.
**All levels use the same voice, tone rules, and quality standards.** The only thing that changes is structure. An Instant extraction should hit just as hard per-bullet as a Comprehensive one.
## Workflow Routing
| Workflow | Trigger | File |
|----------|---------|------|
| **Extract** | "extract wisdom from", "analyze this", YouTube URL | `Workflows/Extract.md` |
## The Core Idea
Old extract_wisdom: Static sections. Same headers every time. IDEAS. QUOTES. HABITS. FACTS.
This skill: **Read the content first. Figure out what's actually in there. Build sections around what you find.**
The output should feel like your smartest friend watched/read the thing and is telling you about it over coffee. Not a book report. Not documentation. A real person pointing out the parts that made them go "holy shit" or "wait, that's actually brilliant."
## Tone Rules (CRITICAL)
**Canonical voice reference: `PAI/USER/WRITINGSTYLE.md`** — read this file for the full voice definition. The bullets should sound like {PRINCIPAL.NAME} telling a friend about it over coffee. Not compressed info nuggets. Not clever one-liners. Actual spoken observations.
**THREE LEVELS — we're aiming for Level 3:**
**Level 1 (BAD — documentation):**
- The speaker discussed the importance of self-modifying software in the context of agentic AI development
- It was noted that financial success has diminishing returns beyond a certain threshold
- The distinction between "vibe coding" and "agentic engineering" was emphasized as meaningful
**Level 2 (BETTER — but still "smart bullet points"):**
- He built self-modifying software basically by accident — just made the agent aware of its own source code
- Money has diminishing returns. A cheeseburger is a cheeseburger no matter how rich you are.
- "Vibe coding is a slur" — he calls it agentic engineering, and only does vibe coding after 3am
**Level 3 (YES — this is what we want — conversational, {PRINCIPAL.NAME}'s voice):**
- He wasn't trying to build self-modifying software. He just let the agent see its own source code and it started fixing itself.
- Past a certain point, money stops mattering. A cheeseburger is a cheeseburger no matter how rich you are.
- He calls vibe coding a slur. What he does is agentic engineering. The vibe coding only happens after 3am, and he regrets it in the morning.
**The difference between Level 2 and 3:** Level 2 is compressed info with em-dashes. Level 3 is how you'd actually SAY it. Varied sentence lengths. Letting a thought breathe. Not trying to be clever — just being clear and direct and a little bit personal.
**Key signals of Level 3:**
- Reads naturally when spoken aloud
- Varied sentence lengths — some short, some longer
- Understated — lets the content carry the weight
- Uses periods, not em-dashes, to let ideas land
- Feels opinionated ("Past a certain point, money stops mattering") not just informational
- The reader should think "I want to watch this" not "I got the summary"
## Rules for Extracted Points
1. **Write like you'd say it.** Read each bullet aloud. If it sounds like a press release or a compressed tweet, rewrite it. If it sounds like you telling a friend what you just watched, you nailed it.
2. **8-16 words per sentence.** This is the target range. Mix short (8-10) with medium (11-14) and longer (15-16). Don't make them all the same length. Exception: verbatim quotes can be any length since they're the speaker's actual words.
3. **Let ideas breathe.** Use periods between thoughts, not em-dashes. Short sentences. Then a slightly longer one to explain. That's the rhythm.
4. **Include the actual detail.** Not "he talked about money" but "a cheeseburger is a cheeseburger no matter how rich you are."
5. **Use the speaker's words when they're good.** If they said something perfectly, use it.
6. **No hedging language.** Not "it was suggested that" or "the speaker noted." Just say the thing.
7. **Capture what made you stop.** Every bullet should be something worth telling someone about.
8. **Vary your openers.** Don't start three bullets the same way. And don't front-load with "He" — if more than 3 bullets in a section start with the speaker's name, you're writing a biography.
9. **Capture the human moments.** Burnout stories, moments of doubt, something that moved them. That's wisdom too. Don't skip it because it's not "technical."
10. **Insight over inventory.** "He uses Go for CLIs" is inventory. "He picked a language he doesn't even like because the ecosystem fits agents perfectly. That's the new normal." is insight. Go deeper.
11. **Specificity is everything.** "He was impressed by the agent" = bad. "The agent found ffmpeg, curled the Whisper API, and transcribed a voice message nobody taught it to handle" = good.
12. **Tension and surprise.** The best bullets have a contradiction or reversal. "Every VC is offering hundreds of millions. He genuinely doesn't care." The gap between the offer and the indifference IS the wisdom.
13. **Understated, not clever.** Let the content carry the weight. You don't need to manufacture drama or craft the perfect one-liner. Just state what's interesting plainly and move on.
## How Dynamic Sections Work
### Phase 1: Content Scan
Read/listen to the full content. As you go, notice what DOMAINS of wisdom are present. These aren't the topics discussed — they're the TYPES of insight being delivered.
Examples of wisdom domains (these are illustrative, not exhaustive):
- Programming Philosophy (how to think about code, not specific syntax)
- Developer Workflow (practical tips for how to work)
- Business/Money Philosophy (unconventional takes on money, success, building companies)
- Human Psychology (insights about how people think, behave, learn)
- Technology Predictions (where things are headed)
- Life Philosophy (how to live, what matters)
- Contrarian Takes (things that go against conventional wisdom)
- First-Time Revelations (things you're hearing for the first time — genuinely new)
- Technical Architecture (how something is built, design decisions)
- Leadership & Team Dynamics (managing people, working with others)
- Creative Process (how to make things, craft, art)
### Phase 2: Section Selection
Pick sections based on depth level (default Full = 5-12). Requirements:
- Section count follows depth level table. Full = 5-12, Comprehensive = 10-15, Basic/Fast = 3, Instant = 1.
- Each section must have at least 3 STRONG bullets to justify existing (except Fast, where 3 tight bullets IS the section). If you can only scrape together 2 weak ones, merge into a related section.
- Always include "Quotes That Hit Different" if the content has good ones
- Always include "First-Time Revelations" if there are genuinely new ideas — things you literally didn't know before
- Section names should be conversational, not academic. "Money Philosophy" not "Financial Considerations"
- Sections should be SPECIFIC to this content. Generic sections = failure.
- **Kill inventory sections.** If a section is just a list of facts ("uses X for Y, uses A for B"), it's not wisdom. Either go deeper on WHY those choices matter or merge the facts into a section about the underlying philosophy.
- **Don't split what belongs together.** If "burnout recovery" and "money philosophy" are actually both about "what success really means," make one richer section instead of two thin ones.
- **Name sections like a magazine editor.** "The Death of 80% of Apps" is great. "Technology Predictions" is not. The section name itself should make you curious. It's a headline, not a category.
- **Surprise density per section.** If a section has 6+ bullets but only 2 are genuinely surprising, kill the padding and keep the winners. Quality > quantity per section.
- **Don't drop your best material between drafts.** If a spicy take, stunning moment, or first-time revelation was identified in an earlier pass, it MUST survive into the final version. Losing great material is worse than adding mediocre material.
### Phase 3: Extraction
For each section, extract 3-15 bullets depending on density. Apply all tone rules. Every bullet earns its place.
**The Spiciest Take Rule:** If the speaker has a genuinely contrarian or hot take on a topic (e.g., "screw MCPs", "X is dead", "Y is overhyped"), that take MUST appear somewhere. Spicy takes are the most memorable, shareable, and valuable parts of any content. Don't water them down. Don't leave them out.
**The "Would I Tweet This?" Test:** After extraction, scan your bullets. If fewer than half would make a good standalone tweet or social media post, your bullets are too generic. The best extractions are effectively a thread of tweetable insights.
### Phase 4: Closing Sections (Depth-Level Dependent)
Which closing sections to include depends on depth level:
| Level | Closing Sections |
|-------|-----------------|
| **Instant** | None |
| **Fast** | None |
| **Basic** | One-Sentence Takeaway only |
| **Full** | One-Sentence Takeaway + If You Only Have 2 Minutes + References & Rabbit Holes |
| **Comprehensive** | All three above + Themes & Connections |
**One-Sentence Takeaway**
The single most important thing from the entire piece in 15-20 words.
**If You Only Have 2 Minutes**
The 5-7 absolute must-know points. The cream of the cream.
**References & Rabbit Holes**
People, projects, books, tools, and ideas mentioned that are worth following up on. Brief context for each.
**Themes & Connections** (Comprehensive only)
3-5 throughlines that connect multiple sections. The deeper patterns the speaker may not realize they're revealing. Not summaries. Synthesis.
## Output Format
```markdown
# EXTRACT WISDOM: {Content Title}
> {One-line description of what this is and who's talking}
---
## {Dynamic Section 1 Name}
- {bullet}
- {bullet}
- {bullet}
## {Dynamic Section 2 Name}
- {bullet}
- {bullet}
[... more dynamic sections ...]
---
## One-Sentence Takeaway
{15-20 word sentence}
## If You Only Have 2 Minutes
- {essential point 1}
- {essential point 2}
- {essential point 3}
- {essential point 4}
- {essential point 5}
## References & Rabbit Holes
- **{Name/Project}** — {one-line context of why it's worth looking into}
- **{Name/Project}** — {context}
```
## Quality Check
Before delivering output, verify:
- [ ] Sections are specific to THIS content, not generic
- [ ] No bullet sounds like it was written by a committee
- [ ] Every bullet has a specific detail, quote, or insight — not vague summaries
- [ ] Section names are conversational and headline-worthy (not category labels)
- [ ] Section count matches depth level (Instant=1, Fast/Basic=3, Full=5-12, Comprehensive=10-15)
- [ ] Closing sections match depth level (see Phase 4 table)
- [ ] No bullet starts with "The speaker" or "It was noted that"
- [ ] No more than 3 bullets per section start with "He" or the speaker's name
- [ ] No bullet exceeds 25 words
- [ ] No inventory sections (just listing facts without insight)
- [ ] "If You Only Have 2 Minutes" bullets are each under 20 words
- [ ] Reading the output makes you want to consume the original content

View File

@@ -0,0 +1,60 @@
# Extract Workflow
Extract dynamic, content-adaptive wisdom from any content source.
## Input Sources
| Source | Method |
|--------|--------|
| YouTube URL | `fabric -y "URL"` to get transcript |
| Article URL | WebFetch to get content |
| File path | Read the file directly |
| Pasted text | Use directly |
## Execution Steps
### Step 1: Get the Content
Obtain the full text/transcript. For YouTube, use `fabric -y "URL"` to extract transcript. Save to a working file if large.
### Step 2: Deep Read
Read the entire content. Don't extract yet. Notice:
- What domains of wisdom are present?
- What made you stop and think?
- What's genuinely novel vs. commonly known?
- What would {PRINCIPAL.NAME} highlight if he were reading this?
- What quotes land perfectly?
### Step 3: Select Dynamic Sections
Based on your deep read, pick 5-12 section names. Rules:
- Section names must be conversational, not academic
- Each must have at least 3 quality bullets
- Always include "Quotes That Hit Different" if source has quotable moments
- Always include "First-Time Revelations" if genuinely new ideas exist
- Be SPECIFIC — "Agentic Engineering Philosophy" not "Technology Insights"
### Step 4: Extract Per Section
For each section, extract 3-15 bullets. Apply tone rules from SKILL.md:
- 8-20 words, flexible for clarity
- Specific details, not vague summaries
- Speaker's words when they're good
- No hedging language
- Every bullet worth telling someone about
### Step 5: Add Closing Sections
Always append:
1. **One-Sentence Takeaway** (15-20 words)
2. **If You Only Have 2 Minutes** (5-7 essential points)
3. **References & Rabbit Holes** (people, projects, books, tools mentioned)
### Step 6: Quality Check
Run the quality checklist from SKILL.md before delivering.
### Step 7: Output
Present the complete extraction in the format specified in SKILL.md.

14
skills/contentanalysis/SKILL.md Executable file
View File

@@ -0,0 +1,14 @@
---
name: ContentAnalysis
description: Content extraction and analysis — wisdom extraction from videos, podcasts, articles, and YouTube. USE WHEN extract wisdom, content analysis, analyze content, insight report, analyze video, analyze podcast, extract insights, key takeaways, what did I miss, extract from YouTube.
---
# ContentAnalysis
Unified skill for content extraction and analysis workflows.
## Workflow Routing
| Request Pattern | Route To |
|---|---|
| Extract wisdom, content analysis, insight report, analyze content | `ExtractWisdom/SKILL.md` |

13
skills/docx/LICENSE.txt Executable file
View File

@@ -0,0 +1,13 @@
Copyright (c) 2026 Z.ai All rights reserved.
Permission is granted for personal, educational, and non-commercial use only.
Commercial use is strictly prohibited without prior written permission from the author.
Unauthorized copying, modification, or distribution of the software for commercial purposes is prohibited.
The author reserves the right to make the final determination of what constitutes "commercial use".
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY ARISING FROM THE USE OF THE SOFTWARE.

201
skills/docx/SKILL.md Executable file
View File

@@ -0,0 +1,201 @@
---
name: docx
metadata:
author: Z.AI
version: "1.0"
description: "Comprehensive document creation, editing, and analysis with support for tracked changes, comments, formatting preservation, and text extraction. When GLM needs to work with professional documents (.docx files) for: (1) Creating new documents, (2) Modifying or editing content, (3) Working with tracked changes, (4) Adding comments, or any other document tasks"
license: Proprietary. LICENSE.txt has complete terms
---
# DOCX Creation, Editing, and Analysis
## Quick Setup
```bash
bash "$SKILL_DIR/setup.sh" # Interactive environment check + install
```
## Overview
A .docx file is a ZIP archive containing XML files. This skill provides tools for creating, editing, reading, and reviewing Word documents.
## Quick Route — Read This First
**Step 1**: Determine task type → load the corresponding route file
**Step 2**: Determine business scene → load the corresponding scene file (if applicable)
**Step 3**: Load `references/design-system.md` for cover recipes, palettes, and chart colors
**Step 4**: Load `references/common-rules.md` for shared layout, font, and quality rules
**Step 5**: Execute per route instructions
**Step 6**: Run the post-generation checklist
⚠️ **MANDATORY — Cover Recipe Enforcement (Step 3):**
When creating a document that needs a cover page, you MUST use one of the 7 validated cover recipes (R1R7) from `design-system.md`. **Free-form cover code is FORBIDDEN.** The recipe provides the wrapper table, background, layout structure, border settings, and spacing — do not reinvent any of these.
Workflow: (1) Call `selectCoverRecipe(docType, industry)` to get recipe + palette → (2) Use the corresponding `buildCoverRX()` function code from `design-system.md` → (3) Pass your `config` (title, subtitle, metaLines, etc.) into the recipe builder. If you skip this and write cover code from scratch, the cover WILL have compatibility issues (blank pages in MS Office, missing borders, overflow, etc.).
### Script Path Setup (MANDATORY before any script call)
All CLI tools live in `scripts/` relative to this skill's directory. Before calling any script, resolve the absolute path once:
```bash
DOCX_SCRIPTS="<skill_directory>/scripts" # ← parent directory of this SKILL.md
# Then all commands use $DOCX_SCRIPTS:
python3 "$DOCX_SCRIPTS/postcheck.py" output.docx
python3 "$DOCX_SCRIPTS/add_toc_placeholders.py" output.docx --auto
```
**For Python imports** (when generation code needs to import skill modules):
```python
import sys, os
DOCX_SCRIPTS = os.path.join("<skill_directory>", "scripts")
if DOCX_SCRIPTS not in sys.path:
sys.path.insert(0, DOCX_SCRIPTS)
```
**⚠️ NEVER use bare `python3 scripts/...`** — it only works if cwd happens to be the skill directory. Always use the absolute `$DOCX_SCRIPTS` path.
### Task Router
| User Intent | Route | Files to Load |
|-------------|-------|---------------|
| Create/write/generate (no attachment) | **Create** | `routes/create.md` + `references/docx-js-core.md` |
| Edit/modify/revise (has attachment) | **Edit** | `routes/edit.md` + `references/ooxml.md` |
| Format/layout/font/margin | **Format** | `routes/format.md` |
| Comment/annotate/review | **Comment** | `routes/comment.md` |
| Read/analyze/extract | **Read** | `routes/read.md` |
### Scene Router (Optional — load after route)
| User Keywords | Scene | File |
|---------------|-------|------|
| thesis, academic, research, paper, dissertation, abstract, journal | Academic | `scenes/academic.md` |
| report, analysis, experiment, testing, survey, review, summary, proposal, feasibility, competitor, industry, operations | Report | `scenes/report.md` |
| contract, agreement, terms, transfer, NDA, confidential, framework, cooperation, service terms, user agreement, procurement | Contract | `scenes/contract.md` |
| resume, CV, job application | Resume | `scenes/resume.md` |
| exam, test, quiz, paper (exam context), lesson plan | Exam | `scenes/exam.md` |
| official document, notice, letter, reply, minutes, red header, government, issuance | Official | `scenes/official-doc.md` |
| broadcast script, product copy, livestream, speech, presentation script, video script | Copywriting | `scenes/copywriting.md` |
| plan, proposal (if not report context) | Report | `scenes/report.md` |
| policy, regulation, standard, management rules | Official | `scenes/official-doc.md` |
**If no scene matches**, use default design rules from `references/design-system.md` and `references/common-rules.md`.
## Formatting Standards (Always Apply)
→ See `references/common-rules.md` for full font profiles, spacing, indent, and layout rules.
**Key rules (quick reference):**
- **Line spacing**: 1.3x (`line: 312`) — MANDATORY. Exceptions: resume 1.15x, official doc 28pt fixed, copywriting `400`, contract 1.5x
- **CJK body**: Justified + 2-char indent (`firstLine: 480` SimSun / `420` YaHei)
- **Tables**: `margins` set, `ShadingType.CLEAR`, `tableHeader: true`, `cantSplit: true`, title `keepNext: true`
- **Images**: `type` parameter required, preserve aspect ratio via `image-size`, PageBreak inside Paragraph
- **Full-page Table row**: `rule: "exact"` with 1200 twips safety margin
## Unit Quick Reference
| Unit | Value |
|------|-------|
| 1 cm | 567 twips |
| 1 inch | 1440 twips |
| 1 pt | 20 half-points |
| A4 | 11906 × 16838 twips |
For Chinese font size table and common margins, see `references/common-rules.md`.
## Post-Generation — Two-Layer Verification
### Layer 1: Manual Checklist (self-check during generation)
#### Basic Format
- [ ] Line spacing is 1.3x (`line: 312`) or scene-specific override
- [ ] CJK body has 2-char indent (`firstLine: 480` or `420`)
- [ ] Tables have margins set
- [ ] Images preserve aspect ratio via `image-size` — NEVER hardcode both width and height
- [ ] PageBreak inside Paragraph
- [ ] ShadingType uses CLEAR
- [ ] Each numbered list uses unique `reference`
- [ ] **⚠️ CRITICAL — Quotation marks in JS strings properly escaped.** Chinese curly quotes (`""` `''`) MUST use Unicode escapes (`\u201c` `\u201d` `\u2018` `\u2019`); straight quotes (`"` `'`) use `\"` `\'` or alternate delimiters. **This is the #1 most common code generation bug.** Chinese text frequently contains `""` for emphasis or proper nouns (e.g., "双11", "前低后高", "618") — every occurrence MUST be escaped. Failure to escape produces JS syntax errors that silently break document generation.
- [ ] ImageRun includes `type` parameter
- [ ] Header/footer present (unless scene says otherwise)
#### Heading Styles
- [ ] All body chapter headings use `heading: HeadingLevel.HEADING_X` (never simulate with bold + large font)
- [ ] Cover title may skip Heading style (not in TOC), but body headings MUST use Heading style
#### Page Break & Blank Page Prevention
- [ ] Cover/content in separate sections
- [ ] Three rules to prevent blank pages:
- ① When using section(NEXT_PAGE), previous section must NOT end with PageBreak (double break = blank page)
- ② PageBreak paragraph SHOULD contain visible text — **exception**: section-ending empty para + PageBreak is allowed (normal section separator, e.g., after cover page)
- ③ No more than 3 consecutive empty paragraphs
- [ ] Full-page Table row height uses `rule: "exact"` (never `"atLeast"` for tall tables)
- [ ] No unwanted blank pages (check each section ending)
#### TOC
→ See `references/toc.md` for the complete TOC reference and checklist.
- [ ] If TOC title exists → `TableOfContents` element must be present
- [ ] **⚠️ MANDATORY PageBreak after TableOfContents** — a Paragraph containing PageBreak MUST immediately follow the `TableOfContents` element; without it, TOC and body content will render on the same page. This is the #1 TOC formatting failure — never omit it
- [ ] `add_toc_placeholders.py --auto` runs after generation; exit code = 0
- [ ] **TOC MUST be in its own section** — body section sets `page: { pageNumbers: { start: 1, formatType: NumberFormat.DECIMAL } }` so page numbers start from the first body page, not from the TOC pages
- [ ] **Page number API nesting**`pageNumbers` MUST be inside `page: {}`, NOT at properties top level (see toc.md § Page Number API)
- [ ] **3-section page numbering** — Cover (no page#) → Front matter (Roman i,ii,iii, start=1) → Body (Arabic 1,2,3, start=1)
- [ ] **Post-process footers** — Roman section footer instrText must contain `PAGE \* ROMAN \* MERGEFORMAT`; Arabic section `PAGE \* arabic \* MERGEFORMAT` (WPS ignores pgNumType fmt). **⚠️ NEVER use `\* decimal` in instrText** — `decimal` is a docx-js API enum value (`NumberFormat.DECIMAL`), NOT a valid Word field format switch; using it causes page numbers to render as "1decimal", "2decimal". The correct Word field switch for Arabic numerals is `\* arabic`.
- [ ] **Remove empty pgNumType** — Post-process to strip `<w:pgNumType/>` from cover section (docx-js emits empty element that confuses WPS)
- [ ] **⚠️ TOC Refresh Hint MANDATORY** — between `TableOfContents` element and the PageBreak, MUST add an italic gray note paragraph telling users to right-click TOC → "Update Field" to refresh page numbers (see toc.md § TOC Refresh Hint)
#### Table Cross-Page
- [ ] Header rows: `tableHeader: true`
- [ ] All rows: `cantSplit: true`
- [ ] Title paragraph: `keepNext: true`
#### Cover
- [ ] **Cover MUST use a validated recipe (R1R7)** from `design-system.md` — free-form cover code is forbidden
- [ ] Cover recipe matches document type (per `selectCoverRecipe()` in `design-system.md`)
- [ ] Cover uses the 16838 outer wrapper table with `allNoBorders` (all recipes provide this)
- [ ] Cover title uses `calcTitleLayout()` — never hardcoded font size above 40pt
- [ ] Cover spacing uses `calcCoverSpacing()` — never hardcoded large spacing values
- [ ] Cover content does not overflow (total height ≤ 15638 twips, Table uses `rule: "exact"`)
- [ ] Every TextRun on dark/colored background has explicit `color` set (Rule 9 — never rely on default black)
- [ ] Cover section has no trailing PageBreak or empty paragraphs
- [ ] Title lines split at semantic boundaries (no mid-word breaks, no single-char orphan lines)
- [ ] No text-character decorative lines (`───`, `━━━`) — use paragraph borders only
### Layer 2: Automated Post-Check Script
```bash
python3 "$DOCX_SCRIPTS/postcheck.py" output.docx
```
Automatically checks 14 business rules: blank pages, **cover overflow (font size/spacing/trailing content)**, line spacing consistency, table margins, table cross-page control (cantSplit/tblHeader), image overflow, image aspect ratio distortion, font fallback, CJK indent, heading hierarchy, ShadingType misuse, TOC quality, document cleanliness (placeholder text/Markdown/HTML residuals), report content quality (abstract presence/heading specificity/vague conclusion detection).
⚠️ **After generating any document, MUST run postcheck.py and fix all ❌ errors.**
## Math Formulas
Formula input uses **LaTeX syntax**, internally converted to docx-js Math objects.
- **Basic formulas** (fractions, sub/superscript, roots, summation) → docx-js Math components
- **Complex formulas** (3+ nesting, matrices, piecewise functions) → matplotlib PNG fallback
See `references/math-formulas.md`.
## Charts
Default: **matplotlib template library** generates PNG for embedding.
6 ready-to-use templates: bar, line, pie, box, radar, heatmap.
Colors auto-derived from document palette.accent for style consistency.
Default palette: Morandi low-saturation (see design-system.md).
See `references/chart-templates.md`.
## Dependencies
- **pandoc**: Text extraction
- **docx**: `bun add docx` or `npm install docx` (creating)
- **LibreOffice**: PDF conversion, .doc support
- **Poppler**: PDF to image (`pdftoppm`)
- **defusedxml**: Secure XML parsing
- **python-docx**: Simple comment operations

View File

@@ -0,0 +1,386 @@
# Chart Templates — matplotlib Template Library
## Design Philosophy
GLM uses **matplotlib as the primary chart engine**. Advantages:
- High chart quality, print-ready
- Full style control, consistent with document palette
- Supports complex chart types (heatmap, radar, box plot, etc.)
- Reliable CJK rendering (with SimHei font configured)
**When to use native Word charts?**
Only when the user explicitly requests "editable charts." Default is always matplotlib PNG embedding.
## Base Configuration
```python
import matplotlib
matplotlib.use("Agg")
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.font_manager import FontProperties
# ── CJK Font ──
_FONT_PATHS = [
"/System/Library/Fonts/Supplemental/SimHei.ttf", # macOS
"/usr/share/fonts/truetype/wqy/wqy-zenhei.ttc", # Linux
"/usr/share/fonts/truetype/chinese/SimHei.ttf", # custom install
"./SimHei.ttf", # current dir
]
ZH_FONT = None
for _fp in _FONT_PATHS:
try:
ZH_FONT = FontProperties(fname=_fp)
break
except:
continue
plt.rcParams["axes.unicode_minus"] = False
# ── Palette Adapter ──
def make_chart_palette(accent: str, surface: str = "#F2F4F6") -> dict:
"""Generate chart palette from document palette.accent"""
return {
"primary": accent,
"series": _generate_series_colors(accent, 6),
"grid": "#E0E0E0",
"bg": "white",
"text": "#333333",
"surface": surface,
}
def _generate_series_colors(base_hex: str, count: int) -> list:
"""Generate series colors via hue rotation from base color"""
import colorsys
base = tuple(int(base_hex.lstrip("#")[i:i+2], 16) / 255.0 for i in (0, 2, 4))
h, s, v = colorsys.rgb_to_hsv(*base)
colors = []
for i in range(count):
hi = (h + i * (1.0 / count)) % 1.0
r, g, b = colorsys.hsv_to_rgb(hi, min(s * 0.9, 1.0), min(v * 1.05, 1.0))
colors.append(f"#{int(r*255):02x}{int(g*255):02x}{int(b*255):02x}")
return colors
# ── Universal Export ──
def save_chart(fig, path: str, dpi: int = 200):
"""Save chart with uniform DPI. Square charts (pie/radar) use fixed padding to preserve 1:1 ratio."""
w, h = fig.get_size_inches()
if abs(w - h) < 0.1:
fig.savefig(path, dpi=dpi, bbox_inches="tight", pad_inches=0.3,
facecolor="white", edgecolor="none")
else:
fig.savefig(path, dpi=dpi, bbox_inches="tight", pad_inches=0.1,
facecolor="white", edgecolor="none")
plt.close(fig)
return path
```
## Template 1: Bar Chart
```python
def bar_chart(categories: list, values: list, title: str = "",
ylabel: str = "", palette: dict = None, output: str = "bar.png"):
"""
Basic bar chart.
categories: ["Q1", "Q2", "Q3", "Q4"]
values: [120, 150, 180, 200]
"""
p = palette or make_chart_palette("#5B8DB8")
fig, ax = plt.subplots(figsize=(10, 6))
bars = ax.bar(categories, values, color=p["primary"], width=0.6, edgecolor="white")
# Data labels
for bar, val in zip(bars, values):
ax.text(bar.get_x() + bar.get_width() / 2, bar.get_height() + max(values) * 0.02,
str(val), ha="center", va="bottom", fontsize=10,
fontproperties=ZH_FONT, color=p["text"])
if title:
ax.set_title(title, fontproperties=ZH_FONT, fontsize=14, pad=15, color=p["text"])
if ylabel:
ax.set_ylabel(ylabel, fontproperties=ZH_FONT, fontsize=11, color=p["text"])
ax.set_xticklabels(categories, fontproperties=ZH_FONT, fontsize=10)
ax.spines[["top", "right"]].set_visible(False)
ax.grid(axis="y", alpha=0.3, color=p["grid"])
if len(categories) > 6:
plt.xticks(rotation=45, ha="right")
return save_chart(fig, output)
```
### Grouped Bar Chart
```python
def grouped_bar(categories: list, groups: dict, title: str = "",
ylabel: str = "", palette: dict = None, output: str = "grouped_bar.png"):
"""
groups: {"Product A": [10, 20, 30], "Product B": [15, 25, 35]}
"""
p = palette or make_chart_palette("#5B8DB8")
fig, ax = plt.subplots(figsize=(10, 6))
x = np.arange(len(categories))
n = len(groups)
width = 0.8 / n
for i, (name, vals) in enumerate(groups.items()):
offset = (i - n / 2 + 0.5) * width
bars = ax.bar(x + offset, vals, width, label=name, color=p["series"][i % len(p["series"])])
ax.set_xticks(x)
ax.set_xticklabels(categories, fontproperties=ZH_FONT, fontsize=10)
ax.legend(prop=ZH_FONT, frameon=False)
if title:
ax.set_title(title, fontproperties=ZH_FONT, fontsize=14, pad=15)
ax.spines[["top", "right"]].set_visible(False)
ax.grid(axis="y", alpha=0.3)
return save_chart(fig, output)
```
## Template 2: Line Chart
```python
def line_chart(x_data: list, series: dict, title: str = "",
xlabel: str = "", ylabel: str = "", palette: dict = None,
output: str = "line.png"):
"""
series: {"Revenue": [100, 120, 150, 180], "Cost": [80, 90, 100, 110]}
"""
p = palette or make_chart_palette("#5B8DB8")
fig, ax = plt.subplots(figsize=(10, 6))
for i, (name, values) in enumerate(series.items()):
color = p["series"][i % len(p["series"])]
ax.plot(x_data, values, marker="o", markersize=5, linewidth=2,
label=name, color=color)
if title:
ax.set_title(title, fontproperties=ZH_FONT, fontsize=14, pad=15)
if xlabel:
ax.set_xlabel(xlabel, fontproperties=ZH_FONT, fontsize=11)
if ylabel:
ax.set_ylabel(ylabel, fontproperties=ZH_FONT, fontsize=11)
ax.legend(prop=ZH_FONT, frameon=False, loc="best")
ax.spines[["top", "right"]].set_visible(False)
ax.grid(True, alpha=0.3)
if len(x_data) > 6:
plt.xticks(rotation=45, ha="right")
return save_chart(fig, output)
```
## Template 3: Pie Chart
```python
def pie_chart(labels: list, values: list, title: str = "",
palette: dict = None, output: str = "pie.png"):
"""Pie chart — auto-merges slices below 3% into 'Other'"""
p = palette or make_chart_palette("#5B8DB8")
fig, ax = plt.subplots(figsize=(8, 8))
# Merge slices below 3% into "Other"
total = sum(values)
merged_labels, merged_values = [], []
other = 0
for lbl, val in zip(labels, values):
if val / total < 0.03:
other += val
else:
merged_labels.append(lbl)
merged_values.append(val)
if other > 0:
merged_labels.append("Other")
merged_values.append(other)
colors = p["series"][:len(merged_labels)]
wedges, texts, autotexts = ax.pie(
merged_values, labels=merged_labels, colors=colors,
autopct="%1.1f%%", startangle=90, pctdistance=0.75,
textprops={"fontproperties": ZH_FONT, "fontsize": 11}
)
for t in autotexts:
t.set_fontsize(10)
t.set_color("white")
if title:
ax.set_title(title, fontproperties=ZH_FONT, fontsize=14, pad=20)
return save_chart(fig, output)
```
## Template 4: Box Plot
```python
def box_plot(data: dict, title: str = "", ylabel: str = "",
palette: dict = None, output: str = "box.png"):
"""
data: {"Class A": [78, 82, 91, ...], "Class B": [65, 70, 88, ...]}
"""
p = palette or make_chart_palette("#5B8DB8")
fig, ax = plt.subplots(figsize=(10, 6))
labels = list(data.keys())
values = list(data.values())
bp = ax.boxplot(values, labels=labels, patch_artist=True, notch=False,
medianprops={"color": "white", "linewidth": 2})
for i, patch in enumerate(bp["boxes"]):
patch.set_facecolor(p["series"][i % len(p["series"])])
patch.set_alpha(0.8)
ax.set_xticklabels(labels, fontproperties=ZH_FONT, fontsize=11)
if title:
ax.set_title(title, fontproperties=ZH_FONT, fontsize=14, pad=15)
if ylabel:
ax.set_ylabel(ylabel, fontproperties=ZH_FONT, fontsize=11)
ax.spines[["top", "right"]].set_visible(False)
ax.grid(axis="y", alpha=0.3)
return save_chart(fig, output)
```
## Template 5: Radar Chart
```python
def radar_chart(categories: list, series: dict, title: str = "",
palette: dict = None, output: str = "radar.png"):
"""
categories: ["Chinese", "Math", "English", "Physics", "Chemistry"]
series: {"Student A": [85, 92, 78, 90, 88], "Student B": [75, 88, 92, 70, 85]}
"""
p = palette or make_chart_palette("#5B8DB8")
fig, ax = plt.subplots(figsize=(8, 8), subplot_kw=dict(polar=True))
n = len(categories)
angles = np.linspace(0, 2 * np.pi, n, endpoint=False).tolist()
angles += angles[:1] # close the polygon
for i, (name, values) in enumerate(series.items()):
vals = values + values[:1] # close the polygon
color = p["series"][i % len(p["series"])]
ax.plot(angles, vals, linewidth=2, label=name, color=color)
ax.fill(angles, vals, alpha=0.15, color=color)
ax.set_xticks(angles[:-1])
ax.set_xticklabels(categories, fontproperties=ZH_FONT, fontsize=11)
ax.legend(prop=ZH_FONT, loc="upper right", bbox_to_anchor=(1.2, 1.1), frameon=False)
if title:
ax.set_title(title, fontproperties=ZH_FONT, fontsize=14, pad=25)
return save_chart(fig, output)
```
## Template 6: Heatmap
```python
def heatmap(data: list, row_labels: list, col_labels: list, title: str = "",
palette: dict = None, output: str = "heatmap.png"):
"""
data: 2D array [[1,2,3],[4,5,6]]
row_labels: ["Row 1", "Row 2"]
col_labels: ["Col 1", "Col 2", "Col 3"]
"""
fig, ax = plt.subplots(figsize=(max(8, len(col_labels) * 1.2), max(6, len(row_labels) * 0.8)))
arr = np.array(data)
im = ax.imshow(arr, cmap="YlOrRd", aspect="auto")
ax.set_xticks(range(len(col_labels)))
ax.set_yticks(range(len(row_labels)))
ax.set_xticklabels(col_labels, fontproperties=ZH_FONT, fontsize=10)
ax.set_yticklabels(row_labels, fontproperties=ZH_FONT, fontsize=10)
# Value annotations
for i in range(len(row_labels)):
for j in range(len(col_labels)):
val = arr[i, j]
color = "white" if val > arr.max() * 0.7 else "black"
ax.text(j, i, f"{val:.1f}", ha="center", va="center",
fontsize=10, color=color)
fig.colorbar(im, ax=ax, shrink=0.8)
if title:
ax.set_title(title, fontproperties=ZH_FONT, fontsize=14, pad=15)
return save_chart(fig, output)
```
## Embedding in Documents (MANDATORY — Preserve Aspect Ratio)
**⚠️ Core Rule: When embedding any chart image, you MUST read actual image dimensions to calculate displayHeight. NEVER hardcode both width and height.**
Pie and radar charts are square — mismatched width/height produces ellipses or diamonds.
```js
// ✅ Correct: read actual image dimensions
const chartBuffer = fs.readFileSync("bar.png");
const sizeOf = require("image-size");
const dims = sizeOf(chartBuffer);
const displayWidth = 500;
const displayHeight = Math.round(displayWidth * (dims.height / dims.width));
new Paragraph({
alignment: AlignmentType.CENTER,
spacing: { before: 200, after: 100 },
children: [
new ImageRun({
data: chartBuffer,
transformation: { width: displayWidth, height: displayHeight },
type: "png",
}),
],
})
```
```js
// ❌ Wrong: hardcoded width and height (pie becomes ellipse, radar becomes diamond)
new ImageRun({
data: chartBuffer,
transformation: { width: 500, height: 350 }, // wrong ratio!
type: "png",
})
```
```python
# ✅ Python (ReportLab) correct approach:
from PIL import Image as PILImage
from reportlab.platypus import Image
pil_img = PILImage.open('chart.png')
orig_w, orig_h = pil_img.size
target_width = 400 # pt
scale = target_width / orig_w
img = Image('chart.png', width=target_width, height=orig_h * scale)
```
## Chart Selection Guide
| Data Scenario | Recommended Chart | Template Function |
|---------------|-------------------|-------------------|
| Category comparison | Bar chart | `bar_chart()` |
| Multi-group comparison | Grouped bar | `grouped_bar()` |
| Trend over time | Line chart | `line_chart()` |
| Proportion/composition | Pie chart | `pie_chart()` |
| Distribution/spread | Box plot | `box_plot()` |
| Multi-dimensional assessment | Radar chart | `radar_chart()` |
| Matrix correlation | Heatmap | `heatmap()` |
## Quality Standards
1. **DPI**: Uniform 200 DPI (built into `save_chart`)
2. **Colors**: Derived from document palette.accent for style consistency
3. **CJK text**: Must configure SimHei font; otherwise renders as boxes
4. **Label overlap prevention**: Auto-rotate 45° when >6 x-axis labels
5. **Legend**: Move outside chart (`bbox_to_anchor`) when >4 series
6. **Grid**: Light gray dashed grid lines for readability
7. **Clean frames**: Remove top/right spines for modern minimalist look
8. **Aspect ratio (CRITICAL)**: Must use `image-size` (JS) or `PIL` (Python) to read actual image dimensions and calculate displayHeight proportionally. **Pie and radar charts are square — hardcoding non-1:1 ratio causes ellipse/diamond distortion.**
9. **Dimensions**: Default 10×6 inches, fits well within A4 page

View File

@@ -0,0 +1,419 @@
# Common Rules
Shared rules referenced by all scene files. Scene-specific overrides take precedence.
## Default Page Layout
A4 portrait. Unless the scene specifies otherwise, use:
| Property | Value | Twips |
|----------|-------|-------|
| Page width | 21.0 cm | 11906 |
| Page height | 29.7 cm | 16838 |
| Top margin | 2.54 cm | 1440 |
| Bottom margin | 2.54 cm | 1440 |
| Left margin | 3.0 cm | 1701 |
| Right margin | 2.5 cm | 1417 |
```js
page: {
size: { width: 11906, height: 16838, orientation: PageOrientation.PORTRAIT },
margin: { top: 1440, bottom: 1440, left: 1701, right: 1417 },
}
```
**Scene overrides:**
- **Official doc (GB/T 9704 red-header):** top 2098, bottom 1984, left 1588, right 1474
- **Exam:** top/bottom 1134 (2 cm), left/right 1134 (2 cm)
## Default Font Specifications
Two font profiles exist. Each scene declares which profile it uses.
### Profile A: Formal (report, academic, contract, official-doc, exam)
| Element | CN Font | EN Font | Size | Notes |
|---------|---------|---------|------|-------|
| H1 | SimHei | Times New Roman | 16 pt (size: 32) | Bold, centered |
| H2 | SimHei | Times New Roman | 15 pt (size: 30) | Bold |
| H3 | SimHei | Times New Roman | 14 pt (size: 28) | Bold |
| Body | SimSun | Times New Roman | 12 pt (size: 24) | |
| Caption | SimSun | Times New Roman | 10.5 pt (size: 21) | |
- Text color: always **pure black `"000000"`** (never dark-blue-grey)
- First-line indent: **480 twips** (2 chars at SimSun 12pt)
- Line spacing: **312** (1.3x).
- **Color routing for non-report documents**: When the document is a short-form text (essay, evaluation, letter, speech, application, reflection, etc.) rather than a structured report/whitepaper/proposal/consulting deliverable, heading color MUST use pure black `"000000"` instead of `palette.primary`. Colored headings are reserved for documents that need brand/professional identity (reports with covers, whitepapers, proposals, consulting deliverables).
### Profile B: Visual (resume, copywriting)
| Element | CN Font | EN Font | Size |
|---------|---------|---------|------|
| Name/Title | Microsoft YaHei | Calibri | Varies |
| Body | Microsoft YaHei | Calibri | 1011 pt |
| Caption | Microsoft YaHei | Calibri | 9 pt |
- First-line indent: **420 twips** (2 chars at YaHei)
- Color: per design-system palette
### Official-Doc Font Override (GB/T 9704)
When `needsRedHeader() = true`:
| Element | Font | Size |
|---------|------|------|
| Red header org name | STXiaoBiaoSong (or SimSun bold) | 26 pt (size: 52) |
| Title | STXiaoBiaoSong (or SimHei) | 22 pt (size: 44) |
| Body | FangSong | 16 pt (size: 32) |
| Section heading | FangSong_GB2312 bold (or HeiTi) | 16 pt (size: 32) |
- Line spacing: **560** (28 pt fixed)
- First-line indent: **640 twips** (2 chars at FangSong 16pt)
## Chinese Font Size Reference
| Name | Points | Half-points (size:) |
|------|--------|---------------------|
| Chu Hao (initial) | 42 | 84 |
| Xiao Chu | 36 | 72 |
| Yi Hao (1st) | 26 | 52 |
| Xiao Yi | 24 | 48 |
| Er Hao (2nd) | 22 | 44 |
| Xiao Er | 18 | 36 |
| San Hao (3rd) | 16 | 32 |
| Xiao San | 15 | 30 |
| Si Hao (4th) | 14 | 28 |
| Xiao Si | 12 | 24 |
| Wu Hao (5th) | 10.5 | 21 |
| Xiao Wu | 9 | 18 |
| Liu Hao (6th) | 7.5 | 15 |
## Placeholder Convention
When required information is missing, use standardized placeholders so users can Find & Replace in Word.
**Format:** Always use full-width brackets `【 】`.
| Type | Format | Example |
|------|--------|---------|
| General field | `【field name】` | Name: 【company name】 |
| Monetary amount | `【RMB in words: yuan (lowercase: ¥)】` | Amount: 【RMB in words】 |
| Date field | `【____/____/____】` | Signing date: 【____/____/____】 |
| Long text | `【Please fill in: ______】` | Delivery criteria: 【Please fill in: ______】 |
| Attachment ref | `【See Appendix 1: ______】` | |
**Rules:**
1. Placeholder format must be consistent throughout the entire document
2. Each placeholder must specify exactly what is needed (never use vague "TBD" or "to be completed")
3. Never hard-code unconfirmed critical facts; use a placeholder instead
4. Never use sloppy expressions like "to be refined", "omitted", "user fills in later"
## Title Orphan Prevention (All Scenes)
Body headings (H1/H2/H3) and cover titles must avoid leaving 12 characters alone on the last line. This rule applies to ALL document types.
**For cover titles:** Always use `calcTitleLayout()` + `splitTitleLines()` from `design-system.md` — these handle orphan prevention automatically (merges ≤2-char last lines into the previous line).
**For body headings (H1/H2/H3):** When a heading text is long enough to wrap, apply the same `splitTitleLines()` logic. If the heading would cause a single-character orphan in Word's auto-wrapping, manually split into multiple `TextRun` elements with a `Break` (soft line break) at a semantic boundary.
```js
const { Break } = require("docx");
// Check if heading needs manual line break to prevent orphan
function buildHeadingRuns(text, maxCharsPerLine, runProps) {
// If text fits in one line, no action needed
if (text.length <= maxCharsPerLine) {
return [new TextRun({ text, ...runProps })];
}
// Use splitTitleLines to find semantic break points
const lines = splitTitleLines(text, maxCharsPerLine);
const runs = [];
for (let i = 0; i < lines.length; i++) {
if (i > 0) runs.push(new TextRun({ break: 1, ...runProps, text: "" })); // soft line break
runs.push(new TextRun({ text: lines[i], ...runProps }));
}
return runs;
}
```
**Estimation for maxCharsPerLine:** For centered headings, estimate available width = page width - left margin - right margin. For SimHei at a given pt size, each CJK char ≈ pt × 20 twips wide. Divide available width by char width to get `maxCharsPerLine`.
---
## Undefined / Null Value Prevention (Mandatory)
Generated code MUST guard against outputting literal `undefined`, `null`, `NaN`, or empty strings for any visible text field. This is a **hard requirement** — these are never acceptable in a delivered document.
```js
// ✅ MANDATORY: Safe text helper — use for ALL user-facing text values
function safeText(value, placeholder) {
if (value === undefined || value === null || value === "" || String(value) === "NaN" || String(value) === "undefined") {
return placeholder || "【Please fill in】";
}
return String(value);
}
// Usage:
new TextRun({ text: safeText(config.contact, "【Contact person】") })
new TextRun({ text: safeText(row.phone, "【Phone number】") })
```
**Rules:**
1. Every `TextRun` displaying user-provided or config-derived data MUST use `safeText()` or equivalent guard
2. If a field is optional and not provided, use `【Please fill in: field_name】` placeholder (full-width brackets)
3. Table cells with missing data: show `【Please fill in】`, never leave as empty string or undefined
4. This applies to ALL scenes — contracts, reports, academic, exams, etc.
---
## WPS / Office Word Compatibility (Mandatory)
Generated .docx files must render consistently in both Microsoft Office Word and WPS Office. The following OOXML features have known compatibility issues — avoid or use carefully.
### Features to AVOID (high incompatibility risk)
| Feature | Issue | Alternative |
|---------|-------|------------|
| **Text-character decorative lines** (e.g., `───`, `━━━`, `═══`, `——————`) | Character-drawn lines depend on font metrics and rendering engine — they appear different widths/lengths in MS Office vs WPS, often truncated or misaligned. They cannot span a controlled width. | **Always use paragraph borders** (`border.top`, `border.bottom`) for horizontal decorative lines. Paragraph borders render consistently across engines and respect indent for precise width control. See recipe R2 for correct implementation. |
| **Default table borders on cover wrapper tables** (forgetting `allNoBorders`) | docx-js default table borders are `single/auto/sz=4`. On the 16838-high cover wrapper, these borders add ~8 twips of extra height per edge. MS Office includes border thickness in height calculation, causing content to overflow by a few twips → **blank page 2**. WPS is more lenient and may absorb the overflow. | **Every cover wrapper table MUST explicitly set `borders: allNoBorders`** (all 6 border positions = NONE). Never rely on defaults. Define the `allNoBorders` constant and use it consistently. |
| `verticalAlign: "center"` or `"bottom"` in exact-height TableRow | WPS ignores vertical alignment in exact-height rows; content may clip or shift | Use `verticalAlign: "top"` + `spacing.before` to position content. Avoid `margins.top`/`margins.bottom` in exact-height cells — they reduce available height unpredictably across engines |
| `characterSpacing` (large values) | WPS renders differently from Word; letter spacing may collapse or expand | Keep `characterSpacing` ≤ 80; for cover English labels, test both renderers |
| `margins.top`/`margins.bottom` inside exact-height cells | MS Office and WPS calculate remaining height differently when cell margins are present | Use `spacing.before` on the first paragraph for vertical positioning; only use `margins.left`/`margins.right` |
| Complex nested Tables inside exact-height cells | WPS height calculation differs from Word; content may overflow or clip | Wrap everything in a single 16838 outer wrapper cell (R1 architecture). Nested tables inside are acceptable when the outer wrapper provides a safety net |
| Large font without explicit `spacing.line` | Paragraph inherits small line spacing from document default (e.g., 560tw for body); font taller than line height → top of characters clipped | Always set `spacing: { line: fontPt * 23, lineRule: "atLeast" }` on paragraphs with font size > body text |
| `ShadingType.SOLID` | WPS shows solid black instead of intended color | Always use `ShadingType.CLEAR` |
| OOXML raw XML for columns (`w:cols`) | WPS column rendering may differ | Use only when explicitly needed (A3 exam papers); test output |
| `titlePage: true` with complex headers/footers | WPS may not properly suppress first-page header/footer | Use separate sections instead of titlePage flag |
| Tab stops for alignment | WPS tab width may differ from Word | Use borderless Tables for alignment instead |
### Features that are SAFE (consistent rendering)
| Feature | Notes |
|---------|-------|
| Borderless Tables for layout | Both renderers handle well |
| `ShadingType.CLEAR` with fill color | Consistent |
| `rule: "exact"` on single-level TableRow | Works in both (avoid with nested Tables) |
| Paragraph borders (left, bottom, etc.) | Consistent |
| `spacing.before` / `spacing.after` | Consistent |
| Standard fonts (SimHei, SimSun, YaHei, TNR, Calibri) | Available on both platforms |
| `PageBreak` inside Paragraph | Consistent |
| Section breaks (`SectionType.NEXT_PAGE`) | Consistent |
### Mandatory Compatibility Checks (Post-Generation)
Add to quality self-check:
- [ ] No `ShadingType.SOLID` anywhere (search codebase)
- [ ] No `verticalAlign: "center"` or `"bottom"` in exact-height rows
- [ ] No tab-stop alignment for party info or data alignment (use Tables)
- [ ] Covers use the 16838 outer wrapper architecture (R1 pattern) with `spacing.before` for positioning; no `margins.top`/`margins.bottom` in exact-height cells
- [ ] **Cover section margin = `{ top: 0, bottom: 0, left: 0, right: 0 }`** — non-zero margins cause wrapper to shrink away from page edges
- [ ] **Cover wrapper row has `height: { value: 16838, rule: "exact" }`** — without this, content overflows or leaves whitespace
- [ ] **Cover is in a separate section from body content** — cover and body must not share a section
- [ ] **Cover wrapper table uses explicit `allNoBorders`** — never rely on default table borders (causes blank page 2 in MS Office)
- [ ] **No text-character decorative lines** (`───`, `━━━`, `═══`, `——————`) — use paragraph borders instead
- [ ] `characterSpacing` values ≤ 80 throughout
- [ ] TOC: follow `references/toc.md` checklist (heading style, TableOfContents element, PageBreak, post-processing script)
- [ ] All tables use `WidthType.PERCENTAGE` for column widths (WPS tblGrid bug; if DXA is unavoidable, set `columnWidths` explicitly)
```js
// ✅ Correct — percentage widths, WPS-safe
new Table({
width: { size: 100, type: WidthType.PERCENTAGE },
rows: [new TableRow({ children: [
new TableCell({ width: { size: 30, type: WidthType.PERCENTAGE }, children: [...] }),
new TableCell({ width: { size: 70, type: WidthType.PERCENTAGE }, children: [...] }),
]})],
});
// ❌ WRONG — DXA widths cause WPS tblGrid mismatch (all gridCol=100)
new TableCell({ width: { size: 3000, type: WidthType.DXA }, ... })
```
---
## Universal Prohibitions
These apply to ALL scenes. Scene files may add scene-specific prohibitions.
1. **No outlines-only** — always produce a complete, finished document
2. **No chat-style output** — the document must not read like a conversation or explanation
3. **No fake TOC / page numbers / headers** — use proper docx-js structures
4. **No excessive blank lines** to pad layout
5. **No dirty formatting** — no stray annotations, template fragments, broken hyperlinks, garbled markers
6. **No sloppy placeholders** — "TBD", "omitted", "略", "to be refined" are forbidden; use proper `【】` placeholders
7. **No fabricated data** — do not invent statistics, citations, legal references, or facts to appear professional
8. **No inconsistent heading/numbering** — one numbering system per document, no level-skipping
9. **No Markdown artifacts** — no `#`, `**`, `-` list markers, `>` blockquotes, and **no Markdown table syntax** (`| col1 | col2 |`, `|---|---|`) in the final docx. Any tabular data MUST be rendered as a proper docx `Table` object — never as plain-text pipe-delimited lines. This applies to ALL scenes including exam paper data tables, report statistics, and academic result tables.
10. **No bullet-list documents** — body text must be proper paragraphs, not endless bullet points
## Letter / Correspondence Format (Universal)
When generating any letter-style document (invitation letter, thank-you letter, cover letter, recommendation letter, English essay in letter format, etc.), the following layout rules apply regardless of scene:
1. **Complimentary close and sender name MUST be right-aligned** — e.g., "Yours sincerely,", "Best regards,", "Yours,", and the sender name below it must use `alignment: AlignmentType.RIGHT`
2. **Date** — if placed at the top of the letter, right-aligned; if at the bottom, right-aligned with the closing
3. **Salutation** ("Dear Mr. Smith," / "Dear Mike,") — left-aligned, followed by a blank line or `spacing.after`
4. **Body paragraphs** — left-aligned (English) or justified (CJK), with appropriate `spacing.after` between paragraphs
```js
// ✅ Correct — closing and sender right-aligned
new Paragraph({ alignment: AlignmentType.RIGHT, spacing: { before: 400 },
children: [new TextRun({ text: "Yours sincerely,", size: 24 })] }),
new Paragraph({ alignment: AlignmentType.RIGHT,
children: [new TextRun({ text: "Li Hua", size: 24 })] }),
// ❌ WRONG — closing left-aligned (default)
new Paragraph({
children: [new TextRun({ text: "Yours sincerely," })] }),
```
## Quality Self-Check (Universal)
→ See **SKILL.md § Post-Generation — Two-Layer Verification** for the complete checklist.
Scene files add scene-specific checks on top of that universal checklist.
## Execution Priority
When rules conflict, follow this precedence (highest first):
1. **User-provided template or explicit instructions** — always override defaults
2. **Scene-specific rules** — override common rules and design-system defaults
3. **Common rules** (this file) — override design-system aesthetic defaults
4. **Design-system defaults** — baseline aesthetics
## Cover Recipes
See `references/design-system.md` for the 7 validated cover recipes (R1R7) and 14 color palettes.
Cover recipe selection: `selectCoverRecipe(docType, industry, titleLength)` — defined in `references/design-system.md` (authoritative source).
---
## Cover Title Layout Rules (Mandatory)
These rules apply to ALL cover recipes (R1R7). They prevent the most common cover quality issues: title overflow, content spilling to page 2, and mid-word line breaks.
### Rule 1: Always use `calcTitleLayout()`
Every cover MUST call `calcTitleLayout(title, availableWidth)` from `design-system.md` to determine:
- **Font size** (dynamically calculated, never hardcoded above 40pt)
- **Line breaks** (semantically split, never mid-word)
**Forbidden:** Passing the full title as a single long TextRun and letting Word auto-wrap. This causes uncontrolled line breaks at arbitrary character positions.
### Rule 2: No single-character orphan lines
If the last line of a title contains only 12 characters, merge it into the previous line. The `splitTitleLines()` function handles this automatically.
### Rule 3: No mid-word breaks for CJK text
Line breaks must occur at semantic boundaries: after particles (e.g., de/yu/he/ji/zhi), punctuation, connectors, spaces, or underscores. Never split a compound term (e.g., a 4-character term like a management specification must not be split into 3+1 characters).
For mixed Chinese+English titles (e.g., "基于Transformer架构的..."), use `estimateTextWidth()` instead of character count for line break calculation. Chinese characters are ~2× wider than English characters at the same font size.
### Rule 4: Maximum 3 title lines on cover
Cover titles must not exceed 3 lines. If the title is too long, reduce font size (down to minimum 24pt) before adding more lines. If it still exceeds 3 lines at 24pt, force 3 lines with longer line lengths.
### Rule 5: Always use `calcCoverSpacing()` for whitespace
Spacing values (`spacing.before`) in cover elements must be dynamically calculated, not hardcoded. Fixed values like `before: 4500` assume a specific title length and will cause overflow with longer titles.
### Rule 6: Cover height budget validation
Before generating, verify that total content height stays within 15638 twips (16838 page height minus 1200 twips safety margin — MS Office renders large fonts taller than calculated). Each recipe in `design-system.md` includes height budget annotations — verify during generation.
### Rule 7: R5 meta info table (academic covers)
Academic cover meta info must use a 2-column table with **percentage widths only** (NOT DXA — WPS breaks with DXA widths):
- **Table width:** adaptive 5575% of page, calculated by `calcR5MetaLayout()` in `design-system.md`. Table is centered via `alignment: CENTER`.
- **Label column:** adaptive 2545% of table width, **LEFT aligned**, plain text label + "". NO full-width space padding, NO right-alignment, NO distributed alignment.
- **Value column:** remaining percentage, **LEFT aligned**, `bottom border single sz=4` = fixed-length underline (same length for all rows regardless of value text length).
- **Label column borders:** none (NO bottom border on label cells).
- ⚠️ Do NOT use DXA widths, full-width space padding (`\u3000`), spacer columns, or tab stops — these render inconsistently between MS Office and WPS.
### Rule 8: Large font paragraphs must set explicit line spacing
When a paragraph uses a font size larger than the document body text (e.g., cover titles at 36pt+), it **MUST** set explicit `spacing.line` to prevent clipping. Without it, the paragraph inherits the document/style default line spacing (often 560 twips for body text), which is smaller than the font height → the top of characters gets clipped.
**Formula:** `spacing.line = Math.ceil(fontPt * 23)` with `lineRule: "atLeast"`
**Example:** A 36pt title needs `spacing: { line: 828, lineRule: "atLeast" }`. Without this, the inherited `line=560` clips the top 160 twips of the text.
This applies to ALL large-font paragraphs (cover titles, chapter headings, decorative text), not just covers.
### Rule 9: Every TextRun on a colored background MUST set explicit `color`
⚠️ **CRITICAL:** When a TextRun is inside a cell/area with a dark or colored background (shading), it **MUST** explicitly set the `color` property. Omitting `color` defaults to black (`#000000`), which is invisible on dark backgrounds.
**Common mistake:** Subtitle or meta text on R1/R2/R4 dark cover blocks without `color` → appears as invisible black text on dark bg.
**Rule:** For any TextRun inside a shaded cell:
- Use `P.cover.titleColor` for title text
- Use `P.cover.subtitleColor` for subtitle text
- Use `P.cover.metaColor` for meta info text
- Use `P.cover.footerColor` for footer text
- **NEVER** rely on default color when background is not white
### Rule 10: Page number API nesting and 3-section numbering
⚠️ **CRITICAL:** Page number settings MUST be nested inside `page.pageNumbers`:
```js
// ❌ WRONG — docx-js ignores top-level pageNumberStart/pageNumberFormatType
properties: { pageNumberStart: 1, pageNumberFormatType: NumberFormat.DECIMAL }
// ✅ CORRECT
properties: { page: { pageNumbers: { start: 1, formatType: NumberFormat.DECIMAL } } }
```
**Standard page numbering (5-zone convention):**
All multi-section documents MUST follow this five-zone page numbering scheme unless the user explicitly requests otherwise.
| Zone | Section | pageNumbers | Footer instrText | Notes |
|------|---------|-------------|-----------------|-------|
| 1. Cover | Title page | None (no footer) | — | Always logical page 1, but number is **hidden** |
| 2. Front matter | Abstract, TOC, Preface | `{ start: 1, formatType: UPPER_ROMAN }` | `PAGE \* ROMAN \* MERGEFORMAT` | Separate Roman numeral sequence (i, ii, iii…) |
| 3. Body | Main content | `{ start: 1, formatType: DECIMAL }` | `PAGE \* arabic \* MERGEFORMAT` | **Resets to 1** |
| 4. Appendix | Appendices (A, B, C…) | Continues body (no reset) | Same as body | No section break needed unless different headers required |
| 5. References | Bibliography | Continues body (no reset) | Same as body | If body ends on p.42, references continue from p.43 |
**Key rules:**
0. **NEVER use "Page X of Y" denominator format.** Footer must show only the current page number (e.g., `1`, `2`, `iii`). Do NOT display total page count. No `Page 3 of 12`, no `3 / 12`, no `第3页/共12页`. Just the bare number. `PageNumber.TOTAL_PAGES` / `NUMPAGES` is **FORBIDDEN** in footers.
1. **Cover is always page 1 internally** but the page number is never displayed. Suppress footer in cover section.
2. **Front matter uses independent Roman numerals** starting at `i`. This sequence is separate from the body.
3. **Body resets to Arabic 1.** The first page of main content is always page `1`.
4. **Appendix and references continue the body sequence.** No reset between body → appendix → references.
5. **Documents without front matter** skip zone 2 (cover hidden, body starts at Arabic 1).
6. **Documents without cover** start body (or front matter) at page 1 directly.
7. **Short documents (≤3 pages):** simple Arabic 1, 2, 3 throughout, no cover/frontmatter distinction.
8. **Single-page documents** (certificates, letters): no page numbering at all.
**3-section docx-js implementation (for documents with TOC):**
At minimum, implement zones 13 as separate docx sections:
```js
// Section 1: Cover — no page number
properties: { page: { /* no pageNumbers */ } }
// No footer children, or empty footer
// Section 2: Front matter — Roman numerals
properties: { page: { pageNumbers: { start: 1, formatType: NumberFormat.UPPER_ROMAN } } }
// Footer: PAGE \* ROMAN \* MERGEFORMAT
// Section 3: Body — Arabic, reset to 1
properties: { page: { pageNumbers: { start: 1, formatType: NumberFormat.DECIMAL } } }
// Footer: PAGE \* arabic \* MERGEFORMAT
// Appendix and References: same section as body (continues numbering)
// Only create a new section if different header/footer content is needed
```
**Post-processing required** (WPS compatibility):
1. Remove empty `<w:pgNumType/>` from cover section XML
2. Patch footer instrText: replace bare `PAGE` with format-specific `PAGE \* ROMAN` or `PAGE \* arabic`
See `toc.md` § Page Number API for full details.

View File

@@ -0,0 +1,538 @@
## Geometric Decoration System — Pure docx-js Decorations
### Design Philosophy
Uses only docx-js native capabilities for visual decoration — no external tools (like Playwright screenshots). Suitable for covers, chapter separators, page background enhancement.
**When to fall back to Playwright?**
Only when gradients, complex illustrations, or brand visuals are needed that pure OOXML cannot express. Default: prefer native solutions below.
### Decoration Element Library
#### 1. Color Strip — Table Simulation
Single-row single-column borderless table + background color to create horizontal color strips.
```js
function colorStrip(color, height = 80) {
return new Table({
width: { size: 100, type: WidthType.PERCENTAGE },
borders: { top: NB, bottom: NB, left: NB, right: NB,
insideHorizontal: NB, insideVertical: NB },
rows: [new TableRow({
height: { value: height, rule: "exact" },
children: [new TableCell({
shading: { type: ShadingType.CLEAR, fill: color.replace("#", "") },
borders: { top: NB, bottom: NB, left: NB, right: NB },
children: [new Paragraph({ children: [] })],
})],
})],
});
}
// ══════════════════════════════════════════════════════════════
// R6 — Editorial Warm (minimal, warm white bg, no decorations)
// ══════════════════════════════════════════════════════════════
// Suitable for: lesson plans (non-STEM), cultural/creative, newsletters,
// event planning, internal reports, light-weight documents
// NOT for: formal business, consulting, finance, government, academic
// Title constraint: single line only (≤20 chars). Longer titles → route to R1.
//
// Structure: 2-row wrapper table (no border, warm bg shading)
// Row 1 (content): category → title → subtitle → fields
// Row 2 (footer): left English title + right label
// All spacing via paragraph indent (WPS safe, no cell margins).
function buildCoverR6(config) {
const P = config.palette;
const PAD_L = 1300, PAD_R = 1100;
const ind = { left: PAD_L, right: PAD_R };
const FOOTER_H = 900;
const CONTENT_H = 16838 - FOOTER_H;
const shading = { fill: P.bg || "F7F7F5", type: ShadingType.CLEAR };
// ⚠️ R6 uses a simplified title layout: prefer single line, shrink font to fit
const availW = 11906 - PAD_L - PAD_R;
const { titlePt, titleLines } = calcTitleLayoutR6(config.title, availW, 36, 22);
const titleSize = titlePt * 2;
const lineH = Math.ceil(titlePt * 23 * 1.3);
// Dynamic top spacing
const titleH = titleLines.length * (titleSize * 10 + 200);
const categoryH = 22 * 10 + 900;
const subtitleH = config.subtitle ? (28 * 10 + 1200) : 0;
const fieldsH = (config.metaLines || []).length * (24 * 10 + 100);
const contentH = categoryH + titleH + subtitleH + fieldsH;
const remaining = Math.max(CONTENT_H - 1200 - contentH, 400);
const topSpacing = Math.floor(remaining * 0.55);
const children = [];
// 1. Top spacer (dynamic)
children.push(new Paragraph({ indent: ind, spacing: { before: topSpacing } }));
// 2. Category label (small, wide letter-spacing)
if (config.englishLabel) {
children.push(new Paragraph({
indent: ind, spacing: { after: 900 },
children: [new TextRun({
text: config.englishLabel, size: 22,
color: P.cover.metaColor || "9A9A9A",
font: { ascii: "Calibri", eastAsia: "Microsoft YaHei" },
characterSpacing: 60,
})],
}));
}
// 3. Title (single line preferred, dynamic font size)
for (let i = 0; i < titleLines.length; i++) {
children.push(new Paragraph({
indent: ind,
spacing: { after: i < titleLines.length - 1 ? 60 : 300, line: lineH, lineRule: "atLeast" },
children: [new TextRun({
text: titleLines[i], size: titleSize,
color: P.cover.titleColor || "2C2C2C",
font: { ascii: "Calibri", eastAsia: "Microsoft YaHei" },
characterSpacing: 30,
})],
}));
}
// 4. Subtitle
if (config.subtitle) {
children.push(new Paragraph({
indent: ind, spacing: { after: 1200 },
children: [new TextRun({
text: config.subtitle, size: 28,
color: P.cover.subtitleColor || "6B6B6B",
font: { ascii: "Calibri", eastAsia: "Microsoft YaHei" },
characterSpacing: 15,
})],
}));
}
// 5. Meta fields (tab-aligned label + value)
for (const line of (config.metaLines || [])) {
// Expect "labelvalue" format or plain text
const sep = line.indexOf("") !== -1 ? "" : (line.indexOf(":") !== -1 ? ":" : null);
const label = sep ? line.split(sep)[0].trim() : line;
const value = sep ? line.split(sep).slice(1).join(sep).trim() : "";
children.push(new Paragraph({
indent: ind, spacing: { after: 100 },
tabStops: [{ type: TabStopType.LEFT, position: PAD_L + 1600 }],
children: [
new TextRun({ text: label, size: 22, color: P.cover.metaColor || "9A9A9A",
font: { ascii: "Calibri", eastAsia: "Microsoft YaHei" }, characterSpacing: 20 }),
...(value ? [
new TextRun({ text: "\t" }),
new TextRun({ text: value, size: 24, color: P.cover.subtitleColor || "6B6B6B",
font: { ascii: "Calibri", eastAsia: "Microsoft YaHei" }, characterSpacing: 8 }),
] : []),
],
}));
}
// 6. Footer (2-column borderless table)
const footerLeft = config.footerLeft || "";
const footerRight = config.footerRight || "";
// Adaptive font size for long English footer text
const flSize = footerLeft.length > 60 ? 14 : (footerLeft.length > 40 ? 16 : 18);
const flSpacing = footerLeft.length > 60 ? 5 : (footerLeft.length > 40 ? 10 : 20);
const footerTable = new Table({
width: { size: 100, type: WidthType.PERCENTAGE },
layout: TableLayoutType.FIXED, borders: allNoBorders,
rows: [new TableRow({
children: [
new TableCell({
width: { size: 70, type: WidthType.PERCENTAGE }, borders: noBorders, shading,
children: [new Paragraph({
indent: { left: PAD_L },
children: [new TextRun({ text: footerLeft, size: flSize,
color: P.cover.footerColor || "9A9A9A",
font: { ascii: "Calibri" }, characterSpacing: flSpacing })],
})],
}),
new TableCell({
width: { size: 30, type: WidthType.PERCENTAGE }, borders: noBorders, shading,
children: [new Paragraph({
alignment: AlignmentType.RIGHT, indent: { right: PAD_R },
children: [new TextRun({ text: footerRight, size: 18,
color: P.cover.footerColor || "9A9A9A",
font: { ascii: "Calibri" }, characterSpacing: 20 })],
})],
}),
],
})],
});
// 7. 2-row wrapper (content + footer)
return [new Table({
width: { size: 100, type: WidthType.PERCENTAGE },
layout: TableLayoutType.FIXED, borders: allNoBorders,
rows: [
new TableRow({
height: { value: CONTENT_H, rule: "exact" },
children: [new TableCell({
shading, borders: noBorders,
margins: { top: 0, bottom: 0, left: 0, right: 0 },
verticalAlign: VerticalAlign.TOP,
children,
})],
}),
new TableRow({
height: { value: FOOTER_H, rule: "exact" },
children: [new TableCell({
shading, borders: noBorders,
margins: { top: 0, bottom: 0, left: 0, right: 0 },
verticalAlign: VerticalAlign.CENTER,
children: [footerTable],
})],
}),
],
})];
}
// R6 title layout: prefer FEWER lines over larger font size (single line best)
function calcTitleLayoutR6(title, availableWidthTw, preferredPt, minPt) {
const step = 2;
// Try to fit in 1 line (shrink font if needed)
for (let pt = preferredPt; pt >= minPt; pt -= step) {
const charWidthTw = pt * 23 * 0.5; // CJK ~50% em width
const charsPerLine = Math.floor(availableWidthTw / charWidthTw);
if (title.length <= charsPerLine) return { titlePt: pt, titleLines: [title] };
}
// Can't fit in 1 line, try 2 lines at largest possible font
for (let pt = preferredPt; pt >= minPt; pt -= step) {
const charWidthTw = pt * 23 * 0.5;
const charsPerLine = Math.floor(availableWidthTw / charWidthTw);
const lines = splitTitleLines(title, charsPerLine);
if (lines.length <= 2) return { titlePt: pt, titleLines: lines };
}
// Fallback: minPt, up to 3 lines
const charWidthTw = minPt * 23 * 0.5;
const charsPerLine = Math.floor(availableWidthTw / charWidthTw);
return { titlePt: minPt, titleLines: splitTitleLines(title, charsPerLine) };
}
// Usage: cover top decoration
// children: [colorStrip(P.accent, 120), ...]
```
#### 2. Side Ribbon
Uses left border to create vertical ribbon effect.
```js
function sideRibbon(content, color, width = 14) {
return new Paragraph({
border: {
left: { style: BorderStyle.SINGLE, size: width, color: color.replace("#", ""), space: 12 },
},
indent: { left: 240 },
spacing: { before: 100, after: 100 },
children: content,
});
}
// Usage: emphasis quotes, chapter tips
// sideRibbon([new TextRun({ text: "Key Insight", bold: true })], P.accent)
```
#### 3. Border Compositions
```js
// Top thick line + bottom thin line — title area frame
function frameTitle(titleRuns) {
return new Paragraph({
border: {
top: { style: BorderStyle.SINGLE, size: 18, color: c(P.accent) },
bottom: { style: BorderStyle.SINGLE, size: 4, color: c(P.accent) },
},
spacing: { before: 400, after: 200 },
alignment: AlignmentType.CENTER,
children: titleRuns,
});
}
// L-shape border — left + bottom
function lShapeBorder(content) {
return new Paragraph({
border: {
left: { style: BorderStyle.SINGLE, size: 12, color: c(P.accent), space: 10 },
bottom: { style: BorderStyle.SINGLE, size: 12, color: c(P.accent) },
},
indent: { left: 300 },
spacing: { before: 200, after: 300 },
children: content,
});
}
// Double-line frame — top and bottom double lines
function doubleLine(content) {
return new Paragraph({
border: {
top: { style: BorderStyle.DOUBLE, size: 6, color: c(P.accent) },
bottom: { style: BorderStyle.DOUBLE, size: 6, color: c(P.accent) },
},
spacing: { before: 200, after: 200 },
alignment: AlignmentType.CENTER,
children: content,
});
}
```
#### 4. Gradient Simulation
Multiple narrow color strips to simulate gradient effect.
```js
function gradientStrip(startColor, endColor, steps = 5, totalHeight = 200) {
const rows = [];
const h = Math.floor(totalHeight / steps);
for (let i = 0; i < steps; i++) {
const ratio = i / (steps - 1);
const blended = blendColors(startColor, endColor, ratio);
rows.push(new TableRow({
height: { value: h, rule: "exact" },
children: [new TableCell({
shading: { type: ShadingType.CLEAR, fill: blended },
borders: { top: NB, bottom: NB, left: NB, right: NB },
children: [new Paragraph({ children: [] })],
})],
}));
}
return new Table({
width: { size: 100, type: WidthType.PERCENTAGE },
borders: { top: NB, bottom: NB, left: NB, right: NB,
insideHorizontal: NB, insideVertical: NB },
rows,
});
}
function blendColors(hex1, hex2, ratio) {
const r1 = parseInt(hex1.slice(1, 3), 16), g1 = parseInt(hex1.slice(3, 5), 16), b1 = parseInt(hex1.slice(5, 7), 16);
const r2 = parseInt(hex2.slice(1, 3), 16), g2 = parseInt(hex2.slice(3, 5), 16), b2 = parseInt(hex2.slice(5, 7), 16);
const r = Math.round(r1 + (r2 - r1) * ratio), g = Math.round(g1 + (g2 - g1) * ratio), b = Math.round(b1 + (b2 - b1) * ratio);
return `${r.toString(16).padStart(2,"0")}${g.toString(16).padStart(2,"0")}${b.toString(16).padStart(2,"0")}`;
}
```
#### 5. Symbol Ornaments
```js
// Section divider line — for chapter separation
function ornamentDivider(symbol = "◆", count = 3) {
const ornament = Array(count).fill(symbol).join(" ");
return new Paragraph({
alignment: AlignmentType.CENTER,
spacing: { before: 400, after: 400 },
children: [new TextRun({ text: ornament, size: 20, color: c(P.accent) })],
});
}
// Common decoration symbols
// ◆ ◇ ● ○ ★ ☆ ■ □ ▲ △ ─ ━ ═ ║ ╔ ╗ ╚ ╝
// Ornamental: ❧ ❦ ✦ ✧ ✿ ❀ ❁ ※
```
#### 6. Info Card — Table Implementation
```js
function infoCard(title, items, accentColor) {
const ac = accentColor.replace("#", "");
const headerRow = new TableRow({
children: [new TableCell({
columnSpan: 2,
shading: { type: ShadingType.CLEAR, fill: ac },
margins: { top: 80, bottom: 80, left: 160, right: 160 },
borders: { top: NB, bottom: NB, left: NB, right: NB },
children: [new Paragraph({
children: [new TextRun({ text: title, bold: true, size: 24, color: "FFFFFF" })],
})],
})],
});
const dataRows = items.map(([label, value]) => new TableRow({
children: [
new TableCell({
width: { size: 30, type: WidthType.PERCENTAGE },
margins: { top: 60, bottom: 60, left: 160, right: 80 },
shading: { type: ShadingType.CLEAR, fill: "F8F9FA" },
borders: { bottom: { style: BorderStyle.SINGLE, size: 1, color: "E0E0E0" },
top: NB, left: NB, right: NB },
children: [new Paragraph({ children: [new TextRun({ text: label, size: 21, color: "666666" })] })],
}),
new TableCell({
margins: { top: 60, bottom: 60, left: 80, right: 160 },
borders: { bottom: { style: BorderStyle.SINGLE, size: 1, color: "E0E0E0" },
top: NB, left: NB, right: NB },
children: [new Paragraph({ children: [new TextRun({ text: value, size: 21 })] })],
}),
],
}));
return new Table({
width: { size: 80, type: WidthType.PERCENTAGE },
alignment: AlignmentType.CENTER,
borders: { top: NB, bottom: NB, left: NB, right: NB,
insideHorizontal: NB, insideVertical: NB },
rows: [headerRow, ...dataRows],
});
}
```
// R7 — Swiss Tech Minimalist (slate grey bg, Klein blue accent, asymmetric layout)
// Suitable for: cultural/creative research, trend reports, brand strategy, design deliverables
// Palette: ST-1 (exclusive)
// Layout: left-aligned title (upper 20%), right-shifted subtitle with top rule,
// right-aligned info block with accent right border, Swiss cross anchor
// Key features: ■ square accent dot, open-frame tables, large whitespace
//
// ⚠️ MANDATORY: All cover non-negotiables apply (margin=0, 16838 exact, allNoBorders)
// ⚠️ Title uses calcTitleLayout() with maxPt=36 (not 40 — R7 uses lighter visual weight)
function buildCoverR7(config) {
const P = palettes[config.palette || "ST-1"];
const C = P.cover;
const padL = 600;
// Title layout — R7 uses 36pt max (lighter than R1-R4's 40pt)
const availW = 11906 - padL - 600;
const { titlePt, titleLines } = calcTitleLayout(config.title, availW, 36, 24);
const titleSize = titlePt * 2;
const lineH = Math.ceil(titlePt * 23);
// Dynamic spacing based on title lines
const topSpacer = titleLines.length <= 2 ? 1200 : 800;
const subtitleSpacer = titleLines.length <= 2 ? 1400 : 800;
const infoSpacer = titleLines.length <= 2 ? 2200 : 1200;
const children = [];
// 1. Swiss cross anchor — top-left decorative element
children.push(new Paragraph({
spacing: { before: 600 },
indent: { left: padL },
children: [new TextRun({
text: "\uFF0B", // fullwidth plus
size: 40, bold: true, color: C.titleColor,
font: { ascii: "Arial", eastAsia: "SimHei" },
})],
}));
// 2. Top spacer
children.push(new Paragraph({ spacing: { before: topSpacer } }));
// 3. Title lines — left-aligned, last line has accent ■
titleLines.forEach((line, i) => {
const isLast = i === titleLines.length - 1;
const runs = [new TextRun({
text: line, size: titleSize, color: C.titleColor,
font: { ascii: "Arial", eastAsia: "Noto Sans SC" },
})];
if (isLast) {
runs.push(new TextRun({
text: " \u25A0", // ■ black square
size: 24, color: P.accent,
font: { ascii: "Arial" },
}));
}
children.push(new Paragraph({
indent: { left: padL },
spacing: { after: isLast ? 200 : 80, line: lineH, lineRule: "atLeast" },
children: runs,
}));
});
// 4. Subtitle spacer
children.push(new Paragraph({ spacing: { before: subtitleSpacer } }));
// 5. Subtitle — right-shifted, top border rule, wide character spacing
if (config.subtitle) {
children.push(new Paragraph({
indent: { left: 3800, right: 600 },
border: { top: { style: BorderStyle.SINGLE, size: 2, color: C.titleColor, space: 14 } },
spacing: { after: 200 },
children: [new TextRun({
text: config.subtitle, size: 26, color: C.subtitleColor,
font: { ascii: "Arial", eastAsia: "Noto Sans SC" },
characterSpacing: 40,
})],
}));
}
// 6. Decorative horizontal line
children.push(new Paragraph({
spacing: { before: 600 },
border: { bottom: { style: BorderStyle.SINGLE, size: 1, color: "C8D0DC", space: 0 } },
}));
// 7. Info spacer
children.push(new Paragraph({ spacing: { before: infoSpacer } }));
// 8. Info footer — right-aligned, 4 label+value pairs, accent right border
// Standard fields: ORGANIZATION, RESPONSIBILITY, REPORT NUMBER, DATE & EDITION
const metaEntries = config.metaEntries || [
{ label: "ORGANIZATION", value: config.organization || "" },
{ label: "RESPONSIBILITY", value: config.responsibility || "" },
{ label: "REPORT NUMBER", value: config.reportNumber || "" },
{ label: "DATE & EDITION", value: config.dateEdition || "" },
];
for (const entry of metaEntries) {
// Label — 7pt uppercase English
children.push(new Paragraph({
alignment: AlignmentType.RIGHT,
indent: { right: 800 },
border: { right: { style: BorderStyle.SINGLE, size: 12, color: P.accent, space: 16 } },
spacing: { after: 20 },
children: [new TextRun({
text: entry.label, size: 14, color: C.metaColor,
font: { ascii: "Arial" },
characterSpacing: 20,
})],
}));
// Value — 11pt bold
children.push(new Paragraph({
alignment: AlignmentType.RIGHT,
indent: { right: 800 },
border: { right: { style: BorderStyle.SINGLE, size: 12, color: P.accent, space: 16 } },
spacing: { after: 280 },
children: [new TextRun({
text: entry.value, size: 22, bold: true, color: C.titleColor,
font: { ascii: "Arial", eastAsia: "Noto Sans SC" },
})],
}));
}
// Wrap in 16838 exact wrapper table
return [new Table({
width: { size: 100, type: WidthType.PERCENTAGE },
layout: TableLayoutType.FIXED,
borders: allNoBorders,
rows: [new TableRow({
height: { value: 16838, rule: "exact" },
children: [new TableCell({
shading: { type: ShadingType.CLEAR, fill: P.bg },
borders: noBorders,
verticalAlign: VerticalAlign.TOP,
children,
})],
})],
})];
}
### Decoration Usage Scenarios
| Scenario | Recommended Decoration | Combination |
|------|----------|----------|
| Report cover | Color strip + L-frame border | Top strip → Title area → L-frame author info |
| Proposal cover | Gradient simulation + double-line frame | Gradient bg → Double-line title |
| Chapter separator | Symbol ornament + side ribbon | Symbol divider → New chapter title with ribbon |
| Summary card | Info card | Standalone card displaying key metrics |
| Academic cover | Color strip + info table | Top strip → School name → Title → Info table |
---

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,257 @@
# docx-js Advanced Features
Advanced API for complex document scenarios. Load this when creating documents with TOC, cover pages, footnotes, multi-section layouts, or post-processing needs.
## Table of Contents (TOC)
**→ See `references/toc.md` for the complete TOC reference** (3-step process, code examples, page numbering, common bugs, checklist).
## Cover Page Design (Vertical Centering)
Use large `spacing.before` to push content down for visual centering:
```js
// Approximate vertical center on A4:
// Total printable height ≈ 14000 twips
// For title at ~40% from top: before = 5600
const coverSection = {
properties: {
page: { /* standard A4 */ },
// No headers/footers on cover page
},
children: [
new Paragraph({ spacing: { before: 5600 } }), // spacer
new Paragraph({
alignment: AlignmentType.CENTER,
children: [new TextRun({
text: title,
font: { ascii: "Calibri", eastAsia: "SimHei" },
size: 52, bold: true, color: palette.primary,
})],
}),
// ... subtitle, author, date
],
};
```
For multi-section documents, put the cover in its own section so it can have different headers/footers.
## Footnotes
```js
const { FootnoteReferenceRun, Footnote } = require("docx");
const doc = new Document({
footnotes: {
1: { children: [new Paragraph({ children: [new TextRun({ text: "Smith, J. (2024). Research Methods. Academic Press, pp. 45-67.", size: 18 })] })] },
2: { children: [new Paragraph({ children: [new TextRun({ text: "Zhang, W. (2023). \u201c数据分析方法研究\u201d. 科学通报, 68(12), 1234-1250.", size: 18 })] })] },
},
sections: [{
children: [
new Paragraph({
children: [
new TextRun({ text: "According to recent studies" }),
new FootnoteReferenceRun(1), // superscript [1]
new TextRun({ text: ", data analysis methods have evolved" }),
new FootnoteReferenceRun(2), // superscript [2]
new TextRun({ text: "." }),
],
}),
],
}],
});
```
### Academic Reference Pattern
For sequential references [1][2][3]..., pre-define all footnotes in the `footnotes` object with numeric keys, then reference them inline with `FootnoteReferenceRun(n)`.
## keepNext — Element Binding
Prevent page breaks between related elements:
```js
// Heading stays with next paragraph
new Paragraph({
heading: HeadingLevel.HEADING_2,
keepNext: true, // don't break after this
children: [new TextRun({ text: "Table 1: Results" })],
})
// Table immediately follows on same page
// Caption stays with image
new Paragraph({
keepNext: true,
alignment: AlignmentType.CENTER,
children: [new TextRun({ text: "Figure 1: Architecture Diagram", italics: true, size: 20 })],
})
// ImageRun paragraph follows
```
Use `keepNext: true` for:
- Heading → first paragraph of section
- Table caption → table
- Image → image caption
- "Figure X" label → image
## Page Break Rules
Follow the document type strategy defined in SOUL.md Rule 1.
**Structural breaks (always):**
- Cover page → TOC
- TOC → main content
- Main content → back cover
**Content breaks (by document type):**
- Academic / teaching → `new Paragraph({ children: [new PageBreak()] })` before each H1 chapter
- Business report → PageBreak before each H1; H2 flows naturally
- Resume / contract / letter → No content page breaks
- Short article → No content page breaks
**Anti-tear (mandatory):**
```js
// Heading stays with next paragraph
new Paragraph({
heading: HeadingLevel.HEADING_1,
keepNext: true,
children: [new TextRun("Chapter Title")],
})
// Table caption stays with table
new Paragraph({
keepNext: true,
children: [new TextRun({ text: "Table 1: Summary", italics: true })],
})
// Image caption stays with image
new Paragraph({
keepNext: true,
children: [new TextRun({ text: "Figure 1: Architecture", italics: true })],
})
```
**Never:**
- PageBreak inside tables
- PageBreak as standalone element (must be inside Paragraph)
- PageBreak at the END of the last section (causes blank page)
```js
// Correct: page break between cover and TOC
new Paragraph({ children: [new PageBreak()] })
```
## Quotes Escaping in JS Strings
**⚠️⚠️⚠️ CRITICAL — #1 MOST COMMON BUG ⚠️⚠️⚠️**
Bare Chinese curly quotation marks (`""` `''`) in JS string literals **WILL break syntax and crash document generation**. This bug occurs most often in **Chinese body text** where curly quotes are used for emphasis, proper nouns, event names, or quoted speech — e.g., `"双11"`, `"前低后高"`, `"618"大促`. **Every single occurrence** of `""''` in text content MUST be Unicode-escaped. No exceptions.
**MANDATORY RULE: Before writing ANY `TextRun`, `para()`, or string containing Chinese text, scan the text for `""''` characters and replace ALL of them with `\u201c \u201d \u2018 \u2019`.**
| Character | Unicode | Escape method |
|-----------|---------|---------------|
| `"` `"` | `\u201c` `\u201d` | Unicode escape `\u201c` `\u201d` |
| `'` `'` | `\u2018` `\u2019` | Unicode escape `\u2018` `\u2019` |
| `"` | U+0022 | `\"` or wrap string in single quotes / template literal |
| `'` | U+0027 | `\'` or wrap string in double quotes / template literal |
```js
// ❌ WRONG — curly quotes in Chinese text break JS syntax (VERY COMMON MISTAKE)
content.push(para("2025年四个季度行业增速呈现"前低后高"的态势。在"618"大促、"双11""双12"活动拉动下增长显著。"));
new TextRun({ text: "他说"你好"" })
new TextRun({ text: 'It's a test' })
// ✅ CORRECT — ALL curly quotes replaced with Unicode escapes
content.push(para("2025年四个季度行业增速呈现\u201c前低后高\u201d的态势。在\u201c618\u201d大促、\u201c双11\u201d\u201c双12\u201d活动拉动下增长显著。"));
new TextRun({ text: "他说\u201c你好\u201d" })
new TextRun({ text: "It\u2019s a test" })
// ✅ CORRECT — straight quotes escaped or use alternate delimiters
new TextRun({ text: "He said \"hello\"" })
new TextRun({ text: 'He said "hello"' })
new TextRun({ text: `He said "hello"` })
```
## Multi-Section Documents
Different headers/footers per section:
```js
const doc = new Document({
sections: [
{
// Section 1: Cover — no header/footer
properties: { page: { /* ... */ } },
children: coverChildren,
},
{
// Section 2: Front matter — Roman page numbers
properties: {
type: SectionType.NEXT_PAGE,
page: {
/* size, margin... */
pageNumbers: { start: 1, formatType: NumberFormat.UPPER_ROMAN },
},
},
headers: { default: new Header({ children: [] }) },
footers: {
default: new Footer({
children: [new Paragraph({
alignment: AlignmentType.CENTER,
children: [new TextRun({ children: [PageNumber.CURRENT], size: 18 })],
})],
}),
},
children: tocAndAbstract,
},
{
// Section 3: Main content — Arabic page numbers
properties: {
type: SectionType.NEXT_PAGE,
page: {
/* size, margin... */
pageNumbers: { start: 1, formatType: NumberFormat.DECIMAL },
},
},
headers: {
default: new Header({
children: [new Paragraph({
alignment: AlignmentType.CENTER,
children: [new TextRun({ text: docTitle, size: 18, color: "888888" })],
})],
}),
},
footers: { default: footerWithPageNumbers },
children: mainContent,
},
],
});
```
## Converting DOCX to PDF
```bash
# Using LibreOffice (headless)
libreoffice --headless --convert-to pdf output.docx
# ⚠️ TOC Rule: If document has TOC, warn user that:
# 1. LibreOffice conversion may show empty TOC
# 2. User should open in Word first, update fields (Ctrl+A → F9), save, then convert
# 3. Or use Word's "Save as PDF" for best results
```
## Converting DOCX to Images
```bash
# Step 1: Convert to PDF
libreoffice --headless --convert-to pdf output.docx
# Step 2: Convert PDF to images
pdftoppm -png -r 200 output.pdf output_page
# This generates output_page-1.png, output_page-2.png, etc.
# Use -r 200 for good quality (200 DPI)
```
Useful for generating preview thumbnails or when user needs images instead of document files.

View File

@@ -0,0 +1,333 @@
# docx-js API Reference
Complete API for creating .docx documents with the `docx` npm package. For advanced features (TOC details, footnotes, PDF conversion), see `docx-js-advanced.md`.
## Setup
```js
const {
Document, Packer, Paragraph, TextRun, Table, TableRow, TableCell,
ImageRun, PageBreak, Header, Footer, PageNumber, NumberFormat,
AlignmentType, HeadingLevel, WidthType, BorderStyle, ShadingType,
PageOrientation, TabStopType, TabStopPosition, ExternalHyperlink,
InternalHyperlink, Bookmark, LevelFormat, TableOfContents,
} = require("docx");
const fs = require("fs");
```
## Document Creation + Export
```js
const doc = new Document({
styles: { /* see Styles section */ },
numbering: { config: [ /* see Lists section */ ] },
sections: [{
properties: {
page: {
size: { width: 11906, height: 16838 },
margin: { top: 1417, bottom: 1417, left: 1701, right: 1417 },
},
},
headers: { default: new Header({ children: [/* */] }) },
footers: { default: new Footer({ children: [/* */] }) },
children: [ /* Paragraphs, Tables, etc. */ ],
}],
});
const buffer = await Packer.toBuffer(doc);
fs.writeFileSync("output.docx", buffer);
```
## Paragraph + TextRun
```js
new Paragraph({
heading: HeadingLevel.HEADING_1, // or HEADING_2, HEADING_3
alignment: AlignmentType.JUSTIFIED,
spacing: { before: 240, after: 120, line: 312 }, // 1.3x mandatory
indent: { firstLine: 480 }, // 2-char CJK indent (480 SimSun / 420 YaHei)
children: [
new TextRun({
text: "Hello",
bold: true,
italics: true,
size: 24, // 12pt = Xiao Si
font: { ascii: "Calibri", eastAsia: "Microsoft YaHei" },
color: "000000", // Pure black for Profile A; for Profile B use palette.body
}),
],
});
// Additional text formatting options
new TextRun({ text: "Underlined", underline: { type: UnderlineType.SINGLE } })
new TextRun({ text: "Highlighted", highlight: "yellow" })
new TextRun({ text: "Strikethrough", strike: true })
new TextRun({ text: "x²", superScript: true })
new TextRun({ text: "H₂O", subScript: true })
new SymbolRun({ char: "2022", font: "Symbol" }) // Bullet •
```
## Table
**⚠️ CRITICAL**: Always set `margins` on TableCell (or at Table level for global default). Without margins, text touches borders.
**⚠️ CRITICAL**: Use `ShadingType.CLEAR` — never `ShadingType.SOLID` (causes black cells).
**⚠️ CRITICAL — Table Cross-Page Control**:
- Header row MUST set `tableHeader: true` (auto-repeat header on page break)
- All rows MUST set `cantSplit: true` (prevent row content split across pages)
- Title paragraph before table MUST set `keepNext: true` (keep title with table)
```js
// ⚠️ Title before table — keepNext keeps title with table
new Paragraph({
keepNext: true, // ← critical
children: [new TextRun({ text: "Table 1 Feature Comparison", bold: true, size: 21 })],
}),
new Table({
width: { size: 100, type: WidthType.PERCENTAGE },
borders: {
top: { style: BorderStyle.SINGLE, size: 2, color: "9AA6B2" },
bottom: { style: BorderStyle.SINGLE, size: 2, color: "9AA6B2" },
left: { style: BorderStyle.NONE },
right: { style: BorderStyle.NONE },
insideHorizontal: { style: BorderStyle.SINGLE, size: 1, color: "D0D0D0" },
insideVertical: { style: BorderStyle.NONE },
},
rows: [
// ⚠️ Header row — tableHeader + cantSplit
new TableRow({
tableHeader: true, // auto-repeat on page break
cantSplit: true, // prevent row split
children: ["Header 1", "Header 2"].map(text =>
new TableCell({
children: [new Paragraph({ children: [new TextRun({ text, bold: true, size: 21 })] })],
shading: { type: ShadingType.CLEAR, fill: "F1F5F9" },
margins: { top: 60, bottom: 60, left: 120, right: 120 },
width: { size: 50, type: WidthType.PERCENTAGE },
})
),
}),
// ⚠️ Data rows — cantSplit
new TableRow({
cantSplit: true, // prevent row split
children: ["Data 1", "Data 2"].map(text =>
new TableCell({
children: [new Paragraph({ children: [new TextRun({ text, size: 21 })] })],
margins: { top: 60, bottom: 60, left: 120, right: 120 },
width: { size: 50, type: WidthType.PERCENTAGE },
})
),
}),
],
});
```
### Column Widths
```js
// Fixed widths (twips)
width: { size: 3000, type: WidthType.DXA }
// Percentage
width: { size: 50, type: WidthType.PERCENTAGE }
```
## ImageRun
**⚠️ CRITICAL**: Always include `type` parameter. Always preserve aspect ratio.
```js
const imageBuffer = fs.readFileSync("chart.png");
// Calculate dimensions preserving aspect ratio
const displayWidth = 500;
const aspectRatio = originalHeight / originalWidth;
const displayHeight = Math.round(displayWidth * aspectRatio);
new Paragraph({
alignment: AlignmentType.CENTER,
children: [
new ImageRun({
data: imageBuffer,
transformation: { width: displayWidth, height: displayHeight },
type: "png", // REQUIRED: "png", "jpg", "gif", "bmp"
}),
],
});
```
## PageBreak
**⚠️ CRITICAL**: PageBreak MUST be inside a Paragraph. Standalone PageBreak crashes Word.
**⚠️ Best Practice**: Attach PageBreak to the end of a **paragraph with text content**. Avoid empty paragraph + PageBreak (may cause blank pages). If using multi-section structure, prefer section breaks over PageBreak.
```js
// ✅ Recommended — PageBreak attached to content paragraph
new Paragraph({
children: [
new TextRun({ text: "End of section" }),
new PageBreak()
]
})
// ✅ Acceptable — but prefer section breaks
new Paragraph({ children: [new PageBreak()] })
// ✅ Best — use section breaks instead of PageBreak
// Place content in different sections — auto page break
```
## Headers & Footers + Page Numbers
```js
headers: {
default: new Header({
children: [
new Paragraph({
alignment: AlignmentType.CENTER,
children: [new TextRun({ text: "Document Title", size: 18, color: "888888" })],
}),
],
}),
},
footers: {
default: new Footer({
children: [
new Paragraph({
alignment: AlignmentType.CENTER,
children: [
new TextRun({ children: [PageNumber.CURRENT], size: 18 }),
],
}),
],
}),
},
```
> ⚠️ **Denominator FORBIDDEN** — never use `PageNumber.TOTAL_PAGES` or "X / Y" format. Show only current page number.
## Styles Definition
The example below is for **Chinese documents** (default). For **English documents**, replace `font` with `"Times New Roman"` throughout.
```js
styles: {
default: {
document: {
run: {
font: { ascii: "Calibri", eastAsia: "Microsoft YaHei" },
size: 24, color: "000000", // Pure black for Profile A; for Profile B use palette.body
},
paragraph: {
spacing: { line: 312 }, // 1.3x mandatory
},
},
heading1: {
run: { font: { ascii: "Calibri", eastAsia: "SimHei" }, size: 32, bold: true, color: "0B1220" },
paragraph: { spacing: { before: 360, after: 160, line: 312 } },
},
heading2: {
run: { font: { ascii: "Calibri", eastAsia: "SimHei" }, size: 28, bold: true, color: "0B1220" },
paragraph: { spacing: { before: 240, after: 120, line: 312 } },
},
heading3: {
run: { font: { ascii: "Calibri", eastAsia: "SimHei" }, size: 24, bold: true, color: "0B1220" },
paragraph: { spacing: { before: 200, after: 100, line: 312 } },
},
},
}
```
## Lists
**⚠️ CRITICAL**: Each separate numbered list MUST use a unique `reference` name. Reusing the same reference causes numbering to continue instead of restarting.
```js
// In Document numbering config
numbering: {
config: [
{
reference: "list-features", // unique name!
levels: [{
level: 0,
format: LevelFormat.DECIMAL,
text: "%1.",
alignment: AlignmentType.LEFT,
style: { paragraph: { indent: { left: 720, hanging: 360 } } },
}],
},
{
reference: "list-benefits", // different name for second list!
levels: [{ /* same config */ }],
},
],
},
// Usage in paragraphs
new Paragraph({
numbering: { reference: "list-features", level: 0 },
children: [new TextRun({ text: "First item" })],
})
```
### Bullet Lists
```js
new Paragraph({
bullet: { level: 0 },
children: [new TextRun({ text: "Bullet item" })],
})
```
## Hyperlinks
### External Link
```js
new ExternalHyperlink({
children: [new TextRun({ text: "Click here", style: "Hyperlink" })],
link: "https://example.com",
})
```
### Internal Link (Bookmark)
```js
// Define bookmark at target
new Paragraph({
children: [
new Bookmark({ id: "section1", children: [new TextRun("Section 1")] }),
],
})
// Link to bookmark
new InternalHyperlink({
children: [new TextRun({ text: "Go to Section 1", style: "Hyperlink" })],
anchor: "section1",
})
```
## Table of Contents (TOC)
**→ See `references/toc.md` for the complete TOC reference.**
Quick reminder: (1) Add `TableOfContents` element + PageBreak, (2) Run `python3 "$DOCX_SCRIPTS/add_toc_placeholders.py" output.docx --auto`, (3) Check exit code.
## Tabs
```js
new Paragraph({
tabStops: [
{ type: TabStopType.RIGHT, position: TabStopPosition.MAX },
],
children: [new TextRun("Left"), new TextRun("\t"), new TextRun("Right")]
})
```
## Constants Quick Reference
- **Underlines:** `SINGLE`, `DOUBLE`, `WAVY`, `DASH`
- **Borders:** `SINGLE`, `DOUBLE`, `DASHED`, `DOTTED`
- **Numbering:** `DECIMAL` (1,2,3), `UPPER_ROMAN` (I,II,III), `LOWER_LETTER` (a,b,c)
- **Symbols:** `"2022"` (•), `"00A9"` (©), `"00AE"` (®), `"2122"` (™)

323
skills/docx/references/faq.md Executable file
View File

@@ -0,0 +1,323 @@
# FAQ — Common Bugs and Fixes
## Bug: Table text touching cell borders
**Symptom**: Text is cramped against table cell edges, no padding.
**Fix**: Set `margins` at the TableCell level:
```js
new TableCell({
margins: { top: 60, bottom: 60, left: 120, right: 120 },
children: [/* ... */],
})
```
---
## Bug: Numbered list doesn't restart
**Symptom**: Second numbered list continues from where the first left off (e.g., starts at 4 instead of 1).
**Fix**: Each separate numbered list MUST use a unique `reference` name in numbering config:
```js
numbering: { config: [
{ reference: "list-A", levels: [{ level: 0, format: LevelFormat.DECIMAL, text: "%1." }] },
{ reference: "list-B", levels: [{ level: 0, format: LevelFormat.DECIMAL, text: "%1." }] },
]}
```
---
## Bug: Cover and content on same page
**Symptom**: Cover page content flows directly into main content without page break.
**Fix**: Add a PageBreak paragraph at the end of cover content:
```js
coverChildren.push(new Paragraph({ children: [new PageBreak()] }));
```
---
## Bug: Three-line table shows all borders
**Symptom**: Table intended to be three-line shows full grid borders.
**Fix**: Set table-level borders to NONE, then override only specific cell borders:
```js
// Table level: all borders NONE
borders: { top: { style: BorderStyle.SINGLE, size: 4 }, bottom: { style: BorderStyle.SINGLE, size: 4 },
left: { style: BorderStyle.NONE }, right: { style: BorderStyle.NONE },
insideHorizontal: { style: BorderStyle.NONE }, insideVertical: { style: BorderStyle.NONE } }
// Header cells: bottom border only
headerCell.borders = { bottom: { style: BorderStyle.SINGLE, size: 2, color: "000000" } }
```
---
## Bug: User requests Chinese font size name (e.g. Wu Hao) but output is wrong
**Symptom**: Font size doesn't match expected Chinese size name.
**Fix**: Use the correct half-point value. `size` in docx-js is in half-points:
- Wu Hao 五号 = 10.5pt → `size: 21`
- Xiao Si 小四 = 12pt → `size: 24`
- Si Hao 四号 = 14pt → `size: 28`
See SKILL.md for complete conversion table.
---
## Bug: Black table cells
**Symptom**: Table cells appear solid black in Word.
**Fix**: Use `ShadingType.CLEAR` not `ShadingType.SOLID`:
```js
// ❌ WRONG
shading: { type: ShadingType.SOLID, fill: "F1F5F9" }
// ✅ CORRECT
shading: { type: ShadingType.CLEAR, fill: "F1F5F9" }
```
---
## Bug: Chinese characters garbled in matplotlib charts
**Symptom**: Chinese text shows as empty boxes □□□ in generated PNG charts.
**Fix**: Configure SimHei font before plotting:
```python
from matplotlib.font_manager import FontProperties
zh_font = FontProperties(fname="/path/to/SimHei.ttf")
plt.title("中文标题", fontproperties=zh_font)
plt.rcParams["axes.unicode_minus"] = False
```
---
## Bug: Image stretched/squashed in document
**Symptom**: Embedded image appears distorted.
**Fix**: Calculate display height from width using original aspect ratio:
```js
const aspectRatio = originalHeight / originalWidth;
const displayWidth = 500;
const displayHeight = Math.round(displayWidth * aspectRatio);
new ImageRun({ data: buf, transformation: { width: displayWidth, height: displayHeight }, type: "png" });
```
---
## Bug: TOC shows empty in generated document
→ See `references/toc.md` — "5 Common TOC Bugs" section for diagnosis and fixes.
---
## Bug: PageBreak standalone crashes Word
**Symptom**: Document fails to open or renders incorrectly.
**Fix**: PageBreak must always be wrapped in a Paragraph:
```js
// ❌ WRONG — standalone
children: [new PageBreak()]
// ✅ CORRECT — inside Paragraph
children: [new Paragraph({ children: [new PageBreak()] })]
```
---
## Bug: Quotation marks break JavaScript syntax — ⚠️ #1 MOST COMMON BUG
**This is the single most frequent code generation error.** Chinese text routinely uses curly quotes `""` for emphasis, proper nouns, and event names (e.g., "双11", "前低后高", "618"大促). These MUST be Unicode-escaped — bare curly quotes silently break JS syntax.
**Rule: scan ALL Chinese text for `""''` and replace with `\u201c \u201d \u2018 \u2019` BEFORE writing the string.**
```js
// ❌ WRONG — curly quotes in Chinese text break syntax (extremely common)
para("行业增速呈现"前低后高"的态势,在"618"大促拉动下增长。")
"他说"你好"" // \u201c \u201d
'It's a test' // \u2019
// ✅ CORRECT — Unicode escapes for ALL curly quotes
para("行业增速呈现\u201c前低后高\u201d的态势在\u201c618\u201d大促拉动下增长。")
"他说\u201c你好\u201d"
"It\u2019s a test"
// ✅ Straight quotes: escape or use alternate delimiters
"He said \"hello\""
'He said "hello"'
```
---
## Bug: Unwanted blank pages in document
**Common causes:**
1. **Trailing PageBreak at end of last section** — pagination should use section breaks or be at the start of the next section
2. **Empty Paragraph overflow** — empty paragraphs at page bottom push to a new page
3. **PageBreak right after Table** — Table already at page bottom, PageBreak creates extra page
**Fix:**
```js
// Post-generation check: last section's children should not end with PageBreak
function removeTrailingPageBreak(section) {
const children = section.children;
if (!children.length) return;
const last = children[children.length - 1];
// If last element is a Paragraph containing only PageBreak, remove it
if (last instanceof Paragraph) {
const runs = last.root?.filter(c => c instanceof PageBreak);
if (runs?.length && !last.root?.some(c => c instanceof TextRun)) {
children.pop();
}
}
}
```
**Prevention rules:**
- Place PageBreak at the **start of the next section**, not the end of the previous one
- Or use separate sections for pagination (no PageBreak needed)
- The last section of a document must NEVER end with a PageBreak
---
## Bug: Different rendering in WPS vs Microsoft Word
**Symptom**: Document looks correct in Word but renders differently in WPS (or vice versa) — misaligned tables, shifted content, clipped text in cells, black cells, or broken covers.
**Root causes and fixes:**
### 1. `ShadingType.SOLID` shows black in WPS
```js
// ❌ WPS shows solid black
shading: { type: ShadingType.SOLID, fill: "F1F5F9" }
// ✅ Both renderers show correct color
shading: { type: ShadingType.CLEAR, fill: "F1F5F9" }
```
### 2. `verticalAlign: "center"` in exact-height rows shifts content
WPS ignores vertical centering in `rule: "exact"` rows — content stays at top, creating visual mismatch.
```js
// ❌ Inconsistent between Word and WPS
new TableRow({ height: { value: 800, rule: "exact" },
children: [new TableCell({ verticalAlign: VerticalAlign.CENTER, ... })] })
// ✅ Use top alignment + margins/spacing for positioning
new TableRow({ height: { value: 800, rule: "exact" },
children: [new TableCell({ verticalAlign: VerticalAlign.TOP,
margins: { top: 200 }, ... })] })
```
### 3. Tab stops misalign in WPS
Tab widths differ between Word and WPS. Never use tabs for alignment.
```js
// ❌ Tab-based alignment — breaks in WPS
new Paragraph({ tabStops: [{ type: TabStopType.RIGHT, position: 8000 }],
children: [new TextRun({ text: "Party A:\tCompany Name" })] })
// ✅ Borderless table for alignment — consistent everywhere
new Table({ borders: allNoBorders, rows: [new TableRow({ children: [
new TableCell({ children: [new Paragraph({ children: [new TextRun({ text: "Party A:" })] })] }),
new TableCell({ children: [new Paragraph({ children: [new TextRun({ text: "Company Name" })] })] }),
] })] })
```
### 4. Nested tables in exact-height cells overflow differently
Word calculates nested table heights more accurately than WPS. Use stacked tables instead.
```js
// ❌ Nested table inside exact-height cell
new TableRow({ height: { value: 16838, rule: "exact" },
children: [new TableCell({ children: [nestedTable1, nestedTable2] })] })
// ✅ Stacked approach — content table + filler table
[contentTable, fillerTable] // both at top level, heights sum to 16838
```
### 5. `characterSpacing` renders differently
Large `characterSpacing` values cause inconsistent letter spacing. Keep ≤ 80.
### 6. `titlePage: true` header/footer suppression
WPS may not correctly hide first-page headers when using `titlePage: true`. Use a separate section for the cover instead.
---
## Bug: Cover spills to second page
**Symptom**: Cover content overflows, with some elements (date, footer, accent strip) appearing on page 2.
**Root cause**: Total content height exceeds 16838 twips (A4 page height). Common when:
- Title is very long (3+ lines at large font size)
- Fixed spacing values assume short title
- Multiple meta lines + subtitle + English label
**Fix**: Always use `calcTitleLayout()` + `calcCoverSpacing()` from `design-system.md`. These dynamically adjust font sizes and spacing to fit within the page. See `design-system.md § Cover Content Overflow Prevention` for the complete checklist.
---
## Bug: Blank page 2 after cover in MS Office (but not WPS)
**Symptom**: Cover displays correctly in WPS but produces a blank second page in MS Office Word.
**Root cause**: The cover wrapper table uses **default docx-js table borders** (`single/auto/sz=4`) instead of explicitly setting `allNoBorders`. Default borders add ~8 twips per edge. MS Office includes border thickness in the exact-height row calculation, pushing total height past 16838 twips → overflow to page 2. WPS is more lenient and absorbs the extra pixels.
**Fix**: Every cover wrapper table MUST explicitly set `borders: allNoBorders`:
```js
const NB = { style: BorderStyle.NONE, size: 0, color: "FFFFFF" };
const allNoBorders = { top: NB, bottom: NB, left: NB, right: NB,
insideHorizontal: NB, insideVertical: NB };
new Table({
borders: allNoBorders, // ← MANDATORY
rows: [new TableRow({
height: { value: 16838, rule: "exact" },
// ...
})],
});
```
**Prevention**: Add to post-generation check — search for any `new Table` in cover code that does not explicitly set `borders`.
---
## Bug: Cover decorative lines appear truncated or misaligned
**Symptom**: Horizontal decorative lines on the cover (accent strips, divider rules) display at different widths in MS Office vs WPS, or appear truncated / not spanning the intended width.
**Root cause**: Lines were implemented using text characters (`───`, `━━━`, `═══`, `——————`) instead of paragraph borders. Character-drawn lines depend on font metrics (character width × count), which vary across rendering engines.
**Fix**: Always use **paragraph borders** for decorative lines:
```js
// ✅ Paragraph border — renders consistently in both MS Office and WPS
new Paragraph({
indent: { left: 1000, right: 1000 },
border: { top: { style: BorderStyle.SINGLE, size: 18, color: accentColor, space: 20 } },
children: [],
})
// ❌ NEVER use text characters for decorative lines
new TextRun({ text: "───────────────" }) // width varies across engines
```
**Note**: This applies to ALL cover recipes (R1R5). Recipe R2 uses `border.top` and `border.bottom` for its double-rule frame — follow this pattern.
---
## Bug: "undefined" appears in document text
**Symptom**: Fields like "Contact: undefined" or "Location: undefined" in generated documents.
**Root cause**: JavaScript outputs the string `"undefined"` when accessing a property that doesn't exist on the config object.
**Fix**: Use `safeText()` helper for ALL user-facing text values:
```js
function safeText(value, placeholder) {
if (value === undefined || value === null || value === "" ||
String(value) === "NaN" || String(value) === "undefined") {
return placeholder || "【Please fill in】";
}
return String(value);
}
// Usage: new TextRun({ text: safeText(config.contact, "【Contact person】") })
```

View File

@@ -0,0 +1,276 @@
# Math Formulas — LaTeX → docx-js Mapping
## Design Philosophy
GLM uses **LaTeX as the formula input syntax**, internally converting to docx-js Math objects.
**Why not write OMML directly?**
- Models are naturally proficient in LaTeX (abundant in training data)
- LaTeX is semantically clear and highly readable
- Conversion layer is encapsulated internally, transparent to the user
## Quick Start
```js
const { Math: OoxmlMath, MathRun, MathFraction, MathSuperScript,
MathSubScript, MathRadical, MathSum, MathSubSuperScript } = require("docx");
// Embed formula in paragraph
new Paragraph({
alignment: AlignmentType.CENTER,
children: [
new OoxmlMath({
children: [/* Math components */]
})
]
})
```
## LaTeX → docx-js Conversion Table
### Basic Operations
| LaTeX | Meaning | docx-js Implementation |
|-------|---------|----------------------|
| `x + y` | Addition | `new MathRun("x + y")` |
| `x - y` | Subtraction | `new MathRun("x y")` (use Unicode minus ``) |
| `x \times y` | Multiplication | `new MathRun("x × y")` |
| `x \div y` | Division | `new MathRun("x ÷ y")` |
| `x \pm y` | Plus-minus | `new MathRun("x ± y")` |
| `x \neq y` | Not equal | `new MathRun("x ≠ y")` |
| `x \leq y` | Less or equal | `new MathRun("x ≤ y")` |
| `x \geq y` | Greater or equal | `new MathRun("x ≥ y")` |
### Fractions
| LaTeX | docx-js |
|-------|---------|
| `\frac{a}{b}` | `new MathFraction({ numerator: [new MathRun("a")], denominator: [new MathRun("b")] })` |
| `\frac{x+1}{x-1}` | `new MathFraction({ numerator: [new MathRun("x+1")], denominator: [new MathRun("x1")] })` |
### Superscripts & Subscripts
| LaTeX | docx-js |
|-------|---------|
| `x^2` | `new MathSuperScript({ children: [new MathRun("x")], superScript: [new MathRun("2")] })` |
| `x_i` | `new MathSubScript({ children: [new MathRun("x")], subScript: [new MathRun("i")] })` |
| `x_i^2` | `new MathSubSuperScript({ children: [new MathRun("x")], subScript: [new MathRun("i")], superScript: [new MathRun("2")] })` |
### Radicals
| LaTeX | docx-js |
|-------|---------|
| `\sqrt{x}` | `new MathRadical({ children: [new MathRun("x")] })` |
| `\sqrt[3]{x}` | `new MathRadical({ children: [new MathRun("x")], degree: [new MathRun("3")] })` |
### Summation & Integrals
| LaTeX | docx-js |
|-------|---------|
| `\sum_{i=1}^{n}` | `new MathSum({ subScript: [new MathRun("i=1")], superScript: [new MathRun("n")], children: [new MathRun("aᵢ")] })` |
### Greek Letters
Use Unicode characters directly:
```js
// LaTeX → Unicode mapping
const GREEK = {
"\\alpha": "α", "\\beta": "β", "\\gamma": "γ", "\\delta": "δ",
"\\epsilon": "ε", "\\zeta": "ζ", "\\eta": "η", "\\theta": "θ",
"\\iota": "ι", "\\kappa": "κ", "\\lambda": "λ", "\\mu": "μ",
"\\nu": "ν", "\\xi": "ξ", "\\pi": "π", "\\rho": "ρ",
"\\sigma": "σ", "\\tau": "τ", "\\phi": "φ", "\\chi": "χ",
"\\psi": "ψ", "\\omega": "ω",
"\\Alpha": "Α", "\\Beta": "Β", "\\Gamma": "Γ", "\\Delta": "Δ",
"\\Theta": "Θ", "\\Lambda": "Λ", "\\Pi": "Π", "\\Sigma": "Σ",
"\\Phi": "Φ", "\\Psi": "Ψ", "\\Omega": "Ω",
};
```
## Complete Formula Examples
### Quadratic Formula
LaTeX: `x = \frac{-b \pm \sqrt{b^2 - 4ac}}{2a}`
```js
new OoxmlMath({
children: [
new MathRun("x = "),
new MathFraction({
numerator: [
new MathRun("b ± "),
new MathRadical({
children: [
new MathSuperScript({
children: [new MathRun("b")],
superScript: [new MathRun("2")],
}),
new MathRun(" 4ac"),
],
}),
],
denominator: [new MathRun("2a")],
}),
],
})
```
### Pythagorean Theorem
LaTeX: `a^2 + b^2 = c^2`
```js
new OoxmlMath({
children: [
new MathSuperScript({ children: [new MathRun("a")], superScript: [new MathRun("2")] }),
new MathRun(" + "),
new MathSuperScript({ children: [new MathRun("b")], superScript: [new MathRun("2")] }),
new MathRun(" = "),
new MathSuperScript({ children: [new MathRun("c")], superScript: [new MathRun("2")] }),
],
})
```
### Trigonometric Identity
LaTeX: `\sin^2\theta + \cos^2\theta = 1`
```js
new OoxmlMath({
children: [
new MathSuperScript({ children: [new MathRun("sin")], superScript: [new MathRun("2")] }),
new MathRun("θ + "),
new MathSuperScript({ children: [new MathRun("cos")], superScript: [new MathRun("2")] }),
new MathRun("θ = 1"),
],
})
```
## Common Exam Formula Templates
### Middle School Math
```js
// Quadratic discriminant
const discriminant = new OoxmlMath({
children: [
new MathRun("Δ = "),
new MathSuperScript({ children: [new MathRun("b")], superScript: [new MathRun("2")] }),
new MathRun(" 4ac"),
],
});
// Circle area
const circleArea = new OoxmlMath({
children: [
new MathRun("S = π"),
new MathSuperScript({ children: [new MathRun("r")], superScript: [new MathRun("2")] }),
],
});
```
### High School Math
```js
// Logarithm change of base
const logChange = new OoxmlMath({
children: [
new MathSubScript({ children: [new MathRun("log")], subScript: [new MathRun("a")] }),
new MathRun("b = "),
new MathFraction({
numerator: [new MathRun("ln b")],
denominator: [new MathRun("ln a")],
}),
],
});
// Arithmetic series sum
const arithmeticSum = new OoxmlMath({
children: [
new MathSubScript({ children: [new MathRun("S")], subScript: [new MathRun("n")] }),
new MathRun(" = "),
new MathFraction({
numerator: [
new MathRun("n("),
new MathSubScript({ children: [new MathRun("a")], subScript: [new MathRun("1")] }),
new MathRun(" + "),
new MathSubScript({ children: [new MathRun("a")], subScript: [new MathRun("n")] }),
new MathRun(")"),
],
denominator: [new MathRun("2")],
}),
],
});
```
### Physics
```js
// Newton's second law
const newton2 = new OoxmlMath({
children: [new MathRun("F = ma")],
});
// Kinetic energy
const kineticEnergy = new OoxmlMath({
children: [
new MathSubScript({ children: [new MathRun("E")], subScript: [new MathRun("k")] }),
new MathRun(" = "),
new MathFraction({
numerator: [new MathRun("1")],
denominator: [new MathRun("2")],
}),
new MathRun("m"),
new MathSuperScript({ children: [new MathRun("v")], superScript: [new MathRun("2")] }),
],
});
```
## Complexity Fallback Strategy
When formulas are too complex (nesting >3 levels) for docx-js Math, **fall back to matplotlib PNG rendering:**
```python
import matplotlib
matplotlib.use("Agg")
import matplotlib.pyplot as plt
def latex_to_png(latex_str: str, output_path: str, fontsize: int = 14, dpi: int = 200):
"""Render LaTeX formula as PNG image"""
fig, ax = plt.subplots(figsize=(0.1, 0.1))
ax.axis("off")
text = ax.text(0, 0.5, f"${latex_str}$", fontsize=fontsize,
transform=ax.transAxes, verticalalignment="center")
fig.canvas.draw()
bbox = text.get_window_extent(fig.canvas.get_renderer())
fig.set_size_inches(bbox.width / dpi + 0.2, bbox.height / dpi + 0.2)
plt.savefig(output_path, dpi=dpi, bbox_inches="tight",
pad_inches=0.05, transparent=True)
plt.close()
return output_path
```
Then embed the PNG in the document:
```js
const formulaImg = fs.readFileSync("formula.png");
new Paragraph({
alignment: AlignmentType.CENTER,
children: [new ImageRun({
data: formulaImg,
transformation: { width: 300, height: 40 }, // adjust based on actual size
type: "png",
})],
})
```
**Fallback rules:**
- Nested fractions >2 levels → fallback
- Matrices/determinants → fallback
- Complex integrals (multiple integrals + limits + integrand) → fallback
- Piecewise functions → fallback
- All other cases → prefer docx-js Math

222
skills/docx/references/ooxml.md Executable file
View File

@@ -0,0 +1,222 @@
# OOXML Editing Reference — Document Library API
**Important: Read this entire document before editing.** This is the primary reference for modifying existing .docx files.
## Document Library (Python) — Primary API
Use the `Document` class from `"$DOCX_SCRIPTS/document.py"` for all edits, tracked changes, and comments. It handles infrastructure automatically (people.xml, RSIDs, settings.xml, comments, relationships, content types).
**Working with Unicode and Entities:**
- Both entity notation and Unicode work for search: `contains="&#8220;Company"``contains="\u201cCompany"`
- Both work for replacement too
### Setup
```bash
# Find the docx skill root
find /mnt/skills -name "document.py" -path "*/docx/scripts/*" 2>/dev/null | head -1
# Skill root = parent of scripts/
# Run with PYTHONPATH
PYTHONPATH=/mnt/skills/docx python your_script.py
```
```python
from scripts.document import Document, DocxXMLEditor
# Basic init (auto-creates temp copy, sets up infrastructure)
doc = Document('unpacked')
# Custom author/initials
doc = Document('unpacked', author="John Doe", initials="JD")
# Enable tracked changes
doc = Document('unpacked', track_revisions=True)
# Custom RSID (auto-generated if omitted)
doc = Document('unpacked', rsid="07DC5ECB")
```
### Finding Nodes
```python
# By text
node = doc["word/document.xml"].get_node(tag="w:p", contains="specific text")
# By line range
para = doc["word/document.xml"].get_node(tag="w:p", line_number=range(100, 150))
# By attributes
node = doc["word/document.xml"].get_node(tag="w:del", attrs={"w:id": "1"})
# By exact line number
para = doc["word/document.xml"].get_node(tag="w:p", line_number=42)
# Combined filters (disambiguation)
node = doc["word/document.xml"].get_node(tag="w:r", contains="Section", line_number=range(2400, 2500))
```
### Tracked Changes
**CRITICAL**: Only mark text that actually changes. Keep unchanged text outside `<w:del>`/`<w:ins>` tags.
**Method Selection**:
- Regular text → `replace_node()` with `<w:del>`/`<w:ins>`, or `suggest_deletion()` for whole elements
- Partially modify another's tracked change → `replace_node()` to nest changes
- Reject another's insertion → `revert_insertion()` (NOT `suggest_deletion()`)
- Reject another's deletion → `revert_deletion()`
```python
# Change one word: "monthly" → "quarterly"
node = doc["word/document.xml"].get_node(tag="w:r", contains="The report is monthly")
rpr = tags[0].toxml() if (tags := node.getElementsByTagName("w:rPr")) else ""
replacement = f'<w:r w:rsidR="00AB12CD">{rpr}<w:t>The report is </w:t></w:r><w:del><w:r>{rpr}<w:delText>monthly</w:delText></w:r></w:del><w:ins><w:r>{rpr}<w:t>quarterly</w:t></w:r></w:ins>'
doc["word/document.xml"].replace_node(node, replacement)
# Delete entire run
node = doc["word/document.xml"].get_node(tag="w:r", contains="text to delete")
doc["word/document.xml"].suggest_deletion(node)
# Delete entire paragraph
para = doc["word/document.xml"].get_node(tag="w:p", contains="paragraph to delete")
doc["word/document.xml"].suggest_deletion(para)
# Insert new content after a node
node = doc["word/document.xml"].get_node(tag="w:r", contains="existing text")
doc["word/document.xml"].insert_after(node, '<w:ins><w:r><w:t>new text</w:t></w:r></w:ins>')
# Add new numbered list item
target_para = doc["word/document.xml"].get_node(tag="w:p", contains="existing list item")
pPr = tags[0].toxml() if (tags := target_para.getElementsByTagName("w:pPr")) else ""
new_item = f'<w:p>{pPr}<w:r><w:t>New item</w:t></w:r></w:p>'
tracked_para = DocxXMLEditor.suggest_paragraph(new_item)
doc["word/document.xml"].insert_after(target_para, tracked_para)
```
### Handling Other Authors' Changes
```python
# Partially delete another author's insertion
node = doc["word/document.xml"].get_node(tag="w:ins", attrs={"w:id": "5"})
replacement = '''<w:ins w:author="Jane Smith" w:date="2025-01-15T10:00:00Z">
<w:r><w:t>quarterly </w:t></w:r>
<w:del><w:r><w:delText>financial </w:delText></w:r></w:del>
<w:r><w:t>report</w:t></w:r>
</w:ins>'''
doc["word/document.xml"].replace_node(node, replacement)
# Reject insertion (wraps in deletion)
ins = doc["word/document.xml"].get_node(tag="w:ins", attrs={"w:id": "5"})
doc["word/document.xml"].revert_insertion(ins)
# Reject deletion (restores deleted content)
del_elem = doc["word/document.xml"].get_node(tag="w:del", attrs={"w:id": "3"})
doc["word/document.xml"].revert_deletion(del_elem)
```
### Comments
```python
doc = Document('unpacked', author="Z.ai", initials="Z")
# Comment on a range
start = doc["word/document.xml"].get_node(tag="w:del", attrs={"w:id": "1"})
end = doc["word/document.xml"].get_node(tag="w:ins", attrs={"w:id": "2"})
doc.add_comment(start=start, end=end, text="Explanation of this change")
# Comment on paragraph
para = doc["word/document.xml"].get_node(tag="w:p", contains="text")
doc.add_comment(start=para, end=para, text="Comment here")
# Comment on newly created tracked change
node = doc["word/document.xml"].get_node(tag="w:r", contains="old")
new_nodes = doc["word/document.xml"].replace_node(
node, '<w:del><w:r><w:delText>old</w:delText></w:r></w:del><w:ins><w:r><w:t>new</w:t></w:r></w:ins>')
doc.add_comment(start=new_nodes[0], end=new_nodes[1], text="Changed per requirements")
# Reply to comment
doc.reply_to_comment(parent_comment_id=0, text="I agree")
```
### Images
```python
from PIL import Image
import shutil, os
doc = Document('unpacked')
media_dir = os.path.join(doc.unpacked_path, 'word/media')
os.makedirs(media_dir, exist_ok=True)
shutil.copy('image.png', os.path.join(media_dir, 'image1.png'))
img = Image.open(os.path.join(media_dir, 'image1.png'))
width_emus = int(6.5 * 914400) # 6.5" usable width
height_emus = int(width_emus * img.size[1] / img.size[0])
# Add relationship
rels_editor = doc['word/_rels/document.xml.rels']
next_rid = rels_editor.get_next_rid()
rels_editor.append_to(rels_editor.dom.documentElement,
f'<Relationship Id="{next_rid}" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/image" Target="media/image1.png"/>')
doc['[Content_Types].xml'].append_to(doc['[Content_Types].xml'].dom.documentElement,
'<Default Extension="png" ContentType="image/png"/>')
# Insert
node = doc["word/document.xml"].get_node(tag="w:p", line_number=100)
doc["word/document.xml"].insert_after(node, f'''<w:p><w:r><w:drawing>
<wp:inline distT="0" distB="0" distL="0" distR="0">
<wp:extent cx="{width_emus}" cy="{height_emus}"/>
<wp:docPr id="1" name="Picture 1"/>
<a:graphic xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main">
<a:graphicData uri="http://schemas.openxmlformats.org/drawingml/2006/picture">
<pic:pic xmlns:pic="http://schemas.openxmlformats.org/drawingml/2006/picture">
<pic:nvPicPr><pic:cNvPr id="1" name="image1.png"/><pic:cNvPicPr/></pic:nvPicPr>
<pic:blipFill><a:blip r:embed="{next_rid}"/><a:stretch><a:fillRect/></a:stretch></pic:blipFill>
<pic:spPr><a:xfrm><a:ext cx="{width_emus}" cy="{height_emus}"/></a:xfrm><a:prstGeom prst="rect"><a:avLst/></a:prstGeom></pic:spPr>
</pic:pic>
</a:graphicData>
</a:graphic>
</wp:inline>
</w:drawing></w:r></w:p>''')
```
### Saving
```python
doc.save() # Validates + copies back to original dir
doc.save('modified-unpacked') # Save to different location
doc.save(validate=False) # Skip validation (debug only)
```
### Direct DOM Manipulation
```python
editor = doc["word/document.xml"]
node = doc["word/document.xml"].get_node(tag="w:p", line_number=5)
parent = node.parentNode
parent.removeChild(node)
# General replacement (without tracked changes)
old = doc["word/document.xml"].get_node(tag="w:p", contains="original")
doc["word/document.xml"].replace_node(old, "<w:p><w:r><w:t>replacement</w:t></w:r></w:p>")
# Chained insertions
node = doc["word/document.xml"].get_node(tag="w:r", line_number=100)
nodes = doc["word/document.xml"].insert_after(node, "<w:r><w:t>A</w:t></w:r>")
nodes = doc["word/document.xml"].insert_after(nodes[-1], "<w:r><w:t>B</w:t></w:r>")
```
## Schema Compliance Quick Reference
- **Element ordering in `<w:pPr>`**: `<w:pStyle>``<w:numPr>``<w:spacing>``<w:ind>``<w:jc>`
- **Whitespace**: `xml:space='preserve'` on `<w:t>` with leading/trailing spaces
- **RSIDs**: 8-digit hex only (0-9, A-F)
- **trackRevisions**: Add `<w:trackRevisions/>` after `<w:proofState>` in settings.xml
- **`<w:del>`/`<w:ins>` placement**: At paragraph level, containing complete `<w:r>` elements. Never nest inside `<w:r>`.
## Validation Rules
The validator ensures document text matches the original after reverting GLM's changes:
- **Never modify text inside another author's `<w:ins>` or `<w:del>` tags**
- **Use nested deletions** to remove another author's insertions
- **Every edit must be tracked** with `<w:ins>` or `<w:del>` tags

264
skills/docx/references/toc.md Executable file
View File

@@ -0,0 +1,264 @@
# Table of Contents (TOC) — Complete Reference
> **This is the single source of truth for all TOC rules.** Other files should reference this file instead of duplicating TOC instructions.
## Overview
DOCX TOC is a **3-step process**: Code → Post-process → User opens Word.
```
Step A: docx-js code generates empty TOC field structure
Step B: add_toc_placeholders.py fills it with visible placeholder entries
Step C: User opens Word → "Update Field" → real page numbers replace placeholders
```
All 3 steps are **mandatory**. Skipping any step results in a broken or empty TOC.
## When to Add TOC
- **Recommended**: Long or complex documents with many headings (reports, theses, papers, manuals)
- **Do NOT add**: Resumes, contracts, letters, exam papers, short documents
- **postcheck rule**: If document contains a "目录" title but no `TableOfContents` element → error
## Step A: Code Generation (docx-js)
Insert **4 elements** in sequence:
```js
const { TableOfContents, Paragraph, TextRun, PageBreak, AlignmentType } = require("docx");
// 1. TOC title — ⛔ DO NOT use HeadingLevel (or TOC will index itself!)
new Paragraph({
alignment: AlignmentType.CENTER,
spacing: { before: 480, after: 360 },
children: [new TextRun({
text: "目 录", // or "Table of Contents" for English docs
bold: true, size: 32,
font: { eastAsia: "SimHei", ascii: "Times New Roman" }
})],
}),
// 2. TOC field element — ⚠️ first parameter is NOT displayed, it's internal name only
new TableOfContents("Table of Contents", {
hyperlink: true,
headingStyleRange: "1-3", // match HeadingLevel range used in document
}),
// 3. ★ MANDATORY Refresh Hint — tells user how to update page numbers
new Paragraph({
spacing: { before: 200 },
children: [new TextRun({
text: "Note: This Table of Contents is generated via field codes. To ensure page number accuracy after editing, please right-click the TOC and select \"Update Field.\"",
italics: true, size: 18, color: "888888"
})]
}),
// 4. ★ MANDATORY PageBreak after TOC — prevents TOC and body merging on same page
new Paragraph({ children: [new PageBreak()] }),
```
### Heading Requirements
**⚠️ CRITICAL**: TOC only picks up paragraphs with `heading: HeadingLevel.HEADING_X`.
```js
// ✅ Correct — Heading style, TOC can index
new Paragraph({
heading: HeadingLevel.HEADING_1,
children: [new TextRun({ text: "第一章 引言", bold: true, size: 32, color: c(P.primary) })]
})
// ❌ Wrong — manual bold + large font, TOC cannot detect
new Paragraph({
children: [new TextRun({ text: "第一章 引言", bold: true, size: 32, color: c(P.primary) })]
})
```
**Exceptions:**
- Cover title: does NOT need Heading style (should not appear in TOC)
- "目录" title: **MUST NOT** use Heading style (prevents TOC from indexing itself)
## Step B: Post-Processing Script
**MUST** run after generating the DOCX file:
```bash
python3 "$DOCX_SCRIPTS/add_toc_placeholders.py" output.docx --auto
```
### What the script does
1. Extracts Heading 1-3 from the document as TOC entries
2. Fixes docx-js fldChar structure bug (begin+instrText+separate merged in one `<w:r>`)
3. Patches `settings.xml` with `updateFields=true` (Word prompts to refresh on open)
4. Ensures Heading styles have `outlineLvl` (required for TOC field update)
5. Ensures TOC 1/2/3 styles exist in `styles.xml`
6. Injects placeholder entries with HYPERLINK + PAGEREF between `separate` and `end` fldChars
7. Handles duplicate heading texts (each gets its own bookmark)
### Error handling
The script **exits with code 1** if:
- No TOC field structure found (missing `TableOfContents` element)
- TOC field has `begin` but no `separate` fldChar (malformed structure)
- Field structure exists but no TOC instrText detected
**If exit code = 1 → the generated code is wrong. Fix the code and regenerate.**
### Options
```bash
# Auto mode (recommended — default behavior)
python3 "$DOCX_SCRIPTS/add_toc_placeholders.py" output.docx --auto
# Manual entries
python3 "$DOCX_SCRIPTS/add_toc_placeholders.py" output.docx \
--entries '[{"level":1,"text":"Chapter 1","page":"1"},{"level":2,"text":"Section 1.1","page":"2"}]'
```
## Step C: User Opens in Word/WPS
- **Word**: Detects `updateFields=true` → prompts "Update field?" → click Yes → real page numbers
- **WPS**: May NOT auto-prompt. User must: right-click TOC → "Update Field" → "Update entire table"
The placeholder entries ensure TOC is **not blank** even without updating — users see heading titles with approximate page numbers.
## Multi-Section Page Numbering
When a document has a TOC, the TOC MUST be in its own section so that body page numbering starts from 1. This applies to **all document types with a TOC** (reports, whitepapers, PRDs, academic papers, etc.) — not just academic papers.
**Mandatory 3-section architecture for documents with cover + TOC:**
```js
sections: [
{ /* Section 1: Cover — no page number, no footer */
properties: {
page: { size: pgSize, margin: pgMargin },
// ⚠️ Do NOT set page.pageNumbers here — docx-js emits empty <pgNumType/> which confuses WPS
},
},
{ /* Section 2: Front matter (abstract, TOC) — Roman numerals */
properties: {
type: SectionType.NEXT_PAGE,
page: {
size: pgSize, margin: pgMargin,
pageNumbers: { start: 1, formatType: NumberFormat.UPPER_ROMAN }, // I, II, III...
},
},
footers: { default: pageNumFooter() }, // see footer rules below
children: [/* abstract + TOC title + TableOfContents + PageBreak */]
},
{ /* Section 3: Body — Arabic numerals starting from 1 */
properties: {
type: SectionType.NEXT_PAGE,
page: {
size: pgSize, margin: pgMargin,
pageNumbers: { start: 1, formatType: NumberFormat.DECIMAL }, // 1, 2, 3...
},
},
footers: { default: pageNumFooter() },
children: [/* body content */]
},
]
```
### ⚠️ Page Number API — Correct Nesting (CRITICAL)
Page number settings MUST be nested inside `page.pageNumbers`, NOT at properties top level:
```js
// ❌ WRONG — docx-js ignores these, pgNumType will be empty
properties: {
pageNumberStart: 1,
pageNumberFormatType: NumberFormat.DECIMAL,
}
// ✅ CORRECT — docx-js writes start= and fmt= attributes
properties: {
page: {
pageNumbers: { start: 1, formatType: NumberFormat.DECIMAL },
},
}
```
### ⚠️ Footer Field Instruction — WPS Compatibility (CRITICAL)
WPS may ignore `pgNumType fmt` in the section properties. To ensure correct display, the footer PAGE field **MUST** include an explicit format switch via **post-processing**:
After generating the docx, unzip and patch each footer XML:
- **Roman numeral footer**: replace `PAGE` with `PAGE \* ROMAN \\** MERGEFORMAT`
- **Arabic numeral footer**: replace `PAGE \* arabic \* MERGEFORMAT`
**⚠️ NEVER use `\* decimal` in instrText** — `decimal` is a docx-js API enum value (`NumberFormat.DECIMAL` for `pgNumType` XML attribute), NOT a valid Word field format switch. Using it causes page numbers to render as "1decimal", "2decimal". The correct Word field switch for Arabic numerals is always `\* arabic`.
```js
// Post-process footer XML:
footerXml = footerXml.replace(
/(<w:instrText[^>]*>)\s*PAGE\s*(<\/w:instrText>)/g,
'$1 PAGE \\* ROMAN \\** MERGEFORMAT $2' // or "arabic" for body section
);
```
Also remove any empty `<w:pgNumType/>` from the cover section (docx-js emits these even when no pageNumbers is set):
```js
docXml = docXml.replace(/<w:pgNumType\/>/g, "");
```
### Page Numbering Rules
| Section | Content | Format | Start | Footer |
|---------|---------|--------|-------|--------|
| Cover | Title page | None | — | No footer |
| Front matter | Abstract, TOC | Roman (I, II, III) | 1 | `PAGE \* ROMAN` |
| Body | Main content | Arabic (1, 2, 3) | 1 | `PAGE \* arabic` |
⚠️ **The body section MUST set `pageNumbers: { start: 1 }`** — otherwise page numbers continue from the front matter pages, causing TOC page references to be offset. This is the #1 cause of "TOC page numbers are wrong".
### Common Causes of Incorrect Page Numbers
| Cause | Fix |
|-------|-----|
| `pageNumberStart` at properties top level | Move to `page: { pageNumbers: { start: 1 } }` |
| Cover section emits empty `<pgNumType/>` | Post-process to remove it |
| Footer uses bare `PAGE` without format switch | Post-process to add `\* roman` or `\* arabic` |
| Cover and body in same section | Separate cover into its own section |
| Multiple sections without pageNumbers.start | Explicitly set on each section needing independent counting |
| headingStyleRange doesn't match headings | Ensure `headingStyleRange: "1-3"` covers all HeadingLevel values used |
| Cover section has header/footer | Don't set header/footer on cover section |
## TOC Refresh Hint (MANDATORY)
**⚠️ When the document contains a TOC, you MUST add the following hint paragraph between the `TableOfContents` element and the PageBreak (so it appears on the TOC page, not the body page).** This ensures users know how to refresh page numbers after editing.
```js
new Paragraph({
spacing: { before: 200 },
children: [new TextRun({
text: "Note: This Table of Contents is generated via field codes. To ensure page number accuracy after editing, please right-click the TOC and select \"Update Field.\"",
italics: true, size: 18, color: "888888"
})]
}),
```
## 5 Common TOC Bugs
| # | Bug | Symptom | Fix |
|---|-----|---------|-----|
| 1 | "目录" heading uses `HeadingLevel.HEADING_1` | TOC includes "目录" as an entry | Remove `heading:` from TOC title paragraph |
| 2 | No `PageBreak` after `TableOfContents` | TOC and body text on same page | Add `new Paragraph({ children: [new PageBreak()] })` after TOC |
| 3 | Missing `TableOfContents` element | Script cannot inject placeholders, TOC is empty | Always include `new TableOfContents(...)` in code |
| 4 | Headings use bold+large instead of `HeadingLevel` | TOC is empty even after running script | Change all body headings to `heading: HeadingLevel.HEADING_X` |
| 5 | Script not run or exit code ignored | TOC page shows only title + blank space | Always run script; if exit code = 1, fix code and regenerate |
## Checklist (for self-check during generation)
- [ ] Document has 3+ H1 → TOC is included
- [ ] "目录" heading does NOT use `HeadingLevel` (prevents self-indexing)
- [ ] `new TableOfContents(...)` element present (not just plain text)
- [ ] `PageBreak` exists after TOC element (prevents merging with body)
- [ ] All body chapter headings use `heading: HeadingLevel.HEADING_X`
- [ ] `add_toc_placeholders.py --auto` runs after generation
- [ ] Script exit code checked — if 1, fix code and regenerate
- [ ] TOC page has visible placeholder content (not empty)
- [ ] **TOC Refresh Hint present** — italic gray note after TOC PageBreak telling user to right-click → "Update Field"
- [ ] `outlineLevel: 0` for H1, `1` for H2, etc. (needed for TOC field update)

88
skills/docx/routes/comment.md Executable file
View File

@@ -0,0 +1,88 @@
# Route: Add Comments
## Method 1: python-docx (Recommended — Simple)
```python
from docx import Document
from docx.oxml.ns import qn
from docx.oxml import OxmlElement
from datetime import datetime
def add_comment(paragraph, comment_text, author="GLM", initials="G"):
"""Add a comment to an entire paragraph."""
# Create comment reference
comment_id = str(hash(comment_text) % 10000)
# Add to comments.xml (need to create if not exists)
# ... complex XML manipulation required
pass
# Simpler approach: use python-docx-ng or manipulate XML directly
```
**Note**: python-docx has limited native comment support. For reliable results, use the OOXML method.
## Method 2: OOXML Direct Manipulation (Reliable)
### Step 1: Unpack
```bash
mkdir work && cd work && unzip ../input.docx
```
### Step 2: Create/update word/comments.xml
```xml
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<w:comments xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"
xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships">
<w:comment w:id="1" w:author="Reviewer" w:date="2024-01-15T10:30:00Z" w:initials="R">
<w:p>
<w:r>
<w:t>This section needs more detail.</w:t>
</w:r>
</w:p>
</w:comment>
</w:comments>
```
### Step 3: Mark comment range in document.xml
```xml
<w:commentRangeStart w:id="1"/>
<w:r><w:t>Text being commented on</w:t></w:r>
<w:commentRangeEnd w:id="1"/>
<w:r>
<w:rPr><w:rStyle w:val="CommentReference"/></w:rPr>
<w:commentReference w:id="1"/>
</w:r>
```
### Step 4: Update relationships
In `word/_rels/document.xml.rels`, add:
```xml
<Relationship Id="rIdComments" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/comments" Target="comments.xml"/>
```
### Step 5: Update Content_Types
In `[Content_Types].xml`, ensure:
```xml
<Override PartName="/word/comments.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.comments+xml"/>
```
### Step 6: Pack
```bash
zip -r ../output.docx . -x ".*"
```
## When to Use Each Method
| Scenario | Method |
|----------|--------|
| Add 1-2 simple comments | OOXML |
| Batch review (many comments) | OOXML with Python script |
| Comment on specific words | OOXML (precise range control) |
| Quick annotation | python-docx if available |

207
skills/docx/routes/create.md Executable file
View File

@@ -0,0 +1,207 @@
# Route: Create New Document
## Workflow
```
0. Check if user provided a reference template (PDF/docx) → if yes, use Template-Following Mode below
1. Load `references/design-system.md` → select palette and cover recipe
2. Load `references/common-rules.md` → shared layout, font, placeholder rules
3. Check user keywords → load scene file if applicable
4. Load `references/docx-js-core.md`
5. If complex → also load `references/docx-js-advanced.md`
6. Plan document structure (outline)
7. Write JS/TS using docx library
⚠️ **BEFORE writing any string**: scan ALL Chinese text for curly quotes `""''` and replace with `\u201c \u201d \u2018 \u2019` — bare curly quotes break JS syntax (see docx-js-advanced.md § Quotes Escaping)
8. Run with `bun run generate.js` (or `node generate.js`)
9. If TOC → run `python3 "$DOCX_SCRIPTS/add_toc_placeholders.py" output.docx --auto`
10. Run post-generation checklist (see SKILL.md)
```
## Template-Following Mode
When the user provides a reference document (PDF/docx) as a **formatting template** (e.g., "generate following this template format", "refer to this sample"), switch to template-following mode instead of the standard recipe-based workflow:
1. **Extract the template's structure** — cover layout, section order, heading hierarchy, page breaks, special pages (e.g., advisor comments page, approval form)
2. **Replicate structure exactly** — every major structural unit becomes a **separate section** (cover, body, appendix/form pages) with appropriate margins and page breaks
3. **Fill content** from the user's content source, or generate per user instructions
4. **Preserve template-specific elements** — school-specific forms, signature areas, stamp placeholders, advisor comment blocks → reproduce as-is with placeholder text (e.g., "Advisor (signature):")
5. **Maintain formatting fidelity** — font choices, table layouts, spacing, and alignment should match the template, not the standard design-system palettes
⚠️ **Do NOT apply standard cover recipes (R1R7) when a user-provided template defines its own cover format.** Follow the template's cover layout instead. Standard `common-rules.md` constraints (e.g., `WidthType.PERCENTAGE`, `allNoBorders` for cover wrapper, `Rule 8` line spacing) still apply for cross-engine compatibility.
⚠️ **Each distinct page type = separate section.** Cover section (margin: 0), body section (standard margins), appendix/form pages (may need different margins or orientation). Never place cover + body + appendix in a single section.
---
## Decision Tree
### Cover Page?
- **YES**: Reports, theses, proposals, plans, or 3+ page docs with clear title/author
- **NO**: Resumes, contracts, official documents, exam papers, short memos
### Cover Style Selector — Recipe Router
Covers use **7 validated layout recipes (R1R7)**, auto-selected by `selectCoverRecipe()` in `references/design-system.md` (the **authoritative source** — do NOT duplicate the function).
**Quick Reference:**
| docType | Recipe | Default Palette |
|---------|--------|-----------------|
| contract / official / exam / resume | null (no cover) | — |
| academic | R5 (Clean White) | ACADEMIC |
| proposal_report (thesis proposal) | R5 (Clean White) | ACADEMIC |
| lesson_plan (STEM) | R4 (Top Color Block) | DM-1 |
| lesson_plan (arts/general) | R6 (Editorial Warm) | ED-1 |
| creative / branding / design | R3 (Centered Card Frame) | SN-2 |
| cultural / newsletter / internal | R6 (Editorial Warm) | ED-1 |
| activity / event | R6 (Editorial Warm) | ED-1 |
| trend/research (cultural/creative/brand) | R7 (Swiss Tech) | ST-1 |
| whitepaper | R2 (Double-Rule Frame) | IG-1 / CM-2 |
| consulting | R2 (Double-Rule Frame) | MIN-1 |
| proposal / plan | R4 (Top Color Block) | GO-1 |
| report | R1 (Pure Paragraph Left) | by industry |
| default | R1 (Pure Paragraph Left) | DS-1 |
⚠️ **Long title routing:** After selecting recipe, apply `applyLongTitleOverride(result, titleLength)`. Titles >20 chars on R3/R4/R6 → fall back to R1. Titles >30 chars on R2 → fall back to R1. R5 is never overridden.
⚠️ **Academic thesis cover:** Use `buildAcademicCover()` from `scenes/academic.md`.
⚠️ **Thesis proposal report (开题报告):** Use `buildProposalCover()` from `scenes/academic.md`. Cover MUST be an independent section. Keywords: "开题报告" (Chinese), "thesis proposal", "research proposal" — NOT the same as business proposals (which use R4).
### Table of Contents?
- **YES**: 3+ major sections (H1 headings)
- **NO**: Resumes, exam papers, short docs, contracts (<20 clauses)
→ See `references/toc.md` for the complete TOC reference (3-step process, code examples, common bugs).
### Headers/Footers?
- **YES** by default (page numbers minimum)
- **NO**: cover page section, official docs (special format)
### Load Math Formulas?
When: exam papers, academic papers, physics/math/chemistry → load `references/math-formulas.md`
### Load Chart Templates?
When: data visualization, reports with charts → load `references/chart-templates.md`
## Outline Rules
**User provides outline** → Follow EXACTLY. No additions, deletions, or reordering.
**No outline** → Create from scene template:
- **Academic:** Abstract → TOC → Body → References
- **Report:** Use `selectReportType()` to determine type, then follow template AF:
- analysis → Template A (Executive Summary → Background → Scope & Method → Findings → Diagnosis → Conclusions)
- experiment → Template B (Abstract → Objective & Hypothesis → Environment → Procedure → Results → Error Analysis → Conclusions)
- testing → Template C (Overview → Scope & Environment → Test Plan → Results → Defects → Risks → Conclusions)
- research → Template D (Summary → Background → Subjects & Method → Sample → Findings → Synthesis → Recommendations)
- review → Template E (Overview → Goals → Review → Results → Issues → Lessons → Action Plan)
- proposal → Template F (Summary → Status → Goals → Solution → Roadmap → Resources → Risks → Benefits)
- **Contract:** Use `selectContractType()` then follow template AE:
- bilateral → Template A (Header → Parties → Recitals → Definitions → Subject → Price → Rights → Delivery → Tax → IP → Breach → Force Majeure → Termination → Notices → Dispute → Miscellaneous → Signature)
- transfer → Template B (Header → Recitals → Definitions → Subject → Consideration → Closing → Representations → Tax → Breach → Dispute → Signature)
- nda → Template C (Header → Recitals → Definition → Obligations → Use Restrictions → Return/Destroy → Exceptions → Duration → Breach → Dispute → Signature)
- framework → Template D (Header → Recitals → Purpose → Scope → Division → Mechanism → Commercial → Confidentiality → Term → Breach → Dispute → Signature)
- terms → Template E (Title → Definitions → Services → Rights → Liability → Fees → IP → Termination → Notices → Dispute → Miscellaneous)
- **Official:** Use `selectOfficialType()` + `needsRedHeader()`:
- notice → Template A ([Red header] → [Doc number] → Title → Addressee → Reason → Items → Requirements → [Attachments] → [Signature] → [Date] → [Colophon])
- letter → Template B ([Red header] → [Doc number] → Title → Addressee → Reason → Negotiation/Reply → Closing → [Signature] → [Date])
- reply → Template C ([Red header] → [Doc number] → Title → Addressee → Reference → Reply → "This is the reply." → Signature → Date)
- minutes → Template D (Title → Meeting Overview → Agreed Items → Responsibilities → [Distribution]) — typically no red header
- Present outline to user before generating when possible
## Scene Completeness
Include ALL elements a scene specifies:
- **Academic thesis:** Cover (`buildAcademicCover()` in its own section), abstract, TOC, references
- **Thesis proposal report (thesis proposal / 开题报告):** Cover (`buildProposalCover()` in its own section), body sections per proposal template. Cover MUST be a separate section.
- **Report:** Cover, executive summary, conclusions
- **Contract:** Party info, recitals, complete clause closure, signature block, uniform `【】` placeholders
- **Official:** Correct document type, specific title, closing phrase matching type, proper numbering hierarchy, red header only when requested
- **Exam:** Student info area, scoring criteria
Generate complete, substantive content — not skeletons.
## Content Guidelines
- **Length**: "detailed report" = 3000+ words. "brief summary" = 5001000.
- **Data**: Use user's data, or generate realistic placeholders
- **Charts**: Use `references/chart-templates.md` matplotlib templates → PNG → embed
- **Math**: Use `references/math-formulas.md` LaTeX → docx-js Math mapping
- **Tables**: For structured data, not layout
- **Numbering**: Figures, tables numbered sequentially with cross-references
## Code Architecture
### Heading Style Rule (Mandatory)
**All body chapter headings MUST use `heading: HeadingLevel.HEADING_X`** — never simulate with bold + large font (TOC cannot detect simulated headings).
**Exception:** Cover title and TOC title ("目录") heading MUST NOT use Heading style.
### Blank Page Prevention
→ See SKILL.md § Post-Generation checklist for the full set of rules.
Key rules:
1. No double page breaks (SectionType.NEXT_PAGE + PageBreak = blank page)
2. PageBreak paragraphs should have visible text content
3. No more than 3 consecutive empty paragraphs
4. Cover section: ≤2 trailing empty paragraphs, no trailing PageBreak
### Builder Pattern Example
```js
const { Document, Packer, Paragraph, TextRun, Header, Footer,
AlignmentType, HeadingLevel, PageNumber } = require("docx");
const fs = require("fs");
// 1. Palette
const P = { primary: "#101820", body: "#182030", secondary: "#506070", accent: "#8090A0" };
const c = (hex) => hex.replace("#", "");
// 2. Component builders
function heading(text, level = HeadingLevel.HEADING_1) {
return new Paragraph({
heading: level,
spacing: { before: level === HeadingLevel.HEADING_1 ? 360 : 240, after: 120 },
children: [new TextRun({ text, bold: true, color: c(P.primary), font: { ascii: "Calibri", eastAsia: "SimHei" } })]
});
}
function body(text) {
return new Paragraph({
alignment: AlignmentType.JUSTIFIED,
indent: { firstLine: 480 },
spacing: { line: 312 },
children: [new TextRun({ text, size: 24, color: c(P.body) })],
});
}
// 3. Assembly — cover + body in separate sections
const doc = new Document({
styles: { default: { document: {
run: { font: { ascii: "Calibri", eastAsia: "Microsoft YaHei" }, size: 24, color: c(P.body) },
paragraph: { spacing: { line: 312 } },
}}},
sections: [
{ properties: { page: { margin: { top: 0, bottom: 0, left: 0, right: 0 } } },
children: buildCoverR1(config) }, // ← use recipe from design-system.md
{ properties: { page: { margin: { top: 1440, bottom: 1440, left: 1701, right: 1417 } } },
footers: { default: new Footer({ children: [new Paragraph({ alignment: AlignmentType.CENTER,
children: [new TextRun({ children: [PageNumber.CURRENT], size: 18 })] })] }) },
children: [heading("Chapter 1"), body("Content...")] },
],
});
Packer.toBuffer(doc).then(buf => { fs.writeFileSync("output.docx", buf); });
```
## Post-Generation
→ See SKILL.md § Post-Generation for the complete two-layer verification checklist.
```bash
python3 "$DOCX_SCRIPTS/postcheck.py" output.docx
```
⚠️ **Running postcheck.py is MANDATORY.** Fix all ❌ errors before delivering.

115
skills/docx/routes/edit.md Executable file
View File

@@ -0,0 +1,115 @@
# Route: Edit Existing Document
## Workflow Overview
```
1. Receive .docx (or .doc → convert)
2. Unpack → working directory
3. Analyze structure (document.xml, styles.xml)
4. Plan changes → batch by type
5. Implement via Document library (Python)
6. Pack → output.docx
7. Verify (pandoc or visual)
```
## Step 0: Format Conversion
```bash
# .doc → .docx
libreoffice --headless --convert-to docx input.doc
```
## Step 1: Unpack
```bash
mkdir -p work_dir && cd work_dir && unzip ../input.docx
```
Key files: `word/document.xml` (content), `word/styles.xml` (styles), `word/numbering.xml` (lists), `word/media/` (images), `[Content_Types].xml`, `word/_rels/document.xml.rels`
## Step 2: Plan Changes
Group changes into batches, process in order:
1. **Structural** — Add/remove sections, reorder paragraphs
2. **Style** — Font, size, color modifications
3. **Text** — Find/replace, fix typos
4. **Table** — Add/remove rows/columns, update data
5. **Image** — Replace/add images
## Step 3: Implement
Load `references/ooxml.md` for the full Document library API. Key patterns:
```python
from scripts.document import Document
doc = Document('work_dir')
# Text replacement with tracked changes
node = doc["word/document.xml"].get_node(tag="w:r", contains="old text")
rpr = tags[0].toxml() if (tags := node.getElementsByTagName("w:rPr")) else ""
replacement = f'<w:del><w:r>{rpr}<w:delText>old text</w:delText></w:r></w:del><w:ins><w:r>{rpr}<w:t>new text</w:t></w:r></w:ins>'
doc["word/document.xml"].replace_node(node, replacement)
doc.save()
```
## Step 4: Pack
```bash
cd work_dir && zip -r ../output.docx . -x ".*"
```
## Step 5: Verify
```bash
pandoc output.docx -t plain -o /dev/stdout | head -50
# or visual
libreoffice --headless --convert-to pdf output.docx
```
---
## Template Matching Workflow
When user says "use this format" or provides a template:
1. Unpack template, extract `styles.xml`, `numbering.xml`
2. Analyze font/size/spacing/margins
3. Copy `styles.xml` into target document
4. Match heading hierarchy and spacing
## Multi-File Merge
1. Use first document as base
2. Extract content from additional documents
3. Insert with page breaks between sections
4. Merge styles (prefer base document's)
5. Re-number figures/tables sequentially
## Redlining (Tracked Changes) — Default for Revisions
When user asks for revisions, **default to tracked changes** so they can review:
```python
doc = Document('work_dir', track_revisions=True)
# ... make changes using replace_node with <w:del>/<w:ins>
doc.save()
```
Ask user if they want clean output or tracked changes only if ambiguous.
## Common Operations Quick Reference
| Operation | Approach |
|-----------|----------|
| Replace text | `get_node` + `replace_node` with tracked changes |
| Change font | Modify `<w:rFonts>` in run properties |
| Add paragraph | `insert_after` with `<w:p>` element |
| Delete paragraph | `suggest_deletion` on `<w:p>` |
| Add table row | Clone `<w:tr>`, modify cells |
| Update header | Edit `word/headerN.xml` |
| Change margins | Edit `<w:pgMar>` in `<w:sectPr>` |
| Add image | See `references/ooxml.md` image insertion pattern |
| Add comment | `doc.add_comment(start, end, text)` |

120
skills/docx/routes/format.md Executable file
View File

@@ -0,0 +1,120 @@
# Route: Format / Layout
## Workflow
```
1. Read current document (pandoc for content, unpack for structure)
2. Identify format requirements from user
3. Use unit conversion table (see SKILL.md)
4. Apply formatting via OOXML manipulation or python-docx
5. Pack and verify
```
## Quick Formatting via python-docx
For simple formatting tasks, python-docx is often faster than raw XML:
```python
from docx import Document as PythonDocument
from docx.shared import Pt, Cm, Twips
from docx.enum.text import WD_ALIGN_PARAGRAPH
doc = PythonDocument("input.docx")
# Change all body paragraph formatting
for para in doc.paragraphs:
if para.style.name.startswith("Heading"):
continue
para.paragraph_format.first_line_indent = Twips(420)
para.paragraph_format.line_spacing = 1.5
para.paragraph_format.alignment = WD_ALIGN_PARAGRAPH.JUSTIFY
for run in para.runs:
run.font.name = "宋体"
run.font.size = Pt(12) # Xiao Si 小四
doc.save("output.docx")
```
## Common Format Request Patterns
### University Thesis Formatting
Typical Chinese university thesis requirements:
```python
from docx.shared import Cm, Pt, Twips
# Margins
for section in doc.sections:
section.top_margin = Cm(2.5)
section.bottom_margin = Cm(2.5)
section.left_margin = Cm(3.0)
section.right_margin = Cm(2.5)
# Fonts
# Body: SimSun 宋体 Xiao Si 小四 (12pt)
# H1: SimHei 黑体 San Hao 三号 (16pt) centered
# H2: SimHei 黑体 Si Hao 四号 (14pt)
# H3: SimHei 黑体 Xiao Si 小四 (12pt)
# English: Times New Roman, same sizes
```
### Page Numbers Starting from Specific Page
Use multi-section approach:
```python
# Section 1: Front matter (Roman numerals)
# Section 2: Main content (Arabic, starting from 1)
# This requires OOXML manipulation — see routes/edit.md for unpack/pack workflow
```
In raw XML (`word/document.xml`):
```xml
<w:sectPr>
<w:pgNumType w:fmt="upperRoman" w:start="1"/>
</w:sectPr>
<!-- New section -->
<w:sectPr>
<w:pgNumType w:fmt="decimal" w:start="1"/>
</w:sectPr>
```
### Different Headers Per Section
Each section in a .docx can have its own header/footer. See `references/docx-js-advanced.md` for the multi-section approach.
For existing documents, modify `word/document.xml` to split `<w:sectPr>` and create separate `headerN.xml` files.
### Font Size Conversion
When user requests a Chinese font size name:
| Request | Action |
|---------|--------|
| "Change to Wu Hao (5th) size" | `font.size = Pt(10.5)` or `size: 21` in docx-js |
| "Title in San Hao SimHei" | `font.size = Pt(16)`, `font.name = "SimHei"` |
| "Body in Xiao Si SimSun" | `font.size = Pt(12)`, `font.name = "SimSun"` |
### Line Spacing Adjustment
```python
from docx.shared import Twips
# 1.0x spacing
para.paragraph_format.line_spacing_rule = WD_LINE_SPACING.MULTIPLE
para.paragraph_format.line_spacing = 1.0
# 1.3x spacing (our default)
para.paragraph_format.line_spacing = 1.5
# Fixed spacing (e.g., 28pt)
para.paragraph_format.line_spacing_rule = WD_LINE_SPACING.EXACTLY
para.paragraph_format.line_spacing = Pt(28)
```
## Verification
After formatting changes:
1. Open in LibreOffice or convert to PDF for visual check
2. Extract text with pandoc to ensure content unchanged
3. Compare file sizes (formatting-only changes shouldn't dramatically change size)

114
skills/docx/routes/read.md Executable file
View File

@@ -0,0 +1,114 @@
# Route: Read / Analyze / Extract
## Method 1: Text Extraction via pandoc (Fastest)
```bash
# Plain text
pandoc input.docx -t plain -o output.txt
# Markdown (preserves structure)
pandoc input.docx -t markdown -o output.md
# Extract with metadata
pandoc input.docx -t markdown --standalone -o output.md
```
**Best for**: Quick content reading, text analysis, word count, searching.
## Method 2: Raw XML Access (Detailed)
```bash
mkdir work && cd work && unzip ../input.docx
# Read main content
cat word/document.xml
# Read styles
cat word/styles.xml
# List embedded media
ls word/media/
# Read headers/footers
cat word/header1.xml
cat word/footer1.xml
```
**Best for**: Analyzing formatting, extracting styles, inspecting document structure, debugging layout issues.
### Quick XML Parsing
```python
import defusedxml.ElementTree as ET
tree = ET.parse("word/document.xml")
ns = {"w": "http://schemas.openxmlformats.org/wordprocessingml/2006/main"}
# Extract all text
texts = []
for t in tree.iter("{http://schemas.openxmlformats.org/wordprocessingml/2006/main}t"):
if t.text:
texts.append(t.text)
full_text = "".join(texts)
# Count paragraphs
paras = tree.findall(".//w:p", ns)
print(f"Paragraphs: {len(paras)}")
# Find headings
for para in paras:
pPr = para.find("w:pPr", ns)
if pPr is not None:
pStyle = pPr.find("w:pStyle", ns)
if pStyle is not None and "Heading" in pStyle.get(f"{{{ns['w']}}}val", ""):
text = "".join(t.text for t in para.iter(f"{{{ns['w']}}}t") if t.text)
print(f" {pStyle.get(f'{{{ns[\"w\"]}}}val')}: {text}")
```
## Method 3: Convert to Images (Visual Analysis)
```bash
# Convert to PDF first
libreoffice --headless --convert-to pdf input.docx
# Then to images
pdftoppm -png -r 200 input.pdf page
# Generates page-1.png, page-2.png, etc.
```
**Best for**: Visual layout analysis, comparing formatting, generating previews, when user asks "what does it look like".
## Method 4: python-docx Reading
```python
from docx import Document
doc = Document("input.docx")
# Read paragraphs
for para in doc.paragraphs:
print(f"[{para.style.name}] {para.text}")
# Read tables
for table in doc.tables:
for row in table.rows:
print([cell.text for cell in row.cells])
# Document properties
print(f"Sections: {len(doc.sections)}")
print(f"Paragraphs: {len(doc.paragraphs)}")
print(f"Tables: {len(doc.tables)}")
```
## Choosing the Right Method
| Need | Method |
|------|--------|
| Quick text content | pandoc |
| Document structure/outline | pandoc → markdown |
| Formatting details | Raw XML |
| Table data extraction | python-docx |
| Visual appearance | Convert to images |
| Style analysis | Raw XML (styles.xml) |
| Word/character count | pandoc → plain → wc |

783
skills/docx/scenes/academic.md Executable file
View File

@@ -0,0 +1,783 @@
# Scene: Academic / Thesis
## Palette
**Academic Dark** (Cool + Heavy + Calm) — Academic papers use **pure black body text**. Palette only for cover decoration and minimal title scenarios.
```js
const palette = {
primary: "#000000", // Title — pure black
body: "#000000", // Body — pure black
secondary: "#333333", // Header/caption — dark grey
accent: "#8B7E5A", // Cover decoration line — cover only
surface: "#F5F7FA", // Table header light bg — three-line tables only
};
```
⚠️ **Body text color must be pure black `"000000"`**. No decorative dark-blue-grey. Academic papers require print-friendly, black-and-white clarity.
→ Placeholder convention & universal prohibitions — see `references/common-rules.md`
**Note:** This scene uses Profile A fonts with academic-specific overrides below.
---
## Page Layout
| Property | Value | Twips |
|----------|-------|-------|
| Top margin | 2.54 cm | 1440 |
| Bottom margin | 2.54 cm | 1440 |
| Left margin | 3.00 cm | 1701 |
| Right margin | 2.50 cm | 1417 |
| Header distance | 1.5 cm | 850 |
| Footer distance | 1.75 cm | 992 |
```js
page: {
size: { width: 11906, height: 16838 },
margin: { top: 1440, bottom: 1440, left: 1701, right: 1417, header: 850, footer: 992 },
}
```
For binding margin, add 0.51.0 cm to left (i.e., left: 19852268).
---
## Font Specifications
| Element | CN Font | EN Font | Size | half-pt | Style |
|---------|---------|---------|------|---------|-------|
| Thesis title | SimHei | Times New Roman | Xiao Er 18pt | 36 | Bold, centered |
| H1 | SimHei | Times New Roman | San Hao 16pt | 32 | Bold, centered |
| H2 | SimHei | Times New Roman | Xiao San 15pt | 30 | Bold, left |
| H3 | SimHei | Times New Roman | Si Hao 14pt | 28 | Bold, left |
| Body | SimSun | Times New Roman | Xiao Si 12pt | 24 | Normal, justified |
| Abstract title | SimHei | Times New Roman Bold | San Hao 16pt | 32 | Bold, centered |
| Abstract body | SimSun | Times New Roman | Xiao Si 12pt | 24 | Normal, justified |
| Keywords label | SimHei | Times New Roman Bold | Xiao Si 12pt | 24 | Bold |
| Keywords content | SimSun | Times New Roman | Xiao Si 12pt | 24 | Normal |
| Header | SimSun | Times New Roman | Xiao Wu 9pt | 18 | Centered, color 333333 |
| Page number | — | Times New Roman | Xiao Wu 10.5pt | 21 | Centered |
| Footnote | SimSun | Times New Roman | Xiao Wu 9pt | 18 | Normal |
| Figure/table caption | SimSun | Times New Roman | Wu Hao 10.5pt | 21 | Centered |
### Paragraph Format
- Body: justified, first-line indent 2 chars (`firstLine: 480`, SimSun Xiao Si = 480 twips)
- Line spacing: 1.5x (`line: 360`); if school requires fixed 22pt, use `line: 440, lineRule: "exact"`
- Body paragraph spacing: before/after 0pt; heading spacing per styles below
```js
styles: {
default: {
document: {
run: { font: { ascii: "Times New Roman", eastAsia: "SimSun" }, size: 24, color: "000000" },
paragraph: { spacing: { line: 360 } },
},
heading1: {
run: { font: { ascii: "Times New Roman", eastAsia: "SimHei" }, size: 32, bold: true, color: "000000" },
paragraph: { alignment: AlignmentType.CENTER, spacing: { before: 480, after: 360, line: 360 } },
},
heading2: {
run: { font: { ascii: "Times New Roman", eastAsia: "SimHei" }, size: 30, bold: true, color: "000000" },
paragraph: { spacing: { before: 360, after: 240, line: 360 } },
},
heading3: {
run: { font: { ascii: "Times New Roman", eastAsia: "SimHei" }, size: 28, bold: true, color: "000000" },
paragraph: { spacing: { before: 240, after: 120, line: 360 } },
},
},
}
```
---
## Heading Numbering System (Mandatory)
### Format
| Level | Format | Example |
|-------|--------|---------|
| H1 | Chapter X + title | 第一章 绪论 (Chapter 1 Introduction) |
| H2 | X.X + section title | 1.1 Research Background |
| H3 | X.X.X + subsection | 1.1.1 Domestic Research Status |
### Mandatory Rules
1. **H1 must use "第X章" format** — not "一、", not "Chapter 1", not "第1章"
2. **H2/H3 use Arabic decimal numbering** (1.1, 1.1.1) — no "(一)", "1)"
3. **No mixing multiple numbering systems**
4. **No level-skipping** (cannot jump from H1 to H3)
5. **All body headings must use `heading: HeadingLevel.HEADING_X`** (TOC depends on this)
```js
// ✅ Correct
new Paragraph({
heading: HeadingLevel.HEADING_1,
children: [new TextRun({ text: "第一章 绪论", bold: true, size: 32, font: { eastAsia: "SimHei", ascii: "Times New Roman" } })]
})
new Paragraph({
heading: HeadingLevel.HEADING_2,
children: [new TextRun({ text: "1.1 研究背景", bold: true, size: 30, font: { eastAsia: "SimHei", ascii: "Times New Roman" } })]
})
```
### Non-Body Headings
Abstract, Table of Contents, References, Appendices, Acknowledgments:
- Use H1 style (San Hao SimHei centered) for TOC indexing
- But **no numbering** (write directly: "摘 要", "参考文献", etc. — these are non-numbered standalone section headings)
---
## Document Structure & Multi-Section Architecture
Theses must use **multi-section structure** for independent page numbering and header/footer per section.
### Complete Structure
```
Section 1: Cover → No page number, no header/footer
Section 2: Chinese Abstract → Roman numerals starting from i
Section 3: English Abstract → Roman numerals continued
Section 4: Table of Contents → Roman numerals continued
Section 5: Body (all chapters) → Arabic numerals from 1
Section 6: References → Arabic numerals continued
Section 7: Appendices (if any) → Arabic numerals continued
Section 8: Acknowledgments (if any) → Arabic numerals continued
```
### Page Number Implementation
```js
const { NumberFormat } = require("docx");
// Section 1: Cover — no page number
{
properties: {
page: { margin: { top: 0, bottom: 0, left: 0, right: 0 } },
titlePage: true,
},
children: buildCover(...),
}
// Section 2: Abstract — Roman numerals from i
{
properties: {
type: SectionType.NEXT_PAGE,
page: {
margin: { top: 1440, bottom: 1440, left: 1701, right: 1417, header: 850, footer: 992 },
pageNumbers: { start: 1, formatType: NumberFormat.UPPER_ROMAN },
},
},
headers: { default: buildHeader("Thesis Title") },
footers: { default: buildPageNumberFooter() },
children: buildAbstractCN(...),
}
// Section 3: English Abstract — Roman numerals continued (no reset)
{
properties: {
type: SectionType.NEXT_PAGE,
page: {
margin: { top: 1440, bottom: 1440, left: 1701, right: 1417, header: 850, footer: 992 },
pageNumbers: { formatType: NumberFormat.UPPER_ROMAN }, // no start → continues from previous
},
},
headers: { default: buildHeader("Thesis Title") },
footers: { default: buildPageNumberFooter() },
children: buildAbstractEN(...),
}
// Section 5: Body — Arabic numerals from 1
{
properties: {
type: SectionType.NEXT_PAGE,
page: {
margin: { top: 1440, bottom: 1440, left: 1701, right: 1417, header: 850, footer: 992 },
pageNumbers: { start: 1, formatType: NumberFormat.DECIMAL },
},
},
headers: { default: buildHeader("Thesis Title") },
footers: { default: buildPageNumberFooter() },
children: buildMainContent(...),
}
// Section 6+: References/Appendices/Acknowledgments — Arabic continued
```
### Header & Footer Helpers
```js
function buildHeader(title) {
return new Header({ children: [
new Paragraph({ alignment: AlignmentType.CENTER,
border: { bottom: { style: BorderStyle.SINGLE, size: 1, color: "000000" } },
children: [new TextRun({ text: title, size: 18, color: "333333",
font: { ascii: "Times New Roman", eastAsia: "SimSun" } })],
}),
] });
}
function buildPageNumberFooter() {
return new Footer({ children: [
new Paragraph({ alignment: AlignmentType.CENTER,
children: [
new TextRun({ text: "- ", size: 21 }),
new TextRun({ children: [PageNumber.CURRENT], size: 21 }),
new TextRun({ text: " -", size: 21 }),
],
}),
] });
}
```
### Page Break Rules
- Cover is a separate section (no PageBreak needed)
- Chinese abstract, English abstract, TOC each in their own section
- All body chapters in **one section** (no forced page breaks between chapters unless user requests)
- References, appendices, acknowledgments each in their own section
- **Never use blank lines instead of section breaks**
---
## Cover
### Information Fields
Cover must include (use placeholders for missing info):
| Field | Format | Placeholder |
|-------|--------|-------------|
| University name | Er Hao SimHei, centered | ×××University |
| Thesis title (CN) | Xiao Er SimHei, centered | (user-provided) |
| Thesis title (EN) | San Hao Times New Roman, centered | (translated from CN) |
| College | Si Hao SimSun | ×××College |
| Major | Si Hao SimSun | ×××Major |
| Author | Si Hao SimSun | ××× |
| Student ID | Si Hao SimSun | ××××××× |
| Advisor | Si Hao SimSun | ×××Professor |
| Date | Si Hao SimSun | 2026/XX |
### Cover Style
Use Recipe R5 (Clean White) or academic-specific `buildAcademicCover()` — never use commercial-style covers.
### Cover Layout Order (Mandatory)
The visual order on academic covers must follow this hierarchy from top to bottom:
1. School name (top)
2. Document type label (e.g., "Undergraduate Thesis", "Thesis Proposal Report")
3. **Thesis title** (prominent, centered)
4. Thesis English title (if bilingual)
5. **Author information table** (college, major, author, student ID, advisor)
6. Date (bottom)
⚠️ **Title MUST appear ABOVE the author info table.** The screenshot issue of info table appearing above the title is caused by incorrect element ordering. The `buildAcademicCover()` and `buildProposalCover()` functions below enforce correct order.
⚠️ **Layout must be vertically balanced** — use dynamic spacing to distribute whitespace evenly. Do not cram all elements into the top half or let large gaps appear between elements.
```js
function buildAcademicCover(info) {
const { school, title, titleEN, college, major, author, studentId, advisor, date } = info;
// ⚠️ Use safeText() for all values — never output "undefined"
const infoRows = [
["College", safeText(college, "【College】")],
["Major", safeText(major, "【Major】")],
["Author", safeText(author, "【Author】")],
["Student ID", safeText(studentId, "【Student ID】")],
["Advisor", safeText(advisor, "【Advisor】")],
];
const infoTable = new Table({
width: { size: 60, type: WidthType.PERCENTAGE },
alignment: AlignmentType.CENTER,
borders: { top: NB, bottom: NB, left: NB, right: NB, insideHorizontal: NB, insideVertical: NB },
rows: infoRows.map(([label, value]) => new TableRow({
cantSplit: true,
children: [
new TableCell({
width: { size: 35, type: WidthType.PERCENTAGE },
borders: { bottom: { style: BorderStyle.SINGLE, size: 1, color: "000000" }, top: NB, left: NB, right: NB },
margins: { top: 60, bottom: 60, left: 120, right: 120 },
children: [new Paragraph({
alignment: AlignmentType.RIGHT,
children: [new TextRun({ text: label + ":", size: 28, font: { eastAsia: "SimHei", ascii: "Times New Roman" } })],
})],
}),
new TableCell({
borders: { bottom: { style: BorderStyle.SINGLE, size: 1, color: "000000" }, top: NB, left: NB, right: NB },
margins: { top: 60, bottom: 60, left: 120, right: 120 },
children: [new Paragraph({
alignment: AlignmentType.CENTER,
children: [new TextRun({ text: value, size: 28, font: { eastAsia: "SimSun", ascii: "Times New Roman" } })],
})],
}),
],
})),
});
// ⚠️ Correct order: school → doc type → TITLE → info table → date
// ★ Rule 8: All large-font paragraphs must set explicit line spacing
return [
new Paragraph({ alignment: AlignmentType.CENTER, spacing: { before: 1200, after: 400, line: Math.ceil(22 * 23), lineRule: "atLeast" },
children: [new TextRun({ text: safeText(school, "【University Name】"), size: 44, bold: true, font: { eastAsia: "SimHei" } })] }),
new Paragraph({ alignment: AlignmentType.CENTER, spacing: { after: 800, line: Math.ceil(18 * 23), lineRule: "atLeast" },
children: [new TextRun({ text: "Undergraduate Thesis", size: 36, font: { eastAsia: "SimHei" } })] }),
new Paragraph({ alignment: AlignmentType.CENTER, spacing: { after: 200, line: Math.ceil(18 * 23), lineRule: "atLeast" },
children: [new TextRun({ text: safeText(title, "【Thesis Title】"), size: 36, bold: true, font: { eastAsia: "SimHei", ascii: "Times New Roman" } })] }),
titleEN ? new Paragraph({ alignment: AlignmentType.CENTER, spacing: { after: 1200, line: Math.ceil(16 * 23), lineRule: "atLeast" },
children: [new TextRun({ text: titleEN, size: 32, font: { ascii: "Times New Roman" } })] })
: new Paragraph({ spacing: { after: 1200 }, children: [] }),
infoTable,
new Paragraph({ alignment: AlignmentType.CENTER, spacing: { before: 1200, line: Math.ceil(14 * 23), lineRule: "atLeast" },
children: [new TextRun({ text: safeText(date, "2026/XX"), size: 28, font: { eastAsia: "SimSun" } })] }),
];
}
```
### Thesis Proposal Report Cover (开题报告)
Thesis proposal reports use a similar cover layout but with different document type label. The key layout rule is the same: **title above author info, evenly spaced**.
⚠️ **CRITICAL — Proposal cover MUST be an independent section:**
The proposal cover MUST be placed in its **own section** (with margin: 0 and a 16838 wrapper table), completely separate from the body content. The body content starts in the **next section** (with `SectionType.NEXT_PAGE` or as a separate section entry). **Never place the cover elements and body content in the same section** — this causes them to render on the same page without any page break, which is the #1 proposal report formatting failure.
```js
// ✅ Correct — cover and body in separate sections
sections: [
{
properties: { page: { margin: { top: 0, bottom: 0, left: 0, right: 0 } } },
children: buildProposalCover(info), // standalone cover section
},
{
properties: { page: { margin: { top: 1440, bottom: 1440, left: 1701, right: 1417 } } },
children: [...bodyContent], // body starts here
},
]
// ❌ WRONG — cover and body in same section (no page separation!)
sections: [
{
children: [...coverElements, ...bodyContent], // everything on one continuous flow
},
]
```
```js
function buildProposalCover(info) {
const { school, year, title, subtitle, college, major, author, studentId, advisor, date } = info;
// ⚠️ Use safeText() for all values
const infoRows = [
["姓名 (Name)", safeText(author, "XXX")],
["专业 (Major)", safeText(major, "XXX")],
["入学时间 (Enrollment)", safeText(info.enrollment, "XXX")],
];
const infoTable = new Table({
width: { size: 60, type: WidthType.PERCENTAGE },
alignment: AlignmentType.CENTER,
borders: { top: NB, bottom: NB, left: NB, right: NB, insideHorizontal: NB, insideVertical: NB },
rows: infoRows.map(([label, value]) => new TableRow({
children: [
new TableCell({
width: { size: 35, type: WidthType.PERCENTAGE },
borders: { bottom: { style: BorderStyle.SINGLE, size: 1, color: "000000" }, top: NB, left: NB, right: NB },
margins: { top: 60, bottom: 60, left: 120, right: 120 },
children: [new Paragraph({
alignment: AlignmentType.CENTER,
children: [new TextRun({ text: label, size: 28, bold: true, font: { eastAsia: "SimHei", ascii: "Times New Roman" } })],
})],
}),
new TableCell({
borders: { bottom: { style: BorderStyle.SINGLE, size: 1, color: "000000" }, top: NB, left: NB, right: NB },
margins: { top: 60, bottom: 60, left: 120, right: 120 },
children: [new Paragraph({
alignment: AlignmentType.CENTER,
children: [new TextRun({ text: value, size: 28, font: { eastAsia: "SimSun", ascii: "Times New Roman" } })],
})],
}),
],
})),
});
// ⚠️ Correct order: doc type label → info table → "论文题目" label → TITLE → subtitle
// Layout balanced: upper 40% for header + info, middle 20% for title, lower 40% for whitespace
// ★ Rule 8: All large-font paragraphs must set explicit line spacing
return [
new Paragraph({ alignment: AlignmentType.CENTER, spacing: { before: 1500, after: 600, line: Math.ceil(18 * 23), lineRule: "atLeast" },
children: [new TextRun({ text: safeText(year, "2025") + " 届本科毕业论文开题报告",
size: 36, bold: true, font: { eastAsia: "SimHei", ascii: "Times New Roman" } })] }),
infoTable,
new Paragraph({ spacing: { before: 1200 } }), // Balanced whitespace
new Paragraph({ alignment: AlignmentType.CENTER, spacing: { after: 200 },
children: [new TextRun({ text: "论文题目", size: 28, font: { eastAsia: "SimSun", ascii: "Times New Roman" } })] }),
new Paragraph({ alignment: AlignmentType.CENTER, spacing: { after: 200, line: Math.ceil(16 * 23), lineRule: "atLeast" },
children: [new TextRun({ text: safeText(title, "【Thesis Title】"), size: 32, bold: true,
font: { eastAsia: "SimHei", ascii: "Times New Roman" } })] }),
subtitle ? new Paragraph({ alignment: AlignmentType.CENTER, spacing: { after: 800 },
children: [new TextRun({ text: "——" + subtitle, size: 28,
font: { eastAsia: "SimSun", ascii: "Times New Roman" } })] })
: new Paragraph({ spacing: { after: 800 }, children: [] }),
];
}
```
### ⚠️ WPS Compatibility Notes for Academic Covers
Both thesis cover and proposal cover use info tables. These MUST follow the cross-engine rules:
- Table uses **percentage widths** (`WidthType.PERCENTAGE`), NOT DXA — WPS renders DXA widths differently in nested contexts
- Table width: adaptive 5575%, centered via `alignment: CENTER` (calculated by `calcR5MetaLayout()`)
- Label column: **LEFT aligned**, plain text + "", NO full-width space padding, NO borders
- Value column: **LEFT aligned**, `bottom: single sz=4` border = fixed-length underline
- Cell `margins.top/bottom: 60` is acceptable (small values) but avoid larger values
- All paragraphs with font size > 12pt (body) must set `spacing: { line: Math.ceil(fontPt * 23), lineRule: "atLeast" }` to prevent top clipping (Rule 8)
- ⚠️ Do NOT use DXA widths, full-width space padding (`\u3000`), tab stops, or right-alignment for meta info
⚠️ **Proposal cover must fit on one page.** Use the same height-budget approach as commercial covers — total content height must stay within 15638 twips (1200 twips safety margin). If the title is very long, reduce font size (minimum 24pt).
```
---
## Section Content Standards
### Chinese Abstract
**Format:**
- Title: "摘 要" (space in middle), San Hao SimHei centered, H1 style
- Body: Xiao Si SimSun, justified, first-line indent 480 twips
- Keywords: "关键词:" SimHei bold + content SimSun normal, 38 keywords, semicolon-separated
**Content structure (mandatory):**
1. Research background (12 sentences)
2. Research problem/purpose (1 sentence)
3. Research method (12 sentences)
4. Main results/findings (23 sentences)
5. Research significance/value (1 sentence)
⚠️ **Abstract is NOT a TOC summary.** Must not read as "Chapter 1 introduces... Chapter 2 analyzes..."
### English Abstract
- Title: "Abstract", San Hao Times New Roman Bold, centered, H1 style
- Body: Xiao Si Times New Roman, justified
- Keywords: bold label + normal content, 38 keywords, comma-separated
- **Must be consistent with Chinese abstract** — no significant shrinkage
- Use formal academic English, avoid Chinglish
### Table of Contents
- Title: "目 录", San Hao SimHei centered
- Use `TableOfContents` field for auto-generation, display at least H1H2, recommend H3
- Run `"$DOCX_SCRIPTS/add_toc_placeholders.py" --auto` after generation
- TOC on its own page
---
## Body Chapter Structure
### Standard Structure (6-chapter)
```
Chapter 1: Introduction
1.1 Research Background
1.2 Research Purpose & Significance
1.3 Literature Review (Domestic & International)
1.4 Research Content & Methods
1.5 Thesis Structure
Chapter 2: Theoretical Framework & Literature Review
2.1 Core Concept Definitions
2.2 Theoretical Basis
2.3 Literature Review
2.4 Research Gap & Entry Point
Chapter 3: Research Design / Method / Model
3.1 Research Framework
3.2 Method Design / System Architecture / Algorithm
3.3 Variables / Data Sources / Experimental Environment
Chapter 4: Empirical Analysis / Case Study / Results
4.1 Data Analysis / Case Description / Experiment Process
4.2 Results Presentation
4.3 Results Interpretation
Chapter 5: Discussion
5.1 Key Findings
5.2 Comparison with Existing Research
5.3 Limitations
Chapter 6: Conclusions & Outlook
6.1 Research Conclusions
6.2 Contributions
6.3 Limitations
6.4 Future Research Directions
```
### Chapter Content Requirements
**Chapter 1 (Introduction):** Must state background, purpose, significance, methods, content, structure.
**Chapter 2 (Literature Review):** Must be systematically organized by theme/method/stage — **never a chronological dump of papers**. Must identify contributions, gaps, and research opportunities.
**Chapter 3 (Method):** Must explain why this method was chosen and its rationale. Content must be understandable, executable, reproducible.
**Chapter 4 (Results):** Must be specific, not vague. Must be consistent with Chapter 3 design.
**Chapter 5 (Discussion):** Must not merely repeat Chapter 4 results. Must explain what results mean and what conclusions they support.
**Chapter 6 (Conclusions):** Must summarize concisely, state contributions, acknowledge limitations, propose future directions. Must end formally — no abrupt ending.
---
## Discipline-Adaptive Routing
Auto-adjust research methods and chapter emphasis by discipline. **When user doesn't specify method, choose the most appropriate research paradigm for the discipline — never mechanically apply "empirical + survey + regression" template.**
### 1. Humanities & Social Sciences (Literature, History, Philosophy, Arts)
**Preferred methods:** Literature analysis, theoretical research, text analysis, comparative studies, historical research
**Adjustments:** Ch.2 focuses on theoretical lineage; Ch.4 becomes text analysis/case argumentation; minimize "variables", "hypotheses", "regression" terminology
### 2. Management / Economics / Public Administration
**Preferred methods:** Case analysis, surveys, model analysis, institutional research, empirical research
**Adjustments:** Ch.3 focuses on hypotheses, variables, framework; Ch.4 on data collection & analysis; Ch.5 adds management implications/policy recommendations
### 3. Computer Science / Engineering / IT
**Preferred methods:** Method design, system architecture, experimental comparison, performance evaluation, algorithm analysis
**Adjustments:** Ch.3 becomes system/algorithm design; Ch.4 becomes experiments (environment, parameters, control experiments, metric comparison); minimize "interviews", "surveys"
### 4. Education / Linguistics / Communication
**Preferred methods:** Teaching experiments, text analysis, survey research, interview research, case studies
**Adjustments:** Ch.3 focuses on subjects, dimensions, samples; Ch.4 on teaching practice/communication case analysis; Ch.5 adds educational implications/communication strategies
### 5. Law / Marxism / Policy Studies
**Preferred methods:** Normative analysis, statutory interpretation, case studies, institutional comparison, theoretical analysis
**Adjustments:** Ch.2 focuses on legal/policy framework; Ch.4 becomes case analysis/institutional comparison; Ch.5 focuses on normative evaluation, reform recommendations
---
## Figure/Table/Formula Numbering (By Chapter)
### Numbering Rules
| Type | Format | Example |
|------|--------|---------|
| Figure | Figure X-Y | Figure 3-1, Figure 4-2 |
| Table | Table X-Y | Table 2-1, Table 4-3 |
| Formula | Eq. (X-Y) | Eq. (3-1), Eq. (5-2) |
Where X = chapter number, Y = sequential number within chapter.
### Figures
- Caption **below** figure, Wu Hao SimSun, centered
- Format: "Figure X-Y Description"
- Must be referenced in text: "as shown in Figure 3-1"
```js
new Paragraph({ alignment: AlignmentType.CENTER,
children: [new ImageRun({ data: imgBuf, transformation: { width: w, height: h }, type: "png" })] }),
new Paragraph({ alignment: AlignmentType.CENTER, spacing: { before: 60, after: 200 },
children: [new TextRun({ text: "图3-1 System Architecture", size: 21,
font: { eastAsia: "SimSun", ascii: "Times New Roman" } })] }),
```
### Tables
- Caption **above** table, Wu Hao SimSun, centered, `keepNext: true`
- Format: "Table X-Y Description"
- Must use three-line table (mandatory for academic papers)
- Must be referenced in text: "as shown in Table 2-1"
### Formulas
- Formula centered, number **right-aligned**
- Use Tab for center + right alignment
- Text reference: "from Eq. (3-1)"
```js
new Paragraph({
alignment: AlignmentType.CENTER,
tabStops: [
{ type: TabStopType.CENTER, position: 4500 },
{ type: TabStopType.RIGHT, position: 9000 },
],
children: [
new TextRun({ text: "\t" }),
new TextRun({ text: "E = mc²" }),
new TextRun({ text: "\t(3-1)" }),
],
}),
```
### Mandatory Rules
1. Figures/tables/formulas **must be referenced in text** — never placed without explanation
2. Must have introductory and analytical text before/after
3. Must not exceed page margins
4. Insert only when analytically valuable — not for decoration
---
## Citation & Reference System
### In-Text Citation (Sequential Numbering)
Default: **GB/T 7714 sequential numbering**`[1]`, `[2]` in text, references listed in order of appearance.
```js
new TextRun({ text: "[1]", superScript: true, size: 18, font: { ascii: "Times New Roman" } })
```
### Citation Rules
1. In-text numbers must **correspond one-to-one** with reference list
2. **Same source reused keeps the same number**
3. **Do not mix footnote citations and endnote references** (unless user explicitly requests)
4. Footnotes are for supplementary notes only, not primary citations
### Reference Format (GB/T 7714)
```
[1] Author. Title[J]. Journal, Year, Vol(No): Pages.
[2] Author. Book Title[M]. Place: Publisher, Year: Pages.
[3] Author. Title[D]. Location: Institution, Year.
[4] Author. Title[EB/OL]. (Published)[Cited]. URL.
```
### Reference Formatting
```js
// Reference title — H1 style
new Paragraph({ heading: HeadingLevel.HEADING_1, alignment: AlignmentType.CENTER,
children: [new TextRun({ text: "References", bold: true, size: 32, font: { eastAsia: "SimHei" } })] }),
// Each entry — hanging indent
new Paragraph({
indent: { left: 420, hanging: 420 },
spacing: { line: 360 },
children: [new TextRun({ text: "[1] Author. Title[J]. Journal, 2024, 59(3): 45-62.",
size: 21, font: { eastAsia: "SimSun", ascii: "Times New Roman" } })],
}),
```
### Reference Count Guidelines
| Thesis Type | Suggested Count |
|------------|----------------|
| Course paper (30005000 words) | 1015 |
| Undergraduate thesis | 1530 |
| Master's thesis | 4080 |
| Doctoral dissertation | 80150 |
If user specifies APA, MLA, Chicago, or school-specific format, follow that instead.
---
## Three-Line Table (Mandatory for Academic Papers)
All tables in academic papers **must use three-line tables** — no full-border tables.
```js
const threeLineTable = new Table({
width: { size: 100, type: WidthType.PERCENTAGE },
borders: {
top: { style: BorderStyle.SINGLE, size: 4, color: "000000" },
bottom: { style: BorderStyle.SINGLE, size: 4, color: "000000" },
left: { style: BorderStyle.NONE }, right: { style: BorderStyle.NONE },
insideHorizontal: { style: BorderStyle.NONE }, insideVertical: { style: BorderStyle.NONE },
},
rows: [
new TableRow({
tableHeader: true, cantSplit: true,
children: headerCells.map(text => new TableCell({
borders: { bottom: { style: BorderStyle.SINGLE, size: 2, color: "000000" },
top: { style: BorderStyle.NONE }, left: { style: BorderStyle.NONE }, right: { style: BorderStyle.NONE } },
margins: { top: 60, bottom: 60, left: 120, right: 120 },
children: [new Paragraph({ alignment: AlignmentType.CENTER,
children: [new TextRun({ text, bold: true, size: 21, font: { eastAsia: "SimSun", ascii: "Times New Roman" } })] })],
})),
}),
...dataRows, // All borders NONE
],
});
```
---
## Content Quality Constraints (Mandatory)
### Truthfulness & Conservatism
1. **Never fabricate** unverifiable statistics, survey response counts, significance levels, interview subject identities, experimental precision, government document numbers
2. **Never invent** non-existent classic theories, authoritative scholar opinions, regulation names, core data sources
3. When user provides no real data → prefer **theoretical analysis, literature research, case studies, comparative analysis** (low-risk methods)
4. If example data must be constructed → keep scale reasonable, results conservative; never produce "significantly superior" or "dramatically improved" high-risk claims
5. Research conclusions must be **restrained** — do not overstate contributions, effects, or applicability
6. Research limitations must be **honestly disclosed**
### Language Style
1. Formal academic register throughout
2. **Forbidden:** "I think", "everyone knows", "obviously", "it is well known" (subjective expressions)
3. **Forbidden:** Sloganeering, propaganda, advertising-style expressions
4. First occurrence of CN/EN terms should include English original
5. CN/EN punctuation, spacing, and number formats must be consistent throughout
### Structural Consistency
1. Abstract, body, and conclusions **must be consistent** — no self-contradiction
2. Must form complete loop: "research question → method → analysis → findings → conclusions & outlook"
3. Terminology consistent throughout — no concept drift
4. All chapters balanced and substantive — no padding
### Document Cleanliness
1. **No residual** comments, tracked changes, field codes, template default text
2. **No** "TBD", "omitted", "user modifies", "insert figure here" expressions
3. **No** Markdown syntax, HTML tags, code blocks wrapping body text
4. **No** consecutive blank lines, abnormal page breaks, chaotic numbering
5. Final document must be clean, well-formatted, ready for submission
---
## School Standard Override Rule
⚠️ **When user specifies school/journal-specific format requirements, those requirements OVERRIDE all defaults above.**
Common override items:
- Margins (binding margin left 3.5 cm common)
- Body font (some schools require FangSong)
- Line spacing (some schools require fixed 28pt)
- Cover layout (varies significantly by school)
- Reference format (APA, MLA, etc.)
- Heading numbering (some schools use "1", "2" instead of "Chapter 1", "Chapter 2")
### Common Variants
| Thesis Type | Common Differences |
|------------|-------------------|
| Top universities | Strict GB/T 7714, often require STXiaoBiaoSong cover |
| Regular undergraduate | More flexible, SimSun/SimHei sufficient |
| Master's thesis | Requires English abstract, longer lit review, innovation statement |
| Doctoral dissertation | Requires innovation statement, publication list, originality declaration |
---
## Scene-Specific Quality Checks
In addition to universal checks (see `references/common-rules.md`):
### Structure & Content
- [ ] Cover, abstract, English abstract, TOC, body, references all present
- [ ] Cover info complete (school/title/EN title/college/major/name/ID/advisor/date)
- [ ] Abstract contains 5 elements: background + problem + method + results + significance
- [ ] English abstract consistent with Chinese abstract
- [ ] All chapters balanced, substantive, logical loop complete
- [ ] Literature review is thematic, not chronological dump
- [ ] Conclusions respond to research questions
### Format & Layout
- [ ] Heading numbering consistent (Chapter X / X.X / X.X.X), no mixing
- [ ] All body headings use `heading: HeadingLevel.HEADING_X`
- [ ] Body text pure black `"000000"`
- [ ] Three-line tables used consistently (no full-border tables)
- [ ] Figure captions below, table captions above, numbered by chapter
- [ ] Formulas centered, numbers right-aligned
- [ ] In-text citations match reference list one-to-one
- [ ] References use hanging indent, consistent format
- [ ] Page numbers: front matter Roman, body Arabic from 1
- [ ] Cover has no page number
- [ ] Headers formal and concise
- [ ] No extra blank pages
### Cleanliness
- [ ] No comment/revision residuals
- [ ] No "TBD" / "omitted" expressions
- [ ] No Markdown/HTML/code block residuals
- [ ] No consecutive blank lines or abnormal page breaks
- [ ] No fabricated high-risk data or exaggerated conclusions

463
skills/docx/scenes/contract.md Executable file
View File

@@ -0,0 +1,463 @@
# Scene: Contract / Agreement
## Goal
Generate a complete, formal, well-structured legal document with clear clauses, rigorous logic, and proper formatting. Must simultaneously meet:
- Complete structure, clear clauses, formal language, explicit responsibilities
- Identifiable risk boundaries, proper Word formatting
- Ready for review, revision, circulation, or signing preparation
**Forbidden:** Producing outlines-only / sample clauses / drafting advice / risk summaries; outputting chat-style explanations or filler phrases.
→ Font profile: **A (Formal)** — see `references/common-rules.md`
→ Default layout: standard margins — see `references/common-rules.md`
→ Placeholder convention & universal prohibitions — see `references/common-rules.md`
---
## Contract Type Routing
```js
function selectContractType(keywords, topic) {
if (/confidential|NDA|non-disclosure/.test(keywords)) return "nda";
if (/transfer|equity|asset|rights/.test(keywords)) return "transfer";
if (/framework|strategic|cooperation agreement/.test(keywords)) return "framework";
if (/terms|platform rules|user agreement|privacy/.test(keywords)) return "terms";
return "bilateral"; // default: bilateral commercial contract
}
```
### 5 Contract Types
| Type | Use Case | Structure Focus |
|------|----------|----------------|
| bilateral | Service/sale/development/procurement contracts | Subject → Consideration → Performance → Acceptance → Breach → Dispute |
| transfer | Equity/debt/asset/rights transfer | Subject → Consideration → Closing & Registration → Representations → Tax |
| nda | Non-disclosure agreements | Definition of Confidential Info → Obligations → Use Restrictions → Exceptions → Duration |
| framework | Cooperation framework / strategic alliance | Scope → Division of Work → Mechanism → Subsequent Agreements |
| terms | Platform rules / Terms of Service / User agreements | Definitions → Services → Rights & Obligations → Liability Limits → Amendments |
---
## Standard Template Structures
### Template A: Bilateral Commercial Contract
1. Header (title, contract number, date, location)
2. Party Information (Party A, Party B)
3. Recitals ("Whereas" clauses)
4. Definitions & Interpretation
5. Subject Matter & Scope of Services/Delivery
6. Contract Price & Payment Terms
7. Rights & Obligations of Both Parties
8. Timeline, Delivery & Acceptance
9. Invoicing, Tax & Settlement
10. Intellectual Property & Confidentiality
11. Representations & Warranties (if applicable)
12. Liability for Breach
13. Force Majeure
14. Termination & Dissolution
15. Notices & Service
16. Dispute Resolution
17. Miscellaneous
18. Signature Block
### Template B: Rights Transfer Agreement
1. Header & Parties
2. Recitals
3. Definitions & Interpretation
4. Subject of Transfer
5. Consideration & Payment Arrangement
6. Closing & Registration/Transfer
7. Representations & Warranties
8. Tax Allocation
9. Liability for Breach
10. Dispute Resolution
11. Miscellaneous
12. Signature Block
### Template C: Non-Disclosure Agreement (NDA)
1. Header & Parties
2. Recitals
3. Definition of Confidential Information
4. Confidentiality Obligations
5. Use Restrictions
6. Return, Deletion & Destruction of Information
7. Exceptions
8. Confidentiality Period
9. Liability for Breach
10. Dispute Resolution
11. Miscellaneous
12. Signature Block
### Template D: Framework / Cooperation Agreement
1. Header & Parties
2. Recitals
3. Purpose & Principles
4. Scope of Cooperation
5. Division of Work & Responsibilities
6. Project Advancement Mechanism
7. Commercial Arrangements / Subsequent Agreements
8. Confidentiality, IP & Compliance
9. Term, Amendment & Termination
10. Liability for Breach
11. Dispute Resolution
12. Miscellaneous
13. Signature Block
### Template E: Unilateral Terms / Platform Rules
1. Document Title
2. Definitions & Scope
3. Service/Rule Content
4. User Rights & Obligations / Platform Rights & Obligations
5. Liability Limitations & Disclaimers
6. Fees & Payment (if applicable)
7. Intellectual Property
8. Termination, Suspension & Amendment
9. Notices & Service
10. Dispute Resolution
11. Miscellaneous
**Note:** Unilateral/boilerplate terms require special attention to adhesion clause risks — avoid creating extremely one-sided documents.
**If the user provides an existing template, historical agreement, or company standard, always follow it first.**
---
## Input Recognition & Completion
### Processing Rules
1. If user provides a template, historical agreement, or company standard → **always follow it first**
2. If information is incomplete, fill conservatively — must be **restrained, natural, professional, consistent with transaction logic**
3. **Never fabricate** unrealistic commercial terms, regulatory requirements, approval conclusions, qualification status, tax treatment results, payment facts, or performance facts
4. If critical info is missing → use standardized placeholders
5. If user does not specify jurisdiction → default to PRC commercial writing conventions, but avoid making specific legal conclusions
---
## Legal Writing Standards
### Register
1. Use formal legal document register
2. Use clear party designations: "Party A", "Party B", "both parties", "either party", "non-breaching party", "breaching party"
3. **Forbidden:** Colloquial expressions ("you", "me", "they", "pay up", "cancel the contract", "handle ASAP")
4. Preferred terms: "pay consideration", "perform obligations", "constitute a breach", "terminate the contract", "assume liability for damages", "written notice", "deliver and accept", "representations and warranties"
### Precision
1. Eliminate vague adjectives: avoid "quality", "reasonable", "enormous", "appropriate", "ASAP" unless necessary for legal flexibility
2. Each obligation must specify: who, when, how, what
3. Consistent legal phrasing:
- Mandatory obligation → "shall"
- Right authorization → "has the right to"
- Prohibition → "shall not"
- Discretionary → "may"
4. Amounts, dates, percentages, deadlines, business days vs. calendar days must be as specific as possible
### Clear Subjects
1. Every clause must have an explicit responsible party — avoid vague subjects ("relevant parties", "relevant personnel", "when necessary")
2. Joint obligations: explicitly write "both parties agree" or "both parties shall"
3. Unilateral obligations: explicitly write "Party A shall" or "Party B shall"
---
## Transaction Closure & Risk Control
A contract must not only describe the transaction — it must ensure logical closure. Check the following:
1. If a performance deadline is specified → specify consequences of delay
2. If payment milestones are specified → specify payment conditions, method, invoice requirements
3. If a delivery obligation exists → specify delivery standards, method, acceptance rules, objection period
4. If termination rights exist → specify conditions, notice, effective date, post-termination settlement
5. If breach liability exists → must correspond to main obligations in preceding clauses
6. If IP/technology/data/trade secrets are involved → separately address ownership, license scope, use restrictions
7. If confidentiality obligations exist → define scope, exceptions, duration, breach consequences
8. If force majeure clause exists → specify notice obligation, mitigation duty, subsequent negotiation mechanism
9. If notice/service arrangements exist → specify address, contact person, email, or other delivery method
10. If user requests significantly one-sided adhesion/disclaimer clauses → add a note near the clause:
`[Note: This clause may involve adhesion terms or liability limitations. Manual review recommended for the specific transaction.]`
---
## Truthfulness & Legal Caution
1. **Never fabricate** specific statute article numbers, judicial interpretation numbers, or regulatory document numbers
2. Legal bases should use general references, e.g.: "In accordance with the Civil Code of the PRC and relevant laws and regulations..."
3. **Never** pretend to provide formal legal opinions, litigation success predictions, or definitive validity/invalidity conclusions
4. **Never** state definitive legality conclusions for high-risk clauses (adhesion terms, penalty clauses, disclaimers, non-compete, exclusivity, unilateral interpretation rights)
5. **Never** fabricate that regulatory approvals are obtained, title is unencumbered, tax compliance is assured, or third-party consent is secured
6. When critical info is insufficient → use placeholders, never present as confirmed fact
7. For high-risk areas (equity, debt, licenses, data compliance, labor, personal information, cross-border) → maintain restrained language, do not add rigid commitments without user confirmation
---
## Special Clause Requirements
### Definitions Clause
If the document repeatedly uses specialized terms ("deliverables", "service results", "confidential information", "source code", "project milestones", "acceptance criteria", "trade secrets"), include a "Definitions & Interpretation" clause near the beginning.
### Dispute Resolution
1. Must be explicit
2. Choose between litigation OR arbitration — never mix both
3. Litigation → specify jurisdictional connection point
4. Arbitration → specify arbitration institution
5. If user hasn't specified → use placeholder for confirmation
### Tax Clause
1. If the transaction involves taxes → specify which party bears them, whether price includes tax, invoice type and conditions
2. Avoid vague "taxes borne as required by law" without transaction-specific detail
### Breach Liability
1. Must correspond to main obligations in preceding clauses
2. Penalty amounts should be restrained — avoid obviously exaggerated or severely imbalanced figures
3. If fundamental breach exists → consider corresponding termination rights and damages
### Appendices
1. For complex subjects/pricing/technical requirements/deliverables → use "Appendix 1, Appendix 2..." format
2. Explicitly state appendix-contract relationship (typically: "Appendices form an integral part of this contract")
3. If appendix content is unknown → use placeholder
---
## Palette
**Legal Wood** (Warm + Heavy + Calm) — for decorative elements only; body text must be pure black.
```js
const palette = { primary:"#28201C", body:"#000000", secondary:"#6E6560", accent:"#7A5C3A", surface:"#FBF9F7" };
```
⚠️ **ALL visible text in contracts must be pure black `"000000"`.** This includes:
- Contract title (SimHei, black, NOT accent color)
- Contract number (black)
- Clause headings (black)
- Body text (black)
- Party information (black)
- Signature block text (black)
**The only exception** is red-header official documents (红头文件), which follow their own GB/T 9704 color rules. For standard contracts, NO colored text is permitted — no red, no accent color, no dark-blue-grey.
```js
// ✅ Contract title — always pure black
new Paragraph({ alignment: AlignmentType.CENTER,
spacing: { line: Math.ceil(22 * 23), lineRule: "atLeast" }, // ★ Rule 8: prevent clipping
children: [new TextRun({ text: "Training Cooperation Framework Agreement",
size: 44, bold: true, color: "000000", // ← MUST be "000000"
font: { eastAsia: "SimHei", ascii: "Times New Roman" } })]
})
// ❌ FORBIDDEN — accent/palette color on contract text
new TextRun({ text: "Training Cooperation Framework Agreement", color: palette.accent }) // ← WRONG
new TextRun({ text: "Contract No.:", color: palette.primary }) // ← WRONG (if primary ≠ "000000")
```
---
## Scene-Specific Font Overrides
Beyond Profile A defaults:
| Element | Font | Size | Style |
|---------|------|------|-------|
| Contract title | SimHei | Er Hao 22pt (size: 44) | Bold, centered |
| Contract number | SimSun | Wu Hao 10.5pt (size: 21) | Right-aligned |
| Clause heading | SimHei | Xiao Si 12pt (size: 24) | Bold |
| Monetary amount | SimSun | Xiao Si 12pt (size: 24) | Bold |
---
## Document Structure
1. **Title**: "XXX Contract" or "XXX Agreement" — Er Hao SimHei, centered
2. **Contract number**: right-aligned, Wu Hao
3. **Preamble**: Party information with placeholders
4. **Recitals** (summarize transaction background and purpose)
5. **Definitions** (if specialized terms recur)
6. **Substantive clauses** (per selected template)
7. **Signature block**
8. **Appendices** (if any)
---
## Clause Numbering System
Use stable, consistent, pure-text numbering suitable for Chinese legal documents.
```
Article 1 Subject Matter
1.1 xxxxxxxxxx
1.2 xxxxxxxxxx
(1) xxxxxxxxxx
(2) xxxxxxxxxx
① xxxxxxxxxx
② xxxxxxxxxx
Article 2 Price and Payment
2.1 ...
```
**Numbering discipline:**
1. No level-skipping
2. **Forbidden:** Using Markdown list markers (`-` `*` `1.`) for clause hierarchy
3. No switching from "Article X" to `-` or `*` or auto-list mid-document
4. Numbering style must be consistent throughout the entire document
5. Clause headings should be clean and simple
---
## Party Information Layout (Table-Based Alignment — Mandatory)
Party A and Party B information MUST be laid out using a **borderless table** so that labels align vertically. Never use plain paragraphs with indentation — this causes misalignment between parties.
```js
// ✅ Correct — borderless table ensures "统一社会信用代码:", "地址:", "法定代表人:" align
function partyInfoBlock(partyLabel, partyName, fields) {
// fields: [["Unified Social Credit Code", value], ["Address", value], ["Legal Representative", value]]
const NB = { style: BorderStyle.NONE, size: 0, color: "FFFFFF" };
const noBorders = { top: NB, bottom: NB, left: NB, right: NB };
const headerPara = new Paragraph({ spacing: { before: 200, after: 120 },
children: [new TextRun({ text: `${partyLabel}: ${safeText(partyName, "【Company full name】")}`,
size: 24, font: { eastAsia: "SimSun", ascii: "Times New Roman" } })]
});
const infoTable = new Table({
width: { size: 90, type: WidthType.PERCENTAGE },
borders: { top: NB, bottom: NB, left: NB, right: NB, insideHorizontal: NB, insideVertical: NB },
rows: fields.map(([label, value]) => new TableRow({
children: [
new TableCell({
width: { size: 35, type: WidthType.PERCENTAGE },
borders: noBorders,
margins: { top: 40, bottom: 40, left: 420, right: 60 },
children: [new Paragraph({
children: [new TextRun({ text: `${label}:`, size: 24,
font: { eastAsia: "SimSun", ascii: "Times New Roman" } })],
})],
}),
new TableCell({
borders: noBorders,
margins: { top: 40, bottom: 40, left: 60, right: 120 },
children: [new Paragraph({
children: [new TextRun({ text: safeText(value, `【Please fill in: ${label}`), size: 24,
font: { eastAsia: "SimSun", ascii: "Times New Roman" } })],
})],
}),
],
})),
});
return [headerPara, infoTable];
}
// Usage:
const partyAChildren = partyInfoBlock("Party A (甲方)", config.partyA?.name, [
["Unified Social Credit Code (统一社会信用代码)", config.partyA?.creditCode],
["Address (地址)", config.partyA?.address],
["Legal Representative (法定代表人/负责人)", config.partyA?.legalRep],
]);
```
**Rules:**
1. Party A and Party B info blocks must use the **same table column widths** — labels align across both blocks
2. Use `safeText()` for all field values — never output `undefined`
3. Label column width should accommodate the longest label (e.g., "统一社会信用代码")
4. The indent (`margins.left: 420`) simulates sub-level nesting under the party name
---
## Signature Block
Left-right symmetric, structured, easy to adjust in Word. Never write as scattered paragraphs.
Required fields for each party:
- Party name (seal)
- Legal representative / Authorized representative
- Contact person
- Contact information
- Signing location
- Date: 【____/____/____】
Use a borderless 2-column table for symmetry. **Every field value must use `safeText()`** — never output `undefined` or empty string. If a field is not provided, use the appropriate `【Please fill in】` placeholder.
```js
// ✅ Correct signature block — safeText for all values
function buildSignatureBlock(partyA, partyB) {
const fields = ["Party (Seal)", "Legal Rep / Authorized Rep (Signature)", "Contact Person", "Contact Info", "Signing Location", "Date"];
const NB = { style: BorderStyle.NONE, size: 0, color: "FFFFFF" };
const noBorders = { top: NB, bottom: NB, left: NB, right: NB };
return new Table({
width: { size: 100, type: WidthType.PERCENTAGE },
borders: { top: NB, bottom: NB, left: NB, right: NB, insideHorizontal: NB, insideVertical: NB },
rows: fields.map((label, i) => {
const aVal = i === fields.length - 1 ? "【____/____/____】" : safeText(partyA?.[i], "");
const bVal = i === fields.length - 1 ? "【____/____/____】" : safeText(partyB?.[i], "");
const displayA = i === 0 ? `Party A (甲方): ${aVal}` : `${label}: ${aVal}`;
const displayB = i === 0 ? `Party B (乙方): ${bVal}` : `${label}: ${bVal}`;
return new TableRow({
children: [
new TableCell({ width: { size: 50, type: WidthType.PERCENTAGE }, borders: noBorders,
margins: { top: 80, bottom: 80, left: 120, right: 60 },
children: [new Paragraph({ children: [new TextRun({ text: displayA, size: 24, color: "000000" })] })] }),
new TableCell({ width: { size: 50, type: WidthType.PERCENTAGE }, borders: noBorders,
margins: { top: 80, bottom: 80, left: 60, right: 120 },
children: [new Paragraph({ children: [new TextRun({ text: displayB, size: 24, color: "000000" })] })] }),
],
});
}),
});
}
```
---
## Monetary Amount Format
Contracts must show amounts in **both uppercase Chinese and numeric format**:
```
Contract amount: RMB One Million Two Hundred Thirty-Four Thousand Five Hundred Sixty-Seven Yuan (¥1,234,567.00)
```
---
## Style Rules
- **NO cover page** — title page is the first page (title + contract number at top)
- **NO TOC** unless >20 clauses
- **NO decorative elements** — contracts must be formal and clean
- **Line spacing**: 1.5x (line: 360) — ⚠️ scene override (Profile A default is 1.3x/312; contracts use 1.5x for readability and annotation space)
- **Body**: Justified, first-line indent 480 twips
- **Color**: pure black "000000" throughout — no colored text
---
## Scene-Specific Quality Checks
In addition to universal checks (see `references/common-rules.md`):
### Format
- [ ] Party information complete (full name / address / legal representative / contact)
- [ ] Signature block properly formatted, symmetrical, all fields present
- [ ] Monetary amounts shown in both uppercase and numeric format
- [ ] Clause numbering sequential with no gaps
- [ ] No cover page (title page is first page)
- [ ] No Markdown list markers mixed into clause hierarchy
### Content
- [ ] Clause numbering system consistent, no mixing
- [ ] Transaction closure complete (subject → consideration → performance → acceptance → breach → dispute)
- [ ] Breach liability corresponds to main obligations
- [ ] Dispute resolution explicitly stated (or placeholder for confirmation)
- [ ] All unconfirmed variables use `【】` placeholders consistently
- [ ] Language is formal, restrained, subjects are explicit
- [ ] No fabricated statute numbers or overreaching legal conclusions
- [ ] High-risk clauses include manual review notes
- [ ] Terminology consistent throughout
- [ ] Appendix-contract relationship explicitly stated
### Closure
- [ ] Performance deadline → delay consequences specified
- [ ] Payment milestones → conditions and invoice requirements specified
- [ ] Delivery obligation → acceptance rules and objection period specified
- [ ] Termination right → conditions and post-termination handling specified
- [ ] Confidentiality obligation → scope, exceptions, duration, breach consequences specified
- [ ] Force majeure → notice and mitigation duties specified

139
skills/docx/scenes/copywriting.md Executable file
View File

@@ -0,0 +1,139 @@
# Scene: Copywriting / Script
## Scope
Broadcast scripts, product promotion copy, livestream scripts, presentation scripts, speeches, hosting scripts, short video scripts — any document where the goal is **spoken delivery**.
→ Placeholder convention & universal prohibitions — see `references/common-rules.md`
→ Font profile: **B (Visual)** — see `references/common-rules.md`
---
## 1. Core Principles
⚠️ **A broadcast script is NOT a report, NOT a spec sheet, NOT an encyclopedia.**
The goal is for the audience to **understand on first listen, remember key points, and take action.** Therefore:
1. **Highlight selling points, don't pile specs:** Each paragraph covers only 12 core points with relatable scenario descriptions
2. **Conversational tone:** Use "you" not "the user"; use natural speech, not corporate jargon
3. **Rhythm:** Alternate long and short sentences, insert pause markers, avoid wall-of-text paragraphs
4. **Length discipline:** ~250300 words per minute of speech; a 5-minute script should not exceed 1500 words
5. **Information consistency:** All data, model numbers, prices must be consistent throughout — no self-contradiction
---
## 2. Document Structure
Completely different from reports:
```
Title (centered, short and punchy)
────────────────────
[Opening] ← Grab attention, 12 sentences
[Core Para 1] ← One selling point/opinion + scenario
[Core Para 2] ← One selling point/opinion + scenario
[Core Para 3] ← One selling point/opinion + scenario (max 35 paras)
[Closing] ← Summary + Call to Action (CTA)
────────────────────
[Notes] ← Supplementary info, data sources (optional, small grey text)
```
### Decisions
- **Cover:** ❌ Not needed
- **TOC:** ❌ Not needed
- **Header/footer:** Optional, minimal
- **Sections:** Single section sufficient
- **Line spacing:** `line: 400` (slightly larger than standard 1.5x for reading/marking ease)
---
## 3. Layout Standards
### Font Specifications
| Element | Font | Size | Style |
|---------|------|------|-------|
| Title | SimHei | 18pt (size:36) | Bold, centered |
| Section heading / highlight | SimHei | 14pt (size:28) | Bold |
| Body | Microsoft YaHei | 12pt (size:24) | Left-aligned |
| Rhythm markers | Microsoft YaHei | 10.5pt (size:21) | Grey 999999, italic |
| Notes | Microsoft YaHei | 10pt (size:20) | Grey 666666 |
### Paragraph Spacing
```js
// Generous spacing between paragraphs for reading/breathing pauses
spacing: { before: 200, after: 200, line: 400 }
// Larger gap between core sections
sectionGap: { before: 400, after: 200 }
```
### Key Point Highlighting
Use **bold** or **accent-colored text** to mark key selling points:
```js
new TextRun({ text: "Key selling point", bold: true, color: c(P.accent) })
```
### Rhythm Markers (optional)
Insert small grey markers where pauses, emphasis, or tone changes are needed:
```js
new Paragraph({ spacing: { before: 60, after: 60 },
children: [new TextRun({ text: "[Pause 2 sec]", size: 21, color: "999999", italics: true })] })
// Or inline: new TextRun({ text: " [emphasis] ", size: 18, color: "999999", italics: true })
```
---
## 4. Content Quality Rules
### Information Density Guide
| Script Type | Duration | Word Count | Core Paragraphs |
|-------------|----------|-----------|----------------|
| Short video | 3060 sec | 150300 | 12 |
| Product promotion | 23 min | 500800 | 34 |
| Presentation / Speech | 510 min | 12002500 | 58 |
| Hosting script | Per agenda | Per segment | Per segment |
### Scene-Specific Prohibitions
1. **No spec dumping:** Do not list all product specifications in tables. Select 23 most persuasive data points and express them through scenarios
2. **No information contradiction:** Model numbers, prices, data appearing multiple times must be perfectly consistent
3. **No report tone:** No "in conclusion", "research indicates", "as mentioned above" — this is spoken word
4. **No lengthy citations:** Broadcast scripts do not need quotes, footnotes, or references
5. **No dense layout:** Paragraphs must have visible spacing — no screen-filling text walls
### Product Promotion Specific Rules
- **Opening:** Lead with pain point / scenario ("Does your washing machine still smell after a cycle?"), not self-introduction
- **Product intro:** Compare only 12 competitive dimensions at a time — not a full review
- **Price anchor:** State original/market price first, then discount price — create contrast
- **CTA:** Explicitly state the action ("Click the link below", "Type 1 in comments")
---
## 5. Palette
Broadcast scripts use clean, simple colors — no complex visual design needed:
```js
const P = {
primary: "#1A1A1A", // Title
body: "#333333", // Body
secondary: "#666666", // Notes
accent: "#E85D3A", // Key highlight (warm, energetic)
surface: "#FFF8F5", // Background (if needed)
};
```
---
## 6. Scene-Specific Quality Checks
In addition to universal checks (see `references/common-rules.md`):
- [ ] Total word count within target range (not exceeding)
- [ ] Each core paragraph has only 12 selling points (no dumping)
- [ ] Conversational tone present (not report/formal style)
- [ ] Information consistent throughout (model, price, data — no contradictions)
- [ ] Paragraph spacing sufficient (visually not crowded)
- [ ] Clear attention-grabbing opening + closing CTA

698
skills/docx/scenes/exam.md Executable file
View File

@@ -0,0 +1,698 @@
# Scene: Exam Paper
## Overview
Exam papers are among the most critical document types in education. Unlike general documents, they require high precision in layout, print compatibility, and subject-specific formatting. This specification covers the complete workflow from page framework to subject-specific features.
→ Universal prohibitions — see `references/common-rules.md`
**Note:** Exam papers use their OWN font/layout specs (not Profile A defaults). All text is pure black/white/grey for photocopy clarity.
---
## 1. Page Setup & Framework
### Paper Specifications
| Type | Paper | Orientation | Use Case |
|------|-------|-------------|----------|
| Practice / Unit quiz | A4 | Portrait | Daily practice, homework, quizzes |
| Formal exam | A3 | Landscape + 2-column | Midterm / final / standardized (requires OOXML) |
| Answer sheet | A4 | Portrait | Standalone answer card |
### Margins
```js
// A4 portrait — no seal line
page: { size: { width: 11906, height: 16838 },
margin: { top: 850, bottom: 850, left: 1200, right: 1200 } }
// A4 portrait — with seal line (left binding area reserved)
page: { size: { width: 11906, height: 16838 },
margin: { top: 850, bottom: 850, left: 2200, right: 850 } }
// A3 landscape dual-column (requires OOXML)
// ⚠️ A3 dual-column may render slightly differently in WPS vs Word. Test in both before batch printing.
page: { size: { width: 23812, height: 16838, orientation: PageOrientation.LANDSCAPE },
margin: { top: 850, bottom: 850, left: 2200, right: 850 } }
```
### Section Handling
Different parts should use section breaks (`SectionType.NEXT_PAGE`):
- **Header area (full-width):** Title, instructions, score table (no columns)
- **Content area:** Questions (may use columns)
- **Composition / answer sheet:** Independent section, independent format
- **Attachment pages:** Large maps/diagrams for geography/biology can be separate pages
```js
sections: [
{ properties: { /* Header section — no columns */ }, children: [...] },
{ properties: { type: SectionType.CONTINUOUS, column: { count: 2, space: 720 } }, children: [...] },
{ properties: { type: SectionType.NEXT_PAGE }, children: [...] }, // Composition
]
```
### Template-First Principle
⚠️ **Build framework first, fill content second.** Before writing questions, determine:
1. Paper size + margins
2. Whether seal line is needed
3. Whether columns are used
4. Question type structure and point allocation
5. Whether composition grid / answer sheet is needed
---
## 2. Seal Line & Student Information Area
### When to Use Seal Line
| Scenario | Seal Line | Student Info Position |
|----------|-----------|---------------------|
| Formal standardized exam | ✅ Required | Left vertical info column |
| Midterm / Final | ✅ Recommended | Left vertical info column |
| Unit quiz | ❌ Optional | Header horizontal info row |
| Daily practice | ❌ Skip | Header horizontal info row |
### Seal Line Implementation
#### Method 1: Header horizontal prompt (simple)
```js
headers: { default: new Header({ children: [
new Paragraph({ alignment: AlignmentType.CENTER,
children: [new TextRun({
text: ".............. Seal ...... Line ...... Do ...... Not ...... Answer ...... Inside ..............",
size: 16, color: "999999", font: "SimSun" })] })
] }) }
```
#### Method 2: Vertical text box (OOXML advanced)
```xml
<w:txbxContent>
<w:p><w:pPr><w:jc w:val="center"/></w:pPr>
<w:r><w:rPr><w:sz w:val="18"/><w:color w:val="999999"/></w:rPr>
<w:t>Name:________ Class:________ ID:________</w:t></w:r>
</w:p>
<w:p><w:r><w:rPr><w:sz w:val="16"/><w:color w:val="CCCCCC"/></w:rPr>
<w:t>- - - - - - - - - Seal Line - - - - - - - - -</w:t></w:r>
</w:p>
</w:txbxContent>
```
### Student Info Row
```js
// Horizontal info row (when no seal line) — borderless 3-column table
new Table({
alignment: AlignmentType.CENTER, columnWidths: [2800, 2800, 2800],
rows: [new TableRow({ children: [
cell("Name: ______________"),
cell("Class: ______________", AlignmentType.CENTER),
cell("ID: ______________", AlignmentType.RIGHT),
] })]
})
```
Fill lines should be moderate length (1014 underscore chars). Label order: Name → Class → Student ID.
---
## 3. Paper Header & Title Area
### Structure
```
School name (16pt SimHei, centered)
Exam title (14pt SimHei, centered) — e.g., "20252026 Academic Year Second Semester Midterm"
Subject title (14pt SimHei, centered) — e.g., "Grade 7 Mathematics"
Student info row
Instructions (10pt SimSun, centered, grey)
Score table (as needed)
```
### Font Specifications
| Element | Font | Size | Style |
|---------|------|------|-------|
| School name | SimHei | 16pt (size:32) | Bold, centered |
| Exam title | SimHei | 14pt (size:28) | Bold, centered |
| Subject title | SimHei | 14pt (size:28) | Bold, centered |
| Instructions | SimSun | 10pt (size:20) | Grey 333333, centered |
| Student info | SimSun | 10.5pt (size:21) | Normal |
### Instructions Content
Should include: total score, exam duration, answer method, special requirements (e.g., calculator allowed).
### Score Table
- Header row: light grey background F0F0F0, centered
- Columns: Question type | Section names... | Total
- Rows: Points | Section points... | Total points
- Row: Score | blank... | blank
- Table centered, 80% page width
⚠️ **Header area should not be too full** — title + info + instructions + score table should not exceed 1/3 of the page.
---
## 4. Content Layout Rules
### Color Palette
```js
// Exam papers use only black/white/grey for clear photocopying
const C = {
title: "000000", body: "000000", section: "333333",
seal: "999999", answerLine: "CCCCCC", headerBg: "F0F0F0", gridLine: "DDDDDD",
};
```
### Column Usage
| Subject / Question Type | Recommendation |
|------------------------|----------------|
| Math multiple choice + fill-in | ✅ Suitable for columns |
| Physics multiple choice | ✅ Suitable for columns |
| Chinese reading / composition | ❌ Not suitable |
| English cloze / reading | ❌ Not suitable |
| History source-based | ❌ Not suitable |
| Geography map reading | ❌ Not suitable |
### Question Numbering
Entire paper uses consistent three-level numbering:
- **Major sections:** I, II, III, IV... (Chinese: 一、二、三、四…)
- **Questions:** 1. 2. 3. ... (Arabic + period)
- **Sub-questions:** (1) (2) (3) ... (parenthesized)
⚠️ **No extra symbols before question numbers** (no `•`, `▸`, `▪`, `-`, `*`). The number itself is the only marker. **Never use docx numbering/bullet list styles** for question numbers — must use plain TextRun manual numbering.
```js
// ✅ Correct — plain TextRun manual numbering
new Paragraph({ spacing: { before: 120, after: 60, line: 360 },
children: [new TextRun({ text: `${i+1}. ${question}`, size: 21, font: { eastAsia: "SimSun" } })] })
// ❌ Wrong — numbering causes Word to add bullets
new Paragraph({ numbering: { reference: "xxx", level: 0 }, // ← Forbidden!
children: [new TextRun({ text: question })] })
```
### Question Spacing
```js
sectionTitle: { before: 300, after: 150 } // Major section headers
question: { before: 120, after: 80 } // Between questions
subQuestion: { before: 60, after: 40 } // Between sub-questions
```
### Page Break Control
⚠️ Key principles:
- **Question stem and answer area must not split** across pages
- **Source material and questions on same page**
- **Figures adjacent to their questions**
- **Avoid orphan lines** — question stem, options, answer area appear as a group
```js
new Paragraph({ keepNext: true, keepLines: true, children: [...] })
```
⚠️ **Answer question page break rule (mandatory):**
Complete combination (stem + figure + answer lines) must be considered as a unit. If remaining space cannot fit stem + figure + at least 3 answer lines, push entire question to next page.
Use `keepNext: true` to chain: stem → figure → first 3 answer lines.
---
## 5. Font & Paragraph Standards
### Underline Formatting for "Underlined Parts" (Mandatory)
When a question references "underlined part" (划线部分), the relevant text MUST use actual underline formatting (`underline: { type: UnderlineType.SINGLE }`). **Never** show "划线部分为 XXX" as plain text annotation — the underline must be visually rendered.
```js
// ✅ Correct — actual underline on the referenced text
new Paragraph({ children: [
new TextRun({ text: "1. It is ", size: 21, font: { ascii: "Times New Roman" } }),
new TextRun({ text: "a butterfly", size: 21, font: { ascii: "Times New Roman" },
underline: { type: UnderlineType.SINGLE, color: "000000" } }),
new TextRun({ text: ". (Ask about the underlined part)", size: 21, font: { ascii: "Times New Roman" } }),
]})
// ❌ Wrong — underlined part described as annotation text
new TextRun({ text: "1. It is a butterfly. (对划线部分提问) 注:划线部分为 a butterfly" })
```
### Font Hierarchy
| Element | Font | Size | Style |
|---------|------|------|-------|
| Section title | SimHei | 11pt (size:22) | Bold |
| Question content | SimSun | 10.5pt (size:21) | Normal |
| Points annotation | SimSun | 10pt (size:20) | In parentheses |
| Reading material | KaiTi/SimSun | 10.5pt (size:21) | KaiTi to differentiate |
| Notes/source | SimSun | 9pt (size:18) | Grey 666666 |
| Seal line | SimSun | 8pt (size:16) | Grey 999999 |
| Page number | SimSun | 9pt (size:18) | Centered |
### Line Spacing
```js
line: 360 // ~1.5x for readability
answerLine: 500 // Answer line spacing for writing room
```
### Paragraph Rules
- ⚠️ **Never use consecutive returns for whitespace** — use `spacing.before/after`
- Chinese questions use Chinese punctuation; English materials use English punctuation
- Mixed CN/EN: use Times New Roman or Calibri for English text
---
## 6. Multiple Choice Layout
### Core Rule
⚠️ **Options must NEVER be aligned with spaces!** Must use borderless tables.
### Option Layout — Borderless Table
```js
// Short options: 4 columns in 1 row
new Table({
columnWidths: [2200, 2200, 2200, 2200],
rows: [new TableRow({ children: ["A","B","C","D"].map((label, i) =>
new TableCell({ borders: NBs, width: { size: 2200, type: WidthType.DXA },
margins: { top: 0, bottom: 0, left: 60, right: 60 },
children: [new Paragraph({ spacing: { before: 0, after: 0 },
children: [new TextRun({ text: `${label}. ${options[i]}`, size: 21, font: "SimSun" })] })]
})
) })]
})
// Medium options: 2 columns, 2 rows
// Long options: 1 column, 4 rows
```
### Option Length Detection
```js
function getOptionLayout(options) {
const maxLen = Math.max(...options.map(o => o.length));
if (maxLen <= 6) return "4col";
if (maxLen <= 15) return "2col";
return "1col";
}
```
---
## 7. Fill-in-the-Blank Layout
```js
// Blank line length matches expected answer:
// Short answer (number/word): 8 underscores
// Medium (phrase): 14 underscores
// Long (sentence): 20 underscores
new Paragraph({ spacing: { before: 140, after: 80, line: 400 },
children: [new TextRun({ text: `${num}. Question text ________________.`, size: 21, font: "SimSun" })] })
```
⚠️ Fill-in lines must not break across lines — if line is too long, put the blank on the next line.
---
## 8. Short Answer / Problem-Solving Layout
### Question + Points
```js
new Paragraph({ spacing: { before: 200, after: 60, line: 360 }, keepNext: true,
children: [new TextRun({ text: `${num}. (${points} pts) ${question}`, size: 21, font: "SimSun" })] })
```
### Answer Lines
```js
// Light grey answer lines (CCCCCC), NOT black
// ⚠️ Answer lines are ONLY for writing space within each question — never as dividers between questions
function answerLines(count) {
return Array(count).fill(null).map(() =>
new Paragraph({ spacing: { before: 0, after: 0, line: 500 },
borders: { bottom: { style: BorderStyle.SINGLE, size: 1, color: "CCCCCC" } },
children: [new TextRun({ text: " ", size: 21 })] })
);
}
```
⚠️ **Separation between questions:**
Use **only spacing** (`spacing.before: 200`) for visual separation between questions. **Forbidden:**
- ❌ Grey horizontal lines (borders)
- ❌ Color block dividers (Table-simulated separators)
- ❌ Symbol dividers (e.g., `───────`)
- ❌ Any visual separator decoration
### Answer Space vs. Points
| Points | Suggested Lines | Description |
|--------|----------------|-------------|
| 24 | 34 lines | Simple calculation / short answer |
| 58 | 68 lines | Medium problem |
| 1012 | 810 lines | Complex problem |
| 1420 | 1014 lines | Comprehensive / essay question |
---
## 9. Source-Based / Reading Question Layout
### Material vs. Question Separation
```js
// Material area — indented + KaiTi to differentiate
new Paragraph({ indent: { left: 420, right: 420 }, spacing: { before: 100, after: 100, line: 380 },
children: [new TextRun({ text: materialText, size: 21, font: "KaiTi" })] })
// Source attribution
new Paragraph({ alignment: AlignmentType.RIGHT, indent: { right: 420 },
children: [new TextRun({ text: "— from \"XXX\"", size: 18, color: "666666", font: "SimSun" })] })
```
### Key Principles
- Material title, source, body, and notes use different fonts
- Long materials: increase line spacing (line: 380400)
- Material and corresponding questions on same page
- Sub-question numbers (1)(2)(3) clearly correspond to material
- **Data tables in materials MUST use proper docx `Table` objects** — never render tabular data as Markdown plain text (`| col | col |`). This includes statistics tables, climate data tables, comparison tables, and any structured data within question materials. Use bordered tables (see § 13 Table Usage Standards) with appropriate header row styling.
---
## 10. Composition / Writing Area
### Grid Count Calculation
⚠️ **Grid count must exceed required word count by 2030%** (for title, paragraph indents, line breaks).
| Required Words | Min Grid Count | Recommended Layout |
|---------------|---------------|-------------------|
| 400 | 500 | 25 rows × 20 cols |
| 600 | 750 | 38 rows × 20 cols |
| 800 | 1000 | 50 rows × 20 cols |
| 1000 | 1250 | 63 rows × 20 cols |
```js
function calcGridSize(requiredWords, colsPerRow = 20) {
const totalCells = Math.ceil(requiredWords * 1.25);
const rows = Math.ceil(totalCells / colsPerRow);
return { rows, colsPerRow, totalCells: rows * colsPerRow };
}
```
### Chinese Composition Grid
```js
function compositionGrid(rows, colsPerRow) {
const cellSize = Math.floor(8800 / colsPerRow);
return new Table({
columnWidths: Array(colsPerRow).fill(cellSize),
rows: Array(rows).fill(null).map(() =>
new TableRow({
height: { value: cellSize, rule: HeightRule.EXACT },
children: Array(colsPerRow).fill(null).map(() =>
new TableCell({ borders: thinBs("DDDDDD"), width: { size: cellSize, type: WidthType.DXA },
children: [new Paragraph({ children: [] })] })
)
})
)
});
}
```
### English Writing Area (Horizontal Lines) — MANDATORY for English Writing Questions
⚠️ **Every English writing/composition question MUST include ruled horizontal lines.** A blank area without lines is FORBIDDEN — students need lines to write on.
```js
function writingLines(count) {
return Array(count).fill(null).map(() =>
new Paragraph({ spacing: { before: 0, after: 0, line: 560 },
borders: { bottom: { style: BorderStyle.SINGLE, size: 1, color: "CCCCCC" } },
children: [new TextRun({ text: " ", size: 21 })] })
);
}
```
**Line count by word requirement:**
| Required Words | Lines |
|---------------|-------|
| ≤50 | 8 |
| 5080 | 10 |
| 80120 | 12 |
| 120+ | 15 |
**Rules:**
1. Lines must appear immediately after the writing prompt paragraph
2. Line color: light grey `CCCCCC` (print-friendly, not visually heavy)
3. Line spacing: `line: 560` (provides adequate writing room)
4. Chinese composition uses grid (`compositionGrid`), English uses lines (`writingLines`) — never mix them up
```
### Composition Area Requirements
- Independent section or clear separation
- Title space reserved (for self-chosen topics)
- Word count prompt visible ("No fewer than 800 words" / "About 120 words")
- Grid/line colors light — must not interfere with writing
- Pages continuous, not split
---
## 11. Answer Key (参考答案)
### Output Rules
1. **Default (user does not request answers in the same file):** Generate the answer key as a **separate .docx file** (e.g., `exam.docx` + `exam_answers.docx`). This prevents students from accidentally seeing answers.
2. **User explicitly requests answers in the same file:** Place the answer key on an **independent page** using `SectionType.NEXT_PAGE`. Answer key MUST NOT appear on the same page as any exam question.
### Separate File Format (Default)
The answer key file should include:
- Title: "《{exam title}》参考答案" (SimHei, 14pt/size:28, bold, centered)
- Same question numbering as the exam
- Concise answers (letter choices, key words, short solutions)
- Font: SimSun 10.5pt (size: 21)
### Same File Format (When User Requests)
```js
// Answer key as a separate section — MUST use SectionType.NEXT_PAGE
{
properties: { type: SectionType.NEXT_PAGE,
page: { margin: { top: 850, bottom: 850, left: 1200, right: 1200 } } },
children: [
new Paragraph({
alignment: AlignmentType.CENTER, spacing: { after: 300 },
children: [new TextRun({ text: "参考答案", size: 28, bold: true,
font: { eastAsia: "SimHei" } })],
}),
// ... answer content paragraphs
],
}
```
### Rules
1. ⚠️ **Never place answer content directly after the last question without a page/section break**
2. Answer content should be concise — no answer lines, no grid, plain text only
3. Calculation/proof questions: show key steps, not just final answer
4. If the exam has figures, answers may reference "see Figure X" without re-embedding
---
## 12. Figures & Illustrations
### Image Insertion
```js
new Paragraph({ alignment: AlignmentType.CENTER, spacing: { before: 100, after: 60 },
children: [new ImageRun({ data: imageBuffer, transformation: { width: 300, height: 200 }, type: "png" })] })
new Paragraph({ alignment: AlignmentType.CENTER, spacing: { after: 100 },
children: [new TextRun({ text: "(Figure 1)", size: 18, color: "666666", font: "SimSun" })] })
```
### Key Principles
- Images set as inline (default) to prevent floating
- Resolution sufficient for print clarity
- **B&W print compatible:** images must remain distinguishable when printed in grayscale
- Figure numbers and captions complete
- Figures adjacent to corresponding questions
- Maps must have: scale bar, north arrow, legend
- Coordinate graphs must have: axis labels, tick marks, units
### ⚠️ Figure-Text Order (Strictly Enforced)
**For questions with figures, element order must be:**
```
1. Question stem (keepNext: true)
2. Figure (centered, keepNext: true)
3. Answer lines / answer area
```
**Forbidden:** answer lines between stem and figure, or figure after answer lines.
### Figure Content Matching
- **Figures must be semantically consistent with question stem:** if question says "triangle ABC", figure must label vertices A, B, C
- Geometry annotations must match described angles, side lengths
- Function graphs must mark key points mentioned in the question
- Physics experiment diagrams must match described apparatus
- Figure width: geometry ≤ 50% page width, data/experiment ≤ 70%
### ⚠️ Figure Diversity Rule (Mandatory)
**No duplicate figures in the entire paper.** Even if two questions involve the same type (e.g., both triangles), each must have a distinct figure:
1. Different labels (different vertex letters, angles, side lengths)
2. Different shapes (acute vs. right vs. obtuse triangle)
3. Different styling (if applicable)
If using matplotlib, each call must use **different parameters and data** — never copy the same generation code.
### Subject-Specific Figure Requirements
| Subject | Common Types | Special Requirements |
|---------|-------------|---------------------|
| Math | Geometry, functions, coordinates | No distortion, clear labels |
| Physics | Circuits, mechanics, apparatus | Standard symbols, correct arrows |
| Chemistry | Apparatus, molecular structures | Reagent names labeled |
| Biology | Cell, organ, ecosystem diagrams | Labels not too small |
| Geography | Maps, contour lines, statistics | Legend + scale + north arrow |
---
## 13. Formulas & Special Symbols
### Formulas
Math/physics/chemistry formulas use **LaTeX → docx-js Math mapping** (see `references/math-formulas.md`):
- Basic (fractions, sub/superscript, roots) → docx-js Math components
- Complex (3+ nesting, matrices) → matplotlib PNG fallback
- Never hand-type Unicode formula approximations
### Common Unicode Math Symbols
```
× ÷ ± ∓ ≠ ≈ ≤ ≥ ∞ √ ∑ ∏ ∫ ∂ ∆ ∇
α β γ δ ε θ λ μ π σ φ ω
⊂ ⊃ ∈ ∉ ∩ ∅ ∀ ∃
→ ← ↑ ↓ ⇒ ⇔ ° ″ ‰ ² ³ ⁴ ⁿ ₁ ₂ ₃
```
### Chemical Formulas
Subscripts/superscripts must be correct: H₂O, CO₂, Fe₂O₃, Ca(OH)₂
Reaction arrows: → ⇌ ↑ ↓
---
## 14. Table Usage Standards
### Borderless Tables (for alignment)
For: option alignment, info rows, question number + points alignment
```js
const NB = { style: BorderStyle.NONE, size: 0, color: "FFFFFF" };
const NBs = { top: NB, bottom: NB, left: NB, right: NB };
```
### Bordered Tables (for data display)
For: score tables, data tables, statistics
```js
const thinB = (c="000000") => ({ style: BorderStyle.SINGLE, size: 1, color: c });
const thinBs = (c="000000") => ({ top: thinB(c), bottom: thinB(c), left: thinB(c), right: thinB(c) });
```
### Table Standards
- Cell padding moderate (margins: top/bottom 4060, left/right 6080)
- Consistent border thickness
- Header row: light grey F0F0F0 background
- Avoid cross-page tables
- Tables centered (`alignment: AlignmentType.CENTER`)
---
## 15. Headers & Footers
### Page Numbers
```js
footers: { default: new Footer({ children: [
new Paragraph({ alignment: AlignmentType.CENTER,
children: [
new TextRun({ children: [PageNumber.CURRENT], size: 18, font: "SimSun" }),
] })
] }) }
```
⚠️ **Denominator FORBIDDEN** — never use `PageNumber.TOTAL_PAGES` or "Page X of Y". Show only current page number.
### Headers
- May contain seal line prompt or subject name
- Small font (89pt), grey color (999999)
- Should not be visually heavy — must not compete with content
---
## 16. Subject-Specific Standards
### Chinese Language
- Reading, classical poetry, composition: **no columns**
- Poetry preserves original line breaks
- Classical text needs annotation area (smaller font, indented)
- Composition grid in independent section, grid count via `calcGridSize` (800 words → 50×20 = 1000 cells)
- Dictation questions: horizontal lines, moderate length
- Reading materials: use KaiTi to differentiate
### Mathematics
- Multiple choice, fill-in: suitable for neat layout
- Formulas: Unicode symbols or OOXML
- Geometry/function graphs must be clear, undistorted
- Problem-solving: sufficient working space
- Coordinate graphs: labeled axes, tick marks
### English
- English font: Times New Roman, moderate character spacing
- Cloze: numbers in text, options after passage
- Reading comprehension: material + questions as groups
- Writing area: horizontal lines, not grid
- Listening (if any): numbers aligned with options
### Physics / Chemistry / Biology
- Experiment/apparatus diagrams must be clear and accurate
- Unit symbols standardized (m/s, kg, mol/L, etc.)
- Chemical formula subscripts correct
- Calculation and experiment analysis: sufficient answer space
- Biology structure diagrams: labels not too small
### History / Politics
- Source-based questions are lengthy — **no columns**
- Dates, figures, events clearly labeled
- Essay questions: more whitespace than multiple choice
- Historical sources cite provenance
- Chart materials in logical order
### Geography
- Maps are the focus — must be clear
- Legend, scale bar, north arrow required
- Map and question close together — avoid page turns
- Map reading questions: balance figure and text space
- Contour line values clearly labeled
---
## Final Review Checklist
After generating an exam paper, check every item:
- [ ] Question numbers sequential, points correct, total correct
- [ ] Question stems match options / materials / illustrations one-to-one
- [ ] **Figures come after stem, before answer area** (strict order)
- [ ] **Figure content matches question semantics** (labels, symbols match)
- [ ] **Composition grid count ≥ required words × 1.25** (800 words → at least 1000 cells)
- [ ] Options aligned with borderless tables (not spaces)
- [ ] No wrong pages, missing pages, **no extra blank pages**
- [ ] Images / tables / formulas positioned correctly
- [ ] **No Markdown table syntax in document** — all data tables use proper docx Table objects
- [ ] Fonts, sizes, line spacing consistent
- [ ] Answer space matches difficulty and point value
- [ ] Clear when printed in B&W
- [ ] Subject-specific layout handled properly
- [ ] Seal line / page numbers / headers formatted correctly
- [ ] Header info complete (school, subject, duration, total score)
- [ ] **No extra PageBreak at end of last section**
- [ ] **Answer key is either a separate file (default) or on a separate page (if user requested in same file)** — never on the same page as questions

View File

@@ -0,0 +1,411 @@
# Scene: Official Document (Government Notice / Letter / Reply / Minutes)
## Goal
Generate a complete, formal, properly structured official document ready for Word delivery. Must simultaneously meet:
- Correct document type, complete structure, clear elements
- Formal government register, stable hierarchy, reliable layout
- Ready for approval, circulation, filing, issuance, or formal internal communication
**Forbidden:** Producing outlines-only / sample paragraphs / writing advice / half-finished drafts; outputting chat-style explanations.
→ Placeholder convention & universal prohibitions — see `references/common-rules.md`
**Note:** This scene uses its OWN font and layout specs (not Profile A defaults), because official documents follow GB/T 9704 standards.
---
## Scope & Document Type Boundaries
This scene covers:
1. **Notice** — assigning work, communicating requirements, forwarding documents
2. **Official Letter** — between non-subordinate organizations: negotiation, inquiry, assistance requests, replies
3. **Reply (to Request)** — superior authority answering a subordinate's formal request
4. **Meeting Minutes** — recording key outcomes and agreed items
**Important boundaries:**
- "Red header" is a format/layout, not a document type — it typically carries notices, letters, or replies
- **Not all official documents need red headers / document numbers / colophons** — only enable when user explicitly requests "red header format", "GB/T 9704 format", or "formal issuance format"
- Internal enterprise notices, business letters, meeting minutes often do NOT use full GB/T standard format
- This scene does NOT cover: speeches, press releases, promotional materials, papers, summary reports, contracts, or legal opinions
---
## Document Type Routing
```js
function selectOfficialType(keywords, purpose) {
if (/minutes|meeting/.test(keywords)) return "minutes";
if (/reply|respond to request/.test(keywords)) return "reply";
if (/letter|inquiry|negotiation/.test(keywords)) return "letter";
return "notice"; // default
}
```
### Red Header Activation
```js
function needsRedHeader(userRequest) {
// Only activate when explicitly requested
return /red header|GB\/T 9704|formal issuance|official format/.test(userRequest);
}
```
**Rules:**
- `needsRedHeader = true` → Enable red header, document number, colophon (full formal elements)
- `needsRedHeader = false` → Maintain formal style but no mandatory red header; keep only title + addressee + body + signature
---
## Standard Template Structures
### Template A: Notice
1. Red header area (if applicable)
2. Document number (if applicable)
3. Title
4. Addressee
5. Reason for issuance
6. "The relevant matters are hereby notified as follows:"
7. Notice items (expanded by hierarchy)
8. Requirements
9. Attachment notes (if any)
10. Signature (if applicable)
11. Date (if applicable)
12. Colophon (if applicable)
**Closing phrase:** "This notice is hereby given." or "Please implement accordingly."
### Template B: Official Letter
1. Red header area (if applicable)
2. Document number (if applicable)
3. Title
4. Addressee
5. Reason / reference to incoming letter
6. Negotiation / inquiry / reply items
7. Closing
8. Signature (if applicable)
9. Date (if applicable)
10. Colophon (if applicable)
**Closing phrases:** "Please reply by letter." / "This letter is hereby sent." / "This is in reply."
### Template C: Reply
111. Similar to Notice structure
- Addressee is typically the single requesting organization
- Must reference the incoming request document
- "After review, the reply is as follows:"
- Closing: "This is the reply."
### Template D: Meeting Minutes
1. Title (meeting name + "Minutes")
2. Meeting overview (time, place, chair, attendees)
3. Agreed items
4. Responsibility assignments / follow-up requirements (if applicable)
5. Distribution scope (if applicable)
**Notes:**
- Minutes record "agreed items", not a transcript of speeches
- Minutes generally do NOT follow standard red header format
- Unless user explicitly requests organizational template compliance
---
## Input Recognition & Completion
### Processing Rules
1. If user provides a template, historical document, or organizational standard → **always follow it first**
2. If information is incomplete → fill conservatively, formally, and appropriately for the government context
3. **Never fabricate** policy bases, incoming document numbers, leadership directives, meeting decisions, or official organization names
4. If critical info is missing → use standardized placeholders
5. Never present a draft as if it were already formally issued
---
## Title Drafting Rules
The title is the most critical identifying element — must accurately, concisely reflect the issuing body, subject matter, and document type.
| Type | Format | Example |
|------|--------|---------|
| Notice | Issuing body + "regarding" + subject + "notice" | XX Municipal Government Notice on Issuing the XX Management Measures |
| Letter | Issuing body + "regarding" + subject + "letter" | XX Company Letter Regarding Land Use for XX Project |
| Reply | Issuing body + "regarding" + subject + "reply" | XX Bureau Reply on Approving Establishment of XX Branch |
| Minutes | Meeting name + "minutes" | XX Company Third General Manager Meeting Minutes |
**Rules:**
1. Title must specify the subject — no vague titles ("Notice on Relevant Matters")
2. Titles generally do not use periods
3. Title length should be moderate — avoid excessive length
---
## Addressee & CC
### Addressee
1. The primary recipient of the document
2. On its own line, between title and body
3. Followed by full-width colon
4. Replies typically address only one requesting organization
5. Meeting minutes generally do not have a standard addressee
### CC (Carbon Copy)
1. CC recipients are NOT addressees — do not mix them
2. CC information typically appears in the colophon area
3. Non-red-header documents should not mechanically add "CC:" lines
---
## Writing Style & Register
### Language Style
1. Must be **solemn, plain, precise, rigorous, concise**
2. **Forbidden:** Literary devices (metaphor, personification, hyperbole, rhetorical questions, exclamations)
3. **Forbidden:** Vague expressions ("approximately", "recently", "relevant departments", "as soon as possible") — unless user explicitly requires vague wording
4. Time, location, organization, scope, milestones should be as specific as possible
5. No sloganeering filler or obvious "AI boilerplate" feel
### Common Phrase Patterns
**Purpose phrases:**
- "In order to implement..."
- "To further standardize..."
- "To effectively carry out..."
**Basis phrases:**
- "In accordance with the provisions of..."
- "As required by..."
- "Pursuant to relevant regulations"
**Transition phrases:**
- Notice: "The relevant matters are hereby notified as follows:"
- Letter: "The following is hereby communicated:"
- Reply: "After review, the reply is as follows:"
- Minutes: "The agreed items of the meeting are recorded as follows:"
**Closing phrases (must match document type):**
- Notice: "This notice is hereby given."
- Letter: "Please reply." / "This is hereby communicated." / "This is in reply."
- Reply: "This is the reply."
- Minutes: generally no fixed closing phrase
### Conciseness
1. Use "because" not "due to the reason that..."
2. Use "to" not "for the purpose of..."
3. Name specific entities — not "relevant parties" or "related departments"
4. Name responsible units — not "all units should ensure implementation" (vague ending)
---
## Body Hierarchy & Numbering
Official document body must strictly follow the standard Chinese government numbering system:
```
I. General matters
(1) Sub-items
1. Specific points
(1) Detail supplements
```
Original Chinese numbering:
```
一、General matters
Sub-items
1. Specific points
1Detail supplements
```
**Rules:**
1. No level-skipping
2. **Forbidden:** Markdown list markers (`-` `*`)
3. No switching between numbering styles at the same level
4. Level 1: major tasks; Level 2: sub-items; Levels 34: only when truly necessary
---
## Truthfulness & Caution
1. **Never fabricate** issuing bodies, incoming organizations, document numbers, leadership directives, meeting decisions, or policy bases
2. **Never** write "per the spirit of XX meeting" or "per XX directive" unless user explicitly provides these
3. **Never** fabricate titles and numbers of referenced documents in replies or letters
4. **Never** present a draft as already formally issued
5. When information is insufficient → use placeholders, never pretend elements are complete
---
## Attachment Notes
1. Placed after body text, before signature
2. "Attachment:" followed by attachment name
3. Multiple attachments: numbered sequentially (Attachment 1, Attachment 2...)
4. Attachment names must be clear and specific — never fabricate unknown attachments
---
## Signature & Date
1. Document types requiring signatures should have issuing body name and date
2. Not all types mechanically require signatures (minutes typically do not)
3. Formal document dates must use Chinese numeral format with proper "" character
- Example: March 31, 2026 → 二〇二六年三月三十一日
4. Document numbers use tortoiseshell brackets "" (not square brackets "[]")
- Example: X政发20261号
5. Date format must be consistent throughout
---
## Palette
**NO decorative colors.** Pure black text on white background. The only color is red header text.
```js
const palette = { primary:"#000000", body:"#000000", accent:"#000000", surface:"#FFFFFF" };
const RED_HEADER = "FF0000"; // Only for red header text
```
---
## Page Layout (GB/T 9704-2012 Standard)
**Only for formal GB/T red-header documents.** Non-GB/T scenarios may use standard margins.
| Property | Value | Twips |
|----------|-------|-------|
| Top margin | 3.7 cm | 2098 |
| Bottom margin | 3.5 cm | 1984 |
| Left margin | 2.8 cm | 1588 |
| Right margin | 2.6 cm | 1474 |
```js
// GB/T red header layout
page: { size: { width: 11906, height: 16838 }, margin: { top: 2098, bottom: 1984, left: 1588, right: 1474 } }
// Non-GB/T formal documents may use standard margins:
// margin: { top: 1440, bottom: 1440, left: 1701, right: 1417 }
```
---
## Font Specifications (GB/T 9704)
| Element | Font | Size | Style |
|---------|------|------|-------|
| Red header org name | STXiaoBiaoSong / SimSun Bold | As determined by org | Red (#FF0000), centered |
| Document title | STXiaoBiaoSong / SimSun Bold | Er Hao 22pt (size: 44) | Centered |
**Font fallback for STXiaoBiaoSong:** This font is not installed by default on all systems. WPS ships FZXiaoBiaoSong-S13 instead. Use this fallback chain:
- Preferred: `STXiaoBiaoSong` (华文小标宋)
- Fallback 1: `FZXiaoBiaoSong-S13` (方正小标宋, available in WPS)
- Fallback 2: `SimSun` with Bold (宋体加粗, universally available)
In code, set primary font and note the fallback:
```js
font: { eastAsia: "STXiaoBiaoSong" }
// Fallback: FZXiaoBiaoSong-S13 → SimSun Bold. User may need to install STXiaoBiaoSong for exact rendering.
```
| Addressee | FangSong | San Hao 16pt (size: 32) | Left-aligned |
| Body | FangSong | San Hao 16pt (size: 32) | Justified, indent 640 |
| Level 1 heading | SimHei | San Hao 16pt (size: 32) | Bold |
| Level 2 heading | KaiTi | San Hao 16pt (size: 32) | Normal |
| Level 3 heading | FangSong | San Hao 16pt (size: 32) | Bold |
| Attachment notes | FangSong | San Hao 16pt (size: 32) | Left-aligned |
| Signature/date | FangSong | San Hao 16pt (size: 32) | Right-aligned |
| Page number | FangSong | Si Hao 14pt (size: 28) | Centered, "— X —" |
```js
styles: {
default: {
document: {
run: { font: { ascii: "Times New Roman", eastAsia: "FangSong" }, size: 32, color: "000000" },
paragraph: { spacing: { line: 560 } }, // Fixed 28pt line spacing
},
heading1: {
run: { font: { eastAsia: "SimHei" }, size: 32, bold: true, color: "000000" },
},
heading2: {
run: { font: { eastAsia: "KaiTi" }, size: 32, color: "000000" },
},
},
}
```
**Note:** For "formal administrative style" (not strict GB/T), retain the style logic but do not rigidly require every GB/T element.
---
## Code Examples
### Red Header (red-header documents only)
```js
new Paragraph({ alignment: AlignmentType.CENTER, spacing: { before: 0, after: 200, line: Math.ceil(26 * 23), lineRule: "atLeast" },
children: [new TextRun({ text: "XX Municipal Government", font: { eastAsia: "SimSun" },
size: 52, bold: true, color: "FF0000" })] })
new Paragraph({ border: { bottom: { style: BorderStyle.SINGLE, size: 4, color: "FF0000" } },
spacing: { after: 40 }, children: [] })
```
### Page Number Footer
```js
footers: { default: new Footer({ children: [new Paragraph({
alignment: AlignmentType.CENTER,
children: [
new TextRun({ text: "\u2014 ", size: 28 }),
new TextRun({ children: [PageNumber.CURRENT], size: 28 }),
new TextRun({ text: " \u2014", size: 28 }),
],
})] }) }
```
---
## Style Rules
1. **Strictly follow official document format — no decorative elements**
2. NO cover page
3. NO TOC
4. NO headers (only page numbers in footer)
5. NO colors except red header (red-header documents only)
6. NO images or charts (unless integral to document content)
7. NO fancy fonts — only FangSong, SimHei, KaiTi, STXiaoBiaoSong
8. Line spacing: fixed 28pt (`line: 560`) — **NOT** the default 1.5x
---
## Scene-Specific Prohibitions
In addition to universal prohibitions (see `references/common-rules.md`):
1. Must not write official documents as chat replies, promotional copy, speeches, or papers
2. Must not use Markdown headings/lists/bold/italic for document hierarchy
3. Must not apply red header/document number/colophon to all document types indiscriminately
4. Must not format meeting minutes as a standard red-header notice
5. Must not use literary rhetoric, colloquial expressions, or strongly emotional language
6. Must not fabricate incoming documents, policies, document numbers, meeting decisions, or superior directives
7. Must not use excessive blank lines to create "formal appearance"
8. Must not let the document read like a report, paper, or marketing copy
---
## Scene-Specific Quality Checks
In addition to universal checks (see `references/common-rules.md`):
### Format
- [ ] Red header text is #FF0000 and only red header uses color (red-header scenarios)
- [ ] Line spacing fixed at 28pt (line: 560)
- [ ] FangSong / SimHei / KaiTi correctly applied
- [ ] Signature right-aligned, date format correct
- [ ] No cover page, no TOC, no header
- [ ] Page number format "— X —"
- [ ] Red header / document number / colophon only where appropriate
### Content
- [ ] Document type correctly identified, structure matches
- [ ] Title is accurate, specific, document type clear (not vague)
- [ ] Addressee, attachments, signature, colophon used appropriately
- [ ] Closing phrase matches document type
- [ ] Body hierarchy strictly follows: 一、(Level 1) →(一)(Level 2) → 1. (Level 3) →1(Level 4)
- [ ] No Markdown headings/lists/bold/italic mixed in
- [ ] Meeting minutes not incorrectly given standard document signature and colophon
- [ ] Date uses Chinese numerals with proper "" character
- [ ] Document number uses tortoiseshell brackets ""
- [ ] No fabricated incoming documents / policy bases / organizational elements
- [ ] Register is solemn and plain — no colloquial / literary / promotional tone

340
skills/docx/scenes/report.md Executable file
View File

@@ -0,0 +1,340 @@
# Scene: Report / Proposal
## Goal
Generate a complete, formal, well-structured report ready for Word delivery. Must simultaneously meet:
- Complete structure, clear logic, formal language, definitive conclusions
- Objective data presentation, proper Word formatting
- Ready for presentation, filing, review, submission, or internal communication
**Forbidden:** Producing outlines-only / summaries / template annotations / half-finished drafts; outputting chat-style explanations or filler phrases like "here is the report content".
→ Font profile: **A (Formal)** — see `references/common-rules.md`
→ Default layout: standard margins — see `references/common-rules.md`
→ Placeholder convention — see `references/common-rules.md`
→ Universal prohibitions & quality checks — see `references/common-rules.md`
---
## Report Type Routing
Auto-select structure and expression style based on user intent. If not explicit, infer from topic.
```js
function selectReportType(keywords, topic) {
if (/analysis|competitor|industry|operations|data/.test(keywords)) return "analysis";
if (/experiment|lab|algorithm|engineering/.test(keywords)) return "experiment";
if (/test|QA|performance|security|compatibility/.test(keywords)) return "testing";
if (/survey|questionnaire|interview|market research/.test(keywords)) return "research";
if (/review|retrospective|post-mortem|summary/.test(keywords)) return "review";
if (/proposal|feasibility|implementation|optimization/.test(keywords)) return "proposal";
return "analysis"; // default
}
```
### 6 Report Types
| Type | Use Case | Structure Focus | Expression Focus |
|------|----------|----------------|-----------------|
| analysis | Industry/competitor/operations/data analysis | Background → Dimensions → Findings → Diagnosis → Recommendations | Conclusion-first, clear dimensions, chart-supported, actionable advice |
| experiment | Scientific/academic/algorithm/engineering experiments | Objective → Environment → Method → Results → Error → Conclusion | Precise process, clear conditions, objective results, conclusion ties to hypothesis |
| testing | Functional/performance/security/compatibility testing | Overview → Scope → Plan → Results → Defects → Risks → Conclusion | Data-driven, traceable, reproducible, supports go/no-go decisions |
| research | User/market/survey/interview research | Background → Subjects & Method → Sample → Findings → Synthesis → Recommendations | Clear sample boundaries, layered findings, recommendations match findings |
| review | Project/incident retrospective, phase summary | Goals → Review → Results → Issues → Lessons → Actions | Clear facts, restrained attribution, specific action items |
| proposal | Project/optimization proposal, feasibility study | Status → Goals → Solution → Roadmap → Resources → Risks → Benefits | Strong argumentation, executable plan, clear boundaries |
---
## Standard Template Structures
### Template A: Analysis Report
1. Executive Summary
2. Background & Objectives
3. Scope, Data Sources & Methodology
4. Core Findings
5. Problem Diagnosis & Root Cause
6. Conclusions & Recommendations
7. Appendices (if needed)
### Template B: Experiment Report
1. Abstract
2. Objective & Hypothesis
3. Environment & Materials
4. Procedure & Method
5. Data & Results
6. Error Analysis & Discussion
7. Conclusions
8. Appendices (if needed)
### Template C: Testing Report
1. Test Overview
2. Test Scope & Environment
3. Test Plan & Case Design
4. Test Results Summary
5. Defect Analysis & Distribution
6. Risk Assessment & Outstanding Issues
7. Test Conclusions
8. Appendices (if needed)
### Template D: Research Report
1. Research Summary
2. Background & Objectives
3. Subjects & Methodology
4. Sample & Data Description
5. Core Findings
6. Problem Synthesis
7. Recommendations & Action Direction
8. Appendices (if needed)
### Template E: Review / Summary Report
1. Overview
2. Goals & Scope
3. Process Review
4. Results Summary
5. Issues & Root Cause Analysis
6. Lessons Learned
7. Follow-up Action Plan
8. Appendices (if needed)
### Template F: Proposal / Feasibility Report
1. Executive Summary
2. Current State & Problem Analysis
3. Goals & Expected Outcomes
4. Solution Design
5. Implementation Roadmap & Milestones
6. Resource Requirements & Budget
7. Risk Analysis & Mitigation
8. Expected Benefits & Evaluation
9. Appendices (if needed)
**If the user provides a company/school/course template or fixed chapter requirements, always follow those first.**
---
## Input Recognition & Completion
### User May Provide
Report topic, type, use case, audience, industry, length requirements, data sources, structure requirements, output purpose (presentation/filing/audit/review/external submission/coursework), template files, company/department/project/author/date, etc.
### Processing Rules
1. If the user provides a template, existing document, company standard, or format example, **always follow it first**
2. If information is incomplete, fill in conservatively — completions must be **restrained, natural, credible, professional**
3. **Never fabricate** unrealistic data, conclusions, test results, business metrics, project statuses, policy backgrounds, or customer feedback
4. If critical information is missing and cannot be safely inferred, use standardized placeholders
5. If no real data is available, prefer low-hallucination approaches: "status description → analysis framework → problem synthesis → recommendations"
---
## Content Quality Constraints
### Logic & Structure
1. Report must revolve around a clear topic, objective, audience, and through-line
2. Must not just pile up background/concepts/vague statements — must demonstrate analysis, synthesis, judgment, comparison, or review value
3. Terminology must be consistent throughout — concepts must not drift
4. Abstract, body, conclusions, and recommendations must be consistent — no self-contradiction
5. Must form a complete loop: "background → objective → method/basis → process/status → findings/results → problems/judgment → recommendations/conclusions"
6. Each major chapter must have a clear core conclusion or topic sentence — no information dump
### Language Style
1. Formal, objective, restrained, professional
2. No colloquial expressions, chat tone, hyperbole, emotional language, or propaganda style
3. For management/decision-maker audience: conclusion-first, highlight key points, actionable recommendations
4. For technical/testing reports: clear basis, reproducible process, verifiable results, stated risks
### Data Expression
1. Never use vague expressions as main conclusions: "significantly improved", "obviously optimized", "performed well", "has certain issues"
2. If data exists, express quantitatively (e.g., "average response time under 200 ms" not "fast response")
3. First occurrence of a term: write full name with abbreviation, e.g., "Application Programming Interface (API)"
4. Without real data backing, never fabricate precise figures
5. Statements about facts, data, status, and results must be internally consistent
### Truthfulness & Conservative Generation
1. Never fabricate test results, experiment data, growth rates, customer counts, interview conclusions, sample distributions, or launch decisions
2. Never present speculation as proven fact
3. Never fabricate meeting minutes, regulatory bases, customer feedback, or system logs
4. When information is insufficient, use placeholders — never pretend information is complete
5. Conclusions must be restrained — do not overstate effects, risks, or value
6. Recommendations must be grounded in preceding analysis — no conclusions from thin air
---
## Chapter Content Requirements
### (1) Cover
1. Formal reports should have a cover page
2. Cover includes: title, subtitle (if any), organization/department, author, date, classification (if requested)
3. Cover must be a separate section
4. Cover does not display page numbers
5. Use `selectCoverRecipe()` for recipe + palette (see design-system.md)
6. Common recipes: general report R1, whitepaper R2, consulting R3, proposal R4
### (2) Executive Summary
1. Formal reports **must have** a summary opening — never jump directly into details
2. Summary should briefly state: background, objective, key methodology, key findings, main recommendations
3. Suitable for quick reading by management — generally 200400 words
4. Must not read like a TOC description or pile of background filler
### (3) Table of Contents
1. Medium-to-long formal reports should include a TOC
2. TOC must be generated from real heading styles (Heading + TOC field) — never write a fake TOC
3. TOC page is typically a separate page
4. TOC depth: usually 23 levels
### (4) Background & Objectives
1. Must explain why this report exists
2. Must state what problem/scenario/audience the report serves
3. If scope boundaries exist, state what the report does NOT cover
4. Must not be vague/grand background — must relate directly to this report's task
### (5) Methodology / Scope / Basis
1. Must state what materials, criteria, methods, and time range the report is based on
2. Analysis: data sources, analysis dimensions, criteria definitions
3. Experiment: environment, materials, samples, procedure principles
4. Testing: scope, version, environment, methods, coverage/rounds
5. Research: sample source, sample size, research method, time range
6. Reader must understand how conclusions were derived
### (6) Core Content / Process / Status / Results
1. Organized by logical or dimensional order — no chaotic piling
2. Each section should lead with its conclusion, then expand with evidence
3. Results must be specific — never just "performed well" or "has certain issues"
4. Data, metrics, phenomena, and comparisons must be clearly stated
5. If charts are needed but cannot be generated, use chart placeholders (see below)
### (7) Analysis / Discussion / Problem Diagnosis
1. Must not merely repeat earlier results
2. Must explain what results mean, what patterns they reveal, what problems they expose
3. May include: comparative analysis, root cause analysis, mechanism analysis, anomaly explanation, limitations, risk boundaries
4. Analysis must be consistent with preceding data and facts
### (8) Conclusions / Recommendations / Next Steps
1. Conclusions must respond to report objectives
2. Recommendations must be executable — not just principle slogans
3. Recommendations should state: who executes, what to do, when, expected improvement
4. Testing/review: clear verdict (pass / conditional pass / fail)
5. Retrospective/summary: specific follow-up action items
### (9) Appendices
1. Supplementary material valuable to the report but not suitable for the main body
2. Includes: raw data excerpts, detailed parameters, supplementary tables, sample screenshots
3. Appendices should be on separate pages with proper headings
---
## Chart Placeholder Convention
When charts are needed but cannot be directly generated:
```
[Chart Placeholder: Bar chart; Topic: Q1-Q4 2025 revenue comparison; X-axis: Quarter; Y-axis: Revenue (10K CNY); Style: clean business]
```
**Rules:**
- Specify: chart type, topic, axis meanings, key dimensions, optional palette suggestion
- Placeholder must be a standalone paragraph — never inline
- Never use vague placeholders like "insert chart here"
**Prefer direct generation:** Charts that can be produced via matplotlib should be generated as embedded PNGs. Placeholders are a fallback only.
---
## Content-to-Word Mapping
### Heading Levels
1. Strict hierarchy — no level-skipping
2. Headings must be informative — never "Background", "Content", "Other" (use "Project Background & Report Objectives" instead)
3. Do not mix multiple numbering systems
4. Normal paragraphs must not masquerade as headings
### Paragraphs
1. Do not use consecutive blank lines for visual spacing
2. Each paragraph should be a complete semantic unit — not too long or too fragmented
### Lists
1. Use lists only when genuinely needed — an entire report must not be bullet points
2. Nesting depth ≤ 3 levels
3. Consistent punctuation within a list (all complete sentences or all fragments)
4. Combine "key points" with "analysis paragraphs" — never just list without explaining
### Tables
1. Use tables only for structured data (statistics, comparisons, parameter lists)
2. Every table must have a header row — headers must not be blank
3. Avoid heavily merged-cell complex nested tables
4. Tables must have introductory and explanatory text before/after
5. Cell content should be concise — avoid long paragraphs inside cells
### Emphasis
1. Bold only for key conclusions, critical metrics, first occurrence of key terms
2. Never bold entire paragraphs
3. Avoid italic, strikethrough, and other unstable styles
---
## Palette Selection
| Report Type | Suggested Palette |
|-------------|-------------------|
| General | Neutral calm (primary: #101820) |
| Consulting | Warm terracotta |
| Tech | Cool dawn mist |
| Environment / Education | Warm sunshine |
| Medical | Cool mint |
See `references/design-system.md` for full palette definitions.
---
## Document Structure
1. **Cover** — via `selectCoverRecipe()` (see design-system.md)
- Separate section, page margin typically 0
- Common: general R1, whitepaper R2, consulting R3, proposal R4
2. **Table of Contents** — H1H3, separate section
3. **Executive Summary** — 1 page max
4. **Body** — Chapters per selected template (AF)
5. **Conclusions & Recommendations**
6. **Appendices** — Raw data, detailed tables
---
## Professional Elements
- **Page numbers**: bottom center, size 18, color "808080"
- **Header**: report title (abbreviated), size 18, color "808080"
- **Figure/table numbering**: sequential (Figure 1 / Table 1)
- **Cover**: no page number, no header/footer
- **TOC**: optional Roman numerals or no page numbers
- **Body**: Arabic numerals, continuous
---
## Scene-Specific Quality Checks
In addition to universal checks (see `references/common-rules.md`):
### Format
- [ ] Executive summary ≤ 1 page
- [ ] Figures/tables have captions ("Figure X: description" / "Table X: description")
- [ ] Cover recipe matches report type
- [ ] Data charts use palette accent color
### Content
- [ ] Has executive summary — not starting directly with details
- [ ] Heading names are specific and meaningful
- [ ] Complete loop: background → basis → content → analysis → conclusions/recommendations
- [ ] No fabricated or exaggerated details
- [ ] Abstract and conclusions are consistent
- [ ] Terminology consistent throughout
- [ ] Data expressions are quantified, not vague
- [ ] Recommendations are actionable with owners and timeline
### Structure
- [ ] Heading hierarchy has no level-skipping
- [ ] List nesting ≤ 3 levels
- [ ] Tables have headers with intro/explanation text
- [ ] Bold used sparingly for emphasis only

534
skills/docx/scenes/resume.md Executable file
View File

@@ -0,0 +1,534 @@
# Scene: Resume / CV
## Goal
Generate a complete, authentic, well-structured, position-targeted resume with stable Word formatting. Must simultaneously meet:
- Authentic and credible content, clear position targeting
- ATS-friendly, stable Word layout
- Clean structure, professional visual design, easy to scan
**Execution priority** (when conflicting): Position relevance > Information readability > ATS compatibility > Visual decoration
**Forbidden:** Producing advice-only / fragments / half-finished drafts; outputting chat-style explanations.
→ Font profile: **B (Visual)** — see `references/common-rules.md`
→ Placeholder convention & universal prohibitions — see `references/common-rules.md`
---
## Scope
Default: generate a position-oriented general resume. Switch to English resume, academic CV, international format, or design portfolio style only when explicitly requested by the user.
---
## Resume Type Routing
Auto-select module order based on user background and target:
### General Resume (default)
Name & Contact → Target Position → Profile Summary (optional) → Core Skills → Work Experience → Projects → Education → Certifications / Awards
### New Graduate Resume
Name & Contact → Target Position → Education → Internship Experience → Projects → Campus Activities / Competitions / Awards → Skills & Certifications
### Technical Role Resume
Name & Contact → Target Direction → Profile Summary (optional) → Tech Stack / Core Skills → Work Experience → Projects → Education → Open Source / Papers / Patents / Competitions
### Academic CV
Name & Contact → Research Direction / Target → Education → Research Experience → Papers / Patents / Projects / Grants → Teaching / Academic Service → Awards / Skills / Languages
---
## Input Processing Rules
1. If user provides a target position or JD → **must reorganize and rewrite content around position requirements**
2. If user provides a raw draft → prioritize restructuring, phrasing refinement, and priority reordering; do not rewrite into an unfamiliar career
3. **Never fabricate** companies, positions, degrees, projects, certifications, awards, papers, patents, data results, or achievements
4. If critical data is missing → use conservative expressions or placeholder `【Please fill in: ______】`; never fabricate precise numbers
5. A single resume should generally serve only one primary career direction
---
## Content Quality Constraints
### Core Principles
1. Resume must revolve around the target position — do not spread all experiences equally
2. Most relevant experiences, projects, and skills must be **placed first and detailed**
3. Terminology, company names, position titles, date formats, and skill names must be consistent
4. Must demonstrate: **personal positioning → capability tags → relevant experience → provable results**
5. No piling of vague self-praise; no inspirational writing or chronological dumps
### Experience Writing Standards
Each experience bullet should demonstrate: **Action + Object/Context + Method + Result/Impact**
**Recommended verbs:** Led, built, drove, optimized, refactored, designed, delivered, coordinated, improved, reduced, achieved
**Rules:**
- "Responsible for" / "participated in" are not absolutely forbidden, but must include scope and results
- Each bullet is concise — one core contribution per bullet
- Quantify when possible, but do not force-bold all numbers
- Recent experience gets detail; low-relevance/low-value experience gets compressed or removed
- Reverse chronological order — most recent and relevant first
- Expand the most recent 2 experiences; compress earlier ones
### Profile Summary / Self-Assessment
1. Not mandatory
2. If included, frame as "Profile Summary" — **34 lines max**
3. Focus on: years of experience, career direction, core capabilities, representative achievements, position fit
4. **Forbidden** as main content: "hardworking", "strong sense of responsibility", "team player", "quick learner", "outgoing personality"
### Truthfulness & Risk Control
1. Never fabricate experiences, achievements, education, awards, or certifications
2. Never upgrade "participated in" to "led" unless user information supports it
3. Never attribute team results entirely to the individual
4. Never fabricate revenue, conversion rates, headcount, budgets, or technical metrics
5. If no data available, use restrained expressions: "improved delivery efficiency", "shortened processing cycle", "supported core business launch"
---
## Length Control
| Candidate Type | Target Pages |
|---------------|-------------|
| New graduate / <3 years experience | **1 page** |
| 310 years experience | 12 pages |
| Senior manager / researcher / academic CV | May exceed 2 pages, but must maintain information density |
**Compression rules:**
- Experiences >5 years old with low relevance should be compressed
- Experiences >10 years old and irrelevant may be omitted
- Never pad low-value experiences just to "look comprehensive"
---
## ATS & Structure Constraints
1. Core information must be plain text — never rely on images, icons, text boxes, or headers/footers for key content
2. No embedded charts, objects, SmartArt, or WordArt
3. Experience descriptions use consistent bullet symbols — no complex auto-numbering
4. Bullets within the same position should be compact — no excess blank lines
**Table layout vs. ATS balance:** The 3 visual templates (A/B/C) use Table-based layouts for Word visual quality. In strict ATS scenarios (user explicitly says "ATS priority"), prefer Template B (single-column) with reduced table dependency. Default: visual quality first.
---
## Module Naming
Use only standard, universal, recruiter-familiar names:
- Personal Info, Target Position, Profile Summary, Core Skills, Work Experience, Projects, Education, Certifications, Awards, Languages
**Forbidden fancy names:** "My Growth Journey", "Self-Appreciation", "Shining Moments", "Life Motto"
---
## Template Disease Prevention
1. Do not include irrelevant identity tags (political affiliation, hometown, etc.) unless user explicitly requests
2. Do not place low-priority modules (hobbies, languages, personality traits) before work experience
3. Do not combine cover letter and resume in one document (unless user explicitly requests)
4. Do not let template feel overpower actual personal information
5. Do not let "self-assessment" occupy the golden area of the page (should come after core skills/experience)
---
## Template Selection
Three templates are provided, auto-selected based on user needs:
| Template | Layout | Best For | Color Style |
|----------|--------|----------|-------------|
| A | Left sidebar + right body | General purpose, tech roles | Dark grey sidebar + blue bar headings |
| B | Dark header banner + single column | Content-heavy / senior candidates | Dark blue header + underline headings |
| C | Left sidebar + vertical-line headings | International / bilingual / foreign companies | Blue sidebar + left-border headings |
**Selection logic:**
- Default: Template A
- Lots of content (expected > 1 page) → Template B (no sidebar, better space utilization)
- User explicitly requests bilingual / English → Template C
### Industry Color Suggestions
| Career Direction | Sidebar BG | Accent Color | Recommended Template |
|-----------------|-----------|-------------|---------------------|
| Tech / Internet | `#1A1F36` (deep blue-purple) | `#667eea` (amethyst) | A or C |
| Finance / Consulting | `#0F2027` (deep sea blue) | `#D4AF37` (gold) | A or B |
| Design / Creative | `#2D1B30` (deep purple) | `#f5576c` (coral pink) | A or C |
| Education / Training | `#1A3A3A` (dark green) | `#3CB4A0` (mint green) | A |
| Medical / Health | `#0E2030` (dark cyan) | `#3888A8` (medical blue) | B |
| General / Default | `#303030` (warm dark neutral) | `#B89870` (warm accent) | A |
When industry is unspecified, use default warm neutral palette. This aligns with the Visual Profile warm-neutral guidance in `design-system.md`.
## Key Rules
- **NO cover page / NO TOC**
- **Target: 1 page** (2 pages max for senior roles)
- **Compact spacing**: `line: 276` (1.15x)
- All templates use **bilingual section headings** (e.g., "Work Experience 工作经历")
---
## Template A: Left Sidebar + Color Bar Headings
### Color Palette
```js
const S = {
bg: "3B4F5C", // sidebar background (dark grey-blue)
text: "D8E2E8", // sidebar text
label: "8BA0AD", // sidebar secondary text
accent: "2F97B8", // accent color (blue-cyan)
title: "1A2D38", // body heading
body: "2C3E4A", // body content
sec: "6B8592", // secondary info (dates etc.)
};
```
### Layout Structure
```
┌──────────┬──────────────────────┐
│ [Photo] │ ██ Profile ██ │ ← Blue bar heading
│ │ Summary text... │
│ Name │ │
│ Title │ ██ Work Experience ██│
│ │ Company Role Date │
│ ──────── │ ▸ Achievement... │
│ Basic │ ▸ Achievement... │
│ Info │ │
│ │ ██ Projects ██ │
│ ──────── │ ... │
│ Contact │ │
│ │ ██ Education ██ │
│ ──────── │ ... │
│ Skills │ │
│ Java ●●●●○│ │
│ Go ●●●○○│ │
│ │ │
│ ──────── │ │
│ Certs │ │
└──────────┴──────────────────────┘
30% 70%
```
### Implementation Notes
**Page setup:**
```js
page: { margin: { top: 0, bottom: 0, left: 0, right: 0 } }
// Use Table to simulate columns: columnWidths: [3400, 8506]
// ⚠️ Row height must use "exact" with safety margin to prevent overflow blank pages
// Row height: height: { value: 16038, rule: "exact" }
// 16038 = 16838(A4 height) - 1200(safety margin for cross-engine compatibility)
```
**Sidebar element order:**
1. Photo placeholder (rectangle + border, width 2400 DXA, height 1800)
2. Name (32pt bold white SimHei) + Title (18pt accent)
3. Basic info (DOB / degree / school)
4. Contact info (phone / email / address)
5. Skill ratings (name + ●○ dot rating, 5 levels each)
6. Certificates list
**Right-side section headings (color bar style):**
```js
// Full-width bar background + white Chinese text + lighter English text
new Table({ columnWidths:[7600], rows:[new TableRow({ children:[
new TableCell({
shading: { fill: S.accent, type: ShadingType.CLEAR },
margins: { top:40, bottom:40, left:200, right:100 },
children: [new Paragraph({ children: [
new TextRun({ text: "Work Experience ", size:22, bold:true, color:"FFFFFF", font:"SimHei" }),
new TextRun({ text: "Experience", size:18, color:"C8E8F0", font:"Times New Roman", italics:true }),
] })],
})
] })] });
```
**Experience entry format:**
```js
// Line 1: Company(bold) + Title(accent) + Date(right-aligned)
new Paragraph({
tabStops: [{ type: TabStopType.RIGHT, position: 7200 }],
children: [
new TextRun({ text: "Company Name", size:22, bold:true, color:S.title }),
new TextRun({ text: " Role Title", size:20, color:S.accent }),
new TextRun({ text: "\t2023.06 — Present", size:17, color:S.sec }),
]
});
// Line 2+: ▸ bullet points
```
---
## Template B: Dark Header Banner + Single Column
### Color Palette
```js
const C = {
dark: "1A3352", // header background (dark blue)
accent: "2980B9", // accent color
title: "1A2636", // heading
body: "2C3E50", // body text
sec: "6B8599", // secondary info
light: "E8EFF5", // light background
};
```
### Layout Structure
```
┌────────────────────────────────┐
│ ██████████████████████████████ │ ← Dark blue background banner
│ █ Name Title █ │ Contains name / title /
│ █ Phone | Email | Location █ │ contact / basic info
│ █ DOB | Degree | School █ │
│ ██████████████████████████████ │
│ │
│ Profile │ ← Underline heading
│ ───────────────────────────── │
│ Summary text... │
│ │
│ Work Experience │
│ ───────────────────────────── │
│ Company | Role Date │
│ • Achievement... │
│ ... │
│ │
│ Skills │
│ ───────────────────────────── │
│ Programming ●●●●○ Java/Go/...│ ← Rating + details
└────────────────────────────────┘
```
### Implementation Notes
**Header banner:**
```js
// Table single row single column, dark background, height 2400 DXA
new Table({ columnWidths:[11906], rows:[new TableRow({
height: { value:2400, rule:"exact" },
children:[new TableCell({
shading: { fill: C.dark },
margins: { top:300, bottom:200, left:800, right:800 },
verticalAlign: VerticalAlign.TOP, // Never use CENTER in exact-height rows (WPS incompatible)
children: [
// Line 1: Name(48pt white) + Title
// Line 2: Phone | Email | Location
// Line 3: DOB | Degree | School
]
})]
})] });
```
**Section headings (underline style):**
```js
new Paragraph({
borders: { bottom: { style: BorderStyle.SINGLE, size: 2, color: C.accent } },
children: [
new TextRun({ text: "Work Experience", size:24, bold:true, color:C.accent, font:"SimHei" }),
new TextRun({ text: " Experience", size:18, color:C.sec, italics:true }),
]
});
```
**Skills display (rating + details):**
```js
// Name(bold) + ●○ rating + specific tools list
new Paragraph({ children: [
new TextRun({ text: "Programming ", size:19, bold:true, color:C.title }),
new TextRun({ text: "●●●●○ ", size:13, color:C.accent }),
new TextRun({ text: "Java / Go / Python / TypeScript", size:18, color:C.sec }),
] });
```
---
## Template C: Blue Sidebar + Vertical-Line Headings
### Color Palette
```js
const C = {
side: "4A7C8F", // sidebar background (teal-blue)
text: "FFFFFF", // sidebar text
label: "A0C4D0", // sidebar secondary text
accent: "357A8F", // accent color
dot: "2F8FAD", // skill dot fill color
dotDim: "B8D4DE", // skill dot empty color
title: "1A3040", // body heading
body: "2C4050", // body content
sec: "6B8A98", // secondary info
};
```
### Sidebar-Specific Elements
**Circular photo placeholder:**
```js
new Paragraph({ alignment: AlignmentType.CENTER,
children: [new TextRun({ text: "◯", size:80, color:C.label })]
});
```
**Language proficiency matrix:**
```js
"English ● ● ● ● ○"
"Japanese ● ● ○ ○ ○"
```
**Right-side section headings (left-border style):**
```js
new Paragraph({
borders: { left: { style: BorderStyle.SINGLE, size:8, color:C.accent, space:8 } },
indent: { left: 120 },
children: [
new TextRun({ text: "Work Experience", size:24, bold:true, color:C.title, font:"SimHei" }),
new TextRun({ text: " Experience", size:18, color:C.sec, italics:true }),
]
});
```
**Experience entry format (differs from A):**
```js
// Line 1: Company name (bold)
// Line 2: Role (accent color) + Date
// Line 3+: ▸ bullet points
```
---
## Universal Rules
### Font Specifications
| Element | Font | Size | Style |
|---------|------|------|-------|
| Name (sidebar) | SimHei | 32pt (size:64) | Bold, white |
| Name (header) | SimHei | 24pt (size:48) | Bold, white |
| Section heading | SimHei | 11pt (size:22) | Bold |
| Company / School | Microsoft YaHei | 11pt (size:22) | Bold |
| Role title | Microsoft YaHei | 10pt (size:20) | accent color |
| Date range | Microsoft YaHei | 8.5pt (size:17) | sec color |
| Bullet description | Microsoft YaHei | 9.5pt (size:19) | body color |
| Skill dots | Default | 6.5pt (size:13) | accent / dimColor |
### Bullet Symbols
- Template A / C: `▸` (small triangle)
- Template B: `•` (round dot)
### Skill Rating Rules
- 15 levels using filled ● and empty ○ dots
- One skill per line, name on the left, dots on the right
- Filled dot color: accent; empty dot color: dimColor
### JD Matching Logic
When user provides a job description:
1. Extract key requirements (skills, experience, education)
2. Prioritize matching experience items to the top
3. Naturally incorporate JD keywords into descriptions
4. Highlight relevant skills
### Multi-Page Handling
- 1 page content: Sidebar templates (A/C) or single-column template (B)
- Over 1 page: Prefer Template B; if using A/C, switch page 2 to full-width layout with a name bar at the top (Name | Title)
⚠️ **Multi-page resumes must use multi-section structure:**
Page 1 and Page 2 must be **different sections** for independent margin and layout control:
```js
sections: [
{
// Page 1 section — margin 0 (sidebar layout needs full-page)
properties: { page: { margin: { top: 0, bottom: 0, left: 0, right: 0 } } },
children: [page1Table],
},
{
// Page 2 section — normal margins with header bar
properties: { page: { margin: { top: 800, bottom: 600, left: 800, right: 800 } } },
children: [pageHeader(name, title), ...page2Content],
},
]
```
⚠️ **Template B multi-page handling:**
Template B header banner uses Table simulation:
1. Banner `columnWidths` must equal **page content area width** (pageWidth - marginLeft - marginRight), not full page width
2. If banner needs full page width → set page 1 section margin to 0, banner columnWidths to 11906
3. Page 2+ must be independent sections, margin.top ≥ 800
⚠️ **Page 2+ top spacing rules (mandatory):**
1. **Page margin.top must be ≥ 800 twips** (~1.4 cm), never 0
2. **Page 2+ needs a header info bar:** concise `Name | Title` bar, height ~400600 twips, separated from body with light background or bottom line
3. **200300 twips spacing between header bar and body content**
4. **Forbidden: content touching the very top of page 2**
```js
// Concise header bar for page 2+
function pageHeader(name, title) {
return new Table({
width: { size: 100, type: WidthType.PERCENTAGE },
borders: { top: NB, left: NB, right: NB, insideHorizontal: NB, insideVertical: NB,
bottom: { style: BorderStyle.SINGLE, size: 1, color: "D0D0D0" } },
rows: [new TableRow({
cantSplit: true,
height: { value: 500, rule: "atLeast" },
children: [new TableCell({
margins: { top: 60, bottom: 60, left: 200, right: 200 },
borders: { top: NB, left: NB, right: NB, bottom: NB },
children: [new Paragraph({
children: [
new TextRun({ text: name, size: 20, bold: true, color: S.title || C.title }),
new TextRun({ text: ` | ${title}`, size: 18, color: S.sec || C.sec }),
]
})],
})],
})],
});
}
```
---
## Scene-Specific Quality Checks
In addition to universal checks (see `references/common-rules.md`):
### Format
- [ ] Fits within 1 page (senior ≤ 2 pages)
- [ ] **Single-page fill rate ≥ 85%** (bottom whitespace ≤ 15%, ~2500 twips)
- [ ] Section headings are bilingual
- [ ] Skill rating dots correct (●○)
- [ ] Experience in reverse chronological order
- [ ] No cover page, no TOC
- [ ] Line spacing 1.15x (line: 276)
- [ ] No extra blank pages
- [ ] **Table row height uses `rule: "exact"` with value ≤ 16038** (prevent overflow blank pages)
- [ ] **Multi-page: page 2+ has header info bar + proper top spacing**
### Content
- [ ] Clearly organized around target position
- [ ] No vague self-assessments ("hardworking", "responsible", "team player")
- [ ] No fabricated data or exaggerated results
- [ ] Most relevant experience placed first and detailed
- [ ] Each bullet demonstrates action + object + method + result
- [ ] No long narrative blocks / excessive long sentences / information density imbalance
- [ ] Module names are standardized
- [ ] Contact info is plain text, clearly positioned
- [ ] Header area forms visual center
- [ ] Work experience and projects are the visual main body
- [ ] Page count matches candidate seniority
### Single-Page Fill Rules
Single-page resumes must fully utilize page space — **large bottom whitespace is forbidden:**
1. If content is insufficient → **proactively expand:**
- Add project details, skill keywords, achievement data
- Add supplementary modules: profile summary, interests, awards
2. Use section spacing (`spacing.before/after`) to **distribute content evenly**
3. Sidebar templates (A/C): sidebar height should approach full page
- If sidebar content is sparse, increase element spacing
- Or add supplementary modules: "Languages", "Interests"
4. Assessment: after generation, check last content element position; if >2500 twips from page bottom, adjust

View File

@@ -0,0 +1 @@
# Make scripts directory a package for relative imports in tests

View File

@@ -0,0 +1,749 @@
#!/usr/bin/env python3
"""
Add placeholder entries to Table of Contents in a DOCX file.
This script adds placeholder TOC entries between the 'separate' and 'end'
field characters, so users see some content on first open instead of an empty TOC.
The original file is replaced with the modified version.
Usage:
python add_toc_placeholders.py <docx_file> # auto-extract headings (default)
python add_toc_placeholders.py <docx_file> --auto # explicit auto mode
python add_toc_placeholders.py <docx_file> --entries <entries_json>
entries_json format: JSON string with array of objects:
[
{"level": 1, "text": "Chapter 1 Overview", "page": "1"},
{"level": 2, "text": "Section 1.1 Details", "page": "1"}
]
Default behavior (no flags): auto-extracts Heading 1-3 from the document.
Filters out table/figure captions (e.g. "表 1xxx", "图 2xxx").
Example:
python add_toc_placeholders.py document.docx
python add_toc_placeholders.py document.docx --auto
python add_toc_placeholders.py document.docx --entries '[{"level":1,"text":"Introduction","page":"1"}]'
"""
import argparse
import html
import json
import re
import shutil
import sys
import tempfile
import zipfile
from pathlib import Path
def _extract_headings_from_docx(docx_path: str, max_level: int = 3) -> list:
"""Extract headings from a DOCX file for auto-mode TOC generation.
Args:
docx_path: Path to DOCX file
max_level: Maximum heading level to include (default 3)
Returns:
List of dicts with 'level', 'text', 'page' keys
"""
from docx import Document
doc = Document(docx_path)
entries = []
page_estimate = 1
# Pattern to filter out table/figure captions styled as headings
caption_pattern = re.compile(r'^[表图]\s*\d')
for i, para in enumerate(doc.paragraphs):
style_name = para.style.name if para.style else ''
if not style_name.startswith('Heading'):
continue
m = re.search(r'(\d+)', style_name)
if not m:
continue
level = int(m.group(1))
if level > max_level:
continue
text = para.text.strip()
if not text:
continue
# Filter table/figure captions
if caption_pattern.match(text):
continue
# Rough page estimate: increment every ~8 headings
page_estimate = max(1, 1 + i // 8)
entries.append({"level": level, "text": text, "page": str(page_estimate)})
return entries
def add_toc_placeholders(docx_path: str, entries: list = None) -> None:
"""Add placeholder TOC entries to a DOCX file (in-place replacement).
Args:
docx_path: Path to DOCX file (will be modified in-place)
entries: Optional list of placeholder entries. Each entry should be a dict
with 'level' (1-3), 'text', and 'page' keys.
"""
docx_path = Path(docx_path)
# Create temp directory for extraction
with tempfile.TemporaryDirectory() as temp_dir:
temp_path = Path(temp_dir)
extracted_dir = temp_path / "extracted"
temp_output = temp_path / "output.docx"
# Extract DOCX
with zipfile.ZipFile(docx_path, 'r') as zip_ref:
zip_ref.extractall(extracted_dir)
# Ensure TOC styles exist in styles.xml
styles_xml_path = extracted_dir / "word" / "styles.xml"
toc_style_mapping = _ensure_toc_styles(styles_xml_path)
print(f"TOC style mapping: {toc_style_mapping}")
# Fix settings.xml: ensure updateFields has val="true"
settings_xml_path = extracted_dir / "word" / "settings.xml"
_fix_update_fields(settings_xml_path)
# Fix Heading styles: ensure outlineLvl is set (required for TOC field update)
_fix_heading_outline_levels(styles_xml_path)
# Process document.xml
document_xml = extracted_dir / "word" / "document.xml"
if not document_xml.exists():
raise ValueError("document.xml not found in the DOCX file")
# Read and process XML
content = document_xml.read_text(encoding='utf-8')
# Fix fldChar structure: split merged begin+instrText+separate into separate <w:r> elements
content = _fix_fld_char_structure(content)
# Find TOC structure and add placeholders (uses lxml for robust XML parsing)
modified_content = _insert_toc_placeholders(content, entries, toc_style_mapping)
# Write back
document_xml.write_text(modified_content, encoding='utf-8')
# Repack DOCX to temp file
with zipfile.ZipFile(temp_output, 'w', zipfile.ZIP_DEFLATED) as zipf:
for file_path in extracted_dir.rglob('*'):
if file_path.is_file():
arcname = file_path.relative_to(extracted_dir)
zipf.write(file_path, arcname)
# Replace original file with modified version (use shutil.move for cross-device support)
docx_path.unlink()
shutil.move(str(temp_output), str(docx_path))
def _fix_update_fields(settings_xml_path: Path) -> None:
"""Fix settings.xml to ensure <w:updateFields w:val="true"/> is present.
The docx npm library generates <w:updateFields/> without val="true",
which Word/WPS interprets as false, preventing TOC auto-update on open.
"""
if not settings_xml_path.exists():
return
content = settings_xml_path.read_text(encoding='utf-8')
original = content
# Case 1: <w:updateFields/> (self-closing, no val) → add val="true"
if '<w:updateFields/>' in content:
content = content.replace('<w:updateFields/>', '<w:updateFields w:val="true"/>')
print('Fixed: <w:updateFields/> → <w:updateFields w:val="true"/>')
# Case 2: <w:updateFields w:val="false"/> → change to true (match precisely)
elif re.search(r'<w:updateFields\s+w:val="false"\s*/>', content):
content = re.sub(
r'<w:updateFields\s+w:val="false"\s*/>',
'<w:updateFields w:val="true"/>',
content
)
print('Fixed: <w:updateFields w:val="false"/> → <w:updateFields w:val="true"/>')
# Case 3: Not present at all → inject before </w:settings>
elif '<w:updateFields' not in content:
content = content.replace('</w:settings>', '<w:updateFields w:val="true"/></w:settings>')
print('Fixed: added <w:updateFields w:val="true"/> to settings.xml')
if content != original:
settings_xml_path.write_text(content, encoding='utf-8')
def _fix_heading_outline_levels(styles_xml_path: Path) -> None:
"""Fix Heading styles to include outlineLvl in pPr.
The docx npm library creates Heading styles but sometimes doesn't set outlineLvl
in the style definition. Without outlineLvl, Word's TOC field update won't find
headings even though they display correctly.
This ensures Heading1 has outlineLvl=0, Heading2 has outlineLvl=1, etc.
"""
if not styles_xml_path.exists():
return
content = styles_xml_path.read_text(encoding='utf-8')
original = content
W_NS = 'http://schemas.openxmlformats.org/wordprocessingml/2006/main'
for level in range(1, 7):
style_id = f'Heading{level}'
outline_val = str(level - 1)
# Pattern: find <w:style> with w:styleId="HeadingN"
style_pattern = (
rf'(<w:style[^>]*w:styleId="{style_id}"[^>]*>)'
rf'(.*?)'
rf'(</w:style>)'
)
match = re.search(style_pattern, content, flags=re.DOTALL)
if not match:
continue
style_content = match.group(2)
# Check if outlineLvl already exists in this style
if f'<w:outlineLvl' in style_content:
continue
# Find or create <w:pPr> within this style
ppr_match = re.search(r'(<w:pPr[^>]*>)(.*?)(</w:pPr>)', style_content, flags=re.DOTALL)
if ppr_match:
# Add outlineLvl inside existing pPr
new_ppr_content = ppr_match.group(2) + f'<w:outlineLvl w:val="{outline_val}"/>'
new_style_content = (
style_content[:ppr_match.start()] +
ppr_match.group(1) + new_ppr_content + ppr_match.group(3) +
style_content[ppr_match.end():]
)
else:
# No pPr exists, create one
new_ppr = f'<w:pPr><w:outlineLvl w:val="{outline_val}"/></w:pPr>'
# Insert pPr right after style opening (after name/basedOn if present)
new_style_content = new_ppr + style_content
new_style = match.group(1) + new_style_content + match.group(3)
content = content[:match.start()] + new_style + content[match.end():]
print(f'Fixed: added outlineLvl={outline_val} to {style_id} style')
if content != original:
styles_xml_path.write_text(content, encoding='utf-8')
def _fix_fld_char_structure(xml_content: str) -> str:
"""Fix malformed fldChar structure where begin+instrText+separate are in one <w:r>.
The docx npm library generates:
<w:r><w:fldChar begin/><w:instrText>TOC...</w:instrText><w:fldChar separate/></w:r>
Word/WPS requires the standard structure:
<w:r><w:fldChar begin/></w:r>
<w:r><w:instrText>TOC...</w:instrText></w:r>
<w:r><w:fldChar separate/></w:r>
"""
# Match a <w:r> that contains both begin fldChar AND instrText AND separate fldChar
pattern = (
r'<w:r(?:\s[^>]*)?>('
r'<w:fldChar[^>]*w:fldCharType="begin"[^>]*/>' # begin
r')('
r'<w:instrText[^>]*>.*?</w:instrText>' # instrText
r')('
r'<w:fldChar[^>]*w:fldCharType="separate"[^>]*/>' # separate
r')</w:r>'
)
def split_run(match):
begin = match.group(1)
instr = match.group(2)
separate = match.group(3)
return f'<w:r>{begin}</w:r><w:r>{instr}</w:r><w:r>{separate}</w:r>'
modified = re.sub(pattern, split_run, xml_content, flags=re.DOTALL)
if modified != xml_content:
print("Fixed: split merged fldChar begin+instrText+separate into separate <w:r> elements")
# Fix TOC instrText: remove \t switch with wrong style names
# docx npm lib generates \t "Heading1,1,Heading2,2,..." but Word expects "Heading 1,1,..."
# Since we already have \o "1-3" which uses outlineLvl (now fixed), \t is redundant and harmful
toc_t_pattern = r'(TOC\s+[^<]*?)\\t\s+&quot;[^&]*&quot;'
modified2 = re.sub(toc_t_pattern, r'\1', modified)
if modified2 != modified:
print("Fixed: removed \\t switch from TOC instrText (\\o with outlineLvl is sufficient)")
modified = modified2
return modified
def _detect_toc_styles(styles_xml_path: Path) -> dict:
"""Detect TOC style IDs from styles.xml.
Args:
styles_xml_path: Path to styles.xml
Returns:
Dictionary mapping level (1-3) to style ID string
"""
if not styles_xml_path.exists():
return {}
content = styles_xml_path.read_text(encoding='utf-8')
result = {}
for level in range(1, 4):
# Standard TOC style names: "TOC 1", "TOC 2", "TOC 3" (with space)
# or "TOC1", "TOC2", "TOC3" (no space) — docx-js uses numeric IDs like "9", "11", "12"
patterns = [
rf'w:styleId="(TOC{level})"',
rf'w:styleId="(TOC {level})"',
rf'<w:name\s+w:val="toc\s*{level}"[^/]*/>\s*</w:name>|<w:name\s+w:val="toc\s*{level}"[^/]*/>',
]
for pattern in patterns[:2]:
m = re.search(pattern, content)
if m:
result[level] = m.group(1)
break
else:
# Try matching by w:name (case insensitive toc N)
# Find <w:style> blocks with name containing "toc N"
name_pattern = rf'<w:style[^>]*w:styleId="([^"]*)"[^>]*>.*?<w:name\s+w:val="[Tt][Oo][Cc]\s*{level}"'
m = re.search(name_pattern, content, flags=re.DOTALL)
if m:
result[level] = m.group(1)
return result
def _ensure_toc_styles(styles_xml_path: Path) -> dict:
"""Ensure TOC styles exist in styles.xml, adding them if necessary.
Returns:
Dictionary mapping level (1-3) to style ID string
"""
if not styles_xml_path.exists():
return {1: "9", 2: "11", 3: "12"}
content = styles_xml_path.read_text(encoding='utf-8')
detected = _detect_toc_styles(styles_xml_path)
result = dict(detected)
W_NS = 'http://schemas.openxmlformats.org/wordprocessingml/2006/main'
# Define TOC styles to add if missing
toc_style_defs = {
1: {
'id': '9',
'name': 'toc 1',
'xml': f'''<w:style w:type="paragraph" w:styleId="9" xmlns:w="{W_NS}">
<w:name w:val="toc 1"/>
<w:basedOn w:val="Normal"/>
<w:uiPriority w:val="39"/>
<w:pPr>
<w:tabs><w:tab w:val="right" w:leader="dot" w:pos="9026"/></w:tabs>
<w:spacing w:before="120" w:after="60"/>
</w:pPr>
<w:rPr><w:b/><w:bCs/></w:rPr>
</w:style>'''
},
2: {
'id': '11',
'name': 'toc 2',
'xml': f'''<w:style w:type="paragraph" w:styleId="11" xmlns:w="{W_NS}">
<w:name w:val="toc 2"/>
<w:basedOn w:val="Normal"/>
<w:uiPriority w:val="39"/>
<w:pPr>
<w:tabs><w:tab w:val="right" w:leader="dot" w:pos="9026"/></w:tabs>
<w:ind w:left="360"/>
<w:spacing w:before="60" w:after="40"/>
</w:pPr>
</w:style>'''
},
3: {
'id': '12',
'name': 'toc 3',
'xml': f'''<w:style w:type="paragraph" w:styleId="12" xmlns:w="{W_NS}">
<w:name w:val="toc 3"/>
<w:basedOn w:val="Normal"/>
<w:uiPriority w:val="39"/>
<w:pPr>
<w:tabs><w:tab w:val="right" w:leader="dot" w:pos="9026"/></w:tabs>
<w:ind w:left="720"/>
<w:spacing w:before="40" w:after="20"/>
</w:pPr>
</w:style>'''
},
}
modified = False
for level in range(1, 4):
if level not in result:
style_def = toc_style_defs[level]
result[level] = style_def['id']
# Add style before </w:styles>
insert_point = content.rfind('</w:styles>')
if insert_point == -1:
print(f"WARNING: Could not find </w:styles> to insert TOC {level} style", file=sys.stderr)
continue
content = content[:insert_point] + style_def['xml'] + '\n' + content[insert_point:]
print(f"Added TOC {level} style (ID: {style_def['id']})")
modified = True
if modified:
styles_xml_path.write_text(content, encoding='utf-8')
# Ensure Hyperlink style exists
_ensure_hyperlink_style(styles_xml_path)
return result
def _ensure_hyperlink_style(styles_xml_path: Path) -> None:
"""Ensure Hyperlink character style exists in styles.xml."""
if not styles_xml_path.exists():
return
content = styles_xml_path.read_text(encoding='utf-8')
if 'w:styleId="Hyperlink"' in content:
return
W_NS = 'http://schemas.openxmlformats.org/wordprocessingml/2006/main'
hyperlink_style = f'''<w:style w:type="character" w:styleId="Hyperlink" xmlns:w="{W_NS}">
<w:name w:val="Hyperlink"/>
<w:uiPriority w:val="99"/>
<w:rPr>
<w:color w:val="0563C1"/>
<w:u w:val="single"/>
</w:rPr>
</w:style>'''
insert_point = content.rfind('</w:styles>')
if insert_point != -1:
content = content[:insert_point] + hyperlink_style + '\n' + content[insert_point:]
styles_xml_path.write_text(content, encoding='utf-8')
print("Added Hyperlink character style")
def _insert_toc_placeholders(xml_content: str, entries: list = None, toc_style_mapping: dict = None) -> str:
"""Insert placeholder TOC entries and heading bookmarks into XML content.
Uses lxml ElementTree for robust XML manipulation instead of fragile regex.
This function does TWO things:
1. Adds bookmark anchors to each Heading paragraph (so Word can link TOC → heading)
2. Replaces TOC placeholder area with proper entries containing HYPERLINK + PAGEREF
Args:
xml_content: The XML content of document.xml
entries: List of placeholder entries with 'level', 'text', 'page' keys
toc_style_mapping: Dictionary mapping level to style ID
Returns:
Modified XML content with bookmarks and TOC placeholders
Raises:
RuntimeError: If TOC structure cannot be found or is malformed
"""
from lxml import etree
if entries is None:
entries = [{"level": 1, "text": "Contents", "page": "1"}]
if toc_style_mapping is None:
toc_style_mapping = {1: "9", 2: "11", 3: "12"}
W = "http://schemas.openxmlformats.org/wordprocessingml/2006/main"
R_NS = "http://schemas.openxmlformats.org/officeDocument/2006/relationships"
# Parse XML
root = etree.fromstring(xml_content.encode('utf-8'))
nsmap = {'w': W, 'r': R_NS}
# ── Step 1: Add bookmarks to Heading paragraphs ──
bookmark_id_counter = 100000
heading_bookmark_map = {} # text → first bookmark_name (backward compat)
heading_bookmark_map_all = {} # text → [list of bookmark_names] for duplicate headings
for para in root.iter(f'{{{W}}}p'):
# Find pStyle
ppr = para.find(f'{{{W}}}pPr')
if ppr is None:
continue
pstyle = ppr.find(f'{{{W}}}pStyle')
if pstyle is None:
continue
style_val = pstyle.get(f'{{{W}}}val', '')
if not re.match(r'Heading\d$', style_val):
continue
# Extract heading text
texts = []
for t_elem in para.iter(f'{{{W}}}t'):
if t_elem.text:
texts.append(t_elem.text)
heading_text = ''.join(texts).strip()
if not heading_text:
continue
# Skip if already has bookmark
if para.find(f'{{{W}}}bookmarkStart') is not None:
continue
# Generate bookmark
bm_name = f"_Toc{bookmark_id_counter}"
bm_id_str = str(bookmark_id_counter)
bookmark_id_counter += 1
# Store mapping (support duplicate headings)
if heading_text not in heading_bookmark_map_all:
heading_bookmark_map_all[heading_text] = []
heading_bookmark_map_all[heading_text].append(bm_name)
if heading_text not in heading_bookmark_map:
heading_bookmark_map[heading_text] = bm_name
# Insert bookmarkStart after pPr
bm_start = etree.Element(f'{{{W}}}bookmarkStart')
bm_start.set(f'{{{W}}}id', bm_id_str)
bm_start.set(f'{{{W}}}name', bm_name)
bm_end = etree.Element(f'{{{W}}}bookmarkEnd')
bm_end.set(f'{{{W}}}id', bm_id_str)
ppr_index = list(para).index(ppr)
para.insert(ppr_index + 1, bm_start)
# bookmarkEnd at end of paragraph
para.append(bm_end)
bookmarks_added = len(heading_bookmark_map)
if bookmarks_added > 0:
print(f"Added {bookmarks_added} bookmarks to Heading paragraphs")
# ── Step 2: Find TOC field structure (begin → instrText → separate → end) ──
toc_separate_para = None
toc_end_para = None
# Track field nesting to handle nested fields correctly
field_stack = []
toc_field_depth = None
for fld_char in root.iter(f'{{{W}}}fldChar'):
fld_type = fld_char.get(f'{{{W}}}fldCharType')
run = fld_char.getparent()
if fld_type == 'begin':
para = run.getparent()
instr_text = ''
found_run = False
for sibling in para:
if sibling is run:
found_run = True
it = sibling.find(f'{{{W}}}instrText')
if it is not None and it.text:
instr_text += it.text
continue
if found_run and sibling.tag == f'{{{W}}}r':
it = sibling.find(f'{{{W}}}instrText')
if it is not None and it.text:
instr_text += it.text
if sibling.find(f'{{{W}}}fldChar') is not None:
break
field_stack.append(instr_text.strip())
if 'TOC' in instr_text and toc_field_depth is None:
toc_field_depth = len(field_stack)
elif fld_type == 'separate':
if toc_field_depth is not None and len(field_stack) == toc_field_depth:
toc_separate_para = run.getparent()
elif fld_type == 'end':
if toc_field_depth is not None and len(field_stack) == toc_field_depth:
toc_end_para = run.getparent()
break
if field_stack:
field_stack.pop()
if toc_separate_para is None or toc_end_para is None:
has_begin = root.find(f'.//{{{W}}}fldChar[@{{{W}}}fldCharType="begin"]') is not None
has_separate = root.find(f'.//{{{W}}}fldChar[@{{{W}}}fldCharType="separate"]') is not None
if not has_begin:
raise RuntimeError(
"TOC FAILED: No field structure found in document. "
"Ensure the code includes a TableOfContents element."
)
elif not has_separate:
raise RuntimeError(
"TOC FAILED: TOC field has 'begin' but no 'separate' fldChar. "
"Run _fix_fld_char_structure() first or check the docx-js version."
)
else:
raise RuntimeError(
"TOC FAILED: Field structure found but no TOC instrText detected. "
"Ensure TableOfContents element generates a TOC \\o field code."
)
# ── Step 3: Remove everything between separate-para and end-para ──
# The TOC paragraphs may be direct children of <w:body> or wrapped in <w:sdt><w:sdtContent>
toc_container = toc_separate_para.getparent() # could be body or sdtContent
container_children = list(toc_container)
sep_idx = container_children.index(toc_separate_para)
end_idx = container_children.index(toc_end_para)
for elem in container_children[sep_idx + 1:end_idx]:
toc_container.remove(elem)
# ── Step 4: Build and insert placeholder paragraphs ──
indent_mapping = {1: 0, 2: 360, 3: 720, 4: 1080, 5: 1440, 6: 1800}
heading_occurrence_counter = {}
insert_pos = list(toc_container).index(toc_end_para)
for entry in entries:
level = entry.get('level', 1)
text_raw = entry.get('text', '')
page = entry.get('page', '1')
toc_style = toc_style_mapping.get(level, toc_style_mapping.get(1, "9"))
indent = indent_mapping.get(level, 0)
# Resolve bookmark (handle duplicate headings correctly)
bm_name = ''
if text_raw in heading_bookmark_map_all:
occ = heading_occurrence_counter.get(text_raw, 0)
bm_list = heading_bookmark_map_all[text_raw]
if occ < len(bm_list):
bm_name = bm_list[occ]
heading_occurrence_counter[text_raw] = occ + 1
# Build paragraph element
p = etree.Element(f'{{{W}}}p')
toc_container.insert(insert_pos, p)
insert_pos += 1
# pPr
ppr = etree.SubElement(p, f'{{{W}}}pPr')
pstyle = etree.SubElement(ppr, f'{{{W}}}pStyle')
pstyle.set(f'{{{W}}}val', str(toc_style))
if indent > 0:
ind = etree.SubElement(ppr, f'{{{W}}}ind')
ind.set(f'{{{W}}}left', str(indent))
tabs = etree.SubElement(ppr, f'{{{W}}}tabs')
tab = etree.SubElement(tabs, f'{{{W}}}tab')
tab.set(f'{{{W}}}val', 'right')
tab.set(f'{{{W}}}leader', 'dot')
tab.set(f'{{{W}}}pos', '9026')
spacing = etree.SubElement(ppr, f'{{{W}}}spacing')
spacing.set(f'{{{W}}}before', '120')
spacing.set(f'{{{W}}}after', '60')
if bm_name:
hyperlink = etree.SubElement(p, f'{{{W}}}hyperlink')
hyperlink.set(f'{{{W}}}anchor', bm_name)
hyperlink.set(f'{{{W}}}history', '1')
r_text = etree.SubElement(hyperlink, f'{{{W}}}r')
rpr = etree.SubElement(r_text, f'{{{W}}}rPr')
rstyle = etree.SubElement(rpr, f'{{{W}}}rStyle')
rstyle.set(f'{{{W}}}val', 'Hyperlink')
t = etree.SubElement(r_text, f'{{{W}}}t')
t.text = text_raw
r_tab = etree.SubElement(hyperlink, f'{{{W}}}r')
etree.SubElement(r_tab, f'{{{W}}}tab')
r_begin = etree.SubElement(hyperlink, f'{{{W}}}r')
fc_begin = etree.SubElement(r_begin, f'{{{W}}}fldChar')
fc_begin.set(f'{{{W}}}fldCharType', 'begin')
r_instr = etree.SubElement(hyperlink, f'{{{W}}}r')
instr = etree.SubElement(r_instr, f'{{{W}}}instrText')
instr.set('{http://www.w3.org/XML/1998/namespace}space', 'preserve')
instr.text = f' PAGEREF {bm_name} \\h '
r_sep = etree.SubElement(hyperlink, f'{{{W}}}r')
fc_sep = etree.SubElement(r_sep, f'{{{W}}}fldChar')
fc_sep.set(f'{{{W}}}fldCharType', 'separate')
r_page = etree.SubElement(hyperlink, f'{{{W}}}r')
t_page = etree.SubElement(r_page, f'{{{W}}}t')
t_page.text = str(page)
r_end = etree.SubElement(hyperlink, f'{{{W}}}r')
fc_end = etree.SubElement(r_end, f'{{{W}}}fldChar')
fc_end.set(f'{{{W}}}fldCharType', 'end')
else:
r_text = etree.SubElement(p, f'{{{W}}}r')
t = etree.SubElement(r_text, f'{{{W}}}t')
t.text = text_raw
r_tab = etree.SubElement(p, f'{{{W}}}r')
etree.SubElement(r_tab, f'{{{W}}}tab')
r_page = etree.SubElement(p, f'{{{W}}}r')
t_page = etree.SubElement(r_page, f'{{{W}}}t')
t_page.text = str(page)
placeholders_inserted = len(entries)
print(f"Inserted {placeholders_inserted} TOC placeholder entries")
# Serialize back to string
result = etree.tostring(root, xml_declaration=True, encoding='UTF-8', standalone=True)
return result.decode('utf-8')
def main():
parser = argparse.ArgumentParser(
description='Add placeholder entries to Table of Contents in a DOCX file (in-place)'
)
parser.add_argument('docx_file', help='DOCX file to modify (will be replaced)')
parser.add_argument(
'--auto', action='store_true',
help='Auto-extract Heading 1-3 from the DOCX as TOC entries (recommended)'
)
parser.add_argument(
'--entries',
help='JSON string with placeholder entries: [{"level":1,"text":"Chapter 1","page":"1"}]'
)
args = parser.parse_args()
# Determine entries
entries = None
if args.entries:
try:
entries = json.loads(args.entries)
except json.JSONDecodeError as e:
print(f"Error parsing entries JSON: {e}", file=sys.stderr)
sys.exit(1)
elif args.auto or True:
# Default to auto mode — always extract from document headings
entries = _extract_headings_from_docx(args.docx_file)
if entries:
print(f"Auto-extracted {len(entries)} headings from document", file=sys.stderr)
else:
print("No headings found in document, using minimal placeholder", file=sys.stderr)
entries = [{"level": 1, "text": "Contents", "page": "1"}]
# Add placeholders
try:
add_toc_placeholders(args.docx_file, entries)
print(f"Successfully added TOC placeholders to {args.docx_file}")
except RuntimeError as e:
# TOC structure errors — hard fail with exit code 1
print(f"ERROR: {e}", file=sys.stderr)
sys.exit(1)
except Exception as e:
print(f"Error: {e}", file=sys.stderr)
sys.exit(1)
if __name__ == '__main__':
main()

1333
skills/docx/scripts/document.py Executable file

File diff suppressed because it is too large Load Diff

807
skills/docx/scripts/postcheck.py Executable file
View File

@@ -0,0 +1,807 @@
#!/usr/bin/env python3
"""
postcheck.py — Document business rule self-check script
Unlike traditional OpenXML Schema validation, this script does not check XML legality.
Instead, it checks document "visual quality" and "typesetting correctness" — issues visible to the human eye.
Usage:
python3 postcheck.py output.docx [--fix] [--json]
Checks:
1. Blank page detection — trailing/middle excess blank pages, double page breaks, consecutive empty paragraphs
2. Line spacing consistency — whether body paragraph line spacing is uniform
3. Table margins — whether cells have padding set
4. Table pagination control — whether header rows have tblHeader set, data rows have cantSplit
5. Image overflow — whether image width exceeds page usable area
6. Font fallback — whether fonts are used that may be missing on target systems
7. CJK indentation — whether Chinese body text has first-line indent (excluding table cells, lists, centered paragraphs)
8. Heading level continuity — whether headings skip levels (H1→H3 skipping H2)
9. Numbering continuity — whether numbered lists have gaps
10. Cover separation — whether cover and body are in different sections
11. ShadingType — whether SOLID is misused causing black cells
12. TOC quality — whether TOC field exists, whether headings use standard Heading styles
13. Image aspect ratio — whether images are stretched/distorted
14. Document cleanliness — whether placeholder text, Markdown syntax, or draft expressions remain
15. Report content quality — whether summary exists, whether titles are specific, whether vague conclusions are used
"""
import zipfile
import sys
import json
import re
from pathlib import Path
from xml.etree import ElementTree as ET
NS = {
"w": "http://schemas.openxmlformats.org/wordprocessingml/2006/main",
"r": "http://schemas.openxmlformats.org/officeDocument/2006/relationships",
"wp": "http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing",
"a": "http://schemas.openxmlformats.org/drawingml/2006/main",
"pic": "http://schemas.openxmlformats.org/drawingml/2006/picture",
}
class CheckResult:
def __init__(self, name: str, passed: bool, message: str, severity: str = "warning"):
self.name = name
self.passed = passed
self.message = message
self.severity = severity # "error" | "warning" | "info"
def to_dict(self):
return {
"name": self.name,
"passed": self.passed,
"message": self.message,
"severity": self.severity,
}
def __str__(self):
icon = "" if self.passed else ("" if self.severity == "error" else "⚠️")
return f"{icon} [{self.name}] {self.message}"
def read_document_xml(docx_path: str) -> ET.Element:
"""Read document.xml and return the root element"""
with zipfile.ZipFile(docx_path, "r") as z:
return ET.fromstring(z.read("word/document.xml"))
def get_sections(root: ET.Element) -> list:
"""Extract all sections (located via sectPr)"""
body = root.find(".//w:body", NS)
if body is None:
return []
sections = []
current_children = []
for child in body:
tag = child.tag.split("}")[-1] if "}" in child.tag else child.tag
if tag == "sectPr":
sections.append({"children": current_children, "sectPr": child})
current_children = []
else:
# Check whether paragraph contains sectPr (section break inside paragraph pPr)
ppr_sect = child.find(".//w:pPr/w:sectPr", NS)
if ppr_sect is not None:
current_children.append(child)
sections.append({"children": current_children, "sectPr": ppr_sect})
current_children = []
else:
current_children.append(child)
# Last section (body-level sectPr)
body_sect = body.find("w:sectPr", NS)
if body_sect is not None and current_children:
sections.append({"children": current_children, "sectPr": body_sect})
return sections
def check_blank_pages(root: ET.Element) -> CheckResult:
"""Detect excess blank pages — multi-pattern detection"""
body = root.find(".//w:body", NS)
paragraphs = body.findall("w:p", NS)
issues = []
if not paragraphs:
return CheckResult("blank-pages", True, "No paragraph content")
# Check 1: Whether the last paragraph only has a page break
last_p = paragraphs[-1]
runs = last_p.findall(".//w:r", NS)
has_page_break = False
has_text = False
for run in runs:
br = run.find("w:br", NS)
if br is not None and br.get(f"{{{NS['w']}}}type") == "page":
has_page_break = True
t = run.find("w:t", NS)
if t is not None and t.text and t.text.strip():
has_text = True
if has_page_break and not has_text:
issues.append("Trailing page break at document end may cause blank page")
# Check 2: Consecutive empty paragraphs (≥5 consecutive may form visual blank page)
consecutive_empty = 0
max_empty = 0
max_empty_pos = 0
for idx, p in enumerate(paragraphs):
texts = p.findall(".//w:t", NS)
has_any_text = any(t.text and t.text.strip() for t in texts)
has_br = any(
br.get(f"{{{NS['w']}}}type") == "page"
for br in p.findall(".//w:br", NS)
)
has_drawing = p.find(".//{http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing}inline", None) is not None
if not has_any_text and not has_br and not has_drawing:
consecutive_empty += 1
if consecutive_empty > max_empty:
max_empty = consecutive_empty
max_empty_pos = idx
else:
consecutive_empty = 0
if max_empty >= 5:
issues.append(f"Found {max_empty} consecutive empty paragraphs (starting around paragraph {max_empty_pos - max_empty + 2}), may form visual blank page")
# Check 3: Double page break at section boundary (PageBreak at section end + NEXT_PAGE in next section)
sections = get_sections(root)
for i in range(len(sections) - 1):
sec_children = sections[i]["children"]
if not sec_children:
continue
# Check whether the last paragraph of the section contains PageBreak
last_child = sec_children[-1]
if last_child.tag == f"{{{NS['w']}}}p":
for br in last_child.findall(".//w:br", NS):
if br.get(f"{{{NS['w']}}}type") == "page":
# Check whether the next section is NEXT_PAGE
next_sect_pr = sections[i + 1]["sectPr"]
sect_type = next_sect_pr.find("w:type", NS)
if sect_type is not None and sect_type.get(f"{{{NS['w']}}}val") == "nextPage":
issues.append(f"Section {i+1} ends with PageBreak and Section {i+2} is type nextPage, double page break causes blank page")
# Check 4: Empty paragraph + PageBreak (paragraph has only PageBreak, no text)
# Exclude section-ending PageBreaks — they are normal section separators
# (e.g., cover page ending with an empty para + PageBreak before a new section)
section_last_paras = set()
for sec in sections:
children = sec["children"]
if children:
last_child = children[-1]
section_last_paras.add(id(last_child))
empty_pb_count = 0
for p in paragraphs[:-1]: # Last paragraph already handled in Check 1
if id(p) in section_last_paras:
continue # Skip section-ending paragraphs (normal section breaks)
runs = p.findall(".//w:r", NS)
p_has_break = False
p_has_text = False
for run in runs:
br = run.find("w:br", NS)
if br is not None and br.get(f"{{{NS['w']}}}type") == "page":
p_has_break = True
t = run.find("w:t", NS)
if t is not None and t.text and t.text.strip():
p_has_text = True
if p_has_break and not p_has_text:
empty_pb_count += 1
if empty_pb_count > 0:
issues.append(f"Found {empty_pb_count} empty paragraphs with PageBreak (suggest attaching PageBreak to content paragraphs)")
# Separate hard errors from soft warnings
hard_issues = [i for i in issues if "double page break" in i.lower() or "trailing page break" in i.lower() or "consecutive" in i.lower()]
soft_issues = [i for i in issues if i not in hard_issues]
if hard_issues:
return CheckResult(
"blank-pages", False,
"; ".join(hard_issues[:3]),
"error"
)
if soft_issues:
return CheckResult(
"blank-pages", False,
"; ".join(soft_issues[:3]),
"warning"
)
return CheckResult("blank-pages", True, "No blank page issues detected")
def check_line_spacing(root: ET.Element) -> CheckResult:
"""Check body paragraph line spacing consistency"""
body = root.find(".//w:body", NS)
paragraphs = body.findall(".//w:p", NS)
spacing_values = {}
body_para_count = 0
for p in paragraphs:
ppr = p.find("w:pPr", NS)
# Skip heading paragraphs
if ppr is not None:
style = ppr.find("w:pStyle", NS)
if style is not None:
val = style.get(f"{{{NS['w']}}}val", "")
if val.startswith("Heading") or val == "Title":
continue
spacing = ppr.find("w:spacing", NS) if ppr is not None else None
line_val = spacing.get(f"{{{NS['w']}}}line") if spacing is not None else None
# Only count paragraphs with text content
texts = p.findall(".//w:t", NS)
if not any(t.text and t.text.strip() for t in texts):
continue
body_para_count += 1
key = line_val or "default"
spacing_values[key] = spacing_values.get(key, 0) + 1
if body_para_count == 0:
return CheckResult("line-spacing", True, "No body paragraphs")
if len(spacing_values) <= 1:
dominant = list(spacing_values.keys())[0] if spacing_values else "default"
return CheckResult("line-spacing", True, f"Line spacing uniform (line={dominant})")
# Find the most common line spacing
dominant = max(spacing_values, key=spacing_values.get)
inconsistent = sum(v for k, v in spacing_values.items() if k != dominant)
total = sum(spacing_values.values())
if inconsistent / total > 0.2:
return CheckResult(
"line-spacing", False,
f"Line spacing inconsistent: {dict(spacing_values)}, {inconsistent}/{total} paragraphs differ from dominant spacing {dominant}",
"warning"
)
return CheckResult("line-spacing", True, f"Line spacing mostly uniform (line={dominant}, {inconsistent} exceptions)")
def check_image_overflow(root: ET.Element) -> CheckResult:
"""Check whether image width may exceed page bounds"""
# Get page width
sect_pr = root.find(".//w:body/w:sectPr", NS)
page_width = 11906 # A4 default
margin_left = 1701
margin_right = 1417
if sect_pr is not None:
pg_sz = sect_pr.find("w:pgSz", NS)
pg_mar = sect_pr.find("w:pgMar", NS)
if pg_sz is not None:
page_width = int(pg_sz.get(f"{{{NS['w']}}}w", "11906"))
if pg_mar is not None:
margin_left = int(pg_mar.get(f"{{{NS['w']}}}left", "1701"))
margin_right = int(pg_mar.get(f"{{{NS['w']}}}right", "1417"))
usable_width_emu = (page_width - margin_left - margin_right) * 635 # twips → EMU
drawings = root.findall(".//wp:inline", NS) + root.findall(".//wp:anchor", NS)
oversized = 0
for dwg in drawings:
extent = dwg.find("wp:extent", NS)
if extent is not None:
cx = int(extent.get("cx", "0"))
if cx > usable_width_emu * 1.05: # 5% tolerance
oversized += 1
if oversized > 0:
return CheckResult(
"image-overflow", False,
f"{oversized} images exceed page usable area",
"error"
)
return CheckResult(
"image-overflow", True,
f"All images within page width ({len(drawings)} images)"
)
def check_image_aspect_ratio(docx_path: str, root: ET.Element) -> CheckResult:
"""Check whether images are stretched/distorted (aspect ratio drift).
Compares the original aspect ratio of embedded images with the display aspect ratio set in wp:extent.
Drift >10% is considered distortion (pie charts becoming elliptical, radar charts becoming diamond-shaped, etc).
"""
import zipfile as _zf
# Build a mapping: rId → image file path inside the zip
# We need to parse word/_rels/document.xml.rels
rid_to_path = {}
try:
with _zf.ZipFile(docx_path, 'r') as z:
rels_path = 'word/_rels/document.xml.rels'
if rels_path in z.namelist():
rels_xml = z.read(rels_path)
rels_root = ET.fromstring(rels_xml)
rels_ns = 'http://schemas.openxmlformats.org/package/2006/relationships'
for rel in rels_root.findall(f'{{{rels_ns}}}Relationship'):
rid = rel.get('Id', '')
target = rel.get('Target', '')
rel_type = rel.get('Type', '')
if 'image' in rel_type:
# Target is relative to word/ directory
if not target.startswith('/'):
img_path = 'word/' + target
else:
img_path = target.lstrip('/')
rid_to_path[rid] = img_path
# Now check each drawing
drawings = root.findall(".//wp:inline", NS) + root.findall(".//wp:anchor", NS)
distorted = []
for dwg in drawings:
extent = dwg.find("wp:extent", NS)
if extent is None:
continue
display_cx = int(extent.get("cx", "0"))
display_cy = int(extent.get("cy", "0"))
if display_cx == 0 or display_cy == 0:
continue
# Find the blip rId
blip = dwg.find(".//a:blip", NS)
if blip is None:
continue
r_embed = blip.get(f"{{{NS['r']}}}embed", "")
if not r_embed or r_embed not in rid_to_path:
continue
img_zip_path = rid_to_path[r_embed]
if img_zip_path not in z.namelist():
continue
# Read actual image dimensions
try:
img_data = z.read(img_zip_path)
from PIL import Image as _PILImage
import io as _io
pil_img = _PILImage.open(_io.BytesIO(img_data))
orig_w, orig_h = pil_img.size
if orig_w == 0 or orig_h == 0:
continue
except Exception:
continue
# Compare aspect ratios
orig_ratio = orig_w / orig_h
display_ratio = display_cx / display_cy
drift = abs(orig_ratio - display_ratio) / orig_ratio
if drift > 0.10: # >10% distortion
pct = drift * 100
distorted.append(
f"{img_zip_path.split('/')[-1]}: "
f"original {orig_w}×{orig_h} (ratio={orig_ratio:.2f}), "
f"display ratio={display_ratio:.2f}, distortion {pct:.0f}%"
)
except Exception:
return CheckResult(
"image-aspect-ratio", True,
"Cannot check image aspect ratio (zip read error)",
"info"
)
if distorted:
detail = "; ".join(distorted[:3])
if len(distorted) > 3:
detail += f" ...and {len(distorted)} more"
return CheckResult(
"image-aspect-ratio", False,
f"{len(distorted)} images have aspect ratio distortion: {detail}",
"warning"
)
img_count = len(drawings)
return CheckResult(
"image-aspect-ratio", True,
f"All images have correct aspect ratio ({img_count} images)"
)
def check_font_fallback(root: ET.Element) -> CheckResult:
"""Check whether potentially missing fonts are used"""
SAFE_FONTS = {
# Chinese
"宋体", "SimSun", "黑体", "SimHei", "微软雅黑", "Microsoft YaHei",
"仿宋", "FangSong", "FangSong_GB2312", "楷体", "KaiTi",
# English
"Times New Roman", "Arial", "Calibri", "Helvetica",
"Courier New", "Georgia", "Verdana", "Tahoma",
# Universal
"Symbol", "Wingdings",
}
fonts_used = set()
for rpr in root.findall(".//w:rPr", NS):
for font_tag in ["w:rFonts"]:
rf = rpr.find(font_tag, NS)
if rf is not None:
for attr in ["ascii", "eastAsia", "hAnsi", "cs"]:
f = rf.get(f"{{{NS['w']}}}{attr}")
if f:
fonts_used.add(f)
risky = fonts_used - SAFE_FONTS
if risky:
return CheckResult(
"font-fallback", False,
f"Following fonts may be missing on target system: {', '.join(sorted(risky))}",
"info"
)
return CheckResult("font-fallback", True, f"All fonts are common system fonts ({len(fonts_used)} types)")
def check_heading_levels(root: ET.Element) -> CheckResult:
"""Check whether headings skip levels"""
body = root.find(".//w:body", NS)
heading_levels = []
for p in body.findall(".//w:p", NS):
ppr = p.find("w:pPr", NS)
if ppr is None:
continue
style = ppr.find("w:pStyle", NS)
if style is None:
continue
val = style.get(f"{{{NS['w']}}}val", "")
m = re.match(r"Heading(\d+)", val)
if m:
heading_levels.append(int(m.group(1)))
if len(heading_levels) < 2:
return CheckResult("heading-levels", True, "Too few headings, skipping check")
skips = []
for i in range(1, len(heading_levels)):
diff = heading_levels[i] - heading_levels[i - 1]
if diff > 1:
skips.append(f"H{heading_levels[i-1]}→H{heading_levels[i]}")
if skips:
return CheckResult(
"heading-levels", False,
f"Heading level skip: {', '.join(skips[:5])}",
"warning"
)
return CheckResult("heading-levels", True, f"Heading levels continuous ({len(heading_levels)} headings)")
# check_cover_separation removed — false positives on complex covers (>15 elements is normal)
def check_shading_type(root: ET.Element) -> CheckResult:
"""Check whether ShadingType.SOLID is misused"""
shadings = root.findall(".//w:shd", NS)
solid_count = 0
for shd in shadings:
val = shd.get(f"{{{NS['w']}}}val", "")
if val == "solid":
solid_count += 1
if solid_count > 0:
return CheckResult(
"shading-type", False,
f"Found {solid_count} instances of ShadingType.SOLID (should be CLEAR), may cause black cells",
"error"
)
return CheckResult("shading-type", True, "No ShadingType.SOLID misuse found")
def check_toc(root: ET.Element, docx_path: str = "") -> CheckResult:
"""Check TOC quality: field existence, headings presence, outlineLvl, updateFields."""
body = root.find(".//w:body", NS)
if body is None:
return CheckResult("toc", True, "Document body is empty, skipping TOC check", "info")
paragraphs = list(body)
w_ns = NS["w"]
# --- Detect headings and their levels ---
heading_count = 0
heading_levels_used = set() # e.g. {1, 2, 3}
for p in paragraphs:
if p.tag != f"{{{w_ns}}}p":
continue
ppr = p.find(f"{{{w_ns}}}pPr")
if ppr is None:
continue
ps = ppr.find(f"{{{w_ns}}}pStyle")
if ps is None:
continue
val = ps.get(f"{{{w_ns}}}val", "")
m = re.match(r"(?i)heading\s*(\d)", val)
if m:
heading_count += 1
heading_levels_used.add(int(m.group(1)))
# --- Detect TOC field ---
has_toc = False
for instr in root.findall(f".//{{{w_ns}}}instrText"):
if instr.text and "TOC" in instr.text.upper():
has_toc = True
break
if not has_toc:
for fld in root.findall(f".//{{{w_ns}}}fldSimple"):
if "TOC" in fld.get(f"{{{w_ns}}}instr", "").upper():
has_toc = True
break
# Also check SDT-wrapped TOC
if not has_toc:
for sdt in root.findall(f".//{{{w_ns}}}sdt"):
for instr in sdt.findall(f".//{{{w_ns}}}instrText"):
if instr.text and "TOC" in instr.text.upper():
has_toc = True
break
if has_toc:
break
issues = []
# Check 1: Document has a "目录" / "目 录" / "Table of Contents" title but no TOC field
has_toc_title = False
toc_title_pattern = re.compile(r'^(?:目\s*录|table\s+of\s+contents|contents)$', re.IGNORECASE)
for p in paragraphs:
if p.tag != f"{{{w_ns}}}p":
continue
texts = p.findall(f".//{{{w_ns}}}t")
p_text = "".join(t.text or "" for t in texts).strip()
if toc_title_pattern.match(p_text):
has_toc_title = True
break
if has_toc_title and not has_toc:
issues.append("TOC_FIELD_MISSING: document has a TOC title but no TOC field element — add TableOfContents in code")
# Check 2: TOC field exists but no headings in document → TOC will be empty after update
if has_toc and heading_count == 0:
issues.append("TOC_NO_HEADINGS: TOC field exists but document has 0 Heading-styled paragraphs — TOC will be empty after update")
# Check 3 & 4: Read styles.xml and settings.xml from DOCX (only when TOC exists)
if has_toc and docx_path:
try:
import zipfile
with zipfile.ZipFile(docx_path, 'r') as zf:
# Check 3: outlineLvl missing in Heading styles
if 'word/styles.xml' in zf.namelist():
styles_content = zf.read('word/styles.xml').decode('utf-8')
styles_root = ET.fromstring(styles_content)
missing_outline = []
for level in sorted(heading_levels_used):
style_id = f"Heading{level}"
# Find <w:style w:styleId="HeadingN">
for style_elem in styles_root.findall(f".//{{{w_ns}}}style"):
sid = style_elem.get(f"{{{w_ns}}}styleId", "")
if sid == style_id:
# Check if pPr has outlineLvl
ppr = style_elem.find(f"{{{w_ns}}}pPr")
has_outline = False
if ppr is not None:
ol = ppr.find(f"{{{w_ns}}}outlineLvl")
if ol is not None:
has_outline = True
if not has_outline:
missing_outline.append(style_id)
break
if missing_outline:
issues.append(
"TOC_OUTLINE_MISSING: %s style(s) missing outlineLvl — "
"Word TOC update won't find these headings. "
"Run add_toc_placeholders.py to fix" % ", ".join(missing_outline)
)
# Check 4: updateFields not set to true
if 'word/settings.xml' in zf.namelist():
settings_content = zf.read('word/settings.xml').decode('utf-8')
# Check for <w:updateFields w:val="true"/>
update_ok = bool(re.search(
r'<w:updateFields\s+[^>]*w:val\s*=\s*"true"',
settings_content
))
if not update_ok:
issues.append(
"TOC_UPDATE_DISABLED: settings.xml missing updateFields=true — "
"Word won't prompt to update TOC on open. "
"Run add_toc_placeholders.py to fix"
)
except Exception as e:
issues.append(f"TOC_CHECK_ERROR: failed to read styles/settings from DOCX: {e}")
if not issues:
if has_toc:
return CheckResult("toc", True, "TOC field present and update-ready")
else:
return CheckResult("toc", True, "No TOC needed")
severity = "error" if any(k in i for i in issues for k in ("FIELD_MISSING", "NO_HEADINGS", "OUTLINE_MISSING")) else "warning"
return CheckResult("toc", False, "; ".join(issues[:5]), severity)
def check_cover_overflow(root: ET.Element) -> CheckResult:
"""Detect cover section issues: oversized fonts, excessive spacing, trailing empty content."""
sections = get_sections(root)
if not sections:
return CheckResult("cover-overflow", True, "No sections found")
sec0 = sections[0]
sect_pr = sec0["sectPr"]
# Get page dimensions and margins for accurate available height calculation
pg_sz = sect_pr.find("w:pgSz", NS)
pg_mar = sect_pr.find("w:pgMar", NS)
page_height = int(pg_sz.get(f"{{{NS['w']}}}h", "16838")) if pg_sz is not None else 16838
margin_top = int(pg_mar.get(f"{{{NS['w']}}}top", "0")) if pg_mar is not None else 0
margin_bottom = int(pg_mar.get(f"{{{NS['w']}}}bottom", "0")) if pg_mar is not None else 0
issues = []
children = sec0["children"]
# Check 1: Oversized font in cover section (> 44pt = 88 half-points = 889000 EMU)
max_font_size = 0
for child in children:
for sz in child.findall(".//" + f"{{{NS['w']}}}sz"):
val = sz.get(f"{{{NS['w']}}}val")
if val and val.isdigit():
size_hp = int(val)
if size_hp > max_font_size:
max_font_size = size_hp
if max_font_size > 88: # 88 half-points = 44pt
issues.append(
f"Cover has font size {max_font_size // 2}pt (>{44}pt max). "
f"Use calcTitleLayout() for dynamic sizing"
)
# Check 2: Excessive spacing.before in cover section (> 5000 twips)
max_spacing = 0
for child in children:
for sp in child.findall(".//" + f"{{{NS['w']}}}spacing"):
before = sp.get(f"{{{NS['w']}}}before")
if before and before.isdigit():
val = int(before)
if val > max_spacing:
max_spacing = val
if max_spacing > 5000:
issues.append(
f"Cover has spacing.before={max_spacing} twips (>5000 max). "
f"Use calcCoverSpacing() for dynamic spacing"
)
# Check 3: Trailing empty paragraphs in cover section
trailing_empty = 0
for child in reversed(children):
tag = child.tag.split("}")[-1] if "}" in child.tag else child.tag
if tag != "p":
break
texts = child.findall(".//" + f"{{{NS['w']}}}t")
has_text = any(t.text and t.text.strip() for t in texts)
if not has_text:
trailing_empty += 1
else:
break
if trailing_empty > 2:
issues.append(
f"Cover section ends with {trailing_empty} empty paragraphs (max 2 allowed) — "
f"excessive empty paragraphs may cause blank page after cover"
)
if issues:
return CheckResult(
"cover-overflow", False,
"; ".join(issues),
"error"
)
return CheckResult("cover-overflow", True, "Cover section layout looks OK")
def run_all_checks(docx_path: str) -> list[CheckResult]:
"""Run all checks"""
root = read_document_xml(docx_path)
checks = [
check_blank_pages,
check_cover_overflow,
check_line_spacing,
check_image_overflow,
check_font_fallback,
check_heading_levels,
check_shading_type,
]
results = []
for check_fn in checks:
try:
results.append(check_fn(root))
except Exception as e:
results.append(CheckResult(
check_fn.__name__.replace("check_", ""),
False,
f"Check error: {e}",
"error"
))
# TOC check needs both root and docx_path
try:
results.append(check_toc(root, docx_path))
except Exception as e:
results.append(CheckResult("toc", False, f"Check error: {e}", "error"))
# Image aspect ratio check needs both root and docx_path
try:
results.append(check_image_aspect_ratio(docx_path, root))
except Exception as e:
results.append(CheckResult("image-aspect-ratio", False, f"Check error: {e}", "error"))
return results
def main():
import argparse
parser = argparse.ArgumentParser(description="docx business rule self-check")
parser.add_argument("docx_path", help="Path to the .docx file to check")
parser.add_argument("--json", action="store_true", help="Output in JSON format")
parser.add_argument("--strict", action="store_true", help="Treat warnings as failures")
args = parser.parse_args()
if not Path(args.docx_path).exists():
print(f"❌ File not found: {args.docx_path}")
sys.exit(1)
results = run_all_checks(args.docx_path)
if args.json:
print(json.dumps([r.to_dict() for r in results], ensure_ascii=False, indent=2))
else:
print(f"\n📋 Document self-check report: {args.docx_path}\n")
for r in results:
print(f" {r}")
passed = sum(1 for r in results if r.passed)
total = len(results)
errors = sum(1 for r in results if not r.passed and r.severity == "error")
warnings = sum(1 for r in results if not r.passed and r.severity == "warning")
print(f"\n {'' * 50}")
print(f" Passed {passed}/{total} | ❌ {errors} errors | ⚠️ {warnings} warnings\n")
# Exit code
has_errors = any(not r.passed and r.severity == "error" for r in results)
has_warnings = any(not r.passed and r.severity == "warning" for r in results)
if has_errors:
sys.exit(2)
elif args.strict and has_warnings:
sys.exit(1)
else:
sys.exit(0)
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,3 @@
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<w:comments xmlns:wpc="http://schemas.microsoft.com/office/word/2010/wordprocessingCanvas" xmlns:cx="http://schemas.microsoft.com/office/drawing/2014/chartex" xmlns:cx1="http://schemas.microsoft.com/office/drawing/2015/9/8/chartex" xmlns:cx2="http://schemas.microsoft.com/office/drawing/2015/10/21/chartex" xmlns:cx3="http://schemas.microsoft.com/office/drawing/2016/5/9/chartex" xmlns:cx4="http://schemas.microsoft.com/office/drawing/2016/5/10/chartex" xmlns:cx5="http://schemas.microsoft.com/office/drawing/2016/5/11/chartex" xmlns:cx6="http://schemas.microsoft.com/office/drawing/2016/5/12/chartex" xmlns:cx7="http://schemas.microsoft.com/office/drawing/2016/5/13/chartex" xmlns:cx8="http://schemas.microsoft.com/office/drawing/2016/5/14/chartex" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:aink="http://schemas.microsoft.com/office/drawing/2016/ink" xmlns:am3d="http://schemas.microsoft.com/office/drawing/2017/model3d" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:oel="http://schemas.microsoft.com/office/2019/extlst" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:wp14="http://schemas.microsoft.com/office/word/2010/wordprocessingDrawing" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns:w14="http://schemas.microsoft.com/office/word/2010/wordml" xmlns:w15="http://schemas.microsoft.com/office/word/2012/wordml" xmlns:w16cex="http://schemas.microsoft.com/office/word/2018/wordml/cex" xmlns:w16cid="http://schemas.microsoft.com/office/word/2016/wordml/cid" xmlns:w16="http://schemas.microsoft.com/office/word/2018/wordml" xmlns:w16du="http://schemas.microsoft.com/office/word/2023/wordml/word16du" xmlns:w16sdtdh="http://schemas.microsoft.com/office/word/2020/wordml/sdtdatahash" xmlns:w16sdtfl="http://schemas.microsoft.com/office/word/2024/wordml/sdtformatlock" xmlns:w16se="http://schemas.microsoft.com/office/word/2015/wordml/symex" xmlns:wpg="http://schemas.microsoft.com/office/word/2010/wordprocessingGroup" xmlns:wpi="http://schemas.microsoft.com/office/word/2010/wordprocessingInk" xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml" xmlns:wps="http://schemas.microsoft.com/office/word/2010/wordprocessingShape" mc:Ignorable="w14 w15 w16se w16cid w16 w16cex w16sdtdh w16sdtfl w16du wp14">
</w:comments>

View File

@@ -0,0 +1,3 @@
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<w15:commentsEx xmlns:wpc="http://schemas.microsoft.com/office/word/2010/wordprocessingCanvas" xmlns:cx="http://schemas.microsoft.com/office/drawing/2014/chartex" xmlns:cx1="http://schemas.microsoft.com/office/drawing/2015/9/8/chartex" xmlns:cx2="http://schemas.microsoft.com/office/drawing/2015/10/21/chartex" xmlns:cx3="http://schemas.microsoft.com/office/drawing/2016/5/9/chartex" xmlns:cx4="http://schemas.microsoft.com/office/drawing/2016/5/10/chartex" xmlns:cx5="http://schemas.microsoft.com/office/drawing/2016/5/11/chartex" xmlns:cx6="http://schemas.microsoft.com/office/drawing/2016/5/12/chartex" xmlns:cx7="http://schemas.microsoft.com/office/drawing/2016/5/13/chartex" xmlns:cx8="http://schemas.microsoft.com/office/drawing/2016/5/14/chartex" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:aink="http://schemas.microsoft.com/office/drawing/2016/ink" xmlns:am3d="http://schemas.microsoft.com/office/drawing/2017/model3d" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:oel="http://schemas.microsoft.com/office/2019/extlst" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:wp14="http://schemas.microsoft.com/office/word/2010/wordprocessingDrawing" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns:w14="http://schemas.microsoft.com/office/word/2010/wordml" xmlns:w15="http://schemas.microsoft.com/office/word/2012/wordml" xmlns:w16cex="http://schemas.microsoft.com/office/word/2018/wordml/cex" xmlns:w16cid="http://schemas.microsoft.com/office/word/2016/wordml/cid" xmlns:w16="http://schemas.microsoft.com/office/word/2018/wordml" xmlns:w16du="http://schemas.microsoft.com/office/word/2023/wordml/word16du" xmlns:w16sdtdh="http://schemas.microsoft.com/office/word/2020/wordml/sdtdatahash" xmlns:w16sdtfl="http://schemas.microsoft.com/office/word/2024/wordml/sdtformatlock" xmlns:w16se="http://schemas.microsoft.com/office/word/2015/wordml/symex" xmlns:wpg="http://schemas.microsoft.com/office/word/2010/wordprocessingGroup" xmlns:wpi="http://schemas.microsoft.com/office/word/2010/wordprocessingInk" xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml" xmlns:wps="http://schemas.microsoft.com/office/word/2010/wordprocessingShape" mc:Ignorable="w14 w15 w16se w16cid w16 w16cex w16sdtdh w16sdtfl w16du wp14">
</w15:commentsEx>

View File

@@ -0,0 +1,3 @@
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<w16cex:commentsExtensible xmlns:wpc="http://schemas.microsoft.com/office/word/2010/wordprocessingCanvas" xmlns:cx="http://schemas.microsoft.com/office/drawing/2014/chartex" xmlns:cx1="http://schemas.microsoft.com/office/drawing/2015/9/8/chartex" xmlns:cx2="http://schemas.microsoft.com/office/drawing/2015/10/21/chartex" xmlns:cx3="http://schemas.microsoft.com/office/drawing/2016/5/9/chartex" xmlns:cx4="http://schemas.microsoft.com/office/drawing/2016/5/10/chartex" xmlns:cx5="http://schemas.microsoft.com/office/drawing/2016/5/11/chartex" xmlns:cx6="http://schemas.microsoft.com/office/drawing/2016/5/12/chartex" xmlns:cx7="http://schemas.microsoft.com/office/drawing/2016/5/13/chartex" xmlns:cx8="http://schemas.microsoft.com/office/drawing/2016/5/14/chartex" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:aink="http://schemas.microsoft.com/office/drawing/2016/ink" xmlns:am3d="http://schemas.microsoft.com/office/drawing/2017/model3d" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:oel="http://schemas.microsoft.com/office/2019/extlst" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:wp14="http://schemas.microsoft.com/office/word/2010/wordprocessingDrawing" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns:w14="http://schemas.microsoft.com/office/word/2010/wordml" xmlns:w15="http://schemas.microsoft.com/office/word/2012/wordml" xmlns:w16cex="http://schemas.microsoft.com/office/word/2018/wordml/cex" xmlns:w16cid="http://schemas.microsoft.com/office/word/2016/wordml/cid" xmlns:w16="http://schemas.microsoft.com/office/word/2018/wordml" xmlns:w16du="http://schemas.microsoft.com/office/word/2023/wordml/word16du" xmlns:w16sdtdh="http://schemas.microsoft.com/office/word/2020/wordml/sdtdatahash" xmlns:w16sdtfl="http://schemas.microsoft.com/office/word/2024/wordml/sdtformatlock" xmlns:w16se="http://schemas.microsoft.com/office/word/2015/wordml/symex" xmlns:wpg="http://schemas.microsoft.com/office/word/2010/wordprocessingGroup" xmlns:wpi="http://schemas.microsoft.com/office/word/2010/wordprocessingInk" xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml" xmlns:wps="http://schemas.microsoft.com/office/word/2010/wordprocessingShape" xmlns:cr="http://schemas.microsoft.com/office/comments/2020/reactions" mc:Ignorable="w14 w15 w16se w16cid w16 w16cex w16sdtdh w16sdtfl cr w16du wp14">
</w16cex:commentsExtensible>

View File

@@ -0,0 +1,3 @@
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<w16cid:commentsIds xmlns:wpc="http://schemas.microsoft.com/office/word/2010/wordprocessingCanvas" xmlns:cx="http://schemas.microsoft.com/office/drawing/2014/chartex" xmlns:cx1="http://schemas.microsoft.com/office/drawing/2015/9/8/chartex" xmlns:cx2="http://schemas.microsoft.com/office/drawing/2015/10/21/chartex" xmlns:cx3="http://schemas.microsoft.com/office/drawing/2016/5/9/chartex" xmlns:cx4="http://schemas.microsoft.com/office/drawing/2016/5/10/chartex" xmlns:cx5="http://schemas.microsoft.com/office/drawing/2016/5/11/chartex" xmlns:cx6="http://schemas.microsoft.com/office/drawing/2016/5/12/chartex" xmlns:cx7="http://schemas.microsoft.com/office/drawing/2016/5/13/chartex" xmlns:cx8="http://schemas.microsoft.com/office/drawing/2016/5/14/chartex" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:aink="http://schemas.microsoft.com/office/drawing/2016/ink" xmlns:am3d="http://schemas.microsoft.com/office/drawing/2017/model3d" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:oel="http://schemas.microsoft.com/office/2019/extlst" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:wp14="http://schemas.microsoft.com/office/word/2010/wordprocessingDrawing" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns:w14="http://schemas.microsoft.com/office/word/2010/wordml" xmlns:w15="http://schemas.microsoft.com/office/word/2012/wordml" xmlns:w16cex="http://schemas.microsoft.com/office/word/2018/wordml/cex" xmlns:w16cid="http://schemas.microsoft.com/office/word/2016/wordml/cid" xmlns:w16="http://schemas.microsoft.com/office/word/2018/wordml" xmlns:w16du="http://schemas.microsoft.com/office/word/2023/wordml/word16du" xmlns:w16sdtdh="http://schemas.microsoft.com/office/word/2020/wordml/sdtdatahash" xmlns:w16sdtfl="http://schemas.microsoft.com/office/word/2024/wordml/sdtformatlock" xmlns:w16se="http://schemas.microsoft.com/office/word/2015/wordml/symex" xmlns:wpg="http://schemas.microsoft.com/office/word/2010/wordprocessingGroup" xmlns:wpi="http://schemas.microsoft.com/office/word/2010/wordprocessingInk" xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml" xmlns:wps="http://schemas.microsoft.com/office/word/2010/wordprocessingShape" mc:Ignorable="w14 w15 w16se w16cid w16 w16cex w16sdtdh w16sdtfl w16du wp14">
</w16cid:commentsIds>

View File

@@ -0,0 +1,3 @@
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<w15:people xmlns:w15="http://schemas.microsoft.com/office/word/2012/wordml">
</w15:people>

374
skills/docx/scripts/utilities.py Executable file
View File

@@ -0,0 +1,374 @@
#!/usr/bin/env python3
"""
Utilities for editing OOXML documents.
This module provides XMLEditor, a tool for manipulating XML files with support for
line-number-based node finding and DOM manipulation. Each element is automatically
annotated with its original line and column position during parsing.
Example usage:
editor = XMLEditor("document.xml")
# Find node by line number or range
elem = editor.get_node(tag="w:r", line_number=519)
elem = editor.get_node(tag="w:p", line_number=range(100, 200))
# Find node by text content
elem = editor.get_node(tag="w:p", contains="specific text")
# Find node by attributes
elem = editor.get_node(tag="w:r", attrs={"w:id": "target"})
# Combine filters
elem = editor.get_node(tag="w:p", line_number=range(1, 50), contains="text")
# Replace, insert, or manipulate
new_elem = editor.replace_node(elem, "<w:r><w:t>new text</w:t></w:r>")
editor.insert_after(new_elem, "<w:r><w:t>more</w:t></w:r>")
# Save changes
editor.save()
"""
import html
from pathlib import Path
from typing import Optional, Union
import defusedxml.minidom
import defusedxml.sax
class XMLEditor:
"""
Editor for manipulating OOXML XML files with line-number-based node finding.
This class parses XML files and tracks the original line and column position
of each element. This enables finding nodes by their line number in the original
file, which is useful when working with Read tool output.
Attributes:
xml_path: Path to the XML file being edited
encoding: Detected encoding of the XML file ('ascii' or 'utf-8')
dom: Parsed DOM tree with parse_position attributes on elements
"""
def __init__(self, xml_path):
"""
Initialize with path to XML file and parse with line number tracking.
Args:
xml_path: Path to XML file to edit (str or Path)
Raises:
ValueError: If the XML file does not exist
"""
self.xml_path = Path(xml_path)
if not self.xml_path.exists():
raise ValueError(f"XML file not found: {xml_path}")
with open(self.xml_path, "rb") as f:
header = f.read(200).decode("utf-8", errors="ignore")
self.encoding = "ascii" if 'encoding="ascii"' in header else "utf-8"
parser = _create_line_tracking_parser()
self.dom = defusedxml.minidom.parse(str(self.xml_path), parser)
def get_node(
self,
tag: str,
attrs: Optional[dict[str, str]] = None,
line_number: Optional[Union[int, range]] = None,
contains: Optional[str] = None,
):
"""
Get a DOM element by tag and identifier.
Finds an element by either its line number in the original file or by
matching attribute values. Exactly one match must be found.
Args:
tag: The XML tag name (e.g., "w:del", "w:ins", "w:r")
attrs: Dictionary of attribute name-value pairs to match (e.g., {"w:id": "1"})
line_number: Line number (int) or line range (range) in original XML file (1-indexed)
contains: Text string that must appear in any text node within the element.
Supports both entity notation (&#8220;) and Unicode characters (\u201c).
Returns:
defusedxml.minidom.Element: The matching DOM element
Raises:
ValueError: If node not found or multiple matches found
Example:
elem = editor.get_node(tag="w:r", line_number=519)
elem = editor.get_node(tag="w:r", line_number=range(100, 200))
elem = editor.get_node(tag="w:del", attrs={"w:id": "1"})
elem = editor.get_node(tag="w:p", attrs={"w14:paraId": "12345678"})
elem = editor.get_node(tag="w:commentRangeStart", attrs={"w:id": "0"})
elem = editor.get_node(tag="w:p", contains="specific text")
elem = editor.get_node(tag="w:t", contains="&#8220;Agreement") # Entity notation
elem = editor.get_node(tag="w:t", contains="\u201cAgreement") # Unicode character
"""
matches = []
for elem in self.dom.getElementsByTagName(tag):
# Check line_number filter
if line_number is not None:
parse_pos = getattr(elem, "parse_position", (None,))
elem_line = parse_pos[0]
# Handle both single line number and range
if isinstance(line_number, range):
if elem_line not in line_number:
continue
else:
if elem_line != line_number:
continue
# Check attrs filter
if attrs is not None:
if not all(
elem.getAttribute(attr_name) == attr_value
for attr_name, attr_value in attrs.items()
):
continue
# Check contains filter
if contains is not None:
elem_text = self._get_element_text(elem)
# Normalize the search string: convert HTML entities to Unicode characters
# This allows searching for both "&#8220;Rowan" and ""Rowan"
normalized_contains = html.unescape(contains)
if normalized_contains not in elem_text:
continue
# If all applicable filters passed, this is a match
matches.append(elem)
if not matches:
# Build descriptive error message
filters = []
if line_number is not None:
line_str = (
f"lines {line_number.start}-{line_number.stop - 1}"
if isinstance(line_number, range)
else f"line {line_number}"
)
filters.append(f"at {line_str}")
if attrs is not None:
filters.append(f"with attributes {attrs}")
if contains is not None:
filters.append(f"containing '{contains}'")
filter_desc = " ".join(filters) if filters else ""
base_msg = f"Node not found: <{tag}> {filter_desc}".strip()
# Add helpful hint based on filters used
if contains:
hint = "Text may be split across elements or use different wording."
elif line_number:
hint = "Line numbers may have changed if document was modified."
elif attrs:
hint = "Verify attribute values are correct."
else:
hint = "Try adding filters (attrs, line_number, or contains)."
raise ValueError(f"{base_msg}. {hint}")
if len(matches) > 1:
raise ValueError(
f"Multiple nodes found: <{tag}>. "
f"Add more filters (attrs, line_number, or contains) to narrow the search."
)
return matches[0]
def _get_element_text(self, elem):
"""
Recursively extract all text content from an element.
Skips text nodes that contain only whitespace (spaces, tabs, newlines),
which typically represent XML formatting rather than document content.
Args:
elem: defusedxml.minidom.Element to extract text from
Returns:
str: Concatenated text from all non-whitespace text nodes within the element
"""
text_parts = []
for node in elem.childNodes:
if node.nodeType == node.TEXT_NODE:
# Skip whitespace-only text nodes (XML formatting)
if node.data.strip():
text_parts.append(node.data)
elif node.nodeType == node.ELEMENT_NODE:
text_parts.append(self._get_element_text(node))
return "".join(text_parts)
def replace_node(self, elem, new_content):
"""
Replace a DOM element with new XML content.
Args:
elem: defusedxml.minidom.Element to replace
new_content: String containing XML to replace the node with
Returns:
List[defusedxml.minidom.Node]: All inserted nodes
Example:
new_nodes = editor.replace_node(old_elem, "<w:r><w:t>text</w:t></w:r>")
"""
parent = elem.parentNode
nodes = self._parse_fragment(new_content)
for node in nodes:
parent.insertBefore(node, elem)
parent.removeChild(elem)
return nodes
def insert_after(self, elem, xml_content):
"""
Insert XML content after a DOM element.
Args:
elem: defusedxml.minidom.Element to insert after
xml_content: String containing XML to insert
Returns:
List[defusedxml.minidom.Node]: All inserted nodes
Example:
new_nodes = editor.insert_after(elem, "<w:r><w:t>text</w:t></w:r>")
"""
parent = elem.parentNode
next_sibling = elem.nextSibling
nodes = self._parse_fragment(xml_content)
for node in nodes:
if next_sibling:
parent.insertBefore(node, next_sibling)
else:
parent.appendChild(node)
return nodes
def insert_before(self, elem, xml_content):
"""
Insert XML content before a DOM element.
Args:
elem: defusedxml.minidom.Element to insert before
xml_content: String containing XML to insert
Returns:
List[defusedxml.minidom.Node]: All inserted nodes
Example:
new_nodes = editor.insert_before(elem, "<w:r><w:t>text</w:t></w:r>")
"""
parent = elem.parentNode
nodes = self._parse_fragment(xml_content)
for node in nodes:
parent.insertBefore(node, elem)
return nodes
def append_to(self, elem, xml_content):
"""
Append XML content as a child of a DOM element.
Args:
elem: defusedxml.minidom.Element to append to
xml_content: String containing XML to append
Returns:
List[defusedxml.minidom.Node]: All inserted nodes
Example:
new_nodes = editor.append_to(elem, "<w:r><w:t>text</w:t></w:r>")
"""
nodes = self._parse_fragment(xml_content)
for node in nodes:
elem.appendChild(node)
return nodes
def get_next_rid(self):
"""Get the next available rId for relationships files."""
max_id = 0
for rel_elem in self.dom.getElementsByTagName("Relationship"):
rel_id = rel_elem.getAttribute("Id")
if rel_id.startswith("rId"):
try:
max_id = max(max_id, int(rel_id[3:]))
except ValueError:
pass
return f"rId{max_id + 1}"
def save(self):
"""
Save the edited XML back to the file.
Serializes the DOM tree and writes it back to the original file path,
preserving the original encoding (ascii or utf-8).
"""
content = self.dom.toxml(encoding=self.encoding)
self.xml_path.write_bytes(content)
def _parse_fragment(self, xml_content):
"""
Parse XML fragment and return list of imported nodes.
Args:
xml_content: String containing XML fragment
Returns:
List of defusedxml.minidom.Node objects imported into this document
Raises:
AssertionError: If fragment contains no element nodes
"""
# Extract namespace declarations from the root document element
root_elem = self.dom.documentElement
namespaces = []
if root_elem and root_elem.attributes:
for i in range(root_elem.attributes.length):
attr = root_elem.attributes.item(i)
if attr.name.startswith("xmlns"): # type: ignore
namespaces.append(f'{attr.name}="{attr.value}"') # type: ignore
ns_decl = " ".join(namespaces)
wrapper = f"<root {ns_decl}>{xml_content}</root>"
fragment_doc = defusedxml.minidom.parseString(wrapper)
nodes = [
self.dom.importNode(child, deep=True)
for child in fragment_doc.documentElement.childNodes # type: ignore
]
elements = [n for n in nodes if n.nodeType == n.ELEMENT_NODE]
assert elements, "Fragment must contain at least one element"
return nodes
def _create_line_tracking_parser():
"""
Create a SAX parser that tracks line and column numbers for each element.
Monkey patches the SAX content handler to store the current line and column
position from the underlying expat parser onto each element as a parse_position
attribute (line, column) tuple.
Returns:
defusedxml.sax.xmlreader.XMLReader: Configured SAX parser
"""
def set_content_handler(dom_handler):
def startElementNS(name, tagName, attrs):
orig_start_cb(name, tagName, attrs)
cur_elem = dom_handler.elementStack[-1]
cur_elem.parse_position = (
parser._parser.CurrentLineNumber, # type: ignore
parser._parser.CurrentColumnNumber, # type: ignore
)
orig_start_cb = dom_handler.startElementNS
dom_handler.startElementNS = startElementNS
orig_set_content_handler(dom_handler)
parser = defusedxml.sax.make_parser()
orig_set_content_handler = parser.setContentHandler
parser.setContentHandler = set_content_handler # type: ignore
return parser

177
skills/docx/setup.sh Executable file
View File

@@ -0,0 +1,177 @@
#!/usr/bin/env bash
# ---
# name: docx-setup
# author: Z.AI
# version: "1.0"
# description: Environment setup for the DOCX skill. Checks and installs all required dependencies.
# ---
#
# Installs only dependencies required by the DOCX skill.
set -euo pipefail
RED='\033[0;31m'; GREEN='\033[0;32m'; YELLOW='\033[1;33m'; BLUE='\033[0;34m'; NC='\033[0m'
ok() { echo -e " ${GREEN}${NC} $1"; }
fail() { echo -e " ${RED}${NC} $1"; }
warn() { echo -e " ${YELLOW}${NC} $1"; }
info() { echo -e " ${BLUE}${NC} $1"; }
echo "============================================"
echo " DOCX Skill — Environment Setup"
echo "============================================"
echo ""
OS="$(uname -s)"
ARCH="$(uname -m)"
echo "Platform: $OS $ARCH"
echo ""
# ── 0. macOS: Homebrew ──
if [ "$OS" = "Darwin" ]; then
echo "--- Homebrew (macOS package manager) ---"
if command -v brew &>/dev/null; then
BREW_VER=$(brew --version 2>/dev/null | head -1)
ok "brew ($BREW_VER)"
else
fail "brew not found — Node.js install needs Homebrew on macOS"
info "Install: /bin/bash -c \"\$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)\""
fi
echo ""
fi
# ── 1. Node.js (docx-js runs on Node) ──
echo "--- Node.js ---"
if command -v node &>/dev/null; then
NODE_VER=$(node --version)
ok "node ($NODE_VER)"
else
fail "node not found (required — docx generation uses docx-js on Node)"
case "$OS" in
Darwin) info "Install: brew install node" ;;
Linux) info "Install: curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -"
info " sudo apt install -y nodejs" ;;
*) info "Install: https://nodejs.org/" ;;
esac
fi
# ── 2. npm ──
echo ""
echo "--- npm ---"
if command -v npm &>/dev/null; then
NPM_VER=$(npm --version 2>/dev/null)
ok "npm ($NPM_VER)"
else
fail "npm not found"
case "$OS" in
Darwin) info "Install: brew install node (includes npm)" ;;
Linux) info "Install: comes with nodejs" ;;
*) info "Install: https://nodejs.org/" ;;
esac
fi
# ── 3. npm package: docx ──
echo ""
echo "--- npm Packages ---"
if node -e "require('docx')" 2>/dev/null || npm list -g docx &>/dev/null; then
DOCX_VER=$(node -e "try{console.log(require('docx/package.json').version)}catch(e){console.log('installed')}" 2>/dev/null)
ok "docx ($DOCX_VER)"
else
fail "docx not installed"
info "Install: npm install -g docx"
echo ""
if [ -t 0 ]; then
read -p " Install now? [Y/n] " -n 1 -r REPLY
echo ""
REPLY=${REPLY:-Y}
else
warn "Non-interactive mode — skipping auto-install."
REPLY=N
fi
if [[ ! $REPLY =~ ^[Nn]$ ]]; then
npm install -g docx 2>/dev/null && ok "Installed: docx" || fail "npm install failed"
fi
fi
# ── 4. Python 3 (post-processing scripts) ──
echo ""
echo "--- Python (post-processing) ---"
if command -v python3 &>/dev/null; then
PY_VER=$(python3 --version 2>&1)
ok "python3 ($PY_VER)"
if [ "$OS" = "Darwin" ]; then
PY_PATH=$(which python3 2>/dev/null)
if [[ "$PY_PATH" == "/usr/bin/python3" ]]; then
warn "Using macOS system Python (limited). Recommend: brew install python3"
fi
fi
else
fail "python3 not found"
case "$OS" in
Darwin) info "Install: brew install python3" ;;
Linux) info "Install: sudo apt install python3 python3-pip (Debian/Ubuntu)"
info " sudo dnf install python3 python3-pip (Fedora/RHEL)" ;;
*) info "Install: https://www.python.org/downloads/" ;;
esac
fi
# ── 5. pip ──
echo ""
echo "--- pip ---"
if python3 -m pip --version &>/dev/null 2>&1; then
PIP_VER=$(python3 -m pip --version 2>/dev/null | head -1)
ok "pip ($PIP_VER)"
else
fail "pip not found"
case "$OS" in
Darwin) info "Install: python3 -m ensurepip --upgrade"
info " or: brew install python3 (includes pip)" ;;
Linux) info "Install: sudo apt install python3-pip (Debian/Ubuntu)" ;;
*) info "Install: python3 -m ensurepip --upgrade" ;;
esac
fi
# ── 6. Python packages ──
echo ""
echo "--- Python Packages ---"
PY_PKGS=(
"defusedxml:defusedxml"
)
MISSING_PY=()
for entry in "${PY_PKGS[@]}"; do
mod="${entry%%:*}"
pkg="${entry##*:}"
if python3 -c "import $mod" 2>/dev/null; then
ver=$(python3 -c "import $mod; print(getattr($mod, '__version__', 'installed'))" 2>/dev/null)
ok "$pkg ($ver)"
else
fail "$pkg not installed"
MISSING_PY+=("$pkg")
fi
done
if [ ${#MISSING_PY[@]} -gt 0 ]; then
echo ""
if [ -t 0 ]; then
read -p " Install missing Python packages? [Y/n] " -n 1 -r REPLY
echo ""
REPLY=${REPLY:-Y}
else
warn "Non-interactive mode — skipping auto-install. Run interactively or install manually."
REPLY=N
fi
if [[ ! $REPLY =~ ^[Nn]$ ]]; then
python3 -m pip install -q "${MISSING_PY[@]}" 2>/dev/null \
|| python3 -m pip install -q --user "${MISSING_PY[@]}" 2>/dev/null \
|| python3 -m pip install -q --break-system-packages "${MISSING_PY[@]}" 2>/dev/null \
|| { fail "pip install failed. Try manually: pip install ${MISSING_PY[*]}"; }
ok "Installed: ${MISSING_PY[*]}"
fi
fi
# ── Summary ──
echo ""
echo "============================================"
echo " Setup complete."
echo " Core: Node.js + docx (npm)"
echo " Post-processing: Python + defusedxml"
echo "============================================"

View File

@@ -0,0 +1,88 @@
---
name: dream-interpreter
description: AI 解梦大师。用户描述梦境,智能追问关键细节后,从三个视角(周公解梦/心理分析/赛博神棍)生成解读,输出结构化 JSON 供前端渲染"梦境解析卡"。
---
# dream-interpreter
AI 解梦大师。用户描述梦境,智能追问关键细节后,从三个视角(周公解梦/心理分析/赛博神棍)生成解读,输出结构化 JSON 供前端渲染"梦境解析卡"。
## When to use
- 用户说"我梦到..."、"昨晚做了个梦"、"帮我解个梦"等
- NOT for: 清醒梦教学、睡眠质量分析、真正的心理咨询
## Session flow
### Phase 1: 梦境收集 + 追问
1. 用户描述梦境
2. 从描述中提取关键意象,找出最影响解读方向的模糊点
3. 追问最多 3 个问题(可以更少),每个聚焦一个维度:
追问维度优先级:
- **情绪**"掉下去的时候害怕还是放松?" → 决定焦虑型/释放型
- **环境**"那个地方你认识吗?" → 关联生活领域
- **人物**"梦里的那个人你认识吗?" → 判断投射对象
- **结局**"最后怎么样了?" → 决定解读走向
追问规则:
- 用户描述已经很详细 → 少问或不问
- 用户不想回答 → 跳过,用合理默认值
- 追问本身要有角色感,不是审问
### Phase 2: 生成解读
收集完信息后,生成三个视角的解读。每个视角独立分析,风格差异要大。
读取 `interpretation-guide.md` 获取三个视角的详细指南。
### Phase 3: 输出结构化 JSON
`output-schema.md` 中的格式输出 JSON供前端渲染。
JSON 包含:梦境摘要、关键词、情绪分类、配色方案、视觉元素列表、三视角解读内容、综合建议、可分享文案。
读取 `visual-mapping.md` 将意象映射为视觉元素和配色。
## Output format
**追问阶段**:纯文本对话,角色感强
**解读阶段**:输出 JSON 代码块,格式遵循 `output-schema.md`
示例:
追问:
```
嗯...高楼上掉下去...
问你几个事:
1. 掉的时候你是害怕还是反而觉得挺爽?
2. 那个楼你认识吗?公司?家?还是没见过的地方?
3. 最后落地了吗?还是一直在掉?
```
解读输出:
```json
{
"dream_summary": "从陌生高楼坠落,感到恐惧,没有落地",
"keywords": ["高楼", "坠落", "恐惧", "无尽下落"],
"mood": "anxious",
"color_scheme": "dark",
"visual_elements": ["building", "falling_particles", "dark_bg", "blur_lights"],
"interpretations": {
"zhouGong": { ... },
"freud": { ... },
"cyber": { ... }
},
"overall_advice": "...",
"shareable_text": "..."
}
```
## References
- `interpretation-guide.md` — 三视角解读详细指南和风格要求
- `visual-mapping.md` — 梦境意象 → 视觉元素/配色的映射表
- `output-schema.md` — JSON 输出格式完整规范
- `questioning-strategy.md` — 追问策略和示例库

Some files were not shown because too many files have changed in this diff Show More