Files
mantle-ai-trader/skills/pdf/SKILL.md
2026-06-06 05:21:10 +00:00

65 KiB
Executable File
Raw Blame History

name: pdf metadata: author: Z.AI version: "1.0" description: Professional PDF toolkit with four production lines: (1) Report - structured documents via ReportLab (reports, proposals, contracts, white papers) (2) Creative - visual design via JSON Blueprint → design_engine.py → Playwright snapshot (posters, infographics, invitations, dashboards). The LLM acts as Art Director outputting ONLY JSON spatial blueprints; convert.blueprint compiles to pixel-perfect PDF. (3) Academic - scholarly work via LaTeX/Tectonic (papers, theses, math-heavy documents) (4) Process - manipulate existing PDFs (extract, merge, split, fill forms, convert) Auto-routes based on document type. Includes ATS/creative/academic resume sub-paths. license: Proprietary. LICENSE.txt has complete terms

PDF - Document Production Workbench

Quick Setup

bash "$PDF_SKILL_DIR/scripts/setup.sh"          # Interactive environment check + install
python3 "$PDF_SKILL_DIR/scripts/pdf.py" env.check  # Detailed dependency status (JSON: add -j)
python3 "$PDF_SKILL_DIR/scripts/pdf.py" env.fix     # Auto-install missing Python packages

Triage

Determine task weight to control how much context to load:

Weight Triggers What to Load
Light Format conversion, form fill, text extract, merge/split, simple certificate SKILL.md + briefs/process.md only
Standard Multi-page report, poster, academic paper, resume, reformat - any document with design decisions SKILL.md + matched brief + typesetting assets on demand

Light tasks skip typesetting files entirely. Standard tasks load them on demand per the brief's instructions.

⚠️ Pre-Routing Checks (run BEFORE matching brief)

  1. Emoji Check - Scan user content for intentional emoji (decorative 📊🎯🔥, not OS-level emoji input). If found → force Creative brief regardless of document type. ReportLab renders emoji as □ squares; LaTeX drops them entirely.
  2. CJK Check - Chinese/Japanese/Korean content needs font coverage. Report brief must use UniSong/UniHei registered fonts; Creative brief must load Google Fonts Noto Sans SC with font-display: swap; Academic brief must use \usepackage{ctex}.
  3. Size Check - Non-standard page sizes (not A4/Letter/A3) → prefer Creative brief (Playwright handles any dimension). ReportLab can do custom sizes but pagination is manual.
  4. Character Safety Check - Before writing any content string, scan for Japanese kana (の、が、は etc.), unusual Unicode symbols, or non-CJK characters that may corrupt during encoding transit ( Especially when code is written via heredoc/base64/LLM output). Replace with plain Chinese equivalents: 之/的/缔, →omit or write full character. If content must preserve Japanese, use only standard CJK Unified Ideographs (U+4E00-U+9FFF) and common kana; avoid rare/private-use codepoints.

Briefing

Match the user's intent to a production brief. Each brief contains the full workflow, tech stack specifics, and references to shared typesetting assets.

User Request
│
├─ Work with existing PDF? ─────────────┬─ Extract/merge/split/fill/convert → briefs/process.md
│                                       ├─ Reformat/redesign → briefs/process.md (extract) → delegate to report or creative brief
│                                       └─ User provides a PDF template/reference to match style
│                                          → briefs/process.md "Template-Guided Reformat" → delegate to matched brief
│
├─ Report / proposal / white paper / contract / analysis?
│  └─ ────────────────────────────────── → briefs/report.md   (ReportLab)
│
├─ Poster / invitation / infographic / dashboard / creative layout?
│  └─ ────────────────────────────────── → briefs/creative.md  (Playwright)
│
├─ Academic paper / thesis / math / IEEE / ACM / LaTeX?
│  └─ ────────────────────────────────── → briefs/academic.md  (Tectonic)
│
├─ Math-heavy doc / TikZ diagram / algorithm pseudocode / Beamer slides?
│  └─ ────────────────────────────────── → briefs/academic.md  (Tectonic, Scenarios A-D)
│
├─ Document needs complex embedded diagrams (flowcharts, architecture, neural nets)?
│  └─ Route by target brief:
│     ├─ Report → Playwright+CSS → PNG → ReportLab Image() flowable
│     ├─ Creative → directly in HTML (CSS flexbox/grid + connectors)
│     └─ Academic → complexity-based:
│        ├─ Simple (≤6 nodes, linear/tree) → TikZ native (vector)
│        └─ Complex (>6 nodes, branches, annotations) → Playwright+CSS → PNG → \includegraphics
│
└─ Resume / CV?
   ├─ ATS-safe / corporate ─────────── → briefs/report.md     (resume sub-section)
   ├─ Creative / design industry ────── → briefs/creative.md   (resume sub-section)
   └─ Academic CV / publications ────── → briefs/academic.md   (resume sub-section)

Detection Keywords

Brief Keywords
Report 报告, report, 分析, analysis, 白皮书, white paper, 提案, proposal, 合同, contract, 方案, 规划, 发票, invoice, 收据, receipt, 试卷, exam, quiz, test paper, 练习, exercise, worksheet, 考试, 测验
Creative 海报, poster, 邀请函, invitation, 信息图, infographic, 仪表盘, dashboard, 传单, flyer, 证书, certificate, 菜单, menu, 名片, business card, 奖状, award, 标签, label, 信封, envelope, 贺卡, greeting card
Creative (Poster) 海报, poster, 传单, flyer, 宣传页, 宣传单 → additionally load briefs/poster.md scene layer rules
Academic 论文, paper, 学术, academic, LaTeX, 数学, math, IEEE, ACM, 毕业, thesis, 研究, research, Beamer, slides, 开题报告, 学位, dissertation, proposal
Process 提取, extract, 合并, merge, 拆分, split, 填写, fill, 转换, convert, OCR, 重排, reformat, 重新排版, redesign, 模板, template, 参照, 照着这个做, match this style, 压缩, compress, 水印, watermark, 加密, encrypt, 签名, sign

Complete Scenario Routing Matrix

Below is an exhaustive map of every known PDF request type to its handling strategy. If a scenario is not listed, route to the closest match or ask the user.

📄 Creation (Generate PDF from scratch)

Scenario Route Notes
Report / white paper / analysis report.md ReportLab structured document
Report with emoji creative.md 🚨 Emoji rule override
Business proposal report.md Structured + data tables
Contract / legal document report.md Add signature placeholders (dotted line + label)
Invoice / receipt report.md Table-heavy, precision alignment
Exam / quiz / test paper / worksheet report.md Indented options, answer space reservation, structured numbering (see Exam Paper Rules in report.md)
Math exam / math worksheet (with formulas/equations) academic.md LaTeX for proper math typesetting. See §Exam Paper Rules in academic.md
Poster / flyer creative.md + poster.md Visual design + poster density/sizing rules
Invitation / greeting card creative.md Non-standard size, decorative
Certificate / award creative.md Single page, centered layout, decorative border
Business card creative.md Tiny size (90×54mm), Playwright native support
Envelope / label creative.md Non-standard size, simple layout
Menu / price list creative.md Visual layout + may contain emoji
Resume (ATS) report.md Plain text structure
Resume (creative) creative.md Visual design
Resume (academic CV) academic.md Publication list + BibTeX
Academic paper academic.md LaTeX/Tectonic
Math-heavy document academic.md LaTeX typesetting
Presentation / PPT-style creative.md Landscape (1280×720), one topic per page
Book / long document report.md Add TOC + chapter numbering, validate with toc_validate.py
CJK vertical text creative.md HTML writing-mode: vertical-rl + text-orientation: upright + white-space: nowrap + Playwright
RTL document (Arabic/Hebrew) creative.md HTML dir="rtl" + Playwright
Batch generation (mail merge) report.md Python loop + template variable substitution
Infographic creative.md Data visualization + design
Calendar / schedule creative.md Grid layout + custom dimensions

🔧 Processing (Manipulate existing PDF)

Scenario Route Command / Method
Merge multiple PDFs process.md pages.merge a.pdf b.pdf -o out.pdf
Split PDF process.md pages.split input.pdf -o ./output/
Extract text process.md extract.text input.pdf
Extract tables process.md extract.table input.pdf
Extract images process.md extract.image input.pdf
Fill forms process.md form.fill input.pdf
Office → PDF process.md convert.office input.docx
HTML → PDF (documents) process.md convert.html input.html or node html2pdf-next.js
HTML → PDF (posters) poster.md node html2poster.js poster.html
Image → PDF process.md pikepdf: one image per page, embed as XObject
PDF → image process-advanced.md pypdfium2 render each page to PNG
Encrypt / decrypt process-advanced.md pikepdf encryption
Add watermark process.md pikepdf overlay: create watermark page → merge onto each page
Compress PDF process.md Ghostscript: gs -sDEVICE=pdfwrite -dPDFSETTINGS=/screen
OCR scanned PDF process-advanced.md ocrmypdf or Tesseract
Rotate pages process.md pages.rotate input.pdf 90 -o out.pdf
Crop pages process.md pages.crop input.pdf l,b,r,t -o out.pdf
Remove blank pages process.md pages.clean input.pdf
Reformat by template process.md → delegate Extract content → regenerate via report/creative
PDF diff / compare process.md diff-pdf CLI or Python per-page text comparison
Digital signature process.md pyhanko library (requires extra install)
Edit metadata process.md meta.set input.pdf -o out.pdf -d '{...}'

Special Routing Rules

🚨 Emoji rule (CRITICAL - check FIRST): Content with intentional emoji (📊🎯🔥💡 etc.) → force briefs/creative.md regardless of document type. ReportLab renders emoji as □ squares; LaTeX silently drops them. This rule overrides all other routing. Even if the user says "report" - if the content has emoji, use Creative pipeline.

Non-standard page size rule: Dimensions other than A4/Letter/A3 → strongly prefer briefs/creative.md. Playwright handles any arbitrary page size natively. ReportLab requires manual pagination math.

Academic auto-detect: Papers, theses, or heavy math → briefs/academic.md even without explicit "LaTeX" mention.

Template-guided rule: When the user uploads a PDF and says "match this template" / "follow this style" / "reformat like this" → briefs/process.md Template-Guided Reformat section. This is a Standard triage (not Light), because it involves design decisions.

Resume routing: Default to Report brief (ATS-safe). Creative industry → Creative brief. Academic CV with publications → Academic brief.


Shared Assets

These are referenced by multiple briefs. Do not load upfront - each brief tells you when and what to load.

Asset Path Used By Purpose
Palette & Typography typesetting/palette.md Report, Creative Color system, font rules, anti-patterns, spacing
Cover Layout System V2.1 typesetting/cover.md Report + Creative + Academic 7 industrial-grade templates with absolute anchor grid, Z-index layers, typography weight system, mandatory Summary Block, code-level safety (5 checks), base unit U = W*0.05. Unified HTML/Playwright cover system for all routes.
Chart Styling & Anti-Stacking typesetting/charts.md Report, Creative, Academic Chart defaults, collision prevention, axis/grid/legend rules
Overflow Prevention typesetting/overflow.md Report, Creative, Academic Bounding box system, text/image/table overflow prevention, fallback strategies
Fill Engine (Anti-Void) typesetting/fill-engine.md Report, Creative, Academic Anti-Void Engine V2.0: font floor enforcement, fill ratio calculation, paragraph inflation, component elevation, Y-axis golden-ratio anchoring
Pagination & Flow Control typesetting/pagination.md Report, Creative Cross-page integrity, orphan/widow control, CJK punctuation rules
Typography System typesetting/typography.md Report, Creative Font size scale, line-height, spacing hierarchy
Geometric Anchors typesetting/geometry.md Creative + Report Decorative geometric elements, anchor placement rules
Cover Backgrounds typesetting/cover-backgrounds.md Report + Creative + Academic Cover background rendering, transparency constraints
Visual Framework configs/visual_framework.md Creative Palette mode, color harmony, SVG background params
Components Library configs/components.md Creative Non-grid composition components (floating cards, oversized text, etc.)
Font Stacks configs/fonts.md All pipelines Font families per pipeline (Google Fonts, ReportLab, LaTeX)

Content Rules

  • Language: Match user's query language. Chinese query → Chinese PDF.
  • Page/word count: Respect explicit constraints (±20%). Unspecified → completeness over brevity.
  • Outline: User-provided outlines are sacred. No reordering without asking.
  • Citations: No fabrication. Chinese → GB/T 7714, English → APA. Search to verify.
  • Multi-part requests: Generate ALL parts - never silently drop a component.

HTML Image Source Path Rules

When embedding images in HTML documents (Creative pipeline, Playwright-rendered diagrams, or any HTML→PDF flow):

Image location <img src> value Example
Local file Relative path from the HTML file's directory <img src="images/chart.png"> or <img src="./diagram.png">
Remote URL Full URL (no change needed) <img src="https://example.com/photo.jpg">

Iron rules:

  1. NEVER use absolute paths for local files in HTML <img>, <source>, CSS url(), or any other asset reference (e.g. /Users/alice/project/img.png). Absolute paths break portability across machines and environments.
  2. Always use relative paths anchored to the HTML file's own directory. If the image lives in a subdirectory, use images/foo.png or ./images/foo.png.
  3. Remote URLs (http:// / https://) are fine as-is — do not convert them to local paths.
  4. When generating HTML from a script or blueprint, ensure all referenced assets are either (a) in the same directory as the output HTML, or (b) in a clearly named subdirectory (e.g. assets/, images/), and referenced with relative paths.
  5. If a build script needs to resolve paths programmatically, compute relative paths at generation time (e.g. os.path.relpath(image_path, html_dir)) rather than embedding absolute filesystem paths.

Figure & Diagram Embedding (All Briefs)

Iron Rule: Figures Are Block-Level

Figures, diagrams, and charts MUST be independent block elements occupying full width. Never float/wrap figures alongside body text - this causes the text-diagram overlap badcase.

Brief Correct embedding Forbidden
Report (ReportLab) story.append(Image(...)) as standalone Flowable Placing images inside Paragraph text, simulating float
Creative (Playwright) <figure style="display:block; width:100%; margin:2em auto"> float:right, display:flex with text, wrapfigure-style CSS
Academic (LaTeX) \begin{figure}[t] ... \end{figure} Bare \includegraphics in text body (no figure env), bare tikzpicture in multi-column

Complex Diagram Strategy

When a diagram has >12 nodes, >3 subgroups, or intricate connections, do NOT try to render it as one giant figure. Instead:

  1. Table for details - structured data (phases, components, specs) goes into a proper table
  2. Simplified overview diagram - a stripped-down flowchart/Mermaid showing only the top-level flow (≤8 nodes)
  3. Cross-reference - table caption + diagram caption reference each other

This "table + simple diagram" pattern prevents:

  • Diagrams overflowing page boundaries
  • Text becoming unreadably small to fit everything
  • Layout engines mishandling oversized graphics

Diagram Content Quality Rules (Cross-reference: charts)

The rules above handle how to embed diagrams in PDF. For what the diagram itself looks like (node layout, connector routing, color, readability), follow the charts skill rules:

Before generating ANY flowchart/diagram for PDF embedding, check these:

  1. Connectors must not pass through nodes - If 3+ layers exist, connect adjacent layers only (top→mid, mid→bottom). Never draw top→bottom lines through middle nodes. Use detour paths if cross-layer links are needed.

  2. Multiple arrows into one node must not pile up - Distribute entry points evenly along target edge, or use merge-then-enter pattern (sources converge to a vertical merge line, then single arrow to target).

  3. Low-saturation fills only - Node backgrounds must be pale (#EFF6FF, #F0FDF4). High-saturation colors (#3B82F6, #10B981) only for borders or small accents. No children's-art color schemes.

  4. Phase titles vs sub-steps must be visually distinct - Different background color, font size, and font weight. Never same-style boxes for both.

  5. Font sizes must be readable at final output size - Sizes depend on the embedding context:

    Output context Node title min Description min Label min
    Standalone PNG (web/presentation, ≥1200px wide) 14px 12px 11px
    Embedded in A4 PDF (ReportLab/LaTeX, ~450pt content width) 10pt 8pt 7pt
    Embedded in slide deck (landscape, ~720pt wide) 12pt 10pt 9pt

    Principle: After embedding, the smallest text in the diagram must still be legible when the document is viewed at 100% zoom. If the diagram is scaled down to fit page width, recalculate: effective_size = original_size × (display_width / canvas_width). If effective size drops below the minimum, either increase original font size or reduce diagram complexity.

  6. Legend/annotations must not overlap content - Separate container, ≥ 40px gap from last node, fully within canvas bounds.

For Playwright-rendered diagrams: Use low-saturation fills (#EFF6FF, #F0FDF4), CSS flexbox/grid for node layout, SVG <line>/<path> for connectors, and verify no overlap at final render size. For ReportLab-drawn diagrams: Same principles apply - use Drawing() with explicit coordinates, check node bounding boxes for overlap before finalizing.

Diagram Generation Strategy (Per-Brief)

Diagram rendering depends on the target brief - NOT a one-size-fits-all TikZ pipeline.

Target Brief Diagram Method Rationale
Report (ReportLab) Playwright+CSS → PNG → Image() No LaTeX compiler in this route; HTML/CSS handles any layout natively
Creative (Playwright) Directly in HTML (CSS flexbox/grid + JS connectors) Already in browser context
Academic (Tectonic) - simple (≤6 nodes) TikZ native tikzpicture Vector output, font consistency, LaTeX-native
Academic (Tectonic) - complex (>6 nodes) Playwright+CSS → PNG @2×\includegraphics TikZ branch logic is error-prone for models; 300dpi PNG is publication-ready

Playwright+CSS diagram pipeline (Report & Academic-complex):

# 1. Write diagram HTML (CSS grid/flexbox + connectors)
cat > diagram.html << 'EOF'
<!-- LLM generates: nodes as divs, arrows as SVG/CSS -->
EOF

# 2. Screenshot at 2× for print quality (300dpi equivalent)
python3 "$PDF_SKILL_DIR/scripts/pdf.py" convert.blueprint diagram.html --device-scale-factor 2 --output diagram.png
# Or via Playwright directly:
# page.screenshot(path='diagram.png', scale='device', device_scale_factor=2)

# 3a. Embed in ReportLab (Report brief)
from reportlab.platypus import Image
img = Image('diagram.png', width=450)  # auto height via aspect ratio
story.append(img)

# 3b. Embed in LaTeX (Academic brief, complex diagrams only)
# \includegraphics[width=\columnwidth]{diagram.png}

🚫 FORBIDDEN for Report/Creative briefs: Do NOT use TikZ standalone → compile → pdftoppm → PNG pipeline. This route has no LaTeX compiler and the extra compilation steps are error-prone.

TikZ remains valid ONLY for:

  • Academic brief with simple diagrams (≤6 nodes, linear/hierarchical)
  • Direct tikzpicture embedding in LaTeX documents
  • Math-annotated diagrams where LaTeX math rendering matters

See briefs/academic.md Scenario B for TikZ templates (simple diagrams only).


Vector Rendering Iron Rule

The final PDF MUST be generated via page.pdf() (Playwright) or ReportLab/LaTeX native output - NEVER via screenshot-to-PDF.

Scenario Correct Method Forbidden
Creative pipeline (single/multi-page) page.pdf() via convert.blueprint or html2pdf-next.js page.screenshot() → image → wrap as PDF
Report cover (HTML/Playwright) page.pdf() → merge via pypdf Screenshot cover → embed as image
Academic cover page.pdf() → merge via pypdf Screenshot → \includegraphics for cover
Full-page posters/infographics html2poster.js (auto overflow:hidden + height measurement + page.pdf()) Any raster pipeline for the final output

Why: page.pdf() produces vector text + vector shapes. Text remains selectable, sharp at any zoom, and file size is smaller. Screenshot-based PDFs are raster images - blurry when zoomed, unsearchable, and 3-5× larger.

The ONLY place screenshot/PNG embedding is acceptable:

  • Diagrams embedded as sub-elements inside a larger document (e.g., flowcharts in a Report). These use page.screenshot() at 2× device scale factor for 300dpi print quality, then embed via Image() (ReportLab) or \includegraphics (LaTeX).
  • Chart images generated by matplotlib/plotly saved as PNG, then embedded.

These are sub-elements, not the document itself. The document-level PDF output must always be vector.

Quick test: Open the generated PDF, zoom to 400%. If text is blurry, you used a screenshot pipeline. Fix it.

HTML→PDF Engine Selection Rules

There are two dedicated scripts for HTML→PDF. Choose based on document type:

Document type Script Reason
Posters, infographics, long-image single-page designs html2poster.js Auto overflow:hidden, auto height measurement, zero margin, single-page output
Cover pages (Report/Academic route) html2poster.js Covers are single-page fixed layouts with absolute positioning — same nature as posters. html2pdf-next.js would convert absolute→static and destroy the layout
Multi-page documents, reports, academic papers, resumes html2pdf-next.js A4/custom pagination, 20mm margin fallback, cover adaptation, pdf-lib metadata
Creative pipeline (Blueprint → HTML → PDF) html2pdf-next.js via convert.blueprint Called internally by design_engine pipeline

Poster / Single-Page Long-Image → html2poster.js

node "$PDF_SKILL_DIR/scripts/html2poster.js" poster.html --output poster.pdf --width 720px

html2poster.js automatically:

  • Forces overflow: hidden on .poster / .page containers (clips decorative overflow)
  • Injects @page { margin: 0 } (zero margins always)
  • Syncs html/body background with poster background color
  • Measures .poster scrollHeight and uses it as PDF height
  • Generates a single-page vector PDF with exact content dimensions

Use this for ANY fixed-width, dynamic-height, single-page design.

Documents / Multi-Page → html2pdf-next.js

node "$PDF_SKILL_DIR/scripts/html2pdf-next.js" input.html --output output.pdf --width 210mm --height 297mm
# Or via pdf.py wrapper:
python3 "$PDF_SKILL_DIR/scripts/pdf.py" convert.html input.html --output output.pdf

Pre-render hooks auto-handle @page injection, overflow detection, cover adaptation, font loading, and pdf-lib metadata.

⚠️ Iron Rule: No Hand-Written Playwright Scripts

Common issues with hand-written Python page.pdf() (the dedicated scripts handle these automatically):

  1. Missing @page rule → browser default margin causes content overflow to second page or white edges
  2. Oversized elements not fixed → large elements with break-inside: avoid block pagination, content gets truncated
  3. Rendering before fonts are loaded → Chinese text displays as squares or falls back to wrong font
  4. No overflow detection → content exceeds page boundary without awareness
  5. No metadata → PDF title, author, and other info missing

Iron rule: Posters and cover pages use html2poster.js, multi-page documents use html2pdf-next.js. Do not write hand-written Python Playwright scripts.

⚠️ Cover page gotcha: Cover HTML uses position: absolute for layout. html2pdf-next.js pre-render hooks convert absolute-positioned elements to static flow (to prevent multi-page overlap), which destroys cover layouts. Always use html2poster.js for cover pages.

No overflow:hidden on Fixed-Size Pages (html2pdf-next.js only)

When using html2pdf-next.js for documents, NEVER set overflow: hidden on html, body, or the main page container.

Note: This rule does NOT apply to posters rendered via html2poster.js — that script automatically adds overflow: hidden to .poster/.page containers to clip decorative overflow. You don't need to add or remove it manually.

Problem Cause Fix
Browser preview cuts off bottom content, can't scroll overflow: hidden on container + viewport < design height Remove overflow: hidden
html2pdf-next.js "Fixed vertical overflow" warning, layout may break Pre-render detects scrollHeight > clientHeight + hidden overflow, force-expands container Remove overflow: hidden

Always pair fixed-size pages with @media screen auto-scale so the full page is visible in any browser window without scrolling. See briefs/creative.md § 0.5 for the CSS pattern.

Full-Bleed Rule (No White Margins)

When generating HTML for Playwright page.pdf(), the content MUST fill the entire page with zero margins. White side margins = broken layout.

Mandatory CSS for any HTML → PDF:

@page {
  size: <width> <height>;  /* e.g., 720px 960px, or A4 */
  margin: 0;
}
html, body {
  margin: 0;
  padding: 0;
}

Common causes of white margins:

  1. Missing @page { margin: 0 } - browser default margins kick in (~1cm each side)
  2. Content width doesn't match page width - e.g., canvas is 720px but page is A4 (794px)
  3. Missing @page { size } declaration in the HTML
  4. Content has explicit max-width that's narrower than the page

For blueprint pipeline: design_engine.py now injects @page { size: var(--canvas-w) var(--canvas-h); margin: 0; } automatically. For raw HTML: YOU must include the @page rule. No exceptions. For direct Playwright: Pass margin: { top: 0, right: 0, bottom: 0, left: 0 } to page.pdf().

Background Color Consistency (No Color Mismatch)

html / body background color must match the content canvas background color.

Playwright page.pdf({ printBackground: true }) renders the body background color. If body is white while the content area is gray/colored, color-inconsistent borders/gaps will appear in the PDF.

Single-color documents (all pages same background)

/* MANDATORY: body background = content background */
html, body {
  margin: 0;
  padding: 0;
  background: var(--c-bg);  /* Same color as content canvas */
}

Multi-page documents with mixed backgrounds (e.g. dark cover + white body pages)

Root cause: Playwright resolves .page { width: 210mm } and @page { size: 210mm } to slightly different sub-pixel values (e.g. 793.688px vs 793.701px). This creates a <1px gap at the right/bottom edge of each .page div where body's background shows through. On dark pages, a white body background makes this gap visible as a white edge.

Fix — set body background to the document's dominant dark color:

:root {
  --primary: #0f172a;  /* darkest page background */
}
html, body {
  margin: 0;
  padding: 0;
  width: 210mm;  /* match @page size */
  background: var(--primary);  /* fallback for sub-pixel gaps */
}

Why this works and doesn't break white pages:

  • Dark pages: sub-pixel gap reveals dark body → gap invisible.
  • White pages: .page-white { background: #ffffff } fully covers body → dark body never visible.
  • The gap is <1px — even on white pages, the dark body at the extreme pixel edge is imperceptible after anti-aliasing.

Rule: when generating multi-page HTML with mixed backgrounds, always set html, body { background } to the darkest page's background color. If all pages are light/white, use the lightest content background (e.g. #f8fafc). Never leave body background unset (browser default = white = guaranteed white edges on dark pages).


### Content Centering (No Left/Right Drift)

**After HTML-to-PDF conversion, content must be centered, no left or right drift allowed.**

Common drift causes:
1. `@page { margin }` not 0 — browser default margin causes drift
2. `.safe-zone` or content container `inset` / `padding` left-right asymmetric
3. Content container has `max-width` but no `margin: 0 auto`
4. Grid components only occupy partial column width (e.g. `1/1 → X/7` only uses left half)
5. **Decorative elements overflow page boundary** — elements with `width > 100%` or negative offsets (e.g. glow circles, gradient overlays) inflate `scrollWidth` beyond page width. Playwright shrinks all content to fit, causing left-shift. **Fix: add `overflow: hidden` to `.page` containers.** See `typesetting/overflow.md` §3.5 for horizontal flex overflow rules.

### Anti-Void Edges (No Large Blank Margins)

**Content should not have large meaningless whitespace at page edges, top, or bottom.**

- Content should make full use of page area; do not cram all content in the top half while leaving the bottom blank
- For multi-page documents, each page's fill rate should be ≥ 60% (see `pagination.md` last page ≥ 40% rule)
- For single-page posters/infographics, fill rate should be ≥ 70%

---

## Preflight (Quality Assurance)

Every PDF must pass preflight checks before delivery. Each brief specifies the exact commands.

### HTML Pre-Render Validation (MANDATORY for ALL HTML→PDF paths)

**Before** calling `html2pdf-next.js`, `html2poster.js`, `convert.blueprint`, or any Playwright `page.pdf()`, run:

```bash
python3 "$PDF_SKILL_DIR/scripts/poster_validate.py" check-html <your_file>.html
Result Action
PASS (no errors) Proceed to PDF generation
ERROR items Must fix before generating PDF. Use --fix --output <file>.html for auto-repair
WARNING items Review; non-blocking but should be addressed

Key checks:

  • OVERFLOW_HIDDEN_CONTAINER (error): overflow:hidden on html/body/.page clips content in browser preview and triggers html2pdf-next.js auto-fix that may break layout
  • FIXED_SIZE_NO_SCREEN_ADAPT (warning): fixed-size page without @media screen auto-scale — browser preview requires scrolling
  • SCREEN_ADAPT_NO_SCALE (warning): @media screen exists but lacks scale/transform/zoom
  • FONT_NO_FALLBACK (error): font-family without generic fallback
  • COLOR_CONTRAST (warning): text/background contrast ratio < 3:1
  • Plus: remote images, absolute paths, missing margin reset, tiny fonts, background mismatch, etc.

This applies to all three HTML routes: Creative blueprint pipeline, Report HTML covers, and bypass/custom HTML.

Overflow Prevention System

→ Full spec: typesetting/overflow.md - read it for any document with tables, images, or multi-column layouts.

Core principles:

  1. Measure first, draw second - never render content without pre-calculating its dimensions
  2. Bounding Box constraint - every element's width ≤ its parent container's Max_Width
  3. Text: use font metrics, not character count, for width calculation
  4. Images: proportional scaling - never insert at original size
  5. Tables: weight-based column width + Paragraph() wrapping (never plain strings)
  6. Fallback ladder: wrap → shrink font (max -3pt) → reduce padding → split element → log warning
  7. Vertical: KeepTogether for heading+body, chart+caption; repeatRows=1 for long tables

Table Overflow Prevention (ReportLab)

Most common layout bug: table columns exceed page margins.

Before building any ReportLab Table:

  1. Calculate available_width = page_width - left_margin - right_margin
  2. Use proportional colWidths ([0.25, 0.40, 0.20, 0.15] × available_width) or fixed+flex pattern
  3. sum(colWidths) must be ≤ available_width - verify this in code
  4. Long text columns must use Paragraph() wrapping, not plain strings (plain strings don't wrap)
  5. CJK text is wider: budget ~12pt per character at 10pt font size

See briefs/report.md § "Table Width Management" for code patterns.

Table Overflow Prevention (LaTeX/Academic)

Most common bug in dual-column papers: wide tables overflow single-column width.

Before writing any LaTeX table:

  1. Count data columns - ≤ 4 fits single column; 5-6 needs \small; 7-8 needs \resizebox; ≥ 9 use table* (full width)
  2. Use tabular*{\columnwidth} or tabularx{\columnwidth} instead of plain tabular for 5+ columns
  3. Never use plain tabular with 8+ columns in twocolumn layout - guaranteed overflow
  4. \resizebox{\columnwidth}{!} as last resort - verify smallest text ≥ 6pt after scaling

See briefs/academic.md § "Table width management" for LaTeX patterns.

Playwright PDF CSS Blacklist

These CSS properties silently break in Playwright's PDF renderer:

  • backdrop-filter / -webkit-backdrop-filter - drops entire element content. Use solid rgba() backgrounds.
  • overflow: hidden on content containers - clips content. Only safe on small decorative elements (< 200px).

After generating any Playwright PDF, verify every page has content (pypdf text extraction, check non-empty).

PDF Metadata (all briefs)

ALL PDFs must have: Title, Author (default "Z.ai"), Creator, Subject.

Delivery Summary (all briefs)

Report to user: file path, size, page count. Academic adds word/image count. Creative adds per-page verification.

HTML→PDF route deliverables (MANDATORY — applies to ALL briefs that use Playwright/HTML to generate PDF): Whenever the HTML→PDF pipeline is used (Creative route, Report cover bypass, Direct HTML Flow posters, or any Playwright page.pdf() path), you MUST deliver both files to the user:

  1. HTML — the source HTML file, so the user can edit and reuse the design
  2. PDF — the final vector PDF (page.pdf() output)

Optionally also provide: 3. Image — a full-page screenshot/preview image (PNG or JPG) for quick sharing on chat/social media

All file paths must be reported to the user. Never deliver only the PDF without the HTML source.


Tooling Reference

CLI: python3 "$PDF_SKILL_DIR/scripts/pdf.py" <command>

# Environment
env.check                    # Check deps
env.fix                      # Auto-install missing

# Quality
code.sanitize <script>       # Sanitize forbidden Unicode
content.sanitize <file> [--apply]  # Fix content issues (CJK, encoding)
meta.brand <pdf>             # Add Z.ai metadata
font.check <pdf>             # Scan for missing glyphs
toc.check <pdf>              # Validate TOC

# Conversion
convert.blueprint <llm_json_response.md> -o final.pdf  # CRITICAL FOR CREATIVE: Auto-extracts JSON, compiles, and renders PDF.
convert.html <html>          # HTML → PDF (Playwright)
convert.latex <tex>           # LaTeX → PDF (Tectonic). Bundled binary is macOS arm64 only; see academic.md for other-platform install.
convert.office <file>         # Office → PDF (LibreOffice)

# Processing
extract.text <pdf>            # Extract text
extract.table <pdf>           # Extract tables
extract.image <pdf>           # Extract images
pages.merge a.pdf b.pdf -o out.pdf
pages.split <pdf>
pages.clean <pdf>             # Remove blank pages
form.info <pdf>               # Inspect form fields
form.fill <pdf>               # Fill form
form.annotate <pdf>           # Fill via annotations
meta.get <pdf>
meta.set <pdf> -o out.pdf -d '{"Title": "..."}'

Poster/HTML/LaTeX Validator: python3 "$PDF_SKILL_DIR/scripts/poster_validate.py"

check-html <html>                              # Pre-render validation (overflow:hidden, @media screen, fonts, contrast, etc.)
check-html <html> --fix --output <fixed.html>  # Auto-fix errors (remove overflow:hidden, add font fallback)
check-pdf <pdf> --source-html <html>           # Post-render validation
check-pdf <pdf> --poster                       # Poster mode: suppress ORPHAN_PAGE warning
check-tex <tex>                                # LaTeX source validation (table overflow, image width, etc.)

check-html checks include:

  • OVERFLOW_HIDDEN_CONTAINER (error): overflow:hidden on html/body/.page/.poster — clips content
  • FIXED_SIZE_NO_SCREEN_ADAPT (warning): fixed-size page without @media screen auto-scale
  • SCREEN_ADAPT_NO_SCALE (warning): @media screen exists but lacks scale/transform/zoom
  • FONT_NO_FALLBACK (error): font-family without generic fallback (sans-serif/serif)
  • COLOR_CONTRAST (warning): text/background contrast ratio < 3:1
  • BG_COLOR_MISMATCH (warning): body background differs from .canvas/.poster background
  • SCREEN_BG_MISMATCH (warning): @media screen html background differs from body/canvas background
  • MULTIPAGE_BODY_BG_MISSING (warning): multi-page document with dark .page backgrounds but no html/body background color. Sub-pixel gaps at page edges reveal white body, causing visible white edges on dark pages. Resolves var() references via :root variables.
  • SCREEN_NO_BG (warning): fixed-size page's @media screen block lacks html background color
  • OVERFLOW_DECORATION (warning): negative position values may cause black edges
  • NO_PAGE_SIZE / MISSING_MARGIN_RESET / WHITE_BACKGROUND / TINY_FONT / etc.

check-tex checks include:

  • BARE_TABULAR_OVERFLOW (error): \begin{tabular} with 5+ columns in two-column layout, not wrapped in resizebox/adjustbox/table*
  • RESIZEBOX_TEXTWIDTH (error): \resizebox{\textwidth} used inside single-column float in two-column layout. \textwidth = full page width, but table float is one column. Fix: use \resizebox{\columnwidth} or table*
  • TABULAR_OVERFLOW_RISK (warning): 4-column tabular in two-column layout without width constraint
  • TABULAR_WIDE (warning): 7+ column tabular in single-column layout without width constraint
  • TABULAR_NO_FLOAT (warning): tabular not inside table/table* float environment
  • TABULARX_NOT_LOADED (warning): document has tabular but tabularx package not loaded
  • IMAGE_NO_WIDTH (warning): \includegraphics without width/height/scale constraint
  • EQUATION_DUAL_ON_LINE (warning): equation environment has 2+ equations joined by \quad without line breaks. Guaranteed overflow in dual-column
  • EQUATION_OVERFLOW_RISK (warning): equation body has >80 math characters. Likely overflows single column
  • ALGORITHM_NO_SMALL_FONT (warning): algorithm environment in dual-column without \SetAlFnt{\small}
  • ALGORITHM_LONG_IO (warning): Algorithm Input/Output line >120 chars. Will overflow narrow column
  • CJK_ASCII_QUOTES (error): ASCII " found adjacent to CJK characters. LaTeX interprets " as right double quote, so "北漂" renders incorrectly. Skips verbatim/lstlisting/minted environments and \texttt{}/\url{}/\href{}{}/\verb|| inline commands.

Design Engine: python3 "$PDF_SKILL_DIR/scripts/design_engine.py"

compile --blueprint <json_file> --output poster.html  # CRITICAL: Compile JSON blueprint to HTML
derive "document title or description"         # Auto-derive intent from content
palette --intent calm --mode dark               # Generate HSL-locked palette
palette-cascade --intent cold --mode minimal    # Generate role-based cascade palette (V2, preferred)
svg --intent flow --dimensions 720x960         # Generate SVG background
full --intent energy --mode dark --dimensions 720x960 --output-dir ./assets/
audit --palette-json palette.json              # Check palette constraints

Palette Generator (for Report route): python3 "$PDF_SKILL_DIR/scripts/pdf.py" palette.generate

palette.generate --title "document title" --mode minimal   # Output: ready-to-paste ReportLab Python code
palette.generate --title "..." --format json               # Output: raw JSON
palette.generate --title "..." --format css                # Output: CSS custom properties
palette.generate --title "..." --mode dark --harmony complementary --seed 42

Cascade Palette (V2 - Preferred): python3 "$PDF_SKILL_DIR/scripts/pdf.py" palette.cascade

palette.cascade --title "document title" --mode minimal    # Output: summary table with all 12 roles
palette.cascade --title "..." --format json                # Full structured JSON (roles + cover + body + charts + semantic)
palette.cascade --title "..." --format css                 # CSS custom properties by tier
palette.cascade --title "..." --format reportlab           # Ready-to-paste ReportLab Python code

⚠️ Cascade palette is the preferred palette system. It enforces area ∝ 1/saturation (larger areas = lower saturation) and outputs unified color subsets for cover, body, and charts from one base hue. Use palette.cascade instead of palette.generate for new documents.

⚠️ Report route MUST call palette.cascade (or palette.generate) before writing any ReportLab code. The output is copy-paste ready - no manual hex picking allowed.

Note

: design_engine.py compile produces HTML from a JSON blueprint. To get a PDF, use pdf.py convert.blueprint which internally calls compile → Playwright render → PDF output. In the Creative pipeline, always use convert.blueprint for the final PDF.

Tech Stack per Brief

Brief Primary Tool Secondary Emoji Support Custom Page Size
Report ReportLab + pypdf Playwright (cover) (tofu □) Manual pagination
Creative Playwright html2pdf-next.js (pdf-lib for post-processing) native any size
Academic Tectonic + pypdf Playwright (cover) (dropped) Template-dependent
Process pikepdf, pdfplumber LibreOffice (soffice) N/A N/A

Unified Cover System: All routes generate covers via HTML/Playwright. Report uses Templates 0107, Academic uses Templates 0810 (dark backgrounds, scholarly typography), Creative generates cover + body in one HTML document. Cover PDFs are merged with body PDFs via pypdf.

Fallback: If Report brief content has emoji → reroute to Creative.


File Map

SKILL.md                            ← You are here
briefs/
  report.md                         ← Report production: ReportLab workflow + API + resume(ATS)
  creative.md                       ← Creative production: 5-phase generative design workflow
  poster.md                          ← Poster scene rules: density, font sizing, fill constraints (overlay on creative.md)
  academic.md                       ← Academic production: LaTeX workflow + templates + resume(CV)
  process.md                        ← PDF processing: extract/merge/split/form/convert/reformat
  process-advanced.md               ← Advanced reference (encrypted/corrupted/OCR/batch/perf) - load on demand
configs/
  visual_framework.md               ← Palette mode, color harmony, SVG background params
  components.md                     ← Non-grid composition components (floating cards, etc.)
  fonts.md                          ← Font stacks per pipeline (Creative/Report/Academic)
typesetting/
  palette.md                        ← Color system + typography + anti-patterns + spacing
  cover.md                          ← Cover page layout system (7 layouts × 2-3 variants) + typography scale + color rules
  cover-backgrounds.md              ← Cover background rendering rules + transparency constraints
  charts.md                         ← Chart styling + anti-stacking rules + axis/grid/legend treatment
  overflow.md                       ← Bounding box system, text/image/table overflow prevention
  pagination.md                     ← Cross-page integrity, orphan/widow control, CJK punctuation
  typography.md                     ← Font size scale, line-height, spacing system
  geometry.md                       ← Geometric anchor system (decorative elements, lines, shapes)
  fill-engine.md                    ← Adaptive anti-void layout engine V2.0
scripts/
  pdf.py                            ← CLI tool (30 subcommands)
  pdf_qa.py                         ← PDF quality checker (metadata, fonts, overflow, margins, tables, formulas)
  design_engine.py                  ← Generative SVG + palette engine (palette/svg/compile/derive/audit)
  poster_validate.py                ← HTML/PDF validator
  toc_validate.py                   ← TOC validator
  html2pdf-next.js                  ← Playwright + pdf-lib HTML→PDF converter for documents (no Paged.js)
  html2poster.js                    ← Playwright HTML→PDF converter for posters/single-page (auto overflow:hidden, dynamic height)
  cover_validate.js                 ← Cover-ONLY overlap detection (text vs decorative lines). Do NOT run on posters or documents — only on cover HTML in Report/Academic pipelines.
references/
  resume-altacv.tex                 ← AltaCV dual-column resume template (creative/tech)
  resume-academic.tex               ← Academic CV template (PhD/academic)

Loading Protocol

  1. Always read: This file (SKILL.md)
  2. Read ONE brief: The matched brief file - it contains the complete workflow
  3. Read typesetting on demand: Only when the brief says to (standard tasks)
  4. Never load all files upfront - briefs reference what they need

Script Path Setup (MANDATORY before any script call)

All paths are relative to $PDF_SKILL_DIR — the single root variable for this skill. Resolve it once before calling any script:

PDF_SKILL_DIR="<skill_directory>"   # ← parent directory of this SKILL.md

# Then all commands use $PDF_SKILL_DIR:
python3 "$PDF_SKILL_DIR/scripts/pdf.py" code.sanitize generate_pdf.py
python3 "$PDF_SKILL_DIR/scripts/pdf.py" meta.brand output.pdf
python3 "$PDF_SKILL_DIR/scripts/pdf.py" font.check output.pdf
python3 "$PDF_SKILL_DIR/scripts/pdf.py" toc.check output.pdf
python3 "$PDF_SKILL_DIR/scripts/pdf.py" pages.clean output.pdf -o output_clean.pdf
python3 "$PDF_SKILL_DIR/scripts/pdf_qa.py" output.pdf
python3 "$PDF_SKILL_DIR/scripts/poster_validate.py" check-html page.html
python3 "$PDF_SKILL_DIR/scripts/poster_validate.py" check-pdf output.pdf

For Python imports (when generation code needs to import skill modules):

import sys, os
PDF_SKILL_DIR = "<skill_directory>"
_scripts = os.path.join(PDF_SKILL_DIR, "scripts")
if _scripts not in sys.path:
    sys.path.insert(0, _scripts)

⚠️ NEVER use bare python3 scripts/pdf.py ... - it only works if cwd happens to be the skill directory. Always use $PDF_SKILL_DIR/scripts/ as the absolute prefix.


8. Quality Checklist (Mandatory after every PDF generation)

The following checks come from the typesetting/ spec files and are mandatory quality gates.

Automated Detection (Must Run)

python3 "$PDF_SKILL_DIR/scripts/pdf_qa.py" <output.pdf>
python3 "$PDF_SKILL_DIR/scripts/pdf_qa.py" --poster <output.pdf>   # poster mode: skip content fill ratio, check all pages for full-bleed
python3 "$PDF_SKILL_DIR/scripts/pdf_qa.py" --skip-cover --formulas <output.pdf>   # academic mode: skip cover for margin check, enable formula overflow
python3 "$PDF_SKILL_DIR/scripts/pdf_qa.py" --no-tables <output.pdf>   # creative mode: skip table centering check

Dependency: Requires pymupdf (pip install pymupdf). If not installed, skip automated detection and use the manual checklist below.

Run pdf_qa.py after generating a PDF. It auto-detects: metadata completeness, page size consistency, blank pages, CJK punctuation placement, color count, font embedding status, content overflow, content fill ratio, cover full-bleed, margin symmetry, table centering, formula overflow.

  • --poster mode: skips content fill ratio check (poster last page naturally has less content), checks ALL pages for full-bleed (not just cover)
  • --skip-cover: skips page 1 when checking margin symmetry (for documents with separately-generated covers)
  • --no-tables: disables table centering check (for creative/poster documents that rarely have traditional tables)
  • --formulas: enables formula overflow detection (checks if formula-like content extends past right content margin)
  • Result PASS → deliver directly
  • Result WARN → evaluate whether fix is needed, non-blocking
  • Result FAIL → must fix and regenerate

Pagination & Layout (pagination.md)

  • Last page fill ratio ≥ 40%: No large blank areas on the final page. If insufficient, backtrack to compress spacing/line-height/font-size
  • Major section 3/4 threshold: H1-level headings must NOT start in the bottom 25% of a page. If remaining space < 25%, force page break and start on fresh page. Use CondPageBreak(available_height * 0.25) in ReportLab, \needspace{0.25\textheight} in LaTeX
  • Tables dont split across pages: Table header and data rows must stay together. Small tables: break-inside: avoid. Large tables: thead { display: table-header-group }
  • Punctuation placement rules: Commas, periods, etc. must not appear at line start. Set line-break: strict in CSS
  • No orphan headings: Headings must not appear alone at page bottom. Use break-after: avoid
  • Cards/images not cut: break-inside: avoid

Overflow Prevention (overflow.md)

  • All table cells use Paragraph() wrapping (ReportLab): Never plain strings - they don't wrap and overflow
  • sum(colWidths) ≤ available_width: Verified in code, not assumed
  • Images/charts proportionally scaled: Never inserted at original dimensions; always fit_image() or max-width: 100%
  • Long tables have repeatRows=1: Table header repeats on every page when table breaks across pages
  • Heading + first paragraph in KeepTogether: Prevents orphan headings at page bottom
  • Chart + caption in KeepTogether: Prevents chart on one page, caption on next
  • CJK text uses wordWrap='CJK': Required for proper line-breaking of Chinese/Japanese/Korean
  • URLs/long strings have word-break: overflow-wrap: break-word (HTML) or manual splitting (ReportLab)
  • Font degradation fallback: Tight columns can shrink font by up to 3pt before clipping

Color (palette.md) - Report & Creative only

Academic (LaTeX) documents are exempt from this color system. LaTeX uses template-defined styling.

  • Entire document ≤ 5 colors: Primary + secondary + accent + neutral + background
  • All colors traceable to primary: Secondary and accent derived via lightness/saturation/micro-hue shift
  • Sibling elements not differentiated by different hues: Use opacity/lightness/borders instead
  • Gradient endpoints hue difference < 20°: No warm-to-cool gradients
  • No high-saturation color blocks: Avoid eye strain

Cover V2 (cover.md)

  • Evaluate whether a cover is needed: Reports, proposals, analysis, white papers, manuals ≥ 3 pages → always add cover (default ON). Skip cover ONLY for: resumes, CVs, letters, memos, forms, checklists, invoices, internal notes, or documents ≤ 2 pages
  • Single PDF output: Cover is merged into the final PDF as page 1. Report/Academic: cover generated via HTML/Playwright → merged as page 0 via pypdf. Creative: cover is part of the same HTML document. NEVER deliver a separate cover file
  • Page isolation: Cover NEVER shares a page with TOC or body content. Report/Academic: inherent via pypdf merge (separate PDFs). Creative: CSS page-break ensures isolation
  • Absolute Anchor Grid: All elements use percentage Y-anchors (Part 0, A0.1). NO flow-based layout
  • Z-Index Layers: Render in strict order: Layer 0 (bg fill) → Layer 1 (decorative, CLIPPED) → Layer 2 (structure lines) → Layer 3 (text)
  • Typography Weight System: Use weight/spacing/opacity hierarchy per A0.2 (Kicker: 16pt+3pt spacing+60% opacity; Hero: 45-65pt Heavy; Meta: 20-22pt; Summary: 16-18pt Regular line-height 1.6)
  • Mandatory Summary Block 🆕: Every cover MUST include a Summary/Description drawer (2-4 lines). If user provides none, auto-generate placeholder text (S3.5)
  • Safety checks: Hero title overflow (max 3 lines, auto-reduce font S3.1); Zone collision detection (S3.2); Uppercase lock for Latin kickers/footers/watermarks (S3.3); Hard width boundary enforcement (S3.4); Summary auto-generation (S3.5); Background watermark full-display enforcement (S3.6)
  • Background watermark complete 🆕: All background layer watermark text (year, document type, sidebar text) must be 100% visible within page bounds - auto-shrink font if needed, NEVER clip/truncate
  • Data binding correct 🆕: Hero Title = company/entity name (biggest, heaviest text); Kicker = report type/subtitle (small decorative text). NEVER reverse this mapping
  • Fill Engine applied 🆕: Font floor enforced (body ≥ 14pt single-col / 12pt dual-col, H1 ≥ 32pt, H2 ≥ 24pt, H3 ≥ 18pt); Fill Ratio calculated; inflation triggered when < 65%; Y-axis golden-ratio anchor when < 40%
  • Selected one of 7+4 templates: General templates 0107 + Academic templates 0810 + Institutional template 11. Autonomously select the best-fit template by analyzing document intent (Calm/Tension/Energy/Authority/Warmth) and document type per Part 2 Intent × Type matrix. Thesis proposals/dissertations/institutional submissions → default Template 11. No global default - every selection must be a deliberate design decision
  • Typography weight hierarchy: Hero 45-65pt Heavy, Meta 20-22pt Regular, Kicker/Footer 16pt with 3pt letter-spacing + 60% opacity, Summary 16-18pt Regular
  • Base spacing unit: U = W * 0.05 - all spacing should be multiples of U
  • Bounding box via absolute anchors: Each block anchored to fixed Y%, grows only within its own zone, never pushes adjacent blocks
  • Safe zone margin: 8-12% on all sides per template spec (corner marks for Template 04 at 8%)
  • Cover whitespace ≥ 60%: Restraint > clutter (but Summary block fills mid-page void intentionally)
  • Cover colors consistent with body: No independent color scheme; white/light backgrounds only
  • Clip-path on Layer 1: All background decorative elements must be clipped to page bounds
  • Clip scope = Layer 1 ONLY <20>F: saveState()/clipPath() must restoreState() BEFORE rendering Layer 2 lines and Layer 3 text. Text rendered inside clip scope = text gets cut off
  • No page border/frame <20>F: Cover page must have showBoundary=0, no canvas.rect(0,0,W,H), no CSS border/outline on cover container
  • Line-to-text minimum gap <20>F: Decorative lines (Layer 2) must be at least U (= W * 0.05) away from any text content
  • No dark/gradient backgrounds: No dark fills, no gradients, no high-saturation schemes
  • Hard width enforcement: Text wraps vertically at zone boundary, NEVER bleeds horizontally past assigned width
  • 🚫 NEVER use ReportLab for covers — ALL covers (Report, Creative, Academic) are generated via HTML/Playwright. See cover.md for the 10-template system. If you catch yourself writing canvas.setFillColor() + canvas.rect() for a cover background, STOP — switch to HTML/Playwright.
  • Line-length alignment (S3.7): Vertical lines match text block height (± 1U); horizontal lines ≥ widest text element width (never shorter than text)
  • Vertical balance (S3.8): No >40% dead whitespace at bottom; sparse content uses centered distribution; CJK titles 15-20% larger than Latin equivalent
  • Percentage positioning safety (S3.9): Every element with top: XX% must have a containing block with deterministic height (height: 100%, inset: 0, or top+bottom pair). Wrappers without explicit height + percentage-positioned children = overlap bug. Prefer px values over percentages
  • Cover colors from palette system: All :root CSS variables populated by palette.cascade output. Template HTML uses --c-bg, --c-accent, --c-text, --c-muted — no hardcoded hex values in generated HTML

Geometric Anchors (geometry.md)

  • Anchors use only the primary color: Layer via opacity, don't mix colors
  • Strokes over fills: Solid elements ≤ 30%
  • Ultra-thin lines: stroke-width 0.3-0.8px
  • Asymmetric placement: Offset creates tension
  • Elements ≤ 8: Restraint, don't clutter

Charts (charts.md)

  • No text stacking/overlap: All chart labels, values, and legends must be collision-free
  • Chart-to-text separation: Minimum 24pt gap above and below charts; 8pt between chart and caption; 30pt between consecutive charts
  • Legend-to-chart non-overlap: Legend MUST NOT overlap chart data area. Use bbox_to_anchor or external placement
  • Value label anti-collision: Adjacent value labels that overlap must be staggered, rotated, or selectively hidden
  • Pie charts → Donut by default: hole_ratio 60-70%, center shows total/core metric
  • Small pie slices handled: Slices < 5% use leader lines, < 3% merge to "Others", or strip labels to rich legend
  • Bar chart auto-rotation: If X-axis labels avg > 5 CJK chars (or 10 Latin), auto-convert to horizontal bars
  • Line chart labeling: Only label start, end, max, min points - NOT every data point
  • Axis cleanup: Top/right spines deleted, grid lines dashed at 0.5pt/20% opacity (or hidden if values labeled)
  • Bar micro-rounding: Top border-radius 2-4px, bar-to-gap ratio 1.5:1 or 2:1
  • Legend de-boxed: No border on legend, horizontal layout, small circle markers
  • Chart title hierarchy: Bold main title left-aligned above chart, lighter subtitle below it

Global Layout

  • Margin symmetry: left_margin == right_margin - asymmetric margins cause off-center content (ReportLab, LaTeX, HTML all checked)
  • Full-bleed enforcement (Playwright): HTML includes @page { size: <w> <h>; margin: 0; } and html,body { margin:0; padding:0; }. No white side margins in the output PDF
  • Background color consistency (Playwright): html, body { background } set explicitly. Single-color docs: match content canvas. Multi-page mixed docs: use the darkest page's background color. Mismatch or missing = sub-pixel white edges on dark pages
  • Content centering (Playwright): Content is centered in PDF, not drifting left or right. Check: symmetric inset/padding, full-width grid columns, no unbalanced max-width
  • Anti-void edges: No large meaningless blank areas at top, bottom, or sides. Content fills ≥ 60% of page (multi-page) or ≥ 70% (single-page poster/infographic)
  • Fill Engine applied: Pages with < 80% fill ratio trigger the fill engine (see fill-engine.md)
  • Table centering: ALL tables must be horizontally centered on the page. ReportLab: use hAlign='CENTER' on Table flowable. LaTeX: use \centering inside table environment. HTML: use margin: 0 auto on table element. NEVER let tables float left with right-side whitespace
  • Table column width: Table total width should be 85-100% of content area width. Avoid narrow tables (< 60% width) that look lost on the page. If table is narrow, expand column widths proportionally or use colWidths to fill available space

Exam / Quiz / Test Paper Rules

  • Question numbering: Use hierarchical numbering (一、二、三 for sections; 1. 2. 3. for questions; (1)(2)(3) or A B C D for sub-questions/options)
  • Option indentation: Multiple-choice options MUST be indented relative to the question stem. Minimum leftIndent = 24pt (2em). Options must NEVER start at the same X position as the question number
  • Option layout: ≤4 short options (≤4 chars each) → 2×2 grid or single row. >4 options or long text → vertical list, one per line. Each option on its own line gets consistent indentation
  • Answer space reservation: MUST reserve blank space for handwritten answers. Calculation: short answer = 2-3 blank lines (40-60pt); paragraph/essay = 8-15 blank lines (160-300pt); math work = 6-10 blank lines (120-200pt); fill-in-the-blank = inline underline (min 80pt width). Use Spacer(1, height) in ReportLab
  • Answer line style: Use light gray dashed or dotted horizontal lines for answer areas, NOT solid black lines. Line weight ≤ 0.5pt, color = #cccccc or lighter
  • Score marking area: Each question should have a score indicator in the margin or after the question number, e.g., “(10分)” or “[10 pts]”
  • Page density: Exam papers should NOT be cramped. Minimum spaceBefore=12pt between questions. Section headers get spaceBefore=24pt

Design Restraint (Anti-Gaudy)

  • Decorative elements ≤ 3 per page: Maximum 3 decorative/non-functional visual elements per page (lines, shapes, icons, patterns). Cover page exempt
  • No gratuitous icons/emoji in headers: Section headers should use typography hierarchy (size, weight, color) for emphasis — NOT emoji, icons, or decorative bullets unless the user explicitly requested them
  • No rainbow/multi-color schemes: Stick to the single-family palette system. If you find yourself using 4+ distinct hue families in one document, STOP and simplify
  • No decorative borders on body pages: Body content pages must NOT have decorative borders, corner ornaments, or page frames. Clean margins only. (Cover page Template 11 border is the sole exception)
  • No texture/pattern backgrounds on body pages: Body pages use solid white or ultra-light tinted backgrounds only. No dot grids, crosshatch, diagonal lines, or any pattern fills
  • Whitespace is design: Empty space between elements is intentional and valuable. Do NOT fill every gap with decorative elements, horizontal rules, or filler content
  • Typography over decoration: Create visual hierarchy through font size, weight, spacing, and color — not through adding more visual elements. If a design looks busy, REMOVE elements rather than rearranging them
  • 2-typeface maximum: Entire document uses at most 2 font families (one serif, one sans-serif). No mixing 3+ fonts for “variety”
  • 🚫 NO stock images / clipart / AI-generated decorations: NEVER embed watercolor flowers, floral borders, gold frames, stock photos, clipart illustrations, or AI-generated artwork for decoration. Use geometric shapes (CSS/SVG from geometry.md) + typography for all visual design. Only user-provided content images (photos, logos, diagrams) are allowed. See visual_framework.md Stock Image Ban

LaTeX-Specific (academic.md)

  • Curly quotes: No straight " quotes - use text'' for double and `text' `` for single
  • Title page isolation: \end{titlepage} followed by \newpage/\clearpage - TOC/body NEVER on same page as title
  • Resume column overlap: AltaCV paracol entries checked for vertical overflow; max 3-4 bullets per \cvevent; explicit \newpage for 2-page resumes
  • \geometry symmetry: left=X, right=X must be equal values

Output Cleanliness (All Pipelines)

  • No process artifacts in output: NEVER include version numbers ("V3"), iteration markers, draft labels ("DRAFT"), "CONFIDENTIAL"/"机密" stamps, "Generated by AI"/"本文档由AI生成", or internal comments in the final PDF unless the user explicitly requested them
  • No auto-generated boilerplate labels: Do not add ANY watermarks, generation notices, version numbers, timestamps, or tool names that the user didn't ask for
  • No debug output in content: Console logs, file paths, generation timestamps, tool names, or error messages must never appear in the PDF body
  • Clean metadata only: PDF metadata (author, title, subject) should reflect the document content, not the generation process