# docx-js Advanced Features Advanced API for complex document scenarios. Load this when creating documents with TOC, cover pages, footnotes, multi-section layouts, or post-processing needs. ## Table of Contents (TOC) **→ See `references/toc.md` for the complete TOC reference** (3-step process, code examples, page numbering, common bugs, checklist). ## Cover Page Design (Vertical Centering) Use large `spacing.before` to push content down for visual centering: ```js // Approximate vertical center on A4: // Total printable height ≈ 14000 twips // For title at ~40% from top: before = 5600 const coverSection = { properties: { page: { /* standard A4 */ }, // No headers/footers on cover page }, children: [ new Paragraph({ spacing: { before: 5600 } }), // spacer new Paragraph({ alignment: AlignmentType.CENTER, children: [new TextRun({ text: title, font: { ascii: "Calibri", eastAsia: "SimHei" }, size: 52, bold: true, color: palette.primary, })], }), // ... subtitle, author, date ], }; ``` For multi-section documents, put the cover in its own section so it can have different headers/footers. ## Footnotes ```js const { FootnoteReferenceRun, Footnote } = require("docx"); const doc = new Document({ footnotes: { 1: { children: [new Paragraph({ children: [new TextRun({ text: "Smith, J. (2024). Research Methods. Academic Press, pp. 45-67.", size: 18 })] })] }, 2: { children: [new Paragraph({ children: [new TextRun({ text: "Zhang, W. (2023). \u201c数据分析方法研究\u201d. 科学通报, 68(12), 1234-1250.", size: 18 })] })] }, }, sections: [{ children: [ new Paragraph({ children: [ new TextRun({ text: "According to recent studies" }), new FootnoteReferenceRun(1), // superscript [1] new TextRun({ text: ", data analysis methods have evolved" }), new FootnoteReferenceRun(2), // superscript [2] new TextRun({ text: "." }), ], }), ], }], }); ``` ### Academic Reference Pattern For sequential references [1][2][3]..., pre-define all footnotes in the `footnotes` object with numeric keys, then reference them inline with `FootnoteReferenceRun(n)`. ## keepNext — Element Binding Prevent page breaks between related elements: ```js // Heading stays with next paragraph new Paragraph({ heading: HeadingLevel.HEADING_2, keepNext: true, // don't break after this children: [new TextRun({ text: "Table 1: Results" })], }) // Table immediately follows on same page // Caption stays with image new Paragraph({ keepNext: true, alignment: AlignmentType.CENTER, children: [new TextRun({ text: "Figure 1: Architecture Diagram", italics: true, size: 20 })], }) // ImageRun paragraph follows ``` Use `keepNext: true` for: - Heading → first paragraph of section - Table caption → table - Image → image caption - "Figure X" label → image ## Page Break Rules Follow the document type strategy defined in SOUL.md Rule 1. **Structural breaks (always):** - Cover page → TOC - TOC → main content - Main content → back cover **Content breaks (by document type):** - Academic / teaching → `new Paragraph({ children: [new PageBreak()] })` before each H1 chapter - Business report → PageBreak before each H1; H2 flows naturally - Resume / contract / letter → No content page breaks - Short article → No content page breaks **Anti-tear (mandatory):** ```js // Heading stays with next paragraph new Paragraph({ heading: HeadingLevel.HEADING_1, keepNext: true, children: [new TextRun("Chapter Title")], }) // Table caption stays with table new Paragraph({ keepNext: true, children: [new TextRun({ text: "Table 1: Summary", italics: true })], }) // Image caption stays with image new Paragraph({ keepNext: true, children: [new TextRun({ text: "Figure 1: Architecture", italics: true })], }) ``` **Never:** - PageBreak inside tables - PageBreak as standalone element (must be inside Paragraph) - PageBreak at the END of the last section (causes blank page) ```js // Correct: page break between cover and TOC new Paragraph({ children: [new PageBreak()] }) ``` ## Quotes Escaping in JS Strings **⚠️⚠️⚠️ CRITICAL — #1 MOST COMMON BUG ⚠️⚠️⚠️** Bare Chinese curly quotation marks (`""` `''`) in JS string literals **WILL break syntax and crash document generation**. This bug occurs most often in **Chinese body text** where curly quotes are used for emphasis, proper nouns, event names, or quoted speech — e.g., `"双11"`, `"前低后高"`, `"618"大促`. **Every single occurrence** of `""''` in text content MUST be Unicode-escaped. No exceptions. **MANDATORY RULE: Before writing ANY `TextRun`, `para()`, or string containing Chinese text, scan the text for `""''` characters and replace ALL of them with `\u201c \u201d \u2018 \u2019`.** | Character | Unicode | Escape method | |-----------|---------|---------------| | `"` `"` | `\u201c` `\u201d` | Unicode escape `\u201c` `\u201d` | | `'` `'` | `\u2018` `\u2019` | Unicode escape `\u2018` `\u2019` | | `"` | U+0022 | `\"` or wrap string in single quotes / template literal | | `'` | U+0027 | `\'` or wrap string in double quotes / template literal | ```js // ❌ WRONG — curly quotes in Chinese text break JS syntax (VERY COMMON MISTAKE) content.push(para("2025年四个季度行业增速呈现"前低后高"的态势。在"618"大促、"双11""双12"活动拉动下增长显著。")); new TextRun({ text: "他说"你好"" }) new TextRun({ text: 'It's a test' }) // ✅ CORRECT — ALL curly quotes replaced with Unicode escapes content.push(para("2025年四个季度行业增速呈现\u201c前低后高\u201d的态势。在\u201c618\u201d大促、\u201c双11\u201d\u201c双12\u201d活动拉动下增长显著。")); new TextRun({ text: "他说\u201c你好\u201d" }) new TextRun({ text: "It\u2019s a test" }) // ✅ CORRECT — straight quotes escaped or use alternate delimiters new TextRun({ text: "He said \"hello\"" }) new TextRun({ text: 'He said "hello"' }) new TextRun({ text: `He said "hello"` }) ``` ## Multi-Section Documents Different headers/footers per section: ```js const doc = new Document({ sections: [ { // Section 1: Cover — no header/footer properties: { page: { /* ... */ } }, children: coverChildren, }, { // Section 2: Front matter — Roman page numbers properties: { type: SectionType.NEXT_PAGE, page: { /* size, margin... */ pageNumbers: { start: 1, formatType: NumberFormat.UPPER_ROMAN }, }, }, headers: { default: new Header({ children: [] }) }, footers: { default: new Footer({ children: [new Paragraph({ alignment: AlignmentType.CENTER, children: [new TextRun({ children: [PageNumber.CURRENT], size: 18 })], })], }), }, children: tocAndAbstract, }, { // Section 3: Main content — Arabic page numbers properties: { type: SectionType.NEXT_PAGE, page: { /* size, margin... */ pageNumbers: { start: 1, formatType: NumberFormat.DECIMAL }, }, }, headers: { default: new Header({ children: [new Paragraph({ alignment: AlignmentType.CENTER, children: [new TextRun({ text: docTitle, size: 18, color: "888888" })], })], }), }, footers: { default: footerWithPageNumbers }, children: mainContent, }, ], }); ``` ## Converting DOCX to PDF ```bash # Using LibreOffice (headless) libreoffice --headless --convert-to pdf output.docx # ⚠️ TOC Rule: If document has TOC, warn user that: # 1. LibreOffice conversion may show empty TOC # 2. User should open in Word first, update fields (Ctrl+A → F9), save, then convert # 3. Or use Word's "Save as PDF" for best results ``` ## Converting DOCX to Images ```bash # Step 1: Convert to PDF libreoffice --headless --convert-to pdf output.docx # Step 2: Convert PDF to images pdftoppm -png -r 200 output.pdf output_page # This generates output_page-1.png, output_page-2.png, etc. # Use -r 200 for good quality (200 DPI) ``` Useful for generating preview thumbnails or when user needs images instead of document files.