Files

admin b723e2bd7d Reorganize: Move all skills to skills/ folder

- Created skills/ directory
- Moved 272 skills to skills/ subfolder
- Kept agents/ at root level
- Kept installation scripts and docs at root level

Repository structure:
- skills/           - All 272 skills from skills.sh
- agents/           - Agent definitions
- *.sh, *.ps1       - Installation scripts
- README.md, etc.   - Documentation

Co-Authored-By: Claude <noreply@anthropic.com>

2026-01-23 18:05:17 +00:00

4.5 KiB

Raw Permalink Blame History

name, description

name	description
baoyu-url-to-markdown	Fetch any URL and convert to markdown using Chrome CDP. Supports two modes - auto-capture on page load, or wait for user signal (for pages requiring login). Use when user wants to save a webpage as markdown.

URL to Markdown

Fetches any URL via Chrome CDP and converts HTML to clean markdown.

Script Directory

Important: All scripts are located in the scripts/ subdirectory of this skill.

Agent Execution Instructions:

Determine this SKILL.md file's directory path as SKILL_DIR
Script path = ${SKILL_DIR}/scripts/<script-name>.ts
Replace all ${SKILL_DIR} in this document with the actual path

Script Reference:

Script	Purpose
`scripts/main.ts`	CLI entry point for URL fetching

Features

Chrome CDP for full JavaScript rendering
Two capture modes: auto or wait-for-user
Clean markdown output with metadata
Handles login-required pages via wait mode

Usage

# Auto mode (default) - capture when page loads
npx -y bun ${SKILL_DIR}/scripts/main.ts <url>

# Wait mode - wait for user signal before capture
npx -y bun ${SKILL_DIR}/scripts/main.ts <url> --wait

# Save to specific file
npx -y bun ${SKILL_DIR}/scripts/main.ts <url> -o output.md

Options

Option	Description
`<url>`	URL to fetch
`-o <path>`	Output file path (default: auto-generated)
`--wait`	Wait for user signal before capturing
`--timeout <ms>`	Page load timeout (default: 30000)

Capture Modes

Auto Mode (default)

Page loads → waits for network idle → captures immediately.

Best for:

Public pages
Static content
No login required

Wait Mode (`--wait`)

Page opens → user can interact (login, scroll, etc.) → user signals ready → captures.

Best for:

Login-required pages
Dynamic content needing interaction
Pages with lazy loading

Agent workflow for wait mode:

Run script with --wait flag
Script outputs: Page opened. Press Enter when ready to capture...
Use AskUserQuestion to ask user if page is ready
When user confirms, send newline to stdin to trigger capture

Output Format

---
url: https://example.com/page
title: "Page Title"
description: "Meta description if available"
author: "Author if available"
published: "2024-01-01"
captured_at: "2024-01-15T10:30:00Z"
---

# Page Title

Converted markdown content...

Mode Selection Guide

When user requests URL capture, help select appropriate mode:

Suggest Auto Mode when:

URL is public (no login wall visible)
Content appears static
User doesn't mention login requirements

Suggest Wait Mode when:

User mentions needing to log in
Site known to require authentication
User wants to scroll/interact before capture
Content is behind paywall

Ask user when unclear:

The page may require login or interaction before capturing.

Which mode should I use?
1. Auto - Capture immediately when loaded
2. Wait - Wait for you to interact first

Output Directory

Each capture creates a file organized by domain:

url-to-markdown/
└── <domain>/
    └── <slug>.md

Path Components:

<domain>: Site domain (e.g., example.com, github.com)
<slug>: Generated from page title or URL path (kebab-case)

Slug Generation:

Extract from page title (preferred) or URL path
Convert to kebab-case, 2-6 words
Example: "Getting Started with React" → getting-started-with-react

Conflict Resolution: If url-to-markdown/<domain>/<slug>.md already exists:

Append timestamp: <slug>-YYYYMMDD-HHMMSS.md
Example: getting-started.md exists → getting-started-20260118-143052.md

Error Handling

Error	Resolution
Chrome not found	Install Chrome or set `URL_CHROME_PATH` env
Page timeout	Increase `--timeout` value
Capture failed	Try wait mode for complex pages
Empty content	Page may need JS rendering time

Environment Variables

Variable	Description
`URL_CHROME_PATH`	Custom Chrome executable path
`URL_DATA_DIR`	Custom data directory
`URL_CHROME_PROFILE_DIR`	Custom Chrome profile directory

Extension Support

Custom configurations via EXTEND.md.

Check paths (priority order):

.baoyu-skills/baoyu-url-to-markdown/EXTEND.md (project)
~/.baoyu-skills/baoyu-url-to-markdown/EXTEND.md (user)

If found, load before workflow. Extension content overrides defaults.

4.5 KiB Raw Permalink Blame History