---
name: agent-pipeline-builder
description: Build multi-agent pipelines with structured data flow between agents. Use when creating workflows where each agent has a specialized role and passes output to the next agent.
allowed-tools: Write, Edit, Read, Bash, WebSearch
license: MIT
---

# Agent Pipeline Builder

Build reliable multi-agent workflows where each agent has a single, focused responsibility and outputs structured data that the next agent consumes.

## When to Use This Skill

Use this skill when:
- Building complex workflows that need multiple specialized agents
- Creating content pipelines (research → analysis → writing)
- Designing data processing flows with validation at each stage
- Implementing "funnel" patterns where broad input becomes focused output

## Pipeline Pattern

A pipeline consists of:
1. **Stage 1: Researcher/Gatherer** - Fetches raw data (WebSearch, file reading, API calls)
2. **Stage 2: Analyzer/Filter** - Processes and selects best options
3. **Stage 3: Creator/Writer** - Produces final output

Each stage:
- Has ONE job
- Outputs structured JSON (or YAML)
- Wraps output in markers (e.g., `<<<stage>>>...<<<end-stage>>>`)
- Passes data to next stage via stdin or file

## RalphLoop "Tackle Until Solved" Integration

For complex pipelines (3+ stages or complexity >= 5), agent-pipeline-builder automatically delegates to Ralph Orchestrator for autonomous pipeline construction and testing.

### When Ralph is Triggered

Ralph mode activates for pipelines with:
- 3 or more stages
- Complex stage patterns (external APIs, complex processing, conditional logic)
- Parallel stage execution
- User opt-in via `RALPH_AUTO=true` or `PIPELINE_USE_RALPH=true`

### Using Ralph Integration

When a complex pipeline is detected:

1. Check for Python integration module:
   ```bash
   python3 /home/uroma/.claude/skills/agent-pipeline-builder/ralph-pipeline.py --test-complexity
   ```

2. If complex, delegate to Ralph:
   ```bash
   /home/uroma/obsidian-web-interface/bin/ralphloop -i .ralph/PIPELINE.md
   ```

3. Monitor Ralph's progress in `.ralph/state.json`

4. On completion, use generated pipeline from `.ralph/iterations/pipeline.md`

### Manual Ralph Invocation

For explicit Ralph mode on any pipeline:
```bash
export PIPELINE_USE_RALPH=true
# or
export RALPH_AUTO=true
```

Then invoke `/agent-pipeline-builder` as normal.

### Ralph-Generated Pipeline Structure

When Ralph builds the pipeline autonomously, it creates:

```
.claude/agents/[pipeline-name]/
├── researcher.md    # Agent definition
├── analyzer.md      # Agent definition
└── writer.md        # Agent definition

scripts/
└── run-[pipeline-name].ts  # Orchestration script

.ralph/
├── PIPELINE.md      # Manifest
├── state.json       # Progress tracking
└── iterations/
    └── pipeline.md  # Final generated pipeline
```

## Creating a Pipeline

### Step 1: Define Pipeline Manifest

Create a `pipeline.md` file:

```markdown
# Pipeline: [Name]

## Stages
1. researcher - Finds/fetches raw data
2. analyzer - Processes and selects
3. writer - Creates final output

## Data Format
All stages use JSON with markers: `<<<stage-name>>>...<<<end-stage-name>>>`
```

### Step 2: Create Agent Definitions

For each stage, create an agent file `.claude/agents/[pipeline-name]/[stage-name].md`:

```markdown
---
name: researcher
description: What this agent does
model: haiku  # or sonnet, opus
---

You are a [role] agent.

## CRITICAL: NO EXPLANATION - JUST ACTION

DO NOT explain what you will do. Just USE tools immediately, then output.

## Instructions

1. Use [specific tool] to get data
2. Output JSON in the exact format below
3. Wrap in markers as specified

## Output Format

<<<researcher>>>
```json
{
  "data": [...]
}
```
<<<end-researcher>>>
```

### Step 3: Implement Pipeline Script

Create a script that orchestrates the agents:

```typescript
// scripts/run-pipeline.ts
import { runAgent } from '@anthropic-ai/claude-agent-sdk';

async function runPipeline() {
  // Stage 1: Researcher
  const research = await runAgent('researcher', {
    context: { topic: 'AI news' }
  });

  // Stage 2: Analyzer (uses research output)
  const analysis = await runAgent('analyzer', {
    input: research,
    context: { criteria: 'impact' }
  });

  // Stage 3: Writer (uses analysis output)
  const final = await runAgent('writer', {
    input: analysis,
    context: { format: 'tweet' }
  });

  return final;
}
```

## Pipeline Best Practices

### 1. Single Responsibility
Each agent does ONE thing:
- ✓ researcher: Fetches data
- ✓ analyzer: Filters and ranks
- ✗ researcher-analyzer: Does both (too complex)

### 2. Structured Data Flow
- Use JSON or YAML for all inter-agent communication
- Define schemas upfront
- Validate output before passing to next stage

### 3. Error Handling
- Each agent should fail gracefully
- Use fallback outputs
- Log errors for debugging

### 4. Deterministic Patterns
- Constrain agents with specific tools
- Use detailed system prompts
- Avoid open-ended requests

## Example Pipeline: AI News Tweet

### Manifest
```yaml
name: ai-news-tweet
stages:
  - researcher: Gets today's AI news
  - analyzer: Picks most impactful story
  - writer: Crafts engaging tweet
```

### Researcher Agent
```markdown
---
name: researcher
description: Finds recent AI news using WebSearch
model: haiku
---

Use WebSearch to find AI news from TODAY ONLY.

Output:
<<<researcher>>>
```json
{
  "items": [
    {
      "title": "...",
      "summary": "...",
      "url": "...",
      "published_at": "YYYY-MM-DD"
    }
  ]
}
```
<<<end-researcher>>>
```

### Analyzer Agent
```markdown
---
name: analyzer
description: Analyzes news and selects best story
model: sonnet
---

Input: Researcher output (stdin)

Select the most impactful story based on:
- Technical significance
- Broad interest
- Credibility of source

Output:
<<<analyzer>>>
```json
{
  "selected": {
    "title": "...",
    "summary": "...",
    "reasoning": "..."
  }
}
```
<<<end-analyzer>>>
```

### Writer Agent
```markdown
---
name: writer
description: Writes engaging tweet
model: sonnet
---

Input: Analyzer output (stdin)

Write a tweet that:
- Hooks attention
- Conveys key insight
- Fits 280 characters
- Includes relevant hashtags

Output:
<<<writer>>>
```json
{
  "tweet": "...",
  "hashtags": ["..."]
}
```
<<<end-writer>>>
```

## Running the Pipeline

### Method 1: Sequential Script
```bash
./scripts/run-pipeline.ts
```

### Method 2: Using Task Tool
```typescript
// Launch each stage as a separate agent task
await Task('Research stage', researchPrompt, 'haiku');
await Task('Analysis stage', analysisPrompt, 'sonnet');
await Task('Writing stage', writingPrompt, 'sonnet');
```

### Method 3: Using Claude Code Skills
Create a skill that orchestrates the pipeline with proper error handling.

## Testing Pipelines

### Unit Tests
Test each agent independently:
```bash
# Test researcher
npm run test:researcher

# Test analyzer with mock data
npm run test:analyzer

# Test writer with mock analysis
npm run test:writer
```

### Integration Tests
Test full pipeline:
```bash
npm run test:pipeline
```

## Debugging Tips

1. **Enable verbose logging** - See what each agent outputs
2. **Validate JSON schemas** - Catch malformed data early
3. **Use mock inputs** - Test downstream agents independently
4. **Check marker format** - Agents must use exact markers

## Common Patterns

### Funnel Pattern
```
Many inputs → Filter → Select One → Output
```
Example: News aggregator → analyzer → best story

### Transformation Pattern
```
Input → Transform → Validate → Output
```
Example: Raw data → clean → validate → structured data

### Assembly Pattern
```
Part A + Part B → Assemble → Complete
```
Example: Research + style guide → formatted article