Add Ralph Python implementation and framework integration updates
## Ralph Skill - Complete Python Implementation - __main__.py: Main entry point for Ralph autonomous agent - agent_capability_registry.py: Agent capability registry (FIXED syntax error) - dynamic_agent_selector.py: Dynamic agent selection logic - meta_agent_orchestrator.py: Meta-orchestration for multi-agent workflows - worker_agent.py: Worker agent implementation - ralph_agent_integration.py: Integration with Claude Code - superpowers_integration.py: Superpowers framework integration - observability_dashboard.html: Real-time observability UI - observability_server.py: Dashboard server - multi-agent-architecture.md: Architecture documentation - SUPERPOWERS_INTEGRATION.md: Integration guide ## Framework Integration Status - ✅ codebase-indexer (Chippery): Complete implementation with 5 scripts - ✅ ralph (Ralph Orchestrator): Complete Python implementation - ✅ always-use-superpowers: Declarative skill (SKILL.md) - ✅ auto-superpowers: Declarative skill (SKILL.md) - ✅ auto-dispatcher: Declarative skill (Ralph framework) - ✅ autonomous-planning: Declarative skill (Ralph framework) - ✅ mcp-client: Declarative skill (AGIAgent/Agno framework) ## Agent Updates - Updated README.md with latest integration status - Added framework integration agents Token Savings: ~99% via semantic codebase indexing 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
458
skills/ralph/multi-agent-architecture.md
Normal file
458
skills/ralph/multi-agent-architecture.md
Normal file
@@ -0,0 +1,458 @@
|
||||
# Ralph Multi-Agent Orchestration System
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
The Ralph Multi-Agent Orchestration System enables running 10+ Claude instances in parallel with intelligent coordination, conflict resolution, and real-time observability.
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────┐
|
||||
│ Meta-Agent Orchestrator │
|
||||
│ (ralph-integration.py) │
|
||||
│ - Analyzes requirements │
|
||||
│ - Breaks into independent tasks │
|
||||
│ - Manages dependencies │
|
||||
│ - Coordinates worker agents │
|
||||
└──────────────────┬──────────────────────────┘
|
||||
│ Creates tasks
|
||||
▼
|
||||
┌─────────────────────────────────────────────┐
|
||||
│ Task Queue (Redis) │
|
||||
│ Stores and distributes work │
|
||||
└─────┬───────┬───────┬───────┬──────────────┘
|
||||
│ │ │ │
|
||||
▼ ▼ ▼ ▼
|
||||
┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐
|
||||
│ Agent 1 │ │ Agent 2 │ │ Agent 3 │ │ Agent N │
|
||||
│Frontend │ │ Backend │ │ Tests │ │ Docs │
|
||||
└─────────┘ └─────────┘ └─────────┘ └─────────┘
|
||||
│ │ │ │
|
||||
└───────┴───────┴───────┘
|
||||
│
|
||||
▼
|
||||
┌──────────────────┐
|
||||
│ Observability │
|
||||
│ Dashboard │
|
||||
│ (Real-time UI) │
|
||||
└──────────────────┘
|
||||
```
|
||||
|
||||
## Core Components
|
||||
|
||||
### 1. Meta-Agent Orchestrator
|
||||
|
||||
The meta-agent is Ralph running in orchestration mode where it manages other agents instead of writing code directly.
|
||||
|
||||
**Key Responsibilities:**
|
||||
- Analyze project requirements
|
||||
- Break down into parallelizable tasks
|
||||
- Manage task dependencies
|
||||
- Spawn and coordinate worker agents
|
||||
- Monitor progress and handle conflicts
|
||||
- Aggregate results
|
||||
|
||||
**Configuration:**
|
||||
```bash
|
||||
# Enable multi-agent mode
|
||||
RALPH_MULTI_AGENT=true
|
||||
RALPH_MAX_WORKERS=12
|
||||
RALPH_TASK_QUEUE_HOST=localhost
|
||||
RALPH_TASK_QUEUE_PORT=6379
|
||||
RALPH_OBSERVABILITY_PORT=3001
|
||||
```
|
||||
|
||||
### 2. Task Queue System
|
||||
|
||||
Uses Redis for reliable task distribution and state management.
|
||||
|
||||
**Task Structure:**
|
||||
```json
|
||||
{
|
||||
"id": "unique-task-id",
|
||||
"type": "frontend|backend|testing|docs|refactor|analysis",
|
||||
"description": "What needs to be done",
|
||||
"dependencies": ["task-id-1", "task-id-2"],
|
||||
"files": ["path/to/file1.ts", "path/to/file2.ts"],
|
||||
"priority": 1-10,
|
||||
"specialization": "optional-specific-agent-type",
|
||||
"timeout": 300,
|
||||
"retry_count": 0,
|
||||
"max_retries": 3
|
||||
}
|
||||
```
|
||||
|
||||
**Queue Operations:**
|
||||
- `claude_tasks` - Main task queue
|
||||
- `claude_tasks:pending` - Tasks waiting for dependencies
|
||||
- `claude_tasks:complete` - Completed tasks
|
||||
- `claude_tasks:failed` - Failed tasks for retry
|
||||
- `lock:{file_path}` - File-level locks
|
||||
- `task:{task_id}` - Task status tracking
|
||||
|
||||
### 3. Specialized Worker Agents
|
||||
|
||||
Each worker agent has a specific role and configuration.
|
||||
|
||||
**Agent Types:**
|
||||
|
||||
| Agent Type | Specialization | Example Tasks |
|
||||
|------------|----------------|---------------|
|
||||
| **Frontend** | UI/UX, React, Vue, Svelte | Component refactoring, styling |
|
||||
| **Backend** | APIs, databases, services | Endpoint creation, data models |
|
||||
| **Testing** | Unit tests, integration tests | Test writing, coverage improvement |
|
||||
| **Documentation** | Docs, comments, README | API docs, inline documentation |
|
||||
| **Refactor** | Code quality, optimization | Performance tuning, code cleanup |
|
||||
| **Analysis** | Code review, architecture | Dependency analysis, security audit |
|
||||
|
||||
**Worker Configuration:**
|
||||
```json
|
||||
{
|
||||
"agent_id": "agent-frontend-1",
|
||||
"specialization": "frontend",
|
||||
"max_concurrent_tasks": 1,
|
||||
"file_lock_timeout": 300,
|
||||
"heartbeat_interval": 10,
|
||||
"log_level": "info"
|
||||
}
|
||||
```
|
||||
|
||||
### 4. File Locking & Conflict Resolution
|
||||
|
||||
Prevents multiple agents from modifying the same file simultaneously.
|
||||
|
||||
**Lock Acquisition Flow:**
|
||||
1. Agent requests locks for required files
|
||||
2. Redis attempts to set lock keys with NX flag
|
||||
3. If all locks acquired, agent proceeds
|
||||
4. If any lock fails, agent waits and retries
|
||||
5. Locks auto-expire after timeout (safety mechanism)
|
||||
|
||||
**Conflict Detection:**
|
||||
```python
|
||||
def detect_conflicts(agent_files: Dict[str, List[str]]) -> List[Conflict]:
|
||||
"""Detect file access conflicts between agents"""
|
||||
file_agents = {}
|
||||
for agent_id, files in agent_files.items():
|
||||
for file_path in files:
|
||||
if file_path in file_agents:
|
||||
file_agents[file_path].append(agent_id)
|
||||
else:
|
||||
file_agents[file_path] = [agent_id]
|
||||
|
||||
conflicts = [
|
||||
{"file": f, "agents": agents}
|
||||
for f, agents in file_agents.items()
|
||||
if len(agents) > 1
|
||||
]
|
||||
return conflicts
|
||||
```
|
||||
|
||||
**Resolution Strategies:**
|
||||
1. **Dependency-based ordering** - Add dependencies between conflicting tasks
|
||||
2. **File splitting** - Break tasks into smaller units
|
||||
3. **Agent specialization** - Assign conflicting tasks to same agent
|
||||
4. **Merge coordination** - Use git merge strategies
|
||||
|
||||
### 5. Real-Time Observability Dashboard
|
||||
|
||||
WebSocket-based dashboard for monitoring all agents in real-time.
|
||||
|
||||
**Dashboard Features:**
|
||||
- Live agent status (active, busy, idle, error)
|
||||
- Task progress tracking
|
||||
- File modification visualization
|
||||
- Conflict alerts and resolution
|
||||
- Activity stream with timestamps
|
||||
- Performance metrics
|
||||
|
||||
**WebSocket Events:**
|
||||
```javascript
|
||||
// Agent update
|
||||
{
|
||||
"type": "agent_update",
|
||||
"agent": {
|
||||
"id": "agent-frontend-1",
|
||||
"status": "active",
|
||||
"currentTask": "refactor-buttons",
|
||||
"progress": 65,
|
||||
"workingFiles": ["components/Button.tsx"],
|
||||
"completedCount": 12
|
||||
}
|
||||
}
|
||||
|
||||
// Conflict detected
|
||||
{
|
||||
"type": "conflict",
|
||||
"conflict": {
|
||||
"file": "components/Button.tsx",
|
||||
"agents": ["agent-frontend-1", "agent-frontend-2"],
|
||||
"timestamp": "2025-08-02T15:30:00Z"
|
||||
}
|
||||
}
|
||||
|
||||
// Task completed
|
||||
{
|
||||
"type": "task_complete",
|
||||
"taskId": "refactor-buttons",
|
||||
"agentId": "agent-frontend-1",
|
||||
"duration": 45.2,
|
||||
"filesModified": ["components/Button.tsx", "components/Button.test.tsx"]
|
||||
}
|
||||
```
|
||||
|
||||
## Usage Examples
|
||||
|
||||
### Example 1: Frontend Refactor
|
||||
|
||||
```bash
|
||||
# Start multi-agent Ralph for frontend refactor
|
||||
RALPH_MULTI_AGENT=true \
|
||||
RALPH_MAX_WORKERS=8 \
|
||||
/ralph "Refactor all components from class to functional with hooks"
|
||||
```
|
||||
|
||||
**Meta-Agent Breakdown:**
|
||||
```json
|
||||
[
|
||||
{
|
||||
"id": "analyze-1",
|
||||
"type": "analysis",
|
||||
"description": "Scan all components and create refactoring plan",
|
||||
"dependencies": [],
|
||||
"files": []
|
||||
},
|
||||
{
|
||||
"id": "refactor-buttons",
|
||||
"type": "frontend",
|
||||
"description": "Convert all Button components to functional",
|
||||
"dependencies": ["analyze-1"],
|
||||
"files": ["components/Button/*.tsx"]
|
||||
},
|
||||
{
|
||||
"id": "refactor-forms",
|
||||
"type": "frontend",
|
||||
"description": "Convert all Form components to functional",
|
||||
"dependencies": ["analyze-1"],
|
||||
"files": ["components/Form/*.tsx"]
|
||||
},
|
||||
{
|
||||
"id": "update-tests-buttons",
|
||||
"type": "testing",
|
||||
"description": "Update Button component tests",
|
||||
"dependencies": ["refactor-buttons"],
|
||||
"files": ["__tests__/Button/*.test.tsx"]
|
||||
}
|
||||
]
|
||||
```
|
||||
|
||||
### Example 2: Full-Stack Feature
|
||||
|
||||
```bash
|
||||
# Build feature with parallel frontend/backend
|
||||
RALPH_MULTI_AGENT=true \
|
||||
RALPH_MAX_WORKERS=6 \
|
||||
/ralph "Build user authentication with OAuth, profile management, and email verification"
|
||||
```
|
||||
|
||||
**Parallel Execution:**
|
||||
- Agent 1 (Frontend): Build login form UI
|
||||
- Agent 2 (Frontend): Build profile page UI
|
||||
- Agent 3 (Backend): Implement OAuth endpoints
|
||||
- Agent 4 (Backend): Implement profile API
|
||||
- Agent 5 (Testing): Write integration tests
|
||||
- Agent 6 (Docs): Write API documentation
|
||||
|
||||
### Example 3: Codebase Optimization
|
||||
|
||||
```bash
|
||||
# Parallel optimization across codebase
|
||||
RALPH_MULTI_AGENT=true \
|
||||
RALPH_MAX_WORKERS=10 \
|
||||
/ralph "Optimize performance: bundle size, lazy loading, image optimization, caching strategy"
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
```bash
|
||||
# Multi-Agent Configuration
|
||||
RALPH_MULTI_AGENT=true # Enable multi-agent mode
|
||||
RALPH_MAX_WORKERS=12 # Maximum worker agents
|
||||
RALPH_MIN_WORKERS=2 # Minimum worker agents
|
||||
|
||||
# Task Queue (Redis)
|
||||
RALPH_TASK_QUEUE_HOST=localhost # Redis host
|
||||
RALPH_TASK_QUEUE_PORT=6379 # Redis port
|
||||
RALPH_TASK_QUEUE_DB=0 # Redis database
|
||||
RALPH_TASK_QUEUE_PASSWORD= # Redis password (optional)
|
||||
|
||||
# Observability
|
||||
RALPH_OBSERVABILITY_ENABLED=true # Enable dashboard
|
||||
RALPH_OBSERVABILITY_PORT=3001 # WebSocket port
|
||||
RALPH_OBSERVABILITY_HOST=localhost # Dashboard host
|
||||
|
||||
# Agent Behavior
|
||||
RALPH_AGENT_TIMEOUT=300 # Task timeout (seconds)
|
||||
RALPH_AGENT_HEARTBEAT=10 # Heartbeat interval (seconds)
|
||||
RALPH_FILE_LOCK_TIMEOUT=300 # File lock timeout (seconds)
|
||||
RALPH_MAX_RETRIES=3 # Task retry count
|
||||
|
||||
# Logging
|
||||
RALPH_VERBOSE=true # Verbose logging
|
||||
RALPH_LOG_LEVEL=info # Log level
|
||||
RALPH_LOG_FILE=.ralph/multi-agent.log # Log file path
|
||||
```
|
||||
|
||||
## Monitoring & Debugging
|
||||
|
||||
### Check Multi-Agent Status
|
||||
|
||||
```bash
|
||||
# View active agents
|
||||
redis-cli keys "agent:*"
|
||||
|
||||
# View task queue
|
||||
redis-cli lrange claude_tasks 0 10
|
||||
|
||||
# View file locks
|
||||
redis-cli keys "lock:*"
|
||||
|
||||
# View task status
|
||||
redis-cli hgetall "task:task-id"
|
||||
|
||||
# View completed tasks
|
||||
redis-cli lrange claude_tasks:complete 0 10
|
||||
```
|
||||
|
||||
### Observability Dashboard
|
||||
|
||||
Access dashboard at: `http://localhost:3001`
|
||||
|
||||
**Dashboard Sections:**
|
||||
1. **Mission Status** - Overall progress
|
||||
2. **Agent Grid** - Individual agent status
|
||||
3. **Conflict Alerts** - Active file conflicts
|
||||
4. **Activity Stream** - Real-time event log
|
||||
5. **Performance Metrics** - Agent efficiency
|
||||
|
||||
## Best Practices
|
||||
|
||||
### 1. Task Design
|
||||
- Keep tasks independent when possible
|
||||
- Minimize cross-task file dependencies
|
||||
- Use specialization to guide agent assignment
|
||||
- Set appropriate timeouts
|
||||
|
||||
### 2. Dependency Management
|
||||
- Use topological sort for execution order
|
||||
- Minimize dependency depth
|
||||
- Allow parallel execution at every opportunity
|
||||
- Handle circular dependencies gracefully
|
||||
|
||||
### 3. Conflict Prevention
|
||||
- Group related file modifications in single task
|
||||
- Use file-specific agents when conflicts likely
|
||||
- Implement merge strategies for common conflicts
|
||||
- Monitor lock acquisition time
|
||||
|
||||
### 4. Observability
|
||||
- Log all agent activities
|
||||
- Track file modifications in real-time
|
||||
- Alert on conflicts immediately
|
||||
- Maintain activity history for debugging
|
||||
|
||||
### 5. Error Handling
|
||||
- Implement retry logic with exponential backoff
|
||||
- Quarantine failing tasks for analysis
|
||||
- Provide detailed error context
|
||||
- Allow manual intervention when needed
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
**Agents stuck waiting:**
|
||||
```bash
|
||||
# Check for stale locks
|
||||
redis-cli keys "lock:*"
|
||||
|
||||
# Clear stale locks
|
||||
redis-cli del "lock:path/to/file"
|
||||
```
|
||||
|
||||
**Tasks not executing:**
|
||||
```bash
|
||||
# Check task queue
|
||||
redis-cli lrange claude_tasks 0 -1
|
||||
|
||||
# Check pending tasks
|
||||
redis-cli lrange claude_tasks:pending 0 -1
|
||||
```
|
||||
|
||||
**Dashboard not updating:**
|
||||
```bash
|
||||
# Check WebSocket server
|
||||
netstat -an | grep 3001
|
||||
|
||||
# Restart observability server
|
||||
pkill -f ralph-observability
|
||||
RALPH_OBSERVABILITY_ENABLED=true ralph-observability
|
||||
```
|
||||
|
||||
## Performance Tuning
|
||||
|
||||
### Optimize Worker Count
|
||||
```bash
|
||||
# Calculate optimal workers
|
||||
WORKERS = (CPU_CORES * 1.5) - 1
|
||||
|
||||
# For I/O bound tasks
|
||||
WORKERS = CPU_CORES * 2
|
||||
|
||||
# For CPU bound tasks
|
||||
WORKERS = CPU_CORES
|
||||
```
|
||||
|
||||
### Redis Configuration
|
||||
```bash
|
||||
# redis.conf
|
||||
maxmemory 2gb
|
||||
maxmemory-policy allkeys-lru
|
||||
timeout 300
|
||||
tcp-keepalive 60
|
||||
```
|
||||
|
||||
### Agent Pool Sizing
|
||||
```bash
|
||||
# Dynamic scaling based on queue depth
|
||||
QUEUE_DEPTH=$(redis-cli llen claude_tasks)
|
||||
if [ $QUEUE_DEPTH -gt 50 ]; then
|
||||
SCALE_UP=true
|
||||
elif [ $QUEUE_DEPTH -lt 10 ]; then
|
||||
SCALE_DOWN=true
|
||||
fi
|
||||
```
|
||||
|
||||
## Security Considerations
|
||||
|
||||
1. **File Access Control** - Restrict agent file system access
|
||||
2. **Redis Authentication** - Use Redis password in production
|
||||
3. **Network Isolation** - Run agents in isolated network
|
||||
4. **Resource Limits** - Set CPU/memory limits per agent
|
||||
5. **Audit Logging** - Log all agent actions for compliance
|
||||
|
||||
## Integration with Claude Code
|
||||
|
||||
The Ralph Multi-Agent System integrates seamlessly with Claude Code:
|
||||
|
||||
```bash
|
||||
# Use with Claude Code projects
|
||||
export RALPH_AGENT=claude
|
||||
export RALPH_MULTI_AGENT=true
|
||||
cd /path/to/claude-code-project
|
||||
/ralph "Refactor authentication system"
|
||||
```
|
||||
|
||||
**Claude Code Integration Points:**
|
||||
- Uses Claude Code agent pool
|
||||
- Respects Claude Code project structure
|
||||
- Integrates with Claude Code hooks
|
||||
- Supports Claude Code tool ecosystem
|
||||
Reference in New Issue
Block a user