Files
Pony Alpha 2 68453089ee feat: initial Alpha Brain 2 dataset release
Massive training corpus for AI coding models containing:
- 10 JSONL training datasets (641+ examples across coding, reasoning, planning, architecture, communication, debugging, security, workflows, error handling, UI/UX)
- 11 agent behavior specifications (explorer, planner, reviewer, debugger, executor, UI designer, Linux admin, kernel engineer, security architect, automation engineer, API architect)
- 6 skill definition files (coding, API engineering, kernel, Linux server, security architecture, server automation, UI/UX)
- Master README with project origin story and philosophy

Built by Pony Alpha 2 to help AI models learn expert-level coding approaches.
2026-03-13 16:26:29 +04:00

3.9 KiB

Planning and Decomposition Dataset

Created: March 13, 2026 3:40 PM

Overview

This dataset contains examples for training AI models to decompose complex tasks into sub-tasks, manage todo lists, and determine execution order and dependencies.

Dataset Format

JSONL (JSON Lines) - one JSON object per line

Schema

Each example contains:

  • task: The original user request
  • decomposition: Array of sub-tasks
  • execution_order: Dependencies between tasks (pairs of [task_i, task_j])
  • todo_list: Structured todos with content, status, and activeForm

Statistics

  • Total Examples: 43
  • File Size: 90KB
  • Format: JSONL

Coverage

Todo List Management

  • All examples include structured todo lists with content, status, and activeForm
  • Demonstrates proper todo item formulation
  • Shows task progression from "pending" to completion

Multi-Step Tasks (3+ steps)

All examples have 8-15 sub-steps, demonstrating:

  • Feature implementation (authentication, real-time chat, search)
  • Bug investigation (memory leaks, API timeouts)
  • Refactoring (monolithic controllers, duplicate code)
  • Migration (database, frontend JavaScript to TypeScript)
  • CI/CD setup (GitHub Actions pipelines)
  • And many more complex scenarios

ONE Task in Progress at a Time

All todo_list items show status: "pending", demonstrating that only one task should be marked as in_progress at any given time during execution.

Sequential vs Parallel Decisions

The execution_order field clearly shows dependencies:

Replanning When New Info Emerges

Several examples show iterative refinement:

  • Debug scenarios where new information changes the approach
  • Migration examples with staging before production
  • Testing examples where results inform next steps

Scenarios Covered

Feature Implementation

  • User authentication with JWT
  • Real-time chat with WebSocket
  • Search functionality with Elasticsearch
  • Data export with multiple formats
  • Notification systems (email, SMS, push)
  • File upload with drag-and-drop
  • Content management systems
  • Real-time collaboration features
  • Data visualization components
  • Form validation systems
  • And more...

Bug Investigation

  • Memory leak debugging
  • API timeout errors
  • Performance profiling
  • Error tracking systems

Refactoring

  • Monolithic controller to service layer
  • Duplicate code to utilities
  • Frontend optimization (bundle size, load time)
  • Component library creation with Storybook

Migration

  • PostgreSQL to MongoDB
  • JavaScript to TypeScript
  • Database schema migrations
  • API versioning

CI/CD

  • GitHub Actions pipeline setup
  • Automated testing strategies
  • Build and deployment automation
  • Infrastructure monitoring

Security & Compliance

  • Security audits and fixes
  • Data encryption
  • Permission systems (RBAC)
  • Audit logging
  • Input validation

Infrastructure

  • Database sharding
  • Message queue systems
  • Caching with Redis
  • API gateway setup
  • Distributed tracing
  • Session management

Testing & Quality

  • Automated testing strategies
  • A/B testing frameworks
  • Localization testing
  • Accessibility features

Usage Example

import json

# Read the dataset
with open('planning-decomposition.jsonl', 'r') as f:
    for line in f:
        example = json.loads(line)
        print(f"Task: {example['task']}")
        print(f"Sub-tasks: {len(example['decomposition'])}")
        print(f"Dependencies: {len(example['execution_order'])}")
        print(f"Todo items: {len(example['todo_list'])}")
        print()

File Location

C:/Users/admin/Pony-Alpha-2-Dataset-Training/datasets/03-planning-decomposition/planning-decomposition.jsonl

Notes

  • Dataset created March 13, 2026
  • All examples follow consistent schema
  • Suitable for training planning and task decomposition models
  • Covers real-world software engineering scenarios