5.5 KiB
5.5 KiB
🚀 Goose Ultra: Master Plan v3.0 (Complete Architecture)
Executive Summary
SAP (Streaming Artifact Protocol) fixes parsing reliability but does NOT fix:
- ❌ Skipped plan approval (users go straight to broken builds)
- ❌ Wrong app generation (CBT game requested → dashboard generated)
- ❌ Redesign drift (small edits cause full regeneration)
- ❌ Cross-talk (old sessions pollute new ones)
This plan implements SAP + 4 Critical Layers as a single atomic upgrade.
Layer 0: SAP (Streaming Artifact Protocol) ✅ DONE
- XML-based output format with
<goose_file>tags - CDATA wrapping to prevent escaping issues
- Fallback to legacy markdown parsing
- Status: Implemented in previous commit
Layer 1: PLAN_FIRST_STATE_MACHINE
Rule
"Idea submission must generate a plan first; build is forbidden until user approves."
State Machine
STATES: IDLE → PLANNING → PLAN_READY → BUILDING → PREVIEW_READY
↓ ↑
ERROR ←───────┘
Transitions
| From | To | Event | Guard |
|---|---|---|---|
| IDLE | PLANNING | SUBMIT_IDEA | - |
| PLANNING | PLAN_READY | PLAN_COMPLETE | Plan text received |
| PLAN_READY | BUILDING | APPROVE_PLAN | User clicked Approve |
| PLAN_READY | PLANNING | EDIT_PLAN | User edited and resubmitted |
| PLAN_READY | IDLE | REJECT_PLAN | User clicked Reject |
| BUILDING | PREVIEW_READY | BUILD_SUCCESS | Files written & QA passed |
| BUILDING | ERROR | BUILD_FAIL | QA failed or timeout |
Hard Guards (Enforced in Code)
- No BUILDING without APPROVE_PLAN: The
handleApprove()function is the ONLY path to BUILDING state. - Approve button disabled until PLAN_COMPLETE: UI shows disabled button during PLANNING.
- No auto-build: Removing any code that transitions directly from PLANNING → BUILDING.
Implementation
- File:
src/types.ts- Add missing states (PLAN_READY) - File:
src/orchestrator.ts- Enforce transitions - File:
src/components/Views.tsx- Guard UI buttons
Layer 2: SESSION_GATING
Rule
"Prevent cross-talk: only the active sessionId may update UI or write files."
Requirements
- Every stream emits sessionId: Wrap all
electron.onChatChunk/Complete/Errorcalls with sessionId tracking. - UI ignores stale events: Before dispatching any action, check
if (sessionId !== activeSessionId) return; - Cancel marks session as cancelled:
dispatch({ type: 'CANCEL_SESSION', sessionId })sets a flag. - Single finalize path: All sessions end via one of: COMPLETE, ERROR, CANCEL, TIMEOUT.
Implementation
- Add
activeSessionIdto orchestrator state - Add
START_SESSIONandEND_SESSIONactions - Wrap all stream handlers with session checks
- Add timeout watchdog (30s default)
Layer 3: PATCH_ONLY_MODIFICATIONS
Rule
"Existing project edits must be patch-based; no full regeneration."
Requirements
- Patch JSON format: Model outputs bounded operations only:
{ "patches": [ { "op": "replace", "anchor": "<!-- HERO_SECTION -->", "content": "..." }, { "op": "insert_after", "anchor": "</header>", "content": "..." } ] } - Deterministic applier: Local code applies patches, enforces:
- Max 500 lines changed per patch
- Forbidden zones (e.g.,
<head>metadata)
- REDESIGN_OK gate: Full regeneration blocked unless user explicitly says "redesign" or "rebuild from scratch".
Implementation
- New file:
src/services/PatchApplier.ts - Update:
applyPlanToExistingHtml()to use patch format - Update: System prompt for modification mode
Layer 4: QUALITY_AND_TASK_MATCH_GUARDS
Rule
"Block broken UI and wrong-app output before writing or previewing."
Quality Gates (Already Partially Implemented)
| Gate | Check | Action on Fail |
|---|---|---|
| artifact_type_gate | No [PLAN] markers, no markdown headings without HTML | Block |
| html_validity_gate | Has DOCTYPE, html, body tags | Block |
| styling_presence_gate | Has Tailwind CDN or >20 CSS rules | Warn + Retry |
| runtime_sanity_gate | No console errors in sandboxed render | Warn |
Task Match Gate (NEW)
- Rule: If user asked for "X" but AI generated "Y", block and retry.
- Implementation:
- Extract keywords from original prompt (e.g., "CBT game", "stress relief")
- Analyze generated HTML for matching keywords in titles, headings, content
- If mismatch score > 0.7, block and auto-retry with:
RETRY REASON: User requested "CBT mini games" but output appears to be "Dashboard".
Auto-Repair
- Max 2 retry attempts
- Each retry includes: failure reasons + original request + project context
Implementation Order
| Phase | Layer | Files | Complexity |
|---|---|---|---|
| 1 | PLAN_FIRST_STATE_MACHINE | types.ts, orchestrator.ts, Views.tsx | High |
| 2 | SESSION_GATING | orchestrator.ts, Views.tsx, LayoutComponents.tsx | High |
| 3 | PATCH_ONLY_MODIFICATIONS | PatchApplier.ts, automationService.ts | Medium |
| 4 | QUALITY_AND_TASK_MATCH_GUARDS | automationService.ts (extend gates) | Medium |
| 5 | Integration Testing | - | - |
Definition of Done
- SAP implemented ✅
- No build starts without plan approval
- No cross-talk between sessions
- Small changes do not redesign apps
- Broken/unstyled outputs are blocked and repaired before preview
- Wrong-app outputs are blocked (task-match gate)
Status: Ready for Approval Execution Agent: Opus 4.5