Release v1.01 Enhanced: Vi Control, TUI Gen5, Core Stability

This commit is contained in:
Gemini AI
2025-12-20 01:12:45 +04:00
Unverified
parent 2407c42eb9
commit 142aaeee1e
254 changed files with 44888 additions and 31025 deletions

70
docs/FEATURE-CHECKLIST.md Normal file
View File

@@ -0,0 +1,70 @@
# OpenQode TUI Feature Checklist (QA)
Statuses:
- `OK`: verified via automated/static checks in this repo
- `PARTIAL`: implemented, but needs manual QA in a real terminal session
- `TODO`: not implemented yet
## Automated Sanity (OK)
- `node --check bin/opencode-ink.mjs`
- `npm test`
- Extension validation:
- `node final_validation.js`
- `node final_integration_test.js`
- `node runtime_simulation_test.js`
- `node validate_extension.js`
## Startup / Core Flow
- TUI start (`npm start`) — `PARTIAL` (manual: confirm no crash in an interactive terminal)
- Project picker — `PARTIAL` (manual: pick current/recent/new path)
- Agent picker (`/agents`) — `PARTIAL`
## Layout / Smoothness
- Responsive sizing (`bin/tui-layout.mjs`) — `OK` (tests)
- Reduced jitter streaming (`bin/tui-stream-buffer.mjs`) — `PARTIAL` (manual: long response should not “shake”)
- Fixed/reserved strip heights (header/run/flow/footer/input) — `PARTIAL` (manual: transcript shouldnt jump)
- Reduce motion toggle (`/motion on|off`) — `PARTIAL` (manual: fewer spinners/less jitter)
## Sidebar + IDE Loop
- Explorer sidebar (default ON) — `PARTIAL`
- Toggle: `Ctrl+E`, `/explorer on|off`
- Navigate: arrows, `Enter` open, `Space` select
- Preview tabs panel (`bin/ui/components/FilePreviewTabs.mjs`) — `PARTIAL`
- `/open <path[:line]>` opens into tabs
- Project search (`/search [query]`) — `PARTIAL` (requires `rg` in PATH)
- Recent/Hot pickers — `PARTIAL`
- `/recent`, `/hot`
- `Ctrl+R` (recent), `Ctrl+H` (hot)
- “Add to context” pack from explorer selection — `PARTIAL`
- Persistent UI prefs (`.opencode/ui_prefs.json`) — `PARTIAL`
## Commands / Tools
- Command palette (`Ctrl+P`, `Ctrl+K`, `/settings`) — `PARTIAL`
- Safe mode (`/safe on|off`) — `PARTIAL`
- Safe confirm overlay (Enter run once, Esc cancel) — `PARTIAL`
- `/doctor` diagnostics — `PARTIAL`
- Task Wizard (`/new <goal>`) — `PARTIAL`
## File Writes / Diff Review
- Pending file diffs + `/write``PARTIAL`
- Diff review overlay — `PARTIAL`
- Hunk staging / partial apply — `PARTIAL`
- Apply + reopen / apply + run tests actions — `PARTIAL`
## Automation (IQ Exchange)
- Request → plan preview → run (`PreviewPlan` / `AutomationTimeline`) — `PARTIAL`
- Step-by-step gate + step editor — `PARTIAL`
- Inspector panels (Browser/Desktop/Server) — `PARTIAL`
- Desktop automation script parsing + basic commands — `OK`
- `powershell -NoProfile -ExecutionPolicy Bypass -File bin/input.ps1 key LWIN`
- `powershell -NoProfile -ExecutionPolicy Bypass -File bin/input.ps1 startmenu`
## Project Intelligence
- Index cache (`/index`, `.opencode/index.json`) — `PARTIAL`
- Symbols map (`/symbols`) — `PARTIAL`
- Recent/Hot tracking — `PARTIAL`
## Nano Dev (Self-Improve Safely)
- Nano Dev fork workflow (`/nanodev <goal>`) — `PARTIAL`
- Fork verify (`/nanodev verify`) — `PARTIAL`

View File

@@ -0,0 +1,104 @@
# OpenQode TUI QA + Enhancements Report
## Whats Verified Automatically (PASS)
- `npm test`
- `node --check bin/opencode-ink.mjs`
- Extension validation:
- `node final_validation.js`
- `node final_integration_test.js`
- `node runtime_simulation_test.js`
- `node validate_extension.js`
## Feature-by-Feature Status + Suggestions
### Desktop Automation (Start Menu / Clicks / UIA)
**Implemented**
- `bin/input.ps1` parses and core commands run (`key`, `click`, `uiclick`, `screenshot`, etc).
- Added `startmenu` command (`Ctrl+Esc`) to reliably open Start when `key LWIN` is flaky.
**Manual QA**
- From the TUI automation preview, run a plan that calls:
- `powershell -NoProfile -ExecutionPolicy Bypass -File "…/bin/input.ps1" startmenu`
- Confirm Start opens consistently.
**Enhancements**
- Add a “foreground focus” step helper (bring shell/desktop to foreground before key presses).
- Add automatic retry strategy: `startmenu``uiclick "Start"` → coordinate fallback.
### IDE Loop (Explorer + Tabs + Open/Search + Context Pack)
**Implemented**
- Explorer sidebar toggle: `Ctrl+E`, `/explorer on|off` (default ON).
- Tabs UI: file preview tabs; `/open <path[:line]>`.
- Search overlay: `Ctrl+Shift+F`, `/search [query]` (ripgrep picker).
- Context pack: selected explorer files are appended once to the next prompt then cleared.
- Recent/Hot pickers: `Ctrl+R`/`Ctrl+H`, `/recent`, `/hot`.
**Manual QA**
- Toggle Explorer, select multiple files, send a prompt; selection clears and is used as context.
- `/search <term>` lists matches; Enter opens the match into tabs.
**Enhancements**
- Make Recent/Hot lists clickable directly in the sidebar (mouse-free “jump to”).
- Add “pin tab” and “reveal in explorer” actions in tabs.
### Diff Workflow (Git-Like Hunk Staging)
**Implemented**
- Hunk staging in diff view (toggle hunks; apply selected).
- Quick actions: apply, apply+reopen, apply+run tests.
**Enhancements**
- Add per-line staging inside hunks.
- Add “apply all + keep review open” for iterative editing.
### Quality Rails (Safe Mode + Confirm + Doctor)
**Implemented**
- `/safe on|off` guard rails for destructive runs and batches.
- Interactive Safe Confirm overlay (Enter run once, Esc cancel).
- `/doctor` environment diagnostics.
**Enhancements**
- Add “confirm all for this session” for repeated safe operations.
- Add a “dry-run” rendering for batches (expanded quoting + cwd).
### Task Wizard (`/new`)
**Implemented**
- `/new <goal>` seeds a checklist into the todo list; `Ctrl+T` opens tasks.
**Enhancements**
- Add “wizard phases” that map to Ask → Preview → Run → Verify with a progress ribbon.
### Project Intelligence (Index / Symbols / Recent / Hot)
**Implemented**
- `/index` builds `.opencode/index.json` (`rg --files` preferred).
- `/symbols [path]` basic symbol extraction.
- Recent/Hot tracking used by pickers.
**Enhancements**
- Add `/find <filename>` backed by index cache.
- Add lightweight “hot files” heuristics (recent churn + git status).
### Chat-to-App / Instant Preview / One-Click Deploy
**Implemented**
- `/app <name> <desc>` scaffolds an app into `web/apps/<name>`.
- `/preview [name]` runs local preview server.
- `/deployapp <name>` deploys via Vercel (requires `vercel` configured).
**Enhancements**
- Persist deploy history in `.opencode/deploy_history.json` and expose `/deploys`.
### Nano Dev (Self-Modify Safely, Fork-First)
**Implemented**
- `/nanodev <goal>` creates a worktree fork under `.opencode/nano-dev/*` and instructs AI to write only there.
- Helpers: `/nanodev status|diff|verify|cleanup`.
- Fixed startup crash by avoiding temporal-dead-zone behavior for `runNanoDevVerify`.
**Enhancements**
- Add `/nanodev merge` that produces a patch and applies it only after verify + clean git status.
### UI Polish (“Jaggy/Shaky” Feel)
**Implemented**
- Reduce motion toggle: `/motion on|off` (removes extra animated spinners to reduce redraw jitter).
**Enhancements**
- Add an FPS/terminal-capability section to `/doctor` to guide ideal terminal settings.

View File

@@ -0,0 +1,355 @@
# Goose Super - Current State Analysis & Enhancement Plan
## Executive Summary
**Goose Super** is currently a functional Electron-based AI coding assistant that combines Qwen LLM with basic computer automation. However, to achieve the vision of a **noob-proof, all-in-one AI coding environment** (like Lovable but with full computer control), significant enhancements are needed.
---
## Current State Assessment
### ✅ What Works Today
| Feature | Status | File |
|---------|--------|------|
| **Qwen LLM Integration** | ✅ Working | `bin/qwen-bridge.mjs`, `bin/goose-launch.mjs` |
| **Electron GUI** | ✅ Working | `bin/goose-electron-app/main.cjs` |
| **Chat Interface** | ✅ Working | `bin/goose-electron-app/renderer.js` (1970 lines) |
| **Screenshots** | ✅ Working | `computer-use.cjs` - PowerShell capture |
| **Mouse Click** | ✅ Working | PowerShell mouse_event simulation |
| **Keyboard Input** | ✅ Working | SendKeys via PowerShell |
| **Key Combinations** | ✅ Working | Ctrl+C, Alt+Tab, etc. |
| **Shell Commands** | ✅ Working | `exec()` with timeout |
| **Window Listing** | ✅ Working | Get-Process filtering |
| **Window Focus** | ✅ Working | SetForegroundWindow via WinAPI |
| **App Opening** | ✅ Working | Common apps mapped |
| **Preview Panel** | ✅ Working | Webview for HTML preview |
| **Playwright Bridge** | ⚠️ Basic | `playwright-bridge.js` - navigate/click/type |
| **AI Suggestions** | ✅ Working | Pre-defined prompt cards |
| **Terminal Panel** | ✅ Working | Command execution UI |
### ❌ What's Missing (vs. Goal)
| Gap | Impact | Priority |
|-----|--------|----------|
| **No Vision/OCR Element Finding** | Can't "see" and click buttons by name | 🔴 CRITICAL |
| **No Self-Correction Loop** | Doesn't verify if actions worked | 🔴 CRITICAL |
| **No Vibe Coding Flow** | Can't create/preview apps like Lovable | 🔴 CRITICAL |
| **No Project/File Management** | No file tree, save/load projects | 🟠 HIGH |
| **No Embedded IDE** | No Monaco editor, syntax highlighting | 🟠 HIGH |
| **No Server/SSH Management** | Can't deploy/manage remote servers | 🟡 MEDIUM |
| **No Git Integration** | Can't commit/push/pull | 🟡 MEDIUM |
| **Browser Automation is Surface-Level** | No DOM inspection, smart selectors | 🟡 MEDIUM |
| **No Memory/Context Persistence** | Forgets between sessions | 🟡 MEDIUM |
---
## Reference Implementations Deep-Dive
### 1. Windows-Use (CursorTouch)
**Best for: Desktop automation without computer vision**
```
windows_use/
├── agent/ # Agent orchestration
├── llms/ # LLM providers (Ollama, Google, etc.)
├── messages/ # Conversation handling
├── tool/ # Tool definitions
└── telemetry/ # Analytics
```
**Key Innovations:**
- Uses **UIAutomation** (Windows Accessibility API) to find elements by name/role
- **PyAutoGUI** for mouse/keyboard (more reliable than raw SendKeys)
- Works with **any LLM** (Qwen, Gemini, Ollama) - not tied to specific models
- **Grounding** - Shows how it "sees" the screen with labeled elements
**What to Take:**
- UIAutomation for element discovery (instead of blind x,y clicking)
- Agent loop pattern with tool execution
- LLM abstraction layer
---
### 2. Browser-Use
**Best for: Comprehensive web automation**
```
browser_use/
├── actor/ # Action execution
├── agent/ # Agent service
├── browser/ # Playwright wrapper
├── code_use/ # Code execution sandbox
├── controller/ # Action controller
├── dom/ # DOM manipulation & analysis
├── filesystem/ # File operations
├── llm/ # LLM integrations
├── mcp/ # Model Context Protocol
├── sandbox/ # Safe execution environment
├── skills/ # Reusable action patterns
└── tools/ # Custom tool definitions
```
**Key Innovations:**
- **Smart DOM analysis** - extracts meaningful selectors
- **Multi-tab support** with session persistence
- **Custom tools API** - `@tools.action(description='...')`
- **Sandbox execution** for safe code running
- **Cloud deployment** option
- **Form filling with validation**
- **CAPTCHA handling** (via stealth browsers)
**What to Take:**
- DOM extraction and smart selector logic
- Tools/actions decorator pattern
- Multi-tab browser session management
- Sandbox for safe code execution
---
### 3. Open-Interface
**Best for: Simple LLM → Screenshot → Execute loop**
```
app/
├── core.py # Main orchestration loop
├── interpreter.py # Parse LLM responses
├── llm.py # LLM communication
├── ui.py # Tkinter UI (18KB)
└── utils/ # Helpers
```
**Architecture:**
```
User Request → Screenshot → LLM → Parse Instructions → Execute → Repeat
```
**Key Innovations:**
- **Course-correction** via screenshot feedback loop
- **Stop button** + corner detection to interrupt
- Simple, understandable architecture
- Works across Windows/Mac/Linux
**What to Take:**
- The "screenshot → analyze → execute → verify" loop
- Interrupt mechanisms (corner detection)
- Cross-platform automation patterns
---
### 4. OpenCode TUI (sst/opencode)
**Best for: Terminal-based IDE experience**
```
packages/
├── core/ # Core logic
├── tui/ # Terminal UI (Ink-based)
├── lsp/ # Language Server Protocol
└── ...
```
**Key Innovations:**
- Uses **Bun** for speed
- **LSP integration** for code intelligence
- **SST** infrastructure for deployment
- Beautiful TUI with Ink
**What to Take:**
- LSP integration for code completion/diagnostics
- Bun for faster package management
- TUI patterns (if we add TUI mode)
---
### 5. Mini-Agent (MiniMax)
**Best for: Lightweight Python agent framework**
```
mini_agent/
├── agent.py # Agent implementation
├── tools.py # Tool definitions
└── memory.py # Context management
```
**What to Take:**
- Memory/context management patterns
- Simple agent abstraction
---
## Gap Analysis: Current State vs. Noob-Proof Vision
### 🎯 User Experience Goals
| Goal | Current | Required |
|------|---------|----------|
| "Build me a website" | ❌ Can chat, can't create | ✅ One prompt → working preview |
| "Click the Settings button" | ⚠️ Blind x,y click | ✅ Find element by name/OCR |
| "Deploy this to my server" | ❌ No SSH | ✅ Connect, upload, run commands |
| "Open my last project" | ❌ No persistence | ✅ Project save/load |
| "Edit this file" | ❌ No editor | ✅ Monaco with syntax highlighting |
| "Show me what you see" | ⚠️ Can screenshot | ✅ Annotated vision with element labels |
---
## Proposed Architecture
### Layered Super-Powers
```
┌─────────────────────────────────────────────────────────────┐
│ GOOSE SUPER UI │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌────────┐ │
│ │ Chat │ │ Preview │ │ Editor │ │ Browser │ │Terminal│ │
│ │ Panel │ │ Panel │ │ Panel │ │ Panel │ │ Panel │ │
│ └─────────┴─┴─────────┴─┴─────────┴─┴─────────┴─┴────────┘ │
└─────────────────────────────────────────────────────────────┘
┌─────────────────────┼─────────────────────┐
▼ ▼ ▼
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
│ AI BRAIN │ │ EXECUTION │ │ CONTEXT │
│ │ │ LAYER │ │ LAYER │
│ • Qwen Bridge │ │ • Computer Use│ │ • Memory │
│ • Planning │ │ • Browser Use │ │ • Projects │
│ • Verification│ │ • Server Mgmt │ │ • Sessions │
│ • Correction │ │ • File Ops │ │ • History │
└───────────────┘ └───────────────┘ └───────────────┘
```
### Phase-by-Phase Enhancement
#### Phase 1: Vision & Smart Automation
**Make the AI truly "see" and interact reliably**
1. **Add UIAutomation element discovery** (from Windows-Use)
- Find buttons/inputs by name, not x,y
- Label screenshot with element overlays
2. **Implement verification loop** (from Open-Interface)
- After each action, screenshot and verify success
- Self-correct if needed
3. **Enhanced computer-use.cjs**
- Add `findElement(name)` using UIAutomation
- Add `getElementsOnScreen()` for element listing
- Add `clickElement(name)` for reliable interaction
#### Phase 2: Vibe Coding Experience
**Create apps from prompts like Lovable**
1. **Embedded Monaco Editor**
- File tree sidebar
- Multi-tab editing
- Syntax highlighting
- Live error detection
2. **Project System**
- Create/save/load projects
- Auto-scaffold HTML/CSS/JS
- Template library
3. **Live Preview Enhancement**
- Hot reload on file save
- Dev server auto-start (Vite integration)
- Console output in UI
#### Phase 3: Full Automation Power
**Control everything**
1. **Server Management**
- SSH connection panel
- Remote command execution
- Log streaming
- File upload/download
2. **Browser Automation**
- DOM inspection
- Smart element selectors
- Multi-tab support
- Cookie/auth persistence
3. **Git Integration**
- Clone/commit/push
- Branch management
- Diff visualization
#### Phase 4: Noob-Proof Polish
**Make it intuitive for anyone**
1. **Onboarding Wizard**
- API key setup
- Permissions check
- Quick tutorial
2. **AI Suggestions**
- Context-aware suggestions
- One-click actions
- Visual tutorials
3. **Error Recovery**
- Smart retry
- User-friendly error messages
- Undo/redo history
---
## Technical Debt to Address
| Issue | Risk | Fix |
|-------|------|-----|
| PowerShell scripts for automation | Slow, fragile | Use native Node.js (robotjs/nut.js) |
| 1970-line renderer.js | Hard to maintain | Modularize into components |
| No TypeScript | Type errors | Migrate to TypeScript |
| No tests | Regressions | Add jest/playwright tests |
| Hardcoded paths | Portability | Use config files |
---
## Recommended Priorities
### 🔴 Immediate (Week 1)
1. Add UIAutomation element discovery
2. Implement screenshot → verify loop
3. Fix any existing bugs in computer-use
### 🟠 Short-term (Week 2-3)
4. Monaco Editor integration
5. Project save/load system
6. Enhanced preview with hot reload
### 🟡 Medium-term (Week 4-6)
7. SSH/Server management panel
8. Git integration
9. Browser DOM inspection
### 🟢 Long-term (Week 7+)
10. Onboarding wizard
11. AI-driven auto-correction
12. Multi-agent support
---
## Questions for User
1. **Which LLMs to support?** Currently Qwen only - add OpenAI/Claude/Ollama?
2. **Deployment target?** Windows only or also Mac/Linux?
3. **Cloud features?** Should Goose have cloud sync/remote execution?
4. **Monetization?** Any commercial plans affecting architecture?
5. **Performance priority?** Speed vs. reliability trade-off?
---
## Summary
Goose Super has a solid foundation but needs these critical additions to become "noob-proof":
1. **Vision** - UIAutomation element discovery (not blind clicking)
2. **Verification** - Screenshot → analyze → correct loop
3. **IDE** - Monaco editor with project management
4. **Server** - SSH/deployment capabilities
5. **Polish** - Onboarding, error handling, undo/redo
The reference repos provide excellent patterns to adopt. Windows-Use gives us UIAutomation. Browser-Use gives us smart DOM handling. Open-Interface gives us the verification loop. OpenCode gives us TUI patterns.
**Next step recommendation:** Start with Phase 1 (Vision & Smart Automation) as it unblocks all other "noob-proof" features.

View File

@@ -0,0 +1,559 @@
# 📦 FEATURE EXTRACTION MAP - Reference Projects
> What specific code/patterns to take from each project for Goose Super
---
## 1. Windows-Use (CursorTouch)
**Repo:** https://github.com/CursorTouch/Windows-Use
### What They Have
```
windows_use/
├── agent/ # Agent orchestration loop
├── llms/ # LLM provider abstraction (Ollama, Google, OpenAI)
├── messages/ # Conversation/message handling
├── tool/ # Tool definitions for automation
└── telemetry/ # Usage analytics
```
### What We Take
| Feature | Their File | Our Implementation |
|---------|------------|-------------------|
| **UIAutomation element finder** | `tool/uia.py` | Port to `computer-use.cjs` |
| **Element grounding (labeling)** | `tool/grounding.py` | New `services/GroundingService.js` |
| **Smart click by name** | `tool/actions.py` | Enhance `input.ps1 uiclick` |
| **LLM abstraction layer** | `llms/*.py` | New `services/LLMService.js` |
| **Agent loop pattern** | `agent/agent.py` | Enhance `lib/iq-exchange.mjs` |
### Specific Code to Port
```python
# FROM: windows_use/tool/uia.py - UIAutomation element discovery
def find_element_by_name(name: str, control_type: str = None):
"""Find UI element using Windows Accessibility API"""
automation = UIAutomation.GetRootElement()
condition = automation.CreatePropertyCondition(
UIA.NamePropertyId, name
)
if control_type:
condition = automation.CreateAndCondition(
condition,
automation.CreatePropertyCondition(
UIA.ControlTypePropertyId, control_type
)
)
return automation.FindFirst(TreeScope.Descendants, condition)
```
**Port to Node.js/PowerShell:**
```powershell
# Enhanced input.ps1 - find_element function
function Find-ElementByName {
param([string]$Name, [string]$ControlType = $null)
Add-Type -AssemblyName UIAutomationClient
$root = [System.Windows.Automation.AutomationElement]::RootElement
$condition = [System.Windows.Automation.PropertyCondition]::new(
[System.Windows.Automation.AutomationElement]::NameProperty,
$Name
)
return $root.FindFirst([System.Windows.Automation.TreeScope]::Descendants, $condition)
}
```
### Grounding System (Label Elements on Screenshot)
```python
# FROM: windows_use/tool/grounding.py
def create_grounding_map(screenshot_path):
"""Overlay element labels on screenshot for AI understanding"""
elements = get_all_interactive_elements()
labeled_image = draw_labels_on_screenshot(screenshot_path, elements)
return {
"image": labeled_image,
"elements": [
{"id": i, "name": el.name, "type": el.type, "bounds": el.rect}
for i, el in enumerate(elements)
]
}
```
---
## 2. Open-Interface (AmberSahdev)
**Repo:** https://github.com/AmberSahdev/Open-Interface
### What They Have
```
app/
├── core.py # Main loop: screenshot → LLM → execute → verify
├── interpreter.py # Parse LLM response into actions
├── llm.py # LLM communication
├── ui.py # Tkinter UI (18KB)
└── utils/ # Helpers
```
### What We Take
| Feature | Their File | Our Implementation |
|---------|------------|-------------------|
| **Screenshot → Execute → Verify loop** | `core.py` | Enhance IQ Exchange loop |
| **Action interpreter/parser** | `interpreter.py` | Improve `extractExecutables()` |
| **Corner detection (interrupt)** | `ui.py` | Add to Electron app |
| **Course-correction logic** | `core.py` | Add to self-heal flow |
### Specific Code to Port
```python
# FROM: app/core.py - Main automation loop
class AutomationCore:
def run_task(self, user_request):
while not self.task_complete and self.attempts < self.max_attempts:
# 1. Capture current state
screenshot = self.capture_screen()
# 2. Send to LLM with context
response = self.llm.analyze(
screenshot=screenshot,
request=user_request,
previous_actions=self.action_history
)
# 3. Parse and execute actions
actions = self.interpreter.parse(response)
for action in actions:
result = self.executor.run(action)
self.action_history.append(result)
# 4. Verify progress
new_screenshot = self.capture_screen()
progress = self.llm.verify_progress(screenshot, new_screenshot, user_request)
if progress.is_complete:
self.task_complete = True
elif progress.is_stuck:
self.attempts += 1
# Self-correction: ask LLM for alternative approach
```
**Port to our IQ Exchange:**
```javascript
// Enhanced lib/iq-exchange.mjs
async process(userRequest) {
while (!this.taskComplete && this.attempts < this.maxRetries) {
// 1. Screenshot current state
const before = await this.captureScreen();
// 2. Get AI instructions
const response = await this.sendToAI(
this.buildPromptWithVision(userRequest, before, this.history)
);
// 3. Execute
const actions = this.extractExecutables(response);
for (const action of actions) {
await this.executeAny(action.content);
}
// 4. Verify
const after = await this.captureScreen();
const verified = await this.verifyProgress(before, after, userRequest);
if (verified.complete) {
this.taskComplete = true;
} else if (verified.stuck) {
this.attempts++;
// Request alternative approach
}
}
}
```
### Mouse Corner Interrupt
```python
# FROM: app/ui.py - Corner detection to stop automation
def check_corner_interrupt(self):
"""Stop if user drags mouse to corner (safety interrupt)"""
x, y = pyautogui.position()
screen_w, screen_h = pyautogui.size()
corners = [
(0, 0), (screen_w, 0),
(0, screen_h), (screen_w, screen_h)
]
for cx, cy in corners:
if abs(x - cx) < 50 and abs(y - cy) < 50:
self.stop_automation()
return True
return False
```
---
## 3. Browser-Use
**Repo:** https://github.com/browser-use/browser-use
### What They Have (Most Comprehensive!)
```
browser_use/
├── agent/ # Agent service and logic
├── browser/ # Playwright wrapper with smart features
├── controller/ # Action controller
├── dom/ # DOM analysis and manipulation
├── code_use/ # Code execution sandbox
├── filesystem/ # File operations
├── tools/ # Custom tool definitions
├── skills/ # Reusable action patterns
├── sandbox/ # Safe execution environment
├── llm/ # LLM integrations
└── mcp/ # Model Context Protocol
```
### What We Take
| Feature | Their File | Our Implementation |
|---------|------------|-------------------|
| **Smart DOM extraction** | `dom/` | New `services/DOMService.js` |
| **Element selectors by text** | `browser/element.py` | Enhance `playwright-bridge.js` |
| **Multi-tab management** | `browser/session.py` | Add to browser panel |
| **Tools decorator pattern** | `tools/registry.py` | New tool registration system |
| **Sandbox execution** | `sandbox/` | Safe code runner |
| **Form auto-detection** | `dom/forms.py` | Auto-fill capability |
| **Smart waiting** | `browser/waits.py` | Better wait logic |
### Specific Code to Port
```python
# FROM: browser_use/dom/extractor.py - Smart DOM extraction
class DOMExtractor:
def extract_interactive_elements(self, page):
"""Extract all clickable/fillable elements with smart selectors"""
return page.evaluate('''() => {
const elements = [];
const interactive = document.querySelectorAll(
'a, button, input, select, textarea, [role="button"], [onclick]'
);
for (let i = 0; i < interactive.length; i++) {
const el = interactive[i];
const rect = el.getBoundingClientRect();
if (rect.width > 0 && rect.height > 0) {
elements.push({
index: i,
tag: el.tagName.toLowerCase(),
text: el.textContent?.trim().slice(0, 50) || '',
placeholder: el.placeholder || '',
id: el.id,
name: el.name,
type: el.type,
selector: generateUniqueSelector(el),
rect: { x: rect.x, y: rect.y, w: rect.width, h: rect.height }
});
}
}
return elements;
}''')
```
**Port to our Playwright bridge:**
```javascript
// Enhanced bin/playwright-bridge.js
async function extractInteractiveElements(page) {
return await page.evaluate(() => {
const elements = [];
const selectors = 'a, button, input, select, textarea, [role="button"], [onclick]';
document.querySelectorAll(selectors).forEach((el, i) => {
const rect = el.getBoundingClientRect();
if (rect.width > 0 && rect.height > 0) {
elements.push({
index: i,
tag: el.tagName.toLowerCase(),
text: (el.textContent || '').trim().slice(0, 50),
placeholder: el.placeholder || '',
ariaLabel: el.getAttribute('aria-label') || '',
selector: el.id ? `#${el.id}` : generateSelector(el),
rect: { x: rect.x, y: rect.y, w: rect.width, h: rect.height }
});
}
});
return elements;
});
}
```
### Custom Tools Pattern
```python
# FROM: browser_use/tools/registry.py
from browser_use import Tools
tools = Tools()
@tools.action(description='Search for something on Google')
async def google_search(query: str):
await browser.navigate('https://google.com')
await browser.fill('input[name="q"]', query)
await browser.press('Enter')
return await browser.get_content()
```
**Port to our system:**
```javascript
// New: lib/tools-registry.js
class ToolsRegistry {
constructor() {
this.tools = new Map();
}
register(name, description, handler) {
this.tools.set(name, { description, handler });
}
async execute(name, params) {
const tool = this.tools.get(name);
if (!tool) throw new Error(`Unknown tool: ${name}`);
return await tool.handler(params);
}
getSchema() {
return Array.from(this.tools.entries()).map(([name, tool]) => ({
name,
description: tool.description
}));
}
}
// Register tools
tools.register('google_search', 'Search Google', async ({ query }) => {
await playwright.navigate('https://google.com');
await playwright.fill('input[name="q"]', query);
await playwright.press('Enter');
});
```
---
## 4. VS Code / Monaco Editor
**Repo:** https://github.com/microsoft/vscode
### What They Have
```
VS Code is massive! We only need:
├── monaco-editor # Core editor component (npm package)
├── Language services # IntelliSense for JS/TS/CSS/HTML
└── TextMate grammars # Syntax highlighting definitions
```
### What We Take
| Feature | Their Package | Our Implementation |
|---------|---------------|-------------------|
| **Monaco Editor** | `monaco-editor` npm | Embed in Electron |
| **Language services** | Built into Monaco | Enable for JS/TS/CSS/HTML |
| **File tree UI** | Custom | Build our own simple tree |
| **Multi-tab** | Custom | Build tabbed interface |
### Implementation
```javascript
// Install
npm install monaco-editor
// Embed in Electron (renderer)
import * as monaco from 'monaco-editor';
// Configure for web languages
monaco.languages.typescript.javascriptDefaults.setDiagnosticsOptions({
noSemanticValidation: false,
noSyntaxValidation: false
});
// Create editor
const editor = monaco.editor.create(document.getElementById('editor-container'), {
value: '// Start coding...',
language: 'javascript',
theme: 'vs-dark',
automaticLayout: true,
minimap: { enabled: false },
fontSize: 14,
lineNumbers: 'on',
wordWrap: 'on'
});
// Listen for changes
editor.onDidChangeModelContent(() => {
const code = editor.getValue();
// Auto-save or trigger preview refresh
});
```
---
## 5. Goose (Block)
**Repo:** https://github.com/block/goose
### What They Have
```
crates/
├── goose-cli/ # CLI interface
├── goose-server/ # Web server
├── goose/ # Core agent logic
└── mcp-*/ # Model Context Protocol extensions
ui/ # Web UI (React)
documentation/ # Recipes and guides
```
### What We Take
| Feature | Their Location | Our Implementation |
|---------|----------------|-------------------|
| **Recipe system** | `documentation/recipes/` | Project templates |
| **Session management** | `crates/goose/session.rs` | Session persistence |
| **Multi-agent support** | `crates/goose/agent.rs` | Future: agent switching |
| **MCP protocol** | `crates/mcp-*/` | Tool discovery |
| **Web UI patterns** | `ui/` | UI component ideas |
### Session Pattern
```rust
// FROM: crates/goose/session.rs (concept)
struct Session {
id: String,
messages: Vec<Message>,
tools: Vec<Tool>,
context: Context,
}
impl Session {
fn save(&self, path: &Path) { /* serialize to disk */ }
fn load(path: &Path) -> Self { /* deserialize from disk */ }
}
```
**Port to our system:**
```javascript
// New: services/SessionService.js
class SessionService {
constructor(dataDir = '.opencode/sessions') {
this.dataDir = dataDir;
}
save(session) {
const path = `${this.dataDir}/${session.id}.json`;
fs.writeFileSync(path, JSON.stringify({
id: session.id,
created: session.created,
messages: session.messages,
files: session.files,
projectPath: session.projectPath
}, null, 2));
}
load(sessionId) {
const path = `${this.dataDir}/${sessionId}.json`;
return JSON.parse(fs.readFileSync(path, 'utf8'));
}
list() {
return fs.readdirSync(this.dataDir)
.filter(f => f.endsWith('.json'))
.map(f => this.load(f.replace('.json', '')));
}
}
```
---
## 6. OpenCode TUI (SST)
**Repo:** https://github.com/sst/opencode
### What They Have
```
packages/
├── core/ # Core agent logic
├── tui/ # Terminal UI (Ink-based)
├── lsp/ # Language Server Protocol
└── providers/ # LLM providers
```
### What We Take
| Feature | Their Location | Our Implementation |
|---------|----------------|-------------------|
| **LSP integration** | `packages/lsp/` | Code intelligence |
| **Beautiful TUI** | `packages/tui/` | Reference for Ink components |
| **Provider abstraction** | `packages/providers/` | Multi-LLM support |
| **Status indicators** | `packages/tui/components/` | Progress UI |
---
## 7. Mini-Agent (MiniMax)
**Repo:** https://github.com/MiniMax-AI/Mini-Agent
### What They Have
```
mini_agent/
├── agent.py # Lightweight agent
├── tools.py # Tool definitions
└── memory.py # Context/memory management
```
### What We Take
| Feature | Their File | Our Implementation |
|---------|------------|-------------------|
| **Memory management** | `memory.py` | Context window optimization |
| **Simple agent loop** | `agent.py` | Reference for clarity |
### Memory Pattern
```python
# FROM: mini_agent/memory.py
class Memory:
def __init__(self, max_tokens=8000):
self.messages = []
self.max_tokens = max_tokens
def add(self, message):
self.messages.append(message)
self._prune_if_needed()
def _prune_if_needed(self):
"""Remove old messages to stay under token limit"""
while self._count_tokens() > self.max_tokens:
# Keep system message, remove oldest user/assistant
if len(self.messages) > 2:
self.messages.pop(1)
```
---
## Summary: Code Extraction Checklist
### Immediate Priority
- [ ] **UIAutomation from Windows-Use**`computer-use.cjs`
- [ ] **DOM extraction from browser-use**`playwright-bridge.js`
- [ ] **Verify loop from Open-Interface**`lib/iq-exchange.mjs`
- [ ] **Monaco Editor from VS Code** → New editor panel
### Medium Priority
- [ ] **Tools registry from browser-use**`lib/tools-registry.js`
- [ ] **Session management from Goose**`services/SessionService.js`
- [ ] **Grounding system from Windows-Use**`services/GroundingService.js`
### Nice to Have
- [ ] **LSP from OpenCode** → Code intelligence
- [ ] **Memory optimization from Mini-Agent** → Context management
- [ ] **MCP protocol from Goose** → Tool discovery
---
*This document maps exactly what code to extract from each project.*

View File

@@ -0,0 +1,605 @@
# 🦆 GOOSE SUPER - MASTER PLAN
## Vision Statement
**Goose Super** = **Lovable** (vibe coding) + **Antigravity** (computer use) + **VS Code** (IDE) in one self-contained package.
> "Everything happens INSIDE Goose. No external tools unless user approves."
---
## Architecture Overview
```
┌────────────────────────────────────────────────────────────────────────────┐
│ GOOSE SUPER UI │
│ │
│ ┌──────────────────────────────────────────────────────────────────────┐ │
│ │ TOP BAR │ │
│ │ 🏠 Home │ 📝 Editor │ 🌐 Preview │ 🖥️ Computer │ 🔗 Server │ │
│ └──────────────────────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────┬────────────────────────────────────────────────────┐ │
│ │ FILE TREE │ MAIN WORKSPACE │ │
│ │ │ │ │
│ │ 📁 my-project │ ┌─────────────────┬─────────────────────────┐ │ │
│ │ 📄 index.html│ │ MONACO EDITOR │ LIVE PREVIEW │ │ │
│ │ 📄 style.css │ │ │ (Built-in Browser) │ │ │
│ │ 📄 script.js │ │ [code here] │ [renders app here] │ │ │
│ │ │ │ │ │ │ │
│ │ │ └─────────────────┴─────────────────────────┘ │ │
│ │ │ │ │
│ │ │ ┌────────────────────────────────────────────┐ │ │
│ │ │ │ TERMINAL / AI CHAT │ │ │
│ │ │ │ > npm run dev │ │ │
│ │ │ │ 🧠 IQ Exchange: Building coffee shop... │ │ │
│ │ │ └────────────────────────────────────────────┘ │ │
│ └─────────────────┴────────────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────────────────────────────┘
```
---
## Core Systems
### 1. 🧠 IQ Exchange (The Brain)
**What it does:** Translates natural language → executable commands → self-heals on failure
```
User: "Open notepad and type hello"
IQ Exchange Translation Layer
[powershell input.ps1 open "Notepad"]
[powershell input.ps1 waitfor "Untitled" 5]
[powershell input.ps1 type "Hello"]
Execute → Screenshot → Verify → Self-Correct if needed
```
**Current State:** ✅ Working in `lib/iq-exchange.mjs` (536 lines)
- Translation layer with task type detection
- Execution with retry loop (max 5 attempts)
- Self-healing prompts sent to AI on failure
- Supports: Browser (Playwright), Desktop (input.ps1), Shell commands
**Improvements Needed:**
| Gap | Fix |
|-----|-----|
| No visual verification | Add screenshot → OCR → compare after each action |
| Limited element finding | Add UIAutomation element discovery |
| Looping is basic | Enhance with success criteria detection |
---
### 2. 📝 Built-In IDE (Monaco Editor)
**Purpose:** Write, edit, and manage code without leaving Goose
**Features to Add:**
- **Monaco Editor** from VS Code (same core)
- **File Tree** sidebar with create/delete/rename
- **Multi-Tab** editing
- **Syntax Highlighting** for all languages
- **IntelliSense** for JS/TS/CSS/HTML
- **Live Error Detection** (red underlines)
- **Git Panel** for version control
**Integration:**
```javascript
// Install
npm install monaco-editor
// Embed in Electron
import * as monaco from 'monaco-editor';
const editor = monaco.editor.create(document.getElementById('editor'), {
value: code,
language: 'javascript',
theme: 'vs-dark'
});
```
---
### 3. 🌐 Built-In Browser (Preview & Automation)
**Purpose:** Preview apps AND automate web tasks without external browsers
**Features:**
| Feature | Implementation |
|---------|----------------|
| Live Preview | Electron BrowserView (Chromium) |
| Hot Reload | Watch file changes → refresh |
| Dev Tools | Chrome DevTools Protocol |
| DOM Inspector | Expose element tree to AI |
| Multi-Tab | Multiple BrowserViews in tabs |
| Screenshot | `BrowserView.capturePage()` |
**Goose Always Uses Built-In Browser:**
```
1. AI generates HTML/CSS/JS
2. Files saved to project folder
3. Auto-starts Vite dev server
4. Preview loads in built-in browser
5. Changes appear instantly (hot reload)
```
**Only Ask User for External Browser When:**
- OAuth login requires real browser
- Site blocks embedded browsers
- User explicitly requests "open in Chrome"
---
### 4. 🖥️ Computer Use (Desktop Automation)
**Purpose:** Control Windows desktop - click, type, screenshot, etc.
**Current State:** ✅ Working in `computer-use.cjs` + `input.ps1`
**Tools Available:**
```powershell
# Vision (The Eyes)
input.ps1 app_state "Window Name" # UI tree structure
input.ps1 ocr "full" # Read all text on screen
input.ps1 screenshot "file.png" # Capture screen
# Navigation
input.ps1 open "App Name" # Launch/focus app
input.ps1 waitfor "Text" 10 # Wait for text to appear
input.ps1 focus "Element" # Focus element
# Interaction (Reliable via UIAutomation)
input.ps1 uiclick "Button Name" # Click by name
input.ps1 uipress "Item Name" # Toggle/select
input.ps1 type "Hello World" # Type text
input.ps1 key "Enter" # Press key
input.ps1 hotkey "Ctrl+C" # Key combo
# Fallback (Coordinates)
input.ps1 mouse 100 200 # Move mouse
input.ps1 click # Click at position
```
**Improvements Needed:**
- Replace PowerShell with native Node.js (`robotjs`/`nut.js`) for speed
- Add element grounding (label screenshot with clickable elements)
- Better OCR integration for finding text on screen
---
### 5. 🔗 Server Management (SSH/Deploy)
**Purpose:** Connect to remote servers, run commands, deploy apps
**Features to Add:**
```javascript
// Using ssh2 library
const { Client } = require('ssh2');
const conn = new Client();
conn.on('ready', () => {
conn.exec('pm2 restart all', (err, stream) => {
stream.on('data', (data) => console.log(data.toString()));
});
});
conn.connect({
host: '86.105.224.125',
username: 'root',
password: '***'
});
```
**UI Panel:**
- Saved connections list
- Command input with history
- Log stream viewer
- SFTP file browser
---
## Complete Tool Ecosystem
### Always Built-In (Never Ask)
| Tool | Purpose |
|------|---------|
| Monaco Editor | Code editing |
| File Manager | Create/edit/delete files |
| Terminal | Run commands |
| Live Preview | Show app in browser |
| Dev Server | Vite for hot reload |
| Console Output | Show logs |
| Screenshot | Capture for verification |
### Ask Once (Then Remember)
| Tool | When to Ask |
|------|-------------|
| SSH Connection | First time connecting to server |
| API Keys | First time using external API |
| Git Credentials | First time pushing to repo |
### Ask Every Time (Safety)
| Tool | Why |
|------|-----|
| External Browser | User might not want visible browser |
| System Settings | Prevent accidental changes |
| File Delete | Prevent data loss |
| Server Restart | Prevent downtime |
---
## IQ Exchange Flow (Enhanced)
```
┌─────────────────────────────────────────────────────────────────────┐
│ IQ EXCHANGE MAIN LOOP │
├─────────────────────────────────────────────────────────────────────┤
│ │
│ User Request: "Build me a coffee shop landing page" │
│ ↓ │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ STEP 1: DETECT TASK TYPE │ │
│ │ → [code, file, browser] │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ ↓ │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ STEP 2: TRANSLATION LAYER │ │
│ │ "Build landing page" → │ │
│ │ • Create index.html │ │
│ │ • Create style.css │ │
│ │ • Create script.js │ │
│ │ • Start dev server │ │
│ │ • Open in preview │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ ↓ │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ STEP 3: EXECUTE (with built-in tools) │ │
│ │ [FILE] Write index.html │ │
│ │ [FILE] Write style.css │ │
│ │ [FILE] Write script.js │ │
│ │ [TERMINAL] npx vite │ │
│ │ [PREVIEW] Load http://localhost:5173 │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ ↓ │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ STEP 4: VERIFY (screenshot + analyze) │ │
│ │ • Screenshot preview panel │ │
│ │ • Send to AI: "Did this render correctly?" │ │
│ │ • If error detected → STEP 5 │ │
│ │ • If success → COMPLETE │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ ↓ │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ STEP 5: SELF-HEAL (if verification failed) │ │
│ │ • Analyze error │ │
│ │ • Generate fix │ │
│ │ • Re-execute │ │
│ │ • Retry up to 5 times │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ ↓ │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ STEP 6: LOOP (for persistent tasks) │ │
│ │ • Continue until user says stop │ │
│ │ • Or until success criteria met │ │
│ │ • Keep chat context for follow-ups │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────┘
```
---
## Session & Project Management
### Home Screen (First Thing User Sees)
```
┌──────────────────────────────────────────────────────────────┐
│ 🦆 GOOSE SUPER │
├──────────────────────────────────────────────────────────────┤
│ │
│ RECENT PROJECTS [+ New Project]│
│ ───────────────────────────────────────────────────────────│
│ 📁 Coffee Shop Landing Last edited: 2 hours ago [Open] │
│ 📁 Dashboard App Last edited: Yesterday [Open] │
│ 📁 API Server Last edited: 3 days ago [Open] │
│ │
│ RECENT SESSIONS [View All] │
│ ───────────────────────────────────────────────────────────│
│ 💬 "Build landing page" Dec 16, 3:30 PM [Load] │
│ 💬 "Fix login bug" Dec 15, 10:00 AM [Load] │
│ 💬 "Server setup help" Dec 14, 5:00 PM [Load] │
│ │
└──────────────────────────────────────────────────────────────┘
```
### Task Tracker (During Coding)
```
┌──────────────────────────────────────────────────────────────┐
│ ✅ TASK TRACKER [X] │
├──────────────────────────────────────────────────────────────┤
│ Current Goal: "Build coffee shop landing page" │
│ ───────────────────────────────────────────────── │
│ [x] Create project folder │
│ [x] Write index.html structure │
│ [/] Style with CSS (in progress...) │
│ [ ] Add JavaScript interactions │
│ [ ] Test responsiveness │
│ │
│ [Save Progress] [Clear Completed] [+ Add Task] │
└──────────────────────────────────────────────────────────────┘
```
### Data Storage
```javascript
// .opencode/projects.json - All projects
{
"projects": [
{
"id": "proj_001",
"name": "Coffee Shop Landing",
"path": "E:/Projects/coffee-shop",
"lastOpened": "2024-12-16T14:30:00Z",
"files": ["index.html", "style.css", "script.js"]
}
]
}
// .opencode/sessions/session_001.json - Chat sessions
{
"id": "session_001",
"title": "Build landing page",
"created": "2024-12-16T14:30:00Z",
"projectId": "proj_001",
"messages": [...],
"tasks": [
{ "text": "Create index.html", "done": true },
{ "text": "Style with CSS", "done": false }
]
}
```
---
## Implementation Phases
### Phase 0: Critical Fixes (FIRST - Before New Features)
> ⚠️ These are existing bugs that must be fixed before adding new capabilities
| Task | Effort | Priority |
|------|--------|----------|
| **2. Fix internal file save + preview pipeline** | 2 hours | 🔴 CRITICAL |
| **3. Fix layout squeeze and panel overlap** | 2 hours | 🔴 CRITICAL |
| **4. Repair IQ Exchange and action loop** | 3 hours | 🔴 CRITICAL |
| **5. Restore TODO/checklist sidebar + animations** | 2 hours | 🔴 CRITICAL |
| **6. Verify Playwright/browser automation flows** | 2 hours | 🔴 CRITICAL |
| **7. Run smoke tests and document results** | 1 hour | 🔴 CRITICAL |
**Details:**
0. **Internal File Save + Preview Pipeline**
- Fix file write operations in `main.cjs`
- Ensure saved files trigger preview refresh
- Verify hot-reload works with Vite/dev server
- Test create → save → preview flow end-to-end
1. **Layout Squeeze & Panel Overlap**
- Fix CSS flexbox/grid issues in `styles.css`
- Ensure panels resize correctly
- Test all panel combinations
2. **IQ Exchange & Action Loop**
- Debug `lib/iq-exchange.mjs` translation layer
- Fix action execution in `renderer.js`
- Verify self-healing retry loop works
3. **TODO/Checklist Sidebar**
- Restore sidebar visibility
- Add smooth slide animations
- Persist tasks to session
4. **Playwright/Browser Automation**
- Test all `playwright-bridge.js` commands
- Verify session persistence
- Fix any broken selectors
5. **Smoke Tests**
- Run full QA checklist
- Document pass/fail in `tests/SMOKE_TEST_RESULTS.md`
- Fix any critical failures
### ⚡ Optimized Phase 0 Execution Strategy (Batching)
> **Goal:** Complete typical 12-hour work in ~6 hours by grouping related file edits.
| Batch | Tasks Covered | Files Touched |
|-------|---------------|---------------|
| **BATCH 1** | #2 (Save), #3 (Layout), #5 (Sidebar) | `renderer.js`, `styles.css`, `main.cjs` |
| **BATCH 2** | #4 (IQ Loop) | `lib/iq-exchange.mjs`, `renderer.js` |
| **BATCH 3** | #6 (Playwright), #7 (Smoke Tests) | `playwright-bridge.js`, `tests/*` |
---
## Safety & Telemetry (Critical Protocol)
1. **🛑 Global STOP Button**: Must be visible in the UI at all times. Clicking it immediately kills `input.ps1` and Playwright processes.
2. **🔒 Safe Mode**: Computer Use defaults to "Ask for Confirmation" before every click. Includes toggle to switch to "**AUTO ACCEPT**" mode for trusted sessions.
3. **📝 Log Panel**: A minimal log viewer must be added to the bottom panel to debug "silent" failures.
- **Smart Error Handling**: Any error log must include an "**IQ Exchange**" button. Clicking it sends the error context back to the AI with a prompt to analyze, generate a fix, and self-heal.
---
### Phase 1: Core IDE Experience (Week 1-2)
| Task | Effort | Priority |
|------|--------|----------|
| Monaco Editor integration | 6 hours | 🔴 CRITICAL |
| File tree panel | 4 hours | 🔴 CRITICAL |
| Multi-tab editing | 3 hours | 🔴 CRITICAL |
| Live preview with hot reload | 4 hours | 🔴 CRITICAL |
| Project save/load | 2 hours | 🟠 HIGH |
| Template scaffolding | 2 hours | 🟠 HIGH |
### Phase 2: Enhanced IQ Exchange (Week 2-3)
| Task | Effort | Priority |
|------|--------|----------|
| Screenshot verification loop | 3 hours | 🔴 CRITICAL |
| UIAutomation element finder | 4 hours | 🔴 CRITICAL |
| Success criteria detection | 2 hours | 🟠 HIGH |
| Better looping logic | 2 hours | 🟠 HIGH |
| Native Node.js automation (robotjs) | 4 hours | 🟡 MEDIUM |
### Phase 3: Built-In Browser (Week 3-4)
| Task | Effort | Priority |
|------|--------|----------|
| Multi-tab BrowserView | 4 hours | 🟠 HIGH |
| DOM inspector for AI | 3 hours | 🟠 HIGH |
| Dev tools integration | 3 hours | 🟡 MEDIUM |
| Network monitor | 2 hours | 🟡 MEDIUM |
### Phase 4: Server & Deploy (Week 4-5)
| Task | Effort | Priority |
|------|--------|----------|
| SSH connection panel | 4 hours | 🟠 HIGH |
| Remote command execution | 2 hours | 🟠 HIGH |
| SFTP file transfer | 3 hours | 🟡 MEDIUM |
| Log streaming | 2 hours | 🟡 MEDIUM |
### Phase 5: Polish & UX (Week 5-6)
| Task | Effort | Priority |
|------|--------|----------|
| Onboarding wizard | 3 hours | 🟡 MEDIUM |
| Modularize renderer.js | 4 hours | 🟡 MEDIUM |
| Visual polish (dark mode, animations) | 4 hours | 🟡 MEDIUM |
| Error recovery UI | 2 hours | 🟡 MEDIUM |
---
## File Changes Summary
### New Files to Create
```
bin/goose-electron-app/
├── panels/
│ ├── EditorPanel.js # Monaco wrapper
│ ├── FileTreePanel.js # Project file browser
│ ├── PreviewPanel.js # BrowserView wrapper
│ ├── TerminalPanel.js # xterm.js terminal
│ ├── BrowserPanel.js # Web browsing (separate from preview)
│ └── ServerPanel.js # SSH/deployment UI
├── services/
│ ├── ProjectService.js # Save/load projects
│ ├── DevServerService.js # Vite integration
│ ├── SSHService.js # SSH connections
│ └── VerificationService.js # Screenshot → analyze loop
└── components/
├── OnboardingWizard.js # First-run setup
└── PermissionDialog.js # External tool approval
```
### Files to Modify
| File | Changes |
|------|---------|
| `main.cjs` | Add IPC handlers for new services |
| `renderer.js` | Refactor into modular panels |
| `preload.js` | Expose new APIs to renderer |
| `index.html` | Add panel containers, tabs |
| `styles.css` | Panel layouts, dark theme polish |
| `lib/iq-exchange.mjs` | Enhance with verification loop |
| `computer-use.cjs` | Add UIAutomation element finder |
### Dependencies to Add
```json
{
"dependencies": {
"monaco-editor": "^0.45.0",
"xterm": "^5.3.0",
"xterm-addon-fit": "^0.8.0",
"ssh2": "^1.15.0",
"isomorphic-git": "^1.25.0",
"robotjs": "^0.6.0"
}
}
```
---
## Verification Plan
### Automated Tests
```bash
# Run existing tests
npm test
# Add new tests for:
# - File operations
# - Project save/load
# - IQ Exchange translation
# - Screenshot capture
```
### Manual Testing Checklist
1. **Editor Test**
- Create new project
- Write HTML/CSS/JS
- See live preview update
2. **IQ Exchange Test**
- Say "open notepad"
- Verify it opens
- Say "type hello world"
- Verify text appears
3. **Browser Test**
- Navigate to google.com in built-in browser
- Search for "weather"
- Verify results appear
4. **Server Test**
- Connect via SSH
- Run `ls -la`
- Verify output shows
---
## Questions for User Approval
1. **Start with Phase 1?** (Monaco Editor + File Tree + Preview)
2. **Skip any phases?** (Maybe delay Server/SSH if not needed)
3. **Specific LLMs?** (Keep Qwen only or add OpenAI/Claude?)
4. **Design preferences?** (Specific colors, layouts?)
5. **Priority features?** (What to do first if time is limited?)
---
## Summary
**Goose Super will become a self-contained AI coding powerhouse that:**
✅ Writes code in a full IDE (Monaco Editor)
✅ Previews apps instantly (Built-in Browser)
✅ Controls the desktop intelligently (IQ Exchange + UIAutomation)
✅ Deploys to servers (SSH/SFTP)
✅ Self-heals on errors (Translation Layer + Looping)
✅ Never uses external tools without permission
**Total estimated effort: ~60 hours over 6 weeks**
---
*Ready for your approval to begin execution.*

View File

@@ -0,0 +1,680 @@
# 🦆 GOOSE SUPER - CORE SUBSYSTEMS DETAILED SPECIFICATIONS
## Table of Contents
1. [Playwright Integration](#1-playwright-integration)
2. [Browser Use System](#2-browser-use-system)
3. [Computer Use System](#3-computer-use-system)
4. [Full Vision Capabilities](#4-full-vision-capabilities)
5. [Vibe Server Management](#5-vibe-server-management)
---
# 1. Playwright Integration
## Purpose
Automate web browsers with precision - navigate, click, fill forms, extract data, take screenshots.
## Architecture
```
┌────────────────────────────────────────────────────────────────┐
│ PLAYWRIGHT SUBSYSTEM │
├────────────────────────────────────────────────────────────────┤
│ │
│ User: "Search for hotels in Dubai on Booking.com" │
│ ↓ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ IQ Exchange Translation Layer │ │
│ │ → BROWSER action detected │ │
│ │ → Route to Playwright Bridge │ │
│ └──────────────────────────────────────────────────────────┘ │
│ ↓ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Playwright Bridge (bin/playwright-bridge.js) │ │
│ │ │ │
│ │ Commands: │ │
│ │ navigate <url> → Go to URL │ │
│ │ click <selector|text> → Click element │ │
│ │ fill <selector> <text> → Type in input │ │
│ │ type <text> → Type at cursor │ │
│ │ press <key> → Press key (Enter, Tab, etc) │ │
│ │ wait <selector> <ms> → Wait for element │ │
│ │ screenshot <file> → Capture page │ │
│ │ content → Get page text │ │
│ │ elements → List all elements │ │
│ │ close → Close browser session │ │
│ └──────────────────────────────────────────────────────────┘ │
│ ↓ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Chromium Browser (Headless or Visible) │ │
│ │ • Persistent session across commands │ │
│ │ • Cookies & auth remembered │ │
│ │ • DevTools access for debugging │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │
└────────────────────────────────────────────────────────────────┘
```
## Current State vs. Enhancement
| Feature | Current | Enhanced |
|---------|---------|----------|
| Session persistence | ✅ Basic | ✅ Full cookie/auth persistence |
| Element selection | Text/CSS only | Smart AI-assisted selector |
| Form filling | Manual fields | Auto-detect all form fields |
| Multi-tab | ❌ Single tab | ✅ Multiple tabs |
| Screenshots | ✅ Basic | ✅ With element annotations |
| DOM extraction | ❌ | ✅ Structured for AI analysis |
## Enhanced Playwright Bridge Commands
```javascript
// NEW: Smart element finding (AI-assisted)
node playwright-bridge.js smart-click "the search button"
node playwright-bridge.js smart-fill "email" "user@example.com"
// NEW: Element discovery for AI
node playwright-bridge.js list-interactive // Lists all clickable/fillable elements
node playwright-bridge.js describe-page // AI-friendly page description
// NEW: Multi-tab
node playwright-bridge.js new-tab
node playwright-bridge.js switch-tab 2
node playwright-bridge.js list-tabs
// NEW: Authentication helpers
node playwright-bridge.js save-cookies "site-name"
node playwright-bridge.js load-cookies "site-name"
// NEW: Form automation
node playwright-bridge.js auto-fill-form '{"email":"x","password":"y"}'
```
## Integration with IQ Exchange
```javascript
// In lib/iq-exchange.mjs - TASK_PATTERNS
browser: /\b(website|browser|search|navigate|booking\.com|amazon|google|fill.*form|click.*button|login|signup)\b/i
// Translation example:
// User: "Book a hotel in Dubai on Booking.com"
// IQ Exchange generates:
[
'node playwright-bridge.js navigate "https://booking.com"',
'node playwright-bridge.js wait "[name=ss]" 5000',
'node playwright-bridge.js fill "[name=ss]" "Dubai hotels"',
'node playwright-bridge.js click "[type=submit]"',
'node playwright-bridge.js screenshot "search-results.png"'
]
```
---
# 2. Browser Use System
## Purpose
Built-in browser panel inside Goose for:
1. **App Preview** - See your code rendered live
2. **Web Browsing** - AI can browse the web inside Goose
3. **Automation Control** - Visual feedback for Playwright actions
## Architecture
```
┌────────────────────────────────────────────────────────────────┐
│ BROWSER USE SYSTEM │
├────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ BUILT-IN BROWSER (Electron BrowserView) │ │
│ │ │ │
│ │ ┌───────────────────────────────────────────────────┐ │ │
│ │ │ 🌐 https://localhost:5173 🔄 ◀ ▶ 🔍 🛠️ 📸 │ │ │
│ │ ├───────────────────────────────────────────────────┤ │ │
│ │ │ │ │ │
│ │ │ YOUR APP RENDERS HERE │ │ │
│ │ │ │ │ │
│ │ │ [Live preview of HTML/CSS/JS you write] │ │ │
│ │ │ │ │ │
│ │ │ │ │ │
│ │ └───────────────────────────────────────────────────┘ │ │
│ │ │ │
│ │ Toolbar: │ │
│ │ 🔄 Refresh ◀ Back ▶ Forward 🔍 URL Bar │ │
│ │ 🛠️ DevTools 📸 Screenshot 📍 Element Inspect │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │
│ Modes: │
│ [Preview] - Shows your local app (default) │
│ [Browse] - Navigate to any URL │
│ [Automate]- Watch Playwright actions in real-time │
│ │
└────────────────────────────────────────────────────────────────┘
```
## Features
### Preview Mode (Vibe Coding)
```
User: "Make me a landing page"
Goose writes HTML/CSS/JS
Auto-starts Vite dev server (port 5173)
Built-in browser loads http://localhost:5173
User: "Make the header blue"
Goose edits CSS → Hot reload → Preview updates instantly
```
### Browse Mode
```
User: "Go to GitHub and star the browser-use repo"
Built-in browser navigates to github.com
AI uses Playwright to interact with the page
User can SEE every action happening
```
### Automate Mode (Visible Automation)
```
When Playwright runs:
1. Built-in browser shows the exact page
2. Red highlight appears on clicked elements
3. Green highlight on filled inputs
4. Status bar shows current action
```
## IPC Handlers (main.cjs)
```javascript
// Browser panel control
ipcMain.handle('browser:create-view', async (event, options) => {
browserView = new BrowserView({ webPreferences });
mainWindow.addBrowserView(browserView);
return { success: true };
});
ipcMain.handle('browser:navigate', async (event, url) => {
await browserView.webContents.loadURL(url);
return { success: true, url };
});
ipcMain.handle('browser:screenshot', async () => {
const image = await browserView.webContents.capturePage();
return { success: true, base64: image.toDataURL() };
});
ipcMain.handle('browser:execute-js', async (event, script) => {
const result = await browserView.webContents.executeJavaScript(script);
return { success: true, result };
});
ipcMain.handle('browser:get-dom', async () => {
const dom = await browserView.webContents.executeJavaScript(`
Array.from(document.querySelectorAll('a, button, input, select, textarea'))
.map(el => ({
tag: el.tagName,
text: el.textContent?.trim().slice(0, 50),
id: el.id,
name: el.name,
type: el.type,
placeholder: el.placeholder
}))
`);
return { success: true, elements: dom };
});
```
---
# 3. Computer Use System
## Purpose
Control the Windows desktop - open apps, click UI elements, type text, take screenshots.
## Architecture
```
┌────────────────────────────────────────────────────────────────┐
│ COMPUTER USE SYSTEM │
├────────────────────────────────────────────────────────────────┤
│ │
│ User: "Open Spotify and play some jazz" │
│ ↓ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ IQ Exchange Translation Layer │ │
│ │ → DESKTOP action detected │ │
│ │ → Route to Computer Use module │ │
│ └──────────────────────────────────────────────────────────┘ │
│ ↓ │
│ ┌────────────────────────────────────┬─────────────────────┐ │
│ │ input.ps1 (PowerShell) │ computer-use.cjs │ │
│ │ (Current - 66KB!) │ (Node.js native) │ │
│ │ │ │ │
│ │ • UIAutomation via .NET │ • robotjs for speed │ │
│ │ • OCR via Windows API │ • screenshot-desktop│ │
│ │ • Window management │ • Cross-platform │ │
│ └────────────────────────────────────┴─────────────────────┘ │
│ ↓ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Windows Desktop │ │
│ │ • Apps, dialogs, menus │ │
│ │ • Task bar, system tray │ │
│ │ • Any visible UI element │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │
└────────────────────────────────────────────────────────────────┘
```
## Available Commands (input.ps1)
### Vision Commands (The Eyes)
```powershell
# See what's on screen
input.ps1 screenshot "screen.png" # Capture full screen
input.ps1 ocr "full" # Read ALL text on screen
input.ps1 ocr "region" 100 100 500 400 # Read text in region
input.ps1 app_state "Notepad" # Get UI tree of app (buttons, menus, etc.)
```
### Navigation Commands
```powershell
# Open and focus apps
input.ps1 open "Spotify" # Launch or focus app
input.ps1 startmenu # Open Windows Start menu
input.ps1 focus "Save As" # Focus specific dialog/element
input.ps1 waitfor "Ready" 10 # Wait for text to appear (max 10s)
```
### Interaction Commands (Smart - via UIAutomation)
```powershell
# Click by element name (RELIABLE!)
input.ps1 uiclick "Play" # Click button named "Play"
input.ps1 uiclick "File" # Click menu item
input.ps1 uipress "Remember me" # Toggle checkbox
input.ps1 uipress "Jazz" # Select list item
# Type text
input.ps1 type "Hello World" # Type into focused element
input.ps1 keyboard "search query" # Alternative typing method
input.ps1 key "Enter" # Press single key
input.ps1 hotkey "Ctrl+S" # Key combination
```
### Fallback Commands (Coordinate-based)
```powershell
# When UIAutomation fails, use coordinates
input.ps1 mouse 500 300 # Move mouse to x,y
input.ps1 click # Click at current position
input.ps1 drag 100 100 500 500 # Drag from point to point
```
## Example Flow
```
User: "Open Paint and draw a red circle"
IQ Exchange generates:
1. input.ps1 open "Paint"
2. input.ps1 waitfor "Untitled" 5
3. input.ps1 uiclick "Brushes"
4. input.ps1 uiclick "Red"
5. input.ps1 mouse 400 300
6. input.ps1 drag 400 300 500 400
After each step:
→ Screenshot
→ OCR/app_state to verify
→ If failed → Self-heal and retry
```
## Enhancements Needed
| Current | Enhanced |
|---------|----------|
| PowerShell scripts (slow startup) | Node.js native with robotjs |
| Basic element finding | Smart element grounding with labels |
| Manual verification | Auto-screenshot after each action |
| English-only OCR | Multi-language OCR support |
---
# 4. Full Vision Capabilities
## Purpose
Let the AI "see" the screen like a human - understand what's visible, where elements are, and verify actions worked.
## Architecture
```
┌────────────────────────────────────────────────────────────────┐
│ VISION SYSTEM │
├────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ SCREENSHOT LAYER │ │
│ │ • Full screen capture │ │
│ │ • Window-specific capture │ │
│ │ • Region capture │ │
│ │ • Before/after comparison │ │
│ └──────────────────────────────────────────────────────────┘ │
│ ↓ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ ANALYSIS LAYER │ │
│ │ │ │
│ │ ┌────────────────┬────────────────┬──────────────────┐ │ │
│ │ │ OCR Engine │ UI Automation │ AI Vision (LLM) │ │ │
│ │ │ │ │ │ │ │
│ │ │ • Read text │ • Element tree │ • Understand │ │ │
│ │ │ • Find words │ • Button names │ context │ │ │
│ │ │ • Coordinates │ • Input fields │ • Verify success │ │ │
│ │ │ of text │ • Menus │ • Describe scene │ │ │
│ │ └────────────────┴────────────────┴──────────────────┘ │ │
│ └──────────────────────────────────────────────────────────┘ │
│ ↓ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ GROUNDING LAYER (Element Labeling) │ │
│ │ │ │
│ │ Screenshot with overlays: │ │
│ │ ┌─────────────────────────────────────────┐ │ │
│ │ │ [1] File [2] Edit [3] View │ │ │
│ │ │ ┌─────────────────────────────┐ │ │ │
│ │ │ │ [4] Search... │ │ │ │
│ │ │ └─────────────────────────────┘ │ │ │
│ │ │ │ │ │
│ │ │ [5] Save [6] Cancel │ │ │
│ │ └─────────────────────────────────────────┘ │ │
│ │ │ │
│ │ AI can say: "Click element [5]" → We click Save button │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │
└────────────────────────────────────────────────────────────────┘
```
## Vision Flow
### Before Action
```
1. Screenshot current state
2. OCR to find text locations
3. UIAutomation to get element names
4. Create "grounding map" of clickable elements
5. Send to AI: "Here's what I see on screen..."
```
### After Action
```
1. Screenshot new state
2. Compare with previous
3. OCR to verify expected text appeared
4. Send to AI: "Did the action succeed?"
5. If failed → Generate fix → Retry
```
## Grounding Map Example
```json
{
"screenshot": "base64://...",
"elements": [
{ "id": 1, "type": "button", "text": "File", "bounds": [10, 5, 50, 25] },
{ "id": 2, "type": "button", "text": "Edit", "bounds": [60, 5, 100, 25] },
{ "id": 3, "type": "input", "placeholder": "Search...", "bounds": [150, 50, 400, 80] },
{ "id": 4, "type": "button", "text": "Save", "bounds": [300, 500, 380, 530] },
{ "id": 5, "type": "button", "text": "Cancel", "bounds": [400, 500, 480, 530] }
],
"ocr_text": [
{ "text": "Untitled - Notepad", "bounds": [100, 0, 300, 20] },
{ "text": "Hello World", "bounds": [50, 100, 200, 120] }
]
}
```
## AI Prompt with Vision
```
You are controlling a Windows computer. Here's what you see:
[SCREENSHOT: base64 image]
INTERACTIVE ELEMENTS:
[1] Button: "File" at (10, 5)
[2] Button: "Edit" at (60, 5)
[3] Input: "Search..." at (150, 50)
[4] Button: "Save" at (300, 500)
[5] Button: "Cancel" at (400, 500)
VISIBLE TEXT:
- "Untitled - Notepad" (title bar)
- "Hello World" (document content)
USER REQUEST: "Save this file as hello.txt"
YOUR ACTION:
- To click an element, say: CLICK [4]
- To type, say: TYPE "hello.txt"
- To press key, say: KEY Enter
```
---
# 5. Vibe Server Management
## Purpose
**"Vibe Server Management"** = Server tasks for people who don't know servers.
No SSH commands to memorize. Just tell Goose what you want in plain English.
## Architecture
```
┌────────────────────────────────────────────────────────────────┐
│ VIBE SERVER MANAGEMENT │
├────────────────────────────────────────────────────────────────┤
│ │
│ User: "My website is down, fix it" │
│ ↓ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ IQ Exchange Translation Layer │ │
│ │ → SERVER action detected │ │
│ │ → Route to Server Management module │ │
│ └──────────────────────────────────────────────────────────┘ │
│ ↓ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ SERVER DIAGNOSIS FLOW │ │
│ │ │ │
│ │ 1. Check if we have SSH credentials │ │
│ │ → If not: "I need SSH access. Enter credentials?" │ │
│ │ │ │
│ │ 2. Connect to server │ │
│ │ → ssh root@86.105.224.125 │ │
│ │ │ │
│ │ 3. Diagnose the issue │ │
│ │ → systemctl status nginx │ │
│ │ → pm2 status │ │
│ │ → tail -n 50 /var/log/nginx/error.log │ │
│ │ │ │
│ │ 4. Fix the issue │ │
│ │ → systemctl restart nginx │ │
│ │ → pm2 restart all │ │
│ │ │ │
│ │ 5. Verify fix worked │ │
│ │ → curl http://localhost │ │
│ │ → Report back to user │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │
└────────────────────────────────────────────────────────────────┘
```
## Noob-Friendly Commands (What User Says → What Goose Does)
| User Says | Goose Does |
|-----------|------------|
| "Is my server running?" | `ssh``systemctl status nginx` → parse output |
| "Restart my website" | `pm2 restart all` or `systemctl restart nginx` |
| "Check the logs" | `tail -n 100 /var/log/nginx/error.log` |
| "Deploy my latest code" | `git pull``npm install``npm run build``pm2 restart` |
| "My server is slow" | Check CPU/memory, identify resource hogs |
| "Set up SSL" | `certbot --nginx -d domain.com` |
| "Add a new website" | Create nginx config, set up PM2 process |
| "Backup my database" | `pg_dump` or `mysqldump` |
| "My disk is full" | `df -h`, find large files, clean up safely |
## Server Panel UI
```
┌────────────────────────────────────────────────────────────────┐
│ 🔗 SERVER MANAGEMENT [X] │
├────────────────────────────────────────────────────────────────┤
│ │
│ ┌───────────────────────────────────────────────────────────┐│
│ │ CONNECTIONS [+ Add New] ││
│ │ ───────────────────────────────────────────────────────── ││
│ │ ● Production Server (86.105.224.125) [Connect] ││
│ │ ○ Staging Server (192.168.1.100) [Connect] ││
│ └───────────────────────────────────────────────────────────┘│
│ │
│ ┌───────────────────────────────────────────────────────────┐│
│ │ QUICK ACTIONS ││
│ │ ───────────────────────────────────────────────────────── ││
│ │ [🔄 Restart Website] [📊 Check Status] [📋 View Logs] ││
│ │ [🚀 Deploy Latest] [💾 Backup DB] [🔒 SSL Setup] ││
│ └───────────────────────────────────────────────────────────┘│
│ │
│ ┌───────────────────────────────────────────────────────────┐│
│ │ LIVE TERMINAL ││
│ │ ───────────────────────────────────────────────────────── ││
│ │ root@server:~# systemctl status nginx ││
│ │ ● nginx.service - A high performance web server ││
│ │ Loaded: loaded (/lib/systemd/system/nginx.service) ││
│ │ Active: active (running) since Mon 2025-12-16 10:00 ││
│ │ ││
│ │ $> _ ││
│ └───────────────────────────────────────────────────────────┘│
│ │
│ ┌───────────────────────────────────────────────────────────┐│
│ │ 💬 ASK GOOSE (natural language) ││
│ │ ───────────────────────────────────────────────────────── ││
│ │ "Why is my website slow?" [Ask Goose] ││
│ └───────────────────────────────────────────────────────────┘│
│ │
└────────────────────────────────────────────────────────────────┘
```
## SSH Service Implementation
```javascript
// services/SSHService.js
const { Client } = require('ssh2');
class SSHService {
constructor() {
this.connections = new Map();
}
async connect(name, config) {
const conn = new Client();
return new Promise((resolve, reject) => {
conn.on('ready', () => {
this.connections.set(name, conn);
resolve({ success: true });
});
conn.on('error', reject);
conn.connect(config);
});
}
async exec(name, command) {
const conn = this.connections.get(name);
if (!conn) throw new Error('Not connected');
return new Promise((resolve, reject) => {
conn.exec(command, (err, stream) => {
if (err) reject(err);
let output = '';
stream.on('data', (data) => output += data);
stream.on('close', () => resolve(output));
});
});
}
// High-level helper methods
async checkWebsiteStatus(name) {
const nginx = await this.exec(name, 'systemctl is-active nginx');
const pm2 = await this.exec(name, 'pm2 jlist');
return { nginx: nginx.trim(), pm2: JSON.parse(pm2) };
}
async restartWebsite(name) {
await this.exec(name, 'pm2 restart all');
await this.exec(name, 'systemctl reload nginx');
return { success: true };
}
async getLogs(name, lines = 50) {
return this.exec(name, `tail -n ${lines} /var/log/nginx/error.log`);
}
async deploy(name, path = '/var/www/app') {
await this.exec(name, `cd ${path} && git pull`);
await this.exec(name, `cd ${path} && npm install`);
await this.exec(name, `cd ${path} && npm run build`);
await this.exec(name, 'pm2 restart all');
return { success: true };
}
}
```
## Saved Server Configurations
```json
// .opencode/servers.json
{
"production": {
"name": "Production Server",
"host": "86.105.224.125",
"port": 22,
"username": "root",
"authType": "password",
"webRoot": "/var/www/myapp",
"pm2App": "myapp"
},
"staging": {
"name": "Staging Server",
"host": "192.168.1.100",
"port": 22,
"username": "deploy",
"authType": "key",
"keyPath": "~/.ssh/id_rsa",
"webRoot": "/var/www/staging"
}
}
```
---
# Summary: All Five Subsystems
| Subsystem | Purpose | Key Tech |
|-----------|---------|----------|
| **1. Playwright** | Browser automation | Playwright + persistent sessions |
| **2. Browser Use** | Built-in browser panel | Electron BrowserView |
| **3. Computer Use** | Desktop automation | input.ps1 + UIAutomation |
| **4. Full Vision** | See and verify | OCR + Screenshot + Grounding |
| **5. Vibe Server** | Noob-friendly server ops | SSH2 + high-level helpers |
All subsystems connect through **IQ Exchange** which:
- Detects task type from natural language
- Routes to appropriate subsystem
- Executes with self-healing loop
- Verifies success using vision
- Retries until success or max attempts
---
*This document provides implementation-ready specifications for each core subsystem.*

View File

@@ -0,0 +1,267 @@
# Goose Super - Built-In Tools Ecosystem
## Vision: Lovable + Antigravity Hybrid
**Goose Super should be a self-contained super-IDE where:**
- Everything happens INSIDE Goose (no external tools unless approved)
- Apps/pages are built, previewed, and tested within Goose's built-in browser
- AI sees everything and can interact with its own UI
- Users go from idea → working app → deployed in one session
---
## Core Principle: Built-In First
```
┌────────────────────────────────────────────────────────────────────┐
│ GOOSE SUPER │
│ │
│ User: "Build me a coffee shop website" │
│ ↓ │
│ ┌─────────────────────┬─────────────────────┬─────────────────┐ │
│ │ 📝 BUILT-IN EDITOR │ 🌐 BUILT-IN BROWSER│ 💬 AI CHAT │ │
│ │ (Monaco/VS Code) │ (Electron webview) │ (Qwen/Claude) │ │
│ │ │ │ │ │
│ │ - File tree │ - Live preview │ - Plan tasks │ │
│ │ - Multi-tab │ - Dev tools │ - Execute │ │
│ │ - Syntax highlight │ - Console output │ - Verify │ │
│ │ - IntelliSense │ - Network monitor │ - Self-correct │ │
│ └─────────────────────┴─────────────────────┴─────────────────┘ │
│ ↓ │
│ ┌─────────────────────┬─────────────────────┬─────────────────┐ │
│ │ 🖥️ COMPUTER USE │ 🔗 SERVER MGMT │ 📦 PACKAGE MGR │ │
│ │ (Desktop control) │ (SSH/Deploy) │ (npm/pip/etc) │ │
│ └─────────────────────┴─────────────────────┴─────────────────┘ │
└────────────────────────────────────────────────────────────────────┘
```
---
## Complete Built-In Tools Matrix
### 🎨 **DEVELOPMENT TOOLS** (Always Use Built-In)
| Tool | Purpose | Implementation | External Fallback? |
|------|---------|----------------|-------------------|
| **Monaco Editor** | Code editing with IntelliSense | VS Code's Monaco (same core) | No - always built-in |
| **File Manager** | Create/edit/delete files/folders | Node.js fs operations | No - always built-in |
| **Terminal** | Run commands in project context | Node.js child_process | No - always built-in |
| **Git Panel** | Version control | isomorphic-git | Ask user for GitHub Desktop |
| **Package Manager** | Install dependencies | npm/yarn/pnpm via terminal | No - always built-in |
| **Live Preview** | See app in real-time | Electron BrowserView | No - always built-in |
| **Dev Server** | Hot-reload for React/Vue/etc | Vite built-in | No - always built-in |
| **Console** | View console.log output | Capture from BrowserView | No - always built-in |
| **Network Monitor** | See API calls | DevTools protocol | No - always built-in |
---
### 🌐 **BROWSER TOOLS** (Built-In Chromium)
| Tool | Purpose | Implementation |
|------|---------|----------------|
| **Built-In Browser** | Preview apps, browse web | Electron BrowserView (Chromium) |
| **DOM Inspector** | Inspect/modify elements | Chrome DevTools Protocol |
| **Screenshot** | Capture visible page | BrowserView.capturePage() |
| **Form Filler** | Auto-fill forms | DOM manipulation |
| **Navigation** | Forward/back/refresh | BrowserView methods |
| **Multi-Tab** | Multiple pages open | Multiple BrowserViews |
| **Cookie Manager** | Save/restore sessions | Electron session API |
**Browser Priority:**
```
1. Use BUILT-IN browser for preview and web actions
2. Only open EXTERNAL browser if:
- User explicitly requests it ("open in Chrome")
- Authentication flow requires it (OAuth callbacks)
- Site blocks embedded browsers
→ Always ASK user first: "Need to open external browser for X. Approve?"
```
---
### 🖥️ **COMPUTER AUTOMATION** (Desktop Control)
| Tool | Purpose | Implementation |
|------|---------|----------------|
| **Screenshot** | Capture screen state | PowerShell/screencapture |
| **Element Finder** | Find UI elements by name (UIAutomation) | NEW - Windows Accessibility API |
| **Click** | Click at position or element | mouse_event / robotjs |
| **Type** | Keyboard input | SendKeys / robotjs |
| **Key Press** | Hotkeys (Ctrl+C, etc) | SendKeys / robotjs |
| **Window Manager** | Focus/minimize/move windows | WinAPI calls |
| **App Launcher** | Open applications | start / open / exec |
| **Clipboard** | Read/write clipboard | Electron clipboard API |
---
### 🔗 **SERVER & DEPLOYMENT** (Remote Operations)
| Tool | Purpose | Implementation | Ask User? |
|------|---------|----------------|-----------|
| **SSH Connect** | Connect to remote server | ssh2 library | Yes - show credentials dialog |
| **Remote Exec** | Run commands on server | ssh2 exec | No - once connected |
| **SFTP Upload** | Upload files to server | ssh2-sftp | No - once connected |
| **SFTP Download** | Download files from server | ssh2-sftp | No - once connected |
| **Log Stream** | Tail remote log files | ssh2 stream | No - once connected |
| **Docker** | Container management | SSH + docker commands | No |
| **PM2** | Process management | SSH + pm2 commands | No |
---
### 📦 **PROJECT SCAFFOLDING** (Templates)
| Template | Stack | Files Created |
|----------|-------|---------------|
| **Static Site** | HTML/CSS/JS | index.html, style.css, script.js |
| **React App** | Vite + React | Full Vite React setup |
| **Vue App** | Vite + Vue | Full Vite Vue setup |
| **Next.js** | Next.js | Full Next.js setup |
| **API Server** | Express | Express with routes |
| **Full-Stack** | Next.js + DB | Full-stack with Prisma |
---
## IDE Integration Strategy (VS Code Components)
### Option A: **Monaco Editor Only** (Recommended)
Use VS Code's Monaco editor component standalone:
```
What we take from VS Code:
├── monaco-editor # The editor component
├── TextMate grammars # Syntax highlighting
├── Language services # IntelliSense for JS/TS/CSS/HTML
└── Extension host # (Optional) Support VS Code extensions
```
**Pros:** Lightweight, well-documented, npm package available
**Cons:** No built-in file tree, terminals, or panels
### Option B: **VS Code in Electron** (Heavy)
Embed full VS Code via code-server or similar:
```
What we embed:
├── Full VS Code UI
├── All extensions
├── Terminal
└── File explorer
```
**Pros:** Full IDE experience
**Cons:** 200MB+ size, complex integration, version conflicts
### **Recommendation: Option A with custom panels**
```javascript
// Our architecture
goose-electron-app/
panels/
EditorPanel.js // Monaco wrapper
FileTreePanel.js // Custom file explorer
TerminalPanel.js // xterm.js integration
PreviewPanel.js // BrowserView wrapper
ChatPanel.js // AI conversation
BrowserPanel.js // Web browsing
services/
LanguageService.js // Monaco language features
FileService.js // File operations
ProjectService.js // Project management
DevServerService.js // Vite/webpack integration
```
---
## Goose Super Flow: User Story
### "Build me a landing page for a coffee shop"
```
Step 1: User types request
Step 2: Goose creates project folder
Step 3: Goose scaffolds HTML/CSS/JS files
├── Opens in BUILT-IN Editor (Monaco)
└── Shows in BUILT-IN Preview (BrowserView)
Step 4: Goose writes code, preview updates live
Step 5: User: "Make the header sticky"
Step 6: Goose edits CSS, preview shows change instantly
Step 7: User: "Deploy to my server"
Step 8: Goose: "I need SSH credentials. Connect now?"
├── User provides credentials
└── Goose connects, uploads, restarts nginx
Step 9: Goose: "Done! Live at http://your-server.com"
└── Shows in BUILT-IN Browser
```
---
## External Tool Permission Flow
When Goose needs something external:
```
┌─────────────────────────────────────────────────────────────┐
│ ⚠️ Goose needs to use an external tool │
│ │
│ Tool: Google Chrome │
│ Reason: OAuth login requires external browser │
│ │
│ [✓ Approve] [✗ Deny] [Always allow for OAuth] │
└─────────────────────────────────────────────────────────────┘
```
**Permission Categories:**
1. **Always Built-In**: Editor, preview, terminal, file ops
2. **Ask Once**: SSH connections, API keys
3. **Ask Every Time**: External browser, external apps
4. **Never Without Approval**: Purchases, account actions
---
## Implementation Priority
### Week 1: Core Editor + Preview
- [ ] Monaco Editor integration
- [ ] File tree panel
- [ ] Live preview with BrowserView
- [ ] Hot reload via Vite
### Week 2: Browser Superpowers
- [ ] Built-in browser for web actions
- [ ] DOM inspection
- [ ] Screenshot capture from browser
- [ ] Multi-tab support
### Week 3: Smart Automation
- [ ] UIAutomation element finder
- [ ] Screenshot → verify → correct loop
- [ ] Better click/type reliability
### Week 4: Server & Deploy
- [ ] SSH connection panel
- [ ] SFTP file upload
- [ ] Remote command execution
- [ ] Log streaming
---
## Summary: What Makes Goose SUPER
| Capability | Current | Goal |
|------------|---------|------|
| **Editor** | ❌ None | ✅ Full Monaco IDE |
| **Preview** | ⚠️ Basic webview | ✅ Live hot-reload with dev tools |
| **Browser** | ⚠️ External only | ✅ Built-in Chromium |
| **Automation** | ⚠️ Blind clicking | ✅ Smart element discovery |
| **Server** | ❌ None | ✅ SSH/SFTP/Deploy |
| **Projects** | ❌ None | ✅ Save/load/scaffold |
**Goose Super = Lovable (vibe coding) + Antigravity (computer use) + VS Code (IDE) in one package.**

62
docs/KNOWN_ISSUES.md Normal file
View File

@@ -0,0 +1,62 @@
# Goose Super - Known Issues & Bugs
## Critical (Fix Before Release)
### 1. ❌ Preview ERR_FILE_NOT_FOUND
**Screenshot:** 2024-12-16
**Issue:** Preview shows `file:///E:/calculator.html` instead of full project path
**Root Cause:** `normalizeUrlForPreview()` not resolving properly in all cases
**Fix:** Need to check if file exists before loading, and fall back to projectRootPath
### 2. ❌ "use code editor" Opens Notepad Instead of Built-In Editor
**Screenshot:** 2024-12-16
**Issue:** User says "use code editor" but IQ Exchange outputs `OPEN_APP app="notepad"`
**Root Cause:** Translation prompt doesn't know about built-in Editor panel
**Fix:** Add [ACTION:OPEN_EDITOR] action and teach translation about built-in tools
### 3. ⚠️ IQ Exchange Translation Issues
**Issue:** Sometimes outputs "needs more specific instructions" instead of actions
**Root Cause:** Translation prompt may not be specific enough for simple tasks
**Fix:** Improve `translateTaskToActions()` prompt or add fallback to chat
---
## Medium Priority
### 4. Panel Layout Edge Cases
- [ ] Very narrow windows may still cause overlap
- [ ] Multiple rapid panel toggles can cause animation glitches
### 5. TODO/Follow-up Persistence
- [ ] Items persist in localStorage but not synced to session file
- [ ] No pagination for many items
### 6. Code Block Copy Button
- [ ] `copyCode()` function defined but button may not work in all cases
- [ ] Need to verify clipboard API works in Electron
---
## Low Priority / Polish
### 7. Typography & Icons
- [ ] Some emoji icons may render inconsistently on Windows
- [ ] Font loading can be slow on first launch
### 8. Status Indicator Accuracy
- [ ] "Ready" shows even when auth not verified
- [ ] Token counter is character-based, not actual tokens
---
## Recently Fixed (Phase 0)
- ✅ Layout squeeze with multiple panels (CSS transitions)
- ✅ IQ Exchange mojibake (🧠⚡✅ emojis)
- ✅ TODO sidebar animations (slideInLeft)
- ✅ Code block wrapper CSS
- ✅ File save + preview auto-refresh
---
*Updated: 2024-12-16*

View File

@@ -0,0 +1,75 @@
# Goose Super - Phase 0 & 1 Walkthrough
## Summary
Completed Phase 0 (Critical Fixes) and Phase 1 (Core IDE Experience) for Goose Super.
---
## Phase 0: Critical Fixes ✅
### Batch 1: Core UI & Reliability
- **Layout squeeze** - Fixed CSS transitions for thinking-panel, added min-width to chat-container
- **File save + preview** - Added auto-refresh when web files are saved
- **TODO animations** - Added slideInLeft animation, hover effects, emoji icons
- **Code block styling** - Added `.code-block-wrapper`, copy button CSS
### Batch 2: IQ Exchange Repair
- **Mojibake fixed** - Restored emojis (🧠⚡✅) in IQ Exchange messages
- **Action policy verified** - External actions blocked unless explicitly allowed
### Batch 3: Documentation
- Created `CREDITS.md` - Reference project acknowledgements
- Created `tests/SMOKE_TEST_RESULTS.md` - Manual test checklist
---
## Phase 1: Core IDE Experience ✅
### Monaco Editor Integration
- **EditorPanel.js** (470 lines)
- Monaco Editor with VS Code-like experience
- File tree sidebar with recursive scanning
- Multi-tab support with viewState preservation
- Auto-save with debounce
- Language detection from file extension
- Status bar (file, position, language)
### Critical Bug Fix
- **Preview ERR_FILE_NOT_FOUND** - Fixed relative path resolution
- Before: `calculator-app/index.html``file:///calculator-app/index.html`
- After: Resolves to full Goose project path ✅
### UI Additions
- Added 📝 **Editor** button to header
- Added Editor CSS (160+ lines) for sidebar, tabs, status bar
---
## Files Changed
| File | Action |
|------|--------|
| `bin/goose-electron-app/EditorPanel.js` | ✨ New |
| `bin/goose-electron-app/index.html` | ✏️ Added Editor button |
| `bin/goose-electron-app/styles.css` | ✏️ Layout fixes + Editor CSS |
| `bin/goose-electron-app/renderer.js` | ✏️ IQ fix + Preview path fix |
| `package.json` | ✏️ Added monaco-editor, xterm deps |
| `CREDITS.md` | ✨ New |
| `docs/KNOWN_ISSUES.md` | ✨ New |
| `tests/SMOKE_TEST_RESULTS.md` | ✨ New |
---
## Known Issues (Documented in docs/KNOWN_ISSUES.md)
1. ⚠️ Preview may still fail if `projectRootPath` not initialized
2. ⚠️ IQ Exchange sometimes needs more specific prompts
3. ⚠️ Monaco loads from CDN (bundle for offline later)
---
## Next Steps
- **Install dependencies**: `npm install` to get monaco-editor, xterm
- **Test app**: Run Goose Super and try "build a calculator"
- **Continue Phase 2**: Enhanced IQ Exchange with screenshot verification

146
docs/QA_FEATURE_REPORT.md Normal file
View File

@@ -0,0 +1,146 @@
# OpenQode Gen5 — Feature QA Checklist + Enhancements
This report is **code-level QA** (feature existence + wiring + tests + sanity guards). It does **not** replace manual runtime QA in a real terminal session (especially for Windows UI Automation + browser popups).
Last updated: 2025-12-15
## Legend
- ✅ Implemented + wired + basic automated checks pass
- 🟡 Implemented but requires manual runtime verification
- 🔴 Missing / incomplete (recommended next work)
---
## 1) Core TUI Shell (Layout, Stability, UX)
- ✅ Split-pane layout with responsive sidebar (`bin/opencode-ink.mjs`)
- ✅ Reduced resize “shake” via debounced terminal sizing (`bin/opencode-ink.mjs`)
- 🟡 Resize jitter may still occur depending on terminal renderer and font; verify in Windows Terminal + cmd + PowerShell
Enhancements
- Add layout hysteresis near breakpoint boundaries (prevents mode flapping during resize).
- Add `/motion on|off` to a visible toggle in Settings/Palette (already exists as command).
---
## 2) Sidebar + Explorer (IDE-style File Manager)
- ✅ Explorer tree component (`bin/ui/components/FileTree.mjs`)
- ✅ Sidebar rendering + selection wiring (`bin/ui/components/PremiumSidebar.mjs`, `bin/opencode-ink.mjs`)
- ✅ Reveal controls:
- `Ctrl+E` toggles Explorer
- `/explorer on|off`
- ✅ Selected files become a “context pack” for the next prompt (`bin/opencode-ink.mjs`)
- 🟡 Manual QA: ensure Explorer appears in narrow mode (Tab to expand sidebar, `Ctrl+E` should expand/focus)
Enhancements
- Add a dedicated “Explorer focus mode” hotkey (e.g. `Ctrl+Shift+E`) to jump into tree navigation.
- Add file actions: rename/delete/new file (safe-mode gated).
---
## 3) File Preview Tabs + Open/Search
- ✅ File tabs UI (`bin/ui/components/FilePreviewTabs.mjs`)
-`/open <path[:line]>` opens a file tab (`bin/opencode-ink.mjs`)
-`/search <rg query>` + search overlay (`bin/ui/components/SearchOverlay.mjs`, `bin/opencode-ink.mjs`)
- ✅ Recent/Hot file pickers (`Ctrl+R`, `Ctrl+H`) + tracking (`bin/opencode-ink.mjs`)
- 🟡 Manual QA: open → edit → reopen behavior with large files
Enhancements
- Add “pin tab” + “close others” actions.
- Add symbol/outline panel (basic parsing for JS/TS/Python).
---
## 4) Better Edit Workflow (Diff + Hunks + One-Key Actions)
- ✅ Diff viewer with hunk toggles + apply actions (`bin/ui/components/DiffView.mjs`)
- `TAB` toggles diff/hunks
- `T` toggles hunk, `A` all, `X` none
- `R` apply + reopen, `V` apply + run tests
- 🟡 Manual QA: ensure apply writes correct files and doesnt mangle newlines
Enhancements
- Add “stage to git” (interactive `git add -p`) integration with safe-mode prompts.
---
## 5) Guided “Task Wizard” (`/new`)
-`/new <goal>` seeds a checklist into Tasks (`Ctrl+T`) (`bin/opencode-ink.mjs`)
- 🟡 Manual QA: tasks persist per-project and render correctly in overlay
Enhancements
- Add “run next step” action that triggers `/search`, `/open`, test runs, etc.
---
## 6) Quality Rails (Safe Mode, Verify Step, Doctor)
- ✅ Safe mode (`/safe on|off`) for destructive command blocking (`bin/opencode-ink.mjs`)
-`/doctor` diagnostics for terminal + tools + preview server state (`bin/opencode-ink.mjs`)
- ✅ Automation plans auto-append lightweight verify steps for browser flows (`bin/opencode-ink.mjs`)
Enhancements
- Add “safe-mode confirmations” UI in preview plan (show WHY a step is risky).
---
## 7) Chat-to-App + Instant Preview + Deploy
- ✅ Chat-to-App scaffold writes vanilla app to `web/apps/<name>` (`bin/opencode-ink.mjs`)
- ✅ Preview server start/stop + browser open (`/preview`) (`bin/opencode-ink.mjs`)
- ✅ One-click deploy:
- `/deploy` deploys current project to Vercel
- `/deployapp <name>` deploys a generated app dir (`bin/opencode-ink.mjs`)
- 🟡 Manual QA: verify `server.js` exists and preview server serves expected routes
Enhancements
- Add per-app “deploy pipeline” UI card with last deploy URL + logs.
---
## 8) IQ Exchange (Computer Use / Browser Use / Server Management)
- ✅ Request detection + routing (`lib/iq-exchange.mjs`, `bin/opencode-ink.mjs`)
- ✅ Empty-plan/stuck-state guards (no “0 steps” preview; clears preview state on translation failure)
- ✅ Browser routing improvements to prefer Playwright (reduces Edge first-run popups breaking flows)
- ✅ Playwright popup dismissal best-effort (`bin/playwright-bridge.js`)
- ✅ “Vision-like” auto-observations for self-heal (DOM content + OCR + window lists) on failures (`bin/opencode-ink.mjs`)
- 🟡 Manual QA: Edge popups + Windows Start menu automation on your machine
Enhancements
- Add a “preflight” step library the translator can reference (focus checks, window activation, retries).
- Add an internal “failure classifier” (timeouts vs focus vs popup vs permission) to drive more reliable retries.
---
## 9) Nano Dev (Self-modifying TUI on a fork/worktree)
-`/nanodev <goal>` creates a git worktree fork under `.opencode/nano-dev/*` (`bin/opencode-ink.mjs`)
- ✅ Auto-verify after writes land in fork (runs `node --check` + `npm test`)
- ✅ Fix for verify init crash by moving verifier out of component scope (`bin/opencode-ink.mjs`)
- 🟡 Manual QA: requires git repo state suitable for worktrees
Enhancements
- Add `/nanodev merge` flow (safe-mode gated, with “diff preview” first).
---
## 10) Goose “Windows App” Launch (Lovable/Vibe Coding Companion)
Goal: launch Goose from the TUI while using the **same Qwen auth** as OpenQode.
- ✅ Local OpenAI-compatible proxy backed by Qwen CLI (`bin/qwen-openai-proxy.mjs`)
- Supports `/v1/chat/completions` (stream/non-stream) and `/v1/models`
- ✅ Goose launcher that:
- starts proxy if needed
- launches `goose web` using the proxy (`bin/goose-launch.mjs`)
- ✅ TUI command: `/goose` (`bin/opencode-ink.mjs`)
- 🟡 Manual QA: requires Goose + Rust toolchain (or `goose` binary) installed, and a browser for the web UI
Enhancements
- Add “desktop/electron” mode bootstrap (large install; best behind explicit `/goose setup-desktop`).
- Add context handoff: selected Explorer files → push into Goose session via a preloaded system prompt.
---
## Quick Manual QA Script (Recommended)
1. Launch TUI and verify Explorer: `Ctrl+E`, `Ctrl+R`, `Ctrl+H`.
2. Verify `/open package.json`, `/search TODO`.
3. Run `/doctor` and confirm tools show expected states.
4. Run `/preview on 15044` and confirm browser opens.
5. Trigger a browser automation task and confirm it routes to Playwright and can dismiss popups.
6. Run `/goose` (requires Goose prerequisites).

38
docs/task.md Normal file
View File

@@ -0,0 +1,38 @@
# Goose Super Enhancement Tasks
## Phase 0: Critical Fixes (Prioritized)
- [ ] **Batch 1: Core UI & Reliability** <!-- id: 0 -->
- [ ] Fix internal file save + preview pipeline <!-- id: 1 -->
- [ ] Fix layout squeeze and panel overlap in `styles.css` <!-- id: 2 -->
- [ ] Restore TODO sidebar & animations <!-- id: 3 -->
- [ ] **Batch 2: IQ Exchange Repair** <!-- id: 4 -->
- [ ] Debug translation layer in `lib/iq-exchange.mjs` <!-- id: 5 -->
- [ ] Fix action execution loop in `renderer.js` <!-- id: 6 -->
- [ ] Implement self-healing retry mechanism <!-- id: 7 -->
- [ ] **Batch 3: Automation Verification** <!-- id: 8 -->
- [ ] Verify `playwright-bridge.js` commands <!-- id: 9 -->
- [ ] Run full smoke tests (`tests/SMOKE_TEST_RESULTS.md`) <!-- id: 10 -->
- [ ] Implement Global STOP Button & Log Panel <!-- id: 11 -->
## Phase 1: Core IDE Experience
- [ ] Integrate Monaco Editor <!-- id: 12 -->
- [ ] Implement File Tree Panel <!-- id: 13 -->
- [ ] Enable Live Preview with Hot Reload <!-- id: 14 -->
- [ ] Implement Session & Project Management <!-- id: 15 -->
## Phase 2: Enhanced IQ Exchange
- [ ] Screenshot verification loop <!-- id: 16 -->
- [ ] UIAutomation element finder <!-- id: 17 -->
- [ ] Advanced self-healing prompts <!-- id: 18 -->
## Phase 3: Built-In Browser
- [ ] Multi-tab BrowserView <!-- id: 19 -->
- [ ] DOM Inspector for AI <!-- id: 20 -->
## Phase 4: Server & Deploy
- [ ] SSH Connection Panel <!-- id: 21 -->
- [ ] Vibe Server Management Commands <!-- id: 22 -->
## Phase 5: Polish & UX
- [ ] Onboarding Wizard <!-- id: 23 -->
- [ ] Visual Polish <!-- id: 24 -->