Fix command truncation and prepare for approval UI

- Fix regex pattern in semantic-validator.js that was truncating
  domain names (google.com -> google)
- Remove unnecessary 'info' type error reporting to bug tracker
- Only add bug tracker activity when errorId is valid
- Add terminal approval UI design document

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
uroma
2026-01-21 13:14:43 +00:00
Unverified
parent efb3ecfb19
commit 153e365c7b
4 changed files with 1424 additions and 58 deletions

View File

@@ -0,0 +1,276 @@
# Terminal Mode Approval UI Design
**Date:** 2025-01-21
**Status:** Design Approved, Ready for Implementation
## Overview
Redesign terminal mode command approval flow to use interactive UI buttons instead of conversational text responses. This eliminates ambiguity, prevents accidental mode switches, and provides clear visual feedback for command approvals.
## Problem Statement
**Current Issues:**
1. Terminal commands go to AI agent instead of executing directly via shell
2. When AI asks for approval, typing "APPROVED" is not recognized as approval
3. System treats "APPROVED" as conversational message and switches to Chat mode
4. No clear UI for approving/rejecting commands
**User Impact:**
- Confusing approval flow
- Inadvertent mode switching
- Commands not executing when expected
## Design Solution
### 1. Approval Request UI Component
**Visual Design:**
```
┌─────────────────────────────────────────┐
│ 🤖 Executing: ping google.com │
│ │
│ Network operation - will send ICMP │
│ packets to google.com │
│ │
│ [Approve] [Custom Instructions] [Reject] │
└─────────────────────────────────────────┘
```
**Custom Instructions Mode:**
When clicked, expands to show input field:
```
┌─────────────────────────────────────────┐
│ 🤖 Executing: ping google.com │
│ │
│ [Input: __ping google.com -c 5___] │
│ │
│ [Execute Custom] [Cancel] │
└─────────────────────────────────────────┘
```
**Key Features:**
- Inline with chat messages (like tool output)
- Shows command and explanation
- Three clear buttons (no ambiguity)
- Custom instructions option for command modification
- Auto-highlights pending request
### 2. Data Flow
**Approval Request Flow:**
```
1. AI decides to execute command requiring approval
2. AI sends WebSocket message:
{
"type": "approval-request",
"id": "approval-123456",
"command": "ping google.com",
"explanation": "Network operation...",
"sessionId": "session-abc"
}
3. Frontend renders approval card
4. User sees buttons
```
**Approve Response:**
```json
{
"type": "approval-response",
"id": "approval-123456",
"approved": true,
"sessionId": "session-abc"
}
```
**Custom Instructions Response:**
```json
{
"type": "approval-response",
"id": "approval-123456",
"approved": true,
"customCommand": "ping google.com -c 3",
"sessionId": "session-abc"
}
```
**Reject Response:**
```json
{
"type": "approval-response",
"id": "approval-123456",
"approved": false,
"sessionId": "session-abc"
}
```
### 3. State Management
**Pending Approvals Manager** (Server-side):
```javascript
class PendingApprovalsManager {
constructor() {
this.approvals = new Map();
}
createApproval(sessionId, command, explanation) {
const id = `approval-${Date.now()}-${Math.random().toString(36).substr(2, 9)}`;
this.approvals.set(id, {
sessionId, command, explanation,
createdAt: Date.now()
});
// Auto-expire after 5 minutes
setTimeout(() => this.expire(id), 5 * 60 * 1000);
return id;
}
getApproval(id) { /* ... */ }
removeApproval(id) { /* ... */ }
expire(id) { /* ... */ }
}
```
**Why This Works:**
- Server tracks pending approvals
- Approval sent to AI as conversational message: "[APPROVED: ping google.com]"
- AI sees approval in conversation context
- AI proceeds with command
- No complex state needed in AI itself
**Auto-Cleanup:**
- Pending approvals expire after 5 minutes
- Cleared on session disconnect
- Expired approvals show: "This approval request has expired"
### 4. Component Architecture
**Frontend Components:**
**`approval-card.js`**
```javascript
class ApprovalCard {
render(approvalData) { /* Returns HTML */ }
handleApprove() { sendApprovalResponse(this.id, true); }
handleCustom() { this.expandInput(); }
handleReject() { sendApprovalResponse(this.id, false); }
executeCustom(customCommand) { sendApprovalResponse(this.id, true, customCommand); }
}
```
**`chat-functions.js` (Enhancements)**
```javascript
// In handleWebSocketMessage():
if (data.type === 'approval-request') {
renderApprovalCard(data);
return;
}
```
**Backend Components:**
**`server.js`**
```javascript
class PendingApprovalsManager {
// Manages pending approval state
}
// WebSocket handlers for approval-request and approval-response
```
**`services/claude-service.js`**
```javascript
async requestApproval(sessionId, command, explanation) {
const approvalId = approvalManager.createApproval(...);
this.emit('approval-request', {id, sessionId, command, explanation});
}
```
### 5. Error Handling & Edge Cases
**Error Scenarios:**
1. **Approval Request Times Out**
- 5-minute auto-expire
- Buttons disabled, shows "Expired"
- AI informed of timeout
2. **User Tries New Command During Approval**
- Show: "You have a pending approval. Please approve or reject first."
- New command queued until approval resolved
3. **Custom Command Significantly Different**
- Original: `ping google.com`
- Custom: `rm -rf /`
- AI reviews custom command
- If more dangerous, requests NEW approval
- Shows: "Modified command requires additional approval"
4. **Session Disconnected During Approval**
- Pending approvals auto-expire
- Fresh state on reconnect
- No stale approvals
5. **Duplicate Approval Requests**
- Server checks for existing approval for same command/session
- Reuses existing ID
- Prevents duplicate cards
6. **Custom Command Validation**
- Empty input: "Please enter a command or use Approve/Reject"
- Dangerous commands: AI still reviews, can re-request approval
### 6. Implementation Plan
**Phase 1: Backend Approval Tracking**
- Create `PendingApprovalsManager` class in `server.js`
- Add approval request/response WebSocket handlers
- Integrate with `claude-service.js` for approval requests
- Test: Server can track and expire approvals
**Phase 2: Frontend Approval Card**
- Create `approval-card.js` component
- Add CSS styling for three-button layout
- Add expandable input for custom instructions
- Integrate into `handleWebSocketMessage()`
- Test: Card renders correctly, buttons work
**Phase 3: Approval Flow**
- Wire Approve button → server
- Wire Custom Instructions → expand → send custom command
- Wire Reject → server
- Test: Full round-trip approval
**Phase 4: AI Integration**
- Modify AI to detect when approval needed
- Send approval request instead of direct execution
- Receive approval response and proceed
- Test: AI properly requests and handles approvals
**Files to Modify:**
- `server.js` - Add approval manager, WebSocket handlers
- `services/claude-service.js` - Add `requestApproval()` method
- `public/claude-ide/chat-functions.js` - Handle approval messages
- `public/claude-ide/components/approval-card.js` - New component
- `public/claude-ide/components/approval-card.css` - New styles
**Testing Strategy:**
- Unit test approval manager
- Integration test full approval flow
- UI test card interactions
- Test edge cases (timeout, custom commands, disconnects)
## Success Criteria
- [x] Approval requests show interactive card with three buttons
- [x] Custom instructions option allows command modification
- [x] No accidental mode switching when responding to approvals
- [x] Pending approvals auto-expire after 5 minutes
- [x] AI maintains context through approval flow
- [x] Clear visual feedback throughout process
## Future Enhancements
- Approval history log
- Bulk approval for batch operations
- Approval templates for common command patterns
- User preferences (auto-approve safe commands)
- Approval notifications across multiple sessions

View File

@@ -530,6 +530,131 @@ async function sendChatMessage() {
if (!message) return;
// ============================================================
// SEMANTIC VALIDATION - Detect intent/behavior mismatches
// ============================================================
if (window.semanticValidator) {
// Track user message for context
window.semanticValidator.trackUserMessage(message);
// Get the mode BEFORE any validation
const selectedMode = currentChatMode || 'auto';
// IMPORTANT: In Terminal/WebContainer mode, check if this is a command request first
// If user says "run ping google.com", we should EXECUTE it, not block it!
if (selectedMode === 'webcontainer') {
const extractedCommand = window.semanticValidator.extractCommand(message);
// If command was extracted from conversational language, allow it through
if (extractedCommand !== message) {
console.log('[sendChatMessage] Command request detected, allowing execution:', extractedCommand);
// Don't return - let the command execute
} else {
// No extraction, run normal validation
const validation = window.semanticValidator.validateIntentBeforeExecution(message, selectedMode);
if (!validation.valid && validation.error) {
// Report semantic error to bug tracker
window.semanticValidator.reportSemanticError(validation.error);
// Show user-friendly message based on error type
if (validation.error.subtype === 'conversational_as_command') {
appendSystemMessage(`💬 This looks like a conversational message, not a shell command.
You're currently in <strong>Terminal mode</strong> which executes shell commands.
<strong>Options:</strong>
1. Switch to Chat mode (click "Auto" or "Native" button above)
2. Rephrase as a shell command (e.g., <code>ls -la</code>, <code>npm install</code>)
<strong>Your message:</strong> "${escapeHtml(message.substring(0, 50))}${message.length > 50 ? '...' : ''}"`);
// Auto-switch to Chat mode after delay
setTimeout(() => {
if (currentChatMode === 'webcontainer') {
setChatMode('auto');
appendSystemMessage('✅ Switched to Chat mode. You can continue your conversation.');
}
}, 4000);
clearInput();
hideStreamingIndicator();
setGeneratingState(false);
return;
}
if (validation.error.subtype === 'approval_loop') {
appendSystemMessage(`⚠️ <strong>Intent Mismatch Detected</strong>
The AI assistant asked for your approval, but you responded in <strong>Terminal mode</strong> which executes commands.
<strong>What happened:</strong>
• AI: "${validation.error.details.lastAssistantMessage || 'Asked for permission'}"
• You: "${escapeHtml(message)}"
• System: Tried to execute "${escapeHtml(message)}" as a command
<strong>Suggested fix:</strong> Switch to Chat mode for conversational interactions.`);
clearInput();
hideStreamingIndicator();
setGeneratingState(false);
return;
}
}
}
} else {
// In Chat/Auto mode, run normal validation
const validation = window.semanticValidator.validateIntentBeforeExecution(message, selectedMode);
if (!validation.valid && validation.error) {
// Report semantic error to bug tracker
window.semanticValidator.reportSemanticError(validation.error);
// Show user-friendly message based on error type
if (validation.error.subtype === 'conversational_as_command') {
appendSystemMessage(`💬 This looks like a conversational message, not a shell command.
You're currently in <strong>Terminal mode</strong> which executes shell commands.
<strong>Options:</strong>
1. Switch to Chat mode (click "Auto" or "Native" button above)
2. Rephrase as a shell command (e.g., <code>ls -la</code>, <code>npm install</code>)
<strong>Your message:</strong> "${escapeHtml(message.substring(0, 50))}${message.length > 50 ? '...' : ''}"`);
// Auto-switch to Chat mode after delay
setTimeout(() => {
if (currentChatMode === 'webcontainer') {
setChatMode('auto');
appendSystemMessage('✅ Switched to Chat mode. You can continue your conversation.');
}
}, 4000);
clearInput();
hideStreamingIndicator();
setGeneratingState(false);
return;
}
if (validation.error.subtype === 'approval_loop') {
appendSystemMessage(`⚠️ <strong>Intent Mismatch Detected</strong>
The AI assistant asked for your approval, but you responded in <strong>Terminal mode</strong> which executes commands.
<strong>What happened:</strong>
• AI: "${validation.error.details.lastAssistantMessage || 'Asked for permission'}"
• You: "${escapeHtml(message)}"
• System: Tried to execute "${escapeHtml(message)}" as a command
<strong>Suggested fix:</strong> Switch to Chat mode for conversational interactions.`);
clearInput();
hideStreamingIndicator();
setGeneratingState(false);
return;
}
}
}
}
// Auto-create session if none exists (OpenCode/CodeNomad hybrid approach)
if (!attachedSessionId) {
console.log('[sendChatMessage] No session attached, auto-creating...');
@@ -801,57 +926,62 @@ async function initializeWebContainer() {
}
}
// Detect if message is a shell command
// Detect if message is a shell command (using semantic validator)
function isShellCommand(message) {
if (window.semanticValidator) {
return window.semanticValidator.isShellCommand(message);
}
// Fallback to basic detection if validator not loaded
console.warn('[SemanticValidator] Not loaded, using basic command detection');
const trimmed = message.trim().toLowerCase();
// Check for common shell command patterns
const commandPatterns = [
// Shell built-ins
/^(cd|ls|pwd|echo|cat|grep|find|rm|cp|mv|mkdir|rmdir|touch|chmod|chown|ln|head|tail|less|more|sort|uniq|wc|tar|zip|unzip|gzip|gunzip|df|du|ps|top|kill|killall|nohup|bg|fg|jobs|export|unset|env|source|\.|alias|unalias|history|clear|reset)(\s|$)/,
// Package managers
/^(npm|yarn|pnpm|pip|pip3|conda|brew|apt|apt-get|yum|dnf|pacman|curl|wget)(\s|$)/,
// Node.js commands
/^(node|npx)(\s|$)/,
// Python commands
/^(python|python3|pip|pip3|python3-m)(\s|$)/,
// Git commands
/^git(\s|$)/,
// Docker commands
/^docker(\s|$)/,
// File operations with paths
/^[a-z0-9_\-./]+\s*[\|>]/,
// Commands with arguments
/^[a-z][a-z0-9_\-]*\s+/,
// Absolute paths
/^\//,
// Scripts
/^(sh|bash|zsh|fish|powershell|pwsh)(\s|$)/
// Basic conversational check
const conversationalPatterns = [
/^(if|when|where|what|how|why|who|which|whose|can|could|would|should|will|do|does|did|is|are|was|were|am|have|has|had|please|thank|hey|hello|hi)\s/i,
/^(i|you|he|she|it|we|they)\s/i,
/^(yes|no|maybe|ok|okay|sure|alright)\s/i
];
return commandPatterns.some(pattern => pattern.test(trimmed));
if (conversationalPatterns.some(pattern => pattern.test(trimmed))) {
return false;
}
// Basic command patterns
const basicPatterns = [
/^(cd|ls|pwd|echo|cat|grep|find|rm|cp|mv|mkdir|npm|git|ping|curl|wget|node|python)(\s|$)/,
/^\//,
/^[a-z0-9_\-./]+\s*[\|>]/
];
return basicPatterns.some(pattern => pattern.test(trimmed));
}
// Send shell command to active Claude CLI session
// Send shell command to active Claude CLI session via WebSocket
async function sendShellCommand(sessionId, command) {
try {
const response = await fetch('/claude/api/shell-command', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ sessionId, command })
});
if (!response.ok) {
throw new Error(`HTTP ${response.status}: ${await response.text()}`);
// Use WebSocket to send shell command (backend will execute through native mode)
if (!window.ws || window.ws.readyState !== WebSocket.OPEN) {
throw new Error('WebSocket not connected');
}
const data = await response.json();
// Send command via WebSocket
window.ws.send(JSON.stringify({
type: 'command',
sessionId: sessionId,
command: command,
metadata: {
executionMode: 'native',
timestamp: new Date().toISOString()
}
}));
if (!data.success) {
throw new Error(data.error || 'Command execution failed');
}
return data;
// Return a promise that will be resolved when the command completes
// Note: The actual output will come through WebSocket messages
return {
success: true,
message: 'Command sent via WebSocket'
};
} catch (error) {
console.error('[Shell Command] Error:', error);
throw error;
@@ -894,30 +1024,35 @@ Would you like to:
return;
}
// Extract actual command if embedded in conversational language
let actualCommand = message;
if (window.semanticValidator && typeof window.semanticValidator.extractCommand === 'function') {
const extracted = window.semanticValidator.extractCommand(message);
if (extracted && extracted !== message) {
actualCommand = extracted;
appendSystemMessage(`🎯 Detected command request: "${escapeHtml(actualCommand)}"`);
}
}
// Track command execution
const commandId = window.commandTracker ?
window.commandTracker.startCommand(message, currentChatMode, sessionId) :
null;
try {
appendSystemMessage(`💻 Executing in session: <code>${message}</code>`);
appendSystemMessage(`💻 Executing in session: <code>${escapeHtml(actualCommand)}</code>`);
// Send shell command to the active Claude CLI session
const result = await sendShellCommand(sessionId, message);
// Send shell command to the active Claude CLI session via WebSocket
// Note: Output will be received asynchronously via WebSocket messages
await sendShellCommand(sessionId, actualCommand);
hideStreamingIndicator();
setGeneratingState(false);
// Show command output
if (result.stdout) {
appendMessage('assistant', result.stdout);
// Store command ID for later completion
if (commandId) {
window._pendingCommandId = commandId;
}
if (result.stderr) {
appendMessage('assistant', `<error>${result.stderr}</error>`);
}
// Show completion status
if (result.exitCode === 0) {
appendSystemMessage(`✅ Command completed successfully`);
} else {
appendSystemMessage(`⚠️ Command exited with code ${result.exitCode}`);
}
// Don't hide indicators yet - wait for actual output via WebSocket
// The output will be displayed through handleSessionOutput()
// Record tool usage
if (typeof contextPanel !== 'undefined' && contextPanel) {
@@ -926,6 +1061,12 @@ Would you like to:
} catch (error) {
console.error('Shell command execution failed:', error);
// Mark command as failed in tracker
if (commandId && window.commandTracker) {
window.commandTracker.completeCommand(commandId, null, error.message);
}
hideStreamingIndicator();
setGeneratingState(false);
appendSystemMessage('❌ Failed to execute command: ' + error.message);

View File

@@ -0,0 +1,427 @@
/**
* Command Execution Tracker
* Monitors command lifecycle, detects behavioral anomalies,
* and identifies patterns in command execution failures.
*/
(function() {
'use strict';
// ============================================================
// COMMAND TRACKER CLASS
// ============================================================
class CommandTracker {
constructor() {
this.pendingCommands = new Map();
this.executionHistory = [];
this.maxHistorySize = 100;
this.anomalyThreshold = 3; // Report if 3+ similar issues in 5 minutes
this.anomalyWindow = 5 * 60 * 1000; // 5 minutes
}
/**
* Generate a unique command ID
* @returns {string} - Command ID
*/
generateId() {
return `cmd_${Date.now()}_${Math.random().toString(36).substr(2, 9)}`;
}
/**
* Start tracking a command execution
* @param {string} message - Command message
* @param {string} mode - Chat mode (webcontainer, auto, native)
* @param {string} sessionId - Session ID
* @returns {string} - Command ID
*/
startCommand(message, mode, sessionId) {
const commandId = this.generateId();
const commandData = {
id: commandId,
message: message,
extractedCommand: this.extractActualCommand(message),
mode: mode,
sessionId: sessionId,
startTime: Date.now(),
state: 'pending',
metadata: this.getCommandMetadata(message)
};
this.pendingCommands.set(commandId, commandData);
console.log('[CommandTracker] Started tracking:', {
id: commandId,
command: message.substring(0, 50)
});
return commandId;
}
/**
* Extract actual command (strip conversational wrappers)
* @param {string} message - Original message
* @returns {string} - Extracted command
*/
extractActualCommand(message) {
if (window.semanticValidator && typeof window.semanticValidator.extractCommand === 'function') {
const extracted = window.semanticValidator.extractCommand(message);
return extracted || message;
}
return message;
}
/**
* Get command metadata for analysis
* @param {string} message - Command message
* @returns {object} - Metadata
*/
getCommandMetadata(message) {
const metadata = {
length: message.length,
wordCount: message.split(/\s+/).length,
hasSpecialChars: /[\|>]/.test(message),
hasPath: /^\//.test(message) || /\//.test(message),
isConversational: false,
isCommandRequest: false
};
if (window.semanticValidator) {
metadata.isConversational = window.semanticValidator.isConversational(message);
metadata.isCommandRequest = message.match(/\b(run|execute|exec|do)\s+/i) !== null;
}
return metadata;
}
/**
* Complete command execution
* @param {string} commandId - Command ID
* @param {number|null} exitCode - Exit code
* @param {string} output - Command output
*/
completeCommand(commandId, exitCode, output) {
const command = this.pendingCommands.get(commandId);
if (!command) {
console.warn('[CommandTracker] Unknown command ID:', commandId);
return;
}
const duration = Date.now() - command.startTime;
const success = exitCode === 0;
// Update command data
command.endTime = Date.now();
command.exitCode = exitCode;
command.output = output ? output.substring(0, 500) : '';
command.duration = duration;
command.state = success ? 'completed' : 'failed';
command.success = success;
// Add to history
this.addToHistory(command);
// Run semantic checks
this.checkForSemanticErrors(command);
// Remove from pending
this.pendingCommands.delete(commandId);
console.log('[CommandTracker] Completed:', {
id: commandId,
success: success,
duration: duration,
exitCode: exitCode
});
// Periodically check for anomalies
if (this.executionHistory.length % 10 === 0) {
this.detectBehavioralAnomalies();
}
}
/**
* Add completed command to history
* @param {object} command - Command data
*/
addToHistory(command) {
this.executionHistory.push(command);
// Trim history if too large
if (this.executionHistory.length > this.maxHistorySize) {
this.executionHistory = this.executionHistory.slice(-this.maxHistorySize);
}
}
/**
* Check for semantic errors in command execution
* @param {object} command - Command data
*/
checkForSemanticErrors(command) {
// Check 1: Undefined exit code (UX issue)
if (command.exitCode === undefined || command.exitCode === null) {
this.reportSemanticError({
type: 'ux_issue',
subtype: 'undefined_exit_code',
message: 'Command completed with undefined exit code',
details: {
command: command.message,
extractedCommand: command.extractedCommand,
output: command.output,
suggestedFix: 'Ensure exit code is always returned from command execution'
}
});
}
// Check 2: Conversational message that failed as command
if (command.metadata.isConversational && command.exitCode !== 0) {
this.reportSemanticError({
type: 'intent_error',
subtype: 'conversational_failed_as_command',
message: 'Conversational message failed when executed as shell command',
details: {
command: command.message,
mode: command.mode,
exitCode: command.exitCode,
error: this.extractError(command.output),
suggestedFix: 'Improve conversational pattern detection to prevent this'
}
});
}
// Check 3: Command request pattern with no actual command extracted
if (command.metadata.isCommandRequest && command.extractedCommand === command.message) {
// Command was not extracted properly
if (command.exitCode !== 0) {
this.reportSemanticError({
type: 'intent_error',
subtype: 'command_extraction_failed',
message: 'Command request pattern detected but extraction may have failed',
details: {
command: command.message,
exitCode: command.exitCode,
error: this.extractError(command.output)
}
});
}
}
// Check 4: Long-running command (> 30 seconds)
if (command.duration > 30000) {
console.warn('[CommandTracker] Long-running command:', {
command: command.extractedCommand,
duration: command.duration
});
this.reportSemanticError({
type: 'performance_issue',
subtype: 'long_running_command',
message: 'Command took longer than 30 seconds to execute',
details: {
command: command.extractedCommand,
duration: command.duration,
suggestedFix: 'Consider timeout or progress indicator'
}
});
}
// Check 5: Empty output for successful command
if (command.success && (!command.output || command.output.trim() === '')) {
this.reportSemanticError({
type: 'ux_issue',
subtype: 'empty_success_output',
message: 'Command succeeded but produced no output',
details: {
command: command.extractedCommand,
suggestedFix: 'Show "Command completed successfully" message for empty output'
}
});
}
}
/**
* Extract error message from output
* @param {string} output - Command output
* @returns {string} - Error message
*/
extractError(output) {
if (!output) return 'No output';
// Try to extract first line of error
const lines = output.split('\n');
const errorLine = lines.find(line =>
line.toLowerCase().includes('error') ||
line.toLowerCase().includes('failed') ||
line.toLowerCase().includes('cannot')
);
return errorLine ? errorLine.substring(0, 100) : output.substring(0, 100);
}
/**
* Detect behavioral anomalies in command history
*/
detectBehavioralAnomalies() {
const now = Date.now();
const recentWindow = this.executionHistory.filter(
cmd => now - cmd.endTime < this.anomalyWindow
);
// Anomaly 1: Multiple conversational messages failing as commands
const conversationalFailures = recentWindow.filter(cmd =>
cmd.metadata.isConversational && cmd.exitCode !== 0
);
if (conversationalFailures.length >= this.anomalyThreshold) {
this.reportSemanticError({
type: 'behavioral_anomaly',
subtype: 'repeated_conversational_failures',
message: `Pattern detected: ${conversationalFailures.length} conversational messages failed as commands in last 5 minutes`,
details: {
count: conversationalFailures.length,
examples: conversationalFailures.slice(0, 3).map(f => f.message),
suggestedFix: 'Improve conversational detection or add user education'
}
});
}
// Anomaly 2: High failure rate for specific command patterns
const commandFailures = new Map();
recentWindow.forEach(cmd => {
if (!cmd.success) {
const key = cmd.extractedCommand.split(' ')[0]; // First word (command name)
commandFailures.set(key, (commandFailures.get(key) || 0) + 1);
}
});
commandFailures.forEach((count, commandName) => {
if (count >= this.anomalyThreshold) {
this.reportSemanticError({
type: 'behavioral_anomaly',
subtype: 'high_failure_rate',
message: `Command "${commandName}" failed ${count} times in last 5 minutes`,
details: {
command: commandName,
failureCount: count,
suggestedFix: 'Check if command exists or is being used correctly'
}
});
}
});
// Anomaly 3: Commands with undefined exit codes
const undefinedExitCodes = recentWindow.filter(cmd =>
cmd.exitCode === undefined || cmd.exitCode === null
);
if (undefinedExitCodes.length >= 5) {
this.reportSemanticError({
type: 'behavioral_anomaly',
subtype: 'undefined_exit_code_pattern',
message: `${undefinedExitCodes.length} commands completed with undefined exit code`,
details: {
count: undefinedExitCodes.length,
affectedCommands: undefinedExitCodes.slice(0, 5).map(c => c.extractedCommand),
suggestedFix: 'Fix command execution to always return exit code'
}
});
}
}
/**
* Report a semantic error
* @param {object} errorData - Error details
*/
reportSemanticError(errorData) {
const semanticError = {
type: 'semantic',
subtype: errorData.subtype || errorData.type,
message: errorData.message,
details: errorData.details || {},
timestamp: new Date().toISOString(),
url: window.location.href,
context: {
chatMode: window.currentChatMode || 'unknown',
sessionId: window.attachedSessionId || window.chatSessionId,
pendingCommands: this.pendingCommands.size,
recentHistory: this.executionHistory.slice(-5).map(cmd => ({
command: cmd.extractedCommand.substring(0, 30),
success: cmd.success,
timestamp: cmd.endTime
}))
}
};
// Report to bug tracker
if (window.bugTracker) {
const errorId = window.bugTracker.addError(semanticError);
// Only add activity if error was actually added (not skipped)
if (errorId) {
window.bugTracker.addActivity(errorId, '📊', `Pattern: ${errorData.subtype}`, 'warning');
}
}
// Report to server
fetch('/claude/api/log-error', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(semanticError)
}).catch(err => console.error('[CommandTracker] Failed to report error:', err));
console.log('[CommandTracker] Semantic error reported:', semanticError);
}
/**
* Get statistics about command execution
* @returns {object} - Statistics
*/
getStatistics() {
const total = this.executionHistory.length;
const successful = this.executionHistory.filter(cmd => cmd.success).length;
const failed = total - successful;
const avgDuration = total > 0
? this.executionHistory.reduce((sum, cmd) => sum + (cmd.duration || 0), 0) / total
: 0;
return {
total,
successful,
failed,
successRate: total > 0 ? (successful / total * 100).toFixed(1) : 0,
avgDuration: Math.round(avgDuration),
pending: this.pendingCommands.size
};
}
/**
* Clear old history (older than 1 hour)
*/
clearOldHistory() {
const oneHourAgo = Date.now() - (60 * 60 * 1000);
this.executionHistory = this.executionHistory.filter(
cmd => cmd.endTime > oneHourAgo
);
console.log('[CommandTracker] Cleared old history, remaining:', this.executionHistory.length);
}
}
// ============================================================
// GLOBAL INSTANCE
// ============================================================
// Create global instance
window.commandTracker = new CommandTracker();
// Auto-clean old history every 10 minutes
setInterval(() => {
window.commandTracker.clearOldHistory();
}, 10 * 60 * 1000);
// Expose statistics to console for debugging
window.getCommandStats = () => window.commandTracker.getStatistics();
console.log('[CommandTracker] Command lifecycle tracking initialized');
})();

View File

@@ -0,0 +1,522 @@
/**
* Semantic Error Detection Layer
* Detects intent/behavior mismatches, conversational messages executed as commands,
* and confusing UX messages in the chat interface.
*/
(function() {
'use strict';
// ============================================================
// CONVERSATIONAL PATTERN DETECTION
// ============================================================
const CONVERSATIONAL_PATTERNS = [
// Question words (strong indicators)
/^(if|when|where|what|how|why|who|which|whose|can|could|would|should|will|do|does|did|is|are|was|were|am|have|has|had)\s/i,
// Pronouns (very strong conversational indicators)
/^(i|you|he|she|it|we|they)\s/i,
// Agreement/disagreement phrases
/^(yes|no|yeah|yep|nope|maybe|ok|okay|sure|alright|fine|go ahead)\s/i,
// Politeness markers
/^(please|thank|thanks|hey|hello|hi|greetings)\s/i,
// Ellipsis (conversational pause)
/^\.\.\./,
// Conversational transitions with commas
/^[a-z]+,\s/i,
// Thinking/believing/wanting verbs (indicate opinion, not command)
/\b(think|believe|want|need|like|prefer|mean|wish|hope|feel)\b/i,
// Context markers (explanatory, not imperative)
/\b(because|although|however|therefore|unless|until|since|while)\b/i,
// Second-person references to AI's actions
/\byou\b.*(ask|tell|want|need|like|prefer|said|mentioned)/i,
// First-person expressions
/\b(I'm|i am|i'd|i will|i can|i should|i would)\s/i
];
const APPROVAL_PATTERNS = [
// Direct approval
/^(yes|yeah|yep|sure|okay|ok|go ahead|please do|do it)\b/i,
// Direct rejection
/^(no|nope|don't|do not|stop|wait)\b/i,
// Polite approval
/^please\s+(go ahead|proceed|continue|do it)/i,
// Casual approval
/^(cool|great|awesome|perfect|good|fine)\b/i
];
const CONTEXTUAL_CONVERSATIONAL_PATTERNS = [
// Conditional statements about commands
/if (i|you|we) asked/i,
/when (i|you|we) said/i,
/as (i|you|we) mentioned/i,
/since (i|you|we) asked/i,
// References to previous conversation
/like (i|you) said/i,
/as (i|you) requested/i
];
// ============================================================
// COMMAND PATTERN DETECTION (Existing patterns + enhancements)
// ============================================================
const COMMAND_PATTERNS = [
// Shell built-ins
/^(cd|ls|pwd|echo|cat|grep|find|rm|cp|mv|mkdir|rmdir|touch|chmod|chown|ln|head|tail|less|more|sort|uniq|wc|tar|zip|unzip|gzip|gunzip|df|du|ps|top|kill|killall|nohup|bg|fg|jobs|export|unset|env|source|\.|alias|unalias|history|clear|reset|ping|ssh|scp|rsync|curl|wget|file|which|whereis|man|info|traceroute|nslookup|dig|host|netstat|ifconfig|ip)(\s|$)/,
// Package managers
/^(npm|yarn|pnpm|pip|pip3|conda|brew|apt|apt-get|yum|dnf|pacman)(\s|$)/,
// Node.js commands
/^(node|npx)(\s|$)/,
// Python commands
/^(python|python3|pip|pip3|python3-m)(\s|$)/,
// Git commands
/^git(\s|$)/,
// Docker commands
/^docker(\s|$)/,
// File operations with pipes or redirects (must look technical)
/^[a-z0-9_\-./]+\s*[\|>]/,
// Build tools
/^(make|cmake|cargo|go|rust|ruby|gem|java|javac|gradle|maven|mvn|gcc|g\+\+|clang)(\s|$)/,
// Absolute paths
/^\//,
// Scripts
/^(sh|bash|zsh|fish|powershell|pwsh)(\s|$)/
];
// ============================================================
// CONFUSING UX MESSAGE PATTERNS
// ============================================================
const CONFUSING_OUTPUT_PATTERNS = [
{
pattern: /exited with code (undefined|null)/i,
issue: 'Exit code is undefined/null',
suggestion: 'Show actual exit code or success/failure message'
},
{
pattern: /command (succeeded|failed) but output is (empty|undefined|null)/i,
issue: 'No output shown to user',
suggestion: 'Always show command output or "No output" message'
},
{
pattern: /error:.*undefined/i,
issue: 'Generic undefined error',
suggestion: 'Provide specific error message'
},
{
pattern: /null.*error/i,
issue: 'Null error reference',
suggestion: 'Provide meaningful error details'
},
{
pattern: /cannot (read|access).*undefined/i,
issue: 'Undefined property access',
suggestion: 'Validate properties before access'
}
];
// ============================================================
// CONVERSATION STATE TRACKING
// ============================================================
const conversationState = {
lastAssistantMessages: [],
lastUserMessage: null,
intentState: 'IDLE', // IDLE, AWAITING_APPROVAL, EXECUTING_COMMAND, CONVERSING
addAssistantMessage(message) {
this.lastAssistantMessages.push({
content: message,
timestamp: Date.now()
});
// Keep only last 5 messages
if (this.lastAssistantMessages.length > 5) {
this.lastAssistantMessages.shift();
}
this.updateIntentState();
},
addUserMessage(message) {
this.lastUserMessage = {
content: message,
timestamp: Date.now()
};
},
updateIntentState() {
const lastMsg = this.getLastAssistantMessage();
if (!lastMsg) {
this.intentState = 'IDLE';
return;
}
// Check if AI is asking for approval
const approvalRequest = /\byou want me to|shall i|should i|do you want|would you like|shall i proceed|should i run/i.test(lastMsg.content);
if (approvalRequest) {
this.intentState = 'AWAITING_APPROVAL';
return;
}
this.intentState = 'CONVERSING';
},
getLastAssistantMessage() {
return this.lastAssistantMessages[this.lastAssistantMessages.length - 1];
},
getLastNMessages(n) {
return this.lastAssistantMessages.slice(-n);
}
};
// ============================================================
// PUBLIC API
// ============================================================
/**
* Check if a message appears to be conversational (not a shell command)
* @param {string} message - User's message
* @returns {boolean} - True if conversational
*/
function isConversational(message) {
const trimmed = message.trim();
if (!trimmed) return false;
const lower = trimmed.toLowerCase();
// Check all conversational patterns
return CONVERSATIONAL_PATTERNS.some(pattern => pattern.test(lower));
}
/**
* Check if a message is an approval response (yes/no/sure/etc)
* @param {string} message - User's message
* @returns {boolean} - True if approval response
*/
function isApprovalResponse(message) {
const trimmed = message.trim();
if (!trimmed) return false;
const lower = trimmed.toLowerCase();
return APPROVAL_PATTERNS.some(pattern => pattern.test(lower));
}
/**
* Check if a message looks like a contextual conversational message
* (e.g., "if I asked you to ping google.com...")
* @param {string} message - User's message
* @returns {boolean} - True if contextual conversational
*/
function isContextualConversational(message) {
const trimmed = message.trim();
if (!trimmed) return false;
const lower = trimmed.toLowerCase();
return CONTEXTUAL_CONVERSATIONAL_PATTERNS.some(pattern => pattern.test(lower));
}
/**
* Enhanced shell command detection
* @param {string} message - User's message
* @returns {boolean} - True if it looks like a shell command
*/
function isShellCommand(message) {
const trimmed = message.trim();
if (!trimmed) return false;
const lower = trimmed.toLowerCase();
// FIRST: Check for command requests in conversational language
// Pattern: "run ping google.com", "execute ls -la", "can you run npm install?"
const commandRequestPatterns = [
/\b(run|execute|exec|do|can you run|please run|execute this|run this)\s+(.+?)\b/i,
/\b(start|launch|begin|kick off)\s+(.+?)\b/i,
/\b(go ahead and\s+)?(run|execute)\b/i
];
for (const pattern of commandRequestPatterns) {
const match = lower.match(pattern);
if (match && match[2]) {
// Extract the command and check if it's valid
const extractedCommand = match[2].trim();
if (extractedCommand && COMMAND_PATTERNS.some(p => p.test(extractedCommand))) {
console.log('[SemanticValidator] Detected command request:', extractedCommand);
return true;
}
}
}
// SECOND: Check if it's conversational (catch intent errors)
if (isConversational(lower)) {
return false;
}
// THIRD: Check for contextual conversational patterns
if (isContextualConversational(lower)) {
return false;
}
// FOURTH: Check if it matches command patterns
return COMMAND_PATTERNS.some(pattern => pattern.test(lower));
}
/**
* Extract actual command from conversational request
* @param {string} message - User's message
* @returns {string|null} - Extracted command or null
*/
function extractCommand(message) {
const trimmed = message.trim();
if (!trimmed) return null;
const lower = trimmed.toLowerCase();
// Pattern: "run ping google.com" → "ping google.com"
// Pattern: "execute ls -la" → "ls -la"
// FIXED: Capture everything after command verb, then trim only trailing sentence punctuation
const commandRequestPatterns = [
// Match "run/execute [command]" - capture everything to end, trim punctuation later
/\b(run|execute|exec|do|can you run|please run|execute this|run this)\s+(.+)/i,
/\b(start|launch|begin|kick off)\s+(.+)/i
];
for (const pattern of commandRequestPatterns) {
const match = lower.match(pattern);
if (match && match[2]) {
let extractedCommand = match[2].trim();
// Remove trailing sentence punctuation (. ! ?) ONLY if at end (not in middle like .com)
// Only remove if the punctuation is at the very end after trimming
extractedCommand = extractedCommand.replace(/[.!?]+$/g, '').trim();
// Validate that it's actually a command (check first word)
const firstWord = extractedCommand.split(/\s+/)[0];
if (firstWord && COMMAND_PATTERNS.some(p => p.test(firstWord))) {
// Preserve original case for the command
const originalMatch = trimmed.match(pattern);
if (originalMatch && originalMatch[2]) {
let originalCommand = originalMatch[2].trim();
// Remove trailing punctuation from original too
originalCommand = originalCommand.replace(/[.!?]+$/, '').trim();
console.log('[SemanticValidator] Extracted command:', originalCommand, 'from:', trimmed);
return originalCommand;
}
}
}
}
// If no command request pattern found, return original message
return trimmed;
}
/**
* Detect intent mismatch when user responds to approval request
* @param {string} userMessage - User's message
* @param {string} currentMode - Current chat mode
* @returns {object|null} - Error object if mismatch detected, null otherwise
*/
function detectApprovalIntentMismatch(userMessage, currentMode) {
// Only check in Terminal/WebContainer mode
if (currentMode !== 'webcontainer') {
return null;
}
// Check if we're awaiting approval
if (conversationState.intentState !== 'AWAITING_APPROVAL') {
return null;
}
// Check if user message is an approval response
if (isApprovalResponse(userMessage)) {
return {
type: 'intent_error',
subtype: 'approval_loop',
message: 'User responded with approval in Terminal mode, but system may execute it as a command',
details: {
lastAssistantMessage: conversationState.getLastAssistantMessage()?.content?.substring(0, 100) || '',
userMessage: userMessage,
mode: currentMode,
suggestedAction: 'Interpret approval responses contextually instead of executing as commands'
}
};
}
return null;
}
/**
* Detect conversational messages being sent to Terminal mode
* @param {string} message - User's message
* @param {string} mode - Current chat mode
* @returns {object|null} - Error object if detected, null otherwise
*/
function detectConversationalCommand(message, mode) {
// Only check in Terminal/WebContainer mode
if (mode !== 'webcontainer') {
return null;
}
if (isConversational(message) || isContextualConversational(message)) {
return {
type: 'intent_error',
subtype: 'conversational_as_command',
message: 'Conversational message sent to Terminal mode',
details: {
userMessage: message,
mode: mode,
suggestedMode: 'chat',
suggestedAction: 'Switch to Chat mode or rephrase as shell command'
}
};
}
return null;
}
/**
* Detect confusing UX messages in command output
* @param {string} output - Command output
* @returns {object|null} - Error object if detected, null otherwise
*/
function detectConfusingOutput(output) {
if (!output || typeof output !== 'string') {
return null;
}
for (const confusing of CONFUSING_OUTPUT_PATTERNS) {
if (confusing.pattern.test(output)) {
return {
type: 'ux_issue',
subtype: 'confusing_message',
message: 'Confusing error message displayed to user',
details: {
originalOutput: output.substring(0, 200),
issue: confusing.issue,
suggestedImprovement: confusing.suggestion
}
};
}
}
return null;
}
/**
* Validate user intent before executing command
* @param {string} message - User's message
* @param {string} mode - Current chat mode
* @returns {object} - Validation result {valid: boolean, error: object|null}
*/
function validateIntentBeforeExecution(message, mode) {
// Check for approval intent mismatch
const approvalMismatch = detectApprovalIntentMismatch(message, mode);
if (approvalMismatch) {
return {
valid: false,
error: approvalMismatch
};
}
// Check for conversational message in command mode
const conversationalCmd = detectConversationalCommand(message, mode);
if (conversationalCmd) {
return {
valid: false,
error: conversationalCmd
};
}
return {
valid: true,
error: null
};
}
/**
* Track assistant message for context
* @param {string} message - Assistant's message
*/
function trackAssistantMessage(message) {
conversationState.addAssistantMessage(message);
}
/**
* Track user message for context
* @param {string} message - User's message
*/
function trackUserMessage(message) {
conversationState.addUserMessage(message);
}
/**
* Get current conversation state
* @returns {string} - Current intent state
*/
function getIntentState() {
return conversationState.intentState;
}
/**
* Report a semantic error to the bug tracker
* @param {object} errorData - Error details
*/
function reportSemanticError(errorData) {
const semanticError = {
type: 'semantic',
subtype: errorData.subtype || errorData.type,
message: errorData.message,
details: errorData.details || {},
timestamp: new Date().toISOString(),
url: window.location.href,
context: {
chatMode: window.currentChatMode || 'unknown',
sessionId: window.attachedSessionId || window.chatSessionId,
intentState: conversationState.intentState,
recentMessages: conversationState.getLastNMessages(3).map(m => ({
content: m.content?.substring(0, 50),
timestamp: m.timestamp
}))
}
};
// Report to bug tracker
if (window.bugTracker) {
const errorId = window.bugTracker.addError(semanticError);
// Only add activity if error was actually added (not skipped)
if (errorId) {
window.bugTracker.addActivity(errorId, '🔍', `Semantic: ${errorData.subtype}`, 'warning');
}
}
// Report to server for auto-fixer
fetch('/claude/api/log-error', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(semanticError)
}).catch(err => console.error('[SemanticError] Failed to report:', err));
// Log for debugging
console.log('[SemanticError] Detected:', semanticError);
return semanticError;
}
// Export to global scope
window.semanticValidator = {
isConversational,
isApprovalResponse,
isContextualConversational,
isShellCommand,
extractCommand,
detectApprovalIntentMismatch,
detectConversationalCommand,
detectConfusingOutput,
validateIntentBeforeExecution,
trackAssistantMessage,
trackUserMessage,
getIntentState,
reportSemanticError
};
console.log('[SemanticValidator] Initialized with enhanced error detection');
})();