486 lines
10 KiB
Markdown
486 lines
10 KiB
Markdown
# GUI Automation Skill for QwenClaw
|
|
|
|
## Overview
|
|
|
|
This skill provides **full GUI automation capabilities** to QwenClaw using Playwright. Control web browsers, interact with elements, capture screenshots, and automate complex web workflows.
|
|
|
|
**Version:** 1.0.0
|
|
**Category:** Automation
|
|
**Dependencies:** Playwright
|
|
|
|
---
|
|
|
|
## Features
|
|
|
|
### 🌐 **Web Browser Automation**
|
|
- Launch Chromium, Firefox, or WebKit
|
|
- Navigate to any URL
|
|
- Handle multiple tabs/windows
|
|
- Custom viewport and user agent
|
|
|
|
### 🖱️ **Element Interaction**
|
|
- Click buttons and links
|
|
- Type text into inputs
|
|
- Fill forms
|
|
- Select dropdown options
|
|
- Check/uncheck checkboxes
|
|
- Hover over elements
|
|
|
|
### 📸 **Screenshot & Capture**
|
|
- Full page screenshots
|
|
- Element-specific screenshots
|
|
- Save to configurable directory
|
|
- Automatic timestamps
|
|
|
|
### 📊 **Data Extraction**
|
|
- Extract text from elements
|
|
- Get attributes (href, src, etc.)
|
|
- Scrape tables and lists
|
|
- Export to JSON/CSV
|
|
|
|
### ⌨️ **Keyboard & Navigation**
|
|
- Press keyboard shortcuts
|
|
- Wait for elements
|
|
- Wait for navigation
|
|
- Execute JavaScript
|
|
|
|
### 📥 **File Operations**
|
|
- Download files
|
|
- Upload files
|
|
- Handle file dialogs
|
|
|
|
---
|
|
|
|
## Installation
|
|
|
|
### 1. Install Playwright
|
|
|
|
```bash
|
|
cd qwenclaw
|
|
bun add playwright
|
|
```
|
|
|
|
### 2. Install Browser Binaries
|
|
|
|
```bash
|
|
# Install all browsers
|
|
npx playwright install
|
|
|
|
# Or specific browsers
|
|
npx playwright install chromium
|
|
npx playwright install firefox
|
|
npx playwright install webkit
|
|
```
|
|
|
|
### 3. Enable Skill
|
|
|
|
Copy this skill to QwenClaw skills directory:
|
|
|
|
```bash
|
|
cp -r skills/gui-automation ~/.qwen/qwenclaw/skills/gui-automation
|
|
```
|
|
|
|
---
|
|
|
|
## Usage
|
|
|
|
### From Qwen Code Chat
|
|
|
|
```
|
|
Use gui-automation to take a screenshot of https://example.com
|
|
|
|
Use gui-automation to navigate to GitHub and click the sign in button
|
|
|
|
Use gui-automation to fill the login form with username "test" and password "pass123"
|
|
|
|
Use gui-automation to extract all article titles from the page
|
|
```
|
|
|
|
### From Terminal CLI
|
|
|
|
```bash
|
|
# Take screenshot
|
|
qwenclaw gui screenshot https://example.com
|
|
|
|
# Navigate to page
|
|
qwenclaw gui navigate https://github.com
|
|
|
|
# Click element
|
|
qwenclaw gui click "#login-button"
|
|
|
|
# Type text
|
|
qwenclaw gui type "search query" "#search-input"
|
|
|
|
# Extract text
|
|
qwenclaw gui extract ".article-title"
|
|
|
|
# Get page title
|
|
qwenclaw gui title
|
|
```
|
|
|
|
### Programmatic Usage
|
|
|
|
```typescript
|
|
import { GUIAutomation } from './tools/gui-automation';
|
|
|
|
const automation = new GUIAutomation({
|
|
browserType: 'chromium',
|
|
headless: true,
|
|
});
|
|
|
|
await automation.launch();
|
|
await automation.goto('https://example.com');
|
|
await automation.screenshot('example.png');
|
|
await automation.close();
|
|
```
|
|
|
|
---
|
|
|
|
## API Reference
|
|
|
|
### GUIAutomation Class
|
|
|
|
#### Constructor
|
|
|
|
```typescript
|
|
new GUIAutomation(config?: Partial<GUIAutomationConfig>)
|
|
```
|
|
|
|
**Config:**
|
|
```typescript
|
|
interface GUIAutomationConfig {
|
|
browserType: 'chromium' | 'firefox' | 'webkit';
|
|
headless: boolean;
|
|
viewport?: { width: number; height: number };
|
|
userAgent?: string;
|
|
}
|
|
```
|
|
|
|
#### Methods
|
|
|
|
| Method | Description | Example |
|
|
|--------|-------------|---------|
|
|
| `launch()` | Launch browser | `await automation.launch()` |
|
|
| `close()` | Close browser | `await automation.close()` |
|
|
| `goto(url)` | Navigate to URL | `await automation.goto('https://...')` |
|
|
| `screenshot(name?)` | Take screenshot | `await automation.screenshot('page.png')` |
|
|
| `click(selector)` | Click element | `await automation.click('#btn')` |
|
|
| `type(selector, text)` | Type text | `await automation.type('#input', 'hello')` |
|
|
| `fill(selector, value)` | Fill field | `await automation.fill('#email', 'test@test.com')` |
|
|
| `select(selector, value)` | Select option | `await automation.select('#country', 'US')` |
|
|
| `check(selector)` | Check checkbox | `await automation.check('#agree')` |
|
|
| `getText(selector)` | Get element text | `const text = await automation.getText('#title')` |
|
|
| `getAttribute(sel, attr)` | Get attribute | `const href = await automation.getAttribute('a', 'href')` |
|
|
| `waitFor(selector)` | Wait for element | `await automation.waitFor('.loaded')` |
|
|
| `evaluate(script)` | Execute JS | `const result = await automation.evaluate('2+2')` |
|
|
| `scrollTo(selector)` | Scroll to element | `await automation.scrollTo('#footer')` |
|
|
| `hover(selector)` | Hover element | `await automation.hover('#menu')` |
|
|
| `press(key)` | Press key | `await automation.press('Enter')` |
|
|
| `getTitle()` | Get page title | `const title = await automation.getTitle()` |
|
|
| `getUrl()` | Get page URL | `const url = await automation.getUrl()` |
|
|
| `getHTML()` | Get page HTML | `const html = await automation.getHTML()` |
|
|
|
|
---
|
|
|
|
## Examples
|
|
|
|
### Example 1: Take Screenshot
|
|
|
|
```typescript
|
|
import { GUIAutomation } from './tools/gui-automation';
|
|
|
|
const automation = new GUIAutomation();
|
|
await automation.launch();
|
|
await automation.goto('https://example.com');
|
|
await automation.screenshot('example-homepage.png');
|
|
await automation.close();
|
|
```
|
|
|
|
### Example 2: Fill and Submit Form
|
|
|
|
```typescript
|
|
const automation = new GUIAutomation();
|
|
await automation.launch();
|
|
|
|
await automation.goto('https://example.com/login');
|
|
await automation.fill('#email', 'user@example.com');
|
|
await automation.fill('#password', 'secret123');
|
|
await automation.check('#remember-me');
|
|
await automation.click('#submit-button');
|
|
|
|
await automation.screenshot('logged-in.png');
|
|
await automation.close();
|
|
```
|
|
|
|
### Example 3: Web Scraping
|
|
|
|
```typescript
|
|
const automation = new GUIAutomation();
|
|
await automation.launch();
|
|
|
|
await automation.goto('https://news.ycombinator.com');
|
|
|
|
// Extract all article titles
|
|
const titles = await automation.findAll('.titleline > a');
|
|
console.log('Articles:', titles.map(t => t.text));
|
|
|
|
// Extract specific data
|
|
const data = await automation.extractData({
|
|
topStory: '.titleline > a',
|
|
score: '.score',
|
|
comments: '.subtext a:last-child',
|
|
});
|
|
|
|
await automation.close();
|
|
```
|
|
|
|
### Example 4: Wait for Dynamic Content
|
|
|
|
```typescript
|
|
const automation = new GUIAutomation();
|
|
await automation.launch();
|
|
|
|
await automation.goto('https://example.com');
|
|
|
|
// Wait for element to appear
|
|
await automation.waitFor('.loaded-content', { timeout: 10000 });
|
|
|
|
// Wait for navigation after click
|
|
await automation.click('#load-more');
|
|
await automation.waitForNavigation({ waitUntil: 'networkidle' });
|
|
|
|
await automation.screenshot('loaded.png');
|
|
await automation.close();
|
|
```
|
|
|
|
### Example 5: Execute JavaScript
|
|
|
|
```typescript
|
|
const automation = new GUIAutomation();
|
|
await automation.launch();
|
|
await automation.goto('https://example.com');
|
|
|
|
// Get window size
|
|
const size = await automation.evaluate(() => ({
|
|
width: window.innerWidth,
|
|
height: window.innerHeight,
|
|
}));
|
|
|
|
// Modify page
|
|
await automation.evaluate(() => {
|
|
document.body.style.background = 'red';
|
|
});
|
|
|
|
await automation.screenshot('modified.png');
|
|
await automation.close();
|
|
```
|
|
|
|
---
|
|
|
|
## Common Selectors
|
|
|
|
| Selector Type | Example |
|
|
|--------------|---------|
|
|
| ID | `#login-button` |
|
|
| Class | `.article-title` |
|
|
| Tag | `input`, `button`, `a` |
|
|
| Attribute | `[type="email"]`, `[href*="github"]` |
|
|
| CSS | `div > p.text`, `ul li:first-child` |
|
|
| XPath | `//button[text()="Submit"]` |
|
|
|
|
---
|
|
|
|
## Configuration
|
|
|
|
### config.json
|
|
|
|
Create `~/.qwen/qwenclaw/gui-config.json`:
|
|
|
|
```json
|
|
{
|
|
"browserType": "chromium",
|
|
"headless": true,
|
|
"viewport": {
|
|
"width": 1920,
|
|
"height": 1080
|
|
},
|
|
"userAgent": "Mozilla/5.0...",
|
|
"screenshotsDir": "~/.qwen/qwenclaw/screenshots",
|
|
"timeout": 30000,
|
|
"waitForNetworkIdle": true
|
|
}
|
|
```
|
|
|
|
### Environment Variables
|
|
|
|
| Variable | Description | Default |
|
|
|----------|-------------|---------|
|
|
| `PLAYWRIGHT_BROWSERS_PATH` | Browser binaries location | `~/.cache/ms-playwright` |
|
|
| `QWENCLAW_GUI_HEADLESS` | Run headless | `true` |
|
|
| `QWENCLAW_GUI_BROWSER` | Default browser | `chromium` |
|
|
|
|
---
|
|
|
|
## Integration with QwenClaw Daemon
|
|
|
|
### Add to rig-service
|
|
|
|
Update `rig-service/src/agent.rs`:
|
|
|
|
```rust
|
|
// Add GUI automation capability
|
|
async fn automate_gui(task: &str) -> Result<String> {
|
|
let automation = GUIAutomation::default();
|
|
automation.launch().await?;
|
|
|
|
// Parse and execute task
|
|
let result = automation.execute(task).await?;
|
|
|
|
automation.close().await?;
|
|
Ok(result)
|
|
}
|
|
```
|
|
|
|
### Add to QwenClaw skills
|
|
|
|
The skill is automatically available when installed in `skills/gui-automation/`.
|
|
|
|
---
|
|
|
|
## Troubleshooting
|
|
|
|
### Issue: "Browser not found"
|
|
|
|
**Solution:**
|
|
```bash
|
|
# Install browser binaries
|
|
npx playwright install chromium
|
|
```
|
|
|
|
### Issue: "Timeout waiting for element"
|
|
|
|
**Solution:**
|
|
```typescript
|
|
// Increase timeout
|
|
await automation.waitFor('.element', { timeout: 30000 });
|
|
|
|
// Or wait for specific event
|
|
await automation.waitForNavigation({ waitUntil: 'networkidle' });
|
|
```
|
|
|
|
### Issue: "Screenshot not saved"
|
|
|
|
**Solution:**
|
|
```bash
|
|
# Ensure directory exists
|
|
mkdir -p ~/.qwen/qwenclaw/screenshots
|
|
|
|
# Check permissions
|
|
chmod 755 ~/.qwen/qwenclaw/screenshots
|
|
```
|
|
|
|
---
|
|
|
|
## Best Practices
|
|
|
|
### 1. **Always Close Browser**
|
|
|
|
```typescript
|
|
try {
|
|
await automation.launch();
|
|
// ... do work
|
|
} finally {
|
|
await automation.close(); // Always close
|
|
}
|
|
```
|
|
|
|
### 2. **Use Headless for Automation**
|
|
|
|
```typescript
|
|
const automation = new GUIAutomation({ headless: true });
|
|
```
|
|
|
|
### 3. **Wait for Elements**
|
|
|
|
```typescript
|
|
// Don't assume element exists
|
|
await automation.waitFor('#dynamic-content');
|
|
const text = await automation.getText('#dynamic-content');
|
|
```
|
|
|
|
### 4. **Handle Errors**
|
|
|
|
```typescript
|
|
try {
|
|
await automation.click('#button');
|
|
} catch (err) {
|
|
console.error('Click failed:', err);
|
|
// Fallback or retry
|
|
}
|
|
```
|
|
|
|
### 5. **Use Descriptive Selectors**
|
|
|
|
```typescript
|
|
// Good
|
|
await automation.click('[data-testid="submit-button"]');
|
|
|
|
// Bad (brittle)
|
|
await automation.click('div > div:nth-child(3) > button');
|
|
```
|
|
|
|
---
|
|
|
|
## Security Considerations
|
|
|
|
### 1. **Don't Automate Sensitive Sites**
|
|
|
|
Avoid automating:
|
|
- Banking websites
|
|
- Password managers
|
|
- Private data entry
|
|
|
|
### 2. **Use Headless Mode**
|
|
|
|
```json
|
|
{
|
|
"headless": true
|
|
}
|
|
```
|
|
|
|
### 3. **Clear Cookies After Session**
|
|
|
|
```typescript
|
|
await automation.close(); // Clears all session data
|
|
```
|
|
|
|
---
|
|
|
|
## Resources
|
|
|
|
- **Playwright Docs:** https://playwright.dev/
|
|
- **Playwright Selectors:** https://playwright.dev/docs/selectors
|
|
- **Playwright API:** https://playwright.dev/docs/api/class-playwright
|
|
|
|
---
|
|
|
|
## Changelog
|
|
|
|
### v1.0.0 (2026-02-26)
|
|
- Initial release
|
|
- Full browser automation
|
|
- Screenshot capture
|
|
- Element interaction
|
|
- Data extraction
|
|
- JavaScript execution
|
|
|
|
---
|
|
|
|
## License
|
|
|
|
MIT License - See LICENSE file for details.
|
|
|
|
---
|
|
|
|
**GUI Automation skill ready for QwenClaw!** 🖥️🤖
|