# GUI Automation Skill for QwenClaw ## Overview This skill provides **full GUI automation capabilities** to QwenClaw using Playwright. Control web browsers, interact with elements, capture screenshots, and automate complex web workflows. **Version:** 1.0.0 **Category:** Automation **Dependencies:** Playwright --- ## Features ### 🌐 **Web Browser Automation** - Launch Chromium, Firefox, or WebKit - Navigate to any URL - Handle multiple tabs/windows - Custom viewport and user agent ### 🖱️ **Element Interaction** - Click buttons and links - Type text into inputs - Fill forms - Select dropdown options - Check/uncheck checkboxes - Hover over elements ### 📸 **Screenshot & Capture** - Full page screenshots - Element-specific screenshots - Save to configurable directory - Automatic timestamps ### 📊 **Data Extraction** - Extract text from elements - Get attributes (href, src, etc.) - Scrape tables and lists - Export to JSON/CSV ### ⌨️ **Keyboard & Navigation** - Press keyboard shortcuts - Wait for elements - Wait for navigation - Execute JavaScript ### 📥 **File Operations** - Download files - Upload files - Handle file dialogs --- ## Installation ### 1. Install Playwright ```bash cd qwenclaw bun add playwright ``` ### 2. Install Browser Binaries ```bash # Install all browsers npx playwright install # Or specific browsers npx playwright install chromium npx playwright install firefox npx playwright install webkit ``` ### 3. Enable Skill Copy this skill to QwenClaw skills directory: ```bash cp -r skills/gui-automation ~/.qwen/qwenclaw/skills/gui-automation ``` --- ## Usage ### From Qwen Code Chat ``` Use gui-automation to take a screenshot of https://example.com Use gui-automation to navigate to GitHub and click the sign in button Use gui-automation to fill the login form with username "test" and password "pass123" Use gui-automation to extract all article titles from the page ``` ### From Terminal CLI ```bash # Take screenshot qwenclaw gui screenshot https://example.com # Navigate to page qwenclaw gui navigate https://github.com # Click element qwenclaw gui click "#login-button" # Type text qwenclaw gui type "search query" "#search-input" # Extract text qwenclaw gui extract ".article-title" # Get page title qwenclaw gui title ``` ### Programmatic Usage ```typescript import { GUIAutomation } from './tools/gui-automation'; const automation = new GUIAutomation({ browserType: 'chromium', headless: true, }); await automation.launch(); await automation.goto('https://example.com'); await automation.screenshot('example.png'); await automation.close(); ``` --- ## API Reference ### GUIAutomation Class #### Constructor ```typescript new GUIAutomation(config?: Partial) ``` **Config:** ```typescript interface GUIAutomationConfig { browserType: 'chromium' | 'firefox' | 'webkit'; headless: boolean; viewport?: { width: number; height: number }; userAgent?: string; } ``` #### Methods | Method | Description | Example | |--------|-------------|---------| | `launch()` | Launch browser | `await automation.launch()` | | `close()` | Close browser | `await automation.close()` | | `goto(url)` | Navigate to URL | `await automation.goto('https://...')` | | `screenshot(name?)` | Take screenshot | `await automation.screenshot('page.png')` | | `click(selector)` | Click element | `await automation.click('#btn')` | | `type(selector, text)` | Type text | `await automation.type('#input', 'hello')` | | `fill(selector, value)` | Fill field | `await automation.fill('#email', 'test@test.com')` | | `select(selector, value)` | Select option | `await automation.select('#country', 'US')` | | `check(selector)` | Check checkbox | `await automation.check('#agree')` | | `getText(selector)` | Get element text | `const text = await automation.getText('#title')` | | `getAttribute(sel, attr)` | Get attribute | `const href = await automation.getAttribute('a', 'href')` | | `waitFor(selector)` | Wait for element | `await automation.waitFor('.loaded')` | | `evaluate(script)` | Execute JS | `const result = await automation.evaluate('2+2')` | | `scrollTo(selector)` | Scroll to element | `await automation.scrollTo('#footer')` | | `hover(selector)` | Hover element | `await automation.hover('#menu')` | | `press(key)` | Press key | `await automation.press('Enter')` | | `getTitle()` | Get page title | `const title = await automation.getTitle()` | | `getUrl()` | Get page URL | `const url = await automation.getUrl()` | | `getHTML()` | Get page HTML | `const html = await automation.getHTML()` | --- ## Examples ### Example 1: Take Screenshot ```typescript import { GUIAutomation } from './tools/gui-automation'; const automation = new GUIAutomation(); await automation.launch(); await automation.goto('https://example.com'); await automation.screenshot('example-homepage.png'); await automation.close(); ``` ### Example 2: Fill and Submit Form ```typescript const automation = new GUIAutomation(); await automation.launch(); await automation.goto('https://example.com/login'); await automation.fill('#email', 'user@example.com'); await automation.fill('#password', 'secret123'); await automation.check('#remember-me'); await automation.click('#submit-button'); await automation.screenshot('logged-in.png'); await automation.close(); ``` ### Example 3: Web Scraping ```typescript const automation = new GUIAutomation(); await automation.launch(); await automation.goto('https://news.ycombinator.com'); // Extract all article titles const titles = await automation.findAll('.titleline > a'); console.log('Articles:', titles.map(t => t.text)); // Extract specific data const data = await automation.extractData({ topStory: '.titleline > a', score: '.score', comments: '.subtext a:last-child', }); await automation.close(); ``` ### Example 4: Wait for Dynamic Content ```typescript const automation = new GUIAutomation(); await automation.launch(); await automation.goto('https://example.com'); // Wait for element to appear await automation.waitFor('.loaded-content', { timeout: 10000 }); // Wait for navigation after click await automation.click('#load-more'); await automation.waitForNavigation({ waitUntil: 'networkidle' }); await automation.screenshot('loaded.png'); await automation.close(); ``` ### Example 5: Execute JavaScript ```typescript const automation = new GUIAutomation(); await automation.launch(); await automation.goto('https://example.com'); // Get window size const size = await automation.evaluate(() => ({ width: window.innerWidth, height: window.innerHeight, })); // Modify page await automation.evaluate(() => { document.body.style.background = 'red'; }); await automation.screenshot('modified.png'); await automation.close(); ``` --- ## Common Selectors | Selector Type | Example | |--------------|---------| | ID | `#login-button` | | Class | `.article-title` | | Tag | `input`, `button`, `a` | | Attribute | `[type="email"]`, `[href*="github"]` | | CSS | `div > p.text`, `ul li:first-child` | | XPath | `//button[text()="Submit"]` | --- ## Configuration ### config.json Create `~/.qwen/qwenclaw/gui-config.json`: ```json { "browserType": "chromium", "headless": true, "viewport": { "width": 1920, "height": 1080 }, "userAgent": "Mozilla/5.0...", "screenshotsDir": "~/.qwen/qwenclaw/screenshots", "timeout": 30000, "waitForNetworkIdle": true } ``` ### Environment Variables | Variable | Description | Default | |----------|-------------|---------| | `PLAYWRIGHT_BROWSERS_PATH` | Browser binaries location | `~/.cache/ms-playwright` | | `QWENCLAW_GUI_HEADLESS` | Run headless | `true` | | `QWENCLAW_GUI_BROWSER` | Default browser | `chromium` | --- ## Integration with QwenClaw Daemon ### Add to rig-service Update `rig-service/src/agent.rs`: ```rust // Add GUI automation capability async fn automate_gui(task: &str) -> Result { let automation = GUIAutomation::default(); automation.launch().await?; // Parse and execute task let result = automation.execute(task).await?; automation.close().await?; Ok(result) } ``` ### Add to QwenClaw skills The skill is automatically available when installed in `skills/gui-automation/`. --- ## Troubleshooting ### Issue: "Browser not found" **Solution:** ```bash # Install browser binaries npx playwright install chromium ``` ### Issue: "Timeout waiting for element" **Solution:** ```typescript // Increase timeout await automation.waitFor('.element', { timeout: 30000 }); // Or wait for specific event await automation.waitForNavigation({ waitUntil: 'networkidle' }); ``` ### Issue: "Screenshot not saved" **Solution:** ```bash # Ensure directory exists mkdir -p ~/.qwen/qwenclaw/screenshots # Check permissions chmod 755 ~/.qwen/qwenclaw/screenshots ``` --- ## Best Practices ### 1. **Always Close Browser** ```typescript try { await automation.launch(); // ... do work } finally { await automation.close(); // Always close } ``` ### 2. **Use Headless for Automation** ```typescript const automation = new GUIAutomation({ headless: true }); ``` ### 3. **Wait for Elements** ```typescript // Don't assume element exists await automation.waitFor('#dynamic-content'); const text = await automation.getText('#dynamic-content'); ``` ### 4. **Handle Errors** ```typescript try { await automation.click('#button'); } catch (err) { console.error('Click failed:', err); // Fallback or retry } ``` ### 5. **Use Descriptive Selectors** ```typescript // Good await automation.click('[data-testid="submit-button"]'); // Bad (brittle) await automation.click('div > div:nth-child(3) > button'); ``` --- ## Security Considerations ### 1. **Don't Automate Sensitive Sites** Avoid automating: - Banking websites - Password managers - Private data entry ### 2. **Use Headless Mode** ```json { "headless": true } ``` ### 3. **Clear Cookies After Session** ```typescript await automation.close(); // Clears all session data ``` --- ## Resources - **Playwright Docs:** https://playwright.dev/ - **Playwright Selectors:** https://playwright.dev/docs/selectors - **Playwright API:** https://playwright.dev/docs/api/class-playwright --- ## Changelog ### v1.0.0 (2026-02-26) - Initial release - Full browser automation - Screenshot capture - Element interaction - Data extraction - JavaScript execution --- ## License MIT License - See LICENSE file for details. --- **GUI Automation skill ready for QwenClaw!** 🖥️🤖