feat: Add intelligent auto-router and enhanced integrations
- Add intelligent-router.sh hook for automatic agent routing - Add AUTO-TRIGGER-SUMMARY.md documentation - Add FINAL-INTEGRATION-SUMMARY.md documentation - Complete Prometheus integration (6 commands + 4 tools) - Complete Dexto integration (12 commands + 5 tools) - Enhanced Ralph with access to all agents - Fix /clawd command (removed disable-model-invocation) - Update hooks.json to v5 with intelligent routing - 291 total skills now available - All 21 commands with automatic routing 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
200
dexto/agents/nano-banana-agent/README.md
Normal file
200
dexto/agents/nano-banana-agent/README.md
Normal file
@@ -0,0 +1,200 @@
|
||||
# Nano Banana Agent
|
||||
|
||||
A Dexto agent that provides access to Google's **Gemini 2.5 Flash Image** model for image generation and editing through a lean, powerful MCP server.
|
||||
|
||||
## 🎯 What is Gemini 2.5 Flash Image?
|
||||
|
||||
Gemini 2.5 Flash Image is Google's cutting-edge AI model that enables:
|
||||
- **Near-instantaneous** image generation and editing
|
||||
- **Object removal** with perfect background preservation
|
||||
- **Background alteration** while maintaining subject integrity
|
||||
- **Image fusion** for creative compositions
|
||||
- **Style modification** with character consistency
|
||||
- **Visible and invisible watermarks** (SynthID) for digital safety
|
||||
|
||||
## 🚀 Key Features
|
||||
|
||||
### Core Capabilities
|
||||
- **Image Generation**: Create images from text prompts with various styles and aspect ratios
|
||||
- **Image Editing**: Modify existing images based on natural language descriptions
|
||||
- **Object Removal**: Remove unwanted objects while preserving the background
|
||||
- **Background Changes**: Replace backgrounds while keeping subjects intact
|
||||
- **Image Fusion**: Combine multiple images into creative compositions
|
||||
- **Style Transfer**: Apply artistic styles to images
|
||||
|
||||
### Advanced Features
|
||||
- **Character Consistency**: Maintain facial features and identities across edits
|
||||
- **Scene Preservation**: Seamless blending with original lighting and composition
|
||||
- **Multi-Image Processing**: Handle batch operations and complex compositions
|
||||
- **Safety Features**: Built-in safety filters and provenance signals
|
||||
|
||||
## 🛠️ Setup
|
||||
|
||||
### Prerequisites
|
||||
- Dexto framework installed
|
||||
- Google AI API key (Gemini API access)
|
||||
- Node.js 20.0.0 or higher
|
||||
|
||||
### Installation
|
||||
1. **Set up environment variables**:
|
||||
```bash
|
||||
export GOOGLE_GENERATIVE_AI_API_KEY="your-google-ai-api-key"
|
||||
# or
|
||||
export GEMINI_API_KEY="your-google-ai-api-key"
|
||||
```
|
||||
|
||||
2. **Run the agent** (the MCP server will be automatically downloaded via npx):
|
||||
```bash
|
||||
# From the dexto repository root
|
||||
npx dexto -a agents/nano-banana-agent/nano-banana-agent.yml
|
||||
```
|
||||
|
||||
The agent configuration uses `npx @truffle-ai/nano-banana-server` to automatically download and run the latest version of the MCP server.
|
||||
|
||||
## 📋 Available Tools
|
||||
|
||||
The agent provides access to 3 essential tools:
|
||||
|
||||
### 1. `generate_image`
|
||||
Generate new images from text prompts.
|
||||
|
||||
**Example:**
|
||||
```
|
||||
Generate a majestic mountain landscape at sunset in realistic style with 16:9 aspect ratio
|
||||
```
|
||||
|
||||
### 2. `process_image`
|
||||
Process existing images based on detailed instructions. This tool can handle any image editing task including object removal, background changes, style transfer, adding elements, and more.
|
||||
|
||||
**Example:**
|
||||
```
|
||||
Remove the red car in the background from /path/to/photo.jpg
|
||||
```
|
||||
|
||||
**Example:**
|
||||
```
|
||||
Change the background of /path/to/portrait.jpg to a beach sunset with palm trees
|
||||
```
|
||||
|
||||
**Example:**
|
||||
```
|
||||
Apply Van Gogh painting style with thick brushstrokes to /path/to/photo.jpg
|
||||
```
|
||||
|
||||
### 3. `process_multiple_images`
|
||||
Process multiple images together based on detailed instructions. This tool can combine images, create collages, blend compositions, or perform any multi-image operation.
|
||||
|
||||
**Example:**
|
||||
```
|
||||
Place the person from /path/to/person.jpg into the landscape from /path/to/landscape.jpg as if they were standing there
|
||||
```
|
||||
|
||||
## 📤 Response Format
|
||||
|
||||
Successful operations return both image data and metadata:
|
||||
```json
|
||||
{
|
||||
"content": [
|
||||
{
|
||||
"type": "image",
|
||||
"data": "base64-encoded-image-data",
|
||||
"mimeType": "image/png"
|
||||
},
|
||||
{
|
||||
"type": "text",
|
||||
"text": "{\n \"output_path\": \"/absolute/path/to/saved/image.png\",\n \"size_bytes\": 12345,\n \"format\": \"image/png\"\n}"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## 🎨 Popular Use Cases
|
||||
|
||||
### 1. **Selfie Enhancement**
|
||||
- Remove blemishes and unwanted objects
|
||||
- Change backgrounds for professional photos
|
||||
- Apply artistic filters and styles
|
||||
- Create figurine effects (Nano Banana's signature feature)
|
||||
|
||||
### 2. **Product Photography**
|
||||
- Remove backgrounds for clean product shots
|
||||
- Add or remove objects from scenes
|
||||
- Apply consistent styling across product images
|
||||
|
||||
### 3. **Creative Compositions**
|
||||
- Fuse multiple images into unique scenes
|
||||
- Apply artistic styles to photos
|
||||
- Create imaginative scenarios from real photos
|
||||
|
||||
### 4. **Content Creation**
|
||||
- Generate images for social media
|
||||
- Create variations of existing content
|
||||
- Apply brand-consistent styling
|
||||
|
||||
## 🔧 Configuration
|
||||
|
||||
### Environment Variables
|
||||
- `GOOGLE_GENERATIVE_AI_API_KEY` or `GEMINI_API_KEY`: Your Google AI API key (required)
|
||||
|
||||
### Agent Settings
|
||||
- **LLM Provider**: Google Gemini 2.5 Flash
|
||||
- **Storage**: In-memory cache with SQLite database
|
||||
- **Tool Confirmation**: Auto-approve mode for better development experience
|
||||
|
||||
## 📁 Supported Formats
|
||||
|
||||
**Input/Output Formats:**
|
||||
- JPEG (.jpg, .jpeg)
|
||||
- PNG (.png)
|
||||
- WebP (.webp)
|
||||
- GIF (.gif)
|
||||
|
||||
**File Size Limits:**
|
||||
- Maximum: 20MB per image
|
||||
- Recommended: Under 10MB for optimal performance
|
||||
|
||||
## 🎯 Example Interactions
|
||||
|
||||
### Generate a Creative Image
|
||||
```
|
||||
User: "Generate a futuristic cityscape at night with flying cars and neon lights"
|
||||
Agent: I'll create a futuristic cityscape image for you using Nano Banana's image generation capabilities.
|
||||
```
|
||||
|
||||
### Remove Unwanted Objects
|
||||
```
|
||||
User: "Remove the power lines from this photo: /path/to/landscape.jpg"
|
||||
Agent: I'll remove the power lines from your landscape photo while preserving the natural background.
|
||||
```
|
||||
|
||||
### Create Figurine Effect
|
||||
```
|
||||
User: "Transform this selfie into a mini figurine on a desk: /path/to/selfie.jpg"
|
||||
Agent: I'll create Nano Banana's signature figurine effect, transforming your selfie into a mini figurine displayed on a desk.
|
||||
```
|
||||
|
||||
### Change Background
|
||||
```
|
||||
User: "Change the background of this portrait to a professional office setting: /path/to/portrait.jpg"
|
||||
Agent: I'll replace the background with a professional office setting while keeping you as the main subject.
|
||||
```
|
||||
|
||||
## 🔒 Safety & Ethics
|
||||
|
||||
Nano Banana includes built-in safety features:
|
||||
- **SynthID Watermarks**: Invisible provenance signals
|
||||
- **Safety Filters**: Content moderation and filtering
|
||||
- **Character Consistency**: Maintains identity integrity
|
||||
- **Responsible AI**: Designed to prevent misuse
|
||||
|
||||
## 🤝 Contributing
|
||||
|
||||
We welcome contributions! Please see our [Contributing Guidelines](../../CONTRIBUTING.md) for details.
|
||||
|
||||
## 📄 License
|
||||
|
||||
This project is licensed under the MIT License - see the [LICENSE](../../LICENSE) file for details.
|
||||
|
||||
---
|
||||
|
||||
**Note**: This agent provides access to Google's Gemini 2.5 Flash Image model through the MCP protocol. The implementation returns both image content (base64-encoded) and text metadata according to MCP specifications, allowing for direct image display in compatible clients. A valid Google AI API key is required and usage is subject to Google's terms of service and usage limits.
|
||||
Reference in New Issue
Block a user