Files
SuperCharged-Claude-Code-Up…/dexto/agents/nano-banana-agent/README.md
admin b52318eeae feat: Add intelligent auto-router and enhanced integrations
- Add intelligent-router.sh hook for automatic agent routing
- Add AUTO-TRIGGER-SUMMARY.md documentation
- Add FINAL-INTEGRATION-SUMMARY.md documentation
- Complete Prometheus integration (6 commands + 4 tools)
- Complete Dexto integration (12 commands + 5 tools)
- Enhanced Ralph with access to all agents
- Fix /clawd command (removed disable-model-invocation)
- Update hooks.json to v5 with intelligent routing
- 291 total skills now available
- All 21 commands with automatic routing

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-28 00:27:56 +04:00

6.6 KiB

Nano Banana Agent

A Dexto agent that provides access to Google's Gemini 2.5 Flash Image model for image generation and editing through a lean, powerful MCP server.

🎯 What is Gemini 2.5 Flash Image?

Gemini 2.5 Flash Image is Google's cutting-edge AI model that enables:

  • Near-instantaneous image generation and editing
  • Object removal with perfect background preservation
  • Background alteration while maintaining subject integrity
  • Image fusion for creative compositions
  • Style modification with character consistency
  • Visible and invisible watermarks (SynthID) for digital safety

🚀 Key Features

Core Capabilities

  • Image Generation: Create images from text prompts with various styles and aspect ratios
  • Image Editing: Modify existing images based on natural language descriptions
  • Object Removal: Remove unwanted objects while preserving the background
  • Background Changes: Replace backgrounds while keeping subjects intact
  • Image Fusion: Combine multiple images into creative compositions
  • Style Transfer: Apply artistic styles to images

Advanced Features

  • Character Consistency: Maintain facial features and identities across edits
  • Scene Preservation: Seamless blending with original lighting and composition
  • Multi-Image Processing: Handle batch operations and complex compositions
  • Safety Features: Built-in safety filters and provenance signals

🛠️ Setup

Prerequisites

  • Dexto framework installed
  • Google AI API key (Gemini API access)
  • Node.js 20.0.0 or higher

Installation

  1. Set up environment variables:

    export GOOGLE_GENERATIVE_AI_API_KEY="your-google-ai-api-key"
    # or
    export GEMINI_API_KEY="your-google-ai-api-key"
    
  2. Run the agent (the MCP server will be automatically downloaded via npx):

    # From the dexto repository root
    npx dexto -a agents/nano-banana-agent/nano-banana-agent.yml
    

The agent configuration uses npx @truffle-ai/nano-banana-server to automatically download and run the latest version of the MCP server.

📋 Available Tools

The agent provides access to 3 essential tools:

1. generate_image

Generate new images from text prompts.

Example:

Generate a majestic mountain landscape at sunset in realistic style with 16:9 aspect ratio

2. process_image

Process existing images based on detailed instructions. This tool can handle any image editing task including object removal, background changes, style transfer, adding elements, and more.

Example:

Remove the red car in the background from /path/to/photo.jpg

Example:

Change the background of /path/to/portrait.jpg to a beach sunset with palm trees

Example:

Apply Van Gogh painting style with thick brushstrokes to /path/to/photo.jpg

3. process_multiple_images

Process multiple images together based on detailed instructions. This tool can combine images, create collages, blend compositions, or perform any multi-image operation.

Example:

Place the person from /path/to/person.jpg into the landscape from /path/to/landscape.jpg as if they were standing there

📤 Response Format

Successful operations return both image data and metadata:

{
  "content": [
    {
      "type": "image",
      "data": "base64-encoded-image-data",
      "mimeType": "image/png"
    },
    {
      "type": "text",
      "text": "{\n  \"output_path\": \"/absolute/path/to/saved/image.png\",\n  \"size_bytes\": 12345,\n  \"format\": \"image/png\"\n}"
    }
  ]
}

1. Selfie Enhancement

  • Remove blemishes and unwanted objects
  • Change backgrounds for professional photos
  • Apply artistic filters and styles
  • Create figurine effects (Nano Banana's signature feature)

2. Product Photography

  • Remove backgrounds for clean product shots
  • Add or remove objects from scenes
  • Apply consistent styling across product images

3. Creative Compositions

  • Fuse multiple images into unique scenes
  • Apply artistic styles to photos
  • Create imaginative scenarios from real photos

4. Content Creation

  • Generate images for social media
  • Create variations of existing content
  • Apply brand-consistent styling

🔧 Configuration

Environment Variables

  • GOOGLE_GENERATIVE_AI_API_KEY or GEMINI_API_KEY: Your Google AI API key (required)

Agent Settings

  • LLM Provider: Google Gemini 2.5 Flash
  • Storage: In-memory cache with SQLite database
  • Tool Confirmation: Auto-approve mode for better development experience

📁 Supported Formats

Input/Output Formats:

  • JPEG (.jpg, .jpeg)
  • PNG (.png)
  • WebP (.webp)
  • GIF (.gif)

File Size Limits:

  • Maximum: 20MB per image
  • Recommended: Under 10MB for optimal performance

🎯 Example Interactions

Generate a Creative Image

User: "Generate a futuristic cityscape at night with flying cars and neon lights"
Agent: I'll create a futuristic cityscape image for you using Nano Banana's image generation capabilities.

Remove Unwanted Objects

User: "Remove the power lines from this photo: /path/to/landscape.jpg"
Agent: I'll remove the power lines from your landscape photo while preserving the natural background.

Create Figurine Effect

User: "Transform this selfie into a mini figurine on a desk: /path/to/selfie.jpg"
Agent: I'll create Nano Banana's signature figurine effect, transforming your selfie into a mini figurine displayed on a desk.

Change Background

User: "Change the background of this portrait to a professional office setting: /path/to/portrait.jpg"
Agent: I'll replace the background with a professional office setting while keeping you as the main subject.

🔒 Safety & Ethics

Nano Banana includes built-in safety features:

  • SynthID Watermarks: Invisible provenance signals
  • Safety Filters: Content moderation and filtering
  • Character Consistency: Maintains identity integrity
  • Responsible AI: Designed to prevent misuse

🤝 Contributing

We welcome contributions! Please see our Contributing Guidelines for details.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.


Note: This agent provides access to Google's Gemini 2.5 Flash Image model through the MCP protocol. The implementation returns both image content (base64-encoded) and text metadata according to MCP specifications, allowing for direct image display in compatible clients. A valid Google AI API key is required and usage is subject to Google's terms of service and usage limits.