feat: Add intelligent auto-router and enhanced integrations

- Add intelligent-router.sh hook for automatic agent routing
- Add AUTO-TRIGGER-SUMMARY.md documentation
- Add FINAL-INTEGRATION-SUMMARY.md documentation
- Complete Prometheus integration (6 commands + 4 tools)
- Complete Dexto integration (12 commands + 5 tools)
- Enhanced Ralph with access to all agents
- Fix /clawd command (removed disable-model-invocation)
- Update hooks.json to v5 with intelligent routing
- 291 total skills now available
- All 21 commands with automatic routing

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
admin
2026-01-28 00:27:56 +04:00
Unverified
parent 3b128ba3bd
commit b52318eeae
1724 changed files with 351216 additions and 0 deletions

20
dexto/docs/.gitignore vendored Normal file
View File

@@ -0,0 +1,20 @@
# Dependencies
/node_modules
# Production
/build
# Generated files
.docusaurus
.cache-loader
# Misc
.DS_Store
.env.local
.env.development.local
.env.test.local
.env.production.local
npm-debug.log*
yarn-debug.log*
yarn-error.log*

41
dexto/docs/README.md Normal file
View File

@@ -0,0 +1,41 @@
# Website
This website is built using [Docusaurus](https://docusaurus.io/), a modern static website generator.
### Installation
```
$ yarn
```
### Local Development
```
$ yarn start
```
This command starts a local development server and opens up a browser window. Most changes are reflected live without having to restart the server.
### Build
```
$ yarn build
```
This command generates static content into the `build` directory and can be served using any static contents hosting service.
### Deployment
Using SSH:
```
$ USE_SSH=true yarn deploy
```
Not using SSH:
```
$ GIT_USER=<Your GitHub username> yarn deploy
```
If you are using GitHub pages for hosting, this command is a convenient way to build the website and push to the `gh-pages` branch.

View File

@@ -0,0 +1,34 @@
import type { SidebarsConfig } from '@docusaurus/plugin-content-docs';
const sidebars: SidebarsConfig = {
apiSidebar: [
{
type: 'doc',
id: 'getting-started',
label: 'Getting Started',
},
{
type: 'link',
label: 'REST API Reference',
href: '/api/rest',
},
{
type: 'category',
label: 'Dexto SDK',
link: {
type: 'generated-index',
title: 'Dexto SDK API Reference',
description:
'Complete technical API reference for the Dexto SDK for TypeScript/JavaScript.',
},
items: [
{
type: 'autogenerated',
dirName: 'sdk',
},
],
},
],
};
export default sidebars;

View File

@@ -0,0 +1,9 @@
{
"label": "API Reference",
"position": 3,
"link": {
"type": "generated-index",
"title": "Dexto API Reference",
"description": "Welcome to the Dexto API documentation. Here you will find detailed information about using Dexto as a Library, and about our REST and SSE streaming APIs. To use the REST or SSE API, you must first start the Dexto server by running the following command in your terminal:\n\n```bash\ndexto --mode server\n```\n\nBy default, the server will start on `http://localhost:3001`."
}
}

View File

@@ -0,0 +1,54 @@
---
slug: /
sidebar_position: 1
---
# Getting Started
Welcome to the Dexto API. This guide will walk you through the essential first steps to begin interacting with your Dexto agent programmatically.
## 1. Starting the API Server
Before you can make any API calls, you must start the Dexto server. This single command enables both the REST and SSE streaming APIs.
Run the following command in your terminal:
```bash
dexto --mode server
```
By default, the server will run on port `3001`. You should see a confirmation message in your terminal indicating that the server has started successfully.
**Customize the port:**
```bash
dexto --mode server --port 8080
```
This starts the API server on port 8080 instead of the default 3001.
## 2. Choosing Your API
Dexto offers two distinct APIs to suit different use cases. Understanding when to use each is key to building your application effectively.
### When to use the REST API?
Use the **REST API** for synchronous, request-response actions where you want to perform a task and get a result immediately. It's ideal for:
- Managing resources (e.g., listing or adding MCP servers).
- Retrieving configuration or session data.
- Triggering a single, non-streamed agent response.
**Base URL**: `http://localhost:3001`
### When to use Server-Sent Events (SSE)?
Use **Server-Sent Events (SSE)** for building interactive, real-time applications. It's the best choice for:
- Streaming agent responses (`chunk` events) as they are generated.
- Receiving real-time events from the agent's core, such as `toolCall` and `toolResult`.
- Creating chat-like user interfaces.
**Stream URL**: `http://localhost:3001/api/message-stream`
## 3. What's Next?
Now that your server is running and you know which API to use, you can dive into the specifics:
- Explore the **[REST API Reference](/api/rest)** - comprehensive documentation of all HTTP endpoints.
- Learn about the **[SDK Events Reference](/api/sdk/events)**.

View File

@@ -0,0 +1,9 @@
{
"label": "Dexto SDK",
"position": 4,
"link": {
"type": "generated-index",
"title": "Dexto SDK API Reference",
"description": "Complete technical API reference for the Dexto SDK for TypeScript/JavaScript."
}
}

View File

@@ -0,0 +1,324 @@
---
sidebar_position: 7
---
# AgentFactory API
The `AgentFactory` namespace provides static methods for agent creation, installation, and management. Use these functions to create agents from inline configs, install agents from the bundled registry, install custom agents, and manage installed agents.
```typescript
import { AgentFactory } from '@dexto/agent-management';
```
---
## createAgent
Creates a `DextoAgent` from an inline configuration object. Use this when you have a config from a database, API, or constructed programmatically and don't need a registry file.
```typescript
async function AgentFactory.createAgent(
config: AgentConfig,
options?: CreateAgentOptions
): Promise<DextoAgent>
```
| Parameter | Type | Description |
| :--- | :--- | :--- |
| `config` | `AgentConfig` | Agent configuration object |
| `options.agentId` | `string` | (Optional) Override agent ID (affects log/storage paths) |
| `options.isInteractiveCli` | `boolean` | (Optional) If true, disables console logging |
**Returns:** `Promise<DextoAgent>` - Agent instance (not started)
**Example:**
```typescript
import { AgentFactory } from '@dexto/agent-management';
// Create from inline config
const agent = await AgentFactory.createAgent({
llm: {
provider: 'openai',
model: 'gpt-4o',
apiKey: process.env.OPENAI_API_KEY
},
systemPrompt: 'You are a helpful assistant.'
});
await agent.start();
// With custom agent ID (affects log/storage paths)
const agent = await AgentFactory.createAgent(config, { agentId: 'my-custom-agent' });
// From database
const configFromDb = await db.getAgentConfig(userId);
const agent = await AgentFactory.createAgent(configFromDb, { agentId: `user-${userId}` });
await agent.start();
```
---
## listAgents
Lists all installed and available agents from the bundled registry.
```typescript
async function AgentFactory.listAgents(): Promise<{
installed: AgentInfo[];
available: AgentInfo[];
}>
```
**Returns:** Object with `installed` and `available` agent arrays
```typescript
interface AgentInfo {
id: string; // Unique identifier
name: string; // Display name
description: string; // What the agent does
author: string; // Creator
tags: string[]; // Categorization tags
type: 'builtin' | 'custom';
}
```
**Example:**
```typescript
import { AgentFactory } from '@dexto/agent-management';
const { installed, available } = await AgentFactory.listAgents();
console.log('Installed agents:');
installed.forEach(agent => {
console.log(` - ${agent.name} (${agent.id})`);
});
console.log('\nAvailable to install:');
available.forEach(agent => {
console.log(` - ${agent.name}: ${agent.description}`);
});
```
---
## installAgent
Installs an agent from the bundled registry to the local agents directory (`~/.dexto/agents/`).
```typescript
async function AgentFactory.installAgent(
agentId: string,
options?: InstallOptions
): Promise<string>
```
| Parameter | Type | Description |
| :--- | :--- | :--- |
| `agentId` | `string` | Agent ID from bundled registry |
| `options.agentsDir` | `string` | (Optional) Custom agents directory |
**Returns:** `Promise<string>` - Path to installed agent's main config file
**Throws:** `DextoRuntimeError` if agent not found or installation fails
**Example:**
```typescript
import { AgentFactory } from '@dexto/agent-management';
// Install a bundled agent
const configPath = await AgentFactory.installAgent('coding-agent');
console.log(`Installed to: ${configPath}`);
```
### What Happens During Installation
1. Agent files are copied from bundled location to `~/.dexto/agents/{agentId}/`
2. Agent is added to the user's registry (`~/.dexto/agents/registry.json`)
3. User preferences are applied at runtime for the bundled coding-agent only
---
## installCustomAgent
Installs a custom agent from a local file or directory path.
```typescript
async function AgentFactory.installCustomAgent(
agentId: string,
sourcePath: string,
metadata: {
name?: string;
description: string;
author: string;
tags: string[];
},
options?: InstallOptions
): Promise<string>
```
| Parameter | Type | Description |
| :--- | :--- | :--- |
| `agentId` | `string` | Unique ID for the custom agent |
| `sourcePath` | `string` | Absolute path to agent YAML file or directory |
| `metadata.name` | `string` | (Optional) Display name (defaults to agentId) |
| `metadata.description` | `string` | Description of what the agent does |
| `metadata.author` | `string` | Creator of the agent |
| `metadata.tags` | `string[]` | Categorization tags |
| `options.agentsDir` | `string` | (Optional) Custom agents directory |
**Returns:** `Promise<string>` - Path to installed agent's main config file
**Throws:**
- `DextoRuntimeError` if agent ID conflicts with bundled agent
- `DextoRuntimeError` if agent ID already exists
- `DextoRuntimeError` if source path doesn't exist
**Example:**
```typescript
import { AgentFactory } from '@dexto/agent-management';
// Install from a single YAML file
const configPath = await AgentFactory.installCustomAgent(
'my-support-agent',
'/path/to/support-agent.yml',
{
description: 'Custom support agent for our product',
author: 'My Team',
tags: ['support', 'custom']
}
);
// Install from a directory (for agents with multiple files)
const configPath = await AgentFactory.installCustomAgent(
'my-complex-agent',
'/path/to/agent-directory/',
{
name: 'Complex Agent',
description: 'Agent with knowledge files and multiple configs',
author: 'My Team',
tags: ['complex', 'custom']
}
);
```
### Directory Structure for Multi-File Agents
When installing from a directory:
```
my-agent/
├── agent.yml # Main config (required, or specify custom name)
├── knowledge/
│ ├── docs.md
│ └── faq.md
└── prompts/
└── system.txt
```
---
## uninstallAgent
Removes an installed agent from disk and the user registry.
```typescript
async function AgentFactory.uninstallAgent(agentId: string): Promise<void>
```
| Parameter | Type | Description |
| :--- | :--- | :--- |
| `agentId` | `string` | Agent ID to uninstall |
**Throws:** `DextoRuntimeError` if agent is not installed
**Example:**
```typescript
import { AgentFactory } from '@dexto/agent-management';
// Uninstall an agent
await AgentFactory.uninstallAgent('my-custom-agent');
console.log('Agent uninstalled');
```
### What Happens During Uninstallation
1. Agent directory is removed from `~/.dexto/agents/{agentId}/`
2. Agent entry is removed from user registry (`~/.dexto/agents/registry.json`)
:::caution
Uninstallation is permanent. All agent files including conversation history (if stored locally) will be deleted.
:::
---
## InstallOptions
Options for installation functions:
```typescript
interface InstallOptions {
/** Directory where agents are stored (default: ~/.dexto/agents) */
agentsDir?: string;
}
```
---
## Complete Example
```typescript
import { AgentFactory } from '@dexto/agent-management';
import { AgentManager } from '@dexto/agent-management';
async function setupAgents() {
// List what's available
const { installed, available } = await AgentFactory.listAgents();
console.log(`${installed.length} installed, ${available.length} available`);
// Install a bundled agent if not already installed
if (!installed.some(a => a.id === 'coding-agent')) {
await AgentFactory.installAgent('coding-agent');
console.log('Installed coding-agent');
}
// Install a custom agent
await AgentFactory.installCustomAgent(
'team-agent',
'./my-agents/team-agent.yml',
{
description: 'Our team\'s custom agent',
author: 'Engineering Team',
tags: ['internal', 'custom']
}
);
// Now use AgentManager to work with installed agents
const manager = new AgentManager('~/.dexto/agents/registry.json');
await manager.loadRegistry();
const agent = await manager.loadAgent('team-agent');
await agent.start();
// ... use the agent ...
await agent.stop();
}
```
---
## File Locations
| Resource | Path |
| :--- | :--- |
| Agents directory | `~/.dexto/agents/` |
| User registry | `~/.dexto/agents/registry.json` |
| Per-agent configs | `~/.dexto/agents/{agentId}/` |
| Bundled registry | Bundled with `@dexto/agent-management` package |
---
## See Also
- [AgentManager API](./agent-manager.md) - Registry-based agent lifecycle management
- [Config Utilities](./config-utilities.md) - Lower-level config loading functions
- [Agent Orchestration Tutorial](/docs/tutorials/sdk/orchestration) - Step-by-step guide

View File

@@ -0,0 +1,263 @@
---
sidebar_position: 5
---
# AgentManager API
The `AgentManager` class provides registry-based agent lifecycle management. It loads agent configurations from a registry file and creates agent instances programmatically.
```typescript
import { AgentManager } from '@dexto/agent-management';
```
:::note When to use AgentManager
**`AgentManager`** - Registry-based. Use when you have a `registry.json` with multiple predefined agents.
:::
---
## Constructor
### `constructor`
Creates a new AgentManager instance pointing to a registry file.
```typescript
constructor(registryPath: string)
```
| Parameter | Type | Description |
| :--- | :--- | :--- |
| `registryPath` | `string` | Path to registry.json file (absolute or relative) |
**Example:**
```typescript
// Project-local registry
const manager = new AgentManager('./agents/registry.json');
// Absolute path
const manager = new AgentManager('/path/to/registry.json');
```
---
## Methods
### `loadRegistry`
Loads the registry from file. Must be called before using sync methods like `listAgents()` or `hasAgent()`.
```typescript
async loadRegistry(): Promise<Registry>
```
**Returns:** `Promise<Registry>` - The loaded registry object
**Example:**
```typescript
const manager = new AgentManager('./registry.json');
await manager.loadRegistry();
// Now sync methods work
const agents = manager.listAgents();
```
:::note
`loadAgent()` automatically calls `loadRegistry()` if not already loaded.
:::
---
### `listAgents`
Returns metadata for all agents in the registry.
```typescript
listAgents(): AgentMetadata[]
```
**Returns:** `AgentMetadata[]` - Array of agent metadata objects
```typescript
interface AgentMetadata {
id: string; // Unique identifier
name: string; // Display name
description: string; // What the agent does
author?: string; // Creator
tags?: string[]; // Categorization tags
}
```
**Example:**
```typescript
const manager = new AgentManager('./registry.json');
await manager.loadRegistry();
const agents = manager.listAgents();
console.log(agents);
// [
// { id: 'coding-agent', name: 'Coding Assistant', description: '...', tags: ['coding'] },
// { id: 'support-agent', name: 'Support Assistant', description: '...', tags: ['support'] }
// ]
// Filter by tag
const codingAgents = agents.filter(a => a.tags?.includes('coding'));
```
---
### `hasAgent`
Checks if an agent exists in the registry.
```typescript
hasAgent(id: string): boolean
```
| Parameter | Type | Description |
| :--- | :--- | :--- |
| `id` | `string` | Agent ID to check |
**Returns:** `boolean` - True if agent exists
**Example:**
```typescript
const manager = new AgentManager('./registry.json');
await manager.loadRegistry();
if (manager.hasAgent('coding-agent')) {
const agent = await manager.loadAgent('coding-agent');
}
```
---
### `loadAgent`
Loads a `DextoAgent` instance from the registry. Loads the agent's YAML config, enriches it with runtime paths, and returns an unstarted agent.
```typescript
async loadAgent(id: string): Promise<DextoAgent>
```
| Parameter | Type | Description |
| :--- | :--- | :--- |
| `id` | `string` | Agent ID from registry |
**Returns:** `Promise<DextoAgent>` - Agent instance (not started)
**Throws:**
- `DextoRuntimeError` if agent not found or config loading fails
- `DextoValidationError` if agent config validation fails
**Example:**
```typescript
const manager = new AgentManager('./registry.json');
const agent = await manager.loadAgent('coding-agent');
await agent.start();
// Use the agent
const session = await agent.createSession();
const response = await agent.generate('Write a function to reverse a string', session.id);
console.log(response.content);
await agent.stop();
```
---
## Registry Format
The registry file is a JSON file that describes available agents:
```json
{
"agents": [
{
"id": "coding-agent",
"name": "Coding Assistant",
"description": "Expert coding assistant for development tasks",
"configPath": "./coding-agent.yml",
"author": "Your Team",
"tags": ["coding", "development"]
},
{
"id": "support-agent",
"name": "Support Assistant",
"description": "Friendly customer support agent",
"configPath": "./support-agent.yml",
"tags": ["support", "customer-service"]
}
]
}
```
| Field | Type | Required | Description |
| :--- | :--- | :--- | :--- |
| `id` | `string` | Yes | Unique identifier (used in `loadAgent()`) |
| `name` | `string` | Yes | Human-readable display name |
| `description` | `string` | Yes | What this agent does |
| `configPath` | `string` | Yes | Path to YAML config (relative to registry.json) |
| `author` | `string` | No | Creator of the agent |
| `tags` | `string[]` | No | Categorization tags |
---
## Complete Example
```typescript
import { AgentManager } from '@dexto/agent-management';
async function main() {
// Initialize manager
const manager = new AgentManager('./agents/registry.json');
await manager.loadRegistry();
// List available agents
console.log('Available agents:');
for (const agent of manager.listAgents()) {
console.log(` - ${agent.name} (${agent.id}): ${agent.description}`);
}
// Create and use an agent
if (manager.hasAgent('coding-agent')) {
const agent = await manager.loadAgent('coding-agent');
await agent.start();
const session = await agent.createSession();
const response = await agent.generate('Hello!', session.id);
console.log(response.content);
await agent.stop();
}
}
main();
```
---
## Error Handling
```typescript
import { AgentManager } from '@dexto/agent-management';
try {
const manager = new AgentManager('./registry.json');
const agent = await manager.loadAgent('non-existent-agent');
} catch (error) {
if (error.code === 'AGENT_NOT_FOUND') {
console.log('Agent not found in registry');
} else if (error.name === 'DextoValidationError') {
console.log('Agent config validation failed:', error.issues);
}
}
```
---
## See Also
- [Config Utilities](./config-utilities.md) - Lower-level config loading functions
- [AgentFactory API](./agent-factory.md) - Agent installation and management
- [Agent Orchestration Tutorial](/docs/tutorials/sdk/orchestration) - Step-by-step guide

View File

@@ -0,0 +1,209 @@
---
sidebar_position: 6
---
# Config Utilities
Utilities for loading and enriching agent configurations from YAML files. These functions are the building blocks for programmatic agent management.
```typescript
import { loadAgentConfig, enrichAgentConfig } from '@dexto/agent-management';
```
---
## loadAgentConfig
Loads and processes an agent configuration from a YAML file. Handles file reading, YAML parsing, and template variable expansion.
```typescript
async function loadAgentConfig(
configPath: string,
logger?: IDextoLogger
): Promise<AgentConfig>
```
| Parameter | Type | Description |
| :--- | :--- | :--- |
| `configPath` | `string` | Path to the YAML config file (absolute or relative) |
| `logger` | `IDextoLogger` | (Optional) Logger instance for debug output |
**Returns:** `Promise<AgentConfig>` - Parsed configuration object
**Throws:**
- `ConfigError` with `FILE_NOT_FOUND` if file doesn't exist
- `ConfigError` with `FILE_READ_ERROR` if file read fails
- `ConfigError` with `PARSE_ERROR` if YAML is invalid
### What It Does
1. **Reads the YAML file** from disk
2. **Parses YAML** into a JavaScript object
3. **Expands template variables** like `${{dexto.agent_dir}}`
4. **Expands environment variables** like `$OPENAI_API_KEY`
### Example
```typescript
import { loadAgentConfig } from '@dexto/agent-management';
// Load a config file
const config = await loadAgentConfig('./agents/my-agent.yml');
console.log(config.llm.provider); // 'openai'
console.log(config.llm.model); // 'gpt-4o'
```
### Template Variables
Config files can use template variables that are expanded at load time:
```yaml
# my-agent.yml
systemPrompt:
contributors:
- id: knowledge
type: file
files:
- ${{dexto.agent_dir}}/knowledge/docs.md
```
| Variable | Expands To |
| :--- | :--- |
| `${{dexto.agent_dir}}` | Directory containing the config file |
### Environment Variables
Environment variables are expanded during schema validation:
```yaml
llm:
provider: openai
model: gpt-4o
apiKey: $OPENAI_API_KEY # Expanded from environment
```
---
## enrichAgentConfig
Enriches a loaded configuration with per-agent runtime paths for logs, database, and blob storage. This function should be called after `loadAgentConfig` and before creating a `DextoAgent`.
```typescript
function enrichAgentConfig(
config: AgentConfig,
configPath?: string,
isInteractiveCli?: boolean
): AgentConfig
```
| Parameter | Type | Description |
| :--- | :--- | :--- |
| `config` | `AgentConfig` | Configuration from `loadAgentConfig` |
| `configPath` | `string` | (Optional) Path to config file (used for agent ID derivation) |
| `isInteractiveCli` | `boolean` | (Optional) If true, disables console logging (default: false) |
**Returns:** `AgentConfig` - Enriched configuration with explicit paths
### What It Adds
Each agent gets isolated paths based on its ID:
| Resource | Path |
| :--- | :--- |
| Logs | `~/.dexto/agents/{agentId}/logs/{agentId}.log` |
| Database | `~/.dexto/agents/{agentId}/db/{agentId}.db` |
| Blob Storage | `~/.dexto/agents/{agentId}/blobs/` |
### Agent ID Derivation
The agent ID is derived in priority order:
1. `agentCard.name` from config (sanitized)
2. Config filename (without extension)
3. Fallback: `coding-agent`
### Example
```typescript
import { loadAgentConfig, enrichAgentConfig } from '@dexto/agent-management';
import { DextoAgent } from '@dexto/core';
// Load raw config
const config = await loadAgentConfig('./agents/coding-agent.yml');
// Enrich with runtime paths
const enrichedConfig = enrichAgentConfig(config, './agents/coding-agent.yml');
// Create agent with enriched config
const agent = new DextoAgent(enrichedConfig, './agents/coding-agent.yml');
await agent.start();
```
### Default Storage Configuration
If no storage is specified in the config, enrichment adds:
```typescript
{
storage: {
cache: { type: 'in-memory' },
database: { type: 'sqlite', path: '~/.dexto/agents/{agentId}/db/{agentId}.db' },
blob: { type: 'local', storePath: '~/.dexto/agents/{agentId}/blobs/' }
}
}
```
---
## Complete Usage Pattern
```typescript
import { loadAgentConfig, enrichAgentConfig } from '@dexto/agent-management';
import { DextoAgent } from '@dexto/core';
async function createAgentFromConfig(configPath: string): Promise<DextoAgent> {
// 1. Load the YAML config
const config = await loadAgentConfig(configPath);
// 2. Enrich with runtime paths
const enrichedConfig = enrichAgentConfig(config, configPath);
// 3. Create and start the agent
const agent = new DextoAgent(enrichedConfig, configPath);
await agent.start();
return agent;
}
// Usage
const agent = await createAgentFromConfig('./agents/my-agent.yml');
const session = await agent.createSession();
const response = await agent.generate('Hello!', session.id);
```
---
## Error Handling
```typescript
import { loadAgentConfig, enrichAgentConfig } from '@dexto/agent-management';
try {
const config = await loadAgentConfig('./agents/my-agent.yml');
const enriched = enrichAgentConfig(config, './agents/my-agent.yml');
} catch (error) {
if (error.code === 'FILE_NOT_FOUND') {
console.error('Config file not found:', error.path);
} else if (error.code === 'PARSE_ERROR') {
console.error('Invalid YAML:', error.message);
}
}
```
---
## See Also
- [AgentManager API](./agent-manager.md) - Higher-level registry-based management
- [AgentFactory API](./agent-factory.md) - Agent installation functions
- [Loading Agent Configs Tutorial](/docs/tutorials/sdk/config-files) - Step-by-step guide

View File

@@ -0,0 +1,596 @@
---
sidebar_position: 1
---
# DextoAgent API
Complete API reference for the main `DextoAgent` class. This is the core interface for the Dexto Agent SDK.
## Constructor and Lifecycle
### `constructor`
Creates a new Dexto agent instance with the provided configuration.
```typescript
constructor(config: AgentConfig)
```
| Parameter | Type | Description |
| :--- | :--- | :--- |
| `config` | `AgentConfig` | Agent configuration object |
### `start`
Initializes and starts the agent with all required services.
```typescript
async start(): Promise<void>
```
**Parameters:** None
**Example:**
```typescript
const agent = new DextoAgent(config);
await agent.start();
```
### `stop`
Stops the agent and cleans up all resources.
```typescript
async stop(): Promise<void>
```
**Example:**
```typescript
await agent.stop();
```
---
## Core Methods
The Dexto Agent SDK provides three methods for processing messages:
- **`generate()`** - Recommended for most use cases. Returns a complete response.
- **`stream()`** - For real-time streaming UIs. Yields events as they arrive.
- **`run()`** - Lower-level method for direct control.
### `generate`
**Recommended method** for processing user input. Waits for complete response.
```typescript
async generate(
content: ContentInput,
sessionId: string,
options?: GenerateOptions
): Promise<GenerateResponse>
```
| Parameter | Type | Description |
| :--- | :--- | :--- |
| `content` | `string \| ContentPart[]` | User message (string) or multimodal content (array) |
| `sessionId` | `string` | **Required.** Session ID for the conversation |
| `options.signal` | `AbortSignal` | (Optional) For cancellation |
**Content Types:**
```typescript
// Simple string content
type ContentInput = string | ContentPart[];
// For multimodal content, use ContentPart array:
type ContentPart = TextPart | ImagePart | FilePart;
interface TextPart { type: 'text'; text: string; }
interface ImagePart { type: 'image'; image: string; mimeType?: string; }
interface FilePart { type: 'file'; data: string; mimeType: string; filename?: string; }
```
**Returns:** `Promise<GenerateResponse>`
```typescript
interface GenerateResponse {
content: string; // The AI's text response
reasoning?: string; // Extended thinking (o1/o3 models)
usage: TokenUsage; // Token usage statistics
toolCalls: AgentToolCall[]; // Tools that were called
sessionId: string;
messageId: string;
}
```
**Example:**
```typescript
const agent = new DextoAgent(config);
await agent.start();
const session = await agent.createSession();
// Simple text message
const response = await agent.generate('What is 2+2?', session.id);
console.log(response.content); // "4"
console.log(response.usage.totalTokens); // Token count
// With image URL (auto-detected)
const response = await agent.generate([
{ type: 'text', text: 'Describe this image' },
{ type: 'image', image: 'https://example.com/photo.jpg' }
], session.id);
// With image base64
const response = await agent.generate([
{ type: 'text', text: 'Describe this image' },
{ type: 'image', image: base64Image, mimeType: 'image/png' }
], session.id);
// With file URL
const response = await agent.generate([
{ type: 'text', text: 'Summarize this document' },
{ type: 'file', data: 'https://example.com/doc.pdf', mimeType: 'application/pdf' }
], session.id);
// With file base64
const response = await agent.generate([
{ type: 'text', text: 'Summarize this document' },
{ type: 'file', data: base64Pdf, mimeType: 'application/pdf', filename: 'doc.pdf' }
], session.id);
// With cancellation support
const controller = new AbortController();
const response = await agent.generate('Long task...', session.id, { signal: controller.signal });
await agent.stop();
```
### `stream`
For real-time streaming UIs. Yields events as they arrive.
```typescript
async stream(
content: ContentInput,
sessionId: string,
options?: StreamOptions
): Promise<AsyncIterableIterator<StreamingEvent>>
```
| Parameter | Type | Description |
| :--- | :--- | :--- |
| `content` | `string \| ContentPart[]` | User message (string) or multimodal content (array) |
| `sessionId` | `string` | **Required.** Session ID |
| `options.signal` | `AbortSignal` | (Optional) For cancellation |
**Returns:** `Promise<AsyncIterableIterator<StreamingEvent>>`
**Example:**
```typescript
const session = await agent.createSession();
// Simple text streaming
for await (const event of await agent.stream('Write a poem', session.id)) {
if (event.name === 'llm:chunk') {
process.stdout.write(event.content);
}
if (event.name === 'llm:tool-call') {
console.log(`\n[Using ${event.toolName}]\n`);
}
}
// Streaming with image
for await (const event of await agent.stream([
{ type: 'text', text: 'Describe this image' },
{ type: 'image', image: base64Image, mimeType: 'image/png' }
], session.id)) {
if (event.name === 'llm:chunk') {
process.stdout.write(event.content);
}
}
```
### `run`
Lower-level method for direct control. Prefer `generate()` for most use cases.
```typescript
async run(
textInput: string,
imageDataInput: { image: string; mimeType: string } | undefined,
fileDataInput: { data: string; mimeType: string; filename?: string } | undefined,
sessionId: string,
stream?: boolean
): Promise<string>
```
| Parameter | Type | Description |
| :--- | :--- | :--- |
| `textInput` | `string` | User message or query |
| `imageDataInput` | `{ image: string; mimeType: string } \| undefined` | Image data or undefined |
| `fileDataInput` | `{ data: string; mimeType: string; filename?: string } \| undefined` | File data or undefined |
| `sessionId` | `string` | **Required.** Session ID |
| `stream` | `boolean` | (Optional) Enable streaming (default: false) |
**Returns:** `Promise<string>` - AI response text
**Example:**
```typescript
const agent = new DextoAgent(config);
await agent.start();
const session = await agent.createSession();
// Recommended: Use generate() for most use cases
const response = await agent.generate(
"Explain quantum computing",
session.id
);
console.log(response.content);
// Lower-level run() method (returns just the text)
const responseText = await agent.run(
"Explain quantum computing",
undefined, // no image
undefined, // no file
session.id
);
await agent.stop();
```
### `cancel`
Cancels the currently running turn for a session.
```typescript
async cancel(sessionId: string): Promise<boolean>
```
| Parameter | Type | Description |
| :--- | :--- | :--- |
| `sessionId` | `string` | **Required.** Session ID to cancel |
**Returns:** `Promise<boolean>` - true if a run was in progress and cancelled
---
## Session Management
:::note Architectural Pattern
DextoAgent's core is **stateless** and does not track a "current" or "default" session. All session-specific operations require an explicit `sessionId` parameter. Application layers (CLI, WebUI, API servers) are responsible for managing which session is active in their own context.
:::
### `createSession`
Creates a new conversation session with optional custom ID.
```typescript
async createSession(sessionId?: string): Promise<ChatSession>
```
| Parameter | Type | Description |
| :--- | :--- | :--- |
| `sessionId` | `string` | (Optional) Custom session ID |
**Returns:** `Promise<ChatSession>`
**Example:**
```typescript
// Create a new session (auto-generated ID)
const session = await agent.createSession();
console.log(`Created session: ${session.id}`);
// Create a session with custom ID
const userSession = await agent.createSession('user-123');
// Use the session for conversations
await agent.generate("Hello!", session.id);
```
### `getSession`
Retrieves an existing session by its ID.
```typescript
async getSession(sessionId: string): Promise<ChatSession | undefined>
```
| Parameter | Type | Description |
| :--- | :--- | :--- |
| `sessionId` | `string` | Session ID to retrieve |
**Returns:** `Promise<ChatSession | undefined>`
### `listSessions`
Returns an array of all active session IDs.
```typescript
async listSessions(): Promise<string[]>
```
**Returns:** `Promise<string[]>` - Array of session IDs
### `deleteSession`
Permanently deletes a session and all its conversation history. This action cannot be undone.
```typescript
async deleteSession(sessionId: string): Promise<void>
```
| Parameter | Type | Description |
| :--- | :--- | :--- |
| `sessionId` | `string` | Session ID to delete |
**Note:** This completely removes the session and all associated conversation data from storage.
### `resetConversation`
Clears the conversation history of a session while keeping the session active.
```typescript
async resetConversation(sessionId: string): Promise<void>
```
| Parameter | Type | Description |
| :--- | :--- | :--- |
| `sessionId` | `string` | Session ID to reset |
### `getSessionMetadata`
Retrieves metadata for a session including creation time and message count.
```typescript
async getSessionMetadata(sessionId: string): Promise<SessionMetadata | undefined>
```
| Parameter | Type | Description |
| :--- | :--- | :--- |
| `sessionId` | `string` | Session ID |
**Returns:** `Promise<SessionMetadata | undefined>`
### `getSessionHistory`
Gets the complete conversation history for a session.
```typescript
async getSessionHistory(sessionId: string): Promise<ConversationHistory>
```
| Parameter | Type | Description |
| :--- | :--- | :--- |
| `sessionId` | `string` | Session ID |
**Returns:** `Promise<ConversationHistory>`
---
## Configuration
### `switchLLM`
Dynamically changes the LLM configuration for the agent or a specific session.
```typescript
async switchLLM(
llmUpdates: LLMUpdates,
sessionId?: string
): Promise<ValidatedLLMConfig>
```
| Parameter | Type | Description |
| :--- | :--- | :--- |
| `llmUpdates` | `LLMUpdates` | LLM configuration updates (model, provider, apiKey, etc.) |
| `sessionId` | `string` | (Optional) Target session ID |
**Returns:** `Promise<ValidatedLLMConfig>` the fully validated, effective LLM configuration.
```typescript
const config = await agent.switchLLM({
provider: 'anthropic',
model: 'claude-sonnet-4-5-20250929'
});
console.log(config.model);
```
### `getCurrentLLMConfig`
Returns the base LLM configuration from the agent's initialization config.
```typescript
getCurrentLLMConfig(): LLMConfig
```
**Returns:** `LLMConfig` - The base LLM configuration (does not include session-specific overrides)
### `getEffectiveConfig`
Gets the complete effective configuration for a session or the default configuration.
```typescript
getEffectiveConfig(sessionId?: string): Readonly<AgentConfig>
```
| Parameter | Type | Description |
| :--- | :--- | :--- |
| `sessionId` | `string` | (Optional) Session ID |
**Returns:** `Readonly<AgentConfig>`
---
## MCP Server Management
### `addMcpServer`
Adds and connects to a new MCP server, making its tools available to the agent.
```typescript
async addMcpServer(name: string, config: McpServerConfig): Promise<void>
```
| Parameter | Type | Description |
| :--- | :--- | :--- |
| `name` | `string` | Server name |
| `config` | `McpServerConfig` | Server configuration |
### `removeMcpServer`
Disconnects from an MCP server and removes it completely from the agent.
```typescript
async removeMcpServer(name: string): Promise<void>
```
| Parameter | Type | Description |
| :--- | :--- | :--- |
| `name` | `string` | Server name to remove |
### `enableMcpServer`
Enables a disabled MCP server and connects it.
```typescript
async enableMcpServer(name: string): Promise<void>
```
| Parameter | Type | Description |
| :--- | :--- | :--- |
| `name` | `string` | Server name to enable |
### `disableMcpServer`
Disables an MCP server and disconnects it. The server remains registered but inactive.
```typescript
async disableMcpServer(name: string): Promise<void>
```
| Parameter | Type | Description |
| :--- | :--- | :--- |
| `name` | `string` | Server name to disable |
### `restartMcpServer`
Restarts an MCP server by disconnecting and reconnecting with its original configuration.
```typescript
async restartMcpServer(name: string): Promise<void>
```
| Parameter | Type | Description |
| :--- | :--- | :--- |
| `name` | `string` | Server name to restart |
### `executeTool`
Executes a tool from any source (MCP servers, custom tools, or internal tools). This is the unified interface for tool execution.
```typescript
async executeTool(toolName: string, args: any): Promise<any>
```
| Parameter | Type | Description |
| :--- | :--- | :--- |
| `toolName` | `string` | Tool name |
| `args` | `any` | Tool arguments |
**Returns:** `Promise<any>` - Tool execution result
### `getAllMcpTools`
Returns a map of all available tools from all connected MCP servers.
```typescript
async getAllMcpTools(): Promise<Record<string, ToolDefinition>>
```
**Returns:** `Promise<Record<string, ToolDefinition>>`
### `getAllTools`
Returns a map of all available tools from all sources (MCP servers, custom tools, and internal tools). This is the unified interface for tool discovery.
```typescript
async getAllTools(): Promise<Record<string, ToolDefinition>>
```
**Returns:** `Promise<Record<string, ToolDefinition>>`
### `getMcpClients`
Returns a map of all connected MCP client instances.
```typescript
getMcpClients(): Map<string, IMCPClient>
```
**Returns:** `Map<string, IMCPClient>`
### `getMcpFailedConnections`
Returns a record of failed MCP server connections and their error messages.
```typescript
getMcpFailedConnections(): Record<string, string>
```
**Returns:** `Record<string, string>` - Failed connection names to error messages
---
## Model & Provider Introspection
### `getSupportedProviders`
Returns the list of supported LLM providers.
```typescript
getSupportedProviders(): LLMProvider[]
```
### `getSupportedModels`
Returns supported models grouped by provider, including a flag for the default model per provider.
```typescript
getSupportedModels(): Record<LLMProvider, Array<ModelInfo & { isDefault: boolean }>>
```
### `getSupportedModelsForProvider`
Returns supported models for a specific provider.
```typescript
getSupportedModelsForProvider(provider: LLMProvider): Array<ModelInfo & { isDefault: boolean }>
```
### `inferProviderFromModel`
Infers the provider from a model name or returns `null` if unknown.
```typescript
inferProviderFromModel(modelName: string): LLMProvider | null
```
---
## Search
### `searchMessages`
Search for messages across all sessions or within a specific session.
```typescript
async searchMessages(query: string, options?: SearchOptions): Promise<SearchResponse>
```
### `searchSessions`
Search for sessions that contain the specified query.
```typescript
async searchSessions(query: string): Promise<SessionSearchResponse>
```

View File

@@ -0,0 +1,658 @@
---
sidebar_position: 3
---
# Events Reference
Complete event system documentation for monitoring and integrating with Dexto agents.
## Overview
The Dexto SDK provides a comprehensive event system through two main event buses:
- **AgentEventBus**: Agent-level events that occur across the entire agent instance
- **SessionEventBus**: Session-specific events that occur within individual conversation sessions
### Event Naming Convention
All events follow the `namespace:kebab-case` format:
- **LLM events**: `llm:thinking`, `llm:chunk`, `llm:response`, `llm:tool-call`
- **Session events**: `session:created`, `session:reset`, `session:title-updated`
- **MCP events**: `mcp:server-connected`, `mcp:resource-updated`
- **Approval events**: `approval:request`, `approval:response`
- **State events**: `state:changed`, `state:exported`
- **Tool events**: `tools:available-updated`
### Event Visibility Tiers
Events are organized into three tiers based on their intended audience:
#### **Tier 1: Streaming Events** (`STREAMING_EVENTS`)
Exposed via `DextoAgent.stream()` for real-time chat UIs. These are the most commonly used events for building interactive applications.
**LLM Events:** `llm:thinking`, `llm:chunk`, `llm:response`, `llm:tool-call`, `llm:tool-result`, `llm:error`, `llm:unsupported-input`
**Tool Events:** `tool:running`
**Context Events:** `context:compressed`, `context:pruned`
**Message Queue Events:** `message:queued`, `message:dequeued`
**Run Lifecycle Events:** `run:complete`
**Session Events:** `session:title-updated`
**Approval Events:** `approval:request`, `approval:response`
**Use cases:**
- Real-time chat interfaces
- Progress indicators
- Streaming responses
- Tool execution tracking
- User approval flows
#### **Tier 2: Integration Events** (`INTEGRATION_EVENTS`)
Exposed via webhooks, A2A subscriptions, and monitoring systems. Includes all streaming events plus lifecycle and state management events.
**Additional events:** `session:created`, `session:reset`, `mcp:server-connected`, `mcp:server-restarted`, `mcp:tools-list-changed`, `mcp:prompts-list-changed`, `tools:available-updated`, `llm:switched`, `state:changed`
**Use cases:**
- External system integrations
- Monitoring and observability
- Analytics and logging
- Multi-agent coordination (A2A)
#### **Tier 3: Internal Events**
Only available via direct `AgentEventBus` access for advanced use cases. These are implementation details that may change between versions.
**Examples:** `resource:cache-invalidated`, `state:exported`, `state:reset`, `mcp:server-added`, `mcp:server-removed`, `session:override-set`
---
## Agent-Level Events
These events are emitted by the `AgentEventBus` and provide insight into agent-wide operations.
### Session Events
#### `session:reset`
Fired when a conversation history is reset for a session.
```typescript
{
sessionId: string;
}
```
#### `session:created`
Fired when a new session is created and should become active.
```typescript
{
sessionId: string;
switchTo: boolean; // Whether UI should switch to this session
}
```
#### `session:title-updated`
Fired when a session's human-friendly title is updated.
```typescript
{
sessionId: string;
title: string;
}
```
#### `session:override-set`
Fired when session-specific configuration is set.
```typescript
{
sessionId: string;
override: SessionOverride;
}
```
#### `session:override-cleared`
Fired when session-specific configuration is cleared.
```typescript
{
sessionId: string;
}
```
### MCP Server Events
#### `mcp:server-connected`
Fired when an MCP server connection attempt completes (success or failure).
```typescript
{
name: string;
success: boolean;
error?: string;
}
```
#### `mcp:server-added`
Fired when an MCP server is added to the runtime state.
```typescript
{
serverName: string;
config: McpServerConfig;
}
```
#### `mcp:server-removed`
Fired when an MCP server is removed from the runtime state.
```typescript
{
serverName: string;
}
```
#### `mcp:server-updated`
Fired when an MCP server configuration is updated.
```typescript
{
serverName: string;
config: McpServerConfig;
}
```
#### `mcp:server-restarted`
Fired when an MCP server is restarted.
```typescript
{
serverName: string;
}
```
#### `mcp:resource-updated`
Fired when an MCP server resource is updated.
```typescript
{
serverName: string;
resourceUri: string;
}
```
#### `mcp:prompts-list-changed`
Fired when available prompts from MCP servers change.
```typescript
{
serverName: string;
prompts: string[];
}
```
#### `mcp:tools-list-changed`
Fired when available tools from MCP servers change.
```typescript
{
serverName: string;
tools: string[];
}
```
#### `resource:cache-invalidated`
Fired when resource cache is invalidated.
```typescript
{
resourceUri?: string;
serverName: string;
action: 'updated' | 'server_connected' | 'server_removed' | 'blob_stored';
}
```
#### `tools:available-updated`
Fired when the available tools list is updated.
```typescript
{
tools: string[];
source: 'mcp' | 'builtin';
}
```
### Configuration Events
#### `llm:switched`
Fired when the LLM configuration is changed.
```typescript
{
newConfig: LLMConfig;
historyRetained?: boolean;
sessionIds: string[]; // Array of affected session IDs
}
```
#### `state:changed`
Fired when agent runtime state changes.
```typescript
{
field: string; // keyof AgentRuntimeState
oldValue: any;
newValue: any;
sessionId?: string;
}
```
#### `state:exported`
Fired when agent state is exported as configuration.
```typescript
{
config: AgentConfig;
}
```
#### `state:reset`
Fired when agent state is reset to baseline.
```typescript
{
toConfig: AgentConfig;
}
```
### User Approval Events
Dexto's generalized approval system handles various types of user input requests, including tool confirmations and form-based input (elicitation). These events are included in `STREAMING_EVENTS` and are available via `DextoAgent.stream()`.
:::tip Custom Approval Handlers
For direct `DextoAgent` usage without SSE streaming, you can implement a custom approval handler via `agent.setApprovalHandler()` to intercept approval requests programmatically.
:::
#### `approval:request`
Fired when user approval or input is requested. This event supports multiple approval types through a discriminated union based on the `type` field.
```typescript
{
approvalId: string; // Unique identifier for this approval request
type: string; // 'tool_confirmation' | 'command_confirmation' | 'elicitation'
sessionId?: string; // Optional session scope
timeout?: number; // Request timeout in milliseconds
timestamp: Date; // When the request was created
metadata: Record<string, any>; // Type-specific approval data
}
```
**Approval Types:**
- **`tool_confirmation`**: Binary approval for tool execution
- `metadata.toolName`: Name of the tool requiring confirmation
- `metadata.args`: Tool arguments
- `metadata.description`: Optional tool description
- **`command_confirmation`**: Binary approval for command execution (e.g., bash commands)
- `metadata.command`: Command requiring confirmation
- `metadata.args`: Command arguments
- **`elicitation`**: Schema-based form input (typically from MCP servers or ask_user tool)
- `metadata.schema`: JSON Schema defining expected input structure
- `metadata.prompt`: Prompt text to display to user
- `metadata.serverName`: Name of requesting entity (MCP server or 'Dexto Agent')
- `metadata.context`: Optional additional context
#### `approval:response`
Fired when a user approval response is received from the UI layer.
```typescript
{
approvalId: string; // Must match the request approvalId
status: 'approved' | 'denied' | 'cancelled'; // Approval status
reason?: DenialReason; // Reason for denial/cancellation
message?: string; // Optional user message
sessionId?: string; // Session identifier (if scoped)
data?: Record<string, any>; // Type-specific response data
}
```
**Response Data by Type:**
- **Tool confirmation**: `{ rememberChoice?: boolean }`
- **Command confirmation**: `{ rememberChoice?: boolean }`
- **Elicitation**: `{ formData: Record<string, unknown> }`
**Usage Notes:**
- Agent-initiated forms use `ask_user` tool → triggers elicitation request
- MCP server input requests trigger elicitation automatically
- Tool confirmations can be remembered per session via `rememberChoice`
- Approval requests timeout based on configuration (default: 2 minutes)
- Cancelled status indicates timeout or explicit cancellation
---
## Session-Level Events
These events are emitted by the `SessionEventBus` and provide insight into LLM service operations within sessions. They are automatically forwarded to the `AgentEventBus` with a `sessionId` property.
### LLM Processing Events
#### `llm:thinking`
Fired when the LLM service starts processing a request.
```typescript
{
sessionId: string;
}
```
#### `llm:response`
Fired when the LLM service completes a response.
```typescript
{
content: string;
reasoning?: string; // Extended thinking output for reasoning models
provider?: string;
model?: string;
tokenUsage?: {
inputTokens?: number;
outputTokens?: number;
reasoningTokens?: number; // Additional tokens used for reasoning
totalTokens?: number;
};
sessionId: string;
}
```
**Note:** The `reasoning` field contains extended thinking output for models that support reasoning (e.g., o1, o3-mini). This is separate from the main `content` response.
#### `llm:chunk`
Fired when a streaming response chunk is received.
```typescript
{
chunkType: 'text' | 'reasoning'; // Indicates whether chunk is reasoning or main response
content: string;
isComplete?: boolean;
sessionId: string;
}
```
**Note:** The `chunkType` field distinguishes between reasoning output (`reasoning`) and the main response text (`text`). For reasoning models, you'll receive reasoning chunks followed by text chunks.
#### `llm:error`
Fired when the LLM service encounters an error.
```typescript
{
error: Error;
context?: string;
recoverable?: boolean;
sessionId: string;
}
```
#### `llm:switched`
Fired when session LLM configuration is changed.
```typescript
{
newConfig: LLMConfig;
historyRetained?: boolean;
sessionIds: string[]; // Array of affected session IDs
}
```
#### `llm:unsupported-input`
Fired when the LLM service receives unsupported input.
```typescript
{
errors: string[];
provider: LLMProvider;
model?: string;
fileType?: string;
details?: any;
sessionId: string;
}
```
### Tool Execution Events
#### `llm:tool-call`
Fired when the LLM service requests a tool execution.
```typescript
{
toolName: string;
args: Record<string, any>;
callId?: string;
sessionId: string;
}
```
#### `tool:running`
Fired when a tool actually starts executing (after approval if required). This allows UIs to distinguish between tools pending approval and tools actively running.
```typescript
{
toolName: string;
toolCallId: string;
sessionId: string;
}
```
#### `llm:tool-result`
Fired when a tool execution completes.
```typescript
{
toolName: string;
sanitized: SanitizedToolResult;
rawResult?: unknown; // only present when DEXTO_DEBUG_TOOL_RESULT_RAW=true
callId?: string;
success: boolean;
sessionId: string;
}
```
### Context Management Events
#### `context:compressed`
Fired when conversation context is compressed to stay within token limits.
```typescript
{
originalTokens: number; // Actual input tokens that triggered compression
compressedTokens: number; // Estimated tokens after compression
originalMessages: number;
compressedMessages: number;
strategy: string;
reason: 'overflow' | 'token_limit' | 'message_limit';
sessionId: string;
}
```
#### `context:pruned`
Fired when old messages are pruned from context.
```typescript
{
prunedCount: number;
savedTokens: number;
sessionId: string;
}
```
### Message Queue Events
These events track the message queue system, which allows users to queue additional messages while the agent is processing.
#### `message:queued`
Fired when a user message is queued during agent execution.
```typescript
{
position: number; // Position in the queue
id: string; // Unique message ID
sessionId: string;
}
```
#### `message:dequeued`
Fired when queued messages are dequeued and injected into context.
```typescript
{
count: number; // Number of messages dequeued
ids: string[]; // IDs of dequeued messages
coalesced: boolean; // Whether messages were combined
content: ContentPart[]; // Combined content for UI display
sessionId: string;
}
```
### Run Lifecycle Events
#### `run:complete`
Fired when an agent run completes, providing summary information about the execution.
```typescript
{
finishReason: LLMFinishReason; // How the run ended
stepCount: number; // Number of steps executed
durationMs: number; // Wall-clock duration in milliseconds
error?: Error; // Error if finishReason === 'error'
sessionId: string;
}
```
**Finish Reasons:**
- `stop` - Normal completion
- `tool-calls` - Stopped to execute tool calls (more steps coming)
- `length` - Hit token/length limit
- `content-filter` - Content filter violation
- `error` - Error occurred
- `cancelled` - User cancelled
- `max-steps` - Hit max steps limit
---
## Usage Examples
### Listening to Streaming Events
```typescript
import { DextoAgent } from '@dexto/core';
const agent = new DextoAgent(config);
await agent.start();
// Use the stream() API to get streaming events
for await (const event of await agent.stream('Hello!', 'session-1')) {
switch (event.name) {
case 'llm:thinking':
console.log('Agent is thinking...');
break;
case 'llm:chunk':
process.stdout.write(event.content);
break;
case 'llm:response':
console.log('\nFull response:', event.content);
console.log('Tokens used:', event.tokenUsage);
break;
case 'llm:tool-call':
console.log(`Calling tool: ${event.toolName}`);
break;
case 'tool:running':
console.log(`Tool ${event.toolName} is now running`);
break;
case 'run:complete':
console.log(`Run completed: ${event.finishReason} (${event.stepCount} steps, ${event.durationMs}ms)`);
break;
case 'approval:request':
console.log(`Approval needed: ${event.type}`);
// Handle approval UI...
break;
}
}
```
### Listening to Integration Events
```typescript
import { DextoAgent, INTEGRATION_EVENTS } from '@dexto/core';
const agent = new DextoAgent(config);
await agent.start();
// Listen to all integration events via the event bus
INTEGRATION_EVENTS.forEach((eventName) => {
agent.agentEventBus.on(eventName, (payload) => {
console.log(`[${eventName}]`, payload);
// Send to your monitoring/analytics system
sendToMonitoring(eventName, payload);
});
});
```
### Listening to Internal Events
```typescript
import { DextoAgent } from '@dexto/core';
const agent = new DextoAgent(config);
await agent.start();
// Listen to internal events for advanced debugging
agent.agentEventBus.on('resource:cache-invalidated', (payload) => {
console.log('Cache invalidated:', payload);
});
agent.agentEventBus.on('state:exported', (payload) => {
console.log('State exported:', payload.config);
});
```

View File

@@ -0,0 +1,415 @@
---
sidebar_position: 2
title: "MCPManager"
---
# MCPManager
The `MCPManager` is a powerful, standalone utility for managing [Model Context Protocol (MCP)](/docs/mcp/overview) servers. It allows you to connect, manage, and interact with multiple MCP servers in your own applications without needing the full Dexto agent framework.
This class provides a unified interface for accessing tools, resources, and prompts from all connected servers, making it an essential component for building complex, multi-server workflows.
## Constructor
```typescript
constructor(confirmationProvider?: ToolConfirmationProvider)
```
Creates a new `MCPManager` instance for managing MCP server connections.
**Parameters:**
- `confirmationProvider` (optional): A custom tool confirmation provider. If not provided, a default CLI-based confirmation is used.
**Example:**
```typescript
import { MCPManager } from '@dexto/core';
// Basic manager
const manager = new MCPManager();
// With a custom confirmation provider
const customProvider = new CustomConfirmationProvider();
const managerWithProvider = new MCPManager(customProvider);
```
## Connection Management Methods
#### `connectServer`
Connects to a new MCP server.
```typescript
async connectServer(name: string, config: McpServerConfig): Promise<void>
```
**Parameters:**
- `name`: Unique identifier for the server connection
- `config`: Server configuration object
**Server Configuration Types:**
```typescript
// stdio server (most common)
{
type: 'stdio',
command: 'npx',
args: ['-y', '@modelcontextprotocol/server-filesystem', '.'],
env?: { [key: string]: string }
}
// HTTP server (recommended for remote)
{
type: 'http',
url: 'http://localhost:3001/mcp',
headers?: { [key: string]: string },
timeout?: number,
connectionMode?: 'strict' | 'lenient'
}
// SSE (Server-Sent Events) server - DEPRECATED, use http instead
{
type: 'sse',
url: 'http://localhost:3001/sse',
headers?: { [key: string]: string },
timeout?: number,
connectionMode?: 'strict' | 'lenient'
}
```
**Examples:**
```typescript
// File system server
await manager.connectServer('filesystem', {
type: 'stdio',
command: 'npx',
args: ['-y', '@modelcontextprotocol/server-filesystem', '.']
});
// Web search server with API key
await manager.connectServer('tavily-search', {
type: 'stdio',
command: 'npx',
args: ['-y', 'tavily-mcp@0.1.2'],
env: {
TAVILY_API_KEY: process.env.TAVILY_API_KEY
}
});
// HTTP MCP server
await manager.connectServer('remote-agent', {
type: 'http',
baseUrl: 'http://localhost:3001/mcp',
timeout: 30000
});
```
#### `initializeFromConfig`
Initialize multiple servers from configuration.
```typescript
async initializeFromConfig(
serverConfigs: ServerConfigs,
connectionMode: 'strict' | 'lenient' = 'lenient'
): Promise<void>
```
**Parameters:**
- `serverConfigs`: Object mapping server names to configurations
- `connectionMode`:
- `'strict'`: All servers must connect successfully
- `'lenient'`: At least one server must connect successfully
**Example:**
```typescript
const serverConfigs = {
filesystem: {
type: 'stdio',
command: 'npx',
args: ['-y', '@modelcontextprotocol/server-filesystem', '.']
},
search: {
type: 'stdio',
command: 'npx',
args: ['-y', 'tavily-mcp@0.1.2'],
env: { TAVILY_API_KEY: process.env.TAVILY_API_KEY }
}
};
await manager.initializeFromConfig(serverConfigs, 'lenient');
```
#### `removeClient`
Disconnects and removes a specific MCP server.
```typescript
async removeClient(name: string): Promise<void>
```
**Example:**
```typescript
await manager.removeClient('filesystem');
```
#### `disconnectAll`
Disconnect all servers and clear caches.
```typescript
async disconnectAll(): Promise<void>
```
**Example:**
```typescript
await manager.disconnectAll();
```
#### `restartServer`
Restart a specific MCP server by disconnecting and reconnecting with its original configuration.
```typescript
async restartServer(name: string): Promise<void>
```
**Parameters:**
- `name`: The name of the server to restart
**Example:**
```typescript
// Restart a server after it becomes unresponsive
await manager.restartServer('filesystem');
```
#### `refresh`
Refresh all tool, resource, and prompt caches from connected servers.
```typescript
async refresh(): Promise<void>
```
**Example:**
```typescript
// Force refresh all caches after external changes
await manager.refresh();
```
## Tool Management Methods
#### `getAllTools`
Gets all available tools from connected servers.
```typescript
async getAllTools(): Promise<ToolSet>
```
**Returns:** Object mapping tool names to tool definitions
**Example:**
```typescript
const tools = await manager.getAllTools();
console.log('Available tools:', Object.keys(tools));
// Inspect a specific tool
const readFileTool = tools.readFile;
console.log('Tool schema:', readFileTool.inputSchema);
```
#### `getToolClient`
Get the client that provides a specific tool.
```typescript
getToolClient(toolName: string): IMCPClient | undefined
```
#### `executeTool`
Executes a specific tool with arguments.
```typescript
async executeTool(toolName: string, args: any): Promise<any>
```
**Example:**
```typescript
// Read a file
const content = await manager.executeTool('readFile', {
path: './package.json'
});
// Search the web
const searchResults = await manager.executeTool('search', {
query: 'latest AI developments',
max_results: 5
});
// Write a file
await manager.executeTool('writeFile', {
path: './output.txt',
content: 'Hello from MCP!'
});
```
## Resource Management Methods
#### `listAllResources`
Gets all cached MCP resources from connected servers.
```typescript
async listAllResources(): Promise<MCPResolvedResource[]>
```
**Returns:** Array of resolved resources with metadata:
```typescript
interface MCPResolvedResource {
key: string; // Qualified resource key
serverName: string; // Server that provides this resource
summary: MCPResourceSummary;
}
```
#### `getResource`
Get cached resource metadata by qualified key.
```typescript
getResource(resourceKey: string): MCPResolvedResource | undefined
```
#### `readResource`
Reads a specific resource by URI.
```typescript
async readResource(uri: string): Promise<ReadResourceResult>
```
**Example:**
```typescript
const resource = await manager.readResource('file:///project/README.md');
console.log('Resource content:', resource.contents);
```
## Prompt Management Methods
#### `listAllPrompts`
Gets all available prompt names from connected servers.
```typescript
async listAllPrompts(): Promise<string[]>
```
#### `getPromptClient`
Get the client that provides a specific prompt.
```typescript
getPromptClient(promptName: string): IMCPClient | undefined
```
#### `getPrompt`
Gets a specific prompt by name.
```typescript
async getPrompt(name: string, args?: any): Promise<GetPromptResult>
```
**Example:**
```typescript
const prompt = await manager.getPrompt('code-review', {
language: 'typescript',
file: 'src/index.ts'
});
console.log('Prompt:', prompt.messages);
```
#### `getPromptMetadata`
Get cached metadata for a specific prompt (no network calls).
```typescript
getPromptMetadata(promptName: string): PromptDefinition | undefined
```
#### `getAllPromptMetadata`
Get all cached prompt metadata (no network calls).
```typescript
getAllPromptMetadata(): Array<{
promptName: string;
serverName: string;
definition: PromptDefinition;
}>
```
## Status and Monitoring Methods
#### `getClients`
Returns all registered MCP client instances.
```typescript
getClients(): Map<string, IMCPClient>
```
**Example:**
```typescript
const clients = manager.getClients();
console.log('Connected servers:', Array.from(clients.keys()));
for (const [name, client] of clients) {
console.log(`Server: ${name}, Tools available: ${Object.keys(await client.getTools()).length}`);
}
```
#### `getFailedConnections`
Returns failed connection error messages.
```typescript
getFailedConnections(): Record<string, string>
```
**Example:**
```typescript
const errors = manager.getFailedConnections();
if (Object.keys(errors).length > 0) {
console.log('Failed connections:', errors);
}
```
### Complete Example
```typescript
import { MCPManager } from '@dexto/core';
const manager = new MCPManager();
// Connect to servers
await manager.connectServer('filesystem', {
type: 'stdio',
command: 'npx',
args: ['-y', '@modelcontextprotocol/server-filesystem', '.']
});
// Execute tools directly
const result = await manager.executeTool('readFile', { path: './README.md' });
console.log('Read file result:', result);
// Get all available tools
const tools = await manager.getAllTools();
console.log('Available tools:', Object.keys(tools));
// Clean up
await manager.disconnectAll();
```

685
dexto/docs/api/sdk/types.md Normal file
View File

@@ -0,0 +1,685 @@
---
sidebar_position: 4
---
# SDK Types for TypeScript
Type definitions and interfaces for the Dexto Agent SDK for TypeScript.
## Core Imports
```typescript
import {
// Main classes
DextoAgent,
// Standalone utilities
MCPManager,
Logger,
AgentEventBus,
SessionEventBus,
createStorageBackends,
createAgentServices,
// Configuration types
AgentConfig,
LLMConfig,
McpServerConfig,
StorageConfig,
// Session types
ChatSession,
SessionMetadata,
ConversationHistory,
// Result types
ValidatedLLMConfig,
// Event types
AgentEventMap,
SessionEventMap,
// Storage types
StorageBackends,
CacheBackend,
DatabaseBackend,
// Service types
AgentServices,
} from '@dexto/core';
```
---
## Configuration Types
### `AgentConfig`
Main configuration object for creating Dexto agents.
```typescript
interface AgentConfig {
llm: LLMConfig;
mcpServers?: Record<string, McpServerConfig>;
storage?: StorageConfig;
sessions?: SessionConfig;
systemPrompt?: string;
}
```
### `LLMConfig`
Configuration for Large Language Model providers.
```typescript
interface LLMConfig {
provider: 'openai' | 'anthropic' | 'google' | 'groq' | 'xai' | 'cohere' | 'openai-compatible';
model: string;
apiKey?: string;
baseURL?: string;
temperature?: number;
maxOutputTokens?: number;
maxInputTokens?: number;
maxIterations?: number;
systemPrompt?: string;
}
```
### `McpServerConfig`
Configuration for Model Context Protocol servers.
```typescript
interface McpServerConfig {
type: 'stdio' | 'sse' | 'http';
command?: string; // Required for stdio
args?: string[]; // For stdio
env?: Record<string, string>; // For stdio
url?: string; // Required for sse/http
headers?: Record<string, string>; // For sse/http
timeout?: number; // Default: 30000
connectionMode?: 'strict' | 'lenient'; // Default: 'lenient'
}
```
### `StorageConfig`
Configuration for storage backends.
```typescript
interface StorageConfig {
cache: CacheBackendConfig;
database: DatabaseBackendConfig;
}
interface CacheBackendConfig {
type: 'in-memory' | 'redis';
url?: string;
options?: Record<string, any>;
}
interface DatabaseBackendConfig {
type: 'in-memory' | 'sqlite' | 'postgresql';
url?: string;
options?: Record<string, any>;
}
```
---
## Session Types
### `ChatSession`
Represents an individual conversation session.
```typescript
interface ChatSession {
id: string;
createdAt: Date;
lastActivity: Date;
// Session methods
run(userInput: string, imageData?: ImageData): Promise<string>;
getHistory(): Promise<ConversationHistory>;
reset(): Promise<void>;
getLLMService(): VercelLLMService;
}
```
### `SessionMetadata`
Metadata information about a session.
```typescript
interface SessionMetadata {
id: string;
createdAt: Date;
lastActivity: Date;
messageCount: number;
tokenCount?: number;
}
```
### `ConversationHistory`
Complete conversation history for a session.
```typescript
interface ConversationHistory {
sessionId: string;
messages: ConversationMessage[];
totalTokens?: number;
}
interface ConversationMessage {
role: 'user' | 'assistant' | 'system' | 'tool';
content: string;
timestamp: Date;
tokenCount?: number;
toolCall?: ToolCall;
toolResult?: ToolResult;
}
```
---
## Result Types
### `ValidatedLLMConfig`
Validated LLM configuration returned by `switchLLM`.
```typescript
type ValidatedLLMConfig = LLMConfig & {
maxInputTokens?: number;
};
```
---
## Event Types
:::info Event Naming Convention
All events use the `namespace:kebab-case` format. For detailed event documentation and usage examples, see the [Events Reference](./events.md).
:::
### `AgentEventMap`
Type map for agent-level events. All event names follow the `namespace:kebab-case` convention.
```typescript
interface AgentEventMap {
// Session events
'session:reset': {
sessionId: string;
};
'session:created': {
sessionId: string;
switchTo: boolean; // Whether UI should switch to this session
};
'session:title-updated': {
sessionId: string;
title: string;
};
'session:override-set': {
sessionId: string;
override: SessionOverride;
};
'session:override-cleared': {
sessionId: string;
};
// MCP server events
'mcp:server-connected': {
name: string;
success: boolean;
error?: string;
};
'mcp:server-added': {
serverName: string;
config: McpServerConfig;
};
'mcp:server-removed': {
serverName: string;
};
'mcp:server-updated': {
serverName: string;
config: McpServerConfig;
};
'mcp:server-restarted': {
serverName: string;
};
'mcp:resource-updated': {
serverName: string;
resourceUri: string;
};
'mcp:prompts-list-changed': {
serverName: string;
prompts: string[];
};
'mcp:tools-list-changed': {
serverName: string;
tools: string[];
};
'resource:cache-invalidated': {
resourceUri?: string;
serverName: string;
action: 'updated' | 'server_connected' | 'server_removed' | 'blob_stored';
};
'tools:available-updated': {
tools: string[];
source: 'mcp' | 'builtin';
};
// Configuration events
'llm:switched': {
newConfig: ValidatedLLMConfig;
historyRetained?: boolean;
sessionIds: string[]; // Array of affected session IDs
};
'state:changed': {
field: string;
oldValue: any;
newValue: any;
sessionId?: string;
};
'state:exported': {
config: AgentConfig;
};
'state:reset': {
toConfig: AgentConfig;
};
// Approval events
'approval:request': {
approvalId: string;
approvalType: 'tool_confirmation' | 'elicitation' | 'custom';
sessionId?: string;
timeout?: number;
timestamp: Date;
metadata: Record<string, any>;
};
'approval:response': {
approvalId: string;
status: 'approved' | 'denied' | 'cancelled';
reason?: DenialReason;
message?: string;
sessionId?: string;
data?: Record<string, any>;
};
// LLM service events (forwarded from sessions with sessionId)
'llm:thinking': {
sessionId: string;
};
'llm:response': {
content: string;
reasoning?: string;
provider?: string;
model?: string;
tokenUsage?: {
inputTokens?: number;
outputTokens?: number;
reasoningTokens?: number;
totalTokens?: number;
};
sessionId: string;
};
'llm:chunk': {
chunkType: 'text' | 'reasoning'; // Note: renamed from 'type' to avoid conflicts
content: string;
isComplete?: boolean;
sessionId: string;
};
'llm:tool-call': {
toolName: string;
args: Record<string, any>;
callId?: string;
sessionId: string;
};
'llm:tool-result': {
toolName: string;
sanitized: SanitizedToolResult;
rawResult?: unknown;
callId?: string;
success: boolean;
sessionId: string;
};
'llm:error': {
error: Error;
context?: string;
recoverable?: boolean;
sessionId: string;
};
'llm:unsupported-input': {
errors: string[];
provider: LLMProvider;
model?: string;
fileType?: string;
details?: any;
sessionId: string;
};
}
```
### `SessionEventMap`
Type map for session-level events. These events are emitted within individual chat sessions and are automatically forwarded to the `AgentEventBus` with a `sessionId` property.
```typescript
interface SessionEventMap {
'llm:thinking': void;
'llm:response': {
content: string;
reasoning?: string;
provider?: string;
model?: string;
tokenUsage?: {
inputTokens?: number;
outputTokens?: number;
reasoningTokens?: number;
totalTokens?: number;
};
};
'llm:chunk': {
chunkType: 'text' | 'reasoning';
content: string;
isComplete?: boolean;
};
'llm:tool-call': {
toolName: string;
args: Record<string, any>;
callId?: string;
};
'llm:tool-result': {
toolName: string;
sanitized: SanitizedToolResult;
rawResult?: unknown;
callId?: string;
success: boolean;
};
'llm:error': {
error: Error;
context?: string;
recoverable?: boolean;
};
'llm:switched': {
newConfig: ValidatedLLMConfig;
historyRetained?: boolean;
sessionIds: string[];
};
'llm:unsupported-input': {
errors: string[];
provider: LLMProvider;
model?: string;
fileType?: string;
details?: any;
};
}
```
### Event Tier Types
```typescript
// Tier 1: Events exposed via DextoAgent.stream()
export type StreamingEventName =
| 'llm:thinking'
| 'llm:chunk'
| 'llm:response'
| 'llm:tool-call'
| 'llm:tool-result'
| 'llm:error'
| 'llm:unsupported-input'
| 'approval:request'
| 'approval:response'
| 'session:title-updated';
// Tier 2: Events exposed via webhooks, A2A, and monitoring
export type IntegrationEventName = StreamingEventName
| 'session:created'
| 'session:reset'
| 'mcp:server-connected'
| 'mcp:server-restarted'
| 'mcp:tools-list-changed'
| 'mcp:prompts-list-changed'
| 'tools:available-updated'
| 'llm:switched'
| 'state:changed';
// Union types with payloads
// Note: Uses 'name' (not 'type') to avoid collision with ApprovalRequest.type payload field
export type StreamingEvent = {
[K in StreamingEventName]: { name: K } & AgentEventMap[K];
}[StreamingEventName];
```
---
## Storage Types
### `StorageBackends`
Container for storage backend instances.
```typescript
interface StorageBackends {
cache: CacheBackend;
database: DatabaseBackend;
}
```
### `CacheBackend`
Interface for cache storage operations.
```typescript
interface CacheBackend {
get(key: string): Promise<any>;
set(key: string, value: any, ttl?: number): Promise<void>;
delete(key: string): Promise<void>;
clear(): Promise<void>;
disconnect?(): Promise<void>;
}
```
### `DatabaseBackend`
Interface for database storage operations.
```typescript
interface DatabaseBackend {
get(key: string): Promise<any>;
set(key: string, value: any): Promise<void>;
delete(key: string): Promise<void>;
append(key: string, value: any): Promise<void>;
getRange(key: string, start: number, end: number): Promise<any[]>;
disconnect?(): Promise<void>;
}
```
---
## Service Types
### `AgentServices`
Container for all agent service instances.
```typescript
interface AgentServices {
mcpManager: MCPManager;
systemPromptManager: SystemPromptManager;
agentEventBus: AgentEventBus;
stateManager: AgentStateManager;
sessionManager: SessionManager;
storage: StorageBackends;
}
```
---
## Tool Types
### `ToolSet`
Map of tool names to tool definitions.
```typescript
type ToolSet = Record<string, ToolDefinition>;
interface ToolDefinition {
name: string;
description: string;
inputSchema: {
type: 'object';
properties: Record<string, any>;
required?: string[];
};
}
```
### `ToolCall`
Represents a tool execution request.
```typescript
interface ToolCall {
id: string;
name: string;
arguments: Record<string, any>;
}
```
### `ToolResult`
Represents a tool execution result.
```typescript
interface ToolResult {
callId: string;
toolName: string;
result: any;
success: boolean;
error?: string;
}
```
---
## Utility Types
### `ImageData`
Type for image data in conversations.
```typescript
interface ImageData {
base64: string; // Base64 encoded image
mimeType: string; // e.g., 'image/jpeg', 'image/png'
}
```
### `FileData`
Type for file data in conversations.
```typescript
interface FileData {
base64: string; // Base64 encoded file data
mimeType: string; // e.g., 'application/pdf', 'audio/wav'
filename?: string; // Optional filename
}
```
**Supported File Types:**
- **PDF files** (`application/pdf`) - Most widely supported
- **Audio files** (`audio/mp3`, `audio/wav`) - With OpenAI `gpt-4o-audio-preview` and Google Gemini models
**Unsupported File Types:**
- Text files (`.txt`, `.md`)
- CSV files (`.csv`)
- Word documents (`.doc`, `.docx`)
- Excel files (`.xls`, `.xlsx`)
- PowerPoint files (`.ppt`, `.pptx`)
- JSON files (`.json`)
- XML files (`.xml`)
- HTML files (`.html`)
For unsupported file types, consider:
1. Converting to text and sending as regular messages
2. Using specialized MCP servers for file processing
3. Using dedicated file processing tools
### `LoggerOptions`
Configuration options for the Logger class.
```typescript
interface LoggerOptions {
level?: 'error' | 'warn' | 'info' | 'http' | 'verbose' | 'debug' | 'silly';
silent?: boolean;
}
```
### `ChalkColor`
Available colors for logger output.
```typescript
type ChalkColor =
| 'black' | 'red' | 'green' | 'yellow' | 'blue' | 'magenta' | 'cyan' | 'white'
| 'gray' | 'grey' | 'blackBright' | 'redBright' | 'greenBright' | 'yellowBright'
| 'blueBright' | 'magentaBright' | 'cyanBright' | 'whiteBright';
```
---
## Generic Types
### `EventListener`
Generic event listener function type.
```typescript
type EventListener<T> = (data: T) => void;
```
### `EventEmitterOptions`
Options for event emitter methods.
```typescript
interface EventEmitterOptions {
signal?: AbortSignal;
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 172 KiB

View File

@@ -0,0 +1,214 @@
---
slug: ai-agents-vs-llm-workflows
title: AI Agents vs LLM Workflows Why Autonomy Matters
description: Learn what AI agents are, how they differ from traditional LLM workflows, and when to use each approach.
authors: rahul
tags: [ai-agents, llm-workflows, autonomous-ai, dexto]
keywords:
- ai agents
- llm workflows
- autonomous ai
- code review automation
- dexto open-source runtime
---
If you have been remotely exposed to AI, you've probably heard the term AI agent. But what really is an AI agent?
`AI agent` has become a blanket term that is used in the industry for any automation or software that uses an LLM.
In this post, well break down what an **AI agent** is from first principles, then compare **AI agents vs LLM workflows**—when to use each and why autonomy matters.
<!--truncate-->
Let's first start with Large Language Models (LLMs), the backbone of AI agents.
## What Are LLMs?
LLMs are deep learning models, pre-trained on vast amounts of data, often more than what's available on the entire internet!
At their core, LLMs take in input and predict the most likely output.
Here the input could be a chat message, an image, a voice message or even video.
LLMs predict the output token-by-token, which means that at each step of giving you a response, the LLM is predicting what the next token should be. More on this [here](https://x.com/cwolferesearch/status/1879995082285171081).
So when you ask an LLM something like `what is 5+10`, or [`how many r's are there in strawberry?`](https://techcrunch.com/2024/08/27/why-ai-cant-spell-strawberry/), the LLM tries to *guess* what the actual answer should be based on its training data.
LLMs have reached a point where their grammar and sentence structure are much better than a typical human, and also have knowledge of an extremely broad variety of topics.
ChatGPT is the most popular example of an LLM based application, which you've probably used, unless you're living under a rock.
Under the hood, ChatGPT uses LLMs built by OpenAI, like `gpt-5` or `gpt-5-mini` to answer your questions about almost anything.
This is why if you ask LLMs questions like `how many r's are there in the word strawberry`, you might see completely incorrect results - [the guessing doesn't always work well](https://www.reddit.com/r/singularity/comments/1enqk04/how_many_rs_in_strawberry_why_is_this_a_very/). This is called [*hallucination*](https://en.wikipedia.org/wiki/Hallucination_(artificial_intelligence)).
## System Prompts
Well we know that LLMs have an incredibly large knowledge base, but what if we wanted the LLM to specifically just do one thing - like give me food recipes.
LLMs allow you to customize their base instructions (aka system prompts).
This enables you to give the LLM custom roles/instructions based on your use-case
Here's what the recipe analogy might look like as a system prompt
```
You are an expert chef.
Your job is to suggest tasty recipes to me.
I don't eat beef, so keep that in mind
Only answer food related questions.
```
```
systemPrompt: |
You are an expert chef.
Your job is to suggest tasty recipes to me.
I don't eat beef, so keep that in mind.
Only answer food related questions.
```
Now when I chat with the LLM, it will talk to me only like a chef!
![Conversation 1](./sysprompt_1.webp)
![Conversation 2](./sysprompt_2.webp)
## Function Calling in LLMs
Now, we've already established that LLMs can accept input and give back output. But LLMs can do more than that - they can also **take actions**
This is done by giving the LLM access to `functions`, or `tools`.
These are defined methods with specific capabilities, implemented by developers.
This process of giving the LLM access to `functions` is called [function_calling](https://platform.openai.com/docs/guides/function-calling)
Let's revisit the previous case where we asked the LLM to add 2 numbers, this time with function calling.
Suppose you gave `gpt-5` a function to add 2 numbers.
The next time you ask it `What is 5+10` - instead of trying to guess what the answer is, it would use the function to generate a more reliable response.
This is an extremely basic example, but the key takeaway here is that by giving the LLM tools - **LLMs can now take actions on your behalf.**
This is where things get interesting - what if the LLM had a function to book a reservation for you at a restaurant? Or if the LLM had a function to make a payment for you?
All the LLM would need to do in this case is use the right function based on the request, and you now have AI-powered bookings and payments.
There are other complexities like ensuring the LLM uses the correct function more accurately enough, and adding the appropriate guardrails and authentication, but we won't get into that for now.
## LLM Workflows and AI Agents
Now that we've explained how tools and system prompts work, let's dive into how LLMs can be used to automate tasks.
Let's look at one specific problem - automating code reviews, and 2 different approaches for how we can solve this problem using LLMs.
I've intentionally left out most of the complexity of actually building this system to primarily show 2 ways we can think about this problem.
### Approach 1
Ok I'm a software developer, so I have a pretty good idea of how code reviews work.
Here are 4 important things, among others that I look at when I review code:
1. **Functionality** - is the code functionally correct?
2. **Architecture** - does the code fit the codebase well and will it adapt well to changes we make in the future?
3. **Testing** - has the code been sufficiently tested? are there more test cases we can come up with?
4. **Documentation** - has the documentation been updated to account for the code changes?
If I wanted to use LLMs to automate this, I could maybe use 1 LLM for each of these sub-tasks? What if I had 4 LLMs - one for each problem? Then the flow could look something like this:
LLM-1 - instructed to ensure the code is functional for the problem.
LLM-2 - instructed to ensure the code fits the architecture and requirements
LLM-3 - ensures test coverage is sufficient and tries to come up with more edge-cases
LLM-4 - ensures documentation is up to date.
1. User submits pull request which triggers CI workflow
2. LLM-1 reads the code and examines it. If code is functional, move to next step.
3. LLM-2 reads the code and style guidelines and checks if it's good. Adds comments on the PR based on its analysis.
4. LLM-3 reads the code and tests and adds comments related to test cases
5. LLM-4 reads the code and documentation and adds comments
<!-- ![Code Review Workflow](./cr_workflow.png) -->
```mermaid
flowchart TD
A[User submits pull request] --> B[CI workflow triggers LLM review]
B --> LLM1[LLM-1: Check functionality]
LLM1 -->|Functional| LLM2[LLM-2: Review architecture & style]
LLM2 --> LLM3[LLM-3: Evaluate tests & suggest edge cases]
LLM3 --> LLM4[LLM-4: Review documentation]
LLM1 -->|Not functional| Stop1[❌ Add comments & halt review]
LLM2 -->|Issues found| Comment2[📝 Add architecture/style comments]
LLM3 -->|Insufficient tests| Comment3[📝 Add test coverage comments]
LLM4 -->|Missing/Outdated docs| Comment4[📝 Add documentation comments]
```
With this workflow mapped out, I just need to implement code that follows this logic tree.
### Approach 2
If I had a developer working for me, I'd just ask them to review the code right? What if I could leverage LLMs in a similar manner?
Let's give an LLM very detailed instructions, and all the tools necessary to complete this review, just like I would for a human. Let's also give it a way to reach back out to me if it needs any clarifying information.
LLM-1 - instructed to review the code and given all the necessary tools to do the task.
In this approach, LLM-1 is not just doing the steps, but it is also *figuring out* what steps to review the PR based on high level instructions.
<!-- ![Code Review Agent](./cr_workflow_2.png) -->
```mermaid
flowchart TD
A[User submits pull request] --> B[CI workflow triggers LLM-1 review]
B --> C[LLM-1: Reviews PR]
```
### So what's the difference between approach 1 and approach 2?
In Approach 1 - we broke down the high level task ourselves, decided exactly what steps were going to happen, and in what order, and then programmed that.
In Approach 2 - we gave the LLM some instructions and tools, and passed on the high level task to let the LLM do much more of the heavy-lifting to figure out how to do the task.
Let's look at the key differences in the approaches:
| Feature | Approach 1 (LLM Workflow) | Approach 2 (AI Agent) |
| --------------------- | ------------------------------ | -------------------------------- |
| Autonomy | Low follows set steps | High makes decisions |
| Adaptability | Rigid, limited to defined flow | Handles unexpected situations |
| Tool / Service usage | Fixed call order | Orchestrates multiple services |
| User interaction | None / minimal | Can ask clarifying questions |
Now, we can replace `Approach 1` with the term `LLM Workflow`, and `Approach 2` with the term `AI Agent`
The key takeaway here is that workflows execute steps *we define*, while AI agents *figure out how to accomplish the goal* and can make decisions dynamically.
## Which Approach Is Better? {#which-approach-is-better}
Use an LLM Workflow when:
- The problem is small, and requires a repeatable, well-defined sequence.
- You want predictable, consistent output.
- The process does not require dynamic decision-making.
- Examples: AI recipe provider, AI task tracker
Use an AI Agent when:
- The problem is vague and complex - requires decision-making, adaptation, or chaining multiple services.
- The process may change based on context, user input, or something else.
- Examples: Coding assistant, customer support assistant.
## Closing Thoughts
In the past few years, we have seen AI products emerge that have primarily been LLM workflows or lightweight wrappers around LLM APIs. The general trend is that these companies do well for a short while until the models natively get better, then the products fade away.
My theory is that as AI models get natively better, there will be less need for these workflow driven paradigms for specific problems, and LLMs will be able to do more of the heavy lifting.
AI models will be able to handle more tools and more complex instructions - and more use-cases will shift towards using autonomous agents. We have already seen reinforcement learning cases where the AI is just given a high level goal, and is able to figure out unique ways of accomplishing the task that humans wouldn't have tried.
Google DeepMind recently launched [AlphaEvolve](https://deepmind.google/discover/blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/), a coding agent designed to create new algorithms. AlphaEvolve has already discovered multiple new algorithms for matrix multiplication, a fundamental problem in computer science.
We're also seeing new AI agent products - IDEs like [Cursor](https://www.cursor.com/) and [Windsurf](https://windsurf.com/) allow users to build software applications by talking to an AI agent.
In a later blog post, we'll walk through how to use [Dexto, our open-source AI agent runtime](/docs/getting-started/intro) to build a real AI agent.

Binary file not shown.

After

Width:  |  Height:  |  Size: 107 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 109 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 27 KiB

View File

@@ -0,0 +1,11 @@
rahul:
name: Rahul Karajgikar
title: Co-Founder @ Truffle AI
url: https://github.com/rahulkarajgikar
image_url: https://github.com/rahulkarajgikar.png
shaunak:
name: Shaunak Srivastava
title: Co-Founder @ Truffle AI
url: https://github.com/shaunak99
image_url: https://github.com/shaunak99.png

24
dexto/docs/blog/tags.yml Normal file
View File

@@ -0,0 +1,24 @@
ai-agents:
label: AI Agents
permalink: /ai-agents
description: Posts about AI agents and autonomous systems
llm-workflows:
label: LLM Workflows
permalink: /llm-workflows
description: Posts about Large Language Model workflows and automation
llm:
label: LLM
permalink: /llm
description: Posts about Large Language Models
autonomous-ai:
label: Autonomous AI
permalink: /autonomous-ai
description: Posts about autonomous AI systems and automation
dexto:
label: Dexto
permalink: /dexto
description: Posts about the Dexto open-source AI agent runtime

View File

@@ -0,0 +1,9 @@
{
"label": "API Reference",
"position": 2,
"link": {
"type": "generated-index",
"title": "API Reference",
"description": "API Reference for Dexto"
}
}

View File

@@ -0,0 +1,9 @@
{
"label": "Architecture",
"position": 5,
"link": {
"type": "generated-index",
"title": "Architecture",
"description": "Explore the high-level and technical architecture of Dexto."
}
}

View File

@@ -0,0 +1,31 @@
# Overview
Dexto was built by the Truffle AI team.
We were trying to build useful AI agents in different domains, but we realized that we were re-building a lot of the same plumbing work each time. So we tried to use some existing AI agent frameworks.
Then we felt that we were getting stuck learning frameworks - each framework had different abstractions and levels of control, and we felt there had to be a simpler way to build AI agents.
So we built Dexto with the following tenets:
1. <ins>**Complete configurability**</ins>: We want users to be able to configure every part of Dexto with just a config file.
2. <ins>**MCP first**</ins>: Adopting MCP enables Dexto to interact with tooling in a standardized manner
3. <ins>**Powerful CLI**</ins>: We wanted a powerful CLI we could use for anything AI - just talking to LLMs, creating AI agents, deploying agents, testing out models/prompts/tools
4. <ins>**Re-usable Core primitives**</ins>: We want developers to be able to build all kinds of AI powered interfaces and applications using Dexto, without having to dive-deep into the code, but always having the option to. This allows us to re-use the same core layer to expose AI agents on telegram, discord, slack, etc.
5. <ins>**Simple deployments**</ins>: We want users to be able to play around with different config files of Dexto and simply save the configuration they liked to be able to re-use it anywhere. Docker helps make this happen.
Check out our generated deepwiki [here](https://deepwiki.com/truffle-ai/dexto) for more details.
## Core Services
Dexto's architecture is built around core services that handle different aspects of agent functionality. Each service has a specific responsibility and works together to provide the full agent experience.
See [Core Services](./services.md) for detailed information about:
- **DextoAgent** - Main orchestrator
- **MCPManager** - Tool coordination
- **ToolManager** - Tool execution
- **SessionManager** - Conversation state
- **StorageManager** - Data persistence
- **SystemPromptManager** - System prompts
- **AgentEventBus** - Event coordination
- **Telemetry** - Distributed tracing and observability

View File

@@ -0,0 +1,315 @@
# Core Services
import ExpandableMermaid from '@site/src/components/ExpandableMermaid';
Dexto's architecture is built around core services that handle different aspects of agent functionality. Understanding these services helps with debugging, customization, and troubleshooting.
## Service Overview
| Service | Purpose | Key Responsibilities |
|---------|---------|---------------------|
| **DextoAgent** | Main orchestrator | Coordinates all services, handles user interactions |
| **MCPManager** | Tool coordination | Connects to MCP servers, manages tools and resources |
| **ToolManager** | Tool execution | Executes tools, handles confirmations, manages internal tools |
| **SessionManager** | Conversation state | Manages chat sessions, conversation history |
| **StorageManager** | Data persistence | Handles cache and database storage |
| **SystemPromptManager** | System prompts | Manages system prompt assembly and dynamic content |
| **AgentEventBus** | Event coordination | Handles inter-service communication |
## Service Relationships
<ExpandableMermaid title="Service Relationships Diagram">
```mermaid
graph TB
DA[DextoAgent] --> SM[SessionManager]
DA --> MM[MCPManager]
DA --> TM[ToolManager]
DA --> SPM[SystemPromptManager]
DA --> STM[StorageManager]
DA --> AEB[AgentEventBus]
MM --> TM
TM --> STM
SM --> STM
SPM --> STM
AEB -.-> SM
AEB -.-> MM
AEB -.-> TM
subgraph "Storage Layer"
STM
Cache[(Cache)]
DB[(Database)]
end
STM --> Cache
STM --> DB
```
</ExpandableMermaid>
## DextoAgent
**Main orchestrator** that coordinates all other services.
### Key Methods
- `start()` - Initialize all services
- `generate(message, options)` - Execute user prompt (recommended)
- `run(prompt, imageData?, fileData?, sessionId)` - Lower-level execution
- `switchLLM(updates)` - Change LLM model/provider
- `createSession(sessionId?)` - Create new chat session
- `stop()` - Shutdown all services
### Usage Example
```typescript
const agent = new DextoAgent(config);
await agent.start();
// Create a session
const session = await agent.createSession();
// Run a task
const response = await agent.generate("List files in current directory", session.id);
console.log(response.content);
// Switch models
await agent.switchLLM({ model: "claude-sonnet-4-5-20250929" });
await agent.stop();
```
## MCPManager
**Tool coordination** service that connects to Model Context Protocol servers.
### Key Methods
- `connectServer(name, config)` - Connect to MCP server
- `disconnectServer(name)` - Disconnect server
- `getAllTools()` - Get all available tools
- `executeTool(name, params)` - Execute specific tool
### Server Types
- **stdio** - Command-line programs
- **http** - HTTP REST endpoints
- **sse** - Server-sent events
### Usage Example
```typescript
// Connect filesystem tools
await agent.mcpManager.connectServer('filesystem', {
type: 'stdio',
command: 'npx',
args: ['-y', '@modelcontextprotocol/server-filesystem', '.']
});
// Get available tools
const tools = await agent.mcpManager.getAllTools();
```
## ToolManager
**Tool execution** service that handles tool calls and confirmations.
### Key Methods
- `getToolStats()` - Get tool counts (MCP + internal)
- `getAllTools()` - Get all available tools
- `executeTool(call)` - Execute tool with confirmation
### Tool Confirmation
Controls when users are prompted to approve tool execution:
- **auto** - Smart approval based on tool risk
- **always** - Always ask for confirmation
- **never** - Never ask (auto-approve)
### Usage Example
```typescript
// Get tool statistics
const stats = await agent.toolManager.getToolStats();
console.log(`${stats.total} tools: ${stats.mcp} MCP, ${stats.internal} internal`);
```
## SessionManager
**Conversation state** management for persistent chat sessions.
### Key Methods
- `createSession(sessionId?)` - Create new session
- `getSession(sessionId)` - Retrieve existing session
- `listSessions()` - List all sessions
- `deleteSession(sessionId)` - Delete session
- `getSessionHistory(sessionId)` - Get conversation history
- `resetConversation(sessionId)` - Clear session history while keeping session active
### Session Features
- Persistent conversation history
- Session metadata (creation time, last activity)
- Cross-session search capabilities
- Export/import functionality
### Usage Example
```typescript
// Create new session
const session = await agent.createSession('work-session');
// List all sessions
const sessions = await agent.listSessions();
// Get conversation history
const history = await agent.getSessionHistory('work-session');
// Use session in conversations
const response = await agent.generate("Hello", session.id);
console.log(response.content);
```
## StorageManager
**Data persistence** using two-tier architecture.
### Storage Tiers
- **Cache** - Fast, ephemeral (Redis or in-memory)
- **Database** - Persistent, reliable (PostgreSQL, SQLite)
### Backends
| Backend | Use Case | Configuration |
|---------|----------|---------------|
| **in-memory** | Development, testing | No config needed |
| **sqlite** | Single instance, persistence | `path: ./data/dexto.db` |
| **postgres** | Production, scaling | `connectionString: $POSTGRES_URL` |
| **redis** | Fast caching | `url: $REDIS_URL` |
### Usage Pattern
```yaml
storage:
cache:
type: redis # Fast access
url: $REDIS_URL
database:
type: postgres # Persistent storage
connectionString: $POSTGRES_CONNECTION_STRING
```
## SystemPromptManager
**System prompt** assembly from multiple contributors.
### Contributor Types
- **static** - Fixed text content
- **dynamic** - Generated content (e.g., current date/time)
- **file** - Content from files (.md, .txt)
### Priority System
Lower numbers execute first (0 = highest priority).
### Usage Example
```yaml
systemPrompt:
contributors:
- id: primary
type: static
priority: 0
content: "You are a helpful AI assistant..."
- id: date
type: dynamic
priority: 10
source: date
- id: context
type: file
priority: 5
files: ["./docs/context.md"]
```
## AgentEventBus
**Event coordination** for inter-service communication.
### Event Types
- **thinking** - AI is processing
- **chunk** - Streaming response chunk
- **toolCall** - Tool execution starting
- **toolResult** - Tool execution completed
- **response** - Final response ready
### Usage Example
```typescript
agent.agentEventBus.on('toolCall', (event) => {
console.log(`Executing tool: ${event.toolName}`);
});
agent.agentEventBus.on('response', (event) => {
console.log(`Response: ${event.content}`);
});
```
## Service Initialization
Services are initialized automatically when `DextoAgent.start()` is called:
1. **Storage** - Cache and database connections
2. **Events** - Event bus setup
3. **Prompts** - System prompt assembly
4. **MCP** - Server connections
5. **Tools** - Tool discovery and validation
6. **Sessions** - Session management ready
## Debugging Services
### Log Levels
```bash
# Enable debug logging
DEXTO_LOG_LEVEL=debug dexto
# Service-specific debugging
DEXTO_LOG_LEVEL=silly dexto # Most verbose
```
### Service Health Checks
```typescript
// Check MCP connections
const connectedServers = agent.mcpManager.getClients();
const failedConnections = agent.mcpManager.getFailedConnections();
// Check tool availability
const toolStats = await agent.toolManager.getToolStats();
// Check storage status
const storageHealth = agent.storageManager.getStatus();
```
### Common Issues
- **MCP connection failures** - Check command paths, network access
- **Storage errors** - Verify database/Redis connections
- **Tool execution timeouts** - Increase timeout in server config
- **Session persistence issues** - Check database backend health
## Service Configuration
Each service can be configured through the agent config:
```yaml
# MCP server connections
mcpServers:
filesystem:
type: stdio
command: npx
timeout: 30000
# Storage backends
storage:
cache:
type: redis
database:
type: postgres
# Session limits
sessions:
maxSessions: 100
sessionTTL: 3600000
# Tool confirmation
toolConfirmation:
mode: auto
timeout: 30000
```
See [Configuration Guide](../guides/configuring-dexto/overview.md) for complete config options.

View File

@@ -0,0 +1,8 @@
{
"label": "Community",
"position": 6,
"link": {
"type": "generated-index",
"description": "Learn how to contribute to the Dexto ecosystem by adding MCPs, creating example agents, and more."
}
}

View File

@@ -0,0 +1,9 @@
{
"label": "Learning resources",
"position": 3,
"link": {
"type": "generated-index",
"title": "AI Agents basics",
"description": "Fundamental concepts and principles of AI agents in Dexto."
}
}

View File

@@ -0,0 +1,49 @@
---
sidebar_position: 2
---
# AI Agents vs. LLM Workflows
People often get confused between AI Agents and LLM workflows.
Understanding the distinction between **AI agents** and **LLM (Large Language Model) workflows** is key to choosing the right automation approach for your use-case.
## What is an AI Agent?
- An autonomous software entity that can perceive, reason, and act to achieve goals.
- Handles complex, multi-step tasks by making decisions and orchestrating tools/services.
- Can adapt to changing environments and user requests.
## What is an LLM Workflow?
- A predefined sequence of steps or prompts executed by a large language model (like GPT-4 or Claude).
- Typically linear and deterministic: each step follows the previous one.
- Great for repeatable, well-defined processes (e.g., data extraction, summarization, formatting).
## Key Differences
| Feature | AI Agent | LLM Workflow |
|------------------------|------------------------------------------|--------------------------------------|
| Autonomy | High (makes decisions) | Low (follows set steps) |
| Adaptability | Can handle unexpected situations | Rigid, limited to defined flow |
| Use of Tools/Services | Orchestrates multiple tools/services | May call tools, but generally in fixed order |
| User Interaction | Can ask clarifying questions, replan | Usually no dynamic interaction |
| Example Use Case | "Book a flight and notify me on Slack" | "A button on a web page to summarize a document" |
## When to Use Each
- **Use an AI Agent when:**
- The problem is vague and complex - requires decision-making, adaptation, or chaining multiple services.
- The process may change based on context, user input or something else.
- **Use an LLM Workflow when:**
- The problem is small, and requires a repeatable, well-defined sequence.
- You want predictable, consistent output.
- The process does not require dynamic decision-making.
## In Dexto
Dexto CLI spins up a powerful AI Agent you can use for solving complex problems
Choosing the right tools for the job helps you get the most out of Dexto's automation capabilities.

View File

@@ -0,0 +1,18 @@
---
sidebar_position: 3
---
# How do AI Agents work?
AI agents operate by understanding your natural language instructions, reasoning about what needs to be done, and then taking actions to accomplish your goals. This process involves several key steps:
1. **Understanding:** The agent interprets your request using natural language processing.
2. **Planning:** It determines what actions or steps are needed to fulfill your request.
3. **Tool Selection:** The agent decides which tools or services are required to perform each step.
4. **Execution:** It invokes the selected tools, orchestrating them as needed to complete the task.
A key part of this process is the use of **tools** — external services, APIs, or modules that enable the agent to take real actions in the world. Tools are the bridge between the agent's reasoning and real-world effects.
Dexto CLI is an AI Agent that does all this for you - you just setup Dexto CLI the way you want to, and tell it what to do - Dexto CLI does the rest!
To learn more about tools and their role in Dexto, continue to the next section: [Tools](./tools.md).

View File

@@ -0,0 +1,41 @@
---
sidebar_position: 4
sidebar_label: "What are Tools?"
---
# Tools
## The Role of Tools
**Tools** are external services, APIs, or modules that an AI agent can use to perform actions, retrieve information, or manipulate data.
### Why Tools Matter
AI agents are powerful because they can go beyond language understanding—they can take real actions in the world. Tools are the bridge between the agent's reasoning and real-world effects.
### How do Dexto Agents use Tools?
Dexto agents use tools from MCP servers - MCP servers define the tools, dexto uses them.
### Examples of Tools in Dexto
- **Filesystem Tool:** Read, write, or search files on your computer.
- **Web Browser Tool:** Automate web browsing, scraping, or form submissions.
- **Email Tool:** Read, summarize, or send emails.
- **Slack Tool:** Post messages, retrieve channels, or automate notifications.
- **Custom Tools:** Any API or service you connect via the Model Context Protocol (MCP).
### How Tools Work
- Tools are registered with Dexto agents via MCP configuration (see the Configuration docs).
- When you give a natural language command, the agent decides which tools to use and in what order.
- The agent can chain multiple tools together to accomplish complex tasks.
**Example:**
> "Find all PDF files in my Downloads folder and email them to me."
- The Dexto agent uses the Filesystem Tool to search for PDFs.
- Then uses the Email Tool to send them—all automatically.
### Extending with Your Own Tools
Dexto agents are extensible: you can add your own tools by implementing an MCP server or connecting to existing APIs. This lets you automate anything you can describe and connect.

View File

@@ -0,0 +1,24 @@
---
sidebar_position: 1
---
# What is an AI Agent?
An **AI Agent** is a program that can understand a goal, make a plan, and use a set of tools to execute that plan. Think of it as an autonomous worker that can operate digital tools on your behalf.
At a high level, most AI agents share a common structure:
1. **Goal:** They are given a high-level objective in natural language (e.g., "summarize this report," "book a flight," "organize my files").
2. **Reasoning:** The agent, typically powered by a Large Language Model (LLM), breaks down the goal into a sequence of steps.
3. **Tools:** The agent has access to a set of tools (like APIs, web browsers, or file system commands) that it can use to execute the steps.
4. **Observation:** After each action, the agent observes the result (e.g., the output of a command, the content of a webpage) and uses that information to decide the next step.
5. **Execution:** The agent continues this loop of reasoning, tool use, and observation until the goal is achieved.
## Key Characteristics
- **Autonomy:** Agents can operate independently to achieve a goal without step-by-step human guidance.
- **Goal-Oriented:** Their actions are driven by the objective they are trying to achieve.
- **Tool-Using:** They don't just process information; they take action by using external tools.
- **Adaptive:** They can react to the results of their actions and adjust their plan accordingly.
AI agents represent a shift from simple instruction-following programs to more dynamic, goal-oriented systems that can navigate complex digital environments. Check out this blog post where we explore this in depth.

View File

@@ -0,0 +1,9 @@
{
"label": "Getting Started",
"position": 1,
"link": {
"type": "generated-index",
"title": "Getting Started",
"description": "Learn how to build AI agents using Dexto's declarative configuration and runtime orchestration. These guides provide a hands-on introduction to creating persistent, tool-enabled agents."
}
}

View File

@@ -0,0 +1,395 @@
---
sidebar_position: 3
---
# Build Your First Agent
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
Let's create your first custom agent in under 5 minutes.
We'll build a **Market Research Assistant** that can analyze competitors, research trends, and generate actionable insights.
<Tabs groupId="interface" defaultValue="webui" values={[
{ label: 'CLI', value: 'cli' },
{ label: 'WebUI', value: 'webui' },
]}>
<TabItem value="cli">
## Step 1: Create Your Agent
Create a new directory and add a basic configuration:
```bash
mkdir market-research-assistant
cd market-research-assistant
```
Create a `market-research-assistant.yml` file with your preferred LLM provider:
<Tabs groupId="llm-provider" defaultValue="openai" values={[
{ label: 'OpenAI', value: 'openai' },
{ label: 'Anthropic', value: 'anthropic' },
{ label: 'Google', value: 'google' },
{ label: 'Groq', value: 'groq' },
{ label: 'xAI', value: 'xai' },
{ label: 'Cohere', value: 'cohere' },
]}>
<TabItem value="openai">
```yaml
# market-research-assistant.yml
systemPrompt: |
You are a Market Research Assistant specializing in competitive analysis and market intelligence.
Your expertise includes:
- Analyzing competitor strategies and positioning
- Identifying market trends and opportunities
- Researching customer pain points and needs
- Generating actionable business insights
Always provide data-driven insights with specific recommendations.
llm:
provider: openai
model: gpt-5-mini
apiKey: $OPENAI_API_KEY
```
</TabItem>
<TabItem value="anthropic">
```yaml
# market-research-assistant.yml
systemPrompt: |
You are a Market Research Assistant specializing in competitive analysis and market intelligence.
Your expertise includes:
- Analyzing competitor strategies and positioning
- Identifying market trends and opportunities
- Researching customer pain points and needs
- Generating actionable business insights
Always provide data-driven insights with specific recommendations.
llm:
provider: anthropic
model: claude-sonnet-4-5-20250929
apiKey: $ANTHROPIC_API_KEY
```
</TabItem>
<TabItem value="google">
```yaml
# market-research-assistant.yml
systemPrompt: |
You are a Market Research Assistant specializing in competitive analysis and market intelligence.
Your expertise includes:
- Analyzing competitor strategies and positioning
- Identifying market trends and opportunities
- Researching customer pain points and needs
- Generating actionable business insights
Always provide data-driven insights with specific recommendations.
llm:
provider: google
model: gemini-2.5-pro
apiKey: $GOOGLE_GENERATIVE_AI_API_KEY
```
</TabItem>
<TabItem value="groq">
```yaml
# market-research-assistant.yml
systemPrompt: |
You are a Market Research Assistant specializing in competitive analysis and market intelligence.
Your expertise includes:
- Analyzing competitor strategies and positioning
- Identifying market trends and opportunities
- Researching customer pain points and needs
- Generating actionable business insights
Always provide data-driven insights with specific recommendations.
llm:
provider: groq
model: llama-3.3-70b-versatile
apiKey: $GROQ_API_KEY
```
</TabItem>
<TabItem value="xai">
```yaml
# market-research-assistant.yml
systemPrompt: |
You are a Market Research Assistant specializing in competitive analysis and market intelligence.
Your expertise includes:
- Analyzing competitor strategies and positioning
- Identifying market trends and opportunities
- Researching customer pain points and needs
- Generating actionable business insights
Always provide data-driven insights with specific recommendations.
llm:
provider: xai
model: grok-4
apiKey: $XAI_API_KEY
```
</TabItem>
<TabItem value="cohere">
```yaml
# market-research-assistant.yml
systemPrompt: |
You are a Market Research Assistant specializing in competitive analysis and market intelligence.
Your expertise includes:
- Analyzing competitor strategies and positioning
- Identifying market trends and opportunities
- Researching customer pain points and needs
- Generating actionable business insights
Always provide data-driven insights with specific recommendations.
llm:
provider: cohere
model: command-a-03-2025
apiKey: $COHERE_API_KEY
```
</TabItem>
</Tabs>
## Step 2: Add Research Tools
Add Exa's AI-powered web search capabilities to your agent:
```yaml
# market-research-assistant.yml
systemPrompt: |
You are a Market Research Assistant specializing in competitive analysis and market intelligence.
Your expertise includes:
- Analyzing competitor strategies and positioning
- Identifying market trends and opportunities
- Researching customer pain points and needs
- Generating actionable business insights
Always provide data-driven insights with specific recommendations.
llm:
provider: openai
model: gpt-5-mini
apiKey: $OPENAI_API_KEY
mcpServers:
# AI-powered web search for market research
exa:
type: http
url: https://mcp.exa.ai/mcp
```
## Step 3: Test Your Agent
Run your agent with a practical market research task:
```bash
dexto --agent market-research-assistant.yml
```
This opens the web UI.
Then ask the agent this:
```
Research the AI coding assistant market. Analyze the top 3 competitors, their pricing strategies, and identify opportunities for a new entrant
```
Your agent will:
1. 🔍 Use AI-powered search to find relevant companies and data
2. 📊 Analyze competitor strategies and market positioning
3. 📝 Generate a comprehensive market analysis report
4. 💡 Provide actionable recommendations
:::tip CLI Mode
For terminal-based interaction, add `--mode cli`: `dexto --agent market-research-assistant.yml --mode cli`
:::
## Try These Research Tasks
Once your agent is running, try these powerful research scenarios:
- *"Analyze how Notion, Obsidian, and Roam Research position themselves in the note-taking market."*
- *"Research the sustainable packaging industry and identify customer pain points."*
- *"Find emerging trends in remote work tools and 5 companies launched in 2024."*
- *"Research successful SaaS product launches and their marketing strategies."*
</TabItem>
<TabItem value="webui">
## Step 1: Create a New Agent
First, start the Dexto Web UI:
```bash
dexto
```
Click on the **Agent Selector** button (with the sparkle icon) at the top of the page, then click **New Agent**.
![Create new agent](/img/screenshots/build-agent/webui-new-agent-button.png)
This opens the agent creation modal where you can configure your agent. Fill in the basic information:
1. **Agent ID**: `market-research-agent`
2. **Agent Name**: `Market Research Agent`
3. **Description**: `An agent to help with market research and analysis.`
4. **Tags**: `coding, custom, specialized`
## Step 2: Configure the LLM
Choose your LLM provider and model. Select the same one you picked when you set up Dexto.
![Configure LLM](/img/screenshots/build-agent/webui-llm-config.png)
The available providers include:
- OpenAI (e.g., gpt-5-mini)
- Anthropic (e.g., claude-sonnet-4-5-20250929)
- Google (e.g., gemini-2.5-pro)
- Groq (e.g., llama-3.3-70b-versatile)
- xAI (e.g., grok-4)
- Cohere (e.g., command-a-03-2025)
## Step 3: Set the System Prompt
Scroll to the **System Prompt** section and configure your agent's personality and expertise:
![Configure system prompt](/img/screenshots/build-agent/webui-system-prompt.png)
Enter this system prompt:
```
You are a Market Research Assistant specializing in competitive analysis and market intelligence.
Your expertise includes:
- Analyzing competitor strategies and positioning
- Identifying market trends and opportunities
- Researching customer pain points and needs
- Generating actionable business insights
Always provide data-driven insights with specific recommendations.
```
Click **Create Agent** to create your agent with all the configurations.
## Step 4: Switch to Your New Agent
After creating the agent, you need to switch to it. Click the **Agent Selector** button at the top and select your newly created **Market Research Agent** from the dropdown.
## Step 5: Connect Research Tools
Now that you've switched to your agent, let's add tools to give it research capabilities. Click the **Connect MCPs** button in the Tools & Servers panel on the right side to open the MCP Server Registry.
![Add MCP server](/img/screenshots/build-agent/webui-mcp-servers.png)
Search for **Exa** and click to add it. This gives your agent AI-powered web search capabilities for market research. Once added, click the connect button to activate the server.
## Step 6: Test Your Agent
Now you're ready to test your agent! You can see it's active in the Agent Selector button at the top, and the Exa tool is connected in the Tools & Servers panel.
![Market Research Agent active](/img/screenshots/build-agent/webui-agent-active.png)
Test your agent with a practical market research task:
```
Research the AI coding assistant market. Analyze the top 3 competitors, their pricing strategies, and identify opportunities for a new entrant
```
Your agent will:
1. 🔍 Use AI-powered search to find relevant companies and data
2. 📊 Analyze competitor strategies and market positioning
3. 📝 Generate a comprehensive market analysis report
4. 💡 Provide actionable recommendations
## Try These Research Tasks
Once your agent is running, try these powerful research scenarios:
- *"Analyze how Notion, Obsidian, and Roam Research position themselves in the note-taking market."*
- *"Research the sustainable packaging industry and identify customer pain points."*
- *"Find emerging trends in remote work tools and 5 companies launched in 2024."*
- *"Research successful SaaS product launches and their marketing strategies."*
## Step 7: Edit Your Agent (Optional)
You can edit your agent's configuration at any time:
1. Click the **Customize** button (pencil icon) in the sidebar
2. This opens the Agent Editor where you can:
- Modify the system prompt
- Change LLM settings
- Add or remove MCP servers
- Configure storage options
![Agent editor](/img/screenshots/build-agent/webui-agent-editor.png)
Changes are saved automatically and take effect immediately.
</TabItem>
</Tabs>
## That's It!
You've built a powerful Market Research Assistant!
<Tabs groupId="interface" defaultValue="webui" values={[
{ label: 'CLI', value: 'cli' },
{ label: 'WebUI', value: 'webui' },
]}>
<TabItem value="cli">
**What you accomplished:**
1. ✅ Created a specialized `market-research-assistant.yml` configuration
2. ✅ Added AI-powered web search tools
3. ✅ Tested it with real market research tasks
This demonstrates Dexto's core philosophy: **configure declaratively, let the runtime handle orchestration, and focus on your agent's purpose rather than implementation details.**
Your agent can now:
- 🔍 Conduct comprehensive web searches
- 📊 Analyze competitor websites and strategies
- 📝 Generate detailed research reports
- 💡 Provide actionable business insights
</TabItem>
<TabItem value="webui">
**What you accomplished:**
1. ✅ Created a new agent using the Web UI
2. ✅ Set up LLM provider and model
3. ✅ Configured system prompts for agent personality
4. ✅ Switched to your newly created agent
5. ✅ Connected MCP servers for research tools
6. ✅ Tested your agent with real market research tasks
7. ✅ Learned how to edit agent configuration on the fly
The Web UI provides a visual interface to build agents without writing YAML files manually, while still giving you full control over agent configuration.
</TabItem>
</Tabs>
Continue to the [Install Your First Agent Tutorial](./install-first-agent-tutorial.mdx) to explore pre-built agents that come with Dexto!

View File

@@ -0,0 +1,262 @@
---
sidebar_position: 4
---
# Install Your First Agent
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
Now that you have used Dexto to build your own agent, let's explore some of the in-built agents that come with Dexto.
<Tabs groupId="interface" defaultValue="webui" values={[
{ label: 'CLI', value: 'cli' },
{ label: 'WebUI', value: 'webui' },
]}>
<TabItem value="cli">
### 1. List available Agents
Dexto comes shipped with in-built agents for common use-cases.
Use the `list-agents` command to list the in-built agents.
```bash
dexto list-agents
```
You will see many different agents. Some of the available agents are:
- **Nano Banana Agent** Advanced image generation and editing using Google's Nano Banana (Gemini 2.5 Flash Image)
- **Podcast Agent** Advanced podcast generation using Google Gemini TTS for multi-speaker audio content
- **Coding Agent** Expert software development assistant with comprehensive coding tools
- **Sora Video Agent** AI video generation using OpenAI's Sora technology
- **Database Agent** Demo agent for SQL queries and database operations
- **Image Editor Agent** Image editing and manipulation
- **Music Agent** Music creation and audio processing
- **Talk2PDF Agent** Document analysis and conversation
- **GitHub Agent** GitHub operations, PR analysis, and repository management
- **Product Researcher** Product naming and branding research
- **Triage Agent** Demo multi-agent customer support routing system
Each agent is pre-configured with the right tools, prompts, and LLM settings for its domain.
No setup required — just install and start using them.
**📚 See the full [Agent Registry](../guides/agent-registry.md) for detailed information about all available agents, their capabilities, and use cases.**
### 2. Install a new agent
```bash
dexto install music-agent
```
This will install the music-agent with your preferred LLM you chose during setup, and you can immediately start running it!
```bash
dexto list-agents
```
This command will verify that the agent was installed successfully.
### 3. Run the installed agent
```bash
dexto --agent music-agent
```
This opens the Web UI (the default) where you can interact with your newly installed music agent!
:::tip Terminal Mode
Prefer terminal-based interaction? Use `dexto --agent music-agent --mode cli` to run in CLI mode, where you can use slash commands like `/tools` to list available tools.
:::
Check out [Music Agent Demo and Tutorial](../tutorials/cli/examples/music-agent.md) for a full tutorial of how to use this agent.
### 4. Uninstall the agent
```bash
dexto uninstall music-agent
```
This command uninstalls the agent.
```bash
dexto list-agents
```
Verify the music-agent was successfully uninstalled.
### 5. Reinstall the agent with in-built LLM settings
```bash
dexto install music-agent --no-inject-preferences
```
The first time when you installed `music-agent`, Dexto automatically injected your preferred LLM you configured during setup to be the LLM for this agent.
To avoid this and use the default LLM shipped for the music-agent, use the `--no-inject-preferences` argument while installing the agent.
This is preferable for certain agents that work better with or require pre-set LLMs, such as the nano-banana agent, which requires gemini.
Now let's rerun the list-agents command to see the installed agents.
```bash
dexto list-agents
```
You will now see that the music-agent has a different LLM configuration (the default one, as opposed to your preferred one)
In case you install an agent incorrectly, or want to use the default LLM config, you can always uninstall and reinstall agents.
### 6. Use the `dexto which` command to find the path to your agent file
```bash
dexto which music-agent
```
This command will show you the path to the installed music-agent so you can edit it directly if you are comfortable!
### 7. Update the music-agent to be your default agent
Open the dexto preferences file:
```bash
vi ~/.dexto/preferences.yml
```
Update the `defaultAgent` from `coding-agent` to `music-agent`
Before:
```yml
defaults:
defaultAgent: coding-agent
```
After:
```yml
defaults:
defaultAgent: music-agent
```
Now re-start dexto without any arguments
```bash
dexto
```
This opens the Web UI with your new default agent. Ask it "Who are you?", and you will see that instead of the `coding-agent`, dexto is now using the `music-agent`!
Similarly, if you want to change the default to any other agent, just update `defaultAgent` to any valid agent-name!
### 8. Run a new agent directly without install
```bash
dexto -a database-agent
```
If you don't run the install command first, Dexto will automatically install the agent with your preferred LLM and open the Web UI for you to start chatting!
You can now interact with the database agent in the Web UI, or close it and re-run the `list-agents` command to verify that the `database-agent` was installed.
:::note
Use `--no-auto-install` with `dexto` to disable the automatic install behavior.
:::
</TabItem>
<TabItem value="webui">
### 1. View Available Agents
First, start the Dexto Web UI:
```bash
dexto --mode web
```
Click on the **Agent Selector** button (with the sparkle icon) at the top of the page to see all available agents.
![Agent sections](/img/screenshots/install-agent/webui-agent-sections.png)
The dropdown organizes agents into sections:
- **Custom Agents** Your personally created agents (with trash icon to delete)
- **Installed** Built-in agents you've already installed
- **Available** Built-in agents ready to install (with download icon)
**📚 See the full [Agent Registry](../guides/agent-registry.md) for detailed information about all available agents, their capabilities, and use cases.**
### 2. Install an Available Agent
Scroll to the **Available** section and find an agent you want to try, like **Music Agent**.
![Available agents](/img/screenshots/install-agent/webui-available-agents.png)
Click on the agent with the download icon. It will be installed automatically with your preferred LLM.
### 3. Switch to the Installed Agent
After installation, you need to open the Agent Selector again. The agent will now appear in the **Installed** section.
![Installed agents](/img/screenshots/install-agent/webui-installed-agents.png)
Click on **Music Agent** to switch to it.
### 4. Use Your Agent
You're now talking to the Music Agent! The Agent Selector button at the top shows which agent is currently active.
![Music agent active](/img/screenshots/install-agent/webui-agent-active.png)
You can start chatting with the music agent immediately. Try asking it about music creation or audio processing.
Check out [Music Agent Demo and Tutorial](../tutorials/cli/examples/music-agent.md) for a full tutorial of how to use this agent.
### 5. Delete Custom Agents
To delete a custom agent you created, open the Agent Selector dropdown and find it under **Custom Agents**. Click the **trash icon** next to it to delete.
:::note
Built-in agents (from the Installed or Available sections) cannot be deleted through the UI.
:::
</TabItem>
</Tabs>
## Congratulations!
You've just installed your first pre-built AI agent with Dexto!
<Tabs groupId="interface" defaultValue="webui" values={[
{ label: 'CLI', value: 'cli' },
{ label: 'WebUI', value: 'webui' },
]}>
<TabItem value="cli">
You've learned how to:
- ✅ List available agents in dexto with the `dexto list-agents` command
- ✅ Install agents with the `dexto install` command
- ✅ Uninstall agents with the `dexto uninstall` command
- ✅ Use the `--no-inject-preferences` argument with `dexto install` command to use default LLM preferences
- ✅ Find the file path to installed agents with the `dexto which` command
- ✅ Set any agent as your preferred default agent in `dexto`
- ✅ Automatically install and run an agent with `dexto -a <agent_name>`
</TabItem>
<TabItem value="webui">
You've learned how to:
- ✅ View available agents using the Agent Selector dropdown
- ✅ Understand the different agent sections (Custom, Installed, Available)
- ✅ Install agents by clicking on them in the Available section
- ✅ Switch between installed agents
- ✅ Delete custom agents using the trash icon
</TabItem>
</Tabs>
You are now ready to explore other in-built agents like the podcast agent, triage agent, database agent, and more!
**Next Steps**: Explore adding more [tools](../concepts/tools.md) or building [multi-agent systems](../tutorials/cli/examples/multi-agent-systems.md).

View File

@@ -0,0 +1,118 @@
---
sidebar_position: 2
---
# Installation
This guide will walk you through installing the Dexto CLI and setting up your environment so you can start running agents.
## Prerequisites
- [Node.js](https://nodejs.org/en/download) >= 20.0.0
**Optional:** An LLM API Key (not required for local models)
- [Get a Gemini Key](https://aistudio.google.com/apikey) (free option available)
- [Get a Groq Key](https://console.groq.com/keys) (free option available)
- [Get an OpenAI Key](https://platform.openai.com/api-keys)
- [Get a Claude Key](https://console.anthropic.com/settings/keys)
## 1. Install Dexto
Install Dexto globally using npm:
```bash
npm install -g dexto
```
This adds the `dexto` command to your system, giving you access to the agent runtime.
## 2. Run Setup
```bash
dexto
```
This triggers the first-time setup wizard.
### Quick Start (Recommended)
1. **Choose setup type** → Select "Get started now"
2. **Pick provider** → Gemini (free), Groq (fast), or Local
3. **Get API key** → Browser opens, or paste existing key
4. **Confirm mode** → Press Enter for Terminal (default)
Done! You're ready to chat.
### Custom Setup
1. **Choose "Choose my own provider"**
2. **Browse providers** → OpenAI, Anthropic, local models, gateways
3. **Select model** → Pick from available models
4. **Configure API key** → Browser, paste, or skip
5. **Choose default mode** → Terminal, Browser, or API Server
### Supported Providers
| Category | Providers |
|----------|-----------|
| **Free Cloud** | Google Gemini, Groq |
| **Local (No API key)** | Local Models, Ollama (requires [Ollama](https://ollama.com) installed) |
| **Cloud** | OpenAI, Anthropic, xAI, Cohere |
| **Gateways** | OpenRouter, Glama, LiteLLM, OpenAI-Compatible |
| **Enterprise** | Google Vertex AI, AWS Bedrock |
### Modes Explained
| Mode | Flag | Description | Best For |
|------|------|-------------|----------|
| **Terminal** | `--mode cli` | Interactive CLI in your terminal | Quick tasks, coding |
| **Browser** | `--mode web` | Web UI at localhost:3000 | Long conversations |
| **API Server** | `--mode server` | REST API on port 3001 | Integrations, apps |
## 3. Start Using Dexto
After setup, Dexto launches in your selected default mode.
```bash
# Run with your default mode
dexto
# Override with a specific mode
dexto --mode cli
dexto --mode web
dexto --mode server
# One-shot commands (auto-uses CLI mode)
dexto "say hello"
dexto -p "list files in this directory"
```
## Reconfigure Anytime
```bash
# Open settings menu
dexto setup
# Force re-run full setup
dexto setup --force
```
### Non-Interactive Setup
For automation or CI environments:
```bash
dexto setup --provider google --model gemini-2.5-pro
dexto setup --provider ollama --model llama3.2
dexto setup --quick-start
```
:::tip CLI reference
For detailed information about all CLI commands, flags, and advanced usage patterns, check out our comprehensive **[CLI Guide](../guides/cli/overview)**.
:::
## Next Step: Build Your First Agent
Now that Dexto is installed, you're ready to create your first custom agent with its own configuration and capabilities.
Continue to the **[Build Your First Agent Tutorial](./build-first-agent-tutorial.mdx)** to learn how to build agents using declarative configuration.

View File

@@ -0,0 +1,40 @@
---
sidebar_position: 1
---
# Introduction
**An all-in-one toolkit to build agentic applications that turn natural language into real-world actions.**
Dexto is a universal agent intelligence layer for building collaborative, context-aware AI Agents & agentic apps. It orchestrates LLMs, tools, and data into persistent, stateful systems with memory, so you can rapidly create AI assistants, digital companions & copilots that think, act and feel alive.
Dexto combines a configuration-driven framework, robust runtime, and seamless developer experience so you can build, deploy, and iterate on your agents easily.
- **Framework** Define agent behavior in YAML. Instantly swap models and tools without touching code.
- **Runtime** Execution with orchestration, session management, conversation memory, and multimodal support.
- **Interfaces & Tooling** Native support for CLI, Web, APIs, and the Dexto Agent SDK.
import ExpandableImage from '@site/src/components/ExpandableImage';
<ExpandableImage
src="/assets/intro_diagram.png"
alt="Dexto Architecture"
title="Dexto Architecture Overview"
/>
## Key Features
- **50+ LLMs** - OpenAI, Anthropic, Google, Groq, local models
- **MCP integration** - Connect to 100+ tools and services via Model Context Protocol
- **Multiple interfaces** - CLI, Web UI, REST API, Dexto Agent SDK
- **Persistent sessions** - Maintain context across conversations
- **Local-first** - Run on your infrastructure
- **Production storage** - Redis, PostgreSQL, SQLite
## Ready to Get Started?
**[Install Dexto →](./installation.md)**
---
*Dexto is built by the team at Truffle AI. Join our community and help shape the future of collaborative agent systems!*

View File

@@ -0,0 +1,8 @@
{
"label": "Guides",
"position": 2,
"link": {
"type": "doc",
"id": "guides/introduction"
}
}

View File

@@ -0,0 +1,669 @@
---
sidebar_position: 2
title: "Agent Registry"
---
# Agent Registry
Dexto comes with a curated collection of pre-built agents ready to use for common tasks. Each agent is optimized with specific tools, system prompts, and LLM configurations for its domain.
:::tip Adding Custom Agents
Want to add your Agent to this registry? Check out our [Community Contribution Guide](https://github.com/truffle-ai/dexto/blob/main/CONTRIBUTING.md#3-adding-agents-to-the-official-registry) for step-by-step instructions.
:::
## Quick Start
```bash
# List all available agents
dexto list-agents
# Install an agent
dexto install <agent-name>
# Use an installed agent
dexto --agent <agent-name>
```
For detailed installation instructions, see the [Installing Custom Agents guide](./installing-custom-agents).
---
## Agent Catalog Overview
| Agent | Category | Best For | LLM |
|-------|----------|----------|-----|
| [Podcast Agent](#%EF%B8%8F-podcast-agent) | Content Creation | Multi-speaker audio, podcast intros | OpenAI GPT-5 Mini |
| [Music Agent](#-music-agent) | Content Creation | Music composition, audio processing | OpenAI GPT-5 Mini |
| [Nano Banana Agent](#%EF%B8%8F-nano-banana-agent) | Content Creation | Image generation & editing | Google Gemini 2.5 Flash |
| [Sora Video Agent](#-sora-video-agent) | Content Creation | AI video generation | OpenAI GPT-5 Mini |
| [Image Editor Agent](#%EF%B8%8F-image-editor-agent) | Content Creation | Image manipulation, face detection | OpenAI GPT-5 Mini |
| [Coding Agent](#-coding-agent) | Development | Software development, debugging | Anthropic Claude Haiku 4.5 |
| [Explore Agent](#-explore-agent) | Development | Codebase exploration, research | Anthropic Claude Haiku 4.5 |
| [Database Agent](#%EF%B8%8F-database-agent) | Data & Analysis | SQL queries, database operations | OpenAI GPT-5 Mini |
| [Talk2PDF Agent](#-talk2pdf-agent) | Data & Analysis | PDF analysis, document conversation | OpenAI GPT-5 Mini |
| [Product Analysis Agent](#-product-analysis-agent) | Data & Analysis | Product analytics, user behavior | Anthropic Claude Sonnet 4.5 |
| [GitHub Agent](#-github-agent) | DevOps | GitHub operations, PR analysis | OpenAI GPT-5 Mini |
| [Workflow Builder Agent](#-workflow-builder-agent) | Automation | n8n workflow automation | OpenAI GPT-5 Mini |
| [Product Researcher](#-product-researcher) | Research | Product naming, branding research | Anthropic Claude Sonnet 4.5 |
| [Triage Agent](#-triage-agent) | Multi-Agent | Customer support routing | OpenAI GPT-5 |
| [Gaming Agent](#-gaming-agent) | Entertainment | GameBoy games, Pokemon | Anthropic Claude Sonnet 4.5 |
| [Default Agent](#%EF%B8%8F-default-agent) | General Purpose | General tasks, file operations | Any |
---
## Detailed Agent Information
### Content Creation
#### 🎙️ Podcast Agent
**ID:** `podcast-agent`
**Best For:** Multi-speaker audio content, podcast intros, voice synthesis
Create professional podcast content with realistic multi-speaker audio using Google Gemini TTS.
**Key Features:**
- Multi-speaker podcast generation
- High-quality voice synthesis
- Audio editing and production
- Various podcast format support
**Example Use:**
```bash
dexto --agent podcast-agent "Generate an intro for a tech podcast about AI"
```
**Recommended LLM:** OpenAI GPT-5 Mini
**Requires:** `GOOGLE_GENERATIVE_AI_API_KEY`
---
#### 🎵 Music Agent
**ID:** `music-agent`
**Best For:** Music creation, audio processing, sound design
AI agent specialized in music composition and audio manipulation.
**Key Features:**
- Music composition and generation
- Audio processing and effects
- Sound design capabilities
- Multiple musical styles
**Example Use:**
```bash
dexto --agent music-agent "Create a lo-fi chill beat for studying"
```
**Recommended LLM:** OpenAI GPT-5 Mini
**Tutorial:** [Music Agent Tutorial](../tutorials/cli/examples/music-agent.md)
---
#### 🖼️ Nano Banana Agent
**ID:** `nano-banana-agent`
**Best For:** Image generation, editing, object removal, style transfer
Advanced image generation and editing using Google's Nano Banana (Gemini 2.5 Flash Image) with near-instantaneous processing.
**Key Features:**
- **Image Generation** Create stunning images from text prompts with various styles
- **Image Editing** Modify existing images using natural language
- **Object Removal** Remove unwanted objects while preserving backgrounds
- **Background Changes** Replace backgrounds seamlessly
- **Image Fusion** Combine multiple images creatively
- **Style Transfer** Apply artistic styles with character consistency
- **Figurine Effect** Nano Banana's signature feature
- **Multi-image Processing** Complex compositions support
**Capabilities:**
- Near-instantaneous processing with high visual coherence
- Character consistency across multiple edits
- Scene preservation with seamless blending
- SynthID watermarks for safety
- Supports JPG, PNG, WebP, GIF (max 20MB per image)
**Example Use:**
```bash
dexto --agent nano-banana-agent "Create a futuristic cityscape with flying cars"
dexto --agent nano-banana-agent "Remove the person from this photo"
```
**Recommended LLM:** Google Gemini 2.5 Flash (Required)
**Requires:** `GOOGLE_GENERATIVE_AI_API_KEY`
**Demo:** [Image Generation Example](/examples/image-generation)
---
#### 🎬 Sora Video Agent
**ID:** `sora-video-agent`
**Best For:** AI video generation, video remixing, video management
Create AI-generated videos using OpenAI's Sora technology with comprehensive video creation and management capabilities.
**Key Features:**
- **Video Generation** Create videos from text prompts with custom duration, resolution, and style
- **Reference-Based Creation** Use images or videos as reference for precise generation
- **Video Management** Monitor progress, list all videos, organize creations
- **Video Remixing** Create variations and extensions with new prompts
- **File Management** Auto-download and organize generated videos
- **Quality Control** Delete unwanted videos and manage storage
**Supported Specifications:**
- **Durations:** 4s, 8s, 16s, 32s
- **Resolutions:** 720x1280 (9:16), 1280x720 (16:9), 1024x1024 (1:1), 1024x1808 (9:16 HD), 1808x1024 (16:9 HD)
- **Reference Formats:** JPG, PNG, WebP, MP4, MOV, AVI, WebM
**Example Use:**
```bash
dexto --agent sora-video-agent "Create a 16s cinematic video of a sunset over mountains"
```
**Recommended LLM:** OpenAI GPT-5 Mini
**Requires:** `OPENAI_API_KEY`
---
#### 🖌️ Image Editor Agent
**ID:** `image-editor-agent`
**Best For:** Image manipulation, face detection, OpenCV operations
General-purpose image editing and manipulation with computer vision capabilities.
**Key Features:**
- Image editing and transformation
- Face detection and annotation
- OpenCV-powered operations
- Graphics manipulation
**Example Use:**
```bash
dexto --agent image-editor-agent "Detect all faces in this image and draw bounding boxes"
```
**Recommended LLM:** OpenAI GPT-5 Mini
**Demo:** [Face Detection Example](/examples/face-detection)
---
### Development & Coding
#### 👨‍💻 Coding Agent
**ID:** `coding-agent`
**Best For:** Software development, debugging, code review, refactoring
Expert software development assistant with comprehensive internal coding tools for building, debugging, and maintaining codebases.
**Key Features:**
- **Codebase Analysis** Read and analyze code using glob and grep patterns
- **File Operations** Write, edit, and organize code files
- **Command Execution** Run shell commands for testing and building
- **Debugging** Identify and fix bugs by examining errors and code structure
- **Refactoring** Improve code following best practices
- **Testing** Write and run unit tests
**Internal Tools:**
- `read_file`, `write_file`, `edit_file` File operations
- `glob_files`, `grep_content` Code search
- `bash_exec` Shell command execution
- `ask_user` Interactive clarification
**Starter Prompts Include:**
- 🔍 Analyze Codebase
- 🐛 Debug Error
- ♻️ Refactor Code
- 🧪 Write Tests
- ✨ Implement Feature
- ⚡ Optimize Performance
- 🚀 Setup Project
- 👀 Code Review
**Example Use:**
```bash
dexto --agent coding-agent "Analyze this codebase and suggest improvements"
dexto --agent coding-agent "Create a landing page for a coffee brand"
```
**Recommended LLM:** Anthropic Claude Haiku 4.5
**File Support:** 50+ programming languages and config formats
**Demo:** [Snake Game Development](/examples/snake-game)
---
#### 🔍 Explore Agent
**ID:** `explore-agent`
**Best For:** Codebase exploration, finding files, understanding architecture, researching code
Fast, read-only agent optimized for codebase exploration. Designed to be spawned by other agents for quick research tasks.
**Key Features:**
- **File Discovery** Find files matching patterns using glob
- **Content Search** Search for text/patterns within files using grep
- **Code Reading** Read and analyze file contents
- **Architecture Understanding** Map relationships between components
- **Fast Response** Optimized for speed with Haiku model
**Use Cases:**
- "What's in this folder?"
- "How does X work?"
- "Find where Y is handled"
- "Understand the architecture"
- "Explore the codebase"
**Available Tools:**
- `glob_files` Find files matching patterns (e.g., `src/**/*.ts`)
- `grep_content` Search for text/patterns within files
- `read_file` Read file contents
**Example Use:**
```bash
dexto --agent explore-agent "How is authentication handled in this codebase?"
dexto --agent explore-agent "Find all API endpoints"
```
**Recommended LLM:** Anthropic Claude Haiku 4.5
**Performance Notes:**
- Read-only tools only (no write operations)
- Auto-approves all tool calls for speed
- Optimized for quick research tasks
- In-memory storage for ephemeral use
---
### Data & Analysis
#### 🗄️ Database Agent
**ID:** `database-agent`
**Best For:** SQL queries, database operations, data analysis
AI agent specialized in database operations and SQL query generation.
**Key Features:**
- SQL query generation
- Database schema analysis
- Data operations and transformations
- Query optimization suggestions
**Example Use:**
```bash
dexto --agent database-agent "Show me all users who signed up last month"
```
**Recommended LLM:** OpenAI GPT-5 Mini
**Tutorial:** [Database Agent Tutorial](../tutorials/cli/examples/database-agent.md)
---
#### 📄 Talk2PDF Agent
**ID:** `talk2pdf-agent`
**Best For:** PDF analysis, document conversation, content extraction
Conversational interface for analyzing and extracting information from PDF documents.
**Key Features:**
- PDF document analysis
- Natural language queries about content
- Information extraction
- Document summarization
**Example Use:**
```bash
dexto --agent talk2pdf-agent "Summarize the key findings in this research paper"
```
**Recommended LLM:** OpenAI GPT-5 Mini
**Tutorial:** [Talk2PDF Tutorial](../tutorials/cli/examples/talk2pdf-agent.md)
---
#### 📊 Product Analysis Agent
**ID:** `product-analysis-agent`
**Best For:** Product analytics, user behavior, feature flags, error tracking
AI agent for product analytics using PostHog MCP server.
**Key Features:**
- User growth and behavior analysis
- Feature flag management
- Error tracking and debugging
- Annotations for events
- Funnel and retention analysis
**Example Use:**
```bash
dexto --agent product-analysis-agent "Show me user growth trends over the past 30 days"
```
**Recommended LLM:** Anthropic Claude Sonnet 4.5
**Requires:** `POSTHOG_API_KEY`
---
### Automation & Integration
#### 🔧 Workflow Builder Agent
**ID:** `workflow-builder-agent`
**Best For:** n8n workflow automation, integrations, scheduled tasks
AI agent for building and managing n8n automation workflows.
**Key Features:**
- Create workflows from natural language
- Execution monitoring and debugging
- Credential guidance for service integrations
- Workflow templates (social media scheduler, etc.)
**Example Use:**
```bash
dexto --agent workflow-builder-agent "Build a social media scheduler that posts from Google Sheets"
```
**Recommended LLM:** OpenAI GPT-5 Mini
**Requires:** `N8N_MCP_URL`, `N8N_MCP_TOKEN`
---
### Collaboration & DevOps
#### 🐙 GitHub Agent
**ID:** `github-agent`
**Best For:** GitHub operations, PR analysis, repository management
Specialized agent for GitHub operations including pull request analysis, issue management, and repository insights.
**Key Features:**
- Analyze pull requests and issues
- Repository insights and statistics
- Code review assistance
- GitHub workflow automation
- Collaboration features via MCP
**Example Use:**
```bash
dexto --agent github-agent "Analyze the open pull requests in this repo"
```
**Recommended LLM:** OpenAI GPT-5 Mini
**Requires:** `GITHUB_TOKEN`
---
### Research & Branding
#### 🔍 Product Researcher
**ID:** `product-researcher`
**Best For:** Product naming, branding research, market analysis
AI agent specialized in product name research, branding strategies, and market positioning.
**Key Features:**
- Product name generation and evaluation
- Brand identity research
- Competitive analysis
- Market positioning insights
- Naming conventions and trends
**Example Use:**
```bash
dexto --agent product-researcher "Suggest names for a sustainable fashion startup"
```
**Recommended LLM:** Anthropic Claude Sonnet 4.5
**Tutorial:** [Product Name Scout Tutorial](../tutorials/cli/examples/product-name-scout-agent.md)
---
### Multi-Agent Systems
#### 🎯 Triage Agent
**ID:** `triage-agent`
**Best For:** Customer support routing, multi-agent coordination
Demonstration of a multi-agent customer support triage system that routes queries to specialized agents.
**System Architecture:**
- **Triage Agent** (Main) Routes queries to specialized agents
- **Technical Support Agent** Handles technical issues
- **Billing Agent** Manages billing and payment queries
- **Product Info Agent** Answers product-related questions
- **Escalation Agent** Handles complex cases requiring human intervention
**Key Features:**
- Intelligent query routing
- Multi-agent coordination
- Specialized domain handling
- Escalation workflows
**Example Use:**
```bash
dexto --agent triage-agent "I need help with my billing"
```
**Recommended LLM:** OpenAI GPT-5
**Tutorial:** [Building Multi-Agent Systems](../tutorials/cli/examples/building-triage-system.md)
---
### Entertainment
#### 🎮 Gaming Agent
**ID:** `gaming-agent`
**Best For:** Playing GameBoy games like Pokemon through an emulator
AI agent that plays GameBoy games through a visual emulator with button controls and screen capture.
**Key Features:**
- **Visual Gameplay** See and analyze the game screen in real-time
- **Button Controls** D-pad, A, B, START, SELECT with configurable hold duration
- **ROM Management** Load .gb and .gbc ROM files
- **Frame Control** Wait for animations and control game timing
**Available Tools:**
- `press_up`, `press_down`, `press_left`, `press_right` D-pad controls
- `press_a`, `press_b`, `press_start`, `press_select` Button controls
- `load_rom` Load a GameBoy ROM file
- `get_screen` Capture current screen state
- `wait_frames` Wait without input
- `list_roms` List available ROMs
**Example Use:**
```bash
dexto --agent gaming-agent "Load Pokemon Red and start a new game"
```
**Recommended LLM:** Anthropic Claude Sonnet 4.5 (vision required)
**Note:** You must provide your own ROM files (.gb or .gbc format)
---
### General Purpose
#### ⚙️ Default Agent
**ID:** `default-agent`
**Best For:** General tasks, file operations, web automation
Default Dexto agent with filesystem and Playwright tools for general-purpose tasks.
**Key Features:**
- Filesystem operations
- Web browser automation via Playwright
- General conversation and assistance
- Task execution
**Example Use:**
```bash
dexto --agent default-agent
```
**Recommended LLM:** Any supported provider
**Comes pre-installed:** No (available in registry)
---
## Installation & Usage
### Installing Agents
```bash
# Install single agent
dexto install nano-banana-agent
# Install multiple agents
dexto install podcast-agent music-agent coding-agent
# Install with default LLM (skip preference injection)
dexto install nano-banana-agent --no-inject-preferences
# Install all agents
dexto install --all
```
### Using Installed Agents
```bash
# Use specific agent
dexto --agent coding-agent
# Auto-install and use (if not installed)
dexto -a podcast-agent "Generate a podcast intro"
```
### Managing Agents
```bash
# List installed agents
dexto list-agents --installed
# Find agent location
dexto which nano-banana-agent
# Uninstall agent
dexto uninstall music-agent
```
### Setting Default Agent
Edit `~/.dexto/preferences.yml`:
```yaml
defaults:
defaultAgent: coding-agent # Change to your preferred agent
```
## Agent Comparison
| Agent | Category | LLM Requirement | Special Requirements |
|-------|----------|----------------|---------------------|
| podcast-agent | Content | Google Gemini | GOOGLE_GENERATIVE_AI_API_KEY |
| music-agent | Content | Any | - |
| nano-banana-agent | Content | Google Gemini (Required) | GOOGLE_GENERATIVE_AI_API_KEY |
| sora-video-agent | Content | OpenAI GPT | OPENAI_API_KEY |
| image-editor-agent | Content | Any | - |
| database-agent | Data | Claude/GPT | - |
| talk2pdf-agent | Data | Claude/Gemini | - |
| product-analysis-agent | Data | Claude | POSTHOG_API_KEY |
| github-agent | DevOps | Claude/GPT | GITHUB_TOKEN |
| workflow-builder-agent | Automation | GPT | N8N_MCP_URL, N8N_MCP_TOKEN |
| product-researcher | Research | Claude/GPT | - |
| triage-agent | Multi-Agent | Claude/GPT | - |
| gaming-agent | Entertainment | Claude (Vision) | ROM files |
| coding-agent | Development | Any | Pre-installed |
| explore-agent | Development | Claude Haiku | - |
| default-agent | General | Any | - |
## Choosing the Right Agent
### For Content Creation
- **Images:** Use `nano-banana-agent` for fast, high-quality generation and editing
- **Videos:** Use `sora-video-agent` for AI-generated video content
- **Audio/Podcasts:** Use `podcast-agent` for multi-speaker content
- **Music:** Use `music-agent` for composition and sound design
### For Development
- **Coding:** Use `coding-agent` for comprehensive development assistance
- **Exploration:** Use `explore-agent` for fast codebase research and understanding
- **GitHub:** Use `github-agent` for repository management and PR analysis
- **Databases:** Use `database-agent` for SQL and data operations
### For Analysis & Research
- **Documents:** Use `talk2pdf-agent` for PDF analysis
- **Product Analytics:** Use `product-analysis-agent` for PostHog insights and user behavior
- **Branding:** Use `product-researcher` for naming and market research
### For Automation
- **Workflows:** Use `workflow-builder-agent` for n8n automation and integrations
### For Complex Systems
- **Multi-Agent:** Use `triage-agent` as a template for building agent coordination systems
### For Entertainment
- **Gaming:** Use `gaming-agent` to play GameBoy games like Pokemon
## API Key Requirements
Most agents require API keys for their LLM providers:
```bash
# Run setup to configure keys
dexto setup
# Or add to ~/.dexto/.env
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_GENERATIVE_AI_API_KEY=...
GITHUB_TOKEN=ghp_...
POSTHOG_API_KEY=phx_...
N8N_MCP_URL=https://your-instance.app.n8n.cloud/api/v1
N8N_MCP_TOKEN=...
```
## Contributing Your Own Agent
Built something useful? Share it with the community!
1. Create your agent following the [agent.yml configuration guide](./configuring-dexto/agent-yml.md)
2. Add documentation and examples
3. Submit to our [GitHub repository](https://github.com/truffle-ai/dexto)
4. See [Contributing Guide](https://github.com/truffle-ai/dexto/blob/main/CONTRIBUTING.md) for details
Pre-installed status is available for high-quality, well-documented agents that serve common use cases.
## Next Steps
- **Get Started:** Follow the [Install Your First Agent tutorial](../getting-started/install-first-agent-tutorial.mdx)
- **Install Guide:** Learn more about [installing custom agents](./installing-custom-agents)
- **Create Your Own:** Build custom agents with the [Configuration Guide](./configuring-dexto/agent-yml.md)
- **Examples:** Explore [examples and demos](/examples/intro)
- **Tutorials:** Deep-dive into agent-specific tutorials in the [tutorials section](../tutorials)

View File

@@ -0,0 +1,9 @@
{
"label": "CLI Guide",
"position": 3,
"link": {
"type": "generated-index",
"title": "Dexto CLI Guide",
"description": "Complete guide to using the Dexto command-line interface, including all commands, options, and global preferences."
}
}

View File

@@ -0,0 +1,243 @@
---
sidebar_position: 10
---
# Global Preferences
Configure system-wide settings for Dexto that apply across all agents and sessions.
:::tip Complete Reference
For complete field documentation, validation rules, and API specifications, see **[agent.yml → Global Preferences](../configuring-dexto/agent-yml.md#global-preferences)**.
:::
## Overview
Global preferences are stored in `~/.dexto/preferences.yml` and provide system-wide defaults for LLM configuration, default agents, and setup status.
**Key features:**
- System-wide LLM configuration (provider, model, API key)
- Default agent management
- Automatic preference injection into new agents
- Setup completion tracking
## Preferences Structure
```yaml
llm:
provider: anthropic
model: claude-sonnet-4-5-20250929
apiKey: $ANTHROPIC_API_KEY
defaults:
defaultAgent: coding-agent
defaultMode: web # web | cli | server | mcp
setup:
completed: true
```
## Configuration Sections
### LLM Section
Global AI provider configuration used as defaults for all agents:
```yaml
llm:
provider: openai # See supported providers below
model: gpt-5-mini # Valid model for provider
apiKey: $OPENAI_API_KEY # Environment variable reference (not required for vertex)
```
**Supported providers:**
- **Built-in:** `openai`, `anthropic`, `google`, `groq`, `xai`, `cohere`
- **Cloud platforms:** `vertex` (Google Cloud), `bedrock` (AWS)
- **Gateways:** `openrouter`, `litellm`, `glama`
- **Custom:** `openai-compatible`
**Required fields:**
- **provider** - LLM provider name
- **model** - Model identifier for the provider
- **apiKey** - Environment variable reference (`$VAR_NAME`) - not required for `vertex` or `bedrock`
**API key format:**
- Must start with `$`
- Uppercase letters, numbers, underscores only
- Pattern: `^\$[A-Z_][A-Z0-9_]*$`
Valid: `$OPENAI_API_KEY`, `$ANTHROPIC_API_KEY`
Invalid: `sk-proj-...`, `openai_key`, `$lowercase`
### Defaults Section
Default CLI behavior and mode selection:
```yaml
defaults:
defaultAgent: coding-agent # Agent to use when none specified
defaultMode: web # Run mode when --mode flag not specified
```
**Fields:**
- **defaultAgent** - Agent name to use when no `--agent` flag is provided
- **defaultMode** - Run mode when no `--mode` flag is provided (default: `web`)
- `cli` - Interactive terminal mode
- `web` - Web UI mode (default)
- `server` - API server mode
- `mcp` - MCP server mode
### Setup Section
Setup completion tracking:
```yaml
setup:
completed: true # Whether initial setup has run
```
## Setup Command
Create or update preferences:
### Interactive Setup
```bash
dexto setup
```
Guides you through provider selection, model choice, and API key configuration.
### Non-Interactive Setup
```bash
dexto setup --provider anthropic --model claude-sonnet-4-5-20250929
```
**Options:**
- `--provider` - AI provider
- `--model` - Model name
- `--default-agent` - Default agent name
- `--force` - Overwrite existing setup
- `--interactive` - Enable/disable interactive mode
## How Preferences Work
### Agent Resolution Flow
1. **Explicit agent specified** - Uses the specified agent
2. **Project context** - Looks for project-local agents
3. **Global CLI context** - Uses `defaults.defaultAgent`
4. **No preferences** - Prompts to run `dexto setup`
### Preference Injection
When installing agents, global preferences are automatically injected:
```bash
dexto install code-helper
# Agent receives your LLM provider, model, and API key
```
### Preference Precedence
Highest to lowest:
1. **CLI arguments** - Explicit overrides
2. **Agent configuration** - Agent's YAML file
3. **Global preferences** - `~/.dexto/preferences.yml`
## Common Configurations
### OpenAI
```yaml
llm:
provider: openai
model: gpt-5-mini
apiKey: $OPENAI_API_KEY
```
### Anthropic
```yaml
llm:
provider: anthropic
model: claude-sonnet-4-5-20250929
apiKey: $ANTHROPIC_API_KEY
```
### Google
```yaml
llm:
provider: google
model: gemini-2.0-flash
apiKey: $GOOGLE_GENERATIVE_AI_API_KEY
```
### Google Cloud Vertex AI
```yaml
llm:
provider: vertex
model: gemini-2.5-pro
# No apiKey needed - uses Application Default Credentials
```
Requires `GOOGLE_VERTEX_PROJECT` environment variable set to your GCP project ID.
### OpenRouter
```yaml
llm:
provider: openrouter
model: anthropic/claude-sonnet-4-5-20250929
apiKey: $OPENROUTER_API_KEY
```
### Amazon Bedrock
```yaml
llm:
provider: bedrock
model: anthropic.claude-sonnet-4-5-20250929-v1:0
# No apiKey needed - uses AWS credentials or Bedrock API key
```
Requires `AWS_REGION` plus either:
- `AWS_BEARER_TOKEN_BEDROCK` - Bedrock API key (simplest), or
- `AWS_ACCESS_KEY_ID` + `AWS_SECRET_ACCESS_KEY` - IAM credentials (for production)
## Best Practices
1. **Keep API keys in environment** - Never store literal keys
2. **Use consistent naming** - Follow provider conventions
3. **Run setup after changes** - Re-run when switching providers
4. **Verify after edits** - Run a command to validate changes
5. **Set reliable default agent** - For predictable CLI behavior
6. **Configure default mode** - Set `defaultMode: cli` if you prefer terminal interaction, or keep `defaultMode: web` for UI-first workflows
## File Location
Always at: `~/.dexto/preferences.yml`
## Updating Preferences
### Manual Editing
Edit `~/.dexto/preferences.yml` directly, ensuring:
- Valid YAML syntax
- API keys are environment variable references
- Provider and model are compatible
- All required fields present
### Re-run Setup
```bash
dexto setup
```
## See Also
- [agent.yml Reference → Global Preferences](../configuring-dexto/agent-yml.md#global-preferences) - Complete field documentation
- [Agent Configuration Guide](../configuring-dexto/agent-yml.md) - Agent-level settings
- [CLI Overview](./overview.md) - Complete CLI command reference

View File

@@ -0,0 +1,145 @@
---
sidebar_position: 2
title: "Interactive Commands"
---
# Interactive Commands
Slash commands available during an active chat session.
:::tip CLI vs Interactive
This page covers **in-session slash commands** (like `/model`, `/mcp`). For terminal commands like `dexto install` or `dexto setup`, see [CLI Overview](./overview.md).
:::
Start an interactive session with `dexto` (Web UI) or `dexto --mode cli` (terminal).
## General
| Command | Aliases | Description |
|---------|---------|-------------|
| `/help` | `/h`, `/?` | Show all commands |
| `/exit` | `/quit`, `/q` | Exit the CLI |
| `/new` | | Start new conversation |
| `/clear` | `/reset` | Clear context (keeps session) |
| `/compact` | `/summarize` | Compress older messages |
| `/context` | `/ctx`, `/tokens` | Show token usage |
| `/copy` | `/cp` | Copy last response |
| `/shortcuts` | `/keys` | Show keyboard shortcuts |
| `!<cmd>` | `/shell` | Run shell command |
## Sessions
| Command | Description |
|---------|-------------|
| `/resume` | Browse and resume sessions |
| `/rename` | Rename current session |
| `/search <query>` | Search across sessions |
| `/session list` | List all sessions |
| `/session history` | Show current session history |
| `/session delete <id>` | Delete a session |
## Configuration
| Command | Description |
|---------|-------------|
| `/model` | Change LLM model |
| `/model list` | List available models |
| `/model current` | Show current model |
| `/config` | Show configuration |
| `/config reload` | Reload config from file |
| `/sysprompt` | Show system prompt |
| `/log [level]` | Set log level (debug/info/warn/error) |
| `/stats` | Show statistics |
## MCP & Tools
| Command | Description |
|---------|-------------|
| `/mcp` | List MCP servers |
| `/mcp list` | List connected servers |
| `/mcp add stdio <name> <cmd>` | Add stdio MCP server |
| `/mcp add http <name> <url>` | Add HTTP MCP server |
| `/mcp add sse <name> <url>` | Add SSE MCP server |
| `/mcp remove <name>` | Remove MCP server |
| `/tools` | Browse available tools |
| `/tools list` | List all tools |
| `/tools search <query>` | Search for tools |
## Prompts
| Command | Description |
|---------|-------------|
| `/prompts` | List all prompts |
| `/use <prompt> [args]` | Execute a prompt |
| `/<prompt-name>` | Execute prompt directly |
| `/docs` | Open documentation |
## Keyboard Shortcuts
| Shortcut | Action |
|----------|--------|
| `Ctrl+C` | Clear input / cancel |
| `Escape` | Close overlay |
| `↑` / `↓` | Navigate history |
| `Tab` | Autocomplete |
| `Shift+Enter` | Multi-line input |
## Examples
### Managing Sessions
```bash
# List all sessions
/session list
# Resume a previous session
/resume
# Search for something you discussed
/search "database migration"
# Rename current session
/rename my-project-refactor
```
### Working with Models
```bash
# See current model
/model current
# List available models
/model list
# Switch to a different model
/model switch gpt-5
```
### Using MCP Tools
```bash
# See all available tools
/tools list
# Search for specific tools
/tools search "file"
# List connected MCP servers
/mcp list
```
### Debugging
```bash
# Check token usage
/context
# View system prompt
/sysprompt
# Enable debug logging
/log debug
# Show system stats
/stats
```

View File

@@ -0,0 +1,471 @@
---
sidebar_position: 1
title: "Overview"
---
# CLI Overview
The Dexto CLI provides two ways to interact with AI agents:
| Mode | What It Is | How to Use |
|------|------------|------------|
| **CLI Tool** | Terminal commands for managing agents, sessions, and configuration | `dexto install`, `dexto setup`, `dexto list-agents` |
| **Interactive Mode** | Chat session with slash commands for real-time control | `/model switch`, `/mcp add`, `/search` |
This guide covers the **CLI Tool** commands. For slash commands available during chat, see [Interactive Commands](./interactive-commands.md).
## What You Can Do
- Talk to any LLM in your terminal or browser
- Create long-lived AI agents with tools, knowledge, and memories
- Deploy agents locally or on the cloud
- Build custom integrations with Discord, Telegram, Slack, etc.
- Scaffold new AI applications with `dexto create-app`
---
## Main Command
### Basic Usage
```bash
# Start interactive session (opens Web UI by default)
dexto
# Start interactive CLI mode
dexto --mode cli
# Run a single prompt (auto-uses CLI mode)
dexto "list files here"
dexto -p "create a snake game"
# Start as API server
dexto --mode server
# Run as MCP server
dexto --mode mcp
```
:::tip Mode Auto-Detection
`dexto` opens the Web UI by default. When you provide a prompt via `-p` or as a positional argument, Dexto automatically switches to CLI mode for one-shot execution.
:::
### Main Command Options
| Flag | Description | Example |
|------|-------------|---------|
| `-v, --version` | Show version | `dexto --version` |
| `-a, --agent <id\|path>` | Use agent ID or path to config file | `dexto -a nano-banana-agent` |
| `-p, --prompt <text>` | Run single prompt and exit | `dexto -p "list files"` |
| `-m, --model <model>` | Specify LLM model | `dexto -m gpt-5-mini` |
| `-c, --continue` | Continue most recent conversation | `dexto -c` |
| `-r, --resume <sessionId>` | Resume a specific session by ID | `dexto --resume my-session` |
| `--mode <mode>` | Run mode (web/cli/server/mcp, default: web) | `dexto --mode cli` |
| `--port <port>` | Server port (default: 3000 for web, 3001 for server mode) | `dexto --port 8080` |
| `--skip-setup` | Skip initial setup prompts | `dexto --skip-setup` |
| `-s, --strict` | Require all MCP servers to connect | `dexto --strict` |
| `--no-verbose` | Disable verbose output | `dexto --no-verbose` |
| `--no-interactive` | Disable prompts/setup | `dexto --no-interactive` |
| `--no-auto-install` | Disable auto agent install | `dexto --no-auto-install` |
| `--auto-approve` | Auto-approve all tool executions | `dexto --auto-approve` |
**Note:** The `-a, --agent` flag accepts both agent IDs from the registry and paths to agent config files. See the [Agent Registry](/docs/guides/agent-registry) for available agents.
## Subcommands
### `create-app` - Scaffold New TypeScript App
Create a new Dexto TypeScript application from scratch.
```bash
dexto create-app
```
This command will:
1. Create project structure
2. Set up TypeScript configuration
3. Prompt for LLM provider and API keys
4. Install dependencies
5. Generate example files
### `init-app` - Initialize Existing TypeScript App
Add Dexto to an existing TypeScript project.
```bash
dexto init-app
```
**Requirements:**
- Must have `package.json` and `tsconfig.json` in current directory
### `setup` - Configure Global Preferences
Configure global Dexto preferences including default LLM provider, model, and agent.
```bash
dexto setup
dexto setup --provider openai --model gpt-5-mini
dexto setup --force
```
**Options:**
- `--provider <provider>` - LLM provider (openai, anthropic, google, groq, xai, cohere)
- `--model <model>` - Model name (uses provider default if not specified)
- `--default-agent <agent>` - Default agent name (default: coding-agent)
- `--force` - Overwrite existing setup without confirmation
- `--no-interactive` - Skip interactive prompts
See [Global Preferences](./global-preferences) for detailed configuration guide.
### `install` - Install Agents
Install agents from the registry or custom YAML files/directories.
```bash
# Install single agent from registry
dexto install nano-banana-agent
# Install multiple agents
dexto install podcast-agent coding-agent database-agent
# Install all available agents
dexto install --all
# Install custom agent from file
dexto install ./my-agent.yml
# Install from directory (interactive)
dexto install ./my-agent-dir/
```
**Options:**
- `--all` - Install all available agents from registry
- `--force` - Force reinstall even if agent is already installed
- `--no-inject-preferences` - Skip injecting global preferences into installed agents
See the [Agent Registry](/docs/guides/agent-registry) for available agents.
### `uninstall` - Uninstall Agents
Remove agents from your local installation.
```bash
# Uninstall single agent
dexto uninstall nano-banana-agent
# Uninstall multiple agents
dexto uninstall agent1 agent2
# Uninstall all agents
dexto uninstall --all
```
**Options:**
- `--all` - Uninstall all installed agents
- `--force` - Force uninstall even if agent is protected (e.g., coding-agent)
### `sync-agents` - Sync Agent Configs
Sync installed agents with bundled versions after Dexto updates.
```bash
# Check status and prompt for updates
dexto sync-agents
# List what would change (dry run)
dexto sync-agents --list
# Force update all without prompts
dexto sync-agents --force
```
**Options:**
- `--list` - Show status without making changes
- `--force` - Update all agents without confirmation
**When to use:** When Dexto shows "Agent updates available" notification after an update, or when you want to reset agents to their default configurations.
### `list-agents` - List Available Agents
List agents from the registry and locally installed agents.
```bash
# List all agents (registry + installed)
dexto list-agents
# Show only installed agents
dexto list-agents --installed
# Show only registry agents
dexto list-agents --available
# Show detailed information
dexto list-agents --verbose
```
**Options:**
- `--verbose` - Show detailed agent information
- `--installed` - Show only installed agents
- `--available` - Show only available agents from registry
See the [Agent Registry](/docs/guides/agent-registry) for detailed agent information.
### `which` - Show Agent Path
Display the path to a specific agent's configuration file.
```bash
dexto which nano-banana-agent
dexto which coding-agent
```
### `session` - Manage Sessions
Manage conversation sessions.
#### `session list`
List all available sessions.
```bash
dexto session list
```
#### `session history`
Show message history for a session.
```bash
# Show history for current session
dexto session history
# Show history for specific session
dexto session history my-session-id
```
#### `session delete`
Delete a specific session.
```bash
dexto session delete old-session-id
```
### `search` - Search Session History
Search across all conversation messages in session history.
```bash
# Search all sessions
dexto search "bug fix"
# Search in specific session
dexto search "error" --session my-session
# Filter by role
dexto search "help" --role assistant
# Limit results
dexto search "code" --limit 20
```
**Options:**
- `--session <sessionId>` - Search in specific session only
- `--role <role>` - Filter by role (user, assistant, system, tool)
- `--limit <number>` - Limit number of results (default: 10)
### `mcp` - MCP Server Mode
Start Dexto as an MCP server to aggregate and re-expose tools from configured MCP servers.
```bash
# Start MCP tool aggregation server
dexto mcp --group-servers
# Start in strict mode
dexto mcp --group-servers --strict
```
**Options:**
- `--group-servers` - Aggregate and re-expose tools from configured MCP servers
- `-s, --strict` - Require all MCP server connections to succeed
- `--name <name>` - MCP server name (default: 'dexto-tools')
- `--version <version>` - MCP server version (default: '1.0.0')
**Note:** In the future, `dexto --mode mcp` will be moved to this subcommand to expose the agent as an MCP server by default.
---
## Common Usage Patterns
### Quick Start
```bash
# Interactive session with default settings (opens Web UI)
dexto
# Interactive CLI mode
dexto --mode cli
# Use a specific agent (opens Web UI)
dexto --agent nano-banana-agent
# Start with a specific model (opens Web UI)
dexto -m claude-sonnet-4-5-20250929
```
### One-Shot Prompts
```bash
# Run single task and exit (auto-uses CLI mode)
dexto "list all TypeScript files in src/"
dexto -p "create a README for this project"
# With auto-approve for automation
dexto --auto-approve "format all JavaScript files"
# With specific agent
dexto --agent coding-agent "create a landing page for my coffee shop"
# Combine agent, model, and auto-approve
dexto --agent coding-agent -m gpt-5 --auto-approve "build a todo app with React"
```
### Session Continuation
```bash
# Continue most recent conversation (opens Web UI)
dexto --continue
# Continue in CLI mode
dexto --continue --mode cli
# Continue with a one-shot prompt, then exit
dexto -c -p "now add error handling"
# Resume specific session (opens Web UI)
# Get session id from the web UI or session list command
dexto --resume my-project-session
# Resume session in CLI mode
dexto --resume my-project-session --mode cli
# Resume and run a one-shot prompt
dexto -r my-project-session "fix the bug we discussed"
```
### Agent Management
```bash
# Install agents for specific use cases
dexto install podcast-agent music-agent coding-agent
# Install all available agents
dexto install --all
# List what's installed
dexto list-agents --installed
# Find agent config location
dexto which coding-agent
# Use custom agent file
dexto --agent ./agents/my-custom-agent.yml
```
### Web UI
```bash
# Launch on default port (3000)
dexto
# Custom port
dexto --port 8080
# With specific agent
dexto --agent database-agent
```
### API Server
```bash
# Start REST + SSE streaming server (default port 3001)
dexto --mode server
# With custom port
dexto --mode server --port 8080
# With specific agent and strict mode
dexto --mode server --agent my-agent --strict
# For production with custom agent
dexto --mode server --agent ./production-agent.yml --port 3001
```
### Content Generation
```bash
# Generate podcast content
dexto --agent podcast-agent "create a 5-minute podcast about space exploration"
# Generate images
dexto --agent nano-banana-agent "create a futuristic cityscape"
# Create code with specific instructions
dexto --agent coding-agent "build a REST API with Express and TypeScript"
# Interactive mode for complex tasks
dexto --agent coding-agent
# Then in the UI: "Let's build a full-stack app step by step"
```
### Automation & CI/CD
```bash
# Automated code review (no confirmation prompts)
dexto --auto-approve "review all files in src/ and suggest improvements"
# Generate documentation
dexto --auto-approve "create API documentation from the code in src/api/"
# Run tests and analyze results
dexto "run the test suite and explain any failures"
# Git commit message generation
git diff | dexto -p "generate a conventional commit message for these changes"
```
### Multi-Agent Workflows
```bash
# Start researcher agent as MCP server (Terminal 1)
dexto --mode mcp --port 4000 --agent researcher-agent
# Start coordinator agent that uses researcher (Terminal 2)
dexto --agent coordinator-agent --port 5000
```
### Search & History
```bash
# Search all conversations
dexto search "database schema"
# Search in specific session
dexto search "bug fix" --session my-session-id
# Filter by role
dexto search "error" --role assistant
# View session history
dexto session history my-session-id
```
## Next Steps
- **[Interactive Commands](./interactive-commands)** - Slash commands for chat sessions
- **[Agent Registry](/docs/guides/agent-registry)** - Browse available agents
- **[Agent Configuration](/docs/guides/configuring-dexto/overview)** - Create custom agents
- **[MCP Integration](/docs/mcp/overview)** - Connect external tools and services
- **[Global Preferences](./global-preferences)** - Configure system-wide defaults

View File

@@ -0,0 +1,9 @@
{
"label": "Agent Configuration Guide",
"position": 6,
"link": {
"type": "generated-index",
"title": "Configuring Agents in Dexto",
"description": "Configure agents in Dexto using YML to suit your needs."
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,148 @@
---
sidebar_position: 12
sidebar_label: "Agent Card (A2A)"
---
# Agent Card Configuration
Configure your agent's public metadata for Agent-to-Agent (A2A) communication and service discovery.
:::tip Complete Reference
For complete field documentation and A2A specifications, see **[agent.yml → Agent Card](./agent-yml.md#agent-identity--a2a)**.
:::
## Overview
The agent card provides standardized metadata about your agent's capabilities, enabling other agents and services to discover and interact with your agent programmatically through the Agent-to-Agent (A2A) protocol.
**Key benefits:**
- Service discovery by other agents
- Protocol negotiation (input/output formats)
- Capability matching for task delegation
- Standardized authentication setup
Learn more: [A2A: A new era of agent interoperability](https://developers.googleblog.com/en/a2a-a-new-era-of-agent-interoperability/)
## Configuration
```yaml
agentCard:
name: "My Dexto Agent"
description: "A helpful AI assistant with specialized capabilities"
url: "https://my-agent.example.com"
version: "1.0.0"
documentationUrl: "https://docs.example.com/my-agent"
provider:
organization: "My Company"
url: "https://mycompany.com"
capabilities:
streaming: true
pushNotifications: false
stateTransitionHistory: false
authentication:
schemes: ["bearer", "apiKey"]
credentials: "optional"
defaultInputModes: ["application/json", "text/plain"]
defaultOutputModes: ["application/json", "text/plain"]
skills:
- id: "data_analysis"
name: "Data Analysis"
description: "Analyze and visualize data from various sources"
tags: ["analytics", "data", "visualization"]
examples: ["Analyze sales data", "Create charts from CSV"]
```
## Required Fields
- **name** - Display name for your agent
- **url** - Public endpoint where your agent can be accessed
- **version** - Version identifier (semantic versioning recommended)
## Optional Fields
- **description** - Brief capability description
- **documentationUrl** - Link to documentation
- **provider** - Organization information (organization, url)
- **capabilities** - Technical capabilities (streaming, pushNotifications, stateTransitionHistory)
- **authentication** - Supported auth methods (schemes, credentials)
- **defaultInputModes** - Accepted content types
- **defaultOutputModes** - Produced content types
- **skills** - Specific agent capabilities with examples
## Examples
### Basic Agent Card
```yaml
agentCard:
name: "Support Bot"
description: "Customer support assistant"
url: "https://support.mycompany.com/agent"
version: "2.1.0"
```
### Full-Featured Agent Card
```yaml
agentCard:
name: "Analytics Assistant"
description: "Advanced data analysis and visualization agent"
url: "https://analytics.mycompany.com"
version: "3.0.0"
documentationUrl: "https://docs.mycompany.com/analytics-agent"
provider:
organization: "Data Insights Corp"
url: "https://datainsights.com"
capabilities:
streaming: true
pushNotifications: true
stateTransitionHistory: true
authentication:
schemes: ["bearer", "oauth2"]
credentials: "required"
defaultInputModes: ["application/json", "text/csv"]
defaultOutputModes: ["application/json", "image/png", "text/html"]
skills:
- id: "csv_analysis"
name: "CSV Analysis"
description: "Parse and analyze CSV data files"
tags: ["data", "csv", "analysis"]
examples: ["Analyze sales data CSV", "Generate summary statistics"]
- id: "chart_generation"
name: "Chart Generation"
description: "Create visualizations from data"
tags: ["visualization", "charts"]
examples: ["Create bar chart", "Generate trend analysis"]
```
## Skill Configuration
Skills describe specific capabilities:
```yaml
skills:
- id: "unique_skill_id"
name: "Human-readable name"
description: "What this skill does"
tags: ["category", "keywords"]
examples: ["Example 1", "Example 2"]
inputModes: ["text/plain"] # Optional
outputModes: ["application/json"] # Optional
```
## A2A Communication
The agent card enables:
- **Service Discovery** - Other agents find your capabilities
- **Protocol Negotiation** - Compatible format selection
- **Capability Matching** - Task delegation decisions
- **Authentication** - Secure agent-to-agent setup
## Default Behavior
If no agent card is specified, Dexto generates basic metadata from your configuration. For A2A communication, explicit configuration is recommended.
## See Also
- [agent.yml Reference → Agent Card](./agent-yml.md#agent-identity--a2a) - Complete field documentation
- [A2A Documentation](https://developers.googleblog.com/en/a2a-a-new-era-of-agent-interoperability/) - Official A2A specification

View File

@@ -0,0 +1,47 @@
---
sidebar_position: 13
sidebar_label: "Dynamic Changes"
---
# Runtime / Dynamic Configuration Changes
Configure and manage runtime changes to agent state through the AgentStateManager.
:::tip Complete Reference
For complete API documentation and event specifications, see **[agent.yml → Dynamic Changes](./agent-yml.md#dynamic-changes)**.
:::
## Overview
`AgentStateManager` allows safe, validated modifications to the running configuration without restarting your agent.
## Example per-session LLM override
```typescript
stateManager.updateLLM(
{ provider: 'openai', model: 'gpt-5', maxInputTokens: 50_000 },
'user-123'
);
```
Internally the manager:
1. Validates the patch against `LLMConfigSchema`.
2. Stores the override under `sessionOverrides`.
3. Emits `state:changed` and `session:override-set` events.
## Example add MCP server at runtime
```typescript
await agent.addMcpServer('git', {
command: 'mcp-git',
args: ['--repo', process.cwd()]
});
```
This triggers `mcp:server-added`, after which `MCPManager` connects and refreshes its capability cache.
## See Also
- [agent.yml Reference → Dynamic Changes](./agent-yml.md#dynamic-changes) - Complete API documentation
- [System Prompt Configuration](./systemPrompt.md) - Static configuration
- [MCP Configuration](./mcpConfiguration.md) - MCP server setup

View File

@@ -0,0 +1,113 @@
---
sidebar_position: 12
---
# Internal Resources
## What are Internal Resources?
Internal resources let your Dexto agent expose local files and blob storage directly to the LLM as context. Unlike MCP resources (from external servers), these are managed by your agent.
:::tip Quick Reference
For complete field documentation, see **[agent.yml → Internal Resources](./agent-yml#internal-resources)**.
:::
## Resource Types
### Filesystem Resources
Expose local files and directories to your agent:
```yaml
internalResources:
- type: filesystem
paths: ["./docs", "./src"]
maxDepth: 3
maxFiles: 1000
includeHidden: false
includeExtensions: [".md", ".ts", ".js", ".json"]
```
**Key options:**
- `paths` - Directories/files to expose (required)
- `maxDepth` - How deep to traverse directories (default: 3)
- `maxFiles` - Maximum files to index (default: 1000, max: 10000)
- `includeExtensions` - File types to include (default: common text files)
### Blob Resources
Expose blob storage (images, documents, generated files):
```yaml
storage:
blob:
type: local
maxBlobSize: 52428800 # 50MB
internalResources:
- type: blob
```
Blob storage settings go in `storage.blob` section. The resource just enables LLM access to stored blobs.
## Using Resources
### In Web UI
Type `@` to auto-complete and reference resources:
```
@file:///project/README.md summarize this
```
### Via SDK
```typescript
// List all resources (internal + MCP)
const resources = await agent.resourceManager.list();
// Read a resource
const content = await agent.resourceManager.read('file:///path/to/file');
```
## Configuration Patterns
**Documentation bot:**
```yaml
internalResources:
- type: filesystem
paths: ["./documentation"]
maxDepth: 5
includeExtensions: [".md", ".mdx"]
```
**Project context:**
```yaml
internalResources:
- type: filesystem
paths: ["./src", "./tests", "./README.md"]
includeExtensions: [".ts", ".tsx", ".js", ".json", ".md"]
- type: blob
```
**Config files only:**
```yaml
internalResources:
- type: filesystem
paths: ["."]
maxDepth: 2
includeExtensions: [".json", ".yaml", ".yml", ".toml"]
```
## Best Practices
1. **Be selective** - Only expose necessary directories
2. **Set reasonable limits** - Use `maxDepth` and `maxFiles` to prevent performance issues
3. **Filter extensions** - Include only relevant file types
4. **Exclude secrets** - Never expose `.env`, credentials, private keys
5. **Use path variables** - `${{dexto.agent_dir}}` for portable configs
## See Also
- [agent.yml → Internal Resources](./agent-yml#internal-resources) - Complete field reference
- [Storage Configuration](./storage) - Blob storage backend settings
- [MCP Resources](../../mcp/resources) - External MCP server resources

View File

@@ -0,0 +1,244 @@
---
sidebar_position: 11
---
# Internal Tools
Configure built-in Dexto capabilities that provide core agent functionality like file operations, code search, and command execution.
:::tip Complete Reference
For complete tool specifications, parameters, and usage details, see **[agent.yml → Internal Tools](./agent-yml.md#internal-tools)**.
:::
## Overview
Internal tools are built directly into Dexto core, providing essential capabilities for agents to interact with the local filesystem, execute commands, and collect user input.
**Key characteristics:**
- Built into Dexto (no external dependencies)
- Can be enabled/disabled per agent
- Subject to tool confirmation policies
- Optimized for common agent workflows
## Available Tools
| Tool | Purpose | Safety |
|------|---------|--------|
| **ask_user** | Collect structured user input | Safe |
| **read_file** | Read file contents | Read-only |
| **write_file** | Create/overwrite files | Requires approval |
| **edit_file** | Precise file edits | Requires approval |
| **glob_files** | Find files by pattern | Read-only |
| **grep_content** | Search code with regex | Read-only |
| **bash_exec** | Execute shell commands | Dangerous |
| **bash_output** | Monitor background processes | Safe |
| **kill_process** | Terminate processes | Safe |
## Configuration
Enable tools by listing them:
```yaml
internalTools:
- ask_user
- read_file
- write_file
- edit_file
- glob_files
- grep_content
- bash_exec
```
**Disable all tools:**
```yaml
internalTools: []
```
## Tool Categories
### File Reading (Read-only)
**read_file** - Read file contents with pagination:
```yaml
internalTools:
- read_file
```
**glob_files** - Find files by pattern:
```yaml
internalTools:
- glob_files
```
Common patterns: `**/*.ts`, `**/config.{yml,yaml,json}`
**grep_content** - Search code with regex:
```yaml
internalTools:
- grep_content
```
Example: Find function definitions, imports, class usage
### File Writing (Requires Approval)
**write_file** - Create/overwrite files:
```yaml
internalTools:
- write_file
```
**edit_file** - Make precise changes:
```yaml
internalTools:
- read_file # Often used together
- edit_file
```
### Command Execution (Dangerous)
**bash_exec** - Execute shell commands:
```yaml
internalTools:
- bash_exec
```
**bash_output** - Monitor background processes:
```yaml
internalTools:
- bash_exec
- bash_output
```
**kill_process** - Terminate processes:
```yaml
internalTools:
- bash_exec
- bash_output
- kill_process
```
### User Interaction (Safe)
**ask_user** - Collect structured input:
```yaml
internalTools:
- ask_user
```
## Common Tool Combinations
### Read-Only Analysis Agent
```yaml
internalTools:
- read_file
- glob_files
- grep_content
- ask_user
toolConfirmation:
mode: auto-approve # Safe since all read-only
```
### Coding Agent
```yaml
internalTools:
- read_file
- write_file
- edit_file
- glob_files
- grep_content
- bash_exec
- ask_user
toolConfirmation:
mode: manual
toolPolicies:
alwaysAllow:
- internal--read_file
- internal--glob_files
- internal--grep_content
- internal--ask_user
```
### DevOps Agent
```yaml
internalTools:
- read_file
- write_file
- bash_exec
- bash_output
- kill_process
toolConfirmation:
mode: manual
toolPolicies:
alwaysAllow:
- internal--read_file
- internal--bash_output
- internal--kill_process
alwaysDeny:
- internal--bash_exec--rm -rf*
```
## Tool Confirmation Policies
Configure which tools require approval:
```yaml
toolConfirmation:
mode: manual
toolPolicies:
# Safe, read-only operations
alwaysAllow:
- internal--read_file
- internal--glob_files
- internal--grep_content
- internal--ask_user
# Explicitly deny dangerous operations
alwaysDeny:
- internal--bash_exec--rm -rf*
- internal--bash_exec--sudo*
```
## Best Practices
1. **Enable only what you need** - Don't enable all tools unnecessarily
2. **Pair tools with instructions** - Guide agents in system prompt
3. **Use safe defaults** - Auto-approve read-only, require confirmation for writes
4. **Provide usage examples** - Include patterns in system prompt
**Example system prompt:**
```yaml
systemPrompt: |
## Tool Usage Guidelines
Finding Files:
- Use glob_files with "**/*.ts" for TypeScript files
- Use grep_content to search for patterns
Editing Files:
- ALWAYS read_file first to see current content
- Use edit_file with unique old_string for precision
Running Commands:
- Use bash_exec for tests: "npm test"
- Never use destructive commands without approval
```
## Use Cases
| Agent Type | Recommended Tools |
|-----------|------------------|
| **Code Analyst** | read_file, glob_files, grep_content, ask_user |
| **Developer** | read_file, write_file, edit_file, glob_files, grep_content, bash_exec |
| **DevOps** | read_file, write_file, bash_exec, bash_output, kill_process |
| **Documentation** | read_file, write_file, glob_files |
## See Also
- [agent.yml Reference → Internal Tools](./agent-yml.md#internal-tools) - Complete tool documentation
- [Tool Confirmation](./toolConfirmation.md) - Configure approval policies
- [System Prompt](./systemPrompt.md) - Guide agents on tool usage

View File

@@ -0,0 +1,207 @@
---
sidebar_position: 3
sidebar_label: "LLM Configuration"
---
# LLM Configuration
Configure the language model provider and settings for your Dexto agent.
:::tip Complete Reference
For supported providers and models, see **[Supported LLM Providers](../supported-llm-providers.md)**.
For complete field documentation, see **[agent.yml → LLM Configuration](./agent-yml.md#llm-configuration)**.
:::
:::info Interactive Model Switching
Prefer not to edit YAML? Switch models interactively during a session:
- **CLI**: Type `/model` to open the model picker
- **WebUI**: Click the model name in the header
Custom models can also be added through the interactive wizard.
:::
## Overview
Large Language Models (LLMs) are the brain of your Dexto agents. Dexto supports multiple LLM providers out-of-the-box, including OpenAI, Anthropic, Google, and other OpenAI SDK-compatible providers.
## Basic Configuration
### Minimal Example
```yaml
llm:
provider: openai
model: gpt-5-mini
apiKey: $OPENAI_API_KEY
```
### Common Providers
**OpenAI:**
```yaml
llm:
provider: openai
model: gpt-5-mini
apiKey: $OPENAI_API_KEY
```
**Anthropic:**
```yaml
llm:
provider: anthropic
model: claude-sonnet-4-5-20250929
apiKey: $ANTHROPIC_API_KEY
```
**Google:**
```yaml
llm:
provider: google
model: gemini-2.5-pro
apiKey: $GOOGLE_GENERATIVE_AI_API_KEY
```
## Configuration Options
### Required Fields
- **provider** - LLM provider name
- Built-in: `openai`, `anthropic`, `google`, `xai`, `groq`, `cohere`
- Cloud platforms: `vertex` (Google Cloud), `bedrock` (AWS)
- Gateways: `openrouter`, `litellm`, `glama`
- Custom: `openai-compatible`
- **model** - Model identifier for the provider
- **apiKey** - API key or environment variable (not required for `vertex` or `bedrock`)
### Optional Fields
- **baseURL** - Custom API endpoint for OpenAI-compatible providers
- **maxInputTokens** - Maximum tokens for input context (when crossed, messages are compressed)
- **maxOutputTokens** - Maximum tokens for AI response generation
- **temperature** - Controls randomness (0 = deterministic, 1 = very creative)
- **maxIterations** - Maximum tool execution iterations (default: 50)
## Advanced Configuration
### Custom Providers
Use OpenAI-compatible providers:
```yaml
llm:
provider: openai-compatible
model: your-custom-model
apiKey: $YOUR_API_KEY
baseURL: https://api.your-provider.com/v1
maxInputTokens: 100000
```
### Local Models
Run models locally using Ollama:
```yaml
llm:
provider: openai-compatible
model: llama3.2
apiKey: dummy
baseURL: http://localhost:11434/v1
maxInputTokens: 8000
```
### Gateway Providers
**OpenRouter:**
```yaml
llm:
provider: openrouter
model: anthropic/claude-sonnet-4-5-20250929
apiKey: $OPENROUTER_API_KEY
```
**Glama:**
```yaml
llm:
provider: glama
model: openai/gpt-4o
apiKey: $GLAMA_API_KEY
```
**LiteLLM (self-hosted proxy):**
```yaml
llm:
provider: litellm
model: gpt-4
apiKey: $LITELLM_API_KEY
baseURL: http://localhost:4000
```
### Google Cloud Vertex AI
Access Gemini and Claude models through GCP:
```yaml
llm:
provider: vertex
model: gemini-2.5-pro
```
Vertex uses Application Default Credentials (ADC), not API keys. Set these environment variables:
- `GOOGLE_VERTEX_PROJECT` - Your GCP project ID (required)
- `GOOGLE_VERTEX_LOCATION` - Region (optional, defaults to us-central1)
- `GOOGLE_APPLICATION_CREDENTIALS` - Path to service account JSON (for production)
### Amazon Bedrock
Access Claude, Nova, Llama, and Mistral models through AWS:
```yaml
llm:
provider: bedrock
model: anthropic.claude-sonnet-4-5-20250929-v1:0
```
**Authentication options:**
**Option 1: API Key (simplest)**
- `AWS_REGION` - AWS region (required, e.g., us-east-1)
- `AWS_BEARER_TOKEN_BEDROCK` - Bedrock API key ([generate here](https://console.aws.amazon.com/bedrock/home#/api-keys))
**Option 2: IAM Credentials (for production)**
- `AWS_REGION` - AWS region (required, e.g., us-east-1)
- `AWS_ACCESS_KEY_ID` - Your AWS access key
- `AWS_SECRET_ACCESS_KEY` - Your AWS secret key
- `AWS_SESSION_TOKEN` - Session token (optional, for temporary credentials)
### Token Control
```yaml
llm:
provider: openai
model: gpt-5-mini
apiKey: $OPENAI_API_KEY
maxInputTokens: 100000 # Compress history when exceeding
maxOutputTokens: 4000 # Limit response length
temperature: 0.7
```
## Environment Variables
Set API keys in your `~/.dexto/.env` file:
```bash
# Built-in providers
OPENAI_API_KEY=your_openai_key
ANTHROPIC_API_KEY=your_anthropic_key
GOOGLE_GENERATIVE_AI_API_KEY=your_google_key
GROQ_API_KEY=your_groq_key
XAI_API_KEY=your_xai_key
COHERE_API_KEY=your_cohere_key
```
## Best Practices
1. **Use environment variables** - Store API keys as `$VAR` references
2. **Set appropriate token limits** - Control context and response length
3. **Test locally first** - Use local models for development before production

View File

@@ -0,0 +1,313 @@
---
sidebar_position: 5
---
# MCP Configuration
Configure Model Context Protocol (MCP) servers to extend your agent's capabilities by connecting to external tools, services, and APIs.
:::tip Complete Reference
For complete field documentation, transport specifications, and troubleshooting, see **[agent.yml → MCP Servers](./agent-yml.md#mcp-servers)**.
:::
## Overview
MCP servers provide tools and resources that your agents can discover and use at runtime. Unlike internal tools which are built into Dexto, MCP servers are external processes or services that communicate using the standardized Model Context Protocol.
**Key characteristics:**
- Pluggable architecture - Add/remove servers dynamically
- Multiple transport types - stdio, HTTP, and SSE
- Environment variable support - Secure configuration
- Connection modes - Strict vs lenient error handling
- Tool aggregation - Multiple servers' tools available simultaneously
## Server Types
| Transport | Use Case | Protocol |
|-----------|----------|----------|
| **stdio** | Local processes, file operations, system tools | stdin/stdout |
| **http** | Remote APIs, cloud services (recommended) | HTTP/REST |
| **sse** | Legacy streaming integrations (deprecated) | Server-Sent Events |
## Quick Examples
### Example 1 - Local Filesystem Access
```yaml
mcpServers:
filesystem:
type: stdio
command: npx
args: ["-y", "@modelcontextprotocol/server-filesystem", "."]
timeout: 30000
connectionMode: lenient
```
### Example 2 - Remote HTTP Service
```yaml
mcpServers:
api-service:
type: http
url: https://api.example.com/mcp
headers:
Authorization: Bearer $API_TOKEN
timeout: 45000
connectionMode: strict
```
### Example 3 - Multiple Servers
```yaml
mcpServers:
# Local filesystem
filesystem:
type: stdio
command: npx
args: ["-y", "@modelcontextprotocol/server-filesystem", "."]
connectionMode: strict
# Browser automation
playwright:
type: stdio
command: npx
args: ["-y", "@playwright/mcp@latest"]
connectionMode: lenient
# Remote service
analytics:
type: http
url: $ANALYTICS_URL
headers:
Authorization: Bearer $ANALYTICS_TOKEN
connectionMode: lenient
```
## Transport Types
### stdio - Local Process Servers
Execute local programs that communicate via stdin/stdout.
**Use when:**
- Running local tools (filesystem, git, database)
- Development and testing
- System command execution
- Fast, no-network operations
**Configuration:**
```yaml
mcpServers:
database:
type: stdio
command: npx
args: ["-y", "@truffle-ai/database-server"]
env:
DATABASE_URL: $DATABASE_URL
timeout: 30000
connectionMode: lenient
```
**Common stdio servers:**
- `@modelcontextprotocol/server-filesystem` - File operations
- `@playwright/mcp` - Browser automation
- `@modelcontextprotocol/server-github` - GitHub integration
### http - HTTP Servers (Recommended)
Connect to standard HTTP/REST servers.
**Use when:**
- Integrating cloud services
- Production deployments
- Third-party APIs
- Reliable remote communication
**Configuration:**
```yaml
mcpServers:
external-api:
type: http
url: https://api.external-service.com/mcp
headers:
Authorization: Bearer $EXTERNAL_API_TOKEN
X-Client-Version: "1.0"
timeout: 45000
connectionMode: strict
```
### sse - Server-Sent Events (Deprecated)
Legacy streaming integration. Consider HTTP for new projects.
## Connection Modes
### lenient (Default)
**Behavior:**
- Logs warning if server fails to connect
- Continues agent initialization
- Agent remains functional without this server
**Use when:**
- Server provides optional enhancements
- Development/testing environments
- Server may be temporarily unavailable
```yaml
mcpServers:
optional-analytics:
type: http
url: https://analytics.example.com/mcp
connectionMode: lenient
```
### strict
**Behavior:**
- Throws error if server fails to connect
- Stops agent initialization
- Ensures server availability before agent starts
**Use when:**
- Server is critical for agent functionality
- Production environments requiring reliability
- Data consistency is important
```yaml
mcpServers:
critical-database:
type: stdio
command: npx
args: ["-y", "@truffle-ai/database-server"]
connectionMode: strict
```
### Global --strict Flag
Override all connection modes:
```bash
dexto --strict # Makes ALL servers strict
```
## Environment Variables
All MCP configurations support environment variable expansion using `$VAR` syntax.
**Supported fields:**
- `command` - stdio command executable
- `args` - Each argument in the array
- `url` - Server endpoint URLs
- `headers` - Header values
- `env` - Environment variable values
**Example:**
```yaml
mcpServers:
secure-service:
type: http
url: $SERVICE_URL
headers:
Authorization: Bearer $API_KEY
```
**Security best practice:** Always use environment variables for secrets, never hardcode them.
## Common Use Cases
### Development vs Production
**Development:**
```yaml
mcpServers:
database:
type: stdio
command: npx
args: ["-y", "@truffle-ai/database-server"]
timeout: 10000
connectionMode: lenient
```
**Production:**
```yaml
mcpServers:
database:
type: stdio
command: npx
args: ["-y", "@truffle-ai/database-server@2.1.0"] # Pinned version
timeout: 60000
connectionMode: strict
```
### Content Creation Agent
```yaml
mcpServers:
filesystem:
type: stdio
command: npx
args: ["-y", "@modelcontextprotocol/server-filesystem", "."]
connectionMode: strict
playwright:
type: stdio
command: npx
args: ["-y", "@playwright/mcp@latest"]
connectionMode: lenient
```
### Data Analysis Agent
```yaml
mcpServers:
database:
type: stdio
command: npx
args: ["-y", "@truffle-ai/database-server"]
env:
DATABASE_URL: $DATABASE_URL
connectionMode: strict
analytics-api:
type: http
url: $ANALYTICS_API_URL
headers:
Authorization: Bearer $ANALYTICS_TOKEN
connectionMode: strict
```
## Tool Aggregation
When multiple MCP servers are configured, Dexto aggregates all their tools. Tools are prefixed with server names to avoid conflicts:
**Format:** `<server-name>__<tool-name>`
**Example:**
```yaml
mcpServers:
filesystem:
# Provides: filesystem__read_file, filesystem__write_file
playwright:
# Provides: playwright__navigate, playwright__screenshot
```
Agent sees: `filesystem__read_file`, `filesystem__write_file`, `playwright__navigate`, `playwright__screenshot`
## Best Practices
1. **Use appropriate connection modes** - Strict for critical, lenient for optional
2. **Set reasonable timeouts** - Shorter for local (10s), longer for remote (60s)
3. **Pin versions in production** - `@package@version` for stability
4. **Use environment variables** - Never hardcode secrets
5. **Document server purposes** - Add comments explaining each server's role
6. **Group related servers** - Organize config sections logically
7. **Test connections** - Verify servers work before deploying
## See Also
- [agent.yml Reference → MCP Servers](./agent-yml.md#mcp-servers) - Complete field documentation
- [Tool Confirmation](./toolConfirmation.md) - Control MCP tool execution
- [MCP Overview](../../mcp/overview.md) - What is MCP and why it matters
- [MCP Manager](../../mcp/mcp-manager.md) - Runtime server management
- [Official MCP Servers](https://github.com/modelcontextprotocol/servers) - Available MCP servers

View File

@@ -0,0 +1,179 @@
---
sidebar_position: 4
sidebar_label: "Memory"
---
# Memory Configuration
Configure the memory system to store and retrieve persistent information about user preferences, context, and important facts across conversations.
:::tip Complete Reference
For complete field documentation, see **[agent.yml → Memories](./agent-yml.md#memories)**.
:::
## Overview
The Memory system allows your Dexto agent to remember information across conversations and sessions. Memories can be created by users, the system, or programmatically through the API.
**Key features:**
- Persistent storage across sessions
- Tagging and metadata support
- Pinned memories for auto-loading
- Flexible filtering and retrieval
- Integration with system prompts
Memories are stored in your configured database backend and can be automatically included in the system prompt for context-aware interactions.
## Memory Structure
Each memory contains:
- **content** - The actual memory text (1-10,000 characters)
- **tags** - Optional categorization (max 10 tags, 1-50 chars each)
- **metadata** - Source tracking, pinning, and custom fields
- **timestamps** - Creation and last update times
```yaml
# Example memory structure
content: "User prefers concise responses"
tags: ["preference", "communication"]
metadata:
source: user
pinned: true
customField: "any value"
```
## Enabling Memories
Memory is configured at the top level of your agent.yml file:
```yaml
memories:
enabled: true
priority: 40
limit: 10
```
### Configuration Options
| Option | Type | Default | Description |
|--------|------|---------|-------------|
| `enabled` | boolean | `false` | Enable memory inclusion in system prompt |
| `priority` | number | `40` | Position in system prompt (lower = earlier) |
| `limit` | number | - | Maximum memories to include |
| `includeTimestamps` | boolean | `false` | Show last updated date |
| `includeTags` | boolean | `true` | Include associated tags |
| `pinnedOnly` | boolean | `false` | Only include pinned memories |
## Pinned Memories
Pinned memories are automatically loaded into the system prompt when `pinnedOnly` is false, or exclusively loaded when `pinnedOnly` is true.
### Configuring Pinned Memories Only
```yaml
memories:
enabled: true
priority: 40
pinnedOnly: true # Only load pinned memories
limit: 5 # Keep system prompt compact
```
### When to Pin Memories
**Pin these:**
- Critical user preferences (communication style, constraints)
- Important project context (tech stack, standards)
- User-specific requirements (accessibility needs, language)
**Don't pin these:**
- Temporary context (current task details)
- Historical information (past interactions)
- Optional details (nice-to-have context)
## Use Cases
| Scenario | Memory Strategy |
|----------|----------------|
| **Personal Assistant** | Pin schedules, preferences, important dates |
| **Customer Support** | Store customer history, preferences, past issues |
| **Development Assistant** | Remember tech stack, coding standards, project structure |
| **Research Agent** | Track research topics, sources, findings |
## Configuration Examples
### Basic Memory Integration
```yaml
llm:
provider: anthropic
model: claude-sonnet-4-5-20250929
apiKey: $ANTHROPIC_API_KEY
systemPrompt: |
You are a helpful AI assistant that remembers user preferences.
memories:
enabled: true
priority: 40
limit: 15
includeTimestamps: true
includeTags: true
```
### Hybrid Approach: Pinned + On-Demand
```yaml
memories:
enabled: true
pinnedOnly: true # Only auto-load critical context
limit: 5 # Keep system prompt compact
```
Then query additional memories programmatically when needed:
```typescript
// In your application code
const memories = await agent.memory.list({
tags: ["customer", "billing"],
limit: 20
});
```
## Example Output Format
Memories appear in the system prompt as:
```
## User Memories
- User prefers concise responses [Tags: preference, communication]
- Project uses TypeScript with strict mode [Tags: technical, configuration]
- User's timezone is PST [Tags: personal] (Updated: 1/15/2025)
```
## Storage Requirements
Memories require persistent storage:
```yaml
storage:
database:
type: sqlite # Required for persistent memories
```
Memory data uses the key pattern: `memory:item:{id}`
## Best Practices
1. **Pin sparingly** - Only pin critical information that should always be available
2. **Tag consistently** - Develop a tagging strategy for easy filtering
3. **Keep content focused** - Each memory should contain a single, clear piece of information
4. **Use source field** - Track whether memories came from users or system
5. **Set reasonable limits** - Use `limit` option to prevent system prompt bloat
6. **Regular cleanup** - Review and remove outdated memories periodically
7. **Combine approaches** - Use pinned for core context, query on-demand for specific needs
## See Also
- [agent.yml Reference → Memories](./agent-yml.md#memories) - Complete field documentation
- [System Prompt Configuration](./systemPrompt.md) - How to configure system prompts
- [Storage Configuration](./storage.md) - Database setup for persistent memories

View File

@@ -0,0 +1,163 @@
---
sidebar_position: 1
sidebar_label: "Overview"
---
# Configuring Dexto
Dexto's power comes from its customizability. You can customize every part of your Dexto agent with one `yml` config file.
:::tip Complete Configuration Reference
For the comprehensive reference of **all configuration options and field documentation**, see **[Complete agent.yml Configuration Reference](./agent-yml.md)**.
The guides in this section explain **concepts and use cases**. For detailed field specifications, always refer to the canonical reference.
:::
This guide walks through all the different features you can customize, and the expected format.
We chose `yml` instead of the more popular `json` because of better parsing libraries, and support for comments.
## Where to Place Your Config
By default, Dexto uses a configuration file named `coding-agent.yml`.
Dexto ships with in-built agents that are stored in `~/.dexto` directory.
You can also specify a custom config path using the CLI:
```bash
dexto --agent path/to/your-config.yml
```
## Common Configuration Patterns
### Local Development
```yaml
llm:
provider: openai
model: gpt-5-mini
apiKey: $OPENAI_API_KEY
storage:
cache:
type: in-memory
database:
type: sqlite
path: "${{dexto.agent_dir}}/data/dexto.db"
mcpServers:
filesystem:
type: stdio
command: npx
args: ["-y", "@modelcontextprotocol/server-filesystem", "."]
```
### Production Setup
```yaml
llm:
provider: openai
model: gpt-5-mini
apiKey: $OPENAI_API_KEY
storage:
cache:
type: redis
url: $REDIS_URL
maxConnections: 10
database:
type: postgres
connectionString: $POSTGRES_CONNECTION_STRING
maxConnections: 25
sessions:
maxSessions: 1000
sessionTTL: 86400000 # 24 hours
mcpServers:
filesystem:
type: stdio
command: npx
args: ["-y", "@modelcontextprotocol/server-filesystem", "/data"]
connectionMode: strict
```
### Docker Deployment
```yaml
llm:
provider: anthropic
model: claude-sonnet-4-5-20250929
apiKey: $ANTHROPIC_API_KEY
storage:
cache:
type: redis
host: redis
port: 6379
database:
type: postgres
host: postgres
port: 5432
username: $DB_USER
password: $DB_PASSWORD
database: dexto
mcpServers:
filesystem:
type: stdio
command: npx
args: ["-y", "@modelcontextprotocol/server-filesystem", "/app/data"]
```
## Environment Variables
| Variable | Required | Description | Example |
|----------|----------|-------------|---------|
| `OPENAI_API_KEY` | Yes* | OpenAI API key | `sk-proj-...` |
| `ANTHROPIC_API_KEY` | Yes* | Anthropic API key | `sk-ant-...` |
| `GOOGLE_GENERATIVE_AI_API_KEY` | Yes* | Google AI API key | `AIza...` |
| `GROQ_API_KEY` | Yes* | Groq API key | `gsk_...` |
| `XAI_API_KEY` | Yes* | xAI API key | `xai-...` |
| `COHERE_API_KEY` | Yes* | Cohere API key | `co-...` |
| `REDIS_URL` | No | Redis connection URL | `redis://localhost:6379` |
| `POSTGRES_CONNECTION_STRING` | No | PostgreSQL connection | `postgresql://user:pass@host:5432/db` |
| `DEXTO_LOG_LEVEL` | No | Log level | `silly`, `debug`, `info`, `warn`, `error` |
*At least one LLM provider API key is required. Individual provider keys are optional - choose the provider you want to use.
## Path Variables
Dexto supports path variables for portable configuration:
**`${{dexto.agent_dir}}`** - Resolves to the directory containing your agent's YAML file
- Use this for agent-relative paths in plugins, file contributors, and custom resources
- Makes your configuration portable when sharing or moving agents
**Example:**
```yaml
# Plugin with agent-relative path
plugins:
custom:
- name: my-plugin
module: "${{dexto.agent_dir}}/plugins/auth.ts"
# System prompt file contributors with mixed paths
systemPrompt:
contributors:
- type: file
files:
- "${{dexto.agent_dir}}/context/guidelines.md" # Agent-relative
- /etc/system/shared-rules.md # Absolute path
```
## Best Practices
- **Use environment variables** for secrets and API keys. Reference them in YML as `$VARNAME`.
- **Keep your agent in version control** (but never commit secrets!). Use `.env` files or CI secrets for sensitive values.
- **Document your agent config** for your team. Add comments to your YML files. We chose YML for this reason.
- **Use $\{\{dexto.agent_dir\}\} for files used in your agent** - this helps you keep files close to your agent config.
- **Validate your agent** before running Dexto in production:
```bash
# Test your configuration by doing a dry run
dexto --agent ./my-agent.yml
```
- **See the `agents/` folder in [the Dexto GitHub repository](https://github.com/truffle-ai/dexto) for more templates and advanced use cases.**

View File

@@ -0,0 +1,228 @@
---
sidebar_position: 10
---
# Plugins Configuration
Extend agent behavior with custom logic that runs at specific points in the request/response lifecycle.
:::tip Complete Reference
For complete field documentation, plugin interfaces, and implementation details, see **[agent.yml → Plugins](./agent-yml.md#plugins)**.
:::
## Overview
Dexto's plugin system allows you to inject custom logic at four key lifecycle points: before LLM requests, before tool calls, after tool results, and before responses.
**Hook points:**
- **beforeLLMRequest** - Validate/modify input before LLM
- **beforeToolCall** - Check tool arguments before execution
- **afterToolResult** - Process tool results
- **beforeResponse** - Sanitize/format final response
**Common uses:**
- Security & compliance (content filtering, PII redaction)
- Observability (logging, metrics, analytics)
- Data transformation (preprocessing, formatting, translation)
- Business logic (validation, workflow enforcement, cost tracking)
## Plugin Types
### Built-in Plugins
**contentPolicy** - Enforce content policies on input:
```yaml
plugins:
contentPolicy:
priority: 10
blocking: true
enabled: true
maxInputChars: 50000
redactEmails: true
redactApiKeys: true
```
**responseSanitizer** - Clean responses before sending:
```yaml
plugins:
responseSanitizer:
priority: 900
blocking: false
enabled: true
redactEmails: true
redactApiKeys: true
maxResponseLength: 100000
```
### Custom Plugins
Implement your own logic:
```yaml
plugins:
custom:
- name: request-logger
module: "${{dexto.agent_dir}}/plugins/request-logger.ts"
enabled: true
blocking: false
priority: 5
config:
logDir: ~/.dexto/logs
```
## Plugin Configuration Fields
**Core fields (all plugins):**
- **priority** - Execution order (1-99: pre-processing, 100-899: main, 900-999: post)
- **blocking** - If true, errors halt execution
- **enabled** - Whether plugin is active
**Custom plugin fields:**
- **name** - Unique identifier
- **module** - Path to plugin file (supports `${{dexto.agent_dir}}`)
- **config** - Plugin-specific configuration
## Priority Ordering
Plugins execute in priority order (lowest first):
```yaml
plugins:
custom:
- name: validator
priority: 10 # Runs first
blocking: true
- name: logger
priority: 50 # Runs second
blocking: false
- name: sanitizer
priority: 900 # Runs last
blocking: false
```
## Blocking vs Non-blocking
**Blocking (`blocking: true`):**
- Errors halt execution
- User sees error message
- Use for: Security, validation, critical rules
**Non-blocking (`blocking: false`):**
- Errors logged but execution continues
- Use for: Logging, metrics, optional features
## Configuration Examples
### Security-Focused
```yaml
plugins:
contentPolicy:
priority: 10
blocking: true
enabled: true
maxInputChars: 50000
redactEmails: true
redactApiKeys: true
responseSanitizer:
priority: 900
blocking: false
enabled: true
redactEmails: true
redactApiKeys: true
```
### With Custom Logging
```yaml
plugins:
custom:
- name: request-logger
module: "${{dexto.agent_dir}}/plugins/request-logger.ts"
blocking: false
priority: 5
config:
logDir: ~/.dexto/logs
logFileName: request-logger.log
- name: analytics
module: "${{dexto.agent_dir}}/plugins/analytics.ts"
blocking: false
priority: 100
config:
endpoint: https://analytics.example.com
apiKey: $ANALYTICS_API_KEY
```
## Custom Plugin Implementation
```typescript
import type { DextoPlugin, BeforeLLMRequestPayload, PluginResult, PluginExecutionContext } from '@core/plugins/types.js';
export class MyPlugin implements DextoPlugin {
private config: any;
async initialize(config: Record<string, any>): Promise<void> {
this.config = config;
}
async beforeLLMRequest(payload: BeforeLLMRequestPayload, context: PluginExecutionContext): Promise<PluginResult> {
// Validate or modify input
return { ok: true };
}
async cleanup(): Promise<void> {
// Release resources
}
}
export default MyPlugin;
```
## Plugin Results
```typescript
// Success
return { ok: true };
// Success with modifications
return {
ok: true,
modify: { text: 'modified input' },
notices: [{ kind: 'info', code: 'modified', message: 'Input modified' }]
};
// Failure (blocks if plugin is blocking)
return {
ok: false,
cancel: true,
message: 'Validation failed'
};
```
## Best Practices
1. **Use appropriate priorities** - Validators before processors, sanitizers last
2. **Make logging non-blocking** - Don't halt on logging failures
3. **Use blocking for security** - Content policy should be blocking
4. **Keep plugins focused** - Single responsibility per plugin
5. **Handle errors gracefully** - Return appropriate results
6. **Use agent-relative paths** - `${{dexto.agent_dir}}` for portability
7. **Clean up resources** - Implement cleanup() properly
## Plugin Examples
### Built-in Plugins
- **[Content Policy Plugin](https://github.com/truffle-ai/dexto/blob/main/packages/core/src/plugins/content-policy-plugin.ts)** - Input validation and content filtering
- **[Response Sanitizer Plugin](https://github.com/truffle-ai/dexto/blob/main/packages/core/src/plugins/response-sanitizer-plugin.ts)** - Output sanitization and PII redaction
### Custom Plugin Examples
- **[Request Logger Plugin](https://github.com/truffle-ai/dexto/blob/main/agents/logger-agent/plugins/request-logger.ts)** - Complete custom plugin implementation with logging
## See Also
- [agent.yml Reference → Plugins](./agent-yml.md#plugins) - Complete field documentation
- [System Prompt Configuration](./systemPrompt.md) - Configure agent behavior

View File

@@ -0,0 +1,82 @@
---
sidebar_position: 8
sidebar_label: "Sessions"
---
# Sessions Configuration
Configure session management for your Dexto agent, including maximum concurrent sessions and session timeouts.
:::tip Complete Reference
For complete field documentation and session management details, see **[agent.yml → Sessions](./agent-yml.md#session-configuration)**.
:::
## Overview
Sessions in Dexto represent individual conversation contexts or user interactions. Each session maintains its own message history, tool approvals, and state.
## Configuration
```yaml
sessions:
maxSessions: 100 # Maximum concurrent sessions
sessionTTL: 3600000 # Session timeout in milliseconds (1 hour)
```
## Options
### `maxSessions`
- **Type:** Number (positive integer)
- **Default:** 100
- **Description:** Maximum number of concurrent sessions the agent can handle
### `sessionTTL`
- **Type:** Number (milliseconds)
- **Default:** 3600000 (1 hour)
- **Description:** How long sessions remain in memory without activity before being removed from memory (chat history is preserved in storage)
## Examples
### High-Traffic Environment
```yaml
sessions:
maxSessions: 1000
sessionTTL: 1800000 # 30 minutes
```
### Low-Resource Environment
```yaml
sessions:
maxSessions: 20
sessionTTL: 7200000 # 2 hours
```
### Development Environment
```yaml
sessions:
maxSessions: 10
sessionTTL: 86400000 # 24 hours (for debugging)
```
## Session Behavior
- **Automatic cleanup:** Expired sessions are automatically removed from memory (chat history preserved in storage)
- **Session isolation:** Each session has independent conversation history and tool approvals
- **Memory management:** Limiting sessions prevents memory exhaustion in long-running deployments
- **Chat persistence:** Conversation history is always preserved in storage and can be restored when sessions are accessed again
## Default Configuration
If not specified, Dexto uses:
```yaml
sessions:
maxSessions: 100
sessionTTL: 3600000
```
This provides a good balance for most use cases.
## See Also
- [agent.yml Reference → Sessions](./agent-yml.md#session-configuration) - Complete field documentation
- [Storage Configuration](./storage.md) - Persistent chat history storage

View File

@@ -0,0 +1,211 @@
---
sidebar_position: 7
sidebar_label: "Storage Configuration"
---
# Storage Configuration
Configure how your Dexto agent stores data: cache, database, and blob storage.
:::tip Complete Reference
For complete field documentation and all storage options, see **[agent.yml → Storage](./agent-yml.md#storage-configuration)**.
:::
## Overview
Dexto storage has three components:
- **Cache** - Temporary, high-speed data access (in-memory or Redis)
- **Database** - Persistent storage (in-memory, SQLite, or PostgreSQL)
- **Blob** - Binary data storage (in-memory or local filesystem)
## Storage Types
| Component | Options | Use Case |
|-----------|---------|----------|
| **Cache** | in-memory, redis | Temporary data, sessions |
| **Database** | in-memory, sqlite, postgres | Persistent data, memories |
| **Blob** | in-memory, local | Files, images, large objects |
## Cache
Temporary, high-speed data access.
### in-memory
Data lost when process terminates:
```yaml
storage:
cache:
type: in-memory
```
**Use for:** Development, testing
### redis
High-performance caching:
```yaml
storage:
cache:
type: redis
host: localhost
port: 6379
maxConnections: 50
```
**Use for:** Production
## Database
Persistent data storage.
### in-memory
Non-persistent:
```yaml
storage:
database:
type: in-memory
```
**Use for:** Testing
### sqlite
File-based persistence:
```yaml
storage:
database:
type: sqlite
path: ./data/my-agent.db
```
**Use for:** Single-instance, simple deployments
### postgres
Production-grade database:
```yaml
storage:
database:
type: postgres
host: db.example.com
port: 5432
database: dexto_prod
password: $DB_PASSWORD
```
**Use for:** Production, multi-instance
## Blob
Binary data storage.
### in-memory
```yaml
storage:
blob:
type: in-memory
maxBlobSize: 5242880 # 5MB
```
**Use for:** Development
### local
Filesystem storage:
```yaml
storage:
blob:
type: local
storePath: "${{dexto.agent_dir}}/blobs"
maxBlobSize: 104857600 # 100MB
cleanupAfterDays: 60
```
**Use for:** Production, persistent files
## Example Configurations
### Development (Default)
```yaml
# No storage config needed - defaults to in-memory for all components
storage:
cache:
type: in-memory
database:
type: in-memory
blob:
type: in-memory
```
:::tip CLI Auto-Configuration
When using the Dexto CLI, SQLite database and local blob storage paths are automatically provided at:
- Database: `~/.dexto/database/<agent-id>.db`
- Blobs: `~/.dexto/blobs/<agent-id>/`
You don't need to specify these paths manually unless you want custom locations.
:::
### Production (Redis + PostgreSQL)
```yaml
storage:
cache:
type: redis
url: $REDIS_URL
database:
type: postgres
url: $POSTGRES_URL
blob:
type: local
storePath: /var/data/blobs
```
### Simple (SQLite)
```yaml
storage:
database:
type: sqlite
# path: automatically provided by CLI as ~/.dexto/database/<agent-id>.db
blob:
type: local
# storePath: automatically provided by CLI as ~/.dexto/blobs/<agent-id>/
```
Or with explicit paths:
```yaml
storage:
database:
type: sqlite
path: ./data/my-agent.db
blob:
type: local
storePath: ./data/blobs
```
## When to Use
| Scenario | Cache | Database | Blob |
|----------|-------|----------|------|
| **Development** | in-memory | in-memory | in-memory |
| **Simple Production** | redis | sqlite | local |
| **Scalable Production** | redis | postgres | local |
| **Testing** | in-memory | sqlite | in-memory |
## Best Practices
1. **Use environment variables** - Store passwords and connection strings as `$VAR`
2. **Match storage to use case** - Redis for caching, Postgres/SQLite for persistence
3. **Set appropriate limits** - Configure `maxConnections`, `maxBlobSize` based on load
4. **Use local blob storage in production** - For persistence and automatic cleanup

View File

@@ -0,0 +1,190 @@
---
sidebar_position: 3
sidebar_label: "System Prompt"
---
# System Prompt Configuration
Configure how your Dexto agent behaves and responds through system prompts that define personality, capabilities, and guidelines.
:::tip Complete Reference
For complete field documentation and all configuration options, see **[agent.yml → System Prompt](./agent-yml.md#system-prompt-configuration)**.
:::
## Overview
System prompts define your agent's personality, behavior, and capabilities. They serve as the foundational instructions that guide how your agent interprets and responds to user requests.
You can use either a simple string for basic scenarios or an advanced multi-contributor system for complex agents that need dynamic context, file-based instructions, or memory integration.
**Key capabilities:**
- Static instructions for consistent behavior
- Dynamic context (date/time, MCP resources)
- File-based documentation inclusion
- Priority-based content ordering
:::tip Memory Configuration
For user memory integration, use the top-level [`memories`](./agent-yml.md#memories) configuration instead of system prompt contributors.
:::
## Configuration Types
### Simple String Prompt
For straightforward agents, use a single string:
```yaml
systemPrompt: |
You are a helpful AI assistant with access to tools.
Use these tools when appropriate to answer user queries.
After each tool result, determine if you need more information or can provide a final answer.
```
### Advanced Multi-Contributor System
For complex scenarios requiring multiple content sources:
```yaml
systemPrompt:
contributors:
- id: core-behavior
type: static
priority: 1
content: |
You are a professional software development assistant.
You help with coding, documentation, and project management.
- id: current-time
type: dynamic
priority: 10
source: date
- id: project-docs
type: file
priority: 20
files:
- "${{dexto.agent_dir}}/README.md"
- "${{dexto.agent_dir}}/CONTRIBUTING.md"
options:
includeFilenames: true
errorHandling: "skip"
# Memory is configured separately at the top level
memories:
enabled: true
limit: 10
```
## Contributor Types
### Static Contributors
Fixed text content for consistent instructions.
```yaml
- id: guidelines
type: static
priority: 1
content: |
Always be helpful, respectful, and thorough.
Provide step-by-step solutions when possible.
```
### Dynamic Contributors
Runtime-generated content:
- **`date`** - Current date context
- **`resources`** - MCP server resources (disabled by default)
```yaml
- id: timestamp
type: dynamic
priority: 10
source: date
enabled: true
```
### File Contributors
Include external documentation files (`.md` and `.txt` only):
```yaml
- id: project-context
type: file
priority: 20
files:
- "${{dexto.agent_dir}}/docs/guidelines.md"
- "../README.md"
options:
includeFilenames: true
separator: "\n\n---\n\n"
maxFileSize: 50000
```
**Path resolution:** Relative paths are resolved from the config file location.
## Priority Ordering
Contributors execute in ascending priority order (1 → 100+). Lower numbers appear first in the final system prompt.
**Recommended ranges:**
- **1-10:** Core behavior and role definition
- **10-50:** Dynamic context (time, resources)
- **50-100:** File-based documentation
- **100+:** Additional context and overrides
## Use Cases
| Scenario | Recommended Approach |
|----------|---------------------|
| Simple chatbot | Single string prompt |
| Development assistant | Static + File contributors for guidelines |
| Customer support | Static + top-level `memories` config |
| Research agent | Static + Dynamic (resources) for live data |
| Personal assistant | Static + File + Dynamic + `memories` config |
## Examples
### Production Agent
```yaml
systemPrompt:
contributors:
- id: core
type: static
priority: 1
content: |
You are a helpful AI assistant designed to work with tools and data.
Provide clear, accurate responses and use available tools effectively.
- id: timestamp
type: dynamic
priority: 10
source: date
```
### Customer Support Agent
```yaml
systemPrompt:
contributors:
- id: role
type: static
priority: 1
content: |
You are a customer support assistant.
Always be polite, professional, and solution-oriented.
memories:
enabled: true
limit: 10
```
## Best Practices
1. **Keep it focused** - Clear, specific instructions work better than lengthy prompts
2. **Use priority ordering** - Structure from general (role) to specific (context)
3. **Test behavior** - Validate that prompts produce desired agent responses
4. **File contributors for docs** - Keep large documentation in separate files
5. **Enable resources selectively** - MCP resources can be large; only enable when needed
6. **Use top-level memories** - Configure memory retrieval via the `memories` config field
## See Also
- [agent.yml Reference → System Prompt](./agent-yml.md#system-prompt-configuration) - Complete field documentation
- [Memory Configuration](./memory.md) - Configure the memory system
- [MCP Configuration](./mcpConfiguration.md) - Set up resource providers

View File

@@ -0,0 +1,175 @@
---
sidebar_position: 9
sidebar_label: "Telemetry"
---
# Telemetry Configuration
Enable distributed tracing to monitor agent behavior, debug issues, and track performance using OpenTelemetry.
:::tip Complete Reference
For complete field documentation, backend setup, and collector configuration, see **[agent.yml → Telemetry](./agent-yml.md#telemetry-configuration)**.
:::
## Overview
Telemetry provides visibility into your agent's operations through distributed tracing. When enabled, Dexto automatically traces agent operations, LLM calls, and tool executions.
**What you get:**
- Complete request lifecycle traces
- LLM token usage tracking
- Tool execution monitoring
- Export to any OTLP-compatible backend
## Quick Start
### 1. Start Jaeger (Local)
```bash
docker run -d \
--name jaeger \
-p 16686:16686 \
-p 4318:4318 \
jaegertracing/all-in-one:latest
```
### 2. Configure Agent
```yaml
telemetry:
enabled: true
serviceName: my-agent
export:
type: otlp
endpoint: http://localhost:4318/v1/traces
```
### 3. View Traces
Open [http://localhost:16686](http://localhost:16686) and explore your traces.
## Configuration Options
```yaml
telemetry:
enabled: boolean # Turn on/off (default: false)
serviceName: string # Service identifier in traces
tracerName: string # Tracer name (default: 'dexto-tracer')
export:
type: 'otlp' | 'console' # Export destination
protocol: 'http' | 'grpc' # OTLP protocol (default: 'http')
endpoint: string # Backend URL
headers: # Optional auth headers
[key: string]: string
```
## Export Types
### OTLP (Production)
Export to OTLP-compatible backends:
```yaml
telemetry:
enabled: true
serviceName: my-prod-agent
export:
type: otlp
endpoint: http://localhost:4318/v1/traces
```
### Console (Development)
Print traces to terminal:
```yaml
telemetry:
enabled: true
export:
type: console
```
## Common Configurations
### Local Jaeger
```yaml
telemetry:
enabled: true
serviceName: my-dev-agent
export:
type: otlp
protocol: http
endpoint: http://localhost:4318/v1/traces
```
### Grafana Cloud
```yaml
telemetry:
enabled: true
serviceName: my-prod-agent
export:
type: otlp
endpoint: https://otlp-gateway-prod.grafana.net/otlp
headers:
authorization: "Basic $GRAFANA_CLOUD_TOKEN"
```
### Honeycomb
```yaml
telemetry:
enabled: true
serviceName: my-prod-agent
export:
type: otlp
endpoint: https://api.honeycomb.io:443
headers:
x-honeycomb-team: $HONEYCOMB_API_KEY
```
## What Gets Traced
Dexto automatically traces:
- **Agent operations** - Full request lifecycle
- **LLM calls** - Model invocations with token counts
- **Tool executions** - Tool calls and results
**Key attributes:**
- `gen_ai.usage.input_tokens` - Prompt tokens
- `gen_ai.usage.output_tokens` - Completion tokens
- `llm.provider` - Provider name
- `llm.model` - Model identifier
## Use Cases
| Scenario | How Telemetry Helps |
|----------|---------------------|
| **Debug slow requests** | Identify bottlenecks in traces |
| **Monitor token usage** | Track LLM costs and optimize prompts |
| **Production monitoring** | Set alerts for errors and latency |
| **Performance optimization** | Find inefficient operations |
## Performance Impact
Minimal overhead:
- ~1-2ms per span
- Async export (non-blocking)
- Automatic batching
For high-volume agents, consider sampling or using a collector.
## Best Practices
1. **Enable in production** - Essential for observability
2. **Use meaningful service names** - Different names per deployment
3. **Set up monitoring** - Create alerts for issues
4. **Consider sampling** - For high-traffic scenarios
5. **Use collectors** - For advanced processing and buffering
## See Also
- [agent.yml Reference → Telemetry](./agent-yml.md#telemetry-configuration) - Complete field documentation
- [OpenTelemetry Docs](https://opentelemetry.io/docs/) - Official OTEL documentation
- [Jaeger Docs](https://www.jaegertracing.io/docs/) - Jaeger tracing platform

View File

@@ -0,0 +1,270 @@
---
sidebar_position: 6
sidebar_label: "Tool Confirmation"
---
# Tool Confirmation Configuration
Control how and when users are prompted to approve tool execution through Dexto's flexible confirmation system.
:::tip Complete Reference
For complete field documentation, event specifications, and UI integration details, see **[agent.yml → Tool Confirmation](./agent-yml.md#tool-confirmation)**.
:::
## Overview
The tool confirmation system provides security and oversight by controlling which tools your agent can execute and when. It supports multiple modes and fine-grained policies for different environments and use cases.
**Configuration controls:**
- **Confirmation mode** - How tools are approved (interactive, auto-approve, auto-deny)
- **Timeout duration** - How long to wait for user response
- **Storage type** - Where to remember approvals (persistent vs session-only)
- **Tool policies** - Fine-grained allow/deny lists
:::note Elicitation vs Tool Confirmation
**Tool confirmation** controls whether tools require approval before execution. **Elicitation** is a separate feature that controls whether MCP servers can request user input during interactions. These are independent settings - see [Elicitation Configuration](./agent-yml.md#elicitation-configuration) for details.
:::
## Confirmation Modes
| Mode | Behavior | Use Case |
|------|----------|----------|
| **manual** | Interactive prompts via CLI/WebUI | Production with oversight |
| **auto-approve** | Automatically approve all tools | Development/testing |
| **auto-deny** | Block all tool execution | Read-only/high-security |
### manual (Default)
Interactive confirmation via CLI prompts or WebUI dialogs:
```yaml
toolConfirmation:
mode: manual
timeout: 30000 # 30 seconds
allowedToolsStorage: storage # Persist across sessions
```
**When to use:**
- Production environments needing oversight
- Multi-user environments with different permissions
- Development with tool approval tracking
### auto-approve
Automatically approve all tools without prompting:
```yaml
toolConfirmation:
mode: auto-approve
```
**When to use:**
- Development where speed is important
- Trusted automation scripts
- Testing scenarios
CLI shortcut: `dexto --auto-approve`
### auto-deny
Block all tool execution:
```yaml
toolConfirmation:
mode: auto-deny
```
**When to use:**
- High-security environments
- Read-only deployments
- Completely disable tool execution
## Tool Policies
Fine-grained control over specific tools:
```yaml
toolConfirmation:
mode: manual
toolPolicies:
alwaysAllow:
- internal--ask_user
- internal--read_file
- mcp--filesystem--read_file
alwaysDeny:
- mcp--filesystem--delete_file
- mcp--git--push
```
**Tool name format:**
- Internal tools: `internal--<tool_name>`
- MCP tools: `mcp--<server_name>--<tool_name>`
**Precedence rules:**
1. `alwaysDeny` takes precedence over `alwaysAllow`
2. Tool policies override confirmation mode
3. Empty arrays by default
## Storage Options
### storage (Default)
Approvals persisted across sessions:
```yaml
toolConfirmation:
allowedToolsStorage: storage
```
**Pros:** Convenient - approve once, use forever
**Cons:** Less secure - approvals persist until cleared
### memory
Approvals cleared when session ends:
```yaml
toolConfirmation:
allowedToolsStorage: memory
```
**Pros:** More secure - no persistent approvals
**Cons:** Need to re-approve in each session
## Session-Aware Approvals
Approvals can be scoped to specific sessions or applied globally:
**Session-scoped:** Only applies to one conversation
**Global:** Applies to all sessions
The system checks: session-specific → global → deny
## Configuration Examples
### Development Environment
```yaml
toolConfirmation:
mode: auto-approve
allowedToolsStorage: memory
toolPolicies:
alwaysDeny:
- internal--bash_exec--rm -rf*
```
### Production Environment
```yaml
toolConfirmation:
mode: manual
timeout: 60000
allowedToolsStorage: storage
toolPolicies:
alwaysAllow:
- internal--ask_user
- internal--read_file
alwaysDeny:
- mcp--filesystem--delete_file
- mcp--git--push
```
### High-Security Environment
```yaml
toolConfirmation:
mode: manual
allowedToolsStorage: memory
toolPolicies:
alwaysAllow: []
alwaysDeny:
- mcp--filesystem--write_file
- mcp--filesystem--delete_file
- internal--bash_exec
```
## Manual Mode Requirements
Manual mode requires UI integration to prompt the user for approvals:
- **CLI Mode**: Interactive prompts in the terminal
- **Web/Server Mode**: Approval dialogs in the WebUI
- **Custom Integration**: Implement your own approval handler via `agent.setApprovalHandler()`
The system will wait for user input up to the configured timeout, then auto-deny if no response is received.
## Approval Handlers
Approval handlers control how your application prompts for and receives user decisions about tool execution.
### Built-in Options
**Auto modes**: No handler needed - `auto-approve` and `auto-deny` modes handle approvals automatically without requiring a handler implementation.
**Manual handler for server/API mode**: Use `createManualApprovalHandler` from `@dexto/server` when building web applications. This handler coordinates approvals between backend and frontend via event bus:
```typescript
import { createManualApprovalHandler } from '@dexto/server';
const handler = createManualApprovalHandler(
agent.agentEventBus,
60000 // timeout in ms
);
agent.setApprovalHandler(handler);
```
### Custom Handlers
For CLI tools, desktop apps, or custom integrations, implement your own handler:
```typescript
import { ApprovalStatus, DenialReason } from '@dexto/core';
agent.setApprovalHandler(async (request) => {
// request contains: approvalId, type, metadata (toolName, args, etc.)
const userChoice = await promptUser(
`Allow ${request.metadata.toolName}?`
);
return {
approvalId: request.approvalId,
status: userChoice ? ApprovalStatus.APPROVED : ApprovalStatus.DENIED,
reason: userChoice ? undefined : DenialReason.USER_DENIED,
};
});
```
**Common use cases for custom handlers:**
- CLI tools (readline, inquirer, prompts)
- Desktop apps (native dialogs, Electron)
- Policy-based approval (check against rules)
- External integrations (Slack, PagerDuty)
- Audit logging wrappers
## Best Practices
1. **Use manual mode in production** - Maintain oversight and control
2. **Set reasonable timeouts** - Balance security with user experience
3. **Enable read-only tools** - Allow safe operations without confirmation
4. **Block destructive operations** - Use `alwaysDeny` for dangerous tools
5. **Use memory storage for sensitive environments** - Don't persist approvals
6. **Test policies** - Verify tool policies work as expected
## Common Use Cases
| Scenario | Configuration |
|----------|--------------|
| **Development** | auto-approve + memory storage |
| **Production** | manual + storage + policies |
| **CI/CD** | auto-deny (no tool execution) |
| **Read-only** | manual + alwaysAllow read operations |
| **High-security** | manual + memory storage + strict deny list |
## See Also
- [agent.yml Reference → Tool Confirmation](./agent-yml.md#tool-confirmation) - Complete field documentation
- [Internal Tools](./internalTools.md) - Built-in Dexto tools
- [MCP Configuration](./mcpConfiguration.md) - External MCP tools
- [Storage Configuration](./storage.md) - Persistent approval storage

View File

@@ -0,0 +1,175 @@
---
sidebar_position: 10
title: "Custom Slash Commands"
---
# Custom Slash Commands
Custom slash commands (also called File Prompts) let you create reusable prompt templates that work like shortcuts in the Dexto CLI and Web UI. Think of them as your personal command library that you can invoke with a simple `/command-name` syntax.
## What are Custom Slash Commands?
Markdown files with prompt templates that support:
- **Positional arguments** `$1`, `$2`, etc. for structured inputs
- **Free-form content** `$ARGUMENTS` for flexible text
- **Local or global scope** Project-specific or user-wide
- **Auto-discovery** Loaded automatically on startup
## Creating a Command
Create a `.md` file with frontmatter and your prompt template:
```markdown
---
description: Translate text between languages
argument-hint: [from-lang] [to-lang] [text]
---
Translate from $1 to $2:
$3
```
**Save locations:**
- **Local** (project-specific): `<your-project>/commands/translate.md`
- **Global** (available everywhere): `~/.dexto/commands/translate.md`
## Using Commands
**In Web UI:**
Type `/` to discover and invoke your custom commands.
**In CLI:**
```bash
dexto
> /translate english spanish "Hello world"
```
This expands to:
```
Translate from english to spanish:
Hello world
```
## Frontmatter Fields
| Field | Required | Description |
|-------|----------|-------------|
| `description` | ✅ Yes | Brief description shown in command list |
| `argument-hint` | ⚠️ Recommended | Argument names for UI hints like `[style] [length?]` |
| `name` | ❌ Optional | Override the filename as command name |
## Placeholder Types
**Positional (`$1`-`$9`):**
```markdown
---
description: Code review comment
argument-hint: [file] [line] [severity]
---
**Code Review for $1 (Line $2)**
Severity: $3
```
**Free-form (`$ARGUMENTS`):**
```markdown
---
description: Improve any text
---
Please improve the following text:
$ARGUMENTS
```
**Mixed approach:**
```markdown
---
description: Analyze code with focus
argument-hint: [language] [focus]
---
Analyze this $1 code focusing on: $2
$ARGUMENTS
```
**Escape literal dollar signs with `$$`:**
```markdown
The cost is $$100 per month.
```
## Example: Git Commit Message
**File**: `commands/commit-msg.md`
```markdown
---
description: Generate semantic commit message
argument-hint: [type]
---
Generate a commit message of type "$1" for these changes:
$ARGUMENTS
Follow conventional commits format. Be concise and descriptive.
```
**Usage**:
```bash
/commit-msg feat "Added user authentication with OAuth2"
```
## Viewing Available Commands
**In Web UI:** Type `/` to see all commands.
**In CLI:**
```bash
> /prompts
```
Shows commands from:
- Built-in starter prompts
- Local `commands/` directory
- Global `~/.dexto/commands` directory
- Connected MCP server prompts
## Best Practices
**DO:**
- Use descriptive names (`analyze-performance` not `analyze`)
- Add clear descriptions for discoverability
- Use `argument-hint` for inline hints
- Use kebab-case filenames (`my-command.md`)
**DON'T:**
- Use spaces in filenames (breaks resolution)
- Make overly complex prompts (split into multiple commands)
- Forget the `description` field (required to appear)
## How It Works
Dexto's `ConfigPromptProvider` loads prompts from your agent configuration (both inline and file-based). For file-based prompts, it parses markdown files with frontmatter and registers them as slash commands. When invoked, placeholders expand with your arguments and send to the LLM.
## Troubleshooting
**Command doesn't appear:**
- File must end with `.md`
- Valid YAML frontmatter required
- `description` field must be present
- Must be in `commands/` or `~/.dexto/commands`
**Arguments not expanding:**
- Use `$1`-`$9` or `$ARGUMENTS` only
- Match argument order to `argument-hint`
- Use `$$` for literal dollar signs
## See Also
- [CLI Guide](./cli/overview) - Interactive commands and options
- [MCP Prompts](../mcp/prompts) - Prompts from external MCP servers
- [Agent Configuration](./configuring-dexto/overview) - Customize agent behavior

View File

@@ -0,0 +1,167 @@
---
sidebar_position: 9
---
# Deployment Guide
Deploy Dexto agents using Docker for local or production environments.
## Docker Deployment
### Quick Start
1. **Build the Docker image**
```bash
docker build -t dexto .
```
2. **Create environment file**
```bash
# .env
OPENAI_API_KEY=your_openai_api_key
ANTHROPIC_API_KEY=your_anthropic_api_key
# Add other API keys as needed
```
3. **Run the container**
```bash
docker run --env-file .env -p 3001:3001 dexto
```
Your Dexto server will be available at `http://localhost:3001` with:
- ✅ SQLite database connected
- ✅ MCP servers (filesystem & puppeteer) connected
- ✅ REST API + SSE streaming endpoints available
### Port Configuration
By default, Dexto server mode runs on port 3001 and web mode on port 3000. Customize the port using the `PORT` environment variable or `--port` flag:
```bash
# Using environment variable
docker run --env-file .env -e PORT=8080 -p 8080:8080 dexto
# Using CLI flag (requires modifying Dockerfile CMD)
docker run --env-file .env -p 8080:8080 dexto --port 8080
```
```bash
# Web mode with custom port (serves both UI and API)
docker run --env-file .env -p 3000:3000 dexto --port 3000
```
### Background Mode
Run Dexto in detached mode:
```bash
# Start in background
docker run -d --name dexto-server --env-file .env -p 3001:3001 dexto
# View logs
docker logs -f dexto-server
# Stop server
docker stop dexto-server
```
### Docker Compose
For easier management:
```yaml
# docker-compose.yml
version: '3.8'
services:
dexto:
build: .
ports:
- "3001:3001"
env_file:
- .env
volumes:
- dexto_data:/app/.dexto
restart: unless-stopped
volumes:
dexto_data:
```
Run with:
```bash
docker compose up --build
```
## Production Setup
### Environment Variables
```bash
# Production environment variables
NODE_ENV=production
PORT=3001
CONFIG_FILE=/app/configuration/dexto.yml
```
### Persistent Storage
Mount a volume for persistent data:
```bash
docker run -d \
--name dexto-server \
--env-file .env \
-p 3001:3001 \
-v dexto_data:/app/.dexto \
dexto
```
### Resource Limits
Set memory and CPU limits:
```bash
docker run -d \
--name dexto-server \
--env-file .env \
--memory=1g \
--cpus=1 \
-p 3001:3001 \
dexto
```
## API Endpoints
Once deployed, your Dexto server provides:
### REST API
- `POST /api/message` - Send async message
- `POST /api/message-sync` - Send sync message
- `POST /api/reset` - Reset conversation
- `GET /api/mcp/servers` - List MCP servers
- `GET /health` - Health check
- And many more for sessions, LLM management, agents, webhooks, etc.
**See the complete [REST API Documentation](/api/rest/)** for all available endpoints.
### Server-Sent Events (SSE)
- Real-time events and streaming responses
- Connect to `http://localhost:3001/api/message-stream`
**See the [SDK Events Reference](/api/sdk/events)** for event types and usage.
## Next Steps
- **[Dexto SDK Guide](./dexto-sdk.md)** - Integrate Dexto into your application's codebase
- **[API Reference](/api)** - Complete API documentation
For more detailed information on configuring agents, refer to the [Dexto Configuration Guide](./configuring-dexto/overview.md).
### Building with the Dexto SDK for TypeScript
For custom builds and advanced integration, you can use the [Dexto SDK Guide](./dexto-sdk.md) to bundle Dexto into your own applications.
For a complete technical reference, see the [API Reference](/api).
## Hosting Options

View File

@@ -0,0 +1,100 @@
---
sidebar_position: 8
title: "Using Dexto Agents in Cursor"
sidebar_label: "Dexto Agents in Cursor"
description: Connect Dexto agents to Cursor via the Model Context Protocol (MCP).
---
# Using Dexto Agents in Cursor
Cursor ships with native MCP support, which means you can talk to your Dexto agents directly inside the editor. This guide walks through the minimal configuration required and highlights a few power tips for customizing the experience.
## Prerequisites
- Install the Dexto CLI globally (`pnpm install -g dexto`, `npm install -g dexto`)
- Run `dexto` at least once so the setup flow can capture your provider credentials. Dexto stores secrets in `~/.dexto/.env`, so you no longer need to pass API keys through environment variables.
## Configure `.cursor/mcp.json`
Cursor looks for MCP definitions in `.cursor/mcp.json` within your project root. Add an entry that launches Dexto with your preferred agent in MCP mode.
### Use an in-built agent
```json title=".cursor/mcp.json"
{
"mcpServers": {
"dexto": {
"command": "dexto",
"args": ["--mode", "mcp", "--agent", "music-agent", "--auto-approve"]
}
}
}
```
```json title=".cursor/mcp.json (just dexto section)"
"dexto": {
"command": "dexto",
"args": ["--mode", "mcp", "--agent", "music-agent", "--auto-approve"]
}
```
Replace `music-agent` with any of the agents you see in `dexto list-agents`
### Expose a custom agent
Point Cursor at a custom agent file to tailor the available tools:
```json title=".cursor/mcp.json"
{
"mcpServers": {
"dexto": {
"command": "npx",
"args": [
"-y", "dexto", "--mode", "mcp", "--agent", "<path_to_your_custom_agent.yml>"
]
}
}
}
```
```json title=".cursor/mcp.json (just dexto section)"
"dexto": {
"command": "npx",
"args": [
"-y", "dexto", "--mode", "mcp", "--agent", "<path_to_your_custom_agent.yml>"
]
}
```
After editing, Cursor automatically connects to Dexto and exposes the tools defined by your agent (filesystem browsing, web search, custom MCP servers, etc.).
## Working with the agent
Once connected, use Cursor's MCP panel or chat to run tools such as `chat_with_agent`:
- **Code improvements:** “Ask Dexto agent to refactor the highlighted function for performance.”
- **Project analysis:** “Ask Dexto agent to explain the current architecture.”
- **Web research:** “Ask Dexto agent to find the latest React 19 release notes.”
<p class="lightbox-gallery">
<a href="#cursor-dexto-screenshot" class="lightbox-thumb">
<img src="/img/cursor/dexto-agent-cursor.png" alt="Cursor running a Dexto MCP agent" />
</a>
</p>
<div id="cursor-dexto-screenshot" class="lightbox-target">
<img src="/img/cursor/dexto-agent-cursor.png" alt="Cursor running a Dexto MCP agent" />
<a class="lightbox-close" href="#"></a>
</div>
Combine this with your own agent configuration to enable domain-specific workflows—everything from documentation search to infrastructure automation.
## Troubleshooting
- **Credentials not found:** rerun `dexto setup` to enter provider keys; Dexto persists them inside `~/.dexto`.
- **Need verbose logs:** start the MCP server yourself with `DEXTO_LOG_LEVEL=debug dexto --mode mcp` before launching Cursor.
For more detail on other MCP transports and remote deployments, see [Using Dexto as an MCP Server](../mcp/dexto-as-mcp-server.md).

View File

@@ -0,0 +1,566 @@
---
sidebar_position: 5
title: "Dexto Agent SDK Guide"
---
import ExpandableMermaid from '@site/src/components/ExpandableMermaid';
# Dexto Agent SDK Guide
Welcome to the Dexto Agent SDK guide for TypeScript. This guide provides everything you need to build high-quality AI applications with Dexto.
Whether you're creating standalone agents, integrating with existing applications, or building custom AI workflows, the SDK offers a flexible and robust set of tools.
## Key Features
- **Full TypeScript Support**: Strong typing for better development.
## Core Concepts
The SDK is built around a few core concepts:
- **DextoAgent**: The main class for creating and managing agents.
- **MCPManager**: A utility for managing MCP server connections.
- **LLMService**: A service for interacting with large language models.
- **StorageBackends**: A set of backends for persisting agent data.
## Example Usage
Here's a quick example of how to create a simple agent that uses the OpenAI API:
```typescript
import { DextoAgent } from '@dexto/core';
const agent = new DextoAgent({
llm: {
provider: 'openai',
model: 'gpt-5',
apiKey: process.env.OPENAI_API_KEY,
}
});
await agent.start();
// Create a session for the conversation
const session = await agent.createSession();
// Use generate() for simple request/response
const response = await agent.generate('Hello, world!', session.id);
console.log(response.content);
await agent.stop();
```
For more detailed examples, see the [Examples](/examples/intro) section.
## Overview
The Dexto SDK provides a complete TypeScript library for building AI agents with MCP (Model Context Protocol) integration. It offers both high-level agent abstractions and low-level utilities for maximum flexibility.
### When to Use the SDK vs REST API
**Use the Dexto SDK when:**
- Building TypeScript applications
- Need real-time event handling
- Want type safety and IDE support
- Require complex session management
- Building long-running applications
**Use the REST API when:**
- Working in other languages
- Building simple integrations
- Prefer stateless interactions
- Working with webhooks or serverless functions
## Installation
```bash
npm install dexto
```
## Quick Start
### Basic Agent Setup
```typescript
import { DextoAgent } from '@dexto/core';
// Create agent with minimal configuration
const agent = new DextoAgent({
llm: {
provider: 'openai',
model: 'gpt-5',
apiKey: process.env.OPENAI_API_KEY
}
});
await agent.start();
// Create a session and start a conversation
const session = await agent.createSession();
const response = await agent.generate('Hello! What can you help me with?', session.id);
console.log(response.content);
```
### Adding MCP Tools
```typescript
const agent = new DextoAgent({
llm: {
provider: 'openai',
model: 'gpt-5',
apiKey: process.env.OPENAI_API_KEY
},
toolConfirmation: { mode: 'auto-approve' },
mcpServers: {
filesystem: {
type: 'stdio',
command: 'npx',
args: ['-y', '@modelcontextprotocol/server-filesystem', '.']
},
web: {
type: 'stdio',
command: 'npx',
args: ['-y', '@modelcontextprotocol/server-brave-search']
}
}
});
await agent.start();
// Create session and use the agent with filesystem and web search tools
const session = await agent.createSession();
const response = await agent.generate(
'List the files in this directory and search for recent AI news',
session.id
);
console.log(response.content);
```
## Core Concepts
### Agents vs Sessions
- **Agent**: The main AI system with configuration, tools, and state management
- **Session**: Individual conversation threads within an agent
```typescript
// Create an agent (one per application typically)
const agent = new DextoAgent(config);
await agent.start();
// Create multiple sessions for different conversations
const userSession = await agent.createSession('user-123');
const adminSession = await agent.createSession('admin-456');
// Each session maintains separate conversation history
await agent.generate('Help me with my account', userSession.id);
await agent.generate('Show me system metrics', adminSession.id);
```
### Multimodal Content
Send text, images, and files using the `ContentPart[]` format:
```typescript
import { ContentPart } from '@dexto/core';
const session = await agent.createSession();
// Simple text (string shorthand)
await agent.generate('What is TypeScript?', session.id);
// Image from URL (auto-detected)
await agent.generate([
{ type: 'text', text: 'Describe this image' },
{ type: 'image', image: 'https://example.com/photo.jpg' }
], session.id);
// Image from base64
await agent.generate([
{ type: 'text', text: 'Describe this image' },
{ type: 'image', image: base64ImageData, mimeType: 'image/png' }
], session.id);
// File from URL
await agent.generate([
{ type: 'text', text: 'Summarize this document' },
{ type: 'file', data: 'https://example.com/report.pdf', mimeType: 'application/pdf' }
], session.id);
// File from base64
await agent.generate([
{ type: 'text', text: 'Summarize this document' },
{ type: 'file', data: base64PdfData, mimeType: 'application/pdf', filename: 'report.pdf' }
], session.id);
// Multiple attachments
await agent.generate([
{ type: 'text', text: 'Compare these two images' },
{ type: 'image', image: 'https://example.com/image1.png' },
{ type: 'image', image: 'https://example.com/image2.jpg' }
], session.id);
```
### Streaming Responses
Use `stream()` for real-time UIs that display text as it arrives:
```typescript
const session = await agent.createSession();
for await (const event of await agent.stream('Write a short story', session.id)) {
switch (event.name) {
case 'llm:thinking':
console.log('Thinking...');
break;
case 'llm:chunk':
process.stdout.write(event.content); // Stream text in real-time
break;
case 'llm:tool-call':
console.log(`\n[Using tool: ${event.toolName}]`);
break;
case 'llm:response':
console.log(`\n\nTotal tokens: ${event.tokenUsage?.totalTokens}`);
break;
}
}
// Streaming with multimodal content
for await (const event of await agent.stream([
{ type: 'text', text: 'Describe this image in detail' },
{ type: 'image', image: base64Image, mimeType: 'image/png' }
], session.id)) {
if (event.name === 'llm:chunk') {
process.stdout.write(event.content);
}
}
```
### Event-Driven Architecture
The SDK provides real-time events for monitoring and integration:
```typescript
// Listen to agent-wide events
agent.agentEventBus.on('mcp:server-connected', (data) => {
console.log(`✅ Connected to ${data.name}`);
});
// Listen to conversation events
agent.agentEventBus.on('llm:thinking', (data) => {
console.log(`🤔 Agent thinking... (session: ${data.sessionId})`);
});
agent.agentEventBus.on('llm:tool-call', (data) => {
console.log(`🔧 Using tool: ${data.toolName}`);
});
```
## Common Patterns
### Multi-User Chat Application
<ExpandableMermaid title="Multi-User Chat Flow">
```mermaid
sequenceDiagram
participant User1 as User A
participant ChatApp as Chat Application
participant Agent as DextoAgent
User1->>ChatApp: handleUserMessage
ChatApp->>ChatApp: Get or create session
ChatApp->>Agent: createSession (if new)
ChatApp->>Agent: generate(message, sessionId)
Agent->>Agent: Process message
Agent-->>ChatApp: Response
ChatApp-->>User1: broadcastToUser
```
</ExpandableMermaid>
```typescript
import { DextoAgent } from '@dexto/core';
class ChatApplication {
private agent: DextoAgent;
private userSessions = new Map<string, string>();
async initialize() {
this.agent = new DextoAgent({
llm: { provider: 'openai', model: 'gpt-5', apiKey: process.env.OPENAI_API_KEY },
toolConfirmation: { mode: 'auto-approve' },
mcpServers: { /* your tools */ }
});
await this.agent.start();
// Set up event monitoring
this.agent.agentEventBus.on('llm:response', (data) => {
this.broadcastToUser(data.sessionId, data.content);
});
}
async handleUserMessage(userId: string, message: string) {
// Get or create session for user
let sessionId = this.userSessions.get(userId);
if (!sessionId) {
const session = await this.agent.createSession(`user-${userId}`);
sessionId = session.id;
this.userSessions.set(userId, sessionId);
}
// Process message using generate()
const response = await this.agent.generate(message, sessionId);
return response.content;
}
private broadcastToUser(sessionId: string, message: string) {
// Find user and send response via SSE, etc.
}
}
```
### Dynamic Tool Management
```typescript
class AdaptiveAgent {
private agent: DextoAgent;
async initialize() {
this.agent = new DextoAgent(baseConfig);
await this.agent.start();
}
async addCapability(name: string, serverConfig: McpServerConfig) {
try {
await this.agent.addMcpServer(name, serverConfig);
console.log(`✅ Added ${name} capability`);
} catch (error) {
console.error(`❌ Failed to add ${name}:`, error);
}
}
async removeCapability(name: string) {
await this.agent.removeMcpServer(name);
console.log(`🗑️ Removed ${name} capability`);
}
async listCapabilities() {
const tools = await this.agent.getAllMcpTools();
return Object.keys(tools);
}
}
```
### Session Management with Persistence
<ExpandableMermaid title="Session Management Flow">
```mermaid
flowchart TD
A[resumeConversation called] --> B{Session exists?}
B -->|Yes| C[Load existing session]
B -->|No| D[Create new session]
C --> E[Return sessionId, history]
D --> F[Return sessionId, history: null]
```
</ExpandableMermaid>
```typescript
class PersistentChatBot {
private agent: DextoAgent;
async initialize() {
this.agent = new DextoAgent({
llm: { /* config */ },
storage: {
cache: { type: 'redis', url: 'redis://localhost:6379' },
database: { type: 'postgresql', url: process.env.DATABASE_URL }
}
});
await this.agent.start();
}
async resumeConversation(userId: string) {
const sessionId = `user-${userId}`;
// Check if session exists
const sessions = await this.agent.listSessions();
if (sessions.includes(sessionId)) {
// Retrieve existing session history
const history = await this.agent.getSessionHistory(sessionId);
return { sessionId, history };
} else {
// Create new session
const session = await this.agent.createSession(sessionId);
return { sessionId: session.id, history: null };
}
}
async chat(userId: string, message: string) {
const sessionId = `user-${userId}`;
// Always pass session ID explicitly
const response = await this.agent.generate(message, sessionId);
return response.content;
}
}
```
## Configuration Options
### LLM Providers
```typescript
// OpenAI
const openaiConfig = {
provider: 'openai',
model: 'gpt-5',
apiKey: process.env.OPENAI_API_KEY,
temperature: 0.7,
maxOutputTokens: 4000
};
// Anthropic
const anthropicConfig = {
provider: 'anthropic',
model: 'claude-sonnet-4-5-20250929',
apiKey: process.env.ANTHROPIC_API_KEY,
maxIterations: 5
};
// Cohere
const cohereConfig = {
provider: 'cohere',
model: 'command-a-03-2025',
apiKey: process.env.COHERE_API_KEY,
temperature: 0.3
};
// Local/Custom OpenAI-compatible
const localConfig = {
provider: 'openai',
model: 'llama-3.1-70b',
apiKey: 'not-needed',
baseURL: 'http://localhost:8080/v1'
};
```
### Storage Backends
```typescript
// In-memory (development)
const memoryStorage = {
cache: { type: 'in-memory' },
database: { type: 'in-memory' }
};
// Production with Redis + PostgreSQL
const productionStorage = {
cache: {
type: 'redis',
url: 'redis://localhost:6379'
},
database: {
type: 'postgresql',
url: process.env.DATABASE_URL
}
};
```
## Error Handling
### Graceful Degradation
```typescript
const agent = new DextoAgent(config);
await agent.start();
// Handle MCP connection failures
agent.agentEventBus.on('mcp:server-connected', (data) => {
if (!data.success) {
console.warn(`⚠️ ${data.name} unavailable: ${data.error}`);
// Continue without this capability
}
});
// Handle LLM errors
agent.agentEventBus.on('llm:error', (data) => {
if (data.recoverable) {
console.log('🔄 Retrying request...');
} else {
console.error('💥 Fatal error:', data.error);
// Implement fallback or user notification
}
});
```
### Validation and Fallbacks
```typescript
try {
const agent = new DextoAgent({
llm: primaryLLMConfig,
toolConfirmation: { mode: 'auto-approve' },
mcpServers: allServers
});
await agent.start();
} catch (error) {
console.warn('⚠️ Full setup failed, using minimal config');
// Fallback to basic configuration
const agent = new DextoAgent({
llm: fallbackLLMConfig
// No MCP servers in fallback mode
});
await agent.start();
}
```
## Best Practices
### 1. Resource Management
```typescript
// Proper cleanup
const agent = new DextoAgent(config);
await agent.start();
process.on('SIGTERM', async () => {
await agent.stop();
process.exit(0);
});
```
### 2. Session Lifecycle
```typescript
// Set session TTL to manage memory usage (chat history preserved in storage)
const agent = new DextoAgent({
// ... other config
sessions: {
maxSessions: 1000,
sessionTTL: 24 * 60 * 60 * 1000 // 24 hours
}
});
await agent.start();
```
### 3. Monitoring and Observability
```typescript
// Log all tool executions
agent.agentEventBus.on('llm:tool-call', (data) => {
console.log(`[${data.sessionId}] Tool: ${data.toolName}`, data.args);
});
agent.agentEventBus.on('llm:tool-result', (data) => {
if (data.success) {
console.log(`[${data.sessionId}] ✅ ${data.toolName} completed`, data.sanitized);
} else {
console.error(`[${data.sessionId}] ❌ ${data.toolName} failed:`, data.rawResult ?? data.sanitized);
}
});
```
## Next Steps
- **[DextoAgent API](/api/sdk/dexto-agent)** - Detailed method documentation
- **[MCP Guide](/docs/mcp/overview)** - Learn about Model Context Protocol
- **[Deployment Guide](/docs/guides/deployment)** - Production deployment strategies
- **[Examples](/examples/intro)** - Complete example applications

View File

@@ -0,0 +1,388 @@
---
sidebar_position: 7
title: "Installing Custom Agents"
---
# Installing Custom Agents
Dexto makes it easy to install and use custom AI agents. Whether you're using pre-built agent templates from the registry or creating your own from scratch, this guide covers everything you need to know.
## What are Custom Agents?
Custom agents are specialized AI configurations tailored for specific tasks. They can include:
- **Custom system prompts** Define the agent's personality and behavior
- **LLM settings** Choose specific models and providers
- **Tools and MCP servers** Enable specific capabilities
- **Memory and storage** Configure how agents remember context
Dexto provides two ways to install custom agents:
1. **From the CLI** Using `dexto install` command
2. **From the Web UI** Using the "+New Agent" button
## Installing from CLI
### Installing Pre-built Agents from Registry
Dexto comes with a curated registry of agent templates ready to use:
```bash
# List available agents
dexto list-agents
# Install a specific agent
dexto install nano-banana-agent
# Install multiple agents at once
dexto install podcast-agent database-agent music-agent
# Install all available agents
dexto install --all
```
**Available agent templates include:**
- `nano-banana-agent` Image generation and editing with Google Nano Banana
- `podcast-agent` Podcast generation with multi-speaker TTS
- `database-agent` SQL queries and database operations
- `image-editor-agent` Image manipulation and editing
- `music-agent` Music creation and audio processing
- `pdf-agent` Document analysis and conversation
- `product-researcher` Product naming and branding research
- `triage-agent` Multi-agent customer support routing
- `workflow-builder-agent` n8n workflow automation
- `product-analysis-agent` PostHog product analytics
- `gaming-agent` Play GameBoy games like Pokemon
### Installing from a Local File
You can install an agent from a local YAML configuration file:
```bash
dexto install ./my-agent.yml
```
Dexto will prompt you for metadata:
- **Agent ID** Unique identifier (lowercase, hyphens only)
- **Agent Name** Display name shown in UI
- **Description** Brief description of what the agent does
- **Author** Your name or organization
- **Tags** Comma-separated categories
**Example:**
```bash
$ dexto install ./coding-assistant.yml
📝 Custom Agent Installation
Agent name: coding-assistant
Description: A specialized coding assistant with best practices
Author: Your Name
Tags (comma-separated): coding, development, productivity
✅ coding-assistant installed successfully
```
### Installing from a Directory
For more complex agents with multiple files (prompts, tools, resources):
```bash
dexto install ./my-complex-agent/
```
Dexto will ask for:
1. **Metadata** (same as file installation)
2. **Main config file** Which YAML file is the entry point (e.g., `agent.yml`)
**Directory structure example:**
```
my-complex-agent/
├── agent.yml # Main config (specified during install)
├── prompts/
│ ├── system.txt
│ └── examples.txt
└── tools/
└── custom-tool.js
```
### Installation Options
**Force Reinstall:**
```bash
dexto install nano-banana-agent --force
```
**Skip Preference Injection:**
```bash
dexto install nano-banana-agent --no-inject-preferences
```
By default, Dexto injects your global preferences (like API keys from `~/.dexto/.env`) into installed agents. Use `--no-inject-preferences` to skip this.
## Installing from Web UI
The Web UI provides a visual way to create custom agents without writing YAML files.
### Steps to Create an Agent
1. **Open the Web UI:**
```bash
dexto
```
2. **Click "+New Agent"** in the agent selector (top of the interface)
3. **Fill out the form:**
**Basic Information:**
- **Agent ID** Unique identifier (e.g., `my-research-agent`)
- **Agent Name** Display name (e.g., `Research Assistant`)
- **Description** What the agent does
- **Author** Your name (optional)
- **Tags** Comma-separated (e.g., `research, analysis, custom`)
**LLM Configuration:**
- **Provider** Choose from OpenAI, Anthropic, Google, etc.
- **Model** Select the specific model (e.g., `gpt-5`, `claude-sonnet-4-5`)
- **API Key** Enter your API key (stored securely in `~/.dexto/.env`)
**System Prompt:**
- Write the agent's instructions and behavior guidelines
4. **Click "Create Agent"**
The agent is immediately available in the agent switcher!
### Web UI Features
- **Visual Configuration** No YAML syntax required
- **Secure Key Storage** API keys are stored in `.dexto/.env` with environment variable references
- **Instant Availability** Created agents appear immediately in all Dexto modes
- **Edit Later** Use the "Edit Agent" button to modify configuration
## Where Agents are Stored
All installed agents (both CLI and Web UI) are stored in:
```
~/.dexto/agents/<agent-id>/
```
**For registry agents:**
```
~/.dexto/agents/nano-banana-agent/
└── nano-banana-agent.yml
```
**For custom agents:**
```
~/.dexto/agents/my-custom-agent/
├── agent.yml # Main configuration
└── .registry-metadata.json # Installation metadata
```
## Using Installed Agents
### In Web UI (Default)
```bash
# Use default agent (opens Web UI)
dexto
# Use specific installed agent (opens Web UI)
dexto --agent nano-banana-agent
```
Use the agent selector dropdown (top of web UI) to switch between installed agents.
### In CLI Mode
```bash
# Use default agent in CLI
dexto --mode cli
# Use specific installed agent in CLI
dexto --agent nano-banana-agent --mode cli
# Switch LLM during CLI session
dexto --mode cli
> /model switch
```
### In Other Modes
```bash
# MCP server with custom agent
dexto --mode mcp --agent coding-assistant
```
## Managing Installed Agents
### List Installed Agents
```bash
# Show all agents (registry + installed)
dexto list-agents
# Show only installed agents
dexto list-agents --installed
# Show detailed information
dexto list-agents --verbose
```
### Find Agent Location
```bash
dexto which nano-banana-agent
# Output: /Users/you/.dexto/agents/nano-banana-agent/nano-banana-agent.yml
```
### Uninstall Agents
```bash
# Uninstall specific agent
dexto uninstall nano-banana-agent
# Uninstall multiple agents
dexto uninstall agent1 agent2 agent3
# Uninstall all agents
dexto uninstall --all
```
## Creating Your Own Agent Template
Want to create a shareable agent template? Here's the structure:
### Single-File Agent
**my-agent.yml:**
```yaml
llm:
provider: anthropic
model: claude-sonnet-4-5-20250929
apiKey: $ANTHROPIC_API_KEY
systemPrompt: |
You are a specialized assistant for data analysis.
Your expertise includes:
- Statistical analysis
- Data visualization recommendations
- Python/R code generation
Always provide clear explanations alongside code.
memory:
enabled: true
maxTokens: 10000
tools:
browserUse:
enabled: true
filesystem:
enabled: true
```
Install with:
```bash
dexto install ./my-agent.yml
```
### Directory-Based Agent
For complex agents with multiple configuration files:
**project-manager-agent/**
```
project-manager-agent/
├── agent.yml # Main config
├── prompts/
│ ├── system.md # System prompt
│ └── examples.md # Few-shot examples
├── mcp/
│ └── mcp-config.yml # MCP server configurations
└── README.md # Documentation
```
**agent.yml:**
```yaml
llm:
provider: openai
model: gpt-5
apiKey: $OPENAI_API_KEY
systemPrompt:
type: file
path: ./prompts/system.md
mcp:
- name: github
command: npx
args: ["-y", "@modelcontextprotocol/server-github"]
env:
GITHUB_TOKEN: $GITHUB_TOKEN
```
Install with:
```bash
dexto install ./project-manager-agent/
# When prompted for "Main config file:", enter: agent.yml
```
## Best Practices
### ✅ DO:
- **Use descriptive agent IDs** `data-analysis-agent` not `agent1`
- **Write clear descriptions** Help users understand the agent's purpose
- **Use environment variables for API keys** Never hardcode secrets
- **Test thoroughly** Try the agent in different scenarios
- **Document custom tools** Explain any special configuration
- **Version your agents** Use git for custom agent directories
### ❌ DON'T:
- **Don't use spaces in agent IDs** Use kebab-case: `my-agent` not `my agent`
- **Don't hardcode API keys** Use env vars: `$ANTHROPIC_API_KEY`
- **Don't override registry agents** Choose unique IDs
- **Don't skip descriptions** They help users find the right agent
- **Don't create overly complex configs** Keep it simple and focused
## Troubleshooting
### Agent doesn't appear after installation
- Check installation completed successfully (look for ✅ message)
- Verify agent exists: `dexto which <agent-id>`
- Restart Dexto if in interactive mode
- Check `~/.dexto/agents/<agent-id>/` directory exists
### "Agent ID already exists" error
- The ID conflicts with a bundled agent or existing custom agent
- Choose a different, unique ID
- To reinstall, use `dexto install --force <agent>`
### API key not working
- Ensure the key is in `~/.dexto/.env`:
```
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
```
- Use the correct env var name in agent config: `$OPENAI_API_KEY`
- Run `dexto setup` to configure keys globally
### Custom agent YAML syntax errors
- Validate YAML syntax online or with a linter
- Check indentation (use 2 spaces, not tabs)
- Ensure all required fields are present
- See [agent configuration guide](./configuring-dexto/agent-yml.md) for valid schema
## Next Steps
- Learn about [agent configuration options](./configuring-dexto/agent-yml.md)
- Explore [MCP integration](../mcp/overview.md) for advanced tools
- Check out [system prompt configuration](./configuring-dexto/systemPrompt.md)
- Read about [LLM providers and models](./configuring-dexto/llm.md)

View File

@@ -0,0 +1,494 @@
---
sidebar_position: 1
title: "Supported LLM Providers"
---
# Supported LLM Providers & Models
Dexto supports multiple LLM providers out-of-the-box, plus the ability to use any OpenAI SDK-compatible provider. This guide lists all supported providers and their available models.
:::tip Configuration Details
For complete LLM configuration options and YAML reference, see the [agent.yml → LLM Configuration](./configuring-dexto/agent-yml.md#llm-configuration) section.
:::
## Built-in Providers
### OpenAI
```yaml
llm:
provider: openai
model: gpt-5-mini
apiKey: $OPENAI_API_KEY
```
**Supported models:**
- `gpt-5.1-chat-latest`, `gpt-5.1`, `gpt-5.1-codex`, `gpt-5.1-codex-mini`
- `gpt-5`, `gpt-5-mini`, `gpt-5-nano`, `gpt-5-pro`, `gpt-5-codex`
- `gpt-4.1`, `gpt-4.1-mini`, `gpt-4.1-nano`
- `gpt-4o`, `gpt-4o-mini`, `gpt-4o-audio-preview`
- `o4-mini`, `o3`, `o3-mini`, `o1`
**Features:** Function calling, streaming, vision (GPT-4o), JSON mode
---
### Anthropic (Claude)
```yaml
llm:
provider: anthropic
model: claude-sonnet-4-5-20250929
apiKey: $ANTHROPIC_API_KEY
```
**Supported models:**
- `claude-haiku-4-5-20251001`
- `claude-sonnet-4-5-20250929`
- `claude-opus-4-5-20251101`, `claude-opus-4-1-20250805`
- `claude-4-opus-20250514`, `claude-4-sonnet-20250514`
- `claude-3-7-sonnet-20250219`
- `claude-3-5-sonnet-20240620`
- `claude-3-5-haiku-20241022`
**Features:** Large context (200K tokens), advanced tool use, Constitutional AI
---
### Google Gemini
```yaml
llm:
provider: google
model: gemini-2.5-pro
apiKey: $GOOGLE_GENERATIVE_AI_API_KEY
```
**Supported models:**
- `gemini-3-pro-preview`, `gemini-3-pro-image-preview`
- `gemini-2.5-pro` (default), `gemini-2.5-flash`, `gemini-2.5-flash-lite`
- `gemini-2.0-flash`, `gemini-2.0-flash-lite`
**Features:** Multimodal (text/image/video/audio), large context (1M tokens), fast inference
---
### xAI (Grok)
```yaml
llm:
provider: xai
model: grok-4
apiKey: $XAI_API_KEY
```
**Supported models:**
- `grok-4` (default)
- `grok-3`
- `grok-3-mini`
- `grok-code-fast-1`
**Features:** State-of-the-art reasoning, real-time knowledge, strong benchmark performance
---
### Groq
```yaml
llm:
provider: groq
model: llama-3.3-70b-versatile
apiKey: $GROQ_API_KEY
```
**Supported models:**
- `llama-3.3-70b-versatile` (default)
- `gemma-2-9b-it`
- `openai/gpt-oss-20b`
- `openai/gpt-oss-120b`
- `moonshotai/kimi-k2-instruct`
- `meta-llama/llama-4-scout-17b-16e-instruct`
- `meta-llama/llama-4-maverick-17b-128e-instruct`
- `deepseek-r1-distill-llama-70b`
- `qwen/qwen3-32b`
**Features:** Ultra-fast inference, cost-effective, open source models
---
### Cohere
```yaml
llm:
provider: cohere
model: command-a-03-2025
apiKey: $COHERE_API_KEY
```
**Supported models:**
- `command-a-03-2025` (default, 256k context window)
- `command-r-plus`
- `command-r`
- `command-r7b`
**Features:** RAG optimization, tool use, multilingual, conversational AI
---
## Cloud Platform Providers
### Amazon Bedrock
Access Claude, Nova, Llama, Mistral, and more through AWS:
```yaml
llm:
provider: bedrock
model: anthropic.claude-sonnet-4-5-20250929-v1:0
```
**Claude models:**
- `anthropic.claude-sonnet-4-5-20250929-v1:0` (default), `anthropic.claude-haiku-4-5-20251001-v1:0`, `anthropic.claude-opus-4-5-20251101-v1:0`
**Amazon Nova models:**
- `amazon.nova-premier-v1:0`, `amazon.nova-pro-v1:0`, `amazon.nova-lite-v1:0`, `amazon.nova-micro-v1:0`
**Other models:**
- `openai.gpt-oss-120b-1:0`, `openai.gpt-oss-20b-1:0`
- `qwen.qwen3-coder-30b-a3b-v1:0`, `qwen.qwen3-coder-480b-a35b-v1:0`
**Features:** Enterprise security, AWS billing, access to Claude/Nova/GPT-OSS/Qwen
<details>
<summary>Setup Instructions</summary>
1. Create an AWS account and enable Bedrock in your region
2. Request model access in [AWS Console → Bedrock → Model access](https://console.aws.amazon.com/bedrock/home#/modelaccess)
**Option 1: API Key (Recommended for development)**
Generate a Bedrock API key directly from the console - no IAM setup required:
1. Go to [AWS Console → Bedrock → API keys](https://console.aws.amazon.com/bedrock/home#/api-keys)
2. Click "Generate API Key" and copy the key
3. Set environment variables:
```bash
export AWS_REGION="us-east-1"
export AWS_BEARER_TOKEN_BEDROCK="your-api-key"
```
**Option 2: IAM Credentials (Recommended for production)**
1. Create IAM credentials with `bedrock:InvokeModel` permission
2. Set environment variables:
```bash
export AWS_REGION="us-east-1"
export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"
# Optional: for temporary credentials
export AWS_SESSION_TOKEN="your-session-token"
```
**Cross-region inference:** Dexto auto-detects and adds the appropriate region prefix (`eu.` or `us.`) based on your `AWS_REGION`. You can override by using explicit prefixed model IDs (e.g., `eu.anthropic.claude-sonnet-4-5-20250929-v1:0`).
</details>
:::tip Custom Model IDs
Need a model not in our registry (e.g., new preview models)?
- **CLI**: `/model` → "Add Custom Model" → AWS Bedrock
- **WebUI**: Model picker → "+" → AWS Bedrock
Uses AWS credentials from your environment.
Set `AWS_REGION` and either `AWS_BEARER_TOKEN_BEDROCK` or IAM credentials as explained above.
:::
---
### Google Cloud Vertex AI
Access Google's Gemini and Anthropic's Claude models through Google Cloud Platform:
```yaml
llm:
provider: vertex
model: gemini-2.5-pro
```
**Gemini models:**
- `gemini-3-flash-preview`, `gemini-3-pro-preview` (Preview)
- `gemini-2.5-pro` (default), `gemini-2.5-flash`
- `gemini-2.0-flash`
**Claude models on Vertex:**
- `claude-opus-4-5@20251101`, `claude-sonnet-4-5@20250929`, `claude-haiku-4-5@20251001`
- `claude-opus-4-1@20250805`, `claude-opus-4@20250514`, `claude-sonnet-4@20250514`
- `claude-3-7-sonnet@20250219`, `claude-3-5-sonnet-v2@20241022`, `claude-3-5-haiku@20241022`
**Features:** Enterprise security, unified billing through GCP, access to both Gemini and Claude
**Authentication:** Uses Google Cloud Application Default Credentials (ADC), not API keys.
<details>
<summary>Setup Instructions</summary>
**Option 1: Service Account Key (Recommended for production)**
1. Go to [Google Cloud Console → IAM & Admin → Service Accounts](https://console.cloud.google.com/iam-admin/serviceaccounts)
2. Create a service account with **Vertex AI User** role
3. Create and download a JSON key
4. Set environment variables:
```bash
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account-key.json"
export GOOGLE_VERTEX_PROJECT="your-project-id"
# Optional: defaults to us-central1 for Gemini, us-east5 for Claude
export GOOGLE_VERTEX_LOCATION="us-central1"
```
**Option 2: gcloud CLI (For local development)**
1. Install [Google Cloud CLI](https://cloud.google.com/sdk/docs/install)
2. Run: `gcloud auth application-default login`
3. Set: `export GOOGLE_VERTEX_PROJECT="your-project-id"`
**For Claude models:** Enable them in [Vertex AI Model Garden](https://console.cloud.google.com/vertex-ai/model-garden)
</details>
---
## Gateway Providers
### OpenRouter
Access 100+ models through one API:
```yaml
llm:
provider: openrouter
model: anthropic/claude-sonnet-4-5-20250929
apiKey: $OPENROUTER_API_KEY
```
**Popular models:**
- `anthropic/claude-sonnet-4-5-20250929`
- `meta-llama/llama-3.1-405b-instruct`
- `google/gemini-pro-1.5`
- `mistralai/mistral-large`
**Features:** Single API for 100+ models, automatic model validation, unified billing
**Learn more:** [openrouter.ai](https://openrouter.ai/)
:::tip Adding Models Not in Registry
Can't find your model? Add any OpenRouter model via the custom model wizard:
- **CLI**: `/model` → "Add Custom Model" → OpenRouter
- **WebUI**: Model picker → "+" → OpenRouter
Model IDs are validated against OpenRouter's registry automatically.
:::
---
### LiteLLM
Unified proxy for 100+ LLM providers. Host your own LiteLLM proxy to access multiple providers through a single interface:
```yaml
llm:
provider: litellm
model: gpt-4
apiKey: $LITELLM_API_KEY
baseURL: http://localhost:4000
```
**Features:**
- Single API for OpenAI, Anthropic, AWS Bedrock, Azure, Vertex AI, and more
- Load balancing and fallbacks
- Cost tracking and rate limiting
- Self-hosted for full control
**Model naming:** Model names follow LiteLLM's format (e.g., `gpt-4`, `claude-3-sonnet`, `bedrock/anthropic.claude-v2`)
**Learn more:** [docs.litellm.ai](https://docs.litellm.ai/)
:::tip Adding Custom Models
Your LiteLLM proxy supports more models than our picker shows:
- **CLI**: `/model` → "Add Custom Model" → LiteLLM
- **WebUI**: Model picker → "+" → LiteLLM
Enter any model your proxy supports, plus your proxy URL.
:::
---
### Glama
OpenAI-compatible gateway providing unified access to multiple LLM providers with single billing:
```yaml
llm:
provider: glama
model: openai/gpt-4o
apiKey: $GLAMA_API_KEY
```
**Features:**
- Single API for OpenAI, Anthropic, Google, and more
- Unified billing across providers
- No base URL configuration needed (fixed endpoint)
**Model naming:** Format is `provider/model` (e.g., `openai/gpt-4o`, `anthropic/claude-3-sonnet`)
**Learn more:** [glama.ai](https://glama.ai/)
:::tip Adding Custom Models
Need a model not in our picker?
- **CLI**: `/model` → "Add Custom Model" → Glama
- **WebUI**: Model picker → "+" → Glama
Model IDs use `provider/model` format (e.g., `openai/gpt-4o`).
:::
---
## OpenAI-Compatible Providers
Use any provider that implements the OpenAI SDK interface:
```yaml
llm:
provider: openai-compatible
model: your-custom-model
apiKey: $YOUR_API_KEY
baseURL: https://api.your-provider.com/v1
maxInputTokens: 100000
```
### Local Models
Run models locally using Ollama, LM Studio, or similar:
```yaml
llm:
provider: openai-compatible
model: gemma3n:e2b
apiKey: dummy
baseURL: http://localhost:11434/v1
maxInputTokens: 8000
```
**Popular options:**
- **Ollama** - Easy local model hosting
- **LM Studio** - User-friendly interface
- **vLLM** - High-performance serving
- **TGI** - Hugging Face serving
:::tip Adding Custom Models via CLI/WebUI
Need to add a local model or custom endpoint?
- **CLI**: `/model` → "Add Custom Model" → OpenAI-Compatible
- **WebUI**: Model picker → "+" → OpenAI-Compatible
Enter model name and base URL (e.g., `http://localhost:11434/v1` for Ollama).
:::
---
### Azure OpenAI
```yaml
llm:
provider: openai-compatible
model: gpt-5
apiKey: $AZURE_OPENAI_API_KEY
baseURL: https://your-resource.openai.azure.com/openai/deployments/gpt-5
maxInputTokens: 128000
```
**Notes:** Replace `your-resource` with your Azure resource name. Supports all OpenAI models available in Azure.
---
### Together.ai
```yaml
llm:
provider: openai-compatible
model: meta-llama/Llama-3.2-90B-Vision-Instruct-Turbo
apiKey: $TOGETHER_API_KEY
baseURL: https://api.together.xyz/v1
maxInputTokens: 8000
```
---
### Perplexity
```yaml
llm:
provider: openai-compatible
model: llama-3.1-sonar-huge-128k-online
apiKey: $PERPLEXITY_API_KEY
baseURL: https://api.perplexity.ai
maxInputTokens: 128000
```
**Special feature:** Online models with real-time web search
---
## Choosing the Right Provider
### For Development
- **OpenAI** - Best developer experience and documentation
- **Local models** - Free, private, great for experimentation
### For Production
- **OpenAI** - Reliable, extensive model selection
- **Anthropic** - Safety-critical applications
- **Google** - Multimodal and large context needs
### For Cost Optimization
- **Groq** - Fastest and often cheapest
- **OpenRouter** - Compare prices across providers
- **Local hosting** - No per-token costs
### For Privacy
- **Local models** - Complete data privacy
- **Azure OpenAI** - Enterprise security and compliance
## Environment Variables
Set API keys in your `~/.dexto/.env` file:
```bash
# Built-in providers
OPENAI_API_KEY=your_openai_key
ANTHROPIC_API_KEY=your_anthropic_key
GOOGLE_GENERATIVE_AI_API_KEY=your_google_key
GROQ_API_KEY=your_groq_key
XAI_API_KEY=your_xai_key
COHERE_API_KEY=your_cohere_key
# Google Cloud Vertex AI (uses ADC, not API keys)
GOOGLE_VERTEX_PROJECT=your_gcp_project_id
GOOGLE_VERTEX_LOCATION=us-central1 # Optional
GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json # For service account auth
# Amazon Bedrock
AWS_REGION=us-east-1
AWS_BEARER_TOKEN_BEDROCK=your_bedrock_api_key # Option 1: API key (simplest)
# OR use IAM credentials (Option 2):
AWS_ACCESS_KEY_ID=your_aws_access_key
AWS_SECRET_ACCESS_KEY=your_aws_secret_key
AWS_SESSION_TOKEN=your_session_token # Optional, for temporary credentials
# Gateway providers
OPENROUTER_API_KEY=your_openrouter_key
LITELLM_API_KEY=your_litellm_key
GLAMA_API_KEY=your_glama_key
# OpenAI-compatible providers
TOGETHER_API_KEY=your_together_key
AZURE_OPENAI_API_KEY=your_azure_key
PERPLEXITY_API_KEY=your_perplexity_key
```

View File

@@ -0,0 +1,147 @@
---
sidebar_position: 20
title: "Troubleshooting"
---
# Troubleshooting
Common issues and how to resolve them.
## Setup Issues
### API key not working
1. Verify your key is saved in the correct location:
- Global: `~/.dexto/.env`
- Project: `.env` in your project root
2. Check the key has the correct format:
- OpenAI keys start with `sk-`
- Anthropic keys start with `sk-ant-`
- Google keys are alphanumeric
3. Verify the key has correct permissions on the provider's dashboard
4. Run `dexto setup` to re-enter your API key
### Provider not found
- Use a supported provider name from the list:
- `google`, `groq`, `openai`, `anthropic`, `xai`, `cohere`
- `openrouter`, `glama`, `litellm`, `openai-compatible`
- `local`, `ollama`, `vertex`, `bedrock`
- Run `dexto setup` to see available providers in the interactive menu
### Local model download fails
1. Check available disk space (models are typically 2-8GB)
2. Ensure you have a stable internet connection
3. Try a smaller model variant
4. Run `dexto setup` and select a different model
### Setup stuck or frozen
- Press `Ctrl+C` to cancel and restart
- Try running with `--no-interactive` flag: `dexto setup --provider google --model gemini-2.5-pro`
## Runtime Issues
### "No API key configured"
Your provider requires an API key that isn't set up yet.
**Solutions:**
1. Run `dexto setup` to configure interactively
2. Set the environment variable directly:
```bash
# For Google Gemini
export GOOGLE_GENERATIVE_AI_API_KEY=your-key-here
# For OpenAI
export OPENAI_API_KEY=your-key-here
# For Anthropic
export ANTHROPIC_API_KEY=your-key-here
```
### MCP server connection failed
1. Check the MCP server is running
2. Verify the configuration in your agent YAML file
3. Check network connectivity for remote servers
4. Run with `--strict` flag to see detailed connection errors
### Agent not found
1. Check the agent name or path is correct
2. List installed agents: `dexto list-agents --installed`
3. Install the agent: `dexto install <agent-name>`
4. For custom agents, verify the path exists: `dexto --agent ./path/to/agent.yml`
### Rate-limiting errors
You've hit the provider's rate limits.
**Solutions:**
1. Wait a few moments and retry
2. Switch to a model with higher limits
3. Consider upgrading your API plan
4. Use a different provider temporarily
## Common Questions
### How do I change my provider?
Run `dexto setup` to access the settings menu where you can change your provider, model, and default mode.
### How do I update agents?
After updating Dexto, run:
```bash
dexto sync-agents
```
This syncs your installed agents with the latest bundled versions.
### Where are settings stored?
| File | Description |
|------|-------------|
| `~/.dexto/preferences.json` | Global preferences (provider, model, mode) |
| `~/.dexto/agents/` | Installed agent configurations |
| `~/.dexto/.env` | API keys (global) |
| `.env` | API keys (project-specific) |
### How do I reset everything?
```bash
# Reset configuration
dexto setup --force
# Or delete the config directory
rm -rf ~/.dexto
```
### How do I see what model I'm using?
In interactive mode, run `/model current` or `/config` to see your current configuration.
### Can I use multiple providers?
Yes! You can:
- Switch providers with `dexto setup`
- Use different providers per agent (configure in agent YAML)
- Override the model for a single session: `dexto -m gpt-5`
## Getting Help
If your issue isn't covered here:
1. Check the [full documentation](/)
2. Search [GitHub Issues](https://github.com/truffle-ai/dexto/issues)
3. Open a new issue with:
- Dexto version (`dexto --version`)
- Operating system
- Steps to reproduce
- Error messages (if any)

View File

@@ -0,0 +1,38 @@
---
sidebar_position: 4
title: "Web UI Guide"
---
# Web UI
## Overview
Dexto web UI is the easiest way to test out different LLMs, MCP servers, prompts, and more!
Once you're satisfied with a specific combination, save it as a **Re-usable** AI agent built with Dexto, and deploy the agent anywhere.
All this is possible because Dexto sees any valid config file as a re-usable AI agent.
Dexto web UI also stores your conversation history locally so it remembers your past conversations!
## Get started
**Start dexto web UI:**
```bash
dexto
```
This opens the Web UI at [http://localhost:3000](http://localhost:3000) in your browser (web is the default mode).
**Use a different port:**
```bash
dexto --port 3333
```
This starts the server on port 3333, serving both the Web UI and API.
## Conversation storage
When installed as a global CLI, dexto stores conversation history in `~/.dexto` folder by default
In development mode, storage location defaults to`<path_to_dexto_project_dir>/.dexto`

View File

@@ -0,0 +1,10 @@
{
"label": "MCP",
"position": 4,
"link": {
"type": "generated-index",
"title": "Model Context Protocol",
"description": "Connect Dexto to MCP servers, manage tools, and expose agents over MCP."
}
}

View File

@@ -0,0 +1,132 @@
---
sidebar_position: 7
title: "Dexto Agents as MCP Servers"
sidebar_label: "Dexto Agents as MCP Servers"
---
# Dexto Agents as MCP Servers
Any Dexto agent can also act as a Model Context Protocol (MCP) server, enabling external tools like Cursor/Claude Desktop or any MCP client to connect and interact with your Dexto agent.
This means you can even connect one Dexto agent to another Dexto agent!
You can use any of our pre-installed Dexto Agents (music-agent, database-agent, podcast-agent, etc.), or use your own yml config file as well
Check out our [Configuration guide](../guides/configuring-dexto/overview.md) to configure your own agent
## Prerequisites
- Install the Dexto CLI globally (`pnpm install -g dexto`, `npm install -g dexto`)
- Run `dexto` at least once so the setup flow can capture your provider credentials. Dexto stores secrets in `~/.dexto/.env`, so you no longer need to pass API keys through environment variables.
## Local MCP Server Guide
### Start the MCP server
Run Dexto in MCP mode to expose your agent over stdio:
```bash
dexto --mode mcp --auto-approve
```
During startup Dexto reads secrets from `.dexto/.env`, so your LLM credentials travel with your profile—no additional environment variables are required.
### Connect an MCP client
Most MCP-compatible clients expect a command plus optional arguments. A minimal configuration looks like:
```json
{
"mcpServers": {
"dexto": {
"command": "dexto",
"args": ["--mode", "mcp", "--auto-approve"]
}
}
}
```
Just the `dexto` section for easy copying:
```json
"dexto": {
"command": "dexto",
"args": ["--mode", "mcp", "--auto-approve"]
}
```
Use `--agent` if you want to expose a specific agent (installed or from file):
```json
"dexto": {
"command": "dexto",
"args": ["--agent", "music-agent", "--mode", "mcp", "--auto-approve"]
}
```
Need debug logs? Add `DEXTO_LOG_LEVEL` env variable
```json
"dexto": {
"command": "npx",
"args": ["-y", "dexto", "--mode", "mcp", "--agent", "music-agent"],
"env": { "DEXTO_LOG_LEVEL": "debug" }
}
```
Logs will be stored in `~/.dexto/logs/dexto.log`
> Looking for Cursor-specific instructions? See [Using Dexto Agents in Cursor](../guides/dexto-in-cursor.md).
Once connected, clients gain access to the agent tools defined in your configuration (filesystem, web browsing, custom MCP servers, etc.).
## Remote MCP Server Guide
Need to run your dexto agent as a remote MCP server?
### Step 1: Start Dexto in Server Mode
```bash
dexto --mode server
```
**Options:**
```bash
# Custom port using environment variable
API_PORT=8080 dexto --mode server
# Custom port for network access
API_PORT=3001 dexto --mode server
# Enable debug logging
dexto --mode server --debug
```
### Step 2: Configure the Connection URL
**HTTP MCP Endpoint:**
```bash
http://localhost:3001/mcp
```
**For network access:**
```bash
http://YOUR_SERVER_IP:3001/mcp
```
### Remote client limitations
Some MCP clients (including Cursor and Claude Desktop today) do not yet support streaming HTTP connections. For those clients, prefer the local stdio transport covered above.
## Troubleshooting
**Issues in Cursor:**
- Check Dexto logs - `~/.dexto/logs/dexto.log`
- Run agent in debug mode
- Reach out for support on Truffle AI discord
**Debug mode:**
```bash
# If installed globally
dexto --mode mcp --debug
```

View File

@@ -0,0 +1,95 @@
---
sidebar_position: 8
sidebar_label: "Using Dexto to group MCP servers"
---
# Using Dexto CLI to group MCP servers together
Dexto can operate in **MCP Tools Mode**, where it acts as a local tool aggregation server that groups MCP servers and re-exposes them all under 1 common MCP server.
Unlike the regular MCP server mode where you interact with a Dexto AI agent, this mode provides direct access to the underlying tools without an AI intermediary.
This is useful when you want to:
- Access tools from multiple MCP servers through a single connection
- Group tools directly without AI agent processing
- Create a centralized tool hub for your development environment
## How It Works
In MCP Tools Mode, Dexto:
1. Connects to multiple MCP servers as configured
2. Aggregates all available tools from these servers
3. Exposes them directly as its own local MCP server
4. Acts as a pass-through for tool execution
## Configuration
### Step 1: Create a Dexto Configuration File
Create a `dexto-tools.yml` configuration file with the MCP servers you want to aggregate:
```yaml
# dexto-tools.yml
mcpServers:
filesystem:
type: stdio
command: npx
args:
- -y
- "@modelcontextprotocol/server-filesystem"
- "."
playwright:
type: stdio
command: npx
args:
- "-y"
- "@playwright/mcp@latest"
```
- You don't need LLM configuration for tools mode
- Only the mcpServers section is used
### Step 2: Setup in Cursor
Add the following to your `.cursor/mcp.json` file:
```json
{
"mcpServers": {
"dexto-tools": {
"command": "npx",
"args": [
"-y",
"dexto",
"mcp",
"--group-servers",
"-a",
"path/to/your/dexto-tools.yml"
]
}
}
}
```
Or use the default Dexto configuration
```json
{
"mcpServers": {
"dexto-tools": {
"command": "npx",
"args": [
"-y",
"dexto",
"mcp",
"--group-servers"
]
}
}
}
```
### Step 3: Restart Cursor
After adding the configuration, restart Cursor to load the new MCP server.

View File

@@ -0,0 +1,92 @@
---
sidebar_position: 4
---
# MCP Elicitation
## What is Elicitation?
Elicitation allows MCP servers to request structured user input during interactions. When a server needs specific data (like API keys, file paths, or configuration parameters), it can request that information through a defined JSON schema, and Dexto will prompt the user for input that matches that structure.
**Specification:** [MCP Elicitation Spec](https://modelcontextprotocol.io/specification/2025-06-18/client/elicitation)
## How It Works
During tool execution or other server operations, an MCP server can:
1. Send an `elicitation/create` request with a JSON schema defining required input
2. Dexto prompts the user for the structured data
3. User provides input matching the schema
4. Dexto validates and returns the data to the server
5. Server continues with the provided information
This enables dynamic, context-aware data collection without requiring all parameters upfront.
## Configuration
Elicitation must be explicitly enabled in your agent configuration. It is disabled by default:
```yaml
# Enable elicitation support
elicitation:
enabled: true # Default: false
timeout: 120000 # Optional: timeout in milliseconds (default: 120000)
# Connect MCP servers that support elicitation
mcpServers:
my-server:
type: stdio
command: npx
args: ["-y", "my-mcp-server"]
```
**Important:** Without `elicitation.enabled: true`, servers cannot request user input and elicitation requests will be rejected.
## Elicitation Schema
Servers define what data they need using JSON Schema:
```json
{
"title": "API Configuration",
"description": "Please provide your API credentials",
"type": "object",
"properties": {
"apiKey": {
"type": "string",
"description": "Your API key"
},
"region": {
"type": "string",
"enum": ["us-east", "eu-west", "ap-south"],
"description": "Preferred region"
}
},
"required": ["apiKey", "region"]
}
```
Dexto validates user input against this schema before returning it to the server.
## Use Cases
Common scenarios where servers use elicitation:
- **Configuration parameters** - Requesting deployment regions, environment settings
- **File selection** - Asking users to choose specific files or paths
- **Disambiguation** - Clarifying ambiguous commands (e.g., "which branch?")
- **Progressive workflows** - Multi-step processes that need user decisions
## Security Considerations
:::warning Important
Elicitation is designed for workflow data, not sensitive credentials:
- ❌ DON'T: Request passwords, private keys, or PII through elicitation
- ✅ DO: Use for configuration, file paths, and workflow decisions
- Store sensitive data in environment variables or secure vaults
:::
## See Also
- [Tools](../concepts/tools) - Understanding agent tools and capabilities
- [MCP Prompts](./prompts) - Templated prompts from servers
- [MCP Overview](./overview) - Introduction to MCP

View File

@@ -0,0 +1,223 @@
---
sidebar_position: 6
---
# MCP Manager
The MCPManager is Dexto's powerful standalone utility for managing Model Context Protocol (MCP) servers. Use it in your own applications to connect, manage, and interact with multiple MCP servers without needing the full Dexto agent framework.
## Overview
The MCPManager provides:
- **Multi-server management**: Connect to multiple MCP servers simultaneously
- **Unified tool interface**: Access tools from all connected servers
- **Resource management**: Handle MCP resources and prompts
- **Connection pooling**: Automatic connection management and error handling
- **Type safety**: Full TypeScript support with comprehensive types
## Installation
```bash
npm install dexto
```
## Quick Start
```typescript
import { MCPManager } from '@dexto/core';
// Create manager instance
const manager = new MCPManager();
// Connect to an MCP server
await manager.connectServer('filesystem', {
type: 'stdio',
command: 'npx',
args: ['-y', '@modelcontextprotocol/server-filesystem', '.']
});
// Get available tools
const tools = await manager.getAllTools();
console.log('Available tools:', Object.keys(tools));
// Execute a tool
const result = await manager.executeTool('readFile', { path: './README.md' });
console.log(result);
```
## Core Concepts
### MCP Servers
MCP servers are external processes that provide tools, resources, and prompts. Common types include:
- **File system servers**: Read/write files and directories
- **Web search servers**: Search the internet for information
- **Database servers**: Query and manage databases
- **API servers**: Interact with external APIs
- **Custom servers**: Your own domain-specific tools
### Connection Types
MCPManager supports three connection types:
- **`stdio`**: Most common, spawns a child process (e.g., Node.js packages)
- **`http`**: Connect to HTTP-based MCP servers
- **`sse`**: Server-sent events for real-time communication
### Tool Execution
Tools are functions provided by MCP servers. The manager:
1. Discovers all available tools from connected servers
2. Routes tool calls to the appropriate server
3. Handles confirmation prompts for sensitive operations
4. Returns structured results
## Common Usage Patterns
### File Operations
Perfect for automating file system tasks:
```typescript
const manager = new MCPManager();
await manager.connectServer('fs', {
type: 'stdio',
command: 'npx',
args: ['-y', '@modelcontextprotocol/server-filesystem', '.']
});
// Read files
const packageJson = await manager.executeTool('readFile', {
path: './package.json'
});
// List directory contents
const files = await manager.executeTool('listFiles', {
path: './src'
});
// Write files
await manager.executeTool('writeFile', {
path: './output.md',
content: '# Generated Report\n\nSome content here...'
});
```
### Web Research
Integrate web search capabilities:
```typescript
await manager.connectServer('search', {
type: 'stdio',
command: 'npx',
args: ['-y', 'tavily-mcp@0.1.2'],
env: { TAVILY_API_KEY: process.env.TAVILY_API_KEY }
});
const results = await manager.executeTool('search', {
query: 'Model Context Protocol specifications',
max_results: 10
});
```
### Multi-Server Workflows
Combine multiple servers for complex tasks:
```typescript
// Initialize multiple servers at once
await manager.initializeFromConfig({
filesystem: {
type: 'stdio',
command: 'npx',
args: ['-y', '@modelcontextprotocol/server-filesystem', '.']
},
search: {
type: 'stdio',
command: 'npx',
args: ['-y', 'tavily-mcp@0.1.2'],
env: { TAVILY_API_KEY: process.env.TAVILY_API_KEY }
},
git: {
type: 'stdio',
command: 'npx',
args: ['-y', '@cyanheads/git-mcp-server'],
env: {
MCP_LOG_LEVEL: "info",
GIT_SIGN_COMMITS: "false"
}
});
// Complex workflow using multiple tools
async function generateProjectReport() {
const files = await manager.executeTool('listFiles', { path: './src' });
const commits = await manager.executeTool('git_log', { limit: 10 });
const research = await manager.executeTool('search', {
query: 'project documentation best practices'
});
const report = `# Project Report
Files: ${files.length}
Recent commits: ${commits.length}
Research findings: ${research.length}`;
await manager.executeTool('writeFile', {
path: './PROJECT_REPORT.md',
content: report
});
}
```
## Integration Examples
### Express.js API
Create an API that exposes MCP tools:
```typescript
import express from 'express';
import { MCPManager } from '@dexto/core';
const app = express();
app.use(express.json());
const manager = new MCPManager();
await manager.initializeFromConfig({
filesystem: {
type: 'stdio',
command: 'npx',
args: ['-y', '@modelcontextprotocol/server-filesystem', '.']
}
});
app.get('/api/tools', async (req, res) => {
const tools = await manager.getAllTools();
res.json({ tools: Object.keys(tools) });
});
app.post('/api/execute/:toolName', async (req, res) => {
try {
const { toolName } = req.params;
const { args } = req.body;
const result = await manager.executeTool(toolName, args);
res.json({ success: true, result });
} catch (error) {
res.status(500).json({
success: false,
error: error.message
});
}
});
app.listen(3000);
```
For detailed API reference, see the [MCPManager API documentation](/api/sdk/mcp-manager). 🛠️
**Tool execution failures**
- Validate tool arguments match expected schema
- Check server logs for detailed error information

View File

@@ -0,0 +1,506 @@
---
sidebar_position: 2
title: MCP Server Registry
description: Browse and discover MCP servers available in Dexto's WebUI registry
---
# MCP Server Registry
Discover MCP servers available in Dexto's WebUI. These servers can be easily installed through the Web interface or manually configured in your `agent.yml`.
:::tip Adding Custom Servers
Want to add your MCP server to this registry? Check out our [Community Contribution Guide](https://github.com/truffle-ai/dexto/blob/main/CONTRIBUTING.md#1-adding-new-mcps-to-the-webui-registry) for step-by-step instructions.
:::
## Productivity
### Filesystem
**Official MCP Server** by Anthropic
Secure file operations with configurable access controls for reading and writing files.
```yaml
mcpServers:
filesystem:
type: stdio
command: npx
args: ["-y", "@modelcontextprotocol/server-filesystem", "."]
timeout: 30000
```
**Tags:** file, directory, filesystem, io
**Homepage:** [GitHub - MCP Servers](https://github.com/modelcontextprotocol/servers)
---
### Linear
**Official MCP Server** by Linear
Manage Linear issues, projects, and workflows.
```yaml
mcpServers:
linear:
type: stdio
command: npx
args: ["-y", "mcp-remote", "https://mcp.linear.app/sse"]
timeout: 30000
```
**Tags:** linear, tasks, projects
**Homepage:** [Linear MCP](https://mcp.linear.app)
---
### Puppeteer
**Official MCP Server** by Truffle AI
Browser automation and web interaction tools.
```yaml
mcpServers:
puppeteer:
type: stdio
command: npx
args: ["-y", "@truffle-ai/puppeteer-server"]
timeout: 30000
```
**Tags:** browser, automation, web, puppeteer
**Homepage:** [GitHub - Truffle AI MCP Servers](https://github.com/truffle-ai/mcp-servers)
---
## Creative
### Meme Generator
**Community MCP Server**
Create memes using Imgflip templates.
```yaml
mcpServers:
meme-mcp:
type: stdio
command: npx
args: ["-y", "meme-mcp"]
env:
IMGFLIP_USERNAME: ""
IMGFLIP_PASSWORD: ""
timeout: 30000
```
**Tags:** meme, image, creative
**Requirements:** Node >= 18.0.0
**Homepage:** [NPM - meme-mcp](https://www.npmjs.com/package/meme-mcp)
---
### Image Editor
**Official MCP Server** by Truffle AI
Comprehensive image processing and manipulation tools.
```yaml
mcpServers:
image-editor:
type: stdio
command: uvx
args: ["truffle-ai-image-editor-mcp"]
timeout: 30000
```
**Tags:** image, edit, opencv, pillow
**Requirements:** Python >= 3.10
**Homepage:** [GitHub - Truffle AI MCP Servers](https://github.com/truffle-ai/mcp-servers)
---
### Music Creator
**Official MCP Server** by Truffle AI
Create, analyze, and transform music and audio.
```yaml
mcpServers:
music-creator:
type: stdio
command: uvx
args: ["truffle-ai-music-creator-mcp"]
timeout: 30000
```
**Tags:** audio, music, effects
**Requirements:** Python >= 3.10
**Homepage:** [GitHub - Truffle AI MCP Servers](https://github.com/truffle-ai/mcp-servers)
---
### ElevenLabs
**Official MCP Server** by ElevenLabs
Text-to-speech and voice synthesis using ElevenLabs API.
```yaml
mcpServers:
elevenlabs:
type: stdio
command: uvx
args: ["elevenlabs-mcp"]
env:
ELEVENLABS_API_KEY: ""
timeout: 30000
```
**Tags:** tts, voice, audio, synthesis
**Requirements:** Python >= 3.10
**Homepage:** [GitHub - ElevenLabs MCP](https://github.com/elevenlabs/elevenlabs-mcp)
---
### Gemini TTS
**Official MCP Server** by Truffle AI
Google Gemini Text-to-Speech with 30 prebuilt voices and multi-speaker conversation support.
```yaml
mcpServers:
gemini-tts:
type: stdio
command: npx
args: ["-y", "@truffle-ai/gemini-tts-server"]
env:
GEMINI_API_KEY: ""
timeout: 60000
```
**Tags:** tts, speech, voice, audio, gemini, multi-speaker
**Requirements:** Node >= 18.0.0
**Homepage:** [GitHub - Truffle AI MCP Servers](https://github.com/truffle-ai/mcp-servers)
---
### Nano Banana
**Official MCP Server** by Truffle AI
Google Gemini 2.5 Flash Image for advanced image generation, editing, and manipulation.
```yaml
mcpServers:
nano-banana:
type: stdio
command: npx
args: ["-y", "@truffle-ai/nano-banana-server@0.1.2"]
env:
GEMINI_API_KEY: ""
timeout: 60000
```
**Tags:** image, generation, editing, ai, gemini, nano-banana
**Requirements:** Node >= 18.0.0
**Homepage:** [GitHub - Truffle AI MCP Servers](https://github.com/truffle-ai/mcp-servers)
---
### HeyGen
**Official MCP Server** by HeyGen
Generate realistic human-like audio using HeyGen.
```yaml
mcpServers:
heygen:
type: stdio
command: uvx
args: ["heygen-mcp"]
env:
HEYGEN_API_KEY: ""
timeout: 30000
```
**Tags:** audio, voice, synthesis, heygen
**Requirements:** Python >= 3.10
**Homepage:** [GitHub - HeyGen MCP](https://github.com/heygen-com/heygen-mcp)
---
### Runway
**Official MCP Server** by Runway
AI-powered creative suite for video and image generation.
```yaml
mcpServers:
runway:
type: stdio
command: npx
args: ["mcp-remote", "https://mcp.runway.team", "--header", "Authorization: Bearer ${RUNWAY_API_KEY}"]
env:
RUNWAY_API_KEY: ""
timeout: 60000
```
**Tags:** runway, video, generation, ai, creative
**Requirements:** Node >= 18.0.0
**Homepage:** [Runway MCP Server Docs](https://docs.runway.team/api/runway-mcp-server)
---
### Sora
**Official MCP Server** by Truffle AI
AI-powered video generation using OpenAI's Sora technology.
```yaml
mcpServers:
sora:
type: stdio
command: npx
args: ["-y", "@truffle-ai/sora-video-server"]
env:
OPENAI_API_KEY: ""
timeout: 60000
```
**Tags:** video, generation, ai, creative
**Requirements:** Node >= 18.0.0
**Homepage:** [GitHub - Truffle AI MCP Servers](https://github.com/truffle-ai/mcp-servers)
---
## Research
### Product Name Scout
**Official MCP Server** by Truffle AI
SERP analysis, autocomplete, dev collisions, and scoring for product names.
```yaml
mcpServers:
product-name-scout:
type: stdio
command: npx
args: ["-y", "@truffle-ai/product-name-scout-mcp"]
timeout: 30000
```
**Tags:** research, naming, brand
**Requirements:** Node >= 18.0.0
**Homepage:** [GitHub - Truffle AI MCP Servers](https://github.com/truffle-ai/mcp-servers)
---
### DuckDuckGo Search
**Community MCP Server**
Search the web using DuckDuckGo.
```yaml
mcpServers:
duckduckgo:
type: stdio
command: uvx
args: ["duckduckgo-mcp-server"]
timeout: 30000
```
**Tags:** search, web, research
**Requirements:** Python >= 3.10, uv
**Homepage:** [GitHub - DuckDuckGo MCP Server](https://github.com/duckduckgo/mcp-server)
---
### Domain Checker
**Official MCP Server** by Truffle AI
Check domain availability across TLDs.
```yaml
mcpServers:
domain-checker:
type: stdio
command: uvx
args: ["truffle-ai-domain-checker-mcp"]
timeout: 30000
```
**Tags:** domains, availability, research
**Requirements:** Python >= 3.10
**Homepage:** [GitHub - Truffle AI MCP Servers](https://github.com/truffle-ai/mcp-servers)
---
### Tavily Search
**Community MCP Server** by Tavily AI
Web search and research using Tavily AI search engine.
```yaml
mcpServers:
tavily:
type: stdio
command: npx
args: ["-y", "tavily-mcp@0.1.3"]
env:
TAVILY_API_KEY: ""
timeout: 30000
```
**Tags:** search, web, research, ai
**Requirements:** Node >= 18.0.0
**Homepage:** [NPM - tavily-mcp](https://www.npmjs.com/package/tavily-mcp)
---
### Perplexity
**Official MCP Server** by Perplexity AI
AI-powered search engine for real-time web search and research.
```yaml
mcpServers:
perplexity:
type: stdio
command: npx
args: ["-y", "@perplexity-ai/mcp-server"]
env:
PERPLEXITY_API_KEY: ""
PERPLEXITY_TIMEOUT_MS: "600000"
timeout: 600000
```
**Tags:** search, web, research, ai
**Requirements:** Node >= 18.0.0
**Homepage:** [GitHub - Perplexity MCP](https://github.com/perplexityai/modelcontextprotocol/tree/main)
---
## Development
### Hugging Face
**Community MCP Server** by LLMindset
Access Hugging Face models and datasets.
```yaml
mcpServers:
hf:
type: stdio
command: npx
args: ["-y", "@llmindset/mcp-hfspace"]
timeout: 30000
```
**Tags:** huggingface, models, ai, ml
**Requirements:** Node >= 18.0.0
**Homepage:** [GitHub - mcp-hfspace](https://github.com/llmindset/mcp-hfspace)
---
## Data & Visualization
### ChartJS
**Official MCP Server** by ax-crew
Charting and visualization tool using ChartJS.
```yaml
mcpServers:
chartjs:
type: stdio
command: npx
args: ["-y", "@ax-crew/chartjs-mcp-server"]
timeout: 30000
```
**Tags:** chart, visualization, data, chartjs
**Requirements:** Node >= 18.0.0
**Homepage:** [GitHub - ChartJS MCP Server](https://github.com/ax-crew/chartjs-mcp-server)
---
### Rag-lite TS
**Official MCP Server** by FrugalX
A local-first TypeScript retrieval engine for semantic search over static documents.
```yaml
mcpServers:
rag-lite-ts:
type: stdio
command: npx
args: ["-y", "raglite-mcp"]
timeout: 30000
```
**Tags:** rag, data, ai
**Requirements:** Node >= 18.0.0
**Homepage:** [GitHub - rag-lite-ts](https://github.com/raglite/rag-lite-ts)
---
### Exa
**Official MCP Server** by Exa
AI-powered web search and research API with semantic search capabilities.
```yaml
mcpServers:
exa:
type: http
url: https://mcp.exa.ai/mcp
headers: {}
```
**Tags:** rag, data, ai
**Requirements:** Node >= 18.0.0
**Homepage:** [Exa MCP Docs](https://docs.exa.ai/reference/exa-mcp)
---
## Additional Resources
- **[MCP Configuration Guide](../guides/configuring-dexto/mcpConfiguration.md)** - Comprehensive configuration documentation
- [MCP Overview](./overview.md) - Introduction to MCP and quick reference
- [Model Context Protocol documentation](https://modelcontextprotocol.io/introduction) - Official MCP docs
- [MCP reference servers](https://github.com/modelcontextprotocol/servers) - Community server list
- [Contribute your MCP](https://github.com/truffle-ai/dexto/blob/main/CONTRIBUTING.md#1-adding-new-mcps-to-the-webui-registry) - Add to registry

View File

@@ -0,0 +1,140 @@
---
sidebar_position: 1
title: MCP Overview
description: Understand the Model Context Protocol (MCP), why it matters, and how Dexto integrates with MCP servers and tools.
---
# What is MCP?
The **Model Context Protocol (MCP)** is an open protocol created and maintained by Anthropic - [MCP github organization](https://github.com/modelcontextprotocol)
MCP defines how AI agents (like Dexto agents) can discover, connect to, and interact with external tools, services, and APIs in a standardized way.
:::tip Comprehensive Documentation
For complete MCP server configuration documentation including detailed field references, environment variables, tool aggregation, troubleshooting, and best practices, see the **[MCP Configuration Guide](../guides/configuring-dexto/mcpConfiguration.md)**.
:::
## Why MCP Matters
- **Interoperability:** MCP provides a common language for agents and tools, making it easy to connect new services without custom integration code for each one.
- **Extensibility:** Anyone can build and share MCP-compatible tools, expanding what agents can do.
- **Modularity:** Tools are decoupled from the agent's core logic, so you can add, remove, or swap tools as needed.
## How Dexto Agents Use MCP
Dexto agents use MCP to:
- **Discover available tools:** MCP servers advertise what actions they support (e.g., read a file, send an email, browse the web).
- **Connect to tools:** Dexto agents communicate with MCP servers using a standard protocol (often over stdio, HTTP, or sockets).
- **Invoke tool actions:** When you give a command, Dexto selects the right tool(s) via MCP and orchestrates their use to fulfill your request.
- **Read server resources:** Dexto agents can read resources from the server, like files, databases, etc., and use that to reason about what to do next.
- **Request structured input:** Servers can use elicitation to request specific data from users during workflows.
## Quick Configuration Reference
Add MCP servers under `mcpServers` in your `agent.yml`. Dexto supports three server types: `stdio`, `sse`, and `http`.
### stdio Server Type
For local MCP servers running as child processes:
```yaml
mcpServers:
filesystem:
type: stdio
command: npx
args:
- "@modelcontextprotocol/server-filesystem"
env:
ROOT: ./
timeout: 30000
connectionMode: lenient
```
**Fields:**
- `type` (required): Must be `stdio`
- `command` (required): Command to execute (e.g., `npx`, `python`, `node`)
- `args` (optional): Array of command arguments. Default: `[]`
- `env` (optional): Environment variables for the server process. Default: `{}`
- `timeout` (optional): Connection timeout in milliseconds. Default: `30000`
- `connectionMode` (optional): `lenient` or `strict`. Default: `lenient`
### sse Server Type
:::warning Deprecated
SSE transport is deprecated. Use `http` type for remote MCP servers instead.
:::
For Server-Sent Events (SSE) based MCP servers:
```yaml
mcpServers:
remote-sse:
type: sse
url: https://api.example.com/mcp/events
headers:
Authorization: Bearer ${MCP_API_KEY}
timeout: 30000
connectionMode: lenient
```
**Fields:**
- `type` (required): Must be `sse`
- `url` (required): SSE endpoint URL. Supports environment variable expansion (e.g., `${VAR}`)
- `headers` (optional): HTTP headers to send with requests. Default: `{}`
- `timeout` (optional): Connection timeout in milliseconds. Default: `30000`
- `connectionMode` (optional): `lenient` or `strict`. Default: `lenient`
### http Server Type
For HTTP-based MCP servers:
```yaml
mcpServers:
remote-http:
type: http
url: https://api.example.com/mcp
headers:
Authorization: Bearer ${MCP_API_KEY}
timeout: 30000
connectionMode: strict
```
**Fields:**
- `type` (required): Must be `http`
- `url` (required): HTTP server URL. Supports environment variable expansion (e.g., `${VAR}`)
- `headers` (optional): HTTP headers to send with requests. Default: `{}`
- `timeout` (optional): Connection timeout in milliseconds. Default: `30000`
- `connectionMode` (optional): `lenient` or `strict`. Default: `lenient`
### Connection Modes
The `connectionMode` field controls how Dexto handles connection failures:
- **`lenient` (default)**: If the server fails to connect, Dexto logs a warning but continues initialization. The server can be retried later. Use this for optional servers or when you want graceful degradation.
- **`strict`**: If the server fails to connect, Dexto throws an error and stops initialization. Use this for critical servers that must be available for your agent to function properly.
## Runtime Changes
- Add/update/remove servers dynamically via the SDK or REST APIs
- Events: `mcp:server-added`, `mcp:server-updated`, `mcp:server-removed`
See: [MCP Manager](./mcp-manager.md)
## Learn More
- [Model Context Protocol documentation](https://modelcontextprotocol.io/introduction)
- [MCP reference servers on GitHub](https://github.com/modelcontextprotocol/reference-servers)
MCP is a key part of what makes Dexto flexible, extensible, and able to automate across a wide range of tools and services.
## Next Steps
- **[MCP Configuration Guide](../guides/configuring-dexto/mcpConfiguration)** - Comprehensive YAML configuration
- [MCP Resources](./resources) - Expose data and context from MCP servers
- [MCP Prompts](./prompts) - Discover and use templated prompts
- [MCP Elicitation](./elicitation) - Request structured user input during workflows
- [Agent Configuration Reference](../guides/configuring-dexto/agent-yml) - Complete agent.yml reference
- [MCP Manager](./mcp-manager) - Runtime server management
- [Aggregate Multiple Servers](./dexto-group-mcp-servers) - Group MCP servers
- [Expose Dexto as MCP Server](./dexto-as-mcp-server) - Use Dexto as an MCP server

View File

@@ -0,0 +1,88 @@
---
sidebar_position: 3
---
# MCP Prompts
## What are Prompts?
Prompts in the Model Context Protocol are pre-built, reusable templates that MCP servers expose to help users interact with LLMs. They provide structured starting points for common tasks.
**Specification:** [MCP Prompts Spec](https://spec.modelcontextprotocol.io/specification/2025-03-26/server/prompts/)
## How It Works
Servers can expose templated prompts with:
- A descriptive name and purpose
- Optional arguments for customization
- Pre-configured messages for the LLM
When you use a prompt, the server fills in the template and sends the formatted message to your LLM.
## Configuration
Prompts are discovered automatically from MCP servers:
```yaml
mcpServers:
code-helper:
type: stdio
command: npx
args: ["-y", "my-code-mcp-server"]
```
If the server supports prompts, they'll be available immediately.
## Using Prompts
### In Web UI
Type `/` to discover and invoke prompts from connected MCP servers.
### In CLI
List available prompts:
```bash
dexto
> /prompts
```
Use a prompt:
```bash
> /use code-review file=src/app.ts
```
Or use the shorthand (if supported):
```bash
> /code-review src/app.ts
```
### Via SDK
```typescript
// List prompts from a server
const client = agent.mcpManager.getClient('code-helper');
const prompts = await client.listPrompts();
// Get and execute a prompt
const prompt = await client.getPrompt('code-review', { file: 'app.ts' });
const response = await agent.sendMessage(prompt.messages);
```
## Prompt Structure
Prompts can include:
- **Text content** - Instructions and context
- **Images** - Visual references (base64-encoded)
- **Resources** - Embedded file contents from the server
Arguments can be:
- **Required** - Must be provided by the user
- **Optional** - Have default values
- **Auto-completable** - Server suggests valid values
## See Also
- [MCP Resources](./resources) - Data sources for context
- [MCP Overview](./overview) - Introduction to MCP
- [CLI Guide](../guides/cli/overview) - All CLI commands

View File

@@ -0,0 +1,75 @@
---
sidebar_position: 2
---
# MCP Resources
## What are Resources?
Resources in the Model Context Protocol let MCP servers expose data (files, documentation, API responses) that LLMs can read for context. Think of them as read-only data sources that help the LLM understand your environment.
**Specification:** [MCP Resources Spec](https://spec.modelcontextprotocol.io/specification/2025-03-26/server/resources/)
## How It Works
When you connect an MCP server that supports resources, Dexto automatically:
1. Discovers available resources during server connection
2. Lists resources with their URIs and descriptions
3. Fetches resource content when the LLM needs it
## Configuration
Resources are discovered automatically from MCP servers:
```yaml
mcpServers:
filesystem:
type: stdio
command: npx
args: ["-y", "@modelcontextprotocol/server-filesystem", "./docs"]
```
No additional setup needed - if the server supports resources, Dexto will expose them.
## Using Resources
### In Web UI
Use the `@` symbol to reference resources:
```
@file:///project/README.md what does this say about installation?
```
The Web UI auto-completes available resources when you type `@`.
### Via SDK
```typescript
// List all available resources
const resources = await agent.resourceManager.list();
// Read a specific resource
const content = await agent.resourceManager.read('file:///path/to/file.md');
```
## Resource URIs
Resources are identified by URIs with different schemes:
- **file://** - Local files: `file:///absolute/path/to/file.txt`
- **http://** - Web resources: `http://api.example.com/data`
- **Custom** - Server-defined: `db://database/schema`, `git://repo/file`
## Common Servers with Resources
Based on Dexto's agent registry:
- **@modelcontextprotocol/server-filesystem** - Exposes local files (used in coding-agent)
- **@truffle-ai/github-mcp-server** - GitHub repository contents (used in github-agent)
- **@truffle-ai/talk2pdf-mcp** - PDF document contents (used in talk2pdf-agent)
## See Also
- [Internal Resources](../guides/configuring-dexto/internalResources) - Agent-managed filesystem/blob resources
- [MCP Prompts](./prompts) - Templated prompts from servers
- [MCP Overview](./overview) - Introduction to MCP

View File

@@ -0,0 +1,9 @@
{
"label": "Tutorials",
"position": 3,
"collapsed": false,
"link": {
"type": "doc",
"id": "tutorials/index"
}
}

View File

@@ -0,0 +1,9 @@
{
"label": "CLI & Configuration",
"position": 1,
"collapsed": false,
"link": {
"type": "doc",
"id": "tutorials/cli/index"
}
}

View File

@@ -0,0 +1,10 @@
{
"label": "Example Agents",
"position": 2,
"collapsed": true,
"link": {
"type": "generated-index",
"title": "Example Agents",
"description": "Step-by-step tutorials for building agents with YAML configuration and the CLI."
}
}

View File

@@ -0,0 +1,329 @@
---
sidebar_position: 1
---
# Customer Support Triage System
Learn how to build an intelligent customer support triage system using multiple specialized agents that work together through MCP connections.
## Overview
We'll build a system where:
1. **Triage Agent** receives customer requests and routes them
2. **Specialist Agents** handle specific domains (technical, billing, etc.)
3. **MCP Tools** enable seamless agent-to-agent communication
4. **Auto-approval** provides smooth customer experience
```
Customer Request → Triage Agent → Specialist Agent → Complete Response
```
## Step 1: Create Specialist Agents
### Technical Support Agent
```yaml
# technical-support-agent.yml
systemPrompt: |
You are a Technical Support Specialist with expertise in:
- API troubleshooting and integration issues
- Application bugs and system diagnostics
- Performance optimization and monitoring
Provide detailed, step-by-step solutions with clear explanations.
mcpServers:
filesystem:
type: stdio
command: npx
args: ["-y", "@modelcontextprotocol/server-filesystem", "."]
llm:
provider: openai
model: gpt-5
apiKey: $OPENAI_API_KEY
```
### Billing Support Agent
```yaml
# billing-agent.yml
systemPrompt: |
You are a Billing Support Specialist handling:
- Payment processing and subscription management
- Plan upgrades, downgrades, and pricing questions
- Refunds and billing disputes
Always provide specific timelines and next steps.
mcpServers:
filesystem:
type: stdio
command: npx
args: ["-y", "@modelcontextprotocol/server-filesystem", "."]
llm:
provider: openai
model: gpt-5-mini
apiKey: $OPENAI_API_KEY
```
## Step 2: Create the Triage Agent
The triage agent coordinates everything and connects to specialists:
```yaml
# triage-agent.yml
systemPrompt: |
You are a Customer Support Triage Agent. Your process:
1. Analyze the customer request
2. Identify the best specialist (technical, billing, etc.)
3. Call the chat_with_agent tool with the customer message
4. Provide a complete response combining your routing decision with the specialist's answer
After routing, you MUST:
- Use chat_with_agent tool to get the specialist's response
- Include the specialist's complete answer in your response
Response format: "I've connected you with [specialist]. [Complete specialist answer]"
# Auto-approve tools for seamless delegation
toolConfirmation:
mode: auto-approve
allowedToolsStorage: memory
mcpServers:
# Connect to specialist agents as MCP servers
technical_support:
type: stdio
command: npx
args: [dexto, --mode, mcp, --agent, technical-support-agent.yml]
connectionMode: lenient
billing_support:
type: stdio
command: npx
args: [dexto, --mode, mcp, --agent, billing-agent.yml]
connectionMode: lenient
llm:
provider: openai
model: gpt-5
apiKey: $OPENAI_API_KEY
```
## Step 3: Test the System
### Start the Triage System
```bash
npx dexto --agent triage-agent.yml
```
This automatically:
- Starts the triage agent
- Connects to specialist agents as MCP servers
- Loads the `chat_with_agent` tool for delegation
### Test Scenarios
**Technical Issue:**
```
My API keeps returning 500 errors when uploading files.
```
**Expected Flow:**
1. Triage identifies → Technical Support
2. Calls `chat_with_agent` → Technical specialist responds
3. Customer gets complete troubleshooting guide
**Billing Issue:**
```
I want to upgrade my plan but confused about pricing.
```
**Expected Flow:**
1. Triage identifies → Billing Support
2. Calls `chat_with_agent` → Billing specialist responds
3. Customer gets complete pricing explanation
## Step 4: Add More Specialists
### Product Information Agent
```yaml
# product-info-agent.yml
systemPrompt: |
You are a Product Information Specialist covering:
- Feature descriptions and plan comparisons
- Integration capabilities and setup guides
- How-to questions and best practices
mcpServers:
web_search:
type: stdio
command: npx
args: ["-y", "tavily-mcp@0.1.3"]
env:
TAVILY_API_KEY: $TAVILY_API_KEY
llm:
provider: openai
model: gpt-5-mini
apiKey: $OPENAI_API_KEY
```
### Update Triage Agent
Add the new specialist to your triage configuration:
```yaml
# Add to mcpServers section
product_info:
type: stdio
command: npx
args: [dexto, --mode, mcp, --agent, product-info-agent.yml]
connectionMode: lenient
```
Update the system prompt to include routing to Product Info Agent:
```yaml
systemPrompt: |
Available specialists:
- Technical Support: API errors, bugs, performance issues
- Billing Support: payments, subscriptions, pricing
- Product Info: features, plans, integrations, how-to guides
# ... rest of prompt
```
## Step 5: Advanced Features
### Add Business Context
Create documentation files that agents can access:
```markdown
<!-- company-info.md -->
# Company Plans
- Basic Plan: $9/month, 10 users, 5GB storage
- Pro Plan: $19/month, 100 users, 100GB storage
- Enterprise Plan: $39/month, unlimited users, 1TB storage
```
Reference in agent configurations:
```yaml
systemPrompt:
contributors:
- id: base-prompt
type: static
content: |
Your main system prompt here...
- id: company-info
type: file
files: ["${{dexto.agent_dir}}/company-info.md"] # Resolve to absolute path at load time
```
### Production Deployment
For production, run specialists as separate servers:
```bash
# Terminal 1: Technical Support
npx dexto --agent technical-support-agent.yml --mode server --port 3001
# Terminal 2: Billing Support
npx dexto --agent billing-agent.yml --mode server --port 3002
# Terminal 3: Triage Coordinator
npx dexto --agent triage-agent.yml --mode server --port 3000
```
Update triage agent to use HTTP connections:
```yaml
mcpServers:
technical_support:
type: http
url: "http://localhost:3001/mcp"
billing_support:
type: http
url: "http://localhost:3002/mcp"
```
## Key Concepts
### MCP Tool Delegation
The `chat_with_agent` tool enables one agent to execute another:
```yaml
# When triage agent connects to specialist as MCP server
# It gets access to chat_with_agent tool automatically
# Tool calls specialist with customer message
# Returns specialist's complete response
```
### Auto-Approval Configuration
Essential for smooth delegation:
```yaml
toolConfirmation:
mode: auto-approve # No manual confirmation
allowedToolsStorage: memory # Session-only approvals
```
### Stdio vs SSE Connections
**Development (stdio):**
- Agents start automatically
- Simple configuration
- Single machine deployment
**Production (sse):**
- Agents run as separate servers
- Distributed deployment
- Better scalability
## Complete Example
Your final file structure:
```
triage-system/
├── triage-agent.yml
├── technical-support-agent.yml
├── billing-agent.yml
├── product-info-agent.yml
└── docs/
└── company-info.md
```
**Test the complete system:**
```bash
npx dexto --agent triage-agent.yml "I need help with API integration and want to upgrade my billing plan"
```
The triage agent will:
1. Identify this as a technical issue (primary)
2. Route to Technical Support specialist
3. Execute the specialist via `chat_with_agent`
4. Provide complete API integration guidance
5. Optionally route billing question to Billing specialist
## Next Steps
- Add more specialists (Sales, Escalation, etc.)
- Include external tools (CRM, knowledge base)
- Implement logging and analytics
- Deploy with authentication and scaling
This pattern works for any domain where you need intelligent routing and specialized expertise!

View File

@@ -0,0 +1,195 @@
---
sidebar_position: 2
---
# Database Agent Tutorial
Learn how to build an AI agent that provides natural language access to database operations and analytics. This tutorial shows how to create an agent that can query databases, manage data, and generate insights through conversation.
## What You'll Build
A database agent that can:
- Execute natural language queries
- Create and manage database records
- Generate reports and analytics
- Handle data validation and errors
- Provide context-aware responses
## Prerequisites
- Node.js 18+ installed
- SQLite3 installed on your system
- OpenAI API key (or other LLM provider)
- Basic understanding of SQL and databases
> **Note**: This tutorial uses the [MCP Database Server](https://github.com/executeautomation/mcp-database-server) for database connectivity. This MCP server provides database access capabilities supporting SQLite, SQL Server, PostgreSQL, and MySQL databases.
## Step 1: Setup the Database Agent
First, let's set up the database agent with sample data:
```bash
# Navigate to the database agent directory
cd agents/database-agent
# Run the setup script to initialize the database
./setup-database.sh
```
This creates a sample database with:
- Users table (id, name, email, created_at, last_login, is_active)
- Products table (id, name, description, price, category, stock_quantity)
- Orders table (id, user_id, total_amount, status, created_at)
- Order items table (id, order_id, product_id, quantity, unit_price)
## Step 2: Configure the Agent
The database agent uses this configuration:
```yaml
# database-agent.yml
mcpServers:
sqlite:
type: stdio
command: npx
args:
- -y
- "@executeautomation/database-server"
- "./agents/database-agent/data/example.db"
timeout: 30000
connectionMode: strict
systemPrompt:
contributors:
- id: primary
type: static
priority: 0
content: |
You are a Database Interaction Agent that provides natural language access to database operations
and analytics. You orchestrate database operations through intelligent conversation and tool usage.
## Your Core Capabilities
**Database Operations:**
- Execute SQL queries and return formatted results
- Create, modify, and drop database tables
- Insert, update, and delete records
- Analyze database schema and structure
- Generate reports and data insights
- Perform data validation and integrity checks
**Intelligent Orchestration:**
- Understand user intent from natural language
- Break down complex requests into sequential operations
- Validate data before operations
- Provide clear explanations of what you're doing
- Handle errors gracefully with helpful suggestions
## Best Practices
- Always explain what you're doing before doing it
- Show sample data when creating tables
- Validate user input before database operations
- Provide helpful error messages and suggestions
- Use transactions for multi-step operations
- Keep responses concise but informative
llm:
provider: openai
model: gpt-5-mini
apiKey: $OPENAI_API_KEY
temperature: 0.1 # Lower temperature for more consistent database operations
```
## Step 3: Start the Agent
```bash
# Set your OpenAI API key
export OPENAI_API_KEY="your-openai-api-key"
# Start the database agent
dexto --agent database-agent.yml
```
## Step 4: Basic Database Operations
### Querying Data
Start with simple queries:
```
User: Show me all users
Agent: I'll query the users table to show you all the user records...
User: Find products under $100
Agent: I'll search for products with prices below $100...
```
### Creating Records
Add new data to the database:
```
User: Create a new user named Sarah Johnson with email sarah@example.com
Agent: I'll insert a new user record for Sarah Johnson...
User: Add a new product called "Wireless Headphones" for $89.99
Agent: I'll add the new product to the database...
```
### Complex Queries
Ask for insights and analytics:
```
User: Show me total sales by category
Agent: I'll aggregate the sales data by product category...
User: Find users who haven't logged in for more than 5 days
Agent: I'll query for users whose last_login is older than 5 days...
```
## Step 5: Advanced Features
### Data Analysis
The agent can perform complex analysis:
```
User: Generate a monthly sales report for the last 6 months
User: Find products with declining sales trends
User: Calculate customer lifetime value for each user
User: Identify the most popular product categories
```
### Data Management
Handle data operations:
```
User: Update the price of the Laptop to $849.99
User: Mark all orders older than 30 days as completed
User: Delete all inactive users who haven't logged in for 90 days
```
### Schema Operations
Manage database structure:
```
User: Show me the current database schema
User: Add a new column to the products table
User: Create an index on the email field for better performance
```
## Next Steps
Now that you have a working database agent, you can:
1. **Extend the Schema**: Add more tables and relationships
2. **Add Business Logic**: Implement domain-specific operations
3. **Integrate with APIs**: Connect to external services
4. **Build Web Interface**: Create a web UI for your agent
5. **Scale Up**: Move to production databases like PostgreSQL
The database agent demonstrates how AI can make data operations accessible through natural conversation, changing how we think about database interaction and business intelligence.

View File

@@ -0,0 +1,254 @@
---
sidebar_position: 5
---
# Image Editor Agent
Learn how to build an AI agent that provides intelligent image processing and editing capabilities. This tutorial shows how to create an agent that can analyze, transform, and enhance images through natural language commands.
## 🎥 Demo Video
Watch the Image Editor Agent in action:
<iframe
width="100%"
height="400"
src="https://www.youtube.com/embed/A0j61EIgWdI"
title="Image Editor Agent Demo"
frameborder="0"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; fullscreen"
allowfullscreen="true"
></iframe>
## What You'll Build
An image editor agent that can:
- Analyze image metadata and properties
- Resize and crop images with intelligent aspect ratio handling
- Convert between image formats with quality control
- Apply filters and effects (blur, sharpen, grayscale, sepia, etc.)
- Adjust brightness, contrast, and color properties
- Add text overlays and annotations
- Detect objects, faces, and visual features
- Create image collages and compositions
## Understanding the Architecture
The image editor agent follows Dexto's framework design with clear separation of responsibilities:
1. **MCP Server**: Sets up the server and exposes image processing tools to the agent
2. **Agent**: Orchestrates workflows and handles user interaction
3. **Tools**: Contain the actual image processing logic
This architecture allows the agent to focus on understanding user intent while the tools handle the technical image processing.
## MCP Server Code
The core functionality is provided by the **Image Editor MCP Server**, a Python-based server built with FastMCP. To understand the complete MCP server implementation, refer to the [mcp-servers repository](https://github.com/truffle-ai/mcp-servers/tree/main):
- **Image Editor Server**: [src/image-editor](https://github.com/truffle-ai/mcp-servers/tree/main/src/image-editor) - Image processing, filters, computer vision, and analysis tools
## Step 1: Setting Up the Project
First, let's understand the project structure:
```
agents/image-editor-agent/
├── image-editor-agent.yml # Agent configuration
├── Lenna.webp # Sample image for testing
└── README.md # Documentation
```
## Step 2: Quick Setup
The image editor agent uses a published MCP server that's automatically installed:
```bash
# From the dexto project root
dexto --agent agents/image-editor-agent/image-editor-agent.yml
```
That's it! The MCP server (`truffle-ai-image-editor-mcp`) will be automatically downloaded and installed via `uvx` on first run.
### What's Happening Behind the Scenes
The published MCP server includes these key dependencies:
- **OpenCV**: Computer vision and image processing operations
- **Pillow**: Python Imaging Library for image manipulation
- **NumPy**: Numerical computing for image data
- **NumPy**: Numerical computing for image data
## Step 3: Understanding the Agent Configuration
The agent is configured in `image-editor-agent.yml`:
```yaml
systemPrompt: |
You are an AI assistant specialized in image editing and processing. You have access to a comprehensive set of tools for manipulating images including:
- **Basic Operations**: Resize, crop, convert formats
- **Filters & Effects**: Blur, sharpen, grayscale, sepia, invert, edge detection, emboss, vintage
- **Adjustments**: Brightness, contrast, color adjustments
- **Text & Overlays**: Add text to images with customizable fonts and colors
- **Computer Vision**: Face detection, edge detection, contour analysis, circle detection, line detection
- **Analysis**: Detailed image statistics, color analysis, histogram data
mcpServers:
image_editor:
type: stdio
command: uvx
args:
- truffle-ai-image-editor-mcp
connectionMode: strict
llm:
provider: openai
model: gpt-5-mini
apiKey: $OPENAI_API_KEY
```
### Key Components Explained
1. **systemPrompt**: Defines the agent's capabilities and behavior
2. **mcpServers**: Connects to the Python MCP server
3. **llm**: Configures the language model for intelligent interaction
## Step 4: Available Tools
The image editor agent provides 20+ powerful tools organized into categories:
### Image Analysis
- `get_image_info` - Get detailed image metadata (dimensions, format, file size)
- `preview_image` - Get a base64 preview for UI display
- `analyze_image` - Comprehensive image analysis with statistics
- `show_image_details` - Display detailed image information
### Basic Operations
- `resize_image` - Resize images with aspect ratio preservation
- `crop_image` - Crop images to specific dimensions
- `convert_format` - Convert between image formats
- `create_thumbnail` - Create small preview images
### Filters & Effects
- `apply_filter` - Apply various filters (blur, sharpen, grayscale, sepia, etc.)
- `adjust_brightness_contrast` - Adjust brightness and contrast levels
### Drawing & Annotations
- `add_text_to_image` - Add text overlays with custom fonts and colors
- `draw_rectangle` - Draw rectangles on images
- `draw_circle` - Draw circles on images
- `draw_line` - Draw lines on images
- `draw_arrow` - Draw arrows on images
- `add_annotation` - Add text annotations with backgrounds
### Computer Vision
- `detect_objects` - Detect faces, edges, contours, circles, lines
### Advanced Features
- `create_collage` - Create image collages with various layouts
- `create_collage_template` - Use predefined collage templates
- `batch_process` - Process multiple images with the same operation
- `compare_images` - Compare two images side by side
### Utility
- `list_available_filters` - List all available filter options
- `list_collage_templates` - List available collage templates
## Step 5: Running the Agent
Start the image editor agent:
```bash
# From the project root
dexto --agent agents/image-editor-agent/image-editor-agent.yml
```
## Step 6: Testing with Example Prompts
Let's test the agent with some example prompts to understand how it works:
### Basic Image Analysis
```
"Get information about the image at /path/to/image.jpg"
```
**What happens**: The agent calls `get_image_info` to retrieve dimensions, format, and file size.
### Image Transformation
```
"Resize the image to 800x600 pixels while maintaining aspect ratio"
```
**What happens**: The agent calls `resize_image` with `maintainAspectRatio: true` to preserve proportions.
### Applying Filters
```
"Apply a sepia filter to make the image look vintage"
```
**What happens**: The agent calls `apply_filter` with `filter: "sepia"` to create a vintage effect.
### Adding Text
```
"Add the text 'Hello World' at coordinates (50, 50) with white color"
```
**What happens**: The agent calls `add_text_to_image` with the specified text, position, and color.
### Computer Vision
```
"Detect faces in the image"
```
**What happens**: The agent calls `detect_objects` with `detectionType: "faces"` to find faces.
### Creating Collages
```
"Create a collage of these three images in a grid layout"
```
**What happens**: The agent calls `create_collage` with the image paths and grid layout.
## Step 7: Understanding the Workflow
Here's how the three components work together in a typical interaction:
1. **User Request**: "Make this image brighter and add a watermark"
2. **Agent**: Interprets the request and orchestrates the workflow
3. **Tools**: Agent calls the processing functions:
- `adjust_brightness_contrast()` - increases image brightness
- `add_text_to_image()` - adds watermark text
4. **Response**: Agent provides the result with image context
### Example Workflow
```
User: "Take this image, resize it to 500x500, apply a blur filter, and add the text 'SAMPLE' at the bottom"
Agent Response:
"I'll help you process that image. Let me break this down into steps:
1. First, I'll resize the image to 500x500 pixels
2. Then I'll apply a blur filter
3. Finally, I'll add the text 'SAMPLE' at the bottom
[Executes tools and provides results]"
```
## Supported Formats
### Input Formats
- **JPG/JPEG**: Most common compressed format
- **PNG**: Lossless format with transparency support
- **BMP**: Uncompressed bitmap format
- **TIFF**: High-quality format for professional use
- **WebP**: Modern format with excellent compression
### Output Formats
- **JPG/JPEG**: Configurable quality settings
- **PNG**: Lossless with transparency
- **WebP**: Configurable quality with small file sizes
- **BMP**: Uncompressed format
- **TIFF**: High-quality professional format
## Common Use Cases
- **Web Development**: Optimize images, create thumbnails, convert formats
- **Content Creation**: Apply filters, add text overlays, create compositions
- **Professional Work**: Batch processing, color adjustments, quality enhancement
---
**Ready to start?** Run the setup script and begin creating intelligent image processing workflows!

View File

@@ -0,0 +1,201 @@
---
sidebar_position: 8
---
# Integrating Existing Agents: Dexto + LangChain
import ExpandableMermaid from '@site/src/components/ExpandableMermaid';
This tutorial shows you how to integrate an existing LangChain agent with Dexto to create a multi-agent system. Instead of rebuilding your agent, you'll learn to wrap it with MCP and let Dexto orchestrate between it and other tools.
## The Integration Pattern
Here's what we're building - a single system where Dexto coordinates between different tools:
<ExpandableMermaid title="Dexto + LangChain Integration Pattern">
```mermaid
graph TD
A[Dexto Orchestrator] --> B[Filesystem Tools]
A --> C[Puppeteer Tools]
A --> D[LangChain Agent]
style A fill:#4f46e5,stroke:#312e81,stroke-width:2px,color:#fff
style B fill:#10b981,stroke:#065f46,stroke-width:1px,color:#fff
style C fill:#f59e0b,stroke:#92400e,stroke-width:1px,color:#fff
style D fill:#8b5cf6,stroke:#5b21b6,stroke-width:1px,color:#fff
```
</ExpandableMermaid>
Your existing LangChain agent becomes just another tool in Dexto's toolkit, working alongside file operations and web browsing.
## How the Integration Works
The integration happens in three simple steps. Let's walk through each piece of code to see exactly what's happening.
### Step 1: Your Existing LangChain Agent
Here's a typical LangChain agent you might already have:
```javascript
// agent.js
import { ChatOpenAI } from '@langchain/openai';
import { PromptTemplate } from '@langchain/core/prompts';
class LangChainAgent {
constructor() {
this.llm = new ChatOpenAI({ model: 'gpt-5-mini' });
this.tools = {
summarize: this.summarize.bind(this),
translate: this.translate.bind(this),
analyze: this.analyze.bind(this)
};
}
async run(input) {
const prompt = PromptTemplate.fromTemplate(`
You have three tools: summarize, translate, analyze.
User input: {user_input}
Determine which tool to use and provide a helpful response.
`);
const chain = prompt.pipe(this.llm);
const result = await chain.invoke({ user_input: input });
return result.content;
}
}
```
This is your standard LangChain agent - it stays exactly as it is. No modifications needed.
### Step 2: Wrap It in MCP
Now we create a thin MCP wrapper that exposes your agent:
```javascript
// mcp-server.js
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { z } from 'zod';
import { LangChainAgent } from './agent.js';
class LangChainMCPServer {
constructor() {
this.server = new McpServer({ name: 'langchain-agent', version: '1.0.0' });
this.agent = new LangChainAgent();
this.registerTools();
}
registerTools() {
this.server.registerTool(
'chat_with_langchain_agent',
{
description: 'Chat with a LangChain agent for text processing',
inputSchema: {
message: z.string().describe('Message to send to the agent')
}
},
async ({ message }) => {
const response = await this.agent.run(message);
return { content: [{ type: 'text', text: response }] };
}
);
}
}
```
This wrapper does one thing: it takes MCP tool calls and forwards them to your existing agent.
### Step 3: Configure Dexto
Finally, tell Dexto about all your tools in the configuration:
```yaml
# dexto-agent-with-langchain.yml
mcpServers:
filesystem:
type: stdio
command: npx
args: ["-y", "@modelcontextprotocol/server-filesystem", "."]
playwright:
type: stdio
command: npx
args:
- "-y"
- "@playwright/mcp@latest"
langchain:
type: stdio
command: node
args: ["${{dexto.agent_dir}}/langchain-agent/dist/mcp-server.js"]
env:
OPENAI_API_KEY: $OPENAI_API_KEY
```
Now Dexto can coordinate between file operations, web browsing, and your LangChain agent.
## See It in Action
```bash
# Setup
cd examples/dexto-langchain-integration/langchain-agent && npm install && npm run build
export OPENAI_API_KEY="your_key_here"
cd ../../../ # Return to repo root
# Run a multi-agent workflow
dexto --agent ./examples/dexto-langchain-integration/dexto-agent-with-langchain.yml "Read README.md, analyze its sentiment, and save the analysis"
```
What happens when you run this:
1. **Dexto receives** the natural language request
2. **Breaks it down** into subtasks: read file → analyze sentiment → save result
3. **Routes each task** to the appropriate tool:
- File reading → filesystem MCP server
- Sentiment analysis → LangChain MCP server
- File saving → filesystem MCP server
4. **Coordinates the workflow** and returns the final result
## Extending the Pattern
Want to add more agents? Just follow the same three-step pattern:
```yaml
mcpServers:
# Your existing setup
langchain:
type: stdio
command: node
args: ["${{dexto.agent_dir}}/langchain-agent/dist/mcp-server.js"]
# Add more agents
autogen-research:
type: stdio
command: python
args: ["${{dexto.agent_dir}}/autogen-agent/mcp_server.py"]
custom-analyzer:
type: stdio
command: "${{dexto.agent_dir}}/custom-agent/target/release/mcp-server"
```
Each agent runs independently, but Dexto can orchestrate between all of them based on the task at hand.
## Benefits of using Dexto
Instead of building custom orchestration logic, Dexto gives you:
- **Intelligent routing**: Automatically determines which tools to use and in what order
- **State management**: Shares context and results between different systems seamlessly
- **Workflow coordination**: Handles dependencies, retries, and error handling across steps
- **Natural language interface**: Describe complex multi-step workflows in plain English
Without Dexto, you'd manually chain API calls and manage coordination between your LangChain agent, file operations, and web browsing. With Dexto, you just describe what you want.
## Key Takeaways
- **Your existing agents don't change** - they keep working exactly as before
- **MCP provides the bridge** - a simple wrapper makes any agent Dexto-compatible
- **Dexto handles orchestration** - it figures out which tool to use when
- **The pattern scales** - add as many agents and frameworks as you need
This approach lets you build sophisticated multi-agent systems by composing existing pieces, rather than rebuilding everything from scratch.

View File

@@ -0,0 +1,447 @@
---
sidebar_position: 3
---
# Building Multi-Agent Systems
Learn how to build multi-agent systems where Dexto agents can communicate with each other using the Model Context Protocol (MCP). This powerful pattern enables specialized agents to collaborate and delegate tasks to each other.
## Overview
In this guide, you'll learn how to:
- Set up multiple Dexto agents running on different ports
- Configure one agent to use another as an MCP server
- Enable inter-agent communication through tool calls
- Build collaborative agent workflows
## What We're Building
We'll create two specialized agents:
- **Researcher Agent** (Port 4001): Specializes in gathering and analyzing information, runs in server mode
- **Writer Agent** (Ports 5000/5001): Specializes in content creation, can call the Researcher for help
The Writer agent will be able to delegate research tasks to the Researcher agent using MCP tool calls.
:::tip Port Convention
Each agent uses an API port to expose its REST API and MCP endpoint. For example, the Researcher uses port 4001 for its API endpoints.
:::
### Mode Selection Strategy
- **`--mode server`**: Use for agents that serve as MCP servers for other agents via HTTP (like our Researcher). Exposes REST APIs and the `/mcp` endpoint without opening a web UI.
- **Default (web) mode**: Use for agents that need web UI access for user interaction (like our Writer). Also exposes REST APIs and `/mcp` endpoint.
Both modes expose the `/mcp` endpoint at `http://localhost:{port}/mcp` for other agents to connect to.
## Step 1: Create the Project Structure
```bash
mkdir multi-agent-example
cd multi-agent-example
# Create config files for each agent
touch researcher.yml writer.yml .env
```
Your project structure should look like:
```
multi-agent-example/
├── researcher.yml
├── writer.yml
└── .env
```
## Step 2: Set Up the Researcher Agent
### Create Researcher Configuration
Create `researcher.yml`:
```yaml
# Research Agent Configuration
systemPrompt: |
You are a Research Agent specializing in gathering and analyzing information.
Your capabilities include:
- Reading and analyzing files using the filesystem tool
- Searching the web for current information using tavily-search
- Synthesizing research findings into clear summaries
When responding to research requests:
1. Use your tools to gather relevant information
2. Analyze and synthesize the findings
3. Provide well-structured, factual responses
4. Include sources and evidence when possible
Be thorough but concise in your research summaries.
llm:
provider: openai
model: gpt-5-mini
apiKey: $OPENAI_API_KEY
mcpServers:
filesystem:
type: stdio
command: npx
args:
- -y
- "@modelcontextprotocol/server-filesystem"
- "."
tavily-search:
type: stdio
command: npx
args:
- -y
- "tavily-mcp@0.1.2"
env:
TAVILY_API_KEY: $TAVILY_API_KEY
```
## Step 3: Set Up the Writer Agent
### Create Writer Configuration
Create `writer.yml`:
```yaml
# Writer Agent - Specializes in content creation
systemPrompt: |
You are a Content Writer Agent specializing in creating high-quality written content.
Your capabilities include:
- Writing articles, blog posts, and documentation
- Reading and editing files using the filesystem tool
- Collaborating with the Researcher Agent for information gathering
When you need research or factual information:
1. Use the "researcher" tool to delegate research tasks
2. Provide clear, specific research requests
3. Incorporate the research findings into your writing
Example researcher tool usage:
- "Research the latest trends in AI agents"
- "Find information about the Model Context Protocol"
- "Analyze the contents of the project files for context"
Always create well-structured, engaging content that incorporates research findings naturally.
mcpServers:
filesystem:
type: stdio
command: npx
args:
- -y
- "@modelcontextprotocol/server-filesystem"
- "."
# Connect to the Researcher Agent as an MCP server
researcher:
type: http
baseUrl: http://localhost:4001/mcp
timeout: 30000
llm:
provider: openai
model: gpt-5-mini
apiKey: $OPENAI_API_KEY
```
## Step 4: Set Up Environment Variables
Create `.env`:
```bash
# Add your OpenAI API key
OPENAI_API_KEY=your_openai_key_here
# Add Tavily API key for web search (get free key at tavily.com)
TAVILY_API_KEY=your_tavily_key_here
# Optional: Add other provider keys if using different models
ANTHROPIC_API_KEY=your_anthropic_key
GOOGLE_GENERATIVE_AI_API_KEY=your_google_key
COHERE_API_KEY=your_cohere_key
```
## Step 5: Run the Multi-Agent System
That's it! No custom code needed. Just run Dexto with different configs and ports:
### Terminal 1: Start the Researcher Agent
```bash
dexto --mode server --port 4001 --agent researcher.yml
```
### Terminal 2: Start the Writer Agent
```bash
dexto --port 5001 --agent writer.yml
```
### Terminal 3: Test the System
```bash
# Test the researcher directly (hits API port)
curl -X POST http://localhost:4001/api/message-sync \
-H "Content-Type: application/json" \
-d '{"message": "Research the latest developments in AI agents"}'
# Test the writer (which can call the researcher)
curl -X POST http://localhost:5001/api/message-sync \
-H "Content-Type: application/json" \
-d '{"message": "Write a blog post about AI agent collaboration. Research current trends first."}'
```
You can also interact via the Writer's web interface:
- **Writer**: `http://localhost:5000` (Web UI automatically opens, API on 5001)
Note: The Researcher runs in server mode (no web UI) and only exposes its API on port 4001.
## How It Works
### Inter-Agent Communication Flow
1. **User Request**: "Write a blog post about AI agents"
2. **Writer Agent**: Recognizes it needs research
3. **Tool Call**: Writer calls the `researcher` tool via HTTP MCP
4. **Research Execution**: Researcher agent processes the research request
5. **Response**: Researcher returns findings to Writer
6. **Content Creation**: Writer incorporates research into the blog post
7. **Final Output**: User receives a well-researched blog post
### MCP Configuration Explained
In the Writer's configuration, this section connects to the Researcher:
```yaml
researcher:
type: http # Use HTTP MCP connection
baseUrl: http://localhost:4001/mcp # Researcher's MCP endpoint
timeout: 30000 # 30-second timeout
```
When Dexto runs in `server` or `web` mode, it automatically exposes an MCP endpoint at `/mcp` that other agents can connect to via HTTP. Use `server` mode for agents that only need to serve as MCP servers (no web UI), and `web` mode for agents that need user interaction through a web interface.
### The Power of Configuration-First
This example demonstrates Dexto's core philosophy:
- **No custom code** - just YAML configuration
- **Built-in web server** - automatic API and UI
- **Automatic MCP endpoints** - no need to implement protocols
- **Simple scaling** - add more agents by adding more configs
## Advanced Usage Patterns
### 1. Multiple Specialized Agents
```yaml
# Writer agent with multiple specialist agents
mcpServers:
researcher:
type: http
baseUrl: http://localhost:4001/mcp
fact-checker:
type: http
baseUrl: http://localhost:6001/mcp
editor:
type: http
baseUrl: http://localhost:7001/mcp
```
Then run:
```bash
# Terminal 1: Researcher (Server mode - HTTP MCP, no UI)
dexto --mode server --port 4001 --agent researcher.yml
# Terminal 2: Fact-checker (Server mode - HTTP MCP, no UI)
dexto --mode server --port 6001 --agent fact-checker.yml
# Terminal 3: Editor (Server mode - HTTP MCP, no UI)
dexto --mode server --port 7001 --agent editor.yml
# Terminal 4: Writer (Web mode - UI for user interaction)
dexto --port 5001 --agent writer.yml
```
### 2. Bidirectional Communication
Configure agents to call each other:
```yaml
# In researcher.yml - Researcher can also call Writer for help with summaries
mcpServers:
filesystem:
type: stdio
command: npx
args: ["-y", "@modelcontextprotocol/server-filesystem", "."]
writer:
type: http
baseUrl: http://localhost:5001/mcp
```
### 3. Agent Orchestration
Create a coordinator agent that manages multiple specialized agents:
```yaml
# coordinator.yml
mcpServers:
researcher:
type: http
baseUrl: http://localhost:4001/mcp
writer:
type: http
baseUrl: http://localhost:5001/mcp
reviewer:
type: http
baseUrl: http://localhost:6001/mcp
# Coordinator Agent Configuration
systemPrompt: |
You are a Coordinator Agent that orchestrates work between specialized agents.
Your team includes:
- researcher: For gathering information and analysis
- writer: For content creation
- reviewer: For quality assurance and editing
When given a task, break it down and delegate to the appropriate agents.
llm:
provider: openai
model: gpt-5-mini
apiKey: $OPENAI_API_KEY
```
Run the system:
```bash
# Start specialized agents (Server mode - HTTP MCP servers)
dexto --mode server --port 4001 --agent researcher.yml
dexto --mode server --port 5001 --agent writer.yml
dexto --mode server --port 6001 --agent reviewer.yml
# Start coordinator (Web mode - UI for user interaction)
dexto --port 8001 --agent coordinator.yml
```
## Production Considerations
### 1. Process Management
Use a process manager like PM2 for production:
```bash
# Install PM2
npm install -g pm2
# Create ecosystem file
cat > ecosystem.config.js << EOF
module.exports = {
apps: [
{
name: 'researcher-agent',
script: 'dexto',
args: '--mode server --port 4001 --agent researcher.yml'
},
{
name: 'writer-agent',
script: 'dexto',
args: '--port 5001 --agent writer.yml'
}
]
};
EOF
# Start all agents
pm2 start ecosystem.config.js
```
### 2. Docker Deployment
```dockerfile
# Dockerfile
FROM node:18-alpine
WORKDIR /app
COPY package*.json ./
RUN npm install -g dexto
COPY . .
CMD ["dexto", "--mode", "web", "--port", "3000"]
```
```yaml
# docker-compose.yml
version: '3.8'
services:
researcher:
build: .
ports:
- "4001:4001"
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY}
- TAVILY_API_KEY=${TAVILY_API_KEY}
command: dexto --mode server --port 4001 --agent researcher.yml
writer:
build: .
ports:
- "5000-5001:5000-5001"
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY}
command: dexto --port 5001 --agent writer.yml
depends_on:
- researcher
```
### 3. Load Balancing
Use nginx to load balance multiple instances:
```nginx
upstream researcher_agents {
server localhost:4001; # Instance 1
server localhost:4011; # Instance 2
server localhost:4021; # Instance 3
}
server {
listen 80;
location /researcher/ {
proxy_pass http://researcher_agents/;
}
}
```
## Troubleshooting
### Common Issues
**"Connection refused" errors**
- Ensure the researcher agent is started before the writer
- Check that ports are not already in use: `netstat -tulpn | grep :4001`
- Verify the MCP endpoint URLs in configurations
**Timeout errors**
- Increase timeout values in MCP server configurations
- Check agent response times in the web UI
- Consider splitting complex tasks
**Tool not found errors**
- Verify agent names match the MCP server names
- Check that target agents are running
- Ensure MCP endpoints return proper responses
**Environment variable issues**
- Verify `.env` file is in the working directory
- Check API key validity and credits
- Use `--no-verbose` flag to reduce debug output
## Next Steps
- **Scale Up**: Add more specialized agents to your system
- **Production**: Use PM2, Docker, or Kubernetes for deployment
- **Integration**: Connect to external services and APIs
- **Monitoring**: Add health checks and logging
The beauty of Dexto's multi-agent systems is their simplicity - just configuration files and command-line arguments. No custom code, no complex deployments, just pure agent collaboration! 🤖✨

View File

@@ -0,0 +1,263 @@
---
sidebar_position: 6
---
# Music Creator Agent
Learn how to build an AI agent that provides comprehensive music creation and audio processing capabilities. This tutorial shows how to create an agent that can generate music, analyze audio, and process sound files through natural language commands.
## 🎥 Demo Video
Watch the Music Creator Agent in action:
<iframe
width="100%"
height="400"
src="https://www.youtube.com/embed/wEnQy1zEVZw"
title="Music Creator Agent Demo"
frameborder="0"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; fullscreen"
allowfullscreen="true"
></iframe>
> **⚠️ Experimental Status**: This agent is currently in experimental development. The tools have not been extensively tested in production environments and may have limitations or bugs. We're actively seeking feedback and improvements from users.
## What You'll Build
A music creator agent that can:
- Generate melodies, chord progressions, and drum patterns
- Analyze audio for tempo, key, and musical features
- Convert between audio formats with quality control
- Apply audio effects and processing
- Mix multiple audio tracks with volume control
- Play audio and MIDI files with precise control
- Process both audio and MIDI files seamlessly
## Understanding the Architecture
The music creator agent follows Dexto's framework design with clear separation of responsibilities:
1. **MCP Server**: Sets up the server and exposes audio processing tools to the agent
2. **Agent**: Orchestrates workflows and handles user interaction
3. **Tools**: Contain the actual audio processing logic
This architecture allows the agent to focus on understanding musical intent while the tools handle the technical audio processing.
## MCP Server Code
The core functionality is provided by the **Music Agent MCP Server**, a Python-based server built with FastMCP. To understand the complete MCP server implementation, refer to the [mcp-servers repository](https://github.com/truffle-ai/mcp-servers/tree/main):
- **Music Server**: [src/music](https://github.com/truffle-ai/mcp-servers/tree/main/src/music) - Audio generation, processing, effects, and MIDI handling
## Step 1: Setting Up the Project
First, let's understand the project structure:
```
agents/music-agent/
├── music-agent.yml # Agent configuration
└── README.md # Documentation
```
## Step 2: Quick Setup
The music creator agent uses a published MCP server that's automatically installed:
```bash
# From the dexto project root
dexto --agent agents/music-agent/music-agent.yml
```
That's it! The MCP server (`truffle-ai-music-creator-mcp`) will be automatically downloaded and installed via `uvx` on first run.
### What's Happening Behind the Scenes
The published MCP server includes these key dependencies:
- **librosa**: Audio analysis and music information retrieval
- **pydub**: Audio file manipulation and processing
- **music21**: Music notation and analysis
- **pretty_midi**: MIDI file handling
- **FastMCP**: Model Context Protocol server framework
- **NumPy & SciPy**: Numerical computing for audio processing
## Step 3: Understanding the Agent Configuration
The agent is configured in `music-agent.yml`:
```yaml
systemPrompt: |
You are an AI assistant specialized in music creation, editing, and production. You have access to a comprehensive set of tools for working with audio and music including:
- **Audio Analysis**: Analyze audio files for tempo, key, BPM, frequency spectrum, and audio characteristics
- **Audio Processing**: Convert formats, adjust volume, normalize, apply effects (reverb, echo, distortion, etc.)
- **Music Generation**: Create melodies, chord progressions, drum patterns, and complete compositions
- **Audio Manipulation**: Trim, cut, splice, loop, and arrange audio segments
- **Effects & Filters**: Apply various audio effects and filters for creative sound design
- **Mixing & Mastering**: Balance levels, apply compression, EQ, and mastering effects
- **File Management**: Organize, convert, and manage audio files in various formats
mcpServers:
music_creator:
type: stdio
command: uvx
args:
- truffle-ai-music-creator-mcp
connectionMode: strict
llm:
provider: openai
model: gpt-5-mini
apiKey: $OPENAI_API_KEY
```
### Key Components Explained
1. **systemPrompt**: Defines the agent's capabilities and behavior
2. **mcpServers**: Connects to the Python MCP server
3. **llm**: Configures the language model for intelligent interaction
## Step 4: Available Tools
The music creator agent provides 20+ powerful tools organized into categories:
### Music Generation
- `create_melody` - Generate melodies in any key and scale
- `create_chord_progression` - Create chord progressions using Roman numerals
- `create_drum_pattern` - Generate drum patterns for different styles
### Audio Analysis
- `analyze_audio` - Comprehensive audio analysis with spectral features
- `detect_tempo` - Detect BPM and beat positions
- `detect_key` - Identify musical key and mode
- `get_audio_info` - Get detailed audio file information
- `get_midi_info` - Get detailed MIDI file information
### Audio Processing
- `convert_audio_format` - Convert between audio formats
- `convert_midi_to_audio` - Convert MIDI files to high-quality audio (WAV, 44.1kHz, 16-bit)
- `adjust_volume` - Adjust audio levels in dB
- `normalize_audio` - Normalize audio to target levels
- `trim_audio` - Cut audio to specific time ranges
- `apply_audio_effect` - Apply reverb, echo, distortion, filters
### Mixing & Arrangement
- `merge_audio_files` - Combine multiple audio files
- `mix_audio_files` - Mix tracks with individual volume control (supports both audio and MIDI)
### Playback
- `play_audio` - Play audio files with optional start time and duration
- `play_midi` - Play MIDI files with optional start time and duration
### Utility
- `list_available_effects` - List all audio effects
- `list_drum_patterns` - List available drum patterns
## Step 5: Running the Agent
Start the music creator agent:
```bash
# From the project root
dexto --agent agents/music-agent/music-agent.yml
```
## Step 6: Testing with Example Prompts
Let's test the agent with some example prompts to understand how it works:
### Music Generation
```
"Create a melody in G major at 140 BPM for 15 seconds"
```
**What happens**: The agent calls `create_melody` with the specified key, tempo, and duration.
```
"Create a I-IV-V-I chord progression in D major"
```
**What happens**: The agent calls `create_chord_progression` with the Roman numeral progression and key.
### Audio Analysis
```
"Analyze the tempo and key of my song.mp3"
```
**What happens**: The agent calls `analyze_audio` to get comprehensive audio information.
```
"What's the BPM of this track?"
```
**What happens**: The agent calls `detect_tempo` to find the beat per minute.
### Audio Processing
```
"Convert my song.wav to MP3 format"
```
**What happens**: The agent calls `convert_audio_format` to change the file format.
```
"Convert my MIDI melody to WAV format"
```
**What happens**: The agent calls `convert_midi_to_audio` to synthesize the MIDI file.
### Audio Effects
```
"Add reverb to my guitar with 200ms reverb time"
```
**What happens**: The agent calls `apply_audio_effect` with reverb parameters.
### Mixing & Playback
```
"Mix my vocals, guitar, and drums together with the vocals at +3dB"
```
**What happens**: The agent calls `mix_audio_files` with volume levels for each track.
```
"Create a melody in G major and play it for 5 seconds"
```
**What happens**: The agent calls `create_melody` followed by `play_midi` to generate and preview.
## Step 7: Understanding the Workflow
Here's how the three components work together in a typical interaction:
1. **User Request**: "Create a rock song with drums and a melody in C major"
2. **Agent**: Interprets the request and orchestrates the workflow
3. **Tools**: Agent calls the processing functions:
- `create_drum_pattern()` - generates rock drum pattern
- `create_melody()` - creates C major melody
- `mix_audio_files()` - combines the tracks
4. **Response**: Agent provides the result with musical context
### Example Workflow
```
User: "Create a jazz melody in B minor, add some reverb, and play it for 10 seconds"
Agent Response:
"I'll help you create a jazz melody with reverb. Let me break this down:
1. First, I'll create a jazz melody in B minor
2. Then I'll add reverb to give it some space
3. Finally, I'll play it for you to hear
[Executes tools and provides results]"
```
## Supported Formats
- **Audio**: MP3, WAV, FLAC, OGG, M4A, AIFF, WMA
- **MIDI**: MID, MIDI
## Experimental Features
This agent is in active development. We encourage feedback on real-world usage, different genres, and various file sizes.
## Common Use Cases
- **Music Production**: Create backing tracks, generate drum patterns, compose melodies
- **Audio Editing**: Clean up recordings, normalize levels, apply effects
- **Music Analysis**: Analyze tempo, key, and musical features
- **Educational**: Learn music theory through generation and experimentation
---
**Ready to start?** Run the setup script and begin creating intelligent music workflows!
> **💡 Tip**: This is an experimental agent, so we encourage you to try different use cases and provide feedback to help improve the tools!

View File

@@ -0,0 +1,306 @@
---
sidebar_position: 7
---
# Product Name Scout Agent
Learn how to build an AI agent that provides comprehensive product name research and brand validation capabilities. This tutorial shows how to create an agent that can analyze potential product names through search engine analysis, developer platform collision detection, and automated scoring algorithms.
## 🎥 Demo Video
Watch the Product Name Scout Agent in action:
<iframe
width="100%"
height="400"
src="https://www.youtube.com/embed/oReKtfZuHYY"
title="Product Name Scout Agent Demo"
frameborder="0"
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; fullscreen"
allowfullscreen="true"
></iframe>
## What You'll Build
A product name research agent that can:
- Check for domain availability (.com, .ai, .dev, .io, etc.) for your product name
- Analyze search engine results for brand competition across Google, DuckDuckGo, and Brave
- Check autocomplete suggestions to identify spelling and recognition issues
- Detect conflicts on developer platforms (GitHub, npm, PyPI)
- Conduct competitive research and trademark conflict assessment
- Provide a final 1-100 scoring based on all of these factors
## Understanding the Architecture
The product name scout agent follows Dexto's framework design with clear separation of responsibilities:
1. **MCP Servers with tools**: Multiple specialized servers for different aspects of name research. These handle specific research tasks (SERP analysis, domain checking, etc.)
2. **Agent**: Orchestrates complex research workflows and synthesizes findings
This architecture allows the agent to conduct thorough research while maintaining clear, actionable insights.
## MCP Server Code
The core functionality is provided by three MCP servers. To understand the complete MCP server implementations, refer to the [mcp-servers repository](https://github.com/truffle-ai/mcp-servers/tree/main):
- **Product Name Scout Server**: [src/product-name-scout](https://github.com/truffle-ai/mcp-servers/tree/main/src/product-name-scout) - SERP analysis, autocomplete, dev collisions, and scoring
- **Domain Checker Server**: [src/domain-checker](https://github.com/truffle-ai/mcp-servers/tree/main/src/domain-checker) - Domain availability checking via WHOIS and DNS
- **DuckDuckGo Server**: External third-party server for web search capabilities
## Step 1: Setting Up the Project
The product name research agent uses multiple MCP servers for comprehensive analysis:
```
agents/product-name-researcher/
├── product-name-researcher.yml # Agent configuration
└── README.md # Documentation
```
## Step 2: Quick Setup
The agent uses published MCP servers that are automatically installed:
```bash
# From the dexto project root
dexto --agent agents/product-name-researcher/product-name-researcher.yml
```
The agent will automatically download and install all required MCP servers:
- `truffle-ai-domain-checker-mcp` - Domain availability checking
- `duckduckgo-mcp-server` - Web search and competitive research
- `@truffle-ai/product-name-scout-mcp` - Advanced name analysis tools
## Step 3: Understanding the Agent Configuration
The agent is configured in `product-name-researcher.yml`:
```yaml
systemPrompt: |
You are a specialized Product Name Research Agent focused on helping entrepreneurs,
product managers, and marketing teams validate potential product names through
comprehensive research. Your expertise combines domain availability checking with
competitive landscape analysis and advanced searchability assessment.
mcpServers:
# Domain availability checking
domain-checker:
type: stdio
command: uvx
args:
- truffle-ai-domain-checker-mcp
# Web search for competitive research
duckduckgo:
type: stdio
command: uvx
args:
- duckduckgo-mcp-server
# Advanced product name analysis
product-name-scout:
type: stdio
command: npx
args:
- "@truffle-ai/product-name-scout-mcp"
llm:
provider: anthropic
model: claude-sonnet-4-5-20250929
apiKey: $ANTHROPIC_API_KEY
toolConfirmation:
mode: auto-approve
```
### Key Components Explained
1. **systemPrompt**: Defines specialized product name research expertise
2. **mcpServers**: Connects to three complementary research tools
3. **toolConfirmation**: Auto-approves tools for seamless research workflow
4. **llm**: Configures the language model for intelligent analysis
## Step 4: Available Tools
The product name scout agent provides 9 specialized research tools across three categories:
### Domain Research Tools (3)
- `check_domain` - Check availability of a single domain
- `check_multiple_domains` - Check multiple domains simultaneously
- `check_domain_variations` - Check a base name across multiple TLD extensions
### Advanced Name Analysis Tools (4)
- `check_brand_serp` - Analyze search engine results for brand competition
- `get_autocomplete` - Get search engine autocomplete suggestions
- `check_dev_collisions` - Check for existing projects on GitHub, npm, PyPI
- `score_name` - Comprehensive scoring across multiple brand viability factors
### Competitive Research Tools (2)
- `search` - DuckDuckGo search for competitive analysis and market research
- `get_content` - Extract and analyze content from specific web pages
## Step 5: Research Methodology
The agent follows a systematic approach to product name validation:
### For Single Name Research:
1. **Domain Availability Check**: Verify availability across key TLDs (.com, .io, .app, etc.)
2. **SERP Competition Analysis**: Assess existing brand presence in search results
3. **Autocomplete Pattern Analysis**: Understand search behavior and spelling issues
4. **Developer Platform Conflicts**: Check for existing projects on GitHub, npm, PyPI
5. **Competitive Research**: Search for existing companies/products with similar names
6. **Trademark Assessment**: Search for trademark conflicts and legal issues
7. **Comprehensive Scoring**: Generate overall viability score with detailed breakdown
### For Multiple Name Comparison:
1. **Batch Domain Analysis**: Check all names across key TLD extensions
2. **Parallel Research**: Conduct SERP and collision analysis for each name
3. **Comparison Matrix**: Create comprehensive comparison including all factors
4. **Scoring & Ranking**: Rank names based on availability, conflicts, and strategic value
5. **Final Recommendation**: Provide clear recommendation with detailed reasoning
## Step 6: Running the Agent
Start the product name research agent:
```bash
# From the project root
dexto --agent agents/product-name-researcher/product-name-researcher.yml
```
## Step 7: Testing with Example Prompts
Let's test the agent with realistic product name research scenarios:
### Basic Name Validation
```
"I'm considering 'ZenFlow' as a product name for a productivity app. Can you research this name?"
```
**What happens**: The agent orchestrates a complete research workflow:
1. Checks domain availability for zenflow.com, zenflow.io, etc.
2. Analyzes search competition for "ZenFlow"
3. Checks autocomplete suggestions
4. Searches GitHub, npm, and PyPI for conflicts
5. Provides comprehensive scoring and recommendations
### Domain-Focused Research
```
"Check domain availability for 'CodeCraft' across all major extensions"
```
**What happens**: The agent calls `check_domain_variations` to check .com, .net, .org, .io, .app, .dev, and .tech extensions simultaneously.
### Competitive Analysis
```
"Research existing companies using the name 'DataFlow' and assess trademark risks"
```
**What happens**: The agent combines multiple tools:
- `search` for competitive landscape analysis
- `check_brand_serp` for search presence assessment
- `get_content` to analyze competitor websites
- Synthesizes findings into trademark risk assessment
### Multiple Name Comparison
```
"Compare these three product names for a fintech startup: 'PayEase', 'CashFlow', and 'MoneyBridge'"
```
**What happens**: The agent conducts parallel research on all three names and provides a ranked comparison with detailed scoring breakdown.
### Developer-Focused Research
```
"I'm launching a new JavaScript library. Check if 'ReactFlow' conflicts with existing packages"
```
**What happens**: The agent focuses on `check_dev_collisions` for GitHub and npm, plus domain research for library website.
## Step 8: Understanding the Scoring System
The `score_name` tool uses weighted scoring across multiple factors:
### Scoring Factors (with weights):
- **SERP Emptiness (3x)**: Lower search competition = higher score
- **Spellability (2x)**: Easy to spell and remember names score higher
- **Autocomplete Behavior (2x)**: Clean autocomplete patterns score higher
- **Developer Collisions (2x)**: Fewer conflicts on dev platforms = higher score
- **Domain Availability (3x)**: More available domains = higher score
- **Trademark Conflicts (2x)**: Lower trademark risk = higher score
- **International Considerations (1x)**: Global usability assessment
### Score Interpretation:
- **80-100**: Excellent name with strong availability and low conflicts
- **60-79**: Good name with minor issues to consider
- **40-59**: Moderate concerns requiring careful evaluation
- **20-39**: Significant conflicts or availability issues
- **0-19**: High-risk name with major obstacles
## Step 9: Example Research Workflow
Here's a complete workflow for researching a product name:
```
User: "Research 'CloudSync' for a file synchronization service"
Agent Response:
"I'll conduct comprehensive research on 'CloudSync' for your file synchronization service. Let me analyze multiple factors:
🔍 DOMAIN AVAILABILITY
[Checks cloudsync.com, cloudsync.io, cloudsync.app, etc.]
- cloudsync.com: ❌ NOT AVAILABLE
- cloudsync.io: ✅ AVAILABLE
- cloudsync.app: ✅ AVAILABLE
🔍 SEARCH COMPETITION ANALYSIS
[Analyzes Google search results for 'CloudSync']
- Found ~45,000 results
- Existing competitors in cloud storage space
- Mixed signals on brand conflicts
🔍 DEVELOPER PLATFORM CHECK
[Searches GitHub, npm, PyPI for conflicts]
- GitHub: 12 repositories with 'cloudsync'
- npm: 3 packages with similar names
- PyPI: 1 existing package
🔍 AUTOCOMPLETE ANALYSIS
[Checks search suggestions]
- Google suggests: 'cloudsync app', 'cloudsync software'
- Shows some existing recognition
📊 COMPREHENSIVE SCORE: 67/100
- Moderate availability with some conflicts
- Strong domain options available (.io, .app)
- Consider alternative: 'SyncCloud' or 'CloudFlow'
💡 RECOMMENDATION: Proceed with caution. While .io and .app domains are available, existing competition suggests trademark research needed. Consider variants like 'SyncStream' or 'CloudVault' for cleaner positioning."
```
## Research Best Practices
### Competitive Research Guidelines:
- Search for exact name matches and close variations
- Research across different industries and markets
- Look for existing trademarks and brand registrations
- Check social media presence and brand mentions
- Consider international markets and global brand presence
### Search Strategy Guidelines:
- Use specific queries: "[name] company", "[name] trademark", "[name] brand"
- Search industry-specific usage: "[name] [industry]", "[name] product"
- Look for legal conflicts: "[name] lawsuit", "[name] trademark dispute"
- Research naming trends in the target industry
## Common Use Cases
- **Startup Name Validation**: Complete due diligence for new company names
- **Product Launch Research**: Validate product names before market entry
- **Brand Extension Analysis**: Research names for new product lines
- **Trademark Risk Assessment**: Identify potential legal conflicts early
- **Domain Strategy Planning**: Optimize domain portfolio decisions
- **Competitive Intelligence**: Understand market landscape and positioning
---
**Ready for comprehensive name research?** Start the agent and begin validating your product names with professional-grade analysis tools!

View File

@@ -0,0 +1,302 @@
---
sidebar_position: 4
---
# Talk2PDF Agent
In this tutorial, we'll build a custom AI agent that can parse PDF documents and make them consumable by LLMs. We'll walk through the process step by step, explaining what we're doing and why.
## What We're Building
We want to create an agent that can:
- Parse PDF files and extract text content
- Search for specific terms within documents
- Provide intelligent analysis and summaries
- Handle errors gracefully
The key insight is that we'll separate concerns: a custom MCP server handles the low-level PDF parsing, while the agent provides intelligent interaction.
## Step 1: Understanding the Architecture
Our agent will have two main components:
1. **MCP Server**: Handles PDF parsing operations (tools)
2. **Agent**: Provides intelligent analysis and user interaction
This separation allows the agent to focus on understanding and analysis while the MCP server handles the technical PDF processing.
## Step 2: Quick Setup
The talk2pdf agent uses a published MCP server that's automatically installed:
```bash
# From the dexto project root
dexto --agent agents/talk2pdf-agent/talk2pdf-agent.yml
```
That's it! The MCP server (`@truffle-ai/talk2pdf-mcp`) will be automatically downloaded and installed via `npx` on first run.
## Step 3: Understanding the MCP Server
The core functionality is provided by the **Talk2PDF MCP Server**, a TypeScript-based server built with the Model Context Protocol SDK. To understand the complete MCP server implementation, refer to the [mcp-servers repository](https://github.com/truffle-ai/mcp-servers/tree/main):
- **Talk2PDF Server**: [src/talk2pdf](https://github.com/truffle-ai/mcp-servers/tree/main/src/talk2pdf) - PDF parsing, text extraction, and content analysis tools
### What's Happening Behind the Scenes
The published MCP server includes these key dependencies:
- **@modelcontextprotocol/sdk**: The MCP framework for server communication
- **pdf-parse-debugging-disabled**: PDF parsing without debug console output
- **zod**: Runtime type validation for tool parameters
- **TypeScript**: Compiled to JavaScript for reliable execution
## Step 4: Available Tools
The talk2pdf MCP server provides two main tools:
1. **`parse_pdf`**: Extract all text and metadata from a PDF
2. **`extract_section`**: Search for specific content within a PDF
Here's how the MCP server is structured (you can view the full implementation at [https://github.com/truffle-ai/mcp-servers/tree/main/src/talk2pdf](https://github.com/truffle-ai/mcp-servers/tree/main/src/talk2pdf)):
```typescript
// Core server structure
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import { z } from 'zod';
import { readFileSync, existsSync } from 'fs';
import { extname } from 'path';
import pdf from 'pdf-parse-debugging-disabled';
class Talk2PDFMCPServer {
private server: McpServer;
constructor() {
this.server = new McpServer(
{ name: 'talk2pdf', version: '1.0.0' },
{ capabilities: { tools: {}, resources: {} } }
);
this.registerTools();
}
private registerTools(): void {
// Tools are registered here
}
async start(): Promise<void> {
const transport = new StdioServerTransport();
await this.server.connect(transport);
console.log('Talk2PDF MCP Server started');
}
}
```
**What's happening here?**
- We create an MCP server with the name 'talk2pdf'
- The server registers tools that the agent can call
- The server communicates via stdio (standard input/output)
## Step 5: Adding the Parse Tool
The `parse_pdf` tool is our main workhorse. It needs to:
1. Validate the file exists and is a PDF
2. Extract text content
3. Extract metadata (page count, title, author, etc.)
4. Return structured data
```typescript
this.server.tool(
'parse_pdf',
'Parse a PDF file and extract its text content and metadata',
{
filePath: z.string().describe('Path to the PDF file'),
includeMetadata: z.boolean().optional().default(true),
maxPages: z.number().optional(),
},
async ({ filePath, includeMetadata = true, maxPages }) => {
// Validate file exists and is PDF
if (!existsSync(filePath)) {
throw new Error(`File not found: ${filePath}`);
}
// Parse PDF and extract content
const dataBuffer = readFileSync(filePath);
const pdfData = await pdf(dataBuffer, { max: maxPages });
// Extract metadata
const metadata = {
pageCount: pdfData.numpages,
title: pdfData.info?.Title,
author: pdfData.info?.Author,
fileSize: dataBuffer.length,
fileName: filePath.split('/').pop() || filePath,
};
// Return structured result
const result = {
content: pdfData.text,
metadata: includeMetadata ? metadata : {
pageCount: metadata.pageCount,
fileSize: metadata.fileSize,
fileName: metadata.fileName,
},
};
return {
content: [{ type: 'text', text: JSON.stringify(result, null, 2) }],
};
}
);
```
**Key points:**
- We validate the file first (safety)
- We use `pdf-parse-debugging-disabled` to avoid common issues
- We return JSON so the agent can easily parse the result
## Step 6: Adding the Search Tool
The `extract_section` tool allows searching within PDFs:
```typescript
this.server.tool(
'extract_section',
'Extract specific content from a PDF document',
{
filePath: z.string().describe('Path to the PDF file'),
searchTerm: z.string().optional().describe('Search term to find'),
},
async ({ filePath, searchTerm }) => {
// Validate file exists and is PDF
if (!existsSync(filePath)) {
throw new Error(`File not found: ${filePath}`);
}
const fileExtension = extname(filePath).toLowerCase();
if (fileExtension !== '.pdf') {
throw new Error(`File is not a PDF: ${filePath}`);
}
// Parse the PDF
const pdfData = await pdf(readFileSync(filePath));
let extractedContent = pdfData.text;
// If search term provided, filter content
if (searchTerm) {
const lines = extractedContent.split('\n');
const matchingLines = lines.filter(line =>
line.toLowerCase().includes(searchTerm.toLowerCase())
);
extractedContent = matchingLines.join('\n');
}
const result = {
fileName: filePath.split('/').pop() || filePath,
totalPages: pdfData.numpages,
extractedContent,
searchTerm: searchTerm || null,
contentLength: extractedContent.length,
};
return { content: [{ type: 'text', text: JSON.stringify(result, null, 2) }] };
}
);
```
**What this does:**
- Parses the PDF to get all text
- If a search term is provided, filters lines containing that term
- Returns the filtered content
## Step 7: Understanding the Agent Configuration
The agent configuration connects to the published MCP server:
```yaml
# agents/talk2pdf-agent/talk2pdf-agent.yml
mcpServers:
talk2pdf:
type: stdio
command: npx
args:
- "@truffle-ai/talk2pdf-mcp"
timeout: 30000
connectionMode: strict
systemPrompt:
contributors:
- id: primary
type: static
priority: 0
content: |
You are a Talk2PDF Agent. You can parse PDF files, extract their text, metadata, and provide summaries or extract specific sections for LLM consumption.
## Your Capabilities
- Parse PDF files and extract all text content and metadata
- Extract specific sections or search for terms within a PDF
- Provide intelligent analysis, summarization, and insights based on the extracted content
- Handle errors gracefully and provide clear feedback
Always ask for the file path if not provided. If a file is not a PDF or does not exist, inform the user.
llm:
provider: openai
model: gpt-5-mini
apiKey: $OPENAI_API_KEY
```
**Key configuration points:**
- The agent connects to the published MCP server via stdio
- We use `npx @truffle-ai/talk2pdf-mcp` to run the compiled server
- The system prompt tells the agent what it can do and how to behave
## Step 8: Testing the Agent
Now we can test our agent:
```bash
# From the project root
dexto --agent ./agents/talk2pdf-agent/talk2pdf-agent.yml
```
Once started, try these interactions:
**Parse and summarize:**
```
Parse the PDF at /path/to/document.pdf and summarize the key points
```
**Search for content:**
```
Find all mentions of "budget" in the financial report at /path/to/report.pdf
```
## How It All Works Together
1. **User asks a question** about a PDF
2. **Agent understands** what the user wants
3. **Agent calls the appropriate tool** (`parse_pdf` or `extract_section`)
4. **MCP server processes** the PDF and returns structured data
5. **Agent analyzes** the returned data and provides intelligent response
6. **User gets** a helpful, contextual answer
## What We've Accomplished
We've created a complete PDF parsing agent that demonstrates:
- **Separation of concerns**: Tools handle technical operations, agent handles intelligence
- **Error handling**: Proper validation and graceful error messages
- **Flexible architecture**: Easy to extend with more tools
- **Distributed architecture**: Published MCP server for easy deployment and updates
The agent can now parse PDFs, extract content, search for terms, and provide intelligent analysis - all through natural language interaction, using a published MCP server that's automatically installed.
## Next Steps
This pattern can be extended to:
- Add more document formats (DOCX, TXT)
- Implement document comparison
- Add OCR for scanned PDFs
- Create document classification
The key insight is that by separating the technical operations (published MCP server) from the intelligence (agent), we create a flexible, maintainable system that's easy to extend and debug. The published server approach means updates and improvements can be distributed automatically without requiring changes to the agent configuration.

View File

@@ -0,0 +1,155 @@
---
sidebar_position: 1
title: "CLI & Configuration"
---
# Building Agents with CLI & Configuration
The fastest way to build AI agents with Dexto. Define your agent in YAML, run it with the CLI.
## Quick Start
### 1. Create an Agent Configuration
```yaml
# agents/my-agent.yml
systemPrompt: |
You are a helpful assistant that can read and analyze files.
mcpServers:
filesystem:
type: stdio
command: npx
args: ["-y", "@modelcontextprotocol/server-filesystem", "."]
llm:
provider: openai
model: gpt-5-mini
apiKey: $OPENAI_API_KEY
```
### 2. Run Your Agent
```bash
# Interactive CLI mode
dexto --agent agents/my-agent.yml
# Web UI mode
dexto --agent agents/my-agent.yml --mode web
# Single task execution
dexto --agent agents/my-agent.yml "List all TypeScript files in this project"
```
That's it! Your agent is running with filesystem access.
## Configuration Anatomy
Every agent configuration has three main sections:
```yaml
# 1. System Prompt - Defines agent behavior
systemPrompt: |
You are a [role] that [capabilities].
Your guidelines:
- [Rule 1]
- [Rule 2]
# 2. MCP Servers - Tools and capabilities
mcpServers:
server_name:
type: stdio | sse | http
command: npx # for stdio
args: ["-y", "@package/name"]
# OR
url: http://localhost:8080 # for sse/http
# 3. LLM Configuration - Model settings
llm:
provider: openai | anthropic | google | groq
model: gpt-5-mini | claude-sonnet-4-5-20250929 | gemini-2.0-flash
apiKey: $ENV_VAR_NAME
temperature: 0.7 # optional
```
## Common Patterns
### Adding Multiple Tools
```yaml
mcpServers:
filesystem:
type: stdio
command: npx
args: ["-y", "@modelcontextprotocol/server-filesystem", "."]
web_search:
type: stdio
command: npx
args: ["-y", "tavily-mcp@0.1.3"]
env:
TAVILY_API_KEY: $TAVILY_API_KEY
playwright:
type: stdio
command: npx
args: ["-y", "@playwright/mcp@latest"]
```
### Environment-Specific Models
```yaml
# Development - faster, cheaper
llm:
provider: openai
model: gpt-5-mini
apiKey: $OPENAI_API_KEY
# Production - more capable
llm:
provider: anthropic
model: claude-sonnet-4-5-20250929
apiKey: $ANTHROPIC_API_KEY
```
### System Prompt from File
```yaml
systemPrompt:
contributors:
- id: base-prompt
type: file
files: ["./prompts/base-prompt.md"]
- id: company-context
type: file
files: ["./docs/company-info.md"]
```
## CLI Commands
```bash
# Basic usage
dexto --agent <config.yml>
# Modes
dexto --agent <config.yml> --mode cli # Interactive terminal (default)
dexto --agent <config.yml> --mode web # Web UI + API server
dexto --agent <config.yml> --mode server # Headless API server
dexto --agent <config.yml> --mode mcp # Run as MCP server
# Options
dexto --agent <config.yml> --api-port 3001 # Custom API port
dexto --agent <config.yml> --web-port 3000 # Custom web port
# Single task (non-interactive)
dexto --agent <config.yml> "Your task here"
```
## What's Next?
- **[Example Agents](./examples/building-triage-system.md)** - Step-by-step tutorials for building real agents
- **[MCP Server Registry](../../mcp/overview.md)** - Browse available tools
- **[Configuration Guide](../../guides/configuring-dexto/overview.md)** - Deep dive into all options
- **[Dexto Agent SDK](../sdk/index.md)** - Switch to programmatic control when needed

View File

@@ -0,0 +1,66 @@
---
sidebar_position: 0
title: "Overview"
---
Dexto gives you two ways to build AI agents. Pick the approach that fits your needs.
## Two Ways to Build
### CLI & Configuration
**Build agents with YAML configuration files.** No code required.
```yaml
# my-agent.yml
systemPrompt: You are a helpful assistant.
llm:
provider: openai
model: gpt-5-mini
apiKey: $OPENAI_API_KEY
```
```bash
dexto --agent my-agent.yml
```
**Best for:** Quick prototypes, simple agents, config-driven workflows
→ [Get started with CLI & Configuration](./cli/index.md)
---
### Dexto Agent SDK
**Build agents programmatically with TypeScript.** Full control over behavior and integration.
```typescript
import { DextoAgent } from '@dexto/core';
const agent = new DextoAgent({
llm: { provider: 'openai', model: 'gpt-5-mini', apiKey: process.env.OPENAI_API_KEY }
});
await agent.start();
const session = await agent.createSession();
const response = await agent.generate('Hello!', session.id);
```
**Best for:** Custom apps, production systems, complex integrations
→ [Get started with the Dexto Agent SDK](./sdk/index.md)
---
## Which Should I Choose?
| Use Case | Recommended |
|----------|-------------|
| Quick prototype or demo | CLI & Configuration |
| Simple task automation | CLI & Configuration |
| Multi-agent systems | CLI & Configuration |
| Custom web application | Dexto Agent SDK |
| Production backend service | Dexto Agent SDK |
| Embedding in existing app | Dexto Agent SDK |
## Need Help?
Join our [Discord community](https://discord.gg/GFzWFAAZcm) for questions and support.

View File

@@ -0,0 +1,9 @@
{
"label": "Dexto Agent SDK",
"position": 2,
"collapsed": false,
"link": {
"type": "doc",
"id": "tutorials/sdk/index"
}
}

View File

@@ -0,0 +1,155 @@
---
sidebar_position: 7
title: "Loading Agent Configs"
---
# Loading Agent Configs
You've been configuring agents inline with JavaScript objects. That works for quick scripts, but as your project grows, configs buried in code become hard to share, review, and manage across a team.
YAML config files solve this—the same approach Dexto's built-in agents use.
## The Problem
Here's what you've been doing:
```typescript
const agent = new DextoAgent({
llm: {
provider: 'openai',
model: 'gpt-4o-mini',
apiKey: process.env.OPENAI_API_KEY
},
systemPrompt: 'You are a helpful assistant.'
});
```
This gets messy fast. Config drift between team members, no easy way to share agent setups, and switching models means changing code.
## The Solution: YAML Config Files
Move your config to a file. Create `my-agent.yml`:
```yaml
systemPrompt: |
You are a helpful assistant specializing in technical documentation.
Be concise and provide code examples when relevant.
llm:
provider: openai
model: gpt-4o-mini
apiKey: $OPENAI_API_KEY
```
Now load it:
```typescript
import { DextoAgent } from '@dexto/core';
import { loadAgentConfig, enrichAgentConfig } from '@dexto/agent-management';
const config = await loadAgentConfig('my-agent.yml');
const enriched = enrichAgentConfig(config, 'my-agent.yml');
const agent = new DextoAgent(enriched, 'my-agent.yml');
await agent.start();
```
**That's it.** Your config is now external, shareable, and version-controlled separately from your code.
```bash
npm install @dexto/agent-management
```
## What These Functions Do
**`loadAgentConfig(path)`** reads your YAML and:
- Validates the schema
- Expands environment variables (`$OPENAI_API_KEY` → actual value)
- Resolves relative paths
**`enrichAgentConfig(config, path)`** adds runtime paths:
- Logs: `~/.dexto/agents/my-agent/logs/`
- Storage: `~/.dexto/agents/my-agent/storage/`
- Database: `~/.dexto/agents/my-agent/db/`
Each agent gets isolated storage automatically, derived from the config filename.
## Organizing Multiple Agents
Building a system with different agent types? Organize them in a folder:
```text
agents/
├── coding-agent.yml
├── research-agent.yml
└── support-agent.yml
```
Each config tailored to its task:
```yaml
# coding-agent.yml - Low temperature for deterministic code
llm:
provider: anthropic
model: claude-sonnet-4-5-20250929
apiKey: $ANTHROPIC_API_KEY
temperature: 0.3
systemPrompt: You are an expert software engineer.
```
```yaml
# support-agent.yml - Higher temperature for conversational tone
llm:
provider: openai
model: gpt-4o-mini
apiKey: $OPENAI_API_KEY
temperature: 0.8
systemPrompt: You are a friendly customer support assistant.
```
Load whichever you need:
```typescript
async function createAgent(type: string): Promise<DextoAgent> {
const path = `agents/${type}.yml`;
const config = await loadAgentConfig(path);
const enriched = enrichAgentConfig(config, path);
const agent = new DextoAgent(enriched, path);
await agent.start();
return agent;
}
// Pick the right agent for the task
const agent = await createAgent('coding-agent');
```
## When to Use What
**Use inline configs when:**
- Quick scripts and demos
- Config values computed at runtime
- Writing tests
**Use config files when:**
- Multiple agents in your application
- Team needs to review/modify configs
- You want version-controlled agent settings
**Hybrid approach**—load a file, override at runtime:
```typescript
const config = await loadAgentConfig('base-agent.yml');
config.llm.model = process.env.USE_ADVANCED ? 'gpt-4o' : 'gpt-4o-mini';
const enriched = enrichAgentConfig(config, 'base-agent.yml');
const agent = new DextoAgent(enriched, 'base-agent.yml');
```
## What's Next?
Config files work great for a handful of agents. But what if you're building a platform where users choose from many specialized agents? You need a way to list, discover, and manage them programmatically.
**Continue to:** [Agent Orchestration](./orchestration.md)

View File

@@ -0,0 +1,9 @@
{
"label": "System Prompt Preparation",
"position": 9,
"collapsed": false,
"link": {
"type": "doc",
"id": "tutorials/sdk/context-management/prompt-contributors"
}
}

View File

@@ -0,0 +1,178 @@
---
sidebar_position: 1
title: "System Prompt Preparation"
---
# System Prompt Preparation
Your agent's system prompt tells it how to behave. But what happens when you need that prompt to include content from files, user preferences, or information that changes at runtime?
## The Problem
You're building a support agent that needs product documentation in its context:
```yaml
# support-agent.yml
systemPrompt: |
You are a support agent for Acme Corp.
Here is the product documentation:
[... 500 lines of docs pasted here ...]
Here is the FAQ:
[... 200 more lines ...]
```
This works, but it's a mess:
- Updating docs means editing the YAML file
- The config becomes huge and hard to read
- You can't reuse the same docs across multiple agents
## The Solution
Instead of one giant string, compose your prompt from **contributors**:
```yaml
# support-agent.yml
systemPrompt:
contributors:
- id: personality
type: static
priority: 1
content: You are a friendly support agent for Acme Corp.
- id: docs
type: file
priority: 2
files:
- ./knowledge/product-guide.md
- ./knowledge/faq.md
```
Now your personality lives in the config, but documentation lives in separate files that are easy to update.
## How It Works
Contributors are assembled in priority order (lower number = first). The example above produces:
```
You are a friendly support agent for Acme Corp.
<fileContext>
## ./knowledge/product-guide.md
[contents of product-guide.md]
---
## ./knowledge/faq.md
[contents of faq.md]
</fileContext>
```
## Contributor Types
### Static: Inline Text
For content that lives in your config:
```yaml
- id: personality
type: static
priority: 1
content: |
You are a helpful assistant.
Always be concise and accurate.
```
### File: External Documents
For content that lives in separate files:
```yaml
- id: knowledge
type: file
priority: 2
files:
- ./docs/guide.md
- ./docs/faq.md
```
Only `.md` and `.txt` files are supported. Files are cached by default to avoid repeated disk reads.
### Dynamic: Runtime Content
For content computed when the agent runs:
```yaml
- id: datetime
type: dynamic
priority: 3
source: date
```
The built-in `date` source adds the current date, so your agent knows "today."
## A Complete Example
Here's a research agent with all the pieces:
```yaml
# research-agent.yml
name: Research Assistant
llm:
provider: anthropic
model: claude-sonnet-4-20250514
apiKey: $ANTHROPIC_API_KEY
systemPrompt:
contributors:
- id: base
type: static
priority: 1
content: |
You are a market research assistant.
Always cite sources when making claims.
- id: industry
type: file
priority: 10
files:
- ./knowledge/industry-overview.md
- ./knowledge/competitors.md
- id: datetime
type: dynamic
priority: 20
source: date
```
The final prompt includes:
1. Base personality (priority 1)
2. Industry knowledge from files (priority 10)
3. Current date/time (priority 20)
## Tips
**Space your priorities.** Use 1, 10, 20 instead of 1, 2, 3. This leaves room to insert new contributors later without renumbering.
**Disable without deleting.** Add `enabled: false` to temporarily skip a contributor:
```yaml
- id: verbose-docs
type: file
priority: 5
enabled: false # Skipped
files: [./docs/detailed.md]
```
**Keep files focused.** Smaller, topic-specific files are easier to maintain than one giant knowledge base.
## What's Next?
You've learned how to build modular, maintainable system prompts. Continue exploring:
- **[SDK Guide](/docs/guides/dexto-sdk)** - Complete SDK documentation
- **[Configuration Reference](/docs/guides/configuring-dexto/systemPrompt)** - All system prompt options

View File

@@ -0,0 +1,189 @@
---
sidebar_position: 6
title: "Handling Events"
---
# Handling Events
Your agent can now chat, remember conversations, serve multiple users, and use tools. But there's a problem: **you can't see what it's doing.**
When a user sends a message, your UI is blind. It doesn't know if the agent is thinking, streaming text, or calling tools. Users just see... nothing.
## The Problem
Without events, your UI looks frozen:
```typescript
const agent = new DextoAgent({
llm: { provider: 'openai', model: 'gpt-4o-mini', apiKey: process.env.OPENAI_API_KEY }
});
await agent.start();
const session = await agent.createSession();
console.log('Sending message...');
const response = await agent.generate('Explain quantum computing', session.id);
// 10 seconds of silence...
console.log('Done:', response.content);
```
Users see "Sending message..." then wait 10 seconds with no feedback. Not great.
## The Solution: Events
Listen to what the agent is doing:
```typescript
const agent = new DextoAgent({
llm: { provider: 'openai', model: 'gpt-4o-mini', apiKey: process.env.OPENAI_API_KEY }
});
// Listen BEFORE starting
agent.agentEventBus.on('llm:thinking', () => {
console.log('Agent is thinking...');
});
agent.agentEventBus.on('llm:chunk', ({ content }) => {
process.stdout.write(content); // Stream text as it arrives
});
agent.agentEventBus.on('llm:response', () => {
console.log('\n✓ Complete');
});
await agent.start();
const session = await agent.createSession();
await agent.generate('Explain quantum computing', session.id);
```
Now users see:
1. "Agent is thinking..." (immediate feedback)
2. Text streaming word-by-word (real-time progress)
3. "Complete" (clear ending)
Much better!
## Core Events
### Thinking
```typescript
agent.agentEventBus.on('llm:thinking', ({ sessionId }) => {
showLoadingSpinner(sessionId);
});
```
Fires when the agent starts processing. Show a loading indicator.
### Streaming Text
```typescript
agent.agentEventBus.on('llm:chunk', ({ sessionId, content }) => {
appendText(sessionId, content);
});
```
Fires for each chunk of text. Build up the response in your UI.
### Response Complete
```typescript
agent.agentEventBus.on('llm:response', ({ sessionId, content, usage }) => {
hideLoadingSpinner(sessionId);
console.log(`Tokens used: ${usage?.totalTokens}`);
});
```
Fires when done. Hide loading, show final message.
## Showing Tool Usage
When your agent uses tools, show what it's doing:
```typescript
agent.agentEventBus.on('llm:tool-call', ({ sessionId, toolName, args }) => {
showToolBanner(sessionId, `Using ${toolName}...`);
});
agent.agentEventBus.on('llm:tool-result', ({ sessionId, toolName, success }) => {
if (success) {
hideToolBanner(sessionId);
} else {
showError(sessionId, `Failed to use ${toolName}`);
}
});
```
This gives users confidence—they see the agent working, not just waiting.
## Complete Example
Here's a simple chat UI with event handling:
```typescript
import { DextoAgent } from '@dexto/core';
const agent = new DextoAgent({
llm: { provider: 'openai', model: 'gpt-4o-mini', apiKey: process.env.OPENAI_API_KEY }
});
// Track UI state
const uiState = new Map<string, {
status: 'idle' | 'thinking' | 'streaming';
currentMessage: string;
}>();
agent.agentEventBus.on('llm:thinking', ({ sessionId }) => {
uiState.set(sessionId, { status: 'thinking', currentMessage: '' });
updateUI(sessionId);
});
agent.agentEventBus.on('llm:chunk', ({ sessionId, content }) => {
const state = uiState.get(sessionId)!;
state.status = 'streaming';
state.currentMessage += content;
updateUI(sessionId);
});
agent.agentEventBus.on('llm:response', ({ sessionId }) => {
const state = uiState.get(sessionId)!;
state.status = 'idle';
updateUI(sessionId);
});
function updateUI(sessionId: string) {
const state = uiState.get(sessionId)!;
if (state.status === 'thinking') {
console.log(`[${sessionId}] 🤔 Thinking...`);
} else if (state.status === 'streaming') {
console.log(`[${sessionId}] ✍️ ${state.currentMessage}`);
} else {
console.log(`[${sessionId}] ✓ Done`);
}
}
await agent.start();
```
## All Available Events
**LLM Events:**
- `llm:thinking` - Started processing
- `llm:chunk` - Text chunk arrived
- `llm:tool-call` - Calling a tool
- `llm:tool-result` - Tool finished
- `llm:response` - Response complete
**MCP Events:**
- `mcp:server-connected` - Tool server connected
- `mcp:server-disconnected` - Tool server disconnected
**Session Events:**
- `session:created` - New session created
- `session:deleted` - Session deleted
See the [Events API Reference](/api/sdk/events) for complete details.
## What's Next?
You've mastered the core SDK capabilities—creating agents, managing sessions, serving users, adding tools, and handling events. But you've been configuring everything inline with JavaScript objects.
Production applications need more: reusable configs, environment management, and programmatic agent orchestration. The next tutorials cover these production patterns.
**Continue to:** [Loading Agent Configs](./config-files.md)

View File

@@ -0,0 +1,74 @@
---
sidebar_position: 1
title: "Dexto Agent SDK"
---
# Building with the Dexto Agent SDK
Learn to build production-ready AI agents with the Dexto SDK. These tutorials follow a progressive path—each one builds on the previous, adding one core concept at a time.
## Tutorial Path
Follow these in order for the best learning experience:
### 1. [Quick Start](./quick-start.md)
**Get your first AI response in 5 minutes.**
The minimal working example. No explanations, no complexity—just 15 lines of code that prove the SDK works. Once you see a response, you know you're set up correctly.
### 2. [Working with Sessions](./sessions.md)
**Give your agent memory.**
Right now your agent forgets everything after each response. Sessions let agents remember previous messages—the foundation for real conversations. Learn when to create sessions, how to manage them, and common patterns for different use cases.
### 3. [Multi-User Chat Endpoint](./multi-user-chat.md)
**Serve hundreds of users with one agent.**
Building a new agent for each user wastes resources. Learn the production pattern: one shared agent, multiple sessions, isolated conversations. Includes a complete Express server example with proper session management.
### 4. [Adding Tools](./tools.md)
**Turn your agent from a chatbot into an actor.**
LLMs only generate text—they can't read files, search the web, or query databases. Tools change that. Learn to add pre-built MCP servers, manage tools dynamically based on user permissions, and build custom tools for your specific needs.
### 5. [Handling Events](./events.md)
**Build responsive, production-quality UIs.**
Without events, your UI is blind—users see nothing while waiting for responses. Events let you show loading states, stream text in real-time, display tool usage, and handle errors gracefully. Includes complete SSE examples.
### 6. [Loading Agent Configs](./config-files.md)
**Move from inline configs to production-ready YAML files.**
You've been configuring agents inline with JavaScript objects. That works for demos, but production apps need reusable, shareable configs. Learn to load agent configs from YAML files, understand config enrichment, and manage multi-environment setups—the same pattern used by Dexto's built-in agents.
### 7. [Agent Orchestration](./orchestration.md)
**Manage multiple agents programmatically.**
So far you've worked with one agent at a time. But what if you're building a platform where users choose from specialized agents? Learn to use AgentManager to list, install, and manage multiple agents programmatically—build agent marketplaces, multi-tenant systems, and dynamic agent selection.
### 8. [System Prompt Preparation](./context-management/prompt-contributors.md)
**Build modular, maintainable system prompts.**
A giant system prompt string becomes a maintenance nightmare. Learn to compose prompts from multiple sources—static text, external files, and runtime content—each handling one piece of the puzzle.
## What You'll Build
By the end of these tutorials, you'll have:
- ✅ A working agent that can use multiple LLM providers
- ✅ Conversation memory across multiple turns
- ✅ Multi-user support with isolated sessions
- ✅ Real-world capabilities (file access, web search, databases)
- ✅ A responsive UI with streaming and progress indicators
- ✅ Production-ready config management with YAML files
- ✅ Programmatic agent orchestration and management
- ✅ Modular system prompts from multiple sources
## API Reference
Once you've completed the tutorials, dive deeper with the API docs:
- **[DextoAgent API](/api/sdk/dexto-agent)** - Complete method documentation
- **[Events Reference](/api/sdk/events)** - All available events and payloads
- **[Types Reference](/api/sdk/types)** - TypeScript type definitions
Ready to start? **[Begin with Quick Start →](./quick-start.md)**

View File

@@ -0,0 +1,260 @@
---
id: multi-user-chat
sidebar_position: 4
title: "Multi-User Chat"
---
# Multi-User Chat
In the last tutorial, you learned that sessions give agents memory. Now here's the key insight: **one agent can manage hundreds of sessions simultaneously**. You don't need a separate agent instance for each user—you just map each user to their own session ID.
This tutorial has two parts:
- **Part I:** Understand the pattern programmatically
- **Part II:** Build an HTTP server to expose it
## Prerequisites
- Completed the [Sessions tutorial](./sessions.md)
- Node.js 18+
- `OPENAI_API_KEY` in your environment
## Part I: Understanding the Pattern
### The Core Idea
```
One Agent + Many Sessions = Multi-User Support
User A → Session A ┐
User B → Session B ├─→ Single DextoAgent
User C → Session C ┘
```
Each user gets their own session, but they all share the same agent instance.
### Build It Programmatically
Create `multi-user.ts`:
```typescript
import { DextoAgent } from '@dexto/core';
// One shared agent for all users
const agent = new DextoAgent({
llm: { provider: 'openai', model: 'gpt-4o-mini', apiKey: process.env.OPENAI_API_KEY }
});
await agent.start();
// Track which session belongs to which user
const userSessions = new Map<string, string>();
async function getOrCreateSession(userId: string) {
// Check if this user already has a session
const existing = userSessions.get(userId);
if (existing) return existing;
// Create a new session for this user
const session = await agent.createSession(`user-${userId}`);
userSessions.set(userId, session.id);
return session.id;
}
async function handleMessage(userId: string, message: string) {
const sessionId = await getOrCreateSession(userId);
const response = await agent.generate(message, sessionId);
return response.content;
}
```
### Test It
Add a test function to verify it works:
```typescript
async function test() {
console.log('Alice:', await handleMessage('alice', 'My name is Alice'));
console.log('Bob:', await handleMessage('bob', 'My name is Bob'));
console.log('Alice:', await handleMessage('alice', 'What is my name?'));
// Should respond "Alice" - proving sessions are isolated
}
test().catch(console.error);
```
Run it:
```bash
export OPENAI_API_KEY=sk-...
npx tsx multi-user.ts
```
You should see Alice and Bob maintaining separate memories. This is the core pattern—master this before moving to HTTP.
### How It Works
```typescript
async function getOrCreateSession(userId: string) {
const existing = userSessions.get(userId);
if (existing) return existing;
const session = await agent.createSession(`user-${userId}`);
userSessions.set(userId, session.id);
return session.id;
}
```
This function ensures:
- First message from a user → creates a new session
- Subsequent messages → reuses existing session
- Different users → different sessions
Once you understand this pattern, you can expose it over HTTP.
## Part II: Building the HTTP Server
Now let's make this accessible to frontends by adding an Express server.
### Install Express
```bash
npm install express
```
### Add the HTTP Layer
Create `chat-server.ts` with the same agent and session logic:
```typescript
import express from 'express';
import { DextoAgent } from '@dexto/core';
// One agent for everyone
const agent = new DextoAgent({
llm: { provider: 'openai', model: 'gpt-4o-mini', apiKey: process.env.OPENAI_API_KEY }
});
await agent.start();
// Map users to sessions
const userSessions = new Map<string, string>();
async function getOrCreateSession(userId: string) {
const existing = userSessions.get(userId);
if (existing) return existing;
const session = await agent.createSession(`user-${userId}`);
userSessions.set(userId, session.id);
return session.id;
}
// Express server
const app = express();
app.use(express.json());
app.post('/chat', async (req, res) => {
try {
const { userId, message } = req.body;
if (!message) {
return res.status(400).json({ error: 'Message required' });
}
const sessionId = await getOrCreateSession(userId || 'anonymous');
const response = await agent.generate(message, sessionId);
res.json({ content: response.content, sessionId });
} catch (error) {
console.error('Chat error:', error);
res.status(500).json({ error: 'Internal server error' });
}
});
const PORT = 3000;
app.listen(PORT, () => {
console.log(`Chat server running on http://localhost:${PORT}`);
});
```
### Start the Server
```bash
export OPENAI_API_KEY=sk-...
npx tsx chat-server.ts
```
### Test with Multiple Users
In another terminal, send messages from different users:
```bash
# Alice's conversation
curl -X POST http://localhost:3000/chat \
-H "Content-Type: application/json" \
-d '{"userId":"alice","message":"My favorite color is blue"}'
# Bob's conversation (completely separate)
curl -X POST http://localhost:3000/chat \
-H "Content-Type: application/json" \
-d '{"userId":"bob","message":"My favorite color is red"}'
# Alice's follow-up
curl -X POST http://localhost:3000/chat \
-H "Content-Type: application/json" \
-d '{"userId":"alice","message":"What is my favorite color?"}'
# Response: "Your favorite color is blue"
```
Perfect! One agent, multiple users, isolated conversations.
### Add Session Management (Optional)
Add endpoints for managing user sessions:
```typescript
// Reset a user's conversation
app.post('/chat/reset', async (req, res) => {
const { userId } = req.body;
const sessionId = userSessions.get(userId);
if (sessionId) {
await agent.resetConversation(sessionId);
res.json({ success: true });
} else {
res.json({ success: false, message: 'No active session' });
}
});
// List active users
app.get('/chat/users', async (req, res) => {
const users = Array.from(userSessions.keys());
res.json({ users, count: users.length });
});
```
Test the reset:
```bash
curl -X POST http://localhost:3000/chat/reset \
-H "Content-Type: application/json" \
-d '{"userId":"alice"}'
```
## Key Takeaways
**The Pattern:**
- One agent instance shared across all users
- One session per user, stored in a Map
- Session lookup/creation handled by `getOrCreateSession()`
**Why It Works:**
- Agent instances are expensive (they load models, connect to services)
- Sessions are cheap (just conversation history)
- Sharing one agent is efficient and scales well
**Production Note:**
In production, store `userSessions` in Redis or a database instead of memory, so sessions persist across server restarts.
## What's Next?
Your agent can now handle multiple users, but it's still just a text generator. What if it could read files, search the web, or query databases? That's where tools come in.
**Continue to:** [Adding Tools](./tools.md)

View File

@@ -0,0 +1,190 @@
---
sidebar_position: 8
title: "Agent Orchestration"
---
# Agent Orchestration
You've learned to load agent configs from YAML files. But what if you're building a platform with many specialized agents? Users might choose between a coding agent, research agent, or support agent—and you need a way to list, discover, and create them dynamically.
`AgentManager` solves this with a simple registry file.
## The Problem
With multiple agents, you end up hardcoding paths everywhere:
```typescript
// You have to know and hardcode every config path
const codingConfig = await loadAgentConfig('agents/coding-agent.yml');
const supportConfig = await loadAgentConfig('agents/support-agent.yml');
// Want to list available agents? Manual work
const available = ['coding-agent', 'support-agent', 'research-agent'];
```
No metadata, no discovery, no organization.
## The Solution: Registry + Manager
Create a registry file that describes your agents. Then use `AgentManager` to work with them.
**Step 1:** Create your agent configs in `agents/`:
```yaml
# agents/coding-agent.yml
systemPrompt: You are an expert coding assistant.
llm:
provider: openai
model: gpt-4o
apiKey: $OPENAI_API_KEY
```
**Step 2:** Create `agents/registry.json`:
```json
{
"agents": [
{
"id": "coding-agent",
"name": "Coding Assistant",
"description": "Expert coding assistant for development tasks",
"configPath": "./coding-agent.yml",
"tags": ["coding", "development"]
},
{
"id": "support-agent",
"name": "Support Assistant",
"description": "Friendly customer support agent",
"configPath": "./support-agent.yml",
"tags": ["support", "customer-service"]
}
]
}
```
**Step 3:** Use `AgentManager`:
```typescript
import { AgentManager } from '@dexto/agent-management';
const manager = new AgentManager('./agents/registry.json');
await manager.loadRegistry();
// Discover what's available
const agents = manager.listAgents();
console.log(agents);
// [
// { id: 'coding-agent', name: 'Coding Assistant', description: '...', tags: [...] },
// { id: 'support-agent', name: 'Support Assistant', description: '...', tags: [...] }
// ]
// Create an agent by ID
const agent = await manager.loadAgent('coding-agent');
await agent.start();
const session = await agent.createSession();
const response = await agent.generate('Write a function to reverse a string', session.id);
console.log(response.content);
```
**That's it.** Your agents are now organized, discoverable, and easy to manage.
## Registry Format
Each agent entry needs:
- `id` - Unique identifier (used in `createAgent('id')`)
- `name` - Human-readable display name
- `description` - What this agent does
- `configPath` - Path to YAML config (relative to registry.json)
Optional fields:
- `tags` - For filtering/categorization
- `author` - Who created the agent
## The AgentManager API
```typescript
const manager = new AgentManager('./registry.json');
// Load the registry first (required)
await manager.loadRegistry();
// List all agents with metadata
const agents = manager.listAgents();
// Check if an agent exists
if (manager.hasAgent('coding-agent')) {
// Create a DextoAgent instance
const agent = await manager.loadAgent('coding-agent');
await agent.start();
}
```
## Routing Requests to Different Agents
A common pattern: pre-load your agents and route requests:
```typescript
const manager = new AgentManager('./agents/registry.json');
// Create agents upfront
const agents = {
code: await manager.loadAgent('coding-agent'),
support: await manager.loadAgent('support-agent'),
};
// Start them all
await Promise.all(Object.values(agents).map(a => a.start()));
// Route by request type
async function handleRequest(type: 'code' | 'support', message: string) {
const agent = agents[type];
const session = await agent.createSession();
return agent.generate(message, session.id);
}
// Use it
const response = await handleRequest('code', 'Write a binary search');
```
## Filtering Agents
Use the metadata to find the right agent:
```typescript
await manager.loadRegistry();
const agents = manager.listAgents();
// Find by tag
const codingAgents = agents.filter(a => a.tags?.includes('coding'));
// Find by description
const supportAgents = agents.filter(a =>
a.description.toLowerCase().includes('support')
);
// Create the first match
if (codingAgents.length > 0) {
const agent = await manager.loadAgent(codingAgents[0].id);
}
```
## When to Use AgentManager
**Use `loadAgentConfig` directly when:**
- You have 1-3 agents
- You know exactly which configs to load
- Simple applications
**Use `AgentManager` when:**
- Many agents that users can choose from
- You need to list/discover available agents
- Building agent marketplaces or platforms
- Dynamic agent selection based on metadata
## What's Next?
You can now create and manage multiple agents. The next tutorial covers system prompt preparation—how to build modular prompts from files and runtime content instead of one giant string.
**Continue to:** [System Prompt Preparation](./context-management/prompt-contributors.md)

View File

@@ -0,0 +1,181 @@
---
sidebar_position: 2
title: "Quick Start"
---
# Quick Start
Let's get your first AI response in under 5 minutes. No complexity, no explanations—just working code that proves the SDK does what you need.
## What You'll Build
A 15-line script that:
1. Creates an AI agent
2. Asks it a question
3. Prints the answer
That's it. Once this works, you'll know the SDK is set up correctly.
## Prerequisites
- Node.js 18 or higher
- An API key from [OpenAI](https://platform.openai.com), [Anthropic](https://console.anthropic.com), or [Cohere](https://dashboard.cohere.com)
## Install
```bash
npm install @dexto/core
```
## Write Your First Agent
Create `first-agent.ts`:
```typescript
import { DextoAgent } from '@dexto/core';
const agent = new DextoAgent({
llm: {
provider: 'openai',
model: 'gpt-4o-mini',
apiKey: process.env.OPENAI_API_KEY
}
});
await agent.start();
const session = await agent.createSession();
const response = await agent.generate('Explain TypeScript in one sentence.', session.id);
console.log(response.content);
await agent.stop();
```
## Run It
```bash
export OPENAI_API_KEY=your-key-here
npx tsx first-agent.ts
```
You should see a concise explanation of TypeScript. If you do, **you're done**. The SDK is working.
## What Just Happened?
Let's break down each step:
### 1. Configure the Agent
```typescript
const agent = new DextoAgent({
llm: { provider: 'openai', model: 'gpt-4o-mini', apiKey: process.env.OPENAI_API_KEY }
});
```
**What it does:** Configure which LLM to use. That's it for setup—no complex initialization code needed.
---
### 2. Start the Agent
```typescript
await agent.start();
```
**What it does:** Initializes internal services (LLM client, storage, event system). The agent is now ready to process messages.
---
### 3. Create a Session
```typescript
const session = await agent.createSession();
```
**What it does:** Creates a conversation session to maintain working memory. Each session tracks its own conversation history.
---
### 4. Generate a Response
```typescript
const response = await agent.generate('What is TypeScript?', session.id);
console.log(response.content);
```
**What it does:** Send a message and get the AI response. The response includes the content, token usage, and any tool calls made.
---
### 5. Stop the Agent
```typescript
await agent.stop();
```
**What it does:** Clean up resources when you're done. Always call this to properly shut down services.
---
**The Pattern:** Every Dexto agent follows this lifecycle: **configure → start → create session → generate → stop**.
## Try Different Providers
Swap providers by changing three lines:
### Google (Gemini)
```typescript
const agent = new DextoAgent({
llm: {
provider: 'google',
model: 'gemini-2.0-flash-exp',
apiKey: process.env.GOOGLE_API_KEY
}
});
```
### Anthropic (Claude)
```typescript
const agent = new DextoAgent({
llm: {
provider: 'anthropic',
model: 'claude-sonnet-4-5-20250929',
apiKey: process.env.ANTHROPIC_API_KEY
}
});
```
### Cohere
```typescript
const agent = new DextoAgent({
llm: {
provider: 'cohere',
model: 'command-a-03-2025',
apiKey: process.env.COHERE_API_KEY
}
});
```
### Local Models (Ollama, vLLM, etc.)
```typescript
const agent = new DextoAgent({
llm: {
provider: 'openai',
model: 'llama-3.1-70b',
apiKey: 'dummy',
baseURL: 'http://localhost:8080/v1'
}
});
```
Any OpenAI-compatible API works with `provider: 'openai'` and a custom `baseURL`.
## What's Next?
Right now your agent forgets everything after each run. In the next tutorial, you'll learn how sessions let your agent remember previous messages—the foundation for building real conversations.
**Continue to:** [Working with Sessions](./sessions.md)

View File

@@ -0,0 +1,212 @@
---
sidebar_position: 3
title: "Working with Sessions"
---
# Working with Sessions
Sessions give your agent working memory—the ability to reference earlier messages in the current conversation, build context, and maintain coherent multi-turn interactions. Every conversation in Dexto happens within a session.
:::info What is Working Memory?
**Working memory** is the conversation history maintained within a session. It's the context the agent uses to understand and respond to your current conversation—like remembering what you said two messages ago.
This is distinct from other types of memory (like long-term facts or user preferences), which you'll learn about in later tutorials.
:::
## What Sessions Do
A session maintains conversation history (working memory). Messages in the same session can reference each other. Messages in different sessions are completely isolated.
Here's a conversation with working memory:
```typescript
const agent = new DextoAgent({
llm: { provider: 'openai', model: 'gpt-4o-mini', apiKey: process.env.OPENAI_API_KEY }
});
await agent.start();
const session = await agent.createSession();
await agent.generate('My name is Sarah.', session.id);
const response = await agent.generate('What is my name?', session.id);
console.log(response.content);
// "Your name is Sarah."
```
The agent remembers because both messages used the same `sessionId`.
Now create a new session—this has no working memory of the first conversation:
```typescript
const newSession = await agent.createSession();
const response = await agent.generate('What is my name?', newSession.id);
console.log(response.content);
// "I don't have that information."
```
**Different sessions = isolated working memory.** This is how one agent can handle multiple users or conversation threads simultaneously—each with their own conversation context.
## Creating Sessions
Sessions are cheap—create as many as you need:
```typescript
const agent = new DextoAgent({
llm: { provider: 'openai', model: 'gpt-4o-mini', apiKey: process.env.OPENAI_API_KEY }
});
await agent.start();
// Auto-generated ID
const session1 = await agent.createSession();
console.log(session1.id); // "sess_abc123def456"
// Custom ID (useful for mapping to users)
const session2 = await agent.createSession('user-sarah-2024');
console.log(session2.id); // "user-sarah-2024"
```
Custom IDs make it easy to tie sessions to your own user system. Just make sure they're unique.
## Building Multi-Turn Conversations
Pass the same `sessionId` on every message:
```typescript
const session = await agent.createSession('demo');
await agent.generate('I want to build a REST API in Node.js.', session.id);
await agent.generate('What framework should I use?', session.id);
const response = await agent.generate('Show me a simple example.', session.id);
console.log(response.content);
// The agent remembers you want a Node.js REST API and suggests Express with example code
```
Each message adds to the session's history. The LLM sees the full conversation every time.
## Inspecting Session History
Check what the agent remembers:
```typescript
const history = await agent.getSessionHistory('demo');
for (const message of history) {
console.log(`[${message.role}]: ${message.content.substring(0, 60)}...`);
}
```
Output:
```text
[user]: I want to build a REST API in Node.js.
[assistant]: Building a REST API in Node.js is a great choice. Here are...
[user]: What framework should I use?
[assistant]: Express is the most popular choice for Node.js REST APIs...
[user]: Show me a simple example.
[assistant]: Here's a minimal Express API...
```
This is useful for debugging, showing conversation history in your UI, or understanding token usage.
## Managing Sessions
### List All Active Sessions
```typescript
const sessions = await agent.listSessions();
console.log(`Active sessions: ${sessions.length}`);
```
### Check if a Session Exists
```typescript
const session = await agent.getSession('user-sarah-2024');
if (session) {
console.log('Session found');
} else {
console.log('Session not found - creating new one');
await agent.createSession('user-sarah-2024');
}
```
### Reset a Session
Clear history but keep the session ID:
```typescript
await agent.resetConversation('demo');
// Session 'demo' still exists, but all messages are gone
```
Use this for "start new conversation" buttons in your UI.
### Delete a Session
Remove everything:
```typescript
await agent.deleteSession('demo');
// Session no longer exists
```
## When to Create Sessions
**One session per conversation thread.** Here are common patterns:
### Pattern 1: One Session Per User
Simple apps where each user has one ongoing conversation:
```typescript
// User logs in
const sessionId = `user-${userId}`;
const session = await agent.getSession(sessionId);
if (!session) {
await agent.createSession(sessionId);
}
// Every message from this user uses the same session
await agent.generate(userMessage, sessionId);
```
### Pattern 2: Multiple Sessions Per User
Apps like ChatGPT where users create multiple conversation threads:
```typescript
// User creates a new chat
const sessionId = `user-${userId}-chat-${chatId}`;
await agent.createSession(sessionId);
// User switches between chats
await agent.generate(message, currentChatId);
```
### Pattern 3: Session Per Task
Short-lived sessions for specific tasks:
```typescript
// User starts a support ticket
const sessionId = `ticket-${ticketId}`;
await agent.createSession(sessionId);
// All messages related to this ticket use this session
await agent.generate(message, sessionId);
// Ticket resolved? Delete the session
await agent.deleteSession(sessionId);
```
## What's Next?
You now know how to give your agent working memory. But what if you have hundreds of users all talking to the same agent? You could create a new agent instance for each user, but that's wasteful.
In the next tutorial, you'll learn how one agent can serve multiple users simultaneously—each with their own isolated session and working memory.
**Continue to:** [Multi-User Chat Endpoint](./multi-user-chat.md)

View File

@@ -0,0 +1,177 @@
---
sidebar_position: 5
title: "Adding Tools"
---
# Adding Tools
Your agent can chat, remember conversations, and serve multiple users. But ask it to "read a file" or "search the web," and it can't. LLMs only generate text—they don't interact with the world.
**Tools** change that. Tools let your agent read files, search the web, query databases, and more.
## The Problem
Without tools, your agent can't do much:
```typescript
import { DextoAgent } from '@dexto/core';
const agent = new DextoAgent({
llm: { provider: 'openai', model: 'gpt-4o-mini', apiKey: process.env.OPENAI_API_KEY }
});
await agent.start();
const session = await agent.createSession();
const response = await agent.generate(
'Read package.json and tell me the version',
session.id
);
console.log(response.content);
// "I cannot read files. You'll need to provide the contents..."
```
The agent knows it can't read files. It needs tools.
## Adding Your First Tool
Give your agent filesystem access:
```typescript
const agent = new DextoAgent({
llm: { provider: 'openai', model: 'gpt-4o-mini', apiKey: process.env.OPENAI_API_KEY },
mcpServers: {
filesystem: {
type: 'stdio',
command: 'npx',
args: ['-y', '@modelcontextprotocol/server-filesystem', process.cwd()]
}
}
});
await agent.start();
const session = await agent.createSession();
const response = await agent.generate(
'Read package.json and tell me the version',
session.id
);
console.log(response.content);
// "The version in package.json is 2.1.4"
```
**That's it.** Add `mcpServers` with a filesystem configuration, and your agent can now:
- Read files
- Write files
- List directories
- Search files
- Create directories
The agent automatically chooses when to use these tools.
## Adding Multiple Tools
Add more capabilities by adding more servers:
```typescript
const agent = new DextoAgent({
llm: { provider: 'openai', model: 'gpt-4o-mini', apiKey: process.env.OPENAI_API_KEY },
mcpServers: {
filesystem: {
type: 'stdio',
command: 'npx',
args: ['-y', '@modelcontextprotocol/server-filesystem', process.cwd()]
},
web_search: {
type: 'stdio',
command: 'npx',
args: ['-y', '@modelcontextprotocol/server-brave-search'],
env: { BRAVE_API_KEY: process.env.BRAVE_API_KEY }
}
}
});
```
Now ask it to do both:
```typescript
await agent.generate(
'Search for TypeScript best practices and save a summary to tips.md',
session.id
);
```
The agent will:
1. Search the web with Brave
2. Read the results
3. Write a summary to `tips.md`
All automatically.
## Popular MCP Servers
Here are common tools you can add:
### Filesystem
```typescript
filesystem: {
type: 'stdio',
command: 'npx',
args: ['-y', '@modelcontextprotocol/server-filesystem', '/path/to/directory']
}
```
Tools: `read_file`, `write_file`, `list_directory`, `search_files`
### Web Search (Brave)
```typescript
web_search: {
type: 'stdio',
command: 'npx',
args: ['-y', '@modelcontextprotocol/server-brave-search'],
env: { BRAVE_API_KEY: process.env.BRAVE_API_KEY }
}
```
Tools: `brave_web_search`
### GitHub
```typescript
github: {
type: 'stdio',
command: 'npx',
args: ['-y', '@modelcontextprotocol/server-github'],
env: { GITHUB_TOKEN: process.env.GITHUB_TOKEN }
}
```
Tools: `create_issue`, `search_code`, `get_file_contents`
### PostgreSQL
```typescript
postgres: {
type: 'stdio',
command: 'npx',
args: ['-y', '@modelcontextprotocol/server-postgres'],
env: { DATABASE_URL: process.env.DATABASE_URL }
}
```
Tools: `query`, `list_tables`, `describe_table`
Find more at [mcp.run](https://mcp.run) and [awesome-mcp-servers](https://github.com/punkpeye/awesome-mcp-servers).
## How It Works
When you send a message, the agent:
1. Sees the available tools (from MCP servers)
2. Decides if it needs to use any tools
3. Calls the tools if needed
4. Uses the results to generate a response
You don't need to tell it when to use tools—it figures it out.
## What's Next?
Your agent now has real capabilities. But how do you show what it's doing in your UI? How do you display "Reading file..." or "Searching web..." to your users?
That's where events come in.
**Continue to:** [Handling Events](./events.md)

View File

@@ -0,0 +1,354 @@
import { themes as prismThemes } from 'prism-react-renderer';
import type { Config } from '@docusaurus/types';
import type * as Preset from '@docusaurus/preset-classic';
// This runs in Node.js - Don't use client-side code here (browser APIs, JSX...)
const config: Config = {
title: 'Dexto',
tagline: 'Build AI Agents with ease',
favicon: 'img/dexto/dexto_logo_icon.svg',
// Set the production url of your site here
url: 'https://docs.dexto.ai',
// Set the /<baseUrl>/ pathname under which your site is served
baseUrl: '/',
// Set to false to match Vercel configuration and avoid redirect issues
trailingSlash: false,
// GitHub pages deployment config.
// If you aren't using GitHub pages, you don't need these.
organizationName: 'truffle-ai', // Usually your GitHub org/user name.
projectName: 'dexto', // Usually your repo name.
onBrokenLinks: 'throw',
// Even if you don't use internationalization, you can use this field to set
// useful metadata like html lang. For example, if your site is Chinese, you
// may want to replace "en" with "zh-Hans".
i18n: {
defaultLocale: 'en',
locales: ['en'],
},
presets: [
[
'classic',
{
docs: false,
blog: {
showReadingTime: true,
feedOptions: {
type: ['rss', 'atom'],
xslt: true,
},
editUrl: 'https://github.com/truffle-ai/dexto/tree/main/docs/',
onInlineTags: 'warn',
onInlineAuthors: 'warn',
onUntruncatedBlogPosts: 'warn',
blogTitle: 'Dexto Blog',
blogDescription: 'The official blog for AI agents using Dexto',
blogSidebarCount: 'ALL',
},
theme: {
customCss: ['./src/css/brand.css', './src/css/custom.css'],
},
} satisfies Preset.Options,
],
[
'redocusaurus',
{
specs: [
{
id: 'dexto-api',
spec: 'static/openapi/openapi.json',
route: '/api/rest/',
},
],
theme: {
primaryColor: '#14b8a6',
},
},
],
],
themes: ['@docusaurus/theme-mermaid'],
markdown: {
mermaid: true,
},
themeConfig: {
// Replace with your project's social card
image: 'img/dexto-social-card.jpg',
algolia: {
appId: 'EHM21LFJ1P',
apiKey: 'e8246111c9f80ec60063d2b395b03ecc',
indexName: 'Dexto docs',
contextualSearch: true,
searchParameters: {},
searchPagePath: 'search',
// askAi: 'reomyK7JUIYj',
askAi: {
assistantId: 'reomyK7JUIYj',
},
},
docs: {
sidebar: {
hideable: true,
autoCollapseCategories: false,
},
},
colorMode: {
defaultMode: 'dark',
disableSwitch: false,
respectPrefersColorScheme: false,
},
navbar: {
logo: {
alt: 'Dexto Logo',
src: 'img/dexto/dexto_logo_light.svg',
srcDark: 'img/dexto/dexto_logo.svg',
},
hideOnScroll: false,
items: [
{
to: '/docs/getting-started/intro',
position: 'left',
label: 'Docs',
activeBaseRegex: `/docs/`,
},
{
to: '/examples/intro',
position: 'left',
label: 'Examples',
activeBaseRegex: `/examples/`,
},
{
to: '/api',
position: 'left',
label: 'API Reference',
activeBaseRegex: `/api/`,
},
{
to: '/blog',
position: 'left',
label: 'Blog',
},
{
type: 'search',
position: 'left',
},
{
href: 'https://discord.gg/GFzWFAAZcm',
position: 'right',
className: 'header-discord-link',
'aria-label': 'Discord community',
},
{
href: 'https://github.com/truffle-ai/dexto',
position: 'right',
className: 'header-github-link',
'aria-label': 'GitHub repository',
},
// Mobile-only social links (Discord + GitHub in one row at bottom of sidebar)
{
type: 'html',
position: 'right',
className: 'mobile-social-links',
value: `
<a href="https://discord.gg/GFzWFAAZcm" aria-label="Discord community" class="header-discord-link"></a>
<a href="https://github.com/truffle-ai/dexto" aria-label="GitHub repository" class="header-github-link"></a>
`,
},
],
},
footer: {
style: 'light',
links: [
{
title: 'Documentation',
items: [
{
label: 'Getting Started',
to: '/docs/getting-started/intro',
},
{
label: 'Guides',
to: '/docs/category/guides',
},
{
label: 'API Reference',
to: '/api',
},
],
},
{
title: 'Community',
items: [
{
label: 'Discord',
href: 'https://discord.gg/GFzWFAAZcm',
},
{
label: 'GitHub Discussions',
href: 'https://github.com/truffle-ai/dexto/discussions',
},
{
label: 'GitHub Issues',
href: 'https://github.com/truffle-ai/dexto/issues',
},
{
label: 'X (Twitter)',
href: 'https://x.com/truffleai_',
},
],
},
{
title: 'Resources',
items: [
{
label: 'Blog',
to: '/blog',
},
{
label: 'Examples',
to: '/examples/intro',
},
{
label: 'Contributing',
href: 'https://github.com/truffle-ai/dexto/blob/main/CONTRIBUTING.md',
},
{
label: 'Changelog',
href: 'https://github.com/truffle-ai/dexto/releases',
},
{
label: 'llms.txt',
href: 'https://docs.dexto.ai/llms.txt',
},
],
},
{
title: 'Truffle AI',
items: [
{
label: 'Website',
href: 'https://trytruffle.ai',
},
{
label: 'GitHub',
href: 'https://github.com/truffle-ai',
},
],
},
],
copyright: `Copyright © ${new Date().getFullYear()} Truffle AI. Built with ❤️ for developers.`,
},
prism: {
theme: prismThemes.oneLight,
darkTheme: prismThemes.oneDark,
additionalLanguages: [
'bash',
'diff',
'json',
'yaml',
'typescript',
'javascript',
'python',
'go',
'rust',
'docker',
],
},
mermaid: {
theme: { light: 'neutral', dark: 'dark' },
},
announcementBar: {
id: 'support_us',
content:
'⭐️ If you like Dexto, give it a star on <a target="_blank" rel="noopener noreferrer" href="https://github.com/truffle-ai/dexto">GitHub</a> and join our <a target="_blank" rel="noopener noreferrer" href="https://discord.gg/GFzWFAAZcm">Discord</a>! ⭐️',
backgroundColor: '#14b8a6',
textColor: '#ffffff',
isCloseable: true,
},
} satisfies Preset.ThemeConfig,
plugins: [
[
'@docusaurus/plugin-content-docs',
{
id: 'docs',
path: 'docs',
routeBasePath: 'docs',
sidebarPath: './sidebars.ts',
editUrl: 'https://github.com/truffle-ai/dexto/tree/main/docs/',
showLastUpdateAuthor: true,
showLastUpdateTime: true,
breadcrumbs: true,
},
],
[
'@docusaurus/plugin-content-docs',
{
id: 'examples',
path: 'examples',
routeBasePath: 'examples',
sidebarPath: './examples-sidebars.ts',
editUrl: 'https://github.com/truffle-ai/dexto/tree/main/docs/',
showLastUpdateAuthor: true,
showLastUpdateTime: true,
breadcrumbs: true,
},
],
[
'@docusaurus/plugin-content-docs',
{
id: 'api',
path: 'api',
routeBasePath: 'api',
sidebarPath: './api-sidebars.ts',
editUrl: 'https://github.com/truffle-ai/dexto/tree/main/docs/',
showLastUpdateAuthor: true,
showLastUpdateTime: true,
breadcrumbs: true,
},
],
'./src/plugins/markdown-route-plugin.ts',
],
headTags: [
{
tagName: 'meta',
attributes: {
name: 'algolia-site-verification',
content: '5AC61F66A1FBFC7D',
},
},
{
tagName: 'link',
attributes: {
rel: 'preconnect',
href: 'https://fonts.googleapis.com',
},
},
{
tagName: 'link',
attributes: {
rel: 'preconnect',
href: 'https://fonts.gstatic.com',
crossorigin: 'anonymous',
},
},
],
stylesheets: [
{
href: 'https://fonts.googleapis.com/css2?family=Geist:wght@300;400;500;600;700&family=Geist+Mono:wght@300;400;500;600&display=swap',
type: 'text/css',
},
],
};
export default config;

View File

@@ -0,0 +1,48 @@
import type { SidebarsConfig } from '@docusaurus/plugin-content-docs';
const sidebars: SidebarsConfig = {
examplesSidebar: [
{
type: 'doc',
id: 'intro',
label: 'Overview',
},
{
type: 'category',
label: 'Agent Examples',
link: {
type: 'generated-index',
title: 'Agent Examples',
description: 'Practical examples showcasing Dexto agents in action.',
},
items: [
'podcast-agent',
'face-detection',
'snake-game',
'image-generation',
'amazon-shopping',
'email-slack',
'triage-agent',
],
},
{
type: 'category',
label: 'More Examples',
link: {
type: 'generated-index',
title: 'More Examples',
description: 'Additional examples and platform demonstrations.',
},
items: [
'portable-agents',
'memory',
'human-in-loop',
'mcp-integration',
'mcp-store',
'playground',
],
},
],
};
export default sidebars;

Some files were not shown because too many files have changed in this diff Show More