@measey/mycoder-agent v1.0.0
MyCoder Agent
Core AI agent system that powers the MyCoder CLI tool. This package provides a modular tool-based architecture that allows AI agents to interact with files, execute commands, make network requests, spawn sub-agents for parallel task execution, and automate browser interactions.
Overview
The MyCoder Agent system is built around these key concepts:
- š ļø Extensible Tool System: Modular architecture with various tool categories
- š Parallel Execution: Ability to spawn sub-agents for concurrent task processing
- š Multi-LLM Support: Works with Anthropic Claude, OpenAI GPT models, and Ollama
- š Web Automation: Built-in browser automation for web interactions
- š Smart Logging: Hierarchical, color-coded logging system for clear output
- š Advanced Text Editing: Powerful file manipulation capabilities
- š MCP Integration: Support for the Model Context Protocol
Please join the MyCoder.ai discord for support: https://discord.gg/5K6TYrHGHt
Installation
npm install mycoder-agent
API Key Required
Before using MyCoder Agent, you must have one of the following API keys:
- Anthropic: Set
ANTHROPIC_API_KEY
as an environment variable or in a .env file (Get from https://www.anthropic.com/api) - OpenAI: Set
OPENAI_API_KEY
as an environment variable or in a .env file - Ollama: Use locally running Ollama instance
Core Components
Tool System
The tool system is the foundation of the MyCoder agent's capabilities:
- Modular Design: Each tool is a standalone module with clear inputs and outputs
- Type Safety: Tools use Zod for schema validation and TypeScript for type safety
- Token Tracking: Built-in token usage tracking to optimize API costs
- Parallel Execution: Tools can run concurrently for efficiency
Agent System
The agent system orchestrates the execution flow:
- Main Agent: Primary agent that handles the overall task
- Sub-Agents: Specialized agents for parallel task execution
- Agent State Management: Tracking agent status and communication
- LLM Integration: Supports multiple LLM providers (Anthropic, OpenAI, Ollama)
LLM Providers
The agent supports multiple LLM providers:
- Anthropic: Claude models with full tool use support
- OpenAI: GPT-4 and other OpenAI models with function calling
- Ollama: Local LLM support for privacy and offline use
Model Context Protocol (MCP)
MyCoder Agent supports the Model Context Protocol:
- Resource Loading: Load context from MCP-compatible servers
- Server Configuration: Configure multiple MCP servers
- Tool Integration: Use MCP-provided tools
Available Tools
File & Text Manipulation
- textEditor: View, create, and edit files with persistent state
- Commands: view, create, str_replace, insert, undo_edit
- Line number support and partial file viewing
System Interaction
- shellStart: Execute shell commands with sync/async modes
- shellMessage: Interact with running shell processes
- shellExecute: One-shot shell command execution
- listShells: List all running shell processes
Agent Management
- agentStart: Create sub-agents for parallel tasks
- agentMessage: Send messages to sub-agents and retrieve their output (including captured logs)
- agentDone: Complete the current agent's execution
- listAgents: List all running agents
The agent system automatically captures log, warn, and error messages from agents and their immediate tools, which are included in the output returned by agentMessage.
Network & Web
- fetch: Make HTTP requests to APIs
- sessionStart: Start browser automation sessions
- sessionMessage: Control browser sessions (navigation, clicking, typing)
- listSessions: List all browser sessions
Utility Tools
- sleep: Pause execution for a specified duration
- userPrompt: Request input from the user
Project Structure
src/
āāā core/ # Core agent and LLM abstraction
ā āāā llm/ # LLM providers and interfaces
ā ā āāā providers/ # Anthropic, OpenAI, Ollama implementations
ā āāā mcp/ # Model Context Protocol integration
ā āāā toolAgent/ # Tool agent implementation
āāā tools/ # Tool implementations
ā āāā agent/ # Sub-agent tools
ā āāā fetch/ # HTTP request tools
ā āāā interaction/ # User interaction tools
ā āāā session/ # Browser automation tools
ā āāā shell/ # Shell execution tools
ā āāā sleep/ # Execution pause tool
ā āāā textEditor/ # File manipulation tools
āāā utils/ # Utility functions and logger
Technical Requirements
- Node.js >= 18.0.0
- pnpm >= 10.2.1
Browser Automation
The agent includes powerful browser automation capabilities using Playwright:
- Web Navigation: Visit websites and follow links
- Content Extraction: Extract and filter page content
- Element Interaction: Click buttons, fill forms, and interact with UI elements
- Waiting Strategies: Smart waiting for page loads and element visibility
Usage Example
import { toolAgent } from '@measey/mycoder-agent';
import { textEditorTool } from '@measey/mycoder-agent';
import { shellStartTool } from '@measey/mycoder-agent';
import { Logger, LogLevel } from '@measey/mycoder-agent';
// Create a logger
const logger = new Logger({ name: 'MyAgent', logLevel: LogLevel.info });
// Define available tools
const tools = [textEditorTool, shellStartTool];
// Run the agent
const result = await toolAgent(
'Write a simple Node.js HTTP server and save it to server.js',
tools,
{
getSystemPrompt: () => 'You are a helpful coding assistant...',
maxIterations: 10,
},
{
logger,
provider: 'anthropic',
model: 'claude-3-opus-20240229',
apiKey: process.env.ANTHROPIC_API_KEY,
workingDirectory: process.cwd(),
},
);
console.log('Agent result:', result);
Contributing
We welcome contributions! Please see our CONTRIBUTING.md for development workflow and guidelines.
License
MIT
7 months ago