1.1.9 • Published 4 months ago

@agent-infra/mcp-server-browser v1.1.9

Weekly downloads
-
License
MIT
Repository
github
Last release
4 months ago

Browser Use MCP Server

NPM Downloads smithery badge codecov

Install MCP Server

A fast, lightweight Model Context Protocol (MCP) server that empowers LLMs with browser automation via Puppeteer’s structured accessibility data, featuring optional vision mode for complex visual understanding and flexible, cross-platform configuration.

npm.io

Key Features

  • ⚡ Fast & lightweight. Utilizes Puppeteer's label index, not pixel-based input and accessibility DOM tree.
  • 👁️ Vision Mode Support. Optional visual understanding capabilities for complex layouts and visual elements when structured data isn't sufficient.
  • 🤖 LLM-optimized. No vision models needed, operates purely on structured data, less context reducing context token usage.
  • 🧩 Flexible Runtime Configuration. Customize viewport size, coordinate system factors, and User-Agent at runtime via HTTP headers.
  • 🌐 Cross-Platform & Extensible. Support for remote and local browsers, the use of a custom browser engine.

Requirements

  • Node.js 18 or newer
  • VS Code, Cursor, Windsurf, Claude Desktop or any other MCP client

Getting started

Local (Stdio)

First, install the Browser MCP server with your client. A typical configuration looks like this:

{
  "mcpServers": {
    "browser": {
      "command": "npx",
      "args": [
        "@agent-infra/mcp-server-browser@latest"
      ]
    }
  }
}

You can also install the Browser MCP server using the VS Code CLI:

# For VS Code
code --add-mcp '{"name":"browser","command":"npx","args":["@agent-infra/mcp-server-browser@latest"]}'

After installation, the Browser MCP server will be available for use with your GitHub Copilot agent in VS Code.

Go to Cursor Settings -> MCP -> Add new MCP Server. Name to your liking, use command type with the command npx @agent-infra/mcp-server-browser. You can also verify config or add command like arguments via clicking Edit.

{
  "mcpServers": {
    "browser": {
      "command": "npx",
      "args": [
        "@agent-infra/mcp-server-browserp@latest"
      ]
    }
  }
}

Follow Windsuff MCP documentation. Use following configuration:

{
  "mcpServers": {
    "browser": {
      "command": "npx",
      "args": [
        "@agent-infra/mcp-server-browser@latest"
      ]
    }
  }
}

Follow the MCP install guide, use following configuration:

{
  "mcpServers": {
    "browser": {
      "command": "npx",
      "args": [
        "@agent-infra/mcp-server-browser@latest"
      ]
    }
  }
}

Remote (SSE / Streamable HTTP)

At the same time, use --port $your_port arg to start the browser mcp can be converted into SSE and Streamable HTTP Server.

# normal run remote mcp server
npx @agent-infra/mcp-server-browser --port 8089

# run with DISPLAY environment for VNC or other virtual display
DISPLAY=:0 npx @agent-infra/mcp-server-browser --port 8089

You can use one of the two MCP Server remote endpoint:

  • Streamable HTTP(Recommended): http://127.0.0.1::8089/mcp
  • SSE: http://127.0.0.1::8089/sse

And then in MCP client config, set the url to the SSE endpoint:

{
  "mcpServers": {
    "browser": {
      "url": "http://127.0.0.1::8089/sse"
    }
  }
}

url to the Streamable HTTP:

{
  "mcpServers": {
    "browser": {
      "type": "streamable-http", // If there is MCP Client support
      "url": "http://127.0.0.1::8089/mcp"
    }
  }
}

In-memory call

If your MCP Client is developed based on JavaScript / TypeScript, you can directly use in-process calls to avoid requiring your users to install the command-line interface to use Browser MCP.

import { Client } from '@modelcontextprotocol/sdk/client/index.js';
import { InMemoryTransport } from '@modelcontextprotocol/sdk/inMemory.js';

// type: module project usage
import { createServer } from '@agent-infra/mcp-server-browser';
// commonjs project usage
// const { createServer } = await import('@agent-infra/mcp-server-browser')

const client = new Client(
  {
    name: 'test browser client',
    version: '1.0',
  },
  {
    capabilities: {},
  },
);

const server = createServer();
const [clientTransport, serverTransport] = InMemoryTransport.createLinkedPair();

await Promise.all([
  client.connect(clientTransport),
  server.connect(serverTransport),
]);

// list tools
const result = await client.listTools();
console.log(result);

// call tool
const toolResult = await client.callTool({
  name: 'browser_navigate',
  arguments: {
    url: 'https://www.google.com',
  },
});
console.log(toolResult);

Configuration

Browser MCP server supports following arguments. They can be provided in the JSON configuration above, as a part of the "args" list:

> npx @agent-infra/mcp-server-browser@latest -h
  -V, --version              output the version number
  --browser <browser>        browser or chrome channel to use, possible values: chrome, edge, firefox.
  --cdp-endpoint <endpoint>  CDP endpoint to connect to, for example "http://127.0.0.1:9222/json/version"
  --ws-endpoint <endpoint>   WebSocket endpoint to connect to, for example "ws://127.0.0.1:9222/devtools/browser/{id}"
  --executable-path <path>   path to the browser executable.
  --headless                 run browser in headless mode, headed by default
  --host <host>              host to bind server to. Default is localhost. Use 0.0.0.0 to bind to all interfaces.
  --port <port>              port to listen on for SSE and HTTP transport.
  --proxy-bypass <bypass>    comma-separated domains to bypass proxy, for example ".com,chromium.org,.domain.com"
  --proxy-server <proxy>     specify proxy server, for example "http://myproxy:3128" or "socks5://myproxy:8080"
  --user-agent <ua string>   specify user agent string
  --user-data-dir <path>     path to the user data directory.
  --viewport-size <size>     specify browser viewport size in pixels, for example "1280, 720"
  --vision                   Run server that uses screenshots (Aria snapshots are used by default)
  -h, --help                 display help for command

Runtime Configuration

The browser runtime requires configuration for Viewport Size, Vision Model Coordinate Factors, and User Agent. These can be passed through corresponding HTTP headers:

HeaderDescription
x-viewport-sizeBrowser viewport size, format: width,height separated by comma
x-vision-factorsVision model coordinate system factors, format: x_factor,y_factor separated by comma
x-user-agentUser Agent string, defaults to system User Agent if not specified

Note: Header names are case-insensitive.

Example:

x-viewport-size: 1920,1080
x-vision-factors: 1.0,1.0
x-user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36

Docker

We have unified the deployment of VNC and MCP under a single URL endpoint, The Dockerfile and DockerHub image will be published together!

Developement

Access http://127.0.0.1:6274/:

npm run dev
1.1.9

4 months ago

1.1.8

4 months ago

1.1.7

4 months ago

1.1.6

5 months ago

1.1.6-beta.10

5 months ago

1.1.6-beta.9

5 months ago

1.1.6-beta.8

5 months ago

1.1.6-beta.7

5 months ago

1.1.6-beta.6

5 months ago

1.1.6-beta.5

5 months ago

1.1.6-beta.4

5 months ago

1.1.6-beta.3

5 months ago

1.1.6-beta.0

5 months ago

1.1.6-beta.2

5 months ago

1.1.6-beta.1

5 months ago

1.1.5

5 months ago

1.1.4

5 months ago

1.1.3

5 months ago

1.1.2

5 months ago

1.1.1

5 months ago

1.1.1-beta.3

5 months ago

1.1.1-beta.2

5 months ago

1.1.1-beta.1

5 months ago

1.1.1-beta.0

5 months ago

1.1.0

5 months ago

1.0.1-beta.15

5 months ago

1.0.1-beta.14

5 months ago

1.0.1-beta.13

5 months ago

1.0.1-beta.12

5 months ago

1.0.1-beta.11

5 months ago

1.0.1-beta.10

5 months ago

1.0.1-beta.9

5 months ago

1.0.1-beta.8

5 months ago

1.0.1-beta.7

5 months ago

0.0.3-beta.6

5 months ago

0.0.3-beta.5

5 months ago

0.0.3-beta.4

5 months ago

0.0.3-beta.3

5 months ago

0.0.3-beta.2

5 months ago

0.0.3-beta.1

5 months ago

0.0.3-beta.0

5 months ago

0.0.2

5 months ago

0.0.1

7 months ago