@presidio-dev/playwright-core v1.0.5
Playwright Core
A powerful wrapper around the Playwright browser automation library, specifically designed for Large Language Models (LLMs) to control web browsers. This package enhances the capabilities of the factifai-agent by providing a interaction system that's optimized for AI models. It offers simplified browser control, intelligent element detection, and rich visual debugging tools that make browser automation more reliable and easier to troubleshoot.
Purpose
This library serves as the browser automation engine for factifai-agent, providing:
- Enhanced Browser Control: Session-based browser management with improved stability
- Smart Element Detection: Automatic identification of interactive page elements
- Visual Debugging Tools: Visualization of detected elements with numbered overlays
- Simplified API: High-level functions that abstract away Playwright complexity
- LLM-friendly Interface: Streamlined coordinate-based approach optimized for AI models to control browsers without needing complex DOM selectors
Quick Start
# Install the package
npm install @presidio-dev/playwright-core
# Create a basic automation script
import { BrowserService, navigate, click, type } from '@presidio-dev/playwright-core';
const run = async () => {
const sessionId = `test-${Date.now()}`;
// Open a page and navigate
await navigate(sessionId, 'https://example.com');
// Interact with the page
await click(sessionId, { x: 150, y: 200 });
await type(sessionId, 'Hello, World!');
// Capture with highlighted elements
const screenshot = await BrowserService.getInstance().takeMarkedScreenshot(sessionId);
// Clean up
await BrowserService.getInstance().closePage(sessionId);
};
run();Core Features
| Feature | Description |
|---|---|
| Session Management | Control browser sessions with unique IDs |
| Visual Debugging | Highlight and number elements for visual inspection |
| Element Detection | Find interactive elements automatically |
| Screenshot Tools | Capture screenshots with optional element highlighting |
| Simplified API | Streamlined wrappers for common browser operations |
API Reference
BrowserService
The main service for managing browser sessions and interactions.
getInstance(): Get the singleton instance of BrowserServicegetPage(sessionId): Get the active page for a sessiontakeScreenshot(sessionId, minWaitMs?): Capture a screenshotcaptureScreenshotAndInfer(sessionId): Capture screenshot with page element datagetAllPageElements(sessionId): Get all clickable and input elementstakeMarkedScreenshot(sessionId, options?): Take screenshot with marked elementsclosePage(sessionId): Close a sessioncloseAll(): Close all sessions
Navigation Functions
navigate(sessionId, url, options?): Navigate to a URLgetCurrentUrl(sessionId): Get the current page URLreload(sessionId): Reload the current pagegoBack(sessionId): Navigate back in historygoForward(sessionId): Navigate forward in historywait(sessionId, ms): Wait for a specified time
Interaction Functions
click(sessionId, coordinates, options?): Click at specific coordinatestype(sessionId, text, options?): Type textclear(sessionId, coordinates?): Clear input fieldscrollToNextChunk(sessionId): Scroll down one viewportscrollToPrevChunk(sessionId): Scroll up one viewport
Element Marking Functions
markVisibleElements(sessionId, options?): Mark elements with numbered boxesremoveElementMarkers(sessionId): Remove element markers
Advanced Usage
Custom Element Marking
// Mark interactive elements with custom colors
await browser.markVisibleElements(sessionId, {
boxColor: 'blue',
textColor: 'white',
borderWidth: 2,
elements: [
{ x: 100, y: 150, width: 200, height: 50, label: 'Search Box' }
]
});Page Element Data
// Get detailed info about page elements
const elements = await browser.getAllPageElements(sessionId);
console.log(`Found ${elements.length} interactive elements:`);
elements.forEach(el => {
console.log(`- ${el.tagName} at (${el.x}, ${el.y}), size: ${el.width}x${el.height}`);
});Requirements
- Node.js 18+
- Playwright (peer dependency)
- Browser binaries (Chromium, Firefox, and/or WebKit)
Installation
# Install the package
npm install @presidio-dev/playwright-core
# yarn
yarn add @presidio-dev/playwright-core
# pnpm
pnpm add @presidio-dev/playwright-core
# IMPORTANT: Install Playwright globally first
npm install -g playwright
# Then install browser dependencies (required)
npx playwright install --with-depsThe installation process is crucial:
1. First, install Playwright globally to ensure the CLI tools are properly recognized
2. Then run npx playwright install --with-deps which installs:
- Browser binaries (Chromium, Firefox, WebKit)
- Required system dependencies for proper browser operation
- Font packages and media codecs needed for complete rendering
License
MIT © PRESIDIO®