agent-browser

Automate web navigation, form filling, screenshots, and data extraction from the command line

by vercel-labs/agent-browser

## Overview

Automates browser interactions from the command line for web testing, form filling, screenshots, video recording, and data extraction. Provides a compact CLI that navigates pages, inspects interactive elements, and performs actions by reference or semantic locators.

## How it works

Open a page, run a snapshot to get element refs (e.g. @e1), then interact via click, fill, type, and wait. Snapshots return accessibility trees or filtered interactive elements. Re-snapshot after navigation or DOM changes to refresh refs. Supports sessions, video recording, network interception, and device emulation.

## When to use

End-to-end or regression testing of web applications. Automated form submission and validation across environments. Page scraping and structured data extraction. Capturing screenshots, PDFs, or recording interaction videos for demos.

## Best practices

Always run snapshot -i to get stable interactive refs before performing actions. Re-snapshot after navigation or significant DOM updates. Use --session to isolate parallel browser sessions. Combine wait commands to make flows deterministic. Use --json for machine-readable output.

## Platforms

Claude CodeCursorCodex CLIGemini CLI

## Triggers

open browserfill formtake screenshottest web app

## Topics

browserautomationtestingscraping

visit website →

licenseMIT

category[Agent Skills]

stack

TypeScript

sourceview →

★ featured