gitmyhub

preprint

Rust ★ 19 updated 24d ago

A new substrate for browser agents: files, actions, diffs, logs, and artifacts.

A Rust CLI tool that lets AI agents control a real Chrome browser through plain text files, a background daemon keeps each tab's state in a markdown file, and the agent triggers actions by appending commands to that file.

RustNode.jsChromesetup: moderatecomplexity 3/5

This project, called preprint, is a tool that lets AI agents control a web browser by reading and writing plain text files instead of learning complex browser automation protocols. The core idea is that a background program (a daemon) runs a real Chrome browser instance and continuously writes the current state of each open tab into a markdown file. The agent reads that file to see what is on screen, then appends a single action command at the bottom of the file. The daemon carries out that action in the live browser and rewrites the file with the new state, all within about a second.

Setting it up involves installing the package through npm and running a command that opens a web page inside a managed Chrome window. A folder called "preprint" appears in your working directory with one file per tab. Each tab file shows the page content as a simplified tree of clickable elements, along with the last action taken and any console output from the page. A second file tracks what changed since the previous snapshot.

The actions an agent can trigger include clicking a button (referenced by a short code from the page file), typing text into a field, pressing a key, scrolling the page, navigating to a different URL, taking a screenshot, or recording a video clip. After each action the agent re-reads the updated tab file before deciding the next step. Screenshots and recordings are saved to a separate artifacts folder and persist even after the tab is closed.

The tool supports multiple browser sessions with different identities. You can use your own Chrome profile, a named separate profile, or a completely clean browser with no stored login data. Sessions stay open until you explicitly close them or run a stop command that shuts everything down.

This is a command-line developer tool, written in Rust, aimed at programmers building AI agents that need to interact with live web pages. It is not a hosted service and has no graphical interface. With 17 stars on GitHub, it appears to be a new or early-stage experiment.

Where it fits