preprint
A new substrate for browser agents: files, actions, diffs, logs, and artifacts.
A Rust CLI tool that lets AI agents control a real Chrome browser through plain text files, a background daemon keeps each tab's state in a markdown file, and the agent triggers actions by appending commands to that file.
This project, called preprint, is a tool that lets AI agents control a web browser by reading and writing plain text files instead of learning complex browser automation protocols. The core idea is that a background program (a daemon) runs a real Chrome browser instance and continuously writes the current state of each open tab into a markdown file. The agent reads that file to see what is on screen, then appends a single action command at the bottom of the file. The daemon carries out that action in the live browser and rewrites the file with the new state, all within about a second.
Setting it up involves installing the package through npm and running a command that opens a web page inside a managed Chrome window. A folder called "preprint" appears in your working directory with one file per tab. Each tab file shows the page content as a simplified tree of clickable elements, along with the last action taken and any console output from the page. A second file tracks what changed since the previous snapshot.
The actions an agent can trigger include clicking a button (referenced by a short code from the page file), typing text into a field, pressing a key, scrolling the page, navigating to a different URL, taking a screenshot, or recording a video clip. After each action the agent re-reads the updated tab file before deciding the next step. Screenshots and recordings are saved to a separate artifacts folder and persist even after the tab is closed.
The tool supports multiple browser sessions with different identities. You can use your own Chrome profile, a named separate profile, or a completely clean browser with no stored login data. Sessions stay open until you explicitly close them or run a stop command that shuts everything down.
This is a command-line developer tool, written in Rust, aimed at programmers building AI agents that need to interact with live web pages. It is not a hosted service and has no graphical interface. With 17 stars on GitHub, it appears to be a new or early-stage experiment.
Where it fits
- Build an AI agent that browses the web by reading tab state from markdown files and writing click or type commands, without any browser SDK
- Automate multi-step web tasks like logging in, filling forms, and clicking buttons using a plain text loop any LLM can drive
- Record browser sessions as video clips while an AI agent performs tasks, for debugging or auditing the agent's decisions