rgr

TypeScript ★ 22 updated 22d ago

Strict Red-Green-Refactor CLI gate for coding agents

A command-line tool that enforces the Red-Green-Refactor coding cycle on AI coding agents, preventing them from skipping steps or cheating tests by locking test snapshots and verifying they have not changed.

TypeScriptBunsetup: easycomplexity 2/5

RGR is a command-line tool that enforces a specific coding discipline on AI coding agents. It follows the Red-Green-Refactor pattern, a practice where you write a failing test before writing any production code, prove the test actually fails, then write the implementation to make it pass, and only then clean up. RGR adds a layer of enforcement that prevents an AI agent from skipping or cheating these steps.

The way it works: you initialize a goal, then run a command that records your failing test. RGR takes a snapshot of that test file and locks it with a hash. When you tell RGR the implementation is ready (the Green step), it checks that the test file has not changed since you recorded the Red state. If the agent quietly edited the test to make it easier to pass, RGR refuses to proceed. This makes it harder for an agent to produce results that look correct but are not actually verified.

The tool also includes a companion plugin called intent-contract, which lets you sign a locked description of what a task is supposed to accomplish before any coding starts. Once locked, RGR can check that the final code changes stayed within the boundaries of that original description, not just that the tests pass. Together, the two tools are meant to let you accept an agent's work without having to trust the agent's honesty.

RGR has no external dependencies and runs using Bun, a JavaScript runtime. It can be installed as a plugin for Claude Code or Codex, or cloned and run directly from a local checkout. Every run writes a log of events, snapshots, and diffs to a local directory so there is a record of what happened.

The readme is thorough and includes installation steps, a full description of the workflow commands, examples of advanced replay modes, and a section on the threat model explaining what the tool can and cannot protect against.

Where it fits

Enforce test-first discipline on AI coding agents like Claude Code or Codex so they cannot quietly change tests to make them pass.
Lock a failing test snapshot before writing any code and have RGR verify the test file was not altered during implementation.
Sign off on a task description with intent-contract before coding starts, then verify the final changes stayed within that original scope.

Open on GitHub → Full breakdown on explaingit →