gitmyhub

semble_rs

Rust ★ 132 updated 18d ago

Fast, AI-agent-native code search in Rust — hybrid BM25 + semantic, Tree-sitter AST chunking, dependency & impact analysis. Drop-in replacement for grep/cat/read/ls in Claude Code, Codex, Cursor, Aider, OpenHands.

semble_rs is a command-line code search tool written in Rust, designed specifically to help AI coding agents like Claude Code, Codex, and Cursor work more efficiently inside large codebases. The core problem it solves is that when an AI agent explores a codebase using standard tools like grep, cat, or ls, it sends enormous amounts of text back to the language model, consuming a huge number of tokens (the units that determine AI processing cost and context window limits) without necessarily finding the most relevant code.

The tool addresses this with three main capabilities. The search command uses a hybrid approach combining BM25 (a keyword relevance algorithm) and static semantic embeddings from a model called Model2Vec to find the code chunks most relevant to a plain-English question, returning only the useful portions rather than entire files. The tree command replaces the standard directory listing, collapsing irrelevant folders like build outputs and node_modules to reduce what might be 400,000 characters down to 500, a reduction of up to 747 times. The digest command compresses build and test output from tools like cargo, pytest, and GitHub Actions by collapsing progress lines while always preserving error messages and stack traces, achieving up to 99 percent size reduction on real CI logs.

It also includes a dependency graph feature showing what a file imports and what other code would be affected if it changed. The tool runs as a single binary with no external services, no API keys, and no GPU required. An AI coding agent or developer trying to reduce token usage when working with large repos would use this.