gitmyhub

mistral.rs

Rust ★ 7.3k updated 9h ago

Fast, flexible LLM inference

A fast Rust-based tool for running AI language and vision models locally from Hugging Face, with a built-in chat UI, Python API, automatic quantization, and support for agentic tool-calling workflows.

RustPythonHugging FaceMCPsetup: easycomplexity 3/5

Mistral.rs is a tool for running AI language models on your own computer. It is built for speed and designed to work with models published on Hugging Face, the main public repository where AI researchers and companies share their models. You point the program at a model name and it handles the rest, detecting the model's format and starting it without requiring any configuration files.

The tool supports far more than text conversations. The same engine handles text generation, image understanding, video input, audio, speech-to-text, image generation, and text embeddings. You can chat with a model through a built-in web interface by running a single command, or call the program from your own code using either a Python package or a Rust library.

Because large AI models can require significant memory, mistral.rs includes detailed quantization support. Quantization is a technique that reduces model file size and memory usage at some cost to precision. The tool supports many quantization formats and can automatically benchmark your hardware and select the best settings for your specific machine. You can also control quantization settings on a per-layer basis if you need fine-grained control.

The project includes what it calls agentic features, meaning the model can do more than generate text. It can call external tools, search the web, connect to external services via a standard protocol called MCP, and loop through multiple tool calls automatically before returning a final answer. These capabilities let you build AI assistants that interact with real systems rather than just producing text.

The supported model list is extensive, covering dozens of well-known text, vision, and speech models. Installation is a one-line command on Linux, macOS, or Windows. The project is actively maintained, with recent updates adding new model families and quantization methods.

Where it fits