xsv
A fast CSV command line toolkit written in Rust.
xsv is a fast Rust command-line tool for slicing, filtering, joining, sorting, and computing stats on CSV files, note it is unmaintained, qsv and xan are the actively developed successors.
xsv is a command-line tool written in Rust for working with CSV files. It provides a collection of subcommands that cover the most common operations on tabular data: counting rows, selecting columns, filtering by regex, joining multiple files, sorting, slicing a range of rows, computing statistics like mean and standard deviation, and splitting one large file into many smaller ones. The design goal is for each command to be fast, composable with Unix pipes, and honest about performance tradeoffs.
One notable feature is indexing. Running xsv index on a CSV file creates a small companion index file that enables certain later commands to skip directly to the relevant rows instead of parsing everything from the start. This makes slice operations on large files nearly instant and speeds up statistics gathering significantly. The README demonstrates this on a 3.1-million-row world cities dataset where indexed operations finish in seconds.
The tool also formats CSV output into aligned columns in a terminal via the table command, handles files with unusual quoting or delimiter rules, and can do inner, outer, and cross joins between files using a hash-based approach that keeps things fast without requiring pre-sorted input.
The project is now unmaintained. The author recommends looking at qsv or xan as actively developed alternatives that cover similar ground. The repository remains available for reference and historical use, and the code compiles and runs fine for users who want to install it from source or via the crates.io package. It is dual-licensed under MIT and the Unlicense.
Where it fits
- Count rows, select specific columns, and filter a large CSV file from the terminal without opening a spreadsheet.
- Join two CSV files on a shared column using a fast hash-based approach that does not require pre-sorted input.
- Compute statistics like mean, median, and standard deviation on a CSV column in seconds using an optional index file.