gitmyhub

trino

Java ★ 13k updated 12h ago

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

Trino is an open-source SQL engine that lets you query massive amounts of data across multiple databases, data lakes, and cloud storage at once, without moving the data first.

JavaMavenDockersetup: hardcomplexity 5/5

Trino is a query engine that lets you run SQL queries across large amounts of data stored in many different places at once. Companies typically collect data in data warehouses, cloud storage buckets, relational databases, and other systems. Rather than moving all that data into one place first, Trino can connect to those sources simultaneously and run a single query that pulls results from all of them, returning answers quickly even when the underlying data is enormous.

The name comes from its history: the project was originally created at Facebook and later became known as PrestoSQL before the community renamed it Trino. Today it is maintained by an independent open-source community and widely used in data analytics teams at companies that need to query petabytes of data across distributed infrastructure.

Trino is written in Java and runs as a cluster of machines working together. One node coordinates the query plan while worker nodes execute pieces of it in parallel. Users connect to the cluster using standard SQL, so anyone who knows how to write a database query can use it. It also ships with a command-line client for running queries interactively.

The project supports connections to many data sources through a plugin system. Common sources include data lakes in formats like Delta Lake, traditional relational databases, and object storage services. Each connection type is handled by a connector, and the codebase includes built-in connectors for several popular systems.

Building Trino from source requires Java 25 and Docker, and the build is managed through Maven, which is a standard Java build tool. The repository's README is primarily a guide for developers who want to run or modify the engine locally rather than an introduction for end users. End-user documentation lives at a separate site.

Where it fits