risingwave

Rust ★ 9.1k updated 1d ago

Event streaming platform for agentic AI. Continuously ingest, transform, and serve event streams in real time, at scale.

A streaming database that processes live data continuously and keeps query results up to date in real time, replacing the multiple separate tools usually needed for stream processing.

RustSQLPostgreSQLKafkaApache IcebergDockerKubernetesS3setup: moderatecomplexity 4/5

RisingWave is a database built for working with data that changes constantly. Rather than analyzing data after the fact by running queries against a stored snapshot, it processes incoming data streams continuously and keeps results up to date. When you query it, you get the current state, not information from minutes ago.

The problem it addresses: building a system that ingests live data, transforms it, and serves fresh results has traditionally required stringing together several separate tools. You might use one tool to capture database changes, another to move those changes around, a third to process and aggregate them, and a fourth to store the results for querying. RisingWave replaces that whole chain with a single system.

You write queries in standard SQL (the same language used to query most databases), and RisingWave maintains what it calls materialized views: precomputed results it updates automatically as new data arrives. Query latency stays under 100 milliseconds regardless of how much data is flowing through. It connects using the same wire protocol as PostgreSQL, so existing tools and database drivers work without changes.

Data can come from multiple sources: changes captured from PostgreSQL and MySQL databases, messages from Kafka or other streaming platforms, HTTP webhooks from external services, or files from storage systems like S3. All these sources can be queried together with the same SQL interface.

For long-term storage, RisingWave writes to Apache Iceberg, an open table format that other tools such as Spark, DuckDB, and Trino can also read. It manages ongoing table maintenance automatically. Hot data for fast queries is kept in an internal row store; cooler historical data lives in object storage, which is substantially cheaper than keeping everything in memory.

It can be deployed as a managed cloud service or self-hosted using Docker or Kubernetes. The source code is licensed under the Apache 2.0 license.

Where it fits

Write SQL queries that automatically update as new Kafka messages arrive, replacing a separate stream processor with a single database.
Capture changes from a PostgreSQL or MySQL database and query the live results with sub-100-millisecond latency.
Build a real-time analytics dashboard that queries continuously updated materialized views instead of stale snapshots.
Store streaming results in Apache Iceberg so Spark or DuckDB can also query the same data for batch workloads.

Open on GitHub → Full breakdown on explaingit →