quack-on-demand

Scala ★ 32 updated 2d ago

Production-grade Arrow FlightSQL gateway in front of DuckDB Quack + DuckLake. Multi-tenant pools, pluggable auth (DB/JWT/OIDC), table-level ACLs, role-aware routing, and a live admin console

Scala gateway that adds multi-user, multi-tenant security and load balancing to DuckDB's Quack extension. Exposes Arrow Flight SQL interface for standard database clients (DBeaver, Spark, JDBC/ODBC). Includes authentication, per-table access control, node pooling, and a web admin console.

ScalaDuckDBArrow Flight SQLPostgreSQLDocker ComposeKeycloakJWTsetup: moderatecomplexity 4/5

Quack on Demand is a Scala-based gateway that sits in front of DuckDB's new client-server protocol and adds the features needed to run it in a shared, production environment. DuckDB is a fast analytical database that runs inside your application, and its Quack extension lets DuckDB instances talk to each other over a network. But Quack ships with minimal security: one static token for authentication, no concept of multiple users or tenants, and no way to restrict which data different users can see. Quack on Demand fills those gaps.

The gateway exposes an Arrow Flight SQL interface, which means any database client that speaks that open protocol (tools like DBeaver, Spark, or custom applications using standard JDBC or ODBC drivers) can connect without needing DuckDB-specific software. Incoming connections go through a pluggable authentication system that supports passwords stored in a database, external tokens in JWT format, or login via identity providers like Keycloak, Google, Azure AD, or AWS Cognito. After authentication, a per-statement access control layer checks whether the logged-in user has permission to read or write the specific tables referenced in their query.

For workloads that need more than one database node, the gateway manages pools of Quack nodes and routes queries to the right one based on whether the query is a read or a write. If a node crashes, the gateway detects it and restarts it automatically before accepting further traffic on that node.

A web-based admin console at the manager's HTTP port provides a live dashboard showing how many queries each node is handling, typical response times, and a history of recent statements. From the same UI you can add or remove tenants, configure pools, and edit access permissions per table.

The whole system deploys as a single JAR file and uses Postgres to store catalog metadata, pool state, user records, and access grants. Quick start uses Docker Compose and takes a few minutes to get a working setup with sample data loaded. The README is long and covers native runs, JDBC client configuration, REST API usage, and load testing.

Where it fits

Run DuckDB as a shared multi-tenant analytical database for teams or multiple applications
Add enterprise authentication (SSO via Keycloak, Google, Azure AD) and row-level access control to DuckDB
Load balance read-heavy analytical workloads across multiple DuckDB nodes with automatic failover
Expose DuckDB via standard database protocols (JDBC/ODBC) so BI tools and Spark can query it directly

Open on GitHub → Full breakdown on explaingit →