gitmyhub

feast

Python ★ 7.1k updated 1d ago

The Open Source Feature Store for AI/ML

Feast is an open-source feature store for machine learning that keeps training data and production data consistent, preventing silent model degradation from features computed in two different places.

PythonSnowflakeBigQueryRedshiftPostgreSQLParquetsetup: moderatecomplexity 4/5

Feast is an open source feature store for machine learning, written in Python. In machine learning, a "feature" is a piece of data used to train a model or make predictions, such as a user's average purchase amount or how recently they logged in. Feast is the system that keeps those features organized, consistent, and available whether you are training a model on historical data or serving predictions in real time.

The main problem Feast solves is consistency between training and serving. Without a dedicated feature store, teams often compute the same numbers in two separate places: once for training, once for production. The values can drift apart, quietly degrading model quality. Feast fixes this by acting as a single source of truth. It maintains an offline store for processing large amounts of historical data (used during training) and a low-latency online store for fetching features quickly at prediction time.

Feast also protects against a subtle and costly error called data leakage, where information from the future accidentally gets included in training data. It does this by generating point-in-time correct datasets: when you ask for a feature value, Feast looks up the value that existed at that specific moment in history, not a later one.

Getting started involves installing the Python package, creating a feature repository with a single command, and defining your features as configuration files. From there you register them with feast apply, load historical data to build training sets, push current values to the online store, and read them back at low latency in production. A built-in web UI lets you browse and explore registered features.

Feast connects to many common data sources including Snowflake, BigQuery, Redshift, Postgres, and Parquet files. Community plugins extend support further. The project is Apache 2.0 licensed and actively maintained with a public roadmap that includes vector search support for AI workloads.

Where it fits