Feast: a feature store practitioner's perspective
I am building a feature store at PalFish (Series-C online education company based in China) since Feb 2021 (see the blog for what we build!). As a feature store practitioner, I learn a lot from Feast. In this article, I will walk you through what is Feast, why it is great, and what it can do better.
Feast is the most starred open-source feature store implementation. Incubated in Gojek and later joined Tecton and Linux Foundation for AI, Feast is adopted by many teams besides Gojek and Tecton, including Twitter, Salesforce, Shopify, and Robinhood.
Feast keeps moving towards a feature-complete feature store implementation, and what it can do is best summarized in the below architecture (source: Feast v0.13 release announcement).
As of v0.14, Feast supports:
- All sorts of mainstream batch data sources, including data warehouses (BigQuery and Redshift), object stores (GCS and S3), files (Parquet), and custom ones.
- Materialization that syncs data from offline stores to online stores ensuring train-serve consistency.
- On-demand transformation.
- Feature serving via both SDK and servers.
- Point-in-time correct join that builds correct training dataset.
The next major moves include supporting stream data sources and data quality monitoring.
There are 4 things I love about Feast.
Feast clearly defines what a feature store is. Prior to Feast, I look into multiple in-house feature stores by tech giants via blogs and presentations and I am discouraged to find that their architectures look drastically different as they focus on different problems - Airbnb emphasized feature backfilling; Twitter focused on feature discovery and sharing; Netflix was too coupled with Spark; Uber's Michelangelo gave the best picture, but there is no reference to deep dive and no author to ask. I need a conceptual framework that truly helps me to think about these systems coherently, and I am not alone. Luckily I meet Feast, which makes the concept extremely clear, with articles, docs, code, and even a SaaS product Tecton.
Feast builds a community that helps people understand feature stores. I started my feature store research by reading a lot of posts and presentations by Uber, Airbnb, Netflix, etc. When I ran into questions, it was difficult to reach the right person to ask. This is not the case with the Feast community. Any time I ask something about feature stores in Feast's Slack group, Willem and friends will help.
Feast delivers what users want. The team collects feedback from the community, prioritizes properly, and iterates fast to meet users' needs. Here are 2 examples.
- Feast was heavy - it runs on k8s only, and it has inbuilt Spark jobs and a Java server for feature retrieval - just too much infrastructure to maintain. With version 0.10, Feast becomes lightweight that allows users to run the whole feature store within a Python process.
- Feast was opinionated - it has preferred infrastructures of choice, which prevents users from integrating Feast with existing infrastructure. With version 0.10, Feast becomes compatible that allows users to choose inbuilt infrastructure or implement their own by extending the provider interface.
Feast is neutral. Despite being sponsored by Tecton, Feast never degrades its SaaS competitors using a table like below, which I find many others do 🤦.
|Feast||Competitor A||Competitor B||Competitor C|
|Awesome Feature 1||✅||✅||❌||❌|
|Awesome Feature 2||✅||✅||❌||✅|
|Awesome Feature 3||✅||❌||✅||❌|
|Awesome Feature 4||✅||✅||✅||❌|
|Awesome Feature 5||✅||❌||❌||❌|
Feast is great, but it can do better.
Feast can be done with a statically typed language rather than Python. Feast is in Python because it was originated by the data science team at Gojek who knew Python better. Nowadays, however, feature stores are mostly developed and maintained by backend engineers. Also, there are many better choices than Python for this system-side (not ml-side) mission-critical software, and I know people who are working on Golang or JVM version of Feast.
Many thanks to the Feast team for their endless effort to build the standard for feature stores, so that I have the luxury to judge. I hope Feast keeps moving the needle, and I highly suggest everyone who is interested in feature stores check out Feast. You will be sure to learn a lot no matter you decide to adopt, buy or build.
Any feedback? Comment on Twitter!
© Yik San Chan. Built with Vercel and Nextra.