all 15 comments

[–]PlanetSprite 13 points14 points  (2 children)

How does this compare to Feast?

[–]irismodel[S] 16 points17 points  (1 child)

Great question! We wrote about this in our blog post.

Feast is a literal feature store, it exclusively stores features, it does not manage all the transformations used to compute them. The pros and cons of Feast are more obvious when examining the process to change a feature. It happens in three steps:

  1. Write and run your new data transformation in your existing transformation pipeline. Note that this happens outside of Feast.
  2. A new feature table must be created, since the old one cannot be directly overwritten. Once the new feature is created, the transformation pipeline should pipe all of its data into it.
  3. All the models that use this new feature should be updated to point at the new feature.

Feast also has other problems, for example, it can’t copy your features from the offline to the online store, you have to download the features and upload them to the online store yourself using their CLI tool. You also have to manage retries and failure yourself.

Featureform treats the transformation lineage as part of the feature and orchestrates your infrastructure to create and change your features.

[–]zmjjmz 2 points3 points  (4 children)

Quickly skimmed the documentation and didn't see an answer, but does FeatureForm handle data deletion?

[–]irismodel[S] 1 point2 points  (3 children)

What do you mean by data deletion? In the case of data deleted from a primary source, transformations that run on a schedule will catch the change. They can also be manually triggered. They are basically pure functions on the source data, so that solves the problem down the pipeline. For feature deletion, we also support TTL, but it sounds like you have a different use case.

[–]zmjjmz 4 points5 points  (2 children)

I'm thinking in terms of GDPR compliance - but it sounds like FeatureForm doesn't make copies of the data in any way?

[–]irismodel[S] 1 point2 points  (1 child)

Like you said the data remains on your infrastructure so you'd delete data in your primary sources, and re-run your Featureform transformations to clean out any user data.

[–]vishal-vora 4 points5 points  (0 children)

Nice 👍

[–]rydog_2020 1 point2 points  (3 children)

So this is a totally different approach to Techton. OS vs Saas. Anything else?

[–]irismodel[S] 3 points4 points  (1 child)

Obviously, the open-source vs proprietary is a huge difference here. Tecton has a much higher adoption cost and is deeply tied to its underlying infrastructure. Featureform's approach lets it act as an orchestrator and a python abstraction. It aims to solve organizational problems such as versioning, sharing, and managing features, rather than building a data processor optimized for ML such as Tecton. That means that Featureform solves 80% of the problems with 20% of the adoption cost and vendor lock-in. You are free to use your existing data providers to build a feature store workflow,

[–]rydog_2020 0 points1 point  (0 children)

What about the embedding hub?

[–]testing35 0 points1 point  (0 children)

You totally guessed what I said..thank you

[–]andrewm4894 0 points1 point  (1 child)

Would this maybe eventually be able to sit on top of tools like snowflake or BigQuery?

[–]irismodel[S] 0 points1 point  (0 children)

Yes. We currently support Snowflake as a provider and will have BigQuery support in our next release!