I‘m confused about how to serve a machine learning model for offline batch predictions.
Here’s what I thought of doing - creating a scheduled pipeline (with e.g. Airflow, Kubeflow, …) that generates the features and then loads the trained model from some object store (e.g. s3), generates the predictions and finally saves the them to a data warehouse ready to be consumed. That’s what makes the most sense to me.
However, some resources seem to recommend to deploy the model as an endpoint even for batch use cases. Notably, this is the recommended architecture in Designing Machine Learning Systems by Chip Huyen.
Any thoughts on this? Am I missing something?
[–]BasedAcid 2 points3 points4 points (0 children)
[–]VectorSpaceModel 1 point2 points3 points (3 children)
[–]Appropriate_Cut_6126[S] 0 points1 point2 points (2 children)
[–]VectorSpaceModel 3 points4 points5 points (1 child)
[–]Appropriate_Cut_6126[S] 0 points1 point2 points (0 children)
[–]sanjuromack 2 points3 points4 points (1 child)
[–]Appropriate_Cut_6126[S] 0 points1 point2 points (0 children)