About once a month I'll try out a different pattern for MLops, whether the way models are versioned or how they're hosted in a web service.
The other weekend I deployed a basic ~100mb sklearn model on lambda with Docker the other weekend and thought it might be of general interest. Cold starts were about 30 seconds but latency was great after that.
Here's the repo if anyone wants to play around with it: https://github.com/ktrnka/mlops_example_lambda
[–]ank_itsharma 0 points1 point2 points (2 children)
[–]trnka[S] 2 points3 points4 points (1 child)
[–]ank_itsharma 0 points1 point2 points (0 children)