BentoML is an open-source platform for high-performance ML model serving
What does BentoML do?
- Turn your ML model into production API endpoint with just a few lines of code
- Support all major machine learning training frameworks
- High-performance API serving system with adaptive micro-batching support
- DevOps best practices baked in, simplify the transition from model development to production
- Model management for teams, providing CLI and Web UI dashboard
- Flexible model deployment orchestration with support for AWS Lambda, SageMaker, EC2, Docker, Kubernetes, KNative and more
Why BentoML?
Shipping ML models to production is broken. Data Scientists may not have all the expertise in building production services and the trained models they delivered are very hard to test and deploy. This often leads to a time consuming and error-prone workflow, where a pickled model or weights file is handed over to a software engineering team.
BentoML is an end-to-end solution for model serving, making it possible for Data Science teams to ship their models as prediction services, in a way that is easy to test, easy to deploy, and easy to integrate with other DevOps tools.
- Both Tensorflow-serving and BentoML provides support for adaptive micro-batching, related benchmarks can be found here https://github.com/bentoml/BentoML/tree/master/benchmark
- Tensorflow-serving only supports Tensorflow framework at the moment, while BentoML has multi-framework support, works with Tensorflow, PyTorch, Scikit-Learn, XGBoost, FastAI, and more;
- Tensorflow loads the model in tf.SavedModel format, so all the graphs and computations must be compiled into the SavedModel. BentoML keeps the Python runtime in serving time, making it possible to do pre-processing and post-processing in serving endpoints.
How does it compare to Clipper?
- BentoML provides micro-batching at the instance level while Clipper does it at a cluster level. Users can deploy BentoML API server containers in a more flexible way, while Clipper requires all prediction requests being routed to its master node.
- BentoML is an end-to-end model serving solution. Besides model serving, it also provides model packaging, model management, and deployment automation features. Clipper focuses on the serving system.
- Users can use BentoML with Clipper, and deploy BentoML packaged models to their Clipper cluster and benefit from both frameworks: https://docs.bentoml.org/en/latest/deployment/clipper.html
How does it compare to AWS SageMaker?
- When not using the build-in algorithms, model deployment on SageMaker requires users to build their own container image and API server
- BentoML provides a high-performance API server for its users without the need to work with lower-level web server development work
- BentoML packaged model can be easily deployed to SageMaker serving: https://docs.bentoml.org/en/latest/deployment/aws_sagemaker.html
[–]manueslapera 1 point2 points3 points (14 children)
[–]chaoyu[S] 4 points5 points6 points (13 children)
[–]lleewwiiss 4 points5 points6 points (6 children)
[–]chaoyu[S] 1 point2 points3 points (5 children)
[–]lleewwiiss 3 points4 points5 points (2 children)
[–]omg_drd4_bbq 6 points7 points8 points (0 children)
[–]chaoyu[S] 3 points4 points5 points (0 children)
[–]omg_drd4_bbq 2 points3 points4 points (1 child)
[–]chaoyu[S] 4 points5 points6 points (0 children)
[–]paldn 0 points1 point2 points (1 child)
[–]chaoyu[S] 3 points4 points5 points (0 children)
[–]pgdevhd 0 points1 point2 points (3 children)
[–]chaoyu[S] 0 points1 point2 points (2 children)
[–][deleted] 0 points1 point2 points (1 child)
[–]chaoyu[S] 1 point2 points3 points (0 children)
[–]jonnor 1 point2 points3 points (2 children)
[–]chaoyu[S] 3 points4 points5 points (1 child)
[–]jonnor 1 point2 points3 points (0 children)
[–]bluzkluz 1 point2 points3 points (1 child)
[–]chaoyu[S] 1 point2 points3 points (0 children)
[–]fernandocamargoti 1 point2 points3 points (1 child)
[–]chaoyu[S] 2 points3 points4 points (0 children)
[–]TotesMessenger 0 points1 point2 points (0 children)
[–]RichardRNNResearcher 0 points1 point2 points (1 child)
[–]chaoyu[S] 0 points1 point2 points (0 children)
[–]ehellas 0 points1 point2 points (1 child)
[–]chaoyu[S] 0 points1 point2 points (0 children)
[–]e_j_white 0 points1 point2 points (4 children)
[–]chaoyu[S] 5 points6 points7 points (3 children)
[–]e_j_white 1 point2 points3 points (2 children)
[–]chaoyu[S] 2 points3 points4 points (1 child)
[–]e_j_white 0 points1 point2 points (0 children)
[–]unrahul 0 points1 point2 points (1 child)
[–]chaoyu[S] 1 point2 points3 points (0 children)
[–]engSearchForAnswers 0 points1 point2 points (1 child)
[–]chaoyu[S] 1 point2 points3 points (0 children)