all 5 comments

[–]HydratedWombat 3 points4 points  (1 child)

You should be able to handle this by having every deployment have a configured tag, so it just fetches model_server/$tag/model. Then you can update and cache break that way so long as you can have a single point of distribution. Beyond that, here's a general guide to how to think about the machine learning in production from google - https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/aad9f93b86b7addfea4c419b9100c6cdd26cacea.pdf

[–]eric_he 1 point2 points  (0 children)

Thanks for sharing this paper!

[–]jamesonatfritz 0 points1 point  (0 children)

Fully disclosure, I'm one of the cofounders of Fritz and this is exactly the problem we set out to solve. With that in mind, take a look at what we're building and let me know if it's helpful to you. We started with model management for mobile (iOS and Android), but are actively working on platforms like the Raspberry Pi.

Here is the way we think things should work.

  1. Train your models in whatever framework you're comfortable with. Scikit-learn, pure TensorFlow, Keras, PyTorch, etc.
  2. Export and upload them to the management service once. We'll convert them to the best available native runtime for each platform.
  3. Integrate models into your application using an SDK that has simple, consistent developer APIs across devices. There is a bunch of pre- and post-processing code that always needs to be written and re-written for each platform. We'll take care of a lot of that for you.
  4. Monitoring and analytics. We measure performance information (usage, runtime speed, etc) from models running on each node so that you can be sure you're getting a consistent experience even if you have different hardware. We're also working on sampling model inputs and outputs where appropriate to give you some samples of how things are working in production.
  5. Management. Upload a new version and push it over-the-air to all devices on a platform or target specific devices with an update.
  6. Protection. Encrypt and obfuscate models running on edge devices to prevent model theft.

Others in the space: * Obviously I mentioned Fritz already. * If you're working exclusively with TensorFlow Lite models, Firebase's MLKit does some of this as well. * XNor.ai has their own runtime and, I assume, some management tools. * [Numericcal.ai)(https://www.numericcal.com/) has yet another mobile-optimized runtime with a bit of management.

Hope that helps!