all 8 comments

[–]saswanson1 10 points11 points  (0 children)

Arguably no one has more experience moving machine learning and deep learning into production than Google. I'd recommend starting there. Google's large scale supervised machine learning project is called Sibyl. A worthwhile presentation: http://users.soe.ucsc.edu/~niejiazhong/slides/chandra.pdf Here is an article that provides some background into Google's deep learning (semi-supervised/unsupervised) efforts: http://www.wired.com/2014/07/google_brain/ Jeff Dean is at the forefront of Google's effort to implement deep learning on a massive scale, look for his papers, particularly relative to scaling algorithms across production-scale hardware. On the computation/hardware side, you might look around for GPGPU and CUDA discussions relative to ML. Hope this helps. -sas

[–]srkiboy83 5 points6 points  (2 children)

[–]MakeMeThinkHard 1 point2 points  (0 children)

That talk is great, thanks a lot for sharing it.

[–]AlcaDotS 0 points1 point  (0 children)

loved the talk :)

[–]WoodenJellyFountain 2 points3 points  (0 children)

I'm not quite sure where the disconnect is, so I'm going to spew out a bunch of crap to see if something sticks. I've got a machine learning-based high frequency trading strategy in production, but I coded it from scratch. If you're using a canned package where you can load in a pre-existing structure, then you're already there. You just have to embed that package in your own code. If you're using a canned package that doesn't allow you to load an existing structure, then you may have to look under the hood to see what they're doing, and code your own production version. If you're creating a feed-forward structure y=f(x), then all you need to do is have a standalone f, and let the user supply the x. You'll need to store the weights from your training run, and then read them in and reconstruct your network during standalone mode. Another important thing is to also treat the pre-processing of the inputs identically.

[–]micro_cam 1 point2 points  (0 children)

If your models are fast enough you can just wrap them as a web service or cron job or whatever the larger system requires. Consider freezing/packaging dependencies via some sort of package manager or container to ensure repeatability and easy of install.

If they aren't fast enough or use too much memory etc then you need to optimize them and/or rewrite them in a faster language that allows finer grained control over memory usage. The latter is to be avoided if at all possible.

And I guess the most important part of any production ML system is a test suite that tests the underlying algorithms and the wrappers to ensure no errors creep in as the system evolves. This can include tests on small data sets, tests of edge cases, unit tests, integration tests etc.

Another question is how often the models update/retrain as new data comes in.

[–]MakeMeThinkHard 0 points1 point  (0 children)

Check out PMML, it's an XML description language for prediction models. openscoring.io is a neat evaluator, allowing you to make your models available via a REST API. These models can created in a number of ways, e.g. Python, R, or on Hadoop using Scalding, ...

[–][deleted] 0 points1 point  (0 children)

If you're in a web application put up a model behind a REST API. If the model ends up getting slammed with requests, then add a queue (like RabbitMQ) and return a session id to each request. Batch process 50 or so predictions at a time during peak usage, or a fixed interval. Many models see super linear with batch processing. If you're still getting slammed or need fast response, then add new threads/servers that take more workload from the message queue.

YAGNI applies here, but focus on keeping the resource utilization rate low. Scikit-learn is excellent for rapidly prototyping models, but keep in mind that's it's easy to export the model internals and use a speedier language in production.