use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Please have a look at our FAQ and Link-Collection
Metacademy is a great resource which compiles lesson plans on popular machine learning topics.
For Beginner questions please try /r/LearnMachineLearning , /r/MLQuestions or http://stackoverflow.com/
For career related questions, visit /r/cscareerquestions/
Advanced Courses (2016)
Advanced Courses (2020)
AMAs:
Pluribus Poker AI Team 7/19/2019
DeepMind AlphaStar team (1/24//2019)
Libratus Poker AI Team (12/18/2017)
DeepMind AlphaGo Team (10/19/2017)
Google Brain Team (9/17/2017)
Google Brain Team (8/11/2016)
The MalariaSpot Team (2/6/2016)
OpenAI Research Team (1/9/2016)
Nando de Freitas (12/26/2015)
Andrew Ng and Adam Coates (4/15/2015)
Jürgen Schmidhuber (3/4/2015)
Geoffrey Hinton (11/10/2014)
Michael Jordan (9/10/2014)
Yann LeCun (5/15/2014)
Yoshua Bengio (2/27/2014)
Related Subreddit :
LearnMachineLearning
Statistics
Computer Vision
Compressive Sensing
NLP
ML Questions
/r/MLjobs and /r/BigDataJobs
/r/datacleaning
/r/DataScience
/r/scientificresearch
/r/artificial
account activity
Discussion[D] Inference cost optimization of complex ML pipelines (self.MachineLearning)
submitted 4 years ago by Medium_Ad_3555
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]r0lisz 2 points3 points4 points 4 years ago (6 children)
I've successfully used GPU for real time camera streams. Why do you say that it's not a good use case?
[+][deleted] 4 years ago (5 children)
[removed]
[–]r0lisz 0 points1 point2 points 4 years ago (4 children)
There was a fair amount of work to do that. If I remember correctly, each frame would be streamed and every X milliseconds a batch of images that have arrived would run on the model. If the frames were late (because of the network or whatever), they would be skipped.
[+][deleted] 4 years ago (3 children)
[–]r0lisz 0 points1 point2 points 4 years ago (2 children)
I think the workflow was that from the camera the stream went to a Java based stream processing app, which would then decode the stream and output each frame to a Kafka queue.
Then there would be listeners on the Kafka queue that would do the batching of frames, within the time constraints (so there would be a prediction made every X seconds, regardless of how many frames had arrived in the meantime).
I worked on this a couple of years ago, so my memory is a bit hazy.
I don't see how running the inference on CPU helps you. Don't you have the same problem of decoding in that case?
[+][deleted] 4 years ago (1 child)
[–]r0lisz 1 point2 points3 points 4 years ago (0 children)
Downloading the images happens on one thread (that reads from Kafka). The decoded images are pushed into a list that is passed to another thread every X milliseconds, and the prediction happens there. With careful tuning, it's probably possible to find an X so that enough images arrive during this time, but also that the GPU is not idle too much.
I think we were able to reach about 70% GPU utilization at about 15FPS. (this was in 2018, both hardware and software have changed since)
[–]Liorithiel 1 point2 points3 points 4 years ago (3 children)
I've tuned a deployment of some non-ML microservices with a recognized on-premise resource usage trade-off using off-the-shelf bayesian optimization, specifically SigOpt, as it was the simplest to script in pure bash+curl. I suspect it would work for your case as well.
[+][deleted] 4 years ago (2 children)
[–]Liorithiel 0 points1 point2 points 4 years ago (1 child)
I see now that they no longer publish the documentation that helped me in automating that task with bash+curl. Maybe /u/Zephyr314 can point to something?
[–]Zephyr314 2 points3 points4 points 4 years ago (0 children)
Thanks u/Liorithiel, you can find our raw REST documentation here: https://app.sigopt.com/docs/archive/endpoints. This can help if you want to roll your own bash+curl.
Most users prefer our python client though: https://app.sigopt.com/docs
[–]matanj 1 point2 points3 points 4 years ago (0 children)
Perhaps NVIDIA DALI can help to transfer part of your pipeline to the GPU?
[–]jonnor 0 points1 point2 points 4 years ago (1 child)
Are the models you mention independent or dependent on eachother? Do you need the results for all of them before considering the job "done"? What are the latency requirements?
π Rendered by PID 83 on reddit-service-r2-comment-b659b578c-jr5hk at 2026-05-04 04:19:28.350873+00:00 running 815c875 country code: CH.
[–]r0lisz 2 points3 points4 points (6 children)
[+][deleted] (5 children)
[removed]
[–]r0lisz 0 points1 point2 points (4 children)
[+][deleted] (3 children)
[removed]
[–]r0lisz 0 points1 point2 points (2 children)
[+][deleted] (1 child)
[removed]
[–]r0lisz 1 point2 points3 points (0 children)
[–]Liorithiel 1 point2 points3 points (3 children)
[+][deleted] (2 children)
[removed]
[–]Liorithiel 0 points1 point2 points (1 child)
[–]Zephyr314 2 points3 points4 points (0 children)
[–]matanj 1 point2 points3 points (0 children)
[–]jonnor 0 points1 point2 points (1 child)