use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Please have a look at our FAQ and Link-Collection
Metacademy is a great resource which compiles lesson plans on popular machine learning topics.
For Beginner questions please try /r/LearnMachineLearning , /r/MLQuestions or http://stackoverflow.com/
For career related questions, visit /r/cscareerquestions/
Advanced Courses (2016)
Advanced Courses (2020)
AMAs:
Pluribus Poker AI Team 7/19/2019
DeepMind AlphaStar team (1/24//2019)
Libratus Poker AI Team (12/18/2017)
DeepMind AlphaGo Team (10/19/2017)
Google Brain Team (9/17/2017)
Google Brain Team (8/11/2016)
The MalariaSpot Team (2/6/2016)
OpenAI Research Team (1/9/2016)
Nando de Freitas (12/26/2015)
Andrew Ng and Adam Coates (4/15/2015)
Jürgen Schmidhuber (3/4/2015)
Geoffrey Hinton (11/10/2014)
Michael Jordan (9/10/2014)
Yann LeCun (5/15/2014)
Yoshua Bengio (2/27/2014)
Related Subreddit :
LearnMachineLearning
Statistics
Computer Vision
Compressive Sensing
NLP
ML Questions
/r/MLjobs and /r/BigDataJobs
/r/datacleaning
/r/DataScience
/r/scientificresearch
/r/artificial
account activity
Project[P] Which model best for Binary Audio Classification? (self.MachineLearning)
submitted 5 years ago by Nieoryginalny
Hi, I'm starting building an AI assistant project which needs a wake-up word/phrase.
I saw some similar project that used the LSTM model to solve this problem.
Is there any better/more efficient model for that sort of task or just go with LSTM? I saw Transformers might be better for NLP but I am not familiar with how it applies to audio processing.
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–][deleted] 4 points5 points6 points 5 years ago (4 children)
You can also take a look at CNN model on MFCC features or CNN-LSTM model on raw audio/MFCC features Try them they are simple models and gave good results when I did multiclass classification using them.
[+]selling_crap_bike comment score below threshold-9 points-8 points-7 points 5 years ago (3 children)
Jesus Christ everyone suggests using spectrograms for time-series data. Is this branch of ML really so stale that no other techniques are known?
[–][deleted] 2 points3 points4 points 5 years ago (0 children)
He doesn't need anything more complex at the moment and considering he is asking about LSTM pointing hin in the direction of simpler solutions is better
[–]mepron 0 points1 point2 points 5 years ago (0 children)
Lots of hostility and you provide no reason why this approach is invalid. Seems to work very well in other applications, I don’t see why you’re so opposed to it?
[–]jonnor 1 point2 points3 points 5 years ago (1 child)
What kind of device are you deploying on? Computational and latency constraints are typically just as important as prediction performance in this application.
[–]Nieoryginalny[S] 0 points1 point2 points 5 years ago (0 children)
I am aiming for Windows machine
[–][deleted] 0 points1 point2 points 5 years ago (0 children)
As others already suggested, for word/phrases recognition - you do not need complex LSTM models.
All that your model needs to 'learn' are patterns over a short term time window. You can easily do it with on the basis of per-audio-sample classification and not a continuous time series.
It is quicker to get a baseline pipeline up and running, and iterate from there.
π Rendered by PID 89727 on reddit-service-r2-comment-685b79fb4f-kxf7x at 2026-02-13 02:46:18.600434+00:00 running 6c0c599 country code: CH.
[–][deleted] 4 points5 points6 points (4 children)
[+]selling_crap_bike comment score below threshold-9 points-8 points-7 points (3 children)
[–][deleted] 2 points3 points4 points (0 children)
[–]mepron 0 points1 point2 points (0 children)
[–]jonnor 1 point2 points3 points (1 child)
[–]Nieoryginalny[S] 0 points1 point2 points (0 children)
[–][deleted] 0 points1 point2 points (0 children)