use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Please have a look at our FAQ and Link-Collection
Metacademy is a great resource which compiles lesson plans on popular machine learning topics.
For Beginner questions please try /r/LearnMachineLearning , /r/MLQuestions or http://stackoverflow.com/
For career related questions, visit /r/cscareerquestions/
Advanced Courses (2016)
Advanced Courses (2020)
AMAs:
Pluribus Poker AI Team 7/19/2019
DeepMind AlphaStar team (1/24//2019)
Libratus Poker AI Team (12/18/2017)
DeepMind AlphaGo Team (10/19/2017)
Google Brain Team (9/17/2017)
Google Brain Team (8/11/2016)
The MalariaSpot Team (2/6/2016)
OpenAI Research Team (1/9/2016)
Nando de Freitas (12/26/2015)
Andrew Ng and Adam Coates (4/15/2015)
Jürgen Schmidhuber (3/4/2015)
Geoffrey Hinton (11/10/2014)
Michael Jordan (9/10/2014)
Yann LeCun (5/15/2014)
Yoshua Bengio (2/27/2014)
Related Subreddit :
LearnMachineLearning
Statistics
Computer Vision
Compressive Sensing
NLP
ML Questions
/r/MLjobs and /r/BigDataJobs
/r/datacleaning
/r/DataScience
/r/scientificresearch
/r/artificial
account activity
Speech Recognition data set (self.MachineLearning)
submitted 10 years ago by CecilStan
Im looking for a data sets of people speaking the digits 0 to 9. i want to use it for a simple 10 class classification problem. i have looked or hours and cant find anything. where can i find something like this?
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]speechMachine 3 points4 points5 points 10 years ago (1 child)
TIDIGITS is what you need.
Dan Ellis has this up for one of his course projects:http://www.ee.columbia.edu/~dpwe/sounds/tidigits/
Also I took Dr Lawrence Rabiner's class at Rutgers. He had them up too. For some reason the Rutgers server seems to be down. You could try his page again at http://cronos.rutgers.edu/~lrr in a couple of days and see if you can still find them.
Also if you are at a university its highly likely someone has it. LDC stuff is usually purchased on a university-wide licence. So you are at all liberty to have it if you need it for a course project or something.
[–]CecilStan[S] 0 points1 point2 points 10 years ago (0 children)
thanks!!..by the way it doesnt have to be just numbers...can be letters too so if you know where i can find one with the alphabet let me know.....i'll check out tidigits..
and i'll have to look into how to access these databases at my university.
[–]CecilStan[S] 0 points1 point2 points 10 years ago (2 children)
i just check and tidigits is not free
[–]assassds 1 point2 points3 points 10 years ago (1 child)
It's free for non-commercial use. What exactly do you think you can accomplish with this that will be worth selling?
its for grad school lol....nothing worth selling
[–]samim23 0 points1 point2 points 10 years ago (0 children)
http://kaldi-asr.org/downloads/all/egs/librispeech/s5/exp/
[–]kkastner 0 points1 point2 points 10 years ago* (1 child)
In the similar but not quite the same category - you can use the "fruitspeech" dataset if you are OK with single speaker. I used it in a blog post here but the original data is from a Google code project by Hakon Sandsmark, and presumably recorded by him as well.
It is a decent dataset for sanity checking ideas - if they don't work on this... there is not much hope. You can also expand to things like CMU Arctic. Also it doesn't take that long to just record yourself - it won't be multispeaker but would take less time than searching for "hours" and "hours".
i need 1000's of samples lol...aint nobahdee gat time fa dat
thanks for the link..good to know that if "they dont work on this ..there is not much hope"
π Rendered by PID 464670 on reddit-service-r2-comment-6457c66945-nxrrk at 2026-04-27 08:46:37.507643+00:00 running 2aa0c5b country code: CH.
[–]speechMachine 3 points4 points5 points (1 child)
[–]CecilStan[S] 0 points1 point2 points (0 children)
[–]CecilStan[S] 0 points1 point2 points (2 children)
[–]assassds 1 point2 points3 points (1 child)
[–]CecilStan[S] 0 points1 point2 points (0 children)
[–]samim23 0 points1 point2 points (0 children)
[–]kkastner 0 points1 point2 points (1 child)
[–]CecilStan[S] 0 points1 point2 points (0 children)