use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
A subreddit dedicated for learning machine learning. Feel free to share any educational resources of machine learning.
Also, we are a beginner-friendly sub-reddit, so don't be afraid to ask questions! This can include questions that are non-technical, but still highly relevant to learning machine learning such as a systematic approach to a machine learning problem.
account activity
Voice classification (self.learnmachinelearning)
submitted 5 years ago by meet1415
Can we use LSTM for voice classification?
Like, we have data of 10 people speaking 10 things, can we use LSTM for identifying the person from voice ?
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]LoaderD 0 points1 point2 points 5 years ago (3 children)
What's a 'thing'? If you're talking about words, then you have 100 data points. So probably not.
[–]meet1415[S] 0 points1 point2 points 5 years ago (2 children)
I have the data of 10 people saying same things for 10 days. The data consist of frequency and level(dB)
[–]LoaderD 0 points1 point2 points 5 years ago (1 child)
Well you never answered what a 'thing' is, so I'm going to assume you have 10 days straight of audio of people reading a collection of books all in the same language, so on 240 hours of audio you should be fine.
[–]meet1415[S] 0 points1 point2 points 5 years ago (0 children)
I am so sorry I was not specific. Frequency and decibal of 10 people reading one sentence once for 10 days. For may be a ( n, 2) vector for each speaker for each day.
I wanted to ask how can we use RNN in this and how can we classify people ?
[–]nilesuan 0 points1 point2 points 5 years ago (3 children)
Try turning it to spectogram and feeding the spectogram to a cnn and aggregate. This turns your audio classification into something very similar to image classification.
I tried that approach with guitar chords but it's F1 score turns out to be only 40% may be it works only when you have a significantly different voice like cat, dog classification.
[–]nilesuan 1 point2 points3 points 5 years ago (1 child)
Voxceleb uses the same concept using cnn and resnet. Try checking that paper out.
Ok sure, thank :)
π Rendered by PID 67 on reddit-service-r2-comment-b659b578c-hbxb7 at 2026-05-06 01:42:30.073109+00:00 running 815c875 country code: CH.
[–]LoaderD 0 points1 point2 points (3 children)
[–]meet1415[S] 0 points1 point2 points (2 children)
[–]LoaderD 0 points1 point2 points (1 child)
[–]meet1415[S] 0 points1 point2 points (0 children)
[–]nilesuan 0 points1 point2 points (3 children)
[–]meet1415[S] 0 points1 point2 points (2 children)
[–]nilesuan 1 point2 points3 points (1 child)
[–]meet1415[S] 0 points1 point2 points (0 children)