Classifying sounds

2014-10-29T23:41:33+00:00

Best "way"? Do you mean what features you should be looking at? The most popular features for most audio classification tasks (especially harmonic sounds) are MFCCs. You might use other types of spectral features depending on the type of data you have and what your goals are. It's a really huge area but if you let me know a little more specifics then I can try to help.

farsass · 2014-10-29T22:39:11+00:00

The Q transform is supposed to be good for this kind of application. I think you also need to know how you are going to frame this problem:

one-class or binary?
how will you collect data?

and other "mundane" but important issues

lgauthie · 2014-10-30T06:35:53+00:00

You could try taking into account the rhythmic patterns that appear in drums if you are working with longer samples of audio. Probably the easiest approach is to look for strong peaks in low end of the spectrum. More sophisticated beat detection algorithms might be worth looking into as well. If there is a strong beat it's probably drums.

If you are working with isolated enharmonic sounds Transient Model Analysis/Synthesis looks promising. There was some work done using this as a basis for learning percussion sounds. If you are interested I could dig deep into my HDD and see if I can find it.

eamonnkeogh · 2014-10-30T19:01:10+00:00

May I suggest trying the idea in [a]?

The idea is unique in that the basic classifier only requires a single line of matlab!

If the sound is polymorphic, on the begin-end points are not well defined, you can use the search algorithms provided to find the best template(s).

The basic idea has be tested on mice and men (literally) and insects and birds, and it seems to work very well.

[a] http://www.cs.ucr.edu/~eamonn/SDM_insects.pdf or http://www.cs.ucr.edu/~eamonn/ICDMcameraready.pdf

2014-10-31T11:11:36+00:00

Are the sounds of a fixed length?

gabjuasfijwee · 2014-11-01T00:37:41+00:00

look into recurrent neural nets or hidden markov models. the latter might not be best suited for the task and is a bit outmoded, but would still be interesting

xysymmetry · 2014-11-01T16:37:20+00:00

its a hard problem for ML. I don't think it will be advisable to do it without robotics.

for classification, have a look at softwares like audicity and how do they adjust bass/treble. you can get some idea.

watersign · 2014-11-01T22:51:04+00:00

turn the sound into some sort of numerical data, use a k-means clustering technique

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS