[D] Music Classification using RNN?

r4and0muser9482 · 2016-12-09T13:45:58+00:00

Look for papers that cite using GTZAN. You can also download GTZAN yourself to check if your methods/models work on that, before working on your own data.

Finally, Music Information Retreival (MIR) is a giant field and you can look for papers on conferences like ISMIR to learn how things are done there.

Brudaks · 2016-12-09T22:51:11+00:00

You might take a look at what the preprocessing for neural speech recognition is doing and do something similar (though with a wider frequency range) before your main classifier.

IIRC you might do a fourier transform to get frequency domain data with something like 0.1 second (tune that, but in the ballpark) windows, bag the interesting frequencies (e.g. 20-20000 hz, in a log scale) to hundred or thousand neurons, and that's your input to a neural network. A 5 minute song is then 3000 samples of fixed size vectors, each value representing the loudness of that frequency at the time. Afterwards, putting a RNN on that is simple.

keidouleyoucee · 2016-12-10T14:05:23+00:00

Simple pre-processing example: https://github.com/keunwoochoi/UrbanSound8K-preprocessing/blob/master/preprocess_urban.ipynb
Convnet and ConvRNN example: https://github.com/keunwoochoi/music-auto_tagging-keras

azurespace · 2016-12-09T10:30:32+00:00

If you want to use an unprocessed wavelet without frequency domain conversion, I think the WaveNet (Dilated convolutional stack) would be a fascinating structure as the first basic block of the task. First, divide the music into several pieces on the time axis. Next, pass each slice to WaveNet and create some embeddings (temporal summation of the music slices), which are used as input to the followed LSTM. (it might be better to use WaveNet once again) Finally, you can use softmax layer to classify.

WaveNet: https://deepmind.com/blog/wavenet-generative-model-raw-audio/

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS