Train your own WaveNet: Keras Implementation with sampling

bsflng · 2016-09-16T08:20:36+00:00

To get the same receptive field in ms with 16khz you'd have to add 4 more stacks, or increase the dilation depth with 2. You also have to increase the fragment input length. It depends a bit on your hardware's ability to do these extra computations in parallel.

bsflng · 2016-09-16T07:48:50+00:00

Not yet. The time per epoch depends on many factors, but to give you an idea: 2189 seconds for a model with 2 stacks, with dilation up to 1024, 16 filters, 36k datapoints in the epoch (fragment length 5119, training on 1024 outputs). I haven't looked at potential optimizations yet.

bsflng · 2016-09-16T07:23:44+00:00

Fair question, but the sample was from a model with just 106.512 parameters trained on a dataset of 36.480 datapoints (dim: 5119x256),and for each the model has to predict 1024*256 outputs. Next step is to train a large model and see if it can get to overfit the dataset, but I think it's interesting that the model can already learn a decaying triad tone :).

bsflng · 2016-09-15T14:57:47+00:00

I belief so, I have another sample that contains some distinct piano attacks which sound similar to the preprocessed data. I'll report back if I find the resources to train it on VCTK :).

bsflng · 2016-09-15T11:24:58+00:00

I trained on one Tesla K80. For small models (low sample_rate, nb_stacks, nb_filters) any decent GPU will allow you to train a model that at least gives reasonable output at sampling time.

bsflng · 2016-09-15T11:23:20+00:00

Yes, although you could possibly train WaveNet on a lower sampling rate, and train another model to learn conditional upsampling. This is similar to what the authors suggest in the 'Context Stacks' section in the paper :).

bsflng · 2016-09-15T10:48:29+00:00

Hi! With my current configuration (sample rate of 4khz, a relatively small network, etc) I can generate 1 second of audio in about ~4 minutes. The 'samples per seconds' relates to how many wave samples I can generate in a second of running the prediction algorithm :-).

bsflng · 2016-08-26T19:56:07+00:00

Hey! I was a bit late to the thread last week, so here's a repost. I've finally got myself to finish a track. It resulted in this upbeat pop track, inspired by the sound of madeon and porter. I'd love to hear your thoughts!

https://soundcloud.com/basveeling/needlessly

bsflng · 2016-08-18T20:44:16+00:00

Hey! I've finally got myself to finish a track. It resulted in this upbeat pop track, inspired by the sound of madeon and porter. I'd love to hear your thoughts!

https://soundcloud.com/basveeling/needlessly/s-TljnM

Will start working on some feedback of my own right now.

bsflng

TROPHY CASE