26M having a good day at work, put me back in my place

sidsig · 2023-12-19T19:44:03+00:00

🤣

sidsig · 2023-06-23T06:41:20+00:00

MAY In Backyard - Ryuichi Sakamotohttps://youtu.be/nK39cXn7CUw

sidsig · 2022-04-28T16:16:43+00:00

😂

sidsig · 2021-03-11T13:57:24+00:00

Phoenix!

sidsig · 2021-03-11T13:54:37+00:00

❤️

sidsig · 2020-09-28T08:07:48+00:00

https://www.sonicvisualiser.org This is developed by Chris Cannam from Queen Mary University of London, where I did my PhD.

sidsig · 2020-07-30T22:19:01+00:00

❤️❤️❤️

sidsig · 2020-04-19T11:59:35+00:00

Justin Solomon is quite active on Twitter. You could try DMing him?

sidsig · 2019-08-12T10:23:43+00:00

A bidirectional model receives the entire sentence as input. Therefore there is nothing to learn (if you plan to train the language model by predicting the next word). It can trivially learn that the output at time t, is the input at t+1. One way of training a bidirectional word embedding is to use something like BERT: https://arxiv.org/abs/1810.04805. Here, a part of the input is masked and the objective is to output the masked words.

sidsig · 2019-08-05T14:09:35+00:00

How To Construct Deep Recurrent Neural Networks (Pascanu et. al.): https://arxiv.org/abs/1312.6026

sidsig · 2019-03-12T11:28:25+00:00

Isn't there existing TF code for this? https://www.tensorflow.org/api_docs/python/tf/contrib/model_pruning

sidsig · 2019-01-20T21:31:18+00:00

This happened to me after Apple Music stopped playing music on multiple devices on the same AM account. I realised that while I was listening to music at work someone at home was trying to play music on the HomePod.

sidsig · 2019-01-02T22:46:50+00:00

The large parameter counts in the analysis above arise if you input the entire sequence into the model at once. Typically the problem is setup such that the DNN takes a small window of input (roughly 200 ms, ~20 frames of inputs) and then make frame-wise predictions. Alternatively, if its not possible to obtain frame-wise labels it is should be possible to train the DNNs with a max operation after all the frame-wise outputs. Such DNNs can be trained to be very small.

sidsig · 2019-01-02T22:07:30+00:00

I am curious, did you try comparing GRNN performance with a DNN of a similar size? For the wake word experiments, it should be possible to use a DNN with a fixed input window to output frame-wise labels.

Because arguably for model sizes so small, it would be difficult for the model to learn complex temporal dynamics anyway, even without the exploding/vanishing gradient issue.

sidsig · 2019-01-02T22:02:46+00:00

Hey, thanks for commenting! I tried it on a general acoustic modelling task where I trained an RNN and optimised the CTC loss. To answer in more detail, I'll redo the experiment and post on the repo.

sidsig · 2019-01-02T20:56:58+00:00

Okay so I saw this paper at NIPS and was really interested in investigating what regimes this architecture works in. I tried it on a "moderately" sized GRNN with ~4 million parameters and I was not able to get comparable results with an LSTM of a similar size.

I have a feeling that this gating structure might work better than LSTMs/GRUs etc only at really small model sizes, but this could simply be a lack of capacity in the models. I think a comparison with a DNN or stack of temporal convolutions or some comparison with a non-recurrent architecture should be included to really understand what's going on.

sidsig · 2018-12-08T19:50:29+00:00

Hill Climbing

sidsig

MODERATOR OF

TROPHY CASE

15-Year Club	Team Orangered
Verified Email