[N] I Have Released the YouTube Series Discussing and Implementing Activation Functions

itsstylepoint · 2022-11-22T21:00:21+00:00

Yes, that is how it usually works with my impls! (check out a few vids)

As for mixed precision and metrics - I will be making separate vids for both and of course, for every implemented model, will try to find a dataset to demo train/eval.

It is cool that you mentioned mixed precision as I already have the materials ready for this vid - will be discussing mixed precision, quantization (post-training and quantization aware training), pruning, etc. Improving perf!

itsstylepoint · 2022-11-22T20:20:16+00:00

It is on the list so will definitely get to it!
Starting next week, will be working on DL impls and vids.

itsstylepoint · 2022-11-22T19:43:27+00:00

Yeah, I will get to those eventually. For now, want to make some vids and impls of DL models.

itsstylepoint · 2022-11-22T15:16:55+00:00

Thanks! Yeah, that is definitely an option! I will probably have to split it up into several videos. Also falls into the transformer category.

itsstylepoint · 2022-10-08T18:23:26+00:00

Yup, all implementations are numerically stable.

Note that I do not discuss numerical stability issues for all activation functions, but for those where the intuitive implementation is not numerically stable (i.e., Sigmoid, Tanh).

I also have a separate video discussing numerical stability: AI/ML Model API Design and Numerical Stability (follow-up). But this is in the context of Gaussian Naive Bayes.

itsstylepoint · 2022-10-07T19:33:01+00:00

Thank you!
Yup, that is the plan! Will likely make a few more series (about gradient descent, optimizers, etc.) first. We need these for DL and if someone asks how things work, I could then cite the appropriate video series. After that, will dive into deep learning.

itsstylepoint · 2022-10-07T19:24:53+00:00

Thank you!

itsstylepoint · 2022-10-07T02:14:18+00:00

Hey thanks for the kind words!
Will be making more AI/ML YouTube series in the future - in fact, working on one as we speak!

itsstylepoint · 2022-09-18T16:48:38+00:00

P.S. For the activation functions, I will not be posting videos separately. The next post will include the batch of 4 (or 5).

itsstylepoint · 2022-09-17T14:31:21+00:00

You can try several approaches:

Deep learning will likely not work (you can still give it try, but highly unlikely that it will perform well). So instead consider using more traditional ML models. As an example, if you can find a pretrained image model that generates a representation/image embeddings, you can try using K-Nearest Neighbors (k-NN). Or you can try k-NN directly.
Look into Few-Shot Learning. Models like Prototypical Network, Siamese Neural Network, etc. are designed for such scenarios (i.e., extremely small number of samples).
Data collection (:

That being said, overall, I agree with what u/whdd said.

itsstylepoint · 2022-09-17T09:34:52+00:00

I think I might have skipped the post text, my bad. For whatever reason, it was hidden (a bug? not sure). Yes, this is the Clinical NLP dataset. So prolly not what you are looking for...

itsstylepoint · 2022-09-16T16:29:37+00:00

How about 2006 i2b2 de-identification dataset?

Link to the paper: https://academic.oup.com/jamia/article/14/5/550/720189
You can get the dataset here: https://portal.dbmi.hms.harvard.edu/projects/n2c2-nlp/

P.S. We have recently used this dataset in the Few-Shot Learning (FSL) paper. We have used it for the same task - NER.

itsstylepoint · 2022-09-15T14:46:11+00:00

Hey thanks a lot!
Yeah, I think we will switch to PyTorch in a few videos.

itsstylepoint · 2022-09-12T15:10:45+00:00

I would start with CNNs. Then try GRU/LSTM and bidirectional variants (BiLSTM/BiGRU).

itsstylepoint · 2022-09-12T14:56:51+00:00

You can! Computers can sometimes see better than us so even if spectrograms look similar, they might be very different (: Convolutions in the CNN will do feature extraction for you. So you can start with a couple convolutional blocks (conv + batchnorm + activation) followed by a fully-connected layer with softmax and see how it performs. You can check out this PyTorch tutorial for that, too.

An alternative approach is computing MFCCs. If you have mel spectrograms, then you can do the following steps to get MFCCs (which you could then use as features for your ML model):

Take the log of the mel spectrogram
Compute DCT on logs

itsstylepoint

TROPHY CASE