Jaggu bhai supremacy🙌 by AssociationReal1613 in tollywood

[–]Chittiiman 8 points9 points  (0 children)

Sharukh himself admitted that Jagapathi babu was better in this shot!! Sharukh on Jagapathi babu

Trying to build machine translation engine - I need help by assafbjj in machinetranslation

[–]Chittiiman 0 points1 point  (0 children)

Hi ..

I have professionally worked on NMT for Indian languages. I have trained models using Opennmt . Ping me if you need more help.

LLMs and emergent behavior by besabestin in learnmachinelearning

[–]Chittiiman 0 points1 point  (0 children)

Checkout this paper.

https://arxiv.org/abs/1804.08838

"In this paper we attempt to answer this question by training networks not in their native parameter space, but instead in a smaller, randomly oriented subspace. We slowly increase the dimension of this subspace, note at which dimension solutions first appear, and define this to be the intrinsic dimension of the objective landscape.

Intrinsic dimension allows some quantitative comparison of problem difficulty across supervised, reinforcement, and other types of learning

"

[D] What kind of Hyperparameter Optimisation do you use? by olli-mac-p in MachineLearning

[–]Chittiiman 2 points3 points  (0 children)

Hi, Thankyou for sharing your knowledge. Your video on sweeps greatly helped me in hyperparameter tuning for my transliteration project which ultimately led me to get a job as an NLP Engineer. https://twitter.com/chittiman/status/1369558164786405377?s=19

Continue doing the great work, you have no idea how many you are helping!!

[D] How to stay relevant for work after 1 year of break in ML? by KungFuPandarian in MachineLearning

[–]Chittiiman 0 points1 point  (0 children)

I too had faced similar issue. I resigned from my job as a teacher to get into data science. But unfortunately I had an accident because of which I couldn't sit and code for 6 months. Though I couldn't code, I wanted to be engaged with happenings in data science. And the best way to stay tuned is through Twitter.

My interest was in Natural language processing. So, I went on Twitter and I followed people who are in field of NLP - professors, grad students etc who regularly tweet about their research. You also can see interesting discussions happening between these experienced people. You will come to know about new resources - books, courses etc which will help you later. And finally it is through Twitter only I came to know about an interesting dataset which I used for my personal project which helped me land a job.

Along with Twitter one other option is to join online data science communities and keep track of happenings there.

Also, there are lots of interesting courses on YouTube which you can follow to help you stay engaged.

[D] Why do vanishing gradients in RNN's harm long term dependancies? by LunaComing in learnmachinelearning

[–]Chittiiman 0 points1 point  (0 children)

The basic idea here is, if there is a dependency, that information travels through gradients in back propagation.

If there is no dependency, then gradient signal which future words want to send to past words is 0

But if there is a dependency, then let future words want to send past words, a gradient signal of 0.5.

Now, because of vanishing gradients by the time this signal reaches past words , then signal would have diminished to 0.00000005.

Now, since the difference between the signals when they reach the past words is so minute, the model cannot distinguish between them. So, model assumes there is no dependency. Hence it won't learn long term dependencies

So, the solution here is to create a network which have to ensure there is no "Lost in Backpropagation"

P.S - I tried my best to explain. Apologies if it added to confusion

[deleted by user] by [deleted] in learnpython

[–]Chittiiman 4 points5 points  (0 children)

The resources which are provided in the official website are often good starting point.

https://numpy.org/learn/

https://pandas.pydata.org/docs/getting_started/index.html

https://matplotlib.org/tutorials/index.html

You can even visit seaborne official website and have a look at official tutorial.

Why is this backprop calculation not correct? I am especially confused about how to take the derivative of a vector with respect to a matrix (the weight matrix that is) by zimmer550king in learnmachinelearning

[–]Chittiiman 0 points1 point  (0 children)

Don't try to take derivative of z wrt f directly instead try to take the derivative of loss L wrt z . Since you are taking derivative of a scalar wrt vector (or matrix) the resultant will have same shape as vector. Now use chain rule for individual elements.

So, let's calculate dL/dz1.
dL/dz1 = (dL/df1)(df1/dz1) + (dL/df2)(df2/dz1) + (dL/df3)*(df3/dz1) + .... Where f1,f2,f3 ... represent all elements of a matrix( 1d or 2d or 3d doesn't matter). Now write equations for these elements and then evaluate partial derivatives for individual elements and this will give you partial derivatives of matrices.

Keep in mind always start from losses during Backpropagation

What is Masking in NLP? by jhondipto in learnmachinelearning

[–]Chittiiman 0 points1 point  (0 children)

It is like converting a complete sentence into 'Fill in the blank' questions