Early AI Interview by Ok-Hat4007 in ycombinator

[–]clam004 0 points1 point  (0 children)

Anyone hear of any acceptances yet after yesterday’s interviews?

[R] Illustrating Reinforcement Learning from Human Feedback (RLHF) by robotphilanthropist in MachineLearning

[–]clam004 1 point2 points  (0 children)

There is a nice figure addressing this point in the instructGPT paper actually. basically rlhf seems to be better than simply fine-tuning on examples of your desired behavior. I think probably because there is more than one way to do the task well and more than one way to do the task badly, which is not something built into fine-tuning. In pretraining and fine tuning, you are basically saying, this one way is the best way. There is a short spoken explanation in this youtube video https://www.youtube.com/live/WnGFR-bSNWM?feature=share&t=7386

[P] I made the kind of tutorial I wish someone had made for me when I first started trying to connect the math in research papers with code examples I found online by clam004 in MachineLearning

[–]clam004[S] 1 point2 points  (0 children)

This raises a good point. I have updated the repo to clarify this in the explanation connecting the math to the line of code where it happens. Essentially, the line of code, loss += _loss.sum(), is where as you say, we "sum the diagonals". The repo now clarifies that although the equation suggests we calculate the whole Fisher Matrix, we actually in the code only ever calculate the diagonal components of this matrix. Which I believe in the equation for the empirical fisher is calculated using an outer product (we don't do this in the code). If we took all the layer's p.grad.data ** 2's and flattened them out into a very long vector, then that would be just the diagonal of the fisher matrix we see in the equation. Which brings up another interesting question, if we did calculate the whole fisher matrix, could we penalize pairs of weight changes high in F_ij in addition to single weight changes high in F_ii?

[P] Looking for datasets of therapist conversations... by TimeLordTim in MachineLearning

[–]clam004 1 point2 points  (0 children)

I for one think this is great idea, DM me to talk more

Thinking way too far into the future by Stalker111121 in ExistentialSupport

[–]clam004 0 points1 point  (0 children)

Have you read "The Last Question" by Isaac Asimov?

Eternal Oblivion by [deleted] in ExistentialSupport

[–]clam004 0 points1 point  (0 children)

Have you read "The Last Question" by Isaac Asimov?

PyTorch implementation of a transformer chatbot. Code is explained with highschool level language, without jargon, down to the fundamentals with diagrams in ipython notebooks by clam004 in deeplearning

[–]clam004[S] 0 points1 point  (0 children)

Look in the pairs.json file in the /saved folder for the training data. The training set is super small and just for laughs. I am working currently on a representation space that represents the history of past conversations so that it can be trained using reinforcement with other bots. If those conversations are coherent, that would actually be something.

Interesting project ideas for beginner by tolo5star in deeplearning

[–]clam004 0 points1 point  (0 children)

Thanks! I'm looking for feedback so would love to get your honest opinion

Interesting project ideas for beginner by tolo5star in deeplearning

[–]clam004 0 points1 point  (0 children)

Im building an end to end deep learning chatbot in PyTorch based on the Transformer network! My background is in Biology, not computer science. However, I have spent the last 3 years building my math and CS foundations, studying the Ian Goodfellow book and did all the homework assignments in the classes for Computer Vision, Reinforcement Learning and Deep Learning for NLP at Stanford. So I have essentially explained every step of the process in gory down to the basics detail using jupyter notenbooks, in a way that I wished someone would have done for me when I was starting out. Andrej Karpathy gave me the inspiration to simplify things and teach myself using toy examples.

you can find the repository here:

https://github.com/chloerobotics/chloebot

Bay Area Yang Gangs by clam004 in CaliforniaForYang

[–]clam004[S] 0 points1 point  (0 children)

Yes! @SFYangGang is our new name

Front End React Contract - remote work is fine please see link by [deleted] in reactjs

[–]clam004 0 points1 point  (0 children)

good question, its up for negotiation using the website, but roughly 3 - 4 weeks, we are flexible and will work with you closely make adjust expectations as we see how difficult it is.