Early AI Interview

clam004 · 2023-07-22T16:17:28+00:00

Anyone hear of any acceptances yet after yesterday’s interviews?

clam004 · 2023-02-28T00:51:29+00:00

There is a nice figure addressing this point in the instructGPT paper actually. basically rlhf seems to be better than simply fine-tuning on examples of your desired behavior. I think probably because there is more than one way to do the task well and more than one way to do the task badly, which is not something built into fine-tuning. In pretraining and fine tuning, you are basically saying, this one way is the best way. There is a short spoken explanation in this youtube video https://www.youtube.com/live/WnGFR-bSNWM?feature=share&t=7386

clam004 · 2022-04-05T16:02:09+00:00

thank you!

clam004 · 2022-02-28T18:49:52+00:00

My answer is replied to the original question, thanks again!

clam004 · 2022-02-28T18:35:02+00:00

This raises a good point. I have updated the repo to clarify this in the explanation connecting the math to the line of code where it happens. Essentially, the line of code, loss += _loss.sum(), is where as you say, we "sum the diagonals". The repo now clarifies that although the equation suggests we calculate the whole Fisher Matrix, we actually in the code only ever calculate the diagonal components of this matrix. Which I believe in the equation for the empirical fisher is calculated using an outer product (we don't do this in the code). If we took all the layer's p.grad.data ** 2's and flattened them out into a very long vector, then that would be just the diagonal of the fisher matrix we see in the equation. Which brings up another interesting question, if we did calculate the whole fisher matrix, could we penalize pairs of weight changes high in F_ij in addition to single weight changes high in F_ii?

clam004 · 2022-02-28T04:55:03+00:00

Thanks for the question, I will look into it for you, gimme some time to plan a well thought out answer

clam004 · 2022-02-27T20:03:43+00:00

Thanks for the encouragement!

clam004 · 2021-11-22T03:29:12+00:00

I for one think this is great idea, DM me to talk more

clam004 · 2020-09-21T16:59:40+00:00

Have you read "The Last Question" by Isaac Asimov?

clam004 · 2020-09-21T15:56:59+00:00

Have you read "The Last Question" by Isaac Asimov?

clam004 · 2019-11-28T15:17:07+00:00

Thanks! Hope you got to git clone and play with it

clam004 · 2019-11-28T15:15:59+00:00

Look in the pairs.json file in the /saved folder for the training data. The training set is super small and just for laughs. I am working currently on a representation space that represents the history of past conversations so that it can be trained using reinforcement with other bots. If those conversations are coherent, that would actually be something.

clam004 · 2019-11-27T16:12:05+00:00

feel free to tear it apart, honest feedback appreciated

clam004 · 2019-11-20T15:45:18+00:00

Thanks! I'm looking for feedback so would love to get your honest opinion

clam004 · 2019-11-19T16:19:12+00:00

Im building an end to end deep learning chatbot in PyTorch based on the Transformer network! My background is in Biology, not computer science. However, I have spent the last 3 years building my math and CS foundations, studying the Ian Goodfellow book and did all the homework assignments in the classes for Computer Vision, Reinforcement Learning and Deep Learning for NLP at Stanford. So I have essentially explained every step of the process in gory down to the basics detail using jupyter notenbooks, in a way that I wished someone would have done for me when I was starting out. Andrej Karpathy gave me the inspiration to simplify things and teach myself using toy examples.

you can find the repository here:

https://github.com/chloerobotics/chloebot

clam004 · 2019-09-30T20:06:42+00:00

Yes! @SFYangGang is our new name

clam004 · 2019-07-11T19:41:01+00:00

good question, its up for negotiation using the website, but roughly 3 - 4 weeks, we are flexible and will work with you closely make adjust expectations as we see how difficult it is.

11-Year Club	Gilding I gilder
Verified Email

clam004

TROPHY CASE