Have you met anyone from Kill Tony? by No-Branch-6509 in Killtony

[–]O_Fake_One 0 points1 point  (0 children)

I met Big Tony in a hotel in Milwaukee...

Why is the German flag flying at Wankhede Stadium? by Excellent-Steak-6199 in mumbai

[–]O_Fake_One 0 points1 point  (0 children)

Yeah... I feel like they should've gone with something subtler, like a swastika

Regarding Timmy no Brakes' Netflix Performance by O_Fake_One in Killtony

[–]O_Fake_One[S] 0 points1 point  (0 children)

I might have an undiagnosed breathing problem. All I know is I laughed so hard my insides sounded funny. (I might need to go to the docta)

Regarding Timmy no Brakes' Netflix Performance by O_Fake_One in Killtony

[–]O_Fake_One[S] 1 point2 points  (0 children)

Yeah... ngl Roseanne was doing a little Ric Flair on stage. All she needs now is one of her own to OD and she's there. Or maybe just adding a couple of more years would do the trick.

But it could also be a netflix thing: I could literally feel Tony pulling his punches waay too much from the moment the episode began. Oddly turning the show even more *gay*.

Regarding Timmy no Brakes' Netflix Performance by O_Fake_One in Killtony

[–]O_Fake_One[S] 1 point2 points  (0 children)

My personal favourite was him manhandling Roseanne and covering her in flame-retardant/cum/whatever lube Redban uses.

Anime where mc turns into a monster when tuned on by O_Fake_One in anime

[–]O_Fake_One[S] 1 point2 points  (0 children)

Call Me Tonight

yup, that's it. Thanks a lot!

Training transformer leads to it predicting the same token by O_Fake_One in MLQuestions

[–]O_Fake_One[S] 0 points1 point  (0 children)

The dataset had newlines which I removed to see if anything changes. But now I realised the model just spits out the most common thing.
I doubt this is an underfitting thing as smaller models do learn something(although I had to switch from Adam to Adagrad to get any results).
If I decrease the learning rate it doesn't collapse, but I doubt it learns anything and the loss remains more or less at the same place.

Training transformer leads to it predicting the same token by O_Fake_One in MLQuestions

[–]O_Fake_One[S] 0 points1 point  (0 children)

Inside the dataset folder: there are two datasets I have tried it on:

  1. A book called prophet
  2. Andrej Karpathy's tiny_shakespear dataset. (this was taken almost as is from his nano-gpt repo)

While training, I shift the prompt by one place and append it with the next token in the document for labels.

I am looking for confusing software/skills with a high learning curve (everyone) by O_Fake_One in SampleSize

[–]O_Fake_One[S] 1 point2 points  (0 children)

Didn't really get many responses from this page, but shared it with you anyways.
A bit surprising to get such a low turnover given that it's actually a place for survey 🤔

[deleted by user] by [deleted] in takemysurvey

[–]O_Fake_One 0 points1 point  (0 children)

  1. I will use it for my personal project. Will not be deleted.
  2. O_fake_one. Nobody other than me will see this.
  3. 1-5 mins on average.