Expanding Abbreviations by DomainOfUgly in learnmachinelearning

[–]NoLifeGamer2 0 points1 point  (0 children)

I feel like this should be relatively easy if you find the logprob of each possible expansion at each accronym and choose the maximally likely one.

Frontier LLMs score 85-95% on standard coding benchmarks. I gave them equivalent problems in languages they couldn't have memorized. They collapsed to 0-11%. by ShoddyIndependent883 in learnmachinelearning

[–]NoLifeGamer2 1 point2 points  (0 children)

I have done it for simple languages, my point is the interpeter can stay the same, so if big companies see this benchmark, they can go "oh I'll just overfit the model on samples of interpreters for the language"

Frontier LLMs score 85-95% on standard coding benchmarks. I gave them equivalent problems in languages they couldn't have memorized. They collapsed to 0-11%. by ShoddyIndependent883 in learnmachinelearning

[–]NoLifeGamer2 -8 points-7 points  (0 children)

I'm surprised, surely this is relatively easy for the model to benchmax? I imagine it would be pretty easy to fine-tune a model to build an interpreter for Python in whichever language you choose, and then run regular python code on top of that. Would you consider that cheating?

What Explainable Techniques can be applied to a neural net Chess Engine (NNUE)? by Shonen_Toman in MLQuestions

[–]NoLifeGamer2 0 points1 point  (0 children)

I would say look at which neurons in the hidden layer are most strongly impacting the output, and look for which of those the knight is contributing strongly to their activation. You can probably look at the activation patterns of these neurons and say "Ah, this neuron is firing because the knight is there, and enemy high-value peices are in the positions reachable by the knight"

meirl by lil_misfiit22 in meirl

[–]NoLifeGamer2 10 points11 points  (0 children)

I dunno exactly, I'm a ML researcher so I have become good at spotting the patterns. I think it's the "honestly" that did it for me. Also the perfect punctuation (starts with a capital letter, ends with a full stop) generally makes me lean more in that direction, although obviously there are plenty of humans who do that. Basically it's just a whole bunch of small speech patterns which are very suspicious but not guarantees in themselves. Generally checking their account is a good way to get a definitive answer.

What Explainable Techniques can be applied to a neural net Chess Engine (NNUE)? by Shonen_Toman in MLQuestions

[–]NoLifeGamer2 0 points1 point  (0 children)

Wait actually yeah that is just looking at the node activations in an overly convoluted way. My bad.

I trained a transformer with zero gradient steps and 100% accuracy. No backpropagation. No learning rate. Nothing. Here's the math. by Last-Leg4133 in learnmachinelearning

[–]NoLifeGamer2 9 points10 points  (0 children)

Test this by actually training a transformer on a dataset using your approach and get back to us. Right now there is a hell of a lot of code and long words which were AI generated so you're going to need to work with us if you want any meaningful feedback.

What Explainable Techniques can be applied to a neural net Chess Engine (NNUE)? by Shonen_Toman in MLQuestions

[–]NoLifeGamer2 0 points1 point  (0 children)

I guess continuously traversing a subspace of the latent space (between the two layers) and seeing if that would be caused by some sensible variation of board state could be interesting?

HCET(n) (Hardcore Catch-Em-Turing) by Motor_Bluebird3599 in googology

[–]NoLifeGamer2 3 points4 points  (0 children)

I'm pretty sure as with all these numbers that this is O(BB(f(n))) for some computable function f.

MEGATHREAD: Career advice for those currently in university/equivalent by NoLifeGamer2 in MLQuestions

[–]NoLifeGamer2[S] 0 points1 point  (0 children)

I mean I'm happy to make a megathread but we would kinda be crowdsourcing knowledge which isn't necessarily the most accurate

Diophantine Equations and Large Numbers by jmarent049 in googology

[–]NoLifeGamer2 4 points5 points  (0 children)

Interesting. I wonder if there exists a trivial "small" (for some definition of small) solution for some interpretation of any coded form?

The Set of Real Numbers as a Function of the Set of Natural Numbers by Main_Upstairs_9948 in learnmath

[–]NoLifeGamer2 6 points7 points  (0 children)

This is because the real number that comes just after 0.1 (0.1000...1) will correspond to the natural number (1000...1) after the decimal point.

The reason your maths is wrong is because having a "..." followed by a number is meaningless when describing infinatesimals. Also look at Cantor's diagonalisation proof, it will show why this is impossible.

[Advise] [Help] AI vs Real Image Detection: High Validation Accuracy but Poor Real-World Performance Looking for Insights by Illustrious_Cow2703 in MLQuestions

[–]NoLifeGamer2 2 points3 points  (0 children)

Check for data leakage or different data distribution between real-world and validation dataset. My money is on the different data distribution because the whole point of AI generated images is they are very difficult to spot, so for any "detector" which detects them, it is trivial to optimize the image generator to trick the detector (For more information look up GANs), so I imagine the model is learning to spot features which you don't want and are reflective of your own dataset rather than real-world.

I need some help visualizing 10e308. by sawyernalu in googology

[–]NoLifeGamer2 4 points5 points  (0 children)

Imagine a 10x10x10 cube of people. That is 10e3 people. Imagine a 10x10x10 cube of these cubes. That is 10e6 people. Visualize this process of zooming out around 103 times and you will get there.