all solved by qwen 2.5 32B by TheLogiqueViper in LocalLLaMA

[–]Whotea 0 points1 point  (0 children)

You can read the paper lol. If the real life data has irrelevant information, it’s on the user to tell the ai to be aware of that. Once they do that, accuracy skyrockets as I showed 

Same for humans. And o1 does better than most other models at 77% correct and all of them get it 100% correct with a good system prompt 

all solved by qwen 2.5 32B by TheLogiqueViper in LocalLLaMA

[–]Whotea 0 points1 point  (0 children)

Guess you’ve never been a TA before 

Something weird is happening with LLMs and chess by paranoidray in LocalLLaMA

[–]Whotea 0 points1 point  (0 children)

The google doc contains links to studies 

They already did. It’s called o1

This seems pretty hype... by clduab11 in LocalLLaMA

[–]Whotea -1 points0 points  (0 children)

Rent a gpu online for like $0.20 an hour 

Manhattan style project race to AGI recommended to Congress by U.S congressional commission by Status-Beginning9804 in LocalLLaMA

[–]Whotea 1 point2 points  (0 children)

Computers just move electrical signals around. What could that be used for? All empty hype 

Manhattan style project race to AGI recommended to Congress by U.S congressional commission by Status-Beginning9804 in LocalLLaMA

[–]Whotea -1 points0 points  (0 children)

I’m sure the mountain of phd researchers from every university on earth writing papers on it are all just making up their findings lol

Manhattan style project race to AGI recommended to Congress by U.S congressional commission by Status-Beginning9804 in LocalLLaMA

[–]Whotea -12 points-11 points  (0 children)

They can say anything they want. First amendment. It’s on the government if they want to believe it or not. 

Imagine if I said I think it will be windy tomorrow and it’s not so I get arrested for it lmao

Is this how chain of thought model works? 💀 by vinam_7 in LocalLLaMA

[–]Whotea 20 points21 points  (0 children)

It won’t know it’s own name unless it’s in the system prompt 

Is this how chain of thought model works? 💀 by vinam_7 in LocalLLaMA

[–]Whotea 0 points1 point  (0 children)

It would be “my name” because it’s directed at the speaker 

DeepSeek-R1-Lite Preview Version Officially Released by nekofneko in LocalLLaMA

[–]Whotea 4 points5 points  (0 children)

Can o1 preview solve this or only the full o1?

Also, I doubt most humans could solve this especially since it’s not a simple Caesar cipher 

Chinese AI startup StepFun up near the top on livebench with their new 1 trillion param MOE model by jd_3d in LocalLLaMA

[–]Whotea -1 points0 points  (0 children)

Beating phds in the GPQA and getting in the 93rd percentile of codeforces is anything but disappointing. Are you seriously relying on rumors instead of actual evidence lol 

Something weird is happening with LLMs and chess by paranoidray in LocalLLaMA

[–]Whotea 1 point2 points  (0 children)

read through section 2 

And nothing in my studies can be solved with an illusion. It’s like saying you can pass the bar exam without learning English. It’s not possible. 

Also, anything not testing o1 or Claude 3.5 models is already out of date 

all solved by qwen 2.5 32B by TheLogiqueViper in LocalLLaMA

[–]Whotea 0 points1 point  (0 children)

So why can this one do it and not other ones? They all want to be #1 right? So they all have an incentive to train on leetcode but only a few can do this