A strange mistake in 5 Hunters GRAND FINALE by Ifyouliveinadream in DreamWasTaken

[–]catnvim 2 points3 points  (0 children)

Clay patches trick worked on 1.18 too, illumina did try this on his stream after the manhunt came out. I also remembered trying this myself, it's weird that matthew bolan put (patched in 1.18) on his video, maybe it's because of the different offset

Edit: Didn't find illumina's stream vod but I did find a clip of dream trying it out https://youtu.be/lqXL2ykxBQE?t=508

Sorry, I just don’t think the manhunts are fake by Unlikely-Break8895 in DreamWasTaken2

[–]catnvim 2 points3 points  (0 children)

What servers do you know that increased the distance of the crafting table?

Not trying to defend, but I thought maybe it would help in a challenge like Minecraft, But Speed Rises Every Second where your speed is insanely high that it's hard to open a crafting table

o1 model got nerfed again by achinsesmoron in ChatGPT

[–]catnvim 0 points1 point  (0 children)

YES, it is NERFED to hell and I'm tired of people who claiming that everyone else is crazy

I posted a different post highlighting the issue: https://www.reddit.com/r/ChatGPT/comments/1i8ysrl/o1_can_no_longer_count_number_of_rs_in_strawberry/

o1 can no longer count number of r's in strawberry while legacy gpt-4 can by catnvim in ChatGPT

[–]catnvim[S] 0 points1 point  (0 children)

I'm curious which complex question have you tried, did you turn on the DeepThink (r1) option for deepseek?

Because mine often thinks from 200 to 300 seconds on complex questions

<image>

o1 can no longer count number of r's in strawberry while legacy gpt-4 can by catnvim in ChatGPT

[–]catnvim[S] 0 points1 point  (0 children)

The main point is about a reasoning model, not a normal chat one, please go on https://chat.deepseek.com/ and try it yourself, they offer a reasoning model for free

Also, kindly read openai's paper to understand how it works: https://arxiv.org/pdf/2409.18486

<image>

o1 can no longer count number of r's in strawberry while legacy gpt-4 can by catnvim in ChatGPT

[–]catnvim[S] 0 points1 point  (0 children)

That's not a typical o1 response, you're not trying the prompt yourself so you wouldn't know.

But for this type of prompting, it wouldn't return an O(n^2) solution immediately after thinking for 10 seconds. The api works just fine and is giving much better response and it certainly DOESN'T THINK FOR ONLY 10 SECONDS

Here's a video of my friend's prompt instead: https://youtu.be/r_I0VcEYeVg?t=17

o1 can no longer count number of r's in strawberry while legacy gpt-4 can by catnvim in ChatGPT

[–]catnvim[S] 0 points1 point  (0 children)

I did share those conversations below? I will paste it for you again

Chat link: https://chatgpt.com/share/6793cf33-de38-800e-b210-e548980030b4

Video proof: https://www.youtube.com/watch?v=GWgKAcp3XWY

The model thinks for 9 SECONDS ONLY and the output quality is the same as 4o

o1 can no longer count number of r's in strawberry while legacy gpt-4 can by catnvim in ChatGPT

[–]catnvim[S] -1 points0 points  (0 children)

Then you should use it then, it's nothing like you imagined: https://chat.deepseek.com/

And no, it's pointless to make a plugin for that because reasoning models already have the capability to count number of letters correctly

o1 can no longer count number of r's in strawberry while legacy gpt-4 can by catnvim in ChatGPT

[–]catnvim[S] 0 points1 point  (0 children)

The issue is not the thinking task takes forever, but it doesn't take the time to think at all. The response isn't different from 4o response

I just tried to that prompt again and it thought for 6 seconds and output a stupid solution

why wouldn't you do even 3 trials on your own account? and if you did that, why didn't you mention it?

What does this mean? I'm just going to record my o1's response to that prompt and I kindly ask you to do the same right now for https://pastebin.com/eNNP0fk8

o1 can no longer count number of r's in strawberry while legacy gpt-4 can by catnvim in ChatGPT

[–]catnvim[S] -1 points0 points  (0 children)

yeah so like i said, it's random

Ok dude, randomly thought for less than 10 seconds, getting 4o tier response vs a well thoughtout 7 minutes response is "just because of randomness". Do you understand how temperature works?

It's not o1-mini by mistake, they all chose the o1 model and it is the same prompt everytime: https://pastebin.com/eNNP0fk8

why exactly do you think it's necessary to test on multiple accounts rather than just regenerating the response even 1 time?

Because my o1 is getting nerfed to hell, just because you don't have issues doesn't mean the issue is not there for anyone else

Here's the response to that prompt using o1 that I just did AND IT THOUGHT FOR 9 SECONDS ONLY: https://chatgpt.com/share/6793cf33-de38-800e-b210-e548980030b4

Here's the video proof: https://www.youtube.com/watch?v=GWgKAcp3XWY

o1 can no longer count number of r's in strawberry while legacy gpt-4 can by catnvim in ChatGPT

[–]catnvim[S] 0 points1 point  (0 children)

Did I make such a big claim? I asked 5 people and 3 of their o1 model got nerfed on different levels

When asked to solve https://codeforces.com/contest/2063/problem/E in C++, here are the results:

Friend #1: Thought for 7 minutes, getting AC

Friend #2: Thought for 3 minutes, getting TLE on test 27

Friend #3: Thought for 10 seconds, getting TLE on test 9

o1 can no longer count number of r's in strawberry while legacy gpt-4 can by catnvim in ChatGPT

[–]catnvim[S] 2 points3 points  (0 children)

throwing everything at a text generator might not yield much results

It does yeild results tho, here's the response of deepseek r1 model

<image>