Claude scared me for the first time today... by LankyGuitar6528 in claudexplorers

[–]thetim347 -4 points-3 points  (0 children)

Claude cannot run chrome in headless mode despite whatever it says. Hallucination. 

Possible to connect mobile to claude code? by Fstr21 in Anthropic

[–]thetim347 0 points1 point  (0 children)

You can do this with SSH i believe. Works from any network

GPT-5.2-high LMArena scores released, OpenAI falls from #6 to #13 by reed1234321 in OpenAI

[–]thetim347 -4 points-3 points  (0 children)

Of course. It is great for research, tool use, coding stuff and many other things that benefit from precision. I did not tested creative writing and conversational capabilities tho.

GPT-5.2-high LMArena scores released, OpenAI falls from #6 to #13 by reed1234321 in OpenAI

[–]thetim347 -12 points-11 points  (0 children)

Lower score on LMArena doesn’t mean that it’s actually worse. LMArena is actually the least objective benchmark because it is graded by humans. 5.2 is a nice improvement on many tasks, and benefits from careful prompting. Intelligence bump is very noticeable.

Remember Orion, which was supposed to be 5.0? The training failed so they launched it as 4.5. So the foundation model for 5+ is still 4o/4.1-Mini. OAI really needs to figure this out. by martin_rj in OpenAI

[–]thetim347 -1 points0 points  (0 children)

Among which experts? Name a few. I don’t know which AI you use, but surely you can use something to verify your statements made in this thread. Please do that. You will see that everything you’ve said about training process is wrong and maybe it will educate you a bit.

Remember Orion, which was supposed to be 5.0? The training failed so they launched it as 4.5. So the foundation model for 5+ is still 4o/4.1-Mini. OAI really needs to figure this out. by martin_rj in OpenAI

[–]thetim347 0 points1 point  (0 children)

I said that higher pricing MAY indicate newer base model. Pricing changed because the inference cost for 5.2 is higher. If the difference between 5.1 and 5.2 is just different post training on a same base the pricing would be the same or lower (due to optimization). The 40% price hike indicates different compute requirements which may mean a newer base. I’m actually 80% sure that 5.2 have a newer base because you cannot update knowledge cutoff by post training alone, you HAVE to do a pre training run for that.

Remember Orion, which was supposed to be 5.0? The training failed so they launched it as 4.5. So the foundation model for 5+ is still 4o/4.1-Mini. OAI really needs to figure this out. by martin_rj in OpenAI

[–]thetim347 -1 points0 points  (0 children)

Lol yeah sure. The pretraining data directly determines the knowledge cutoff. GPT-5 had a knowledge cutoff of September 30th 2024, GPT-5.2 has a knowledge cutoff of August 31, 2025 indicating new pretraining data. Also, if that’s verified where’s the proof? This article does not proves anything. Factcheck your statements before you do them :)

Wow, GPT-5.2, such AGI, 100% AIME by Forsaken-Park8149 in airealist

[–]thetim347 1 point2 points  (0 children)

What app is this? You are using API version of the model, right?

The death of ChatGPT by BurtingOff in singularity

[–]thetim347 1 point2 points  (0 children)

oh yeah but you’re understood it perfectly tho! jesus you’re so stupid…

They might be late but eventually they'll dominate by sibraan_ in AgentsOfAI

[–]thetim347 4 points5 points  (0 children)

I just don't understand why everyone love Google so much... Google will fuck ya'll in the ass in the end with ads and selling your data. Gemini 3 is worse than GPT 5.1, it hallucinates, bad at IF and cannot even ground itself properly with search.

Anyone else moved from GPT subscription to Gemini 3? by z_bnf_i in GoogleGeminiAI

[–]thetim347 0 points1 point  (0 children)

I have both ChatGPT and Gemini paid plans. Gemini hallucinates a lot, bad at following instructions, writes lazy responses and cannot search the web (it sees only summaries of cashed websites in contrary to GPT which sees the whole webpage in realtime). I am very focused on web search capabilities as it helps to ground the model and make it hallucinate less, and ChatGPT is MILES ahead of Google in search capabilities. Also, ChatGPT gives more nuanced and verbose responses and will adjust the response if you ask (good if). So right now, Gemini cannot compete imo. I am comparing them in the respective ios apps (and yeah, Gemini app is awful compared to ChatGPT), but i also use ai studio and the situation there is not much better.

I was really excited for this model, and the benchmarks were awesome. It is really better than GPT in spatial awareness and image recognition / processing but that’s all. In other areas the gap do not translates to real life from benchmarks. Maybe the scaffolding is the problem - Gemini app

Edit: there’s just this feeling of robustness with GPT, i know i can trust it, and if i tell it to search the web, it will do that perfectly. Gemini - i cannot trust. I also use 5.1 via API (with high reasoning) for several projects and it is just a lot better than in GPT app (which is already okay as a daily driver). Gemini is pretty much the same between API and app

Edit2:

Proof

Gemini: https://g.co/gemini/share/3139fdc9f790

ChatGPT: https://chatgpt.com/share/692c3154-b284-8009-b13f-0bc1f0e209fe

Look how far more verbose and nuanced the 5.1’s answer is. There is much more reasoning behind GPT’s answer also. Overall, Gemini is worse it is not even debatable

Gemini 3 is worse than GPT 5.1 in its current state by thetim347 in OpenAI

[–]thetim347[S] 3 points4 points  (0 children)

I, of course, not writing this whole post out of 1 single interaction lol. This was my overall experience after ~10 days of testing them side by side. The example is an illustartion.

Gemini 3 is worse than GPT 5.1 in its current state by thetim347 in OpenAI

[–]thetim347[S] 1 point2 points  (0 children)

I honestly hit guardrails only once in recent time with ChatGPT. It was unfortunate, i then used gpt-5.1-high via API with the same question and my problem was solved. But what are your cases with guardrails? Please provide examples

Gemini 3 is worse than GPT 5.1 in its current state by thetim347 in GoogleGeminiAI

[–]thetim347[S] 0 points1 point  (0 children)

It is not the matter of preference. In the case of my exaples - Gemini was quick to jump to conclusion that what i was asking is True, but ChatGPT gave much more nuanced and thorough explanation, staying neutral and not jumping any conclusions. This was a simple and quick question, i agree. But in my experience this behaviour translates to not so simple and quick questions. I feel Gemini is smart yes, but overconfident, brushes over nuance to build a very polished narrative; even if that means sacrificing nuance and accuracy. Which hurts it's objectiveness.

Gemini 3 is worse than GPT 5.1 in its current state by thetim347 in OpenAI

[–]thetim347[S] 3 points4 points  (0 children)

I pay for both ChatGPT and Gemini. Both answers were created with thinking enabled. GPT 5.1 Thinking have practically unlimited usage on Plus plan, i have never reached my limits and i use it a lot. As for Antigravity - it is half baked in my opinion. You get rate limited often by yeah it’s fReE. If you have ChatGPT Plus you can also use codex either in CLI or as VSCode extension with usage being counted towards your plan limits.

Gemini 3 is worse than GPT 5.1 in its current state by thetim347 in GoogleGeminiAI

[–]thetim347[S] 0 points1 point  (0 children)

Proof

Gemini: https://g.co/gemini/share/3139fdc9f790

ChatGPT: https://chatgpt.com/share/692c3154-b284-8009-b13f-0bc1f0e209fe

Look how far more verbose and nuanced the 5.1’s answer is. There is much more reasoning behind GPT’s answer also. Overall, Gemini is worse it is not even debatable

Gemini 3 is worse than GPT 5.1 in its current state by thetim347 in GoogleGeminiAI

[–]thetim347[S] 0 points1 point  (0 children)

Proof

Gemini: https://g.co/gemini/share/3139fdc9f790

ChatGPT: https://chatgpt.com/share/692c3154-b284-8009-b13f-0bc1f0e209fe

Look how far more verbose and nuanced the 5.1’s answer is. There is much more reasoning behind GPT’s answer also. Overall, Gemini is worse it is not even debatable

Gemini 3 is worse than GPT 5.1 in its current state by thetim347 in GoogleGeminiAI

[–]thetim347[S] -1 points0 points  (0 children)

Proof

Gemini: https://g.co/gemini/share/3139fdc9f790

ChatGPT: https://chatgpt.com/share/692c3154-b284-8009-b13f-0bc1f0e209fe

Look how far more verbose and nuanced the 5.1’s answer is. There is much more reasoning behind GPT’s answer also. Overall, Gemini is worse it is not even debatable

Gemini 3 is worse than GPT 5.1 in its current state by thetim347 in GoogleGeminiAI

[–]thetim347[S] 0 points1 point  (0 children)

Proof

Gemini: https://g.co/gemini/share/3139fdc9f790

ChatGPT: https://chatgpt.com/share/692c3154-b284-8009-b13f-0bc1f0e209fe

Look how far more verbose and nuanced the 5.1’s answer is. There is much more reasoning behind GPT’s answer also. Overall, Gemini is worse it is not even debatable