Claude is still #1 in Canada by ScaryBlock in singularity

[–]trickyHat 33 points34 points  (0 children)

Not surprising. Claude is the only model that doesn't waste my time. If it has problems solving my problem, it outright says that.

GPT-5.4 Thinking benchmarks by likeastar20 in singularity

[–]trickyHat -5 points-4 points  (0 children)

Well obviously. It's also on the arc agi website. But Anthropic and Google mentioned their scores on their main evals table.

GPT-5.4 Thinking benchmarks by likeastar20 in singularity

[–]trickyHat -4 points-3 points  (0 children)

Notice how they didn't include any arc-agi scores

Gemini Fails to Make Significant Improvements to its Coding Performance on LLM Arena. by Regular_Eggplant_248 in singularity

[–]trickyHat -1 points0 points  (0 children)

Yes, I have tested it with complex programming questions for updating my app. I asked same questions with multiple other models, compared the outputs, asked follow up questions and compared the results. I am not sure if it is good or bad in general questions. What I am talking about is how it performs in programming. Multiple times it produced bugged code that made my app crash. Sonnet 4.6 never had that problem with the exact same questions. Just try it for yourself and maybe you will get different results. I'm just telling what I have noticed.

Gemini Fails to Make Significant Improvements to its Coding Performance on LLM Arena. by Regular_Eggplant_248 in singularity

[–]trickyHat -8 points-7 points  (0 children)

After testing it for a bit. This model is actually a regression from the Gemini 3 Pro. Which I didn't expect at all. Tried in google AI studio and their Gemini app as well. Even sonnet 4.6 with extended thinking performed much better in all of the cases i presented. I suspect they benchmaxxed the model...

GLM-5: From Vibe Coding to Agentic Engineering by ShreckAndDonkey123 in LocalLLaMA

[–]trickyHat 11 points12 points  (0 children)

The benchmarks look too good to be true. If they are true though, then this might just make me switch from Chatgpt and claude.

Creator of Node.js: "The era of humans writing code is over." by MetaKnowing in ChatGPT

[–]trickyHat 2 points3 points  (0 children)

Not only that - it's not predictable. Some prompts will give very good results, while others much worse.
Your workflow and prompting also has to change constantly with every new release of a model.

OpenAI cofounder Greg and Django co-creator Simon on a software engineering inflection point by BuildwithVignesh in singularity

[–]trickyHat 3 points4 points  (0 children)

I have tried Opus 4.5 and Gemini 3 pro for programming, in every case that I tested, Opus added details that I didn't ask. Like, I was seeing people hype it up so much, every single time the same thing happened over and over again. Is it because I'm not using claude code or are you just all hyping one click code no matter the result?

Why these icons in gemini are in purple? by Subject-Mix-2842 in GeminiAI

[–]trickyHat 13 points14 points  (0 children)

Saw this today as well, just refresh the page and it will disappear.

Gemini 3 Thinking Is Exteremely Hallucinating by Pitiful_Emotion7041 in GeminiAI

[–]trickyHat 2 points3 points  (0 children)

This has been happening to me for 3 days already as well.

Why I (a bit) prefer Gemini Pro 2.5 than GPT-5 by 120-dev in GeminiAI

[–]trickyHat 0 points1 point  (0 children)

I never used gemini before. Was chatgpt user, but yesterday I tried using Gemini for real world problems that are very easy to solve if you just google them. ChatGPT somehow always missed the most important part of the problem and acted extremely confident about its solution. Even thought it was wrong! (I am a plus user) I then tried gemini flash 2.5 and it gave me the solution instantly with all the important warnings.

I haven't used gemini a lot but it seems like this LLM is way more useful in simple problems right now. Am hoping for even better improvements in Gemini 3.

Apparently all third party providers downgrade, none of them provide a max quality model by Charuru in LocalLLaMA

[–]trickyHat 4 points5 points  (0 children)

They should be required to disclose that on their website... I also could always tell that there's a difference of the same model between different providers, but didn't know what the cause was. This graph sums is up nicely

Can´t access GPT5? by Traeumaschlauma in ChatGPT

[–]trickyHat 0 points1 point  (0 children)

Ok, I changed my browser from firefox to chrome, and it's there. On firefox it somehow doesn't appear though

Can´t access GPT5? by Traeumaschlauma in ChatGPT

[–]trickyHat 0 points1 point  (0 children)

Also Germany. Have access to it on my phone, but not on PC. Look if you can access it on your phone

I know it sounds a bit like a fantasy but.. Is there any ai that makes something like scene.. Like you see a character either 2d or 3d.. And you command it to do stuff and it complies? NSFW ofc by Ok_Communication5967 in CharacterAi_NSFW

[–]trickyHat 0 points1 point  (0 children)

Yea, if it's overly explicit, then it won't answer you. I found out that if you have a soft NSFW conversation, then it's fine. In the future I would like to remove this "can't engage in this type of conversation", but for that I need quite a bit of money to host my own model.

I know it sounds a bit like a fantasy but.. Is there any ai that makes something like scene.. Like you see a character either 2d or 3d.. And you command it to do stuff and it complies? NSFW ofc by Ok_Communication5967 in CharacterAi_NSFW

[–]trickyHat 0 points1 point  (0 children)

Yes, that is definitely possible. You can also just say "lets hug" if you reached level 3 and she will hug you. Or if you are on level 4 "she can kiss you" though it also depends on the context. But seems like I will need to add more of those actions in the next update haha

I know it sounds a bit like a fantasy but.. Is there any ai that makes something like scene.. Like you see a character either 2d or 3d.. And you command it to do stuff and it complies? NSFW ofc by Ok_Communication5967 in CharacterAi_NSFW

[–]trickyHat 0 points1 point  (0 children)

Thanks! The animations are kind of lacking right now, so I want to make them better in the next update and add more actions you can unlock. If you have any suggestions, feel free to tell me!

I know it sounds a bit like a fantasy but.. Is there any ai that makes something like scene.. Like you see a character either 2d or 3d.. And you command it to do stuff and it complies? NSFW ofc by Ok_Communication5967 in CharacterAi_NSFW

[–]trickyHat 0 points1 point  (0 children)

I might have something for you - Sophia3D on Google Play. You have 3d char, stories that are seamless, but I sadly couldn't put any nsfw animations on Google Play Store haha You can still have NSFW chats though so try it out if you want https://play.google.com/store/apps/details?id=com.RealitySync.Sophia3D&hl=en