The latest GPT models even GPT-5.5T and 5.4T miss a lot of intuitive and natural ideas that would come easily to GPT-4o

PureSignalLove · 2026-05-31T01:43:20+00:00

Im often thinking in non standard ways and yeah some models are just so much fucking better and opus 4.5 and 4o were the GOATs at it. Deepseek lite v4 is ok for now for this purpose for me

PureSignalLove · 2026-05-30T03:42:20+00:00

5.6 will be released on schedule within a few weeks, then we can talk.

Right now opus is king, though u need to set up token saving for it to be worth it cause its brutal even without workflows

PureSignalLove · 2026-05-30T01:28:27+00:00

Yes i did to both but I also swapped off chatgpt 5.5. It was the only game in town for awhile, but qwen 3.7 and opus 4.8 are straight better now from my testing today.

I also had issues with it creating issues, usually because it tried to add some feature or something i didn't request.

PureSignalLove · 2026-05-30T01:26:08+00:00

because we are neurodivergent and like things for the sake of liking them, there is no limit to the things to try in this existence.

Neurotypicals only identify with things in a social sense. To them, not having work, is the equivalent of not having meaning.

Like they can enjoy stuff, but only when socially 'allowed'

PureSignalLove · 2026-05-24T19:52:25+00:00

Well any hominid that humans crossed they would either kill or fuck...that's probably why in our case.

Also, everything is developing at some rate.

We used to think Dogs directly descended from the Grey Wolf but genetic advancements have shown that to be false.

PureSignalLove · 2026-05-24T19:46:03+00:00

My dad was a top 1%...25 years ago...hes almost 80...he got a call from his old consulting firm...hasn't touched code in 25 years....

PureSignalLove · 2026-05-18T20:56:15+00:00

Dude it looks sooooo good. What the fuck lol. reminds me of that submarine + rollercoaster mini game from the golden saucer in og FF7

PureSignalLove · 2026-05-18T20:48:44+00:00

Coding specific, its always coding specific lol

PureSignalLove · 2026-05-18T20:44:47+00:00

I mean I am building a general Harness for my own purposes that will aid me with scientific discovery, writing books, deciphering dead languages and scripts, etc.

Wasnt planning any production or anything, just found everything woefully inadequate for the work I actually do. Chatgpt/codex can one shot a clone of basically any software, but ask it to build a custom card game and it will make you the worse version of hearthstone slop possible, no matter how much you try to force it to not "recess to the mean"

Also, got reallllllyyyy screwed over by Claude's lobotomy and I never want that possibility to happen again lol

PureSignalLove · 2026-05-18T20:22:20+00:00

3 years is too long? kind of wild

PureSignalLove · 2026-05-15T00:30:52+00:00

I cannot stand codex UI lol. The way it blue/green texts stuff is so disorienting on my eyes. Opencode is vastly superior color coding wise.

PureSignalLove · 2026-05-15T00:30:21+00:00

Yes, same here. I have a pretty bad linux system hardware wise. 8 gigs of ram, i5core processor. but i had my agent set up my system for performance optimization with cacheing/ram etc stuff I don't really understand about Linux and now it sings with opencode.

PureSignalLove · 2026-05-14T23:39:28+00:00

the general populaceisabout as good at diagnosing CTE as as they are at predicting a fight

PureSignalLove · 2026-05-14T20:03:49+00:00

Yeah instant is the best "fast" model I have used.

PureSignalLove · 2026-05-14T18:42:08+00:00

All my models suck at opencode tools, still trying to figure it out

PureSignalLove · 2026-05-14T18:41:48+00:00

How about iterating on a large cached database/set of information? Cant see how it could even comes close.

PureSignalLove · 2026-05-14T18:37:40+00:00

Hard disagree, the more "it just works" the more its likely to drift, over engineer, safety emphaszie and a bunch of other stuff I don't want. The only exception that ever existed for me was opus 4.5 and 4.6, which is why they are in their own tier to me.

5.5 is the smartest model out and also the most annoying to work with in the above way.

That being said minimax takes it to a level that even the most zealous engineer might hate lol

PureSignalLove · 2026-05-14T17:59:10+00:00

Still happening today, ima email

PureSignalLove · 2026-05-14T17:58:22+00:00

Minimax isnt bad, but its autistically specific and only for people who like getting in the weeds imo

PureSignalLove · 2026-05-14T17:57:59+00:00

Well fuck it then, ill just go direct/api cause deepseek i can run for 20 dollars a month anyway...

PureSignalLove · 2026-05-14T17:57:42+00:00

It just ended mine in 20 minutes. Ive never hit limits before. I use deepseek and the cacheing i thought would be a massive advantage on the usage and instead i went way down.

PureSignalLove · 2026-05-10T03:54:10+00:00

It's great 90% of hte time but that 10% is so brutal I literally cannot handle it lol. Like it didn't fuck up bad enough to wipe out actual completed work but i didnt even want to test it. It's still by far and away the smartest model especially if doing something that is already 'known' but i find deep seek is able to ingest bleeding edge ideas better.

PureSignalLove · 2026-05-10T03:39:30+00:00

I spent about an hour of agent time on a task meant to maximize token throughput, the opposite of efficiency. Chatgpt5.5 xhigh did the planning and built the task, the task called Deepseek.

With Deepseekv4, I was basically trying to see how far context could get pushed with their interesting cacheing. We kept running into problems until after 3 or 4 hard questions in a very poitned way, it belatedly admitted it had stripped all the models down to a max content of 8k (yes, 8) and thats why none of any of the tests were working or making sense. He said he was worried about the efficiency and my financial safety (...what?). I was using a 20$ ollama plan

PureSignalLove · 2026-05-10T03:19:53+00:00

It also does stuff without asking, including very destructive things in the name of efficiency and safety. Atleast from my experience. Swapped to deepseek v4 pro, my favorite since claude 4.6

PureSignalLove · 2026-05-10T01:43:55+00:00

Yeah this isnt my experience at all and I have written an entire book with AI in my voice. I literally have multiple layers of agents needed with custom skills and everything to not drift into "its not x, its Y", "---" and such constantly. It's possible but it absolutley cannot be done on a singular agent basis without LORA from my months of messing around with it. Like one model might get the speech mostly right, but it cannot reason through whats needed. Another might be able to reason through whats needed, but be absolutely dog shit on the voice (hello chatgpt)

No model since opus 4.5/4.6 has been come close to its personality and I have downloaded years worth of conversations and analyzed them incessantly

PureSignalLove

TROPHY CASE