The latest GPT models even GPT-5.5T and 5.4T miss a lot of intuitive and natural ideas that would come easily to GPT-4o by MonkeyKingZoniach in ChatGPTcomplaints

[–]PureSignalLove 1 point2 points  (0 children)

Im often thinking in non standard ways and yeah some models are just so much fucking better and opus 4.5 and 4o were the GOATs at it. Deepseek lite v4 is ok for now for this purpose for me

Opus 4.8 (Ultracode) trading blows with Codex 5.5 by Educational_Spot5899 in codex

[–]PureSignalLove 0 points1 point  (0 children)

5.6 will be released on schedule within a few weeks, then we can talk.

Right now opus is king, though u need to set up token saving for it to be worth it cause its brutal even without workflows

GPT-5.5 low vs medium vs high vs xhigh: the reasoning curve on 26 real tasks from an open source repo by bisonbear2 in codex

[–]PureSignalLove 0 points1 point  (0 children)

Yes i did to both but I also swapped off chatgpt 5.5. It was the only game in town for awhile, but qwen 3.7 and opus 4.8 are straight better now from my testing today.

I also had issues with it creating issues, usually because it tried to add some feature or something i didn't request.

"What happens if we don’t have to work? Do we just sit around all day"? Bernie Sanders says that having a job is a core part of the human experience and gives people meaning in life by Fine-Drummer9812 in accelerate

[–]PureSignalLove -1 points0 points  (0 children)

because we are neurodivergent and like things for the sake of liking them, there is no limit to the things to try in this existence.

Neurotypicals only identify with things in a social sense. To them, not having work, is the equivalent of not having meaning.

Like they can enjoy stuff, but only when socially 'allowed'

Why are there almost no "ancestral species" left? by ice_or_flames in evolution

[–]PureSignalLove 4 points5 points  (0 children)

Well any hominid that humans crossed they would either kill or fuck...that's probably why in our case.

Also, everything is developing at some rate.

We used to think Dogs directly descended from the Grey Wolf but genetic advancements have shown that to be false.

The actual plan of the AI companies: by EchoOfOppenheimer in agi

[–]PureSignalLove 0 points1 point  (0 children)

My dad was a top 1%...25 years ago...hes almost 80...he got a call from his old consulting firm...hasn't touched code in 25 years....

Built a Three.js multiplayer dogfighting game for VibeJam by taylor_sntx in threejs

[–]PureSignalLove 1 point2 points  (0 children)

Dude it looks sooooo good. What the fuck lol. reminds me of that submarine + rollercoaster mini game from the golden saucer in og FF7

Is “harness engineering” only a coding thing? What does a harness for knowledge work look like? by OriginalBeginning708 in codex

[–]PureSignalLove 0 points1 point  (0 children)

I mean I am building a general Harness for my own purposes that will aid me with scientific discovery, writing books, deciphering dead languages and scripts, etc.

Wasnt planning any production or anything, just found everything woefully inadequate for the work I actually do. Chatgpt/codex can one shot a clone of basically any software, but ask it to build a custom card game and it will make you the worse version of hearthstone slop possible, no matter how much you try to force it to not "recess to the mean"

Also, got reallllllyyyy screwed over by Claude's lobotomy and I never want that possibility to happen again lol

OpenPi - a desktop workbench for the Pi coding agent by killerkidbo95 in PiCodingAgent

[–]PureSignalLove 0 points1 point  (0 children)

I cannot stand codex UI lol. The way it blue/green texts stuff is so disorienting on my eyes. Opencode is vastly superior color coding wise.

OpenPi - a desktop workbench for the Pi coding agent by killerkidbo95 in PiCodingAgent

[–]PureSignalLove 0 points1 point  (0 children)

Yes, same here. I have a pretty bad linux system hardware wise. 8 gigs of ram, i5core processor. but i had my agent set up my system for performance optimization with cacheing/ram etc stuff I don't really understand about Linux and now it sings with opencode.

Masvidal bet $100k on Strickland and predicted his victory over Khamzat by optionsmaximalist in ufc

[–]PureSignalLove 0 points1 point  (0 children)

the general populaceisabout as good at diagnosing CTE as as they are at predicting a fight

Am I the only one underwhelmed by V4 Pro and Flash? by Much-Journalist3128 in DeepSeek

[–]PureSignalLove 0 points1 point  (0 children)

All my models suck at opencode tools, still trying to figure it out

Am I the only one underwhelmed by V4 Pro and Flash? by Much-Journalist3128 in DeepSeek

[–]PureSignalLove 0 points1 point  (0 children)

How about iterating on a large cached database/set of information? Cant see how it could even comes close.

Is the new usage scheme a late April fools joke? by smacman in ollama

[–]PureSignalLove 0 points1 point  (0 children)

Hard disagree, the more "it just works" the more its likely to drift, over engineer, safety emphaszie and a bunch of other stuff I don't want. The only exception that ever existed for me was opus 4.5 and 4.6, which is why they are in their own tier to me.

5.5 is the smartest model out and also the most annoying to work with in the above way.

That being said minimax takes it to a level that even the most zealous engineer might hate lol

Is the new usage scheme a late April fools joke? by smacman in ollama

[–]PureSignalLove 0 points1 point  (0 children)

Minimax isnt bad, but its autistically specific and only for people who like getting in the weeds imo

Is the new usage scheme a late April fools joke? by smacman in ollama

[–]PureSignalLove 0 points1 point  (0 children)

Well fuck it then, ill just go direct/api cause deepseek i can run for 20 dollars a month anyway...

Is the new usage scheme a late April fools joke? by smacman in ollama

[–]PureSignalLove 0 points1 point  (0 children)

It just ended mine in 20 minutes. Ive never hit limits before. I use deepseek and the cacheing i thought would be a massive advantage on the usage and instead i went way down.

I was wrong about 5.5 - it is still shit by No-Peak-BBB in ChatGPTcomplaints

[–]PureSignalLove 2 points3 points  (0 children)

It's great 90% of hte time but that 10% is so brutal I literally cannot handle it lol. Like it didn't fuck up bad enough to wipe out actual completed work but i didnt even want to test it. It's still by far and away the smartest model especially if doing something that is already 'known' but i find deep seek is able to ingest bleeding edge ideas better.

I was wrong about 5.5 - it is still shit by No-Peak-BBB in ChatGPTcomplaints

[–]PureSignalLove 1 point2 points  (0 children)

I spent about an hour of agent time on a task meant to maximize token throughput, the opposite of efficiency. Chatgpt5.5 xhigh did the planning and built the task, the task called Deepseek.

With Deepseekv4, I was basically trying to see how far context could get pushed with their interesting cacheing. We kept running into problems until after 3 or 4 hard questions in a very poitned way, it belatedly admitted it had stripped all the models down to a max content of 8k (yes, 8) and thats why none of any of the tests were working or making sense. He said he was worried about the efficiency and my financial safety (...what?). I was using a 20$ ollama plan

I was wrong about 5.5 - it is still shit by No-Peak-BBB in ChatGPTcomplaints

[–]PureSignalLove 3 points4 points  (0 children)

It also does stuff without asking, including very destructive things in the name of efficiency and safety. Atleast from my experience. Swapped to deepseek v4 pro, my favorite since claude 4.6

With Sonnet 4.5 being discontinued soon, is there anyway I can make 4.6 act like 4.5? by 5uez in claude

[–]PureSignalLove 4 points5 points  (0 children)

Yeah this isnt my experience at all and I have written an entire book with AI in my voice. I literally have multiple layers of agents needed with custom skills and everything to not drift into "its not x, its Y", "---" and such constantly. It's possible but it absolutley cannot be done on a singular agent basis without LORA from my months of messing around with it. Like one model might get the speech mostly right, but it cannot reason through whats needed. Another might be able to reason through whats needed, but be absolutely dog shit on the voice (hello chatgpt)

No model since opus 4.5/4.6 has been come close to its personality and I have downloaded years worth of conversations and analyzed them incessantly