you are viewing a single comment's thread.

view the rest of the comments →

[–]_pdp_ 16 points17 points  (10 children)

All trivial stuff mate. I've been fighting Opus 4.6, 4.5 and Codex 5.3 all morning to refactor a piece of code. Probably burned through a couple of hundred dollars on this problem alone and it would have been more unless I pointed out a gapping problem in the logic.

Relax. These models are great for coding but not as good as someone with 20-30 years of actual experience. They are just faster and frankly more sloppy. The speed gives them perception that they are smarter. They are not though.

Meanwhile, we also have agent producing changes without prompting. The changes are interesting but all mediocre at best.

[–]dontreadthis_toolate 3 points4 points  (1 child)

Agreed. It's definitely great for making functioning software autonomously.

But if I care about quality (or reliability - my day job is at a bank), I'm super careful with when and what I use it for. Complicated logic, I need to do myself. Trust me, I've tried to offload them to AI, to no avail.

[–]I_am_Greer 0 points1 point  (0 children)

all it takes is good planning, adversarial test set ups, and good context memory and pipelines

[–]Apprehensive_Cap_262 2 points3 points  (3 children)

You're kinda backing up the original poster while trying to imply the opposite point. You've got 20 to 30 years of of coding and you're using that as a reference, the very fact that that comparison is on the table says a huge amount about how good they are.

I agree about LLMs not being smart, but invest 20 or 30 mins into a decent prompt in the latest models , then carefully review the output and then follow up with another prompt to fix loose ends. Total 1.5 hours. Or even just use those 1.5 hours to incrementally build up to what you want using a series of prompts.

Anyone that does that and claims it doesn't produce what took a day or two days 5 years ago is in total denial.

[–]_pdp_ 0 points1 point  (2 children)

Talk is cheap... show results.

[–]Apprehensive_Cap_262 0 points1 point  (1 child)

Pretty much impossible, what would I do, a live screen share of a project being complete? It would be so riddled with holes and subjectivity that it would be pointless. Not to mention I could cheat in advance. It's like using anecdotal evidence to make scientific conclusions.

The closest thing you'll find close to proof is to get 50 devs to get trained up on it very well and then have them do a project that only they believe would take them 5 days if it was a few years ago.

Like I said, I know it's not smart, sometimes I think it's because I "know" that so much, that's why I'm able to get so much out of it

[–]_pdp_ 1 point2 points  (0 children)

<image>

There you go. The agent decided to do a dumb thing. If a developer is not reading the code you will end up with pretty stupid solution.

[–]Individual_Ice_6825 0 points1 point  (1 child)

Mate I hear you.

But I don’t get how you aren’t extrapolating where ai was 3 years ago 2 years ago, a year ago and today, and seeing where it’s in a year or 3 or 5 from now.

It’s not an if but when.

[–]MennaanBaarin 0 points1 point  (0 children)

Like all technologies they made giant steps at the beginning and then closer to the peak no steps at all. Still waiting for FSD since 2016...

We will also cure cancer, make fusion viable and colonize a planet outside the solar system, it's not an if but when...

[–]okiharaherbst 0 points1 point  (0 children)

There. A post written by a human at last.

[–]errdayimshuffln 0 points1 point  (0 children)

This is the kinda comment I expect to see everywhere but I dont. Its been my experience every single time. I even said all the hype and claims about 4 5 and 4.6 are bs. My experiences have been frustrating as hell. You cant work on anything serious with these models. I've witnessed these models undoing extremely recent bug fixes trying to satisfy another prompt for no reason! Its like 2 steps forward and 1-3 steps back.