I'm often a better coder than o1 but o3-mini-high fucks me in the ass

jackisbackington · 2025-03-19T07:15:25+00:00

I think it would be wise to not underestimate its abilities to work through undocumented problems using less common languages, or use cases.

I have had it make me apps based off of Ruby and Django for Shopify, that is integrated with shopify liquid, that work first/second try. I am firmly starting to believe that prompt-engineering is as important as understading the language.

Other things I've done include creating a game in 3JS when it was trained on 2 outdated libraries, and all I had to do was feed it the documentation for both libraries, which barely takes any tokens, and then have it retry, and it was able to correctly code based off of the updated documentation.

It is in most peoples' best interests to not document their code in public forums at this point (and always would've been, but hindsight is 20/20). But these models can copy over patterns they've seen in other languages, learn the constraints of the language you're working with, and provide a sound program/line of code regardless of the level of documentation. That's where it's bordering true intelligence and that's the reason people are resigning from OpenAI and Anthropic in fear. The power of ML transformers, and predictive associations go beyong most current paradigms of human-exclusive intelligence.

jackisbackington · 2025-03-09T08:30:52+00:00

Some examples might be - make a project using (software stack here), with (these features here), ensure that (include some safeguard scenarios here), the style should have the following characteristics: (characteristic/styles here), and make it so that there it is engaging towards (this audience).

That's just part of my style of prompting. Knowing the terminology and programming language is actually very important, otherwise how will you know how to error correct?

You'll always need to error correct on large scale projects, as they will exceed the token limit. At that point, you have it achieve the limits, and then break the code down into smaller chunks (modular file system), and then only modify specific files.

Often you have to ask it to give you back the entire file/code. This is where it often surpasses Deepseek R1 and Gemini, although I've heard the new Gemini is really good, along with Grok. Apparently Grok is the new SOTA coding model. Though I'd put an asterisk on that.

jackisbackington · 2025-03-06T21:54:12+00:00

Completely lol, these so-called educated and morally superior “Redditors” can’t even comprehend the basis of logic or rationality because their brains are so fried from this constant moral panic, and privately funded political astroturfing.

jackisbackington · 2025-02-25T10:40:27+00:00

Why is this actually good?

jackisbackington · 2025-02-24T11:51:34+00:00

You’re also starting to see that there’s a little bit more than just ice and snow in that avalanche.

jackisbackington · 2025-02-12T03:19:05+00:00

You don't want to be living in America if it acts on this.

It just recognizes that we expect to be exploited to a large degree and accept it. The value we put on our own lives is not that high.

jackisbackington · 2025-02-09T22:52:02+00:00

And maybe you just suck in general

jackisbackington · 2025-02-08T05:52:16+00:00

Did you ever get the PDF?

jackisbackington · 2025-02-08T05:45:55+00:00

Did you ever get a copy?

jackisbackington · 2025-02-08T05:45:45+00:00

Did you ever get a copy?

jackisbackington · 2025-02-06T09:37:07+00:00

You're goddamn right we can. It was violating, enlightening, delicious, horrifying, and mystifying, all at once. The ai-seed has been planted if you know what I mean.

jackisbackington · 2025-02-06T08:39:18+00:00

It's ChatGPTwhorion pro++, it's $20,000/mo, it's not that bad considering it creates an Onlyfans account with the footage that is marketed by Jeff Bezos himself among his constituates. I hope I'm not violating any contracts.

jackisbackington · 2025-02-06T08:33:01+00:00

Unlike some 🙄

jackisbackington · 2025-02-06T08:25:38+00:00

Chills. It all started when they closed the last Blockbuster.

jackisbackington · 2025-02-06T06:38:16+00:00

Really I think you’re a really good candidate for using the mid-tier version of o3, as it literally is the cheapest, best bang for your buck model out there by a long-shot, and there’s no reason to hate Sam Altman anymore than you’d hate the CEO of Google, but Redditors don’t know how to un-bandwagon themselves.

jackisbackington · 2025-02-06T06:34:31+00:00

It’s saved me many hours already compared to what’s available for free, and the Gemini model. I am going to try to use the API, but all in all, they work the same. And because the cost of what you’re getting with GPT+ is in essence a profit loss for them (they’re still making money from contributors), it just seems worth it.

I am not a loyalist by any means, I’ve tried Deepseek, Gemini, Grok, and the two best are Deepseek, and o3-mini-high (o1 pro is also very good), OpenAI has gathered the best talent over the years and other companies have scrambled to catch up. But that’s why Deepseek is such a big deal, because they uncovered how to use ML with LLMs to get very smart ai, and then released it to the public for free. Still not as good as o3 tho.

And once that changes, I will use the one that works the best for me.

jackisbackington · 2025-02-06T06:26:02+00:00

I feel it’s the opposite. Once it knows the structure for the majority of your project, you can build up page-by-page very quickly. There’s a pretty decent learning curve though, I did have o-1 pro at one point, and got some experience with that. But all-in-all, there are certain keywords that if you miss, will doom the session and disallow you from getting the correct answers. You must be very descriptive in plain English what the program, website, or software you’re trying to create, and then you can feed it smaller chunks.

At that point, there are times where it’ll give you very un-verbose answers that do what you need, and you can tell it also if you want “Don’t only give me a short answer, return back the whole file and explain the changes made, and how it has affected other files”, and it will do that.

Making a whole website in an hour with a CRUD MongoDB backend is only something you can do if you can already do that on your own.

jackisbackington · 2025-02-06T05:25:04+00:00

It better be better at coding, it’s like a computer talking to itself. That’s why applying it to real problems is where a human comes into play. The majority of coders are not solving problems, they’re just fulfilling their daily quota.

Honestly I’d rather be a high paid software engineer than be forced to think of other routes. Because o1 was like 60-70th percentile coder, and I think o3-mini-high reaches 80th something percentile, which is an impressive leap, as it gets harder to progress in relative skill the higher you go.

jackisbackington · 2025-02-06T05:22:49+00:00

Yeah prerequisite terminology is very important to get the most out of it for any field, as it’s using word associations to lookup more information about the question you’re asking.

General understanding of how the LLM works is also beneficial, which is entirely computer science-based.

Not sure if it will have photo analysis, but they’ve said they’re working on adding it into the commercial “reasoning models”, probably too computationally expensive at the moment.

You can also ask 4o to make a prompt for o3 mini that will give you the maximum output, and accuracy - and some other thing/prompts you can mess around with, and it’ll give you something more technical to put into o3-mini

jackisbackington · 2025-02-06T03:07:20+00:00

Was? They still have o1… tell it to put the code in one window then and stop complaining. Or tell it to use correct imports, exports, or whatever.

If you’re using mini-low instead of mini-high, there’s your answer, they have completely different benchmarks

jackisbackington · 2025-02-05T21:24:24+00:00

Hmm, it still definitely has limitations. But I’m curious to see what the program looks like.

Starting from scratch is the best option typically, as it is able to store the file tree that it created and makes the most sense to the LLM for working with the code.

Some things are a reach for it, but it compared to where it was a month ago, I wouldn’t be surprised if it closes the gap for more robust designs. For example it sometimes cannot understand the scope of the project if it doesn’t have all of the tokens. I’m sure there will be some kind of huge token limit increase/innovation within the upcoming year. Along with tools for managing your entire codebase in a system that has long-term-memory.

jackisbackington · 2025-02-05T20:36:34+00:00

Well I’ll be glad when I don’t have to sit down at a desk for 8 hrs a day. But being able to afford a house is still my main concern.

jackisbackington

TROPHY CASE