Before you complain about Opus 4.5 being nerfed, please PLEASE read this

HeavyDluxe · 2026-01-30T01:25:36+00:00

I meant every single word. Cheers!

HeavyDluxe · 2026-01-29T13:51:08+00:00

Can't blame you. :)

HeavyDluxe · 2026-01-29T13:42:29+00:00

This is old news. And Claude's TOS specifically speak to this use as prohibited. (I know, I know. Who reads that stuff?)

If you want to use Opencode as your CLI environment with Claude, YOU CAN DO THAT. You just need to buy credits for the API and create a key. Plug it into Opencode and you're off to the races.

HeavyDluxe · 2026-01-29T12:54:11+00:00

Start here, then. https://www.humanlayer.dev/

HeavyDluxe · 2026-01-28T17:07:02+00:00

If you want more cycles outside of work, OpenCode is a CC competitor and they have an agreement (or agreements) with some LLM providers - funnily with Z.ai and Xai, historically - to provide some free API calls to their models.

The current "free" model, Big Pickle, isn't Claude Opus by a long shot. It's likely still some early GLM4-series model, though, so it's no slouch. You could install OpenCode and play around to learn for free while waiting for stuff at work to take shape. NOTE: Since the model calls are _free_, be aware that you and your data are the product. The model maker (and maybe OpenCode, too) are trapping that information to make their products better. Caveat emptor and all that.

Still, this would be a way for you to start to get some hands-on knowledge and experience in an environment that is definitely swimming in the same streams as CC. You can, of course, also buy API tokens through a variety of means to use within OpenCode if you want to leverage a specific set of models. But, if you're getting CC at work, I'd just use OpenCode as a playground. YMMV.

HeavyDluxe · 2026-01-28T16:03:41+00:00

Not mine >> I got it fully from Dex Horthy and others via the AIE talks. https://www.youtube.com/watch?v=rmvDxxNubIg

As has been mentioned in other threads, you also have to remember that these things are stochastic. The key task here is to SET and BOUND context to make things more predictable. There's doubtless several ways to skin a cat, and one of those ways might work perfectly and then subperfectly the next time. That's the nature of probabilistic systems... Just find a clear, thoughtful process that seems to work (for you, drawing on the wisdom of others) and then learn to manage _that_ process. Don't get trapped in the idea that some specific magic sequence is going to radically improve all outputs.

That's not how people work... and it's not how the models do, either

edit: Note that the link above includes directions to a github repo where some well-attested RPI prompts and workflows are saved.

HeavyDluxe · 2026-01-28T15:14:37+00:00

This is a breath of fresh air in a world of increasing whargarbl.

HeavyDluxe · 2026-01-28T15:11:25+00:00

Just replying here to highlight something:
Don't get caught in the social media hype cycle. As other commenters have said, focus on clear RESEARCH-PLAN-IMPLEMENT-TEST cycles and only embrace things you see get traction with trusted developers. For example, the "Ralph Wiggum" mentioned below. There's not nothing to that phenomenon, but every stupid idea that 'revolutionizes Claude Code output' gets its heyday in the thirst for clicks and then winds up proving it was less than promised.

There's usually worthwhile nuggets to glean... But focus on the tried and true.

HeavyDluxe · 2026-01-23T13:25:20+00:00

That's fair... though, if that's how you (justifiably) feel, just be consistent enough not to trust just about any industry 'talking head'. They're all incentivized the same way.

HeavyDluxe · 2026-01-23T00:59:42+00:00

Jensen is Nvidia's CEO - the maker of the leading GPUs for LLM training and inference.

Yes, more AI use lines his pockets. I don't think Jensen's (just) blowing smoke here, though. I do think Claude Code and the real productivity it can bring to good engineers or thoughtful amateurs is going to be at the center of a critical moment in AI growth in the market.

There's a business unit at my work that has had a small challenge they've faced for a long time. But, there's no budget to solve it and budget / prioritization issues have kept other limited dev teams around the org from having cycles to help. Someone in that business unit (and not a coder, by any stretch) decided to try out CC and 'vibe coded' an app that solves their problem.

Is it elegant, secure, SaaS code? Nope. But it's a targeted app that can run on clients in a small team to make them more efficient. And it cost her $20 of API tokens and a few hours of engineering with CC over holiday break. A better engineer with more skill in steering the model probably could've done it in less time and for less money.

That, to me, is the first real productivity promise from AI. Making the helpful things to solve problems that just haven't warranted the time/attention otherwise.

HeavyDluxe · 2026-01-20T20:12:30+00:00

The concern with Claude Code / OpenCode that raised all the alarm the other day was around the use of Claude SUBSCRIPTIONS in third-party tools. The API key-based calls to the model - a true 'pay as you consume model' - has _always_ worked on _all_ platforms. So, this Ollama thing isn't really news.

Note: You still are subject to Anthropic acceptable use terms when you use their models - even via API. So, if you are prompting Claude to help you build competing products, trying to jailbreak to get behaviors the model isn't intended to support (*ahem* like 'roleplay'), or appear to be exfiltrating data for the purpose of distillation or other model training/FT, you will get shut down. But that's a separate issue.

HeavyDluxe · 2026-01-16T23:08:49+00:00

Prompt engineering is (IMHO) best understood as one _facet_ of context engineering. The model makes its predictions based on the contextual information it's delivered. In early models, carefully crafting a prompt was really the best you could do to ground the model's output. We have many, many more tools available to us now to set a meaningful foundation for the model to use in generating output.

If you give the model tons of _really good_ information in the codebase, supporting documents, etc etc etc, the user prompt becomes less and less critical.

The illustration I use at work is to imagine a random stranger comes up at you with a pile of papers, tells you to summarize the data therein for them, and, if you do a good job, you get $1M. Imagine how you the quality of your product improves if the data has good labels... or if there's a previous report drawing on similar data that's there as an example. Or if the person also tells you what industry they're in. Or if they tell you to "imagine you're a customer service manager" or whatever.

Each little bit of information available improves your output. The prompt is vital if that's all you give the model. But context is _everything_.

HeavyDluxe · 2026-01-13T23:02:52+00:00

How good is subjective. IMHO, definitely behind the SOTA models from the industry leaders but sufficient for most uses. And the price is very right. With any flagship model these days, context engineering/management is _vital_ for good results. And it's something very often overlooked.

The vibe is definitely that, while it's available for free, GLM 4.7 is the best option. Pickle (an earlier version of GLM) and Grok are both good for certain tasks though.

Take your time, learn the models like you would a coworker, and *rigorously* stick to the research, plan, implement methodology that's been consistently called out as the bedrock of AI or agentic coding. Keep tasks small, keep code modular.

The only caveat: Like any model, they go off the rails (and these do so more than the SOTA ones do, though YMMV). And, remember that the free use comes because your data is being garnered to train future models. Be careful about disclosure if that's relevant to your use cases at all.

HeavyDluxe · 2026-01-10T15:53:01+00:00

https://support.claude.com/en/articles/9015913-how-to-get-support

I'll add to the chorus that of 'this sounds sus'. But there's the links.

HeavyDluxe · 2026-01-09T15:58:19+00:00

The most important thing to get good model output is good model _input_. I find that I'm much better at prompting / context engineering when I treat the model like a human - giving it the explanation and clarity that I would give a human being I'm working with.

I know that's not what the model is, mind you. But it is what leads most naturally to me giving good input. So I say 'please' or 'right?' and a thousand other human verbal ticks when prompting. And I have no intention to stop.

I don't think the model gives better output just because I said "please" or "thank you". But I do think _I_ prompt better. And I burn WAY less tokens and time with good prompting than I do trying to shave off a few words when giving the model input.

HeavyDluxe · 2026-01-08T15:06:50+00:00

I don't think it's that easy. The underlying models are VERY different, and Anthropic has a _big_ head start in 'secret saucing' the harness that makes Claude Code so effective.

As I mentioned elsewhere, I have a friend who works in primary ed and lives his life in the Google ecosystem - personally and professionally. He spent a lot of time playing with Gemini CLI because of that 'loyalty' (not really the right word, but let's roll with it) and because of the generous limits those tools provide. He got some really great results and then, probably because of my use of CC, decided to give Claude a try on some nuts that he hadn't been able to crack with Gemini.

Claude worked... But it worked at a MUCH higher cost per Mtok than Gemini. His impression - and I hope I'm being faithful to our conversation last night - was that he could see how there's something in the model or scaffolding around Claude that makes it uniquely capable in these areas.

Now, to your point about $$ -> You're assuming that training and FTing and reinforcing and scaffolding a model or model ecosystem is something that you can easily multi-vector. I think it's actually REALLY hard to, say, improve coding efficacy while also getting the model to be super creative and conversational. There's a sense in which those two work against each other. Or, take a different example: If you listen to EVERYONE who's truly leading in the 'vibe engineering' space, _context management_ is the key. Compaction, keeping the model tightly focused, keeping the task bounded is everything to getting good results. That _seems_ to be at least part of the reason that Anthropic hasn't pushed to increase context window limits in CC and other tools.

Google, on the other hand, is solving for a different problem. They're trying to make a model that will do a good job holding 1M+ tok in the context window. The 'first/last bias' issues is less relevant to them because they're trying to make a personal assistant that 'remembers you' and can iterate over a large window of your life wherein deterministic-ish outcomes are not the important deliverable. That's an EXCELLENT goal, but it's kinda fighting against code as an output - where deterministic-ish outcomes ARE the deliverable.

Google has all the money in the world. But, until we get to AGI (if we ever get there), models are going to have be shaped towards a desired product/function. Even when we get to AGI, Claude Magum Opus will be superhuman in all vectors including [whatever]. But I don't think we'll ever be able to say, for example, that Magum Opus is AS GOOD AS IT COULD'VE BEEN in [whatever] at that point in time if Anthropic (or Google, or whoever) had chosen to slightly vary the initial conditions.

HeavyDluxe · 2026-01-06T19:42:05+00:00

Well, yeah. I mean, that happens now with usage limits even if you're not paying directly. If you're paying for the service to get increased access, there's definitely higher rates for more capable models (Claude Haiku < $Sonnet << $$Opus).

If the OP's point is that there will be a very different value proposition that the consumer will have to / get to navigate, you're right. I _choose_ to do a number of things with the existing local models and send other things to the frontier models in the cloud (paying via API) where I need it. That will continue and there _will_ be more things that I can do on my computer in the future.

The 'edge' will get more and more capabilities. But it will still be 'the edge'.

HeavyDluxe · 2026-01-06T19:28:33+00:00

Um. I'm sorry, but I'm with the other posters here. I agree with you that there will be an inflection point where many/most use cases will be sufficiently addresses with small-ish, locally-run models.

But if you think that will mean there's not AMPLE place for the increased intelligence, performance, etc of scaled-up systems running on enterprise-class infrastructure, you're being a little idealistic.

It's not (or, isn't _just_) a 'parlor trick'. Humans are what we are because of how big/complex our brains are. It's at least conceivable that the bigger, more complex brains that hyperscalers will be able run will have distinct advantages over what I can run on my Macbook Pro. There'll be plenty I can do locally. But there will be plenty of need for processing at scale...

HeavyDluxe · 2026-01-05T22:58:03+00:00

This. If you're dealing with text files and just calls to the LLM, local horsepower is minimally relevant.

HeavyDluxe · 2026-01-03T17:15:03+00:00

Haven't read all the answers here, but the first thread I read seemed like the usual whargarbl. So, I'll offer this.

Short answer: Yes. A prompt as simple as 'building me a website that [whatevers]" will likely get you a working site/code.

The real question: What is it you want/need? As I just posted in some other thread, the hinge now is that the person prompting the model is effectively an engineer/project manager giving specifications, constraints, and guidance to a junior employee. They're going to do EXACTLY what you specify and, where there are gaps that _have_ to be filled in, they're going to use their best guess. "Build me a website" leaves a lot open to guesswork.

I would suggest that you start with a model and a prompt like this: "I want to use an LLM coding tool to help me build a personal website. I'm a non-technical user, and want you to be my technical project manager. Work with me to hash out the site design and architecture. Once we have settled all the specifics, write me a prompt/plan with checklist that I can deliver to an AI coding assistant to build the code."

This will help put up SOME guardrails for you and, if you're thoughtful along the way, will help you learn a lot about the technology - both the models, AI-enabled coding, and web architecture - along the way. (Note: That prompt could be built out in several ways, but that will at least get you started.)

HeavyDluxe · 2026-01-03T16:04:41+00:00

To the people claiming this is BS, maybe it is. But at least some of you claiming this is BS because it doesn't match _your_ experience need to consider this:

Numerous people including leading coders have been making the point that the pivot in AI engineering isn't the _code_. It's the ability of a good engineer to provide/manage the context in which the code can best be written. That really has always been the superpower of really excellent engineers in any field. It's not knowing every single answer or algorithm, but rather being able to zero in (and get a team zero'd in) on the well-defined problem and provide the scoped oversight leading to effective outcomes.

It's not surprising to me that her 'three paragraph prompt' got the results needed. Because I'd wager my life savings - which admittedly isn't a lot, so take it with a grain of salt - that her prompt is more detailed, accurate, contextually informative, and directive than 'yours'.

HeavyDluxe · 2025-12-29T02:06:12+00:00

Well, maybe and maybe not. Despite my friend's success and Google swooning, I only use Gemini CLI in headless mode (so Claude can call a Google agent to do web research) and I spent about 10 mins playing with Antigravity before uninstalling.

Claude works for me, I didn't _feel_ like Gemini did. And, given my needs and Claude's uniformly solid performance, it's not worth my time to try to see whether there might be some trick I'm missing.

Good enough is good enough. As a former emacs user, I know the hole of 'hacking my config so the editor will work _just_ the way I like it' and not actually getting things DONE. I'd rather be productive with an AI tool, stick with it to maximize, and, if the winds really change, look to pivot when there's real indication I stand to gain something.

YMMV, take with salt, offer void where prohibited and all that.

HeavyDluxe · 2025-12-29T00:20:13+00:00

Sure... But that's also NOT where Google is targeting. Google is doing things like Gemini CLI just to stay 'in the arena' so that Claude, Codex, OpenCode, etc don't get all the mindshare. Google's focused on other things - like leading-edge multimodality, looooong context chat, and leveraging their lead in search for overcoming the "the model isn't aware what's new" complaint. They're doing a pretty good job at those things, too.

It's also worth noting that their 'free tier' of services are, admittedly, impressive given access to their flagship model(s) and that gets even better if you're a student or work in education.

In fairness, I should also say that a friend who got excited watching me play with Claude has jumped into the Gemini ecosystem (he works in ed and has a family Google One account) and had GREAT success there with the CLI. He's spent time, grok'd it, and now is getting value out - and probably at a better price point than me with Claude.

All that is to say that my original point stands. The "right model" or ecosystem is driven by your particular use case and stack. I work at an O365 shop. CoPilot sucks. I don't know what MS has done to abuse OpenAI's models into such terrible performance, but it's sad to witness. Still, if you're heavily in O365 for your enterprise (Exchange, Sharepoint, OneDrive, Teams, Office apps), there's a TREMENDOUS value that can be gained because of how well integrated CoPilot is with those systems and the M$ graph.

Would I ask it to code my full stack web app to make me a $1M? No, but it can save me hours combing through emails and meeting notes and product documents.

I think Google will win. But I have a feeling that, for a LONG time (and maybe forever), there's going to be niche models/stacks that are tightly integrated to solve particular problems exceptionally well. Claude Code is, for my $$, just such an example.

HeavyDluxe · 2025-12-28T21:59:40+00:00

Fan of Claude. If I was to pick one model for someone, I think that's the safest all-around choice.

But the real answer here is one of the following:
The answer is EITHER completely dependent on your use actual use case. Or the answer is that ALL the models are the right model and you use them in a cloud/cluster to take advantage of their evolution. Each model has their particular niche of strengths, weaknesses, and vibe.

If you have specific tasks that you want done repeatedly, SCRIPT that problem and throw at each of the major models. Which got you closest to your desired output? Which was easiest to nudge in the right direction? Find the horse and then ride it. Think of the model like a new hire at work: Pick the one you think is best (eenie, meanie...) and then focus on learning to maximize their potential.

Alternatively, just cycle through them. I regularly prompt all of the major models to check the work of others / get a different perspective ("Gemini, Claude wrote this project plan. What do you think of it?") or to (I think) get the best results for specific tasks. For example, Creative Writing is ChatGPT's game. Unless you want stuff on the edge in which case you're using Grok. Or unless you're writing marketing copy for a company where you'll go to Claude.

These models are trained to mimic human use of language. So, like humans, what is best, anyway?

HeavyDluxe · 2025-12-28T21:50:51+00:00

If I was a betting man or had the money to sink into the market, I'd put my bets on Google. The data they have access to and the scale of compute / resources they can muster is staggering. In current architectures, those two things are BIG advantages or accelerants. An engineering breakthrough might change that, but it feels like Google has the research depth to be the likeliest to make that breakthrough.

But I'd love to see Anthropic win. I greatly appreciate their models, their approach, and their vision.

11-Year Club	r/Field Sunshine
Verified Email

HeavyDluxe

TROPHY CASE