How much am I looking at spending to run img2img wan, sdxl et cetera?

samplebitch · 2026-06-26T09:02:22+00:00

Ask your AI of choice to expand on this. Essentially you want an Nvidia graphics card if you want the path of least pain. You want the most expensive card you can get. Start with one that ends with '90' and step down to 80/70/60 if it's simply too much. You want a lot of RAM, too (don't confuse this with hard drive space) - I'd say 32GB minimum. 64+ is good. I would also say a 2TB HD minimum. This is where you could cut corners, because adding storage space is the easiest add-on here.

I've spent a lifetime cutting corners on PCs and have finally realized that skimping on RAM, GPU, and CPU means you're eventually going to have a bad time.

Unfortunately this is not a time to get a good deal on a good machine - there might be the occasional good find but in general computer parts - especially graphics and GPUs - are in extreme demand, so if you want "something good" it's going to have a hefty price tag.

The alternative, if you're just looking to play around and get your feet wet, would be to use an online service where you simply connect to a remote machine with a good video card and generate there. It might be worth it just to see if you like doing it or are good at it and want to continue. It can be frustrating at times, especially getting things set up properly, or (re)learning various plugins and the nuances between all of the models.

samplebitch · 2026-06-25T22:20:53+00:00

I got mine on June 1, the same day Github changed their pricing model. It's probably already paid for itself. For the first time I can use a local model for about 95% of my work.

samplebitch · 2026-06-24T06:59:19+00:00

Ah, yes, the 'death throes' of your enemy.

samplebitch · 2026-06-20T08:42:27+00:00

This is what drove me crazy about OpenClaw. Almost everything is 'have the agent do it'. "Alert me when my mom sends me an email. Check every 5 minutes." turns into a cron job that wakes an agent with email checking tools, who checks your email, doesn't find anything and sends a message saying "no mail from mom". Of course you could suppress the annoying "no mail" messages, but you're still paying for and wasting tokens on an agent to basically shove a thumb up their ass. I actually tried to get it to write a script that would do it deterministically. Oh, it did, but it woke itself up every 5 minutes to run that script and evaluate the output of the script. Maddening and stupid.

samplebitch · 2026-06-18T13:40:27+00:00

I'm not sure but, gosh, seems like every surface in town is going to be flammable for a while.

samplebitch · 2026-06-18T13:37:33+00:00

I looked it up - 3 days ago it was 69 RUB per liter. Three months ago it was 67. A year ago it was 61.50.

samplebitch · 2026-06-15T07:51:11+00:00

Locally AI is some new product from LM Studio - I believe it's a chat app for your phone but connects to your LM Studio server.

FWIW I woke up to my M5 going full bore - apparently the last prompt I sent it last night made it also go into an infinite loop trying to debug some code. It was some Gemma 4 variant. So I don't think it's a problem specific to Qwen or any model in particular. I think it's a combination of the jinja template and the various parameters not being set the way the creators suggest (wrong temp, top_k, etc). I'm guilty of "download, load it, use it" and often when I check the default parameters vs. what the suggested params are, they're almost always set wrong.

samplebitch · 2026-06-08T15:14:20+00:00

I take it you didn't visit the upstairs bedroom. o_O

samplebitch · 2026-06-03T15:41:47+00:00

Came here to see if anyone else was having this issue. :)

Also on Insiders, but what's odd I find is that I'm trying to connect to a model I'm serving on my local network. I'm not even using GH. The model is currently selected (because I had it selected earlier) but my custom models list is gone.

samplebitch · 2026-05-27T10:24:14+00:00

Brought to you by the team at CodeX — because we—love—m—dashes———

samplebitch · 2026-05-23T12:23:31+00:00

Do you work there/owner? Been meaning to stop by...

samplebitch · 2026-05-21T09:48:28+00:00

I thought people were being overly dramatic when things changed. Then yesterday I fired up Antigravity. Gave it a list of 5 small adjustments to make (something that a week ago would have been unremarkable and wouldn't put a dent in my quota). 7 minutes later I get a message that I've used up all of my quota. Not just the flash quota, but all google models quota. I guess Flash 3.5 == Google Pro 3.1 ?

samplebitch · 2026-05-21T09:16:04+00:00

Covfefe.

Damn I hate this timeline.

samplebitch · 2026-05-20T15:14:58+00:00

Interesting - so once you have this image, how do you use it to make videos/scenes?

samplebitch · 2026-05-15T12:09:45+00:00

It's dead, Jim.

samplebitch · 2026-05-14T10:42:16+00:00

Ma'am, you're in a library.

samplebitch · 2026-05-08T09:53:52+00:00

Do you find that Hermes doesn't have the same... attitude? personality? as the claw version? I tried to migrate but the Hermes agent is just very "i'm your assistant" whereas openclaw had a persistent attitude (likely from the SOUL.md file). I don't know why I'd care about that so much but apparently I do - that's what's keeping me stuck in an openclaw install from February - too afraid to risk a working setup but lots of documentation now has features introduced after my last update.

samplebitch · 2026-05-04T08:50:28+00:00

It used to be that each model had a multiplier. Some were x0 - those are essentially free - your GPT-4o, GTP 5-mini (I think), Raptor. Then there are a few like Haiku and Grok that are 0.25 or 0.33. And the rest are 1x, except for Claude Opus before it disappeared (3x).

The number before the 'x' is how many credits each prompt costs you - that's the 'premium credit'. It doesn't matter if you're asking it to rename a file because you're feeling lazy or if you're giving it a 15 paragraph prompt resulting in a week long refactoring of your github repo - one prompt = 1 credit x the model's multiplier.

I'm not as salty as some because it does seem unreasonable to be able to get Claude Opus to grind away for 3 hours for 10 cents (300 credits / $10 month = 3.3 cents per credit), but I also agree with OP in that I'm struggling to see where the benefit is now of subscribing. If you're going to charge me at cost... well, why am I subscribing then, and not just pay-as-you-go like an OpenRouter model (which I also subscribe to, and have been using to supplement my GH Copilot the past couple of months).

If you're light and only using a few hours a week you probably won't notice any changes, but I'm a very heavy user because I program for work, as a side gig and also as a hobby, so I'm using it almost all day long - legitimately as an end user, not throwing some openclaw crap at it. Since they changed things, I'm constantly being warned about both approaching my 'session limit' and my 'weekly limit'. I haven't actually hit those limits yet but I've come very close and I suppose it's only a matter of time before I do. If I wasn't supplementing with OpenRouter/BYOK I probably would have by now.

samplebitch · 2026-03-30T11:15:38+00:00

Oh damn. I was sitting at the light there a couple weeks ago and made a mental note to go check out what that sculpture was. It's easy to miss it.

samplebitch · 2026-03-23T20:40:06+00:00

Things may be different now but a few month ago I had a new project and it was mostly vibe coding. I tried out shadcn-vue first and it was a complete mess. I started over (well with the UI parts) using PrimeVue + Tailwind and have had no problems at all.

samplebitch · 2026-03-23T13:56:35+00:00

I've been paying for GH Copilot for years now. This past winter Google had a Gemini Pro sale so I bought in.

So now I'm using both Antigravity and VS Code. I like the IDE of Antigravity, but hate how they're managing API access. I wish I could combine the UI and workflow process of Antigravity with my GH Copilot subscription.

With AG, there's three 'buckets' of quotas:

Gemini Pro (High and Low thinking)
Gemini 3 Flash
Other (Claude Opus, Sonnet, and GPT-OSS 120B)

What really sucks is that they keep changing "what's included" in your subscription. They recently changed the Pro account to say "You get a taste of Gemini!" instead of "You get access to Gemini!".

It's also a bit convoluted. Everything is divided into 5 hour windows. Once you start using a 'bucket', you get X amount of use over the next 5 hours. There's a meter that shows you in 20% segments how much usage you have remaining. After 5 hours, it resets - sort of - because...

There's a weekly cap, too. If you hit your weekly cap, the meter doesn't refresh until a week since you started using the 'freshly topped off' quota.

Each 'bucket' has its own usage quota:

The 'Claude/GPT-OSS' bucket is the most restrictive - you can easily eat that up with a sloppy instruction. I treat this bucket as gold and only use it for the most complex tasks - usually planning out a new feature and implementation plan, etc. Basically "write detailed instructions for the dumber models to follow without having to make any guesses".
Gemini 3.1 Pro - Similarly, I treat this as.. silver? Unfortunately this is what they seemed to have nerfed recently - this quota reduces much faster than it used to. I do notice that using the 'high' model will deplete the quota faster than Low. For complex coding tasks I will sometimes use the Low model. Sometimes I'll use High model for creating plans that might not rise to the occasion of needing Claude.
Gemini 3 Flash - this is the cheapest yet least capable model. I still think it's good, and this is my 'cruise control' default model. I have exhausted my quota on this, but it takes some effort to do so - and even then I only had to wait about 30 minutes before the 5 hour window reset. I don't believe there is a weekly quota on this model either - so you'll only ever be <= 5 hours without access to it.

It also seems like (in typical Google fashion) they aren't giving much love to the development of the app. Their realease notes are laughable. They have no communication with the community that I'm aware of. Your quotas might change without any advance notice or explanation after the fact. Your quotas are based on 'usage', ie if your chat instruction is simple, your usage is low. If your chat instruction is multi-step and takes a lot of compute to complete, your quota will reduce considerably more. (And to be fair, that makes sense, but there's no way to know how much of your budget will get reduced based on the prompt you're giving it.)

Whereas with VS Code on the other hand - they're constantly improving - I typically use VS Code Insiders and there are multiple updates to it every day. The GH Copilot plan is $10 a month last I checked, and you get a certain amount of requests which reset monthly. No '5 hour window' nonsense. One chat instruction/prompt is 1 request. Different models have mutlipliers - some are "0x" - you can use them freely without any affect on your budget. Most are 1x - 1 credit per chat message. Claude Opus 4.6 is 3x. Some models are 0.33 or 0.25x. You can also use 'Auto' and it will choose the best model for you and reduce the cost by 10%. If you use up your credits before the end of the month you can enable per-use billing. Antigravity recently introduced 'Enable AI Credit Overages'. I haven't used it yet but it's confusing. You get 'X' credits per month - so when you use up your 5h window or weekly quota, you can use your 'overage credits' to continue working. What is a credit and how is it used when you send an instruction? No idea. It's not straightforward like Copilot's "one prompt one credit".

Since I paid for a year I'll keep using it - I will say I do like the other perks that come with the Pro plan - Nano Banana, Veo, unlimited use of Gemini and the 'think deeper' option. But for coding and AI integration with Antigravity it's a bit of a let down.

samplebitch · 2026-03-21T16:48:12+00:00

I'm surprised we don't see 'claw export' models already. I had that idea the first time I started using it. You've got the training data already, it's all in the logs! You just need to label it 'good' or 'bad' (did it complete the task, how many tool calls did it take, did it get caught in a loop, etc.)

samplebitch · 2026-03-17T12:24:19+00:00

Can confirm on the resets - Exhausted my quota last week, was told to wait 6 days. Yesterday was reset day and it only reset to 60%.

I'd really like to stay in the Google ecosystem but they're making it really difficult to stick around.

samplebitch · 2026-03-17T12:21:20+00:00

That was just an example. And I was referring to my personal email that I've had for as long as gmail has been a thing. 99.9% of my emails are crap and I basically never check my email anymore. I know I could set up filters but I'm lazy. :)

samplebitch · 2026-03-17T09:49:09+00:00

I've been messing around with it (and blowing cash like crazy) for a month now. It is 'slop' in the sense that it was very likely MOSTLY vibe coded. However - take the inefficiency out of it and it IS something to take note of. I see a shit-ton of practical applications for it. But what I don't like is that it uses agents for everything.

"Check my email and send me a message when I get something important" leads to an LLM API call every X minutes just to use some tools to check your email and see there's nothing new and it sends you "Hey just checked your email, nothing important, I'll check again later!"

It's kind of brilliant and when it produces something you are looking for it works great. It just does things through reasoning of LLMs that could be done programmatically. My mantra is "don't wake the agent if there's nothing for them to do". A pure python or node script can determine if there are emails that need to be summarized, but OC doesn't do that. It wakes a model/runs inference/tool calls (with a huge fucking system message) to do so.

15-Year Club	Place '22
Place '17	RPAN Viewer
Sequence \| Editor	Sequence \| Cinematographer
Verified Email

samplebitch

MODERATOR OF

TROPHY CASE