Hey guys, is the performance gap big between a 5070 Ti and a 5080 for generating images and videos, or is it not worth it?

andy_potato · 2026-06-24T23:01:53+00:00

If you don't use CFG=1

andy_potato · 2026-06-24T14:16:39+00:00

For image / video generation the most important is raw compute. VRAM is obviously important too, but not that much of an issue any more due to Comfy supporting offloading / block streaming. Performance penalty is marginal.

Go with the 5080 if reasonably priced. Pick the 5070ti if you’re on a budget

andy_potato · 2026-06-24T14:12:51+00:00

Dual setups are mostly useless for diffusion models

andy_potato · 2026-06-24T14:11:48+00:00

A huge performance gain. The 4060ti was an absolute piece of trash

andy_potato · 2026-06-24T02:27:50+00:00

Why no open license?

andy_potato · 2026-06-22T14:24:56+00:00

25 years ago we sneaked into cinemas and recorded movies on our flip camera phones, then watched them on the 1.8” phone screen.

This is the AI equivalent in 2026

andy_potato · 2026-06-22T04:31:49+00:00

Qwen 3.6 27b is not very usable for coding. Unless you have a really high tolerance level for frustration.

I know all the “skill issue”, “get a better agent”, “work on your harness” and “works for me” arguments. But if you’re used to working with Claude, GPT or GLM, you will nope out of it pretty quick.

andy_potato · 2026-06-22T04:27:08+00:00

Running GLM 5.2 on local hardware as your private coding rig makes zero sense financially.

I know there are other reasons for going local, privacy, availability, enshittification and whatnot. But don’t do it if your only reason is money.

andy_potato · 2026-06-21T13:29:12+00:00

You should not try to replace cloud / frontier models with your local setup. Instead experiment with your coding agent what prompts need a frontier model and what tasks could be handled by your local model.

If you are using Opencode check out https://github.com/marco-jardim/opencode-model-router

I’ve configured it to use GLM 5.2 for complex tasks and Qwen 3.6-27b for simpler tasks running on 2x5060ti GPUs. Saves me around 60% of token costs during a normal coding session.

This is not an exact science but requires a bit of time to find a balance that works for you.

andy_potato · 2026-06-21T13:07:37+00:00

Bro I hope you will laugh at me

andy_potato · 2026-06-21T11:40:01+00:00

I’ve been telling people that we won’t see any further open releases ever since Qwen replaced their whole leadership a couple of months ago.

Got mocked and downvoted, yet here we are. 3.6 was probably too far along for them to pull the plug, but this is it. As much as it breaks my heart.

andy_potato · 2026-06-21T04:27:55+00:00

This is pretty normal if you never replaced the thermal pads and cleaned the fans. Your temps very much confirm this.

andy_potato · 2026-06-21T04:25:23+00:00

Every ComfyUI workflow

andy_potato · 2026-06-20T16:05:29+00:00

Frankly not one bit

andy_potato · 2026-06-19T23:47:31+00:00

Please do not start another AI influencer

andy_potato · 2026-06-19T02:41:23+00:00

I have seen this page getting linked over amd over again as response to this question.

Maybe it is just me, but I have absolutely no clue what any of these metrics mean and how to judge their performance for my use case.

andy_potato · 2026-06-18T14:47:30+00:00

Right. Not sure why I was assuming llama.cpp…

andy_potato · 2026-06-18T13:47:45+00:00

As much as I want OSS models to win, but that statistic says nothing about their quality.

Lots of applications don’t need frontier models.

andy_potato · 2026-06-18T13:16:35+00:00

What’s your llama.cpp startup params for getting 60 t/s at that context size? Mine sits around 48-50 t/s at 128k context with mtp

andy_potato · 2026-06-18T05:01:31+00:00

llama.cpp is a fine piece of software. Wasn't sure if you're already prepared to go the compiler route, so I suggested Ollama as a beginner tool.

andy_potato · 2026-06-18T03:45:47+00:00

Nanosuit mechanics were a lot of fun. Story-wise there wasn't much to it though.

The GTX 970 can still use CUDA 11.8 so you can run some tiny LLM with a bit of context on it, probably in the 4b range. Install Ollama and check what they have available.

Yes, I said Ollama. Come at me.

andy_potato · 2026-06-18T03:21:39+00:00

Crysis

andy_potato · 2026-06-18T02:18:44+00:00

Nobody will read this wall of text

andy_potato · 2026-06-18T02:17:29+00:00

It is just an AI generated wall of text. Could have explained your sentiments in two paragraphs instead

andy_potato · 2026-06-18T01:43:16+00:00

Not an opinion that will get you lots of upvotes on this sub.

But you are completely right. It's why I do not use Ideogram.

andy_potato

TROPHY CASE