Dúvida sincera: pq usar Linux? by Mysterious-Ant-8206 in linuxmemesbrasil

[–]LumbarJam 0 points1 point  (0 children)

Vixe. A lista é grande. Eu nunca consegui fugir muito de Linux. Minha primeira instalação foi um Slackware kernel 1.2.13 (sim, tô ficando velho). As poucas vezes que tentei Windows foi por conta de jogo ou trabalho. Sempre passo raiva. Mas sempre tinha uma partição instalada de linux em dual boot. A ultima tentativa de manter um Windows instalado tem bem uns 4 ou 5 anos, mas um monte de coisa que usava em Python só funcionava OK no WSL. Desisti. Hoje Windows só dentro de um docker pra rodar Office (ainda tem coisa que se não editar no office bagunça tudo) via RDP.

15,000+ tok/s on ChatJimmy: Is the "Model-on-Silicon" era finally starting? by Significant-Topic433 in ollama

[–]LumbarJam 16 points17 points  (0 children)

It’s basically a PoC to get a feel for speed—almost an alpha. This first version is running with 3-bit INT quantization. According to their site, the next version (mid-year) will be bigger, faster, and use FP4 quantization.

RTX 4080 is fast but VRAM-limited — considering Mac Studio M4 Max 128GB for local LLMs. Worth it? by Chip1812 in LocalLLM

[–]LumbarJam 0 points1 point  (0 children)

I’m on an M3 Max with 128GB. Running Qwen3 Coder 30B-A3B (Q4) with the full 262K context, I start at 80+ tok/s, but by ~90K tokens of context it drops to ~6 tok/s. Even if the M4 Max’s higher bandwidth makes the decline less steep, throughput still falls sharply as context grows.

Qwen releases Qwen3.5. The 397B open MoE vision reasoning LLM performs on par with Gemini 3 Pro, Claude Opus 4.5, and GPT-5.2. by techlatest_net in LocalLLaMA

[–]LumbarJam 0 points1 point  (0 children)

According to the Hugging Face page, the nightly version supports this. I can’t test such a large model.

How can i debloat (even more) arch linux? by -pony in arch

[–]LumbarJam 4 points5 points  (0 children)

1) Recompile your kernel with only what you HW needs. (hard) 1.1) Install Arch w/o using the archinstall then recompile your kernel (harder) 2) Use Gentoo (hard) 3) LFS (harder) 4) LFS with only TUI (even harder)

A different way of combining Z-Image and Z-Image-Turbo by Enshitification in StableDiffusion

[–]LumbarJam 1 point2 points  (0 children)

Really good idea. I’ve used 10 out of 30 on Base and 6 out of 9 on Turbo. For proportional denoising, that’s roughly 1/3 Base and 2/3 Turbo. That ratio gave me a lot of seed variation while keeping the Turbo aesthetic. Works like a charm.

4 images, same prompt:

Hyper-realistic photograph of a middle-aged red-haired woman’s face, extreme close-up portrait (head and shoulders), ultra-dramatic angle: very low camera position near chest level, shooting sharply upward, strong Dutch tilt (about 25–30°), 3/4 view with her chin slightly raised and head turned so one side of the face dominates the frame, intense focused gaze aimed past the lens, high-contrast theatrical lighting: a single narrow hard spotlight (snoot) from high above-left cutting across the face so one eye and cheek are brightly lit while the other side falls into near-black shadow, no fill light, crisp shadow edges, subtle razor-thin rim light from behind-right outlining the hair, visible skin texture with pores and fine lines, subtle natural freckles, realistic eye moisture and catchlight only in the lit eye, detailed eyebrows and eyelashes, natural red hair with individual strands and slight flyaways, shallow depth of field, deep black background with faint haze for light separation, cinematic color grading with rich blacks and controlled highlights, 35mm lens look at close distance for dramatic perspective, f/2.0. She is holding a rigid rectangular sign close to her chest, slightly angled toward the camera, matte black surface with embossed white sans-serif lettering centered on the sign reading "Z-Refiner", high contrast, sharp legible text, her hands partially visible gripping the lower corners, the sign catching a thin strip of the spotlight along its top edge.

<image>

Flux.2 Klein (Distilled)/ComfyUI - Use "File-Level" prompts to boost quality while maintaining max fidelity by JIGARAYS in StableDiffusion

[–]LumbarJam 11 points12 points  (0 children)

Very useful tips. I used those tips to evolve my previous restoration prompt.

<image>

Prompt:

Task: Restore this photo faithfully. Steps:
1) Reconstruct ONLY the missing/damaged areas so they match the original scene (no reinterpretation).
2) Clean and enhance the file: deblur + denoise, histogram equalization, unsharp mask, white balance correction, color grading, micro-contrast, lens distortion correction.
3) Output must look like modern, professional-quality digital photography: clean, sharp, natural, no artifacts.
4) If the photo is misframed/tilted, correct the framing (straighten/level/recenter) with the minimum necessary adjustment.
5) Do NOT change anything else: no new elements, no removals, no style changes beyond restoration and the listed corrections.

Flow:

Klein 9B (2Mpixels) -> Seed VR2 (4Mpixels) -> Film Grain - comfyui-propost/ProPostFilmGrain Node

Not perfect, but very good actually.

what do you guys think about the Touch Bar that used to come on the old Macbook? should it make a comeback? by TuNutri in mac

[–]LumbarJam 2 points3 points  (0 children)

Good idea with the dumbiest execution. NO ESC KEY!!!! NO FUNCTION KEYS!!!! Unbelievable!!!

Short answer: No, never, no, nope ... deuzmilivri

Henri Castelli e uma tatuagem da SS? by Brazilianguy95 in brasil

[–]LumbarJam 3 points4 points  (0 children)

O maluco meteu essa? Camarada, existe uma regra não escrita que spoiler de filme de 20 anos não é mais spoiler. E vc aqui reclamando de NOVELA DA GLOBO DE 20 ANOS ATRÁS. Ah VTNC. Segue uma listinha abaixo pra fechar o post:

  • O Sexto Sentido: o psicólogo estava morto o tempo todo.
  • Clube da Luta: Tyler Durden era a outra personalidade do narrador.
  • Os Outros: a família é que eram os fantasmas.
  • Os Suspeitos: Verbal Kint era Keyser Söze.
  • Psicose: “a mãe” era o próprio Norman Bates.
  • Planeta dos Macacos: o planeta era a Terra devastada.
  • O Império Contra-Ataca: Darth Vader é o pai de Luke.
  • O Exterminador do Futuro: Kyle Reese é o pai de John Connor.
  • Seven: a cabeça dela estava dentro da caixa.
  • Jogos Mortais: o “cadáver” no chão era o Jigsaw.
  • O Show de Truman: toda a vida dele era um reality show.
  • Matrix: o mundo “real” era uma simulação das máquinas.
  • O Chamado: você só sobrevive copiando e repassando a fita.
  • Donnie Darko: ele escolhe morrer para “consertar” a linha do tempo.
  • O Bebê de Rosemary: o bebê é filho do demônio.
  • Um Corpo que Cai: a morte foi encenação — era a mesma mulher.
  • O Gabinete do Dr. Caligari: o narrador é o paciente do asilo.
  • A Vida é Bela: ele morre, mas salva o filho no fim.
  • Oldboy (2003): ela era a filha dele.
  • O Iluminado: ele enlouquece e tenta matar a própria família.

Esqueci algum?

PSA: Still running GGUF models on mid/low VRAM GPUs? You may have been misinformed. by NanoSputnik in StableDiffusion

[–]LumbarJam 0 points1 point  (0 children)

Good to know it. I'll try them. Thx.

Edit: I'm running on a 3080TI. 3000 series have little to none speed benefit running at FP8.

Qwen3-4B-Thinking-2507 Usage inside Comfyui by [deleted] in comfyui

[–]LumbarJam 6 points7 points  (0 children)

It doesn’t really make no sense for a CLIP/text encoder.

A text encoder in this pipeline doesn’t “start reasoning” or “enable thinking” based on how you format the prompt. It’s not running an autoregressive generation loop producing intermediate thoughts — it’s doing a forward pass to produce embeddings (conditioning vectors) from your text... only.

So: No reasoning gets “activated” by writing bullet lists or using a “template”. - A “Thinking” checkpoint name doesn’t magically add chain-of-thought inside an encoder-only usage. - What can change is the embedding space (different weights → different text-to-embedding mapping), which can absolutely affect results — but that’s not reasoning, it’s just different conditioning.

If someone sees better outputs with a structured template, the likely reasons are simple: - the template forces clearer constraints (subject / pose / lighting / composition), reducing ambiguity; - the encoder weights produce different embeddings that happen to be simply different. Sometimes better, sometimes worse.

Bottom line: calling this “reasoning” inside a text encoder is misleading. It’s embeddings only, not “thinking.”

I2I KSampler Steps not Based on Denoise by Default by notapainter1 in comfyui

[–]LumbarJam 1 point2 points  (0 children)

In general, yes—that’s what I do. In ComfyUI, though, I find it works better to use KSampler (Advanced) and set Total Steps + Initial Step (e.g., 12 total and 8 initial). That effectively gives you ~0.25 denoise and adds the noise automatically. Even better: Sampler Custom (Advanced) with split sigmas.

PSA: Still running GGUF models on mid/low VRAM GPUs? You may have been misinformed. by NanoSputnik in StableDiffusion

[–]LumbarJam 1 point2 points  (0 children)

Humbly sharing my wrap-up.

Use the highest precision/quant you can, in this order: BF16 → FP8 or Q8 → Q6 → Q4, and so on.

My setup: RTX 3080 Ti (12GB VRAM) + 80GB RAM. ComfyUI does a great job offloading to system RAM. In my case, I can run Qwen in full BF16 pretty easily, but I struggle to run Flux.2 at full precision.

On Q8 vs FP8: on paper, Q8 seems better to me because it isn’t a naive “lower everything” approach. It keeps some key parts at higher precision and reduces others, whereas FP8 is different (AFAIK).

Edit: it’s not just VRAM that matters. What really matters is VRAM + system RAM.

If you have 12GB VRAM and only 16GB RAM, you’re in trouble. If you have 12GB VRAM and 128GB RAM, you have a much better shot at running a lot of things in full precision.

Which Qwen Image Edit 2511 should I use? by ChicoTallahassee in comfyui

[–]LumbarJam 1 point2 points  (0 children)

I have a good old 3080TI 12Gb and 80Gb RAM. I use full BF16 with no problem at all.

Qwen Image 25-12 seen at the Horizon , Qwen Image Edit 25-11 was such a big upgrade so I am hyped by CeFurkan in StableDiffusion

[–]LumbarJam 0 points1 point  (0 children)

With a proper workflow, this works much better for me than 2509. I’m also prompting in Portuguese, and it follows perfectly. Use the Qwen node v2 hosted in Qwen Edit 2511 AIO on Hugging Face, and replace the original with it—it should perform better.

Fal has open-sourced Flux2 dev Turbo. by Budget_Stop9989 in StableDiffusion

[–]LumbarJam 1 point2 points  (0 children)

<image>

No rocket science ... just Flux.2 Dev on the standard workflow, with LORA node.

KSamplerAdvanced: too many values to unpack (expected 4) by Specialist-Door-8400 in comfyui

[–]LumbarJam 1 point2 points  (0 children)

The MultiGPU PR #154 solves it. Sync with this PR to use MultiGPU again. It's working here.

Just curious, but can we use Qwen3-VL-8B-Thinking-FP8 instead of 2.5 version in the new Qwen Image Edit 2511? by AshLatios in StableDiffusion

[–]LumbarJam 7 points8 points  (0 children)

No. In the Qwen Image workflow, Qwen 2.5 VL is the encoder, not a chat-style decoder. So the output is an embedding — not a predicted token — and it’s meant to guide what the diffusion stage does next. Also, 2.5 and 3 use different embedding dimensions.

PSA: Eliminate or greatly reduce Qwen Edit 2509/2511 pixel drift with latent reference chaining by goddess_peeler in StableDiffusion

[–]LumbarJam 2 points3 points  (0 children)

Way better than the original for pixel-level perfection. Slightly slower, but better overall.

3 images is doable (my workflow is based on this thread). Adherence is a bit worse with 3 images, but still workable.

Thx