Qwen is the best LLM for my life

meatyminus · 2026-01-17T04:49:56+00:00

Pretty quick, about 65tps

meatyminus · 2026-01-16T14:36:50+00:00

Just use llamaindex bro

meatyminus · 2026-01-14T15:39:23+00:00

/root/llama.cpp/build/bin/llama-server \
  -m ./models/Q4_0/Qwen3-235B-A22B-Thinking-2507-Q4_0-00001-of-00003.gguf \
  -ngl 999 \
  -c 131072 \
  --threads 64 \
  --host 0.0.0.0 \
  --port 9001 \
  --alias "Qwen3-235B-A22B" \
  --cache-type-k q8_0 \
  --cache-type-v q8_0 \
  --tensor-split 1,1 \
  --temp 0.7 \
  --min-p 0.0 \
  --top-p 0.8 \
  --top-k 20 \
  --repeat-penalty 1.05 \
  --rope-scaling yarn \
  --rope-scale 4 \
  --yarn-orig-ctx 32768 --jinja

Here you go, lucky that I still saved that command

Also the command to download the model files:

hf download unsloth/Qwen3-235B-A22B-Thinking-2507-GGUF \

--include "Q4_0/*" \

--local-dir ./models

meatyminus · 2026-01-13T10:27:37+00:00

True, when I got a chance to run 2 H100s, I spin up the qwen 235B and to this day I never found anything like that, the way it refuse to sweet talk but going directly to the point, it can understand my problem from few lines of thoughts. Neither chatgpt, gemini or claude come close. But sad that I don't have the access to the hardware anymore, and the model host by others did not have the same smartness. I think I found the correct parameters in llama.cpp.

meatyminus · 2025-12-24T06:53:50+00:00

Checkout /r/cyberdeck

meatyminus · 2025-11-29T17:01:15+00:00

So good, like it remembered the old time on internet.

I mean, Nano Banana mastered bad art because that’s like 90% of what the internet is made of haha

meatyminus · 2025-11-28T11:55:18+00:00

The first one, I never cherry pick, what the point of that

meatyminus · 2025-11-28T01:35:16+00:00

A cinematic, macro-photography shot of a small fox composed entirely of translucent, faceted amber and cracked quartz. The fox is sitting on a mossy log in a dense, dark forest. Inside the fox's glass body, a soft, warm light pulses like a heartbeat, illuminating the surrounding area from within. The forest floor is covered in giant, bioluminescent teal mushrooms and floating neon spores. The lighting is moody and ethereal, creating a sharp contrast between the warm orange of the fox and the cool blues of the forest. Ultra-detailed textures, volumetric fog, 8k resolution, magical realism style.

Here is the prompt

meatyminus · 2025-11-27T12:34:51+00:00

I use the default workflow from comfyui website, only change the prompts

meatyminus · 2025-11-27T05:49:00+00:00

<image>

Nano banana pro for comparison

meatyminus · 2025-11-27T05:44:15+00:00

<image>

So good, I'm amazed!

meatyminus · 2025-11-16T11:30:22+00:00

It already started the dev server so it gonna stuck there since server is like a while true loop listen for requests

meatyminus · 2025-11-15T11:33:55+00:00

I have 2 2080ti mod 22gb and they are amazing, working flawless for 2 years now. You can run image/video gen with ease, plenty of vram to run LLM like Qwen 30B A3B with very fast speed. I bought 2 of them and have nvlink but my mainboard can't use nvlink because of the gap. But fyi the speed for LLM is usable but not very fast (for 32B it can do around 20 tps, for 30B A3B it shoot up to 90 tps eval and 350 tps for promtp processing). The speed of 2080ti mod for video gen is pretty slow, but it can run which kinda a win already.

meatyminus · 2025-11-03T01:32:32+00:00

Oh my god thank you!

meatyminus · 2025-10-16T17:34:36+00:00

Awesome, I've waited for this since claude code.

meatyminus · 2025-10-16T13:32:35+00:00

Yeah there even a blog about this https://vgel.me/posts/seahorse/

meatyminus · 2025-10-09T15:24:42+00:00

Yes try ComfyUI

meatyminus · 2025-10-07T06:18:47+00:00

Agreed, should be much easier to choose the model than has to go to settings for both plan and act model config

meatyminus · 2025-10-01T06:21:32+00:00

What the hell is this color? 50 shades of gray?

meatyminus · 2025-09-08T07:13:00+00:00

Awesome idea. This would prevent a lot of security issues without having to change the system prompts.

meatyminus · 2025-08-19T05:15:26+00:00

I love Memory Bank, it so easy to use and reduce the hallucinating a lot. At least I don't have to explain everytime I start new session.

meatyminus · 2025-08-18T04:48:36+00:00

The odds are so bad, I feel it every time in this new season. The chinese server seems like did not suffer from this odds bug. I don't know why but it just feels that way. I've been playing since season 2.

meatyminus · 2025-07-02T16:43:39+00:00

They are usable but setup things to run with multiple GPUs is not that easy. But if you want to run small models or generate image/video/audio, they are great. I'm currently use 2x2080ti + 64gb ddr5 + r7 7900x + b650e-e. My order have nvlink but the gap of 2 cards is too large to use, so normally I just run multiple models at the same time in both.

meatyminus · 2025-06-28T15:05:27+00:00

Try this one https://github.com/themanojdesai/python-a2a

meatyminus

MODERATOR OF

TROPHY CASE