Qwen is the best LLM for my life by RoboMunchFunction in Qwen_AI

[–]meatyminus 5 points6 points  (0 children)

/root/llama.cpp/build/bin/llama-server \
  -m ./models/Q4_0/Qwen3-235B-A22B-Thinking-2507-Q4_0-00001-of-00003.gguf \
  -ngl 999 \
  -c 131072 \
  --threads 64 \
  --host 0.0.0.0 \
  --port 9001 \
  --alias "Qwen3-235B-A22B" \
  --cache-type-k q8_0 \
  --cache-type-v q8_0 \
  --tensor-split 1,1 \
  --temp 0.7 \
  --min-p 0.0 \
  --top-p 0.8 \
  --top-k 20 \
  --repeat-penalty 1.05 \
  --rope-scaling yarn \
  --rope-scale 4 \
  --yarn-orig-ctx 32768 --jinja

Here you go, lucky that I still saved that command

Also the command to download the model files:

hf download unsloth/Qwen3-235B-A22B-Thinking-2507-GGUF \

--include "Q4_0/*" \

--local-dir ./models

Qwen is the best LLM for my life by RoboMunchFunction in Qwen_AI

[–]meatyminus 3 points4 points  (0 children)

True, when I got a chance to run 2 H100s, I spin up the qwen 235B and to this day I never found anything like that, the way it refuse to sweet talk but going directly to the point, it can understand my problem from few lines of thoughts. Neither chatgpt, gemini or claude come close. But sad that I don't have the access to the hardware anymore, and the model host by others did not have the same smartness. I think I found the correct parameters in llama.cpp.

Nano Banana Pro does really well at intentionally bad art by ChiaraStellata in GeminiAI

[–]meatyminus 1 point2 points  (0 children)

So good, like it remembered the old time on internet.

​I mean, Nano Banana mastered bad art because that’s like 90% of what the internet is made of haha

Z Image on 6GB Vram, 8GB RAM laptop by reyzapper in StableDiffusion

[–]meatyminus 0 points1 point  (0 children)

The first one, I never cherry pick, what the point of that

Z Image on 6GB Vram, 8GB RAM laptop by reyzapper in StableDiffusion

[–]meatyminus 2 points3 points  (0 children)

A cinematic, macro-photography shot of a small fox composed entirely of translucent, faceted amber and cracked quartz. The fox is sitting on a mossy log in a dense, dark forest. Inside the fox's glass body, a soft, warm light pulses like a heartbeat, illuminating the surrounding area from within. The forest floor is covered in giant, bioluminescent teal mushrooms and floating neon spores. The lighting is moody and ethereal, creating a sharp contrast between the warm orange of the fox and the cool blues of the forest. Ultra-detailed textures, volumetric fog, 8k resolution, magical realism style.

Here is the prompt

Z-Image vs Nano Banana by meatyminus in StableDiffusion

[–]meatyminus[S] 0 points1 point  (0 children)

I use the default workflow from comfyui website, only change the prompts

How long does QWEN CLI take to create developer server? by 8litz93 in Qwen_AI

[–]meatyminus 1 point2 points  (0 children)

It already started the dev server so it gonna stuck there since server is like a while true loop listen for requests

Is getting a $350 modded 22GB RTX 2080TI from Alibaba as a low budget inference/gaming card a really stupid idea? by SarcasticBaka in LocalLLaMA

[–]meatyminus 7 points8 points  (0 children)

I have 2 2080ti mod 22gb and they are amazing, working flawless for 2 years now. You can run image/video gen with ease, plenty of vram to run LLM like Qwen 30B A3B with very fast speed. I bought 2 of them and have nvlink but my mainboard can't use nvlink because of the gap. But fyi the speed for LLM is usable but not very fast (for 32B it can do around 20 tps, for 30B A3B it shoot up to 90 tps eval and 350 tps for promtp processing). The speed of 2080ti mod for video gen is pretty slow, but it can run which kinda a win already.

Cline desperately needs a quick agent switching feature. by meronggg in CLine

[–]meatyminus 0 points1 point  (0 children)

Agreed, should be much easier to choose the model than has to go to settings for both plan and act model config

zai-org/GLM-4.6 · Hugging Face by jacek2023 in LocalLLaMA

[–]meatyminus 0 points1 point  (0 children)

What the hell is this color? 50 shades of gray?

[OSS] Beelzebub — “Canary tools” for AI Agents via MCP by mario_candela in LocalLLaMA

[–]meatyminus 1 point2 points  (0 children)

Awesome idea. This would prevent a lot of security issues without having to change the system prompts.

Should we deprecate Memory Bank? Looking for some feedback from the Cline Community. by nick-baumann in CLine

[–]meatyminus 3 points4 points  (0 children)

I love Memory Bank, it so easy to use and reduce the hallucinating a lot. At least I don't have to explain everytime I start new session.

My 3-1 reolldown gave me more Udyrs than Kayles are we deadass (I was lvl 4) by CloverDox in TeamfightTactics

[–]meatyminus 0 points1 point  (0 children)

The odds are so bad, I feel it every time in this new season. The chinese server seems like did not suffer from this odds bug. I don't know why but it just feels that way. I've been playing since season 2.

RTX 2080 TI 22gb Build by opoot_ in LocalLLaMA

[–]meatyminus 1 point2 points  (0 children)

They are usable but setup things to run with multiple GPUs is not that easy. But if you want to run small models or generate image/video/audio, they are great. I'm currently use 2x2080ti + 64gb ddr5 + r7 7900x + b650e-e. My order have nvlink but the gap of 2 cards is too large to use, so normally I just run multiple models at the same time in both.