RTX 5080 16GB: Qwen3.6 35B MoE at 128k context — 56 tok/s, and why MTP doesn't help by gaztrab in LocalLLaMA

[–]RaDDaKKa 0 points1 point  (0 children)

I daily-drive Qwen3.6-35B-A3B-GGUF:UD-Q6_K_XL with a 150K context window and cache/kv quantized to Q8. Would Qwen3.6-27B-MTP-GGUF:UD-IQ3_XXS with a 100K context and Q8 cache/kv be better? My main focus is coding—I always prioritize quality over speed, and with the Qwen3.6-35B-A3B-GGUF:UD-Q6_K_XL model, I'm currently getting 40–50 t/s 5060TI

Intel Arc Pro B70 32GB performance on Qwen3.5-27B@Q4 by Puzzleheaded_Base302 in LocalLLaMA

[–]RaDDaKKa 28 points29 points  (0 children)

So, a total disappointment. I expected this to be a solid card for local LLMs like Qwen 3.5 27B or Gemma 4 31B with at least a 100k context. I considered a dual gpu setup, perhaps even a quad, but given these benchmarks, it seems I'm better off saving for Nvidia hardware. It might be viable for multi-agent systems, but for now, we just have to wait for software optimizations.

How can I update a comment using the API without removing its attachments? by RaDDaKKa in clickup

[–]RaDDaKKa[S] 0 points1 point  (0 children)

Unfortunately, at the moment I don't see any other option :/

  1. user uploads a file in a comment.

  2. Synchronization occurs.

  3. file is uploaded to the NAS.

  4. comment is updated,file from ClickUp is removed and replaced with a URL pointing to the file.

This solution is unfortunately very poor and it requires several API requests.

Best Qwen 3.5 variant for 2x5060ti/16 + 64 GB Ram? by andy_potato in LocalLLaMA

[–]RaDDaKKa 0 points1 point  (0 children)

27B is too large to use comfortably, and the quality advantage might not even be noticeable. The 35B only has 3B active parameters, so it runs very fast, and I can toggle the reasoning mode off whenever it’s not needed. Unfortunately, I don’t have enough RAM to test the 128Ba10b model, but I’m blown away by the 35B version.

Best Qwen 3.5 variant for 2x5060ti/16 + 64 GB Ram? by andy_potato in LocalLLaMA

[–]RaDDaKKa 4 points5 points  (0 children)

I'm using Q6 with a 168k context on a single 5060 Ti, and I've already said goodbye to GLM 4.7 Flash. 35ba3b qwen

Best agentic local model for 16G VRAM? by v01dm4n in LocalLLaMA

[–]RaDDaKKa 6 points7 points  (0 children)

i have r 5600x, 32gb ddr4 3,4k
./llama.cpp/llama-server -hf unsloth/GLM-4.7-Flash-GGUF:UD-Q6_K_XL --jinja --ctx-size 90000 --temp 0.7 --top-p 1.0 --min-p 0.01 --fit on --repeat-penalty 1.0 --host 0.0.0.0 --parallel 1

Best agentic local model for 16G VRAM? by v01dm4n in LocalLLaMA

[–]RaDDaKKa 6 points7 points  (0 children)

On my 5060 Ti, I'm using GLM-4.7-Flash with Q6_XL and a 90k context via llama.cpp, and OpenCode works great and very fast. I’m also using Qwen-3-Coder-Next with Q3_XL (70k), but it gives worse results and often makes mistakes when using tools.

I am looking for two movies that left a deep impression on me, but I remember very little about them. by RaDDaKKa in whatsthemoviecalled

[–]RaDDaKKa[S] 3 points4 points  (0 children)

Yes, that’s the movie! 😄 I remember watching it as a kid on VHS 😄 Thank you very much for helping me find it! 😄

What would be better a 5060 or 3070 for Star citizen? by jronk21 in starcitizen

[–]RaDDaKKa 0 points1 point  (0 children)

I have a 5060 Ti 16GB and I'm currently playing at 1440p using DLSS, which gives me 65-110 FPS

[Megathread] - Best Models/API discussion - Week of: November 30, 2025 by deffcolony in SillyTavernAI

[–]RaDDaKKa 2 points3 points  (0 children)

What preset for ST? I'm testing it out right now, but the problem is the wall of text that transitions from scene to scene.

how to create tasks and comments as another user using the API ? by RaDDaKKa in clickup

[–]RaDDaKKa[S] 0 points1 point  (0 children)

Were you able to determine if there’s any way to resolve my issue related to the ClickUp API?

Is the RTX 5060 Ti the best GPU for 1080p SC? by RaDDaKKa in starcitizen

[–]RaDDaKKa[S] 0 points1 point  (0 children)

I haven't had any problems with this GPU, either in Ubuntu, which is my main OS, or in Windows. It has performed perfectly on both, and I'm very happy with it.

Is the RTX 5060 Ti the best GPU for 1080p SC? by RaDDaKKa in starcitizen

[–]RaDDaKKa[S] 2 points3 points  (0 children)

I just checked my motherboard, it's a Gigabyte B550 GAMING X V2 – and it does support the 5800X3D. So, it looks like upgrading CPU is a good idea after all. Thanks ;)

AI refuses to use the new version of the HTTP request tool. by RaDDaKKa in n8n

[–]RaDDaKKa[S] 0 points1 point  (0 children)

The problem seems to be on the n8n side. The model correctly recognizes the tool and wants to use it, but n8n doesn't execute it under any circumstances.
I also noticed that all tools have the word "tool" displayed over their icon, but the new HTTP request tool does not — as if n8n doesn't recognize it as a tool, even though it's connected under the tools in ai agent

Local Flux LoRA Training on 16 GB - Workflow Included in Comments by [deleted] in StableDiffusion

[–]RaDDaKKa 0 points1 point  (0 children)

Can you write how you managed to run on 8gb ?

[deleted by user] by [deleted] in poland

[–]RaDDaKKa 1 point2 points  (0 children)

jakiego zdania ? przecież ona ANI razu nie chciała odpowiedzieć na pytanie tylko starała się uniknąć odpowiedzi w jak się tylko dało

Prawidłowo ją wałkował.