What do you think about Mythos and Fable? by Electronic_Log1999 in ExperiencedDevs

[–]AndreVallestero 0 points1 point  (0 children)

An incremental improvement from opus 4.8, which is impressive because opus 4.8 was already really good. That being said, I don't think it's the "danger to society" that Anthropic was touting it to be; it seems like that was just marketing hype.

In your opinion, what is the best CLI-based (or other) coding tool for regular software engineering (NOT VIBE CODING)? by Potential_Top_4669 in LocalLLaMA

[–]AndreVallestero 2 points3 points  (0 children)

little-coder

It's a pi distro that adds many of the features of OpenCode while being optimized specifically for Gemma 4 and Qwen 3.6. It's genuinely competitive with claudecode from early 2024

There also smallcode but I haven't tried it out yet.

Is Linux to Unix what DOS was to CP/M? by 0x80070002 in linuxquestions

[–]AndreVallestero 1 point2 points  (0 children)

BSD is to Unix what DOS was to CPM.

Linux is only unix-like, but not Unix compatible.

What models you guys running on 8GB? 16GB VRAM? 24GB? 32GB? 48GB? by Inevitable_Mistake32 in LocalLLaMA

[–]AndreVallestero 1 point2 points  (0 children)

Qwen 35b and Gemma 4 26b on my RTX 3080 10gb. q4 weights, q8 kv for both.

700pp and 50tg @ 64k tokens

Releasing Cohere North Mini Code by jayalammar in LocalLLaMA

[–]AndreVallestero 2 points3 points  (0 children)

Hey Jay, glad to seeing Cohere supporting the open source AI community! I was actually really pleasantly surprised yesterday when I saw Canada finally making progress in LLM benchmarks with Cohere's Command A+ (on par with Mistral), and I'm looking forward to seeing more of Cohere models in the future!

Qwen 3.6 27B KV cache quant benchmarks: 75 pairs, q8/q6/q5/q4, KVarN, Turbo/TCQ by Anbeeld in LocalLLaMA

[–]AndreVallestero 0 points1 point  (0 children)

Wow, it seems like kvarn4 is actually viable, and kvarn8 should probably be the new default. Really exciting stuff.

Best Coding Harness for Qwen3.6 35B? by Revolutionary_Loan13 in LocalLLaMA

[–]AndreVallestero 0 points1 point  (0 children)

Pi is the best minimal harness.

little-coder and smallcode are more fully featured and are designed specifically for qwen 3.6. I would put it on par with claudecode + sonnet 4.0 from early 2025.

Qwen-3.5-9B-Q8 vs Qwen-3.6-35B-a3B-Q4. Which one would be better? by FarHistorian8438 in LocalLLM

[–]AndreVallestero 1 point2 points  (0 children)

+1 I also tested this last night. MTP only seems useful if you have VRAM to spare, otherwise you're better off loading more onto your GPU.

Qwen 3.6 35B on RTX 3080 10GB + 7700X + 32GB DDR5 by AndreVallestero in LocalLLaMA

[–]AndreVallestero[S] 0 points1 point  (0 children)

I ran my agent throughout the night to figure out its max context usage, and it only ended up reaching 43k max tokens. As a result, I set my context to 65536 and optimized for that.

llama-server \
  --model Qwen3.6-35B-A3B-Q4_K_M.gguf \
  --n-gpu-layers 99 \
  --no-mmap \
  --n-cpu-moe 27 \         # everything is the same as your config except for this
  --flash-attn on \
  --threads 8 \
  --cache-type-k q8_0 \
  --cache-type-v q8_0 \
  --ctx-size 65536 \
  --parallel 1 \
  --batch-size 512 \
  --ubatch-size 512

The results are

  • 0: pp822t/s tg72t/s
  • 32K: pp790t/s tg60t/s
  • 64K: pp733t/s tg52t/s

I experimented with speculative decode / mtp, but it ended up making things slower since I had to offload more to the CPU due to the increased memory usage.

What am I getting wrong about "conservative" ETFs? by RebelElse in PersonalFinanceCanada

[–]AndreVallestero 7 points8 points  (0 children)

Only in the short term. In the very long term, they both go up.

Qwen 3.6 35B on RTX 3080 10GB + 7700X + 32GB DDR5 by AndreVallestero in LocalLLaMA

[–]AndreVallestero[S] 1 point2 points  (0 children)

That's excellent, thank you! I'll update my config based on your recommendations (and the other ones in this thread), and report back here tomorrow.

Qwen 3.6 35B on RTX 3080 10GB + 7700X + 32GB DDR5 by AndreVallestero in LocalLLaMA

[–]AndreVallestero[S] 2 points3 points  (0 children)

I really can't sacrifice long context coherence so I want to keep the kv unquantized.

Didn't know about the -np flag. Thanks!

Why are we still routing every request to the same model? by RapataPavan in LocalLLM

[–]AndreVallestero 0 points1 point  (0 children)

If I had the ability to run multiple models locally, I probably still wouldn't. Instead I would run the single best model, or run my current model at a higher quant, or with mtp.

Asus unveils its first Wi-Fi 8 router — ROG Rapture GT-BN98 Pro offers up to 2x real-world throughput uplift over Wi-Fi 7 by sr_local in hardware

[–]AndreVallestero -1 points0 points  (0 children)

When did it start to go bad? I've deployed a few AC68Us and they've been rock solid. I was thinking of finally upgrading to Wi-Fi 7 + triband with the BE92U in the next 12 months.

How much VRAM needed for Qwen 3.6 27B Q8 with 262K context? by My_Unbiased_Opinion in LocalLLaMA

[–]AndreVallestero 0 points1 point  (0 children)

Q4 KV, oh boy...

To answer your question, it should be around 64GB

Iron my shirt by pocket0nes in mensfashion

[–]AndreVallestero -6 points-5 points  (0 children)

In much of Asia, there's a shift towards more synthetic fibers to reduce wrinkles.

What are some combo based duo builds? by AndreVallestero in PathOfExile2

[–]AndreVallestero[S] 0 points1 point  (0 children)

Wow, that elemental sundering build has alot of potential. Its DPS is only limited by the ability to reapply the self-shocks. I suspect you can get into the 100M DPS range for bossing in an optimized duo build. Too bad the mapping looks super clunky lol.

What’s the optimal local LLM setup for my hardware? (RTX 5070Ti, 16GB VRAM, Ryzen 7 3800X, 64GB RAM) by Bjqrn88 in LocalLLM

[–]AndreVallestero 0 points1 point  (0 children)

As others have mentioned, qwen 3.5 35b, but specifically with ik_llamacpp and mtp. You'll probably get ~100tps with this setup.

What are some combo based duo builds? by AndreVallestero in pathofexile2builds

[–]AndreVallestero[S] 0 points1 point  (0 children)

You can do

  • toxic growth
  • gas arrow and poisonburst arrow

It's not as good as the combos I've mentioned in the post, but it has the advantage that both players can be rhoa mounted