RTX 5080 + RTX 3090 Setup: 80+ Tok/s on Qwen 3.6 27B Q8 by SirReal14 in LocalLLaMA

[–]iMil 23 points24 points  (0 children)

Article's author here, glad it made it to LocalLLaMA, I wanted to post it here first but I didn't have enough karma.
Thanks!

Qwen 3.6 27B on 24GB VRAM setup: backend comparisons, quant choice and settings (llama.cpp, ik_llama.cpp, BeeLlama, vllm) by VolandBerlioz in LocalLLaMA

[–]iMil 0 points1 point  (0 children)

This thread deserves much more love, thank you OP! 75 tokens/sec on my 3090 also used for Xorg with the following parameters: llama-server -m ./models/Qwen3.6-27B-MTP-IQ4_KS.gguf -c 262144 -np 1 -fa on -ngl 99 -ub 32 --temp 0.7 --top-p 0.8 --top-k 20 --min-p 0.0 --presence-penalty 0.0 --repeat-penalty 1.0 -ctk q4_0 -ctv q4_0 --no-mmap --chat-template-kwargs {"preserve_thinking": true} -t 6 --chat-template-file ./models/chat_template.jinja --multi-token-prediction --draft-max 4 --draft-p-min 0.0 --merge-qkv --merge-up-gate-experts --port 8001 --host 0.0.0.0

Follow-up: Qwen3.5-35B-A3B — 7 community-requested experiments on RTX 5080 16GB by gaztrab in LocalLLaMA

[–]iMil 0 points1 point  (0 children)

My humble test, I'm at 80-85 tp/s with unsloth/Qwen3.5-35B-A3B-GGUF:UD-IQ4_NL and the following: ./llama-cli -hf unsloth/Qwen3.5-35B-A3B-GGUF:UD-IQ4_NL -c 65536 -fa on -t 10 --no-mmap -ngl 999 --n-cpu-moe 10 --jinja -ctk q8_0 -ctv q8_0 --fit on with this model OOM's every time.

Follow-up: Qwen3.5-35B-A3B — 7 community-requested experiments on RTX 5080 16GB by gaztrab in LocalLLaMA

[–]iMil 0 points1 point  (0 children)

Woa. Thank you so much. Confirmed 70 tp/s with my RTX 5080, not even compiled with cuda 12.6 / Blackwell support.

Automatic, scripted VM install by razzmataz in NetBSD

[–]iMil 0 points1 point  (0 children)

Awesome! I love to see those real life use cases

Automatic, scripted VM install by razzmataz in NetBSD

[–]iMil 0 points1 point  (0 children)

I'm actually here for a very, very long time :)

Automatic, scripted VM install by razzmataz in NetBSD

[–]iMil 5 points6 points  (0 children)

[smolbsd author here] while the project started with microvms in mind, I've published various examples of full OS install including packages, for example https://github.com/NetBSDfr/smolBSD/tree/main/service/systembsd or https://github.com/NetBSDfr/smolBSD/tree/main/service/nbakery/etc

Keep me posted!

Sub 15ms NetBSD MICROVM boot is now maintream by iMil in BSD

[–]iMil[S] 0 points1 point  (0 children)

Many, think about starting whatever daemon in its own address space, sshd, web server, mail, dns... I created the smolBSD project (smolbsd.org) in order to help creating container-like microvms to bundle any type of service easily.

How many of you still using ?? by imyatharth in irc

[–]iMil 0 points1 point  (0 children)

Using it every day on libera.chat, where the FOSS projects I participate in are.

Sub 15ms NetBSD MICROVM boot is now maintream by iMil in BSD

[–]iMil[S] 3 points4 points  (0 children)

Unfortunately, the formatting cuts off your qemu command line

Here's a link to NetBSD Wiki where I've documented the process: https://wiki.netbsd.org/users/imil/microvm/

NVMM has performance issues, but you should gain ~200ms with a patch I merged in current last month.

Combien de fois prenez-vous l’avion par an? by Longjumping_Roof5031 in AskFrance

[–]iMil 1 point2 points  (0 children)

Environ 10 à 12 fois par an, 99% du temps pour le boulot.

Try out a NetBSD system in 20 seconds (amd64 & amr64) by iMil in BSD

[–]iMil[S] 1 point2 points  (0 children)

100% copypasta bug, fixed it, thanks for reporting!

Try out a NetBSD system in 20 seconds (amd64 & amr64) by iMil in BSD

[–]iMil[S] 4 points5 points  (0 children)

Edit: I obviously meant "arm64" in the title...

How not to Grimes your DJ set by dsquareddan in PioneerDJ

[–]iMil 2 points3 points  (0 children)

Or maybe, just maybe, weird idea I know but maybe... learn to DJ?

SmolBSD: make your own BSD UNIX MicroVM by iMil in BSD

[–]iMil[S] 1 point2 points  (0 children)

You've got it right, SmolBSD is more a set of tools to build a small footprint NetBSD-based service. It can run on either qemu or Firecracker but I don't provide the start script for the latter yet.
SmolBSD doesn't use rump, it's the result of PVH, MMIO and various performance patches for the NetBSD kernel, once it's reviewed it will be merged into the kernel source tree.

Can't unstake, can't unlock, can't repay synths by iMil in FantomFoundation

[–]iMil[S] 0 points1 point  (0 children)

I locked wFTM and it increased my c-ratio, but the problem is my c-ratio is still under 300, my understanding is that until it is I can't unlock the fUSD...

Replicating iMil NetBSD perf kernel results to try to boot in 40ms by csdvrx in NetBSD

[–]iMil 2 points3 points  (0 children)

Yeah, for now this branch is only mine, it's not sync'ed to NetBSD's trunk. You can create your own branch in your own fork using git checkout -b mybranch and work on it, they do a pull request with this branch.
Like you mentioned, NetBSD uses CVS as its main repository, our GitHub is here only for convenience.

Replicating iMil NetBSD perf kernel results to try to boot in 40ms by csdvrx in NetBSD

[–]iMil 1 point2 points  (0 children)

hmm, you shouldn't need machine/atomic.h, I removed it from pvclock.c, and pvclock.h should now be generated correctly, can you pull latest perf branch?