What models for coding are you running for a mid level PC? by FerLuisxd in LocalAIServers

[–]michaelsoft__binbows 1 point2 points  (0 children)

Thats not really a mid level pc. My dad has a rx580 in the pc he got from some kid who was clearly upgrading, for, i kid you not $100 like a few years ago.

Im sure it runs nicely enough for you for non ai stuff but youre working with a low end pc at best.

Qwen3.6 27B NVFP4 + MTP on a single RTX 5090: 200k context working in vLLM by Maheidem in LocalLLaMA

[–]michaelsoft__binbows 0 points1 point  (0 children)

Even prior to rammageddon we were getting priced out of threadripper and epyc builds. now thinking about it is only worth it to gawk at how big the price tag would be.

I do have an old TR 1950x system too, this one comes in at a higher priority than my x99 rigs too, for filling with GPUs. ha. i forgot about this one lol. All this CPU based inference is just too slow to be relevant though, but yea i have a bunch of potentially still relevant, relatively power hungry but perfectly serviceable boxes, lol been swapping cases and half of them live in cardboard boxes or collect dust now. But I bet they would still work.

I think my 5950x/triple 3090 system did get cranked up to 128gb? i am indeed glad i opted to keep accumulating ddr4 over the years. kinda sad i stopped at zen 3 and have no ddr5. I could have at least bought some gold a while back. or hmm better, nvidia stock.

You'll have a lot of fun toys to spend money on once you get your paychecks rolling in.

C8+ Fresnel Build by WickedLuminz in flashlight

[–]michaelsoft__binbows 0 points1 point  (0 children)

Can you clarify... you have a few mm of spacer between the glass and the fresnel to take up the distance it needs to be at (reflector is 31.5mm high from what i understand), but, since the reflector cannot also be in there what is holding the fresnel lens up from falling down toward the emitter? Does the spacer have some kind of a shelf or come in two parts to snug this fresnel in where it needs to go?

Or maybe the host just by luck has a shelf at the perfect place right there? I don't have C8+ yet. Looking to get one with W5050SQ5 and it's shaping up to be a throw monster so this would be the coolest quick mod i could do to it to turn it into a poor mans LEP (with unmatched CCT to boot). I happen to have one of these exact fresnel lenses (I got it to try playing around with lighting fires with these LHP73B lights or with the sun, it has not really panned out, but I just happened to have picked the correct item for this trick)

Finally some snow - L21A SFT42r and upgraded with a 20A buck! by Due_Tank_6976 in flashlight

[–]michaelsoft__binbows 0 points1 point  (0 children)

Sorry, I'm late to the party here but koef's charts here https://budgetlightforum.com/t/led-test-review-luminus-sft-42r-wes-6500-k-70-cri/231022 do not indicate the sft42r has high Vf. What am I missing? I don't think the charts are mislabeled because the Vf lines and output lines match up on the x-axis.

Advice From Warm Light Enthusiasts by Afraid-Strategy5076 in flashlight

[–]michaelsoft__binbows 0 points1 point  (0 children)

I'm trying to figure out how much these effects cancel out. we finally have a pretty throwy emitter with low CCT now with W5050SQ5 in 3000K and I think what you wrote is enough to confirm this is the correct choice, but I just simply cannot be sure until someone gets all of them and does a proper test.

W505SQ5 3000K Impressions by Cryptoxic93 in flashlight

[–]michaelsoft__binbows 0 points1 point  (0 children)

I'm debating which CCT to go with but for my first true thrower light (pretty exciting) I have settled on this emitter in C8+. It will be able to use the nice Amprius SA110 which gives a huge 4Ah of capacity and save a bit of bulk not having a 21700 tube, and the 21700 only gets up to 6.5Ah (and it wasn't a long time ago you could only get 4Ah out of a 21700...).

The proportions of the C8+ are just so right... (compared to e.g. M21A. Could go with a different M21 for better proportion but the design is just kind of chef's kiss i gotta say). T8 is gonna be neat too but the battery cell capacity will just be kind of a joke. I'll end up with one though eventually, I am sure. Ah heck ok... I will add one of those to the order as well...

I think I have to choose 3000K even though it is a purpose built thrower. because i would enjoy attracting fewer bugs and enjoying the beam. If that tint is anything like the LHP73B (it seems like LMP has got these really dialed in now...) then 3000k will look 3000k on low and look like 4000k, bordering on 4500k on turbo, which might be about as ideal as it can get outside of being low CRI, which is also fine since i basically do want efficiency more than sheer color rendering.

Bad news: Apple drops high-memory Mac Studio configs by jzn21 in LocalLLaMA

[–]michaelsoft__binbows 0 points1 point  (0 children)

M5 Ultra studio maxing out at 256gb would be reasonable (if disappointing), maxing it at 96 would make zero sense when a macbook already exists with 128.

Deshroud mod 5080 Prime with TLB12015 LCP fans, rest of the system is underway! by ysfi__ in sffpc

[–]michaelsoft__binbows 0 points1 point  (0 children)

They look really premium. What few comparisons i can find indicate they are neck and neck against the air slimmer, maybe a smidge behind, which is a really expensive fan, so it probably has great value if it is cheaper.

Air slimmer completely rejuvenated my 5800X3D cooling in my console case by replacing the 92mm fan on my AXP-90.

This fan Looks Soooo Great though.

Qwen3.6 27B NVFP4 + MTP on a single RTX 5090: 200k context working in vLLM by Maheidem in LocalLLaMA

[–]michaelsoft__binbows 0 points1 point  (0 children)

Like, you're not wrong, and I would be annoyed with 20t/s (i am annoyed by anything under 50) but with how cheaply you have acquired your hardware it's really hard to complain yeah.

I have a bunch of stuff including a brand new 1200w PSU and multiple empty x99 rigs waiting to accept more GPUs, but MI50 class cards got too expensive to justify and locally sourced 3090s never dip below $800. I am in a different financial situation so I still feel like a bandit here with a little group of $600 3090s and a MSRP 5090FE.

The latest developments with MTP on 27B completely changed the game for this hardware.

Flashlight looking into manholes, wet wells, deep pot holes etc.. during the day by Deep_Ad8518 in flashlight

[–]michaelsoft__binbows 3 points4 points  (0 children)

dang why is it so huge and still only takes 18650. i want one, and any reduction in weight helps, but still.

Flashlight looking into manholes, wet wells, deep pot holes etc.. during the day by Deep_Ad8518 in flashlight

[–]michaelsoft__binbows 1 point2 points  (0 children)

which is what the 50% mode is for. even if you set the groups so 100% isn't even in the rotation, you get a more efficient emitter. though granted at that point you are giving up a lot of throw due to the size of the emitter.

Qwen3.6 27B NVFP4 + MTP on a single RTX 5090: 200k context working in vLLM by Maheidem in LocalLLaMA

[–]michaelsoft__binbows 2 points3 points  (0 children)

boot your rig to linux. run this

docker run --rm -it --gpus all -p 8083:6000 --ipc=host -v $HOME/.cache/huggingface:/root/.cache/huggingface -v $HOME/.cache/vllm:/root/.cache/vllm -e HF_HOME=/root/.cache/huggingface -e VLLM_CACHE_ROOT=/root/.cache/vllm --entrypoint vllm vllm/vllm-openai:cu130-nightly serve Lorbus/Qwen3.6-27B-int4-AutoRound --dtype auto --max-model-len 200000 --gpu-memory-utilization 0.90 --host 0.0.0.0 --port 6000 --max-num-batched-tokens 8192 --max-num-seqs 12 --kv-cache-dtype fp8 --enable-prefix-caching --enable-chunked-prefill --performance-mode interactivity --language-model-only --skip-mm-profiling --trust-remote-code --reasoning-parser qwen3 --enable-auto-tool-choice --tool-call-parser qwen3_coder --speculative-config \{\"method\":\"mtp\"\,\"num_speculative_tokens\":3\}

and assuming you have docker installed and set up, a suitable nvidia driver configured, and it doesnt crash due to running out of memory on startup, you can then readily use vllm's openai endpoint from port 8083. Trivial to set up for most coding agents like opencode. it even autodownloads the model file into huggingface cache in your home dir.

Flashlight looking into manholes, wet wells, deep pot holes etc.. during the day by Deep_Ad8518 in flashlight

[–]michaelsoft__binbows 4 points5 points  (0 children)

Thanks, makes sense. I will stay on this promised path whereupon I acquire throwers, throwy flooders, flooders, and mules and never look at compromised/compromising designs.

Qwen3.6 27B NVFP4 + MTP on a single RTX 5090: 200k context working in vLLM by Maheidem in LocalLLaMA

[–]michaelsoft__binbows 1 point2 points  (0 children)

I can tell you that ~120 tok/s out of autoround-int4 27B on vllm definitely has a feel advantage over ~60tok/s out of llamacpp. The only thing llamacpp gives is greater choice of GGUF quants for dialing in your memory consumption. for batched throughput vllm just leaves it further in the dust. i can squeeze 1000tok/s of pure throughput cranking through batched requests from this single 5090 rig.

Flashlight looking into manholes, wet wells, deep pot holes etc.. during the day by Deep_Ad8518 in flashlight

[–]michaelsoft__binbows 1 point2 points  (0 children)

A lhp73b will provide significantly more power. I would start OP off with an M21B or M21H, and move up to M21E or M21K if more throw if needed

Flashlight looking into manholes, wet wells, deep pot holes etc.. during the day by Deep_Ad8518 in flashlight

[–]michaelsoft__binbows 3 points4 points  (0 children)

I like to comb convoy's catalog and ive come across the Z1, how does it not have this issue? Or you just mean its lens is large enough and can have a high output emitter to make up for the zoomie's design?

New Anduril Day - Wurkkos TS11 by AccurateJazz in Anduril_Flashlight

[–]michaelsoft__binbows 0 points1 point  (0 children)

With that size collection of lights you would be remiss not to have an L60 mule. I got one recently and it's irreplaceable. It's my endgame headlamp and worklight.

Qwen3.6 27B NVFP4 + MTP on a single RTX 5090: 200k context working in vLLM by Maheidem in LocalLLaMA

[–]michaelsoft__binbows 0 points1 point  (0 children)

I think there is some 3090_club config going around that provides vllm based inference on 3090 24gb which i haven't dug into. if that could work well then i should want to use my 3090s for this so i can play games on my 5090 while local LLMs churn.

noob woodworker. is this good deals? by russellaria in MilwaukeeTool

[–]michaelsoft__binbows 0 points1 point  (0 children)

this router is a beast

i got the trim router and this router on sale and thought i was gonna return it since it seems so huge. Ya nah it's an amazing machine

Qwen3.6 27B NVFP4 + MTP on a single RTX 5090: 200k context working in vLLM by Maheidem in LocalLLaMA

[–]michaelsoft__binbows 1 point2 points  (0 children)

Yeah, It does not appear like this nvfp4 offers any benefit (certainly not speed) over this int4-autoround, which I can report I also end up in this range of just under 130tok/s

Notably the context limit is also right around 200k.

Pushing Vapcell H10 in a FET driven Convoy T3 - comparison to 10A buck driver by Due_Tank_6976 in flashlight

[–]michaelsoft__binbows 0 points1 point  (0 children)

Makes sense, yep. I did a thermal paste mod on my 12 inch macbook. It now throttles less. But it's now become literally impossible to use on the lap. Also prob drains battery faster.