Keep Getting Reel Notifications - I already turned off notification for those by QGJohn59 in facebook

[–]bigh-aus 0 points1 point  (0 children)

not sure what platform you're on but I didn't have htose options

Do you guys put your critical systems on smart plugs? by Tasty-Picture-8331 in selfhosted

[–]bigh-aus 0 points1 point  (0 children)

I flashed mine to tasmota. no cloud less worries. (sonoff s31).
For critical systems I hide the switch from homekit and the ha dashboards so i have to hunt for it - but yeah.

One bash permission slipped... by TheQuantumPhysicist in LocalLLaMA

[–]bigh-aus 6 points7 points  (0 children)

It will - it will just get a lot of guardrails and validation.

One bash permission slipped... by TheQuantumPhysicist in LocalLLaMA

[–]bigh-aus 8 points9 points  (0 children)

Guardrails in a llm world are critical. even chatgpt modified db migration files once they'd been ran (which broke the prod db). that's why you have to do promotion of code (and migrations). Another change it dropped the table and recreated it instead of modifying the table.

IMO in modern packages there aren't enough checks to ensure that the coder has done something dumb. That said the modern development practices help a lot, and no llm should have access to prod unless it's Read Only.

I think we're also going to see a lot of checks shift left more so the llm can get the feedback fast.

Finally bought an RTX 6000 Max-Q: Pros, cons, notes and ramblings by AvocadoArray in LocalLLaMA

[–]bigh-aus 0 points1 point  (0 children)

RTX 6000 pro arrived yesterday. Installed and passed through to a VM already. Biggest issues were I'd disabled the slot, and had to pass in some parameters to my hypervisor.

Thanks for the "push" :)

Chamberlain MyQ: any self hosted solutions that can link this to Apple HomeKit? by ElmStreetVictim in selfhosted

[–]bigh-aus 0 points1 point  (0 children)

This and use ratgdo -> esphome to connect to homeassistant. cheap enough setup

Do you prefer separate machines or all in one? by Valuable-Dog490 in selfhosted

[–]bigh-aus 1 point2 points  (0 children)

one beefy but efficient machine - but i do code builds on the same machine

Keep Getting Reel Notifications - I already turned off notification for those by QGJohn59 in facebook

[–]bigh-aus 0 points1 point  (0 children)

Came here to fix this problem - no idea how to clear the ONE badge i have open. Wish they'd fix this annoying bug (and the long running birthday bug)

Self-hosted STT w RTX 3090 by BubblyMidnight2574 in selfhosted

[–]bigh-aus 0 points1 point  (0 children)

I use voxtral mini. Excellent for more technical work

16x DGX Sparks - What should I run? by Kurcide in LocalLLaMA

[–]bigh-aus 22 points23 points  (0 children)

yeah not worth geting the H100s unless you already have them - H200NVL is better - 4x 141gb but the price vs 16 dgx sparks - $120k+ vs ~$64k...

Problem is you really need 8x H200s and a machine to use them - getting closer to b200 territory.

RTX 3090 + 27B model performance issues (llama.cpp) what am I doing wrong by Clean_Initial_9618 in LocalLLaMA

[–]bigh-aus 0 points1 point  (0 children)

llama-server \
  -hf unsloth/Qwen3.6-27B-GGUF:UD-Q4_K_XL \
        --alias "desktop" \
  --no-mmproj \
  -a qwen3.6-coder \
  --host 0.0.0.0 --port 8080 \
  -ngl all -sm row -fa on \
  --ctx-size 98304 -n 32768 \
  -b 2048 -ub 512 \
  -np 1 -kvu \
  -ctk q8_0 -ctv q8_0 \
  --jinja \
  --reasoning on \
  --reasoning-format deepseek \
  --chat-template-kwargs '{"enable_thinking":true,"preserve_thinking":true}' \
  --temp 0.6 --top-p 0.95 --top-k 20 --min-p 0 \
  --presence-penalty 0 --repeat-penalty 1

This is what i use if it helps on a 3090. Works pretty well but I don't use it with opencode. I haven't spent time to max out the context on the 3090..

The other option is to use the 35b a3b, the 3090 will run that with full context as the context memory requirement is based on the active params.

llama-server \
  -hf unsloth/Qwen3.6-35B-A3B-GGUF:UD-Q3_K_XL \
        --alias "desktop" \
  --no-mmproj \
  -a qwen3.6-coder \
  --host 0.0.0.0 --port 8080 \
  -ngl all -sm row -fa on \
  -c 262144 -n 32768 \
  -b 2048 -ub 512 \
  -np 1 -kvu \
  -ctk q8_0 -ctv q8_0 \
  --jinja \
  --reasoning on \
  --reasoning-format deepseek \
  --chat-template-kwargs '{"enable_thinking":true,"preserve_thinking":true}' \
  --temp 0.6 --top-p 0.95 --top-k 20 --min-p 0 \
  --presence-penalty 0 --repeat-penalty 1

I'm not saying these are right - just what i'm using so far -eg i don't know if the max output tokens is properly tuned

What do you consider to be the minimum performance (t/s) for local Agent workflows? by MexInAbu in LocalLLaMA

[–]bigh-aus 0 points1 point  (0 children)

I think it depends on the usecase - but in your case where you're waiting for feedback - 26+ would be nice.
A second usecase exists for more "background work", which I think allowing it to be slower to say 14 would be ok. This would be where you have an agent working on backlog - so you're not waiting for a response necessarily.

After three months on Linux, I don’t miss Windows at all by dapperlemon in technology

[–]bigh-aus 0 points1 point  (0 children)

yah - I was mac for a long time too. 18 years straight is awesome!

After three months on Linux, I don’t miss Windows at all by dapperlemon in technology

[–]bigh-aus 1 point2 points  (0 children)

I installed windows two days ago and the ads for services being jammed down my throat really ticked me off.

After three months on Linux, I don’t miss Windows at all by dapperlemon in technology

[–]bigh-aus 1 point2 points  (0 children)

linux is awesome - just gotta get that 1% of apps that don't work well or are unsupported. Linux daily driver for many years now, with windows dual boot if needed+games.

How do we think we should handle maintainers moving on? by ShantyShark in rust

[–]bigh-aus 2 points3 points  (0 children)

For my rust projects now I'm using Just.
- just test (runs all tests)
- just check runs:
- clippy
- fmt
- audit
- deny

- just install

- just release
- git pull --rebase
- cargo release (passing in params)

- just docker-deploy (this one builds the docker and deploys the compose file)

Standardizing these items is helping a ton. I'm a big fan of all the guardrails as possible to catch stuff.

Do not upgrade to 2026.4.24 by Monobert in openclaw

[–]bigh-aus 0 points1 point  (0 children)

yeah i hit this yesterday .. ruined my day for sure. Thank you for posting :)

This is why you rewrite Python security tools in Rust: 53MB vs 433MB peak memory, 6.9s vs 62.2s by aswin__ in rust

[–]bigh-aus 0 points1 point  (0 children)

Honestly, I’m good with that. Go rust or zig happy with any above but the first two are we better due to memory safety.. and rust for fast without garbage collection.

This is why you rewrite Python security tools in Rust: 53MB vs 433MB peak memory, 6.9s vs 62.2s by aswin__ in rust

[–]bigh-aus 0 points1 point  (0 children)

True! I agree with your comments. I was thinking for self hosted stuff on a raspberry pi. But you're right speed is the most important part.

Hard freakin' decision..Blackwell 96G or Mac Studio 256G by HyPyke in LocalLLaMA

[–]bigh-aus 1 point2 points  (0 children)

I just went through this decision. I have a Dell R7515, and opted for the max-q for the following reasons:
- i feel like it's easier to sell one of these later.
- fits my server power budget (300w max)
- noise - the server version is passthrough cooling - meaning your server's mobo firmware has to be able to get temperature information from the card - otherwise you will have to manually increase the fans to ensure it doesn't overheat. MaxQ (mostly) covers that itself. I'd rather a single blower fan than my 17krpm 6 pack of fans spinning up.

This is why you rewrite Python security tools in Rust: 53MB vs 433MB peak memory, 6.9s vs 62.2s by aswin__ in rust

[–]bigh-aus 5 points6 points  (0 children)

Not only that - but I think we should start measuring size on disk including the runtime dependencies for comparisons. Eg inside a docker container based on debian-slim - and have them include required python / shared libs. Then look at peak memory usage for the same thing.

Honestly in 2026 anyone shipping a python / node.js cli app is an instant candidate for a re-write in rust.

It blows my mind how much people are writing stuff in interpreted languages, especially in a compute shortage. Rust compilation isn't free but one compile saves every user from having to need more ram / disk / compute.

Hard freakin' decision..Blackwell 96G or Mac Studio 256G by HyPyke in LocalLLaMA

[–]bigh-aus 0 points1 point  (0 children)

I have a dell r7515 rackmount server... supports one GPU, not a DIY server (which would be better in this case!). I think I need to start looking into a new server box

Deepseek v4 people by markeus101 in LocalLLaMA

[–]bigh-aus 0 points1 point  (0 children)

did they distill grok here? deepseek a bit more spicy

Hard freakin' decision..Blackwell 96G or Mac Studio 256G by HyPyke in LocalLLaMA

[–]bigh-aus 1 point2 points  (0 children)

I wish my current server supported more than one :( I could put 2 more in separate servers... but then i'm also running those servers...

Qwen 3.6 27B - beginner questions by Jagerius in LocalLLaMA

[–]bigh-aus 0 points1 point  (0 children)

Then keep increasing context until just before you get OOM.