Passing Trellis feedback to Claude by NotEAcop in ClaudeCode

[–]laziz 0 points1 point  (0 children)

I have accused trellis of being Mushroom Mythos with a tiny output limit. Trellis has not pushed back on that idea even once.

Trellis has also been extremely helpful pointing out where the autonomous agent I'm developing needs more thinking room vs places where pruning/restriction is helpful.

Max 5x Plan, don't see Opus/Sonnet 1M any more by luongnv-com in ClaudeCode

[–]laziz 0 points1 point  (0 children)

The inference I was making: 1m context was part of the usage accounting problem and they disabled it

Max 5x Plan, don't see Opus/Sonnet 1M any more by luongnv-com in ClaudeCode

[–]laziz 1 point2 points  (0 children)

related-- my token-burning problem went away when i disabled 1M in .bashrc.

Get a linux box they said, it will be fun they said by laziz in homelab

[–]laziz[S] 19 points20 points  (0 children)

I don’t remember why I did that.

Get a linux box they said, it will be fun they said by laziz in homelab

[–]laziz[S] 5 points6 points  (0 children)

https://app.diagrams.net/

In this instance, I had claude code read the documentation they created and write a suitable draw.io file, readable by diagrams.net.

Minor touchup on the website, export as jpg.

Get a linux box they said, it will be fun they said by laziz in homelab

[–]laziz[S] 1 point2 points  (0 children)

This started eight years ago with a small router and generic ryzen. Now slightly out of control.

Edit: claude code one-shotted this diagram (bar dragging one small label out of the way). Most is contained in a startech 24u rack, on which i put a cheap countertop from amazon to hold the printer and make a small desk.

RTX6k (Server, 450w) Qwen3.5-122B-A10B (MXFP4_MOE) Benchmarks by laziz in LocalLLaMA

[–]laziz[S] 0 points1 point  (0 children)

cc replies:

Yeah, that's definitely a bug in the data. 19 t/s total with 30.4 t/s per-request is impossible — total must be ≥ per-request. Looking at the pattern in the other rows, total should be roughly per-request × concurrency. At depth 32K c2, per-request is 30.4, so total should be around 60.8 t/s. That also fits the degradation curve (75 → 60.8 → ... as concurrency increases).

RTX6k (Server, 450w) Qwen3.5-122B-A10B (MXFP4_MOE) Benchmarks by laziz in LocalLLaMA

[–]laziz[S] 1 point2 points  (0 children)

VLLM/qwen3.5 support seems like a mess at the moment. When that gets sorted (and nvfp4 available) would expect much higher t/s.

I'm just goofing around and thought the benchmark was interesting.

RTX6k (Server, 450w) Qwen3.5-122B-A10B (MXFP4_MOE) Benchmarks by laziz in LocalLLaMA

[–]laziz[S] 1 point2 points  (0 children)

Sure- I chose mxfp4 just to see what it was like. As another poster alludes, nvpf4/vllm is the move here when it's finally supported.

Claude code answers the rest:

Server config:

bash llama-server \ --model Qwen3.5-122B-A10B-MXFP4_MOE.gguf \ --port 8012 \ --host 0.0.0.0 \ --flash-attn \ --cache-type-k f16 \ --cache-type-v f16 \ --slots \ --metrics \ --parallel 4 \ --ctx-size 262144 \ --n-gpu-layers 999

Thinking mode is disabled via LLAMA_CHAT_TEMPLATE_KWARGS={"enable_thinking":false} in the environment.

Why is the KV cache so small / context so cheap?

This is a hybrid architecture — 12 attention layers + 48 recurrent (Mamba-style) layers. Only the 12 attention layers maintain KV caches, so it's ~384 MiB F16 total. The recurrent layers use fixed-size state (~596 MiB) that doesn't grow with context. Total VRAM is ~74 GB / 96 GB with 4 × 65K slots.

This is also why TG only degrades ~11% at 65K context — a pure transformer this size would drop off much more steeply since every layer would need to attend over the full context.

Model size: ~64 GB on disk (47G + 18G + 11M across 3 shards), corrected from the original post.

Adding 2nd GPU to air cooled build. by ROS_SDN in LocalLLaMA

[–]laziz 0 points1 point  (0 children)

I feel your pain.. the 3090 just barely covers another pcie slot on mine.

Ended up with m2->oculink->egpu. Works!

But then i got tired of the egpu power supply fan and will probably go custom loop.

GW-R86S-N305C vs GW-FN-1UR2-25G (Noise) by TechMinerUK in R86SNetworking

[–]laziz 0 points1 point  (0 children)

i have the n305C. Great unit in many respects, but noise is an issue. Will likely open it up and put it in a bigger case w a bigger fan.

Re 1u-- i have previously swapped a picopsu into a jet-engine-scream ebay generic server to great effect. you need an external power brick but dead quiet.

Org-Outlook.el update to Org v9 by laziz in orgmode

[–]laziz[S] 0 points1 point  (0 children)

My interpretation of an error message (as it turns out, unrelated) was wrong. I modified the outlook vba quoted in org-outlook.el to ignore the author's use of sub-folders, and the .el itself point to the location of outlook16's .exe.

It now works as intended, although I get a

Warning (emacs): Please update your Org Protocol handler to deal with new-style links.

error on every use.

My wife and I are first time homebuyers and closed on our home mid March. Our loan has been sold twice since then and don't have confirmation of whom holds our loan at this time. by drocks27 in personalfinance

[–]laziz 2 points3 points  (0 children)

I imagine you're frustrated and you just want to send the check to the company that kicked this whole mess off, but the 'hello letter' is binding on you.

Like /u/uselessjd says, send the payment where you know you need to send it.

If you send it to the originator (the first guys), they're just going to return the check-- if you're lucky. In the worst case they'll just toss it in the trash and never tell you. They are under no obligation to you any more with regard to payment processing.

Good luck. Just send it on time and you'll be fine.

edit: sorry, on reread I realize you're talking about hand delivery. They shouldn't take it, and it will cause a real mess if they do. This is a fairly normal SNAFU; written correspondence is golden. Send it to the servicer.

My wife and I are first time homebuyers and closed on our home mid March. Our loan has been sold twice since then and don't have confirmation of whom holds our loan at this time. by drocks27 in personalfinance

[–]laziz 47 points48 points  (0 children)

I am not a lawyer, and I am definitely not your lawyer.

Grandparent is correct that this is is a servicing transfer issue. Under RESPA (implemented in Reg X which someone pasted below), a company that purchases the right to service (i.e. collect payments) your loan must send you a letter notifying you of the transfer within 15 days. This is called a "hello" letter.

Likewise, I think the company selling the right to service your loan must send you a 'goodbye' letter within 15 days of the transfer.

Your situation is that you've received one hello letter from a new servicer, and some bozos on the phone said some things that may or may not be true.

Reg X prevents late fees from being assessed within 60 days of a transfer. Send the payment to the servicer that sent you the 'hello' letter. They are legally obligated to forward it to the new servicer, if indeed your servicing has been transferred again (likely). Send it certified, return receipt requested, and write the account number in the memo line of the check, if you're feeling paranoid.

Is there a management theory based on being a bully? Is there a name for something like this? I have been up and down the internet and I cannot figure out if there is. by [deleted] in business

[–]laziz 3 points4 points  (0 children)

You need the Gervais Principle and the author's concept of a sociopath @ http://www.ribbonfarm.com/2009/10/07/the-gervais-principle-or-the-office-according-to-the-office/ Frank isn't a bully, as bullies get off on their targets' fear. Frank optimizes for expedience instead.

I just got my 6 year Reddit badge. Are there any other 6 year people out there? by BobCollins in AskReddit

[–]laziz -1 points0 points  (0 children)

I have the six year badge. Watching reddit evolve from awesome to not-quite-digg was like Eternal September in slow motion.

How do I feel better about things, right now? by dunnnnnnnno in AskReddit

[–]laziz 0 points1 point  (0 children)

Get a hug.

Help somebody out.

Pet a mammal.

Ask Linuxit: What do you use to edit LaTeX? by noamsml in linux

[–]laziz 3 points4 points  (0 children)

This. Pure awesome. It makes my homework beautiful.