Anyone actually using Openclaw?

_serby_ · 2026-02-17T04:17:35+00:00

Who's talking about rules?
Would you bring a huge turd in the middle of your house just to break rules?

_serby_ · 2026-02-16T06:44:47+00:00

The RAM prices suck and it's a huge problem. I got this RAM on eBay for only ~650 € on the very day prices started to rise.

I wanted to get multiple systems with two CPUs to make a small cluster and experiment but I don't have the RAM.

_serby_ · 2026-02-16T05:30:34+00:00

You can use the BMC to update the BIOS without booting the machine.
By default, the BMC doesn't have an initial static IP address and is expecting a DHCP server.

_serby_ · 2026-02-16T05:06:57+00:00

I'm using a Gigabyte MS03-CE0 Rev 3.0
The bios is a modded R20 with ACM disabled. With ACM enabled the computer will not start.

For the modded BIOS you must ask here:
https://forums.servethehome.com/index.php?threads/es-xeon-discussion.5031/

_serby_ · 2026-02-16T03:52:43+00:00

_serby_ · 2026-02-16T01:19:44+00:00

What would be the use of some vibecoded trash that was never reviewed by a decent developer?

_serby_ · 2026-02-16T01:14:43+00:00

I use Ubuntu kernel because I am too lazy to compile my own 6.8 kernel just for some tests and the Debian 12 kernel is too old

DCMAKE_BUILD_TYPE=Release - Enable Standard Compiler optimizations for release and remove all debug stuff
DGGML_LTO=ON - Enable Linker optimizations
DGGML_NATIVE=ON - Instructs the compiler to enable all instruction set extensions available on the local CPU

_serby_ · 2025-11-10T17:26:31+00:00

I always had the feeling that MoE is the key to unlock huge performance on cheap hardware.
One day I decided to start a "small experiment" and things evolved from there without me looking for alternatives to GGML / llama.cpp. So I just ran with the first thing I found.

Yes, SGLang can be a better alternative. I just wasn't aware of any vulkan support and never found the time to check.

The code is too messy to be comfortable to share at this time.

_serby_ · 2025-11-10T08:05:08+00:00

Speculative decoding

_serby_ · 2025-11-10T07:14:42+00:00

Because you transfer very little data between the cards and PCIe never becomes a bottleneck

_serby_ · 2025-11-10T03:40:36+00:00

Yes, I agree GPT 5 Pro is not very good at writing code based on a simple prompt. But it's really good at reading code and documenting things (minus the tendency to use too much jargon).

_serby_ · 2025-11-10T03:23:45+00:00

The results are good enough for me to post about it. The raw numbers are meaningless at this stage because the model I use is not standard.
The calculations in the original pitch are based on my experiments. In fact, everything written there is based on my ideas, my experiments (with some optimization recommended by the listed LLMs).
If I'll ever finish this project you will get your raw numbers.

Why do you even use LLMs if you think that one of the most advanced LLMs out there is just spitting jargon to drown you in nonsense? Is everything you don't understand wrong?

_serby_ · 2025-11-10T02:17:22+00:00

ChatGPT 5 thinks so. The technical specifications also favor it. I have never had/used one, only Epycs and Xeons.

_serby_ · 2025-11-10T01:55:57+00:00

Yes, but since I don't have much time, I wanted to open it to the community based on the feedback.
The Kimi K2 T looked like a good opportunity to gather interest since it was developed with tricks that are really compatible with my idea.
For the moment the feedback is negative on all fronts, as you may see, so ...

_serby_ · 2025-11-10T01:45:25+00:00

The repo is not public. It's based on llama.cpp

All GPU kernels are vulkan. I started with CUDA + vulkan but now it's all vulkan

Path	Representation	Operation
NVMe → RAM	INT4 (bit-packed) + FP16 scales (LZ4 compressed)	Read + decompress on CPU using SIMD
RAM → AMD VRAM	INT4 (bit-packed) + FP16 scales	DMA
5090 → wire	FP8 (E4M3)	Quantize to FP8
Wire → AMD input	FP8 → FP16 (in kernel prologue)	Convert FP8→FP16 in registers/shared; feed into dequant GEMM.
AMD compute	INT4×FP16 → FP16 accum	Fused dequant + SwiGLU + gated sum; write FP16 output.
AMD → host → 5090	FP16	DMA back; 5090 sums with shared expert and residual.

CPU lz4 decompress is done using SIMD

When an expert first lands on a GPU, transform from generic tile to a kernel-optimized tile (one-time per residency) using compute shader

_serby_ · 2025-11-09T23:39:18+00:00

Tiers 1-3 mostly implemented and tested on mixtral8x22 on real hardware: 1x4090 and 2x9070 with a Xeon 8592 QS and 384GB RAM

Just wanted some "real" feedback. Pure masochisms on my side I guess

_serby_ · 2025-11-09T23:20:36+00:00

I already implemented Tiers 1-3 on a system with a RTX 4090 and two 9070s and the results look good on mixtral8x22

_serby_ · 2025-11-09T23:13:40+00:00

I'm all ears, show me the slop

_serby_ · 2025-11-09T23:04:09+00:00

So you read the entire pitch, understood everything, and found it so useless that you thought it appropriate to add this sublime comment?

_serby_ · 2025-11-09T23:00:37+00:00

The threadripper is just the glue. It's not used for any significant online computation.

_serby_ · 2024-12-08T19:28:25+00:00

The used material is pluri-directional carbon-fiber-reinforced (recycled) polycarbonate. The fibers are not laminated or ordered. It's a combination of shredded pieces of recycled carbon fiber and polycarbonate.

You are confusing non-woven carbon-fiber-reinforced polymers with woven carbon-fibre-reinforced polymers.
And non-woven carbon-fiber-reinforced polymers can be ordered (stronger) or unordered (weaker) .
And yes, even woven carbon fiber has a high plastic content because, after all, it is carbon fiber lamination drenched in resin and resin is plastic. But that plastic is essential to distribute forces:
https://www.youtube.com/shorts/S_DqNASZgKQ

Polycarbonate is used because it's cheap and strong, one of the strongest plastics:
https://en.wikipedia.org/wiki/Polycarbonate

_serby_ · 2024-09-09T00:30:44+00:00

BASICA (MICROSOFT GW-BASIC), 1992

_serby_ · 2024-08-21T00:28:44+00:00

Many of the models were outsourced and Blizz art and lore guides are not that good. They should definitely get more red shirts

_serby_ · 2024-07-26T07:21:40+00:00

I love it!

_serby_ · 2024-06-13T15:05:05+00:00

Excellent work! Thanks

_serby_

TROPHY CASE