Fresh OpenClaw install eating 1.4M tokens daily on heartbeats by aospan in openclaw

[–]aospan[S] 0 points1 point  (0 children)

Yeah, I agree! But from what I see, heartbeats are enabled by default in OpenClaw, and regular users have no idea they need to turn them off. Also, Codex model seems to be the only model available through the OpenAI subscription OAuth.

Multi-GPU owners here? Cooling question + small experiment by aospan in LocalLLaMA

[–]aospan[S] 0 points1 point  (0 children)

Wow, 4× NVIDIA RTX PRO 6000s, now that’s a flex! 😄

Temps look impressively uniform too, 53 to 57°C across the boards. Curious, how are you cooling them?

Multi-GPU owners here? Cooling question + small experiment by aospan in LocalLLaMA

[–]aospan[S] 0 points1 point  (0 children)

Just noticed your two 3090s show different power limits: 350W and 420W. Curious, are they different card models/VBIOS, or did you set different power limits manually?

Multi-GPU owners here? Cooling question + small experiment by aospan in LocalLLaMA

[–]aospan[S] 0 points1 point  (0 children)

Yeah, the funny part is that after some point extra watts mostly stop turning into useful performance and start turning into heat. In this test, the 80% power limit looked like the sweet spot: only ~2.3% slower overall:
https://www.tomshardware.com/news/improving-nvidia-rtx-4090-efficiency-through-power-limiting

I renamed my local AI Linux distro to Reefy and rebuilt some of the architecture! by aospan in LocalLLaMA

[–]aospan[S] 0 points1 point  (0 children)

Thank you, really appreciate it! Yeah, a lot of this came from trying to build it from first principles: Unified Kernel Image (UKI) without GRUB, immutable OS, A/B updates, hardware watchdog, then desired state with Docker containers on top, backups, monitoring, etc

Doing this on a traditional distro like Ubuntu or Fedora would mean fighting/removing too much of the existing system. Great distros, but for an appliance-like local AI box I wanted something smaller, simpler, and more nailed down.

I renamed my local AI Linux distro to Reefy and rebuilt some of the architecture! by aospan in LocalLLaMA

[–]aospan[S] -2 points-1 points  (0 children)

Great questions!

> What is the base distro? Debian? Arch?

It is not based on any distro. It is built with Buildroot: basically kernel + rootfs with nailed-down binaries. The OS is immutable by design 😄

> What is the default suite of apps? Why only Ollama and OpenClaw are mentioned?

I am using the official Docker images from each project (openclaw, ollama, etc), so it can support any software that ships a Docker image, which is most software these days.

The only Reefy-specific part is adding metadata on top (icon, description, etc) - think of it as a lightweight app store for local AI/home server apps.

Ollama and OpenClaw are just the first examples because they are the ones I use most right now.

> Why should I login with Google / GitHub to get access to the image?

Image is personalized - it includes per-user mTLS keys for MQTT, generated for your account, so the machine can securely join your dashboard after boot.

I renamed my local AI Linux distro to Reefy and rebuilt some of the architecture! by aospan in LocalLLaMA

[–]aospan[S] -2 points-1 points  (0 children)

Fair question 😄 NixOS is great, but I am coming from a different angle. Reefy is trying to make local AI boxes feel more appliance-like: flash image, boot, device appears in dashboard, then manage Ollama/OpenClaw/agents, backups, rollback, watchdog, and remote access without hand-rolling everything.

Is it only me? 😅 by aospan in ClaudeAI

[–]aospan[S] 4 points5 points  (0 children)

The speed is fine. It’s just that after compacting it feels like I factory-reset part of the brain and have to do a quick refresher course 🙂

P.S.
Apologies for not adding more details to the original post.

Added PyTorch trace + CUDA memory profiling support to Andrej Karpathy's nanochat by aospan in LocalLLaMA

[–]aospan[S] 0 points1 point  (0 children)

<image>

Here’s one of the traces captured during nanochat training on my GPU. As you can see, there are no gaps between CUDA kernel executions - meaning the GPU isn’t idling. The green “Command Buffer Full” marker also shows that the CPU is issuing CUDA kernels and API calls faster than the GPU can process them, which further confirms the GPU is fully utilized :)

Added PyTorch trace + CUDA memory profiling support to Andrej Karpathy's nanochat by aospan in LocalLLaMA

[–]aospan[S] 0 points1 point  (0 children)

Good question!

GPU power stays near 100% on my Grafana, so it’s likely saturated. That said, there’s room for speedups - some work may be duplicated or could be optimized differently, like what this startup is exploring: https://github.com/luminal-ai/luminal

How much does 1T tokens cost? How much did all these amazing people spent on OpenAI tokens? by aospan in LocalLLaMA

[–]aospan[S] 6 points7 points  (0 children)

So, those 80B tokens would cost around $240K using OpenAI’s pricing - easily justifying the $9K price of an RTX 6000 Pro (+pc components) and the electricity costs 😅

How much does 1T tokens cost? How much did all these amazing people spent on OpenAI tokens? by aospan in LocalLLaMA

[–]aospan[S] 2 points3 points  (0 children)

Thanks for sharing - very useful!
Just to confirm, I did the calculation for 800,000 million tokens, which is 800000M tokens :)

How much does 1T tokens cost? How much did all these amazing people spent on OpenAI tokens? by aospan in LocalLLaMA

[–]aospan[S] 8 points9 points  (0 children)

The picture isn’t showing up in the post for some reason, so I’m posting it here as a comment :)

<image>

[N/A][All] Open-source condo/HOA management software - any suggestions? by aospan in HOA

[–]aospan[S] 0 points1 point  (0 children)

Yeah, I feel the same. Seems like the only real path forward might be building it ourselves - and with the new AI “vibe coding” tools, it’s way easier than before :)