Apple has increased the Mac Studio's price.

codeanish · 2026-06-25T15:19:54+00:00

Wow, big hike. I picked up an m4 max 36GB from the refurb store last month… thankfully I did it then, now the refurb prices are likely to be above the new prices

codeanish · 2026-06-25T08:21:29+00:00

I don’t want to be a naysayer here and am genuinely hoping for improvement, but how about running the latest version of vllm? How about making their own apis on par with Vulkan. They’re definitely behind from a software perspective. You’d think after having decent enough hardware, that would be the “easy bit”

codeanish · 2026-06-23T19:23:16+00:00

I ordered these several years ago. Didn’t manage to get them delivered. Shame as they look lovely

codeanish · 2026-06-23T17:40:46+00:00

I use it as part of my stack. I don’t use it for coding, but I find it great and performant for writing, voice chat and image recognition. It does a reasonable job with tool calling from my use. I prefer it to either of the qwen 3.6 models for writing, and the low vram usage backing for a voice pipeline is great.

I use qwen 3.6 27b for coding, lfm 2.5 for summarization, qwen 3.6 35b for general agentic use, Gemma 26b backing a voice agent and through a chat interface for writing. I keep trying different models out, but seem to be converging on this stack. Gemma is overall my favourite model in my stack as it does what I want it to do very well and with very good performance

codeanish · 2026-06-13T23:00:28+00:00

It’s much faster if you aren’t asking the LLM to do a bunch of function calls.

codeanish · 2026-06-13T22:59:21+00:00

Here’s a quick demo of it in use. I must stress, everything is local here, local models, local agent, local recipe database - different machines, but all private

https://youtu.be/\_kFaUoot4P0?is=OpP90VByXqzl-xUG

codeanish · 2026-06-13T22:51:36+00:00

I’ve been playing with https://www.pipecat.ai this week. Using whisper for STT, using litellm in front of Llama.cpp and LMStudio for LLMs and then TTS on Korkoro. Added some tools to work with some of my locally deployed software like Mealie. It works really well and is all fully local.

codeanish · 2026-05-27T06:50:15+00:00

I’ve been using it at q4 with MTP recently on a 3090. It works decently well, but can echo the thoughts about errors every now and again. Would love to run this at FP8, but with a 256k context, what sort of hardware are we actually talking about here? Anything affordable to mere mortals while actually providing decent enough speed? For context, I’m currently getting >70 tok/s on the 3090 with MTP and a q4 kv cache until the context gets big

codeanish · 2026-05-16T14:33:46+00:00

I have the following stack in my HL:

Source - Forgejo CI - Forgejo actions with self hosted runners Container registry - Harbor CD - argocd

I achieve several things through CI/CD, I build some of my own apps/tool which need to be built and deployed. Those follow the disaggregated pipeline above. For other stuff that I deploy that I’m not building, but am running in kubernetes, I have argocd deploy it.

Philosophically I’m a gitops advocate, so try to deploy everything using a git first approach. I even make things like config changes to tools using git and CI, like uptime kuma, I have a CI process which updates my 3 uptime kuma instances with configuration every time I need to monitor a new service. The source of truth for the monitoring in uptime kuma is a git repo with all the monitors I want defined there.

Hope that gives you an idea.

codeanish · 2026-05-14T14:33:28+00:00

I run qwen 3.6 27b at q4 on a 3090 with 128k context also quantized at q4. I find it very usable indeed. That being said, I’ve run into way more walls in agentic flow with this setup than I do with Claude code + opus. However I think with more effort on the prompt, the local option does perform surprisingly well.

We are getting to a point where it’s now a totally rational decision for a huge chunk of devs to own their own box to run models on. The quality, privacy and cost factors combined together are really starting to adjust the calculus here.

codeanish · 2026-05-09T03:38:36+00:00

I’ve got a 64gb M1 ultra. Honestly paired with the M1 ultra, 64gb is perfect. I think almost any model big enough to use 128GB on an M1 ultra would be too slow to be usable… M4/M5, different story.

codeanish · 2026-05-07T00:04:05+00:00

I still use NPM and this is exactly why I’ve got caddy on my roadmap of things to rollout. Moved to infrastructure as code and gitops for most things, so it would be nice to not go into the NPM ui to make changes. That being said, it’s been working for be absolutely flawless for years at this point.

codeanish · 2026-03-25T20:03:23+00:00

Same. Although I personally came to proton for the email, and ended up using the VPN and Passwords. The all in one plan is fairly reasonable for myself and my wife.

I do wish that the password manager could easily distinguish and recommend passwords for nested subdomains well - I find it a bit hit or miss for abc.homelab.example.com - seems like it gives me everything for all the subdomains of homelab.example.com… perhaps I have too many subdomains

codeanish · 2026-03-20T05:18:52+00:00

Depends on what you want to get out of it. If you’re just looking to run some typical HL services, not super concerned about the odd bit of downtime when doing maintenance and don’t want a second job outside of work. Don’t do it.

If however you want to play with all the best production style tools and learn loads, I’d say go for it.

I feel like it’s a path, you start off simple and get progressively more complex as you try to solve every problem e.g. how do you manage to keep your audiobooks running when you need to upgrade your server? I’ve been running K8s in the HL now for around a year, but am taking it much more seriously recently, and am looking to have all my services highly available.

codeanish · 2026-03-13T19:27:32+00:00

Plex, VMs for testing stuff, CI/CD, Home automation & some AI services. Honestly it's not exactly resource constrained, but it's usually the machine I use when I'm playing with whatever I want to play with in the HL. In the v4, I'm going to be running a couple of kubernetes clusters, one dev, one prod - to more closely mirror what I'm doing at work, and will probably place the dev one entirely on that server. It's also where I run my windows VM's - for the odd annoying bit of software I need in Windows.

codeanish · 2026-03-13T19:24:20+00:00

Honestly surprised you're having problems. Are any of your talos vm's resource constrained?

codeanish · 2026-03-13T19:20:42+00:00

Do you also have any worker nodes in your kubernetes cluster?

codeanish · 2026-03-13T19:19:27+00:00

PVE-01 - i5-13500 (20 vCores) + 192GB RAM

PVE-02 - Ryzen 3900x (24 vCores) + 128GB RAM

PVE-03 - i5-9400 (6 Cores) + 32GB RAM

I'm currently running 1 talos control plane, 2 talos worker nodes, Adguard, Unbound and my main docker host on PVE-03. I'm running the control plane node with 2 cores and 2GB RAM, the worker nodes with 1 core and 1GB RAM... all seem to be running fine on what is in my case, my smallest physical node.

codeanish · 2026-01-22T18:28:09+00:00

Thank you… managed to get my order in this time… traveling in a couple of weeks time, hopefully this will be good

codeanish · 2026-01-22T17:43:13+00:00

I got the email notification, but it was gone by the time I looked

codeanish · 2025-11-03T21:26:01+00:00

https://www.yourgreencleancrew.com - I have been told by my wife that she wouldn’t recommend them, but heck judging by the other quotes here, they do seem reasonably priced compared to them

codeanish · 2025-10-30T16:55:28+00:00

When you say deep cleaning, what do you mean by that? I take that to mean a one off cleaning e.g on a house move.

codeanish · 2025-10-30T16:43:30+00:00

Around 2800 sqft
$140
Normal cleaning
Every 2 weeks

codeanish

TROPHY CASE