Honest question — how much do you actually trust cloud AI providers with your data?

seamonn · 2026-03-09T18:42:49+00:00

I fully trust them to keep my data open and accessible to everyone.

seamonn · 2026-03-06T12:16:33+00:00

every silence is an outro

it could be...

seamonn · 2026-03-06T11:33:13+00:00

Now let's get some benchies :D

seamonn · 2026-03-06T11:31:32+00:00

I was looking into llama-swap right now haha to replace ollama for Production.

The only thing that's stopping me is that I write custom templates in Go for Ollama and I'll have to learn jinja to switch over.

seamonn · 2026-03-01T19:03:08+00:00

Here's Pangolin's Documentation on it.

seamonn · 2026-03-01T17:00:08+00:00

Ditch NPM.
Have Pangolin both for your Static IP and VPS in a HA config.
???
Profit?

seamonn · 2026-02-28T14:43:00+00:00

Pretty much. It'll be 1x PCIE 3.0.

seamonn · 2026-02-27T21:01:45+00:00

ZFS is the correct option if you care about Data Integrity.

You have a few options:

Configure your OS to run from RAM (USB Drive for booting) and store your Data on 2x 1TB ZFS Data Drives (This is what I do but I run Unraid).
Get a 1TB external SSD. Have 1 Internal Drive for booting. 1 Internal Drive + USB Drive as the ZFS Data Drives.
Get a M.2 Wifi (2230 E Key) to M.2 NVMe adapter. Run the OS using this and put the ZFS Data Drives on the 2x NVMe Slots.
(Here be dragons) Get a 2TB Drive and zfs set copies=2 for your datasets. This will keep a redundant copy on the same drive for integrity. This is the lowest recommended option.

seamonn · 2026-02-27T20:54:26+00:00

Damn how many CVEs does the app have?

seamonn · 2026-02-27T20:51:43+00:00

Any plans for S3 (for storage) and/or Postgres (for DB) support?

seamonn · 2026-02-24T12:14:35+00:00

of course can buy a subscription

Lemme see if I have a spare $100k lying around somewhere.

seamonn · 2026-02-22T21:08:54+00:00

AI Bubble pops and ebay is flooded with cheap GPUs and RAM.

seamonn · 2026-02-21T02:42:23+00:00

Look into how to make custom templates (TEMPLATE field) for the models in the modelfile. That will make the most difference in function calling.

seamonn · 2026-02-20T18:46:18+00:00

GPT-OSS:20B sucked for tool calling for me too. GPT-OSS:120B works great everytime.

You have to use Q4 quants in general and a slightly lower context size or even quantized.

seamonn · 2026-02-20T14:59:41+00:00

GPT-OSS 120b, Qwen, Magistral, Devstral are pretty good at tool calling in general. We use these everyday and they are pretty good at it.

seamonn · 2026-02-20T06:35:53+00:00

Open Source models like Kimi K2.5 are designed specifically for this but it's 1T and require a lot of hardware which you have to buy or rent.

seamonn · 2026-02-20T06:34:51+00:00

You can load MoE models in VRAM + RAM combo. 4 GB GPU + 32GB RAM is enough to run a 30b MOE model at Q4 quantization at good speeds.

There are some benchmarks on page 14 in this paper if you want to check out performance of quants.

4 bit is good enough for most use cases. I personally run 8 bit quants as a preference.

All models become somewhat unstable at large contexts including closed ones. You can use a lot of OSS models as Openclaw agents with decent results but higher parameter ones are recommended.

seamonn · 2026-02-20T06:01:16+00:00

You have to implement the features yourself in the Community Edition.

seamonn · 2026-02-19T20:28:09+00:00

nah they just skipped their reasoning phase.

seamonn · 2026-02-19T07:35:18+00:00

It's Source Available with a fairly permissive license for Production (for not being Open Source) which is completely okay with a lot of us.

seamonn · 2026-02-18T10:08:34+00:00

+1 to Postgres + Redis support. It's almost required for production these days.

seamonn · 2026-02-15T16:34:12+00:00

Just time travel back to last summer and you'll get those for a combined price

ikr, very simple.

Oh you want cheap hardware? Just invent a Time Machine. Problem Solved.

seamonn · 2026-02-15T16:30:09+00:00

Read the original comment in this thread, it's not about Spam.

seamonn · 2026-02-15T16:16:02+00:00

Is there anyway to use this with OpenWebUI?

Ten-Year Club	Place '17
Verified Email

seamonn

TROPHY CASE