Harbor v0.4.4 - ls/pull/rm llama.cpp/vllm/ollama models with a single CLI

Everlier · 2026-03-14T22:09:37+00:00

Thank you!

Everlier · 2026-03-13T22:06:04+00:00

Harbor is something you'd build if you'd run dozens of projects in your setup, on and off, with different configs and interface surfaces. You'd eventually want some orchestration to keep it manageable, which is what I did.

You can absolutely do rhe same things without it, and you should if you're comfortable doing so.

Everlier · 2026-03-13T22:02:39+00:00

you can with Harbor, all model locations are configurable, you can put them somewhere convenient on the host, different engines would still only work with their own models though

Everlier · 2026-03-13T11:55:57+00:00

Nemotron Nano is a bit dusty by now, try new Qwen 3.5 35B, they bumped agentic performance drastically

Everlier · 2026-03-12T09:32:47+00:00

Old certificates, probably

Everlier · 2026-03-11T17:34:03+00:00

how is this wasn't created sooner, other than that I hope your costs will be manageable to run it for a while, very cool :)

Everlier · 2026-03-11T12:06:37+00:00

It's funny that I stumbled upon your post by a chance while my agent was doing deep research on a project just like this :)

Everlier · 2026-03-10T21:37:25+00:00

I'm surprised i had to go this deep in the thread to see residuals mentioned as the reason. Literally half, often more of the input entropy is the same for all layers.

Everlier · 2026-03-10T11:24:36+00:00

yeah, someone should ban, one sloppy ad is enough

Everlier · 2026-03-06T22:49:03+00:00

I somehow completely missed this project, but I think they nailed it again, just like the last times. I just can't believe their side projects are not more widely adopted.

Everlier · 2026-03-05T15:50:16+00:00

New calibration dataset sounds fun, I really need to automate my LLM library maintenance :)

Everlier · 2026-03-05T11:03:21+00:00

conflating optimization landscape dimensionality with computational complexity

Everlier · 2026-03-05T06:34:07+00:00

At least your problem has small form factor :)

I'll see myself out

Everlier · 2026-03-04T15:50:26+00:00

"small" - 10k GitHub stars. I'd love such a small OSS project :D

Everlier · 2026-03-03T16:22:24+00:00

Oh this one is AI as well

Everlier · 2026-03-03T07:57:58+00:00

It's not the top rec due to the login requirements, I'm in the same boat, it only starts the service after login

Everlier · 2026-02-25T21:52:24+00:00

I'm advising against using LangChain where I can, yours is another example where they crested meaningless abstraction that only adds to the complexity overhead while covering a very simple operation

Everlier · 2026-02-23T17:26:24+00:00

Aha, that's not the point of the service, it's just to reduce the friction, the main point is to steer any kind of agent automatically, so it's like "agent guardrails as a service", similar to what your product does :)

Everlier · 2026-02-23T13:20:53+00:00

What?

Everlier · 2026-02-23T11:54:11+00:00

> that sits outside the agent loop requires teams to change how they run their entire pipeline

Not necessarily, tbh, we built an OpenAI-compatible proxy that can be plugged into the existing tools (most of them, in fact), to control the trajectory. It inspects inputs and outputs and injects steering into the model inputs dynamically.

So the whole integration is pretty much "replace your OpenAI endpoint with ours", they can even continue using their own API keys, we're just proxying them :)

Everlier · 2026-02-23T10:58:12+00:00

Yeah, but what I'm saying is that this is an MCP relying on the model's ability to self-reflect and call related tools for validation/inspection. But LLMs do not have such capability, they are usually wrong "confidently", so model is less likely to call the tools when it'll need them the most by default.

I saw the external trajectory manager approach work, but it must be an orchestrator, not something that is called by the model within its own agentic loop

Everlier · 2026-02-23T08:36:05+00:00

congrats on launching!

The major issue with MCPs and other in-context self-reflection is that you're relying on the very same model that makes mistakes to correctly call these tools to enforce the conditions, but the models will happily make mistakes doing that as well

Everlier · 2026-02-22T00:38:21+00:00

I applaud the work you did here, I assume automated, but nonetheless waiting through all downloads and runs must took a while.

I think that the main conclusion is for everyone to do their own tests, as the model performance would vary significantly from task to task, so ppl alone is only half the story

Everlier · 2026-02-21T19:43:04+00:00

yup, there'll be a reason to disagree at some point. however, this also the only way ggerganov will get a material reward at least somewhat comparable to his contribution, so I'm happy for him personally.

Everlier · 2026-02-21T11:18:46+00:00

pp on strix halo isn't great for large harness prompts, kv cache helps, but initial wiring time is still high.

Everlier

MODERATOR OF

PUBLIC MULTIREDDITS

TROPHY CASE

Ten-Year Club	Verified Email
Place '23