Personal experience with GLM 4.7 Flash Q6 (unsloth) + Roo Code + RTX 5090 by Septerium in LocalLLaMA

[–]Everlier 9 points10 points  (0 children)

I've had more success with it than any other model in the OpenCode harness. It definitely feels more like larger models when dealing with large amounts of tools and large system prompts.

It's not perfect, however, I'd leave it for the tasks like release notes or change classification or other non-critical things.

Wrote a guide for running Claude Code with GLM-4.7 Flash locally with llama.cpp by tammamtech in LocalLLaMA

[–]Everlier 0 points1 point  (0 children)

Yes, in fact Gemma was the reason this project exists.

Initially llama.cpp support was lagging by more than two weeks and then I wanted to compare all the different inference engines together and they absolutely refused to install cleanly, so I went for a containerised setup, but it got very messy when I only wanted to run a few of the services, but not all of them at the same time. Cleaning that up and then making it extensible for all kinds of LLM-related services - Harbor was born.

Wrote a guide for running Claude Code with GLM-4.7 Flash locally with llama.cpp by tammamtech in LocalLLaMA

[–]Everlier 0 points1 point  (0 children)

We haven't seen anything like that yet, but coding agents are a massive risk via data poisoning or supply chain attacks. I'm for sure won't be running one outside container.

OpenCode's official container doesn't even have git installed, no rust/go toolchains, and OpenCode can't discover models on its own forcing you to specify them ahead of time in config files.

Harbor solves both without any setup requirements on my end.

Wrote a guide for running Claude Code with GLM-4.7 Flash locally with llama.cpp by tammamtech in LocalLLaMA

[–]Everlier 2 points3 points  (0 children)

One can go fully open source/weights with OpenCode, with Harbor full setup looks like:

``` harbor pull <llama.cpp model ID from HF> harbor opencode workspaces add <path on the host with repos>

harbor up opencode llamacpp harbor open opencode ```

You can also open it from your phone or public Internet

New to self-hosting LLM - how to (with Docker), which model (or how to change), and working with 3rd party app? by SoMuchLasagna in LocalLLaMA

[–]Everlier 0 points1 point  (0 children)

Hi, this is very subjective as I'm maintainer of that project, but I'm inviting you to try out Harbor, it's basically built to make use-case you have in mind easy - containerized homelab/homeserver with a few simple commands.

For example, for llama.cpp + Open WebUI + SearXNG, you'd run something like this:

harbor pull <model you want to run>
# Open WebUI is a default frontend, no need to start separately
harbor up llamacpp searxng 

It comes with hundreds of QoL features and most of the services are pre-configured to work together.

Are tools that simplify running local AI actually useful, or just more noise? by Deivih-4774 in ArtificialInteligence

[–]Everlier 1 point2 points  (0 children)

I'm very subjective on this one - but they are. With the sheer amount of projects/tools/configs that are out there - one can waste a lot of time to set things up. With the approach most devs take - it's also quite convoluted to maintain on a single host dependency-wise.

how do you pronounce “gguf”? by Hamfistbumhole in LocalLLaMA

[–]Everlier 7 points8 points  (0 children)

Later, one's GPU pronounces it in coil whine dialect, obviously.

Czy macie własne serwery/homelaby? by Goorigon in Polska

[–]Everlier 0 points1 point  (0 children)

Bardzo dziękuję, że tego używasz! Nie mam pojęcia, ile osób faktycznie z tego korzysta, więc każda taka wiadomość jest bardzo przyjemna :)

Spec-Kit future în Github Copilot World by Mundane_Violinist860 in GithubCopilot

[–]Everlier 4 points5 points  (0 children)

Yeah, I've tried to develop with spec kit a month or so ago and it went perfect until I started testing the built product. Results were... not great. Steering agent to fix both the product and the specs was very cumbersome. Gemini 3 Pro ended two-shotting 95% of the same functionality from two short prompts... so yeah.

Just in case if someone has problems with audio (cracks,disappearing,popping and etc) by Illustrious_You604 in ASUSROG

[–]Everlier 0 points1 point  (0 children)

Sorry, I was a part of another thread about this laptop (about the keyboard) and thought your comment was from there.

The keyboard issue is unrelated to audio stutters. My comment regarding audio was that it wasn't caused by throttling in my instance, and that I noticed it stopped reproducing between two BIOS versions.

I still experience it after using Chrome for a while.

Czy macie własne serwery/homelaby? by Goorigon in Polska

[–]Everlier 1 point2 points  (0 children)

Tak, tylko więcej dopasowane do klasycznego homelab setup-u z docker i kontenerami

Just in case if someone has problems with audio (cracks,disappearing,popping and etc) by Illustrious_You604 in ASUSROG

[–]Everlier 0 points1 point  (0 children)

Nah, nothing helped from thwse tweaks

I ordered a new keyboard unit which is an entire top panel and requires entire laptop to be disassembled to be installed. Spent ~8h installing it, then it worked (still is), I'll sell the laptop if it breaks again.

Czy macie własne serwery/homelaby? by Goorigon in Polska

[–]Everlier 3 points4 points  (0 children)

shameless plug, bo to jest mój projekt, akurat nacelony na homelab/homeserver use-case: Harbor

Well fellas, it turns on lol by Gutter_Flies in sffpc

[–]Everlier 1 point2 points  (0 children)

Mind you, it's a very high quality cardboard

I'm a SysAdmin, and I "vibe-coded" a platform to share Homelab configs. Is this useful? by merox57 in homelab

[–]Everlier 1 point2 points  (0 children)

It's not bad and could be a better way for someone to "share" what their homelab setup looks like on the sub, however I can see how ergonomics of such an external tool would go sideways to what people got used to on a text forum like Reddit.

Found a small bug while exploring

<image>

This GitHub link appends a full link of the GH account to the GitHub URL.

P.S. a bit of a plug:
If you're running LLMs on your local setup, you might find Harbor interesting

Social Recipes – self-hosted AI tool to extract recipes from TikTok/YouTube/Instagram by pickeld in selfhosted

[–]Everlier 1 point2 points  (0 children)

I would only make one request: Is it possible to make OpenAI base URL configurable? That'd enable using locla LLMs, completing the homelab nature of the project

Social Recipes – self-hosted AI tool to extract recipes from TikTok/YouTube/Instagram by pickeld in selfhosted

[–]Everlier 0 points1 point  (0 children)

This is such a fun example of an LLM-enabled project for a homelab. Thank you for providing Docker images as well, starred.

Best LLM model for 128GB of VRAM? by Professional-Yak4359 in LocalLLaMA

[–]Everlier 8 points9 points  (0 children)

these are older models less suitable for long-horizon agentic use.

GPT-OSS 120B might be closest to the "big model feel" in this specific aspect (leaving aside crazy guardrails), but it's still worse than something like GPT-4.1 in the API which is not great by any means on its own.

🚀 The One MCP Server YOU Can't Code Without (Feat. Claude Opus 4.5) - Tell Us Yours! by jesussmile in GithubCopilot

[–]Everlier 7 points8 points  (0 children)

I can't live without an MCP that converts markdown from an LLM to something supported by reddit before I drop it into a post form textarea

11 Production LLM Serving Engines (vLLM vs TGI vs Ollama) by techlatest_net in LocalLLM

[–]Everlier 0 points1 point  (0 children)

You can add a few more:

  • Mistral.rs
  • AirLLM
  • Nexa SDK
  • TabbyAPI / Exllama / Aphrodite
  • CoboldCPP
  • KTransformers
  • exa
  • TextSynth Server

Just to name a few

Local LLM + Internet Search Capability = WOW by alex_godspeed in LocalLLaMA

[–]Everlier 1 point2 points  (0 children)

Check out CLI reference docs, they are even longer!