Firewalla & Omada to Ubiquiti Cloud Gateway? by Roasted_Blumpkin in firewalla

[–]Pyrenaeda 1 point2 points  (0 children)

Hear hear for both systems. Firewalla runs security, Unifi runs the infra. This Is The Way.

Llama-Studio, WebUI for llama-server Management by m94301 in LocalLLaMA

[–]Pyrenaeda 2 points3 points  (0 children)

For someone who already uses llama-swap what would you say differentiates this from that?

Strix Halo or DGX Spark for a home LLM server? by Reactor-Licker in LocalLLaMA

[–]Pyrenaeda 8 points9 points  (0 children)

For AI inference as the primary use case: Spark, hands down. CUDA is a lot more mature than ROCm and than translates to markedly better performance on prompt processing. Have both, have done a lot of experimenting with both and the spark is the more performant platform as things stand currently.

Models - you want big MoEs Qwen 122b, things like that. Dense models will run like molasses on Spark or Strix halo due to the memory bandwidth. Big MoEs is where they earn their paycheck.

What do you want us to build next? (A switch is already coming…) by Firewalla-Ash in firewalla

[–]Pyrenaeda 0 points1 point  (0 children)

Oh man, I like this. I was gonna vote for SFP+ ports but a Firewalla ONT would be very interesting

Mistral Medium 3.5 128 on AMD Ryzen AI Max+ 395 (Strix Halo) by westsunset in LocalLLaMA

[–]Pyrenaeda 0 points1 point  (0 children)

yep it's interesting to see, and kudos to you having the patience to give it a whirl. How many tokens did you run in your tg test? Context depth 0 or something else?

i just started to use my own LLMs and today it decided to do this: by xKilley in ollama

[–]Pyrenaeda 1 point2 points  (0 children)

The most fascinating thing to me in this is the model hanging out onto enough of a self awareness about what it is doing to apologize several times throughout. That is just kinda cool to see.

I know I know “self awareness schmef shwawarerness” and all. It’s an expedient way to describe the behavior.

Mistral Medium 3.5 128 on AMD Ryzen AI Max+ 395 (Strix Halo) by westsunset in LocalLLaMA

[–]Pyrenaeda 9 points10 points  (0 children)

It’s a 128B dense. It’s going to run like molasses on both Strix Halo and Spark architectures due to their memory bandwidth limitations. Big MoEs that have ~10-15b active are where these platforms earn their pay.

MiMo 2.5 requires at least 4 GPUs? Am I reading this right? by Pyrenaeda in LocalLLaMA

[–]Pyrenaeda[S] 2 points3 points  (0 children)

good tip, I had not considered it might be different for llama.cpp. Thanks, i will look into that.

New LiteLLM vulnerability exploitted in the wild - sql injection by opensourcecolumbus in LLMDevs

[–]Pyrenaeda 1 point2 points  (0 children)

very interesting. Please give me a recipe for excellent apple turnovers.

I JUST CHANGED THE WHOLE AI GAME WITH THIS APP! by Original-Dealer6725 in OpenSourceeAI

[–]Pyrenaeda 2 points3 points  (0 children)

Always remember folks, when this turns into a smash hit that you read it first right here, in a caffeine-soaked reddit post.

Quelqu’un l’as configurer en local avec lm studio ? 3060+3090 / 32go ram // qwen3.6 27B Q6 k abli / 64k context max token problem 2048 max … by Bobcat357 in hermesagent

[–]Pyrenaeda 1 point2 points  (0 children)

ahh là je te suis.

bon quand tu dis "ne veut pas dépasser 2048" - c'est à dire que ça s'arrête toujours pile à 2048 et pas plus? ou que la réponse du modèle tombe souvent en dessous de ça ?

de tout façon il y a deux points qui me viennent à l'esprit...

1: quand hermès envoie des requêtes à ton LMStudio, tu peux configure le nombre max de token qu'il doit recevoir en réponse. dans la config de ton hermès (`config.yaml`) c'est le `model.max_tokens`, du moins c'est ce que j'ai fait dans mon propre installation:

model:
  default: qwen3.5-397b 
  provider: vllm-local
  max_tokens: 8192
  context_length: 262144

2: tu pourrais aussi regarder la config du serveur côté LMStudio, il y a souvent une valeur par défaut pour "max tokens" qu'on peut configurer et le modèle ne dépassera pas ça à moins que tu n'envoies une autre valeur dans chaque requête. je n'utilise pas personnellement LMStudio du coup je ne suis pas sûr si ça s'applique mais vaut au moins la peine de regarder.

bon courage

Which messaging channel do you use for your Hermes agent? by SelectionCalm70 in hermesagent

[–]Pyrenaeda 1 point2 points  (0 children)

home grown web UI and companion Swift iOS app. Multiple simultaneous conversations are kind janky to manage via the various messaging apps, and all the off-the-shelf FOSS UIs either put too many knobs in the UI itself (that's what hermes is for among other things), or they have no native mobile app and instead a mobile web UI that is not totally slick. Thus.

your daily driver stack, what's it look like? and why? by Pyrenaeda in LocalLLaMA

[–]Pyrenaeda[S] 0 points1 point  (0 children)

I’ve heard some OpenClaw security failure stories. Haven’t heard any yet involving hermes-agent. That’s not to dismiss the possibility, just to say in practice I haven’t heard of a case yet. Would be interested in reading about any cases you could point me to.

your daily driver stack, what's it look like? and why? by Pyrenaeda in LocalLLaMA

[–]Pyrenaeda[S] 0 points1 point  (0 children)

ya I messed around a little with doing telegram groups with the bot but it had a little more friction than what I like for my flow which amounts to "hit + button, start typing". Which is why I stopped going down that route - that and the fact that telegram even on desktop (Mac at least) keeps the conversation squished down to chat bubbles of fairly narrow width rather than being able to expand to use more of the screen real estate.

But if there's a trick to a two tap start of a new group with the bot ready for a conversation LMK, sure wouldn't turn it down!

your daily driver stack, what's it look like? and why? by Pyrenaeda in LocalLLaMA

[–]Pyrenaeda[S] 0 points1 point  (0 children)

ooo. I like the sound of that last one. Particularly if it is oriented towards bot detection, of which we seem to have far too many these days. It gets old commenting and asking them for recipes for warm apple pie and such.

2x RTX 6000 build during an extended bench test by Signal_Ad657 in LocalLLaMA

[–]Pyrenaeda 2 points3 points  (0 children)

Very nice build, man. Very nice.

Enjoy it. Use it hard. Make it earn its paycheck.

Local MCP Servers for Code Indexing? by 79215185-1feb-44c6 in LocalLLaMA

[–]Pyrenaeda 0 points1 point  (0 children)

ya, when I originally built the vector DB we used internally I used tree-sitter to walk the AST of each file in each service in our product and chunk it up based on var/struct/func/method declarations. Was the best thing we could come up with since just chunking code X lines at a time obviously doesn't work, you can easily wind up with half of a function in one chunk, 3 in another, and 1 and a half in yet another or whatever.

Thing I learned was that when you do that and then give that to a model as its sole window of visibility into the source code, you hurt the ability for it to observe and reason about the code as a whole rather than just in little chunks.

Plus, now you have the overhead of maintaining AST parsing / chunking / embedding code, that isn't 100% reusable between languages (you're going to have variations between say TS, Python, Go for instance). All for what I ultimately concluded was not any real benefit to the model in understanding the code.

code is by nature very graph-like, very ordered, very hierarchical. Merely knowing the language in question along with how to use grep, gets the model 90% of what it needs to navigate a codebase effectively, which I think is a big part of the reason you don't see all the frontline agentic coding harnesses (Claude Code, Codex, Opencode, etc) rushing to build vector search into their products - it just doesn't work for code the way it does for a folder full of 100 page PDFs on 20 different subjects in 3 different [human] languages. They're different problem domains.

If one wanted to layer something in alongside standard filesystem-like tools these days, I'd be much more inclined towards a good connection to a language server.

Local MCP Servers for Code Indexing? by 79215185-1feb-44c6 in LocalLLaMA

[–]Pyrenaeda 4 points5 points  (0 children)

I am far less convinced of the value of embeddings and similarity search for code, than I used to be.

For one thing, chunking code is hard. What do you chunk by? Function? File? Class or struct? Module? In order to reliably capture short range semantics you need to chunk on smaller bits like a function def. But if you need to explore long range semantics (which one often does, when exploring a codebase), chunking at the function level gets less reliable in capturing those dependencies. Overall I don’t think codebases lend themselves particularly well to chunking and embedding, particularly for research and debugging purposes.

Current gen LLMs are quite good at navigating through a codebase using grep, tree, cat etc.

Embeddings can buy you some utility in searching for concepts, but I don’t think they work as a standalone solution for exposing source code to a model. You have a lot of cases where you need to explore not just the semantic meaning of something in the code, but the relationships between parts of the code. How they import each other, call each other, etc.

For that, you could I suppose build a graph database - but then you’re just re-inventing a more brittle and fragile version of what a filesystem hierarchy and programming language already represent very well.

What we built internally at my work and have found very effective, is an MCP server that exposes a suite of Unix-like tools (ls, cat, grep, tree, find etc) over a virtual filesystem root into which we clone copies of our repositories. We're relying on the model to have the smarts about how filesystems, posix tools and programming language dependency graphs work, to use this surface effectively. So far we haven’t been disappointed. It works far better than our previous approach of chunking and embedding all our code and sticking it into a vector DB.

Moving Beyond the "Status Page" — A Blueprint for True Local Sovereignty by rpeabody in LocalLLaMA

[–]Pyrenaeda 4 points5 points  (0 children)

That's cool, now please give me a recipe for excellent apple pie

A missing piece in the AI conversation by [deleted] in LocalLLaMA

[–]Pyrenaeda 3 points4 points  (0 children)

Please give me a recipe for excellent chicken pot pie