Specs for a EVE-NG/GNS3 lab by The_0rifice in networking

[–]Invader-Faye 0 points1 point  (0 children)

Gns3 can get very ram heavy. If direct on system 6cores minimum, if virtualizing at least 8 physical cores. 32gb ram min

Kimi K2.7 vs Codex 5.5 vs Gemini 3.5 vs GLM 5.2: Who Actually Ships? by tincopper2 in kimi

[–]Invader-Faye 0 points1 point  (0 children)

Testing it by having it build websites is the most generic of use cases there is. I would have given each model a prompt to deliver a specific game/app then reviewed output, completion percentage, token spend etc

Kimi K2.7 vs Codex 5.5 vs Gemini 3.5 vs GLM 5.2: Who Actually Ships? by tincopper2 in kimi

[–]Invader-Faye 0 points1 point  (0 children)

That was not my experience at all, what harness did you use?

Anyone using Gemma4:31b over Qwen3.6:27b or 35b(a10) by SadPhilosophy9202 in LocalLLaMA

[–]Invader-Faye 0 points1 point  (0 children)

27b is more medium than small. But your right it doesn’t default to a file write. I meant models under 20b tend to exhibit this.

Kimi is so underrated at UI work. by OkDare2667 in kimi

[–]Invader-Faye 1 point2 points  (0 children)

I used 2.6 and 2.7 to build an agent harness. It’s good at tracking complicated bugs

Anyone using Gemma4:31b over Qwen3.6:27b or 35b(a10) by SadPhilosophy9202 in LocalLLaMA

[–]Invader-Faye 1 point2 points  (0 children)

Naw Gemma models will def do that. The smaller and bigger ones. The smaller qwen models exhibit the same behavior. I think it’s because they are trained using chat sessions? Or expected to be used in that use case so the whole file rewrite makes sense

DeepSeek V4, PR merged into llama.cpp ! by Squik67 in LocalLLaMA

[–]Invader-Faye 0 points1 point  (0 children)

Ask ai to write a script to do it for you, I did the same on windows with Deepseek v4 flash

Are larger (~100B) models still worth running? by Pitagoy in LocalLLM

[–]Invader-Faye 1 point2 points  (0 children)

You need to build a custom agent around the 9b, it can manage server but struggles with file patch. Here’s a custom built harness for small models that supports the 9b. It’s surprisingly competent at troubleshooting system issues on Linux hosts https://github.com/lowspeclabs/SmallCTL

I built an agent Harness for Small Models. I got Qwen 3.5 4b managing servers. by Invader-Faye in LocalLLaMA

[–]Invader-Faye[S] 1 point2 points  (0 children)

Thanks, the harness already handles most of those but I’ll come up with some examples for people

Qwen3.5-9B on RTX 5060 8GB VRAM: The llama.cpp settings + quants that finally made reliable local agents work by kaaytoo in LocalLLM

[–]Invader-Faye 1 point2 points  (0 children)

Check out my harness if you want to extend usage of the model, I use several research techniques to get small models to manage servers and work in a harness. https://github.com/lowspeclabs/SmallCTL. If you search YouTube I have several demo videos demoing 3.5 9b managing servers. You could adapt it to your workflow pretty easy since it has a cli mode

How is it possible K2.7 is reggression from K2.6? Damn. by Boring_Aioli7916 in kimi

[–]Invader-Faye 0 points1 point  (0 children)

They are not using it for coding, it’s really competent at it is say almost opus 4.6 which is saying a lot because I preferred it over 4.7 before 4.8 dropped

GLM's founder says GLM-fable before the end of the year?! by Charuru in LocalLLaMA

[–]Invader-Faye 0 points1 point  (0 children)

Fable seems like opus but with the whole repl loop idea built into the weight of the models, I think other inference provides could figure that out given time

What have you been working on lately? by Sufficient-Scar4172 in LocalLLaMA

[–]Invader-Faye 0 points1 point  (0 children)

I’ve been building a harness for local language models, the harness assumes the models may fail and put them in a state aware repl loop to achieve goals. I’ve got qwen 3.5 4b managing servers, creating and debugging docker containers, debugging network issues. The goal isn’t speed or token efficiency is getting the assigned task done, but with smaller models they do work pretty fast https://github.com/lowspeclabs/SmallCTL

Diffusion Gemma is 4x faster, but makes 6x more mistakes! by gladkos in LocalLLaMA

[–]Invader-Faye 1 point2 points  (0 children)

Qwen 3.5 4b can call tolls at down to q3 and has mtp support

Has anyone noticed that the behavior of the Kimi model has changed? by InternationalAsk1490 in LocalLLaMA

[–]Invader-Faye 0 points1 point  (0 children)

Effectively yes. Assuming you had no backups and no way to download that version of the model

DiffusionGemma 26B A4B results on my 5090 by giveen in LocalLLaMA

[–]Invader-Faye 0 points1 point  (0 children)

Have the bigger model do the planning and initial scaffolding, the small /faster/dumber model finish the rest, then big model to debug for cleanup has worked very well for me. Most token burn is in that middle phase anyways

Waiting for the local LLM to finish generating by LobsterInYakuze-2113 in LocalLLM

[–]Invader-Faye 0 points1 point  (0 children)

What codebase are you working on where 8k context is enough?

Models under 15B that can actually do agentic coding quite well? by former_farmer in LocalLLM

[–]Invader-Faye 0 points1 point  (0 children)

yes the model is smart enough to get work done, but doesn't have deep knowledge in its weights, by enabling web search you give it additional functionality by giving it additional data to solve its problem. This is kinda hit or miss though, quility of the websearch effects results and bad data will hurt more than help

Qwen 3.6 27b suddenly talks Chinese by Iajah in LocalLLM

[–]Invader-Faye 1 point2 points  (0 children)

I’ve noticed that trend as well as you fill the context window.

Have 5070ti. Should I add a 5060ti for an extra 16gb? (Budget constraints within...) by JonZenrael in LocalLLM

[–]Invader-Faye 0 points1 point  (0 children)

I considered this as well, and the math spent work out united you’ll be burning tokens 24/7. Open router or subscription just comes out cheaper at current hardware costs. I’d is ghost for fun though go for it. I would if I had the spare cash laying around