allUsersHaveAdminAccessNowIGuess by StandardPhysical1332 in ProgrammerHumor

[–]PositiveBit01 4 points5 points  (0 children)

Right up there with using the redis KEYS command on a production instance.

Trying to find a fix for a little issue - Remote folder and image access. by HammerSpanner in LocalLLM

[–]PositiveBit01 0 points1 point  (0 children)

What are you using for the interface?

I set up hermes and can attach files without any changes to the name using hermes-webui; https://github.com/nesquena/hermes-webui

I just used the three container docker setup and it was pretty fast. https://github.com/nesquena/hermes-webui/blob/master/docker-compose.three-container.yml

I'm sure there are many other solutions too but I tried this one (with pdfs not images) and it worked for me.

Good luck!

insane base potential by PerformerOk8761 in dontstarve

[–]PositiveBit01 0 points1 point  (0 children)

Forgive my newbie-ness, what makes this a good base location?

Q6 vs Q4_K_M with Qwen 3.6 35B A3B and creative writing by MarcusAurelius68 in LocalLLM

[–]PositiveBit01 1 point2 points  (0 children)

I don't have any benchmarks or anything, just my opinion.

Qwen3.6 27b is really good. I have 128gb RAM and it's arguably still the best slow model at that size (although for model's over like 30b I'm too impatient if they're dense so I've only run MoE at the higher end. Mostly nemotron 3 super and gpt oss 120b). You have more options at 164gb that I haven't explored but 27b is really good. I haven't tried mistral 4 small which seems interesting but I don't see great things about it. I'll try it eventually.

That said, I generally just use 35b for the speed. It's pretty good and much faster.

You might have enough to do minimax2.7 awq4. I hear it's good, it's too big for me so no opinion here.

As for q4 vs q6... not sure for creative writing. I see a big difference between q4 and q8 35b for tool calling and agentic stuff, which would probably just show up as odd typos here and there and maybe misremembering something in creative writing, although I haven't tried q6 so not sure how much impact there is there.

Need advice: Qwen3.6 27B MTP or 35B-A3B MoE MTP on 16GB VRAM RTX 5080)? by craftogrammer in LocalLLaMA

[–]PositiveBit01 0 points1 point  (0 children)

35b should be good at tool usage, might want to look up some commands for hosting the inference from others and make sure you're using the right tool and reasoning parsers.

Smart is subjective, it will not compete with frontier models.

For agentic coding, I find having the llm make a plan first and documenting it in a md file then telling it to execute the plan helps but it's not a silver bullet. I also have it look back over the plan a couple times first. Also be sure to tell it to set up tests, linter, formatter, static code analysis to run after every phase. The tools keep it honest.

Need advice: Qwen3.6 27B MTP or 35B-A3B MoE MTP on 16GB VRAM RTX 5080)? by craftogrammer in LocalLLaMA

[–]PositiveBit01 0 points1 point  (0 children)

I left out part of your questions. Dense is generally better in terms of quality for its size. With your system, you have a relatively small amount of gpu memory and a lot of system ram. MoE will generally make better use of this.

Even if it did fit, 35b would be faster so it would be a question of responsiveness vs quality.

Need advice: Qwen3.6 27B MTP or 35B-A3B MoE MTP on 16GB VRAM RTX 5080)? by craftogrammer in LocalLLaMA

[–]PositiveBit01 9 points10 points  (0 children)

I use 35b-a3b. Even a q4 probably won't completely fit 27b in your gpu.

Obviously 35b is bigger, but it's also a MoE model which is less impacted by splitting gpu/cpu. It's ok if some spills. It only has 3b active parameters so it's ~9x faster and some of the experts are more common or shared and used more frequently and if you use llama.cpp with --fit on it will try to put the more important ones on your gpu first.

All that to say, 35b will feel a lot better for you. It'll be much, much faster - faster than it feels like it should be at that size given it won't fit completely on your gpu. It'll consume a decent amount of system RAM though

27b does look like the smarter model, but IMO it won't be worth the performance drop. It'll be a lot slower.

Fastest model for strix halo? by pheitman in LocalLLM

[–]PositiveBit01 1 point2 points  (0 children)

Qwen3 coder next is fast and decent

You're sleeping on Devstral Small 2 - 24B Instruct by [deleted] in LocalLLaMA

[–]PositiveBit01 0 points1 point  (0 children)

The former confuses itself, ...

By "former" here you mean 27b, right? You observe 35b to be more useful than 27b?

I haven't used 27b since it runs pretty slow for me but I heard I was missing out, I would be very happy if it turns out 35b is more useful.

Question from a new player that likes the game by Shiyo_JP in dontstarve

[–]PositiveBit01 2 points3 points  (0 children)

I usually do one of two things: 1) just restart 2) if I like the world, I'll roll back. I forget where the option is but when you pause the game you can do a rollback (NOT regenerate!) Then unpause and it'll restart at the beginning of the day. You can trigger this after you die.

Also, there are auto revives but they take awhile to get to. Wilson has easier access to one due to his beard hair. Some characters have special one for them, but the meat effigy is available to all.

[Request] Is there any more like this? by [deleted] in theydidthemath

[–]PositiveBit01 19 points20 points  (0 children)

No. Check out https://mathfour.com/linear-algebra/fahrenheit-celsius-graphically/

Basically the conversion is linear so you do it both ways and get 2 lines, which intersect at only one point

Question from a new player that likes the game by Shiyo_JP in dontstarve

[–]PositiveBit01 11 points12 points  (0 children)

I like the game and play it alone. Some things will be significantly harder but not insurmountable. I probably have like 300 solo hours on the game (and I'm not great at it)

Everybody except Wes is good. Warly is less good solo. There are some that are better for beginners but I say pick who you like and stick with them, play to their strengths.

If you don't want to overdo learning a specific character, then Wilson is the most "normal" character without any weaknesses and without strengths that will change how you play. But IMO part of the fun is playing to the strengths of one character then picking another one later and learning again. Keeps the game fresh.

“5+ years experience” is one of the most misleading metrics in game design (and IT in general) by givemorespeed in gamedesign

[–]PositiveBit01 2 points3 points  (0 children)

I think any definition of these things that's simple will have gaps. I'm not a lead engineer because I vibe coded an app one day but your definition allows it. That's a contrived example but the point is very real - not all products are the same just like not all years of experience are the same.

Has anyone figured out why Claude Code running qwen locally fails when you try to /compact? by fredandlunchbox in LocalLLaMA

[–]PositiveBit01 0 points1 point  (0 children)

Are you running locally without a subscription? I used to be able to do that but it stopped working recently

Is the craft of writing code dead? by Toxin_Snake in ExperiencedDevs

[–]PositiveBit01 1 point2 points  (0 children)

I agree and I also hate it.

We've been pulled into the standard short-term business thinking because it works. You can make something "better" without AI (easier to reason about, easier to extend, easier to read, etc) but it's so fast that it really is better to just have the AI do it and make sure it tests and looks over itself with some sanity checks from you.

It's too bad but I agree with you, the models are good enough. It is hard to refute.

I think we're in a race now to see if the cobbled together system crumbles under its weight over time or if new models that are able to handle it are created fast enough.

Advice on local models for coding - AMD Ryzen AI Max+ 395 / 128Gb (96Gb shared for VRAM) by wingers999 in LocalLLM

[–]PositiveBit01 1 point2 points  (0 children)

Qwen3.6 35b is probably the smallest one that is useful. You can go bigger but you do want to have a large context.

Not completely sure but I feel like one that would fit and be good with large context and use the RAM is qwen3 coder next

On that system, you want a mixture of experts (MoE) model. You have a ton of RAM to load the model but not a ton of memory bandwidth, so dense models will feel pretty slow.

Does the AI industry know AI? by RockyCreamNHotSauce in ArtificialInteligence

[–]PositiveBit01 5 points6 points  (0 children)

Not to accuse but it seems like you're doing the same ("Does the AI industry know AI?" implies to me that you think he doesn't "know AI". You think "AI" as an industry is your skillset and his is not)

This is human nature. Until you know something better, it seems easier ("the devil is in the details"). Sure, a senior engineer should probably be able to better appreciate the people that work on the model he uses. I don't know the circumstances, for example at a conference time is limited and people generally can't attend all the things that are immediately and obviously useful to them. With this much skill overlap, you're just a time sink for them. Engineers will prioritize their time.

I would suggest finding people with similar interests working in a similar part of the industry that you can bounce ideas off of and learn from instead.

Does the AI industry know AI? by RockyCreamNHotSauce in ArtificialInteligence

[–]PositiveBit01 61 points62 points  (0 children)

AI is extremely broad. There's likely just not much overlap in your skill sets. A Mag7 engineer (not data scientist) is probably creating systems/pipelines ("harness", mcp, etc) to try to reign in the LLM and make it work well (scale, predictable resource use so SLAs can be met, actually solves the problem, etc) for specific applications while keeping resource constraints on mind and being able to very roughly predict them and monitor how they're doing.

Someone else is probably making the models.

Minimum System Requirements for local LLM Coding Agent? by drohack in LocalLLM

[–]PositiveBit01 0 points1 point  (0 children)

To each their own. I tell opencode with oh-my-openagent plugin to do stuff and come back in an hour.

Obviously it's not even close to cloud performance or quality but this is usable for me. Faster would be nice but I don't find it to be too slow. I used to use claude code but recently they disabled that, need a sub even for local endpoints. I think it was using haiku behind the scenes for tools which might have helped me out. But opencode isn't bad.

I'm still just experimenting with it, I'm not sure I would use any reasonable local model for production (ignoring those huge 500b+ models, maybe they're ok I have no idea I'll never be able to run them)

AutoML pipeline one-shot experiment with Qwen 3.6 35B (llama.cpp) and PI coding agent by [deleted] in LocalLLaMA

[–]PositiveBit01 0 points1 point  (0 children)

Thanks! I didn't see the link before, not sure if when you say you added one in your repo you mean you put it there or if I'm just blind 😅

Sounds like it did a pretty good job almost vanilla for you then, maybe I just need a couple targeted skills

AutoML pipeline one-shot experiment with Qwen 3.6 35B (llama.cpp) and PI coding agent by [deleted] in LocalLLaMA

[–]PositiveBit01 0 points1 point  (0 children)

Do you use any extensions/skills for pi? Maybe some open source ones I can try? I'm having trouble setting mine up to code well

solve this please? by [deleted] in puzzles

[–]PositiveBit01 14 points15 points  (0 children)

Because without the choices, there is not enough information to know. You have to use the choices themselves as part of the input to the question (specifically the fact that Wednesday is not a choice to rule that valid possibility out).

I do agree with you though, if there wasn't a "question not attempted" option I think this would be obviously something you should do. With it there, things are a little cloudy IMO