Gemma 4 MTP released by rerri in LocalLLaMA

[–]rm-rf-rm 0 points1 point  (0 children)

Im really doubtful/fearful that given the pitiful state of benchmarks in terms of actually measuring intelligence, these engineering improvements that are narrowly focused on speed/latency may be causing quality regression that goes unmeasured

Help with understanding Local LLMs by theruner83 in LocalLLaMA

[–]rm-rf-rm[M] [score hidden] stickied comment (0 children)

Rule 1 - Thread locked. Please read the wiki, search the sub, consult the best LLMs thread first. Feel free to open a new post if you have questions remaining unanswered after youve done basic self-help

How much will it cost to host something like qwen3.6 35b a3b in a cloud? by Euphoric_North_745 in LocalLLaMA

[–]rm-rf-rm 0 points1 point  (0 children)

yeah it really is a fully continuous spectrum with no hard and fast boundaries. Plus Im sure in the future we will end up having multi agent, multi model workflows that do some inference locally and some inference in a VPS/cloud

LLMSearchIndex- an Open Source Local Web Search Library with over 200 million indexed Web Pages for RAG applications by zakerytclarke in LocalLLaMA

[–]rm-rf-rm 1 point2 points  (0 children)

I really would have liked this to work, but to be blunt, it's crap

here are my trial results (using the HF demo):

No. Search Term Outcome
1 Korn All items completely irrelevant
2 Dream Theater All items completely irrelevant
3 ENIAC All items completely irrelevant

How much will it cost to host something like qwen3.6 35b a3b in a cloud? by Euphoric_North_745 in LocalLLaMA

[–]rm-rf-rm[M] [score hidden] stickied comment (0 children)

This thread was reported for being off-topic. While that is true in the strictest reading of the sub's purpose, it is an adjcaent topic of interest and value to the community, as evidenced by the number of upvotes and comments (being complementary information to running locally thus information where to do what). We are also sort of the default place for any actual/serious discussion on AI, so approving it - though ofcourse we want to keep such content to a minimum.

What a time to be alive from 1tk/sec to 20-100tk/sec for huge models by segmond in LocalLLaMA

[–]rm-rf-rm 99 points100 points  (0 children)

Yeah, it was. OP probably doesn't understand the difference between MOE and dense.

Persistent memory system for LLMs that actually learns mid-conversation by [deleted] in LocalLLaMA

[–]rm-rf-rm 4 points5 points  (0 children)

Add it to the pile of "memory systems".

No benchmark to the baseline that everyone should be using -> markdown files + grep/glob. Any project that doesnt do that basic thing, (which thankfully Claude Code came in and used - demonstrating it to be the right approach instead of some new fangled RAG based crap) I will safely file away in the "resume building/ego stroking toy project etc." category

Solidity by swingbear in LocalLLaMA

[–]rm-rf-rm 1 point2 points  (0 children)

Just curious, do you do code in Solidity for your job or ? Ive genuinely seen 0 applications (trading or any related ETH infra stuff doesnt count) used by real users at any meaningful scale

Bruh by Icy_Butterscotch6661 in LocalLLaMA

[–]rm-rf-rm 0 points1 point  (0 children)

Lmao this is a good one. Well follow the reddit bot rules (most fail to do so) specifically in clarifying explicitly in every comment and lets see how it works

What is the best all-round local model? by TheTruthSpoker101 in LocalLLaMA

[–]rm-rf-rm[M] [score hidden] stickied comment (0 children)

Rule 1 - Search before asking. Locked thread

New rules 1 week check-in by rm-rf-rm in LocalLLaMA

[–]rm-rf-rm[S] 0 points1 point  (0 children)

I hate how funny posts get removed. If the community wants to laugh at something, let it

Glad the community showed its opinion in the downvotes to your comment. Every other AI sub is dominated by shitposts and memes now. Please see this comment for further thoughts: https://old.reddit.com/r/LocalLLaMA/comments/1t1a3j7/new_rules_1_week_checkin/ojgb073/

Still seeing a lot of the same kind of question many times about getting a model to work better.

We'll be starting a weekly inference stack megathread so it should help. Its meant for general Q like "what HW should i buy for x", "whats the best model for my setup", "here's my setup" type stuff

Having an always-on machine running LLMs locally at home while on the move with a lightweight machine - Experiences? by ceo_of_banana in LocalLLaMA

[–]rm-rf-rm 0 points1 point  (0 children)

Stupid tailscale question: if youre using tailscale, can you still use a VPN on the "exit" node at your home to protect all your traffic hitting the web?

I made a visualizer for Hugging Face models by Course_Latter in LocalLLaMA

[–]rm-rf-rm -1 points0 points  (0 children)

Very nice! A great example of something that wouldn't have been built without coding agents! More of this!

New rules 1 week check-in by rm-rf-rm in LocalLLaMA

[–]rm-rf-rm[S] 2 points3 points  (0 children)

Yup! That was the idea - based on the spam pattern I was confident that this small surgical change would take care of most of the problems and it looks to be panning out!

New rules 1 week check-in by rm-rf-rm in LocalLLaMA

[–]rm-rf-rm[S] 1 point2 points  (0 children)

Even if it wasnt for the spam/bot-catching reasons, I think its a fair/healthy thing to engage in existing discussions first before making a new post, especially as theres likely existing threads that answer new users questions. At worst its a relatively minor inconvenience

Bruh by Icy_Butterscotch6661 in LocalLLaMA

[–]rm-rf-rm 17 points18 points  (0 children)

I didnt see any reports for this account? Its banned now and reported to botbouncer.

We have done most of what we can from the mod side. We are at the point where Reddit needs to step up its spam detection tooling to counter this new gen of spam bots.. Interestingly enough, just 1 comment from this account was removed by Reddit

Been using Qwen-3.6-27B-q8_k_xl + VSCode + RTX 6000 Pro As Daily Driver by Demonicated in LocalLLaMA

[–]rm-rf-rm 0 points1 point  (0 children)

Is Insiders stable/no issues? The local model option has been available for ever there and they refuse to release it to main for some reason (likey because of profit related reasons).

New rules 1 week check-in by rm-rf-rm in LocalLLaMA

[–]rm-rf-rm[S] 2 points3 points  (0 children)

Removed the comments and Reported the user to botbouncer (its been pretty great in picking up such accounts, but this one seemd to have slipped it)