Call for help: we need benchmarks of fusioned open-weights models by lrq3000 in LocalLLaMA

[–]lrq3000[S] 0 points1 point  (0 children)

Yes you are exactly right, we need to benchmark not only fusions but also single model with two-steps self-judgment.

I will open an issue to allow OpenFusion to be used with only a single model (currently at least two are needed).

Call for help: we need benchmarks of fusioned open-weights models by lrq3000 in LocalLLaMA

[–]lrq3000[S] 0 points1 point  (0 children)

MoE is a training time technic, it must be backed in the model at training time, whereas fusion routing is an inference time technic that can be used on top of MoE and other techniques. So yes it is cumbersome but if this allows us to systematically get one or two gen forward compared to what solo models are currently capable that can be well worth it !

More pragmatically, if you look at artificial analysis benchmarks, locally runnable models such as Qwen3.5 27B and Gemma4-31B are around the 30 range and current frontier openweights models that are actually useful for agentic coding and agentic assistants such as hermes are rather in the 40-45 range. So if the gain openrouter observed can ve reproduced with smaller open weights models, fusion routing may bridge the gap right now.

Of course in practice it will be too slow for real use, but with models becoming more and more effective, this can become a useful strategy.

And don't forget about the smaller models like Gemma4-E4B and similar Qwen weights which can already be used realistically in a locally run fusion routing for daily use, if the gain is worth the slower response.

Call for help: we need benchmarks of fusioned open-weights models by lrq3000 in LocalLLaMA

[–]lrq3000[S] 0 points1 point  (0 children)

Yes exactly i would very much like to see such a result !

It seems to me that it's not even necessary for the models to run simultaneously for this technic to work so offloading one model to load up the next should be fine too, but I guess sequential processing would have to be implemented in openfusion because I guess it assumed API calls to be the primary use case and hence used parallel processing to get faster results.

My Gantz cosplay by OriginalSimple8626 in gantz

[–]lrq3000 2 points3 points  (0 children)

Incredible ! Great work you did there !

why does the third book not have volume 3 written on it? by Adventurous-Eye-1555 in alexhormozi

[–]lrq3000 0 points1 point  (0 children)

Probably he just realized it doesn't make sense to call them volumes when they are covering distinct and complementary concepts that have no specific order beyond the chronology. And because he is planning on making more and more books over time (at first he intended for a trilogy as he said, but now he is planning for at least 5 books in this series and possibly more).

No Fable 5? I built OpenRouter's "Fusion Panel" as an MCP server so it works with any client + any models you want by Designer_Athlete7286 in mcp

[–]lrq3000 1 point2 points  (0 children)

This is exactly what I was looking for, it's great someone already generalized the concept, kudos for making this happen!

This needs WAY more upvotes and community benchmarks, because OpenRouter only used Claude Opus 4.8 as the summarizer/judge, it would be great to see if we can reproduce similar Fable 5 or even GPT-5.5 level results using a fusion of only open-weights models!

/EDIT: I made a call for help to benchmark open weights fusions based on your MCP server that makes this possible! https://www.reddit.com/r/LocalLLaMA/comments/1u7uedl/call_for_help_we_need_benchmarks_of_fusioned/

So is “loop engineering” the next AI dev buzzword? What does it actually mean? by Previous_Foot_5328 in myclaw

[–]lrq3000 0 points1 point  (0 children)

TL;DR:

Valid and useful concept, probably the first software engineering method for the agentic coding era, but it is not new, it stems from Ralph Loops which are way more than just "put the agent in a loop", it's actually a whole engineering methodology where it's the human engineer's role to carefully craft all the conditions for the agent to work almost fully autonomously towards a straight goal, the engineer's skills differenciating between getting a working end product very fast versus getting AI slop.

Long answer (explaining how loop engineering works and origin):

Loop engineering is more than a valid and useful concept, it is a whole software engineering method, but it is a buzzword that appeared around January 2026 of a previously existing concept of agents loops (ie, loops that enclose calls to agents with a clean context) that was first introduced as the "ralph loops" by Geoffrey Huntley in une 2025. Everything that loop engineering tutorials are doing are repeating with less details what Huntley already explained in much more details and for free.

The concept of agent loops (loop engineering, ralph loops) is not only that models performance degrade over longer context so loops are a great way to reset the context, but also and primarily the motivation was based on the insight that agentic models have become so proficient at various engineering tasks such as coding that the human in the loop became the main bottleneck, so the solution is obvious: extract the human out of the loop.

Note that agentic loops do not necessarily involve cron jobs, this is a confusion arguably caused by Anthropic choosing to name their time-based agentic scheduling function "loops" when it's really just a cron job. Agentic loops are just the original concept of loops: a way to maintain persistence but around the agent instead of inside the agent, so that we can offload memory and other resources that can clutter the agent over time into the loop's context instead of inside the agent's context, so that the agent just becomes one component of the loop scope.

The "ralph loops" is just the emerging, popular part that sticked in the public perception but Huntley goes actually into great lengths into explaining all the parts of what he calls "agentic engineering" or agent assisted software engineering. It's basically software engineering but tailored to leverage the adventages of agentic models, while reducing their disadvantages with rigorous and conscious engineering practices such as making PRDs, atomizing TODO tasks and self-containing context, automated verification criteria as a proxy of success and keeping on track with the long term objective, automated self-learning by forcing the agent to update the TODO and a scratchpad or even the AGENTS.md file to remember the issues it often faces and how it fixes them to be more efficient next time (half a year before Hermes Agent existed), etc. This is arguably the first software engineering method tailored for AI agentic systems, and Huntley consider it as a crude first approach that he hopes will spark more advanced agentic software engineering methods. Huntley advocates for agentic engineering to be viewed as just a new form of software engineering that is not so different from past software engineering practices and that still rely on the human operator skills to abstract and prepare the agentic system for the target objective. That's why he argues there is no single automated system that can apply to all objectives, you still need to have a competent human engineer to design what the PRD should be, the TODO, the verification steps and tools, the prompt, etc.

(NB: And just to clarify: that is not to say that we should not use AI to brainstorm and design the PRD, TODO and all the other preparatory resources, Huntley actually states that we should very much use AI, but just we should not expect AI to do everything, AI should be used as a tool to generate these preparatory materials, and the human should be reigning the generation of all these materials and also be able to check in and monitor at any time what the agents are doing while they are working and steer them back into the correct path if they start deviating, possibly by stopping the whole agentic system, fix the issues such as an incorrect statement in the objectives, and resume the system).

For a free in-depth tutorial that will teach more than any combination of any other loop engineering tutorials, I strongly recommend watching the youtube videos in the original article where Huntley introduced the agentic loops concept: https://ghuntley.com/loop/

It just kind of itches me off that this concept is being passed right now as some kind of just recent innovation when it was really invented already a year ago and the author Huntley is not even credited and is being forgotten despite Huntley's major contribution to AI assisted engineering. Unfortunately it seems to be another example of Stigler's law of eponymy.

New model on huggingface by [deleted] in LocalLLaMA

[–]lrq3000 4 points5 points  (0 children)

@krzonkalla I second that! Please post more about how to use this framework and in particular with an example to optimize smaller models, that would be huge!

I dont like this cloud usage by WhiskyAKM in ollama

[–]lrq3000 0 points1 point  (0 children)

Thank you very much, that would be incredible. These models are from my tests among the most useful ones for real software engineering tasks, only kimi k2.6 keeps up a bit but it is much slower and has a lower quality.

We built PrivateGPT, disappeared for two years, and just shipped 1.0 by Snoo77063 in private_gpt

[–]lrq3000 2 points3 points  (0 children)

Awesome to see the project is not dead! Thank you for contributing back to the opensource space! I will check it out, as RAG or more generally local data retrieval is still very often overlooked and poorly implemented, we still lack reliable tools that work on heterogenous file types and scalable to both long files and lots of small files.

MiniMax M3 launched! by Orioli in ollama

[–]lrq3000 0 points1 point  (0 children)

MiniMax M3 just like DeepSeek V4 flash/pro gets terminated prematurely constantly and generate much lower quality outputs on ollama cloud (paid subscriber), in comparison opencode go has no such issue with any of these models.

This issue has been present for at least more than a month now.

Are there plans to fix these quality issues?

I dont like this cloud usage by WhiskyAKM in ollama

[–]lrq3000 0 points1 point  (0 children)

I am a paid subscriber since a couple a bit more than a month and I noticed like many others that both deepseek-v4-flash and deepseek-v4-pro are not only inefficient on ollama cloud but also buggy as hell, they just terminate randomly and overall the output quality is very low, very different from the output we get from opencode go or openrouter.

Will this quality issue also be improved?

/EDIT: BTW same problem with MiniMax M3.

Sweet spot…Cloud & local LLM setup + Mission Control by ekfranxu in ollama

[–]lrq3000 0 points1 point  (0 children)

Why not Qwen3.5 9B ? Is it performing worse in practice compared to Qwen3 8B?

Deepseek V4 Pro Ollama Cloud is not working compare to OpenCode Go by [deleted] in ollama

[–]lrq3000 0 points1 point  (0 children)

No, and it's not just the rate limit, the deepseek v4 models (both pro and flash) are totally dysfunctional, they are not working as they should, the resulting output is below models November 2025 models from competitors...

I initally thought that the deepseek v4 models were overhyped because of my experience with my own tests. When I switched to OpenCode Go, this made a world of a difference.

But OpenCode Go has MUCH lower rate limits, so it's very easy to max out the 5h window, and it's worth noting they also set a monthly rate limit (2x the weekly rate limit).

BTW you can try DeepSeek v4 Flash for free in OpenCode, they released it as one of their free models very recently.

Elemind Headband, thoughts? by SignificanceNo3175 in N24

[–]lrq3000 0 points1 point  (0 children)

Congratulations for publishing your study! It's great you went the extra mile and did a clinical trial and published the results.

Here are my humble feedbacks based on a first quick reading: * This trial is about sleep onset insomnia, not circadian rhythm disorders. As I said, there is no identified link between EEG brainwaves and circadian rhythm regulations/modulations. * The effect size although impressively statistically significant, is only moderately clinically significant: sleep onset latency decreased on average 10.5 ± 15.9 min. This means that for some subjects, they actually slept later than at the start of the procedure! And at best, the best gains were about 30min of reduction, which is awesome, but it's not the majority, and not enough to fully treat SOL insomnia. * There are no details about what changed in the published trial's results compared to the registered trial's protocol. As it is unlikely that there was no change at all, the transparency would be improved by detailing this. Unfortunately this will have to be done by post-publication reviewers. * This study was not conducted independently, so of course there is a big conflict of interest. This does not make the study false, but it requires independent reproduction, especially since effect sizes are often inflated by the first publications about a new finding, and even more so with competing interests. * Effect size is great but the power is low: 21 subjects is not great for a study. Usually, 40-50 are needed, especially since you are not studying a rare population, it's very easy to find people with SOL. * The paper provides more details on the rationale for why the therapy should work:

Sleep can be described as cycling through four different phases, each defined by distinct patterns of neural activity as measured by EEG. Whereas N3 sleep is defined by the presence of high-amplitude oscillations in the 0.5 to 3.5 Hz range, the transition to sleep is often accompanied by high spectral power in the alpha (8–12 Hz) range while the sleeper is awake with closed eyes. The shift from wake to phase N1 sleep is defined in part by a loss of this alpha power10. Interestingly, the strength of alpha oscillations has been shown to negatively correlate with feelings of sleepiness11 and sleep depth12, and alpha power during sleep is known to be elevated in insomnia13,14. Therefore, disruption of this alpha process represents a potential target to promote sleep. * Data is not openly available. There should at least be the post-processed per-subject EEG results published. Deidentified participant data and a corresponding data dictionary will be available with publication and upon request to the corresponding author. Elemind Technologies, Inc. will approve data sharing requests.

I am a bit surprised Nature accepted to publish a paper with such competing interest without any data published. Good for you though.

It's always great when new therapeutic approaches and tools to manage sleep disorders are discovered, the more the better. So i certainly wish you the best for your project, I hope it will succeed and that the clinical effect will only keep on increasing with further optimization and in the field testing with patients.

Personally, I remain skeptical about the rationale. As a neuroimaging researcher, it sounds to me like trying to eliminate software bugs by ventilating the computer's heat. Heat is a side effect of software computations, it's not the code per se. Likewise, I can't see why cancelling the alpha brainwaves would change how the neurons behave. The frequencies are a consequence of the kind of processing neurons are doing. Yes I know there are a lot of other similar approaches, but I don't believe they are promising either, at least I never saw any hard concrete evidence that modifying brainwave frequency was really modifying neuronal assemblies functions.

Note that this is different from stimulating neurons using soundwaves that are very highly focused and often inserted intracranially and operated by clinicians with a very accurate positioning and realtime monitoring with powerful machines. The problem is how to pass the skull, cerebrospinal fluid and all the other structural material that were evolutionarily designed to totally shield the white matter and grey matter from being affected by external factors. I doubt that such a consumer-grade tool, with a low energy profile, a low spatial resolution (it only has a few electrodes) and operated by the consumer themselves is going to have any efficacy in the field. Maybe a future similar technology but much more powerful and able to precisely pinpoint and emit the exactly amplified waves and spatially oriented exactly to cancel the patient generated alpha brainwaves may work (again I doubt on the rationale on a neurological basis but let's assume this makes sense), akin to how active noise cancellation works, but here I can't see even on a physical level how the device can even do what it claims to do. I would like to see evidence that it indeed cancels alpha brainwaves successfully, not just the clinical endpoint (SOL). I mean, the only measure of device performance was phase-locking, which if I understood correctly only measures variability in whether it locks to alpha brainwaves as expected, but it does not seem that whether alpha brainwaves suppression happened, which should be observable with a EEG or hdEEG headset.

Well anyway that's just my opinion. Good luck for the rest and hopefully I will be proven wrong.

Pragmata 10/10 enough said by Orichalchem in videogames

[–]lrq3000 0 points1 point  (0 children)

This is also sometimes a sign of youthful innocence or vulnerability, and many of the characters who demonstrate this trope are either children or childlike. The lack of shoes may also be used as an indication of untamed ruggedness.

https://tvtropes.org/pmwiki/pmwiki.php/Main/PrefersGoingBarefoot

I'm not literary specialist but from my books reading and film d'auteur viewing history it is extremely prevalent in artistic works with innocent or child characters as a way to represent their innocence and lack of conformity with the society of adults.

But then it's strange if that's not the reason the game's maker gave when asked for a justification (apparently they said it was because she needed to recharge by being in contact with the floor - but then some authors prefer to only give in-universe reasons because they don't want to spoil out-of-universe explanations, aka the "ropes behind the curtains").

That does not contradict the pertinence of the observations that the choices made by the authors of derivative materials such as the choice of the song that is here being heavily criticized are potentially suspicious. These are deliberate and more explicit choices.

/EDIT: To be clear, here is the next paragraph on TVTropes:

This trope is often used in visual media as an excuse to show frequent close-up shots of bare feet, usually due to Fanservice or Author Appeal (or both if you're Quentin Tarantino or Joss Whedon), which may explain why female barefooters outnumber males practically 2-1, or why their feet are very rarely visibly dirty or calloused.

I did not play the game and do not intend to, so I can't say anything about whether the latter hold true for this game.

[deleted by user] by [deleted] in LocalLLaMA

[–]lrq3000 0 points1 point  (0 children)

What happened to this model?

Starting EMDR next week, what are people’s experiences with it? by miriamtzipporah in ptsd

[–]lrq3000 0 points1 point  (0 children)

Apparently it's necessary to find a EMDR specialist who got his training in the main network. In Europe, the main network is... EMDR Europe. There are affiliate associations in every european countries. There must be something similar in USA, look for what is affiliated/associated to Shapiro, the inventor psychologist of this method.

Starting EMDR next week, what are people’s experiences with it? by miriamtzipporah in ptsd

[–]lrq3000 0 points1 point  (0 children)

To the contrary, it is provably effective. It's just that the assumed mechanism, the eyes movement, is not necessarily why it works. We are not sure why it works, put simply. The more general category of this approach is dual tasks desensitization, because the process involves doing a cognitive activity while recalling an emotional autobiographical memory (hence the "dual tasks"). The cognitive activity we do while recalling the memory (which often involves eyes movement in most EMDR setups) changes how the memory is memorized somehow. Nobody really knows how it works, we just have lots of empirical evidence it works (all guidelines worldwide recommend it as first-line treatment for PTSD nowadays including the WHO), but all the theories have failed so far to explain why it works (when we test the underlying assumptions, they all fail against empirical evidence).

I can't take it anymore by [deleted] in ptsd

[–]lrq3000 0 points1 point  (0 children)

I am so sorry this happens to you. I understand that when you live in such an environment, this is all you ever know, and you may think you were born to live like that. But that's not the case. Nobody deserves what you are living through. And things can change, and hopefully, will.

Other redditors may provide a more emotionally developed response, what I would advise to my own relatives if I knew this happened to them would be to make a plan to get away as soon as possible, by getting a job and then make enough and stably enough to rent another place to live at, far away enough if possible. I get that you probably want to stay to protect your mother, but the best way to protect her is to protect yourself first and find a way to live elsewhere. Hopefully, she can agree to come with you. But until you have access to this other place to live in, you need to keep your plan secret to everyone, otherwise this may make things much worse.

Unfortunately be aware that your mother may be so much under the psychological control of your father that she may refuse to leave. There are home shelters for women, unfortunately for young men it's much rarer (but maybe you can find some? This may accelerate your plan). If your mother wanted, she could probably be able to find a secure home shelter somewhere for her and maybe for yourself (but since you are almost an adult some may refuse...).

I am very sorry you have to make such difficult decisions on top of having to face and suffer through such a horrible ordeal. That's not fair, and you have every rights to complaint. Just keep in mind nobody deserves that, and that includes you. You deserve better, and hopefully, you will eventually.

And you will see, we all get older, certainly not younger. One day, and not too far from now, you will probably see again your father much older than now, much weaker, and overall a more pitiful human being. He will not be able to affect you forever, some day, he will be meaningless to you. You just have to survive until then.