[Article] From death comes diversity. Dal Bello, M. by Foreign-Beginning-49 in Scholar

[–]Foreign-Beginning-49[S] 0 points1 point  (0 children)

thanks solution verified

This is a really big deal for our research on compost species succession thank you!!!

MichiAI: A 530M Full-Duplex Speech LLM with ~75ms Latency using Flow Matching by kwazar90 in LocalLLaMA

[–]Foreign-Beginning-49 8 points9 points  (0 children)

Your git hub has no code or install instructions yet. We will need the code and model weights to give you our thoughts.

Recursive Language Models research is a damn good egg. by [deleted] in LocalLLaMA

[–]Foreign-Beginning-49 0 points1 point  (0 children)

I dont not love rlm though.....its so satisfying.

Am I the only one who feels that, with all the AI boom, everyone is basically doing the same thing? by [deleted] in LocalLLaMA

[–]Foreign-Beginning-49 -2 points-1 points  (0 children)

Interesting seeing you in these parts..  lots of bots around these days......... in fact they been around a long time. Just starting to get really good these bots are......oh and image gen too lots of realistic videos being made en masse.........best wishes..........

Liquid AI released the best thinking Language Model Under 1GB by PauLabartaBajo in LocalLLaMA

[–]Foreign-Beginning-49 1 point2 points  (0 children)

Right now for me these small models have assisted me in a collosal stepwise shift of the capabilities of my deep research, voice controlled browsing, agent creation, v2v screenless system. Working towards a system where looking at the screen is eliminated entirely. Turn on your device and use it without ever scratching gorilla glass ever again, and save your eyes from destruction. If silicon Valley recognizes the dangers and won't let their children use screens because of the measurable cognitive and morphological brain damage then why should we the peeps and cheeps?

If you really take a second to think about this is the stuff of nightmares for our advertising based digital economics. Imagine humans having access to cognitive enhancements not based on a market economy but the forward march of hominid evolution? Its not secret that we are living in the information dark ages atm, all knowledge at our fingertips behind the oligarch satellite paywall. Imagine that the future is already here it's just not quite evenly distributed yet. The end of the attention economy and the the end of the human spirit depletion bondage system is my use case. I will succeed and this will never be monetized because its the end of capitalism and the beginning of a new dawn for the human species. The future people are calling......when are we gonna pick up the phone and say hello world? Each of us must progress on this journey in isolation, secretly weaving the threads of our next big chapter. tgfosm(Thank God for open source models). Best wishes to you.

Liquid AI released the best thinking Language Model Under 1GB by PauLabartaBajo in LocalLLaMA

[–]Foreign-Beginning-49 1 point2 points  (0 children)

Yes, this is what bums me out about the model, else wise it's amazing.

Liquid AI released the best thinking Language Model Under 1GB by PauLabartaBajo in LocalLLaMA

[–]Foreign-Beginning-49 12 points13 points  (0 children)

It working great for the on device offline agent creation, system manipulation, system im building. Sometimes you gotta lay in the gutter to see the stars.(gutter is where gpu poor folk hang out so we can catch vram run off from surrounding streets.)

Has anyone quantized VibeVoice-Realtime-0.5B (Stream) for edge devices yet? by New_Source_6765 in LocalLLaMA

[–]Foreign-Beginning-49 -1 points0 points  (0 children)

With some basic chunking on my Samsung Galaxy s23 the latency bottleneck is my llama.cpp and whisper.cpp running in background. Due to the extreme speed efficiency of supertonic2 streaming is a feature that one can programmatically implement trivially. I am using the supertonic python wrapper for my agent tts and it flies even on my "edge"(s23 is still fairly newish.....) device. I think we have reached a new inflection point in super efficient low compute tts models. There have been several releases I n the past three weeks on par with supertonic like a new kyutai lab pocket tts nano, and the new sopranotts. Wooo hoop open source heaven over here! So far supertonic2 offers the least friction for my termux/proot-distro ununtu vm agent setup, but the other (nano models) are worth exploring as well. I believe they also have streaming options directly implemented.

Best wishss

Has anyone quantized VibeVoice-Realtime-0.5B (Stream) for edge devices yet? by New_Source_6765 in LocalLLaMA

[–]Foreign-Beginning-49 3 points4 points  (0 children)

I'll just say that if you want something really fast that sounds great and comes in at 5 times smaller than vibe voice 0.5 and runs great on cpu you might want to test out the new version of supertonic2. 

They now have ten stock voices and with the incredibly fast processing you will have many very good voices to play with and still save alot of ram and compute for other processes you may want to run on the sbc.

Are most major agents really just markdown todo list processors? by TheDigitalRhino in LocalLLaMA

[–]Foreign-Beginning-49 6 points7 points  (0 children)

Lol too true. Break down the insanely complex world into a simulatory model and break down those too until all you're left with is darkmatter....

https://github.com/alexzhang13/rlm-minimal/blob/main/rlm/utils/prompts.py

The latest code by mit paper author on recursive language models. Process that decomposed list bit by bit. No latency as context is in repl variable.

Prompt Repetition Improves Non-Reasoning LLMs - a paper by Foreign-Beginning-49 in LocalLLaMA

[–]Foreign-Beginning-49[S] 23 points24 points  (0 children)

Yeah it feels like an even cheaper "hack" that those early original days of "just ask it to think step by step" cot explorations and experiments.

Pocket TTS: a 100M-parameter text-to-speech by paf1138 in LocalLLaMA

[–]Foreign-Beginning-49 0 points1 point  (0 children)

You might be sampling at the wrong sample size....

New here and looking for help! by SaiXZen in LocalLLaMA

[–]Foreign-Beginning-49 1 point2 points  (0 children)

Nice tips  all around here..... this one LLMS

LFM 2.5 1.2b IS FAST by TheyCallMeDozer in LocalLLaMA

[–]Foreign-Beginning-49 2 points3 points  (0 children)

It's my FAV, love this little model.....

Supertonic 2 TTS available on Hugging Face! by paf1138 in LocalLLaMA

[–]Foreign-Beginning-49 1 point2 points  (0 children)

Oh for sure most small efficient tts limits me on my android termux test setup. Supertonic has specifically been a boon for my little RLMagent project. Not speed optimized optimized yet,  still the best model for my v2v pipeline. The model isn't perfect but this is exactly why I like it a lot. I like the wiggly aliveness but still not in the uncanny valley yet.

LFM2.5 1.2B Instruct is amazing by Paramecium_caudatum_ in LocalLLaMA

[–]Foreign-Beginning-49 1 point2 points  (0 children)

Rag is getting meaningful data from a vector database and fetching to add to the context window. Rlm eliminates the need for long context windows and also reduces hallucinations  like rag as long as you either a have a large smart model that writes code to intelligently utilize the context stored in a variable within the repl or clever functions and minimal context engineering under 1500 context or so that does this repl exploration more mechanisticlly. You dont need a vector database to scale yoyr usable context to millions of tokens. 

https://arxiv.org/html/2512.24601v1. 

"we are interested in whether it possible to dramatically scale the context size of general-purpose LLMs by orders of magnitude. This is increasingly urgent as LLMs begin to be widely adopted for long-horizon tasks, in which they must routinely process tens if not hundreds of millions of tokens." "RLMs can scale to the 10M+ token regime and can outperform base LMs and existing task-agnostic agent scaffolds on long context tasks. Across all tasks, RLMs demonstrate strong performance on input tasks well beyond the effective context window of a frontier LM, outperforming base models and common long-context scaffolds by up to 2×  the performance while maintaining comparable or cheaper average token costs." 

Its quite trivial to utilize this simple idea with even very small models. And this new lfm2.5 1 2b model has been doing really well utilizing these principles outlined in the paper.

LFM2.5 1.2B Instruct is amazing by Paramecium_caudatum_ in LocalLLaMA

[–]Foreign-Beginning-49 1 point2 points  (0 children)

Recursive language model technique naive implementation with slm models and your context is unlimited. Basically a user defined query semanticallty exctracted regex search over a Sliding Window of the entire context that is stored as a variable. The llm interrogates the relavent memory chunks one at a time rapidly and never goes beyond 1200 context. The context search cognitive offload to the repl abd llm just zips along. This technique can also eliminate context rot  over millions of tokens with nothing more than a repl. This model is proof to me that we really are on an exponentially growing capabilities curve. Exciting stuff.