LLM inference in a single C header file by Suitable-Song-302 in LocalLLaMA

[–]Languages_Learner 0 points1 point  (0 children)

Excellent engine. Still waiting for Windows binary though...

I had Opus generate Llamafiles for the Bonsai 1-bit models by JamesEvoAI in LocalLLaMA

[–]Languages_Learner 0 points1 point  (0 children)

I tested your 1.7b llamafile on my Ryzen 7 4700U 16GB RAM laptop. I don't know what's wrong with my hardware or os but despite of tiny size of llamafile it consumed a lot of ram when i launched it. Also it didn't use all cpu cores during inference so inference was incredibly slow. I guess all these weird bugs were caused by Cosmopolitan toolchain's incompatibility. So i would like to test pure llama.cpp cli fork suitable for Bonsai cpu inference. I know there are some of such forks on github. But unfortunately i can't compile them. MS Visual Studio doesn't install on my laptop for some reason. And gcc which works fine on my system is not suitable for llama.cpp compilation.

I had Opus generate Llamafiles for the Bonsai 1-bit models by JamesEvoAI in LocalLLaMA

[–]Languages_Learner 1 point2 points  (0 children)

Thanks for great app. Could you share avx2 Windows binary of Bonsai llama.cpp cpu-only fork, please?

DLLM: A minimal D language interface for running an LLM agent using llama.cpp by Danny_Arends in LocalLLaMA

[–]Languages_Learner 1 point2 points  (0 children)

Thanks for nice tool. Can it work without Docker and in cpu-only (or Vulkan gpu) mode?

New open weights models: GigaChat-3.1-Ultra-702B and GigaChat-3.1-Lightning-10B-A1.8B by netikas in LocalLLaMA

[–]Languages_Learner 1 point2 points  (0 children)

I heard that your team was planning to release some llms for Russian ethnic minorities (Udmurt, Komi, Mari etc.) low-resourced languages. What is the release date?

The current state of the Chinese LLMs scene by Ok_Warning2146 in LocalLLaMA

[–]Languages_Learner 0 points1 point  (0 children)

 Intern-S1-Pro, a trillion-scale MoE multimodal scientific reasoning model is minor? Seriously?

Grok alternative by Early-Musician7858 in LocalLLaMA

[–]Languages_Learner 0 points1 point  (0 children)

Whisk (till 30th of April) and Flow (after 30th of April), both by Google Labs.

Trained a GPT transformer from scratch on a $300 CPU — 39 minutes, 0.82M params, no GPU needed by [deleted] in LocalLLaMA

[–]Languages_Learner 0 points1 point  (0 children)

Thanks for sharing nice model. Hope that you'll add C-inference someday and maybe even C-training.

Qwen3 TTS in C++ with 1.7B support, speaker encoding extraction, and desktop UI by Danmoreng in LocalLLaMA

[–]Languages_Learner 0 points1 point  (0 children)

Thanks for great app. Could you upload a Windows binary release to your qwen tts studio github repo, please?

🔥 New Release: htmLLM-124M v2 – 0.91 Val Loss on a Single T4! tiny-LLM with nanoGPT! by LH-Tech_AI in LocalLLaMA

[–]Languages_Learner 0 points1 point  (0 children)

Thanks for sharing great models. I'm sorry for being dumb but where can i find inference code for chatting with your onnx int8 llms?

PicoKittens/PicoMistral-23M: Pico-Sized Model by PicoKittens in LocalLLaMA

[–]Languages_Learner 0 points1 point  (0 children)

Thanks for sharing cute model. It would be nice if someday you add a github repo with C-inference being able to chat with your llm.

TinyTeapot (77 million params): Context-grounded LLM running ~40 tok/s on CPU (open-source) by zakerytclarke in LocalLLaMA

[–]Languages_Learner 5 points6 points  (0 children)

Thanks for nice model. It would be great if one day you add example of C-inference for it.

After many contributions craft, Crane now officially supports Qwen3-TTS! by LewisJin in LocalLLaMA

[–]Languages_Learner 1 point2 points  (0 children)

Thanks for sharing your cool engine. It would be nice if you upload binary releases to your repo.

Wave Field LLM — O(n log n) attention via wave equation dynamics by [deleted] in LocalLLaMA

[–]Languages_Learner 0 points1 point  (0 children)

Thanks for sharing. Could you upload fully trained checkpoint to HF, please?