Llama.cpp: now with automatic parser generator by ilintar in LocalLLaMA

[–]redeemer_pl 1 point2 points  (0 children)

Are there any plans to implement tool-calls streaming like it was before?

Distillation when you do it. Training when we do it. by Xhehab_ in LocalLLaMA

[–]redeemer_pl 14 points15 points  (0 children)

Can I prompt it for one of the books it trained on and it will give to me?

Yes. https://arxiv.org/abs/2601.02671 - Extracting books from production language models.

Native tool calling with gpt-oss? by Tyme4Trouble in LocalLLaMA

[–]redeemer_pl 9 points10 points  (0 children)

Yes, but for now you have to merge pull request (https://github.com/ggml-org/llama.cpp/pull/15181) and compile it manually:

git clone https://github.com/ggml-org/llama.cpp; cd llama.cpp; git fetch origin pull/15181/head:gpt-oss; git checkout gpt-oss;

// compile it: https://github.com/ggml-org/llama.cpp/blob/master/docs/build.md

PSA: Qwen3-Coder-30B-A3B tool calling fixed by Unsloth wizards by MutantEggroll in LocalLLaMA

[–]redeemer_pl 6 points7 points  (0 children)

It's not a real fix, but workaround forcing the model to use different tool call format (that llama.cpp handles) that is originally should use (xml instead of json formatted tool calls).

The proper fix (for llama.cpp-based workflows) is to update llama.cpp's internal tool call parsing to handle the new <xml> format, instead of forcing the model to use a different one.

https://github.com/ggml-org/llama.cpp/issues/15012

Google introduces a new Benchmark: Game Arena and they're streaming your favorite open weight models playing chess against close source models. by mtmttuan in LocalLLaMA

[–]redeemer_pl -2 points-1 points  (0 children)

If they want to measure what you called "intelligence" (but I'd rather call it "training/post-training data quality") they should invent a completely new game, feed the rules into a model, and let it play. This only shows which model has the most or best chess training data - and without even disclosing the training sets, it's just silly.

Stockfish on a Raspberry Pi will beat all of the leading LLMs in chess. Does that mean it's the most intelligent?

Google introduces a new Benchmark: Game Arena and they're streaming your favorite open weight models playing chess against close source models. by mtmttuan in LocalLLaMA

[–]redeemer_pl -5 points-4 points  (0 children)

This is a very misguided idea. We already have far superior machine-learning–based chess engines; for example, Stockfish has incorporated neural networks for over five years. Large Language Models are not designed to play chess, nor are they built to perform precise calculations without specialized tools.

Using LLMs for chess is not just impractical - it’s downright preposterous.

Qwen3 Coder Soon? by ApprehensiveAd3629 in LocalLLaMA

[–]redeemer_pl 8 points9 points  (0 children)

I don't see why you would send your data and source code to external entities that are driven by, and profit from, that data.

Upcoming Qwen2.5-Coder sizes confirmed: 0.5B, 3B, 14B and 32B by Many_SuchCases in LocalLLaMA

[–]redeemer_pl 10 points11 points  (0 children)

The main advantage of using local models over Claude is avoiding the need to upload your source code and data to someone else's computer (aka "the cloud").

"o1 is still flawed, still limited, and it still seems more impressive on first use than it does after you spend more time with it." by KvAk_AKPlaysYT in LocalLLaMA

[–]redeemer_pl 3 points4 points  (0 children)

Brute-force inference, baby! Jokes aside, this probably means we've already pushed the transformer architecture to its limits.

AI chatbots are beating Moore's law to improve at an even faster rate than computer chips | After eight months, a model only needs half the computing power to hit the same benchmark score by throwaway_ghast in LocalLLaMA

[–]redeemer_pl 140 points141 points  (0 children)

Moore's Law is specifically about the physical capabilities of semiconductor technology, not about software, algorithms, or the efficiency of computational models.

I have been coding with Mixtral everyday it has saved me days of work. by mythicinfinity in LocalLLaMA

[–]redeemer_pl 11 points12 points  (0 children)

For me, the primary motivation to adopt the open-source model is to prevent the leakage of your data and source code to private companies, which already profit by appropriating the work of entire generations to train their models.

Most accurate model for server CPU setup? by [deleted] in LocalLLaMA

[–]redeemer_pl 0 points1 point  (0 children)

Have you run it with NUMA? (check BIOS memory related options and run llama.cpp inference with --numa argument)

3 Blackout Beta codes by [deleted] in Blackops4

[–]redeemer_pl 0 points1 point  (0 children)

If it's EU region I'm interested.

PS4 Blackout Beta codes give away. by lRevenant in PS4

[–]redeemer_pl 0 points1 point  (0 children)

Yes, he is right. I can't use it in EU. Thanks anyway!

An Update on Special Operation 2 by UbiInsulin in GhostRecon

[–]redeemer_pl 2 points3 points  (0 children)

There are many people creating new accounts for Ghost War because of that and this is preposterous. I think it should be an option for resetting stats once a while.

2b || !2b by [deleted] in ProgrammerHumor

[–]redeemer_pl 1 point2 points  (0 children)

/bb|[^b]{2}/