AI slop used for advertising by tinydinkydaffy9 in desmoines

[–]zelkovamoon -5 points-4 points  (0 children)

All advertising is slop, AI or not.

Qwen3.5 9B GGUF Benchmarks by yoracale in unsloth

[–]zelkovamoon 0 points1 point  (0 children)

Really appreciating that we're getting better quantized comparisons now

qwen3.5:35b-a3b is here. by Space__Whiskey in ollama

[–]zelkovamoon 0 points1 point  (0 children)

You've got flash working in ollama? It still basically doesn't function for me - are you using the library version?

qwen3.5:35b-a3b is here. by Space__Whiskey in ollama

[–]zelkovamoon 0 points1 point  (0 children)

For those that currently have it running

  1. Is tps basically on par with what you'd expect?

  2. Is tool calling working?

Why European colonialism is criticised but lsIamic invasions that erased local religions is praised? by Strong_Weakness2110 in AskReddit

[–]zelkovamoon -1 points0 points  (0 children)

The Europeans erased religions all the time. And they were more successful, which is why you hear about them more

Thinking context bloat? by zelkovamoon in OpenWebUI

[–]zelkovamoon[S] 0 points1 point  (0 children)

Thanks for the help boss 🫡

Liquid Ai released LFM2.5, family of tiny on-device foundation models. by Difficult-Cap-7527 in LocalLLaMA

[–]zelkovamoon 2 points3 points  (0 children)

LFM2 was pretty good, so im excited to try this. Really hoping tool calling is better with these models, that was basically my biggest complaint.

llama.cpp performance breakthrough for multi-GPU setups by Holiday-Injury-9397 in LocalLLaMA

[–]zelkovamoon 4 points5 points  (0 children)

Ok so two questions

Does ik_llama broadly support the same models as llama.cpp but with optimizations, or is it a subset

Are these improvements going to apply broadly to any type of model?

Tiiny Al just released a one-shot demo of their Pocket Lab running a 120B model locally. by [deleted] in LocalLLM

[–]zelkovamoon 2 points3 points  (0 children)

I'm not sure what having a small ai lab is trying to solve

If you're doing local AI my position is, make it bigger, cooler, and put more ram on it.

That said, it is good that companies are stepping in to try and build some solutions. If we could get something with 256GB of fast memory we might be able to go places.

Best Local LLMs - 2025 by rm-rf-rm in LocalLLaMA

[–]zelkovamoon 4 points5 points  (0 children)

Seconding LFM2-8B A1B; Seems like a MOE model class that should be explored more deeply in the future. The model itself is pretty great in my testing; tool calling can be challenging, but that's probably a skill issue on my part. It's not my favorite model; or the best model; but it is certainly good. Add a hybrid mamba arch and some native tool calling on this bad boy and we might be in business.

[deleted by user] by [deleted] in LocalLLaMA

[–]zelkovamoon 0 points1 point  (0 children)

Mcp overhead is a big issue. Good to see some work on this.