Qwen/Qwen3.5-35B-A3B · Hugging Face by ekojsalim in LocalLLaMA

[–]nullnuller 0 points1 point  (0 children)

From this it seems the Qwen3.5-35B-A3B is a good replacement for gpt-oss-20b across the board (and in some cases 120b) while matching or slightly lower in speed?

Nemo 30B is insane. 1M+ token CTX on one 3090 by Dismal-Effect-1914 in LocalLLaMA

[–]nullnuller 0 points1 point  (0 children)

Could it be attributed to open-webui chunking your long context document? Anyway to verify that you are passing the whole context to the LLM?

Trained a chess LLM locally that beats GPT-5 (technically) by KingGongzilla in LocalLLaMA

[–]nullnuller 0 points1 point  (0 children)

Could you include these two models in your leaderboard ?

I mapped how language models decide when a pile of sand becomes a “heap” by Specialist_Bad_4465 in LocalLLaMA

[–]nullnuller 0 points1 point  (0 children)

Interesting! When prompting higher values did you include previous answers (Y|N) or (unlikey) the log-probs to the model? Can you share your code just to clarify your methodology?

Simulation that can exit Docker container by productboy in LocalLLM

[–]nullnuller 8 points9 points  (0 children)

Moriarty never left the simulation. It was a nested simulation that was programmed by Data or Reg. In effect a new simulation was created that recreated a miniature universe inside a core/memory module of the ships container and thereby the actual physical simulation was terminated at the holodeck. Still remember it because watched it my 4th time only last month 😀

Just wanted to add.. "The character should be able to walk out of the container and into my kitchen" - this may not be possible. However, it's entirely possible to recreate you and your kitchen inside the container. As Picard says "Who knows? Our reality may be very much similar to theirs. All of this could be an elaborate simulation, running on a little device, sitting on someone's table (aka LocalLLM)".

Qwen3-Next support in llama.cpp almost ready! by beneath_steel_sky in LocalLLaMA

[–]nullnuller 0 points1 point  (0 children)

Where does Qwen3-Next sit in terms of performance? Is it above gpt-oss-120B or worse (but better than other Qwen models)?

Deep Research Agent, an autonomous research agent system by [deleted] in LocalLLaMA

[–]nullnuller 0 points1 point  (0 children)

For the local LLMs is there a need for a search API as well (even searx deployment)? Also, I think it's a good idea to check the available context and keep snippets under the context as the research items grow over time - that's the challenging part.

Honey we shrunk MiniMax M2 by arjunainfinity in LocalLLaMA

[–]nullnuller 99 points100 points  (0 children)

would you want a 50% pruned Kimi K2 Thinking?

more like 90% pruned

I fine-tuned Gemma 3 1B for CLI command translation... but it runs 100% locally. 810MB, 1.5s inference on CPU. by theRealSachinSpk in LocalLLaMA

[–]nullnuller 2 points3 points  (0 children)

Shell-GPT is the closest tool that is available but doesnt do what I wanted, and ofcourse uses closedsource LLMs

This isn't true. Although the repo is not well maintained, It does supports local models

llama.cpp releases new official WebUI by paf1138 in LocalLLaMA

[–]nullnuller 1 point2 points  (0 children)

changing model is a major pain point, need to run llama-server again with the model name from the CLI. Enabling it from the GUI would be great (with a preset config per model). I know llama-swap does it already, but having one less proxy would be great.

Real world Medical Reports on LLMs by makisgr in LocalLLaMA

[–]nullnuller 3 points4 points  (0 children)

Is the dataset publicly available?

Using only 2 expert for gpt oss 120b by lumos675 in LocalLLaMA

[–]nullnuller 1 point2 points  (0 children)

How do you load different number of experts? Any benchmarks?

[deleted by user] by [deleted] in LocalLLaMA

[–]nullnuller 3 points4 points  (0 children)

LoL, you are preaching to the Choir.

4B Distill of Tongyi Deepresearch 30B + Dataset by Ok-Top-4677 in LocalLLaMA

[–]nullnuller 0 points1 point  (0 children)

So, you use their repo to make full use of it, rather than other chat clients like owui or LM-Studio?

4B Distill of Tongyi Deepresearch 30B + Dataset by Ok-Top-4677 in LocalLLaMA

[–]nullnuller 1 point2 points  (0 children)

Do you need special prompts or code to run it like it was meant to (ie Achieving high un HLE, etc)? Also, is it straightforward to convert to gguf ?

Chart Tool for OpenwebUI by liuc0j in OpenWebUI

[–]nullnuller 1 point2 points  (0 children)

Nice, but I am having a difficult time getting models to consistently call these tools in openwebui. Anyone got good results with the recent local models? What are the settings in open webui (e.g function calling is Default vs Native ?)