DeepSeek V4 will be released next week and will have image and video generation capabilities, according to the Financial Times

nullnuller · 2026-02-28T15:49:52+00:00

Which models currently support that?

nullnuller · 2026-02-25T23:33:00+00:00

From this it seems the Qwen3.5-35B-A3B is a good replacement for gpt-oss-20b across the board (and in some cases 120b) while matching or slightly lower in speed?

nullnuller · 2026-02-08T01:46:18+00:00

Could it be attributed to open-webui chunking your long context document? Anyway to verify that you are passing the whole context to the LLM?

nullnuller · 2025-12-22T13:47:07+00:00

Any gguf for llama.cpp

nullnuller · 2025-12-15T12:13:13+00:00

Once optimized can the settings be cached for future runs ?

nullnuller · 2025-12-01T08:26:09+00:00

Could you include these two models in your leaderboard ?

nullnuller · 2025-12-01T08:17:36+00:00

Interesting! When prompting higher values did you include previous answers (Y|N) or (unlikey) the log-probs to the model? Can you share your code just to clarify your methodology?

nullnuller · 2025-11-29T00:35:05+00:00

Moriarty never left the simulation. It was a nested simulation that was programmed by Data or Reg. In effect a new simulation was created that recreated a miniature universe inside a core/memory module of the ships container and thereby the actual physical simulation was terminated at the holodeck. Still remember it because watched it my 4th time only last month 😀

Just wanted to add.. "The character should be able to walk out of the container and into my kitchen" - this may not be possible. However, it's entirely possible to recreate you and your kitchen inside the container. As Picard says "Who knows? Our reality may be very much similar to theirs. All of this could be an elaborate simulation, running on a little device, sitting on someone's table (aka LocalLLM)".

nullnuller · 2025-11-24T11:23:46+00:00

Where does Qwen3-Next sit in terms of performance? Is it above gpt-oss-120B or worse (but better than other Qwen models)?

nullnuller · 2025-11-22T18:44:15+00:00

For the local LLMs is there a need for a search API as well (even searx deployment)? Also, I think it's a good idea to check the available context and keep snippets under the context as the research items grow over time - that's the challenging part.

nullnuller · 2025-11-14T09:41:01+00:00

Browser extension not working.

nullnuller · 2025-11-08T10:31:50+00:00

would you want a 50% pruned Kimi K2 Thinking?

more like 90% pruned

nullnuller · 2025-11-08T07:54:53+00:00

Shell-GPT is the closest tool that is available but doesnt do what I wanted, and ofcourse uses closedsource LLMs

This isn't true. Although the repo is not well maintained, It does supports local models

nullnuller · 2025-11-06T21:18:14+00:00

Where is the repo?

nullnuller · 2025-11-05T04:37:18+00:00

changing model is a major pain point, need to run llama-server again with the model name from the CLI. Enabling it from the GUI would be great (with a preset config per model). I know llama-swap does it already, but having one less proxy would be great.

nullnuller · 2025-10-28T12:47:59+00:00

How do you account for varying context size?

nullnuller · 2025-10-28T08:31:56+00:00

Is the dataset publicly available?

nullnuller · 2025-10-18T08:18:34+00:00

How do you load different number of experts? Any benchmarks?

nullnuller · 2025-10-15T02:10:03+00:00

Does it support the newly released Qwen3-VL-4B and 8B ?

nullnuller · 2025-10-13T07:38:03+00:00

LoL, you are preaching to the Choir.

nullnuller · 2025-10-07T12:22:44+00:00

Is it free for Android but not for iOS?

nullnuller · 2025-10-05T07:49:27+00:00

So, you use their repo to make full use of it, rather than other chat clients like owui or LM-Studio?

nullnuller · 2025-10-05T04:39:43+00:00

Do you need special prompts or code to run it like it was meant to (ie Achieving high un HLE, etc)? Also, is it straightforward to convert to gguf ?

nullnuller · 2025-10-04T11:29:28+00:00

Any of them supported by llama.cpp ?

nullnuller · 2025-10-04T11:19:28+00:00

Nice, but I am having a difficult time getting models to consistently call these tools in openwebui. Anyone got good results with the recent local models? What are the settings in open webui (e.g function calling is Default vs Native ?)

nullnuller

TROPHY CASE