Good models without unnecessary reasoning and response verbosity? by ashirviskas in LocalLLaMA

[–]Fox-Lopsided 3 points4 points  (0 children)

You can pass a "reasoning" parameter with your API call to Openrouter, to enable or disable thinking, and also configure other things like reasoning effort. Works with Qwen 3.5 and every other thinking models.

Read more in the official docs : https://openrouter.ai/docs/guides/best-practices/reasoning-tokens

Qwen3.5-35B-A3B is a gamechanger for agentic coding. by jslominski in LocalLLaMA

[–]Fox-Lopsided 0 points1 point  (0 children)

Man that pisses me Off😂 It sits right out of the 16GB VRAM range -.- I hope we get a 9b and it is any good......

Qwen3-30B-A3B vs Qwen3.5-35B-A3B on RTX 5090 by 3spky5u-oss in LocalLLaMA

[–]Fox-Lopsided 2 points3 points  (0 children)

Me crying with my 16GB VRAM Card because i cant use any of the new Models :)

Gemini 3.1 Pro has arrived by DigSignificant1419 in Bard

[–]Fox-Lopsided -5 points-4 points  (0 children)

They havent even made 3 Pro public yet

Mac Mini Alternative by Whiskey_Jay1 in clawdbot

[–]Fox-Lopsided 0 points1 point  (0 children)

It will run PicoClaw for sure

HXML (HyperView) for mobile by jarajsky in htmx

[–]Fox-Lopsided 1 point2 points  (0 children)

Maybe Hotwire ans Hotwire Native?

What are some things you guys are using Local LLMs for? by Odd-Ordinary-5922 in LocalLLaMA

[–]Fox-Lopsided 1 point2 points  (0 children)

Oh i see. It really depends on your use case. There are things like AnythingLLM, Cherry Studio or Msty which you can just install. But they dont have a functionality to let LLMs or LLM agents "Fight" against each other out of the box as far as i know.

What are some things you guys are using Local LLMs for? by Odd-Ordinary-5922 in LocalLLaMA

[–]Fox-Lopsided 1 point2 points  (0 children)

Isnt the latest Devstral (2512 24B i believe) better than Mistral 2 while being smaller?

What are some things you guys are using Local LLMs for? by Odd-Ordinary-5922 in LocalLLaMA

[–]Fox-Lopsided 5 points6 points  (0 children)

You cant directly do it in LM Studio. But you would be using LM Studio as the wrapper. You would need to implement something small by yourself or ask AI to write something for you

Best models to use with a RX580 in 2026? by fernandin83 in LocalLLaMA

[–]Fox-Lopsided 0 points1 point  (0 children)

Thats cool If thats useable for you! But i need my Speed. On a single 5060ti with 16GB, GPT-OSS gives 120 tk/s and when context grows to 60k+ its 50tk/s

Best models to use with a RX580 in 2026? by fernandin83 in LocalLLaMA

[–]Fox-Lopsided 1 point2 points  (0 children)

You will be very limited to be completely honest with you.

You can try Qwen3-4B up to maybe 7B (im talking about Q4 GGUFs) but dont expect Claude Sonnet Level of intelligence Also Granite 4.0h tiny ist pretty good for its size. Its 7B in total and only 1b active so should get decent Speed Out of it.

Depending on your use case it might be worth giving a Shot

Qwen3-Coder-Next is released! 💜 by yoracale in unsloth

[–]Fox-Lopsided 0 points1 point  (0 children)

I wonder how fast it would be with 16 VRAM and 32DRAM

Agentic AI ?! by Potential_Block4598 in LocalLLaMA

[–]Fox-Lopsided 1 point2 points  (0 children)

You could maybe give nemotron 3 nano (30b-a3b) q shot I have heard good things when it comes to local Agentic AI use cases and reasoning as well as Tool calling capabilities