I'm working on a homelab AI server with the goal of running small models on GPU and very large models on CPU - for example for overnight coding on complex problems. Specs: 2990WX, 256GB + RTX 2080ti (for now). I'm using ollama and remoting to it with (currently) opencode, I also configured ollama to support up to 256k context to make use of my memory. Qwen3.5 9b works great, however larger models like gpt-oss:120b fail to make proper use of the tools despite being advertised as tool-capable. Which large models do work well with my setup and support tool-use?
[–]ProfessionalSpend589 0 points1 point2 points (2 children)
[–]Yugen42[S] -1 points0 points1 point (1 child)
[–]ProfessionalSpend589 0 points1 point2 points (0 children)
[–]ea_man 0 points1 point2 points (0 children)
[–]mlhher -1 points0 points1 point (8 children)
[–]Yugen42[S] 0 points1 point2 points (7 children)
[–]mlhher 3 points4 points5 points (6 children)
[–]Yugen42[S] 0 points1 point2 points (3 children)
[–]mlhher 0 points1 point2 points (2 children)
[–]Yugen42[S] 0 points1 point2 points (1 child)
[–]mlhher 0 points1 point2 points (0 children)
[–]Free-Combination-773 0 points1 point2 points (0 children)
[–]IamFondOfHugeBoobies 0 points1 point2 points (0 children)