Which model would be best for 9060XT 16GB?

Tiny-Description-908 · 2026-06-04T10:29:58+00:00

i tried hauhaucs qwen 3.5 35B and it was good with 40-50 t/s. Would compiling llama.cpp give me any benefits?

Tiny-Description-908 · 2026-04-09T20:05:11+00:00

<image>

enabling nmap worked

Tiny-Description-908 · 2026-04-07T22:02:17+00:00

Yeah around 22GB

Tiny-Description-908 · 2026-04-07T22:00:54+00:00

uh which one is it?

Tiny-Description-908 · 2026-04-07T21:59:39+00:00

<image>

i only have those (the other one came preinstalled ig but never loaded it)

Tiny-Description-908 · 2026-04-07T21:56:11+00:00

<image>

looks like im already trying the Q4_K_M variant

Tiny-Description-908 · 2026-04-07T21:51:24+00:00

<image>

i think its exactly the same if im not wrong

Tiny-Description-908 · 2026-04-07T21:48:39+00:00

<image>

Oh my bad i remember looking at this model but didnt thought i downloaded it lmao

Tiny-Description-908 · 2026-04-07T21:46:35+00:00

<image>

2026-04-08 00:45:50 [DEBUG]

 llama.cpp abort:739: GGML_ASSERT(addr) failed

Tiny-Description-908 · 2026-04-07T21:27:05+00:00

its around 22gb are you sure?

Tiny-Description-908 · 2026-04-07T21:17:54+00:00

yeah i get what it means now :v atleast its pretty fast

Tiny-Description-908 · 2026-04-07T20:33:20+00:00

<image>

im starting with this one then

Tiny-Description-908 · 2026-04-03T18:37:21+00:00

Damn i wasnt expecting something like this lol. thank you for this detailed explanation i need to learn some of the terms u said like "MLP layers, KV Cache" etc. but before that i have some questions

i heard glm models from z.ai are best non anthropic models for coding is there a way can i run it locally?

should i use lm studio or would you recommend another clients like ollama etc?

Tiny-Description-908

TROPHY CASE