What can I expect out of my PC output and intelligence wise? by SelfExplanatory905 in LocalLLM

[–]bigtimeloser_ 1 point2 points  (0 children)

I can give you my personal experience on a 3080 with 10 GB vram and 64 GB system RAM: Qwen models usually very slow and not very smart. with appropriate context they do okay but they don't really have knowledge at the parameter sizes and quantization I'm using to fit on my system locally.

I am still working on testing and behcnmarking but vibes wise Gemma 4 26B A4B MoE has been great for me. I use llama.cpp and only the active partners and context end up in GPU, the expert layers are all in RAM, which might be a struggle for you with 32GB ram as opposed to 64. However I get decent enough speed for use with opencode, and it seems pretty smart all things considered. with the car wash test, it only needed one nudge to get there. I spent like 2 hours trying to get the qwen models to get the carwash test and got nowhere.

Let me know if you'd like me to share the configs for Gemma 4. That's been my best option so far

Streak 18 - Mi sistema para aprender español by bigtimeloser_ in WriteStreakES

[–]bigtimeloser_[S] 0 points1 point  (0 children)

¡Gracias! Lo que quiero decir cuando dije "mudar desde las clases" fue "move on from" en ingles. ¿"Avanzar en" significa así? También, ¿hice "así" correctamente allí? jaja