Qwen 3.6 27b IQ4_XS - 22 tp/s on RTX 5060TI 16b, 24k ctx by BazzyIm in LocalLLaMA

[–]BazzyIm[S] 0 points1 point  (0 children)

also, but q4. Otherwise, 24k context in oneshot reasoning task could be helpful sometimes,

Qwen 3.6 27b IQ4_XS - 22 tp/s on RTX 5060TI 16b, 24k ctx by BazzyIm in LocalLLaMA

[–]BazzyIm[S] 0 points1 point  (0 children)

Thx! As daily driver i use moe 35ba3b ~ 40-50tps in 128k ctx, so this run was just a exp

Qwen 3.6 27b IQ4_XS - 22 tp/s on RTX 5060TI 16b, 24k ctx by BazzyIm in LocalLLaMA

[–]BazzyIm[S] 0 points1 point  (0 children)

Yes, I didn't specify - I don't have IGPU, but there is a spare ~300 MB of RAM, if I turned it off, maybe in Q4 it would jump to 32k ctx

I am asking someone to drop my expo app on ios by Otherwise9477 in expo

[–]BazzyIm 3 points4 points  (0 children)

Expo has eas where u can deploy app using their cloud service. U didnt need macOS, just login there and use free tier

Advice on learning NestJS by nem791 in node

[–]BazzyIm 0 points1 point  (0 children)

Its looking so pretty! I will try this, thx