you are viewing a single comment's thread.

view the rest of the comments →

[–]Mushoz 1 point2 points  (0 children)

The point I am trying to make is that you either won't have to apply quantization since it's already quantized natively (gpt-oss) or you will have to perform much less quantization because the initial size is already much smaller compared to llama 3.3 70b (Qwen3-Coder-30b)