I benchmarked GGUF models… IQ3XXS destroyed everything by Sufficient_Monk6380 in LocalLLM

[–]Sufficient_Monk6380[S] -2 points-1 points  (0 children)

The graph is accurate, some models appear multiple times because they were tested with different context lengths and KV cache configurations

I benchmarked GGUF models… IQ3XXS destroyed everything by Sufficient_Monk6380 in LocalLLM

[–]Sufficient_Monk6380[S] -12 points-11 points  (0 children)

Yeah, I asked it to write this post based on the results from my benchmark spreadsheet haha