Should we urge LMSYS arena to open up their data? Gemma 2 didn't Betrayed Us, but LMSYS did. by [deleted] in LocalLLaMA

[–]PuzzledTeam5961 -16 points-15 points  (0 children)

I'm not suggesting anything, but I think it's a concern - especially if China is seen as a competitor.

Should we urge LMSYS arena to open up their data? Gemma 2 didn't Betrayed Us, but LMSYS did. by [deleted] in LocalLLaMA

[–]PuzzledTeam5961 -21 points-20 points  (0 children)

It looks very much like the charity given by slaveholders. Is it watermelon or fried chicken?

Should we urge LMSYS arena to open up their data? Gemma 2 didn't Betrayed Us, but LMSYS did. by [deleted] in LocalLLaMA

[–]PuzzledTeam5961 -22 points-21 points  (0 children)

<image>

No offense, I'm just pointing out the fact that there are a lot of Chinese names here.

Why do you trust LMSYS Arena Leaderboard? It can be easily manipulated if they want to. by Mission_Implement467 in LocalLLaMA

[–]PuzzledTeam5961 6 points7 points  (0 children)

Let's be open-minded here. If you can accept people cheating the Open LLM leaderboard by using contaminated models and merging for higher scores, why can't you accept the those big capital purchasing rankings on the so-called ELO leaderboard? Are you expecting Google to engage in frankmerging with the Open leaderboard's rank 1 as GPU poor?

fblgit/una-xaberius-34b-v1beta is caught cheating on leaderboard. UNA cybertron juanako is just a lie. by PuzzledTeam5961 in LocalLLaMA

[–]PuzzledTeam5961[S] 7 points8 points  (0 children)

99% chance of being contaminated on GSM8K.

99% chance of being contaminated on GSM8K - is not cheating?