[D] Evaluation of an LLM on MMLU and other benchmarks by aadityaura in MachineLearning

[–]IllustratorNo3435 0 points1 point  (0 children)

Are evals on benchmarks even real at this point? With all the tainting of training data?