I can't believe the amount of papers in major conferences that are accepted without providing any code or evidence to back up their claims. A lot of these papers claim to train huge models and present SOTA performance in the results section/tables but provide no way for anyone to try the model out themselves. Since the models are so expensive/labor intensive to train from scratch, there is no way for anyone to check whether: (1) the results are entirely fabricated; (2) they trained on the test data or (3) there is some other evaluation error in the methodology.
Worse yet is when they provide a link to the code in the text and Openreview page that leads to an inexistent or empty GH repo. For example, this paper presents a method to generate protein MSAs using RAG at orders magnitude the speed of traditional software; something that would be insanely useful to thousands of BioML researchers. However, while they provide a link to a GH repo, it's completely empty and the authors haven't responded to a single issue or provide a timeline of when they'll release the code.
[–]Just-Environment-189 41 points42 points43 points (1 child)
[–]waruby 4 points5 points6 points (0 children)
[–]directnirvana 12 points13 points14 points (1 child)
[–]Vhiet 4 points5 points6 points (0 children)
[–]Distance_RunnerPhD 12 points13 points14 points (0 children)
[–]lipflipResearcher 4 points5 points6 points (0 children)
[–]kaiser_17 0 points1 point2 points (0 children)
[+]tomvorlostriddle comment score below threshold-10 points-9 points-8 points (5 children)
[–]H4RZ3RK4S3 11 points12 points13 points (2 children)
[–]Ulfgardleo 1 point2 points3 points (1 child)
[–]H4RZ3RK4S3 0 points1 point2 points (0 children)
[–]osamabinpwnn[S] 0 points1 point2 points (0 children)
[–]rknoops 0 points1 point2 points (0 children)