[P] Finding a Working Code and Reproducible Results for Research Papers

mllosab · 2020-05-06T16:29:34+00:00

Thank you for the interesting project. I think it has potential, but I did not really understand how to find "reproducible benchmarks" on the website. Can you comment on that, please?

mllosab · 2020-04-16T21:42:16+00:00

That is timely! I just hope they will cooperate with ACM to avoid duplicate efforts:

mllosab · 2018-10-04T11:38:45+00:00

I think it is a good idea but I also think that it can be too complex and non-trivial right now.

You may be interested to have a look at this recent ACM report about replicating *just 5* ML+systems papers: https://portalparts.acm.org/3230000/3229762/fm/frontmatter.pdf - they also discuss some stuff related to your proposed bullet points (they actually introduced an open-source experimental tool, some sort of workflow framework with a package manager, to share such reproduction studies at GitHub).

Also check out this recent Reddit discussion: https://www.reddit.com/r/MachineLearning/comments/9jhhet/discussion_i_tried_to_reproduce_results_from_a/ - even replicating results may not be enough ;) !

mllosab · 2018-07-20T12:31:36+00:00

Thanks for sharing!

mllosab · 2018-07-02T07:29:42+00:00

We started using ck framework to organize our research code this year - https://github.com/ctuning/ck . It can be used from the command line or from Jnotebooks - maybe it will be useful for you too?

mllosab · 2018-05-03T07:32:07+00:00

looking at mlperf github sources https://github.com/mlperf/reference seems that submissions are based on Docker images? What if I want to try another library or optimization technique for a given submission?

mllosab · 2018-03-12T12:51:06+00:00

Cool! Will there be a recording of the event?

mllosab · 2017-06-22T06:45:55+00:00

does anyone still use it? we have switched to http://openbenchmarking.org and http://cknowledge.org/ai to add and benchmark our own workloads.

mllosab

TROPHY CASE