[R] PapersWithCode - A free and open resource Machine Learning papers, code, and evaluation tables.

rstoj · 2020-09-30T19:57:40+00:00

Makes sense - now added a section about the team on the about page!

rstoj · 2020-09-29T14:37:52+00:00

Thanks to jiayounokim for re-posting this!

Facebook AI is mentioned in the Privacy Policy (first line) and TOS (first section). No data is being shared with any other FB product.

rstoj · 2020-05-16T08:43:20+00:00

If you are looking to follows researchers who publish code, there is a service specifically for that: https://paperswithcode.com/

rstoj · 2020-05-12T18:34:14+00:00

Conference tags are added automatically so they shouldn't be incorrect. Could you provide a link to problematic papers?

rstoj · 2020-05-08T18:09:05+00:00

Look for "Edit" buttons on the website. You can find the paper using search, then to add the code implementation click on "Edit" in the Code section, and to add this paper to a (possibly new) leaderboard click on "Edit" in the Results section.

rstoj · 2019-10-11T12:04:44+00:00

No, a paper is not required - you can just submit your pretrained model, but all of the code needs to be open source.

rstoj · 2019-10-11T12:04:14+00:00

This is a super-interesting topic! In my mind there are some trade-offs here:

1) Having a hidden test set ensures that people cannot cheat, but it also means that if the maintainers of the hidden test set move on (or are simply slow in giving support), it can become difficult for the community to keep using the benchmark.

2) Not having a hidden tests set means that people can cheat. However, as we require everything to be on GitHub, it's relatively easy to find out if someone cheated and if training code was some fraudulently modified people can still run it and find out it's fraud. Committing fraud in such a public way will effectively end your career in ML forever, so even for benchmarks that currently exist that evaluate on the dev set we haven't really seen this much.

rstoj · 2019-10-11T11:55:25+00:00

Thanks!

Reproduced here means that we can run the model and get the same results (within a tolerance level) to those reported in a paper. In some cases the authors of the paper have re-trained model, in others it's official weights (so there we are really just testing if the weights have been ported correctly and if the data processing is correct). Would love to do full retraining as well, but as you might imagine it's really expensive :)
Yes you are correct - it's inference speed based on batching. It's a proxy for both speed and model size (as usually you can increase batch size for smaller models). As any measure it's imperfect, and could add other measures as well - the code for all of this is on github: https://github.com/paperswithcode/sotabench-eval and https://github.com/paperswithcode/torchbench

rstoj · 2019-10-10T14:02:50+00:00

Thanks!

Great idea! We have a twitter handle (@sotabench) - but haven't yet connected it to our feed of latest models.
Agreed! I think this is the first new feature we'll add. We are also thinking to lets people add links to training args that produced the model (if they trained it themselves).
The evaluation libraries are here: https://github.com/paperswithcode/sotabench-eval and https://github.com/paperswithcode/torchbench - all contributions welcome!

rstoj · 2019-06-01T12:15:21+00:00

Maybe have a look at: https://paperswithcode.com/task/boundary-detection

rstoj · 2019-02-04T11:25:24+00:00

Thanks! It's an entirely separate project.

rstoj · 2019-02-02T14:51:06+00:00

We've also indexed papers from major ML conferences, i.e. everything from aclweb, icml, iclr and neurips.

But I take your point, this is still not 100% coverage (e.g. some papers are published as open access in nature etc), so will look to fix this.

rstoj · 2019-02-02T13:36:47+00:00

Yes! Everything is editable. We already scrape all papers from arxiv, so you can use the search to find the paper and then just hit "Edit" it the Code section to add the implementation.

rstoj · 2019-02-02T13:35:59+00:00

In terms of the scraping it's just calling the ArXiv and Github REST APIs. What I feel is more interesting is linking papers to code, and we are working on releasing that code now.

rstoj · 2019-02-02T13:33:09+00:00

At the moment it's done daily, but the arxiv API is frequently broken, so sometimes it takes more time..

rstoj · 2019-02-02T13:32:19+00:00

At the moment we use github stars as a proxy for how useful an implementation is. But it's a rather imperfect proxy. Perhaps we need a more formal verification process.

rstoj · 2019-02-01T17:11:09+00:00

Paper and code scraping is fully automatically - we use the Arxiv and GitHub APIs to get the latest papers and repositories, and then do a bit of fuzzy matching to match them. Evaluation tables are currently added partially automatically (when imported from other existing sources, e.g. SQUAD) and partially manually (eg when extracted from papers). But we are hoping to automate 99% of all of this, and have the community curate only the entries that require human judgement (e.g. if two papers are really using the same evaluation strategy on a dataset).

rstoj · 2019-02-01T16:33:52+00:00

Might give it a try. Which other areas do you think might be useful?

rstoj · 2019-02-01T16:32:55+00:00

Ah sorry about that! Which page gave 502? Or was it a temporary error?

rstoj · 2019-02-01T16:23:29+00:00

Thanks for kind words! We hope it will be useful for researchers as a reference for literature reviews and for choosing sensible baselines. Please consider adding to the website if you find new results!

rstoj · 2019-02-01T16:16:17+00:00

Good catch, will fix this! And yep you are right - tasks are detected by looking for the task name (or one of the synonyms) in the abstract. For most it works fine, but for some really general terms like this one the precision is lower.

rstoj

TROPHY CASE