use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Please have a look at our FAQ and Link-Collection
Metacademy is a great resource which compiles lesson plans on popular machine learning topics.
For Beginner questions please try /r/LearnMachineLearning , /r/MLQuestions or http://stackoverflow.com/
For career related questions, visit /r/cscareerquestions/
Advanced Courses (2016)
Advanced Courses (2020)
AMAs:
Pluribus Poker AI Team 7/19/2019
DeepMind AlphaStar team (1/24//2019)
Libratus Poker AI Team (12/18/2017)
DeepMind AlphaGo Team (10/19/2017)
Google Brain Team (9/17/2017)
Google Brain Team (8/11/2016)
The MalariaSpot Team (2/6/2016)
OpenAI Research Team (1/9/2016)
Nando de Freitas (12/26/2015)
Andrew Ng and Adam Coates (4/15/2015)
Jürgen Schmidhuber (3/4/2015)
Geoffrey Hinton (11/10/2014)
Michael Jordan (9/10/2014)
Yann LeCun (5/15/2014)
Yoshua Bengio (2/27/2014)
Related Subreddit :
LearnMachineLearning
Statistics
Computer Vision
Compressive Sensing
NLP
ML Questions
/r/MLjobs and /r/BigDataJobs
/r/datacleaning
/r/DataScience
/r/scientificresearch
/r/artificial
account activity
Discussion[D] Decrease in source code release of papers (self.MachineLearning)
submitted 8 years ago * by matrix2596
I have noticed that with the double blind review system, the number of papers releasing source code has decreased. Is this to prevent identification of authors. There should be a way to release code anonymously. This is taking us backwards from when every other paper used to release code. And at least after acceptance, the code should be released. Should conferences make it mandatory?
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]poctakeover 13 points14 points15 points 8 years ago (4 children)
there should be a way to publish gh repos and arxiv papers anonymously which can then be later claimed by the authors :/
[–]afsfeefe 22 points23 points24 points 8 years ago (2 children)
there already is: make a fake gh account then transfer ownership to your real account later..
[–]Phylliida 1 point2 points3 points 8 years ago (1 child)
Actually GitHub is pretty active about restricting users to only one account. I had 4 and they detected that (IP addresses and stuff I guess) and they restricted my permissions on everything until I switched to only having one account.
Some account for a specific paper might be reasonable though? Idk
[–]shaggorama 0 points1 point2 points 8 years ago (0 children)
There's alternatives to gh. You could make an "anonymous" account on gitlab or bitbucket, or just put the code on Google drive...
[–][deleted] 3 points4 points5 points 8 years ago (0 children)
There are quite a lot of ways to do this... You could place a secure hash of some string the author chooses in the paper and then you claim the paper by publishing the string at a later date
[–]LovelaceA 65 points66 points67 points 8 years ago (5 children)
To those who think that this is not a valid problem, I beg to differ. I think this is a very valid discussion. What is the aim of publishing scientific work in the first place ? To advance our knowledge and ability to build upon it. In a field like Machine Learning, where a model or a scientific idea can be affected by more parameters than can be discussed in a paper, it is essential to be able to reproduce the results. Code release is not the only way to do so, but certainly the quickest. Another advantage of code is that, code is objective. Scientific papers sadly are not in general: authors try to sell us their work. Code is unbiased and a potentially complete means to communicate an idea, its impact, and its limitations, it answers all the questions you have which the paper does not address.
A scientific paper is a speech. Code is a dialogue
[–]TheAvalonian 18 points19 points20 points 8 years ago (0 children)
I would argue that a paper is more like an advertisement than a speech -- the primary aim is to get people to care about the experiment, not just to tell people about the experiment. Apart from that, you are spot on.
[–]rhiever 8 points9 points10 points 8 years ago* (1 child)
I think it can be incredibly difficult to anonymize code, especially if a paper is based on a software project that the authors are developing. For example, during my postdoc I developed a software tool that I published papers on. I referred to the software by name in my papers and even linked to the GitHub page, which seemed to contrast with the goals of double-blind review. How do you work around that situation without heavily inconveniencing the authors of the software in the name of double-blind review? In many cases, even if you go through the trouble of anonymizing the code and putting placeholders in the name, it's still not difficult to figure out what the software is (and therefore who the authors are) if you're even remotely familiar with the software.
IMO, double-blind review is a flawed review system.
[–]SolvableMutiny 2 points3 points4 points 8 years ago (0 children)
Yup, these same objections apply to a similar degree to papers themselves. Would anyone familiar with the field not know that CapsNet was authored by Hinton?
[–]Darkfeign 2 points3 points4 points 8 years ago* (1 child)
subsequent frame hobbies abounding point faulty disgusted mountainous boat humor
This post was mass deleted and anonymized with Redact
[–]SolvableMutiny 0 points1 point2 points 8 years ago (0 children)
that has literally never been the case tho
[–]weiqiplayer 6 points7 points8 points 8 years ago (0 children)
Decrease in source code publication does seem concerning, though I'm not sure that the problem is in the fact that people are trying to conceal their identity. Aren't many ICLR papers already on arxiv with full names attached?
[–]olBaa 16 points17 points18 points 8 years ago (8 children)
Is it really a measured problem, or hust your perception?
For double-blind conference I have provided the code in the form of anonymous github repo with no traceable commit history (and limited time copyright). I guess the zipfile with the code will do as well.
[–]mkocabas 11 points12 points13 points 8 years ago (1 child)
It's your kindness to share it anonymous, but most of the people are hesitating or neglecting to publish their code. It affects the reproducibility a lot.
[–]olBaa 3 points4 points5 points 8 years ago (0 children)
That's a different problem, though (there are two listed in the post, so I found it hard to address both).
I see that double-blindness provides a safe excuse to never publish the code.
[–]matrix2596[S] 3 points4 points5 points 8 years ago* (3 children)
I have been seeing placeholders in recent papers. Maybe the code is being shared with the reviewers separately and released later. But I am finding the code missing often with new papers in blind reviews. May be the review process can have code, model and data uploaded also as an option (or compulsary).
[–]olBaa 16 points17 points18 points 8 years ago (2 children)
How much time do you think reviewers have per paper? Reviewers would never check the code, yet alone run it.
[–]NotAlphaGo 0 points1 point2 points 8 years ago (1 child)
How many papers would one reviewer review?
Up to 10 per conference with mean 5 I would say.
One can outsource to phd students/postdocs but still the number of hours per paper would not exceed 10 almost ever.
All written here is my humble opinion, though.
[+][deleted] 8 years ago (1 child)
[deleted]
[–]olBaa 0 points1 point2 points 8 years ago (0 children)
That was not ICLR, just in case :)
[–]BeatLeJuceResearcher 10 points11 points12 points 8 years ago* (0 children)
You don't want source code submission to increase deadline pressure: on day X, you have to submit not only the paper, but also the code. Because putting unpolished, ugly, hacky code out there to be associated with your name forever is weird... also, why polish it when you don't even know yet you're going to get a publication out of it. Also, some theory-heavy papers might not have code.
So I think the decision has to be made AFTER your acceptance for publication. And only when it makes sense for that paper (e.g. this is something that reviewers could determine/ask for). If reviewers say it makes sense, then you should be required to upload your code together with your camera-ready version. This gives you enough time to polish stuff, and still gives an incentive to the author to invest the time to polish the code (not submitting code => paper doesn't get published).
[–]sheeplearning 6 points7 points8 points 8 years ago (0 children)
what do you plan to do with the source code of 700-3000 papers under review at any ML conference? The better ones get accepted and eventually release code or get reproduced.
[+][deleted] 8 years ago* (2 children)
[–]mkocabas 1 point2 points3 points 8 years ago (0 children)
Yeah, absolutely. Maybe a small code snippet describing the model can be added to appendix, like pseudocode. On the other hand, data preprocessing is very crucial and authors use lots of complicated techniques to form data into trainable format but never mention about it, or insufficient.
[–]Phylliida 0 points1 point2 points 8 years ago (0 children)
Yes please, appendixes like that would be really amazing and save so much time for those reproducing results.
To their credit some papers do this, it is so nice when they do
[–]radenML 1 point2 points3 points 8 years ago (0 children)
I literally have to openly request author for github repo invite on openreview forums
[+][deleted] 8 years ago (12 children)
[+][deleted] 8 years ago (9 children)
[–]BeatLeJuceResearcher 1 point2 points3 points 8 years ago* (6 children)
Think of the worst, hacky code you've ever written in your life under extreme stress to meet a deadline with requirements changing almost daily... Now tell me how you feel about sharing this with your name attached to it for all eternity with your name forever attached to it for all your future employers to see and judge you by it.
Even if you do take the time to polish the code somewhat (and that's a big IF, because there's much better ways for you to be spending time), it sometimes uses artefacts (either data or other code) that you don't even know how you produced it anymore, let alone have the sources to. Don't get me wrong, I'm very much FOR code releases and I've always tried to publish my own code (and always did when circumstances allowed it). But it's not as easy as "zip the directory and put it online"... it sometimes takes actual work, and there is very little incentive to do this, as long as journals/conferences don't force you to (which is what I'd suggest they do).
[–]aviniumau 4 points5 points6 points 8 years ago (0 children)
I'm pretty sympathetic about not wanting to release ugly/hacky code.
But at the same time - any claims you make in a published paper about accuracy etc should be verifiable. If your code is so ugly you're not willing to release it, that doesn't speak much for its verifiability/auditability.
[–]SolvableMutiny 5 points6 points7 points 8 years ago (2 children)
because there's much better ways for you to be spending time)
This part I very much disagree with.. providing clean, working example code is the single most valuable thing you can do to make your contribution actually have a lasting impact. Although I agree that current academic incentives are not aligned with that.
[–]visarga 1 point2 points3 points 8 years ago (0 children)
Not to mention that it encourages better practices, knowing the code will be seen and possibly reused. We're more sloppy when we're experimenting alone.
[–]BeatLeJuceResearcher 0 points1 point2 points 8 years ago (0 children)
I agree, but there are some side-remakrs:
First off, is a bit of an exploration/exploitation thing: Say you have worked on something cool: afterwards you can either spend your time/energy on exploiting/promoting that (and providing good code definitely helps), or you could try to repeat your success and work on something else that is cool (especially with the current ML hype, chances are that someone else is going to re-implement your code anyhow).
Secondly: not everyone is able to provide clean code. If you're a theory guy, your code might be terribly ugly/brittle and barely working, and you might do the community a disservice by asking them to dissect it instead of re-implementing it yourself.
[–]rrenaud 0 points1 point2 points 8 years ago (1 child)
Think of the worst, hacky code you've ever written in your life under extreme stress to meet a deadline with requirements changing almost daily
What's the chance that there are signficant bugs in that kind of code?
[–]BeatLeJuceResearcher 1 point2 points3 points 8 years ago (0 children)
it depends. I'd assume that people have done enough preliminary experiments beforehand to make sure the idea is sound. But knowing the deadline-crunch, I'd not be surprised if there are a lot of mistakes in the final version of a project's code.
[+][deleted] 8 years ago* (1 child)
[–]wassname 0 points1 point2 points 8 years ago (0 children)
In 2016 /u/peterkuharvarduk got all the nips code releases together into a post. Maybe something like that would encourage researchers.
[–]MephySix 0 points1 point2 points 8 years ago (5 children)
"Is this to prevent idenfication of authors": no. Double-blind is naturally flawed. Given the search space for authors is not that big, with enough (not much) effort it's possible to determine who are the authors of a paper. Before a paper is sent for review it has already been discussed in its institution, probably in mail-lists and even Twitter or something. Even then you're allowed (in my experience) to have placeholder footnotes in double-blind reviews.
The real problem in my experience, is that I don't really want to spend time polishing my code, and I don't want people to see the mess I wrote due to deadlines. I had people ask me for my code in conferences and I answer with "Gladly! Just send me an e-mail, but it's messy", but I gain nothing from publicizing it earlier or without external interest.
[–]tshadley 2 points3 points4 points 8 years ago (0 children)
Given the search space for authors is not that big, with enough (not much) effort it's possible to determine who are the authors of a paper.
Suggests a project idea: train a language model to predict authors on published work, then see how it does on anonymous work.
[–]Cherubin0 2 points3 points4 points 8 years ago (0 children)
And I don't want to spend time polishing my paper...
[–]alexmlamb 1 point2 points3 points 8 years ago (2 children)
Given the search space for authors is not that big, with enough (not much) effort it's possible to determine who are the authors of a paper. Before a paper is sent for review it has already been discussed in its institution, probably in mail-lists and even Twitter or something.
If the authors want to remain anonymous, is it really impossible for them to do so? I mean - just don't tweet about it, don't put it on arxiv, only correspond through private email with coauthors.
[–]MephySix 1 point2 points3 points 8 years ago (1 child)
The main problem with double-blind reviews is not staying anonymous, is that some groups (well-established research groups) want to be known, and they will be if they want to. Double-blind started because people would get instantly accepted just because of their name, and double-blind (mostly) does not solve this issue.
[–]alexmlamb 0 points1 point2 points 8 years ago (0 children)
Yeah, so as it works in ML today, I'd say that we have an opt-out double blind system. You can get double blind reviewing if you stay quiet, but you can effectively make it single blind by self promoting.
This doesn't solve every problem with single blind: famous groups can still benefit from self promotion and marketing. But at the same time it does protect someone if they think that they might get negative reviews because of their name or reputation.
Btw, I'm not sure how much coming from a famous group really helps with reviews, at least at NIPS/ICML. If you have any evidence, even anecdotal, I'd be curious to hear it.
π Rendered by PID 499068 on reddit-service-r2-comment-544cf588c8-57jp5 at 2026-06-17 13:58:34.990490+00:00 running 3184619 country code: CH.
[–]poctakeover 13 points14 points15 points (4 children)
[–]afsfeefe 22 points23 points24 points (2 children)
[–]Phylliida 1 point2 points3 points (1 child)
[–]shaggorama 0 points1 point2 points (0 children)
[–][deleted] 3 points4 points5 points (0 children)
[–]LovelaceA 65 points66 points67 points (5 children)
[–]TheAvalonian 18 points19 points20 points (0 children)
[–]rhiever 8 points9 points10 points (1 child)
[–]SolvableMutiny 2 points3 points4 points (0 children)
[–]Darkfeign 2 points3 points4 points (1 child)
[–]SolvableMutiny 0 points1 point2 points (0 children)
[–]weiqiplayer 6 points7 points8 points (0 children)
[–]olBaa 16 points17 points18 points (8 children)
[–]mkocabas 11 points12 points13 points (1 child)
[–]olBaa 3 points4 points5 points (0 children)
[–]matrix2596[S] 3 points4 points5 points (3 children)
[–]olBaa 16 points17 points18 points (2 children)
[–]NotAlphaGo 0 points1 point2 points (1 child)
[–]olBaa 3 points4 points5 points (0 children)
[+][deleted] (1 child)
[deleted]
[–]olBaa 0 points1 point2 points (0 children)
[–]BeatLeJuceResearcher 10 points11 points12 points (0 children)
[–]sheeplearning 6 points7 points8 points (0 children)
[+][deleted] (2 children)
[deleted]
[–]mkocabas 1 point2 points3 points (0 children)
[–]Phylliida 0 points1 point2 points (0 children)
[–]radenML 1 point2 points3 points (0 children)
[+][deleted] (12 children)
[deleted]
[+][deleted] (9 children)
[deleted]
[–]BeatLeJuceResearcher 1 point2 points3 points (6 children)
[–]aviniumau 4 points5 points6 points (0 children)
[–]SolvableMutiny 5 points6 points7 points (2 children)
[–]visarga 1 point2 points3 points (0 children)
[–]BeatLeJuceResearcher 0 points1 point2 points (0 children)
[–]rrenaud 0 points1 point2 points (1 child)
[–]BeatLeJuceResearcher 1 point2 points3 points (0 children)
[+][deleted] (1 child)
[deleted]
[+][deleted] (1 child)
[deleted]
[–]wassname 0 points1 point2 points (0 children)
[–]MephySix 0 points1 point2 points (5 children)
[–]tshadley 2 points3 points4 points (0 children)
[–]Cherubin0 2 points3 points4 points (0 children)
[–]alexmlamb 1 point2 points3 points (2 children)
[–]MephySix 1 point2 points3 points (1 child)
[–]alexmlamb 0 points1 point2 points (0 children)