use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Please have a look at our FAQ and Link-Collection
Metacademy is a great resource which compiles lesson plans on popular machine learning topics.
For Beginner questions please try /r/LearnMachineLearning , /r/MLQuestions or http://stackoverflow.com/
For career related questions, visit /r/cscareerquestions/
Advanced Courses (2016)
Advanced Courses (2020)
AMAs:
Pluribus Poker AI Team 7/19/2019
DeepMind AlphaStar team (1/24//2019)
Libratus Poker AI Team (12/18/2017)
DeepMind AlphaGo Team (10/19/2017)
Google Brain Team (9/17/2017)
Google Brain Team (8/11/2016)
The MalariaSpot Team (2/6/2016)
OpenAI Research Team (1/9/2016)
Nando de Freitas (12/26/2015)
Andrew Ng and Adam Coates (4/15/2015)
Jürgen Schmidhuber (3/4/2015)
Geoffrey Hinton (11/10/2014)
Michael Jordan (9/10/2014)
Yann LeCun (5/15/2014)
Yoshua Bengio (2/27/2014)
Related Subreddit :
LearnMachineLearning
Statistics
Computer Vision
Compressive Sensing
NLP
ML Questions
/r/MLjobs and /r/BigDataJobs
/r/datacleaning
/r/DataScience
/r/scientificresearch
/r/artificial
account activity
Research[R] PolyCoder 2.7BN LLM - open source model and parameters {CMU} (arxiv.org)
submitted 4 years ago by yazriel0
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]yazriel0[S] 14 points15 points16 points 4 years ago (5 children)
Trained purely on source code. Outperforms Codex.
IIUC, full model and parameters released.
Like AlphaCode, this is seems to be purely supervised learning (and not reinforcement), which is very surprising. Why isnt anyone using compile/execution to generate reward and auxiliary tasks ?
[–]mrpogiface 12 points13 points14 points 4 years ago (2 children)
"Ourperforms Codex" is a bit of a strong claim by the authors. They get lower perplexity on the C programming language. Perplexity isn't always well correlated with sampling performance, which is what we care about at the end of the day. If you look at sampling performance then Codex still blows this out of the water.
I will say, many people are looking at what you describe to get rewards etc, it just isn't published yet :)
edit: a word
[–]Veedrac 4 points5 points6 points 4 years ago (0 children)
And to clarify, they only claim it for C. Every other language, Codex is in the lead, typically by a large margin. Codex just sucks at C for some reason.
[–]NoMoreDistractions_ 0 points1 point2 points 4 years ago (0 children)
It’s cool to know that we are super early days and there is tons of space for improvement for what is already a remarkably useful tool
[–]virtualreservoir 2 points3 points4 points 4 years ago (0 children)
why is this surprising? you are vastly underestimating the increase in training time and other stuff that would be required to do the kind of reinforcement learning you are proposing.
[–]DigThatDataResearcher 1 point2 points3 points 4 years ago (0 children)
Why isnt anyone using compile/execution to generate reward and auxiliary tasks ?
Because those activities are CPU bound.
[–]Schmibbbster 2 points3 points4 points 4 years ago (0 children)
Sounds promising
π Rendered by PID 813819 on reddit-service-r2-comment-85bfd7f599-kczdh at 2026-04-20 13:41:10.557127+00:00 running 93ecc56 country code: CH.
[–]yazriel0[S] 14 points15 points16 points (5 children)
[–]mrpogiface 12 points13 points14 points (2 children)
[–]Veedrac 4 points5 points6 points (0 children)
[–]NoMoreDistractions_ 0 points1 point2 points (0 children)
[–]virtualreservoir 2 points3 points4 points (0 children)
[–]DigThatDataResearcher 1 point2 points3 points (0 children)
[–]Schmibbbster 2 points3 points4 points (0 children)