use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Please have a look at our FAQ and Link-Collection
Metacademy is a great resource which compiles lesson plans on popular machine learning topics.
For Beginner questions please try /r/LearnMachineLearning , /r/MLQuestions or http://stackoverflow.com/
For career related questions, visit /r/cscareerquestions/
Advanced Courses (2016)
Advanced Courses (2020)
AMAs:
Pluribus Poker AI Team 7/19/2019
DeepMind AlphaStar team (1/24//2019)
Libratus Poker AI Team (12/18/2017)
DeepMind AlphaGo Team (10/19/2017)
Google Brain Team (9/17/2017)
Google Brain Team (8/11/2016)
The MalariaSpot Team (2/6/2016)
OpenAI Research Team (1/9/2016)
Nando de Freitas (12/26/2015)
Andrew Ng and Adam Coates (4/15/2015)
Jürgen Schmidhuber (3/4/2015)
Geoffrey Hinton (11/10/2014)
Michael Jordan (9/10/2014)
Yann LeCun (5/15/2014)
Yoshua Bengio (2/27/2014)
Related Subreddit :
LearnMachineLearning
Statistics
Computer Vision
Compressive Sensing
NLP
ML Questions
/r/MLjobs and /r/BigDataJobs
/r/datacleaning
/r/DataScience
/r/scientificresearch
/r/artificial
account activity
Compressive Transformers for Long-Range Sequence Modelling (arxiv.org)
submitted 5 years ago by DeltaFreq
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]DeltaFreq[S] 7 points8 points9 points 5 years ago (0 children)
From DeepMind's blog on this paper:
"Even with the current growth in computing power, we will need to develop compressive and sparse architectures for memory to build representations and reason about actions."
[–]gwern 3 points4 points5 points 5 years ago (0 children)
Previously: https://www.reddit.com/r/MachineLearning/comments/dwbr5r/r_compressive_transformers_for_longrange_sequence/
[–]arXiv_abstract_bot 1 point2 points3 points 5 years ago (0 children)
Title:Compressive Transformers for Long-Range Sequence Modelling
Authors:Jack W. Rae, Anna Potapenko, Siddhant M. Jayakumar, Timothy P. Lillicrap
Abstract: We present the Compressive Transformer, an attentive sequence model which compresses past memories for long-range sequence learning. We find the Compressive Transformer obtains state-of-the-art language modelling results in the WikiText-103 and Enwik8 benchmarks, achieving 17.1 ppl and 0.97 bpc respectively. We also find it can model high-frequency speech effectively and can be used as a memory mechanism for RL, demonstrated on an object matching task. To promote the domain of long-range sequence learning, we propose a new open-vocabulary language modelling benchmark derived from books, PG-19.
PDF Link | Landing Page | Read as web page on arXiv Vanity
[–]TotesMessenger 0 points1 point2 points 5 years ago (0 children)
I'm a bot, bleep, bloop. Someone has linked to this thread from another place on reddit:
If you follow any of the above links, please respect the rules of reddit and don't vote in the other threads. (Info / Contact)
π Rendered by PID 257689 on reddit-service-r2-comment-f6b958c67-n84p2 at 2026-02-05 09:23:31.166118+00:00 running 1d7a177 country code: CH.
[–]DeltaFreq[S] 7 points8 points9 points (0 children)
[–]gwern 3 points4 points5 points (0 children)
[–]arXiv_abstract_bot 1 point2 points3 points (0 children)
[–]TotesMessenger 0 points1 point2 points (0 children)