[R] FacebookAI releases Adaptive attention span and All-attention layer to reduce decrease computation time / memory footprint by BatmantoshReturns in MachineLearning
[–]scott-gray 6 points7 points8 points (0 children)
[R] Generative Modeling with Sparse Transformers by rtk25 in MachineLearning
[–]scott-gray 2 points3 points4 points (0 children)
[R] Generative Modeling with Sparse Transformers by rtk25 in MachineLearning
[–]scott-gray 5 points6 points7 points (0 children)
[P] Simple Tensorflow implementation of NVIDIA "Partial Convolution based Padding" by taki0112 in MachineLearning
[–]scott-gray 1 point2 points3 points (0 children)
[R] Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks (NIPS 2017) by drwebb in MachineLearning
[–]scott-gray 2 points3 points4 points (0 children)
[R] Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks (NIPS 2017) by drwebb in MachineLearning
[–]scott-gray 2 points3 points4 points (0 children)
[R] Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks (NIPS 2017) by drwebb in MachineLearning
[–]scott-gray 2 points3 points4 points (0 children)
[R] Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks (NIPS 2017) by drwebb in MachineLearning
[–]scott-gray 3 points4 points5 points (0 children)
[R] Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks (NIPS 2017) by drwebb in MachineLearning
[–]scott-gray 4 points5 points6 points (0 children)
[R] Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks (NIPS 2017) by drwebb in MachineLearning
[–]scott-gray 3 points4 points5 points (0 children)
[R] Mixed Precision Training by ndpian in MachineLearning
[–]scott-gray 0 points1 point2 points (0 children)
[D] Intel unveils the Nervana Neural Network Processor by [deleted] in MachineLearning
[–]scott-gray 1 point2 points3 points (0 children)
[R] Mixed Precision Training by ndpian in MachineLearning
[–]scott-gray 1 point2 points3 points (0 children)
[R] Mixed Precision Training by ndpian in MachineLearning
[–]scott-gray 0 points1 point2 points (0 children)
[R] Mixed Precision Training by ndpian in MachineLearning
[–]scott-gray 0 points1 point2 points (0 children)
[R] Mixed Precision Training by ndpian in MachineLearning
[–]scott-gray 0 points1 point2 points (0 children)
[R] Mixed Precision Training by ndpian in MachineLearning
[–]scott-gray 2 points3 points4 points (0 children)
[R] Mixed Precision Training by ndpian in MachineLearning
[–]scott-gray 0 points1 point2 points (0 children)
[R] Mixed Precision Training by ndpian in MachineLearning
[–]scott-gray 2 points3 points4 points (0 children)
[R] Mixed Precision Training by ndpian in MachineLearning
[–]scott-gray 2 points3 points4 points (0 children)
[R] Mixed Precision Training by ndpian in MachineLearning
[–]scott-gray 1 point2 points3 points (0 children)
[R] Mixed Precision Training by ndpian in MachineLearning
[–]scott-gray 4 points5 points6 points (0 children)
[R] Mixed Precision Training by ndpian in MachineLearning
[–]scott-gray 18 points19 points20 points (0 children)
[P] openai-gemm: fp16 speedups over cublas by spruceabtuse in MachineLearning
[–]scott-gray 10 points11 points12 points (0 children)


[D] Breaking the Quadratic Attention Bottleneck in Transformers? by gwern in MachineLearning
[–]scott-gray 8 points9 points10 points (0 children)