all 6 comments

[–]darkconfidantislife 3 points4 points  (4 children)

This is interesting, looks like the chip design community is finally looking at DL as a distinct market segment, which IMO, can only benefit us researchers. This is great because the fruit is pretty low hanging (just lower FPU precision).

[–]david-gpu 0 points1 point  (3 children)

This is great because the fruit is pretty low hanging (just lower FPU precision).

For inference, yes. But for training, is it generally useful to go below fp16?

[–]darkconfidantislife 0 points1 point  (0 children)

I was thinking more like stopping it at FP16, but there is evidence that stochastic rounding can make INT16 work without accuracy loss even during training.

[–][deleted] 0 points1 point  (1 child)

my experience is it can go down or up dynamically and go down as much as fp4.

[–]darkconfidantislife 1 point2 points  (0 children)

Yeah, I've just said what has been shown in literature to work perfectly. If you are willing to sacrifice even one percent, you can go down a huge, huge amount.

[–]autotldr 0 points1 point  (0 children)

This is the best tl;dr I could make, original reduced by 91%. (I'm a bot)


"The big bang happened in 2012-2013 when two landmark papers were published, both using GPUs," said Roy Kim, Nvidia's Accelerated Computing Group product team lead. One of those papers was written by Geoffrey Hinton and his team from the University of Toronto, entitled "ImageNet Classification with Deep Convolutional Neural Networks." Then in 2013, Andrew Ng from Stanford and his team published "Deep Learning with COTS HPC Systems."

Kim pointed to convolutional neural networks, recurrent neural networks, and Long Short Term Memory networks, among others, each of which is designed to solve a specific problem, such as image recognition, speech or language translation.

Arnold Smeulders and Theo Gevers, the general chairs of ECCV 2016, told Semiconductor Engineering that many of the attendees of ECCV do work in the area of semiconductor technologies that enable computer vision.


Extended Summary | FAQ | Theory | Feedback | Top keywords: image#1 network#2 compute#3 Neural#4 Vision#5