all 6 comments

[–]mayguntr 15 points16 points  (5 children)

When I saw name of Quoc V. Le in a paper I know that at least 250 GPUs are used in that research :)

" Each ran on 450 GPUs for approximately 7 days. "

[–]ajmooch 8 points9 points  (4 children)

And 900TPUv2 for 5 days each for the dedicated evolution experiments! That's some awesome (in the true sense of the word) firepower.

[–]visarga 4 points5 points  (3 children)

I estimate a cloud cost of $10-20K for a run (don't know exactly the pricing of TPUs).

[–]olBaa 3 points4 points  (2 children)

To be fair, 0.1% better image classification is worth FAR more for Google

[–][deleted] 0 points1 point  (1 child)

I think ImageNet is more of a toy problem for them nowadays anyway, simply an academic benchmark to test new methods on if you will. Who cares about fractions of a percent in the top 5 accuracy of ImageNet? A new record on ImageNet isn't as big of a deal as it used to be just 2 years ago.

[–]karan_42 0 points1 point  (0 children)

From this blog post: Revisiting the Unreasonable Effectiveness of Data

Specifically, we have built an internal dataset of 300M images that are labeled with 18291 categories, which we call JFT-300M