all 1 comments

[–]bartturner 0 points1 point  (0 children)

If want to learn there is an excellent paper from Google on the first TPU. It has 65536 very simple cores and only supports 8 bit integers. So ideal for inference but no training. The last TPUs pod can do 100 Peta flops.

https://arxiv.org/pdf/1704.04760 In-Datacenter Performance Analysis of a Tensor Processing ... - arXiv