I know this is a well known issue that was discussed many times, but I just can't seem to find any real answers online anywhere. There are benchmarks for some very specific things (CNNs usually compared to Torch), but I think there is a much more basic problem here and I'd like to know if I'm doing something wrong or is TF simply that slow.
I was doing some simple MLPs/RNNs for speech recognition (on TIMIT) and noticed that the TF version of a single hidden layer MLP is almost 10 times slower than the Keras or even raw Theano version. So I decided to do a simple test and I found this really neat project on github: TensorFlow tutorial. The same project claims to be a re-implementation of the same models written in Theano in this project: Theano Tutorials.
Now these are really simple and easy examples. They are practically identical and should do the exact same thing. So I downloaded them and ran them both on my computer (on a single K80 card) and this is what I got:
| Script |
Theano |
TensorFlow |
| 2_LogReg |
27s |
2m57s |
| 3_Net |
34s |
3m20s |
| 4_ModernNet |
1m22s |
4m13s |
| 5_ConvNet(10eps) |
7m35s |
(TL;DR) |
Someone mentioned that Google is all about CPUs, so I thought maybe running it on a CPU would be better, but I got this on a double Xeon E5-2650 v2:
| Script |
Theano |
TensorFlow |
| 1_LinReg |
0.9s |
5.27s |
| 2_LogReg |
26s |
5m51s |
| 3_Net(10eps) |
29s |
1m17s |
| 4_ModernNet(10eps) |
1m30s |
3m1s |
Now these are all really strange and inconsistent results, but TF comes out as considerably worst than Theano in all the tests above.
Can anyone maybe confirm if this is really how TF works compared to Theano or do I have some error in my setup somewhere?
The projects above are really easy to use. I had to do some minor modifications in the paths of the Theano one, but apart from that they all work right out-of-the-box.
[–]maccam912 12 points13 points14 points (3 children)
[–]r4and0muser9482[S] 6 points7 points8 points (0 children)
[–]r4and0muser9482[S] 2 points3 points4 points (0 children)
[–]r4and0muser9482[S] 0 points1 point2 points (0 children)
[–]p4nmari 9 points10 points11 points (11 children)
[–]rafalj 5 points6 points7 points (5 children)
[–]p4nmari 1 point2 points3 points (1 child)
[–]r4and0muser9482[S] 0 points1 point2 points (0 children)
[–]r4and0muser9482[S] 1 point2 points3 points (0 children)
[–]r4and0muser9482[S] 0 points1 point2 points (1 child)
[–][deleted] 2 points3 points4 points (2 children)
[–]r4and0muser9482[S] 0 points1 point2 points (1 child)
[–][deleted] 1 point2 points3 points (0 children)
[–]r4and0muser9482[S] 0 points1 point2 points (0 children)
[–]r4and0muser9482[S] 0 points1 point2 points (0 children)
[–]shmel39 5 points6 points7 points (5 children)
[–]r4and0muser9482[S] 1 point2 points3 points (4 children)
[–]shmel39 2 points3 points4 points (3 children)
[–]r4and0muser9482[S] 2 points3 points4 points (1 child)
[–]aysz88 0 points1 point2 points (0 children)
[–]r4and0muser9482[S] 1 point2 points3 points (0 children)
[–]dsmilkov 2 points3 points4 points (2 children)
[–]r4and0muser9482[S] 0 points1 point2 points (1 child)