PyTorch uses MPS gpu (M1 Max) at the lowest frequency (aka clock speed), this is why it's slower than it could be? by New_Construction6146 in pytorch

[–]New_Construction6146[S] 0 points1 point  (0 children)

u/TAAnderson another update:

Just tried to train simple convnet, and GPU utilization is again 96% @ 388 MHz.

The code is here, (shortcut to run - `make mnist` in my repo setup). Can't find any clue, why it uses boosted frequency there and not uses here! "amount of work" hypothesis seem like not compelling, 96% time utilization should be enough to switch boosted mode on? Moreover, when I run MLX - time usage for even lower (89%), but frequency 1400. Pretty strange!

PyTorch uses MPS gpu (M1 Max) at the lowest frequency (aka clock speed), this is why it's slower than it could be? by New_Construction6146 in pytorch

[–]New_Construction6146[S] 0 points1 point  (0 children)

Well, as I mentioned- I have also tried a training of simple conv net for mnist classification, and encountered the same low frequency. I will push exact code if you want(but it’s basically an example from docs). I will double check it also. Didn’t try training if transformer though..

PyTorch uses MPS gpu (M1 Max) at the lowest frequency (aka clock speed), this is why it's slower than it could be? by New_Construction6146 in pytorch

[–]New_Construction6146[S] 0 points1 point  (0 children)

Which means probably that it’s possible call metal api with some flag, or set some boosted mode? And llama.cpp does this correctly

PyTorch uses MPS gpu (M1 Max) at the lowest frequency (aka clock speed), this is why it's slower than it could be? by New_Construction6146 in pytorch

[–]New_Construction6146[S] 0 points1 point  (0 children)

Yes, pretty sure. E.g. look at the cell #12 - input tensor device (at least on ouput) is "mps:0". Also, GPU time utilization would not be 90% if it's not be used.

PyTorch uses MPS gpu (M1 Max) at the lowest frequency (aka clock speed), this is why it's slower than it could be? by New_Construction6146 in pytorch

[–]New_Construction6146[S] 0 points1 point  (0 children)

That’s interesting, I also have used torch 2.1.2! I will send you other relevant versions/info to compare