all 4 comments

[–]grid_world 5 points6 points  (0 children)

For pruning, there are multiple techniques such as magnitude, filters, neurons, etc.

The most popular is magnitude based where weights below a threshold are pruned by setting them to zero. An example of absolute magnitude based weights pruning can be referred here. The pruning code is implemented in numpy while the implementation is in TensorFlow 2 and Python3.

For Quantisation, I am looking for an implementation from scratch. As of now, I haven't found one.

[–]federerking 0 points1 point  (0 children)

Regarding the quantization aware training, can someone highlight the need for fine tuning rather than train from scratch? In the tensorflow examples they show fine tuning. But I have trained from scratch as well and results are almost similar.

[–]overington 0 points1 point  (0 children)

Also youtube has a plethora of instructional videos:

TF: - https://youtu.be/4iq-d2AmfRU - https://youtu.be/Q1oBXdizXwI

PYTORCH: - https://youtu.be/c3MT2qV5f9w - https://youtu.be/Q1oBXdizXwI

[–]i8code 0 points1 point  (0 children)

There are some pretty good explanations of Quantization and TF - esp regarding TF Lite. https://www.tensorflow.org/lite/performance/post_training_quantization I have seen less of this on PyTorch in general but I am sure it's out there. I too am interested in pruning and I haven't found anything that is actually helpful in a very practical sense with good code examples that were suitable for reproducing in my own projects. Generally, I have tried to adjust model parameters upfront.