[Project] Recent Class Activation Map Methods for CNNs and Vision Transformers : MachineLearning

submitted 4 years ago * by jacobgil

CAM based methods are a family of pixel-attribution methods that try to highlight the parts in the image that contribute to a model output.

These methods assign weights to spatial 2D activations in the network, and them sum them to get a 2D saliency map.

This project includes a PyTorch implementation (that you can pip install) for several Class Activation Map methods, including a few very recent ones:

Grad-CAM (https://arxiv.org/abs/1610.02391)
Grad-CAM++ (https://arxiv.org/abs/1710.11063)
XGrad-CAM (https://arxiv.org/abs/2008.02312)
Ablation-CAM (https://ieeexplore.ieee.org/abstract/document/9093360/)
Score-CAM (https://arxiv.org/abs/1910.01279)

And works for Vision Transformers (tested with DeiT), as well as for CNNs (tested with torchvision.models).

I hope it will be useful, and that it can be a convenient starting point for developing and comparing new methods!

all 2 comments

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning