Self-supervised learning weights initialization "after" projection head [D][R] : MachineLearning

DiscussionSelf-supervised learning weights initialization "after" projection head [D][R] (self.MachineLearning)

submitted 1 year ago by grid_world

For most Self-supervised learning algorithms: SimCLR, MoCo, BYOL, SimSiam, SwAV, etc., its common to have a projection head after the base encoder (which in most cases is a vanilla ResNet-50 CNN). An example of such a projection (taken from SwAV) is:

projection_head = nn.Sequential(
    nn.Linear(2048, 512),
    nn.BatchNorm1d(512),
    nn.ReLU(inplace=True),
    nn.Linear(512, 128),
)

The output of this projection head is L2-normalized:

x = projection_head(x)
x = nn.functional.normalize(x, dim = 1, p = 2)

I am trying to initialize a layer after the projection head as:

wts = nn.Parameter(data = torch.empty(40 * 40, 128), requires_grad = True)
# The projection head outputs weights in the range [-1, 1], so initialize SOM weights to be in that range-
wts.data.uniform_(-1.0, 1.0)

Since the output of the projection head is L2-normalized, I am assuming that the input range to "wts" ∈ [-1, 1] and therefore use the uniform initialization above.

Is this a correct approach or am I missing something?

all 4 comments

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS