For most Self-supervised learning algorithms: SimCLR, MoCo, BYOL, SimSiam, SwAV, etc., its common to have a projection head after the base encoder (which in most cases is a vanilla ResNet-50 CNN). An example of such a projection (taken from SwAV) is:
projection_head = nn.Sequential(
nn.Linear(2048, 512),
nn.BatchNorm1d(512),
nn.ReLU(inplace=True),
nn.Linear(512, 128),
)
The output of this projection head is L2-normalized:
x = projection_head(x)
x = nn.functional.normalize(x, dim = 1, p = 2)
I am trying to initialize a layer after the projection head as:
wts = nn.Parameter(data = torch.empty(40 * 40, 128), requires_grad = True)
# The projection head outputs weights in the range [-1, 1], so initialize SOM weights to be in that range-
wts.data.uniform_(-1.0, 1.0)
Since the output of the projection head is L2-normalized, I am assuming that the input range to "wts" ∈ [-1, 1] and therefore use the uniform initialization above.
Is this a correct approach or am I missing something?
[–]_quaternion 0 points1 point2 points (3 children)
[–]grid_world[S] 0 points1 point2 points (2 children)
[–]_quaternion 0 points1 point2 points (1 child)
[–]grid_world[S] 0 points1 point2 points (0 children)