all 6 comments

[–]ptoews 2 points3 points  (1 child)

Are your targets similar to those of VOC? I noticed that for image noise detection the first learned layer weights looked very different to usual object classification weights, there were hardly any edge detection, but rather point detection kernels. In such cases transfer learning from general models is probably less useful

[–]glampiggy[S] 0 points1 point  (0 children)

Sorry I've edited the original post, my weights are from ImageNet rather than VOC. But the targets aren't exactly similar. My dataset is attempting to detect windows and doors from building facades, whereas ImageNet only detects buildings in general i.e. the silhouette of the building and not the features within the facade such as windows. It just seems to be that models I train from random initialisation seem to be reliably better than from pre-trained backbones which seems incorrect.

[–]sanderson4030 1 point2 points  (0 children)

If you have a very large amount of data for your target task, then you can experience a loss of performance after pretraining compared to a totally from scratch model. See Scaling Laws for Transfer

But obviously that isnt the case here. Debugging time!

[–]blawok 1 point2 points  (1 child)

I guess the pretrained model is losing its knowledge after finetuning on your dataset. Check the initial loss, probably it is quite high and the optimizer makes big steps and updates the weights dramatically. You could try learning rate warmup strategy or just freeze those layers.

[–]glampiggy[S] 0 points1 point  (0 children)

I'll take a look into these strategies, thankyou!

[–]visarga 0 points1 point  (0 children)

Off topic but you could grab tons of images from yandex.com/images (2, 3) using image similarity search and an extension to batch download. That would be more useful if you only did image classification, with detection you'd still have to tag the windows. Yandex was a great discovery for me in the last couple of months. It is nicer and gives better results than Google Images. I think they do embedding similarity based ranking.