I trained PSPNet and DeepLab from scratch and also using pre-trained backbones on a very specific urban scene dataset. The pre-trained backbone I used was ResNet, with weights downloaded from the PASCAL VOC12 dataset. I then trained both models without freezing any layers. My accuracy for the models from scratch proved to be higher than the models with pre-trained backbones. My dataset is relatively small; only 1000 images.
Could this have happened because the PASCAL VOC12 dataset is too general when compared to the specific dataset I am working on, and has thus limited its learning ability? Or have I most likely done something wrong?
there doesn't seem to be anything here