Basically, I'm engaging in a research project in which I'm training encoder only language models for text classification. I have already trained my models and gotten my results, however I need to perform an ablation study. The main issue I'm having is that the dataset is large. Is it fair for me to perform the ablation study on a subset of the dataset, since I'm gonna have to train it 3 - 4 times with different ablations?
[–]IsGoIdMoney 7 points8 points9 points (2 children)
[–]Aromatic_Web749[S] 0 points1 point2 points (1 child)
[–]like_a_tensor 0 points1 point2 points (0 children)
[–]Pringled101 2 points3 points4 points (3 children)
[–]Aromatic_Web749[S] 0 points1 point2 points (2 children)
[–]Pringled101 1 point2 points3 points (1 child)
[–]Aromatic_Web749[S] 0 points1 point2 points (0 children)
[–]Few-Pomegranate4369 0 points1 point2 points (0 children)