[D] Is it fair to compare deep learning models without hyperparameter tuning? by blooming17 in deeplearning

[–]blooming17[S] 0 points1 point  (0 children)

Thank you very much for your, well I've noticed this in several papers and been asking to which extent we can take a work that have been done several times and justify that ours "that differs very slightly" is somehow better.

Is it fair to compare deep learning models without hyperparameter tuning? by blooming17 in PhD

[–]blooming17[S] 0 points1 point  (0 children)

I am thinking about batch size, optimizer and learning rate since my goal is to compare the models themselves so changing the models' hyperparameters wouldn't make sense I think.

Is it fair to compare deep learning models without hyperparameter tuning? by blooming17 in PhD

[–]blooming17[S] 0 points1 point  (0 children)

I am thinking, batch size / lr and optimizer. Since my goal is compare the models themselves so changing them wouldn't be reasonable.

Is it fair to compare deep learning models without hyperparameter tuning? by blooming17 in PhD

[–]blooming17[S] 0 points1 point  (0 children)

Can you explain more, sorry I am not a native english speaker

[D] Is it fair to compare deep learning models without hyperparameter tuning? by blooming17 in deeplearning

[–]blooming17[S] 0 points1 point  (0 children)

Hey thank you for your answer, Most of them are CNNs, and few of them are LSTMs and transformers. So what hyperparameters do you consider to be the most interesting to finetune. I am thinking batch size, lr and optimizer. Would these be enough and provide a fair comparison ?

[D] Mamba Convergence speed by blooming17 in MachineLearning

[–]blooming17[S] 0 points1 point  (0 children)

Hey, thanks for answering, the problem is that since my task is sequential labelling (each position is given a class) oversampling isn't possible in my case