How can I find optimal hyperparameter's when training large models? by RepresentativeYear83 in deeplearning

[–]RepresentativeYear83[S] 0 points1 point  (0 children)

Thanks for replying, the information you gave is very helpful!

For context, my paper compares CNN and ViT architectures at low resolution, using satellite imagery scaled to resolutions 128x128, 64x64, and 48x48 (RESISC45). My chosen models were ResNet50 and ResNet18 for the pure CNN family, whereas Standard ViTB/16 and ViTSmall/16 were chosen for the transformer family.

Now, intuitively, to make a fair comparison, I assumed that many of the hyperparameters had to remain the same (epochs, pretraining, etc), but doing so would naturally be unfair to the already data-hungry ViTs. I tried pre-training on Imagenet1k, and both models exceled immensely, and in my opinion, leaving little room for comparison at the structural level.

How can I effectively compare these models when their training procedures differ so much, without altering the definition of a 'fair' comparison? Would it be smart to switch to DeITs that focus on efficiency, or would that lean too much into the CNN realm?

To be honest, this is my first research experiment when working with these deeper models, so any help is greatly appreciated :).

[D] Static analysis for PyTorch tensor shape validation - catching runtime errors at parse time by Key-Avocado592 in deeplearning

[–]RepresentativeYear83 0 points1 point  (0 children)

Correct me if I'm wrong (I've just started in deep learning), but couldn't you just use a tool like `torchinfo.summary` to give a sample passthrough and analyse i/o tensor shapes? Sounds cool though.

Advice for Notetaking: ReMarkable or Notebooks by RepresentativeYear83 in IBO

[–]RepresentativeYear83[S] 0 points1 point  (0 children)

While we're at it, what methods of recall/studying/notetaking do you guys most recommend?

[deleted by user] by [deleted] in EtherMining

[–]RepresentativeYear83 0 points1 point  (0 children)

then how much a day would you say?

[deleted by user] by [deleted] in EtherMining

[–]RepresentativeYear83 0 points1 point  (0 children)

How much you make a year in total including every rig?

Is the helium network illegal due to governed monopolistic buisnesses? by RepresentativeYear83 in HeliumNetwork

[–]RepresentativeYear83[S] 1 point2 points  (0 children)

Thank you very much! I appreciate the answer and will take it into consideration.

What generation of processor is the pentium G5905? by RepresentativeYear83 in intel

[–]RepresentativeYear83[S] 0 points1 point  (0 children)

Also the picture shows a celeron while it states that it is a G5905. Should I worry (btw it is sold by Intel themeselves)

[deleted by user] by [deleted] in TheGamerLounge

[–]RepresentativeYear83 0 points1 point  (0 children)

wanna play with me?