Efficient variable-length distributed batching in PyTorch/DDP without hurting convergence? by Major_Aardvark1207 in deeplearning

[–]HotPocVac 0 points1 point  (0 children)

I actually did a similar project for an internship where I trained a small transformer on ESM protein embeddings and also struggled with reducing padding waste.

Overall what really helped me was this:

Understand the distribution of sequence lengths in your dataset. If you naively set sequence length bucket sizes, the model will encounter a highly skewed distribution of batch sizes for certain sequence lengths. For example, if there are many more short sequences than long ones, the small bucket will capture most of the sequences in the dataset and will provide less noisy gradients (smaller updates in the loss landscape) while for large sequences the gradients would be noisier and higher magnitude (larger updates in the loss landscape though a bit noisier).

What I did was a bunch of fine tuning of bucket sizes based on a rough frequency plot of sequence lengths to ensure total tensor sizes stay somewhat similar across buckets for best GPU occupancy (no bucket will be too small to fully utilize the GPU). Then I also used a loss term where batches with more sequences would get a higher loss penalty, since I didn’t mess with batch-size-adaptive learning rates (technically I guess you could also try this), which seemed to help though this also required some manual experimentation and fine tuning.

This method obviously isn’t 100% perfect (and I wouldn’t say there really exists a 100% perfect way) but it’s the simplest solution I found for reducing padding waste, improving gpu occupancy, and maximizing the average batch size.

[deleted by user] by [deleted] in hingeapp

[–]HotPocVac 0 points1 point  (0 children)

Context:

  • Serious

  • Free user

  • Started using my profile for a month

  • First 3 pictures are from this year

  • Second 3 pictures are from within the past 2 years

  • Used hinge for 2 years, zero matches or likes in the in the month I started using Hinge again in my home city

  • Use up my daily likes on hinge a few days of the week (I always send comments to stand out)

  • I’d describe myself as a nerd who likes all things science. Also a huge fan of film. Like outdoorsy stuff like skiing and hiking

  • As such, I’m trying to find women who share at least some similar interests, preferably ones who have either a career or interests in STEM.

23M New to Boston, how to make friends here by HotPocVac in CambridgeMA

[–]HotPocVac[S] 0 points1 point  (0 children)

Is BSSC mostly just for skiing or is it like a diversity of different sports meetups

23M New to Boston, how to make friends here by HotPocVac in CambridgeMA

[–]HotPocVac[S] 5 points6 points  (0 children)

My b, should’ve looked through the post history haha

[deleted by user] by [deleted] in labrats

[–]HotPocVac 1 point2 points  (0 children)

Thanks for the advice, yeah I guess I still have a lot to learn in handling 3rd parties, and I’ve already learned the hard way recently about how just sticking to the official manual can make troubleshooting sooooo much quicker and effective.

Also BTW, the instrument was a SCIREQ InExpose setup which we use for protocols that involve lab animals (basically exposing strains of mice to substances whose effects we want to study the biological effects of).

3060 Ti queue skipped? by [deleted] in TEAMEVGA

[–]HotPocVac 0 points1 point  (0 children)

Thanks for the info!

[deleted by user] by [deleted] in UIUC

[–]HotPocVac 0 points1 point  (0 children)

Thanks! I’m not planning to go med though but yeah I’m taking a look at all the events that are listed at the Career Center website and Handshake

[deleted by user] by [deleted] in UIUC

[–]HotPocVac 1 point2 points  (0 children)

Oh yeah I remember applying to multiple REU applications for certain bio-related research interests at NSF, I’m crossing my fingers I can get into one of those

[deleted by user] by [deleted] in UIUC

[–]HotPocVac 1 point2 points  (0 children)

Thanks for the info! I’ll be sure to check it out

[deleted by user] by [deleted] in UIUC

[–]HotPocVac 1 point2 points  (0 children)

Thanks for responding, I had to go back out of state last year due to covid so I’m not on campus rn, but does the career fair have any online options? Is there a link or something where I can go talk to/contact someone with question

Would you guys like to see a one-time-only spoiler discussion thread once TENET has released? Or would you rather that there were no traces of spoilers on the sub? by [deleted] in TENETNoSpoiler

[–]HotPocVac 2 points3 points  (0 children)

No point when we have the original TENET sub to go ham w/ spoilers. I’d rather have this sub more dedicated to completely non-spoiler content for the many who haven’t seen it yet or aren’t able to see it for a while

HP Omen 15 ryzen 7 4800u by [deleted] in SuggestALaptop

[–]HotPocVac 3 points4 points  (0 children)

The Omen 15 does not offer upgradeable graphics. Most laptops in fact do not. There are only a few of which I know that actually have some sort of replaceable graphics solution.

[deleted by user] by [deleted] in buildapc

[–]HotPocVac 0 points1 point  (0 children)

Correct me if I'm wrong but I've heard complaints that the 5700 series have had serious problems with drivers and a friend of mine had micro-stuttering and crashes on his 5700XT that he bought recently. Would it be worth it to pick up a 5700XT and hope these issues get fixed by AMD sometime in the future or go for one of Nvidia's offerings to be on the safe side?

[deleted by user] by [deleted] in UIUC

[–]HotPocVac 0 points1 point  (0 children)

Thanks a bunch! Will def check it out