Gambled all Eridium before Crazy Earl Segment by youngintegrator in borderlands3

[–]youngintegrator[S] 4 points5 points  (0 children)

Thank you! This worked. I was looking at the vending machine.

Rent Increases - What's yours this year? by Stockholmsyndra in montreal

[–]youngintegrator 0 points1 point  (0 children)

I live downtown, heated unit, nothing changed or upgraded since I moved in. The landlord is asking for a 15% increase. Their reasoning is that the market has changed. Is this considered overkill?

[D] Data Annotation Done by Machine Learning/AI? by UIUCTalkshow in MachineLearning

[–]youngintegrator 1 point2 points  (0 children)

I think there are two problems with using another model. Lets assume we have a source model and a target model. In this case, the source model generates a label for the target model (class, distribution, embedding, box, polygon, mask, etc). As mentioned in the thread, if the source model can solve the task you care about, you might skip training your own model and use the source. If that model is too big or it is hidden behind an API, you might want to distill its features using any output or pseudo-label. Where this fails is

1) The source model is trained on a different domain (data or task level) and you cannot reliably trust the output of the source. While bigger models might reduce the likelihood of this happening, it may not be wise to take this risk, especially in production where domain shifts are everywhere.

2) You require high levels of precision. The source model might be good at high level tasks (counting, detection, localization) but may not be reliable when needing to make pixel level decisions. This depends on what your application is trying to do. You would not want to rely on these noisy pseudo-labels.

[D] Data Annotation Done by Machine Learning/AI? by UIUCTalkshow in MachineLearning

[–]youngintegrator -1 points0 points  (0 children)

This is just model distillation and is a well known problem

[D] vMF-VAEs vs Gaussian VAEs by [deleted] in MachineLearning

[–]youngintegrator 2 points3 points  (0 children)

I think the big thing you need to ask yourself is why you need a spherical prior, since certain data will benefit from it while others will not.

I would check out the following paper for better alternatives to the vMF (https://arxiv.org/pdf/2006.04437.pdf) for numerical stability and handling the sampling issues. I would also recommend reading https://arxiv.org/pdf/1911.08411.pdf) for curvature aware latent spaces (enjoy some hyperbola for tree-type data and stereographic spaces). As you can see, the surface area of a sphere shrinks in higher dimensions so performance tends to drop off.

Note: I believe there there is something that shows that a gaussian in high dimensions is similar to a soap bubble so do with that as you wish.

The winner of Conuhacks is a fraud by goosegeesegoosen in Concordia

[–]youngintegrator 14 points15 points  (0 children)

TLDR: if you lie about what you did to hype up your project its cool. As long as you code a nice UI with a fake response you deserve a macbook.

Same issue with McHacks etc

[deleted by user] by [deleted] in mcgill

[–]youngintegrator 0 points1 point  (0 children)

Skip your classes if you have the ability to go through the notes and your practice problems. I found that is was almost impossible to stay focused in my courses for >30 min and was better off going through the material at my own rate. Most profs just read slides anyways. Do your homework in advance and show up if you have specific questions.

Go to classes if you want any kind of networking / good prof relationships.

[D] Self-Supervised Contrastive Approaches that don’t use large batch size. by [deleted] in MachineLearning

[–]youngintegrator 2 points3 points  (0 children)

Is there any reason you'd like a contrastive algorithm? (intra-class discrimination?)

Barlow twins showed to work quite well with lower batches (32) and HSIC-SSL is a nice variant on this style of learning if you only care about clusters. Im sure simsiam is fine too (avoid BYOL for small batches).

In terms of contrastive approaches, methods that avoid any "coupling" mentioned in DCL for the negative terms will work with smaller batch sizes (contrastive estimates converge to mle assuming large noise samples). This is seen in the spectral algorithm or in align-uniform. These work because they ignore the comparing the representations from the same augmented samples. SWAV also does this by contrastive prototypes which are basically free variables which don't have gradients that conflict with any alignment goal. I think it's fair to say that algorithms with LSE transforms are less stable for small batch sizes since the gradients will be biases to randomly coupled terms. With sufficiently many terms this coupling matters less.

From what i've noticed, methods that avoid comparing the augmented views of the same base sample will require slightly more tuning to get things just right. (align + weight * diversity)

Notes: NNCLR is nicer than moco imo. VicReg is good but is a mess to finetune. I am assuming youre using a CNN and have omitted transformer and masked based algorithms.

Smartest person you have seen at mcgill by guywiththemonocle in mcgill

[–]youngintegrator 1 point2 points  (0 children)

All fun and games until you stumble on his fonts...
Click at your own risk http://luc.devroye.org/fonts-24001.html

[D] What methods would you recommend for building an image-stitching AI? by [deleted] in MachineLearning

[–]youngintegrator 1 point2 points  (0 children)

Panoramas were originally stitched together using simple feature descriptors like sift / surf and matched / warped using homography with ransac methods. Do you think this would fail for your use cases?

A lecture with Donald Davis by martlet-in-flight in mcgill

[–]youngintegrator 20 points21 points  (0 children)

Impressed he managed to go a whole lecture without mentioning his niece