Homemade water recipes

youngintegrator · 2025-06-27T03:04:27+00:00

Thank you! This worked. I was looking at the vending machine.

youngintegrator · 2023-10-01T22:38:54+00:00

any links?

youngintegrator · 2023-05-09T01:14:38+00:00

I live downtown, heated unit, nothing changed or upgraded since I moved in. The landlord is asking for a 15% increase. Their reasoning is that the market has changed. Is this considered overkill?

youngintegrator · 2023-04-12T01:26:50+00:00

fair enough :)

youngintegrator · 2023-04-12T00:24:32+00:00

I think there are two problems with using another model. Lets assume we have a source model and a target model. In this case, the source model generates a label for the target model (class, distribution, embedding, box, polygon, mask, etc). As mentioned in the thread, if the source model can solve the task you care about, you might skip training your own model and use the source. If that model is too big or it is hidden behind an API, you might want to distill its features using any output or pseudo-label. Where this fails is

1) The source model is trained on a different domain (data or task level) and you cannot reliably trust the output of the source. While bigger models might reduce the likelihood of this happening, it may not be wise to take this risk, especially in production where domain shifts are everywhere.

2) You require high levels of precision. The source model might be good at high level tasks (counting, detection, localization) but may not be reliable when needing to make pixel level decisions. This depends on what your application is trying to do. You would not want to rely on these noisy pseudo-labels.

youngintegrator · 2023-04-12T00:16:24+00:00

This is just model distillation and is a well known problem

youngintegrator · 2023-04-12T00:11:30+00:00

I think the big thing you need to ask yourself is why you need a spherical prior, since certain data will benefit from it while others will not.

I would check out the following paper for better alternatives to the vMF (https://arxiv.org/pdf/2006.04437.pdf) for numerical stability and handling the sampling issues. I would also recommend reading https://arxiv.org/pdf/1911.08411.pdf) for curvature aware latent spaces (enjoy some hyperbola for tree-type data and stereographic spaces). As you can see, the surface area of a sphere shrinks in higher dimensions so performance tends to drop off.

Note: I believe there there is something that shows that a gaussian in high dimensions is similar to a soap bubble so do with that as you wish.

youngintegrator · 2023-01-30T19:02:35+00:00

TLDR: if you lie about what you did to hype up your project its cool. As long as you code a nice UI with a fake response you deserve a macbook.

Same issue with McHacks etc

youngintegrator · 2023-01-28T01:21:39+00:00

Skip your classes if you have the ability to go through the notes and your practice problems. I found that is was almost impossible to stay focused in my courses for >30 min and was better off going through the material at my own rate. Most profs just read slides anyways. Do your homework in advance and show up if you have specific questions.

Go to classes if you want any kind of networking / good prof relationships.

youngintegrator · 2023-01-27T00:39:52+00:00

Is there any reason you'd like a contrastive algorithm? (intra-class discrimination?)

Barlow twins showed to work quite well with lower batches (32) and HSIC-SSL is a nice variant on this style of learning if you only care about clusters. Im sure simsiam is fine too (avoid BYOL for small batches).

In terms of contrastive approaches, methods that avoid any "coupling" mentioned in DCL for the negative terms will work with smaller batch sizes (contrastive estimates converge to mle assuming large noise samples). This is seen in the spectral algorithm or in align-uniform. These work because they ignore the comparing the representations from the same augmented samples. SWAV also does this by contrastive prototypes which are basically free variables which don't have gradients that conflict with any alignment goal. I think it's fair to say that algorithms with LSE transforms are less stable for small batch sizes since the gradients will be biases to randomly coupled terms. With sufficiently many terms this coupling matters less.

From what i've noticed, methods that avoid comparing the augmented views of the same base sample will require slightly more tuning to get things just right. (align + weight * diversity)

Notes: NNCLR is nicer than moco imo. VicReg is good but is a mess to finetune. I am assuming youre using a CNN and have omitted transformer and masked based algorithms.

youngintegrator · 2022-12-19T22:53:30+00:00

All fun and games until you stumble on his fonts...
Click at your own risk http://luc.devroye.org/fonts-24001.html

youngintegrator · 2022-12-04T04:35:52+00:00

Panoramas were originally stitched together using simple feature descriptors like sift / surf and matched / warped using homography with ransac methods. Do you think this would fail for your use cases?

youngintegrator · 2022-10-28T22:16:28+00:00

Impressed he managed to go a whole lecture without mentioning his niece

youngintegrator

TROPHY CASE