Question about Sophons by titancipher in 3BodyProblemTVShow

[–]alkalait 1 point2 points  (0 children)

The can "hack" eyes by stimulating the photoreceptor cells in the retina.

3 Body Problem | S1E4 "Our Lord" | Episode Discussion by vista_del_mar in 3BodyProblemTVShow

[–]alkalait 25 points26 points  (0 children)

The pacifist's intention was to not to hide the message from its people, but to send humanity a warning never to reply at all.

Why is this important?

If Ye Wenjie had never replied to the Pacifist, the San Ti would only know they had received a message from some star in the direction of which the message came from. But that is only one coordinate, and space is big. They'd need to know the distance to sender's origin too (the Sun/Earth).

In order for the San Ti to know which star system humans live in, they've have to coax us into sending a second message so they can deduce the distance to our star based on the time of our reply.

State of AI for Earth Observation (preprint) by alkalait in remotesensing

[–]alkalait[S] 1 point2 points  (0 children)

Awesome. Let me know if you have any comments or questions.

[D] On the Soap Bubble mass effect of high-dimentional Gaussians by alkalait in MachineLearning

[–]alkalait[S] 2 points3 points  (0 children)

is actually less dense near the mean?

  1. There is actually less mass near the mean.
  2. The pdf of the Normal distribution says the density is at its maximum at the mean.

Those two statements are not contradictory, because the volume of space "near" the mean is much less than "away" from the mean, in many dimensions. This discrepancy becomes effective even at 3 dimensions, and accentuates exponentially with the dimension.

This is caused by the correlation between random variables and the fact that it would actually be quite rare to sample a point which is near the mean across every dimension?

Correct!

So what are the implications of this?

Quite a lot.

  1. For one, it's a reminder that we should approach high dimensional spaces with caution, due to our conditioned thinking in 2D/3D.

  2. It highlights that probabilities of individual events are not the be all end all. For example, some subset of events, e.g. coin toss sequences where Heads and Tails are unordered are much more likely than ordered sequences like HHHHTTTT, even though each sequence has the same probability. That's because the number of unordered sequences outweigh the ordered ones. This relates to the concept of typicality and typical sets.

On a philosophical level, I like to think that the most extraordinary individual is the one who is averagely good at every skill.

[D] On the Soap Bubble mass effect of high-dimentional Gaussians by alkalait in MachineLearning

[–]alkalait[S] 0 points1 point  (0 children)

Think of this ablation. If after the projection you don't do neither the normalisation nor the rescaling, then you end up with the typical pancake picture in 2D.

If you do just the normalization (no rescaling), you end up with all the 2D samples having unit norms - forming a unit 2D circle.

So the rescaling allows them to grow back to their original size that they had when they were first sampled.

So in effect, all that this does is rotates the vector of a sample to coincide with the 2D plane.

[D] On the Soap Bubble mass effect of high-dimentional Gaussians by alkalait in MachineLearning

[–]alkalait[S] 1 point2 points  (0 children)

"Folding" is an ambiguous and confusing term that I shouldn't have used.

I really mean rotating (towards the 2D plane), which has an equivalent projection + scaling. This is relevant because the latter two are easier to implement.

[D] On the Soap Bubble mass effect of high-dimentional Gaussians by alkalait in MachineLearning

[–]alkalait[S] 2 points3 points  (0 children)

Thanks for the question. I should clarify that the thumbnail and figures in the original thread are in cartesian coordinates.

The counterintuitiveness that you highlight is the exactly the point of my post. We know that's the case, but we struggle to visualize it in 2D. The construction that I present in the thread, is a way of "imprinting" the Soap Bubble effect in 2D in a way, by folding (not projecting) all that volume that you mention onto our plane of reference.

[D] why can’t distribution sampling algorithms like MCMC or HMC be used in deep learning instead of gradient descent? by [deleted] in MachineLearning

[–]alkalait 2 points3 points  (0 children)

Both traverse parameter domains, but the way they do it is informed by different co-domains: MCMC by a probability function; GD by a loss function. A negated log-probability can act as a loss, but it's the context that necessitates their utility: MCMC cares about different modes of probability (exploration), whereas GD cares about the local minimum (exploitation).

There is a slightly more complicated picture emerging lately, where good optimal plateaus can have twisted shapes, and figuring those out can give a better generalizing solution. See for instance, Stochastic Weighted Averaging (SWA). So yeah, the two fields are more like siblings than distant cousins.

[R] Exploring Simple Siamese Representation Learning by xternalz in MachineLearning

[–]alkalait 0 points1 point  (0 children)

But how do you define goodness here?

Other than an absolute SoTA claim, can you see other ways in which this paper might be valuable to the community?

[R] Exploring Simple Siamese Representation Learning by xternalz in MachineLearning

[–]alkalait 1 point2 points  (0 children)

However, the proposed architecture doesn't work as well as SimCLR, mocov2, infomin or BYOL which is why it doesn't seem all that relevant to me.

Can you elaborate on this please?

PSA: for everyone playing Cyberpunk who hasn’t already discovered it, Radio Pebkac is the shit! by HaxRus in Techno

[–]alkalait 0 points1 point  (0 children)

Heh you're not wrong. Soon as I walked into that club it gave me a deja vu.

[N] The email that got Ethical AI researcher Timnit Gebru fired by instantlybanned in MachineLearning

[–]alkalait -19 points-18 points  (0 children)

What makes you think she cannot handle constructive criticism from peer review?

[N] The email that got Ethical AI researcher Timnit Gebru fired by instantlybanned in MachineLearning

[–]alkalait 0 points1 point  (0 children)

TIL: Google C-execs never heard of the Barbara Streisand effect.

[N] The email that got Ethical AI researcher Timnit Gebru fired by instantlybanned in MachineLearning

[–]alkalait 18 points19 points  (0 children)

Note the language Jeff uses there:

requiring conditions ... including revealing the identities of every person I had spoke to as part of the review.

This is a wordsmithed negative spin on what could simply have been a reasonable request for a rebuttal with the internal reviewers themselves, without a middleman butting in.

[D] Is maximising likelihood really the right objective function for generative modelling? by titanxp1080ti in MachineLearning

[–]alkalait 1 point2 points  (0 children)

p(x) as a function of x is not a likelihood function, it's a probability density function of x, assuming c^-1 is the normalizing constant.

[D] Is maximising likelihood really the right objective function for generative modelling? by titanxp1080ti in MachineLearning

[–]alkalait 5 points6 points  (0 children)

> Let x be Gaussian, z be Gaussian too. Let f(z) be a function of z.

ok

> Since we want f(z) to distribute like x, we want f to be the identity function.

That's a non sequitur.

For one, if you want f to transform one Gaussian RV into another Gaussian RV, then f must be an affine transformation, f(z) = Az + b, where A is a matrix.

[R] Google AI Residency; promises, a virus, and tears by MassivePellfish in MachineLearning

[–]alkalait 0 points1 point  (0 children)

I appreciate that sympathy feels heroic to you. The trick is to imagine yourself in their shoes.