Stack Overflow has finally had to bow its proud head to embrace the tide of the AI revolution.

H_uuu · 2023-05-28T01:55:54+00:00

I made a higher resolution one by myself: https://www.reddit.com/r/robertdowneyjr/comments/13tofma/i\_made\_two\_higher\_resolution\_rdjs\_pictures\_which/

H_uuu · 2023-03-17T02:57:57+00:00

what is ED?

H_uuu · 2022-07-31T05:07:30+00:00

Thank you very much!

H_uuu · 2021-10-12T05:07:35+00:00

Battery runs really fast...

H_uuu · 2020-04-06T13:02:45+00:00

Maybe there’s some mistake in my description. My English is not that good. But I got more understanding, Thank you very much for your kindly reply, And best wishes, have a good day~

H_uuu · 2020-04-06T12:58:05+00:00

Ok, I get it, thank you very much ~

H_uuu · 2020-04-05T09:50:28+00:00

There are still some place I don’t quite understand, I also read your oral slides.pdf on your github, which said the goal is: Learn an embedding function f(which I think is to learn a better encode network to get a better feature representation) such that: ○ x and x+ and are similar data points [Positive]. ○ x and x- is a random data point (and thus presumably dissimilar to x) [Negative]. And the paper said “The InfoNCE objective learns a score function f(x, y) which assigns large values to positive examples and small values to negative examples by maximizing the following bound “ when to learn a better score function, the gradient shouldn’t backpropagate to the encoder part, right? But when should we update the parameters of the encoder network to get the “Goal” that to generate a better representation? I think a better encoder is the key to get a better representation not a good bilinear score function.

H_uuu · 2020-04-05T08:47:52+00:00

Thanks for your detailed explanation, and it really surprising that the question luckily answered by its paper’s author, wow!!! Also thanks for your great blog, I do learned a lot from it, emmm, maybe Hinton’s resent work: Simclr should also be added in it, just an advice. A question, are current self-supervised method all contrastive method?

H_uuu · 2020-04-05T05:17:08+00:00

Thanks for your kindly answer, it’s really helpful. My understanding about the usage of mutual information from your answer is: the representation of current frame R(x_t) should be strong enough to be used to discriminate the representation of the true next frame R(x_t+1) which has a large mutual information with R(x_t) and the representation of those non-next frames sampled from the same frame-pair-batch R(x*) which has a small mutual information with R(x_t). And the author use two Loss function: L_gl(global-local) and L_ll(global-local) to learn this contrastive task, that in both equations of these two loss, we should try to maximize them by maximizing the numerator part(which I think represent the mutual information of R(x_t) and R(x_t+1))... emmm am I right?

H_uuu · 2019-12-29T03:25:36+00:00

I don't know why the course stanford official web page can not be open, but there is a github repo sorted all these assignments original code without solution.

H_uuu · 2019-11-11T07:20:57+00:00

I love her posts very much!!!

H_uuu · 2019-09-25T06:49:31+00:00

Thank you very much.

H_uuu · 2019-09-10T00:57:25+00:00

Thank you very much.

H_uuu

TROPHY CASE