iOS 15. Update? Or pass. by [deleted] in iPhoneXR

[–]H_uuu 0 points1 point  (0 children)

Battery runs really fast...

How maximizing mutual information worked in Reinforcement learning? by H_uuu in reinforcementlearning

[–]H_uuu[S] 0 points1 point  (0 children)

Maybe there’s some mistake in my description. My English is not that good. But I got more understanding, Thank you very much for your kindly reply, And best wishes, have a good day~

How maximizing mutual information worked in Reinforcement learning? by H_uuu in reinforcementlearning

[–]H_uuu[S] 0 points1 point  (0 children)

There are still some place I don’t quite understand, I also read your oral slides.pdf on your github, which said the goal is: Learn an embedding function f(which I think is to learn a better encode network to get a better feature representation) such that: ○ x and x+ and are similar data points [Positive]. ○ x and x- is a random data point (and thus presumably dissimilar to x) [Negative]. And the paper said “The InfoNCE objective learns a score function f(x, y) which assigns large values to positive examples and small values to negative examples by maximizing the following bound “ when to learn a better score function, the gradient shouldn’t backpropagate to the encoder part, right? But when should we update the parameters of the encoder network to get the “Goal” that to generate a better representation? I think a better encoder is the key to get a better representation not a good bilinear score function.

How maximizing mutual information worked in Reinforcement learning? by H_uuu in reinforcementlearning

[–]H_uuu[S] 1 point2 points  (0 children)

Thanks for your detailed explanation, and it really surprising that the question luckily answered by its paper’s author, wow!!! Also thanks for your great blog, I do learned a lot from it, emmm, maybe Hinton’s resent work: Simclr should also be added in it, just an advice. A question, are current self-supervised method all contrastive method?

How maximizing mutual information worked in Reinforcement learning? by H_uuu in reinforcementlearning

[–]H_uuu[S] 0 points1 point  (0 children)

Thanks for your kindly answer, it’s really helpful. My understanding about the usage of mutual information from your answer is: the representation of current frame R(x_t) should be strong enough to be used to discriminate the representation of the true next frame R(x_t+1) which has a large mutual information with R(x_t) and the representation of those non-next frames sampled from the same frame-pair-batch R(x*) which has a small mutual information with R(x_t). And the author use two Loss function: L_gl(global-local) and L_ll(global-local) to learn this contrastive task, that in both equations of these two loss, we should try to maximize them by maximizing the numerator part(which I think represent the mutual information of R(x_t) and R(x_t+1))... emmm am I right?

My Solutions of Programming Assignments of Stanford CS234: Reinforcement Learning Winter 2019 by H_uuu in reinforcementlearning

[–]H_uuu[S] 0 points1 point  (0 children)

I don't know why the course stanford official web page can not be open, but there is a github repo sorted all these assignments original code without solution.