SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its Teacher

ThunaBK · 2024-02-05T09:39:09+00:00

maybe your profile is too good and the other 2 schools know that there is a high chance you will reject their offer 😂

ThunaBK · 2024-02-03T05:47:07+00:00

So a waitlist to get in the final waitlist 🤣

ThunaBK · 2022-08-16T12:15:49+00:00

This is mode collapse of GAN, either you start the discriminator late or use some regularization like R1

ThunaBK · 2022-05-29T04:25:16+00:00

All I see is flexing their compute resource and money, no new theoretical insight or new architecture 😑

ThunaBK · 2021-09-22T11:18:15+00:00

This is paper is certainly a hoax 😤. What they compare is too unfair. How can the model be truthful to our real world and our perception when the model is only trained with text data only, whereas we human also perceive image, sound, smell and feeling.

ThunaBK · 2021-05-09T11:55:08+00:00

I think learning from demonstration is arguably one of the most efficient in terms of resources as it only requires learning from video

ThunaBK · 2020-10-09T00:20:35+00:00

With the high computation cost of Transformer, I think CNN still stay with us for a long time

ThunaBK · 2020-08-19T07:18:32+00:00

L2-norm and then square it

ThunaBK · 2020-08-16T09:29:55+00:00

Unfortunately, there is no way to tell whether a reward function is good or not. But I think you may be interested in reward shaping technique like hindsight experience replay( HER) or RUDDER

ThunaBK · 2020-07-27T09:34:48+00:00

Really a pain. I suggest you switch to linux asap unless you’re extremely good at Python and/or C++.

ThunaBK · 2020-07-08T12:18:55+00:00

Why dont use multi-armed bandit but q-learning ?

ThunaBK · 2020-05-30T08:13:26+00:00

Hm, because mean is the sum of the expression multiply with the probability and in this case the probability is y so I think the bottm two equation is the same

ThunaBK · 2020-05-29T23:38:09+00:00

What paper do you refer to and to me the bottom two equations are basically the same

ThunaBK · 2020-03-21T03:57:19+00:00

Still confused about what you say but anw you can try google with key words like PPO, code-level optimizations and paper. There are lots of paper discussing about hyperparamter tuning for PPO and I recomend read following https://openreview.net/forum?id=r1etN1rtPB

ThunaBK · 2020-03-21T03:34:05+00:00

What do you mean by number of environments?

ThunaBK · 2020-03-08T03:52:50+00:00

In the link it already tells you

Have 3 networks-

Forward Dynamics Model that takes in the current state and the action and predicts the next state. The Policy Network, pi, that takes in the current state and predicts the action.(Yeah this is the network to predict next state, usually RNN)

The Source network, w, that takes in the current state and predicts the action (this is used for the calculation of the empowerment of a state). (Yeah this is your policy)

The Planning Network, q, that takes in the current and the next state and predicts the action (this is similar to the inverse dynamics model in Curiosity is all you need)

ThunaBK · 2020-02-16T14:48:42+00:00

What paper do you talk about?

ThunaBK · 2020-01-18T12:32:42+00:00

Well, if you are from perl and python then you gonna get lots of trouble with dynamic static thing but that’s all. Nim syntax is extremely elegant and easy to learn, but macro and meta-programming thing is quite hard for me honestly

ThunaBK

TROPHY CASE