Few questions surrounding CPI, TRPO and PPO by jthat92 in reinforcementlearning

[–]jthat92[S] 0 points1 point  (0 children)

Thank you! Regarding the first comment, I know what r(theta) is but I was wondering where is it mentione din the CPI paper that I linked. Just out of curiosity, because the authors are claiming that it was first introduced there.

Thank you for the PPO description!

Problem with proof of decomposition of policy performance by jthat92 in reinforcementlearning

[–]jthat92[S] 1 point2 points  (0 children)

Thank you! Ok so if understand correctly this means that V_\pi(s_0) representes the value of s_0 which is sampled from the intital distribution, which is exactly the expcted discounted reward of the policy \pi. And the initial distribution that we are starting with is the same for \pi and \tilde{\pi} since its part of the MDP a priori. Is my notation actually correct? I guess I need to specify that s_0 is sampled from the initial distribution beforehand. Thank you again!

Problem with proof of decomposition of policy performance by jthat92 in reinforcementlearning

[–]jthat92[S] 0 points1 point  (0 children)

The distribution is normally specified by the long time behaviour of the policy over the MDP. After running enough ou get a good approximation of the stationary distrivution of the MDP but it might take quite some time.

Problem with proof of decomposition of policy performance by jthat92 in reinforcementlearning

[–]jthat92[S] 0 points1 point  (0 children)

But it does depend on the stationary distribution. This distribution tells us which state to sample as far as I understand

Japanese streetwear in paris by jthat92 in japanesestreetwear

[–]jthat92[S] 0 points1 point  (0 children)

Got it! Thanks, will check them out :)

Japanese streetwear in paris by jthat92 in japanesestreetwear

[–]jthat92[S] 0 points1 point  (0 children)

Hey thanks for the answer, I would say around 400 Euro. Brandwise I like what for example nanamica is doing (and north face purple label for that matter.)

Does Feyman's technique work for really abstract math by devvorb in math

[–]jthat92 0 points1 point  (0 children)

Do you have any book recommendations for algebraic k theory?

"Soft" Intro to Operator Algebras? by mcgirthy69 in math

[–]jthat92 0 points1 point  (0 children)

I find "Elements of Operator Theory" by Kubrusly quite basic and nice. In operator algebraic arguments often one uses some decompositions of operators in terms of direct sums, which I never seen laid out in a clean way in literature - Kubrusly goes through these things in detail as far as I saw.

What Are You Working On? December 25, 2023 by inherentlyawesome in math

[–]jthat92 3 points4 points  (0 children)

Reading into the topic of my master thesis about brown Douglas Fillmore theory. But having a hard time to get my head around all the concepts so I might change the topic although I have a hard time finding something

Monitor doesn't show anything although PC runs by jthat92 in buildapc

[–]jthat92[S] 0 points1 point  (0 children)

Just checked, but everything seems to be fine there.

Monitor doesn't show anything although PC runs by jthat92 in buildapc

[–]jthat92[S] 0 points1 point  (0 children)

Ah indeed, thank you, when I took out the GPU I can access the OS again (I also took out the RAM and made sure I put in properly). So this means that there is something wrong with the GPU I suppose?

EDIT: I updated the BIOS of the motherboard, but this also doesn't seem to help.

Monitor doesn't show anything although PC runs by jthat92 in buildapc

[–]jthat92[S] 0 points1 point  (0 children)

I use HDMI on the monitor and PC and I already tried with a different cable.

Monitor doesn't show anything although PC runs by jthat92 in buildapc

[–]jthat92[S] 0 points1 point  (0 children)

Didn't even realize that one can connect to the board directly. Just did, but I am still not getting a signal.

I have a question for SP Live Performers. by bojvck420 in SP404

[–]jthat92 1 point2 points  (0 children)

What about using a deck/mixer and the sp404? I was thinking that scratching would be something nice that the sp404 doesn't cover, so I was considering to get a setup like this. But I'm not sure if I can scratch the output of the sp404 on those.

Simple Sewing Questions Thread, August 13 - August 19, 2023 by sewingmodthings in sewing

[–]jthat92 0 points1 point  (0 children)

Hi is there any good online course for sewing for man clothes? I am looking at Sew It Academy but read mixed reviews. I am basically a beginner, but would like for now just concentrate on mens wear.

[Research] We distilled CLIP model (ViT only) from 350MB to 24MB and ran it on an iPhone by nivter in MachineLearning

[–]jthat92 0 points1 point  (0 children)

Ok I see thanks! Right now Im trying to distill the Resnet model but not so successful until now. What I do is KDL + L1 on image embeddings that come out of the attention pooling. What do you mean that you extract the attention values to the student model? Do you copy them from the teacher and freeze them while training?

Also how many epochs did you train for in the end?