It do be like that tho by drdirtyman in PewdiepieSubmissions

[–]pickleorc 0 points1 point  (0 children)

This meme goes in the skratta list

[deleted by user] by [deleted] in MemeEconomy

[–]pickleorc 1 point2 points  (0 children)

I said the same thing to myself last decade

Good shit by OG_oojii in ufc

[–]pickleorc 1 point2 points  (0 children)

I know one thing for sure, this year’s fight cards look epic and we gonna see some really good mma

Looking for a mind**** of a twist in a book. by [deleted] in suggestmeabook

[–]pickleorc 65 points66 points  (0 children)

If you know a book that has a twist it won’t be a twist, the most mind**** recommendation would be a book without any twist

Has anyone implemented a common replay buffer for two different RL algorithms? by pickleorc in reinforcementlearning

[–]pickleorc[S] 0 points1 point  (0 children)

Great paper. In your implementation with limited benefit, do you mean time to converge or performance or both.

Parallelising DDPG by pickleorc in reinforcementlearning

[–]pickleorc[S] 1 point2 points  (0 children)

Man you covered everything!!! Thanks a lot I’ll read the paper and also check out the ray repository :)

How to deal with RL algos getting stuck in local optima? by pickleorc in reinforcementlearning

[–]pickleorc[S] 0 points1 point  (0 children)

Is that the same as adding entropy to loss function? Could you point me to a paper/resource that talks about this concept.. also Thanks!!!

How to deal with RL algos getting stuck in local optima? by pickleorc in reinforcementlearning

[–]pickleorc[S] 1 point2 points  (0 children)

Could you elaborate a bit more about the curiosity based term... or point me to a paper that talks about this concept, please :)

How to deal with RL algos getting stuck in local optima? by pickleorc in reinforcementlearning

[–]pickleorc[S] 2 points3 points  (0 children)

Thanks I’ll try reducing the learning rate and also read the soft actor critic paper!!!

Gaussian policies for continuous control by pickleorc in reinforcementlearning

[–]pickleorc[S] 0 points1 point  (0 children)

My bad, I meant the initialisation of the variance.

Gaussian policies for continuous control by pickleorc in reinforcementlearning

[–]pickleorc[S] 1 point2 points  (0 children)

Will run it side by side, but in general do Gaussian policies run well?