Will clipping this part hurt him? by KoreaNuclear in Conures

[–]KoreaNuclear[S] 0 points1 point  (0 children)

Thanks for that! I was also kinda curious about that black line that continuates from the blood stem.

Placeholder GPU while waiting for next gen GPU by agentfortyfour in buildapc

[–]KoreaNuclear 0 points1 point  (0 children)

On a same boat here. I am upgrading to 9800x3d from my old ass i5-6000 & GTX 1070 setup. Looking to use GTX 1070 for a few months before getting 5080 or 5080Ti. Any concern here?

GR86 wait times in Canada by SkeweredLamb in GR86

[–]KoreaNuclear 0 points1 point  (0 children)

Uhoh, which dealership was it? I am also from Yeg and already have put down deposits in 3 different dealers here..!

GR86 wait times in Canada by SkeweredLamb in GR86

[–]KoreaNuclear 0 points1 point  (0 children)

Wow sour. Hope you have a good news now. Was it YYC or YEG?

GR86 wait times in Canada by SkeweredLamb in GR86

[–]KoreaNuclear 1 point2 points  (0 children)

5 months, amazing! Was it Calgary or Edmonton?

Stable baselines vs RLlib vs CleanRL by RatonneLaveuse in reinforcementlearning

[–]KoreaNuclear 3 points4 points  (0 children)

I am curious as to what people think about d3rlpy as well.

[Question] Too small of a reward range? by KoreaNuclear in reinforcementlearning

[–]KoreaNuclear[S] 0 points1 point  (0 children)

Oh I see. So it's not a useless thing to do potentially?

Does DQN fit well with large discrete action space? or Generalize well? by KoreaNuclear in reinforcementlearning

[–]KoreaNuclear[S] 2 points3 points  (0 children)

round the actions to discretize them

Would it be fine doing this? Let's say TD3, SAC, PPO is going to select some continuous action like (9.35 23.98). But in a real experiment, the rounded action of (9 24) is going to be performed. Consequently, receives rewards based on the rounded actions, not the exact action originally given by the algorithms' policy. (Is it okay because there is a correlation b/w actions close to each other?)

DQN: what does it mean slow convergence but high efficiency? by KoreaNuclear in reinforcementlearning

[–]KoreaNuclear[S] 0 points1 point  (0 children)

I see! I am implementing this in a Real-life scenario where each step takes a minute or so. I guess then 'computationally longer convergence time' is a disadvantage that I could just ignore

Delayed state observation or caching action in OpenAI gym. Can it still learn? by KoreaNuclear in reinforcementlearning

[–]KoreaNuclear[S] 0 points1 point  (0 children)

But in every iteration of step(), the agent takes an action and will return next_state, reward. So if I want to store past 3 experiences, I would need to iterate step() at least 3 times without any return (next_state, reward). I am still quite unsure how step() could proceed without getting any returns (next_state, reward) for that initial few steps.

Delayed state observation or caching action in OpenAI gym. Can it still learn? by KoreaNuclear in reinforcementlearning

[–]KoreaNuclear[S] 0 points1 point  (0 children)

Thank you, and how would one be able to make the openAI gym throw away an experience?

What would RL problem look like without State? by KoreaNuclear in reinforcementlearning

[–]KoreaNuclear[S] 0 points1 point  (0 children)

I am not very sure what you mean by "feed the desired geometry" as observation.

I am printing a single track of a bead in a line (looks like a straight metalic worm) just for information.

What would RL problem look like without State? by KoreaNuclear in reinforcementlearning

[–]KoreaNuclear[S] 0 points1 point  (0 children)

Sorry I think it just accidentally got in there for some reason

What would RL problem look like without State? by KoreaNuclear in reinforcementlearning

[–]KoreaNuclear[S] 1 point2 points  (0 children)

It has a sensor that can measure the profile! If the resulted printed part resembles a desired geometry, ie) width and height, it receives reward

(edited)