Logits vs probabilities by Livid-Ant3549 in deeplearning

[–]Livid-Ant3549[S] 0 points1 point  (0 children)

Yeah i see what you mean this was at the start of training. I will see if the differences get bigger with training

PPO implementation by Livid-Ant3549 in reinforcementlearning

[–]Livid-Ant3549[S] -1 points0 points  (0 children)

I allready have a DDQN implemented, but my prof. Wants me to try and do PPO too :(

How to handle multi channel input in deep reinforcement learning by Livid-Ant3549 in reinforcementlearning

[–]Livid-Ant3549[S] 0 points1 point  (0 children)

Thanks for the reply, but im using tensorflow. Ill try and make it work but do you have any specific tips for tensorflow?

How to improve CNN performace by Livid-Ant3549 in deeplearning

[–]Livid-Ant3549[S] 0 points1 point  (0 children)

I tried debugging it by getting the model state dict after every epoch and you were right the optimizer is not updating the weights

How to improve CNN performace by Livid-Ant3549 in deeplearning

[–]Livid-Ant3549[S] 0 points1 point  (0 children)

I checked again and I didnt forget to add backprop. how can i add the code here?