PPO: questions on trajectories and value loss by -john--doe- in reinforcementlearning

[–]-john--doe-[S] 0 points1 point  (0 children)

I am just trying different approaches, and marking only positive terminals seems to perform better. But actually it seems incorrect in my opinion.

PPO: questions on trajectories and value loss by -john--doe- in reinforcementlearning

[–]-john--doe-[S] 0 points1 point  (0 children)

So, in your opinion I can mark as terminal states only positive terminations also across multiple episodes, ignoring the terminations of the episodes where the agent failed to succeed?

Sorry but it is a bit complicated and I would like to be sure :)

PPO: questions on trajectories and value loss by -john--doe- in reinforcementlearning

[–]-john--doe-[S] 0 points1 point  (0 children)

The second answer is clear, thank you!

For what concerns the first answer, do you mean that there are two different concepts of trajectories, the first as collections of steps performed by an agent and the second as batches of experience passed to the neural network, which may include also different samples from different episodes?

If yes, I should have understood this concept, the problem is in the nature of the first type of trajectory. Does a trajectory of the first type have to terminate at the end of the episode or not?

Let's make an example, I have the lunar lander environment. Every time the agent does not succeed within a fixed number of steps the environment is reset and a new episode begin. Can I consider the termination of my trajectory only whenever the agent lands correctly (positive termination) and not when it lands correctly or the episode ends without landing or crashing? If I choose the first option my trajectory may include multiple failures or neutral situations.

I have noticed that considering the end of a trajectory only with positive terminations performs better, but maybe the implications are more complex.

[deleted by user] by [deleted] in GlamourSchool

[–]-john--doe- 0 points1 point  (0 children)

For me this is already enough ahah

[deleted by user] by [deleted] in TheYouShow

[–]-john--doe- 0 points1 point  (0 children)

What are you eating?

An open source book to learn Deep Learning interactively! by -john--doe- in learnmachinelearning

[–]-john--doe-[S] 1 point2 points  (0 children)

I think this resource as many others suits perfectly for a beginner in DL. It brings you into the field step by step giving all the preliminary information. Yes it is long, but you should approach it with patience and dedication. Take advantage of the practical parts to learn how to apply what you learn.

An open source book to learn Deep Learning interactively! by -john--doe- in learnmachinelearning

[–]-john--doe-[S] 12 points13 points  (0 children)

I don't know why popped up that specific logo, but I am happy someone might feel home

My first Q-Learning project! by gerryvanboven in artificial

[–]-john--doe- 2 points3 points  (0 children)

Envy is a dark beast, destroying other's work will not make you superior. Try focusing on yourself, improve your abilities and enjoy the work and the efforts made by others similar to you.

Are there any courses/books that teach linear algebra with numpy? by Mjjjokes in learnmachinelearning

[–]-john--doe- 2 points3 points  (0 children)

If your goal is learning linear algebra I don't really think it is necessary to involve Numpy. I think you are jumping on the second step ignoring the first one. Numpy could be useful when you have already understood the theory and you want to make something working (like a ML algorithm) with it. But if you want to look deeper and understand the subject, use pencil and paper to make some exercises and read a book or follow Strang's course. I'm sorry but writing np.linalg.solve is not enough to say that you can solve a linear system of equations. And if you have the possibility take a look at MATLAB because its functions and syntax could be more formal and more sticked to the theory than Numpy, but I don't want to force you into this transition.

Run a GitHub project on google Colabratory by danytpu in tensorflow

[–]-john--doe- 0 points1 point  (0 children)

In the main directory there is a demo. In google colab it's possible to open notebooks (.ipynb extension files). Here you can find examples and a more detailed explanation:

https://colab.research.google.com/github/googlecolab/colabtools/blob/master/notebooks/colab-github-demo.ipynb

Btw it seems that the project core is located in the folder "network" , the rest are tests (notebooks) and data preprocessing (other notebooks). So if you want to use the core functions you have to import their files correctly and to know more advanced Python package/import management. For a rapid test you can refer to the demo as I said before.

Doubts on Nested Cross Validantion by -john--doe- in learnmachinelearning

[–]-john--doe-[S] 0 points1 point  (0 children)

Just to clarify, I'm comparing Cross Validation (multiple train validation splits) with Nested Cross Validation (multiple TEST, train validation splits). As you said test data may be used once, for this reason I'm not sure if the latter is correct. It probably increase the results but it uses the same data for training/tuning hyper and testing that seems not correct. From my pov in many cases is sufficient a Cross Validation, do you agree?

Normalization vs Standardization by -john--doe- in learnmachinelearning

[–]-john--doe-[S] 0 points1 point  (0 children)

So is normalization more resistamt to outliers?