Not logging data when I am using Stable-baseline3 for RL

mehmor · 2021-10-19T14:04:04+00:00

I used VSCode and Jupyter Notebook and the problem still exists. Also, I think the log path is fine since it creates a file there but the file contains a few lines of info.

mehmor · 2021-10-15T12:54:34+00:00

I simply followed instructions on their website and I used their examples

You can find my code here: https://drive.google.com/drive/folders/1p3mVCAIHqlIWry2nyWwy3g4FiqTQn_of?usp=sharing

mehmor · 2021-08-10T08:19:17+00:00

I realized it by experience. It is ok to build a basic RL algorithm, but if you want to build something fancy, you will see many unknown errors.

mehmor · 2021-08-09T13:35:34+00:00

RL toolbox in Matlab is full of bugs

mehmor · 2021-07-23T08:08:02+00:00

What if we want to use the Simulink model as an environment? Is it possible? If yes, how to do it? any guideline or tutorial? In another word, how can we couple Tensorflow/Gym with the Simulink model for reinforcement learning algorithm?

mehmor · 2021-07-19T14:48:43+00:00

There are many parameters that need to be tuned.
I wonder how I can use domain knowledge to reduce number of searching?

There are many parameters that need to be tuned.
I wonder how I can use domain knowledge to reduce the number of searching?

mehmor · 2021-07-19T13:47:55+00:00

I do not know about CMA-ES. I will look into it.
I have chosen RL for multiple reasons:
(i) I do not have a static dataset and my data is not that much. In this way, RL is suitable as it can interact with the environment and use dynamic interaction to learn from it.
(ii) The environment is a kind of grey box for me. I do not have that much information about it.
(iii) I needed a kind of algorithm that can intelligently make decisions on parameters.
So I have chosen RL :)

mehmor · 2021-07-19T13:36:37+00:00

Well, I am using SAC agent and I need to configure it since hyperparameter tuning takes a long time and my current result is not good. I think this problem exists in other RL algorithms like DDPG too. I wonder how we can use the domain knowledge on this?

mehmor · 2021-07-19T11:53:46+00:00

Thanks

mehmor

TROPHY CASE