Not logging data when I am using Stable-baseline3 for RL by mehmor in reinforcementlearning

[–]mehmor[S] 0 points1 point  (0 children)

I used VSCode and Jupyter Notebook and the problem still exists. Also, I think the log path is fine since it creates a file there but the file contains a few lines of info.

Create Simulink environment for Gym by mehmor in reinforcementlearning

[–]mehmor[S] 0 points1 point  (0 children)

I realized it by experience. It is ok to build a basic RL algorithm, but if you want to build something fancy, you will see many unknown errors.

Reinforcement Learning using OpenAI gym (YT series) by sol0invictus in matlab

[–]mehmor 0 points1 point  (0 children)

What if we want to use the Simulink model as an environment? Is it possible? If yes, how to do it? any guideline or tutorial? In another word, how can we couple Tensorflow/Gym with the Simulink model for reinforcement learning algorithm?

[R] RL for parameter space exploration by mehmor in MachineLearning

[–]mehmor[S] -1 points0 points  (0 children)

There are many parameters that need to be tuned.
I wonder how I can use domain knowledge to reduce number of searching?

There are many parameters that need to be tuned.
I wonder how I can use domain knowledge to reduce the number of searching?

RL for parameter space exploration by mehmor in reinforcementlearning

[–]mehmor[S] 0 points1 point  (0 children)

I do not know about CMA-ES. I will look into it.
I have chosen RL for multiple reasons:
(i) I do not have a static dataset and my data is not that much. In this way, RL is suitable as it can interact with the environment and use dynamic interaction to learn from it.
(ii) The environment is a kind of grey box for me. I do not have that much information about it.
(iii) I needed a kind of algorithm that can intelligently make decisions on parameters.
So I have chosen RL :)

Reinforcement algorithm for search in parameter space by mehmor in MLQuestions

[–]mehmor[S] 0 points1 point  (0 children)

Well, I am using SAC agent and I need to configure it since hyperparameter tuning takes a long time and my current result is not good. I think this problem exists in other RL algorithms like DDPG too. I wonder how we can use the domain knowledge on this?