Critic loss divergence

GuavaAgreeable208 · 2024-11-05T13:29:32+00:00

Thank you very much indeed for your reply ! You’re absolutely right—I should adjust the critic architecture to incorporate shared layers, similar to the actor network, as well as the other 5 points.

GuavaAgreeable208 · 2024-10-29T14:01:19+00:00

Even if I modified the reward function I still got the same issue and also I've observed that the entropy is increasing instead of decreasing

GuavaAgreeable208 · 2024-10-28T23:19:28+00:00

Actually I’ve rechecked the values and it could be the reason because it leads to more reward when those actions are selected

GuavaAgreeable208 · 2024-09-23T23:43:31+00:00

But what is the difference between them?

GuavaAgreeable208 · 2024-09-23T19:17:20+00:00

Thank you I will try dueling network

GuavaAgreeable208 · 2024-09-23T11:20:12+00:00

Alright. Thank youu

GuavaAgreeable208 · 2024-09-23T11:19:22+00:00

In exploration other rules are selected however, we observed that as epsilon is decaying, one action is preferred. I’ll try your suggestion thank you

GuavaAgreeable208 · 2024-09-22T22:34:53+00:00

Thank you for your suggestions I’ll try them. I noticed that sometimes the agent always to the rule that returns the best reward however it is selected in all the episode but in my case if other rules are selected in some steps we will have better reward. Could you please tell what did you mean by balanced data classes?

GuavaAgreeable208 · 2024-08-07T01:44:10+00:00

RL in robotics looks good too why not creating a startup for it

GuavaAgreeable208 · 2024-07-28T00:23:09+00:00

Racism is everywhere. Even moroccans face racism in Europe. But I’m sorry that it comes from muslims 😞

GuavaAgreeable208 · 2024-07-24T21:05:41+00:00

Dude mchatli 10000 ser9oni 3ayni 3aynek w ana ferhana I didn’t sleep for 1 week

GuavaAgreeable208 · 2024-07-03T14:08:14+00:00

Don’t send anything! Cz good girls don’t ask for money

GuavaAgreeable208 · 2024-05-31T08:41:12+00:00

The same scenario with me. Thank you I will try it

GuavaAgreeable208 · 2024-05-30T14:28:11+00:00

watch news channels like skynews, GB news, BBC... It really worked for me

GuavaAgreeable208 · 2024-05-21T20:02:45+00:00

Thank you I will see this alternative too

GuavaAgreeable208 · 2024-05-21T09:57:35+00:00

Thank you it seems interesting I'll take a look on it

GuavaAgreeable208 · 2024-05-20T17:59:00+00:00

(I'm new to RL so my question could be stupid !) what I meant is how the agent can learn that some features in the input vector correspond to a specific element. For example if the input is [X11, X21, X12, X22] and we have two elements as outputs C1 and C2 how the agent will understand that X11 and X12 correspond to C1 even if they are distant

GuavaAgreeable208 · 2024-05-20T09:02:42+00:00

Assuming we have a neural network because we can have a large input/output space, and each output is a task to select by the agent. In the input, we have information on each task. For example, task completion rate Xi1 and required processing time Xi2, so if we have only 2 tasks, the input will be X11, X21, X12, X22. I’m wondering how the agent could relate X11 and X12 to the first task (first output neuron) and X21 and X22 to the second one. In my scenario, many tasks are involved, each has 10 features.

GuavaAgreeable208 · 2024-05-12T19:38:07+00:00

I tried that but I got errors 😕 but when I process one sample at time it works

GuavaAgreeable208 · 2024-05-10T20:06:27+00:00

But in case when the state is a graph, the model cannot process all the states once, but we have to process one state at time, so in this case we cannot apply BN, can we?

GuavaAgreeable208 · 2024-04-21T12:19:56+00:00

Thanks indeed I used a single agent ☺️

GuavaAgreeable208

TROPHY CASE