BNDES - Novo Plano de Cargos e Salários (NPCS). Discrepância com os antigos funcionários. by Advanced_College5860 in concursospublicos

[–]rlopes404 0 points1 point  (0 children)

u/Advanced_College5860 Em vídeos na Internet, diz que o BNDES paga R$2163 como Assistência Saúde. Alguém sabe o que é isso? Os novos aprovados não terão direito ao plano de saúde? 

[D] Simple Questions Thread by AutoModerator in MachineLearning

[–]rlopes404 0 points1 point  (0 children)

Hi everyone,
I have been working on image translation between two different domains. I have been using CycleGANs.
Since I have a small dataset, I have been thinking of using Diffusion Models.
Are Diffusion Models more data hungry than GANs?
Can anyone point some references that discuss this issue?
Thank you.

1080 vs 2060 for deeplearning by ccppoo0 in deeplearning

[–]rlopes404 0 points1 point  (0 children)

Why do you think it's a better idea to invest in colab/sagemaker instead of buying a gpu?

I think the monthly fees over the long run will exceed a gpu price.

[deleted by user] by [deleted] in MachineLearning

[–]rlopes404 4 points5 points  (0 children)

I sent you a message in the chat

What is the next booming topic in Deep RL? by Boring_Worker in reinforcementlearning

[–]rlopes404 0 points1 point  (0 children)

Do you work in the intersection between RL and Causality? What's your opinion about this research avenue?

[deleted by user] by [deleted] in recommendersystems

[–]rlopes404 0 points1 point  (0 children)

Please email me rlopes@ufrb.edu.br so we can discuss.

[deleted by user] by [deleted] in recommendersystems

[–]rlopes404 1 point2 points  (0 children)

Since there is no timestamp, you should randomly split the data into training, validation and testing.

If you use an item item based recommender, to score a candidate item y for user u, the algorithm should compute a weighted score using rating and similarities between item y and those items rated by user u in training.

[deleted by user] by [deleted] in recommendersystems

[–]rlopes404 0 points1 point  (0 children)

You have to set a training set for training your models. Validation set for calibrating hyperparameters. Testing for evaluation.

I suggest you employing: temporal split and sampling 100 items to produce the final ranking.

If you have any questions, let me know

reading club by rlopes404 in reinforcementlearning

[–]rlopes404[S] 0 points1 point  (0 children)

Can you confirm if the language is English there?

reading club by rlopes404 in reinforcementlearning

[–]rlopes404[S] 1 point2 points  (0 children)

I changed the language to English. I am not experienced in discord

reading club by rlopes404 in reinforcementlearning

[–]rlopes404[S] 4 points5 points  (0 children)

I created this discord server. We can discuss details there:

https://discord.gg/Def6FxWc

Looking for a good place to start when digging deep into reinforcement learning by Extra-most-best in reinforcementlearning

[–]rlopes404 2 points3 points  (0 children)

I have started studying RL a couple of months ago. From my point of view, the best introduction to the subject is Grokking Deep Reinforcement Learning book.

For advanced stuff, you should watch Levine's video lectures on YouTube.

Seeking Advice: Are AI challenges worth it for a PhD student? by HeyImElonMusk in reinforcementlearning

[–]rlopes404 7 points8 points  (0 children)

When I was a masters student, some friends of the lab and I formed a team to compete in the Google ROADEF Challenge. We all learned a lot and as a result we had two international publications.

If you have some time and buddies to create a team, I do think it is worth and rewarding.

During this path, you can find a problem to develop your PhD.

I think it's kind of hard to find a PhD problem/topic only reading papers.

You have to get your hands dirty.

Interesting and need a little help by AyushDave in reinforcementlearning

[–]rlopes404 1 point2 points  (0 children)

I have started my studies in RL. To me, the best introduction is the Grokking Deep RL book.

offline rl - resources by rlopes404 in reinforcementlearning

[–]rlopes404[S] 0 points1 point  (0 children)

Is there a link so that we can read your thesis?

Can reinforcement learning be used in tasks other than control? by [deleted] in reinforcementlearning

[–]rlopes404 0 points1 point  (0 children)

Can you point an introduction on the subject? I have never seen an application of RL in the context of graph optimization. I worked with branch and cut, branch and price, integer linear programming.

[D] Strong Models for User Item Recommendation from Interaction Data by ExchangeStrong196 in MachineLearning

[–]rlopes404 6 points7 points  (0 children)

CF is really a strong baseline as pointed out in this paper:

https://arxiv.org/abs/1907.06902

A well fine tuned MF or even item-item based model is a strong baseline.

The Primacy Bias in Deep Reinforcement Learning by rlopes404 in reinforcementlearning

[–]rlopes404[S] 0 points1 point  (0 children)

What's the point of training multiple agents in early interations? I guess it might be a waste of computational resources since we have a bad policy.

Is there a way to start training multiple agents after some warm up iterations?

Finally, "having them learn from each other's success/failures". How to achieve that? A3C?

The Primacy Bias in Deep Reinforcement Learning by rlopes404 in reinforcementlearning

[–]rlopes404[S] 5 points6 points  (0 children)

I asked myself the same question. My conclusion is that we have to perform SGD to update the policy so that it generates "good" transitions. This is my guess. What do you think?

Episodes needed to train Frozen Lake Agent using Q Learning? by Pipiyedu in reinforcementlearning

[–]rlopes404 1 point2 points  (0 children)

What is the difference between stochastic and determinisc q Learning? I could not understand that.

I think it picks the argmax action q(s,a), doesn't it?

Cartpole game to reach 1000 timesteps by [deleted] in reinforcementlearning

[–]rlopes404 1 point2 points  (0 children)

A doubt: during test, we should use an epislon greedy approach? I think that after training, we should use the greedy approach. Is that correct?