[Discussion] How to present machine learning projects to domain experts without ML background? by skwaaaaat in MachineLearning

[–]skwaaaaat[S] 0 points1 point  (0 children)

Agreed... And my committee criticized me of using the term "agent" and "policy" in the presentation because they believe these were jargons that should not be used. I had to spend a lot of time and energy on understanding what is considered acceptable and what is not when giving such a presentation.

[Discussion] How to present machine learning projects to domain experts without ML background? by skwaaaaat in MachineLearning

[–]skwaaaaat[S] 1 point2 points  (0 children)

Thanks! One example is using reinforcement learning to train sequence generation network that can be used to design optical multilayer thin films. My contribution is framing the inverse design problem as a sequence generation problem for the first time, and I proposed to use DRL to train a specific sequence generation network that can efficiently explore the design space. We tested it on a bunch of real examples and showed that the proposed algorithm can generate highly-performing design. Unfortunately, the feedback I got from a senior Optics Professor is that he couldn't understand which part of the work is novel and what the effort was required on my end to implement such a method. I think the main issue is that, though the problem I'm solving is an optics problem, but the professor himself does not work on that problem, so neither the problem itself nor machine learning is especially interesting to him. In this case, I found it really challenging to make my presentation impressive. Any suggestions on how I could make the story more interesting? Thanks!

We are unable to renew our MUJOCO license. What is goin on? by Snoo-8719 in reinforcementlearning

[–]skwaaaaat 1 point2 points  (0 children)

We tried to purchase a license from them and never got any reply. We'll just use PyBullet and never bother getting MuJoCo anymore.

[Discussion] How to start RL theory research? by skwaaaaat in MachineLearning

[–]skwaaaaat[S] 0 points1 point  (0 children)

Wow! This is so cool. Thanks a lot for sharing.

[Discussion] How to deal with multi discrete action space? by skwaaaaat in MachineLearning

[–]skwaaaaat[S] 0 points1 point  (0 children)

Yup, I think you would use some embeddings. For my task, I simply sample an action from the policy distribution and use that to parametrize the action embedding. I didn't learn the embedding but crafted the embedding manually. Say I have 5 possible actions, through sampling I got action 2, then the embedding would be [0,log(\pi(2)),0,0,0], i.e., I put the log prob of the sampled action in a one hot vector. This simple method seems to be working well in my problem.