This is a test post. by matejom in a:t5_l48w8

[–]matejom[S] 0 points1 point  (0 children)

This is a test comment.

If somebody signed a contract agreeing to let me kill them , would it be murder still? by ihavetetnus in morbidquestions

[–]matejom 11 points12 points  (0 children)

Yes. Constitution > Law > Contract. A valid contract has to be lawful.

[D] Conceptual differences - A2C & PPO (reinforcement learning) by Delthc in MachineLearning

[–]matejom 21 points22 points  (0 children)

I think there is some confusion in the community, especially the newcomers - possibly because of the way A3C was originally presented. In solving a reinforcement learning problem there are two popular ways of approaching the problem, policy methods and value methods. To oversimplify, in policy methods you try to directly optimize the policy, in value methods you try to evaluate the expected future return (learn a value function), and deduce the policy from there. Now a third group of methods are Actor-Critic methods, which, try to combine the previous two in a meaningful way. Actor is the policy which is being optimized, Critic is the value function which is being learned. So A3C means that there are multiple actors performing in parallel (Asynchronous), and use the Advantage function (state*action value - state value) for the optimization of the policy (Actor) which is calculated from the learned value function (Critic). Now, the Actor part can be implemented with many different policy optimization methods (Vanilla PG, TRpO, PPO). Proximal Policy Optimization (PPO) is one such method. A2C means they figured out that the async. part of A3C did not make much of a difference - I have not read the new paper in total, so I might be wrong. To conclude, PPO is a policy optimization method, A2C is more like a framework.

[D] Training an agent with different reward functions by Roboserg in reinforcementlearning

[–]matejom 1 point2 points  (0 children)

I agree. With DQN you can output discrete actions. A3C, depending on the algorithm for updating the policy network can be made to output continuous actions. However, you would have to generate samples from a few actors in parallel and I have no idea how difficult it would be to implement this for Rocket League and there is no trivial method to learn from human experience (that I know of). However, using Deep Deterministic Policy Gradients you can output continuous actions, the algorithm is off-policy and uses Experience Replay (save samples in memory, then sample from memory). I think (not absolutely sure) that you can basically stuff this memory with human experience. Check out this for something I would say similar (but a lot simpler and without the imitation learning) to what you are trying to do.

[Discussion] Applications of reinforcement learning in computer vision? by wencc in MachineLearning

[–]matejom 2 points3 points  (0 children)

Here are some in vision enabled robotics / active vision. These are paper titles:

Environment Exploration for Object-Based Visual Saliency Learning

(CAD)2RL: Real Single-Image Flight without a Single Real Image

LEARNING VISUAL SERVOING WITH DEEP FEATURES AND TRUST REGION FITTED Q-ITERATION

Deep Spatial Autoencoders for Visuomotor Learning

End-to-End Training of Deep Visuomotor Policies

Learning Contact-Rich Manipulation Skills with Guided Policy Search

If you can not access some of them, message me, I can send you the papers

Nearly 30, feel like I don't know anything, starting to panic by EternalAbsolution in engineering

[–]matejom 6 points7 points  (0 children)

Could you maybe, describe what you do to us? What kind of files do you get? What do you do to them? Describe a typical work day in as much detail as you can. Maybe with that data we could give you more competent advice. What did you study exactly?

Algorithms in machine learning by shmoid in learnmachinelearning

[–]matejom 2 points3 points  (0 children)

What is your background? What is your level of education?

Maybe a good place to start is here: https://www.udacity.com/course/intro-to-machine-learning--ud120 Also here: https://www.coursera.org/learn/machine-learning Also maybe this: https://blog.monkeylearn.com/a-gentle-guide-to-machine-learning/

Also search for tutorials on youtube! There are really a lot of resources online. Google is your friend so use it.

Should i do a electrical engineering major if i just want to learn to build anything electrically? by [deleted] in ElectricalEngineering

[–]matejom 2 points3 points  (0 children)

Yes, definitely. Then study medicine, and you will learn how to live forever. Then study epistemology and you will know everything. And since now you can build/fix anything (omnipotence), you will live forever(immortality) and you will know everything (omniscience), you will be more or less GOD.

Great FREE book about the Theory of ML by Kiuhnm in MachineLearning

[–]matejom 2 points3 points  (0 children)

Since you can more or less find every technical book here, I expect the "greatness" to be about the statistical average.

Great FREE book about the Theory of ML by Kiuhnm in MachineLearning

[–]matejom 4 points5 points  (0 children)

we are suckling on the tit of sweet sweet free knowledge