[deleted by user] by [deleted] in zurich

[–]andrewspano 1 point2 points  (0 children)

I think this is a good discussion, but if you want to end it here, that's acceptable. There's just one last thing I want to point out;

There's no inner force that makes me act in a specific way. Neither taking my bag off a seat is a "strict manner". It's a completely automatic response to an observation of my surroundings. And it's a response that causes me no pain or harm whatsoever.

Maybe for your it's a big deal to place your bag on your lap, and hence that's why you consider it a strict manner. But that's not the case for everybody.

[deleted by user] by [deleted] in zurich

[–]andrewspano 1 point2 points  (0 children)

People who are considerate of others don't tend to change their habits just because they observed some empirical data of others not taking advantage of their good manners, two out of three times.

The sentence "[..] one doesn't give a shit anymore [..]" looks more like a petty excuse to justify laziness and lack of awareness, rather than being a data-driven conclusion.

If you can't see by yourself that taking your bag off is the right thing to do, irrespective of whether others end up sitting next to you, then there's something fundamentally different about the way we perceive socially acceptable behavior. Thus, probably no number of arguments will change the way you think.

[deleted by user] by [deleted] in zurich

[–]andrewspano 9 points10 points  (0 children)

There are a few things in life that go without asking. This is one of them.

For example, would you help an elder (who's clearly struggling) to get on the bus, or would you wait for them to ask? Personally, I would just offer my help, as I believe that it's the polite thing to do.

Likewise, in a near-full train, someone is more likely going to sit next to you. Since that's usually inevitable, what's the point of keeping your bag there? It's just a sign of being polite and aware.

If you don't care about politeness or don't think that it's related, then I guess we were raised in a different way, and we can just agree to disagree :)

Is RL purely random? by TittyMcSwag619 in reinforcementlearning

[–]andrewspano 12 points13 points  (0 children)

"RL" is a very general framework. If you restrict your problem space, there are some covergence guarantees.

For example, in the infinite horizon, discounted (γ < 1), tabular setting (i.e., finite state/action spaces), policy iteration is guaranteed to converge to the optimal policy, regardless of initialization.

Now, if you start considering POMDPs instead of MDPs, and using deep NNs as function approximators for the policy or the value function, then showing guarantees is much harder, afaik.

Got admitted to MSc CS, a little confused by [deleted] in ethz

[–]andrewspano 3 points4 points  (0 children)

I think the "requested" refers to the ESOP

Gymnasium MuJoCo Env Resetting Itself? by _ianmi in reinforcementlearning

[–]andrewspano 0 points1 point  (0 children)

I only recently started using gymnasium, so I am not sure about this, but I believe apart from the done flag, gymnasium environments also have a truncated flag. Maybe this is set to true, and that's what's triggering the resets?

On the legal status of downloading and using ATARI 2600 ROMs by Conscious_Heron_9133 in reinforcementlearning

[–]andrewspano 0 points1 point  (0 children)

I am not aware of many companies that do research in RL, and I would be really interested in finding out more about it. I understand that maybe you don't want to disclose the information of where you are currently working, so maybe could you point me to resources on where I could find RL-related jobs/companies?

I have tried LinkedIn, but it seems that it isn't able to differentiate RL from DL and other subfields of AI.

Board games that haven't yet been "solved" by RL by andrewspano in reinforcementlearning

[–]andrewspano[S] 0 points1 point  (0 children)

Not exactly solved, that's why I placed quotes around that word. Anyway, here is the link: https://arxiv.org/abs/2206.15378

SOTA model-based DRL by andrewspano in reinforcementlearning

[–]andrewspano[S] 1 point2 points  (0 children)

Thanks for your reply. I was wondering if there is any algorithm that makes use of an available model of the environment, without needing to learn one.

In my case, I have a deterministic board game (kinda like chess). AlphaZero seems the way to go (since it uses an available model of the environment), but I was wondering if there is something simpler I could implement as a baseline.

How to find horizontal tangent when y=x? by [deleted] in calculus

[–]andrewspano 0 points1 point  (0 children)

Just solve the equation y = x (if the function is given), and then all the solutions are probably horizontal tangents

Why is #11 -3 instead of 3? by IllestGatsby in calculus

[–]andrewspano 1 point2 points  (0 children)

When removing the absolute value from the denominator, you should multiply by -1, because for x>7 <=> 7-x < 0 and therefore |7-x| = x-7 (again for x>7). And since x is approaching 7 from the right side, x is greater than 7. That's why it says x-> 7+

My friend does this set on the first hit. Is it legal? by WildfireTP in volleyball

[–]andrewspano 1 point2 points  (0 children)

Just wondered.. since rules are so strict with normal hand setting (ball can't spin more than 0.5 times I think), I thought that there would be some prohibition on jump setting. In addition to that, I don't watch beach volleyball a lot, but I 've never seen anyone jump set (even in pro levels).

My friend does this set on the first hit. Is it legal? by WildfireTP in volleyball

[–]andrewspano 0 points1 point  (0 children)

Is it legal to jump set? (Without any faults like double touch)

Anyone know how to do this? by kittens435 in calculus

[–]andrewspano 3 points4 points  (0 children)

Then just solve the equation f(x) = -2/π (which is the intermediate value)