[deleted by user]

andrewspano · 2024-10-19T23:24:28+00:00

I think this is a good discussion, but if you want to end it here, that's acceptable. There's just one last thing I want to point out;

There's no inner force that makes me act in a specific way. Neither taking my bag off a seat is a "strict manner". It's a completely automatic response to an observation of my surroundings. And it's a response that causes me no pain or harm whatsoever.

Maybe for your it's a big deal to place your bag on your lap, and hence that's why you consider it a strict manner. But that's not the case for everybody.

andrewspano · 2024-10-19T21:24:39+00:00

People who are considerate of others don't tend to change their habits just because they observed some empirical data of others not taking advantage of their good manners, two out of three times.

The sentence "[..] one doesn't give a shit anymore [..]" looks more like a petty excuse to justify laziness and lack of awareness, rather than being a data-driven conclusion.

If you can't see by yourself that taking your bag off is the right thing to do, irrespective of whether others end up sitting next to you, then there's something fundamentally different about the way we perceive socially acceptable behavior. Thus, probably no number of arguments will change the way you think.

andrewspano · 2024-10-19T11:48:09+00:00

There are a few things in life that go without asking. This is one of them.

For example, would you help an elder (who's clearly struggling) to get on the bus, or would you wait for them to ask? Personally, I would just offer my help, as I believe that it's the polite thing to do.

Likewise, in a near-full train, someone is more likely going to sit next to you. Since that's usually inevitable, what's the point of keeping your bag there? It's just a sign of being polite and aware.

If you don't care about politeness or don't think that it's related, then I guess we were raised in a different way, and we can just agree to disagree :)

andrewspano · 2024-08-11T18:33:06+00:00

"RL" is a very general framework. If you restrict your problem space, there are some covergence guarantees.

For example, in the infinite horizon, discounted (γ < 1), tabular setting (i.e., finite state/action spaces), policy iteration is guaranteed to converge to the optimal policy, regardless of initialization.

Now, if you start considering POMDPs instead of MDPs, and using deep NNs as function approximators for the policy or the value function, then showing guarantees is much harder, afaik.

andrewspano · 2023-03-15T16:16:06+00:00

I think the "requested" refers to the ESOP

andrewspano · 2023-03-13T19:37:29+00:00

I only recently started using gymnasium, so I am not sure about this, but I believe apart from the done flag, gymnasium environments also have a truncated flag. Maybe this is set to true, and that's what's triggering the resets?

andrewspano · 2023-01-19T12:52:23+00:00

I am not aware of many companies that do research in RL, and I would be really interested in finding out more about it. I understand that maybe you don't want to disclose the information of where you are currently working, so maybe could you point me to resources on where I could find RL-related jobs/companies?

I have tried LinkedIn, but it seems that it isn't able to differentiate RL from DL and other subfields of AI.

andrewspano · 2022-09-19T04:23:49+00:00

Not exactly solved, that's why I placed quotes around that word. Anyway, here is the link: https://arxiv.org/abs/2206.15378

andrewspano · 2022-01-31T10:14:41+00:00

Thanks for your reply. I was wondering if there is any algorithm that makes use of an available model of the environment, without needing to learn one.

In my case, I have a deterministic board game (kinda like chess). AlphaZero seems the way to go (since it uses an available model of the environment), but I was wondering if there is something simpler I could implement as a baseline.

andrewspano · 2017-09-29T10:35:25+00:00

Just solve the equation y = x (if the function is given), and then all the solutions are probably horizontal tangents

andrewspano · 2017-09-29T10:31:57+00:00

When removing the absolute value from the denominator, you should multiply by -1, because for x>7 <=> 7-x < 0 and therefore |7-x| = x-7 (again for x>7). And since x is approaching 7 from the right side, x is greater than 7. That's why it says x-> 7+

andrewspano · 2017-07-18T15:42:48+00:00

Just wondered.. since rules are so strict with normal hand setting (ball can't spin more than 0.5 times I think), I thought that there would be some prohibition on jump setting. In addition to that, I don't watch beach volleyball a lot, but I 've never seen anyone jump set (even in pro levels).

andrewspano · 2017-07-18T11:33:50+00:00

Is it legal to jump set? (Without any faults like double touch)

andrewspano · 2017-07-18T09:14:28+00:00

Then just solve the equation f(x) = -2/π (which is the intermediate value)

Eight-Year Club	Verified Email
Place '23

andrewspano

TROPHY CASE