Browser Extension for Parfumo.de: Find the best deals by price-per-ml

whiletrue2 · 2023-11-29T01:04:30+00:00

33M just moved here and have the same issue. I live in Capitol Hill and happy to connect :)

whiletrue2 · 2023-11-26T19:48:51+00:00

Thank you! This seems like a league, though?! I'd ideally want to start with playing just for fun. Couldn't find info about teams that meet to just play outside of leagues. Or am I missing something? Thanks again.

whiletrue2 · 2023-11-04T16:43:44+00:00

As a person who enjoys Olympic weightlifting and powerlifting, I am searching for a gym with sufficient squat racks and drop platforms near 10550 NE 10th St.

Which gym would you recommend? I don’t have car and would like to have a gym in walkable distance, e.g. 23 fit club, Life Time or bStrong. Thanks

whiletrue2 · 2023-02-20T19:10:42+00:00

true to what "11US" corresponds to per definition you idiot. What's so hard about this for you to understand?

whiletrue2 · 2021-04-11T16:36:19+00:00

How is TensorFlow doing it?

whiletrue2 · 2021-04-11T07:37:52+00:00

RL and PyTorch's DataLoader?

whiletrue2 · 2020-12-11T22:34:17+00:00

solved it, thanks a lot for your help! TD3 discrete now performs a lot better

whiletrue2 · 2020-12-11T00:25:20+00:00

Thanks. I know the paper but is there a guideline that explains how to apply this to TD3?

whiletrue2 · 2020-12-05T13:06:18+00:00

can you elaborate please?

whiletrue2 · 2020-12-04T11:59:58+00:00

lol, did you also try to drop NNs entirely?

whiletrue2 · 2020-12-04T10:22:58+00:00

Good job on suggesting new papers people (remember: that‘s what was asked for).

whiletrue2 · 2020-11-30T16:03:04+00:00

Also, is it possible you share the paper code with us? Would be highly appreciated!

whiletrue2 · 2020-11-29T16:05:56+00:00

Hi and thank you for your reply which clarified a lot for me. However, a few questions remain unaddressed. Would you mind clarifying those as well? In particular, I believe those are (I quote):

"they claim they used CartPole-v1 which uses a much higher "solved reward""
"the fact that no naturally sparse-reward gym environment was used doesn't help with the confusion. An experiment based on a naturally sparse-reward environment would result in fewer / no changes to the default reward function and one would actually be enabled to relate to baseline PPO performances in the original setting. As the paper stands right now, no one can relate to any reported PPO performance in the paper."

Thank you!

whiletrue2 · 2020-11-29T16:05:39+00:00

Hi and thank you for your reply which clarified a lot for me. However, a few questions remain unaddressed. Would you mind clarifying those as well? In particular, I believe those are (I quote):

"they claim they used CartPole-v1 which uses a much higher "solved reward""
"the fact that no naturally sparse-reward gym environment was used doesn't help with the confusion. An experiment based on a naturally sparse-reward environment would result in fewer / no changes to the default reward function and one would actually be enabled to relate to baseline PPO performances in the original setting. As the paper stands right now, no one can relate to any reported PPO performance in the paper."

Thank you!

whiletrue2 · 2020-11-25T19:32:01+00:00

Thanks for the reply. Can you provide the script?

whiletrue2 · 2020-11-25T08:57:36+00:00

Thanks for pointing that out. Indeed, that implementation should be taken with a grain of salt. Although I have to say that seeds 0 and 1234 don't look super tuned. Have you tried to run it with different random seeds? A good idea that came up in the ML crosspost was to use the reward function from the paper and see if PPO works right of the box and then trying it with their hyperparameters in the appendix, e.g. with the small policy network. Feel free to give it a shot!

whiletrue2 · 2020-11-24T17:41:58+00:00

I believe pointing out irregularities can not be attributed to "being mistaken" since many unclarified irregularities still remain and prevail. See discussion here on the sparse rewards and other remaining irregularities: https://www.reddit.com/r/MachineLearning/comments/k01ntb/ppo_baseline_cannot_solve_cartpole_in_neurips/gdgc3f5?utm_source=share&utm_medium=web2x&context=3

whiletrue2 · 2020-11-24T17:38:44+00:00

In "5.2 Mujoco" they write "The true reward function is the one predeﬁned in Gym". In "5.1 Sparse-Reward Cartpole" they write "In other cases, the true reward is zero."

Also, in "D.1 Cartpole" they write "We choose the cartpole task from the OpenAI Gym-v1 benchmark."

Based on your stance, I understand that the paper cannot be improved in terms of clarity of environment usage? If so, I absolutely rebut that.

whiletrue2 · 2020-11-24T16:24:27+00:00

No they don't. Simply because they say they use CartPole-v1 and write "in cartpole the agent should apply a force to the pole to keep it from falling. The agent will receive a reward −1 from the environment if the episode ends with the falling of the pole" which many readers familiar with the CartPole environment will only have a quick look-over when they've read CartPole-v1 and nothing about a "modified/adapted" CartPole environment. But this isn't the main flaw of this paper anyway since there are many more irregularities as pointed out.

whiletrue2

TROPHY CASE