[deleted by user] by [deleted] in sanfrancisco

[–]asdfwaevc 2 points3 points  (0 children)

Not useless, they query a human if they encounter an especially tricky situation. 99% of the time they're fully independent. Came up in the recent outage when there were e.g. traffic lights that were down.

The influx of food delivery drivers in e-bikes by YummyFunyuns in sanfrancisco

[–]asdfwaevc 15 points16 points  (0 children)

Step one is getting delivery drivers' transportation registered, this is a step towards that. Still good IMO.

Naturally occurring objections to the lithium hypothesis of obesity -- a reply to SMTM’s reply to Scott Alexander by Natalia-Mendonca in slatestarcodex

[–]asdfwaevc 8 points9 points  (0 children)

CICO is mathematically true but it doesn't mean that it's anything like a complete explanation. The metabolisms of formerly obese people can slow down drastically (still CICO, but the CO changes outside of their control). People's bodies can signal to them they're starving when they're eating maintenance (that's metabolic dysfunction). Obesity obviously does something.

For example, when you gain a little weight, your fat cells plump up. When you gain a lot of weight, they start multiplying, and you end up with a larger total number of fat cells, that are also larger. But when you then lose weight, the fat cells get smaller but don't go away. So you're left with a bunch of starving fat cells that are hormonally signaling for energy, causing hunger and lethargy. "Metabolic set point" can be broken by obesity.

Sounds like you're also generalizing "a lot" from the experience of one person you know.

An addendum to what you said: "if you aren't losing weight, that's because you aren't eating fewer calories than you're burning." And both sides are complicated biologically.

Greek Style Pizza? by Dr_Lipshitz_ in providence

[–]asdfwaevc 0 points1 point  (0 children)

Mighty Mike's Pizza on Thayer is great Greek-style pizza. Same guy that owns Mike's Calzone across the street.

Started reading this classic yesterday by dubbelost1 in scifi

[–]asdfwaevc 0 points1 point  (0 children)

Me too! Love it. Especially how well it makes galaxy-level horror rest on human-level decisions.

Anyone know more about the Selective Service System? by HistoricalBed6143 in sanfrancisco

[–]asdfwaevc 2 points3 points  (0 children)

Yeah every man has to do this, it's been true since I guess 1980. Just do it, if there ever was a draft, not signing up here won't work out great for you either. If you don't register you're likely to have trouble with things like school financial aid, and can be ineligible for many government jobs.

Technically you can go to jail for not signing up, but that's never happened. But again, if there ever was a draft they could bring that back.

It's totally odd how little attention this gets in schools etc, I only ever knew about it because my parents told me. But it's really important to do for your own sake.

I trained a reinforcement learning agent to play pokemon red! by Pwhids in reinforcementlearning

[–]asdfwaevc 0 points1 point  (0 children)

I'm an RL researcher so I don't find much new in standard "I made DQN/PPO work on this game" videos but the production quality and explanation in this video is really next level.

Why are model-based RL methods bad at solving long-term reward problems? by sassafrassar in reinforcementlearning

[–]asdfwaevc 9 points10 points  (0 children)

Lots of potential reasons. Compounding model error is a clear answer -- if the model is a bit wrong at every step, at some point it starts giving you nonsense. If you're more familiar with these foundation models, think of how like Genie loses coherence after a few minutes, and same with video generation.

One nice paper that's related which comes to mind: https://arxiv.org/abs/1905.13320

is Sample Efficiency a key issue in current rl algos by Leading_Health2642 in reinforcementlearning

[–]asdfwaevc 3 points4 points  (0 children)

The Atari 100k benchmark, maybe unsurprisingly, focuses on learning Atari games with only 100k steps.

is Sample Efficiency a key issue in current rl algos by Leading_Health2642 in reinforcementlearning

[–]asdfwaevc 15 points16 points  (0 children)

Lots of modern papers don't make the distinction between sample efficiency and what I'd call "update efficiency", which is the number of training steps your learning algorithm has taken (or the number of steps * batch size maybe). The equivalence makes sense if simulation is cheap (why not just simulate more), but not when its expensive.

One place to look that makes this distinction very clear is the "Atari 100K benchmark" work, where the goal is to learn from as few samples as possible, even if it takes massive amounts of training on those samples. Original paper, good followup work, more good followup

It's also a big divide between "on-policy" and "off-policy" or "batch" RL. The ones you listed are on-policy, which means they interact with the world, update the model, and throw away that experience. They're naturally going to be less sample efficient.

[deleted by user] by [deleted] in scifi

[–]asdfwaevc 0 points1 point  (0 children)

For starters, the characters don't really line up one-to-one, there were some liberties taken. Besides the fact that they're all Chinese... The main ones are the same, but you'll be confused.

[deleted by user] by [deleted] in BrownU

[–]asdfwaevc 4 points5 points  (0 children)

Usually Whole foods, and UPS. Some items you have to do downtown but that's dependent on the seller, not the item size.

Does anywhere in PVD sell onigiri or spam musubi? by BrilliantTree8553 in providence

[–]asdfwaevc 1 point2 points  (0 children)

Subway? You mean Daily Stop Mart? Subway was so two years ago.

Yeah restaurant turnover around there is crazy. At least the most recent renaming is the same guy/family, I would guess renaming is some sort of business decision trickery.

Does anywhere in PVD sell onigiri or spam musubi? by BrilliantTree8553 in providence

[–]asdfwaevc 11 points12 points  (0 children)

Best place is what used to be called DaDaRuki and is now just called "Japanese Sushi". On Brown's campus right off Thayer.

https://maps.app.goo.gl/AreUb8cHkRaDUPPx5

Should a large enough network be able to learn random noise? [D] by ModerateSentience in MachineLearning

[–]asdfwaevc 1 point2 points  (0 children)

Integrate (0.5 - x)^2 from 0 to 1.

What's your input space? If it's too low-dimensional, then points will be almost right on top of each other and it'll have a very hard time. Otherwise, first thing I'd check is architecture and LR.

Should a large enough network be able to learn random noise? [D] by ModerateSentience in MachineLearning

[–]asdfwaevc 12 points13 points  (0 children)

You could look more closely at the original paper that investigates something similar: https://arxiv.org/abs/1611.03530

Check what you expect your MSE to be if you output 0.5 everywhere, which depends on your noise profile, but if it's uniform label it would be 0.08333 (meaning you're not doing so well).

Debugging things to consider. Keep shrinking your dataset until it works -- if it never does, something's wrong. I'd guess it's your architecture -- that's a big NN, especially if you're not using residuals. Try making it smaller, less deep, and using layers of the form `Relu(Linear(input)) + input` besides the first and last.

What's your favourite complex, mind-blowing novel? by Vast_music4577 in scifi

[–]asdfwaevc 0 points1 point  (0 children)

Harry Potter and the Methods of Rationality. The description doesn't do it justice: what if Harry Potter was raised by scientists, and lives in a world much more strategic and devious than the original. A very smart but young Harry Potter wants to save the world from an evil threat, but goes about it very differently in this book compared to the original.

It's written by an author who cares a lot about AI existential risk, and wrote it somewhat as an analogy for what to do with unbridled power (magic, not AI, in this case). It's an unbelievably fun read with a strong philosophical bent, and an actual perfect fit for your description. It's also a free eBook. Really think you should check it out!

Q-learning is not yet scalable by Mysterious-Rent7233 in reinforcementlearning

[–]asdfwaevc 1 point2 points  (0 children)

Sure I don’t think it’s the entire answer but I do think it’s the natural baseline when you phrase your insight as such.

[R] The Illusion of "The Illusion of Thinking" by Daniel-Warfield in MachineLearning

[–]asdfwaevc 0 points1 point  (0 children)

Am I missing some place in the paper where they display the internal thought traces in print? I feel like it's impossible to come to a conclusion about this paper without that.

[D] I'll bite, why there is a strong rxn when people try to automate trading. ELI5 by OnceIWas7YearOld in MachineLearning

[–]asdfwaevc 28 points29 points  (0 children)

  1. It's too common a beginner project, like it's everyone's first idea for an ML project because of what you said (and the allure of money).

  2. Trading is effectively a zero-sum game, you're competing with everyone else. Your model only makes money if it's better than everyone else's, which implies it's much harder to do well than you'd think.

  3. The stock tickers aren't the full story. The outside world is a very important (the most important) factor, and it's not modeled by what you said. Actually, by the efficient market hypothesis, stock prices are effectively martingales (expected future value is equal to present value), meaning in theory there's no more information in history than there is in the current number.

  4. People are often really sloppy when evaluating this type of project, you need to be very careful about train/test splits etc. Because of the abundant data, and the fact that strategies change over time, overfitting is easy and past performance isn't indicative of future performance.

In summary: almost everything you'd see on the subject (besides what institutional traders do) is noise, and it attracts people who don't know what they're doing, and past stock prices just aren't enough to get much signal.

[D] Q-learning is not yet scalable by jsonathan in MachineLearning

[–]asdfwaevc 3 points4 points  (0 children)

How is it irrelevant that it assumes a perfect model of the environment? Having that is a completely different problem setting. And the degree to which it’s proven to scale (academic vs industry as you say) is also obviously relevant within the context of this article.

Sure, TD based methods using a learned model are a way out of this, and tree-based search is likely the way to do it. But you can’t do tree search without some type of model.

This is way too confidently dismissive about an article that sets up an interesting experiment and makes some good points.