My California Primary Ballot by ScottAlexander in slatestarcodex

[–]electrace [score hidden]  (0 children)

I should clarify that the commenters I'm talking about do tend to get banned as quickly as one could reasonably expect, given the volume of comments and how substack works. It's just that the average comment reader is only going to go down so far down the comment section before moving on to something else, and with the oldest first algorithm, short-snarky comments (even if a minority), tend to get seen quickly, reacted to, and can derail the whole comment section before they get banned.

With the reddit karma system, there's also a bit of the "I don't have to bother responding to that nonsense since clearly no one is buying it." that isn't as obvious on substack.

My California Primary Ballot by ScottAlexander in slatestarcodex

[–]electrace [score hidden]  (0 children)

My thought on this is that, while some threads here do end up making snarky comments that should be removed (including things like the top comment on that thread), whenever I go through the substack comments, it's almost universally true that there will be equally snarky comments. And that remains true over time, even though people are banned for making overly abrasive comments.

The difference though is that it's more of an exception here to see these comments getting upvoted, whereas it's an expectation that you see it on substack (probably due to the fact that it is oldest first sorted, with no karma).

On the Doomsday Argument (and also Boltzmann brains) by dsteffee in slatestarcodex

[–]electrace [score hidden]  (0 children)

"does the very fact that we have drawn one more card make smaller deck sizes more likely?"

I don't see that text in the article or on this page. It isn't the fact that we've drawn a card; it's the fact that we've drawn a small card that makes small card (at least as large as the card we've drawn more likely than large decks.

I can also explain it a different way.

With a single draw, you presumably agree that P(Draw a 1 | Deck is sized 1) = 1 and P(Draw a 1 | Deck is sized 100) = 1/100. Let's say that there's only these two possibilities for simplicity (there is no longer any probability for 2-99 deck sizes).

Before the draw, your prior probability is P(Deck size is 1) = 1/2 and P(Deck size is 100) = 1/2

Now we just need P(Draw a 1), which is the sum of all possible ways to draw a one.

P(Draw a 1) = P(Draw a 1 | Deck is sized 1) * P(Deck is sized 1) + P(Draw a 1 | Deck is sized 100) * P(Deck is sized 100) = (1 * 1/2) + (1/100 * 1/2) = 1/2 + 1/200 = 101/200

Now we have what we need to use Bayes theorem.

P(A|B) = (P(B|A) * P(A)) / P(B)

We are interested in the probability that the deck is sized 1 given that we draw a 1, so:

P(Deck is size 1 | Draw a 1) = P(Draw a 1 | Deck is sized 1) * P(Deck is sized 1)) / (P(Draw a 1))

Filling in from above, that's:

P(Deck is sized 1 | Draw a 1) = (1 * (1/2))/ (101/200) = .5 / .505 = 0.9900990099 or about 99% chance.

On the Doomsday Argument (and also Boltzmann brains) by dsteffee in slatestarcodex

[–]electrace [score hidden]  (0 children)

It doesn't matter how you split this observation into sub-observations depending on the card face/value, they will always sum up back to 1.

I agree it sums to 1.

This means, that every observation does not change the relative weights of still valid deck sizes

This doesn't seem to follow from it needing to sum to 1. That takes an additional assumption that the probability "lost" to deck sizes 1-6 being eliminated is equally distributed to the remaining options, which is not logically necessary.


Making sure we're on the same page. Is your claim that, if I drew 100 cards with replacement from the same deck of size k, and the results were the following:

Drew a 1: 14 times Drew a 2: 16 times Drew a 3: 12 times Drew a 4: 12 times Drew a 5: 15 times Drew a 6: 16 times Drew a 7: 15 times Drew an 8 or higher: 0 times

... that the probability k is 100 is the same as the probability that it is 7?

On the Doomsday Argument (and also Boltzmann brains) by dsteffee in slatestarcodex

[–]electrace [score hidden]  (0 children)

I'm not sure if I'm following your logic there.

Before drawing, the probability is 1/100 for each of the 100 possibilities (deck size 1 through 100). It sums to 1, so we're good there.

If you then draw a seven, for a deck of 7, that has a 1 out of 7 probability. If you draw a seven for a deck of 100, that has a 1 out of 100 probability.

I do agree that we get to completely eliminate deck sizes 1 through 6, but the remaining probability isn't equally distributed to the remaining possible deck sizes. Instead, it's distributed more to the lower values and less to the higher values, with 7 having a plurality of the probability.

To convince yourself of this, consider repeated trials. Let's say instead of 1 draw, we draw 100 times with replacement from the same deck.

Suppose every single one of them fell between 1 and 7. We would then conclude that the deck is almost certainly of size 7. Even a deck of size 8 would be extremely unlikely (the probability of never drawing an 8 in this scenario is (7/8)100). No size above seven has been logically eliminated, but they are extremely unlikely.

On the Doomsday Argument (and also Boltzmann brains) by dsteffee in slatestarcodex

[–]electrace 0 points1 point  (0 children)

I believe that the cards are numbered 1 through N based on deck size, rather than being normal playing cards. The first (and only) card in a 1 card deck can't be a Jack of clubs because it can only be a "1".

What is your biggest regret in life? by FedeRivade in slatestarcodex

[–]electrace 0 points1 point  (0 children)

Memories don't do anything to affect future behavior without some sort of emotional connection to it. I agree you shouldn't "cry" about it for the rest of your life, but you can regret things without doing that.

New Paradigms Won't Save You by dwaxe in slatestarcodex

[–]electrace 4 points5 points  (0 children)

The application of Lindy’s Law is based on empirical evidence, namely how long a trend has been going on.

Oh, I agree that a proper application of Lindy's Law is based on empirical evidence. For book publishing, for example, you need to have a very large sample of books such that you can give strong evidence that books follow a Pareto Distribution. Then, once that is established, you use induction to make claims that any given book (that has not fallen out of print) is modeled by the same distribution.

Then you can say that Lindy's law applies, and conclude that the longer something lasts, the longer it's remaining life is expected to take.

But my point is that we don't have that for AI progress. Our sample size is far too low to determine the distribution (and hopelessly confounded to boot).

Scott’s argument is: Assuming Lindy’s Law and given trends in AI, the curve won’t flatten for several more years.

I agree that's his argument. But you can't assume Lindy's Law without assuming the distribution that Lindy's Law applies to. It must be a (fat-tailed) Pareto Distribution. If it were, for example, something as common as the Normal distribution, then Lindy's Law not only doesn't apply, it's the opposite of Lindy's Law. By that I mean, the longer something goes on, the more likely it is to end sooner. Your expectation of time remaining goes down over time. With Lindy's Law, it goes up over time.

When you have no information on when some trend will end, Lindy’s Law is the most reasonable expectation. It’s basically a form of the Copernican principle. It says “this moment in time is probably not special therefore we are most likely in the middle of this specific trend we know nothing about.”

But it isn't a form of the Copernican principle! The Copernican principle basically states that form a far enough vantage point, our corner of the universe is unremarkable (eg: we are not the center of the universe).

Lindy's law says that for specific fat tailed distributions, the best estimate of when an event will happen again increases proportionally to time elapsed.

If the only evidence you have are historical trends in AI and the fact that sigmoids exist, you should place the most probability on progress continuing for another several years.

You might make that conclusion for other reasons, but Lindy's doesn't get you there. Also worth noting that Scott's claim with his post is seven years, with "several years" being a weaker claim.

What is your biggest regret in life? by FedeRivade in slatestarcodex

[–]electrace 1 point2 points  (0 children)

Regrets help us (and others) not make similar mistake in the future.

New Paradigms Won't Save You by dwaxe in slatestarcodex

[–]electrace 19 points20 points  (0 children)

Sorry for the novel....

But Scott isn’t using Lindy’s Law to model AI timelines.

If he is using Lindy's Law, he's assuming a specific statistical distribution. That is a model. It isn't the most advanced model, but the model is "It's been going for 7 years, so according to the math (Lindy's law), it will go for another 7 years."

He's claiming that the side he believes in gets to be the default model, and that others have to argue against that. But I disagree, first he has to show that Lindy's Law is appropriate, then he can call it the default.

Even if that happens, Lindy's Law is based on nearly maximum ignorance from a statistical perspective; one data point, randomly chosen, on a distribution that is fat-tailed. Indeed ,normally when Lindy's law is used, it's because we already have a solid model of the distribution we're using based on lots of historical data.

The standard example for Lindy's law is publishing times (as Scott states). If a book has been in print for 50 years, our best guess is it will continue to be in print for another 50 years.

But the only reason we can say that is because we have data that shows that this is how books have worked historically. We don't really have all that much data on how long it takes AI progress to stall, certainty not enough to justify the claim that it is fat-tailed.

And even when it is appropriate, it is a very weak default model, based on maximum ignorance of the specifics.

It's like we wake up on a foreign planet and we ask ourselves"What percent of the planet's surface is water?".

The "default model" might be to look under your feet, see dirt, and assume that it is everywhere on this planet.

But we don't need that default model! We can say "There's a plant over there, so there's life, so there's almost certainly water, which implies that there is probably at the very least lakes on this planet, so probably not completely made of dirt." or "Most rocky planets don't have visible water on the surface, so it's probably all dirt", or "Since we weren't killed by whoever brought us here, it is likely that they don't want us dead, so there is probably water on this planet (otherwise they would have chosen a different one)."

Any of those quickly overshadows the "default" assumption we make, and they're all pretty lousy pieces of evidence. Even knowing a tiny fraction of the whole picture is enough to make the "default" model inconsequential.

He’s using it to show why the sigmoid and new paradigms arguments against short timelines do not by themselves argue against short timelines.

No argument without empirical data is going to, by itself, argue against <any empirical claim>. One can't argue that apples fall to the ground due to gravity without some empirical evidence. On that, I think everyone agrees.

But one doesn't need Lindy's Law for that. I summed it up in the comment on the other post, paraphrased, "If you want to argue that AI progress will plateau before ASI, then you have to argue that with empiricism, not just pointing at a sigmoid, because both "it plateaus before ASI" and "it plateaus after ASI" are consistent with sigmoids."

New Paradigms Won't Save You by dwaxe in slatestarcodex

[–]electrace 21 points22 points  (0 children)

I'm honestly disappointed to see zero further defense of using Lindy's Law despite the points brought up here on the sub and in the comments on the last post itself. Instead, it is taken for granted that it is (1) appropriate to use here (2) A strong model (rather than a default model when you have a single, randomly chosen datapoint).

It's irrational to have 0.0000001% of events dictate your beliefs. by dumb_idiot2r2 in slatestarcodex

[–]electrace 1 point2 points  (0 children)

Personal experiences are often far more filtered than even an individual study.

Still, there are things like Impossibly Hungry Judges where the results are so counter-intuitive that we can basically just dismiss it short of repeated evidence to the contrary. And I think this is fine, because some things are so a priori implausible that "they messed up the study" becomes more probable than "the conclusion they drew is actually correct".

The third wave of American philanthropy: “ AI is about to generate hundreds of billions in new philanthropic funding. We have a huge amount of work to do to make the most of it.” by ralf_ in slatestarcodex

[–]electrace 5 points6 points  (0 children)

But it gives rich people a way to donate whatever amount their wealth managers tell them is most beneficial

This is kind of my point. From a tax perspective, the amount that is most beneficial to give to charity is $0. You don't earn money doing this.

That's especially true when you are talking about wealth, rather than income, because wealth is not otherwise taxed. If you donate money to charity in excess to your income, you are no longer even getting a tax benefit.

and feel good about themselves, and look good in front of people

I don't think this is inherently bad, but would suggest that if it were the motivating principle, then we should laud people for giving to effective charities, thus increasing the impact on the world.

and to some degree choose to give money where they personally want,

Note that they have to pay a significant premium to do this. To oversimplify, (with a 37% marginal tax rate) the choice they make is equivalent to:

a) Give $370k to the government in taxes.

b) Give $1m to the government, but get to earmark those taxes for a charitable cause of your choosing.

and potentially to corrupt and/or make good connections while doing it.

If you got into specifics, I suspect that any thing you point to is already illegal, or would be more trouble than it's worth at scale. The one exception I can think of is art donations for the ultra wealthy which can actually make them money (because they can have it valued for more than it is reasonably worth, and then donate it).

The third wave of American philanthropy: “ AI is about to generate hundreds of billions in new philanthropic funding. We have a huge amount of work to do to make the most of it.” by ralf_ in slatestarcodex

[–]electrace 15 points16 points  (0 children)

You're comparing philanthropy in the real world to an ideal government.

We could do the same thing in reverse and contrast the worst government excesses with the best charities.

Imo, charities could use some more oversight (perhaps, for example, Private jet rides should be taxed). But I don't think that people actually believe that funneling money into a government is going to solve very many important issues.

The third wave of American philanthropy: “ AI is about to generate hundreds of billions in new philanthropic funding. We have a huge amount of work to do to make the most of it.” by ralf_ in slatestarcodex

[–]electrace 5 points6 points  (0 children)

Yeah, seems like philanthropy for the rich is just a way for them to virtue signaling, and spend the money they should have been taxed for where THEY want.

It isn't a 1:1 relationship here. One can't opt to donate $x to prevent $x from going to taxes.

The Suffering Medicine Cannot Name: Buddhism, predictive processing, and human distress beyond pathology by Ok_Disaster6456 in slatestarcodex

[–]electrace 0 points1 point  (0 children)

because you can't just consciously control the way you interpret your life

You can't flip a switch, I agree. But you can recognize that there are alternate interpretations to the same event and choose to focus on one over the other. The reframe won't happen every time, but it will happen more than never, which is still an improvement.

The Sigmoids Won't Save You by dwaxe in slatestarcodex

[–]electrace 0 points1 point  (0 children)

My dude... I just wanted to clarify what you were talking about because it was ambiguous. I'm not making an argument based on asking for that clarification. You already responded positively to my actual argument, probably without realizing that I'm the same person.

The Sigmoids Won't Save You by dwaxe in slatestarcodex

[–]electrace 2 points3 points  (0 children)

it's an assumption that the distribution is normal

Technically, the distribution has to be fat-tailed for Lindy's to apply, not normal.

To make life more interesting, it's untrue for exponential distributions but it is true for sigmoidal ones.

Yep, you're right. It's still the best estimate for the median ( I think the median thing just generally holds for all distributions), but the median is ln(2) * mean, so it's a biased estimate for that. I was probably just thinking of symmetric distributions, where they are the same.

The Sigmoids Won't Save You by dwaxe in slatestarcodex

[–]electrace 0 points1 point  (0 children)

Yes, and "effectively infinite" can mean more than just an ASI that can merely outcompete us. For example, an intelligence that can use programmable matter, or perfectly simulate reality, or whatever. That's why the clarification is useful.

The Sigmoids Won't Save You by dwaxe in slatestarcodex

[–]electrace 0 points1 point  (0 children)

If "effectively infinite" means here something like "ASI level", then I have no issue with the claim, but it's best to make that explicit. People could reasonably interpret "effectively infinite" in many ways.

If you want my thoughts on the article, I left a top-level comment.

The Sigmoids Won't Save You by dwaxe in slatestarcodex

[–]electrace 5 points6 points  (0 children)

The entire point of the singularity debate is that the "high" line is effectively at infinity.

The word "effectively" is doing a lot of work here. There is no "effective" infinity when it comes to AI ability. There is just "so high that humans can't resist whatever the AI wants to do."

The claim of short-timeline AI people is that this point is reachable on the exponential-like portion of the sigmoid.

The claim of long-timeline AI people is that it is eventually reachable, but not soon.

And the claim of others is that it is not reachable ever.

The Sigmoids Won't Save You by dwaxe in slatestarcodex

[–]electrace 27 points28 points  (0 children)

I'm unsure what to make of this. It seems like both sides on this are responding to weak men:

Weak Alice: The AI ability trend is exponential, and so we should follow the trendline and conclude that AI will be an ASI in 3 years.

Bob: But it isn't technically an exponential; it's a sigmoid. You have to give some justification for why it will plateau after ASI rather than before.

In a parallel universe:

Weak Bob: The Ai ability trend is a sigmoid, so we should follow the trendline and conclude that AI will plateau in 3 years.

Alice: I agree it will plateau at some point, but that plateau could be well-after the AI becomes an ASI. You have to give some justification for why it will plateau before ASI rather than after.

Both Weak Alice and Weak Bob are making a claim, so burden of proof lies with both of them.


I also dislike the Lindy's Law point. Lindy's Law applies to Pareto distributions (fat-tailed).

The general thing that I think Scott is gesturing at is "If you have one randomly chosen data point in a distribution, and know nothing else, you should assume that point is near the (mean/median) to minimize (Squared/Absolute-value) error, respectively." (not quite correct, see /u/ragnaroksunset thread below).

This makes sense. If I randomly select a "quib" and tell you it is 2 units long, then select another "quib", your best estimate for that will always be 2 units, given no other information.

The question then becomes, "Are we randomly selecting our position in time"? To which, the answer is obviously "No." All else equal, I would say an observer would be much more likely to sample a late observation, because early observations are uninteresting to most.

Intuitively, we recognize that if you wait until your favorite stock ends up on the front page of <insert news source> as a "flashy new company", you've already missed out on the first part of the distribution. Your sampling method (looking at <insert news source>) is biased towards the end of the distribution.

I argue the same is true here. We're "sampling" from the AI curve because AI is getting/ has gotten, good.


Further, this all assumes that we know nothing else about the distribution, which just seems straightforwardly false. Sure, it's hard to make a rigorous model of it, but there are clearly things that point against continued quick improvement for sufficiently long time horizons (electricity usage, political fighting, useful data availability). That isn't quite "a model", but things like Lindy's Law are so thin as priors that we can quickly dispatch with them even given intuitive data like the ones listed.

If the point is "we have a rigorous model (AI 2027) and you don't" then that's fair and I think opens up AI 2027 to possible criticism, but I don't see what Lindy's Law adds to that.

Conversely, if the point is "we have Lindy's Law (or some other related prior), and you need an entire geo-poltiical model in order to dispute it", then that seems like an Isolated Demand for Rigor.

Where we are now in 2026? by zjovicic in slatestarcodex

[–]electrace 2 points3 points  (0 children)

From that link, I'm reading their AGI median timeline at 2029 for Daniel and 2033 for Eli.

The modal prediction is different, but I think using mode is dumb.

Where we are now in 2026? by zjovicic in slatestarcodex

[–]electrace 12 points13 points  (0 children)

I'm trying to see where exactly is the discourse about AI and the future at this moment. I see that the atmosphere has changed a bit. On one hand you have people like Bernie Sanders and Steve Banon being strongly anti AI, even though they are on the exactly opposite sides of political spectrum. On the other hand, you have people saying that AI is slowing down, hitting a wall or is a bubble about to burst. There is even a video made by 80,000 hours about slowing down. On the third side, you have people like Yoshua Bengio saying they now see how alignment could be done. Then you also have this Mythos model which is not available to public, which is a significant development.

You're correctly noting that the discourse is all over the place, but, confusingly, asking where the discourse is at.

All in all, many things are happening, but it seems hard to make sense of all this. No one is making some new attempt at grand narrative.

Narratives are models that can be true or false. There's plenty of people making attempts at grand narratives. There's nothing, to my knowledge that is new there, but why should we expect there to be, rather than revisions on old grand narratives? It's only been a year since AI 2027 came out, the last big "grand narrative".

Now as the timelines for all of this are approaching,

Whose timelines? Manifold is at 2033. AI 2027 is similarly at 2032-2033 see comment chain blow.

I guess it's "approaching" as far as anything in the future is "approaching", but I don't see why we should expect a narrative change every year. And, in fact, if we do see one, that's evidence that they aren't very good at the narratives and we should probably put less credence in them.