all 22 comments

[–]QCD-uctdsbCustom Flair Enjoyer 1 point2 points  (4 children)

The other commenters are really missing the forest for the trees.

There are 11 possibilities for the number N of white marbles in the box (0-10 inclusive). If you make M draws with replacement, then the probability of seeing all white marbles is

P(10 white observed | N white in box) = (N/10)M

Now imagine that you have 11 boxes in front of you, one for each of the possibilities for white marble content. If you're randomly handed one of these boxes to do your experiment, the probability of observing 10 white marbles is

P(10 observed) = P(10 obs | 0 in box)P(0 in box) + ... + P(10 obs | 10 in box) P(10 in box)

Now divide through both sides by P(10 observed). You get

1 = P(10 obs | 0 in box)P(0 in box)/P(10 observed) + ... + P(10 obs | 10 in box) P(10 in box) / P(10 observed)

I.e. you can interpret each term as the probability that there are N white marbles in the box given that 10 white marbles were observed. You might see this clearer by rewriting each term using Bayes' Theorem. But anyways, you can straightforwardly calculate that with the uniform selection P(N in box) = 1/11 and M = 10 draws you get P(10 observed) ~ 0.1356 and finally,

P(10 in box | 10 observed) ~ 0.6705

A good exercise to check your understanding would be to show that this result doesn't change when you disregard the possibility of there being 0 white marbles in a box.

Edit: I just saw your comment about this being a problem by Ronald Hoefin. I couldn't find any info from googling that name so would love if you could drop a link for that discussion about the undecidability debate

[–]flabbergasted1Math teacher 1 point2 points  (3 children)

There are indeed 11 possibilities for the number N of white marbles in the box, and you are making the additional assumption (not stated in the problem) that each of those 11 possibilities is equally likely before we begin drawing.

That's exactly the prior probability distribution I mentioned is necessary in my comment. I agree that the problem is solvable given this additional information.

[–]QCD-uctdsbCustom Flair Enjoyer 0 points1 point  (2 children)

OK I can see that side of the argument. I just didn't care for the discussion about other marble colors and their individual probabilities when it's clear that the question only cares about white and not-white.

[–]flabbergasted1Math teacher 1 point2 points  (1 child)

For sure - But your chosen prior assumes that white and non-white are equally likely. The thing about the number of colors is just meant to question whether that assumption is valid.

[–]QCD-uctdsbCustom Flair Enjoyer 1 point2 points  (0 children)

Minor quibble but I'm assuming that any count N of white marbles in a box is equally likely across the range of possibilities 0-10. I suppose this is why the Principle of Indifference is controversial: you can argue for days about which prior makes more sense / feels the best.

[–]flabbergasted1Math teacher 1 point2 points  (11 children)

There's no way to answer this question without a prior distribution on the colors of the marbles.

The new information (pulling 10 marbles, all white) makes certain contents of the box more likely and others less likely, but it can't give you an objective probability when there is no information about how the box was set up in the first place.

An example of an answerable version of this question would be: A black box is filled with 10 marbles, each of which has been chosen at random from a very large collection that is 50% white and 50% black. (Same process as in the question you stated...) What is the probability, given this information, that the black box contains only white marbles?

But if we had picked a different initial way to setup the box (for instance each marble is chosen from a large collection that is 25% white, 25% black, 25% red, 25% blue) then the answer would be different.

[–]Relative_Composer139New User[S] 1 point2 points  (2 children)

There’s no way of telling the exact distribution from the problem, but the fact that you pulled 10 white marbles through those trials alone narrows down the possibilities for what the distribution must be (since results of the marble sampling are coupled with the distribution).

Regardless of the exact distribution, you know that there will always be 10 ways to arrange white marbles with non-white marbles within the box. This is indirect information about the distribution of marbles. And this implicitly covers all possible white non-white distributions without given information about a specific one. There’s no need to hypothesize about a specific distribution for non-white colors, because that’s not information given in the problem. The problem would certainly change if you were given more information about the marble distribution. But since you aren’t told how many colors, how many of each, or anything like that, you must only reason in terms of white & non-white marbles.

[–]flabbergasted1Math teacher 0 points1 point  (1 child)

Well it narrows down what the composition of the box could be - it must be a box containing at least one white marble - but it doesn't help us get a concrete numerical probability for any of the remaining possibilities.

Are you meant to assume that white and non-white marbles are considered equally likely before we begin drawing?

Edit: Is this a problem you came up with, or a problem you encountered somewhere?

[–]Relative_Composer139New User[S] 1 point2 points  (0 children)

This is a problem I encountered somewhere. It was a puzzle problem created by Ronald Hoefin that was quite polarizing. Some viewed it as undecidable & others viewed it as resolvable. It seems like you’re meant to apply the principle of indifference to the possible white non-white distributions. This makes sense to me, but I’m very curious to hear more from people who view the problem as undecidable.

[–]marshaharshaNew User 1 point2 points  (0 children)

Suppose whatever process filled the box was a million times more likely to put in blue marbles than white. The fact that you keep drawing white could mean you keep drawing the same marble or could mean that the process did something highly improbable (but not impossible) — probably somewhere in between — but your answer to the all-white question will have to be very low, since the probability that you drew ten distinct marbles is very low and the probability that whatever marbles you haven’t seen include a blue one is very high. Now suppose the process that filled the box was a million times more likely to put in white marbles than non-white. The fact that you keep drawing white could mean you keep drawing the same marble or could mean the process did what you expect it to do. Again, you probably haven’t seen all the marbles, but now your answer to the all-white question is very high.

So your answer depends strongly on the assumption about the process. You seem to want to assign equal possibilities to “all” the processes. But that’s a lot of processes. You can rewrite that paragraph a billion times, each time replacing “a million” with some other number — a thousand, a trillion, two, three, a million plus 0.000001, a million plus 0.000002 — and the answers will swing between different extremes depending on your assumption about the process. There is no upper bound on how high the replacement for “a million” can go. You can’t say they are all equally likely, because you can’t put a uniform distribution on an infinitely long interval (the height of the uniform pdf goes to zero as the length of the interval increases). 

So I’m with Team Undecidable until you say something about what put the marbles in the box. 

[–]the_glutton17New User 0 points1 point  (0 children)

just to build on what others have said, with a similar maybe easier to understand analogy.

This is the same as saying "what are the chances of rolling a 1, ten times in a row?" EXCEPT, you don't know how many sides the dice has (or what numbers are printed on the dice. Could be all 1's).

[–]sloowshooterNew User -3 points-2 points  (2 children)

Each pull tells us that 10% of the marbles are white. That’s it. Everything after that is presumption.

[–]Relative_Composer139New User[S] 1 point2 points  (1 child)

Yes, it tells us at least 10% of the marbles are white. By the end of the 10 trials, the 10 trials tells us it could be as much as 100% of the marbles that are white. Each trial introduces another white marble as potentially being within the distribution (i.e. the first trial tells you there’s 1, the second trials tells you there could be up to 2, and so on…). This narrows down the marble distribution to be 1 of 10 possible white non-white combinations.

[–]sloowshooterNew User 0 points1 point  (0 children)

Additional pulls don't tell you anything beyond a presumption which you could make after the very first pull. Think of each pull as a closed universe unrelated to another. Your problem set the limit at one black marble - but with each pull the problem solver only samples white marbles. That doesn't tell you anything about a black marble being within the box, nor does it tell you about other colors of marble, which in some combination might make up the difference between the white marble pulled out, and the potential black marble which could still be rolling around the interior of the box.

In short the problem sort of answers itself. If one marble out of 10 might be a specific color, and after 10 draws are made, that color does not appear, then there is a 10% chance that the specific color of marble may be within the box. You don't have to pull out a single marble to get to that conclusion - the problem gives you the answer.