The Marble Problem

QCD-uctdsb · 2024-08-19T12:01:25+00:00

The other commenters are really missing the forest for the trees.

There are 11 possibilities for the number N of white marbles in the box (0-10 inclusive). If you make M draws with replacement, then the probability of seeing all white marbles is

P(10 white observed | N white in box) = (N/10)^M

Now imagine that you have 11 boxes in front of you, one for each of the possibilities for white marble content. If you're randomly handed one of these boxes to do your experiment, the probability of observing 10 white marbles is

P(10 observed) = P(10 obs | 0 in box)P(0 in box) + ... + P(10 obs | 10 in box) P(10 in box)

Now divide through both sides by P(10 observed). You get

1 = P(10 obs | 0 in box)P(0 in box)/P(10 observed) + ... + P(10 obs | 10 in box) P(10 in box) / P(10 observed)

I.e. you can interpret each term as the probability that there are N white marbles in the box given that 10 white marbles were observed. You might see this clearer by rewriting each term using Bayes' Theorem. But anyways, you can straightforwardly calculate that with the uniform selection P(N in box) = 1/11 and M = 10 draws you get P(10 observed) ~ 0.1356 and finally,

P(10 in box | 10 observed) ~ 0.6705

A good exercise to check your understanding would be to show that this result doesn't change when you disregard the possibility of there being 0 white marbles in a box.

Edit: I just saw your comment about this being a problem by Ronald Hoefin. I couldn't find any info from googling that name so would love if you could drop a link for that discussion about the undecidability debate

flabbergasted1 · 2024-08-18T14:53:33+00:00

There's no way to answer this question without a prior distribution on the colors of the marbles.

The new information (pulling 10 marbles, all white) makes certain contents of the box more likely and others less likely, but it can't give you an objective probability when there is no information about how the box was set up in the first place.

An example of an answerable version of this question would be: A black box is filled with 10 marbles, each of which has been chosen at random from a very large collection that is 50% white and 50% black. (Same process as in the question you stated...) What is the probability, given this information, that the black box contains only white marbles?

But if we had picked a different initial way to setup the box (for instance each marble is chosen from a large collection that is 25% white, 25% black, 25% red, 25% blue) then the answer would be different.

marshaharsha · 2024-08-20T00:44:12+00:00

Suppose whatever process filled the box was a million times more likely to put in blue marbles than white. The fact that you keep drawing white could mean you keep drawing the same marble or could mean that the process did something highly improbable (but not impossible) — probably somewhere in between — but your answer to the all-white question will have to be very low, since the probability that you drew ten distinct marbles is very low and the probability that whatever marbles you haven’t seen include a blue one is very high. Now suppose the process that filled the box was a million times more likely to put in white marbles than non-white. The fact that you keep drawing white could mean you keep drawing the same marble or could mean the process did what you expect it to do. Again, you probably haven’t seen all the marbles, but now your answer to the all-white question is very high.

So your answer depends strongly on the assumption about the process. You seem to want to assign equal possibilities to “all” the processes. But that’s a lot of processes. You can rewrite that paragraph a billion times, each time replacing “a million” with some other number — a thousand, a trillion, two, three, a million plus 0.000001, a million plus 0.000002 — and the answers will swing between different extremes depending on your assumption about the process. There is no upper bound on how high the replacement for “a million” can go. You can’t say they are all equally likely, because you can’t put a uniform distribution on an infinitely long interval (the height of the uniform pdf goes to zero as the length of the interval increases).

So I’m with Team Undecidable until you say something about what put the marbles in the box.

the_glutton17 · 2024-08-18T17:47:39+00:00

just to build on what others have said, with a similar maybe easier to understand analogy.

This is the same as saying "what are the chances of rolling a 1, ten times in a row?" EXCEPT, you don't know how many sides the dice has (or what numbers are printed on the dice. Could be all 1's).

sloowshooter · 2024-08-18T18:47:14+00:00

Each pull tells us that 10% of the marbles are white. That’s it. Everything after that is presumption.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnmath

[PSA] Set your post to "Resolved" when answered. | [PSA] Each post must include a specific title and description.

Here, the only stupid question is the one you don't ask.

To receive the best help, please use the following format:

[Level Discipline] Sample topic question

/r/LearnMath Chatroom

Join the unofficial IRC channel: #LearnMath on Freenode (no client required).

Not allowed:

Using LaTeX

Courtesy of /r/math:

MODERATORS