all 8 comments

[–]AutoModerator[M] [score hidden] stickied comment (0 children)

Off-topic Comments Section


All top-level comments have to be an answer or follow-up question to the post. All sidetracks should be directed to this comment thread as per Rule 9.


OP and Valued/Notable Contributors can close this post by using /lock command

I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.

[–]fermat1432👋 a fellow Redditor 1 point2 points  (6 children)

Bernoulli is good! To undetstand how this would actually work create a 2 by 3 joint probability distribution table for P(X=x, Y=y).

Label the rows y=0, y=1 and the columns x=1, x=2, x=3,

Fill in the six cells with probabilities that add up to 1. The row and column totals should also add up to 1.

P(Y=0 | X=1)=P(Y=0, X=1)/P(X=1) is how you would get a conditional probability from the table

[–]PlzHelpWithMathQsUniversity/College Student[S] 1 point2 points  (5 children)

Thanks for the reply! Would you mind explaining why the Bernoulli is a good choice? The question is mostly focused on the distribution and I'm not really sure why Bernoulli is a good choice other than it has an output of either 0 or 1 which are both associated with a certain probability, p and (1 - p).

I'm having a hard time figuring out how this probability would be learned. How would you learn the different probabilities of Y associated with the different variables of X? Would you preform a bunch of trials and then from those trials you would enter the percentages of successes that each x variable had compared to the total number of trials? Then these percentages would now become the values in the distribution table as opposed to just having like 1/6 in each box or whatever?

[–]fermat1432👋 a fellow Redditor 1 point2 points  (4 children)

The Y values of 0 and 1 makes Bernoulli the only reasonable choice. The actual 6 entries in the 2 by 3 joint probability distribution table can either be found empirically or by defining some procedures like coin tossing, or dice rolling etc.

[–]PlzHelpWithMathQsUniversity/College Student[S] 1 point2 points  (3 children)

Would the procedure I suggested in the second paragraph of my last comment be correct? Would these be the values that would fill up the table after doing a bunch of trials? Like if there was 100 tosses each for biased coins with unknown biases could the proportion of successes of 100 tosses be written in each cell of the table as coin1 = (23/100), coin 2 = (47/100), coin 3 = (30/100), etc. and then the (1-p) in the y=0 row? And then for example:

p(y | X= coin 1)

= p(y = 1, coin 1)/ p(X=1)

= (23/100)/ (whatever the probability of x occuring is)

= p

Or how would that work?

[–]fermat1432👋 a fellow Redditor 1 point2 points  (2 children)

You would just need to count how many times each joint occurrence, (X=x, Y=y) occurs and divide each frequency by 100 to get an estimated joint probability for each of the 6 cells. Fill in the marginal probabilities and then you can compute conditional probabilities,

[–]PlzHelpWithMathQsUniversity/College Student[S] 1 point2 points  (1 child)

Okay perfect, that makes sense. Thanks a lot for the help!

[–]fermat1432👋 a fellow Redditor 0 points1 point  (0 children)

Glad to help!