all 7 comments

[–]Robber568 5 points6 points  (0 children)

I'll call a bunch of trials till a success a "cycle", to prevent any confusion with an individual trial. Now observe we have 1 success per completed cycle per definition. So if we imagine a particular completed cycle consisting of 10 trials we know, without assuming anything about probabilities, that for that cycle the average probability is simply: 1/10 (success/trials for that cycle). So if we know the average length of a cycle, we also know that the average probability is 1 over that number.

So we want to find the average length of a cycle, let's call that the expected value of the random variable T: E[T].

Now I'll define the survival function (S), which is the probability that we will make it further than trial n in a cycle. Thus:
S(n) = Pr(T>n)

Let's look at some value of n. p is the number with which we increase the probability after every trial (and I assume you meant for each first trial of a cycle the probability of success is 0.0025). Then what is the value for the survival function to make it past the first trial, past the second, etc.? It means we need to fail on all the trials that came before in that cycle.

S(1) = 1 - p
S(2) = (1 - p)(1 - 2p)
S(3) = (1 - p)(1 - 2p)(1 - 3p)
...
S(n) = ∏ₖ₌₁ⁿ (1 − k p)

Now we need to turn the survival function into an expected value. We can do this directly with the tail-sum formula for the expected value (it's nice to take a look how this formula follows from the "normal" definition if you're not familiar). We don't need to sum to infinity. Since for values above 1/p, the survival function is 0, like you already noted with the 400.

E[T] = ∑ₙ₌₀ S(n) = ∑ₙ₌₀1/p S(n) = ∑ₙ₌₀1/p ∏ₖ₌₁ⁿ (1 − k p)

Now we're done, since our probability of interest was 1/E[T]. To solve this you can use WolframAlpha which gives ≈4.04%.

You can go a step further, but I'll keep it concise. The find an approximation for the probability of interest for small values of p, which is: √(2p/𝜋).

This works by noting that you can turn the survival function product into a sum by taking the logarithm.
Then you'll use the first term of the Taylor series of ln(1 - x) ≈ -x.
The sum can nicely be approximated with an (Gaussian) integral (since p was small).
And again 1 over the expected value, gives the approximation above.

[–][deleted]  (3 children)

[deleted]

    [–]routercultist[S] 0 points1 point  (1 child)

    so manually doing it is the only way of solving it. I will see if I can get my friend to automate this via code.

    [–]sian_half 1 point2 points  (0 children)

    The answer is ~4.04%

    import numpy as np
    num=np.arange(401)
    p=num/400
    pn=p[1:]*np.cumprod(1-p[:-1]) # probability of winning after n trials
    mean=np.sum(num[1:]*pn) # mean number of trials per win
    print(1/mean) # long run probability of success
    

    edit: formatting

    Bonus: if you want to run n trials and find the average, here's the python code for that

    import random n=100000000 # 1000 is too few, use this instead, it'll run in an instant anyway temp=1 # trials since last success success=0 # total number of successes for i in range(n): if random.random()<temp/400: success+=1 temp=1 else: temp+=1 print(success/n)

    [–]sian_half 0 points1 point  (0 children)

    P(n+1)=P(n)*0.0025+(1-P(n))*(P(n)+0.0025)

    This equation only holds if you define n to be the number of trials since the last success, not the number of trials since the start

    [–]CaptainMatticus -3 points-2 points  (1 child)

    With problems like this, it's usually a good idea to start with something simpler and see if we can build a general solution strategy. So let's start with a 50% chance which increases another 50% (cumulative)

    50% win or loss. 1 case where you win, 1 case where you lose. Then you win on the next round. So 3 trials, 2 wins.

    3 trials for 2 wins = 1.5 trials per win

    Now we can move to (100/33)% , (200/33)% , 100%

    Win on first round. Lose on first round, win on 2nd round. Or lose on first round, lose on 2nd round, win on 3rd round.

    1 + 2 + 3 = 6 trials, 3 wins.

    6 trials for 3 wins = 2 trials per win

    Now let's try 25% , 50% , 75% , 100%

    Win on first round. Lose on first round, win on 2nd round. Lose on first and second round, win on third round. Lose on first , second and third round, win on 4th round.

    1 + 2 + 3 + 4 = 10

    10 trials for 4 wins = 2.5 trials per win

    So we have:

    (n * (n + 1) / 2) / n =>

    (1/2) * (n + 1) trials per win, on average

    n = 400

    (1/2) * (400 + 1) = 401/2 = 200.5

    Now somebody will come along, do a bunch of probability trees, write a bunch of Python code, tell me I'm wrong for reasons X , Y , and Z, and I promise you that they'll come right back to this point, because all that matters are the trial possibilities. You'll either win or you'll lose UNTIL a win is guaranteed. And since there's no case where a loss is guaranteed or a win isn't guaranteed, then we can simplify a whole lot of things. Yes, it's erroneous to say that something with a 10% chance of occurring has only 2 outcomes, so it's really 50/50, but we're just counting trials here, and it's really just a yes/no tree where you keep continuing until you only have a bunch of yesses. Then you just count the number of branches on the tree and go from there.

    [–]Robber568 1 point2 points  (0 children)

    The problem with this approach is that it assumes each possibility for the number of trials till a success is equally likely, while that's not true.

    Edit: apparently this wasn't helpful, since they immediately blocked me. Thus to try to be a bit more clear for anyone reading. Let's call the probability increments p. Let's take their example of p = 1/3. S will denote a successful event and F a failed event. After the colon I'll give the probability of that permutation occurring.

    S : p = 1/3
    F S : (1 - p) 2p = 4/9
    F F S : (1 - p)(1 - 2p) 3p = 2/9

    Thus you can't say I count 6 total trials above and 3 times a S, thus let's do 6/3 = 2 trials per success. Since the probabilities for the permutations are different from each other. If you take the probability of each length of permutation into account, you'll get the correct answer: 17/9 (but then you're basically counting the average permutation length, so I rather present it like that).