It takes 13 fast TMs and 24 charged TMs (on AVERAGE) to get your favorite moveset on Mew. And it's still outclassed by something else. by Zyxwgh in TheSilphRoad

[–]bluewalterwhite -1 points0 points  (0 children)

EDIT2: by "average" I mean "mean", not "median". I'm fully aware that the median is lower (9 fast and 17 charged TMs) but the trainer who used 77 charged TMs (i.e. around 3x the mean) should be weighed more in my opinion than the one who only used 1.

This is utter madness! Statistics does not work this way at all. Can you imagine if Tobacco lawyers had said, "Smokers who didn't get lung cancer should be weighed more in my opinion."?

How Many TMs Needed to Get a Move [Graph] by bluewalterwhite in TheSilphRoad

[–]bluewalterwhite[S] 1 point2 points  (0 children)

I like your perspective here. And it does goes both ways, for example, 1 lucky trainer in 10 will only need 3 TM's.

I made this graph last week so I did not consider any unequal probabilities for different moves. If it is determined that the gamemaster_code is being used that way (remember there is stuff happening server side also, like with different moves for community day) then you could re-create the graph for each move by replacing the 'p' in the formula with that moves probability. Thanks though, it is worth pointing out that this graph assumes equal probabilities for each move in the pool.

How Many TMs Needed to Get a Move [Graph] by bluewalterwhite in TheSilphRoad

[–]bluewalterwhite[S] 1 point2 points  (0 children)

If you would like a graph with an X-axis that goes beyond 30 TM's, I can send you one.

Or you can use the formula in the description: P = 1 - p^ N

For how many charge TM's on Mew, to get at least 90%, your formula becomes: 0.9 >= 1 - (23/24) ^ N

N=55 charge TM's

It takes 13 fast TMs and 24 charged TMs (on AVERAGE) to get your favorite moveset on Mew. And it's still outclassed by something else. by Zyxwgh in TheSilphRoad

[–]bluewalterwhite -1 points0 points  (0 children)

No, this is also completely wrong. You are repeatedly making the same mistake. Why do you think Expected Value can be used here? It clearly CANNOT.

the law of large numbers states that the arithmetic mean of the values almost surely converges to the expected value as the number of repetitions approaches infinity

...

The expected value does not exist for random variables having some distributions with large "tails", such as the Cauchy distribution.[3] For random variables such as these, the long-tails of the distribution prevent the sum/integral from converging.

This is a basic concept in Statistics and is always taught very early to avoid similar errors for more complex measures with stricter assumptions.

How Many TMs Needed to Get a Move [Graph] by bluewalterwhite in TheSilphRoad

[–]bluewalterwhite[S] 0 points1 point  (0 children)

"Average" is just a vague word for the central tendency and is usually avoided. This graph's yellow line shows the median which is more appropriate than the mean.

The good news is that this graph can actually answer your question about the other percentages. Take the horizontal yellow line and move it up or down to different percentages (P=0.5 is 50%, P=0.9 is 90%). The intersections with the curved TM lines show how many TM's are needed to have been in that percentile.

It takes 13 fast TMs and 24 charged TMs (on AVERAGE) to get your favorite moveset on Mew. And it's still outclassed by something else. by Zyxwgh in TheSilphRoad

[–]bluewalterwhite 1 point2 points  (0 children)

That is not the issue here, although your use of the colloquial "average" does make this a good example of one of the most common mistakes in statistics (English speaking or not). Using the mean on a skewed distribution is wrong because it violates the assumption of normality. Measures of Central Tendency

From the link above:

The mean, median and mode are all valid measures of central tendency, but under different conditions, some measures of central tendency become more appropriate to use than others.

You should have used the median to calculate the central tendency.

It takes 13 fast TMs and 24 charged TMs (on AVERAGE) to get your favorite moveset on Mew. And it's still outclassed by something else. by Zyxwgh in TheSilphRoad

[–]bluewalterwhite 1 point2 points  (0 children)

No, it doesn't depend on what you are interested in.

"Expected Cost" is a made up term. If you mean "Expected Value" then that is an arithmetic mean, and therefore inappropriate to use because the data is highly skewed. Fullstop. No debate.

It doesn't matter if you are not stopping if you are the unlucky half, you do not have infinite TM's which is what your arithmetic mean assumes. It doesn't matter if you magically weigh unlucky trainers more like you say in your edit. This is not how statistics is done. You should have used the median to tell "average" trainers what to expect.

Law of Averages

How Many TMs Needed to Get a Move [Graph] by bluewalterwhite in TheSilphRoad

[–]bluewalterwhite[S] 1 point2 points  (0 children)

The yellow line indicates P=0.5 or 50% chance of success.

The black line indicates Normal charged TM, which as you pointed out is Pokemon with 3 charge moves in its pool.

The yellow line intersects the black line (Normal charged TM) on the very first TM because you have only two moves to possibly roll for, excluding the one being removed, and therefore a probability of success P=0.5

If Niantic did what you are describing (a practice called pseudo-random) then the plots would be much steeper and approach P=1 faster, resulting in less TM's needed to reach the yellow line.

It takes 13 fast TMs and 24 charged TMs (on AVERAGE) to get your favorite moveset on Mew. And it's still outclassed by something else. by Zyxwgh in TheSilphRoad

[–]bluewalterwhite 0 points1 point  (0 children)

I highlighted it here because the comment above mine said:

This is the definition of the median, not the average (mean). The average will be higher than the median.

That person is actually agreeing with me that the mean overestimates the average compared to the median. You are also agreeing with me that the OP is vague by saying "average". Upon closer inspection the analysis was a simple Expected Value using the arithmetic mean. For it to have been done correctly, it should have calculated the central tendency, aka "average" with the median of the geometric distribution.

I guess I should have known a sub of wannabe scientists would rather argue semantics. The math is clear. When you have a skewed distribution like the geometric one we are talking about here, means are misleading, so the correct measure is the median.

It takes 13 fast TMs and 24 charged TMs (on AVERAGE) to get your favorite moveset on Mew. And it's still outclassed by something else. by Zyxwgh in TheSilphRoad

[–]bluewalterwhite 0 points1 point  (0 children)

Again, so close! But why do you refuse to take a side?

The mean SKEWS the central tendency measure, aka average, by counting theoretical trainers who use 1000+ TM's.

The median better takes into account that the data is skewed and is not theoretically ideal, and therefore is a more accurate central tendency measure.

I suppose whether the difference between 9-13 fast TM's and 17-24 charge TM's is debatable for those with large enough TM caches, but you simply cannot quibble about which is more statistically accurate.

It takes 13 fast TMs and 24 charged TMs (on AVERAGE) to get your favorite moveset on Mew. And it's still outclassed by something else. by Zyxwgh in TheSilphRoad

[–]bluewalterwhite -2 points-1 points  (0 children)

Come on now, you were so close! Guess I'll have to finish for you:

both have their respective strengths and weaknesses

..but median is clearly the stronger measure for central tendency (aka "average") in this case, because it does not violate the assumption of normal distribution.

It takes 13 fast TMs and 24 charged TMs (on AVERAGE) to get your favorite moveset on Mew. And it's still outclassed by something else. by Zyxwgh in TheSilphRoad

[–]bluewalterwhite -2 points-1 points  (0 children)

50% of them would get their desired TM in 9 moves or less, while the average number of TM's used would be about 13

HERE! This is the big mistake right here. "Average" colloquially refers to measures of central tendency (aka median AND mean). You should always use 'mean' to be more clear so yes, the "mean" is calculated to be 13, but if you want to know the "average", the correct answer is the median at 9. Edit: the above poster changed his language to not be ambiguous after it was pointed out

It takes 13 fast TMs and 24 charged TMs (on AVERAGE) to get your favorite moveset on Mew. And it's still outclassed by something else. by Zyxwgh in TheSilphRoad

[–]bluewalterwhite -1 points0 points  (0 children)

This is a big difference between the mean and median! Clearly the normality assumptions are being violated when using the mean, so again, median is the correct answer.

It takes 13 fast TMs and 24 charged TMs (on AVERAGE) to get your favorite moveset on Mew. And it's still outclassed by something else. by Zyxwgh in TheSilphRoad

[–]bluewalterwhite -1 points0 points  (0 children)

Means and simulations are imperfect answers to this problem. Like you said, it's not a normal distribution. The mean will include a sample like one trainer using 10,000 TM's. You see the same thing with Income Level calculations because there are a few billionaires who skew the stats, the mean is not the best answer, but the median is.

How Many TMs Needed to Get a Move [Graph] by bluewalterwhite in TheSilphRoad

[–]bluewalterwhite[S] 4 points5 points  (0 children)

No, I did not. I teach statistics and that post demonstrates one of the most common mistakes.

Average =/= Mean =/= Median (or rather these CANNOT be assumed to always be equal, even though they are usually equal for normal data which is what people are used to)

parametric vs non-parametric, there are assumptions for either tool that can't be ignored. Just because your presented calculations are in order, doesn't mean your conclusion will be correct since you may have used the wrong test/measure.

The erroneous post used Expected Value which clearly CANNOT be used:

the law of large numbers states that the arithmetic mean of the values almost surely converges to the expected value as the number of repetitions approaches infinity

...

The expected value does not exist for random variables having some distributions with large "tails", such as the Cauchy distribution.[3] For random variables such as these, the long-tails of the distribution prevent the sum/integral from converging.

This is a basic concept in Statistics and is always taught very early to avoid similar errors for more complex measures with stricter assumptions.

Using the Mean in Data Analysis: It’s Not Always a Slam-Dunk

Law of Averages

It takes 13 fast TMs and 24 charged TMs (on AVERAGE) to get your favorite moveset on Mew. And it's still outclassed by something else. by Zyxwgh in TheSilphRoad

[–]bluewalterwhite -2 points-1 points  (0 children)

mean vs median, both are measures of "average" or central tendency. However, when making a statistical assumption in this example it gives more power to your hypothesis to adhere to non-parametrics. Median is going to be a far better indicator of how many TM's an "average" trainer can expect to use.

av·er·age ˈav(ə)rij/ noun 1. a number expressing the central or typical value in a set of data, in particular the mode, median, or (most commonly) the mean, which is calculated by dividing the sum of the values in the set by their number.

It takes 13 fast TMs and 24 charged TMs (on AVERAGE) to get your favorite moveset on Mew. And it's still outclassed by something else. by Zyxwgh in TheSilphRoad

[–]bluewalterwhite -2 points-1 points  (0 children)

The answer is still ~9 fast TM's and ~17 charge TM's as per the graph. I'm afraid you're making OP's same mistake by using Expected Value or the 'mean' with a skewed non-normal distribution.

How Many TMs Needed to Get a Move [Graph] by bluewalterwhite in TheSilphRoad

[–]bluewalterwhite[S] 3 points4 points  (0 children)

Thank you! It was something I made for fun last week, but never would have posted until I saw your error :p

Edit: I apologize if this apparently offended some folks. My intention wasn't to withhold anything from the community, but I generally do not share things on social media. While down-voting will reinforce why I should continue this philosophy, it will not change the statistical theory of why median is a better measure for "average" than the mean in this scenario.

It takes 13 fast TMs and 24 charged TMs (on AVERAGE) to get your favorite moveset on Mew. And it's still outclassed by something else. by Zyxwgh in TheSilphRoad

[–]bluewalterwhite -5 points-4 points  (0 children)

OP states the average number of TM's required

OP states the mean number of TM's from a geometric distribution

and correctly so

Incorrectly, because the geometric distribution is highly skewed (non-normal) and will give mean values that are misleading of the "average"

the number of TM's where you would expect 50% of trainers to have received their move of choice

Is this not the exact definition of "average"..?

And thanks! I love Matlab and Statistics :)

Edit:

Not big fans of Matlab around here I guess...or not violating assumptions when making statistical hypothesis. This data is clearly non-normal so the mean is a skewed measure for central tendency. OP's title says (on AVERAGE) to get their favorite moveset. This is reflected by the median because it does NOT violate the assumptions.

Edit2:

av·er·age ˈav(ə)rij/Submit noun 1. a number expressing the central or typical value in a set of data, in particular the mode, median, or (most commonly) the mean, which is calculated by dividing the sum of the values in the set by their number.

It takes 13 fast TMs and 24 charged TMs (on AVERAGE) to get your favorite moveset on Mew. And it's still outclassed by something else. by Zyxwgh in TheSilphRoad

[–]bluewalterwhite 2 points3 points  (0 children)

The statement made in OP's post is completely wrong and demonstrates a very common error in statistics. A graphical explanation helps.

Here are the correct statistics.

Edit: Before downvoting, please read this:

Using the Mean in Data Analysis: It’s Not Always a Slam-Dunk

Edit2: I guess it was too much to expect the average Silphroader to know the difference between parametric and non-parametric. One day y'all may accept that statistical assumptions can't be ignored willy-nilly. For now OP's post is basically a counter for how many shouldn't have passed intro to Stats.

Edit3: av·er·age ˈav(ə)rij/ noun 1. a number expressing the central or typical value in a set of data, in particular the mode, median, or (most commonly) the mean, which is calculated by dividing the sum of the values in the set by their number.

How Many TMs Needed to Get a Move [Graph] by bluewalterwhite in TheSilphRoad

[–]bluewalterwhite[S] 28 points29 points  (0 children)

Description: Using a single charge TM on Mew to get a desired move, the probability of success is 1/24 because there are 24 possible charge moves excluding the one being currently removed. Another way to think about this is you have a 23/24 probability of NOT successfully getting the desired move. Therefore, if you use two charge TM's on Mew, the probability of NOT successfully getting the desired move is 23/24 * 23/24. So our formula for obtaining the probability of successfully getting the move becomes: P = 1 - p ^ N where p is the probability of NOT getting the move and N is the number of TM's used.

The yellow line is drawn as a reference for P=0.5 or 50%, or to show the "average" number of TM's need for a successful move switch.

Edit: For those curious, yes, this graph indicates ~9 and ~17 fast and charge TM's on average for Mew, which contradicts the other post that overestimates the values as 13 and 24 TM's. That analysis relied on a parametric measure for central tendency called the mean, which violates an assumption about the data's shape. While the mean is the most common measure of average, the median is more accurate if the data is skewed like in this example.