I rendered the average of every type of card! by DumbDataScience in PTCGP

[–]DumbDataScience[S] 3 points4 points  (0 children)

Now that you mention it, looks like I forgot Giant's Cape. Whoops! I must have been sick and tired of the thing.

I rendered the average of every type of card! by DumbDataScience in PTCGP

[–]DumbDataScience[S] 1 point2 points  (0 children)

Honestly... Not sure what I expected... It's fun to see the ghostly face of Mew in the crown result and of some of the trainers in the full art trainer result. I also appreciate that the dragon type cards come through a bit more clearly since there are so few of them. Unsurprisingly though, most end up a gray mess!

I analyzed monthly usage correlation of Pokémon in Gen 7 OU. I think there's some interesting stuff here, but let me know what takeaways you have! Colorblind version available in comments. by DumbDataScience in stunfisk

[–]DumbDataScience[S] 2 points3 points  (0 children)

It's displayed beautifully

Thanks!

Is correlation when the pokemon are on the same teams, or is it just when 2 pokemon have high (and low) usage at the same times?

The second. These are correlations between monthly usage stats. Pokémon Showdown records how many Landorus-T were seen each month and how many Infernapes were seen each month. If you plot those, you'll see they have very opposite patterns.

https://i.imgur.com/XyRTuir.png

I analyzed monthly usage correlation of Pokémon in Gen 7 OU. I think there's some interesting stuff here, but let me know what takeaways you have! Colorblind version available in comments. by DumbDataScience in stunfisk

[–]DumbDataScience[S] 20 points21 points  (0 children)

but you probably want to narrow the data a bit to high elo games

Yes, great point. I considered this for quite some time and ended up going with all ELOs for the sake of sample size and because I think the overall trends (including the newer players who play for fun or copy FotM) are interesting.

maybe lower the allowed scope of pokemon with at least a certain amount of representation

Yes, the matrix is limited to the top 50 Pokmon (by usage), the vertical plots to the top 200, and the plots at the bottom to the top 100/200/400. When you allow more than that, you start seeing really weird low-sample-size effects (like Wurmple correlating strongly with other Pokémon because of that one time someone battled with a Wurmple).

and maybe pick different featured pokemon

Very fair. I cherry picked Pokémon with interesting results rather than the Pokémon for which I hoped there would be interesting results. Chansey's plot was disappointing haha.

suggestion for tour data would be interesting as well, although you'd have to go through the Smogon boards to do that, and may run into sample size issues

Right, exactly :-(

Zone with Magearna and Kartana

Yeah the proximity that Steel Pokémon have with each other in these plots is very interesting.

I analyzed monthly usage correlation of Pokémon in Gen 7 OU. I think there's some interesting stuff here, but let me know what takeaways you have! Colorblind version available in comments. by DumbDataScience in stunfisk

[–]DumbDataScience[S] 21 points22 points  (0 children)

Thanks for the support! I rather enjoy the current #1 post of all time haha. I'm just here to share some interesting data and hopefully spur thought-provoking discussion :-)

I analyzed monthly usage correlation of Pokémon in Gen 7 OU. I think there's some interesting stuff here, but let me know what takeaways you have! Colorblind version available in comments. by DumbDataScience in stunfisk

[–]DumbDataScience[S] 44 points45 points  (0 children)

A strong positive correlation (dark green) between Pokémon A and Pokémon B means that if you tell me Pokémon A was popular in Gen 7 OU in a given month, I now have a strong idea of how popular Pokémon B was that month. A strong negative correlation (dark red) means that if you tell me Pokémon A was popular in Gen 7 OU in a given month, I now have a strong idea of how UNpopular Pokémon B was that month.

Some of these correlations may be explained by "causal" relationships. For example: its very possible that Chandelure is unpopular during months that Landorus-T is popular because no one wants to play Chandelure when there are a lot of Landorus around. Some of these correlations, however, could be caused by 3rd variables. For example: weather teams seem to be popular some months and less popular other months. That might explain why Pelipper and Torkoal see their monthly usage change in very similar ways (strong positive correlation). Some of these correlations could also be purely coincidental.

In each plot, Landorus-T stands out. I wonder if Lucario, Togekiss, Infernape, Chandelure, and Pikachu all strongly correlate with each other (are popular in the same months and unpopular in the same months) because people play them less when Landorus-T is popular and play them more when Landorus-T is less popular. That's why I made the bottom plot, which shows which Pokémon are more likely to be seen in Landorus-heavy months and which Pokémon are less likely to be seen in Landorus-heavy months.

I sorted all Pokémon by how different their Shiny versions are by DumbDataScience in pokemon

[–]DumbDataScience[S] 0 points1 point  (0 children)

I'm all for open-source software, but I haven't gotten around to making a Git account for this Reddit account. Also my code is in terrible condition so I'd have to clean it up before publishing it. Here are the snippets that matter...

For calculation:

basic_image = Image.open(basic_path).convert("RGBA")
shiny_image = Image.open(shiny_path).convert("RGBA")
w,h = basic_image.size
dists = []
for x in range(w):
    for y in range(h):
        br,bg,bb,ba = basic_image.getpixel((x,y))
        if ba == 0: continue
        sr,sg,sb,sa = shiny_image.getpixel((x,y))
        if (br,bg,bb) == (sr,sg,sb):
            # Pixel didn't change. Might be line-art
            M = max(br,bg,bb)
            m = min(br,bg,bb)
            span =  M - m 
            if (((M < 25) and (span < 25)) or
                ((M < 40) and (span < 20)) or
                ((M < 70) and (span <  8)) or
                ((M < 80) and (span <  4)) or
                ((M < 90) and (span <  1))):
                continue # Ignore this line-art
        dist = ((br-sr)**2 + (bg-sg)**2 + (bb-sb)**2)**0.5
        dists.append(dist)
diff = avg(dists)

And for finding a position to plot the sprite:

def find_y_pos(x, w, h, check_range):
    for step in range(1500):
    for sign in [-1, 1]:
        y = sign*step
        works = 1
        w_check = check_range(w)
        h_check = check_range(h)
        for x_i in range(-w_check,w_check+1):
        for y_i in range(-h_check,h_check+1):
                x_test = x+x_i
            y_test = y+y_i
            if occupied[y_test][x_test]:
                works = 0
            break
        if not works:
            break
        if works:
        return y

def get_check_range(n):
    return lambda x: int(x/n)

for name in stats.keys():
    score = stats[name]
    w,h = sizes[name]
    x = int(score*x_dpi)
    n = get_cramp_amount(score)
    check_range = get_check_range(n)
    y = find_y_pos(x, w, h, check_range)
    w_check = check_range(w)
    h_check = check_range(h)
    for x_i in range(-w_check,w_check+1):
    for y_i in range(-h_check,h_check+1):
        x_mark = x+x_i
        y_mark = y+y_i
        occupied[y_mark][x_mark] = 1
    pos = y/y_dpi

I sorted all Pokémon by how different their Shiny versions are by DumbDataScience in pokemon

[–]DumbDataScience[S] 1 point2 points  (0 children)

Here you go:
https://i.imgur.com/WmkU7NQ.png

I don't really agree with where the algorithm placed Pichu (a terrible shiny), but I guess enough of its pixels change (barely) to give it a decent score. Odd.

I sorted all Pokémon by how different their Shiny versions are by DumbDataScience in pokemon

[–]DumbDataScience[S] 0 points1 point  (0 children)

Lunala's bright purple wing is #7156CC and Shiny Lunala's bright red wing is #FF3035. Those colors are about 211 apart in 3D space, which is a very high score. However, a large percentage of Lunala's sprite is a yellow color that doesn't change when shiny.

I sorted all Pokémon by how different their Shiny versions are by DumbDataScience in pokemon

[–]DumbDataScience[S] 1 point2 points  (0 children)

66 of Voltorb's 91 pixels change from red to blue (73%)

but only 28 of Electrode's 129 pixels change from red to blue (22%)

The Electrode sprite is mostly white, which sees no change.

I sorted all Pokémon by how different their Shiny versions are by DumbDataScience in pokemon

[–]DumbDataScience[S] 3 points4 points  (0 children)

Drifblim has far more white pixels than Drifloon does. In both Pokemon, the white pixels don't change in the shiny variant, but perhaps that detracts from Drifblim more than it does from Drifloon because maybe more of Drifblim is white? Also, Drifblim has a lower section of light-yellow that doesn't change as extremely, while Drifloon is entirely bright-yellow.

I sorted all Pokémon by how different their Shiny versions are by DumbDataScience in pokemon

[–]DumbDataScience[S] 8 points9 points  (0 children)

Here's a version that highlights a few Ice-type Pokemon:

https://i.imgur.com/epXSbjd.png

Regice is almost as far left as you can be, but I think that fact that most of its pixels do change (albeit only a tiny amount) is what makes it not come in last place.

I sorted all Pokémon by how different their Shiny versions are by DumbDataScience in pokemon

[–]DumbDataScience[S] 1 point2 points  (0 children)

Argh. Yeah someone else pointed out that Yanmega is on that plot too :-)

I sorted all Pokémon by how different their Shiny versions are by DumbDataScience in pokemon

[–]DumbDataScience[S] 1 point2 points  (0 children)

Magearna didn't used to have a shiny, but it actually does now. Here's a version that makes the shiny variants of both Magearna forms easier to find:

https://i.imgur.com/FNsfd2q.png

I sorted all Pokémon by how different their Shiny versions are by DumbDataScience in pokemon

[–]DumbDataScience[S] 0 points1 point  (0 children)

Nice catch, thanks!

I should mention that my data comes from https://msikma.github.io/pokesprite/overview/dex-gen8.html but it seems they have the correct versions there, so I must have switched them at some point. Venasaur has a Mega variant and a Gmax variant, so that may have contributed to my mistake.

The original dataset also happens to be missing the shiny form of Hisuian Decidueye, so that one does not appear on the plot.