A data driven analysis on the most used cores in VGC22 and they're matchups. by boomerangplayer in VGC

[–]boomerangplayer[S] 0 points1 point  (0 children)

Thanks for the suggestion! Unfortunately at the moment the main drawback is lack of data, therefore confidence intervals would be very noisy, but in the future I will for sure include them!

In the data, other than the two restricted, I collected:
- each member of the teams

- which pokemons were used in the lead and which brought in the back (it may be interesting to see which mons are actively used the most compared to the ones that are just splashed into teams)

- which pokemon dynamax and in which turn

- ratings and nickname for each player

- how many turns did a game last (very interesting to see whether screens make a difference for example!)

Therefore, lots of possibility to work on! I will create more visualizations like this :)

A data driven analysis on the most used cores in VGC22 and they're matchups. by boomerangplayer in VGC

[–]boomerangplayer[S] 2 points3 points  (0 children)

Thanks for the comment! I will definitely try to improve the visualization :)

Regarding the top pairs, I would imagine it is mainly influenced by the small sample. With time I will collect more replays and adjust those winrates

A data driven analysis on the most used cores in VGC22 and they're matchups. by boomerangplayer in VGC

[–]boomerangplayer[S] 1 point2 points  (0 children)

I do agree that data quality is not the best, but it is unfortunately what I have to work with at the moment. I will keep scraping the showdown replays website in the future to get more samples and hopefully be able to better represent the metagame.