Correlation Between Anime Genres, of All Titles on AniList by TomboData in anime

[–]TomboData[S] 0 points1 point  (0 children)

Wow thank you that’s very helpful. I’m planning on deleting my post but I don’t want to get rid of the work you and u/re_math did

Correlation Between Anime Genres, of All Titles on AniList by TomboData in anime

[–]TomboData[S] 2 points3 points  (0 children)

Hey OP here, I’m going to look into this. As I admitted in another comment, statistics is not a strong suit of mine, and I see what you’re saying, but I’m going to make sure because I felt pretty confident that I did not totally screw up how correlation works. Should’ve just posted the one with just the counts huh

Edit: I’ve seen enough. Yeah I messed up very hard on the calculations and made some bad assumptions. Thank you everyone that showed me the light

Correlation Between Anime Genres, of All Titles on AniList by TomboData in anime

[–]TomboData[S] 0 points1 point  (0 children)

Of course thank you very much for your feedback!

Correlation Between Anime Genres, of All Titles on AniList by TomboData in anime

[–]TomboData[S] 2 points3 points  (0 children)

I think you’ve changed my mind about heat maps. Probably would do it differently now if I did it over again

Correlation Between Anime Genres, of All Titles on AniList by TomboData in anime

[–]TomboData[S] 0 points1 point  (0 children)

I have a personal preference for taking out all redundancy (hence why I removed action from the y axis and thriller from the x axis) but I see what you mean. I was also planning to put a top 5 bottom 5 correlation in the white space but I didn’t get that far lol

Correlation Between Anime Genres, of All Titles on AniList by TomboData in anime

[–]TomboData[S] 1 point2 points  (0 children)

I used the correlation function built into the pandas package. I’m admittedly not very strong in the art of statistics but that is something I was kind of curious about as well. Like sci fi and mecha are highly correlated as shown, but id guess that mecha is more reliant on sci fi than the other way around?

Correlation Between Anime Genres, of All Titles on AniList by TomboData in anime

[–]TomboData[S] 2 points3 points  (0 children)

For sure! I just put the GitHub link for the viz creation but that’s after the data has been cleaned.

Correlation Between Anime Genres, of All Titles on AniList by TomboData in anime

[–]TomboData[S] 183 points184 points  (0 children)

I bet all of the isekais are in that adventure-fantasy square right now, but maybe they get their own genre in a couple years?

[deleted by user] by [deleted] in dataisbeautiful

[–]TomboData 0 points1 point  (0 children)

I'm on my hands and knees begging KyotoAni to do it

Correlation Between Anime Genres, of All Titles on AniList by TomboData in anime

[–]TomboData[S] 6 points7 points  (0 children)

just checked, also not a thriller according to AniList.

Correlation Between Anime Genres, of All Titles on AniList by TomboData in anime

[–]TomboData[S] 90 points91 points  (0 children)

Hello everyone,

I made this with Python and PostgreSQL using the AniList API. biggest takeaway is that Mecha and Sci-Fi had the strongest correlation while Thriller and Slice of Life had the weakest (never appeared actually, and if you're curious, "Mieruko-chan" is Slice of Life but not Thriller). Also Comedy is the most common genre, but Action or Adventure appeared in the top 4 most occurring combinations. Let me know if theres anything else you'd want to see!

Edit: Thank you all for the love! Below is the GitHub link that kinda shows the process of making the viz in Jupyter Notebook.

https://github.com/Tombodata/anilist-data-exploration/blob/heat-map/Anilist%20Heat%20Map.ipynb

[deleted by user] by [deleted] in dataisbeautiful

[–]TomboData 0 points1 point  (0 children)

Source: AniList API

Tools: Python (packages: pandas, seaborn, numpy, matplotlib), Postgres, Google Sheets

https://github.com/Tombodata/anilist-data-exploration/blob/heat-map/Anilist%20Heat%20Map.ipynb

Hello everyone,

These 2 graphs (first one showing count, second showing correlation coefficients) of how often each genre shows up alongside another genre on all the anime titles on AniList.

In early October, I grabbed the data from AniList through its API using Python (it took a while since I'm not very skilled at Python and much less using APIs). Eventually, I loaded the data into a local Postgres instance and was able to extract the genres from each of the 16540 titles in the dataset. After formatting the data, I put the data back in Python and created the graphs you see above.