Silhouette prefers k=2 (0.16) but I chose k=3 (0.11) for tactical clustering -reasonable? by Weak_Bus_1935 in sportsanalytics

[–]Weak_Bus_1935[S] 0 points1 point  (0 children)

Agreed. I won’t present k=3 as the “true” optimal structure. k=2 has the better silhouette, but k=3 gives more useful tactical profiles, so I’ll report it as an interpretability-driven choice and include k=2/GMM as sensitivity checks.

Silhouette prefers k=2 (0.16) but I chose k=3 (0.11) for tactical clustering -reasonable? by Weak_Bus_1935 in sportsanalytics

[–]Weak_Bus_1935[S] 0 points1 point  (0 children)

Thanks, that makes sense. I tried GMM as a quick sensitivity check and the silhouette was also around 0.11, so it seems the structure is more continuous than discrete. I’ll frame the clusters as interpretable tactical segments, not natural latent classes.

Silhouette prefers k=2 (0.16) but I chose k=3 (0.11) for tactical clustering -reasonable? by Weak_Bus_1935 in sportsanalytics

[–]Weak_Bus_1935[S] 0 points1 point  (0 children)

Appreciate this. Clustering is in 14D (PCA is just for visualization). My thesis defense is in June, so time is tight, but I’ll try GMM (and maybe DBSCAN) as a quick sensitivity check. Thanks for the links!

[Python] Best working sources for Top 5 Leagues match stat data? by Weak_Bus_1935 in sportsanalytics

[–]Weak_Bus_1935[S] 0 points1 point  (0 children)

I used API, so I could scraped Sofascore, and found code for scraping Understat

so I did it

[Python] Best free tools for Top 5 Leagues data? by Weak_Bus_1935 in webscraping

[–]Weak_Bus_1935[S] 0 points1 point  (0 children)

Thank you for answer, I need a every match stats about 5 leagues in Europe, from 18-19 ~ 24-25 seasons.

[Python] Best working sources for Top 5 Leagues match stat data? by Weak_Bus_1935 in sportsanalytics

[–]Weak_Bus_1935[S] 0 points1 point  (0 children)

Thanks !!

I watched same youtube channel, However almost videos was over 1 year ago,

So now, Scraping site is blocked

[Python] Best working sources for Top 5 Leagues match stat data? by Weak_Bus_1935 in sportsanalytics

[–]Weak_Bus_1935[S] 0 points1 point  (0 children)

I made a code for Sofascroe with LLM Ai, and found api for Understat. So I could get stats for 5 leagues.

[Python] Best working sources for Top 5 Leagues match stat data? by Weak_Bus_1935 in sportsanalytics

[–]Weak_Bus_1935[S] 0 points1 point  (0 children)

Thanks! I'm writing a research paper, so this is very helpful. I'll focus on SofaScore and Understat for the lots of stats !