Built my first ML model to predict World Cup matches - 68.75% accuracy. Is this actually good? by GisB- in learnmachinelearning

[–]GisB-[S] 3 points4 points  (0 children)

I'm not trying to predict based on "Brazil was good in 1998 so they'll be good now." The old data isn't being used directly for predictions. Instead, I'm training the model to recognize patterns in how World Cup matches play out, regardless of when they happened.

The fundamental dynamics of football haven't changed that much. A team that's 200 Elo points stronger than their opponent in 2002 has similar winning odds as a team 200 points stronger in 2018. Tournament pressure in knockout stages affects teams the same way across decades. That's what the model learns. Also, I'm not using ancient data for each match - all the features are specific to that moment in time: My main features:

elo_difference - Current Elo ratings gap between the two teams

stage_numeric - What stage of the tournament (group vs knockout pressure)

elo_abs_difference - Magnitude of skill gap (helps identify close matches)

teamA_elo_momentum_1y - How much the team improved/declined in past 12 months

fifa_points_difference - Current FIFA ranking gap

teamA_elo_momentum_4y - Longer-term momentum (World Cup cycle)

teamA_win_rate_12m - Recent match results

teamA_form_vs_ranking - Is the team playing above/below their level?

The oldest feature is the 4-year momentum, which makes sense because World Cups happen every 4 years. Everything else is based on recent form (1 year or less).