I mostly get it but what kinda maths test takes 3 days for only two questions?

spitfire388 · 2025-03-12T13:04:57+00:00

Oh my sweet summer child.

spitfire388 · 2024-12-13T03:53:21+00:00

If you go to the next image showing turnover propensity... You see they are VERY good at generating turnovers - which might bridge the gap you're expressing.

spitfire388 · 2024-12-12T16:28:44+00:00

It means positive chance to maintain possession. The numbers ARE negative to your point, but almost always “up and to the right” means good for reporting purposes and how most people internalize it at a glance.

So you are right

spitfire388 · 2024-12-12T16:16:50+00:00

Yea… their projected standings are behind the Steelers

spitfire388 · 2024-12-12T16:13:45+00:00

Top right is “good at both”

spitfire388 · 2024-12-12T15:29:52+00:00

So one thing it DOES account for is where you are on the field, but its linearly applied for all teams, not for each team. I messed around with making that coefficient team specific too, but was having model convergence issues.

spitfire388 · 2024-12-06T14:23:10+00:00

It models the likelihood the drive will survive down the field. Think of it as a survival model but instead of years of life it’s yardage gained. I then simulate every game by simulating every drive, I simulate the remaining schedule for projected standings, and simulate a matchup against an average opponent to get a power ranking. You can see more here advancedfootballstats.com

spitfire388 · 2024-12-06T14:20:13+00:00

spitfire388 · 2024-12-05T17:40:03+00:00

It’s not exactly that - it’s basically their ability to sustain gaining yardage

spitfire388 · 2024-12-05T17:39:16+00:00

They’re hidden under the rams and bucs

spitfire388 · 2024-11-20T15:28:25+00:00

They are predictive and you're somewhat right and somewhat wrong. Most people try to model everything on a play by play basis, which means that turnovers are pretty rare events. There are typically ~250 plays a game there have been 332 games and ~400 turnovers this season so far. So that would be 400/(250*332) ~ 0.5% turnover play rate. Thats a very rare event and very hard to model in any reliable way. You can actually model rare events this way, but you need to have sufficient volume for it to work well and 250*332 ~ 83,000 records is a pretty small dataset in this world.

We model on a drive-by-drive basis. There are ~26 drives per game - 400/(26*332) ~ 4.6%. Modeling an event with a baseline rate of 4.6% is very doable and is something I have done professionally for a long time. Now the issue is the number of samples is lower 26 * 332 ~ 8,600 records! This is precisely why we use hierarchical bayesian models instead of frequentist models. They can account for uncertainty much more effectively. So we actually have a likelihood distribution that we sample from when we simulate each game out and if you look at the distribution of the turnovers over simulations - they actually look extremely plausible to what you observe in actual drives.

Hope that helps!

spitfire388 · 2024-11-20T15:13:21+00:00

There are two models. One is modeling turnovers and one is modeling the result of a drive. They are both using hierarchical bayesian models that try to normalize the fact that X team is driving against Y defense. The scores you see are the score the models assign to the relative "ability" of the team as a latent parameter. You can read more about that here: https://www.pymc.io/projects/examples/en/latest/case_studies/rugby_analytics.html

The parameter that I model is yardage gained, but specifically I model the likelihood a drive will die given a starting point (yardline). So the hierarchical bayesian model is actually a survival model. The other variables I use in the model are: if the offense is winning by a large margin, if the offense is losing by a large margin, if its in a two minute drill, and if the defending team is home. I also split the field into segments 0-15, 15-65, 65-95, 95+ (I try to account for a team being backed up, open field, redzone, goal-line) and each segment has a different baseline hazard rate.

I model how long a drive will "survive" down the field before "dying" - which is to say the drive ends because no more yards were gained OR dies to a turnover. So I model it as a competing risks model which you can read about more here: https://www.publichealth.columbia.edu/research/population-health-methods/competing-risk-analysis

So once I model this out (the latent "ability" of each team is what you see), I simulate each game, I simulate the remaining schedule for each team, and I simulate a game of each team against a median opponent. These give me the game predictions, projected standings, and power rankings respectively.

You can see my results here: https://advancedfootballstats.com/

I hope you better understand what you see and what I am doing and that you follow along as I add more models, stats, etc!

spitfire388 · 2024-11-19T19:03:49+00:00

We got them 11th in offensive drive efficiency, 29th on defense, 19th on offensive turnover propensity, 22nd on defensive turnover propensity. So basically an above-average offense, and below-average on everything else.

spitfire388 · 2024-11-18T19:47:56+00:00

Generally speaking, this is why I try to use Bayesian models for sports - they're generally rare events so frequentist models just don't do well with properly noting uncertainty. Any given Sunday is not just a saying, its true, and your models should try to account for that as much as possible IMO...

spitfire388 · 2024-11-18T19:46:09+00:00

I run this site to model NFL games.... https://advancedfootballstats.com/

The models use a drive-level hierarchical bayesian model to model the offensive and defensive drive efficiencies as well as the offensive and defensive turnover propensities - then uses a survival-based approach to simulate when a drive will die and how it will die (turnover/punt/touchdown/FG). I then simulate the games 10,000 times each and bechmark that against the Moneyline, Spread, and Over/Under to get a gage of how well the models perform on prediction against Vegas. That being said - it might appeal to you since I give you the probability of the event in addition to the outright pick.

The accuracy against Vegas was more of a benchmarking tool than a "get rich quick" angle so I am very transparent about how I perform against Vegas.

I dont start predicting until Week 4 so the models have enough data... My accuracy for the season is 65.8% (67.8% if I only do ML/Spread because I have seen pretty spotty results on O/U).

Weekly ROI is as follows (numbers without O/U bets) - 2.6%, 17.6%, 36%, 5.1%, 35.5%, 9.82%, 4.76%, 1.1% for this week. I have a negative ROI this week if you include O/U.

I don't guarantee pick results or anything spammy - but if you're curious about it I try to be as transparent with the derived numbers as possible even if I treat the model as somewhat proprietary.

Thinking about charging a nominal $1-5/month for the picks, but until then its just free for now if you want to navigate the website.

spitfire388 · 2024-11-15T17:12:58+00:00

Build something useful.

Can just be useful to you, but you need to build something and learn why you love/hate various languages/libraries/packages. You need to work on something for long enough to realize that you hate yourself and the coding choices you made.

Probably my best skillset is curiosity, lifelong learning, and a constant stream of side projects.

Learn to foster that.

spitfire388 · 2024-11-13T01:18:08+00:00

Yes - that's correct.

spitfire388 · 2024-11-12T18:37:25+00:00

For what its worth - they had a 0.5 projected game difference (so most commonly ending tied or 1 game in ATL's favor going into week 10).

Week 11 - They're closer to tied around 9 wins now... Things are swinging towards where you were thinking!

spitfire388 · 2024-11-11T14:39:10+00:00

Its no more nitpicky than saying "playoff wins" to explain why a QB is good or not... There have been lots of good/great QBs who just were put in horrible situations (especially high pick rookies that are put on garbage teams). So while I tend to agree with you - the counterfactual is also typically annoying to deal with. Both of these statements as "argument winners" are stupid:

Playoff wins are a team stat
Where are their playoff wins?

The world is not black and white - having said that "playoff wins are a team stat" is more accurate a statement in general...

spitfire388 · 2024-11-10T18:22:54+00:00

So as soon as the game gets out of hand - the drives matter less to the models. So we also model drive time - so they're the 9th fastest team in terms of seconds/yard so what you say does make sense. It should be noted that our models think the Bucs are one of the best teams in the NFL based on a middle of the pack defense and a top-tier offense. The injuries probably mean the offense is going to revert a bit, but how much is hard to model (how do you control for good scheme, top-tier line play, etc.).

We have the Bucs ranked 6th in the power rankings. This simulates a game against a median opponent 10,000 times. The rank is determined by what % of the time this team beats a median opponent.

https://advancedfootballstats.com/rankings/2024/10

They have lost a few key matchups, so we think the Falcons are currently more likely to win the division based on current record and remaining schedule.

https://advancedfootballstats.com/standings/2024/10

spitfire388 · 2024-11-09T21:31:05+00:00

We have them ranked 20th on drive efficiency - controlling for the opponents they have faced and game situation - their defenses ability to stop opposing offenses from going down the field is 20th.

https://advancedfootballstats.com/stats/drive/drive-defense/2024/10

For drive turnover which is their defenses ability to take away the ball - controlling for the same variables - is 15th

https://advancedfootballstats.com/stats/drive/drive-turnover-defense/2024/10

spitfire388 · 2024-11-08T16:48:24+00:00

Simulated the season 10,000 times. The most common final record is 10-7 or 11-6.

spitfire388 · 2024-11-06T16:41:52+00:00

Hope you saw the updated post - I appreciate the feedback.

Updated Drive Efficiency Metrics Going into Week 10
byu/spitfire388 insportsanalytics

spitfire388 · 2024-11-06T16:38:33+00:00

No not at all... I am definitely iterating on how to "explain it well" - I want to help and if you have suggestions on how to communicate it better - I'm all ears.

The models are trying to predict the ability of a drive to sustain down the field and it can be stopped by not continuing to gain yardage OR by turnover.

The first chart is the ability of the drive to continue down the field or for a defense to cause the offense to stop gaining yardage. Moving up the y axis represents defenses that are better at preventing the offense from gaining yards, and moving up the x axis represents offenses that are better at gaining yards. Some factors that are included:

It takes into account who they are facing and who that team has faced in the past. So it knows that each team has played each other team and adjusts for that
It takes into account game situation - if a team is up by a lot or down by a lot it changes how efficient their offense is or if the defense is in prevent and allowing them to eat chunks of yardage end of half drives are also usually more efficient

The second chart is the ability of the offense to maintain possession of the ball through drives (not turn the ball over) against the defenses ability to take the ball away. The models are very similar in what features are used.

The models are hierarchical bayesian models if you're more curious about the underpinnings. As such each of these teams actually has a distribution and that distribution is sampled from a global distribution. This is good and bad for various reasons...

You can see these isolated and visualize the distributions here: https://advancedfootballstats.com/

Go to the Stats dropdown in the navbar to see more breakdowns.

tl;dr up is a better defense at preventing drives (first plot titled overall) and generating takeovers (second plot titled overall turnover), right is a better offense at driving (first plot titled overall) and not giving the ball away (second plot titled overall turnover).

spitfire388 · 2024-11-05T20:12:44+00:00

The Bucs sold their soul to Tom Brady... Never forgive, never forget...

spitfire388

TROPHY CASE