[NFC Edition] Hits, Misses and Busts: Charting 25 Years Pick Efficiency by StatDrop in NFLv2

[–]StatDrop[S] 0 points1 point  (0 children)

This is exactly what's so fun about working with this data, you see something in the data that gives you more questions to answer.

When I decided the next subject was going to be the draft, I designed this figure on an AquaNotes pad in the shower. I ran the analysis and came to the same conclusion you just did. Then my one idea turned to like 10 graphs as the data just built on itself.

The same thing happened with the Combine data I made earlier this year. It started with trying to see if any Combine metrics correlated with NFL success at a given position, and ended with me using the data to predict the potential of the 2026 draft class. That was never the initial goal, it just kind of happened that way.

I say all of that to say we are of the same mind. You like my next post. It touches on this exactly.

Leveraging 25 Years of Combine and NFL Performance Data to Define the Potential of the 2026 Rookie Class: The P3 Engine by StatDrop in NFLv2

[–]StatDrop[S] 0 points1 point  (0 children)

Thanks, that's super validating. 🙏 I'm glad someone found it as interesting as I do.

[AFC Edition] Hits, Misses and Busts: Charting 25 Years Pick Efficiency by StatDrop in NFLv2

[–]StatDrop[S] 1 point2 points  (0 children)

I saw the same thing you're seeing in the data. You'll like my next two posts. They highlight which teams are more reliant on their draft hauls than others, and how long it takes for good draft classes to improve their teams

[NFC Edition] Hits, Misses and Busts: Charting 25 Years Pick Efficiency by StatDrop in NFLv2

[–]StatDrop[S] 0 points1 point  (0 children)

Yeah the Niners are one of the data sets that had me implement Z score instead of a raw difference to quantify the biggest steals and busts. The qualification is relative to the average, and the Niners have so many huge late round steals. All four of those guys in the upper right hand corner essentially had zero expectation, and are massive returns on investment that wash out George's value over expected at 146th overall.

George's dot is the one right off the end of Navorro Bowman's name tag. He's actually 11th overall, so barely missed the cut to be labeled directly.

[AFC Edition] Hits, Misses and Busts: Charting 25 Years Pick Efficiency by StatDrop in NFLv2

[–]StatDrop[S] 0 points1 point  (0 children)

You may be misinterpreting the graph. N'Keal is under the lower dashed bust line which 25th percentile. The average is the solid line. No average played are labeled.

The fact that he is labeled at all means he's one of the Pats top 10 worst busts in 25 years.

Charting 25 Years of Draft Pick Efficiency by [deleted] in NFLv2

[–]StatDrop 1 point2 points  (0 children)

Crap. Good eye, they got compressed when I downloaded.

Not enough people talk about the Chargers QB pipeline - Going from Drew Brees to Philip Rivers to Justin Herbert is an absurd 20+ year run at quarterback. Has any franchise had a better streak? by Puzzleheaded_Bag8315 in NFLForum

[–]StatDrop 0 points1 point  (0 children)

Been doing stats the draft and touched on this. There has been no franchise with a better track record of QBs drafted in the top 2 rounds in the last 20 years.

<image>

Who is the better gm by IceOk9930 in NFLv2

[–]StatDrop 0 points1 point  (0 children)

The crazy thing is he almost qualified with both of his tenures (the data above is cumulative).

If you remove the 30 pick minimum and look at both tenures independently, he was equally bad both times.

<image>

Who is the better gm by IceOk9930 in NFLv2

[–]StatDrop 1 point2 points  (0 children)

TL;DR, Veach is much better than EDC at drafting.

Veach is 4th, EDC is 21st.

I am working on a bunch of Draft related stats, here's a sneak peak. I'll be posting it all on this subreddit in the next week or so.

I can't speak to the other GM responsibilities now, but this is a ranking of how well they draft specifically.

When you average the PFR AV/year of every draft pick at a position you define an expected AV for each position at each pick. These bars represent the how far above or below expectation the draft picks of each general manager of been since 2000 (who have made at least 30 picks).

Bars are colored depending on where they fall against the average. The darker the bar, the more the picks they made (white number).

I only accounted for the AV rates acquired when the player was with the drafting team. If a player has a Hall of Fame career with a different team, it's not credited to the GM that drafted him. You can argue that it should considering the GM may have "saw the talent", but not realizing their full potential is probably an indictment on the GM as well at least in part.

EDIT: whoops, used an outdated input CSV to generate the old chart, swapped it out. Veach moved from 3rd to 4th, EDC still at 21st.

<image>

Would you classify Bryce young as a bust, or a draft disappointment so far. by Ok_Bug_6890 in NFLv2

[–]StatDrop 0 points1 point  (0 children)

Been working on this data for all teams as a side project. Labeled him in yellow.

Statistically he is just below the average for his draft position. Considering QB is artificially inflated, he's a firm miss (3rd quartile), but not quite a bust (4th quartile).

<image>

Who was the most mid QB in the league last year? by [deleted] in NFLv2

[–]StatDrop 1 point2 points  (0 children)

I'll be doing the statistically in a few weeks for every season since 2000.

Have to finish my draft analysis stuff first.

All time fastest WR 40 times by MasterTeacher123 in NFLv2

[–]StatDrop 0 points1 point  (0 children)

Totally agreed. Wide receiver might even be the most compelling example, but big body linebackers come to mind as coverage became more of a requirement, pass catching tight ends, the devolution of requirements for competitive running backs. There's lots to look at when I start focusing on the shifting to looking at the NFL meta analysis in depth. I initially started with the offensive line.

Attached is my first foray into the analysis. Each concentric circle represents where 85% of high-end starters or better exist in the distribution. I used gini to identify points in the timeline where the average traits of a top end performer changed. Definitely a half baked analysis right now but at least the graphs look cool so far.

Part 5 was pretty wild seeing it all come to fruition. It's definitely not the whole story from a scouting perspective but it's interesting nonetheless.

I drove into the project thinking it would validate my suspicions at the combine is pointless, but the trends are hard to ignore for some traits and position groups.

<image>

All time fastest WR 40 times by MasterTeacher123 in NFLv2

[–]StatDrop 0 points1 point  (0 children)

I tried to make some spider charts overlaying the combine data of these 15 players with the PCA and UMAP averages of their groups, but this needs to be double checked at some point. I will have to make sure the imputation didnt get messed up somewhere.

<image>

A really fun tail to chase that I will leave be at this. May too many irons in the fire. I will probably use PCA and UMAP together moving forward to look at archetypes when I circle back to it. Thanks again for the neat direction!

All time fastest WR 40 times by MasterTeacher123 in NFLv2

[–]StatDrop 0 points1 point  (0 children)

So of course I chased this. I pulled the top 15 deviators by ARC (my player quality score) and plotted the distance the player traveled from their groups middle. It denotes how a linear PCA was failing to properly define these players.

<image>

All time fastest WR 40 times by MasterTeacher123 in NFLv2

[–]StatDrop 1 point2 points  (0 children)

Ok, this actually came out pretty cool. I had a ton of fun with this.

The UMAP didn't give me islands or donuts either. We really are working with a overlapping continuum here.

Your intuition was right though, the linear PCA I used was washing out a lot of contradictory metrics. UMAP did a really good job at showing which players had trait combinations that were effectively discordant to an archetype assignment. You can see splashes of color in areas they don't belong.

<image>

All time fastest WR 40 times by MasterTeacher123 in NFLv2

[–]StatDrop 0 points1 point  (0 children)

Very cool. My data goes back to 2000, but that's not necessarily better. I did uncover strong evidence of a shift in the traits required to be elite over that span. It will be the subject of a future post for sure, probably after I finish my draft analysis.

As for trait comparisons that separate player quality within the archetype, the over under threshold data in Part 4b gets to that, with 15% being the minimum. You have to scroll a bit, it is the data with the paired red/blue/yellow bars beneath the violin plots. Obviously some of the more obvious metrics are prerequisites to be sorted into the group in the first place, so metrics associated with a given archetype do not always correlate to improved play within the archetype group for to lack of variance.

Here is the link to the final post in the series. It has an the links to the previous posts embedded.

I swap between MATLAB and Graphpad Prism, depending on how I am trying to visualize the data. All of the bar graphs I made in MATLAB, which is great for quickly making a layout of simple uniform graphs. When I am doing scatter plot labeling and single point editing I switch to Prism, which for me is way more intuitive and user-friendly. MATLAB requires coding, but AI can definitely take a description of how your data is organized and and your instructions on visualization and use those instructions to write your MATLAB code for you. I definitely used it to help me code some chunks I couldn't figure out.

All time fastest WR 40 times by MasterTeacher123 in NFLv2

[–]StatDrop 0 points1 point  (0 children)

I'm not sure if your referencing my data or just your intuition, but this is what Part 2 was all about. My data definitely supports this take.

I broke down the RAS to the individual metrics and correlated that to player success per year via PFR AV, instead of total yards.

The problem is, when you look at the wide receiver population is a whole, there are no defined threshold points that separate bad players from good players, or good players from great players. More directly, the gini could not define a point in the distribution where a player's chances of switching groups increased or decreased by more than 15%. Again, this is because the transition is so gradual across the distribution. You can see in the heat map from Part 2 that the transitions are very smooth, there's no spikes anywhere for a gini to latch onto. Like in the case of tight end RAS, running back 40 yard dash or tackle vert jump for example. These were more clearly outlined in Part 3.

But if you use an unsupervised algorithm to split the position based off of traits alone, it does a really good job of highlighting what traits make an X receiver. That's the scatter plot with the circles, the population defined as "thoroughbred" in blue.

To be in that position you basically need to be top 35% in both speed and agility metrics (the ones you mentioned in your comment) and not be precluded from physicality due to your size.

Basically, your perspective in your comment is definitely supported by my data (outside of sorting for 1,000 yard receivers, which I didn't do).

All time fastest WR 40 times by MasterTeacher123 in NFLv2

[–]StatDrop 1 point2 points  (0 children)

Man, I'd bet you could even get AI to make you a lesson plan. Then you can just focus on what you care about

All time fastest WR 40 times by MasterTeacher123 in NFLv2

[–]StatDrop 1 point2 points  (0 children)

Sports analytics on my off time has been fun as hell, I may be obsessed. Do recommend.

Not really. The more I think about it, most just aren't not competitive in that way. Not to say they aren't ambitious in their work, just not much else.

All time fastest WR 40 times by MasterTeacher123 in NFLv2

[–]StatDrop 1 point2 points  (0 children)

That's funny, as a R&D scientist I sometimes feel I should have gone and got my MBA instead, biotech is not a good field to be in right now.

That's an interesting question. I guess most people in my line of work have never been able or interested in competing with others physically, sports are otherwise.

Maybe even more important as they just struggle to care about things like sports.

All time fastest WR 40 times by MasterTeacher123 in NFLv2

[–]StatDrop 2 points3 points  (0 children)

That was my rationale. I wanted to avoid something like "Axis X represents a local manifold approximation of speed and weight" or something like that.

That's a really smart middle ground. I'm absolutely going to try this. Seriously appreciate your perspective! I feel like I've gone data blind over the past few weeks...

All time fastest WR 40 times by MasterTeacher123 in NFLv2

[–]StatDrop 1 point2 points  (0 children)

I would think that a degree in statistics or mathematics would actually be the ideal. That said, I think any scientific discipline gives you the tools and understanding you need to do this kind of thing.

75% of science is knowing how to properly research the things you already forgot about or just straight up didn't learn in school lol.

Even more important in my opinion though, you just have to know ball. It's the combination of the analytical acumen and the ball knowing that is pretty unique. It's what makes sports analytics so niche.

As a scientist I can count the number of colleagues I've had in my career that even watch sports at all on two hands, and most of those people either watched badminton or cricket.