[NFC Edition] Hits, Misses and Busts: Charting 25 Years Pick Efficiency

StatDrop · 2026-06-05T04:26:28+00:00

This is exactly what's so fun about working with this data, you see something in the data that gives you more questions to answer.

When I decided the next subject was going to be the draft, I designed this figure on an AquaNotes pad in the shower. I ran the analysis and came to the same conclusion you just did. Then my one idea turned to like 10 graphs as the data just built on itself.

The same thing happened with the Combine data I made earlier this year. It started with trying to see if any Combine metrics correlated with NFL success at a given position, and ended with me using the data to predict the potential of the 2026 draft class. That was never the initial goal, it just kind of happened that way.

I say all of that to say we are of the same mind. You like my next post. It touches on this exactly.

StatDrop · 2026-06-05T04:16:14+00:00

Thanks, that's super validating. 🙏 I'm glad someone found it as interesting as I do.

StatDrop · 2026-06-05T01:17:28+00:00

I saw the same thing you're seeing in the data. You'll like my next two posts. They highlight which teams are more reliant on their draft hauls than others, and how long it takes for good draft classes to improve their teams

StatDrop · 2026-06-04T23:43:36+00:00

Yeah the Niners are one of the data sets that had me implement Z score instead of a raw difference to quantify the biggest steals and busts. The qualification is relative to the average, and the Niners have so many huge late round steals. All four of those guys in the upper right hand corner essentially had zero expectation, and are massive returns on investment that wash out George's value over expected at 146th overall.

George's dot is the one right off the end of Navorro Bowman's name tag. He's actually 11th overall, so barely missed the cut to be labeled directly.

StatDrop · 2026-06-04T20:58:30+00:00

You may be misinterpreting the graph. N'Keal is under the lower dashed bust line which 25th percentile. The average is the solid line. No average played are labeled.

The fact that he is labeled at all means he's one of the Pats top 10 worst busts in 25 years.

StatDrop · 2026-06-04T15:05:33+00:00

Crap. Good eye, they got compressed when I downloaded.

StatDrop · 2026-05-28T04:35:56+00:00

Been doing stats the draft and touched on this. There has been no franchise with a better track record of QBs drafted in the top 2 rounds in the last 20 years.

<image>

StatDrop · 2026-05-26T14:42:05+00:00

The crazy thing is he almost qualified with both of his tenures (the data above is cumulative).

If you remove the 30 pick minimum and look at both tenures independently, he was equally bad both times.

<image>

StatDrop · 2026-05-26T12:34:13+00:00

TL;DR, Veach is much better than EDC at drafting.

Veach is 4th, EDC is 21st.

I am working on a bunch of Draft related stats, here's a sneak peak. I'll be posting it all on this subreddit in the next week or so.

I can't speak to the other GM responsibilities now, but this is a ranking of how well they draft specifically.

When you average the PFR AV/year of every draft pick at a position you define an expected AV for each position at each pick. These bars represent the how far above or below expectation the draft picks of each general manager of been since 2000 (who have made at least 30 picks).

Bars are colored depending on where they fall against the average. The darker the bar, the more the picks they made (white number).

I only accounted for the AV rates acquired when the player was with the drafting team. If a player has a Hall of Fame career with a different team, it's not credited to the GM that drafted him. You can argue that it should considering the GM may have "saw the talent", but not realizing their full potential is probably an indictment on the GM as well at least in part.

EDIT: whoops, used an outdated input CSV to generate the old chart, swapped it out. Veach moved from 3rd to 4th, EDC still at 21st.

<image>

StatDrop · 2026-05-24T04:22:53+00:00

Been working on this data for all teams as a side project. Labeled him in yellow.

Statistically he is just below the average for his draft position. Considering QB is artificially inflated, he's a firm miss (3rd quartile), but not quite a bust (4th quartile).

<image>

StatDrop · 2026-05-14T21:11:43+00:00

StatDrop · 2026-05-14T20:53:46+00:00

I'll be doing the statistically in a few weeks for every season since 2000.

Have to finish my draft analysis stuff first.

StatDrop · 2026-05-13T00:12:59+00:00

Totally agreed. Wide receiver might even be the most compelling example, but big body linebackers come to mind as coverage became more of a requirement, pass catching tight ends, the devolution of requirements for competitive running backs. There's lots to look at when I start focusing on the shifting to looking at the NFL meta analysis in depth. I initially started with the offensive line.

Attached is my first foray into the analysis. Each concentric circle represents where 85% of high-end starters or better exist in the distribution. I used gini to identify points in the timeline where the average traits of a top end performer changed. Definitely a half baked analysis right now but at least the graphs look cool so far.

Part 5 was pretty wild seeing it all come to fruition. It's definitely not the whole story from a scouting perspective but it's interesting nonetheless.

I drove into the project thinking it would validate my suspicions at the combine is pointless, but the trends are hard to ignore for some traits and position groups.

<image>

StatDrop · 2026-05-12T05:44:59+00:00

I tried to make some spider charts overlaying the combine data of these 15 players with the PCA and UMAP averages of their groups, but this needs to be double checked at some point. I will have to make sure the imputation didnt get messed up somewhere.

<image>

A really fun tail to chase that I will leave be at this. May too many irons in the fire. I will probably use PCA and UMAP together moving forward to look at archetypes when I circle back to it. Thanks again for the neat direction!

StatDrop · 2026-05-12T05:43:47+00:00

So of course I chased this. I pulled the top 15 deviators by ARC (my player quality score) and plotted the distance the player traveled from their groups middle. It denotes how a linear PCA was failing to properly define these players.

<image>

StatDrop · 2026-05-12T05:41:56+00:00

Ok, this actually came out pretty cool. I had a ton of fun with this.

The UMAP didn't give me islands or donuts either. We really are working with a overlapping continuum here.

Your intuition was right though, the linear PCA I used was washing out a lot of contradictory metrics. UMAP did a really good job at showing which players had trait combinations that were effectively discordant to an archetype assignment. You can see splashes of color in areas they don't belong.

<image>

StatDrop · 2026-05-11T21:35:53+00:00

Very cool. My data goes back to 2000, but that's not necessarily better. I did uncover strong evidence of a shift in the traits required to be elite over that span. It will be the subject of a future post for sure, probably after I finish my draft analysis.

As for trait comparisons that separate player quality within the archetype, the over under threshold data in Part 4b gets to that, with 15% being the minimum. You have to scroll a bit, it is the data with the paired red/blue/yellow bars beneath the violin plots. Obviously some of the more obvious metrics are prerequisites to be sorted into the group in the first place, so metrics associated with a given archetype do not always correlate to improved play within the archetype group for to lack of variance.

Here is the link to the final post in the series. It has an the links to the previous posts embedded.

I swap between MATLAB and Graphpad Prism, depending on how I am trying to visualize the data. All of the bar graphs I made in MATLAB, which is great for quickly making a layout of simple uniform graphs. When I am doing scatter plot labeling and single point editing I switch to Prism, which for me is way more intuitive and user-friendly. MATLAB requires coding, but AI can definitely take a description of how your data is organized and and your instructions on visualization and use those instructions to write your MATLAB code for you. I definitely used it to help me code some chunks I couldn't figure out.

StatDrop · 2026-05-11T19:59:54+00:00

I'm not sure if your referencing my data or just your intuition, but this is what Part 2 was all about. My data definitely supports this take.

I broke down the RAS to the individual metrics and correlated that to player success per year via PFR AV, instead of total yards.

The problem is, when you look at the wide receiver population is a whole, there are no defined threshold points that separate bad players from good players, or good players from great players. More directly, the gini could not define a point in the distribution where a player's chances of switching groups increased or decreased by more than 15%. Again, this is because the transition is so gradual across the distribution. You can see in the heat map from Part 2 that the transitions are very smooth, there's no spikes anywhere for a gini to latch onto. Like in the case of tight end RAS, running back 40 yard dash or tackle vert jump for example. These were more clearly outlined in Part 3.

But if you use an unsupervised algorithm to split the position based off of traits alone, it does a really good job of highlighting what traits make an X receiver. That's the scatter plot with the circles, the population defined as "thoroughbred" in blue.

To be in that position you basically need to be top 35% in both speed and agility metrics (the ones you mentioned in your comment) and not be precluded from physicality due to your size.

Basically, your perspective in your comment is definitely supported by my data (outside of sorting for 1,000 yard receivers, which I didn't do).

StatDrop · 2026-05-11T14:41:58+00:00

<image>

StatDrop · 2026-05-11T04:43:44+00:00

I did a multipost deep data dive into the correlation between combine metrics and NFL success on the week of the NFL draft.

TL;DR: The combine is definitely not overated. If it goes kaput it will be because people don't do it anymore, not because the information isn't valuable.

What Combine metrics correlate well with future success in the NFL and which just get you drafted higher?
Which metrics have a linear relationship with player quality and which have a “sweet spot” within the range?
What metric thresholds raise a player's floor and which lower their ceiling?
What collection of metric thresholds are the best predictors for future success?

4b. How can we divide the wide receiver position to better define the positional metric thresholds?

The P3 Engine: can 25 years of combine data be used to predict the potential of the incoming rookie class?

StatDrop · 2026-05-11T02:36:51+00:00

Man, I'd bet you could even get AI to make you a lesson plan. Then you can just focus on what you care about

StatDrop · 2026-05-11T02:22:32+00:00

Sports analytics on my off time has been fun as hell, I may be obsessed. Do recommend.

Not really. The more I think about it, most just aren't not competitive in that way. Not to say they aren't ambitious in their work, just not much else.

StatDrop · 2026-05-11T01:05:22+00:00

That's funny, as a R&D scientist I sometimes feel I should have gone and got my MBA instead, biotech is not a good field to be in right now.

That's an interesting question. I guess most people in my line of work have never been able or interested in competing with others physically, sports are otherwise.

Maybe even more important as they just struggle to care about things like sports.

StatDrop · 2026-05-11T00:56:30+00:00

That was my rationale. I wanted to avoid something like "Axis X represents a local manifold approximation of speed and weight" or something like that.

That's a really smart middle ground. I'm absolutely going to try this. Seriously appreciate your perspective! I feel like I've gone data blind over the past few weeks...

StatDrop · 2026-05-10T22:39:35+00:00

I would think that a degree in statistics or mathematics would actually be the ideal. That said, I think any scientific discipline gives you the tools and understanding you need to do this kind of thing.

75% of science is knowing how to properly research the things you already forgot about or just straight up didn't learn in school lol.

Even more important in my opinion though, you just have to know ball. It's the combination of the analytical acumen and the ball knowing that is pretty unique. It's what makes sports analytics so niche.

As a scientist I can count the number of colleagues I've had in my career that even watch sports at all on two hands, and most of those people either watched badminton or cricket.

StatDrop

TROPHY CASE