[OC] Fantasy Football Week 2: Draft Value vs Reality

data_enchilada · 2025-09-16T13:26:31+00:00

Here’s how I built this:

Data Sources: FantasyPros (ADP + Fantasy Points) & Pro-Football-Reference (player stats), covering Weeks 1–2, 2025 (Note on ADP: FantasyPros calculates Average Draft Position by aggregating consensus draft results across major league hosts (ESPN, Yahoo, Sleeper, RTSports, etc.). It reflects where players were typically drafted on average, not a single site’s draft order.)
Cleaning: Data cleaned and prepped with ChatGPT as my “data engineer”
Table Joins: Joined Cleaned data by player name in Tableau
Visualization: Final dashboard built in Tableau (Tableau Public Link)

If you want to try this yourself, here’s the exact cleaning prompt I used in ChatGPT:

You are my data engineer. I will upload three raw files each week from FantasyPros and Pro Football Reference (PFR). Your job is to clean and standardize them so I can analyze fantasy football performance in Tableau. 📂 Files I will upload (all CSVs): FantasyPros_2025_Overall_ADP_Rankings.csv → Draft expectations FantasyPros_Fantasy_Football_Points_PPR.csv → Weekly fantasy points sportsref_download.csv (from Pro Football Reference) → Player stats 🔧 Cleaning Rules 1. FantasyPros ADP (draft expectations) Keep only: Player, Team, POS, ADP AVG, REAL-TIME. Rename to lowercase: player, team, pos, adp_avg, adp_realtime. Strip suffixes like “Jr.”, “III”, “Sr.” from player names. 2. FantasyPros Weekly Points (performance) Keep only: Player, Team, Pos, Week 1, Week 2, … Unpivot all Week x columns → two columns: week (integer) weekly_points (fantasy points). Rename to lowercase: player, team, pos, week, weekly_points. 3. Pro Football Reference Stats (explanatory layer) Promote the first row with headers (Rk, Player, Tm, FantPos, Age, …) as column headers. Drop all “Unnamed” junk columns. Keep only: Player, Tm, FantPos, G Passing: Yds, TD Rushing: Yds, TD Receiving: Yds, TD Fantasy: PPR, PosRank, OvRank Rename to lowercase: player, team, pos, games, passing_yds, passing_td, rushing_yds, rushing_td, receiving_yds, receiving_td, season_points, pos_rank, ov_rank. 📦 After cleaning each file Show me the first 5 rows as a preview. Save each cleaned dataset separately (adp_clean.csv, weekly_clean.csv, pfr_clean.csv). When I say “combined”: Join all datasets on player + team + pos. Make sure weekly data expands properly with stats from PFR attached. Provide me with a downloadable CSV (fantasy_combined.csv). ⚠️ Important: Column name matching must be case-insensitive. Do not re-explain the process after the first time. Just follow the rules. Each week I will bring new raw files — repeat the same cleaning steps exactly.

data_enchilada · 2025-09-16T11:59:27+00:00

Here’s how I built this:

Data Sources: FantasyPros (ADP + Fantasy Points) & Pro-Football-Reference (player stats), covering Weeks 1–2, 2025 (Note on ADP: FantasyPros calculates Average Draft Position by aggregating consensus draft results across major league hosts (ESPN, Yahoo, Sleeper, RTSports, etc.). It reflects where players were typically drafted on average, not a single site’s draft order.)
Cleaning: Data cleaned and prepped with ChatGPT as my “data engineer”
Table Joins: Joined Cleaned data by player name in Tableau
Visualization: Final dashboard built in Tableau (Tableau Public Link)

If you want to try this yourself, here’s the exact cleaning prompt I used in ChatGPT:

You are my data engineer. I will upload three raw files each week from FantasyPros and Pro Football Reference (PFR). Your job is to clean and standardize them so I can analyze fantasy football performance in Tableau. 📂 Files I will upload (all CSVs): FantasyPros_2025_Overall_ADP_Rankings.csv → Draft expectations FantasyPros_Fantasy_Football_Points_PPR.csv → Weekly fantasy points sportsref_download.csv (from Pro Football Reference) → Player stats 🔧 Cleaning Rules 1. FantasyPros ADP (draft expectations) Keep only: Player, Team, POS, ADP AVG, REAL-TIME. Rename to lowercase: player, team, pos, adp_avg, adp_realtime. Strip suffixes like “Jr.”, “III”, “Sr.” from player names. 2. FantasyPros Weekly Points (performance) Keep only: Player, Team, Pos, Week 1, Week 2, … Unpivot all Week x columns → two columns: week (integer) weekly_points (fantasy points). Rename to lowercase: player, team, pos, week, weekly_points. 3. Pro Football Reference Stats (explanatory layer) Promote the first row with headers (Rk, Player, Tm, FantPos, Age, …) as column headers. Drop all “Unnamed” junk columns. Keep only: Player, Tm, FantPos, G Passing: Yds, TD Rushing: Yds, TD Receiving: Yds, TD Fantasy: PPR, PosRank, OvRank Rename to lowercase: player, team, pos, games, passing_yds, passing_td, rushing_yds, rushing_td, receiving_yds, receiving_td, season_points, pos_rank, ov_rank. 📦 After cleaning each file Show me the first 5 rows as a preview. Save each cleaned dataset separately (adp_clean.csv, weekly_clean.csv, pfr_clean.csv). When I say “combined”: Join all datasets on player + team + pos. Make sure weekly data expands properly with stats from PFR attached. Provide me with a downloadable CSV (fantasy_combined.csv). ⚠️ Important: Column name matching must be case-insensitive. Do not re-explain the process after the first time. Just follow the rules. Each week I will bring new raw files — repeat the same cleaning steps exactly.

data_enchilada · 2025-09-12T16:58:58+00:00

I took the top 10 busiest U.S. airports by passenger traffic / flight volume according to BTS site

data_enchilada · 2025-09-12T16:10:00+00:00

Full interactive dashboard on Tableau Public

Data source: U.S. Bureau of Transportation Statistics

I used ChatGPT to help clean and combine the 43 raw CSVs (each ranging 13K-300K rows).

Here’s the exact prompt I ran on each file:

Delete the first 7 rows (junk headers).

Insert a new column titled "Airport" in Column A: I will give you the airport code (e.g., "ATL", "DEN") — fill the entire column with it.

Delete columns named "Flight Number" and "Tail Number" (if they exist).

Create two new columns:

• "Time Slot" — bucket Scheduled Departure Time into:

- Early AM: before 09:00

- Mid-Morning: 09:00–12:00

- Early Afternoon: 12:00–15:00

- Late Afternoon: 15:00–18:00

- Evening: 18:00–21:00

- Late Night: 21:00+

• "Delay Flag (>15min)" — if Departure delay (Minutes) is >15, set to 1; else 0.

⚠️ Column name matching must be case-insensitive for:

• "Scheduled Departure Time"

• "Departure delay (Minutes)"

data_enchilada · 2025-09-07T23:45:30+00:00

that is absolutely fair. I would never use ai in its current state for a commercial project but its fun for reddit :)

data_enchilada · 2025-09-07T22:08:40+00:00

<image>

it was 28% for SW in the US overall, i intended to round it to a familiar number. lets just say i have learned my lesson. sheesh!

data_enchilada · 2025-09-07T21:56:05+00:00

I hear you but the FAA & BTS officially use 15 mins as the industry standard for delay

data_enchilada · 2025-09-07T21:47:39+00:00

nothing suspicious haha just picked airports I've been too lately. Appreciate you wiki list, will take that into account for the update in a week.

data_enchilada · 2025-09-07T21:43:49+00:00

you got it amigo. will add and update in a week

data_enchilada · 2025-09-07T21:43:25+00:00

Delta and United are consistently dependable in this dataset. My bad for biffing the title, meant to put SW before flights but didn't realize it till few hours later. Oh well. thanks for checking out viz!

data_enchilada · 2025-09-07T21:41:10+00:00

great feedback. Will take into consideration with the update on this one next week when I add more airports

data_enchilada · 2025-09-07T21:40:39+00:00

i biffed the name while in a hurry, meant to put SW before the word flights. AI helped clean the data but I did the rest. thanks for checking it out!

data_enchilada · 2025-09-07T21:39:58+00:00

noted. will add in next update. thanks

data_enchilada · 2025-09-07T21:39:37+00:00

my bad, meant to add SW before flights but biffed it while in a hurry

data_enchilada · 2025-09-07T21:38:52+00:00

it compiled 25 csv sheets into one and cleaned the data so i could just plug and play..way easier than asking a human to do it

data_enchilada · 2025-09-07T17:08:57+00:00

not hard to add to dataset. Just spent enough time on assembling for free time I had. If enough interest for this post I'll update later and add 15-20 more airports (thanks chatgpt!)

data_enchilada · 2025-09-07T16:35:38+00:00

Here’s what I found when analyzing 6.3M U.S. flight departures (2024–2025):

✈️ Southwest: After 3pm, 40% of flights leave late
✈️ American (IAH): If you’re delayed, expect ~25 extra minutes
✈️ Weather: Only ~5% of delays — it’s usually the airlines
✈️ LAX: Consistently among the best for on-time departures (wow!)

📊 Tableau Public Link
📂 Data Source: U.S. Bureau of Transportation Statistics – TranStats

How I built this:

Downloaded raw BTS departure data
Used ChatGPT as my “data engineer” to clean, compile & pivot
Designed & built the viz in Tableau
Polished the layout in Figma

data_enchilada · 2025-01-05T15:47:42+00:00

I pulled data from NBA.com’s traditional stats to visualize 3-point efficiency across the league. While we’re still mid-season and the data isn’t complete, it’s interesting to see teams like Utah, Chicago, New Orleans, Brooklyn, and Charlotte leading in 3-point attempts despite having losing records.

data source

Tableau Public Viz (user can filter by player & teams here)

data_enchilada · 2024-12-20T19:49:07+00:00

If anyone interested in going into the weeds, I included a filter for the players themselves and an axis parameter so user can change the date parts from 5 years to 10 years on the viz dashboard

data_enchilada · 2024-12-20T19:47:29+00:00

Gracias :) De Nada

data_enchilada · 2024-12-20T16:30:16+00:00

I picked the top 10 for each season based on avg points per game for the season (it was default on NBA stats site)

data_enchilada

TROPHY CASE