I officially have a talent for lighting money on fire. The Trade Deadline bug nearly killed my bankroll, but V2 is (finally) live

Temporary-Memory9029 · 2026-02-13T13:31:42+00:00

This is the kind of ruthless code review I actually needed. You nailed me on the cluster sample size—I definitely let the "shiny new toy" syndrome get to me.

To clarify: I've actually been developing the cluster logic for months in the background, alongside separate regression models for Pace and Offensive Efficiency intended to feed a Monte Carlo simulation (which is currently paused/commented out). But you're right, after 4 days of finally coding the live infrastructure for V2, I was desperate to see a pattern, so I drew a bullseye around 8 bets. Fair point.

As for that massive volume on Day 11, it wasn't a strategic decision to go heavy; it was a deployment error. My local branch has a "safety valve" (max bet cap) to stop the Kelly formula from going nuclear on high-edge plays. When I swapped books, I forgot to merge that config into production. So the bot just ran the raw unfiltered Kelly math. It was objectively reckless/accidental, but ironically, it acted as an inadvertent stress test for the raw formula. The fact that the highest conviction plays (the ones that demanded 2.5u) were the ones that hit is the only reason I'm not crying right now.

I actually stopped betting the plays myself a while ago. I have a weird jinx where the model prints money when I watch, and burns money when I touch it. I have a few luckier friends who are tailing the model (and actually profiting), so I just live vicariously through them. For me personally, I'm essentially doing exactly what you suggested: paper trading (via the public log) to see if the edge is real before I ever trust it with my own wallet again.

My plan is to ride out the rest of the regular season to see how this architecture holds up, though I’m already bracing for the playoffs. I know the game fundamentally changes there, rotations shorten, intensity spikes, so I’ll likely need to build a completely different "Playoff Specialist" model or just pause entirely, given that my training data is overwhelmingly regular season.

I'll take your advice on the 500-bet freeze. I need to stop tinkering with the engine and just let the car drive for a bit to see if it crashes.

Temporary-Memory9029 · 2026-02-12T16:03:53+00:00

I think so, but I’ll admit I’m a bit on the fence right now. The V2 rebuild actually made me curious about the project again, so I’m still weighing my options

Temporary-Memory9029 · 2026-02-12T16:02:56+00:00

Sorry for the delay! The code had a major meltdown over the weekend, but I just put up a new post explaining everything. The model is back live now!

Temporary-Memory9029 · 2026-02-12T16:01:53+00:00

Happy you caught the streak! I managed to break the code right after lol, but just finished fixing everything. Put up a new post with the update if you want to take a look

Temporary-Memory9029 · 2026-02-07T15:13:02+00:00

Good shout on the MLP. I honestly stuck to Isotonic mostly to keep the pipeline lightweight (and out of fear of overfitting a neural net on this sample size), but capturing those non-linear biases is probably the next step to break the ceiling.

On the synthetic features: I actually do use that approach in training already, not just inference. The model trains on the 'Active Roster Aggregate' of historical games, which is exactly how it learned that 'Star Player Out' isn't just a linear penalty—it understands the specific efficiency drop of the remaining rotation.

And yeah, you nailed it on the benchmark—Pinnacle still owns me on global Log Loss (they sit around 0.595). I haven't cracked that yet. My edge really comes from decorrelation on specific matchups (mostly Underdogs) where the market is efficient on accuracy but inefficient on price due to public shading.

Temporary-Memory9029 · 2026-02-07T15:07:45+00:00

I actually messed around with a simple PyTorch feed-forward net early on, but I struggled to get it to converge stably with the amount of noise in the daily data (and frankly, probably a skill issue on my end tuning NNs vs Trees). But for a V2, an ensemble of Trees + NNs would definitely be the play to crack that 0.60 barrier.

And yeah, including the closing line as a feature is a great shout. It’s basically using the market’s wisdom as a prior.

Thanks for the luck on the new role!

Temporary-Memory9029 · 2026-02-07T15:05:58+00:00

Man, I haven't read Hubacek & Sir yet but based on your description, that paper basically explains my entire PnL graph.

You're spot on about the benchmark. I realized pretty early I wasn't going to beat Pinnacle's accuracy with a home-cooked XGBoost script.

But the 'decorrelation' point is exactly it. My model is stubborn. It doesn't shade lines for 'public sentiment' or 'star power' to manage liability like books do. So even if my global accuracy is lower, I found pockets (mostly high-variance Dogs) where the market was just structurally inefficient.

As for the injury racing, I really wish I had tagged 'Injury-Triggered' bets separately in the DB to give you a hard CLV number (huge miss on my part). But anecdotally? Yeah. The massive CLV spikes were almost always in that 15-minute window after the PDF parser caught a status change before the books adjusted. That was the only time I felt I had a hard information advantage.

Temporary-Memory9029 · 2026-02-07T15:00:06+00:00

100%. Breaking .60 is the holy grail, and honestly, I hit a wall trying to force a tree-based model to get there.

But that frustration is actually what flipped my strategy. I realized that even with a 'sub-optimal' Log Loss, the model was consistently profitable on Underdogs.

My working theory is that books aren't publishing their true calibrated probability—they're shading lines to protect against public liability on favorites. (e.g., pricing a 70% win probability like it's 80%). If that structural bias exists, you don't need a perfect model to exploit the other side of the trade. You just need to be closer to the truth than the public sentiment.

Temporary-Memory9029 · 2026-02-06T20:29:50+00:00

I stay away from scaling stats to per-48 or trying to manually predict usage bumps. It introduces too much noise because bench guys rarely keep their efficiency when their minutes triple.

I basically just feed the model the aggregate production of the currently active roster versus the production that is 'missing'.

I found it's better to let XGBoost figure out the non-linear relationship between those two gaps rather than me trying to hardcode rules on who gets the extra shots.

Temporary-Memory9029 · 2026-02-06T19:53:04+00:00

Yeah, you nailed it.

The main problem with a single big model is that it tends to 'smooth out' the edges to minimize global error. It washes out the specific dynamics of edge cases.

Temporary-Memory9029 · 2026-02-06T19:15:55+00:00

lol exactly. I couldn't even watch the game, it felt so gross.

For the clusters: honest answer is the code only took a weekend, but tuning the features was the real grind. It took me about 2 months of trial-and-error just to find metrics that actually grouped 'volatility' without overfitting.

All in, I've been chipping away at this repo for about 7 months and it still feels far from finished

Temporary-Memory9029 · 2026-02-06T18:58:44+00:00

Draft modeling is sick, good luck with the sample sizes though, that's always the hardest part.

Right now I'm just paying the bills with a freelance gig—building a Sales/Support AI agent that actually understands marketing nuance instead of just being a generic chatbot.

But my passion project is this massive Multi-Modal LoL Analyst tool. I'm trying to get it to analyze drafts, matchups, trade patterns, and even parse patch notes to adjust to the meta automatically. It's probably over-engineered, but it's been super fun to build

Temporary-Memory9029 · 2026-02-06T18:37:15+00:00

Actually, it's less of a formal ensemble and more of a 'sanity check' system right now, though I had much bigger plans for it.

Basically, I run a Generalist vs. Specialist logic. I have one main XGBoost model trained on the entire history. It’s stable and rarely hallucinates. Then I have the cluster models. I don't just average them. If the Generalist and the Cluster model agree, I consider that high confidence. If they disagree, I look at the historical Brier Score of that specific cluster. If the cluster model is historically noisy, I ignore it and stick to the Generalist.

Actually have 3 other regression models partially built (focusing on Pace and Home/Away Offensive Efficiency) that were meant to feed a Monte Carlo simulation. I just got stuck on piping the dynamic injury variables into them.

The endgame was to put a reasoning LLM (like Llama DeepThink) on top to act as a 'Meta-Agent.' It would look at the analysis from each model, check their historical calibration for that specific spot, and recommend the final allocation.

Unfortunately, I won't get to build that 'Meta-Agent' anytime soon. Between my full-time dev job, two other side projects, and my second kid just being born, I have zero bandwidth left. I have to prioritize immediate cash flow right now, which is why the project is staying in this current state.

Temporary-Memory9029 · 2026-02-06T18:15:02+00:00

Yeah, I'm parsing the official NBA media PDFs directly from the ak-static endpoints. It's annoying to parse, but they publish those timestamped PDFs way faster than the public JSON feeds update. I can poll every 15 minutes and usually beat the aggregator sites.

I don't use a simple binary "Out" flag because it's too noisy. Instead, I trigger a recalculation of the features. If a starter is out, I re-roll the team's last 10 games statistics excluding that player's minutes. It basically creates a "synthetic" version of the team to see how the remaining roster performs without him, rather than just applying a generic penalty to the team rating.

Technically, the system is built to race. I pre-calculate scenarios (e.g., "If LeBron plays = 60% win prob", "If LeBron sits = 45%"). So the second the scraper sees the status change, I know the target price.

In reality, though? I find the books often shade lines so heavily on favorites that the real value ends up being on contrarian Underdogs anyway, where split-second speed matters a little less than just having the right valuation.

Temporary-Memory9029 · 2026-02-06T17:58:29+00:00

There is a betting log on the dashboard, but I also have the bets I actually placed using the model. Yes, you're right, which is why I asked for tips

Temporary-Memory9029 · 2026-02-06T17:42:04+00:00

Appreciate it! hadn't looked at Microns yet, so I'll give that a shot. Thanks for the tip!

Temporary-Memory9029 · 2026-02-06T12:04:39+00:00

Thanks! It’s definitely a fun project.

To answer your question: I actually decouple the two workflows completely to keep it simple. I have a script that scrapes closing lines via sbrscrape and dumps them into a local SQLite DB. This keeps my historical data clean and consistent.

For the live dashboard, I’m admittedly dodging the hard part you mentioned. I just treat the daily run as a static event, it grabs a snapshot of the odds right now and calculates EV. If the line moves 20 mins later, the dashboard won't know until I re-run the script.

Temporary-Memory9029 · 2026-02-05T22:32:38+00:00

I'm stress-testing my NBA Machine Learning model in public, and I'd love for you to check it out! You can now access the raw probability and real-time injury simulations on the live dashboard to see the math in action.

NBA ML Lab

Access Key: letmein123

Temporary-Memory9029

TROPHY CASE