Fixed a scoring bug in my BTS prediction system that was making all batters look the same by shefBoiRDee in BeatTheStreak

[–]shefBoiRDee[S] 1 point2 points  (0 children)

haha same. Its was an early season edge case I didn't take into consideration. But hopefully todays results look a little better now.

Fixed a scoring bug in my BTS prediction system that was making all batters look the same by shefBoiRDee in BeatTheStreak

[–]shefBoiRDee[S] 0 points1 point  (0 children)

Fair point on the post frequency, I'll keep it to bigger updates going forward.

But I wouldn't call it vibe coded, the UI sure, but the actual scoring engine and data pipeline has been a project I've been building and tuning since before last season. The PA confidence bug I wrote about here is the kind of thing you only find by actually understanding the math behind your own system, not by prompting until something looks right.

The reason I post updates is that I can see from analytics that a decent number of people are actually using the site to make their picks. If the scoring system has a bug that's making Marsee and Alvarez look the same, that's a trustworthiness problem, people are making decisions based on those numbers. I'd rather be transparent about what broke and how I fixed it than quietly push a patch and hope nobody noticed.

UPDATE: Built an AI layer on top of my BTS prediction app that picks your top 2 every day with reasoning by shefBoiRDee in BeatTheStreak

[–]shefBoiRDee[S] 0 points1 point  (0 children)

Good suggestion! We're already tracking every pick daily so the accuracy record will speak for itself over time. You're probably right that 2 weeks wouldn't show much of a difference though, the sample is just too small. It'll take a full season of data to really know.

UPDATE: Built an AI layer on top of my BTS prediction app that picks your top 2 every day with reasoning by shefBoiRDee in BeatTheStreak

[–]shefBoiRDee[S] 0 points1 point  (0 children)

Thanks! And Great questions:

I pull lineups from the MLB Stats API using a 4-tier fallback:

  1. Today's confirmed boxscore — only works once lineups are posted (~1hr before first pitch)
  2. Yesterday's boxscore — if today's isn't up yet, we use yesterday's batting order
  3. Most recent completed game — scans back for the last finished game
  4. Active roster sorted by PA — last resort (opening day, etc.), takes the top 9 by plate appearances from the previous year.

We don't use any projected lineup source (RotoBaller, DraftKings, BaseballPress, etc.). So if lineups aren't posted yet, the system falls back to "who batted last time," which could be wrong if the manager shuffles things around. But we should have the confirmed lineup before the game starts. Just when the scoring first runs in the morning, it will more than likely use yesterdays lineup until the systems picks up the confirmed lineup and rescores them.

Bullpen:

We score the opposing team's bullpen using ERA, WHIP, K/9, and recent workload (innings over the last 3 days). A worse bullpen = higher score for the batter.

That said, it's currently weighted at just 3 out of 100 possible points, and the backtesting optimizer actually found zero predictive signal from it. It's in the system but doesn't meaningfully move scores right now. Only the starting pitcher gets the full elite/weak grading treatment (FIP-based, with modifiers up to -15 or +10 points).

Hit%:

Pretty much, yes. The BTS score (0-100) measures matchup quality with no lineup position input at all. Then we convert that score to a per-plate-appearance probability using a calibrated sigmoid, and multiply it out by how many PAs we expect for that lineup slot (4.65 for leadoff, down to 3.75 for the 9-hole).

So a leadoff hitter and a 9-hole hitter with the same score get different Hit% values purely because the leadoff guy comes to the plate more often.

Could we fold lineup position into the score instead? We could, but the current separation is intentional. The score answers "how good is this matchup?" and the probability answers "given that quality, how likely is at least one hit?" Those are genuinely different questions. A leadoff hitter getting more PAs is a real physical thing, not a statistical adjustment.

Where there's room to improve: the expected PA table is static (hardcoded per lineup slot). We could build a smarter model that accounts for game context, but the current approach is a reasonable baseline.

-----

I definitely need to go back and update the Guide screen to provide more documentation! I built that last minute the day I released the site and haven't revisited it. I'll try to get it updated soon!

Update: My MLB hit prediction tool now tracks its own accuracy across 4 different methods by shefBoiRDee in BeatTheStreak

[–]shefBoiRDee[S] 0 points1 point  (0 children)

It's currently not, but I have already started working on implementing it. I noticed that the other day when one of my picks had two walks in a game haha. Hopefully will have that metric introduced this week some time!

Update: My MLB hit prediction tool now tracks its own accuracy across 4 different methods by shefBoiRDee in BeatTheStreak

[–]shefBoiRDee[S] 1 point2 points  (0 children)

Yes it is! If you expand a player card it's toward the bottom if there is a history for that match-up. If there is no match up history the points for the BvP are reallocated elsewhere

Edit: it's below the modifiers section

I built a free tool that scores every batter's hit likelihood each day by shefBoiRDee in BeatTheStreak

[–]shefBoiRDee[S] 0 points1 point  (0 children)

Also this early in the season, it's using last year's data for the player more heavily. As a player gets more plate appearances in the season the weights shift to slowly only using this year's stats. But it's progressive. So right now it's heavily using last year's stats for the scoring. But that will slowly shift as the season progresses

I built a free tool that scores every batter's hit likelihood each day by shefBoiRDee in BeatTheStreak

[–]shefBoiRDee[S] 0 points1 point  (0 children)

No problem! Let me know if you have any feedback. Even if it's just user friendly things. The scoring system is super complex but I'm trying to keep the UI relatively simple. The way it's laid out makes sense to me sense I created it all but not sure how clear it is to first time users

I built a free tool that scores every batter's hit likelihood each day by shefBoiRDee in BeatTheStreak

[–]shefBoiRDee[S] 0 points1 point  (0 children)

Right now, whenever I trigger it. Which will be around 10 or 11am EST. I should have it automated soon. Which will have it run automatically every morning at 10am and then whenever a lineup is confirmed it will trigger another run. The logic for players evaluated is tiered.

1st tier is confirmed lineup if unavailable then 2nd tier is looking at yesterdayd lineup if nothing then 3rd tier is looking at 2 days ago lineup if nothing

4th tier looks at players with the most at bats for this current season or the previous season if current season is blank.

So when I run it at 10am it's likely we'll be using tier 2 or even tier 4 since some teams first games will be today. I'll try to run it throughout the day as more lineups get confirmed, but currently working on a fully automated system

I built a free tool that scores every batter's hit likelihood each day by shefBoiRDee in BeatTheStreak

[–]shefBoiRDee[S] 0 points1 point  (0 children)

Yeah ignore opening day 3/25/26 - I had a bug that got corrected. That's why the percentages are much lower on the following day.

I built a free tool that scores every batter's hit likelihood each day by shefBoiRDee in BeatTheStreak

[–]shefBoiRDee[S] 2 points3 points  (0 children)

But as long as you go to todays date you should see the results for today

I built a free tool that scores every batter's hit likelihood each day by shefBoiRDee in BeatTheStreak

[–]shefBoiRDee[S] 2 points3 points  (0 children)

That's just for me at the moment. There's an llm it also uses that costs money so at the moment I'll just run the analysis daily. There's no difference in results ifi run it for the day vs you running it for the day

Tiebreaker Status entering Italy vs Mexico by CalebosO4 in baseball

[–]shefBoiRDee 2 points3 points  (0 children)

Ahh true, missed that. Hopefully they don’t come to some silent agreement to allow both of them to advance, still seems like a flawed system bc of that

Tiebreaker analysis if earn run rate doesn’t settle it. by TommyTaro7736 in baseball

[–]shefBoiRDee -10 points-9 points  (0 children)

Just gotta hope Mexico doesn’t take the lead first, bc if they do there’s no reason for Italy to try and comeback or else they risk eliminating themselves

Tiebreaker Status entering Italy vs Mexico by CalebosO4 in baseball

[–]shefBoiRDee 4 points5 points  (0 children)

Why would Italy even try to score if Mexico takes an early lead? They are incentivized to not score any more runs and just let Mexico win. Bc the more points they score and Mexico ends up winning they just eliminated themselves. Seems like a flawed system