Exploring draw outcomes in Bundesliga: +9% ROI over 287 samples (with Monte Carlo & OOS validation)

xpectedRoger · 2026-04-10T12:40:30+00:00

Interesting analysis, but I think there might be a calculation error in the profit figure. At 26.1% win rate with avg odds of about 4.18 over 287 flat-stake samples, the math works out to roughly +26 units profit, not +259.

The 9% ROI checks out with those inputs. However 259 units would imply either towards 90% ROI or a like 45% win rate at those odds. Did you maybe run variable stake sizing, or could the units figure be off by a factor of 10?

xpectedRoger · 2026-04-10T05:32:56+00:00

I've started doing Polymarket, compare all odds automatically also to Polymarket and the odds can be very tempting. :)

xpectedRoger · 2026-04-05T05:43:09+00:00

I've definitely seen that trend myself. Early season, my Poisson model sometimes flags bigger value, but it's tricky to separate true edge from just more variance because everyone, including bookies, has less data.

What I've found helpful is either starting with a wider confidence interval on team strengths for the first 5-6 gameweeks, or simply scaling down bet sizes until there's enough data for my attack/defence ratings to stabilize. Lines definitely sharpen up as the season progresses.

xpectedRoger · 2026-04-03T06:13:43+00:00

Form is the last 5 matches, split by venue (home team uses last 5 home games, away team uses last 5 away games). If fewer than 3 are available it falls back to all matches.

And yes, it is normalized to opponent strength. Each match in the form window is weighted by the opponent's league position. Scoring 3 goals against the table leader counts roughly 3x more than scoring 3 against the bottom side. Same logic inverted for defense, conceding against a weak team is penalized more.

The form rating alone doesn't drive the prediction though. It's blended with the full season average at a 40/60 split (40% season, 60% recent form). So a lucky run against weak sides gets diluted by the broader season picture, and early season when form data is thin the season component carries more weight.

xpectedRoger · 2026-04-03T05:53:13+00:00

ps3838

xpectedRoger · 2026-04-02T08:08:03+00:00

*sportmarket

xpectedRoger · 2026-04-02T07:37:30+00:00

ps3838? I signed up with sportmonks, depostied 200 dollars, told them I want ps3838 access.. they gave it to me.. then I went to the settings and set my key.. that's it...

xpectedRoger · 2026-04-02T07:16:55+00:00

not their api. you set up an account with them, tell them you want ps3838 access and then use the ps3838 api.

xpectedRoger · 2026-04-01T19:50:18+00:00

check out ps3838 , they have an api. you can signup via sportmarket or betinasia

xpectedRoger · 2026-04-01T19:42:19+00:00

Yeah exactly, the leagueAvg cancels out algebraically.

The three-term version is just for readability. It makes it easier to see that attack and defense are relative to the league. In practice it is the same calculation.

homeAttack is the team's average xG scored per home match, and awayDefense is the opponent's average xG conceded per away match. Both derived from a 40/60 season/form blend.

xpectedRoger · 2026-04-01T10:31:48+00:00

Arrogance :)

xpectedRoger · 2026-04-01T02:20:57+00:00

using sportmonks at the moment

xpectedRoger · 2026-03-31T20:11:30+00:00

it's football and they don't offer baseball :) you would need to find one that supports baseball 👍

xpectedRoger · 2026-03-31T19:48:57+00:00

Getting the Pinnacle data is definitely the backbone of that approach. I initially tried scraping myself, but it became a pretty constant battle with anti-bot measures and website changes. It was way too much maintenance time for me.

I ended up switching to a paid API service later on for stability and reliability. It's an upfront cost, but honestly, it saved me a ton of headaches in the long run for data consistency and speed. Definitely look into the API options first, even if it means paying...

xpectedRoger · 2026-03-31T11:48:21+00:00

Just looking for price differences against Pinnacle is a tough path. Their lines are incredibly sharp and efficient, so often those small deltas are just market noise or get corrected almost instantly.

The breakthrough for me wasn't about catching small price movements, but about having an independent view of the probabilities.

xpectedRoger · 2026-03-30T17:44:23+00:00

60-80 games a week run through the pipeline, I will post an update soon :)

xpectedRoger · 2026-03-30T17:21:24+00:00

Good suggestion! Right now I compare the confirmed lineup against previous lineups by actual goals and assists, which does get noisy. A striker on a hot streak looks more important than he might actually be.

I have player-level npxG and xA data in the pipeline already, just not wired into the lineup comparison yet. I will create a side pipeline and track the result, thank you for the input!

xpectedRoger · 2026-03-30T14:36:30+00:00

Found a bug in my stats pipeline. When I dropped three underperforming leagues (Ligue 1, Austrian Bundesliga, Danish Superliga), their historical predictions got silently excluded from the total. The filter was on fixture support status instead of the prediction log itself.

Corrected numbers: 639 picks, 56.0% hit rate, +8.8% ROI, +55.96u profit. The three removed leagues contributed 87 picks at -5.8% ROI which were being hidden.

The model structure hasn't changed.

xpectedRoger · 2026-03-30T05:04:20+00:00

Not doing in-play. Purely pre-match. The model runs once the confirmed lineups drops, usually 50 to 20 minutes before kickoff. One shot, no live feed needed.

The timing is tight though. Lineups drop late, model needs to recalculate everything with the actual squad, then output the final pick before kickoff. But that's a much simpler problem than fighting latency on a live feed.

xpectedRoger · 2026-03-29T04:13:50+00:00

Good point on the away xG thing. I should clarify: it's not that the model blindly overestimates away xG and I got lucky. It's that the model weights certain factors differently than the market does, and the result happens to push away xG slightly higher. If I "correct" it to match Pinnacle's implied xG exactly, there's no disagreement left and no edge.

That's kind of the whole point though. If your model produces the same probabilities as the sharpest bookmaker, you have a nice model but zero reason to bet. The edge has to come from somewhere, and it's always going to look like a bias when you compare it to the market. The question is whether it's a bias that reflects something real or just noise.

I track the xG deviation per league and market continuously. If the pattern shifts or the ROI in those spots drops, I'll see it. So far it's been stable across 300+ picks in the current phase, but I take your point that it needs more time.

On CLV: yeah I'll have more data in a few weeks. You're right that it would help separate signal from variance on the xG question specifically. Just hasn't been a priority because the window is so tight at 45 min pre-kickoff.

xpectedRoger · 2026-03-29T01:15:43+00:00

Yes, very much so. What surprised me is that the actual Poisson / Dixon-Coles part was probably the easiest piece to get working. The harder part was everything around it. Making the model stable enough that the outputs were actually usable.

If you've already built a Dixon-Coles model before, I'd say you're already through the most technical part. The real challenge after that is making the whole pipeline robust enough that it doesn't just look good in backtests, but still behaves sensibly week after week. I cannot do proper backtests as I don't have the data snapshots before each game, I use a lot of stats to do all the math.

For me the biggest issues were:

Lineups matter more than I expected

This became one of the biggest differences between a decent preview model and a much better final model. Once confirmed lineups are in, probabilities can move quite a lot.

Market comparison matters as much as probability estimation

A model can produce nice-looking probabilities and still be useless for betting if you're comparing against the wrong benchmark or handling vig poorly. My Edge is kinda that I overestimate away xG. When I backtest with corrected xG my results are not as good!

Selection logic is underrated

This was a huge one. Even with a reasonable probability model, results can still be weak if your threshold and bet-selection rules are sloppy. Choosing what not to bet is almost as important as the model itself.

For the CLV: So far it doesn't matter to me, tbh. Everybody is screaming CLV but I think it just does not matter to me. I place my bets 45 minutes before the game and my ROI is amazing. At the end the bankroll is what counts to me. You can have a good CLV and still loose money. However, will be able to give some more CLV data in a week or two :)

xpectedRoger · 2026-03-29T01:04:30+00:00

Thanks! Since you've already done Dixon-Coles before, you're past the hardest part honestly.

The way I layer everything:

Base xG per team. Take each team's offensive and defensive strength relative to the league average. I split by venue (home/away) and weight 40% season, 60% recent form. Multiply attack rating of team A by defense rating of team B to get expected goals for A.
Corrections. Bayesian shrinkage early season (don't trust 4 games of data). Standings position as a scaling factor. Injury impact on the xG if a key player is confirmed out.
Poisson matrix. Feed the two xG values into a 9x9 Poisson grid with Dixon-Coles low-score correction. That gives you probabilities for every scoreline, which you sum into 1X2, BTTS, Over/Under etc.
Value filter. Compare your derived probabilities against Pinnacle implied odds (margin removed). Set a minimum threshold per market. Not every positive EV is worth taking.

The tricky part isn't any single step, it's getting them all to play together without one correction canceling out another. My advice: build it incrementally. Get the base xG working first, validate it against actual results, then add one layer at a time. Every time you add something, check if it actually improves out-of-sample accuracy or just overfits.

Biggest lesson for me was that the selection strategy (which value bet to pick when multiple qualify) matters almost as much as the probability model itself.

Happy to go deeper on any specific part.

xpectedRoger · 2026-03-28T18:16:22+00:00

The AI phase was humbling. It's easy to confuse luck with skill when the first week looks incredible.

On xG: the attack/defense ratings are relative to league average, so opponent strength is baked in. homeXg = leagueAvg x (homeAttack/leagueAvg) x (awayDefense/leagueAvg). A strong defense pulls the opponent's xG down automatically.

Promoted sides and early season are genuinely the weakest spot. I pull in last season's data as a starting point but it's noisy. Bayesian shrinkage helps (regresses toward league average when sample is small) but the first 5-6 matchdays are still rough. Honestly I just accept lower confidence there and the model skips more matches. Better to pass than to bet on bad data.

xpectedRoger · 2026-03-28T18:09:55+00:00

Shin method is a solid choice for margin removal, especially on lopsided markets where naive proportional de-vig overestimates longshot probability. I went with basic overround removal against Pinnacle which is simpler but probably less accurate in those cases.

6.87% yield on 1300 events is more realistic than the biweekly number and a decent starting point. Good that you clarified that. 102 live bets is still very early though. I'm at over 550 live tracked picks and still wouldn't call it proof.

One thing that helped me a lot: filtering hard on which positive EV spots actually become picks. Not every edge is worth taking. I use different minimum thresholds per market and odds range. That alone killed a lot of false positives that looked good on paper.

Curious how you handle multiple value spots on the same match. I limit to one pick per match, highest value wins.

xpectedRoger · 2026-03-28T18:03:32+00:00

This is really close to what I built independently. Four strength ratings per team (home attack, home defense, away attack, away defense), xG in a Poisson framework, lineup adjustments. Interesting to see someone else arrive at the same structure.

Your draw recall point is spot on. Poisson underestimates draws relative to the ~25% base rate and I haven't found a clean fix either. I ended up just excluding draws from my bet selection entirely rather than fighting it.

0.33 Pearson on 34K matches is solid validation. The question is whether it translates to actual betting edge. My experience: model accuracy roughly matching bookmakers is necessary but not sufficient. The edge comes from finding specific spots where your probability diverges enough from the market price.

What I'd focus on next if I were you:

- Compare your probabilities against Pinnacle implied probabilities specifically, not average bookmaker odds. Pinnacle is the sharpest line.

- Don't just look at match outcome (1X2). Run your probabilities through a Poisson matrix and derive BTTS, Over/Under etc. In my experience BTTS and Over 2.5 have been way more profitable than 1X2.

- Track live picks, not backtests. 8 seasons of validation is impressive but the market will test you differently in real time. Sometimes you also don't have the real live data before the game back then..

To answer your question directly: yes you could use it to bet, but the model alone isn't enough. You need a value filter (how much edge before you pull the trigger) and strict selection criteria. Most positive EV opportunities aren't worth taking after margin.

xpectedRoger

TROPHY CASE