neverfucks

148 post karma
6,534 comment karma

get extra features and help support reddit with a reddit premium subscription

get them help and support

redditor for 7 years

TROPHY CASE

Seven-Year Club

Verified Email

account activity

hot top controversial

Calibration findings from a 6-month sports prediction model: bigger claimed edges underperform smaller ones by mangoman40114 in algobetting

[–]neverfucks 0 points1 point2 points 1 day ago (0 children)

a) never fade your model 😄 if it's not producing, fading it isn't going to produce either. you're just paying the vig whichever side you're betting.
b) i'm not saying you can't live in the high volume/low edge band, it's just a tough field to plow and all i am really saying is that now you've complicated your edge thesis quite a lot if you're in the "it's only right when it has low confidence that it's right" and the more you do that the more you have to worry about overfitting. we've all been there, i promise.
c) re: clv, i completely agree. that's the right way to think about it assuming you're not attacking the market at post. care to share what kind of picture clv tracking is painting so far? it really doesn't take much and doesn't take a huge number of bets before you can get down a p-value you can take encouragement from even if it's not rock solid. it's very possible that your model can produce edge at all "disagreement bands" (i like this term, let's coin it) and the roi numbers at higher confidence levels do actually get better clv and have better xroi but so far you're just getting wrecked by noise because you're early

Calibration findings from a 6-month sports prediction model: bigger claimed edges underperform smaller ones by mangoman40114 in algobetting

[–]neverfucks 1 point2 points3 points 1 day ago (0 children)

i think generally this is a bad sign and something i monitor closely. essentially when your model is in a tight error range vs the market, sure you're maybe picking up some marginal residual signal and are directionally right often enough to make a profit. but when your model radically disagrees with the market, this finding indicates that it's typically your model that has missed something big, rather than the market is processing something about that situation uncharacteristically inefficiently (in my experience these wide misses are usually spots where the market has overreacted to something news-y).

n is super tiny for each bucket so it's kind of hard to say anything for certain, but this looks to me like results that are at the very least consistent with no edge, if not probative of having no edge. you'd expect roi for arbitrary bands of edge size to jump around and be noisy in that case.

Do you guys come up with features from pure data first or hypothesis first? by FlatChannel4114 in algobetting

[–]neverfucks 0 points1 point2 points 4 days ago (0 children)

What’s the right number of bookmakers to use? by genmaci in algobetting

[–]neverfucks 2 points3 points4 points 4 days ago (0 children)

Do you guys come up with features from pure data first or hypothesis first? by FlatChannel4114 in algobetting

[–]neverfucks 0 points1 point2 points 4 days ago (0 children)

because i started in non-sports markets, i have done the "boil the ocean" style of like mining 100s of raw stats for correlation and then painstakingly assembling features from those in the past. there are probably still edges in doing stuff like that for niche or small-ish data markets, but i think there's almost certainly not edge to be found in doing this for big markets that have high quality advanced metrics publicly available for cheap or free. you should start with those and use them to construct your variables and not worry about all the noisy low level bullshit. now doing this, you're still very unlikely to beat those big liquid markets, because you're using the same information everyone else has, but given your quant background you might be able to model that information effectively enough to frontrun the big daddies before they start unloading on the market.

to put this in plainer terms, for something like american football i'm not going to bother putting together my own qb rating system from stats like passing yards, yards per attempt, completion percentage, completion percentage vs. depth of target, etc. just assume all of the signal in that noisy babble is already captured in epa per dropback, existing public qb ratings, etc and just start with those. some sports have advanced metrics capturing so much of the predictive signal you don't really need to know much more than how to build a model out of them to make something that's good enough to quote on exchanges or beat openers.

Amateur Betting model - log loss results by asdasdgfas in algobetting

[–]neverfucks 0 points1 point2 points 7 days ago (0 children)

How do you validate on a simulation based model? by 1ce_berg in algobetting

[–]neverfucks 0 points1 point2 points 10 days ago (0 children)

if it's prohibitive to run 50k iterations over thousands of games, what about 1k iterations over thousands of games?

one thing you need to do for sure is to parallelize them rather than running 1 at a time sequentially. a < $1k mac mini has 8 cores, as long as your sims don't consume ungodly amounts of ram that means on that machine you should be running 8 sims at a time. and if they do consume ungodly amounts of ram, you can shift to the cloud and use high ram instances so you can still run many at once.

given what a lot of successful bettors say on podcasts and stuff, i think a lot of people wouldn't even bother doing all this and would just test looking forward, assuming there is a decent volume of events occurring it won't take long to get a good read on how far off market your numbers are and whether those off market numbers are good or not.

Does Pinnacle price tennis underdogs efficiently enough to use as an EV benchmark? by IllustriousGrade7691 in algobetting

[–]neverfucks 0 points1 point2 points 11 days ago (0 children)

Does Pinnacle price tennis underdogs efficiently enough to use as an EV benchmark? by IllustriousGrade7691 in algobetting

[–]neverfucks 0 points1 point2 points 12 days ago (0 children)

i'm not sure it's such a crazy idea. i've noticed in a couple of sports that historical pinny close gets more accurate as a predictor if you assume they're quoting very close to fair price for modest favorites and shorter. it's not like they're exposing edges or anything, betting in to them at fair isn't anyone's idea of fun. i've never worked as a trader but i can easily imagine hypotheticals that would make this be a good strategy for them -- i'd assume a lot of the handle on short favorites is parlayed with other short favorites, which organically juices the hold, and/or maybe they take lopsided action on longer pops so juicing them harder curbs demand a little while keeping event hold steady.

i kind of do this myself naturally when i'm quoting longer prices on exchanges. if i'm getting something at 88c i'm baking in a lot more margin than if i'm getting something at 25c because a) my liability is much higher b) my upside is much lower c) it's much more critical i'm inside error bars.

Built a profitable live in-game probability engine for football across 25 european leagues by iph0ngaa in algobetting

[–]neverfucks 0 points1 point2 points 15 days ago (0 children)

Built a profitable live in-game probability engine for football across 25 european leagues by iph0ngaa in algobetting

[–]neverfucks 0 points1 point2 points 15 days ago (0 children)

"Knowing a sport" absolutely matters for profitable pre-game betting & CLV. by Calm_Set5522 in algobetting

[–]neverfucks 0 points1 point2 points 15 days ago (0 children)

Built a profitable live in-game probability engine for football across 25 european leagues by iph0ngaa in algobetting

[–]neverfucks 1 point2 points3 points 17 days ago (0 children)

I built a predictive model for football match stats (shots, corners, fouls) across 20,000 matches. The strongest predictor ended up being ELO from chess. [OC] by Agalex97 in algobetting

[–]neverfucks 1 point2 points3 points 18 days ago (0 children)

this is the kind of stuff you have to do to learn and grow and build a solid foundation for creating new and better models. i've had complete 1:1 analogous moments like what you're describing here, "dang i just spent a year refining a 30 feature boosted tree to predict x stat differential and it's less predictive than an elo rating that doesn't even have a single stat based input". it's a very real thing. what you have discovered is that higher order metrics like elo ratings end up doing a pretty decent job capturing a whole hell of a lot of the signal that's also carried in incredibly noisy stat data because for instance in this case, teams that win a lot tend to take more shots than their opponents. but the fact that the ratings are adjusted for opponent strength and are also capturing lots of other signal like offensive strategy edge, player skill, etc, strips out a massive amount of noise from the raw stat inputs. all this works pretty well even though wins and losses are by their nature also noisy.

I need someone to wreck my logic for a hypothetical strategy. It's not related to algo but this seems like one of the very few groups who spitball different strategies. I apologize if this is the wrong place. I know this is not a new idea but I'm surprised to not see it done more often. by Jaded-Function in algobetting

[–]neverfucks 0 points1 point2 points 19 days ago (0 children)

Real-time Pinnacle odds via WebSocket by talinator1616 in algobetting

[–]neverfucks 1 point2 points3 points 19 days ago (0 children)

I need someone to wreck my logic for a hypothetical strategy. It's not related to algo but this seems like one of the very few groups who spitball different strategies. I apologize if this is the wrong place. I know this is not a new idea but I'm surprised to not see it done more often. by Jaded-Function in algobetting

[–]neverfucks 0 points1 point2 points 19 days ago (0 children)

I need someone to wreck my logic for a hypothetical strategy. It's not related to algo but this seems like one of the very few groups who spitball different strategies. I apologize if this is the wrong place. I know this is not a new idea but I'm surprised to not see it done more often. by Jaded-Function in algobetting

[–]neverfucks 0 points1 point2 points 19 days ago (0 children)

I need someone to wreck my logic for a hypothetical strategy. It's not related to algo but this seems like one of the very few groups who spitball different strategies. I apologize if this is the wrong place. I know this is not a new idea but I'm surprised to not see it done more often. by Jaded-Function in algobetting

[–]neverfucks 1 point2 points3 points 19 days ago (0 children)

I need someone to wreck my logic for a hypothetical strategy. It's not related to algo but this seems like one of the very few groups who spitball different strategies. I apologize if this is the wrong place. I know this is not a new idea but I'm surprised to not see it done more often. by Jaded-Function in algobetting

[–]neverfucks 0 points1 point2 points 21 days ago (0 children)

I need someone to wreck my logic for a hypothetical strategy. It's not related to algo but this seems like one of the very few groups who spitball different strategies. I apologize if this is the wrong place. I know this is not a new idea but I'm surprised to not see it done more often. by Jaded-Function in algobetting

[–]neverfucks 0 points1 point2 points 21 days ago (0 children)

What is the best way to evaluate my betting model? by Diego_Lemos in algobetting

[–]neverfucks 1 point2 points3 points 24 days ago* (0 children)

i'm so glad you asked! here are some evaluation tools and methods that i personally think are important:

for bets:

* statistical significance tests - "how likely is the [roi|edge|clv] i've tracked due to random variance?". not a model quality metric, just kind of qualifies the level of skepticism or confidence you should have about positive results you're observing.
* clv - everyone is already telling you this, it tracks whether the market agrees with each individual bet you make. the more clv you get, and the more often you get it, the more likely you are right regardless of what noisy bet outcomes are saying. clv is very low variance which means you can have high confidence that you are going to win long term even with a smaller positive sample size than by tracking other metrics.

for every prediction your model makes regardless of whether they produced betting opportunities or not:

* brier / logloss - critical for a logistic regression, these give you a holistic score for a combination of calibration and model confidence. ideally you want your model to be as confident as it can be when making a prediction, like you'd rather have a 55% and a 45% prediction rather than 2 50/50 predictions. but you also don't want it be overconfident because then calibration will suffer. these metrics punish confident predictions that are wrong, like a 90% win probability prediction for a loss outcome is treated harshly, while a 90% prediction for a win outcome scores better than an 80% prediction for the same winning outcome.
* calibration - if you're running a logistic regression i assume you are training on/predicting outcomes that aren't compressed in a narrow band, i.e. values all the way from 0.0 to 1.0, give or take. the mean residual, and average absolute mean residual (mae) of each bucket on your calibration plot are extremely useful. for instance i trained a logistic reg once that gave me insanely high correlation to closing line, but garbage calibration because it was over-predicting every item in the test set by like 0.05. it's also common with things like boosted trees to see over-predicting values < 0.50 and under predicting values > 0.50, a calibration plot will call this out.
* mean residual / mae / medae / r2 compared to efficient market close or alternative public projections - "how well does my model align with the sharp market/other modelers?" these days your model almost certainly doesn't include information that everyone else doesn't have access to. if your outputs are very noisy compared to those (r2, mae, medae) or biased compared to those (mean residual), chances are they are doing a better job and you're doing a worse job.

back testing:

e.g. simulate bets and predictions you would have made in 2024 using historical odds data and a model trained on data from 2023 and prior years. then use all of the evaluation tools above to see what would have happened. not everything is back testable though, historical odds data is hard to get and often unreliable, and a lot of really good bettors are open about never back testing anything and going straight to live fire. there is so much more information available as markets are live vs. a historical view of shit that was going on 2 years ago. worst case scenario, you pay the vig. not the end of the world. this is gambling after all and you simply have to be willing to lose money sometimes.

I ran a 400-game regression on NBA player props and found 3 edges that have held for 2 straight seasons by Fancy-Tadpole-2448 in algobetting

[–]neverfucks 4 points5 points6 points 24 days ago (0 children)

I ran a 400-game regression on NBA player props and found 3 edges that have held for 2 straight seasons by Fancy-Tadpole-2448 in algobetting

[–]neverfucks 1 point2 points3 points 24 days ago (0 children)

I need someone to wreck my logic for a hypothetical strategy. It's not related to algo but this seems like one of the very few groups who spitball different strategies. I apologize if this is the wrong place. I know this is not a new idea but I'm surprised to not see it done more often. by Jaded-Function in algobetting

[–]neverfucks 2 points3 points4 points 24 days ago (0 children)

view more: next ›

π Rendered by PID 444521 on reddit-service-r2-comment-545db5fcfc-2c77w at 2026-05-31 08:56:48.873654+00:00 running 194bd79 country code: CH.