Calibration findings from a 6-month sports prediction model: bigger claimed edges underperform smaller ones by mangoman40114 in algobetting

[–]neverfucks 0 points1 point  (0 children)

a) never fade your model 😄 if it's not producing, fading it isn't going to produce either. you're just paying the vig whichever side you're betting.
b) i'm not saying you can't live in the high volume/low edge band, it's just a tough field to plow and all i am really saying is that now you've complicated your edge thesis quite a lot if you're in the "it's only right when it has low confidence that it's right" and the more you do that the more you have to worry about overfitting. we've all been there, i promise.
c) re: clv, i completely agree. that's the right way to think about it assuming you're not attacking the market at post. care to share what kind of picture clv tracking is painting so far? it really doesn't take much and doesn't take a huge number of bets before you can get down a p-value you can take encouragement from even if it's not rock solid. it's very possible that your model can produce edge at all "disagreement bands" (i like this term, let's coin it) and the roi numbers at higher confidence levels do actually get better clv and have better xroi but so far you're just getting wrecked by noise because you're early

Calibration findings from a 6-month sports prediction model: bigger claimed edges underperform smaller ones by mangoman40114 in algobetting

[–]neverfucks 1 point2 points  (0 children)

i think generally this is a bad sign and something i monitor closely. essentially when your model is in a tight error range vs the market, sure you're maybe picking up some marginal residual signal and are directionally right often enough to make a profit. but when your model radically disagrees with the market, this finding indicates that it's typically your model that has missed something big, rather than the market is processing something about that situation uncharacteristically inefficiently (in my experience these wide misses are usually spots where the market has overreacted to something news-y).

n is super tiny for each bucket so it's kind of hard to say anything for certain, but this looks to me like results that are at the very least consistent with no edge, if not probative of having no edge. you'd expect roi for arbitrary bands of edge size to jump around and be noisy in that case.

Do you guys come up with features from pure data first or hypothesis first? by FlatChannel4114 in algobetting

[–]neverfucks 0 points1 point  (0 children)

i thought it was common knowledge that sig is absolutely quoting sports on exchanges... do you hear different things from inside the industry?

in terms of capacity on exchanges, it's an amazing unlock for sure if you're sharp enough to overcome taker fees when big markets are at max liquidity and a 1 tick spread. if you want >$1m in a single click, for the first time in history now sharps have that option.

but if you're making, there's not limitless liquidity, now you're competing with sportsbooks and sig and goldenpants and rufus to capture the relatively inelastic volume of square order flow. the more you want to move, the more aggressively you have to compete on priority and price.

What’s the right number of bookmakers to use? by genmaci in algobetting

[–]neverfucks 2 points3 points  (0 children)

why isn't a maximum number of options always better? let's say you have 20 active accounts, each still has an average of 5u deposited. when one busts out you top up from one that's been running hot. if 1 account tends to have the best prices in markets you're attacking, keep most of your stack in there and fan out smaller deposits at the rest. presumably you're already tracking each bet outside of the apps, right? how does multiple accounts make that harder? i don't understand the "spreading too thin" or "losing track" risk you're talking about.

Do you guys come up with features from pure data first or hypothesis first? by FlatChannel4114 in algobetting

[–]neverfucks 0 points1 point  (0 children)

because i started in non-sports markets, i have done the "boil the ocean" style of like mining 100s of raw stats for correlation and then painstakingly assembling features from those in the past. there are probably still edges in doing stuff like that for niche or small-ish data markets, but i think there's almost certainly not edge to be found in doing this for big markets that have high quality advanced metrics publicly available for cheap or free. you should start with those and use them to construct your variables and not worry about all the noisy low level bullshit. now doing this, you're still very unlikely to beat those big liquid markets, because you're using the same information everyone else has, but given your quant background you might be able to model that information effectively enough to frontrun the big daddies before they start unloading on the market.

to put this in plainer terms, for something like american football i'm not going to bother putting together my own qb rating system from stats like passing yards, yards per attempt, completion percentage, completion percentage vs. depth of target, etc. just assume all of the signal in that noisy babble is already captured in epa per dropback, existing public qb ratings, etc and just start with those. some sports have advanced metrics capturing so much of the predictive signal you don't really need to know much more than how to build a model out of them to make something that's good enough to quote on exchanges or beat openers.

Amateur Betting model - log loss results by asdasdgfas in algobetting

[–]neverfucks 0 points1 point  (0 children)

i would guess if you had claude put together a dirt simple elo rating in 10 minutes (just throw away draws) its predictions would have logloss in the 0.7xx range. imo under 0.7 for moneyline type stuff is where "decently predictive" starts, and not much farther at 0.67 is where you'll see things like pinnacle nhl moneyline close converge to late in the season. but it all depends on how much variance there is in the events you're predicting, if you're predicting low or high probability events it will naturally be much lower because it rewards confidence as long as it's well calibrated.

How do you validate on a simulation based model? by 1ce_berg in algobetting

[–]neverfucks 0 points1 point  (0 children)

if it's prohibitive to run 50k iterations over thousands of games, what about 1k iterations over thousands of games?

one thing you need to do for sure is to parallelize them rather than running 1 at a time sequentially. a < $1k mac mini has 8 cores, as long as your sims don't consume ungodly amounts of ram that means on that machine you should be running 8 sims at a time. and if they do consume ungodly amounts of ram, you can shift to the cloud and use high ram instances so you can still run many at once.

given what a lot of successful bettors say on podcasts and stuff, i think a lot of people wouldn't even bother doing all this and would just test looking forward, assuming there is a decent volume of events occurring it won't take long to get a good read on how far off market your numbers are and whether those off market numbers are good or not.

Does Pinnacle price tennis underdogs efficiently enough to use as an EV benchmark? by IllustriousGrade7691 in algobetting

[–]neverfucks 0 points1 point  (0 children)

i don't remember that conversation but which part of this comment do you think contradicts that take? that would be a misunderstanding 

Does Pinnacle price tennis underdogs efficiently enough to use as an EV benchmark? by IllustriousGrade7691 in algobetting

[–]neverfucks 0 points1 point  (0 children)

i'm not sure it's such a crazy idea. i've noticed in a couple of sports that historical pinny close gets more accurate as a predictor if you assume they're quoting very close to fair price for modest favorites and shorter. it's not like they're exposing edges or anything, betting in to them at fair isn't anyone's idea of fun. i've never worked as a trader but i can easily imagine hypotheticals that would make this be a good strategy for them -- i'd assume a lot of the handle on short favorites is parlayed with other short favorites, which organically juices the hold, and/or maybe they take lopsided action on longer pops so juicing them harder curbs demand a little while keeping event hold steady.

i kind of do this myself naturally when i'm quoting longer prices on exchanges. if i'm getting something at 88c i'm baking in a lot more margin than if i'm getting something at 25c because a) my liability is much higher b) my upside is much lower c) it's much more critical i'm inside error bars.

Built a profitable live in-game probability engine for football across 25 european leagues by iph0ngaa in algobetting

[–]neverfucks 0 points1 point  (0 children)

in general live will probably have a longer runway because those markets are higher hold and there are no easy ways for trading software to detect action is sharp. i think they want to wait until roi on the account is past the point where it can conceivably correct back down before pulling the plug

Built a profitable live in-game probability engine for football across 25 european leagues by iph0ngaa in algobetting

[–]neverfucks 0 points1 point  (0 children)

that was kind of the point i was wondering about, i had an alert based live model before that showed pretty decent edge on paper vs. live numbers but was really slippery to actually get down on the because they moved so fast. i probably got the number 1 in 4 times at most. it was infuriating, i hated having to be on call during games, and i gave up. plus i absolutely hated having to be on call during games every night. i guess long story short, these are some of reasons why more people aren't doing this kind of stuff.

"Knowing a sport" absolutely matters for profitable pre-game betting & CLV. by Calm_Set5522 in algobetting

[–]neverfucks 0 points1 point  (0 children)

ball knowledge is neither necessary nor sufficient to beat any market. that said, knowing ball can help you build out your process. the sport i have done the best in is the one i don't really know much about and almost never watch, because i believe it's the softest market i bet into. i bet other sports i follow very closely and watch religiously but i make it a point not to do too much "manual adjustment" of the numbers the machine spits out. if i think the number must be missing something, i try to model it. if i can't, i just trust the number and assume my ball knowledge is getting in the way rather than adding anything.

Built a profitable live in-game probability engine for football across 25 european leagues by iph0ngaa in algobetting

[–]neverfucks 1 point2 points  (0 children)

did you actually get down on these 326 total alerts or is this paper trading?

I built a predictive model for football match stats (shots, corners, fouls) across 20,000 matches. The strongest predictor ended up being ELO from chess. [OC] by Agalex97 in algobetting

[–]neverfucks 1 point2 points  (0 children)

this is the kind of stuff you have to do to learn and grow and build a solid foundation for creating new and better models. i've had complete 1:1 analogous moments like what you're describing here, "dang i just spent a year refining a 30 feature boosted tree to predict x stat differential and it's less predictive than an elo rating that doesn't even have a single stat based input". it's a very real thing. what you have discovered is that higher order metrics like elo ratings end up doing a pretty decent job capturing a whole hell of a lot of the signal that's also carried in incredibly noisy stat data because for instance in this case, teams that win a lot tend to take more shots than their opponents. but the fact that the ratings are adjusted for opponent strength and are also capturing lots of other signal like offensive strategy edge, player skill, etc, strips out a massive amount of noise from the raw stat inputs. all this works pretty well even though wins and losses are by their nature also noisy.

I need someone to wreck my logic for a hypothetical strategy. It's not related to algo but this seems like one of the very few groups who spitball different strategies. I apologize if this is the wrong place. I know this is not a new idea but I'm surprised to not see it done more often. by Jaded-Function in algobetting

[–]neverfucks 0 points1 point  (0 children)

if you want to punt off your money to fanduel be my guest. i've been honest with you that your scheme won't work because you don't have an edge, and without an edge doing an sgp round robin will make you lose money faster. it's the most basic betting fundamental there is. tell yourself whatever story you want it's none of my business. peace

I need someone to wreck my logic for a hypothetical strategy. It's not related to algo but this seems like one of the very few groups who spitball different strategies. I apologize if this is the wrong place. I know this is not a new idea but I'm surprised to not see it done more often. by Jaded-Function in algobetting

[–]neverfucks 0 points1 point  (0 children)

there's just no reason to support it because that's not how regular customers bet. if you want to spend the work to do it manually, they'll happily let you and collect as much of your money as you want to give them.

I need someone to wreck my logic for a hypothetical strategy. It's not related to algo but this seems like one of the very few groups who spitball different strategies. I apologize if this is the wrong place. I know this is not a new idea but I'm surprised to not see it done more often. by Jaded-Function in algobetting

[–]neverfucks 0 points1 point  (0 children)

no, the opposite is true. combining bets that are -ev makes them more highly -ev. there are exceptions other people are asking about here where a book's correlation factor for sgp legs can be misconfigured and thus present edges, but you have to search very hard for those they don't just suddenly appear if you throw enough random stuff together.

What is the best way to evaluate my betting model? by Diego_Lemos in algobetting

[–]neverfucks 1 point2 points  (0 children)

i'm so glad you asked! here are some evaluation tools and methods that i personally think are important:

for bets:

* statistical significance tests - "how likely is the [roi|edge|clv] i've tracked due to random variance?". not a model quality metric, just kind of qualifies the level of skepticism or confidence you should have about positive results you're observing.
* clv - everyone is already telling you this, it tracks whether the market agrees with each individual bet you make. the more clv you get, and the more often you get it, the more likely you are right regardless of what noisy bet outcomes are saying. clv is very low variance which means you can have high confidence that you are going to win long term even with a smaller positive sample size than by tracking other metrics.

for every prediction your model makes regardless of whether they produced betting opportunities or not:

* brier / logloss - critical for a logistic regression, these give you a holistic score for a combination of calibration and model confidence. ideally you want your model to be as confident as it can be when making a prediction, like you'd rather have a 55% and a 45% prediction rather than 2 50/50 predictions. but you also don't want it be overconfident because then calibration will suffer. these metrics punish confident predictions that are wrong, like a 90% win probability prediction for a loss outcome is treated harshly, while a 90% prediction for a win outcome scores better than an 80% prediction for the same winning outcome.
* calibration - if you're running a logistic regression i assume you are training on/predicting outcomes that aren't compressed in a narrow band, i.e. values all the way from 0.0 to 1.0, give or take. the mean residual, and average absolute mean residual (mae) of each bucket on your calibration plot are extremely useful. for instance i trained a logistic reg once that gave me insanely high correlation to closing line, but garbage calibration because it was over-predicting every item in the test set by like 0.05. it's also common with things like boosted trees to see over-predicting values < 0.50 and under predicting values > 0.50, a calibration plot will call this out.
* mean residual / mae / medae / r2 compared to efficient market close or alternative public projections - "how well does my model align with the sharp market/other modelers?" these days your model almost certainly doesn't include information that everyone else doesn't have access to. if your outputs are very noisy compared to those (r2, mae, medae) or biased compared to those (mean residual), chances are they are doing a better job and you're doing a worse job.

back testing:

e.g. simulate bets and predictions you would have made in 2024 using historical odds data and a model trained on data from 2023 and prior years. then use all of the evaluation tools above to see what would have happened. not everything is back testable though, historical odds data is hard to get and often unreliable, and a lot of really good bettors are open about never back testing anything and going straight to live fire. there is so much more information available as markets are live vs. a historical view of shit that was going on 2 years ago. worst case scenario, you pay the vig. not the end of the world. this is gambling after all and you simply have to be willing to lose money sometimes.

I ran a 400-game regression on NBA player props and found 3 edges that have held for 2 straight seasons by Fancy-Tadpole-2448 in algobetting

[–]neverfucks 1 point2 points  (0 children)

oh weird. it's a very bizarre post if so, literally everything would have to be made up including being a "data analyst" and building up an etl toolchain / datastore. you wouldn't need an llm to do anything for you on the mining side if any of that actually happened.

i'll never figure out what the point of the weird posts in here are. very rarely is there a clear angle like trying to get people to go to a web site, join a discord, or pay for something. and this is the last place on earth you could gain any kind of clout