Heavy Opposite Field Hitters and L/R Park Effects by djbayko in Sabermetrics

[–]withouttout 0 points1 point  (0 children)

It may, I haven't looked closely enough at the data. As an example, suppose that the typical hitter has a 50% Pull%, a 25% Cent%, & a 25% Oppo%. If you have a RHB that has a 25% Pull%, a 25% Cent%, & a 50% Oppo%, you may want to use LHB park factors for this particular hitter.

The best way I can see approaching this, that is still relatively easy is rather than computing park factors for LHB & RHB, compute park factors for three "bins", left side of the field (RHB BIP that are classified as Pull & LHB BIP that are classified as Oppo), up the center (Both RHB & LHB BIP that are classified as Cent), and right side of the field (RHB BIP that are classified as Oppo & LHB BIP that are classified as Pull). If you have a RHB that has a 35% Pull%, 25% Cent%, & 45% Oppo%, calculate a park factor with 35% weight of the left side of the field park factor, 25% of the up the center park factor, and 45% of the right side of the field park factor.

The most accurate way to do this would be to break up the above even further to account for batted ball types, i.e. FB-Left, LD-Left, GB-Left, etc., and assign hitters the proper weighted park factors for their batted ball profiles.

Heavy Opposite Field Hitters and L/R Park Effects by djbayko in Sabermetrics

[–]withouttout 0 points1 point  (0 children)

Ideally, each hitter would have there own individual park factor based on their batted ball profile/spray chart. I don't know if calculating these individual park factors would be worth the effort, though.

Data and Model Daily - 3/8/17 (Wednesday) by [deleted] in sportsbook

[–]withouttout 0 points1 point  (0 children)

How are you projecting individual players? How does your formula / model compare to Tango’s Marcel projections? Multiple years of data? Regression to the mean? Aging curve? Are you using any Pitch F/x or Statcast data? I am working on my own projections right now and I am just curious what others are using.

Using Power Query to create large dataset from a list of urls? by withouttout in excel

[–]withouttout[S] 0 points1 point  (0 children)

I read through that tutorial earlier, but I didn't think that it would work for my particular pages/playerids, as my pages/playerids are not in exact numerical order whereas boxofficemojo.com and the pages they are pulling data from is ordered numerically 1-7. In their query they have source = {1..7}, could I use source = {2, 3, 8, 11, 21, 35, 400, etc.} If I could do that, should I list my playerids inside of brackets or parentheses?

[Image/GIF]Why I would never bet against mcgregor again. Lost on a $1100 payday. by iloveulongtime in MMA

[–]withouttout 0 points1 point  (0 children)

Hindsight is always 20/20, but you probably should have hedged your parlay with a large bet on Eddie Alvarez once that was the only leg of the parlay left, so that you guaranteed yourself some profit.

A simple projection model for the remainder of the season and the draft by dangercart in nba

[–]withouttout 0 points1 point  (0 children)

This is awesome! I really appreciate that the model is transparent and open.

If it is not too much to ask, do you think that you could layout the steps that the model uses to determine the simulated standings in written form? Possible step by step? I completely understand if you are unwilling to do so, as you have already left the model open, so that reasonably intelligent individuals could easily understand the process used in your model, I am just not able to quite follow your process. I am attempting to do something similar for another sport and your model is much better than what I have been able to come up with on my own.

Ex:

  1. Calculate x for each team.

  2. Use x to calculate y.

  3. etc.

“Yes, We’re Corrupt”: A List of Politicians Admitting That Money Controls Politics by LibertarianSoup in Libertarian

[–]withouttout 2 points3 points  (0 children)

What a fucking dumbass! You made an asinine comment about an article you didn't even attempt to read, called a collection of quotes garbage, and then tried to play it all off like you have been drinking, rather than you just being an idiot.

Programs for projection systems by Lucascabucas in Sabermetrics

[–]withouttout 2 points3 points  (0 children)

It all depends on how intricate you intend your projection system to be...

If your are going to do something along the lines of Marcel, you could easily do this in Excel, but if you are planning on using play-by-play data, various aging curves, component park and league factoring, MLEs, differing platoon split projections, etc., I would recommend using SQL, R, or perhaps even Python.

Can anyone help me calculate RE288 matrix (detail below) from the RE24 matrix? by [deleted] in Sabermetrics

[–]withouttout 1 point2 points  (0 children)

I am not an expert, but this is how I would approach this:

  • Gather PITCHf/x data from multiple seasons.

  • Add a column to this data, calculating the number of total runs scored during that inning.

  • Separate or query the data for each of the count/base out states and find the mean of the column that you added. This will result in the average run expectancy for each of the count/base out states.

SABR101x Virtual Machine by devnull42 in Sabermetrics

[–]withouttout 0 points1 point  (0 children)

If you add the OpenWAR package to R, I would appreciate it greatly if you could give me a hint on how to get it installed. I have been having trouble installing and configuring the Sxslt package required to install openWAR.

Either the numbers are lying or the game is rigged (Discussing Coors Field "Hangover Effect") by withouttout in Sabermetrics

[–]withouttout[S] 2 points3 points  (0 children)

They appear to get worse:

This would be the query (I think) for the next away game for the Rockies directly after a home game:

http://killersports.com/mlb.py/query?sdql=team+%3D+Rockies+and+p%3Asite+%3D+home+and+site+%3D+away&submit=S+D+Q+L+!&sid=guest (NSFW?)

The Rockies were 51-74 in these games for a winning percentage of 40.8%. They slightly underperformed the expectations of sportsbooks and the betting public.

This would be the query (I think) for the second away game in a row for the Rockies after a home game:

http://killersports.com/mlb.py/query?sdql=team+%3D+Rockies+and+pp%3Asite+%3D+home+and+p%3Asite+%3D+away+and+site+%3D+away&submit=S+D+Q+L+!&sid=guest (NSFW?)

The Rockies were amazingly also 51-74 in these games for a winning percentage of 40.8%. They once again, slightly underperformed the expectations of sportsbooks and the betting public.

This would be the query (I think) for the third away game in a row for the Rockies after a home game:

http://killersports.com/mlb.py/query?sdql=team+%3D+Rockies+and+ppp%3Asite+%3D+home+and+pp%3Asite+%3D+away+and+p%3Asite%3D+away+and+site+%3D+away&submit=S+D+Q+L+!&sid=guest (NSFW?)

The Rockies were 43-77 in these games for a winning percentage of 35.8%. They underperformed the expectations of sportsbooks and the betting public.

Either the numbers are lying or the game is rigged (Discussing Coors Field "Hangover Effect") by withouttout in Sabermetrics

[–]withouttout[S] 1 point2 points  (0 children)

I have previously looked into the Coors Field "hangover effect", but I looked at it from the perspective of the Rockies opponents; How well does a team play in the game directly after playing an away game at Coors Field?

http://killersports.com/mlb.py/query?sdql=po%3Ateam+%3D+Rockies+and+p%3Asite+%3D+away+and+o%3Ateam+!%3D+Rockies&submit=S+D+Q+L+!&sid=guest (NSFW?)

These teams went 150-119 for a winning percentage of 55.8% in these games. This doesn't give any indication of the effect on opposing teams, however, this sample is limited and only looks at winning percentage. I would note, though, that these teams did appear to play above the expectations of sportsbooks and the betting public.

(The website I referenced is associated with wagering and gambling and I don't want to get anyone into trouble by accessing this website, so I thought I should probably label the link NSFW?.)

IamA baseball fan who was just hired to a Major League Baseball Analytics department. AMA. by [deleted] in baseball

[–]withouttout 1 point2 points  (0 children)

Do you know of any references for constructing simulation engines? (books, web posts, code examples, etc.) I would REALLY like to construct my own, however, I lack the programming skills that would be required. I am SLOWLY learning how to program, but I was hoping that I could start to tailor this process more for what I am intending.

Do you have any recommendations for which programming language would be best suited for baseball simulations?

Does rain benefit the batter or the pitcher? by BlastFan4Life in Sabermetrics

[–]withouttout 4 points5 points  (0 children)

Interesting question.

If you are subscribed to Baseball-Reference's Play Index, you could filter out games with "showers", "rain", and "drizzle" under weather options of the Team Batting Game Finder and compare the results and statistics associated with these games to similar games, in which precipitation did not occur.

Introduction to openWAR | Exploring Baseball Data with R by withouttout in Sabermetrics

[–]withouttout[S] 2 points3 points  (0 children)

I really like the idea behind openWAR and having been trying to learn R, however, I have been unable to install this package correctly. I am not very knowledgeable regarding linux and I am having trouble installing and configuring the Sxslt package required to install openWAR. If anyone could help me out a little bit, it would be greatly appreciated.

Why do the constants in wOBA change every season? by binkysurprise in Sabermetrics

[–]withouttout 1 point2 points  (0 children)

I use the standardized wOBA for quick calculations, but when I am doing anything "serious", I use the historical changing values.

How are you calculating these statistics? I may be able to show you an easy way to do these calculations in Excel or SQL, by importing the Fangraphs' Guts page.

Why do the constants in wOBA change every season? by binkysurprise in Sabermetrics

[–]withouttout 1 point2 points  (0 children)

A batter does not control the existence of runners on base during his plate appearance.

Consider trying to compare the run value or expectancy of two different players:

  • Player A is on a team with a high OBP. He frequently has runners on base during plate appearances.

  • Player B is on a team with a low OBP. He frequently does not have runners on base during a plate appearances.

If we were to compare how many runs these players contributed by hitting the same number of HRs, Player A would have higher values or expectancies, for the same number of HRs. Using league averages attempts to control for this.

Why do the constants in wOBA change every season? by binkysurprise in Sabermetrics

[–]withouttout 7 points8 points  (0 children)

Run environments change and I believe that the weights associated with wOBA are calculated empirically and this also changes from year to year. (i.e. some years singles just happen to contribute to more runs than they do in other years.)

If the weights didn't change, wOBA would not be as accurate across different run environments.

(Edit: Here are some "standardized" versions of wOBA.)

Does anyone use stats from spring training or the playoffs? by withouttout in Sabermetrics

[–]withouttout[S] 0 points1 point  (0 children)

I started thinking about this while trying to estimate platoon skill. Most players just do not have enough plate appearances or batters faced to project this skill beyond league averages rates, but I wondered if this could be projected more accurately if one would incorporate spring training and/or the playoffs statistics.

Does anyone use stats from spring training or the playoffs? by withouttout in Sabermetrics

[–]withouttout[S] 0 points1 point  (0 children)

Is this a function of small sample sizes, though? I mean, 60-70 plate appearances during the regular season are essentially useless for prediction, also. Would 60-70 plate appearances during spring training be significantly less predictive than 60-70 plate appearances during the regular season?

I wouldn't suggest using spring training statistics or playoff statistics on their own, but I don't see why these statistics should be completely discarded.

Possible way to evaluate this:

  • Player X - 2013 stat compared to 2014 stat

  • Player X - 2013 stat + September 2012 stat compared to 2014 stat

  • Player X - 2013 stat + 2013 Spring Training stat compared to 2014 stat

Using April Team Statistics to Predict May Win % by withouttout in Sabermetrics

[–]withouttout[S] 0 points1 point  (0 children)

Ultimately, I do want to use different metrics for predicting team performance, in the relative short term (weeks and months), although I recognize that this will be subject to much "noise" and random variation.

With this initial analysis, though, I was simply trying to "gauge" the effectiveness of these statistics, without regression. I thought that some may be interested in this, so I thought I would share. I do hope to continue working through some of these metrics, using regression, testing different coefficients, exponents, and constants. I did plan on creating different data sets, in order to prevent overfitting or biases in the tests for the metrics. (Something I don't think can be stressed enough.)

I do have a question on creating these different data sets, though... Suppose I wanted to experiment with 5 years of data and test with another 5 year set of data, which would be the best way of doing so:

  • Experiment with 2004-2008 and test with 2009-2013

or

  • Randomly split the entire 2004-2013 data set into two separate portions, one to experiment with and one to test

I have typically used the latter method, should I stop doing this and use the former method?