Sunday, October 23, 2011

What Are “Sabermetrics”?

A decade ago, most fans would have looked at you like you were from another planet, when asked that question. Now, most fans are at least aware that “Sabermetrics” means statistical analysis. More specifically, the use of countless (and sometimes confusing) math formulas to predict and explain athletic performance, which in turn leads to team success, if done properly.


The new age of General Manager has come to rely on this process, some more heavily then others. Players and even sports agents are now on board with this process, using the formulas in reverse to bargain for additional money and longer contracts.


Think about the overall struggle that took place in the book (and now the movie) “Moneyball”. It was whether traditional scouting methods were more valuable (accurate) then the “new fangled” approach that valued statistics and computer projections. Our very own Sandy Alderson, and later Paul DePodesta (working as an assistant to A’s General Manager Billy Beane), were heavily involved in the genesis of this approach. While the book also focused on identifying and capitalizing on undervalued assets, it inadvertently documented the early stages of the statistical movement.


It is clear that regardless of what side you fall on (traditional scout or “stat geek”), the use of “Sabermetrics” is not going away (I personally think a blend of statistical analysis, along with the input of savvy baseball scouts is the way to go). So, instead of being resistant to change, it benefits all baseball fans to have a basic understanding of the statistical process, so that you can understand what sportscasters, analysts and even bloggers are talking about from time to time.


Per Mack’s previous request, I am going to try and produce a series of articles on a weekly basis, that highlight a statistical formula, or two. In doing so, I will also try to explain the formula, what it measures and why it is relevant. I am in no way an expert, with just a college minor in statistics, but I will do my best. I encourage you to ask questions, contribute comments, or to even introduce other statistical formulas that you would like to discuss.


Where to begin?


Part of me longs for the days when you could pick up a sports page and look at a traditional box score. It listed basics, such as batting average, runs scored and runs batted in, in the traditional format of AB - R - H - RBI. Now, the box scores are considerably more advanced, to say the least!


Some folks blame “fantasy baseball”, while others recognize it is a direct reflection of the aforementioned “statistical movement”.


I think it is important to understand the basics, before moving on to more advanced topics. Sort of like learning basic algebra, before moving on to “fun” classes like trigonometry and calculus. In that vein, my contributions will seem pretty simple at first, but will most likely get more complicated as we move along.


I think we should start with OPS and a newer derivative called OPS+. OPS basically stands for on base percentage, plus slugging percentage. The plus added to the original formula basically makes an allowance for specific ballpark factors, scaled to what is called the average. This allows for a more in depth comparison between different players, plying their craft in different ballparks.


Looking a bit closer, it makes sense to define both on base percentage and slugging percentage, in order to understand OPS and OPS+.


On base percentage (OBP) is basically the number of times a player gets on base, divided by the total number of times a player could have gotten on base. You might think of it as the old batting average on steroids (OK, bad choice of words). In the old way of figuring batting average, it was simply hits divided by at bats (excluding errors, walks and sacrifices since they are not official at bats).


To calculate OBP, you add the number of hits, walks and hit by pitches and divide that number by the total number of at bats, walks, hit by pitches and sacrifice flies. So, just on the surface, you can see how much more conclusive that is compared to just batting average.


For example, Player A has five at bats in one game. Let’s say he has one hit, draws two walks (we know it isn’t Jose Reyes then) and makes two outs. Subtract the walks and he was officially one for three on the day. Divide the one hit by three at bats and you have a batting average of .333 for that game.


That doesn’t really explain the overall impact of Player A’s day. Using on base percentage, you have to factor in the two walks, as well. So, you have one hit and two walks (three times on base) divided by the total number of official at bats (three), plus the two walks for a total of five. Three times on base divided by five chances equals an OBP of .600, or a much bigger impact on the team’s chances of scoring runs for the game.


Moving on to Slugging Percentage (SLG), as the second component of OPS. SLG simply put, is the total number of bases a player earns, divided by the total number of official at bats. For this calculation, walks are not included in total bases, nor are they included in official at bats.


So, returning to Player A for one moment, we know that he had one hit in three at bats (minus the two walks........and if you are reading this Jose, walks won’t hurt you, my friend).


We know that the player had one hit, but what type of hit was it? A single is not the same thing as a home run, which is why this statistic was created. To properly calculate SLG, you need to know what type of hit the player had, not just if the player had a hit.


So, a single is one base, a double is two bases, and so on. If player A hit a double in three official trips to the plate, then you take two bases and divide that by three at bats. Your slugging percentage for the day would be .667 (which would be great over the course of a season).


OBP and SLG are not difficult calculations, but you can see they reveal more about a player’s contributions then simply looking at a batting average. Adding the two together, as stated above, gives us OPS.


Player A had an OBP of .600 and a SLG of .667 in our example. The OPS then, would be 1.267, which would be fantastic for a season.


In the modern era, a “good” OPS is usually anything over .800, but that depends on what position the player is assigned and where they hit in the batting order (middle infielders are not usually the same as a first baseman, or a corner outfielder with regards to expectations and your leadoff hitter is not looked at the same as your cleanup hitter).


What about OPS+?


The basic principle is to take a player's OPS, adjust it for different ballpark factors and then put it on a percentage scale. When it comes to OPS+, 100 is the league average, 110 is 10 percent above league average, and 90 is 10 percent below league-average.


The actual number of ballpark variables involved makes the calculation complicated to write out here, but the actual arithmetic is simple;


OPS+ = 100 X (OBP/lgOBP* + SLG/lgSLG* - 1)


What the hell is that? Basically, you are adjusting OBP and SLG by the specific park factors. In other words, allowances have to be made for a place like CIti Field (harder to hit), when compared to the new Yankee Stadium (easier to hit).


So, you would expect Mark Texiera to have a lower OPS in Citi Field, then the one he produced in Yankee Stadium.


As far as the math is concerned, you can locate the listed adjustments if you are interested in doing the math. My point is to show you the overall statistic (OPS+) and how it is generated, so when you see it, you know what they are referring to.


So, if a player has an OPS+ of 100, they are average for their position, taking into account where they play. Anything over 100 is above average, under 100 is below average.


For reference, Matt Kemp of the Dodgers posted an OPS of .986 for 2011, which is an excellent number. However, his OPS+ was a staggering 171 this year! Which meant he was 71 percent above the average for his position, even when you consider that he played half his games in Dodger Stadium.


In closing, I hope that the basic calculations of OPS and OPS+ shed some additional light on the topic of Sabermetrics. I also hope that you can now look at those two statistics and further understand what they are meant to measure. It is one of many ways to look at a player’s impact on the game, or to see if a prospective player is worth more or less then his reputation.


For fun, figure out Albert Pujol’s OPS for last night Game Three of the World Series!!


Boy is he getting expensive for 2012 and beyond.


I will pick a couple new statistics for next week and try to explain them, as well.


Enjoy the last few games of 2011.


No comments:

Post a Comment