Getting Started
One of the goals of the FanGraphs Library is to create a static location where you can come to learn about or brush up on the advanced metrics we use on the site. This kind of resource is difficult to cultivate because so much of the great content that helps explain the numbers gets lost in the expansive oceans of daily blog posts. You’re no doubt familiar with this space as a depository for specific information on individual statistics, but we also have a blog section that walks you through a variety of concepts and ideas that don’t really fit well in a “What’s wOBA?” style page.
To that end, welcome to the Library’s new “Getting Started” page. Here you can find an explanation about the kinds of metrics we offer, but also a guide that walks you through the process of learning about sabermetrics. In the summer of 2014, I began a series of posts that hit on these major concepts, but I noticed how quickly that knowledge was fading from view. If you were new to the discussion at precisely the right time, you probably read it. If you were a month late, you missed it.
Now, if you click to our Library you can find this page very easily that helps catalog important topics outside of basic questions like “is wRC+ park-adjusted?” It will be a constantly updated section, so if you think of something that belongs here, please let us know.
Why Sabermetrics | The Basics | What We Offer | Key Concepts & Terms
*****
Why Sabermetrics?
Sabermetrics is about trying to evaluate the sport more accurately. For decades, statistics like home runs, runs batted in, batting average, wins, and earned run average were all we had to determine which players were good, which were bad, and which were in between. But as gathering, collecting, and sharing information became easier, a group of baseball teams and analysts started to develop statistics that were slightly harder to track and disseminate, but ones that were a much better reflection of talent or performance.
The most obvious example of this is the difference between batting average and on-base percentage. A walk is a positive outcome for the batter, and while it isn’t as valuable as a single or a double, it is much better than making an out. Batting average completely ignores walks, meaning that it is failing to capture important information about the hitter. Beyond that, batting average and on-base percentage assume that each hit or time on base is equally valuable, when we know that extra base hits lead to more runs than singles and walks. So there needs to be a way to credit hitters for getting on base, but also for how much their particular way of reaching base is worth. Sabermetrics, at its heart, is about making sure we capture as much of that as possible.
That’s just an example about one or two statistics, but the goal is always better evaluation and using the proper tool for question at hand. We have questions about the game and sabermetrics is about bringing all of the relevant data into the conversation to answer them.
The Basics
One of the most common questions we get from fans who are curious but not well-versed in sabermetrics is where they should begin. What are the first few statistics they should learn to better understand everything we offer. As you can probably tell, we have hundreds of statistics available on the site, but I believe that if you learn four statistics/concepts, you’ve picked up most of what you need to know to interpret most sabermetrics.
Weighted On-Base Average (wOBA)
wOBA is really the key to everything. If you understand and accept wOBA as a statistic, you’re ready for anything that we might throw at you. Essentially, wOBA is an offensive rate statistic that reads like OBP which tries to offer you a complete look at a player’s performance. It is superior to AVG/OBP/SLG and OPS for two key reasons. First, it includes everything from walks and HBP to home runs and sacrifice flies. Second, it weighs those actions based on how much they contribute to run scoring on average.
Everyone knows that a single and a double are differently valuable. If you had to choose one or the other, the double is always as good or better than the single. So right away, batting average and on-base percentage fail to capture this important difference. Yet slugging percentage and OPS do not properly resolve this issue because they weight the actions based on a 1-2-3-4 system that is simply designed to count bases. A double is worth more than a single, but it is not worth precisely twice as much, which is how slugging percentage sees it. Granted, OPS is generally going to tell a similar to story to wOBA, but OPS dramatically overvalues slugging relative to OBP. Enter wOBA.
wOBA = (0.689×uBB + 0.722×HBP + 0.892×1B + 1.283×2B + 1.635×3B +
2.135×HR) / (AB + BB – IBB + SF + HBP)
Above is the wOBA formula for the 2014 season, with the league average sitting at .310 due to the low run environment. We set the league average wOBA to equal the league average OBP to make it easier to read and then we use those weights in the formula to give different credit for different types of offensive actions. A single isn’t worth half of a double, it’s worth more like 70% of one. These numbers are based on linear weights, which is a fancy way of saying they’re based on the actual run value changes that singles, doubles, etc have caused during the season.
Players should get credit for the degree to which their actions lead to run scoring and wOBA offers a much more complete accounting of that than something like RBI, AVG, or OPS. To learn more about wOBA, check out our glossary entry and this post about wOBA as a gateway to sabermetrics.
Weighted Runs Created Plus (wRC+)
wRC+ is really just an adjusted form of wOBA. It takes the same inputs for the same reasons we just discussed, but it does two important things that matter a lot in sabermetrics: It adjusts for park effects and league average. To know wRC+, you have to understand that context has an influence on the numbers you produce. A home run in Colorado in 2000 is not as impressive as one in San Francisco in 2014. You know this intuitively, but we need to account for it in our measures of performance.
So wRC+ is essentially wOBA, but we adjust it for park to even out the way the different environments might affect a player’s raw production. If you play your home games in Colorado compared to San Francisco, your raw stat line is going to look a lot better as a hitter (and worse as a pitcher) in terms of hits and runs, but it won’t actually be more impressive because everyone hits better at Coors. Here we apply what we call a “park adjustment.”
wRC+ also scales to league average during that season so that we can more accurately compare players. In 2000, the average player had a .341 wOBA and in 2014 that number was .310. That’s a dramatic shift in the run environment due in part to drug testing and in part to the changing strike zone, but no matter the cause, it changes how we should think about certain numbers. In 2000, it was no big deal for a player to hit 30 HR, but in today’s game it’s extremely rare. Run scoring is down, so the best players and the average player are both worse in an absolute sense than they were a decade ago.
wRC+ helps us by taking the average offensive performance at setting it equal to 100 and then giving players credit based on how much better they are than league average during that year. So a player with a 120 wRC+ was 20 percent better than league average. This lets us compare across leagues and time more effectively. Basically, a .340 wOBA means something different depending on the context and wRC+ helps us see that. Context and environment always important in sabermetrics.
To learn more, check out our glossary entry on wRC+ and this post about the importance of context.
Defense Independent Pitching Statistics (DIPS) and Batting Average on Balls in Play (BABIP)
DIPS and BABIP are central to our understanding of everything on the field. Hitters, pitchers, fielders, it doesn’t matter. wOBA and wRC+ are about proper measurement, but DIPS and BABIP are about figuring out who is responsible for what. How much of baseball is luck? How much can you control? How much do your teammates matter?
Both of these concepts are complicated, but they are important because they help us understand how much randomness is in play during a baseball game. For hitters, you can crush a baseball and it can find someone’s glove. We’ve always known that. But until relatively recently, we didn’t realize how long it took for that luck to even out. It can take 2-4 years before a player’s unlucky hits (or hits allowed) and unlucky outs even out. What happens once the ball is put into play is driven by far more than just the hitter and pitcher, even though they’re the ones who traditionally receive all of the credit.
On the pitching side, this means that we want to try to evaluate pitchers independent of their defense, which leads us to numbers like Fielding Independent Pitching (FIP) and other variants. Pitchers have much less control over their BABIP than hitters and while certain pitchers might be able to run unusually low BABIPs, for the most part they can’t do much beyond determining the number of balls that are put into play.
Hitters have a much larger range of possible true talent BABIPs, but BABIP is still a very important concept to know for them as well. Hitters can maintain .335 BABIPs, but sample size matters a lot when you’re looking at their stat line. A career .295 BABIP hitter might very easily put up a .350 BABIP for a month and a half, but that doesn’t mean he has changed his talent level to that degree. Putting the ball in play and doing so with authority is a very real skill, but you can’t use the outcome of a few PA to determine if that skill has changed.
DIPS and BABIP really help us understand the importance of evaluating process and results separately. You can do everything right in baseball and the outcome might still not be very good, which means when we want to know how talented a player is or how well they’ve done their job, you want to look at their process in addition to their results. For example, a batter might go 0-4 one day, but they might have smoked three hard line drives that the left fielder happened to run down. On the other hand, the next day they might go 3-4 thanks to four weak ground balls. In small samples, you can’t always tell much about the process from the results.
Here are a couple of posts to learn more about BABIP and FIP in addition to the links above.
Runs and Wins
Finally, learning to speak in the language of runs and wins is vital for understanding sabermetrics. It’s very easy, but with any new language, it’s jarring at first. The goal of baseball is to score more runs than the other team over nine innings in 162 separate games. Wins are the currency of the season and runs are the currency of each game. You want to maximize the number of runs you score, minimize the runs the other team scores, and you want to do so all year long.
This means that when we compare teams and players, we want to do so by using the currency of the game. Often times we speak about a player being X number of runs above or below average in a given area. A good defender might save 5 or 10 runs more than the average player at his position, or a great hitter might be worth 60 runs more than the average hitter that year. Their performances add runs to their teams totals and makes it more likely that they will win.
In general, about 9-10 runs above replacement is equal to one win above replacement. The average full-time player produces about 20 runs for his team above replacement per year, or about two wins. By definition a completely average team should win 81 games during a season and score the same number of runs as they allow. Roughly speaking, adding ten runs to an average team’s run differential will usually make them an 82-80 win team.
That’s the logic of runs and wins, we’re simply deconstructing team success down to the individual players and making sure players who perform well and play often are credited with these runs. Learn more about this terminology here.
What We Offer
FanGraphs provides all kinds of information to help you get up to speed and to make use of the statistics about which you’re now informed. The most obvious tool is our FanGraphs Library which has detailed posts explaining the available statistics and concepts we use every day on the site. If you don’t know how or why you should use a specific stat, this area of the site will help you figure it out.
Additionally, we have loads of customizable leaderboards and pages to help you get the most out of your inquiries. Want to compare the careers of five specific players during similarly aged seasons? You can do that in less than five minutes here. Want to export some data into Excel? No problem. These posts on how to use the leaderboards and player pages will show you how.
There are also a lot of other features on the site such as live scoreboards, win expectancy graphs, and in-game live statistics. We also provide standings (projected and real) in conjunction with depth charts and postseason odds that are updated daily. We have historical and current player and team statistics, and Ottoneu, which is a FanGraphs powered fantasy baseball game.
Finally, we have a great team of writers who offer insight and thoughtful analysis at least five days a week. We’re always adding new features and stats, so stay close if you’re looking for updates.
Key Concepts and Terms
Now that you have a basic grasp of sabermetrics and have a sense of what you can find on FanGraphs, it’s important to get a handle on some terms and concepts that we use a lot on the site.
Projection
The word “projection” shows up everywhere on FanGraphs because a lot of our conversations about baseball concern our estimates of future performance. We want to know how good a player will be in the future and to determine that, we need to use everything we know about him and similar players to project his future results.
Projections aren’t perfect estimates of the future, but they’re a “best guess” that serve as baselines for comparison. To learn more about projections, check out this post.
True Talent
Because baseball includes so much randomness, it’s common for luck to dramatically influence the outcome of plate appearances, games, and even seasons. If you do everything right, sometimes the ball still bounces in the wrong direction. Random variation takes over and the results we observe in any one iteration don’t line up with the true, underlying talent distribution.
To put it another way, if you flip a coin 20 times, on average a fair coin will land on heads 10 times and on tails 10 times. But in any one single set of 20, virtually any combination of heads and tails is possible. It wouldn’t be that odd to get a 15/5 split once in a while. Baseball is like this in that sometimes you get a funny result even if it’s a fair coin. True talent is how good a player or team actually is, but due to the nature of the game, sometimes good teams lose to bad teams, etc.
Regression (Toward the Mean)
Regression is discussed so often these days that it’s become something of a meaningless buzzword, but it’s a very important idea that you have to understand to master sabermetrics. Like you read above, some of baseball is based on skill and some is based on luck. A player’s true talent level doesn’t change dramatically day to day or week to week, but their true talent level isn’t the only thing at work.
Due to luck, randomness, etc, players perform better or worse than their true talent level quite often when we’re dealing in relatively small samples. This means that until you have a very large sample size, you expect a player to regress to his “population’s” average to some degree. In other words, if a great hitter posts a .400 BABIP, you have a pretty good idea that he cannot possibly continue to post that high of a BABIP and you expect it to look something like the average BABIP for hitters of his type going forward.
Replacement Level
Replacement level is sometimes hotly debated because people sometimes believe it was pulled out of thin air, but it’s actually grounded pretty heavily in fact. Replacement level, or a replacement player, is defined as a player or team of players who you could acquire for essentially nothing on the free agent market. In other words, replacement level is what you could get out of a minor league free agent/AAAA type player.
We calculate that a team of only those players would still win about 48 games per season, and given that there are 2,430 games to be won each year, there are about 1,000 wins above replacement (WAR) to be earned by MLB players. We like to compare players to replacement level because it offers us a common yard stick. “How much better is this player than some random player we could grab for nothing?” It’s a useful baseline and it’s mathematically pretty accurate. If you take the MLB performance of minor league free agents over a year or two, it quite often averages out to roughly 0 WAR.
Defensive Metrics
Defensive metrics are also controversial because they’re new and they sometimes disagree with our eyes, and we like to trust our eyes. Essentially, these metrics try to determine how well a defender has performed by combining the difficulty of making each play (based on how often the average player makes the play) and the run value of each play (the average change in run expectancy for similar batted balls).
The inputs used to create these measurements are not perfect, but the overall structure and philosophy is very useful. We want to know how many runs a defender has saved, but we can’t always trust the tools we have to make the measurements. To learn more, click over to this detailed explanation.
Statistics and Terminology
Finally, it’s important to keep a level head about statistics and language more generally. Some statistics are better than others, without a doubt, but no single statistic is perfect. wOBA is always a better bet than batting average, but wOBA isn’t without its flaws. To properly analyze something, you need to bring as much information into the debate as possible.
You need to understand who is writing/speaking and who their audience is. If you read an article on FanGraphs that cites WAR but doesn’t mention some of the potential flaws of the statistic, it’s because the writer is assuming the reader is already aware of that information. We may list 3.5 WAR above 3.0 WAR in the leaderboards, but we know that those players are very difficult to distinguish with WAR because they are so close. However, if you aren’t familiar and don’t speak the language, you might think the we are making a bold statement about which player is better. In general, the best advice is to use caution when making inferences because there is a lot of information that is implied but never verbalized.
[MORE TO COME]
Neil Weinberg is the Site Educator at FanGraphs and can be found writing enthusiastically about the Detroit Tigers at New English D. Follow and interact with him on Twitter @NeilWeinberg44.