Filed under:

Strike Zone a Marginal Component of Home Field Advantage

Home-vs

Author's Note: Special thanks to the commenters (Dan, Mike and Xeifrank) in my previous post who helped me get the answers I needed.

As promised in my previous post, I ran some numbers to double-check the claim by the authors of Scorecasting that a home field bias in the strike zone is a significant contributor to MLB home field advantage. A look at the heat maps above is inconclusive: the away team pitcher's heat map is actually the "hottest" (indicating an away-team bias for pitches thrown down the chute), but banding is wider for the home team, indicating a home team advantage on pitches.

Indeed, the home team undoubtedly has an advantage when it comes to the strike zone, but this advantage just isn't very big. This is true when we look at the home/away splits on their own as well as when we control for several other factors.
Home Team Runs/150 Zone Adv.
Pitching -0.400 2.43%
Batting -0.281 1.97%
-0.119 0.45%

I assigned a run value to each "blown" call thrown during the regular season from 2008-2010, using Tom Tango's linear weights per ball-strike count. As noted in the table above, I find the home vs. away spread to be -0.119, or about +/- 0.06 runs, per 150 called pitches (equivalent to the average nine-inning game). This is exactly the figure Dan Turkenkopf offers in the comments of the previous post.

Considering that the run environment over this period is about 9.1 per game (according to Baseball-Reference.com), and considering that the typical MLB field effect is +/- 4%, home team bias in the strike zone only accounts for ~16% of the observed effect. This is pretty darn close to the findings of Mike Fast and Phil Birnbaum, also noted by Dan in the comments of the previous post (Phil takes on Scorecasting from a slightly different angle on his blog).

For those of you scoring at home, that advantage is equivalent to playing an entire season at home and barely winning one extra game.

Subjectively, this is not my idea of a big deal. It's certainly not enough for the data to go "berserk," as the author claims in this interview with Wired. But there's another objective measurement that indicates how unimportant home field advantage is regarding the strike zone.

As I've noted several times in this series, I base much of my work on a statistical model predicting which variables have an effect on called pitch bias. This model not only tells us which variables make an impact, and in what direction, but how strong that impact is relative to the impact of the other variables in the model. So how strong a predictor is home field advantage on Zone Advantage?

Among 29 non-control, statistically significant variables, home field ranks dead last.

That's right, 29th out of 29. Relative to other variables, it's about 1/4 as important as pitcher-handedness, 1/5 as important as velocity, 1/8 as important as pitch type, 1/10 as important as the run expectancy of the base-out state, 1/57 as important as the run expectancy of the ball-strike state, and nearly 1/100th as important as the ratio of pitches thrown in or out of the legal zone over the course of an at bat.

Of course, at this point, we're beyond the scope of the book; this has nothing to do with home field advantage at the level of win-loss record. It does show, however, just how unimportant home field advantage is in terms of strike zone bias.

Rob Neyer wrote:

I still want to see the numbers. And it's not like nobody's ever studied this stuff before.

Mr. Neyer's right: it's not like nobody's done this before. Dan, Mike and Phil have crunched the numbers, and my findings here replicate and corroborate them. So there you have it.

PitchF/X data originate from Darrell Zimmerman's SQL-based PitchFX database, run expectancy data by Tom Tango at Inside the Book.

 

Previous episodes in the Benefit of the Doubt series: