THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


2013 Bill James Handbook

THE BOOK--Playing The Percentages In Baseball

<< Back to main

Friday, May 04, 2012

WAR updated on Baseball-Reference

By .(JavaScript must be enabled to view this email address), 02:28 PM

I just saw the post, so I’ll read through it, and comment as appropriate.  Sean reached out to me on a couple of things, so I’m keen on seeing what the final product looks like.

UPDATE 1: I just read through all the descriptions on Sean’s site.  The explanations are tremendous, and I have no major objections.  I’ll give it a second read-through, and will just make some minor, sporadic comments. 

More updates below…


UPDATE 2:

I’m going to go through each of Sean’s pages of explanation, and pick out a couple of passages.  You can 99% presume that if I don’t talk about something, then I either agree with it, or I consider it to be a reasonable choice. 

It’s apparent that Sean took great care with the framework and calculations, and put serious thought in his choices.  With Sean at Baseball-Reference and David at Fangraphs as stewards of advanced metrics, as well as having fantastically designed websites, sabermetrics has never been in a better shape to having its message put out in such an open and honest manner.

Anyway, I have several open tabs, and picking them out as I see it, here we go:

http://www.baseball-reference.com/about/war_explained_position.shtml

1. Just a note that I’m glad to see Sean holding to the WAR framework principle, that everything is about comparing to average, and then, you have the final step to compare to replacement level. 

As an aside: What this does is allow people who don’t like the idea of replacement level, or, have a different idea of a replacement level, to simply substitute their own value in it.  This is why the presentation that Rally had at Baseball Projection and Fangraphs has at the bottom of their player pages is so powerful.  It sticks to this idea.

2. Sean switched from “Theoretical Team” BaseRuns as the basis to just linear weights.  In the former, you would compute team BaseRuns with a player, and then compute it without the player, and the difference is his impact.  While undoubtedly a great way to do it, and possibly even the best way, it’s not the easiest thing to program, adapt, or explain.  Linear weights, which basically approximates this, is close enough, and is more flexible.  You can really go either way here. 

As an aside: Personally, I prefer ease and flexibility.

3. Taking out pitchers-as-hitters is always a good choice.  It is more work, but the payoff here is worth it.  It’s a basically minor point, but it shows great care.

As an aside: When I do my stuff, I not only always take out pitchers-as-hitters, but I also take out hitters-as-pitchers.  I just pretend all those things didn’t happen.  I don’t really care about many pitchers Cliff Lee has struck out, as it pertains to his value as a pitcher.  I think of it as noise more than anything.

4. Sean shows the historical positional adjustments.  I’m not sure if this is Forman’s or Rally’s stuff (extended to 2012).  Since I myself change the values every few years by a couple of runs here or there, I’m going to always consider such positional adjustments as a work-in-progress.  The numbers in the chart are certainly justifiable for the recent years, and for the most part in prior years.  Like I said, we can talk about this for hours, and just take one step forward maybe.

5. Sean continues Rally’s choice of the .320 replacement level.  The result is you get 875 wins every year (Sean shows 785, but that’s a typo).  Fangraphs has 1100 wins, so .275 is their replacement level.  You can reasonably justify anything between .250 and .350.

6. He also splits it at 59/41 between nonpitchers and pitchers, and justifies it similarly as I do, that that’s how teams pay.  Now, I don’t know that we necessarily want to keep that split historically, and I know that Rally did not.  But, not much of a big deal, whether you make it 40% to 45% every year, or keep it fixed at something in-between.

As an aside: I have it closer to 57/43, but, like I said, that’s no big deal. I’m not married to this number.

7. I’m not sure if those are Rally’s or Forman’s league adjustments.  I personally use 5 runs as the gap, and since Forman is using 4, I have no issues.  It’s cool to see it historically.

This took longer than I thought!  I’ll chime in on the other pages a bit later…


UPDATE 3:

http://www.baseball-reference.com/about/war_explained_pitch.shtml

8. Sean is maintaining the runs-based version of WAR, in stark contrast to the FIP-based version that Fangraphs has.  As I’ve said in the past, you can make reasonable justification for either of the two extremes, which is why I like it that they each have an extreme, so I can just take the midpoint and get on with my life.

The idea AGAINST runs-based version of WAR is as follows: A run allowed is presumed to be the responsibility of the pitcher, and adjustments are ONLY made based on the OVERALL presumed fielding performance of the team OVER THE SEASON.  So, you can have Cliff Lee and Doc Halladay each performing in front of, overall, league-average fielders.  But, if one guy has a .350 BABIP with men on base and the other guy has a .250 BABIP with men on base, and if we think that almost all of that is luck, then that luck will get absorbed SOLELY by the pitcher.

You can after all instead START with the idea that the runs are the responsibility of the team, and subtract out the things we know is responsible by the pitcher, and assign the rest to the fielders.

It’s a question of how the luck is split.  Runs allowed at its core is equal to the performance of the pitcher + performance of the fielders + timing.  By subtracting out only the performance of the fielders (and, it’s not even their performance ON THOSE PLAYS, but a general overall performance), the pitcher absorbs the timing of all events.

The idea FOR runs-based version of WAR: over time, the timing is going to cancel out, so that if there really is a timing-based impact by the pitcher, then we’ll see it reflected in his runs allowed.  Given a long-enough career, if we can adjust out all the rest, the random variation will be limited, and we’ll be left with the pitcher’s actual performance.  In addition, whereas FIP focuses on HR, BB, HB, SO, a runs-based measure also includes all the little things like SB, PK, etc.

The idea that it doesn’t matter: given a long-enough career, a pitcher’s RA9 and FIP are going to converge anyway.  So, it’s much ado about nothing career-wise.  Season-wise, both sides have a strong case.  Basically, the shorter the time frame, the more FIP makes sense, and the longer the time frame the more RA9 makes sense.

As you can see, I spend more time explaining it than simply choosing the midpoint of the two.  Do yourself a favor, and split the difference and move on. 

9. Good adjustment about the “opposing lineup” (whether it includes the pitcher or not batting).  The natural continuation of that is to look at the ACTUAL opponents!  That’s alot of work of course.  Just handling the pitcher though is 90% of the payoff for 1% of the work.

10. Decent team-adjustment for fielding.  If you want to make it “better”, split it by infield/outfield, and base it on the GB/FB tendency of the pitcher.  Weaver/Santana get a larger OF adjustment, while Felix/Doc get a larger IF adjustment.  But, it’s not going to matter much.

11. The SP / RP adjustment: if I understand it correctly, a pure starter has his replacement level at X, and a pure reliever is at .865X.  So, if X is 120% of league average, then a reliever’s level is 104% of league average.  That’s fairly decent.  I have it a bit wider.  In my case, it would be X and .833X.  So, if X is 125, then a reliever is at 104.  You can certainly justify it in a few ways, but it’s in the ballpark.

12. He uses the chaining explanation for using Leverage Index, so, that’s good.  Interestingly, he ALSO uses it for starting pitchers, which would be a bit harder to justify.  But, since SP are all so close to each other and to 1.00, it doesn’t matter much.  But, you lose the chaining explanation, and instead becomes a regression explanation.

13. Sean does a good job at explaining the difference between BR and Fangraphs, and why he went his way.  Like I said, you can make a convincing case either way.

UDPATE 4:

http://www.baseball-reference.com/about/war_explained_wraa.shtml

14. Alright, glad to see that Sean got that little thing I do with plate appearances (ignoring it in the rate calculation, but including it in the overall calculation).  It’s one of those little things that really doesn’t do much, but it shows detail to attention.

15. He went with the additive run adjustment, rather than the multiplicative one.  I think you can make a good case either way, but absent any strong reason, I’d just as well go with the easy additive one.

16. Good stuff on the SB/CS thing.  He uses the same calculation I use to figure out opportunities, and then scales the SB success rate relative to his SB frequency rate.

17. I skipped over the strikeout calculation (for now).

UPDATE 5:

http://www.baseball-reference.com/about/war_explained.shtml

18. I agree about figuring out the runs per win converter based on incuding the player in the run envrionment, and in the proportion to which he played in a game.  I’m pretty sure Fangraphs does the same.

19. Wonderful chart comparing Fangraphs and BR.com

UPDATE 6:

I must have closed one of the tabs, because he explained that the defense portion of WAR in the leaderboards is the fielding part plus the positional part.  So, when you look at SS and RF and 1B, the SS will float to the top, even if you have a guy with “negative” fielding.  It makes far more sense to include the positional adjustment to the fielding portion than not.

That’s all I got!!

As you can see, I really have no real objections to anything, which given my nature when it comes to this stuff, is a huge testament to Sean’s thoroughness.

#1    seank      (see all posts) 2012/05/04 (Fri) @ 22:47

It seems Sean was convinced:

I’ve changed the oWAR and dWAR formulations. oWAR is now called ndWAR for (no-defense WAR), but is the same otherwise. dWAR now contains the position component of WAR, so the Career Leaderboard is now dominated by SS, C and other great defensive players.
(Emphasis mine.)


#2    Tangotiger      (see all posts) 2012/05/04 (Fri) @ 22:53

Yes, I saw that, which is tremendous.  I gotta go to lunch, so I’ll update the main post in the afternoon.


#3    Tangotiger      (see all posts) 2012/05/05 (Sat) @ 00:01

I put my updates in the main blog, currently at UPDATE 2.


#4    dave smyth      (see all posts) 2012/05/05 (Sat) @ 00:23

Of the top 25 seasons in the new single season WAR since 1950, 20 are by pitchers…


#5    Sean Forman      (see all posts) 2012/05/05 (Sat) @ 00:25

Position and replacement is legacy from rally. I’d like to derive it myself at some point but haven’t yet.

Thanks for the feedback and help.


#6    Tangotiger      (see all posts) 2012/05/05 (Sat) @ 00:34

Main entry updated with UPDATE 3 and UPDATE 4.

***

Dave/4: interesting!  Can you compare to Fangraphs?


#7    Tangotiger      (see all posts) 2012/05/05 (Sat) @ 00:48

Last and final update as UPDATE 5.

Any future comments I will make in teh comments section.

I’m interested in looking for bias, like Dave found.  If you want to look to see how relievers do, that’d be interesting too.


#8    Colin Wyers      (see all posts) 2012/05/05 (Sat) @ 01:09

Dave Cameron noted on Twitter that some of the notes about Fangraph’s WAR were inaccurate, but didn’t specify which ones. I sent Sean a list in his format of our WARP, he says he’ll have it up next week.


#9    Sky      (see all posts) 2012/05/05 (Sat) @ 01:16

I have a nitpick on the dWAR and ndWAR in that each treats “defense” with a different definition. The first calls defense fielding + position, the latter just fielding. I’d suggest nfWAR for no-fielding WAR.


#10    Tangotiger      (see all posts) 2012/05/05 (Sat) @ 01:18

I sent an email to Sean/David with the 4 things that I thought was inaccurate about Fangraphs.  I may as well post it here (see below).  If Dave C. has other items, I’d like to see it too.  Note that Item #1 below has since been confirmed by David via a blog post at Fangraphs.

***

In your super-dandy chart here:
http://www.baseball-reference.com/about/war_explained.shtml

1. I think Fangraphs goes to multi-year park factors, and I think based on Patriot’s approach.

2. I think they calculate the runs per win for pitchers based on number of innings per game, so, if Felix averages 7 innings per game, and Mo averages 1 inning per game, then the run environment is 7/18ths Felix, 11/18th [league] average for Felix, and 1 v 17 for Mo.  That is, exactly like you. But, David can confirm what he actually did.

3. For the LI adjustment, my recommendation was to give out 1.0 for all SP, and then .5 + LI/2 for relief pitchers.  So, I think you might want to expand more on your chart about how the LI is done (i.e., actual LI or a “regressed” LI, and if different for SP/RP).  I don’t know exactly what Fangraphs does.

4. Fangraphs provides downloads of all the data.  On all the data pages, it has “Export Data”.  It’s not prepackaged like yours, but it is a one-click available download.


#11    Tangotiger      (see all posts) 2012/05/05 (Sat) @ 01:21

Sky: right.  Whenever I use the terms defense and fielding, I always refer to defense as something akin to overall run prevention, while fielding (and pitching) is a subset of defense.

So, Jeter is a bad fielder at SS, but good overall defense for the Yankees.


#12    Tangotiger      (see all posts) 2012/05/05 (Sat) @ 01:50

In light of Dave/4’s comment about the high WAR for pitchers (or low for nonpitchers), I took the top 25 in pitching WAR since 1993.

This is what I got:
9.8 WAR
241.5 IP
33.0 G
72.3 R
184.5 ERA+

From there we can figure things out.  The league average Runs allowed given 241.5 IP is 72.3 x 1.845 = 133

That means league RA9 is 4.955.  Our base run environment is therefore twice that, or 9.91.

Now, our pitchers are 133 - 72 = 61 runs better than average over 33 games, or 1.85 better than average.

That reduces our run environment down to 9.91-1.85=8.06

The runs per win converter is .75*(8.06+4) = 9.0

Since WAR is 9.8, that implies 9.8 x 9 = 88.2 RAR.

72.3 Runs allowed + 88.2 RAR = 160.5 as replacement level runs allowed

On 241.5 IP, that’s 5.98 as the replacement level.

5.98 replacement level / 4.955 as average level =
1.21

That’s perfectly fine.

So, things look fine at the pitcher level.

Someone want to look at it at the nonpitcher level?


#13    Tangotiger      (see all posts) 2012/05/05 (Sat) @ 02:17

Interestingly, Fangraphs’ top 25 in pitching WAR in the same time frame averaged only 9.1 WAR.  This is even though Fangraphs has a lower replacement level.  We should have expected an extra 0.5 WAR because of the lower replacement level.  So, the 9.1 fWAR is equivalent to 8.6 bWAR.  So, that’s 1.2 WAR lower than bWAR.

However, because Fangraphs has less “luck” in its numbers (relying on FIP instead of RA9), that might explain the gap.


#14    Sky      (see all posts) 2012/05/05 (Sat) @ 02:47

I get 6.9 for BPro WARP…


#15    Tangotiger      (see all posts) 2012/05/05 (Sat) @ 03:05

41% of 875 wins is 358.75, and Sean has 358.5 wins for pitchers in 2011.  Rounding error aside, a perfect match.

The issue that David is referring to is almost certainly because of how fielding is being attached.  This is extremely clear in 2011, comparing Doc to Kershaw.

The two of them had virtually the same number of innings.  Kershaw was 28K ahead and 19BB ahead.  Doc gave up 5 fewer HR, but 29 more non-HR hits.

Runs allowed was 65 for one and 64 for the other.

All things considered, a wash really.

But Sean’s new version of WAR has Doc at 10.1 WAR and Kershaw at 6.3 WAR, a 3.8 win difference (or about 35 runs).

That gap of 35 runs is 1.35 runs per 9 innings.  That’s one heckavu park + fielders adjustment.  Not to say it’s wrong, but it deserve some in-depth explaining I think.

So, I think Dave brings up a legitimate point here, and I’d like to see the calculation for Doc v Kershaw.


#16    Patriot      (see all posts) 2012/05/05 (Sat) @ 04:37

There’s a typo in their RPW formula:

2 * (RunsPerGameForBothTeams)^.215

Surely they meant .715.


#17    .(JavaScript must be enabled to view this email address)      (see all posts) 2012/05/05 (Sat) @ 04:46

Yes, it’s .715.  I double-checked the formulas.


#18    Sky      (see all posts) 2012/05/05 (Sat) @ 04:59

BPro’s 2011 WARP:

Halladay 3.8
Kershaw 6.0


#19    .(JavaScript must be enabled to view this email address)      (see all posts) 2012/05/05 (Sat) @ 05:14

http://tmp.bbref.com/2011_pitchers.html

Here is a detailed rundown of the calc for Halladay and Kershaw.  I’ve double-checked and it fits what I outlined.  Regarding 20 of top 25 being pitchers.  When you adjust the runs to win by what the pitcher does it has a big impact on how runs get adjusted to wins. 

Halladay has a 7.5 runs to win conversion, while Kemp is like 10.5 when you add in his positive offensive and negative defense performances.

Is that correct?  It unsettles me a bit, but most folks seem to feel that is the best approach.


#20    Matthew Cornwell      (see all posts) 2012/05/05 (Sat) @ 05:43

I’d like to see how Baseball Guage’s new WAR matches up with this one.  And can we still call it “rWAR” anymore?  Is it enough of a BR creation to call it something different? 😊


#21    Peter Jensen      (see all posts) 2012/05/05 (Sat) @ 07:18

Sean - I’ve looked at your war explained link, which is excellent by the way, but I don’t find any explanation as to why the replacement level should be so drastically different for Kershaw and Halliday.  That seems to be much of the difference in their end values.  Could you help me by explaining that calculation in more detail?


#22    Tangotiger      (see all posts) 2012/05/05 (Sat) @ 07:24

Sean,

Thanks for that fantastic presentation.  Basically, a few runs here, a few runs there, and in each case, they *all* broke Doc’s way.  He gets the runs for facing tougher opponents, more for pitching in tougher parks, more for having the bad fielders, even the leverage went his way, etc.  We’re up to like 28 runs, and then, he gets the benefit that each run leads to more wins because of his lower run environment.  And just like that, it’s +3.8 wins.

I mean it’s really really really hard to want to believe this, but, that’s the point of doing all these adjustments, to show all those little things that conspired against Doc from being even better than he was.


#23    Peter Jensen      (see all posts) 2012/05/05 (Sat) @ 07:34

I see now RAreplacement is cumulative of everything before plus the replacement.  I misread it as a replacement valuation alone.


#24    Tangotiger      (see all posts) 2012/05/05 (Sat) @ 07:35

The leverage thing by the way looks mighty excessive.

As I had mentioned somewhere, we can explain why we give a 1.5 for a reliever who has a 2.0 LI: we explain it with chaining.

But, we can’t use that same reasoning for starting pitchers.  So, that huge 1.11 LI he had in 2011 really has a noticeable impact.  (By the way, did you count that as 1.055, or did you keep it at 1.11?)

In any case, I think that it makes more sense not to get into the leverage aspect for starting pitchers. I’m not totally convinced of that, but I am mostly convinced.


#25    KJOK      (see all posts) 2012/05/05 (Sat) @ 08:13

“Halladay has a 7.5 runs to win conversion, while Kemp is like 10.5 when you add in his positive offensive and negative defense performances.

Is that correct?  It unsettles me a bit, but most folks seem to feel that is the best approach.”

This is correct, BUT for older era pitchers like Walter Johnson, I suspect the pitcher/fielder split for pitchers is too high in these older eras?  Seems like it should be on a sliding scale like replacement level is on, with older seasons, that had higher BIP percentages, giving less ‘credit’ to the pitcher.


#26    Peter Jensen      (see all posts) 2012/05/05 (Sat) @ 08:34

For an alternative viewpoint my DIRVA PLUS pitching metric has Kershaw as 7.5 runs better than Halliday in 2011.  The main reason is how Sean and I have adjusted for the defense playing behind each pitcher.  First, I don’t use all BIP as Sean does, I use all BIP - Errors for both the team defense and the pitcher defense.  My reasoning is that assigning 0 of the error run value to the pitcher is preferable.  Second, I calculate the delta run value of BIP events (minus errors) at both the pitcher and team level.  When Kershaw was pitching the defense gave up .012 runs per BIP PA less than the Dodger team averaged for all pitchers.  When Halliday was pitching the defense gave up the defense gave up .016 runs more per BIP PA than the Philadelphia team averaged for all pitchers.  This resulted in a 17.5 run advantage for Kershaw over Halliday on all BIP - error hit balls for my system.  Sean calculates 9 run advantage for Halliday under his system.


#27    Tangotiger      (see all posts) 2012/05/05 (Sat) @ 08:39

Peter, if Kershaw’s defense gave up fewer runs than expected, and Doc’s defense gave up more runs than expected, are you attributing that to the pitcher, or to the fielders?

Because you say that Kershaw has an “advantage”, but doesn’t Kershaw actually have a BENEFIT of 17.5 runs from his fielders?  And so, you need to adjust that so that Kershaw drops by 17.5 relative to Doc?

Maybe it’s a question of terminology that I’m not following you.


#28    Peter Jensen      (see all posts) 2012/05/05 (Sat) @ 11:26

Tango - My assumption with DIRVA PLUS is that the defensive true talent skills of the other fielders are constant.  If they perform better when Kershaw is pitching then I am attributing the entire increase in performance to something that Kershaw is doing as a pitcher.  That is a simplistic solution I know.  But is a compromise solution between attributing the entire increase in performance to chance as FIP does, and attributing all the BIP runs to the pitcher.  My compromise gives the fielders the majority of the BIP runs and the pitcher the difference from the team’s other defensive players average performance. 

The main conceptual difference between Sean’s metric and mine is that Sean starts by giving all the run differential between the actual runs and expected runs to the pitcher and then makes adjustments to that total by tweaking things like park factor, strength of opponents, and quality of the defense.  I start with nothing and add all the defense independent runs, and the run difference between the average defense and the actual defense behind the pitcher on balls in play minus errors, and none of the runs on errors, to arrive at my estimate of the pitcher’s contribution.


#29    .(JavaScript must be enabled to view this email address)      (see all posts) 2012/05/05 (Sat) @ 18:55

On the run environment and runs to win.  I wonder if we might not be better off regressing that somewhat (don’t have an opinion how much).  It would be acknowledgment that 1) we don’t know how much the pitcher and fielder affect things and 2) I’ve always assumed that extreme performances have a good deal of luck, so the player wasn’t necessarily altering his run environment due to his skill.

thoughts?


#30    Tangotiger      (see all posts) 2012/05/05 (Sat) @ 20:01

I was running some simulations, and I noticed that the runs per win flat-lines to something close to 8.5 runs per win.

Here are my results:

RS	RA	diff	win	winsOver500	RPG	RPW	Tango	Patriot
4.5	8.75	4.25	0.2029	0.2971	13.25	 14.3 	 12.9 	 12.7 
4.5	8.25	3.75	0.2254	0.2746	12.75	 13.7 	 12.6 	 12.3 
4.5	7.75	3.25	0.2508	0.2492	12.25	 13.0 	 12.2 	 12.0 
4.5	7.25	2.75	0.2792	0.2208	11.75	 12.5 	 11.8 	 11.6 
4.5	6.75	2.25	0.3109	0.1891	11.25	 11.9 	 11.4 	 11.3 
4.5	6.25	1.75	0.3461	0.1539	10.75	 11.4 	 11.1 	 10.9 
4.5	5.75	1.25	0.3851	0.1149	10.25	 10.9 	 10.7 	 10.6 
4.5	5.25	0.75	0.4281	0.0719	9.75	 10.4 	 10.3 	 10.2 
4.5	4.75	0.25	0.4750	0.0250	9.25	 10.0 	 9.9 	 9.8 
4.5	4.25	-0.25	0.5260	-0.0260	8.75	 9.6 	 9.6 	 9.4 
4.5	3.75	-0.75	0.5807	-0.0807	8.25	 9.3 	 9.2 	 9.0 
4.5	3.25	-1.25	0.6389	-0.1389	7.75	 9.0 	 8.8 	 8.6 
4.5	2.75	-1.75	0.6998	-0.1998	7.25	 8.8 	 8.4 	 8.2 
4.5	2.25	-2.25	0.7623	-0.2623	6.75	 8.6 	 8.1 	 7.8 
4.5	1.75	-2.75	0.8246	-0.3246	6.25	 8.5 	 7.7 	 7.4 
4.5	1.25	-3.25	0.8845	-0.3845	5.75	 8.5 	 7.3 	 7.0 
4.5	0.75	-3.75	0.9387	-0.4387	5.25	 8.5 	 6.9 	 6.5 
4.5	0.25	-4.25	0.9831	-0.4831	4.75	 8.8 	 6.6 	 6.1

diff is RS-RA
winsover500 is wins minus .500
RPG is RS+RA

RPW is diff/winsover500

Tango is .75*RPG+3
Patriot is 2*RPG^.715

So, we see when the gap is about +/-1.50 runs per game, then, the Tango estimator does a decent job.

The Patriot estimator also does a similarly decent job.

In light of this, then maybe we need to have something a bit more robust than what we’ve been using.

Maybe someone can run a DiamondMind sim, and confirm say a 4.5 RPG against a 1.25 RPG.


#31    Tangotiger      (see all posts) 2012/05/05 (Sat) @ 20:15

I missed the obvious: 4.5 v 0 = 100%.  So, win diff = .500 and run diff = 4.5, so runs per win of 9.

So, it’s going to flatline a bit lower than 2 * RPG (of the higher of the two RS, RA), and then slowly go back up to RPG.

I’ll try out different run environments, and report back results.


#32    Kincaid      (see all posts) 2012/05/05 (Sat) @ 20:42

Is there a reason to use a runs per win converter for pitchers rather than just using PythagenPat to calculate wins more directly?  It seems like the former is just trying to approximate the latter (Patriot’s method is a linear approximation of PythagenPat for a .500 team, for example), and using PythagenPat itself is not really any more complicated.  All you have to do is convert the W% from PythagenPat to wins using IP.  For example, if you have a .780 pitcher and a .380 replacement level over 200 innings, you can just do:

WAR per game = .780 - .380 = .4
games = 200IP * 1G/9IP = 22.2 games
WAR = .4 * 22.2 = 8.9


#33    Tangotiger      (see all posts) 2012/05/05 (Sat) @ 20:45

Ok, here’s another one:

RS    RA    diff    win    winsOver500    RPG     RPW      Tango      Patriot 
5.5    8.75    3.25    0.2760    0.2240     14.3      14.5      13.7      13.6 
5.5    8.25    2.75    0.3025    0.1975     13.8      13.9      13.3      13.2 
5.5    7.75    2.25    0.3317    0.1683     13.3      13.4      12.9      12.9 
5.5    7.25    1.75    0.3638    0.1362     12.8      12.8      12.6      12.5 
5.5    6.75    1.25    0.3989    0.1011     12.3      12.4      12.2      12.2 
5.5    6.25    0.75    0.4371    0.0629     11.8      11.9      11.8      11.8 
5.5    5.75    0.25    0.4783    0.0217     11.3      11.5      11.4      11.5 
5.5    5.25    -0.25    0.5225    -0.0225     10.8      11.1      11.1      11.1 
5.5    4.75    -0.75    0.5694    -0.0694     10.3      10.8      10.7      10.7 
5.5    4.25    -1.25    0.6189    -0.1189     9.8      10.5      10.3      10.3 
5.5    3.75    -1.75    0.6704    -0.1704     9.3      10.3      9.9      9.9 
5.5    3.25    -2.25    0.7232    -0.2232     8.8      10.1      9.6      9.6 
5.5    2.75    -2.75    0.7762    -0.2762     8.3      10.0      9.2      9.2 
5.5    2.25    -3.25    0.8284    -0.3284     7.8      9.9      8.8      8.8 
5.5    1.75    -3.75    0.8780    -0.3780     7.3      9.9      8.4      8.3 
5.5    1.25    -4.25    0.9230    -0.4230     6.8      10.0      8.1      7.9 
5.5    0.75    -4.75    0.9612    -0.4612     6.3      10.3      7.7      7.5 
5.5    0.25    -5.25    0.9900    -0.4900     5.8      10.7      7.3      7.1

This time, it bottoms out at 9.9.

And this it bottoms out at 6.9

RS    RA    diff    win    winsOver500    RPG     RPW      Tango      Patriot 
3.5    8.75    5.25    0.1344    0.3656     12.3      14.4      12.2      12.2 
3.5    8.25    4.75    0.1518    0.3482     11.8      13.6      11.8      11.8 
3.5    7.75    4.25    0.1718    0.3282     11.3      12.9      11.4      11.5 
3.5    7.25    3.75    0.1947    0.3053     10.8      12.3      11.1      11.1 
3.5    6.75    3.25    0.2208    0.2792     10.3      11.6      10.7      10.7 
3.5    6.25    2.75    0.2507    0.2493     9.8      11.0      10.3      10.3 
3.5    5.75    2.25    0.2847    0.2153     9.3      10.5      9.9      9.9 
3.5    5.25    1.75    0.3232    0.1768     8.8      9.9      9.6      9.6 
3.5    4.75    1.25    0.3668    0.1332     8.3      9.4      9.2      9.2 
3.5    4.25    0.75    0.4157    0.0843     7.8      8.9      8.8      8.8 
3.5    3.75    0.25    0.4704    0.0296     7.3      8.4      8.4      8.3 
3.5    3.25    -0.25    0.5310    -0.0310     6.8      8.1      8.1      7.9 
3.5    2.75    -0.75    0.5975    -0.0975     6.3      7.7      7.7      7.5 
3.5    2.25    -1.25    0.6693    -0.1693     5.8      7.4      7.3      7.1 
3.5    1.75    -1.75    0.7453    -0.2453     5.3      7.1      6.9      6.6 
3.5    1.25    -2.25    0.8235    -0.3235     4.8      7.0      6.6      6.2 
3.5    0.75    -2.75    0.9003    -0.4003     4.3      6.9      6.2      5.7 
3.5    0.25    -3.25    0.9702    -0.4702     3.8      6.9      5.8      5.2

Ok, so, we have these bottoms out, using the higher of the two runs, runs allowed:

3.5: 6.87
4.5: 8.45
5.5: 9.90

Roughly speaking, the minimum is at 1.5 * (R+1).

And when one of runs or runs allowed is 0, it has to naturally be 2 * R.

***

I adjusted Patriot to match closer to Tango, for comparison purposes, to exponent of .721.


#34    Tangotiger      (see all posts) 2012/05/05 (Sat) @ 20:56

Kincaid brings up an excellent point that Patriot’s approximation is based on a run differential of zero.

If you need something better, then the full-blown PythagenPat is what you need.  Indeed, when I re-run my tests above, and compare to the PythagenPat formula, it works out beautifully, with at most a 0.2 implied Runs Per Win difference between actual and the estimate.  And that’s across some pretty wide range of teams.

For example, with a 5.5 v 1.25 team, the simmed win% is .923, and PythagenPat says .926.  In either case, that’s 10 runs per win.  (+4.25 runs per game, compared to +.423 wins or +.426 wins.)

So, I’d strongly recommend that anyone wanting to do this right, to use PythagenPat.

Great point Kincaid!


#35    Patriot      (see all posts) 2012/05/05 (Sat) @ 21:13

I agree with Kincaid—this problem, such as it is, is caused by using a linear RPW formula.  If you’re going to use one such formula, you have to center it at .500, which obviously will cause some distortions at extreme values.  If it didn’t, Pythagenpat wouldn’t have any advantage over linear W% estimators.

The actual Pythagenpat implied RPW (that is, (R-RA)/(W% - .5) using Tango’s exponent of .279 for Tango’s 5.5 R/G teams are:


RS      RA      RPW        Pyth RPW
5.50         8.75       14.5   14.4
5.50         8.25        13.9   13.8
5.50         7.75        13.4   13.3
5.50         7.25        12.8   12.8
5.50         6.75        12.4   12.3
5.50       6.25        11.9   11.9
5.50         5.75        11.5   11.5
5.50       5.25       11.10   11.1
5.50     4.75       10.80   10.8
5.50         4.25       10.5   10.5
5.50         3.75       10.3   10.2
5.50       3.25       10.1   10.1
5.50         2.75       10.0   9.9
5.50         2.25       9.9   9.9
5.50       1.75       9.9   9.9
5.50       1.25       10.0   10.0
5.50       0.75       10.3   10.2
5.50       0.25     10.7     10.6

Pythagenpat and the sim are in fundamental agreement about the nature of the relationship.  It’s much easier to just use full-blown Pythagenpat than trying to find a polynomial function that would better approximate RPW at extremes.


#36    Sean Forman      (see all posts) 2012/05/05 (Sat) @ 21:38

To bring this back to my issues 😊

If the league average team is at 4.13/game
And Halladay’s team is at 2.45/game, how do I calculate R/W?

If I’m reading #35 right then my #‘s are way, way too low which is explaining the dominance of the pitchers on the leaderboards.


#37    Patriot      (see all posts) 2012/05/05 (Sat) @ 22:00

It’s a little clumsy, but:

x = (4.13+2.45)^.285 = 1.71

W% = 4.13^1.71/(4.13^1.71 + 2.45^1.71) = .709

RPW = (4.13-2.45)/(.709-.5) = 8.04


#38    Tangotiger      (see all posts) 2012/05/05 (Sat) @ 22:03

We’re suggesting you do PythagenPat.

So, 4.13+2.45 = 6.58

Raise that to .28, and you get 1.694

Going back to Pythag:
(4.13/2.45) ^ 1.694 = 2.42

Win% = 2.42/(2.42+1) = .708

Therefore, he’s +.208 wins above average, which you multiply by his number of games (say it was 32), or +7.1 wins above average.

Add in the replacement level, and say you are at +9 wins or something.


#39    .(JavaScript must be enabled to view this email address)      (see all posts) 2012/05/05 (Sat) @ 22:12

So for a position player he is going to have a lot more games, but way fewer wins above average per game. 

Also for a position player you would use his defense to reduce the opposition’s runs per game and then his offense, position, br, dp get added to his team’s runs per game?

So we could now report a winning percentage for the player which will now give us a rate stat?  I like that.  Halladay on the mound makes the Phillies a .708 team.  Kemp makes the Dodgers a .545 team.

How exactly would the replacement runs to wins work then?  A kludge would be to back out runs per win from the WAA and RAA you get above and then apply that to replacement runs, but I suspect there is a better way.


#40    .(JavaScript must be enabled to view this email address)      (see all posts) 2012/05/05 (Sat) @ 22:22

For Patriot’s RPW I get 9.66 WAR for Halladay, but I’m guessing that the replacement runs should not get the modified RPW, so instead would I just use RPG = 2 * the league avg to set the R/W for the replacement runs, since the replacement runs are for being a league average player?

So for Halladay it would be
54/8.04 + 21.4/9.05 = 6.7 + 2.36 = 9.06

Also for pitchers should replacement be multiplicative or additive based on playing time?  Right now it is multiplicative, but for batters it is obviously additive.  Seems to me that if Halladay and Kershaw pitched the same number of innings their replacement level addon should be the same for both of them since that is based on the average player compared to the replacement level player, though you might argue that Halladay would have thrown more innings if in the same context as Kershaw, for the reasons we outline above.


#41    Tangotiger      (see all posts) 2012/05/05 (Sat) @ 22:25

ACctually, Doc is 2.45 RA9, but that’s not the 7/9th thing, right?  So, 7 parts 2.45, and 2 part 4.13 gives us 2.82 for Doc-games.

That’s a .656 win% for Doc-games, or +5.3 wins above average. 

And then whatever the replacement is, which would be 7/9ths replacement and 2/9th average.  If it’s .410 and .500, for example, then replacement level is .430. 

So, +.07 x 34 = +2.4 wins to add, for a total of 7.7.

As a quick illustration anyway.


#42    Tangotiger      (see all posts) 2012/05/05 (Sat) @ 23:19

By the way, great job by Dave in being able to spot that something was amiss!  It brings us back to those old days of Baseball Boards (RIP) from a decade ago.


#43    aweb      (see all posts) 2012/05/06 (Sun) @ 00:10

So hopefully I’m reading this right and pitchers are going to get knocked back towards position players in an update? It’s a pretty fundamental change otherwise…


#44    .(JavaScript must be enabled to view this email address)      (see all posts) 2012/05/06 (Sun) @ 01:09

If you look at my chart linked above to get Doc I’m taking th e following for his runs below avg per out.  I’m taking the total for the avg pitcher after all of the modifications.

119.2-65 = 54.2 runs vs. avg.
54.2/701 outs = .077 below avg per Out,
so he is .15447 - .077 =.077 for his innings he is pitching. 

I get 5.0 * .155 + 21.9 * .077 = 2.46 for the team.

Is the .410 above from the .320 overall corresponding to .410 on offense and .410 on defense?

Also the 7/9th and 2/9ths relates to the innings per game he pitches.  How would we handle that for hitters?  Still innings per game?  What about PH’s?

#43/As we are discussing it the runs above rep will stay same for everyone, but the runs to wins calculation would change.


#45    Tangotiger      (see all posts) 2012/05/06 (Sun) @ 02:46

Well, the .410 is actually for defense, so yes, .410 for defense and .410 for offense will give you .320 for team.

But you know, I’m going to end up complicating your process if we go down this road.  That’s because .410 for defense is fielding, SP, and RP.

Instead, go back to the way you were doing it, which for SP would be something like 1.20 times the league average.  You had 4.13 RPG for average, so repl is 1.20 times that, or 4.96.

Since Doc pitches 7 of 9 innings, then his replacement is 7/9th of 4.96 and you keep the 2/9th of 4.13 for 4.78. 

So, repl win% is based on RS = 4.14 and RA = 4.78, for a win% of .438.

Doc-games was .656, repl-games is .438, for a difference of .218 wins over repl, times 34 games (or whatever he pitched), for 7.4 WAR.

Something like that…


#46    Tangotiger      (see all posts) 2012/05/06 (Sun) @ 03:02

For nonpitchers, you’d work it the same way, though honestly, you don’t have to.  The RPW converter is going to be so close to whatever PythagenPat says, it’s more trouble than it’s worth.

But, for edification, we can talk about it.  So, Kemp is say +40 runs above average and plays 160 games, so he’s +0.25 runs per game.

The replacement is say -20 runs per 600 PA and Kemp has say 660 PA, then replacement is -22 runs.  Which over 160 games is -.1375 runs per game.

If the base is 4.13, then Kemp-games is 4.38 v 4.13.  (Though I guess with his bad defense, it’s more like 4.48 v 4.23 or something.)

His replacement is 3.99 v 4.13.

Use pythag.

Somethign like that…


#47    Tangotiger      (see all posts) 2012/05/06 (Sun) @ 20:01

Man, Forman’s getting really nailed:

http://www.baseballthinkfactory.org/newsstand/discussion/war_upgraded_on_baseball_reference_com/

I understand the frustration at getting the run per win conversion wrong at the extreme end for pitching, but Sean will take care of this pretty quickly.

I’ll take the blame for that, because I had never tested it at the very extreme points like we’re seeing in single-season great pitching seasons.  But Dave stepped in extremely quickly to highlight the issue.

Some of the other objections are based on not accepting DRS results, which, well, either you buy it or you don’t.

If you don’t like FIP or UZR you won’t buy Fangraphs.  If you don’t like DRS then you won’t buy BaseballReference.  If you don’t like FRA then you won’t buy BPro.

The best you can do is to justify FIP, UZR, DRS, FRA, and let the user make the choice.


#48    Matthew Cornwell      (see all posts) 2012/05/06 (Sun) @ 20:48

Other than the runs per win stuff (which is getting worked out) and the LI issue for starting pitchers, I think the new implementation is great.  Most of the people from the BTF are just complaining about their favorite player not finishing where they are used to or where they want them to.  That or the dWAR/ndWAR classification issue, which really doesn’t change the value of the player at all.  Nice work, Sean!


#49    Patriot      (see all posts) 2012/05/06 (Sun) @ 21:39

What was surprising to me about that thread is that there was anyone who thought that dWAR as it was defined previously was worth looking at.

The BTF crowd also reacted negatively when ERA+ was briefly changed to Guy’s version.  It just goes to show that there is a very small group of people that actually cares about how these metrics are calculated.  For everyone else, even people inclined to buy into sabermetrics, they are simply a better set of numbers on the back of the baseball card, which should never change once they are established.


#50    Tangotiger      (see all posts) 2012/05/06 (Sun) @ 22:13

I liked what one of the commentators said:

The only people who should be put about by an upgrade in WAR formulas, as several posters above have noted with telegraphic snark, are those who had become addicted to looking at the single number in the WAR column and ending thought at that moment.

That’s a brilliant line.

The idea behind a WAR is that you first establish a framework, and then create your own implementation of that framework.

The WAR framework is what has been adopted by BR and Fangraphs and BPro and Baseball Guage.  So, that’s great.

They each have their own implementation, their own thoughts on the matter. Now, the horrors, everyone has their own idea as to calculate each component.  This scares people.  And the commenter had it right, that you need to THINK.

If you buy into each of the components, THEN you must buy the results.

So, Leverage Index for SP is something that I personally will not buy into.  I just don’t see the point, and it’s certainly not the same point as LI for relief pitchers. 

Sean is trying something new there, so we’ll see how the rest of the saber-crowd reacts to that.

For Dewan v MGL, like I said earlier, and what most readers seem to suggest: split the difference.

Same deal with FIP v RA9: split the difference.


#51    Tangotiger      (see all posts) 2012/05/06 (Sun) @ 22:38

My favorite example was when someone told me that his buddy couldn’t believe how high up Reuschel was in WAR, and that guy decided to disprove it.

So, he went on his merry way, and… he couldn’t do it.  He didn’t have Reuschel quite as high, but he was still very high.

That’s the point here.  You can’t just sniff an answer wrong.  It may indicate it’s wrong, but then you have to go out an offer… gasp… an alternative.  You have to think.  That’s the point here…. to think a solution if you can, or accept one that’s being offered.


#52    Tangotiger      (see all posts) 2012/05/06 (Sun) @ 23:58

Fangraphs for example has ERA- and FIP-, which is perfect.  It’s got the calculation in the right form (player divided by league).  And, it did a takeoff on the “+” style naming by using the “-”, which basically means “the lower the better”, as opposed to the “+” that means the higher the better.

I don’t think anyone complained about Pedro having a ERA- of 67 as opposed to an ERA+ of 150 (which both mean the same thing).


#53    .(JavaScript must be enabled to view this email address)      (see all posts) 2012/05/10 (Thu) @ 04:21

OK, I’ve got this implemented and I’m rerunning everything.  Should be updated later tonight along with an explanation.  Thanks for everyone’s help.  This brings things much more back in line with people’s expectation.


#54    .(JavaScript must be enabled to view this email address)      (see all posts) 2012/05/10 (Thu) @ 10:15

I’ve completed the site rebuild, so these changes are now live.  Here is our methodology.

http://bbref.com/about/war_explained_runs_to_wins.shtml


#55    .(JavaScript must be enabled to view this email address)      (see all posts) 2012/05/10 (Thu) @ 10:16

Here is the blog post announcing the changes along with a list of other changes like converting over to gmLI and also applying LI just to relievers.


#56    NaOH      (see all posts) 2012/05/10 (Thu) @ 10:35

Here’s the post Sean referenced explaining the updated changes:

http://www.sports-reference.com/blog/2012/05/baseball-reference-com-war-explained-converting-runs-to-wins-baseball-reference-com


#57    .(JavaScript must be enabled to view this email address)      (see all posts) 2012/05/10 (Thu) @ 11:03

Thank you for posting it.  I’m a bit bleary eyed after four weeks of thinking about this non-stop.


#58    tangotiger      (see all posts) 2012/05/10 (Thu) @ 20:21

The new version, here’s what I get for the top 25 in WAR pitching, 1993-2011:

8.9 WAR
241.8 IP
32.9 G
72.6 R
182.8 ERA+

The new WAR compares favorably to the Fangraphs top 25 of 9.1 (which, given the different replacement level would be 8.6 using the BR.com repl level).

***

Dropping the leverage component for starting pitchers is the right choice in my opinion (at least, it’s more justifiable to not have it than to have it).


#59    Matthew Cornwell      (see all posts) 2012/05/11 (Fri) @ 00:31

I love the inclusion of WAA!  Any possible way we will see some career WAA leader-boards? 😊


#60    kds      (see all posts) 2012/05/11 (Fri) @ 00:54

I believe that the way GDP runs are calculated misses much of the ability (or lack thereof) to reduce double plays.  “GIDP opportunities are any infield ground balls with a runner on first, less than two outs and at least one out is recorded on the play.  The play must not be scored as a hit as well.”  There are many ways to avoid DPs, Hit the ball in the air, take pitches, (even a K is likely to be better than a GiDP, a BB much better), have a high BABiP, be a LHB, be fast, maybe others.  The method Sean is using only looks at the last 2 on my list, but the others make a significant difference in DP rate and that benefit is not captured by wOBA, wRAA.  I think that opportunities should be man on first less than 2 outs minus HBP and IBB.  Any way that a batter can lower his DP rate is useful and should be measured for WAR.


#61    Tangotiger      (see all posts) 2012/05/11 (Fri) @ 01:08

kds: I must have missed that, but you are 100% right.  The number of opportunities is counted BEFORE the last pitch, not during or after the last pitch.

A DP opp means that there’s a runner on 1B and less than 2 outs.

Now, perhaps there’s some other details that Sean is handling, by giving out a different run value for GB/FB OVERALL, so that he ends up in the same place anyway.

But, yeah, if you have a hitter that has 100% of his BIP as flyballs with a runner on 1B and less than 2 outs, then he should receive a HUGE boost for not grounding into a DP.  Similarly, a hitter that has 100% of his BIP as grounders, and gets say 30-35% of those as DP (the league average rate), he should receive a HUGE negative for grounding into as many DP as he did.

I’ll ask Sean to comment on this point.


#62    .(JavaScript must be enabled to view this email address)      (see all posts) 2012/05/11 (Fri) @ 01:32

Neil and I went back and forth on how to handle that and what was appropriate.  The way it is now is how Sean Smith had it implemented, but we did discuss doing something like what you are suggesting here.

The logic of the decision sounds weaker to me now as I type it here, but basically our thought was the negative impact of the DP’s were factored into the LWTS for an out, so a player is getting a bit of a DP hit for every out they make already. 

Our thought was that the DP situation was largely one of context and I think you can make a case that the other things we study are largely context free.  Now once you’ve grounded into what could be a DP situation, you can impact whether it is one out or a DP in a context-free way.  In this case sets of potential double plays is a context unto itself.  I’m tired and I’m not sure this is making sense, so this explanation will be largely unsatisfactory.  I’ll ping Rally about it and see what he says.

The ideal way to handle this would probably be to split GB and FB outs and go from there, though I wouldn’t be shocked if the increased number of baserunner advancements for GB’s offset much of the damage caused by GIDP’s.  Perhaps that is the reason.  GB’s and FB’s have the same run value since additional advances offset the GIDP’s unless are better or worse at avoiding DP’s on GB’s than the league is. 

Since 2010


+——————+—————————+———————-+
| event_type | batted_ball_type | avg(re24_bat) |
+——————+—————————+———————-+
|      2 | F           |  -0.24609684 |
|      2 | G           |  -0.27180080 |
|      2 | L           |  -0.27795821 |
|      2 | P           |  -0.27675507 |
+——————+—————————+———————-+
4 rows in set (0.87 sec)

mysql> select event_type, IF(batted_ball_type=‘G’,‘ground’,‘air’) as type, avg(re24_bat) from pbp where year_game >=2010 and event_type=2 group by type;
+——————+————+———————-+
| event_type | type   | avg(re24_bat) |
+——————+————+———————-+
|      2 | air   |  -0.25784706 |
|      2 | ground |  -0.27180080 |
+——————+————+———————-+


since 1988

+——————+————+———————-+
| event_type | type   | avg(re24_bat) |
+——————+————+———————-+
|      2 | air   |  -0.27286721 |
|      2 | ground |  -0.28857360 |
+——————+————+———————-+
2 rows in set (10.63 sec)

mysql> select event_type, batted_ball_type, avg(re24_bat) from pbp where year_game >=1988 and event_type=2 group by batted_ball_type;
+——————+—————————+———————-+
| event_type | batted_ball_type | avg(re24_bat) |
+——————+—————————+———————-+
|      2 | F           |  -0.25991736 |
|      2 | G           |  -0.28857360 |
|      2 | L           |  -0.29665157 |
|      2 | P           |  -0.29187248 |
+——————+—————————+———————-+


#63    Tangotiger      (see all posts) 2012/05/11 (Fri) @ 01:45

Right, it depends on what the basis is here first:
“by giving out a different run value for GB/FB OVERALL”

And that sounds like what Sean may be doing first.  And then the DP is over-and-above that.

So, a GB is expected to give a certain negative run value, say -.28 runs, and an airball is say -.26 runs.

However, it doesn’t work exactly like that.  You’d have to first give a run value of a GB in DP opps.  So, maybe the average GB in that case is, say, -.42 runs, and the avergae air ball is say -.25 runs.

THEN, you can look at GB only, and see if a player grounded into more DP’s than expected, given that you do have a GB.

Sean is doing the last paragraph, but it’s not clear if he’s doing the second-to-last paragraph as well.

The best way to test is to simply take the extreme case like I have, and show the theoretical results.  In many respects, it’s just like the “first to third on a single”, where you presume every runner will take it 30% of the time, then you simply compare the player to that baseline figure.  They key point is to figure out the right baseline figure, and in the case of a GB in DP opps, it’s probably going to be something like -.42 runs or something for any GB out in a DP opp.  Then you add in the extra -.46 runs for each DP over and above the league average rate.


#64    .(JavaScript must be enabled to view this email address)      (see all posts) 2012/05/11 (Fri) @ 01:59

I’m not doing any GB/FB discernment for hitters in the system currently.  I guess my point above was that GB and FB outs are across all situations worth almost exactly the same value under what by definition is a league average DP turn rate on ground ball outs in DP situations.  That is what (to me at least) the RE24 above shows.  Air Outs and Ground outs are within .015 runs of each other.

However if you are Carl Crawford and are never doubled up, your ground balls are more valuable, so you should get credit for avoiding the one situation where GB’s are more negative than FB’s.  Or if you are Prince Fielder and always doubled up when you hit a ground ball, you should get a penalty for that.  Because on average a GO and and AO are the same value.

I’ve asked Rally for his input.  I’m not sure if that is his reasoning or not, though I think it holds up even if it wasn’t.


#65    Tangotiger      (see all posts) 2012/05/11 (Fri) @ 02:10

Except the difference is that a GB out is worth say -.5 runs and a FB out is worth say -.3 runs, with a runner on 1B and less than 2 outs.

Can you show use the delta RE24 in those two cases?

Therefore, if you limit your “opps” to only GB (or GB outs), then a DP is over-and-above that particular baseline.

This way, a player who never hits a GB ever will only have the -.3 run value of a flyball.

And a player who always hits a GB, but never a DP, will also have the same -.3 run value.

That is, those two have to match, that a FB-out-only hitter, and a DP-never hitter, must end up with the same run value on their outs, with a runner on 1B and less than 2 outs.  If your process can show that, then great, it likely ends up working.


#66    kds      (see all posts) 2012/05/11 (Fri) @ 14:48

Some specific examples using rough numbers.  Tango’s table of % likelyhood of the 24 base/out states show that the ones where a GDP is possible are essentially 20% of total PAs.  Apply this to a league/year and we see that GDPs happen in about 10% of opportunities. 

If we apply 10% to Jim Rice’s 2060 opportunities, we would expect about 206 GDPs.  He actually had 315, so an excess of 109.  Using .44 runs as the average cost of the 2nd out, we get a number very close to the 46 runs debited to him in the WAR calculation. 

Barry Bonds had over 2340 opportunities, but only 165 GDPs.  We might expect roughly a 30 run benefit, but he is only getting 5 for WAR. 

Gene Tenace, a slow RHB had just under 1000 GDP opportunities.  His GDP total was 76, more than 20 fewer than expected.  His WAR Rdp is -3, when +10 looks more accurate.

Ichiro is very good at avoiding GDPs, 61 in over 1100 opportunities.  But because he hits GBs at a much higher % of PA than normal, Rdp (+46) thinks he has saved over 100 GDPs, when it seems more likely to be about half that.


#67    .(JavaScript must be enabled to view this email address)      (see all posts) 2012/05/11 (Fri) @ 23:13

Yes, but Bonds avoids DP’s, but his fly ball outs are much less valuable than the GB’s in non-GIDP situations.

Sean Smith confirmed with me that the reason he did it this way was that if you take all situations together DP sit. non DP, etc, ground ball outs and fly ball outs are remarkably similar in terms of run value. 

Ground balls (lots of base advancement and logs DP’s)
Fly Balls (much less base advancement and very few DP’s)
Popups (no base advancement and no DPs)
Line Drive outs (little base advancement and a good number of DPs)

Now if we take your Rice example (let’s say that he hits many more GB’s than avg and that when considering GB’s his rate of GIDP in DP situations is lg avg not sure if facts bear that out). 

Yes, he has a lot of GIDP, but we are not crediting him for all of the cases he scores a runner from third or moves a runner over on a GB, so just docking him for his GIDP, which he’ll do a lot more than a fly ball hitter.  In fact, the value of the extra GB outs would almost exactly offset the cost of the DP’s since he is league average at beating out the backend of the DP. 

So unless we are crediting his non-DP GB’s we are overpenalizing him for the DPs.

Now if he is worse or better than league avg at avoiding being the back end of a GIDP (in GBs in DP situations) we should credit or penalize him for that.

Now we could use RE24 to pick up all of the baserunner advancement a hitter adds, but I think that is skirting a bit too close to WPAesque measures, IMO.


#68    Tangotiger      (see all posts) 2012/05/11 (Fri) @ 23:42

Sean, can you confirm this:

... a FB-out-only hitter, and a DP-never hitter, must end up with the same run value on their outs, with a runner on 1B and less than 2 outs.  If your process can show that, then great, it likely ends up working.

Because we can only resolve this issue by seeing it in action, much like we saw Doc v Kershaw.  Otherwise, because of the many moving parts, it’s hard to really see what is going on.


#69    Tangotiger      (see all posts) 2012/05/12 (Sat) @ 00:10

I should point out that it won’t be EXACTLY the same, because some groundouts will move the runner from 1B to 2B, and get the batter out.


#70    .(JavaScript must be enabled to view this email address)      (see all posts) 2012/05/12 (Sat) @ 01:51

No, the system does not do what you are asking about. 

The reason is that you will then undervalue the GB hitter on all of the other situations in the game.  What we are saying is that on average the average GB-only hitter and the average FB-only hitter have the same run value on average for their outs.  The times this isn’t the case is when the GB hitter beats out more or fewer dp balls than the league average.  In that case they are either more or less valuable than the FB only player.

It appears that neither you or KDS are wrestling with the meat of my argument which is that on average across all situations a GB out (including any GIDP’s) and and FB out have almost exactly the same run expectation.


#71    Tangotiger      (see all posts) 2012/05/12 (Sat) @ 02:17

Let’s say that the run value of a GB out and a FB out is -.27 runs.  With men on base, it’s greater.  With bases empty, it’s fewer.  With 2 outs, it’s fewer, and with 0 outs, it’s more.  But, we’re going to go ahead and give as a baseline a blanket -.27 runs to all GB outs and FB outs.

Now, let’s say that we have man on 1B and 0 outs.  We know that a single out is worth -.38 runs and two outs is worth -.70 runs. 

We’d have a breakdown like this:
55% of time, FB hit, one out, worth -.38 runs
30% of time, GB hit, one out, worth -.38 runs
15% of time, GB hit, two outs, worth -.70 runs

Overall value of the out: -.43 runs

And we see that 33% of GB are DP.

Say that Ichiro hits 100% GB balls.  And let’s say that only 15% of those are turned into DP.  We have this:
85% of time, GB hit, one out, worth -.38 runs
15% of time, GB hit, two outs, worth -.70 runs

Overall value of the out: -.43 runs

What we care about is the % of DP as a function of ALL outs, and not a % of DP as a function of GB outs.

The way kds describes Sean’s process, Ichiro is going to be a net positive for DP, because the league average has 33% of DP per GB outs, even though Ichiro is at 15% of DP per GB outs.

If kds is correct, then I agree with him.  If Sean on the other hand tells me that in both instances, the GB-not-DP-machine Ichiro and the average player both get the same run value on outs, then I’m ok with whatever Sean is doing.

I mean, I can see it if Sean says that Ichiro gets an extra -.05 for hitting so many GB with a man on 1B and 0 outs, and +.05 for staying away from the DP GIVEN that he’s had so many GB, so that net/net it’s a wash, then fine.


#72    Tangotiger      (see all posts) 2012/05/12 (Sat) @ 02:35

Hmmm… in the other base-out situations, a GB may be a net positive relative to a FB out, like for instance runner on 2B and less than 2 outs.

But, since we treat all those GB and FB outs the same….

So, I think I get where this is coming from.

I don’t know that I necessarily believe or accept it that ultimately it works.  But if we look at it holistically, it might work.

Anyway, if someone else wants to continue, feel free, but I’m spent on this!


#73    kds      (see all posts) 2012/05/13 (Sun) @ 13:52

If we can show that there is very little variation in the value of FB outs among different players, and also GB outs other than DPs, then I would think that the present system probably works well.  If there is significant non-random variation among players of the out values on FBs or GBs - DPs then we need a more detailed mechanism.


#74    Darren      (see all posts) 2012/08/08 (Wed) @ 20:13

Sorry to go back to something 2 months ago. Tango in #46 you said this in regards to Kemp’s R/W.

“If the base is 4.13, then Kemp-games is 4.38 v 4.13.  (Though I guess with his bad defense, it’s more like 4.48 v 4.23 or something.)”

Why does Kemps bad defence (say -10 runs) increase his own teams Runs For. I would see why it would increase the Runs Against. Wouldn’t it be 4.38 vs 4.23?


#75    Kincaid      (see all posts) 2012/08/09 (Thu) @ 04:34

I think when Tango/#46 says you have Kemp at +.25 runs per game, that is his overall value (defense included), not just his offensive value.  So Tango’s aside is saying that Kemp’s .25 runs per game of total value really breaks down to adding .35 runs scored to the offense and adding .10 runs allowed to the defense.  His overall value is still .25 runs per game, you just don’t necessarily have to distribute all that value on the offensive side.  4.38 v 4.13 is a generic example of a +.25 player (with +.25 offense and +0 defense), but the actual run impact of a +.25 player can end up being distributed however between runs scored and allowed, as long as the differential stays at +.25.


Commenting is not available in this channel entry.

<< Back to main


Latest...

COMMENTS

Feb 11 02:49
You say Goodbye… and I say Hello

Jan 25 18:36
Blog Beta Testers Needed

Jan 19 02:41
NHL apologizes for being late, and will have players make it up for them

Jan 17 15:31
NHL, NHLPA MOU

Jan 15 19:40
Looks like I picked a good day to suspend blogging

Jan 05 17:24
Are the best one-and-done players better than the worst first-ballot Hall of Famers?

Jan 05 16:52
Poll: I read eBooks on…

Jan 05 16:06
Base scores

Jan 05 13:54
Steubenville High

Jan 04 19:45
“The NHL is using this suit in an attempt to force the players to remain in a union�