THE BOOK cover
The Unwritten Book is Finally Written!
An in-depth analysis of: The sacrifice bunt, batter/pitcher matchups, the intentional base on balls, optimizing a batting lineup, hot and cold streaks, clutch performance, platooning strategies, and much more.
Read Excerpts & Customer Reviews

Buy The Book from Amazon


2013 Bill James Handbook

THE BOOK--Playing The Percentages In Baseball

<< Back to main

Sunday, August 23, 2009

How can we tell if a pitcher is any good?

By .(JavaScript must be enabled to view this email address)

I was reading the thread on BBTF about David Cameron’s article comparing Smoltz and Jered Weaver (who is a very good pitcher - we think).  By the way, “we think” is my standard disclaimer on all qualitative comments on players.  That is because we really don’t know for sure how good or bad any player is - we can only make an educated guess with some level of certainty - and that level of certainty has a level of certainty as well.  But that is not the point of this post.

David basically argues that any pitcher, good or bad, can have a run of bad performance just like Smoltz did.  He used Jered Weaver as an example.  Of course he is right.  While any bad spate of performance suggests that a pitcher is bad, we have to add that bad spate to all of his previous performance and put in the appropriate weightings and age adjustments for us to make any sense of it (the bad spate of performance).  The difference between Weaver and Smoltz is that Weaver’s bad spate was a blip, and thus not much of a factor, in a 3 or 4 year healthy career, whereas Smoltz’ bad spate was all of his performance after a 42 year old player, albeit a formerly great one, had serious arm surgery.

Still, we really don’t know how much of that bad - OK, terrible - spate of performance relates to his future performance.  Because it is only 40 innings or so, not much of a sample to have much meaning - even if it is spectacularly good or bad performance. Some meaning yes, but definitely not a lot.

So what do we have to work with in order to make an educated guess as to his, or any pitcher’s, future performance or true talent at the present time?  In the case of Smoltz, we have other information, or priors in a Bayesian sense.  The problem is that we are going to have a hard time quantifying those priors for obvious reasons.  One prior is that he was, before his most recent injuries and surgeries, a great, great pitcher, both as a reliever and starter.  We also know that he is old and has had major surgery.  Certainly those go into the equation, but we don’t really know what to do with them.

We also know more about his 40 IP performance than his ERA or even his OPS against, FIP, etc.  David pointed out that he still has a 91 mph fastball and an 85 mph slider and splitter, and he said that is evidence that he is still capable of “getting major league hitters out” which is a euphemism for being not a terrible pitcher - maybe better than replacement level if we had to quantify that term.

Someone on the BBTF thread, which was mostly silly, as many of their threads are, pointed out that there are hundreds of AA and AAA pitchers (and probably lower minor league, high school and college) who have 91 and 85 mph pitches, like Smoltz.  He is right.  Actually both are right.  The person who said that is right in that having a 91 mph fastball and an 85 slider and splitter by no means is 100% or even 90% indicative that a pitcher, even a HOF one, is capable of pitching effectively in the majors.  Dave is being reasonable in pointing out that we are not talking about a HOF pitcher who had surgery and now pitches at 87 or 88 like some pitchers.  The fact that he can throw 91 shows some promise and potential.

Now, a lot of people on the thread kept saying things like, “Yeah, but if you saw him pitch, you would see that despite his 91 mph fastball and 85 mph slider and splitter, he could not locate them and he was getting hammered, especially by lefties.”  They also said, “Yeah, he looked fine the first time through the order, but he got hammered after that.”

The problem with that argument is that ANY pitcher can look bad as far as his location goes and can get hammered in 1 or 5 or even 10 games, which was David’s original point in comparing Smoltz to Weaver.

Now, the people bringing up how he looked and how he got hammered and how his location was poor are making somewhat of a legitimate point.  There are two levels with which we can look at a pitcher’s performance.  One, the numbers only:  ERA, WHIP, K, BB, OPS against, whatever. If those are bad in say 40 IP, we can say something like, “There is a 10% chance that this guy is a terrible pitcher,” not knowing anything else about him. Now we can also look at his location and whether he got hammered or not, independent of his numerical results.  If those things are bad or even worse than the numbers, we might say that there is a 12% or 15% that he is a terrible pitcher.  If those things are not so bad, there might be only a 3% chance that he is a terrible pitcher.  (You can see how getting hammered or not actually trumps the numbers to some extent, but by no means is getting hammered indicative of anything in the short term.)

So, for Smoltz or anyone else for whom we only have 40 IP of data, while the observation of location and “hammerness” is interesting and important to the evaluation and can add to or even trump the numbers, it has its limitations because a good pitcher can get hammered or have bad location in any given number of games or innings.  Look at Weaver in the time period that David describes in his article and look at Sabathia the first few weeks of this season (he got absolutely hammered I think). You can probably look at 3 or 4 or 5 game stretches of just about any pitcher and find them getting hammered and having poor location skills.

So that brings us back to the title of this post. How the heck can we tell how good a pitcher is?  We know that pitchers are difficult to project statistically for various reasons.  I’ll also say - and people may disagree with me - that teams and their scouts and other evaluators do a horrible job at evaluating pitchers, other than the obvious ones.  Anecdotal evidence of that is the fact that teams routinely cut and sign what turn out to be terrible washed up pitchers.  If it were easy to scout a pitcher to see whether they were still effective or not, pitchers like Bruce Chen, Odalis Perez, Sidney Ponson, the bad Weaver, etc., would not bounce around from team to team, in and out of retirement, and from the majors to the minors and back again.  But that is also another story.

So what can we (anyone) do to evaluate pitchers in the long and short runs?  We can use the “numbers” but we will run into two problems:  One, for young pitchers we don’t have enough data to make these evaluations with much certainty.  Two, even when we have a lot of numbers, we find that pitching projections are not that reliable, depending on your definition of “not that” of course.  Personally, I put a lot of time and effort into pitching projections by the numbers and I get very frustrated each year when dozens of pitchers seem to belie their projections and many of them go on to do that for many more years, as if their true talent drastically changed from one year to the next (maybe it did and maybe it didn’t).  For example, all of a sudden Edwin Jackson and Jason Marquis are Cy Young and Rich Harden and Ervin Santana are Cy Espstein.  Of course, injuries play a large role in the uncertainty and difficulty in pitcher projections but they are an integral part of the game.

So what else can we do?  As I’ve said many times, I watch as many games as anyone, and I kind of specialize in pitching analysis.  I can tell you that almost ANY major league pitcher has either very good stuff or very good command or both.  However, on any given day, it is amazing how lousy or good a pitcher’s command can be and even how good or bad his stuff can look (especially when the command is there or not there).  For example, if guys with fairly average stuff happen to hit their locations for a game or two, they can look like Greg Maddux and absolutely dominate their opponents.  You would think they were great, great pitchers even if you recognized that they didn’t have great stuff, although I can tell you that command can go a long way in making it look like a pitcher has good or bad stuff.  Jason Marquis and Jeff Weaver are great examples of this in the 06 post-season. Both pitchers looked like Cy Young in the post-season, yet Jason Marquis, until this year, was a bad pitcher, and Jeff Weaver had already imploded before that post-season. 

On the other hand, a pitcher with great stuff can look like absolute crap when he is not in command of those pitches. And some days he can have great command and other days he will have lousy command. 

Now, you may be saying to yourself, “Well if a pitcher has great stuff he has more of a chance to be a good pitcher and if he doesn’t, he has less of a chance, so one of the first things we can do with a young or even an old pitcher is to evaluate his stuff. And the quality of a pitcher’s stuff should not fluctuate all that much, all the more reason to use it to evaluate that pitcher.”  There is some merit in that, and of course that is one thing that scouts do - to a fault I think.  Here are some of the problems with that.  The obvious ones are that lots of pitchers have good or great stuff, but good pitching is more than that, as we all know.  So maybe you have solved 10% or 20% of the mystery by evaluating a pitcher’s stuff.  That still leaves a lot left.

Perhaps more importantly though, is what constitutes good stuff?  A scout may drool at a 96 mph fastball, but as we know, there are 96 mph fastballs and 96 mph fastballs.  IOW, some end up being quality pitches and some don’t, even independent of command, although command plays a large role in how effective a pitch is, not only when that pitch is thrown, but in the grand scale of pitching - the less command you have of your pitches, generally the more predictable you will be. For example, if you cannot throw your off-speed pitch in the strike zone with any regularity, you will be forced to throw fastballs in fastball counts and even that 96 mph fastball, especially if THAT is not commanded well, is going to get hammered in fastball counts.

So how do we tell whether a pitcher has good or great stuff.  Scouts will say, “Just watch him and see what he throws and look at the movement, velocity, and command.”  I say BS!  If it were even close to that easy, teams would know who was a good pitcher and who wasn’t, which they clearly don’t.

Why is that? For several reasons:  One, the eye cannot see the exact movement of a pitch.  Two, the eye cannot see the deceptive element of a pitch, which is important to its effectiveness.  Three, the effectiveness of a pitch is partially based on when it is thrown and the other pitches that are thrown and when, and all of that is complicated.  Plus, as I keep saying, command is such a critical part of the equation, and one, a scout cannot necessarily quantify command with any precision from watching a pitcher, and two, his command is going to fluctuate a lot from session to session and from game to game.

Now, of course a lot of those things that the scout cannot see or cannot measure or see with any precision can be gotten from the data - like the pitch f/x data.  And as I have said from its inception, we have not even scratched the surface as far as using it to evaluate pitchers and pitching in general.  But, the problem with that is that that data is subject to fluctuation and sample size error.  So we are back to the same thing as we started when we just used numbers like ERA, ERC, OPS against, tRA, etc.  Sample size.

To conclude this already too long post, this is what we (teams or anyone that wants to evaluate pitchers) need to do with the pitch f/x data:  We start with velocity.  Smoltz has a 91 mph fastball.  OK, for 91 mph fastballs, what kind of movement is necessary for it to be effective?  Obviously we need to set arbitrary boundaries for effective versus not effective.  OK, given a certain movement, what kind of location and/or command does it need to be effective? 

Now we compare Smoltz’ fastball to those baselines.  For example, let’s say that we find that a 91 mph fastball is pretty good, but only if it moves at least X horizontally and Y vertically.  Does Smoltz’ meet that requisite?  No?  Then he is in trouble.  Let’s say that it does.  Now, we’ll look at all 91 mph fastballs with similar movement and we’ll see what kind of command and location is necessary for it to be effective.  Then we will compare that to Smoltz’.  We’ll do the same for his other pitches.  Then we’ll look at how often he throws the various pitches in the various counts and we’ll do a similar analysis. I know, this gets really complicated, but I think it is doable to some extent.

The one problem we are going to run into is the deception. I firmly believe that a big part of any pitch’s effectiveness is its deceptive nature by virtue of the pitcher’s motion and release point which may or may not be able to be observed or measured. If not, we have to rely on the actual effectiveness of each pitch in each location.  For example, if the average 91 mph fastball with the same movement as Smoltz in a 2-2 count has a lwts value of zero runs, and Smoltz’ is negative (good for him), then we might infer that he has some deception going for him that is better than the average ML pitcher.  Of course, we have to control for how often he throws his other pitches.  For example, if the average pitcher throws that same 91 mph fastball 50% of the time in that count and Smoltz throws it 40%, he is probably going to get a better result in that pitch.

So it is a complicated process to be able to evaluate pitchers, and I think we have a long, long way to go as compared to what we (as analysts) and teams (scouts) are going right now, including a combination of the two (scouts and stats).  A long way to go.  And I think that there will be or there is the potential for great strides in the next 10 years or so, owing in part to the availability of the data like pitch f/x. I also think that when the breakthroughs come, it will likely be in the sabermetric community and that the baseball world - the teams - will lag by 5 years or so (some teams more)...


<< Back to main