Tangotiger Blog

Wednesday, July 07, 2021

Statcast Lab: Distance/Time Model to Taking/Holding Extra Base

By Tangotiger 03:35 PM

We have finally completed the Baserunning and Arm model of taking/holding the extra base. The model is intuitive and matches how a baseball fan processes a play. Let’s take the situation of a batter deciding whether to stretch a single into a double. The runner at contact has a certain number of seconds to reach second base. The fielder at contact has a certain number of seconds to retrieving the ball, and then based on the distance of the throw, a certain number of seconds to getting the ball to second base. The more time it takes the fielder to get the ball in, the higher the probability the runner will try for two (and succeed). The less time it takes the fielder to get the ball in, the lower the probability the runner will try for two (and if he tries, the more likely he will get thrown out).

So suppose the runner needs 8 seconds to get to second base for a particular play. On that same play, let’s say the defense needs 5 seconds to get to the ball, and another 0.75 seconds to release it, another 2 seconds for the ball to travel and another 0.5 seconds to apply the tag. That’s 8.25 seconds. We take 8.25 seconds of fielder time minus 8.0 seconds of runner time, and we have a delta of 0.25 seconds. That’s how much “breathing room” the runner has. Naturally, the runner in question won’t ALWAYS take exactly 8.0 seconds. Sometimes it may be 7.8 or 8.3. Or anything in-between. As for the defense, the fielder might get to the ball quicker than normal or slower than normal. His throw might not reach its peak, and maybe the throw is a bit offline so we need more tag time.

What we have is therefore an S-curve type of probability (a sigmoid function). The more negative, the lower the probability. The more positive, the higher the probability. This is what it looks like for the batter trying for two.

Here we see that when the Fielder Time and the Runner Time matches, the runner will try and be successful about one-third of the time. The more buffer time for the runner, the more often he will try and succeed. The blue line is actual data, while the dashed line is the model.

We’ve identified six different baserunning / arm categories:

Batter going for two
Batter going for three
Runner going first to third on a single
Runner going first to home on a double
Runner going second to home on a single
Runner going third to home on a sac fly

Each type of play has its own model, though they all follow the same principle. The “slope” of the curve is unique to each kind of play, but the structure of the model is the same.

With each play having a probability, we can now compare each runner to the baseline, and figure out how many extra bases they are taking, or how many bases they are NOT taking. As well as how often they are being thrown out. We can combine all that and come up with leaderboards for runners. Here it is since 2016:

Mookie Betts, Kevin Kiermaier, and Billy Hamilton are the best at taking the extra base.

We can also flip it on the other side, and look at leaderboards from the OF Arm perspective, crediting the outfielder not only for throwing runners out, but also holding them to their base. Here’s that leaderboard (more negative is good for the defense) here. Kevin Kiermaier is the leader (as well as Betts and Hamilton also with a strong showing).

You know all those things they say that’s “not in the boxscore”. It’s in the Statcast boxscore, and we’ll be showing the results of that, and shine that spotlight on that “hidden game” of baseball, with Kiermaier its best representative, both as a runner and as a thrower. (We already know that Kiermaier is tremendous as a fielder.) Kiermaier is the kind of player that Statcast does its best to highlight. Eventually, this will make its way to Savant, along with alot more breakdowns, so you can see it by each category and each season.

And what more can we do? Well, plenty. This for example is an Altuve play where he was thrown out by Arozarena. If you are behind the red line, that’s the nogo line. If you are ahead of the green line, that’s the go line. In this particular play, Altuve was thrown out. And we’d be able to show it, frame by frame, in video mode. We’re entering the top of the 5th.

• Baserunning • Fielding • Statcast

#1 kgfella 2021/07/15 (Thu) @ 08:07

There are a number of assumptions that I assume you are making that weren’t addressed, so let me throw them out to make it explicit.

1) For runners, you need to subset only to plays where the runner is at full effort on the play - to determine that this runner needs 8 seconds (on average) from home to second, you exclude the “easy stand-up” doubles.
2) Same idea for fielders - using only max effort plays and throws.

Now, given that you are using the runner’s own ability for determining the go/no-go line, the metric is really measuring baserunner intelligence moreso than baserunner value, in the WAR sense (and same for the fielder). If we wanted to determine how much more value a player is adding from the ability to take extra bases, you should determine the time for the MLB average baserunner (probably split by hitting side) to go from home to second, and using the specific fielder on the play determine the probability of an average runner making it to second.

On a given play, Billy Hamilton may have an 80% chance of making it to 2nd, whereas Miguel Cabrera may have a 15% chance. As I understand it, this metric is going to credit Hamilton with 20% added value of taking the extra base (assuming he goes for it and is successful). Miguel Cabrera would get 85% added value if he were to go and be successful. This is another assumption (not explicit in the article) about how you are tallying value that I am assuming. As I said above, this would be a good metric for measuring baserunner intelligence perhaps, but not actual value added. If Hamilton and Miggy both went for it and were successful, in a value metric they should both be credited with the same value added - they both made it to second base on an identical play; same batted ball, same fielder, same park (same batter handedness also let’s say).

Similarly for the fielder, we should not use his own running/throwing abilities to determine the go/no-go line for the runner, we should use the values of the average fielder at his position.

If I’ve misunderstood then my apologies. I wonder how much the leaderboard would change, it may not make that much of a difference.

#2 Tangotiger 2021/07/17 (Sat) @ 20:47

Let me address a few of these in separate posts.

I’m not taking the actual speed of each runner on each play. Based on where the runner hits the ball, we have a presumed amount of time for the runner to get to the base. So, it doesn’t matter how hard or not he tries, the probability remains the same.

At least for that runner’s baseline

#3 Tangotiger 2021/07/17 (Sat) @ 20:48

For the go/nogo, that’s a separate metric, and that uses the actual conditions on the ground. So if a batter like Rickey decides to no go all-out, then his nogo line will appear pretty quickly.

#4 Tangotiger 2021/07/17 (Sat) @ 20:50

Yes, we can have separate metrics based on whether we compare to the typical runner, or to the speed of that particular runner. That’s a good way to isolate that runner. My goto example is Trea Turner, who should rank much much higher, but he doesn’t. Why? Because he’s not going all out like Kevin Kiermaier.

Yadi Molina is another good example who has far more value than you’d expect of someone of his speed. Shows he’s really smart and aware. But even so, because of his limited speed, Trea Turner is a much more valuable runner than Molina.

#5 Tangotiger 2021/07/17 (Sat) @ 20:50

Let me know if this addresses all your issues, thank you

#6 kgfella 2021/07/20 (Tue) @ 17:28

Post #2 makes a good clarification, but the go/no-go line is based on the batter/runner, so I think my larger point still stands (the Hamilton vs Cabrera paragraph from my original post).

Trea Turner ranks below Kiermaier because he doesn’t choose to be as aggressive *given his own speed* (and therefore his own person go/no-go cutoffs) as compared to Kiermaier. But is Keirmaier actually generating more value based on the bases he has taken? It seems the metric you have described answers the first question rather than the second. Which is perfectly fine as long as it’s understood. This is a measure of successful aggressiveness, perhaps we can call it.

The final way I will use to illustrate my point is a different hypothetical. Let’s say Kiermaier and Turner are runners on 100 identical balls in play, and each takes exactly the same number and kind of extra bases on these balls in play. I believe your metric is going to say Kiermaier was more valuable, because his personal go/no-go cutoffs were more difficult than Turner’s, because he doesn’t have Turner’s speed. In a WAR sense, their value was equal, we’d all agree. In the successful aggression sense (this metric), Kiermaier was superior.

Thanks for engaging with my question.

#7 Tangotiger 2021/07/21 (Wed) @ 11:03

I think I may have confused things.

The go/nogo is its own thing. We can measure things relative to a player’s own speed. I am not showing those results.

What I am showing is that, regardless of speed of the runner, what are the outcomes. If Molina and Turner both take the same number of bases and hold the same number of times and are thrown out the same number of times on identically hit and fielded balls, they will get the same value.

We can break down that value as:
Turner
+30 for his speed
-20 for his lack of awareness or hustle
===
+10

Molina
-20 for his speed
+30 for being aware or hustling
===
+10

In the end, from the perspective of winning games, they are identical.

They just got there in different ways.

#8 Tangotiger 2021/07/21 (Wed) @ 11:05

(Above is for illustration purposes only.)

#9 kgfella 2021/07/22 (Thu) @ 17:54

Thanks, Tango, I think now I am on the same page as you. Your metric is showing the +10 (illustration purposes example), which we could decompose into its components if we wanted to.

Good, the +10 should be a better estimate than the current estimates of baserunning runs, and the value breakdown is fun and intuitive. I like it 😊

<< Back to main

Tangotiger Blog

Wednesday, July 07, 2021

Statcast Lab: Distance/Time Model to Taking/Holding Extra Base

Latest...