In my last post I mentioned the idea the umpire's call is heavily influenced by the location of the last pitch, which sets a sort of reference frame for his strike zone on the next pitch. Let's take a look at this idea in more detail today.
The pitch sequence effect is easiest to observe when we focus in on a small subset of the strike zone near one corner for two reasons - one, if we look at the entire strike zone, our model has to be far more complicated to accurately map the probability a pitch should be called a strike, and two, the influence of the last pitch is different depending on where the current pitch is located in the zone. Specifically, if the pitch is down-and-away, the pitch will look more like a strike if the pitch before was relatively more down-and-away, and vice versa.
We hence will drill-down to probably the most important region of the strike zone for this study, the area down-and-away near the corner. Pitchers target this area more than any other and hitters are less likely to swing at pitches in this area compared to other parts of the strike zone, making the umpire most influential. We only select marginal pitches located within a 0.4-foot circle centered around a point 0.8 feet off the center of the plate, and 0.2 feet above the theoretical bottom of the strike zone:
With our collection of pitches defined we can then estimate the probability a pitch will be called a strike, based on the count, speed of the pitch, and locations of the current and previous pitch. As in the last study, we exclude non-regulars (pitches with less than 150 PA in other games in our sample, and hitters with less than 200 PA), as well as plate appearances with bunts, and the counts 0-0, 0-2, 1-2, and 2-2. We use a logistic regression for our strike probability model. As the reader may know, the logistic regression has coefficients in the form of odds ratios, which are somewhat difficult to interpret. For convenience to the reader, we also convert these odds ratios to probability, assuming the probability the pitch is called a strike is near 50%.
We use a third-degree polynomial to map the influence of each pitches' current x-y location on the called strike probability. Obviously, the pitch location is by far the most important factor in whether the pitch is called a strike. There are better ways to do this, such as some kind of LOESS regression, but for this small area of the zone, the polynomial works fine. After accounting for the location of the pitch, we are left with the other, more interesting factors:
While this effect is well-known in the baseball literature, the current count heavily influences the umpire's decision. Simply put, umpires are very hesitant to put either the pitcher or hitter way down or way ahead in the count. Other things being equal, the least likely count for the pitcher to get a strike call is 0-1, while the most likely is 2-0. For pitches that would otherwise be 50/50 to be called a strike, as is shown in the middle column of the table, the pitcher is ~18.8% more likely to get the call when the count is 2-0 as compared to 0-1. Other counts fall in the middle of this spectrum, with pitcher's counts favoring the hitter getting a favorable call relative to average, and vice versa.
The location of the previous pitch also plays a key role in determining the umpire's call. For every foot higher the last pitch was, the low-and-away pitch is 3.06% less likely to be called a strike, and for every foot inside the last pitch was, the low-and-away pitch is 5.58% less likely to be called a strike. As was the case with the up-and-in pitch, when the current pitch is relatively more towards the center of the strike zone compared to the last one, the pitcher is more likely to get the call, as the umpire's visual frame of reference has been altered.
Finally, while the velocity of the last pitch does not have a statistically significant impact, the velocity of the current pitch does have an impact, of -1.46% reduced strike call probability per MPH. This makes sense in the context of pitches near the bottom of the strike zone. A faster pitch will drop less due to gravity on its path to the plate, meaning it will have started from a lower height. Such a pitch will appear lower to the umpire, even if it crosses the plate at the same height.
We can show the impact of the last pitch on strike call probability through a heatmap of the probability pitches in our low-and-away circle were called a strike, based on the location of the last pitch:
There are two factors at work that drive the above heatmap, which as a reminder, shows the location of the previous pitch, not the current one. The first is the umpire's natural tendency to put the pitcher (or hitter) back in the count. Basically, if the last pitch was a strike, the following count is more likely to be pitcher-favoring, which the umpire will try to correct by calling more strikes on marginal pitches in the subsequent count. And if the last pitch was a ball, the opposite is true. This drives the sharp gradient at the edges of the plate.
The other effect is the reference frame of the last pitch, which is perhaps best visualized by looking at pitches just off the plate outside compared to pitches just off the plate inside. When the pitcher goes from off the plate outside to less off the plate outside, they are far more likely to get the call than if they are coming from off the plate inside to just off the plate outside, and the difference between the two areas reflects this.
Strategically, this tendency of the umpire to put the pitcher or hitter back in the at-bat has far-ranging impacts on the game. One, it tends to blunt the effectiveness of pitch sequencing. Going from up-in to down-away should be good for the pitcher, because it does throw off the hitter's timing. However, the pitcher's gains are offset by the fact that they don't get marginal calls as often in this sequence.
Two, the umpire tendency means throwing strikes is far less important than it would be if strikes were called the same way on every pitch. A wild pitcher who gets down in the count often will be helped back into the count by the umpire, while an accurate pitcher who tends to get ahead through throwing strike one will be penalized when they fail to get marginal calls. Both the reference frame and count tendencies of the umpire work in this way, as the pitcher who misses off the plate outside is more likely to throw their next pitch around the outside part of the plate, other things being equal. This leads to a velocity-focused (at the expense of control) pitching strategy being superior.
If automatic ball-strike calls are brought into the game, which I think is inevitable, less accurate pitchers are going to be hurt badly. In the current game they are able to rely on the umpire to prevent walks through giving them marginal calls when they fall behind, but with a machine calling the balls and strikes, these pitchers are going to actually have to throw a strike. The implications to baseball bettors are obvious - wild pitchers who had high walk or poor first-pitch strike rates are going to suffer relative to control pitchers if the league goes to machines, and this is likely to be under-valued by the market early on. The effect on totals is less clear - obviously, wild pitchers will be over candidates early on, but the league-wide effect will depend on how the machines are set and calibrated.
Whether the high-velocity, max-effort, short-outing style favored in the modern game has reduced pitcher control is an item of debate. Certainly, better mechanics and pitcher fitness levels have driven much of the velocity increases we have seen, and it is possible that modern pitchers have better control than those of say, 20 years ago. But I do suspect that pitchers also try harder than those of the past, as velocity (and spin) at the expense of location is analytically favored, which probably leads to reduced accuracy. If this is the case, and machines start calling games, we could see something of a roll-back to the earlier era, with pitchers not throwing quite as hard in an effort to throw more strikes. This would probably be positive for the game, as we would we see fewer pitcher injuries, longer outings, more balls in play, and fewer pitching changes.