When I look at pitcher performance, one new metric I look at is Called Strike + Whiff rate or the percentage of pitches thrown that result in a swinging or called strike. What makes that outcome so valuable is that is a pitch in which the pitcher outright wins in his matchup against the hitter. It’s also more valuable than a foul ball in a two-strike situation, where a called or swinging strike would end the at-bat while a foul ball prolongs it. While his stat is not a true indication of a pitcher’s raw stuff, it does measure how well a pitcher can command it to get the hitter out.
The CSW statistic was created by former pitcher Nick Pollack in the 2018 season, three years after the Statcast era began in 2015. It’s a simple formula that takes a pitcher’s called strikes, swings and misses, and total pitches thrown to calculate the percentage of pitches that resulted in a called or swinging strike. I see it as a valuable metric to see if a pitcher has good stuff and is able to command it well against an opposing lineup.
It’s a metric you can look up in any Statcast Game Feed, under the Player Breakdown tab after you click into a game. Once you click that tab, you will a breakdown of pitchers on both teams and this is what it will look like.
Once you hit the player breakdown tab, you will see a column in the 3rd group of data that shows Called Strikes (CS), CS+Whiffs, and CSW%. When looking at the potential quality of outing, here’s how we can use CSW% to determine the stuff the pitcher had going in.
|Quality of Stuff||CSW%|
|Quality of Stuff||CSW%|
So in this exercise, I decided to chart all pitchers who faced a minimum of 200 batters in the 2021 season on a spreadsheet with their Called Strike + Whiff rate calculated based on the number of called strikes, swings and misses generated, and total number of pitches. Out of a group of 358 pitchers, the average CSW Rate was 29.0% with a standard deviation of 2.67%. 354 of 358 (98.9%) pitchers all fell within three standard deviations, with the four outliers all being pitchers who finished above 37.0%. Raisel Iglesias and Craig Kimbrel led baseball with a CSW% of 38.1% while Adam Plutko finished last at 23.1%. The highest CSW% of a Diamondbacks pitcher was Madison Bumgarner at 29.1%.
With a decently-sized sample of pitchers from last season, we can take this Called Strike + Whiff metric and see if there is any correlation with run prevention. The most commonly sited metric for run prevention by pitchers is earned run average, or ERA, so we’ll see how it correlates with CSW%.
Just at first glance of this scatterplot of ERA vs. CSW%, a higher CSW% trends toward a lower ERA if you apply a linear regression model to this data. From the simple eye test, the dots on the plot are too scattered apart in the dataset to make a strong correlation with such a model. In fact, the R-squared value for this plot stands at just 0.26. Ideally you want your model to have a high R-squared value, somewhere in the range of 0.7 or higher, so we can say that CSW% isn’t particularly too predictive of ERA in a linear regression model.
That should come as no surprise, as ERA is a context-neutral statistic and sometimes doesn’t paint the full picture for a pitcher’s true run prevention skill. It’s a good comparative metric when comparing over a vast majority of pitchers, but it doesn’t account for factors such as the ballpark or the team the pitcher is on. For example, a sub-4.00 ERA pitcher for the Diamondbacks would be considered solid but not spectacular whereas a similar ERA for Rockies pitchers could be considered near-elite. That example aside, there are better metrics out there to add context to pitcher performances that we’ll be using to see if we get different results. For those not interested in seeing that through, you can skip to the final paragraph then call me a nerd in the comments for wasting your time. For everyone else, this is where the fun begins.
Other metrics I’ve considered using to try and see if there is a better relationship for a more advanced statistic such as weighted on-base average, or wOBA. For those that have visited a Fangraphs or Statcast page, you’ve seen this stat on a player’s page. What this stat measures is the outcomes of every single plate appearance in which the hitter takes a competitive at-bat, applies a linear weight to each outcome, then divides by the number of plate appearances that don’t result in an intentional walk. With enough plate appearances, a player’s wOBA will be similar to their on-base percentage. In this exercise, we will be using the wOBA that pitchers allowed and seeing if there is a correlation between CSW% and wOBA.
In this plot, you can see that the data points are closer together than with ERA. That indicates a stronger correlation than the previous chart, but the R-squared value is only 0.35, which is still not strong enough to make this linear model work. Once again, we’ve hit an impasse on trying to use a metric that can be loosely tied to run prevention since the correlation between ERA vs. wOBA has an R-squared of 0.76 based on the same data. Another possible route that I could have taken is to compare against a pitcher’s xwOBA against or xERA from Statcast. The issue I see with using these metric in this exercise is we want to compare against actual performances instead of estimated performances.
With the need to contextualize performance, I’ve decided to replace ERA with ERA-. You’ve likely seen ERA+ used as a stat when comparing pitchers in the series preview articles on the Snake Pit, but I prefer the context that ERA- brings over ERA+ as ERA-. ERA- means the pitcher is percentage points better (under 100) or worse (over 100) whereas ERA+ means the league average pitcher is percentage points worse (over 100) or better (under 100). ERA- is basically the inverted percentage of ERA+ if you want to compare, so an ERA+ of 125 would be an ERA- of 80. What makes ERA- special is it takes into consideration the park factor of the pitcher’s home park and the league ERA into the calculation and spits out a percentage of how much better or worse the pitcher was vs. league average. It’s not a perfect stat, as it simplifies the park factor to be the pitcher’s home park instead of a weighted average of the parks he pitched in.
Here’s how the data stacks up once again:
Once again the plot shows a very weak correlation to a trend where a higher CSW% yields a better ERA-. The individual points on the chart are too scattered apart once again, producing an R-squared of only 0.25 once again. Even with the more contextualized run prevention values used in the dataset, we got similar results to just the context-neutral ERA values. As a result of how weak the correlation is on both scatterplots, we can conclude that Called Strike + Whiff rate is not a great predictor of ERA or even park-adjusted ERA metrics.
So if this stat isn’t very predictive on run prevention performances, is there still a use for this metric? For those that want to look at surface numbers and try to delve too deep, it may not be that interesting to you, but it has its value as a metric. If you’re interested in looking at this particular metric further, I recommend checking out this article by Alex Fast. Three years ago, he did similar and more thorough research on this topic. His findings concluded that CSW% has a stronger correlation with overall strikeout rate, which I measured the R-squared value to be about 0.59 based on the 2021 data, and SIERA, another ERA predictor stat that’s much more complex than FIP or xFIP. For the sake of this article, we’ll stay away from that metric.
The main limitation with CSW% is that it is also independent of quality of contact and there are starts where a high CSW% may not translate from a run prevention standpoint if the pitcher consistently loud contact. There are situations where a first-pitch out may be more valuable than a longer at-bat that has a higher amount of called or swinging strikes. Unless a pitcher has an elite strikeout rate, like Robbie Ray in the last four 162-game seasons, quality of contact matters just as much as a pitcher’s ability to get quality strikes past the hitter. There are examples of pitchers recording a high CSW% in a start, yet give up a lot of runs due to the inability to prevent hard contact. That example was illustrated in Fast’s article linked above, although it was a sample size of 4 starts from Derek Holland in 2019. He also concluded that the CSW rate stabilizes after 10 starts for starting pitchers, although there isn’t much on relievers. Understanding that limitation is important when citing this statistic to evaluate pitcher performances.