clock menu more-arrow no yes mobile

Filed under:

Is Joe Saunders A Volatile Pitcher?

The recent re-signing of left-hander Joe Saunders has certainly created some excitement, whether it be excitement over Joe's return, excitement over the inexpensive $6MM salary for a guy who threw 212 innings with a 3.69 ERA a year ago, or excitement over the fact that Arizona's payroll for 2012 - expected around $80MM - may be its highest since 2003. However, as with almost any move, the excitement is not universal. Amid the discussion yesterday, I came across this comment from SenSurround, which offered a valid concern on the matter of Saunders: he provides innings certainty, but does he provide quality certainty? After all, Saunders' career ERA figures by season are as follows (2005 to 2011): 7.71 (2 GS), 4.71 (13 GS), 4.44 (18 GS), 3.41 (31 GS), 4.60 (31 GS), 4.47 (33 GS), and 3.69 (33 GS). Despite a career ERA of 4.16, Saunders' closest single-season ERA to that career-average figure is the 4.44 figure from 2007 - a season in which he made just 18 starts - 28 points off of his career average. His closest full-single-season ERA (i.e. > 30 GS) to that mark is the 4.47 figure from 2009. This begs the question: just how volatile is Joe Saunders?

To tackle this question, the first goal is to break down the arbitrary endpoints of individual seasons and divide Saunders' career instead into a series of intervals that ignore the boundaries of seasons and give us more of an idea of the ERA found in a random sample of consecutive starts from Joe, rather than simply the samples of consecutive starts defined by his individual seasons.

For this, I took start-by-start data for Saunders over all 161 major-league starts he's made, and split them into several series of unique intervals using seven different lengths - 20 games, 25 games, 30 games, 35 games, 40 games, 45 games, and 50 games. In other words, the 20-game set will have the following intervals: starts 1-20 (interval 1), starts 21-40 (interval 2), starts 41-60 (interval 3), et al. This should hopefully provide us with more data points around Joe's average career ERA of 4.16, and we'll simply observe how far they deviate from that average. Here's the data:

Int. 1 Int. 2 Int. 3 Int. 4 Int. 5 Int. 6 Int. 7 Int. 8 Final Int.*
20-GS ERA 4.49 4.09 4.12 3.69 5.20 4.29 4.29 3.18 3.37
25-GS ERA 4.38 3.82 3.83 5.23 4.54 3.40 3.20
30-GS ERA 4.48 3.99 4.47 4.24 3.74 3.48
35-GS ERA 4.38 3.59 4.78 4.44 3.66
40-GS ERA 4.28 3.90 4.71 3.71 3.65
45-GS ERA 4.08 4.53 4.46 3.88
50-GS ERA 4.08 4.49 3.96 3.84
Individual Seasons 7.71 4.71 4.44 3.41 4.60 4.47 3.69

*Note: Final Interval overlaps with previous interval, as Joe 20, 25, 30, etc. don't divide 161 evenly.

What can we glean from this? Well, to start, it should be fairly obvious and logical that the later groups - the 35, 40, 45, and 50-game interval groups, are all bunched closer to Joe's career ERA than his individual season figures are. After all, these groups each contain a larger number of games than any of Joe's individual seasons, so that larger sample means that the set is going to be bunched closer to his career average.

However, what is shocking to me, and perhaps somewhat insightful, is that the earlier groups - in particular the 20-game interval group - is more closely bunched around Joe's career-average ERA despite each interval being composed of fewer starts than many of Joe's individual seasons. If you discount the final interval - which is not that egregious, considering that it only differs from the eighth interval by one start - then you get a total of four out of eight intervals between the ERA range of 3.69-4.44, and another interval with an ERA of exactly 3.69. How is this significant? Well, Joe has never had a full single-season ERA within this range, despite a career ERA of 4.16, which is within this range. By shrinking the intervals of Joe's career, we find that the volatility of Joe's output seems to decrease. When looked at this way, it seems as if the individual-season endpoints are betraying Joe's consistency.

However, this is far from a conclusive result - as evidenced by the lack of a standard deviation calculation. Further, it can't be ignored that in the 20-game interval group I pointed to, there's an outlier interval of 5.20 that is higher than any full-single-season ERA of Saunders' career, so it's not an indisputable indicator of Saunders' consistency. So I went a bit deeper into this process to hopefully get more conclusive data. Here's the process:

First, I've divided Saunders' 161-start career into 130 32-game intervals, with the first interval obviously being the first 32 starts of his career, the second interval being starts 2-33, etc. This should help smooth out Saunders' career and give us a better idea of the periods in which he truly deviated from his career averages for large periods of time, and where we're simply dealing with statistical noise. Then, I plotted these 130 different "intervals" on a graph, took a mean of the data, and calculated a standard deviation. Remember, the mean of this data is different than his career average ERA because the bookend starts aren't counted as much since we are only looking at periods of consecutive starts. This is, of course, something of an imperfection in this research, but I felt it was the best way to give us a meaningful number of data points to look at.

However, ERA by itself is well-known to be misleading. Thus, I have also taken Saunders' basic peripheral rates - hits per nine innings, strikeouts per nine innings, walks per nine innings, and home runs per nine innings - over those same 32-start periods, and plotted them alongside his ERA. From here, I then checked a) how volatile the individual peripheral plots were, and b) their correlation to the ERA plot, to see how much the volatility in any of these plots may have influenced the fluctuations in Saunders' trailing 32-start ERA figures. From this, we can hopefully identify what is causing Saunders' ERA to fluctuate, and see if those fluctuations are something that D-backs fans should be worried about in 2012.

An additional bonus of doing this peripheral analysis is that it also should help us tackle the argument that individual-season endpoints have some value, in that there's is a possibility that Saunders' skillset could have greatly improved or regressed during the off-season. If this is truly the case, it should come through in the peripheral analysis, and not just in the ERA analysis.

First, here is the graph for the 32-GS trailing ERA over the 130 intervals as defined above:

Untitled_medium

The mean of this plot is 4.24217, and the standard deviation is 0.427. Under a bell curve scenario, Saunders' ERA over a random 32-start interval should be expected to be within the range of 4.24217 +/- 0.427, or one standard deviation within the mean, approximately 68% of the time. In other words, there's a pretty wide range of expectations for Joe in any given 32-start interval, from as low as ~3.8 to as high as ~4.6. However, this tells us very little as to why Saunders is volatile, and simply saying "regardless of what environment you put him in, Saunders is going to be volatile" ignores the all-important context of Saunders' career.

To build up some context, let's now take a look at Joe's individual peripheral rate plots over these same 130 intervals.

First, Hits Per Nine Innings:

Saunders_hpernine_medium

Walks Per Nine Innings:

Saunders_bbpernine_medium

For strikeouts, we have to do a little manipulation first. Strikeouts are inversely related to ERA - that is, as strikeouts go up, ERA should be expected to go down, and vice-versa - so we need to choose a base number and subtract Saunders' K/9 rate if we want to try to see patterns and similarities in the cycles of ERA and strikeouts. For this plot, I've chosen 10 as the base number, though 12 will be used later (it makes for a clearer graph later on). When running regressions later on to find correlations, we'll use simply K/9 - this is being done right now just for visualization purposes.

10 - (Strikeouts Per Nine Innings):

Saunders_kpernine_medium

Home Runs Per Nine Innings:

Saunders_hrpernine_medium

For the final graph, while it's certainly much more cluttered, I think it's probably most useful to look at all of the peripherals plotted together - including ERA - on one graph, to give us one single scale to look at:

Saunders_alldatagraph_medium

Here's where the legitimate takeaways begin. Here are the major trends I see in this graph:

- Saunders strikeout rate does not fluctuate in a way that one would expect given his ERA fluctuations.

- Saunders' home run rate is fairly steady, though there is a noticeable peak.

- Saunders' walk rate has been absurdly stable throughout his career, with the exception of a brief spike where it seems that Saunders' control deserted him for a brief period (the spike up from the 69th interval to the 79th interval) before steadily returning to normalcy.

- Saunders' hit rate fluctuates quite heavily, as we'd expect given BABIP's randomness.

Strikeout Rate

Let's investigate these trends, first looking at Joe's strikeout rate. The gods of FIP have had us believe that as strikeouts rise, improvement should improve and ERA should fall. Even though Saunders' K/9 hasn't fluctuated much throughout his career, the slight movements of the 12 - K/9 plot seem to move against the ERA plot, suggesting that as Saunders strikeout rate rises, so does his ERA. Indeed, when running a regression of the two plots, I found a r-squared (correlation) value of 0.00206. That's right, a positive correlation, though one of the weakest positive correlations I've ever seen. Given the weak value of the r-squared and the odd positive correlation, I think we can safely say that Saunders' slight fluctuations of strikeout rate aren't the reason for his ERA fluctuations. In this case, I think it's pretty safe to dismiss the possibility of causation.

With regard to the primary question at hand - is Saunders volatile - I think it's safe to say that Joe's strikeout rate, while not enthralling, has proven to be extremely stable throughout his career. Running a standard deviation of the K/9 rates over the 130 intervals bears this out, as we get a mere 0.33 result. With a standard deviation so small, Saunders' seemingly-wide ERA volatility appears to almost completely unrelated to strikeout rate volatility, simply because Saunders doesn't have any real strikeout rate volatility, even before mentioning the almost non-existent statistical correlation.

Home Run Rate

Moving on to home run rate, we do find slightly more informative results, but there isn't an obvious conclusion. The standard deviation of Saunders' home run in this period is 0.18, which is somewhat significant given the damning effects of home runs, but the r-squared between home run rate and ERA is not particularly strong, at 0.30380. That's certainly a stronger correlation than in the strikeout rate case, but compared to the r-squared values we'll see in the BB/9 and H/9 categories, it's rather minuscule. Given Joe's unusual penchant for keeping home run balls from driving in multiple runs, I'm more inclined to consider home runs a minimal factor in Saunders' career ERA fluctuations.

Walk Rate

Moving on to the walk rate category, we find the first of two truly significant correlations to ERA volatility. Despite a standard deviation of just 0.32 - actually 0.01 lower than the standard deviation of Saunders' K/9 rate - there was a staggeringly-high r-squared value of 0.66651 between Saunders' BB/9 and ERA over the 130-interval sample. Despite a similar standard deviation as his K/9, though, the graph does show us why the BB/9-ERA correlation is significant. For starters, Saunders' BB/9 and ERA plots move as you'd expect - when Joe's BB/9 rises, the graph typically shows some sort of increase in Saunders' ERA.

Above all, what stands out is the aforementioned spike period in which Saunders saw his trailing BB/9 rate climb from below 3 to as high as 3.82 within a span of just ten intervals. From April 22, 2010 to June 20, 2010, career starts 99 through 110, Saunders made 12 starts for the Angels and walked 35 in just 68 innings, a K/9 of 4.63 in that span. The result, naturally, was an ERA of 5.29 over that stretch and a second significant jump period in Saunders' ERA plot. Whatever the reason for this jump in BB/9 - mechanical issue, change in approach, et al - it was rectified quickly and Saunders' BB/9 and ERA marks regressed very quickly to their normal levels, with Saunders getting his trailing BB/9 rate back below 3 by his 116th career start, his first start after Arizona acquired him from the Angels.

So, yes, we do see that there is a significant correlation between Saunders' walk rates and his ERA. This only makes sense given that Saunders, as a pitch-to-contact guy, can't afford to give up extra baserunners. However, with how steady Saunders has been at keeping his BB/9 low throughout his career, with the exception of a mere 12-start period of egregiously elevated walk rates, I do think the odds are good that he'll be able to keep posting low walk rates going forward, particularly while facing the watered-down lineups of the NL West.

Hit Rate

Now we come to the crux of the post. By simply observing the graphs above, it's somewhat frightening how closely Saunders' hit rate and ERA have followed each other throughout his career. In spite of the wide cyclic fluctuations of Saunders' hit rates over the 130 intervals - Saunders' trailing H/9 rates have a standard deviation of 0.58 over the intervals - there is still an r-squared of 0.70316, the highest correlation of any of Saunders' peripherals. Of course, this is somewhat intuitive - a contact pitcher like Saunders is going to allow a lot of balls in play, so how successful he can be will be largely dictated by how many hits he allows.

As the all-plots graph shows, Saunders' hit rates and strikeout rates have a peculiar correlation - as Saunders strikeout rates dropped early in his career, his hit rates dipped as well, exactly the opposite of what you would expect. What does that mean? Well, it means that the simple fluctuations in BABIP that Saunders has experienced throughout his career have had a more significant impact on the hits he allows per nine innings than any fluctuations of strikeouts and total batted ball rates. In simpler terms, natural fluctuations of hit rate - which I'd be inclined to say are largely luck-related and cyclic - have been the more significant cause of Saunders' career fluctuations, rather than any significant fluctuations of Saunders' peripheral skills throughout the years.

So, of course, the question becomes one of hits. Saunders kept his hit rates low in 2011, but the supposed randomness of BABIP suggests that he won't be able to continue being so lucky. However, this brings us back to the all-important point of context that was brought up earlier. Is Saunders returning to a neutral defensive context? If Arizona sits Jason Kubel on days that Saunders pitches, then the answer is obviously no. With Gerardo Parra in left field, we can expect Arizona's defense to keep his expected BABIP relatively low, just like in 2011.

With the newly-acquired Kubel in left field, though, the left-handed Saunders - against whom right-handed hitters, who are more likely to hit the ball to left field, would be expected to fare better - could see his expected hit rate rise. Perhaps we could expect to see Parra used as Saunders' personal left fielder, in the way that some of us expect John McDonald to work as Trevor Cahill's personal shortstop. Doing so would give the D-backs a degree of control on Saunders' hit rates, which have been shown to be the clearest indicator of whether or not Saunders is successful or mediocre.

So, what are the basic points I'm trying to make here? To summarize:

- When adjusting the intervals to break through the boundaries of single-year numbers, Saunders isn't nearly as volatile as he seems by looking simply at his individual-season figures.

- However, it is true that Saunders does have a significant standard deviation on his career 32-start trailing ERA totals.

- Yet, Saunders' peripheral skills have experienced relatively small fluctuations throughout his career. Aside from a short spike - which admittedly had a tangible impact on his ERA - in his walk rate in 2010, Saunders' walk rate has been remarkably steady, and his strikeout and home run rates have experienced only mild fluctuations. While it's true that Saunders' ERA does fluctuate, it's typically not because of a serious change in his peripherals or because Saunders' skillset dramatically changes between seasons or from one interval to another.

- Saunders is volatile because BABIP is volatile. For someone who allows as much contact as Saunders does with his 5.0 career K/9, BABIP is going to have a large say in how successful Joe is.

- However, BABIP is not 100% luck, and if Arizona puts a good defensive alignment behind him on a consistent basis, Saunders could again thrive in Arizona's rotation in 2012.

So how do we answer the subject question? Is Joe Saunders a volatile pitcher? Unsatisfying as this answer will probably be, the answer is both yes and no. Do Saunders' basic skills fluctuate heavily from year to year? No. Does Saunders' skillset invite volatility because of the inherent volatility of hit rates for contact pitchers? Yes.

However, remember that we retained Saunders for a mere $6MM. That kind of price-tag isn't going to afford the type of strikeout pitcher that is needed to resist the haunting effects of hit volatility, and as contact pitchers go, Arizona did just retain one of the better ones around at a steep discount. If you add in the context of Arizona's stellar outfield defensive alignment - so long as Gerardo Parra replaces Jason Kubel on days Saunders takes the hill - it appears that Saunders is all the more likely to continue to thrive in the desert, giving the club even more security in the rotation as they chase a second straight NL West division title.

* For anybody wanting to see the spreadsheets and data, feel free to e-mail me at dstritt1@nd.edu, and I'll gladly supply them to you.