clock menu more-arrow no yes mobile

Filed under:

Why I’m concerned about the Diamondbacks pitching in 2020

Regression is a harsh mistress

San Francisco Giants v Arizona Diamondbacks Photo by Norm Hall/Getty Images

Some may have noticed that I’ve mentioned, more than a few times in the comment sections, that I’m not overly confident in the team in 2020 - specifically the pitching staff. There is a reason for this, so I figured I’d better share my thinking.

Simply put, in 2019 there was a pretty big gap between the various ERA estimators, such as FIP, xFIP, and SIERA and the actual ERA results for most of the pitchers that are expected to get innings in 2020. The same can be said for the gap between. wOBA and xwOBA.

For those unfamiliar or in need of a refresher, below is a quick overview of the alphabet soup in the previous paragraph. These are from fangraphs.com Glossary. Follow the highlighted text which links to the fuller explanation. If you don’t require the refresher, go ahead and skip down to the tables below.

FIP A metric that estimates a pitcher’s run prevention independent of their defense. It is based on Strikeouts, Walks, and Homeruns , assuming average outcomes on balls in play. It is converted to an ERA scale for easy comparison.

The basis of this concept was the discovery of Voros McCracken over 20 years ago that pitchers have a very limited amount of control over what happens to balls hit into play. Originally referred to as DIPS, HERE is the article from 2001 when Voros first introduced the concept. It’s held up over time, although it’s been tweaked by others.

xFIP. Similar to FIP, except that it adjusts homeruns to an average rate of HR per Fly ball, adjusted for ballparks.

SIERA Is another run estimator builds off FIP and xFIP, however it takes into account balls in play, and makes estimates based on groundball, flyball, and line drive rates and average outcomes on those ball in play types.

In general, if you see LARGE gaps between ERA and these run Estimators, it may be an indication that the pitcher is likely to regress towards the value of the run estimator. That can be for the good or the bad, depending.

wOBA

xwOBA is available in the leaderboards section of Baseballsavant.mlb.com

It simply adjusts hitting results based on the quality of contact, i.e exit velocity, launch angle, etc.

HERE is a link to the DBacks Pitcher’s page. Click on column heading “Diff” to see luckiest and unluckiest.

=======================================================================================

Ok, on to the meat of this article. What I have done in the tables below, is simply take the average of the three different run estimators, (FIP, xFIP and SIERA) and average them, and compare them to the ERA of the pitchers from last year still on the roster this year, and see who had the biggest gaps, positive or negative. The bigger the negative number, the more the pitcher’s ERA was lower than what should have been expected. If you are looking at differences betwen .10 to .30, it may or may not be all that significant. But when you start getting over a half run difference, it starts to mean something. When the differences are over a full run different....well.......that’s usually just flat out unsustainable.

What we see here is almost every Starting pitcher, and most of the relievers expected to be on the opening day roster had ERAs well below what the estimators think they should have had. And for the most part, their expected wOBA were all higher than their actual wOBA. (xwOBA .20 or more higher than actual wOBA is quite significant...the scale is different than ERA)

Some of this can be credited to excellent defense and shifting, which the Diamondbacks are better at than almost every other team in the league. But not all of it.

Starting Pitcher Comments:

Mike Leake: Be very afraid. His track record of beating his peripherals is less consistent than one might think. Some years he’s way over and some way under. His wOBA against, already high, was expected to be even higher still. With his continued velocity loss, he’ll be extremely challenged to keep his ERA under 4.50 going forward.

Alex Young: As good a feel good story as he is, it will be necessary for him to miss a few more bats see improvement to his peripherals, (BB/K and HR rates) in order to maintain a sub 4 ERA.

Zac Gallen: As much as we enjoyed his low ERA while with the team, the peripherals did not really support it. In his case, improvement is seemingly likely, and his xwOBA isn’t much higher than actual, so a sub 4 ERA is still pretty likely. But without a lot of improvement to his peripherals, a sub-3.00 ERA two years in a row will be a stretch.

Luke Weaver: Similar to Gallen. It’s notable that Luke's xFIP and SIERA are considerably higher than FIP. Simply put, he’s expected to give up a higher rate of HR. We can see this in the sizable gap of his wOBA and xwOBA. And this is before we take into account innings limitation and injury recovery.

Taylor Clarke : I don’t think the team is counting on a lot of starts from Clarke in 2020, but I grouped him with the starters. His ERA was high enough as it was, but the estimators say it could have been a lot worse. At least his xwOBA was slightly lower than wOBA , but it’s not much consolation.

Jon Duplantier: Here again, while FIP was lower than ERA, xFIP, SIERA and xwOBA all indicate he was actually fortunate not to allow more runs than he did. He needs to improve his walk rate, and command within the strike zone as well to reduce hard contact, or a mid 4’s ERA or higher is likely

Merrill Kelly: He beat his peripherals in 2019 over 183 IP, so that was good. But the odds still favor some regression.

Robbie Ray: He is the ONLY starter whose run estimators are lower than actual ERA. But in his case, the command and homer issues don’t feel like they are about to improve dramatically any time soon.

Kevin Ginkel: As much as we all enjoyed his rookie success, I don’t think anyone expects him to repeat a 1.48 ERA. Regression to an ERA in the mid 3’s is pretty likely. That would still be good, but it’s not 1.50

Yoan Lopez: Lopez had the highest hard hit rate of any pitcher in MLB last year, and his peripheral metrics paint a picture of a guy that was EXTREMELY LUCKY to have a sub 4.00 ERA. He is the poster boy for this article theme.

Junior Guerra: The newly acquired Guerra is another regression candidate, having beaten his ERA by over a run. As I pointed out in the comments section of the signing thread, he also had a 42% inherited runners scored rate, 111th out of 129 relievers with over 20 IR.

Archie Bradley: The run estimators think Archibald should have given up more homers than he did. Hopefully the run estimators are wrong about him, as he’s going to be the closer and we don’t like closers giving up homeruns.

Stefan Crichton: Here we have an interesting case as his run estimators are slightly lower than his ERA, but his batting against estimators are much higher than actual batting against. Coin toss.

Andrew Chafin: One of three relievers that was supposedly “unlucky” in 2019. His peripherals suggest his ERA should have been almost half a run lower. His wOBA looks largely deserved though.

Matt Andriese: If anyone is wondering why the team seems to have so much confidence in him going forward, this is probably why. They simply think he has pitched much better than his results, and these numbers support that view. YMMV

Jimmie Sherfy: He was pretty unlucky in his small sample size last year, but even his expected ERA and xwOBA were worse than average. He’ll need to regain his velocity, and boost his K rate, and throw more strikes or he simply won’t make the team.

So there you have it. The only “positive” regression candidates of note are Ray, Chafin, and Andriese. Every other pitcher of note on this roster had ERA results better than the run estimators adjust to. We can argue any one individual pitcher and probably come up with plenty of plausible reasons why these metrics are under estimating any given pitcher. But all of them?