FanPost

Probability and Zac Gallen's Scoreless Innings Streak

Photo by Ole Spata/picture alliance via Getty Images

Earlier, I highlighted how Zac Gallen is doing something that no one has done in quite some time. In addition to all of the things I had mentioned, Jim brought up in the GDT that Gallen became just the third pitcher in history to post three consecutive starts with no runs allowed, no walks allowed, and at least seven strikeouts (joining Corbin Burnes and Clayton Kershaw). Also, I have not been able to verify, but it seems extremely likely that Gallen's streak last year featured the most strikeouts in any scoreless innings streak, ever.

But how does the run scoring environment make Gallen's accomplishments look?

Let's start with last year's streak, as well as all other streaks to go 40 innings in a single season (list here). That list eliminates Gregg Olson's streak because it is only counting streaks in a single season. (Josh Hader's streak of 40 consecutive scoreless appearances is Sir-Not-Appearing-in-this-Film because it did not last 40 innings.)

25 pitchers have had such a streak, but many of them can immediately be eliminated as coming from entirely different eras. Nothing against Jack Coombs, Doc White, Cy Young, the most interesting pitcher ever Rube Waddell, or the American League record holder Walter Johnson, but they were pitching in the dead ball era and the segregation era. I also considered eliminating all of the streaks from the "Year of the Pitcher" but decided that they could stand, since I could relatively easily calculate the run scoring environment for the season. This also showed me that the AL in 1972 actually was a worse run scoring environment than the AL in 1968. I was left with twelve streaks that were not in the dead ball or segregation eras. And here is where things got interesting.

There are two ways to calculate the basic probability. One would be to take the number of innings thrown by all teams over the course of a season and calculate the chances of x number of them being consecutive scoreless innings. However, this would not account for things like pitcher usage or pitcher health. So, instead, I used the number of innings thrown by each pitcher with a streak, and calculated this with the probability of a run being scored in any given inning. Two surprises appeared. First, the pitcher who managed a 40+ scoreless innings streak in the fewest innings in the season was not Zac Gallen, but was Luis Tiant, who threw just 179 innings in 1972. Second, the run scoring environment in 1950 was extremely offense friendly.

This table should show the probabilities of an average pitcher in the league in that season pitching that number of innings and having that scoreless innings streak.

Pitcher Season Streak IP 1 in:
Sal Maglie 1950 45 206 3,493,891,900,420
Orel Hershiser 1988 59 267 2,886,599,453,361
Zac Gallen 2022 44.1 184 75,196,398,967
Zack Greinke 2015 45.2 222.2 50,054,381,755
R.A. Dickey 2012 44.2 233.2 46,226,667,382
Brandon Webb 2007 42 236.1 22,756,868,130
Don Drydale 1968 58 239 15,599,207,624
Clayton Kershaw 2014 41.2 198.2 641,130,366
Bob Gibson 1968 47 304.2 57,379,266
Gaylord Perry 1967 40 293 41,551,871
Luis Tiant 1972 40 179 5,706,447
Luis Tiant 1968 41 258.1 3,920,406

Before doing this, I'm not sure I could have told you that Sal Maglie even had a scoreless innings streak of 45 innings. I recognized the name, but that was from his starting the game with the "Shot Heard Round the World" as well as starting opposite Don Larsen in his perfect game. Had I been told that Maglie's streak was the most improbable, I certainly wouldn't have believed it. But desegregation played a key role in the improbability. Black players joined the National League more quickly than the American League, but primarily hitters, not pitchers.

Also, only the best of the hitters were joining the leagues. Those factors probably led to the increase in offense in 1950. Also, because Maglie started the year in the bullpen, he only pitched 206 innings, the fourth fewest of any streak. So the chances of pitching 45 consecutive scoreless innings out of 206 total innings in that run environment were 1 in almost 3.5 trillion. Even though Hershiser's streak went 14 more innings, it was in a less-scoring friendly environment and he threw the third most innings out of the pitchers on this list.

For the record, the probability of Zac's current streak using this method is 1 in 106,794,373, which would put it in between Kershaw and Gibson's streaks for improbability. I'd hate to think what the probability of having two such streaks in so few innings would be, but it would probably break my brain to calculate it.

I'd be interested in what actual mathematicians could do as far as the probability of the streaks, but on the whole this passes the smell test, based on the expected run scoring environment, with the exception of Maglie, and that was because I had the idea of 1950 being a low-scoring year, not one of the highest scoring seasons in terms of runs per inning.

Thoughts?