Mar. 19, 2012; Phoenix, AZ, USA; Arizona Diamondbacks outfielder Justin Upton throws his bat after striking out in the fifth inning against the Oakland Athletics during a spring training game at Phoenix Municipal Stadium. Mandatory Credit: Mark J. Rebilas-US PRESSWIRE
We face one particular question in nearly every case in which we make an analysis: do I accept that this is normal, or not normal? Of course, there are a million questions that follow when the Pandora's box is opened, but it often times can boiled down to that simple question.
Normality is a hard thing to measure, however. In what ways do we mean normal? Depending on the situation that might mean physical attributes, actions, or behaviors, or perhaps something else entirely. A much easier question to answer is "what is average?" Once we know what the statistical average of something, then it is much easier to determine whether what we're analyzing deviates significantly from the baseline.
None of what I've said above is particularly controversial. Yet in the very action of picking a baseline and deciding deviancy from said baseline can cause great friction.
People naturally feel itchy around statistics, or even just the vague idea of human engineering. Part of it is that thinking statistically is not natural for humans, which leads to the uncomfortable feeling that sabermetric style will lead to some kind of Orwellian nightmare.
Let's not fall to deep in the despairs of literary critique, however. What is facing the Diamondbacks fandom is the threat of dismal due to regression to the mean. It's the concept driving virtually every argument that the D-backs will slide backwards from 2012, and it isn't particularly nice. It also is almost certainly true.
For the sake of putting us all on the same page, I'll briefly describe regression to the mean. It conceptually fleshes out what I described at the beginning of the piece: that there is some kind of known or estimable mean, and that any deviation from it is more likely to be then followed by a return to the mean. In other words, if Shaq is a 52.7% free throw shooter, and he shoots 90% one night, what should we expect in the next game? Did he suddenly learn how to shoot free throws? No, he likely had an incredibly deviated night and we should more than expect the next game to be somewhere between 90% and 52.7%.
The process of figuring out exactly where between those two numbers it will be is the tricky part. Shaq very well could shoot worse than 52.7% the next night, or he has small possibility to shoot much closer to 90%. But, through the magic of statistics, we know that overwhelmingly the smart "bet" would be to assume the number will be closer to the baseline. This only works, of course, if the initial mean of 52.7% actually is true. I think we can assume it is closer to being true for Shaq, given he played in 1200 games.
So to take this framework and apply it to the Diamondbacks, we would look at the various performance metrics and ask ourselves whether or not that player went beyond his baseline, or is the 2012 closer to the real number? It's difficult to say, because it depends on the sample size of each player. If it's a rookie, or even a guy with only a couple years of experience, we can't really say he won't get better. But there does reach a certain point when we can't expect a player to have a major jump in production. It's that assumption that is weighing down the projections for the Diamondbacks in 2012.
Even a slight regression towards the mean doesn't suppose awfulness. The Diamondbacks could have a very good year because other players will have career years, or perhaps the Giants will be particularly bad, or some other unpredictable event that we can't consider. It shouldn't be assumed that because the question has been raised that we don't support particular players, or that we're lousy fans. We're all trying to get to the same place.