clock menu more-arrow no yes mobile

Filed under:

What is Sabermetrics?

A brief introduction to the history and basics of Sabermetrics with a special nod to our Opening Day victory

San Diego Padres v Arizona Diamondbacks
Yasmany Tomas’s defense is often a highly-contested subject
Photo by Christian Petersen/Getty Images

What is Sabermetrics?

The term is thrown around often but what does it actually mean? In 1980, Bill James defined Sabermetrics as “the search for objective knowledge about baseball.” At the root of Sabermetrics is “Saber” which is a reference to the Society for American Baseball Research (SABR).

The key word of all of this, however, is objective. The definition is straight-forward: “a judgment that is not influenced by personal feelings or opinions when considering and representing facts”. However, the problem is that in most arguments, the participants are actually being more subjective - influenced by personal feelings or opinions - than objective without ever realizing it. This is because humans are naturally biased and often don’t recognize these biases are in play.

For example, look at our opening day victory over the Giants. Already, the Pit has had intense debates over the defense of Yasmany Tomas and Jake Lamb regarding two separate plays and these debates have been almost entirely subjective in context. People questioned Tomas’s route and claimed it was a bad play. Some people questioned Lamb’s defense and if he played the ball correctly while others chimed in stating that there was no way Lamb could have made an out even if he played the ball cleanly. But, these are all subjective claims - these were all opinions made by separate people that viewed the play in different ways. Is there a way to know who was right?

Unfortunately, the objective evidence we have for these two plays is pretty limited:

Regarding Tomas’s catch, Statcast’s Catch Probability metric (which I’ll go into more detail in a later post) cited the catch probability as only 21%, which means only the best players are going to make that catch. So we can objectively say that Tomas was unlikely to make the catch no matter what you might have thought about how he played it. Unfortunately, we have very little objective data (currently) about Lamb’s play at third, so that is an argument that will probably never be fully resolved.

And this is the problem of human observation - our biases naturally come into play. I didn’t do a comment count (sue me for being subjective here), but from glancing over the Gameday Thread (as well as the main recap), it seems like the majority of commenters labeled it as a bad play when, objectively, it was just a very tough catch. Objectively, it was a hard catch to make, subjectively, Tomas made a bad play. Does this make sense how human observation and bias can influence how a play is viewed?

And, in a nutshell, this is what Sabermetrics is all about. Sabermetrics is after quantitative data - data that can be measured and written down with numbers. Baseball is loaded with quantitative data - hits, doubles, home runs, strike outs, walks, catches, etc. are all forms of quantitative data. We have a play and we have results that we can write down and measure. There is no bias in saying “Goldschmidt went 2-4 with a walk.” That is literally what happened, no matter how one might perceive it.

Sabermetrics needs quantitative data because it’s impossible to do statistical analysis on data that can’t measured and represented with numbers.

What Sabermetrics doesn’t want is qualitative data - data that can’t actually be measured. This is often correlated with the “traditional” view of baseball, as there are many overlaps. Examples of qualitative data are things like “Player X has a nice swing” or “Player Y is a great veteran presence” or “Player Z is gritty” (sorry, I had to). I know these examples are a bit over-the-top, but they are still rather commonplace in the game today. But there are examples of more “technical” forms of data out there that are still qualitative in nature - e.g., bat speed, certain aspects of fielding, various forms of mechanics, etc. Qualitative data is still meaningful but is generally only useful in the hands of someone qualified, such as a scout. Sabermetricians are looking for ways to turn qualitative data into quantitative data (see: Statcast) because it allows for the data to be properly analyzed. But until then, it is difficult to make unbiased judgments with qualitative data, especially for predicting future performance.

Qualitative data isn’t “bad”, but unless it can be properly quantified, it is subject to human biases which makes it difficult to make accurate judgments of the data and predict the future.

So, this is essentially the “philosophy” of a Sabermetrician. We are looking to view the game as objectively and unbiased as possible. There are a lot of different things you can use with statistical analysis - player valuation, finding the “true” talent of a player, and mostly trying to predict future performance. However, using the tools properly is just the next step in Sabermetrics. Next week, I will begin with hitting stats and metrics. My goal is to breakdown the metrics, tell you why and how it is useful, and to show how they can/should be used going forward.

Also, starting next week, I will be including some form of analysis as a separate part of the post. It typically won’t be related to the subject at hand but I feel like it would be nice to still have on-going analysis as a change-of-pace in the articles.