clock menu more-arrow no yes mobile

Filed under:

Fielding Metrics, Part 1: From the ground(ball) up

Defensive numbers are probably the last great unexplored frontier of baseball statistics. Great strides have been made on the analysis of hitting and pitching over the past few years - and also on the public acceptance of such numbers, to the point where OPS can now be found on the scoreboard at Chase. However, this is not the case with fielding, where errors are still regarded in many quarters as a reliable measurement of fielding ability.

While they certainly have their place, a simple example will show the terror of errors. Ten balls are hit at two players. Infielder X reaches all ten, but bobbles one, and gets charged with an error. Infielder Y, whose range is limited, reaches eight but fields them all safely. X has a .900 fielding percentage, while Y has a perfect 1.000. But X is the better fielder, in terms of the actual game outcome: he retired nine of the ten batters, while Player Y only got eight. Errors do a very poor job of taking into account a player's range, a crucial part of defense. This is especially true at first and third: if a shot gets past a player down the line, it's much more likely to be extra bases than if it goes past the shortstop or second-baseman. Getting to a ball, even if you sometimes then boot it, is preferred to not reaching it at all.

Both errors and fielding percentate are very position-dependent too. The best F% among qualifying shortstops, was Jimmy Rollins' .988, but that figure would have made him a below-average center fielder. The following chart shows the average F% at each spot on the diamond, based all qualifying players during the 2008 season, and the number of errors expected there, based on somebody playing every inning of a 162-game season at that position [not applicable for pitchers, obviously]. It'll give you something against which to measure the raw numbers, and I've also listed the D-backs players, their F%, and number of errors scaled to a 162-game season.

Pos   F% Ex.E  Player     F% Err.
=== ==== ====  ======== ==== ====
1B  .994  9.5  Jackson  .993 10.2
               Tracy    .993 11.2
2B  .985 12.7  Hudson   .982 14.5
SS  .976 17.4  Drew     .976 20.9
3B  .956 19.1  Reynolds .904 38.5
LF  .983  5.3  Jackson  .981  6.7
               Byrnes   .987  3.5
CF  .989  4.6  Young    .993  3.1 
RF  .987  4.4  Upton    .943 18.6
 P  .964  N/A
 C  .993  8.4  Snyder  1.000  0.0
               Montero  .989 14.4

The problem is that the 'error' is not just subjective: after all, balls and strikes are subjective. However, they are at least well-defined, unlike errors which are remarkably nebulous. Everyone recognizes a home-run or an RBI, but per the official rules, an error is basically decided by whether an out would have been made by the fielder with "ordinary effort." And how is that defined? Well, it's based on a fielder of average skill at the position in that league, "with due consideration given to the condition of the field and weather conditions." There is a huge amount of latitude available here.

The official scorer is selected by the home club and one wonders whether that would have an impact on his decision-making. It'd be an interesting exercise to analyze errors awarded and see there is a bias in these matters. If there was, one might imagine the visiting team could be credited with less errors than expected - giving the home batters the benefit of the hits instead. Harder to predict what would happen when the home team took the field: more errors would help the pitchers, but would make the defenders look bad.

The first steps beyond raw fielding percentage are Range Factor and Zone Rating, which make the important move from penalizing a player for making mistakes, to rewarding him for making plays. Range Factor is quite simple, and uses the same basic data as Errors. It's (Putouts + Assists) / Innings played, and while likely an improvement over F%, there are problems here. Most obviously, strikeouts increase the number of innings played, without any chance for a putout or assist. Fielders playing behind high-K pitching staffs [and last year, the Cubs had 32% more K's than the Cardinals] will have lower Range Factors. Similarly, outfielders on a ground-ball heavy team will get get less putouts - as will anyone playing alongside Orlando 'the Pop-up Vacuum Cleaner' Hudson. :-)

Here are the Range Factor numbers for the Diamondbacks in 2008, along with the league average figures for their position (min. 200 innings at that position). I've also converted our fielders' stats to a percentage of league average, to make it easier to see how they compared:

Pos Player     RF   Lg.Avg.  Rating
=== ========  ====  =======  ======
1B. Jackson   8.86   9.34     94.9%
    Tracy     9.53           102.0%
2B. Hudson    4.81   4.83     99.6%
    Ojeda     5.10           105.6%
SS. Drew      3.95   4.43     89.2%
3B. Reynolds  2.25   2.60     86.5%
LF. Jackson   2.07   1.90    108.9%
    Byrnes    1.65            86.8%
CF. Young     2.58   2.61     98.9%
RF. Upton     1.89   2.12     89.2%
    Romero    1.63            76.9%
P.  Webb      2.86   1.73    165.3%
    Haren     1.21            69.9%

The pitching stats particularly point out the impact that external factors, other than fielding ability, can have on these numbers. I don't think even Brandon's mother would claim he was twice as good with the glove as Dan Haren; the gulf in their numbers reflect, to a great extent, the far greater number of chances Webb received, thanks to his sinkerball pitching. Drew's shortcomings at short and Reynolds' at third are also pretty apparent.

Finally in this section, let's take a look at Zone Rating, which was invented by John Dewan, then CEO of Stats Inc. The field is divided into zones: for example the short-stop's zone covers everywhere he has a better then 50% chance of catching a ball. [This article includes a link to the grid, and a description of which fielder covers which sections] Every ball hit into a player’s zone is counted as a chance, as is every ball he catches outside of his zones - for infielders, only ground-balls are counted, not line-drives or pop-ups. The Zone Rating is simply the percentage of chances converted into outs.

This continued refinement is now closer to rating a player based on whether or not they "should" have caught the ball. The main weakness here is the way the system handles out-of-zone plays: rather than giving them credit, these are treated no differently from a successful play made in their zone. As a result, ZR tends to reward sure-handedness at the expense of range, in a similar way [albeit to a much smaller extent] to Fielding Percentage. On the other hand, Chris Dial has done very good work in converting ZR to runs saved, so we can convert the number to something comparable across positions. Here are the figures for Arizona, along with the runs allowed or saved compared to average. Note that this value doesn't take into account the number of innings played, so the last column is scaled, to give a value if they'd played every inning of every game.

Pos Player    ZR   Runs  R/162
=== ======== ===   ====  =====
1B. Jackson  .83    -3     -8
    Tracy    .81    -5    -13
2B. Hudson   .79    -5     -8
    Ojeda    .79    -1     -5
SS. Drew     .80   -11    -12
3B. Reynolds .75    -8     -9  
LF. Jackson  .87     2      5
    Byrnes   .91     4     12
CF. Young    .88     3      3
RF. Upton    .83    -6    -11
    Romero   .88     0      3

[A shout-out goes to the generous data source who rescued me when, literally in the middle of this piece, ESPN's pages stopped showing Zone Rating, and I wasn't able to find it anywhere else!]

It's particularly interesting to note how badly this metric makes the Diamondbacks defense look, especially around the infield, where every position was well below average. There was, by this scale, basically no difference between the performance of Jackson at first, Hudson at second and Reynolds at third - though Drew's shortstop was worse than any of them. Left-field and, marginally, center were the only positions where our defense was above average; perhaps this deficiency might explain why Webb's ERA increased somewhat this season.

At the time of writing, it looks like the only significant change in the defensive staff will be the replacement of Hudson with Felipe Lopez. While you might expect the loss of a multi-Gold Glover like O-Dawg to have a big impact, looking at Lopez's 2008 numbers in these three categories, it's not quite as clear-cut as you'd think. In Fielding Percentage, Hudson does have a significant lead (.985 to .974). However, in Range Factor, it's a good deal closer (Hudson is 4.81, Lopez 4.71), and Lopez had the better Zone Rating, both in Washington (.82) and St. Louis (.81) than Hudson (.79).

Here endeth part one. Next time [probably next week], I'll take a look at the more advanced measurements which are increasingly being used to measure defensive performance, and how they rank the Diamondbacks in 2008.