Don't FIP Me Off
I actually saw a comment on this at Fangraphs and it really amazed me how flawed of a statistic FIP is. I didn't think this got enough attention on Fangraphs, and I'm not sure if people realize how seriously flawed FIP may be here on the 'Pit, so I decided to write a post about it.
One of the biggest problems with FIP lies in the fact that it's based on innings pitched, as opposed to batters faced. Let's have a thought experiment:
Imagine two pitchers. Let's imagine them as genetically enhanced Dan Haren, because they never ever walk anybody. The only possible outcomes when they face batters, are hit, strikeout, or some other out. Now let's imagine that one of these pitchers has a BABIP of .400, whereas the other pitcher has a BABIP of .100. The following table illustrates what happens to these pitchers for every ten batters they face:
|
|
Pitcher A |
Pitcher B |
|
BABIP |
.400 |
.100 |
|
Batters Faced |
10 |
10 |
|
Strikeout Percentage |
30% |
30% |
|
Strikeouts |
3 |
3 |
|
Hits |
4 |
1 |
|
Outs Recorded |
6 |
9 |
|
Innings Pitched |
2 |
3 |
|
K/9 |
13.5 |
9 |
19 comments
|
0 recs |
Do you like this story?
Comments
BABIP
I think your numbers are a little off. If Pitcher A faces 10 batters, and strikes out 3, with a BABIP of .400, he would have 7 Balls in Play, and allow approximately 3 hits, and record 4 outs (plus the 3 Ks, for a total of 2.1 IP). Doesn’t change your point much, but it does move the numbers a little closer.
This is true and all
But consider that a staggering .300 point difference in BABIP created a relatively minuscule difference in FIP. 4.5 K/9 difference times the approximate K/9 multiplier of 2, divided by innings pitched is less than one run of FIP in this small ten inning example.
Over the course of a full season of 200 IP from a starter, a .300 point spread – a nearly-impossible spread to actually uncover over this inning sample – would create an artificial FIP difference of approximately .045… cutting the BABIP spread obviously cuts the margin further.
I’ve seen plenty of arguments for the shortcomings of FIP, but it’s hard to call FIP really all that flawed for this… just too many extremes needed to create a significant difference, even if you compare relief pitching seasons.
Founder and Chairman of the Hire A Manager's Assistant For Kirk Gibson Commission. A non-profit organization.
Founder and Chairman of the Hire A Body Double For David Hernandez's Right Arm Commission. A non-profit organization.
by Dan Strittmatter on Jul 6, 2011 1:18 PM EDT via mobile reply actions
Awwwh crap
I mixed up my inputs. Usually used just K’s when I would self-calculate. Nonetheless, the extremity of the BABIP spread (and Amit’s point) still stand.
Founder and Chairman of the Hire A Manager's Assistant For Kirk Gibson Commission. A non-profit organization.
Founder and Chairman of the Hire A Body Double For David Hernandez's Right Arm Commission. A non-profit organization.
by Dan Strittmatter on Jul 6, 2011 1:21 PM EDT via mobile up reply actions
FIP doesn't "underrate" pitchers who pitch
in a pitcher’s park. It underrates their pitching performance, not the pitcher themself. Also, there’s more to pitching in a pitcher’s park than BABIP, since HRs are counted in FIP and aren’t completely controllable by the pitcher.
HEY, FRENCHY! STAR TREK OR STAR WARS?
by DbacksSkins on Jul 6, 2011 1:41 PM EDT via mobile reply actions
right
i guess what i’m trying to say is, if you see an ERA that’s lower than an FIP for a pitcher in a pitcher’s park
you shouldn’t expect as much regression as you are probably expecting.
i mean, probably nothing rocket-sciency that i showed in this post. just that maybe sometimes we overestimate the regression that FIP would indicate.
also just wanted to sort of show why….if you can look up strikeout percentages for pitchers, then it’s better just to use that instead of FIP.
or perhaps more accurately
use strikeout percentage as opposed to K/9
But the problem here is
How people interpret it, not that stat itself.
Also, FIP isn’t supposed to show what a player’s stats should look like, or try to show that regression is on the way, it’s supposed to show what your numbers should look like in a neutral run environment with neutral defense.
And the problem with your example, as is with a lot of deviations in stats, is small sample size. No one’s ever going to post a .100 or .400 BABIP.
by CaptainCanuck on Jul 6, 2011 7:35 PM EDT via mobile up reply actions
But this
is why God invented stats like tERA and siERA.
HEY, FRENCHY! STAR TREK OR STAR WARS?
by DbacksSkins on Jul 6, 2011 1:56 PM EDT via mobile up reply actions
Word.
HEY, FRENCHY! STAR TREK OR STAR WARS?
by DbacksSkins on Jul 6, 2011 4:05 PM EDT via mobile up reply actions
FIP
is a very useful statistic that has its limitations, like any other. It attempts to isolate only the pitcher’s contribution, but there’s no way to do that in a total vacuum, try as we might.
Facing the Diamondbacks will result in more strikeouts, which makes FIP go down. Facing the Yankees or Red Sox results in more walks, which makes FIP go up. Facing the Brewers results in more homers, which makes FIP go up.
The thing to take away here is that it’s not perfect — but it’s still useful. Just remember that it’ll not only “underrate” flyballers, but extreme groundballers like Brandon Webb as well, because FIP, by its very nature, takes into account ONLY the three true outcomes. Just keep that in mind when trying to extrapolate a pitcher’s future performance. Absolutely NO two pitchers have an identical batted ball profile, which is an assumption made when predicting future ERA using FIP.
HEY, FRENCHY! STAR TREK OR STAR WARS?
by DbacksSkins on Jul 6, 2011 4:04 PM EDT via mobile reply actions
Disagree on the Webb/batted ball point (in theory, at least)
The supposed long-term stability of HR/FB rates means that groundballers get FIP points by having low HR-Rates. The only real issue is fly balls vs line drives, but that’s another issue altogether.
In my mind, this post is just another reason why regression is towards the mean/FIP, not to it.
Founder and Chairman of the Hire A Manager's Assistant For Kirk Gibson Commission. A non-profit organization.
Founder and Chairman of the Hire A Body Double For David Hernandez's Right Arm Commission. A non-profit organization.
by Dan Strittmatter on Jul 6, 2011 7:57 PM EDT via mobile up reply actions
Yes
this is what the post is supposed to reinforce. FIP isn’t a magical number that says “your ERA is supposed to look like this”. Which I think some people, both who follow sabermetrics and who hate sabermetrics, think it means.
When you compare your ERA and FIP, the understanding should only be that regression will go towards the mean/FIP, not to it. Very well put.
FIP does not take into account ONLY the three true outcomes
the FIP formula looks a little like this:
(a*BB + b*HR – c*SO)/IP + d
where a, b, c and d are parameters that depend on league averages. So as you see, there are four variables, one of them being IP, or to put it another way, outs*3.
Basically, outs are a huge part of FIP, and that’s why a guy like Webb isn’t underrated by FIP. He gets a ton of outs with the grounders and that is reflected in FIP.
:-(((((((((((((((((((
I haz a sad.
Founder and Chairman of the Hire A Manager's Assistant For Kirk Gibson Commission. A non-profit organization.
Founder and Chairman of the Hire A Body Double For David Hernandez's Right Arm Commission. A non-profit organization.
by Dan Strittmatter on Jul 7, 2011 11:42 AM EDT via mobile up reply actions

by 






















