With an off-day today, figure it's as good a time as any to start previewing the competition in the NL West this season. For 2012, I'm attempting to be a bit more rigorous with regard to coming up with a projection system for the standings. I can't claim it's entirely original, being based on the work of renowned German sabermetrician Otto R.G. Hesswerk... :) But it kinda seems to cover the factors which should come into play over the coming season, though as we saw last year, sometimes things come out of left-field (literally, in Gerardo Parra's case), which can change the complexion of an entire season.
Overall explanation is after the jump, first target (the Padres) should be up tomorrow morning.
We start with how a team actually performed last year - we use the Pythagorean record as a base, because the actual W-L is more influenced by a team's record in one-run games, which evidence suggests varies almost randomly from year-to-year. We look at the players gained and lost from the team, and add or subtract their performance. Then, there's the question of aging - if you're an "old" team, odds are your performance the next year will be worse, while a young team will get better, simply through experience. Finally, there's an entirely subjective adjustment for other factors.
Base number of wins
The best measure of a team's performance is its actual performance, so we could take the 2011 record as a starting point. However, a significant factor is how a team does in one-run games. Last year, in the NL, the best records were the Brewers and D-backs, who were both six victories above .500 in those - the worst was the Padres, who went 20-31. However, history tells us winning one-run games is not a repeatable skill: witness the 2007 D-backs, who went 32-20 there, only to fall back to 22-23 the following year. Or the cellar-dwelling Padres, who were second-best in one-run games in 2010.
A better measure is the Pythagorean record, which uses the number of runs scored and allowed by a team to determine what their W-L record "should" have been. It's an idea pioneered by Bill James, and central is the concept that a team's run differential is a better predictor of future performance than their W-L record. For instance, at the All-Star Break last year, the Giants had a 52-40 record, but had only fractionally outscored their opponents, 332-322. And in the second half, they duly went 34-26. Obviously, it's not perfect - future performance is based on future runs scored and allowed, more than past performance - but it's closer.
I added a wrinkle here, in that I'm weighting a team's second-half performance more than their first half, as the former would seem a better predictor of 2012 performance than games in April 2011. Fortunately, you can find splits at the break on Baseball-Reference.com giving Pythag W% before and after a specific date, so the baseline for wins ends up as being:
54 * (1st-half Pythag + 2nd-half Pythag + 2nd half Pythag)
The above presumes that continuing players will perform as they did the previous year. Obviously, that's never going to happen, but it seems reasonable that some will be better, while others are worse, making the collective performance round about the same. However, no roster remains static over the winter: players leave as free agents or are traded away; similarly, new players arrive, through signings and deals with other teams. It seems essential to take into account these changes.
For the sake of simplicity, we again use 2011 performances as a predictor. If you sign a 3 WAR player and a 2 WAR player departs through free-agency, the net change is one win. As above, this is unlikely to be the case on an individual basis, but should hopefully prove reasonably accurate on a global level. Of course, there may be reasons why a player performs better or worse than expected, and there can be a knock-on effect of a signing. For example, the arrival of Jason Kubel is likely to reduce Gerardo Parra's value, simply because he'll probably be playing less in 2012. Trying to project such things = far too complex.
Young players get better; older ones get worse. The question of where about the tipping point occurs, finding the "peak age" for hitters and pitchers, has been the subject of a lot more in-depth research than I have interest in carrying out. I'm quite happy to let the market decide this, by drawing the line at the median age of teams. Thankfully, BR.com comes to the rescue again, providing a weighted age for both pitchers and hitters for each team, taking into account at-bats, games, starts (for pitchers) and saves. This gives us a median for hitters of 28.6 years and for pitchers of 28.2 years.
How this exactly impacts future performance is truly a guessing game, and I make no claim that the numbers I came up with have any basis in "fact", but fortunately, this is just for fun and not part of my Ph.D thesis. For every half-year below the average they were in 2011, I'm adding one win on to a team's numbers. For every half-year above, I take one win off. Simple, really.
The fudge factor
This is basically where I go "Nah, that doesn't look right" and massage the numbers to make them seem sensible. Was a team lucky or unlucky in health? Any hot prospects likely to make an impact? I'd rather not use this, but if there's evidence to suggest strongly that the other numbers unfairly represent team performance, I am open to making an adjustment of some kind. Probably no more than plus or minus three games though.
I'm the first to admit there are plenty of 'em here. In addition to those already mentioned above, guessing the impact of prospects is very hard. A team's age last year may not reflect their age this year, depending on the arrival of older or younger players. This is all a bit of fun and the SnakePit accepts absolutely no responsibility for shirts lost in Vegas as a result of wagering on any predictions which may follow. I'm also open to incorporating other factors into this system, if you can come up with a way of getting them in a number of wins.
Have at it...