/cdn.vox-cdn.com/uploads/chorus_image/image/50108235/usa-today-9263600.0.jpg)
Win early = win late
You will not be surprised to hear that teams who were bad in the first half of the season, generally tend to be bad in the second half as well. And - you may want to sit down for this - teams that were good early, well, those will usually be good late too. But to find out exactly how much of a correlation is present, I looked at the first and second half win percentages for every team going back to when the Diamondbacks entered the major leagues in 1998. Through the end of 2015, that gives me a total of 540 data points. The scatter chart below plots first half W% against the same figure for the second half:
It's obviously not strict or absolute. There are teams that had bad first halves, which rebounded to do well after the All-Star break. The greatest improvement during the time under study was by the 2001 Oakland A's, who were basically a .500 team in the first half, going 44-43, but went an insane 58-17 over the second half [and no: it was the following season where they won 20 games in a row]. Conversely, the biggest implosion was the same season. and belongs to the Twins. After a 55-32 first half that saw them five games up in their division, they went 30-45 and finished six games back of the Indians.
The line shows the "best-fit" linear trend, and gives us a formula for a team's predicted second-half winning percentage. This is 0.626 times their first-half winning percentage, plus 0.187. For the Diamondbacks, this works out at a projected second-half W% of .451, which is 32.5 wins. This would give us 70 or 71 wins at the end of the season, about a handful down on last year's record.
64-73 win range for AZ
But let's also take a look at the expected spread of results. The chart below plot the difference in first-half and second-half winning percentages, for the same 540 seasons from 1998 through 2015.
Actually, only 539 points are shown. The 2001 A's season mentioned above is off the right-hand edge, being plus 267 points. But I chose to omit it, in order to keep things looking nice and symmetrical. Still, it shows second-half performance for the bulk of teams will be fairly close to what they did in the first half: teams like those Oakland boys are very much the exception. The majority (52.2%) have a win percentage in the second-half less than 60 points different, in either direction, from the first-half figure. For the D-backs, this means a better than even chance of between 26 and 35 wins; this putts us between 64 and 73 at year's end.
Regression > deadline moves
One of the things I thought might make a difference on second-half performance is trade deadline deals. Typically, these consist of the bad teams, those out of the running, trading their good players to the better sides, those that are still in playoff contention. If those were significant,. you'd expect this to be reflected in bad teams getting worse in the second-half, having lost useful pieces, while good teams improve for the opposite reason. So, I broke the change in W% down in groups of first-half win percentage.
First half W% | N | Change |
Below .400 | 52 | +.046 |
.400-.449 | 83 | +.037 |
.450-.499 | 114 | -.010 |
.500-.549 | 145 | +.005 |
.550-.599 | 97 | -.021 |
.600 or above | 49 | -.059 |
This is a surprise, showing regression to the mean is a stronger force than swapping players at the deadline. Yes, bad teams will tend to be bad in the second half - just not as bad. The same, in the other direction, goes for good teams: The top 14 by first-half W% all posted a lower number in the second-half, while 13 of the worst 14 improved. The sole exception were the 2003 Tigers, the worst first-half of all at 25-67 (.272), yet still managed to get worse, going 18-52 for a second-half W% of .257. Arizona sit in the middle of the .400-.449 block, so we'd expect a 37-point improvement in the second half. That would be a record of 33-39, to finish at 71-91.
The closest parallels
Finally, let's focus more narrowly still, on the teams whose first-half record is closest to Arizona's 38-52. How did these outfits do in the second-half? Since 1998, we find three teams with an identical 38-52 record: If we take those, plus the five on either side, that gives us thirteen who posted a W% in the first half between .420 and .425 inclusive. Here are their details, along with the second half records and change.
Team | Year | 1st W | 1st L | 1st W% | 2nd W | 2nd L | 2nd W% | Chg |
KCR | 2009 | 37 | 51 | .420 | 28 | 46 | .378 | -.042 |
MON | 2001 | 37 | 51 | .420 | 31 | 43 | .419 | -.001 |
MIL | 2000 | 37 | 51 | .420 | 36 | 38 | .486 | .066 |
SEA | 1998 | 37 | 51 | .420 | 39 | 34 | .534 | .114 |
SFG | 2008 | 40 | 55 | .421 | 32 | 35 | .478 | .057 |
MIL | 2015 | 38 | 52 | .422 | 30 | 42 | .417 | -.005 |
MIN | 2000 | 38 | 52 | .422 | 31 | 41 | .431 | .009 |
WSN | 2006 | 38 | 52 | .422 | 33 | 39 | .458 | .036 |
MIN | 2013 | 39 | 53 | .424 | 27 | 43 | .386 | -.038 |
MIN | 2012 | 36 | 49 | .424 | 30 | 47 | .390 | -.034 |
OAK | 2011 | 39 | 53 | .424 | 35 | 35 | .500 | .076 |
SFG | 2005 | 37 | 50 | .425 | 38 | 37 | .507 | .082 |
PHI | 2012 | 37 | 50 | .425 | 44 | 31 | .587 | .162 |
Interestingly, the average improvement of these 13 teams, who were closest in performance over the first half to the 2016 Diamondbacks is... 37 points. This would tend to confirm the above projection, of a final W-L record for Arizona this year of 71-91. Five of the thirteen teams went downhill, with the worst being the 2000 Royals, who ended the year at 65-97. The best were the 2012 Phillies, who actually ended the year with a .500 record, so there is still hope the D-backs might end up improving on last year. But all told, a figure in the low seventies seems the most likely end of year figure, as we restart after the break.