Wednesday, September 22, 2010

Humidor!

It's even on ESPN! Rob Neyer, in his blog, is the latest high-profile baseball writer to pick up on the Rockies-might-be-cheaters idea (alternatively, depending on source, "might" could show up as "must").  A whisper and a rumor that began with Giants announcer Jon Miller has now expanded to such proportions that the great gods of Confirmation Bias, Hearsay, and Circumstantial Evidence have taken over.  While most rational people - like Neyer - do not believe the Rockies are cheating, the rampant cynicism of Internet goers everywhere, combined with their love of their own teams, has ignited a firestorm of anti-Rockies hate.

Circumstantial evidence leads the way, of course.  Fans point to Troy Tulowitzki's recent hot-streak, the magical Septembers of 2007, 2009, and, hopefully, 2010, and a handful of impressive comebacks the Rockies have engineered, as if those things are proof of improper humidor usage.  In the process, of course, we're ignoring Miguel Olivo's collapse to below mediocrity, Carlos Gonzalez's recent slump, the whole of the 2008 season, the terrible starts the Rockies have had in other seasons (including this one), and the impressive comebacks other teams sometimes engineer against the Rockies.

My point here, however, is not to argue against circumstantial evidence with circumstantial evidence in the opposite direction.  My point is to ask a very specific question to which I hope to find a very specific answer.  The question is this: if the Rockies were cheating by using non-humidor balls during their own at bats and humidor balls during their opponents at bats, what would we expect the result to be?  As far as I know, for all of the vehemence about this issue, no one has bothered to turn a statistical eye on the situation, and it seems to me that we should probably do so before we start slinging accusations around.

The first thing I want to pull out are historical Park Factors for Coors Field (from baseball-reference.com):

1995 - 128
1996 - 129
1997 - 113
1998 - 125
1999 - 125
2000 - 129
2001 - 121
2002 - 117
2003 - 110
2004 - 119
2005 - 110
2006 - 107
2007 - 109
2008 - 105
2009 - 113
2010 - 120

There's a lot of variability in there, of course, but there's a noticeable difference between what's going on before 2002 (when the humidor was put in) and after.  How big a difference?  Well, let's run some (very basic) statistics.

1995 - 2001 Average: 124.3
1995 - 2001 Standard Deviation: 5.7

2002-2010 Average:  112.2
2002-2010 Standard Deviation: 5.4

The SDs here are not totally valid, because we have too few values to really have a normal distribution of data.  Nevertheless, it is not the case that every year is equal.  If we looked across the league, the same would be true; park effects are subject to variation just as a player's BABIP, or home run totals are.  A full season may be a lot of data, but it's not enough to establish an answer for what the real park effect of a stadium is.  In the case of Coors, it seems as though there's a 10-15 percent range, for example, which is fairly substantial.

Regardless, there's a difference here.  From 1995-2001, Coors inflated scoring by an average of 24 percent.  From 2002-2010, that number has fallen to 12 percent.  In real terms, that means that 5 runs elsewhere, in the pre-humidor era, was equal to about 6 runs at Coors, while 5 runs in the humidor era is worth 5.5 at Coors now.

Now if we look at Tom Tango's wonderful Run Expectancy Spreadsheet, we figure out how much of a difference we can expect a half-inning of non-humidor balls might make.  The leftmost column in Tango's sheet is for "run environment," meaning we can calculate the expected runs per inning depending on the number of runs per game the environment in question yields.  The National League this season is very nearly a 4.40 run environment (as in, 4.4 runs per team per game), which means that the Run Expectancy each inning is 0.489.

Coors Field with humidor is probably close to a 4.95 environment, yielding a .55 Run Expectancy per inning.  Coors without humidor jumps up to a 5.45 run environment, which means a .606 RE.

The short-form, here, is that, if the Rockies are cheating late in games (which seems to be the conspiracy theory; I haven't heard it suggested that the Rockies are doing this all game), we'd expect them to see a jump from .55 runs per inning to .606 runs per inning once they switch out the balls.  Undoubtedly the sample size here is going to be too small to make conclusions if we go straight to the data right now, so let's look at a little more theory.

The assumption in our Rockies Cheat theory is that Colorado switches out the balls when they're behind late in games.  Let's start by looking at a single inning - the ninth.  If the Rockies are down by 1 run in the bottom of the ninth, how much does switching out humidor balls for non-humidor balls help?  Well, the likelihood of scoring at least one run - and thus tying or winning the game - with humidor balls is about 29%.  The likelihood of the same with non-humidor balls is about 32%.  The likelihood of scoring two or more - and thus winning outright - with humidor balls is 14%, while the same with non-humidor balls is about 16%.

Let's think about this.  If the Rockies are down by a run in the ninth inning, the switch from humidor balls to non-humidor balls will get them a tie only 3% more often, and a win only 2% more often.  Over the course of the season, that comes out, roughly, to somewhere between zero and one games.  If the Rockies are switching humidor balls for non-humidor balls in the ninth inning, they're not getting that much benefit from it.

Of course, you could argue that they start switching in the seventh inning.  Suddenly that 3% difference between 29 and 32 turns into... a 4% difference (between roughly a 65% and 69% chance of scoring at least one run in one of the three innings respectively).  Considering that, humidor balls or not, the other team continues to hit as well, the odds of winning a game down by a run in the seventh inning are not hugely different for the Rockies in either of those situations.

Upping the deficit here will do what you might expect, decrease the likelihood of a comeback and, in the process, decrease the difference between humidor and non-humidor balls.  You might argue that homers become more likely with non-humidor balls, but that's covered already in the change in the park factor.  And, what's more, the exact point here is that the increased odds of hitting a homer with non-humidor balls compared to humidor ones is probably so small in a given at bat that it's negligible.

I know that this brief theoretical look at the humidor proves nothing, but I hope it does show that, in order for the Rockies to really benefit from putting different balls in play, they'd have to do it systematically and at almost all times over the course of the season.  Considering the likelihood that they'd get caught doing that would skyrocket (compared to just doing it close and late in games), I don't see any reason for them to risk it.

I guess the point here is, if the Rockies are cheating by mixing in non-humidor balls in late innings, they should really stop.  Not only is MLB likely to find out in the long run (or the short run), they're not even benefiting that much from doing it.  Sure, those one or, let's be generous, two extra wins might mean the difference between the watching the playoffs from the dugout or from the living room couch, but that effect is far more subtle than what the hysterical cries of the masses (and Jon Miller) suggest.  The implication is and has been that the Rockies win so much at home principally because they're cheating in the late innings, and that's just patently absurd.  Even if they are cheating, they're getting so little from it that it would take 50 to 100 games to see even a one game difference.

1 comment:

  1. You should probably send this article to the mlb and the Giants organization. "Fucking juiced ball bullshit."

    ReplyDelete