Tuesday, February 8, 2011

NBA League Size and Competitiveness, Part Two: The Data

"Errors using inadequate data are much less than those using no data at all." - Charles Babbage

That is the spirit with which this post will proceed.  The question we're trying to get at is whether or not a larger league in the NBA (or, really, in any sport) leads to a more "diluted" product.  I've reinterpreted this question to be the following: is the league more or less competitive when there are more teams?  As discussed in Part One, it's not easy to really tell where the overall talent level of a league is because all the statistics players compile - all the games they play, the championships team win, and so on - are contextual.  The best player from the 1950s was still the best player from the 1950s, and looks great in retrospect, even if he wouldn't even make an NBA roster today.

Therefore, I transformed "diluted" into "competitive," because it seems to me that the one stands for the other.  That is, when we think the league is diluted, what we're really saying is that the league is not competitive, that there are too many players who are not good enough to hang with the few good ones, and that the few good ones are causing a small set of teams to dominate.  In a non-diluted league - in a competitive league - "everyone has a good team," to use Bill Simmons's language from my last post.  The result, no one - or few people - have a team that just trounces everyone else.

So today we're going to make a first pass at the data.  That first pass?  Looking, simply, at the standard deviation of winning percentage for each year in NBA (and ABA) history.  When was the league the most competitive (smallest standard deviation), when was it the least competitive (largest standard deviation), and is there any discernible trend as the league expands (does a larger league tend to be more or less competitive)?  Without further ado, here's a graph of our results.  X-axis is number of teams, Y-axis is standard deviation of winning percentage.

X is number of teams, Y is standard deviation of winning percentage

As you can see, there is a slight upwards trend here, but it's pretty small.  The R value (R-squard is on the graph) is about 0.17, which is not really significant unless you're doing social sciences research.  Nevertheless, a big reason why our correlation is so small is how spread out the standard deviations of winning percentage were when there were only eight to ten teams in the league.  As you can see, the left-most data points are much, much more spread out than the rightmost, with values ranging from barely over 0.05 all the way to almost 0.20.  What does this mean?  It means that, in the league's most competitive season a mere 5% separated average teams from good teams, and almost everyone was within 10%.  Put in sports-fan friendly terms, the best winning percentage in the league was about .600, while the worst was about .400.  That's baseball territory.  On the other hand, in the least competitive season, the separation led to a best team with an .800 winning percentage and a worst team with a .200 winning percentage.  Now, that's not precise (we'll get into the exact numbers shortly), but that's roughly what standard deviation tells you.

So, back when there were only eight professional teams, there was a huge variety from year to year.  Of course, with only eight teams we expect more variety in standard deviation, because, hey, fewer data points means each data point has more influence.  Thus, if one team wins 90% of their games one season in an eight team league, that value is going to skew the overall standard deviation much more than if one team wins 90% of their games in a 30 team league.  And, indeed, the 95-96 Chicago Bulls (who went 72-10) did not make 95-96 anywhere close to one of the least competitive seasons in NBA history.  Had that happened in 1960, the story would have been different.  For example, one of the "least competitive" - by standard deviation of winning percentage - seasons in NBA history was 1952, when a 12-57 Philadelphia team joined a 16-54 Baltimore team to drag the whole league down.

If we take only the seasons since the merger - that is, only seasons in which the league has more than 20 teams, we get the following instead:


Now we have an R value of .46, which is getting much closer to significant.  Indeed, while random variation obviously plays a huge roll - as do hard-to-quantify things like player skill and pre-NBA training, as well as injury management and so on - there seems to be little doubt that larger leagues are at least somewhat less competitive, according to standard deviation of winning percentage.  Consider that there have been only five seasons in which the SD of winning percentage was under 0.15 since the league went to 25 teams in 1988, whereas there were over 20 such seasons in the 40 years before then.

What this really shows, though, is not competitiveness or dilution, but talent distribution.  That is, regardless of the level of talent in the league at any given time, the more spread out that talent is, the lower the standard deviation of winning percentage will be.  The more concentrated, conversely, the higher the standard deviation of winning percentage will be.  Whether this is a measure of dilution is up for debate.  Also, while in a smaller league a larger standard deviation here might mean less competitiveness (one or maybe two dominant teams), in a 30 team league it might be exactly what we want (5 or 6 really good teams).

For example, this season the Miami Heat have Dwayne Wade, LeBron James, and Chris Bosh.  The Los Angeles Lakers have Lamar Odom, Kobe Bryant, and Pau Gasol.  The Boston Celtics have Kevin Garnett, Paul Pierce, Ray Allen, and Rajon Rondo.  Now, any of those ten players would be the best or second best player on most other teams.  Whether because of finances, smarts, collusion, or some combination of factors, we're currently watching a league where talent has conglomerated onto a small set of teams that routinely beat up on inferior opposition.  While I didn't run the (still changing) numbers from this season's NBA, so far there's a team with an .840 winning percentage (San Antonio), three teams above .700 (Dallas, Miami, Boston), and six more teams above .600 (Chicago, Atlanta, Orlando, Oklahoma City, Los Angeles, New Orleans).  On the other side of the coin, there's a .154 Cleveland team, plus five other teams below .300 (Sacramento, Minnesota, Washington, Toronto, and New Jersey).

Is this season's NBA competitive or not?  There are, at this point, ten legitimately good teams, any of whom - given the right breaks - could win the NBA Finals.  That sounds extremely competitive to me.  On the other hand, there are also at least six teams that are flat out awful, meaning that a large portion of each day's games are over before they start.  Is Cleveland really going to beat Miami?  Does Minnesota stand a chance against Oklahoma City?  Even though upsets happen, I doubt any circumstance would arise where a fan would feel like one of those inferior teams really deserved to beat one of the top ones.  They would need lots of lucky breaks.  And that, I think, is a mark of an uncompetitive league, when the bottom third of the league stands little to no chance against the top third.

But wait.  Does that mean the league is uncompetitive, or does it just mean that talent is distributed unevenly?  The latter is certainly true.  The former is more a question of taste and perspective.  We could say the same about the league being diluted.  When it comes down to it, if you make the league smaller, players who seemed great in a 30 team league will look good, and players who seemed good will look average.  Does that mean the league is better or worse?  Or does it mean that our perspective changes?

Consider a historical example.  When the ABA and the NBA merged for the 1976-1977 season, the NBA had one of its least competitive seasons ever, with a standard deviation of winning percentage under .100.  Bill Simmons says, in The Book of Basketball, that this one time the league actually under-expanded, as the 18 teams from the NBA and the 8 remaining viable teams from the ABA became 22 instead of 26.  That meant that a lot of ABA talent got redistributed to (mostly) bad NBA teams, meaning that talent was distributed about as evenly as ever in the history of the league.  The 53-29 Lakers lead the NBA that season, with a 50-32 Denver and 50-32 Philadelphia on their heels.  Those were the only three teams that won 60% of their games or more.

So where does 1976-77 sit in terms of dilution, competitiveness, and distribution of talent?  Really, we can only answer the ladder.  Talent was widely distributed.  Was the league diluted?  Was it competitive?  That's a matter of opinion.  Because talent was widely distributed, it was certainly competitive in a broad sense, but many fans would rather see great teams (and, by extension, terrible teams) than good ones (and merely bad ones).  As far as dilution, that's a more complicated question still.

See, when the league goes from 8 teams to, say, 12 teams, we'll tend to think of it as diluted because players who weren't previously good enough now are.  Similarly, contraction seems to eliminate dilution, because suddenly all of those marginal players are gone.  But give it ten years after expansion and contraction, and we no longer feel that way, because it's a matter of perspective and perception.  If the NBA cut 10 teams this offseason, the result would definitely be a short-term feeling of "raising the level" of the league, and probably increased competitiveness in the sense of a smaller standard deviation of winning percentages.  However, after 10 seasons of the new, 20 team NBA, we'd get used to seeing guys who had previously been their team's #1 as role players on the "deeper" teams in the smaller league.  New draft picks who once would have been franchise guys for bad teams would suddenly never be at the top of the league.  These new would-be stars, however, would never be thought of as franchise players who turned into role players.  We'd just consider them role players.  Suddenly, over time, the league would start to look a lot like it does now, only with fewer teams.

The same goes in the other direction.  A more diluted league is all well and good to talk about, but no one talks about how diluted NCAA Division I college basketball is, despite the fact that it adds new teams almost every year.  Sure, talent distribution is pretty extreme in the NCAA, but even so there are usually a good 20 or so teams that have a legitimate chance to win the Tourney every season (if you think this is an exaggeration, consider Butler), given the right breaks, and a favorable series of match-ups in March Madness.  Is the NCAA diluted?  Maybe, in some sense, but in another sense it's almost a crazy question to ask.

I would argue the same is true in the NBA.  Is the NBA diluted or not?  That's not really a good question, because being diluted is relative, and the stats players compile are relative, and even wins and losses are relative.  Bill Simmons believes the NBA is diluted because he grew up watching a league with a dozen teams in it.  I don't, because I grew up watching a league with 27-30 teams in it.  NCAA fans are used to the 400-something Division I teams, so there's never really a discussion.

What we do have, however, are some interesting measures of talent distribution and, in some sense, competitiveness.  We've already seen the trend - as the league expands, there is both a tightening up of standard deviations of winning percentage (less variety from season to season), and a slight (very slight) upwards trend.  Next time, we'll dive a little deeper into that data and look at some of the outlier seasons.  Moreover, we haven't given up on the competitiveness question - when has the league been most competitive? With more teams or with fewer? - so we're going to tease out only the good teams and run the same analysis here on them (that is, how many good teams are there in a season, and how good are they).  Stay tuned.

No comments:

Post a Comment