Sunday, November 30

How the Big 12 South race turned me against computer polls

As I sit down to write this, the ESPN/USA Coaches Poll, Harris Interactive Poll, and the now-meaningless AP Poll results have all been released. The margin is small enough that it will only matter if the computers are deadlocked. Sagarin's is the only computer poll which has been released, although Colley's and Massey's (canceling each other out) are 99% certain.

I've been a longtime supporter of the inclusion of computer polls in the BCS formula, but the results surrounding the end of this season have caused me to rethink that. Before elaborating, let me talk about some of the reasons I have been for including these polls in the past.

The basic argument was always one of subjectivity vs objectivity. Computer polls are objective, unlike voters. The best example is the ridiculous amount of "respect" ND gets in the polls anytime they look even remotely good, and the computers balance this out. This balances out common sense and creative rationality as things the voters have which the computers do not.

Perhaps no season illustrates the potential positives of computer polls like 2004. USC and Oklahoma started out #1 and #2, and they stayed there for the entire season. (changing the date of the first poll or getting rid of preseason polls would NOT have affected this) Make no mistake, this was a result of two bigtime media giants being ranked above a small-market school. Only in the computers, all but one of which (Billingsley) started all 119 teams on equal footing, would have given Auburn a chance to play for the championship. When a I-A opponent (Bowling Green?) backed out at the last second, Auburn had to add the Citadel, a I-AA opponent which absolutely tanked their computer ratings. In the end, it didn't make a difference, but with just a slightly tougher schedule it could have allowed a team with less of a big name to have a chance.

Likewise, in 2005, the pollsters had Irish fever, ranking Notre Dame ahead of Oregon despite the Ducks being 11-1 with only a loss to #1 USC. The Irish had two losses, one of which was to an unranked Michigan State team! However, the computer polls consistently had Oregon on top, and in fact ranked Notre Dame 10th which was probably about where they belonged. The Irish still got a BCS bid thanks to their special clauses with the BCS, but at least the computer polls gave us the correct results in the standings.

Indeed it was during the 2005 season that I launched an in-depth personal investigation of the computer polls. I read as much as I could to determine how the polls that are used work. (Colley's, Massey's, and Sag's elo-chess are pretty straightforward; on the other hand, I have literally no idea how the other 3 work, and my understanding is that Billingsley's uses the final standings from the previous season as seed values!) I researched the changes mandated in the computer polls via the BCS. Then I created my own computer poll, within the limits of BCS regulations, so that I could test small changes in the systems myself. This was no joke of a poll, it tracked all 712 teams and was retrodictively accurate at the same level (insignificantly higher, actually) than the polls used by the BCS. This tentatively put my suspicion of the computer polls, caused by the numerous tweaks made by BCS non-mathematicians, to rest.

But the comps still completely lack common sense or the ability to make any sort of deep analysis. Let's talk about this past weekend:

Last week, I boldly predicted a +1.0 average lead for Texas in the computer polls if all games played out according to odds. And indeed, Texas beat A&M, Texas Tech beat Baylor, and Oklahoma beat Oklahoma State. What didn't go according to odds was Kansas (faced Texas and OU) upsetting Missouri (faced just Texas) and Georgia Tech upsetting Georgia (best win for both Alabama and Florida). The difference in Colley's is clear: you can actually plug in the games and see that if Georgia had beaten GT, Oklahoma would be #3 rather than #2.

Likewise, Texas' rating in Sagarin's elo-chess is negatively impacted by Missouri's loss. Impossible to say for sure, but the margin is so close that it's likely that the Mizzou loss swung this. Alabama's rating in Billingsley (which may allow Oklahoma to pass them) will be negatively impacted by Georgia's loss; by how much is again impossible to say.

The Big 12 South race could be determined by a game played in the Big 12 North and a game played between the SEC and ACC. For it to come down to this is just absurd. That's a side of the argument I'd never really considered - completely tangential games are determining the computer standings. For Texas to need Missouri to win because Missouri was Texas's toughest non-common opponent with OU is iffy but comprehensible. For so much of the rankings to hinge on Georgia Tech and Georgia, two teams outside the top ten neither of whom faced any Big 12 teams all season, to even have a role in the final margin is just comical. Regardless of how the final computer polls play out over the next 12-24 hours, it's becoming clear that computer rankings should play no role in the BCS model. Until we finally get the playoff system that college football deserves, the voters picking #1 and #2 is the less-(though still horribly)-flawed option.