Thursday, September 21

Stat Time! (or, what I do when I'm bored)

So I was tooling around over on SportingNews.com and I ran across this guy talking about how good K-State's defense is. At this point, my initial reaction was somewhere close to, "WTF is he smoking?" - although I'm not the type to express that thought in so many words. So I decided to conduct a little stat experiment. I asked myself the following questions:

1 - how good is K-State's defense, anyway?
2 - how good did K-State do with respect to stopping the offenses they played?
3 - how good are the offenses they faced?

To answer those questions, I went over to cfbstats.com - which is pretty straightforward to figure out (awesome site, check it out) - and decided to check a couple of things. First, I checked K-State's defensive yardage numbers against their 1-A opponents this year. Then I checked the yardage numbers that their opponents had gone for (also only against 1-A competition). So I ended up with the following numbers:

K-State's defensive rushing YPG allowed
K-State's defensive passing YPG allowed
Opponents' offensive rushing YPG
Opponents' offensive passing YPG

So now I had a clear answer to Questions 1 and 3. What I didn't have was an answer to Question 2. That was found simply enough, though - I found the yardage stats for a specific game and compared them to K-State's averages (and their opponents' averages). I didn't know what to call it, but I ended up with something called the ratio of suppression. It looks like this:

RoSr(x) = rushing yards by opponent X / defensive rushing YPG

RoSp(x) = passing yards by opponent X / defensive passing YPG

RoSt(x) = total yards by opponent X / defensive total YPG

Obviously, RoSr is the rushing ratio of suppression, RoSp is the passing ratio of suppression, and RoSt is the total ratio of suppression. Don't be confused by the terminology X - that really just means "for this opponent". (i.e., K-State would have a specific RoSr, RoSp, and RoSt in their game against Marshall. In this case, X would be Marshall. Make sense?) You can also find the aggregate RoS's (aRoSr, aRoSp, aRoSt) by the following formulas:

aRoSr = defensive rushing YPG / average rushing YPG of all opponents

aRoSp = defensive passing YPG / average passing YPG of all opponents

aRoSt = defensive total YPG / average total YPG of all opponents

With me so far? I hope so. Here's a quick example: Miami has allowed an average of 48 rushing YPG. The average rushing YPG of all their opponents is 153.17 YPG. That means their aRoSr is:

aRoSr = 48 / 153.17 = 0.313 = 31.3%

Basically, that means that teams facing Miami have only run for about 31.3% of their "usual" output. In addition, Miami allowed 294 yards passing against Louisville, and their average passing D YPG is 234.5 yards. So the RoSp(Louisville) is:

RoSp (Louisville) = 294 / 234.5 = 1.254 = 125.4%

This is important to note! It's possible to have a ratio of suppression that's greater than 1. What that means is that the defense basically got taken out back.

Of course, you can do this for offensive stats, too - we'll call that the Ratio of Achievement (I was going to use Ratio of Success, but, well, that'd be too confusing). Here's what those look like:

RoAr(x) = rushing yards against opponent X / offensive rushing YPG

RoAp(x) = passing yards against opponent X / offensive passing YPG

RoAt(x) = total yards against opponent X / offensive total YPG

aRoAr = offensive rushing YPG / average defense rushing YPG of all opponents

aRoAp = offensive passing YPG / average defense passing YPG of all opponents

aRoAt = offensive total YPG / average defense total YPG of all opponents

Basically, these are kind of the inverse of RoS. Quick example: take the Louisville passing numbers and apply them to RoAp(Miami) (Louisville's passing YPG: 307)

RoAp(Miami) = 294 / 307 = 0.958 = 95.8%

So while Miami felt like they got torched, they pretty much held Louisville to what they normally do. In short, the RoS* / RoA* stats (the ones that talk about specific games) compare a team's performance to itself (did they struggle? did they do exceptionally well?) and the aRoS* / aRoA* stats (the cumulative ones) compare a team's performance to the teams it faced (are they beating up on most defenses, or can they be shut down?).

There you have it. It looks like a lot of math, but the math itself is incredibly simple - it's just summing ratios at worst, and obviously there are a couple of things that aren't addressed here. It's not entirely clean, but it's functional, which is what I'm going for.

What's its functionality? Well, if a defense has only given up 60 YPG rushing, that sounds like it's an awesome defense, right? It'd make sense to think that. However, what if their opponents only rush for an average of 40 YPG? Obviously, those are some pretty bad rushing offenses. But if they went for 60 yards against that defense, well, they did better than they normally do. That's the purpose of these stats - to provide clear answers to those questions. In this case, their aRoSr is 1.5, or 150% - they're allowing 150% of their opponents' normal rushing output. Let's say this defense faced, oh, West Virginia. Think WVU would be held to only 60 yards if that defense is allowing 150% of WVU's normal rushing yards? I doubt it.

For the record, there's no direct relationship between any of these stats and points given up. A 30-yard drive that begins at your 20 is worth the exact same as a 30-yard drive that begins at their 30 in this system. Obviously, that's not how it is in the real world, but I'm not concerned with point output in this case. There's no direct relationship between yardage and points - after all, a 79 yard drive can result in no points and a -10 yard drive can result in a FG. Don't use these formulas as a medium to compare points and yards and you'll be okay. Hope that made sense.