Wednesday, July 30

The Passing Premium Part 1: Trends in Offensive Thought

Before I even start writing about this epic project, I want to give some major props to Chris Brown at Smart Football. About a month ago, the Chris at our blog sent me an article from that site about why Left Tackle is the second-most important position in the NFL, after Quarterback. The text of his email was "prepare to waste your entire day." That about sums up the Smart Football experience for the combination math geek slash football geek.

A concept I recently read about was the Passing Premium. In summary, the hypothesis is that an offense is run most efficiently when its average gain per pass attempt is equal to its average gain per rush attempt. "Gain" of course does not refer to mere yards per play - it's "gain" in the sense of furthering your objective, which is to win the game. That's still essentially yards per play, only turnover risk and a few other variables could be included. The so-called Passing Premium comes from the fact that passing plays have greater turnover risk than running plays, so to offset this the passing plays need to gain more yards per play than rushing.

There are some attempts to quantify the number there - Brown uses only first and second down data (arguing that plays on third down are often not representative of your true offense - ie, on 3rd and 1, a 2 yard rush that everybody knows is coming is a great result, even if it lowers your rushing average. Conversely, 3rd and 10 is a passing situation in which the defense is likely to be stacked against the pass. Furthermore, they're probably fine allowing you to run a draw for 6 yards and still be forced to punt.) I'd like to refine this further, but that's coming later.

Oh yeah, this may end up being a month(s?)-long project spanning several posts.

Anyway, my first investigation was looking at historical NFL data. Though the eventual goal is to devise college and pro models, there is likely to be less random variance in the pro data or discrepancies caused by insurmountable talent differences between teams. NFL teams really all have similar talent levels; execution, scheme/strategy, and intangibles play huge factors in determining who is successful.

The first thought was simple: let's devise a basic linear regression model for the passing premium. Recall that the passing premium is a function mostly representing the increased turnover risk (interceptions obviously, but fumbles are actually slightly more likely to happen on passing plays than rushing plays) incurred when a pass is called. Unfortuantely, the distribution of fumbles is difficult to quantify. Even if we looked at individual player data, it's tough to tell if a QB fumbled because he was sacked or because it was a QB sneak on the goalline; if a RB fumbled on a run or when he was used as a receiver, or even if a receiver fumbled after a catch or on a reverse. I went with the simplifying assumption - likely incorrect, but hopefully not too significant of an impact - that fumbles are equally likely to occur on any play for a given offense-defense pairing. Right now I am also looking at data representing all plays; if possible, this will be revised in the future.

I wasn't terribly interested in the passing premium during the Hoover administration, although Pro Football Reference certainly makes data from that era available. So I looked at everything since the AFL-NFL merger in 1970.

It was around the moment I calculated the slope relating passing premium (yards per play pass minus yards per play rush) to the probability of a pass attempt being intercepted, that I realized what a truly epic exercise into lightsabermetrics this was going to be. The slope from 1970-2007 was roughly -18. This means that as the probability of an interception increased, the required premium on yards per play required to offset the risk of passing decreased. If that doesn't make you say "WTF?" then we've lost you already; go back and reread the Smart Football article. But basically, in a risk-reward situation (passing), as risk increases you'd need reward to also increase to make the arrangement equally desirable. The data showed the opposite. The two possibilities were that pro football coaches are really dumb, or that there was a stronger underlying trend. I decided to test out the latter.

In regression, one thing that can flip the slope of a line of best fit is if you have separate populations that you're trying to model as one. The question I now had to answer was, has the basic theory of football undergone significant change since 1970? I don't mean in terms of specific schemes or personnel packages, but do teams operate under a completely different premise than they used to?

One surprise was how much the game hasn't changed. The average NFL team in 1947 scored 22.0 points per game. In 2007, that number was 21.7. 1965-1977 saw a general trend of fewer points per game, followed by a rebound up until 1983. Nothing since then really consitutes a trend.




Yards per game is consistently, and rather trendlessly, between 280 and 335.



You could argue for different groupings in the average offense's number of plays per game, but since 1970 it's all between 60 and 66 which isn't really a meaningful difference. The peak was 68.2 in 1949, and the trough was 59.9 in 1992.



It is therefore unsurprising that the yards per play remains clustered around 5, with some dip from 65-77 and rebound from 77-83.



Where we see our first trend is the number of turnovers a team commits per game.



Now that's some pretty data. Again, we're going to focus here on post-AFL-NFL merger years. (Besides, I'd attribute decreased trends in TOs in the 50s and 60s to the rise of athlete as a full-time profession.) From 1970-2007, teams averaged roughly 2.15 turnovers per game, with a range of 1.8 to 2.6. What immediately stood out to me was that from 1970-1987, the number was always between 2.3 and 2.6, whereas from 1988-2007, the number was always between 1.8 and 2.2. Statistically meaningful? With a p-value of 2 x 10^-16, I'd say so. The real excitement came from this discovery being made mere minutes after a conversation with Chris, in which he mentioned 1987 as the year the west coast offense became prominant in the NFL. So flag 1987/88 as one point in time in which offensive thinking changed in the NFL.

The decrease in turnovers per game indicates that teams became increasingly risk-averse. Was 1987 the year teams finally "figured out" that turnovers are strongly correlated with success? Amazingly, the adaptation to a less risk-prone offense was made with no drop in yards per play. Furthermore, it was made with a slight increase in the ratio of passing plays to running plays.



Instead, teams started throwing the ball shorter:



But completing a higher percentage of their throws:



Resulting in roughly equal yards per passing play:



However, the safer passes meant declining turnovers despite the fact that teams were throwing the ball more frequently. Now let's look back at the mini-trends in offensive production from 1965-77 and 1977-83.

The Society for American Baseball Research was founded in 1971. The Society's mission is to foster the research and dissemination of the history and record of baseball, while generating interest in the game. One of their more interesting methods, sabermetrics, is the analysis of baseball through objective evidence, especially baseball statistics.

Many of you may know that a lot of probability theory was developed through observations of gambling. Anytime there's a buck to be made, somebody's going to do it. My hypothesis is that sometime around a decade before the society was founded, there was a rapid growth in the number of people trying to apply statistical analysis to sports - either for gambling or to increase their team's success. Somebody applied it to football, and somebody discovered that turnovers were vitally important to a team's success.

It was no doubt known that passing plays carried more risk than running plays. Sometime in the late 60s, teams decided to stop throwing the ball, and the ratio of passes to rushes tanked from .92 in 1969 to .66 in 1977. Only it didn't work. Teams averaged a half-yard less per play. This led to a drop of 15 yards per game and, more importantly, a drop of 3.7 points per game from 1969-77. (The drop is 5.9 from 1965-77, but there is no indication that playcalling changed until 1969.) Going super-conservative on offense forced teams into more unmanageable 3rd down situations, in which the likelihood of a turnover is extremely high (hence, there's not much of a trend indicating a turnover drop during those periods) and the likelihood of converting is low (hence, drops in offensive points production). For really the only prolonged time in NFL history, the number of pass attempts per interception actually declined from 1969 to 1977.



So around 1977, teams figured out that they couldn't score via 3 yards and a cloud of dust and that they weren't reducing turnovers enough to offset the drop in points scored. In those years, the playcalling ratio spiked from 40% passes to 50% passes. Turnovers remained essentially constant (probably due to less predictability on offense as well as fewer unmanageable 3rd downs) and points per game rebounded to 21.8. Again indicating that teams were passing under more favorable conditions, the completion percentage increased by 5% from 1977 to 1983.

As yards per completion was relatively trendless over this period, it indicates that passing plays maintained the same philosophy but that teams were simply more successful with them under the new mode of offensive thought.

And from 1983-89, there aren't any real trends in yards per completion. What did happen during this time, though was that the San Francisco 49ers won Super Bowls following the 1981, 84, and 88 seasons with Bill Walsh as head coach. Bill Walsh of course was the leading mastermind behind the west coast offense, a system he'd been developing for over a decade, but the three Super Bowls really brought it into the spotlight. Other teams quickly adopted his system or created their own variant with the same short passing philosophy.

Again looking specifically at post-merger data, the number of turnovers per game decreased at a rate of -0.005 per year from 1970-87. From 1988 to 2007, the number of turnovers per game decreased at a rate of -0.017 per year. Both periods show evidence of risk aversion. The strategy used the second time around was simply more successful.

So there you have it. Pre-merger football seems less risk-averse and we associate this time with the vertical passing game. Immediately following the merger, teams tried to minimize turnover risk by keeping their games on the ground. It didn't work, so in the late 70s they went back to their old ways. A decade later, the modern short passing philosophy, born of the now-used to the point of becoming a meaningless term "west coast offense," found a way to reduce risk without neutering offensive productivity.

How does this relate to the problem of the negative correlation between passing premium and interceptions per attempt? By following an incorrect optimization strategy, teams from 1970-87 saw a strongly negative (-35.8) slope between passing premium and int/att. Specifically, their playcalling brought offensive strategy back to where it had been 20 years earlier, only now facing then-modern defensive schemes. Under such circumstances, offensive failures were almost guaranteed. Bill Walsh's approach gives us a positive slope of 6.2 between the passing premium and the int/att ratio for the years 1988-2007. This indicates a logical decision system.

From this, we can model the NFL passing premium as:

(yards per passing play) - (yards per rush) = 1.69 + 6.15*(interceptions per pass attempt)

This is a very rough model with sub-optimal data; further revisions on the formula to come.

What's useful from this exercise isn't the (likely incorrect, likely to be revised) formula but rather that, for future NFL investigations, only data from 1988 through the present will be used, as that appears to indicate a similar grouping of thought in offensive decision theory.