What is aEPA?
Expected Points (EP) is a statistic that measures the value of field position, down, distance, and time by modeling the likelihood of the next score. Each possible scoring event: safety, field goal, or touchdown for either team, (or, not scoring due to the end of half), is weighted, and the results are summed to give an average value for possession of the ball in each play’s context. EP was pioneered on the Football Analytics Web by Brian Burke, now a statistician at ESPN, and the current state of the art was described by Ron Yurko, Maksim Horowitz and Samuel Ventura in their paper nflWAR: A Reproducible Method for Offensive Player Evaluation in Football, and is implemented in their R package nflscrapR.
Expected Points Added (EPA) simply measures the change in EP from the beginning of a play to the end of the play. Just subtract previous EP from current EP to get a value denominated in points for the previous play. In general, good plays for the offense have positive EPA, while good plays for the defense will have negative EPA.
EPA per play (EPA/P or EPAPP) is a straightforward derived stat, just total EPA divided by the total number of plays. It is a measure of efficiency that is superior to yards per play because it discounts large gains that fail to improve the team’s situation, such as 8 yard runs on 3rd down and 20.
The primary limitation of EPA is that NFL schedules are imbalanced. Each season, 32 teams play 16 games against just 13 opponents. That is, EPA isn’t made equal, it is a convergence of an offense in conflict with a defense. To understand the strength of teams, we need to account for their respective schedule strengths as well as the outcomes of games.
In his 1997 undergraduate honors thesis, Kenneth Massey applied linear algebra to the problem of schedule strength. Starting with the assumption that each team contributes equally to the result, it followed that a square matrix representing a league’s schedule multiplied by a column vector representing each team’s strength should be equal to a column vector of results. By solving this equation for the ratings vector, one can compare teams after accounting for their schedules.
By using EPA as the outcome in a Massey linear system, we can isolate the offensive and defensive performance of teams from opponent strength, starting field position, and penalties.