/cdn.vox-cdn.com/uploads/chorus_image/image/50792857/503100118.0.jpg)
Background
In a previous life, I was a Business Analyst for a nameless, Mclean, VA-based financial services corporation that uses Samuel L. Jackson in their commercials. What, would I say, I did there? I used data and various modeling tools to inform business decisions.
It was...as sexy as it sounds. I craved something different, so I went back to grad school to make a career transition. What do I do now? I’m a Business Analyst for a nameless, Memphis, TN-based professional basketball franchise. The day-to-day responsibilities haven’t really changed, but the direct application goes over much better at dinner parties.
:no_upscale()/cdn.vox-cdn.com/uploads/chorus_asset/file/7081599/513169286.jpg)
So why the Monday Mathematical? Because, selfishly, I enjoy math and its real world applications, but people much more well-versed than me have already covered the "advanced stats" ground on this and other blogs. Hopefully this turns into something just as interesting, but a bit less...inscrutable.
Installment 1: What does it really mean to say Jalen Hurd average 5 yards per carry?
In the penultimate chapter of Will’s "10 Questions for 2016" series, he asked how Tennessee’s passing game could be more explosive. In the comments section, a friendly debate raged about the merits of explosiveness versus being able to consistently get 5 yards per carry with Jalen. That got my wheels turning on what Jalen’s and Alvin’s carries really look like.
I went back to every game from 2015 and pulled ESPN’s play-by-play data to filter out carries by the running backs (note: I may add Dobbs in a future week, but didn’t for now). After calculating the number of yards gained by each of their carries over the course of the season, here were the summary numbers.
Stats | Jalen Hurd | Alvin Kamara |
Yards | 1288 | 698 |
Carries | 277 | 107 |
Yards/Carry | 4.6 | 6.5 |
25th % | 1 | 2 |
Median/50th % | 3 | 5 |
75th % | 6 | 7 |
Standard Deviation | 6.4 | 9.1 |
The "average" carry is how we generally report it (total yards divided by total carries), and that finds Jalen at 4.6 ypc and Alvin at 6.5 ypc. But in gauging how any one individual carry is "likely" to perform, we should look at the distribution of carries, which is where the percentiles come in.
If you lined up every Jalen Hurd carry in order from least yards gained to most yards gained and then picked the one that was 1⁄4 of the way down the line, that’s the 25th percentile. Another way to interpret it is that 1 out of every 4 of Jalen’s carries got 1 yard or less (the actual number was almost 29%) while 1 out of every 4 of Alvin’s carries got 2 yards or less (his 25th percentile just crept to 2 yards...24% of his carries were for 1 yard or less)
My point in doing all this was to better define what an "average" carry looks like for these guys: the median. In exactly the middle of the distribution, you can expect half of the carries to go for more yards and half of the carries to go for fewer yards. For Jalen, that number was 3 yards; for Alvin, that number was 5 yards.
While that’s good to know (and will probably show up in next week’s installment), it doesn’t do much to address our question about explosiveness. I posited in the comments of the aforementioned article that higher variance would imply more explosiveness and better offensive results. How do we use this play-by-play data to see that?
The standard deviation is a measure of variance; the higher it is, the "noisier" the data. Alvin’s standard deviation in 2015 was higher. Great! So...what does that mean?
:no_upscale()/cdn.vox-cdn.com/uploads/chorus_asset/file/7082923/Alvin%20Jalen%20Normal.png)
The above chart plots normal distributions using the averages and standard deviations of each runner’s 2015 carries, Jalen in orange, Alvin in black. I’m making some mathematical simplifications here (really long runs skew the data and mess up the fit of a normal distribution), but it works to prove my point. The x-axis is the distance of the carry; the y-axis is the percentage of carries we would expect to go that individual distance. As an example, Jalen would be expected to have a 4-yard carry about 6% of the time.
How do we apply this? If you want a steady gain, Normal Jalen is your guy. 61% of the time, he’s going to get you somewhere between 0 and 10 yards. Normal Alvin? Like the chipmunk that bears his name, he’s not as consistent. He’s only going to get you 0 to 10 yards 45% of the time.
The flip side though? If you want to get an "explosive" run (12+ yards), Normal Alvin is the one you should call. A full 29% of the time, he can get you 12 or more yards. Normal Jalen can’t even get half of that, with just 14% of his carries hitting the magic mark.
I can hear you all now, "Get to the point, volundore!" So here it is...math confirms that explosiveness is better (and that Kamara should get more carries)! All I had to do was take real data, make it theoretical, and then draw a conclusion about what that means in reality. You’re welcome :P
I’ve got a rough idea of some other ideas that I’d like to explore in this space (probability trees, game theory, etc.), but I’m open to suggestions on the burning mathematical questions you have. Also, if it burns when you math, you should probably get that checked out.