Rocky Top Talk: An SB Nation Community

Navigation: Jump to content areas:


Sports blogs for fans, by fans.
New Blog: Sounder At Heart for Seattle Sounders Fans!

How a nuclear engineering Ph.D. candidate solves a problem like McFadden

The following post comes courtesy of RTT community member hooper. Do not mess with a guy who's this close to getting a Ph.D. in nuclear engineering. I'm bumping his conclusion to the top because frankly, it's the only part I understand:

If the current information provides any prediction about this game, it’s that Tennessee is predicted to beat Arkansas based on the average rush defense predictor. [The "rush defense predictor" is presumably somewhere below. Kudos to you if you can find it. -- ed.] If the Kentucky game is treated as an anomaly, UT is predicted to have about a 50-50 shot at beating Arkansas based on their average rush defense. That’s not exactly a revelation, but it does verify the stuff you wrote and gives some nice pretty numbers and pictures to play with.

Translation: I was right. Na-na-na-na-na-na. Oh, and Tennessee wins! Woo! Maybe. So whoa on woo.

Anyway, the meat is after the jump, but be warned, wicked math, charts, and graphs ahead. Make sure the safety goggles are snug before proceeding.

Star-divide

For the data given in "How do you solve a problem like McFadden, part II", the best fit of points is by yards:


Linear Fit
Points = -21.45121 + 0.3460472 Yards

Summary of Fit


 

 

RSquare

0.567489

RSquare Adj

0.495404

Root Mean Square Error

12.87058

Mean of Response

42

Observations (or Sum Wgts)

8

Analysis of Variance


Source

DF

Sum of Squares

Mean Square

F Ratio

Model

1

1304.0891

1304.09

7.8725

Error

6

993.9109

165.65

Prob > F

C. Total

7

2298.0000

 

0.0309

Parameter Estimates


Term

 

Estimate

Std Error

t Ratio

Prob>|t|

Intercept

 

-21.45121

23.06764

-0.93

0.3883

Yards

 

0.3460472

0.123333

2.81

0.0309

Just look at the big red text and ignore the rest (the software package [I'm using] gives a lot more, but it’s easier to highlight than to edit further).  Interpretation:

  • R2:  Rush Yardage explains a little over half the variance in Arkansas’s points.
  • Prob > F:  A number below 0.05 is generally considered a sign that the model is statistically significant.  In other words, the model is useful.

However, notice that the two leftmost points really stand apart.  Without them, the remaining points appear to trend very nicely.  Treating them as outliers, I’ll remove them:


Linear Fit
Points = -280.3811 + 1.6089548 Yards

Summary of Fit


 

 

RSquare

0.585571

RSquare Adj

0.481964

Root Mean Square Error

9.124053

Mean of Response

48.5

Observations (or Sum Wgts)

6

Analysis of Variance


Source

DF

Sum of Squares

Mean Square

F Ratio

Model

1

470.50665

470.507

5.6518

Error

4

332.99335

83.248

Prob > F

C. Total

5

803.50000

 

0.0762

Parameter Estimates


Term

 

Estimate

Std Error

t Ratio

Prob>|t|

Intercept

 

-280.3811

138.3889

-2.03

0.1127

Yards

 

1.6089548

0.676782

2.38

0.0762

Removing the Alabama and Auburn results, the model is really no better at explaining things (look at RSquare – it gained almost nothing).  Not only that, the statistical significance is lower (Prob > F is higher, which is bad).  Besides, do you really believe a model that predicts 25 points if Arkansas plays a team who averaged 190 yards of rush defense, but predicts 57 points if Arkansas plays a team who averages 210 rushing yards on defense?  Me neither.

Now, for some real fun (well, a statistician would think so).

Whole Model Test


Model

-LogLikelihood

DF

ChiSquare

Prob>ChiSq

Difference

5.2925058

1

10.58501

0.0011

Full

5.82257e-8

 

 

 

Reduced

5.2925059

 

 

 

 

 

 

RSquare (U)

1.0000

Observations (or Sum Wgts)

8

 

 

Converged by Objective

Parameter Estimates


Term

 

Estimate

Std Error

ChiSquare

Prob>ChiSq

Intercept

 Unstable

603.457952

149969.72

0.00

0.9968

Yards

 Unstable

-3.0404375

749.93609

0.00

0.9968

For log odds of L/W

Ignore the data junk.

This is a logistical test where wins and losses are compared against yardage gained.  (Ignore the vertical stuff and read the left-right of the graph to simplify things.)  The nearly vertical blue line effectively says that if Arkansas plays a team who gives up an average of 200 or more rushing yards, they win, otherwise they lose.  Nifty, huh?  If you read the "Parameter Estimates" piece, you see the word "unstable" twice.  This unfortunately tells you that the model is not reliable.  So, while it sounds good, [it's] really not useful.

The problem is the Kentucky game, where a really bad defense produced the same result as a really good defense.  Since there are so few data points, it’s enough to throw the whole thing off.  Removing Kentucky:

Whole Model Test


Model

-LogLikelihood

DF

ChiSquare

Prob>ChiSq

Difference

4.1878871

1

8.375774

0.0038

Full

3.56579e-8

 

 

 

Reduced

4.1878871

 

 

 

 

 

 

RSquare (U)

1.0000

Observations (or Sum Wgts)

7

 

 

Converged by Objective

Parameter Estimates


Term

 

Estimate

Std Error

ChiSquare

Prob>ChiSq

Intercept

 Unstable

77.8017059

26623.368

0.00

0.9977

Yards

 Unstable

-0.4704935

144.43734

0.00

0.9974

For log odds of L/W

Again, just ignore the data junk and watch the pretty blue line. Without Kentucky, the breakwater lies closer to about 167.  If that number is eerily frightening, it should be; your estimate of Tennessee’s run defense came out to basically exactly this result.  Again, the model has stability problems due to a lack of data points (it’s "underpowered" in stats lingo).  Still, it’s as useful as anything else will be for predicting this game.

Summary? Summary:

If the current information provides any prediction about this game, it’s that Tennessee is predicted to beat Arkansas based on the average rush defense predictor. If the Kentucky game is treated as an anomaly, UT is predicted to have about a 50-50 shot at beating Arkansas based on their average rush defense. That’s not exactly a revelation, but it does verify the stuff you wrote and gives some nice pretty numbers and pictures to play with.

Poll
Thoughts?
Exactly!
1 votes
You should have used a spline.
3 votes
Huh?
1 votes

5 votes | Poll has closed

0 recs  |  Comment 2 comments

Story-email Email Printer Print

Comments

Display:

one minor correction
I didn't intend to mislead about my educational status.  I am working on my Master's, not my PhD.

by Hooper on Nov 8, 2007 2:21 PM EST reply actions   0 recs

You didn't mislead . . .
. . . I just mis-assumed.
Go Vols!

by Joel on Nov 8, 2007 2:36 PM EST up reply actions   0 recs

Comments For This Post Are Closed


User Tools

Welcome to the SB Nation blog about the Tennessee Volunteers.
Start posting about the Volunteers »

Join SB Nation and dive into communities focused on all your favorite teams.

FanPosts

Community blog posts and discussion.

Recommended FanPosts

Checkerboard_enzone_small
Ok, Enough Is Enough....

Recent FanPosts

Fiddler_on_the_roof_fiddler_1__small
RTT Pick 'Em: ten tie for 1st this week, marmot_man_111 running away overall
Fiddler_on_the_roof_fiddler_1__small
RTT Guessing Game Standings after Week 9
Checkerboard_enzone_small
Lil' Wayne's Latest Shout-Out to Lane Kiffin and the Vols
111007_1336a_small
Back in Knoxville
Small
Why do opposing kickers never miss against Tennessee????
Small
Of ESPN and Pat Forde
Small
Stickin' it to Spurrier
Fiddler_on_the_roof_fiddler_1__small
RTT Pick 'Em: Vol in Mississippi has good week, marmot_man_111 extends lead
Fiddler_on_the_roof_fiddler_1__small
RTT Guessing Game standings after Week 8
Small
SEC Red Zone Comparisons & Projections

+ New FanPost All FanPosts >

Video Highlights

Animated Drive Charts

Animated BlogPoll

RTT Classics

RTT Classics 2008 Animated BlogPoll2007 Animated BlogPollLOL! Your logo is so scary! Welcome to Rocky Top Talk Tradition! Fiddlin' on the Roof2008 Animated BlogPoll The Season of Which We Do Not Speak Pearlfection Case Study: 2QB Systems and the 2005 Tennessee Volunteers The 2007 College Football Blogger Awards The 2006 College Football Blogger Awards The 20 coolest college football logos The 10 worst college football logos The 29 most boring college football logos 2006 Animated BCS Race 2005 Animated Race to the Rose Bowl

FanShots

Quick hits of video, photos, quotes, chats, links and lists that you find around the web.

Recommended FanShots

Chris's drug test had revealed evidence of human chorionic gonadotropin,...

Recent FanShots

"I did see the rerun," Kiffin said. "It was pretty bad but we'll worry...
Twitter / Luke Stocker: Sporting my Uggs all day, ...
Just a fun little graphic I created to celebrate the Vols outstanding defensive playing!
Meyer: Tebow hit late vs. Georgia
May cause spontaneous tackling
Nuke Doesn't Launch
Don't know if anyone knows this guy but I ran into him a couple of weeks ago and he was highly...
As Hooper mentioned, none of the RTT uniform design entries went with the black shirts and orange pants, like we saw against that team coached by Spurrier this past weekend.

So I gave it a very quick stab.  One thing I couldn't figure out in the short amount of time I took to do this was how to get the "T" on the helmet when designing your own helmet.  Others had done it in the contest...I couldn't figure it out.
Football, dogfighting, and brain damage : The New Yorker
Frazier Done for the Year

+ New FanShot All FanShots >

YouTube


Editor-in-Chief

Fiddler_on_the_roof_fiddler_1__small Joel

Senior Editor

Gromit_small Hooper

Tennessee_logo_small Will

Official Partner of CBS Sports