Rocky Top Talk: An SB Nation Community

Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Login-facebook
New Blog: Cottagers Confidential for Fulham FC Fans!

How a nuclear engineering Ph.D. candidate solves a problem like McFadden

The following post comes courtesy of RTT community member hooper. Do not mess with a guy who's this close to getting a Ph.D. in nuclear engineering. I'm bumping his conclusion to the top because frankly, it's the only part I understand:

If the current information provides any prediction about this game, it’s that Tennessee is predicted to beat Arkansas based on the average rush defense predictor. [The "rush defense predictor" is presumably somewhere below. Kudos to you if you can find it. -- ed.] If the Kentucky game is treated as an anomaly, UT is predicted to have about a 50-50 shot at beating Arkansas based on their average rush defense. That’s not exactly a revelation, but it does verify the stuff you wrote and gives some nice pretty numbers and pictures to play with.

Translation: I was right. Na-na-na-na-na-na. Oh, and Tennessee wins! Woo! Maybe. So whoa on woo.

Anyway, the meat is after the jump, but be warned, wicked math, charts, and graphs ahead. Make sure the safety goggles are snug before proceeding.

Star-divide

For the data given in "How do you solve a problem like McFadden, part II", the best fit of points is by yards:


Linear Fit
Points = -21.45121 + 0.3460472 Yards

Summary of Fit


 

 

RSquare

0.567489

RSquare Adj

0.495404

Root Mean Square Error

12.87058

Mean of Response

42

Observations (or Sum Wgts)

8

Analysis of Variance


Source

DF

Sum of Squares

Mean Square

F Ratio

Model

1

1304.0891

1304.09

7.8725

Error

6

993.9109

165.65

Prob > F

C. Total

7

2298.0000

 

0.0309

Parameter Estimates


Term

 

Estimate

Std Error

t Ratio

Prob>|t|

Intercept

 

-21.45121

23.06764

-0.93

0.3883

Yards

 

0.3460472

0.123333

2.81

0.0309

Just look at the big red text and ignore the rest (the software package [I'm using] gives a lot more, but it’s easier to highlight than to edit further).  Interpretation:

  • R2:  Rush Yardage explains a little over half the variance in Arkansas’s points.
  • Prob > F:  A number below 0.05 is generally considered a sign that the model is statistically significant.  In other words, the model is useful.

However, notice that the two leftmost points really stand apart.  Without them, the remaining points appear to trend very nicely.  Treating them as outliers, I’ll remove them:


Linear Fit
Points = -280.3811 + 1.6089548 Yards

Summary of Fit


 

 

RSquare

0.585571

RSquare Adj

0.481964

Root Mean Square Error

9.124053

Mean of Response

48.5

Observations (or Sum Wgts)

6

Analysis of Variance


Source

DF

Sum of Squares

Mean Square

F Ratio

Model

1

470.50665

470.507

5.6518

Error

4

332.99335

83.248

Prob > F

C. Total

5

803.50000

 

0.0762

Parameter Estimates


Term

 

Estimate

Std Error

t Ratio

Prob>|t|

Intercept

 

-280.3811

138.3889

-2.03

0.1127

Yards

 

1.6089548

0.676782

2.38

0.0762

Removing the Alabama and Auburn results, the model is really no better at explaining things (look at RSquare – it gained almost nothing).  Not only that, the statistical significance is lower (Prob > F is higher, which is bad).  Besides, do you really believe a model that predicts 25 points if Arkansas plays a team who averaged 190 yards of rush defense, but predicts 57 points if Arkansas plays a team who averages 210 rushing yards on defense?  Me neither.

Now, for some real fun (well, a statistician would think so).

Whole Model Test


Model

-LogLikelihood

DF

ChiSquare

Prob>ChiSq

Difference

5.2925058

1

10.58501

0.0011

Full

5.82257e-8

 

 

 

Reduced

5.2925059

 

 

 

 

 

 

RSquare (U)

1.0000

Observations (or Sum Wgts)

8

 

 

Converged by Objective

Parameter Estimates


Term

 

Estimate

Std Error

ChiSquare

Prob>ChiSq

Intercept

 Unstable

603.457952

149969.72

0.00

0.9968

Yards

 Unstable

-3.0404375

749.93609

0.00

0.9968

For log odds of L/W

Ignore the data junk.

This is a logistical test where wins and losses are compared against yardage gained.  (Ignore the vertical stuff and read the left-right of the graph to simplify things.)  The nearly vertical blue line effectively says that if Arkansas plays a team who gives up an average of 200 or more rushing yards, they win, otherwise they lose.  Nifty, huh?  If you read the "Parameter Estimates" piece, you see the word "unstable" twice.  This unfortunately tells you that the model is not reliable.  So, while it sounds good, [it's] really not useful.

The problem is the Kentucky game, where a really bad defense produced the same result as a really good defense.  Since there are so few data points, it’s enough to throw the whole thing off.  Removing Kentucky:

Whole Model Test


Model

-LogLikelihood

DF

ChiSquare

Prob>ChiSq

Difference

4.1878871

1

8.375774

0.0038

Full

3.56579e-8

 

 

 

Reduced

4.1878871

 

 

 

 

 

 

RSquare (U)

1.0000

Observations (or Sum Wgts)

7

 

 

Converged by Objective

Parameter Estimates


Term

 

Estimate

Std Error

ChiSquare

Prob>ChiSq

Intercept

 Unstable

77.8017059

26623.368

0.00

0.9977

Yards

 Unstable

-0.4704935

144.43734

0.00

0.9974

For log odds of L/W

Again, just ignore the data junk and watch the pretty blue line. Without Kentucky, the breakwater lies closer to about 167.  If that number is eerily frightening, it should be; your estimate of Tennessee’s run defense came out to basically exactly this result.  Again, the model has stability problems due to a lack of data points (it’s "underpowered" in stats lingo).  Still, it’s as useful as anything else will be for predicting this game.

Summary? Summary:

If the current information provides any prediction about this game, it’s that Tennessee is predicted to beat Arkansas based on the average rush defense predictor. If the Kentucky game is treated as an anomaly, UT is predicted to have about a 50-50 shot at beating Arkansas based on their average rush defense. That’s not exactly a revelation, but it does verify the stuff you wrote and gives some nice pretty numbers and pictures to play with.

Poll
Thoughts?
Exactly!
1 votes
You should have used a spline.
3 votes
Huh?
1 votes

5 votes | Poll has closed

0 recs  |  Comment 2 comments

Story-email Email Printer Print

Comments

Display:

one minor correction
I didn't intend to mislead about my educational status.  I am working on my Master's, not my PhD.

by Hooper on Nov 8, 2007 2:21 PM EST reply actions   0 recs

You didn't mislead . . .
. . . I just mis-assumed.
Go Vols!

by Joel on Nov 8, 2007 2:36 PM EST up reply actions   0 recs

Comments For This Post Are Closed


User Tools

Welcome to the SB Nation blog about the Tennessee Volunteers.
Start posting about the Volunteers »

Join SB Nation and dive into communities focused on all your favorite teams.

Connect_with_facebook

FanPosts

Community blog posts and discussion.

Recent FanPosts

Small
A little more information on Alexander Ross Jr. QB Buford Wolves
Nathan_bedford_forrest_small
call the police, someone has been robbed
Ph-100222_small
104.5 The Zone
Small
Discussion thread and matchups for 3/12/10
Hardesty_small
Roland Ratings and +/- from the LSU Game
T_small
Letter from Mike Hamilton (9Mar10)
050_50_small
Should the committee look at our play before Jan. 1?
Mcnair_small
Letter From Coach Dooley
Small
Paging Doug Gottlieb
111007_1336a_small
Dooley Misstep?

+ New FanPost All FanPosts >

Animated Drive Charts

RTT Classics

RTT Classics 2008 Animated BlogPoll2007 Animated BlogPollLOL! Your logo is so scary! Welcome to Rocky Top Talk Tradition! Fiddlin' on the Roof2008 Animated BlogPoll The Season of Which We Do Not Speak Pearlfection Case Study: 2QB Systems and the 2005 Tennessee Volunteers The 2007 College Football Blogger Awards The 2006 College Football Blogger Awards The 20 coolest college football logos The 10 worst college football logos The 29 most boring college football logos 2006 Animated BCS Race 2005 Animated Race to the Rose Bowl

FanShots

Quick hits of video, photos, quotes, chats, links and lists that you find around the web.

Recent FanShots

Bruce Pearl has no control of this team. I smell an upset. I don't like...
Looks like Dooley won this round.
Giant Killing in the Midwest
Your Round One NCAA Tournament Announcing Schedule
Dance Lessons: Depth charge?
Tennessee curlers earned second place at College Nationals (Division 3).  UT has now had curlers earn medals at both nationals and regionals for at least three years running.

And by all appearances, curling will be a registered club sport at UT beginning in the fall, making Tennessee the first college in the South to have an official curling club.
Big Cuz for the score?
Esquire Mag: Vote Lane Kiffin For Sexiest Woman Alive
You have to read this article...
Should every day be signing day?

+ New FanShot All FanShots >

YouTube

SBNation.com Recent Stories

Villanova head coach Jay Wright instructs his team in the first half of an NCAA college basketball game against West Virginia, Saturday, March 6, 2010, in Philadelphia. (AP Photo/Matt Slocum)

Villanova Narrowly Avoids Massive Upset, Fights Back To Top Robert Morris In OT

BYU's Jimmer Fredette reacts to a call during an NCAA college basketball game against UNLV in Las Vegas on Saturday, February 6, 2010. UNLV defeated BYU 88-74. (AP Photo/Laura Rauch) +1 updates

BYU's Fredette Drops 37; Cougars Down Florida In Double-OT, 99-92

Old Dominion's Chris Cooper (20) celebrates with Kenyon Carter (33) after Old Dominion defeated Georgetown 61-57 during an NCAA college basketball game, Saturday, Dec. 19, 2009, in Washington. Also seen is Gerald Lee (12). (AP Photo/Nick Wass) +1 updates

Old Dominion Opens NCAA Tournament By Upsetting No. 6 Notre Dame, 51-50

More from SBNation.com >


Editor-in-Chief

Fiddler_on_the_roof_fiddler_1__small Joel

Senior Editor

Gromit_small Hooper

Tennessee_logo_small Will

Official Partner of CBS Sports