I choose the “Beerwings” data set. This information returned a data set of number of beer + wings consumed, and the consumers gender. I was able to apply a few different techniques that I learned in class on Tuesday.

library(resampledata)
## 
## Attaching package: 'resampledata'
## The following object is masked from 'package:datasets':
## 
##     Titanic
data("Beerwings")
attach(Beerwings)
plot(Beer, Hotwings, ylab="Wings consumed", xlab="Fluid Ounces of Beer consumed")

Next, I’d like to create a simple linear regression line:

myinfo<- lm(Beer~Hotwings)
myinfo
## 
## Call:
## lm(formula = Beer ~ Hotwings)
## 
## Coefficients:
## (Intercept)     Hotwings  
##       3.040        1.941

Using the information above, I create my regression line:

plot(Hotwings, Beer, xlab="Wings consumed", ylab="Fluid Ounces of Beer consumed")
abline(3.040,1.941)

The line looks good. Lets find some residual informaTiOn!

myresid<-myinfo$residuals
hist(myresid)

One thing to look for when plotting a histogram of residuals is the general shape of a Normal distribution bell curve. It looks good.

QQ plots? Yeah lets do it.

qqnorm(myresid)
qqline(myresid)

We are looking to see if our QQ plot has points that seem to deviate far from the line, we don’t see any extremes, which is generally a good indicator for normality.

Im going to check for heterscedasticity now. Let’s dig in. We’ve already seen what the graph looks like, but it didn’t tell me much.

plot(Hotwings,myresid, ylab="Residual")
abline(0,0)

Because the variance between dots from the origin seem to be fairly even, we conclude that we don’t have heteroscedasticity.

Lastly, let’s pull some information the easy way to see what our MSE turned out to be.

summary(myinfo)
## 
## Call:
## lm(formula = Beer ~ Hotwings)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -18.566  -4.537  -0.122   3.671  17.789 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   3.0404     3.7235   0.817    0.421    
## Hotwings      1.9408     0.2903   6.686 2.95e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.479 on 28 degrees of freedom
## Multiple R-squared:  0.6148, Adjusted R-squared:  0.6011 
## F-statistic:  44.7 on 1 and 28 DF,  p-value: 2.953e-07

Hmm, interesting indeed.

I sure learned a lot and got to practice R. I can’t wait till next time.