To see a regression analysis for measured weights (weight) and reported weights (repwt) of men and women engaged in regular exercise, we look at the Davis data set in the car package.

library(car)
data(Davis)
head(Davis)
##   sex weight height repwt repht
## 1   M     77    182    77   180
## 2   F     58    161    51   159
## 3   F     53    161    54   158
## 4   M     68    177    70   175
## 5   F     59    157    59   155
## 6   M     76    170    76   165

In order to show the relationship of repwt to weight with repwt being the predictor variable for weight, we conduct a simple linear regression.

davis.mod <- lm(weight ~repwt, data=Davis)

Before continuing, we are going to plot the relationship to look for any outliers.

We notice that there is an outlier and when we graph it we see that it is Observation 12.

scatterplot(weight ~ repwt, data=Davis, smooth=FALSE, id.n=1)

## 12 
## 12

We remove this outlier:

davis.mod2 <- update(davis.mod, subset=-12)

Finally, we show the updated regression:

summary(davis.mod2)
## 
## Call:
## lm(formula = weight ~ repwt, data = Davis, subset = -12)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.5296 -1.1010 -0.1322  1.1287  6.3891 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  2.73380    0.81479   3.355 0.000967 ***
## repwt        0.95837    0.01214  78.926  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.254 on 180 degrees of freedom
##   (17 observations deleted due to missingness)
## Multiple R-squared:  0.9719, Adjusted R-squared:  0.9718 
## F-statistic:  6229 on 1 and 180 DF,  p-value: < 2.2e-16