Problem 1

6.6 C.

s{b1} = 0.3011 s{b2} = 0.673

The probability that, under repeated sampling with the same design, the procedures used will give intervals each containing the true

The 99% confidence interval for $\beta_1$ is (3.40955, 5.44045) and the 99% confidence interval for $\beta_2$ is (2.104317, 6.645683). This means we are 99% confident that both $\beta_k$ (where k=1,2) are in their respective intervals.

Both $\beta_1$ and $\beta_2$ are significant because their p-values are less than 0.01 in the summary of the model.

ch6 <-read.table("CH06PR05.txt")
head(ch6)

##   V1 V2 V3
## 1 64  4  2
## 2 73  4  4
## 3 61  4  2
## 4 76  4  4
## 5 72  6  2
## 6 80  6  4

names(ch6) <- c("Y", "X1", "X2")

fit <- lm(Y~X1+X2,data=ch6)
summary(fit)

## 
## Call:
## lm(formula = Y ~ X1 + X2, data = ch6)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -4.400 -1.762  0.025  1.587  4.200 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  37.6500     2.9961  12.566 1.20e-08 ***
## X1            4.4250     0.3011  14.695 1.78e-09 ***
## X2            4.3750     0.6733   6.498 2.01e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.693 on 13 degrees of freedom
## Multiple R-squared:  0.9521, Adjusted R-squared:  0.9447 
## F-statistic: 129.1 on 2 and 13 DF,  p-value: 2.658e-09

joint_fam <- 2 #number of joint betas we want.
alpha <- 0.01
joint_level <- alpha/joint_fam
bonf_int <- confint(fit, level = 1-joint_level)
bonf_int

##                0.25 %   99.75 %
## (Intercept) 27.545738 47.754262
## X1           3.409483  5.440517
## X2           2.104236  6.645764

B <- qt(0.9975,13)
B

## [1] 3.372468

4.425 + c(-1,1)*B*(0.3011)

## [1] 3.40955 5.44045

4.375 + c(-1,1)*B*(0.6733)

## [1] 2.104317 6.645683

6.8 A.

The interval estimate of E{Yh} is between 73.880 and 80.670.

$\hat{E_y}$ = 77.275

We are 99% confident that the value of E{Yh} lies in the interval (73.88111 80.66889)

new <- data.frame(X1=5,X2=4)  # new data
predict.lm(fit,new,interval="confidence",level=0.99)  # CI for main response

##      fit      lwr      upr
## 1 77.275 73.88111 80.66889

6.9 B.

The prediction interval for Xh1 = 5 and Xh2 = 4 is between (68.48077, 86.06923).

predict.lm(fit,new,interval="prediction",level=0.99)

##      fit      lwr      upr
## 1 77.275 68.48077 86.06923

Problem 2

6.18 c.

$\hat{Y} = 12.2006 − 0.1420X1+ 0.2820X2 + 0.6193X3 + 0.0000079X4$

ch18 <-read.table("CH06PR18.txt")
head(ch18)

##     V1 V2    V3   V4     V5
## 1 13.5  1  5.02 0.14 123000
## 2 12.0 14  8.19 0.27 104079
## 3 10.5 16  3.00 0.00  39998
## 4 15.0  4 10.70 0.05  57112
## 5 14.0 11  8.97 0.07  60000
## 6 10.5 15  9.45 0.24 101385

fit2 <- lm(V1 ~ V2 + V3 + V4 + V5, data =ch18)
summary(fit2)

## 
## Call:
## lm(formula = V1 ~ V2 + V3 + V4 + V5, data = ch18)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.1872 -0.5911 -0.0910  0.5579  2.9441 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.220e+01  5.780e-01  21.110  < 2e-16 ***
## V2          -1.420e-01  2.134e-02  -6.655 3.89e-09 ***
## V3           2.820e-01  6.317e-02   4.464 2.75e-05 ***
## V4           6.193e-01  1.087e+00   0.570     0.57    
## V5           7.924e-06  1.385e-06   5.722 1.98e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.137 on 76 degrees of freedom
## Multiple R-squared:  0.5847, Adjusted R-squared:  0.5629 
## F-statistic: 26.76 on 4 and 76 DF,  p-value: 7.272e-14

6.19

Part A.

Hypothesis Test:

$H_0: \beta_{1}= \beta_{2} = \beta_{3} = \beta_{4} = 0$

$H_A:$ Atleast one $\beta_{k} \neq 0$ where k = 1,2,3,4

Decision Rule:

If $F* > F$, then reject $H_0$.

Conclusion:

With an F* value = 26.74 > F value = 2.75, we reject $H_0$. Additionally, our p-value from the summary tab is 7.272e-14 which is less than $= 0.05%.

Thus, at least one $\beta_{k}$ = 0, where k=1,2,3,4 and there is enough evidence to conclude a linear relationship exists.

anova(fit2)

## Analysis of Variance Table
## 
## Response: V1
##           Df Sum Sq Mean Sq F value    Pr(>F)    
## V2         1 14.819  14.819 11.4649  0.001125 ** 
## V3         1 72.802  72.802 56.3262 9.699e-11 ***
## V4         1  8.381   8.381  6.4846  0.012904 *  
## V5         1 42.325  42.325 32.7464 1.976e-07 ***
## Residuals 76 98.231   1.293                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

summary(fit2)

## 
## Call:
## lm(formula = V1 ~ V2 + V3 + V4 + V5, data = ch18)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.1872 -0.5911 -0.0910  0.5579  2.9441 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.220e+01  5.780e-01  21.110  < 2e-16 ***
## V2          -1.420e-01  2.134e-02  -6.655 3.89e-09 ***
## V3           2.820e-01  6.317e-02   4.464 2.75e-05 ***
## V4           6.193e-01  1.087e+00   0.570     0.57    
## V5           7.924e-06  1.385e-06   5.722 1.98e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.137 on 76 degrees of freedom
## Multiple R-squared:  0.5847, Adjusted R-squared:  0.5629 
## F-statistic: 26.76 on 4 and 76 DF,  p-value: 7.272e-14

MSR <- (14.819+72.802+8.381+42.325)/4
MSR

## [1] 34.58175

MSE <- 1.293
MSE

## [1] 1.293

F_star <- MSR/MSE
F_star

## [1] 26.74536

a=0.05
n <- nrow(ch18)
p <- 5
F_test <- qf(0.95,p-1,n-p)
F_test

## [1] 2.492049

6.19

Part B.

From the summary of our model, s{b1} = 0.02134, s{b2 } = 0.06317, s{b3}= 1.08681, s{b4} = 0.00000138.

We are 95% confident that all our $\beta_k$ (where k=1,2,3,4) are in their respective intervals as listed below.

$\beta_1$ (-0.19659927, -0.08740073) $\beta_2$ (0.1203769, 0.4436231) $\beta_3$ (-2.161348, 3.399948) $\beta_4$ (4.369213e-06, 1.143079e-05)

joint_fam <- 4 #number of joint betas we want.
alpha <- 0.05
joint_level <- alpha/joint_fam
bonf_int <- confint(fit2, level = 1-joint_level)
bonf_int

##                   0.625 %      99.375 %
## (Intercept)  1.072186e+01  1.367931e+01
## V2          -1.966396e-01 -8.742769e-02
## V3           1.203875e-01  4.436456e-01
## V4          -2.161312e+00  3.399999e+00
## V5           4.381297e-06  1.146731e-05

summary(fit2)

## 
## Call:
## lm(formula = V1 ~ V2 + V3 + V4 + V5, data = ch18)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.1872 -0.5911 -0.0910  0.5579  2.9441 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.220e+01  5.780e-01  21.110  < 2e-16 ***
## V2          -1.420e-01  2.134e-02  -6.655 3.89e-09 ***
## V3           2.820e-01  6.317e-02   4.464 2.75e-05 ***
## V4           6.193e-01  1.087e+00   0.570     0.57    
## V5           7.924e-06  1.385e-06   5.722 1.98e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.137 on 76 degrees of freedom
## Multiple R-squared:  0.5847, Adjusted R-squared:  0.5629 
## F-statistic: 26.76 on 4 and 76 DF,  p-value: 7.272e-14

B <- qt(0.99375,n-p)
B

## [1] 2.558541

b1 <- -0.142 + c(-1,1)*B*(0.02134)
b1

## [1] -0.19659927 -0.08740073

b2 <- 0.282 + c(-1,1)*B*(0.06317)
b2

## [1] 0.1203769 0.4436231

b3 <- 0.6193 + c(-1,1)*B*(1.08681)
b3

## [1] -2.161348  3.399948

b4 <- 0.0000079 + c(-1,1)*B*(0.00000138)
b4

## [1] 4.369213e-06 1.143079e-05

Part C.

From our summary fit, we see that muliple R-Squared value is 0.5847.

We can also calculate it using SSR/SSTO = 0.584.

Our regression model accounts for 58.4% of the variation

SSR <-  14.819+72.802+8.381+42.325
SSR

## [1] 138.327

SSE <- 98.321
SSTO <- SSR+SSE
SSTO

## [1] 236.648

SSR/SSTO

## [1] 0.5845264

6.20

Working hotel, W = 3.4168 > B = 2.5585 which is the bonferroni method.

Thus, we use the Bonferroni Method method.

The most efficient procedure provides us the first family of estimates.

#1 Simultaneous CI of mean response
new <- data.frame(V2=c(5,6,14,12),V3=c(8.25,8.5,11.5,10.25),V4= c(0,0.23,0.11,0), V5=c(250000,270000,300000, 310000))
g=nrow(new)
alphaF=0.05
#Bonferroni CI
BCI=predict(fit2,new,se.fit = TRUE,
           interval = "confidence",level=1-alphaF/g)
BCI$fit ##USE THIS AS B < W BETTER TO USE

##        fit      lwr      upr
## 1 15.79813 15.08664 16.50962
## 2 16.02754 15.42391 16.63116
## 3 15.90072 15.33232 16.46913
## 4 15.84339 15.18040 16.50638

n <- nrow(ch18)
p <- 5
B <- qt(0.99375,n-p)
B

## [1] 2.558541

#Working-Hotelling CI 
n=nrow(ch18)
p=5
Yh_hat=predict(fit2,newdata=new)
W=sqrt(p*qf(1-alphaF,p,n-p))
W

## [1] 3.416811

WHCI=cbind(Yh_hat,Yh_hat-W*BCI$se.fit,Yh_hat+W*BCI$se.fit)
colnames(WHCI)=c("fitted","lower","upper")
WHCI

##     fitted    lower    upper
## 1 15.79813 14.84797 16.74829
## 2 16.02754 15.22142 16.83365
## 3 15.90072 15.14165 16.65980
## 4 15.84339 14.95799 16.72878

STAT 408 HW 8 V2

Kajal Chokshi

11/8/2018

Problem 1

6.6 C.

6.8 A.

6.9 B.

Problem 2

6.18 c.

6.19

Part A.

6.19

Part B.

Part C.

6.20