s{b1} = 0.3011 s{b2} = 0.673
The probability that, under repeated sampling with the same design, the procedures used will give intervals each containing the true
The 99% confidence interval for \(\beta_1\) is (3.40955, 5.44045) and the 99% confidence interval for \(\beta_2\) is (2.104317, 6.645683). This means we are 99% confident that both \(\beta_k\) (where k=1,2) are in their respective intervals.
Both \(\beta_1\) and \(\beta_2\) are significant because their p-values are less than 0.01 in the summary of the model.
ch6 <-read.table("CH06PR05.txt")
head(ch6)
## V1 V2 V3
## 1 64 4 2
## 2 73 4 4
## 3 61 4 2
## 4 76 4 4
## 5 72 6 2
## 6 80 6 4
names(ch6) <- c("Y", "X1", "X2")
fit <- lm(Y~X1+X2,data=ch6)
summary(fit)
##
## Call:
## lm(formula = Y ~ X1 + X2, data = ch6)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.400 -1.762 0.025 1.587 4.200
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 37.6500 2.9961 12.566 1.20e-08 ***
## X1 4.4250 0.3011 14.695 1.78e-09 ***
## X2 4.3750 0.6733 6.498 2.01e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.693 on 13 degrees of freedom
## Multiple R-squared: 0.9521, Adjusted R-squared: 0.9447
## F-statistic: 129.1 on 2 and 13 DF, p-value: 2.658e-09
joint_fam <- 2 #number of joint betas we want.
alpha <- 0.01
joint_level <- alpha/joint_fam
bonf_int <- confint(fit, level = 1-joint_level)
bonf_int
## 0.25 % 99.75 %
## (Intercept) 27.545738 47.754262
## X1 3.409483 5.440517
## X2 2.104236 6.645764
B <- qt(0.9975,13)
B
## [1] 3.372468
4.425 + c(-1,1)*B*(0.3011)
## [1] 3.40955 5.44045
4.375 + c(-1,1)*B*(0.6733)
## [1] 2.104317 6.645683
The interval estimate of E{Yh} is between 73.880 and 80.670.
\(\hat{E_y}\) = 77.275
We are 99% confident that the value of E{Yh} lies in the interval (73.88111 80.66889)
new <- data.frame(X1=5,X2=4) # new data
predict.lm(fit,new,interval="confidence",level=0.99) # CI for main response
## fit lwr upr
## 1 77.275 73.88111 80.66889
The prediction interval for Xh1 = 5 and Xh2 = 4 is between (68.48077, 86.06923).
predict.lm(fit,new,interval="prediction",level=0.99)
## fit lwr upr
## 1 77.275 68.48077 86.06923
\(\hat{Y} = 12.2006 − 0.1420X1+ 0.2820X2 + 0.6193X3 + 0.0000079X4\)
ch18 <-read.table("CH06PR18.txt")
head(ch18)
## V1 V2 V3 V4 V5
## 1 13.5 1 5.02 0.14 123000
## 2 12.0 14 8.19 0.27 104079
## 3 10.5 16 3.00 0.00 39998
## 4 15.0 4 10.70 0.05 57112
## 5 14.0 11 8.97 0.07 60000
## 6 10.5 15 9.45 0.24 101385
fit2 <- lm(V1 ~ V2 + V3 + V4 + V5, data =ch18)
summary(fit2)
##
## Call:
## lm(formula = V1 ~ V2 + V3 + V4 + V5, data = ch18)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.1872 -0.5911 -0.0910 0.5579 2.9441
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.220e+01 5.780e-01 21.110 < 2e-16 ***
## V2 -1.420e-01 2.134e-02 -6.655 3.89e-09 ***
## V3 2.820e-01 6.317e-02 4.464 2.75e-05 ***
## V4 6.193e-01 1.087e+00 0.570 0.57
## V5 7.924e-06 1.385e-06 5.722 1.98e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.137 on 76 degrees of freedom
## Multiple R-squared: 0.5847, Adjusted R-squared: 0.5629
## F-statistic: 26.76 on 4 and 76 DF, p-value: 7.272e-14
Hypothesis Test:
\(H_0: \beta_{1}= \beta_{2} = \beta_{3} = \beta_{4} = 0\)
\(H_A:\) Atleast one \(\beta_{k} \neq 0\) where k = 1,2,3,4
Decision Rule:
If \(F* > F\), then reject \(H_0\).
Conclusion:
With an F* value = 26.74 > F value = 2.75, we reject \(H_0\). Additionally, our p-value from the summary tab is 7.272e-14 which is less than $= 0.05%.
Thus, at least one \(\beta_{k}\) = 0, where k=1,2,3,4 and there is enough evidence to conclude a linear relationship exists.
anova(fit2)
## Analysis of Variance Table
##
## Response: V1
## Df Sum Sq Mean Sq F value Pr(>F)
## V2 1 14.819 14.819 11.4649 0.001125 **
## V3 1 72.802 72.802 56.3262 9.699e-11 ***
## V4 1 8.381 8.381 6.4846 0.012904 *
## V5 1 42.325 42.325 32.7464 1.976e-07 ***
## Residuals 76 98.231 1.293
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(fit2)
##
## Call:
## lm(formula = V1 ~ V2 + V3 + V4 + V5, data = ch18)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.1872 -0.5911 -0.0910 0.5579 2.9441
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.220e+01 5.780e-01 21.110 < 2e-16 ***
## V2 -1.420e-01 2.134e-02 -6.655 3.89e-09 ***
## V3 2.820e-01 6.317e-02 4.464 2.75e-05 ***
## V4 6.193e-01 1.087e+00 0.570 0.57
## V5 7.924e-06 1.385e-06 5.722 1.98e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.137 on 76 degrees of freedom
## Multiple R-squared: 0.5847, Adjusted R-squared: 0.5629
## F-statistic: 26.76 on 4 and 76 DF, p-value: 7.272e-14
MSR <- (14.819+72.802+8.381+42.325)/4
MSR
## [1] 34.58175
MSE <- 1.293
MSE
## [1] 1.293
F_star <- MSR/MSE
F_star
## [1] 26.74536
a=0.05
n <- nrow(ch18)
p <- 5
F_test <- qf(0.95,p-1,n-p)
F_test
## [1] 2.492049
From the summary of our model, s{b1} = 0.02134, s{b2 } = 0.06317, s{b3}= 1.08681, s{b4} = 0.00000138.
We are 95% confident that all our \(\beta_k\) (where k=1,2,3,4) are in their respective intervals as listed below.
\(\beta_1\) (-0.19659927, -0.08740073) \(\beta_2\) (0.1203769, 0.4436231) \(\beta_3\) (-2.161348, 3.399948) \(\beta_4\) (4.369213e-06, 1.143079e-05)
joint_fam <- 4 #number of joint betas we want.
alpha <- 0.05
joint_level <- alpha/joint_fam
bonf_int <- confint(fit2, level = 1-joint_level)
bonf_int
## 0.625 % 99.375 %
## (Intercept) 1.072186e+01 1.367931e+01
## V2 -1.966396e-01 -8.742769e-02
## V3 1.203875e-01 4.436456e-01
## V4 -2.161312e+00 3.399999e+00
## V5 4.381297e-06 1.146731e-05
summary(fit2)
##
## Call:
## lm(formula = V1 ~ V2 + V3 + V4 + V5, data = ch18)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.1872 -0.5911 -0.0910 0.5579 2.9441
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.220e+01 5.780e-01 21.110 < 2e-16 ***
## V2 -1.420e-01 2.134e-02 -6.655 3.89e-09 ***
## V3 2.820e-01 6.317e-02 4.464 2.75e-05 ***
## V4 6.193e-01 1.087e+00 0.570 0.57
## V5 7.924e-06 1.385e-06 5.722 1.98e-07 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.137 on 76 degrees of freedom
## Multiple R-squared: 0.5847, Adjusted R-squared: 0.5629
## F-statistic: 26.76 on 4 and 76 DF, p-value: 7.272e-14
B <- qt(0.99375,n-p)
B
## [1] 2.558541
b1 <- -0.142 + c(-1,1)*B*(0.02134)
b1
## [1] -0.19659927 -0.08740073
b2 <- 0.282 + c(-1,1)*B*(0.06317)
b2
## [1] 0.1203769 0.4436231
b3 <- 0.6193 + c(-1,1)*B*(1.08681)
b3
## [1] -2.161348 3.399948
b4 <- 0.0000079 + c(-1,1)*B*(0.00000138)
b4
## [1] 4.369213e-06 1.143079e-05
From our summary fit, we see that muliple R-Squared value is 0.5847.
We can also calculate it using SSR/SSTO = 0.584.
Our regression model accounts for 58.4% of the variation
SSR <- 14.819+72.802+8.381+42.325
SSR
## [1] 138.327
SSE <- 98.321
SSTO <- SSR+SSE
SSTO
## [1] 236.648
SSR/SSTO
## [1] 0.5845264
Working hotel, W = 3.4168 > B = 2.5585 which is the bonferroni method.
Thus, we use the Bonferroni Method method.
The most efficient procedure provides us the first family of estimates.
#1 Simultaneous CI of mean response
new <- data.frame(V2=c(5,6,14,12),V3=c(8.25,8.5,11.5,10.25),V4= c(0,0.23,0.11,0), V5=c(250000,270000,300000, 310000))
g=nrow(new)
alphaF=0.05
#Bonferroni CI
BCI=predict(fit2,new,se.fit = TRUE,
interval = "confidence",level=1-alphaF/g)
BCI$fit ##USE THIS AS B < W BETTER TO USE
## fit lwr upr
## 1 15.79813 15.08664 16.50962
## 2 16.02754 15.42391 16.63116
## 3 15.90072 15.33232 16.46913
## 4 15.84339 15.18040 16.50638
n <- nrow(ch18)
p <- 5
B <- qt(0.99375,n-p)
B
## [1] 2.558541
#Working-Hotelling CI
n=nrow(ch18)
p=5
Yh_hat=predict(fit2,newdata=new)
W=sqrt(p*qf(1-alphaF,p,n-p))
W
## [1] 3.416811
WHCI=cbind(Yh_hat,Yh_hat-W*BCI$se.fit,Yh_hat+W*BCI$se.fit)
colnames(WHCI)=c("fitted","lower","upper")
WHCI
## fitted lower upper
## 1 15.79813 14.84797 16.74829
## 2 16.02754 15.22142 16.83365
## 3 15.90072 15.14165 16.65980
## 4 15.84339 14.95799 16.72878