Week 3 Homework Part 1
Use the Brokerage Satisfaction excel file to answer the following questions in R. Create an R Markdown file to answer the questions, and then “knit” your file to create an HTML document. Your HTML document should contain both textual explanations of your answers, as well as all R code needed to support your work.
B1(Satisfaction_with_Speed_of_Execution) + B2(Satisfaction_with_Trade_Price) + B3, where B1, B2, and B3 are real numbers
brokerage <- read.csv("C:/Users/raze1/OneDrive/Desktop/UIndy/MSDA 621/Homework/2/Brokerage Satisfaction.csv")
summary(brokerage)
## Brokerage Price Speed Overall
## Length:14 Min. :1.000 Min. :2.500 Min. :2.000
## Class :character 1st Qu.:2.425 1st Qu.:3.025 1st Qu.:2.700
## Mode :character Median :2.750 Median :3.200 Median :3.000
## Mean :2.707 Mean :3.257 Mean :3.029
## 3rd Qu.:3.075 3rd Qu.:3.650 3rd Qu.:3.350
## Max. :3.700 Max. :4.000 Max. :4.000
head(brokerage)
## Brokerage Price Speed Overall
## 1 Scottrade, Inc. 3.2 3.1 3.2
## 2 Charles Schwab 3.3 3.1 3.2
## 3 Fidelity Brokerage Services 3.1 3.3 4.0
## 4 TD Ameritrade 2.8 3.5 3.7
## 5 E*Trade Financial 2.9 3.2 3.0
## 6 (Not listed) 2.4 3.2 2.7
colnames(brokerage)
## [1] "Brokerage" "Price" "Speed" "Overall"
brokerage$Brokerage <- NULL
colnames(brokerage)
## [1] "Price" "Speed" "Overall"
brokerage_model <- lm(Overall~., data = brokerage)
summary(brokerage_model)
##
## Call:
## lm(formula = Overall ~ ., data = brokerage)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.58886 -0.13863 -0.09120 0.05781 0.64613
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.6633 0.8248 -0.804 0.438318
## Price 0.7746 0.1521 5.093 0.000348 ***
## Speed 0.4897 0.2016 2.429 0.033469 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3435 on 11 degrees of freedom
## Multiple R-squared: 0.7256, Adjusted R-squared: 0.6757
## F-statistic: 14.54 on 2 and 11 DF, p-value: 0.0008157
confint(brokerage_model, level = .8)
## 10 % 90 %
## (Intercept) -1.7879306 0.4612749
## Price 0.5672241 0.9819956
## Speed 0.2148115 0.7645252
Fill-in the blanks for the following statements: a) There is an 80% probability that the number B1 will fall between 0.2148115 and 0.7645252. b) There is an 80% probability that the number B2 will fall between 0.5672241 and 0.9819956.
Q2A = data.frame(Speed = c(3), Price = c(4))
Q2C = data.frame(Speed = c(2), Price = c(3))
predictA<-predict(brokerage_model, Q2A, type = "response")
predictC<-predict(brokerage_model, Q2C, type = "response")
predictA
## 1
## 3.904117
predictC
## 1
## 2.639838
predictA_prediction<-predict(brokerage_model, Q2A , interval = "prediction", level = .9, type = "response")
predictA_prediction
## fit lwr upr
## 1 3.904117 3.174452 4.633781
predictA_confidence<-predict(brokerage_model, Q2A, interval = "confidence", level = .9, type = "response")
predictA_confidence
## fit lwr upr
## 1 3.904117 3.514362 4.293871
predictC_prediction<-predict(brokerage_model, Q2C, interval = "prediction", level = .85, type = "response")
predictC_prediction
## fit lwr upr
## 1 2.639838 1.965909 3.313768
predictC_confidence<-predict(brokerage_model, Q2C, interval = "confidence", level = .85, type = "response")
predictC_confidence
## fit lwr upr
## 1 2.639838 2.225554 3.054123
lm(formula = Overall ~ Price + Speed, data = brokerage)
##
## Call:
## lm(formula = Overall ~ Price + Speed, data = brokerage)
##
## Coefficients:
## (Intercept) Price Speed
## -0.6633 0.7746 0.4897
brokerage_unit_normal = as.data.frame(apply(brokerage, 2, function(x){(x-mean(x))/sd(x)}))
brokerage_model_unit_normal <- lm(Overall ~ Price + Speed, data = brokerage_unit_normal)
summary(brokerage_model_unit_normal)
##
## Call:
## lm(formula = Overall ~ Price + Speed, data = brokerage_unit_normal)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.97638 -0.22987 -0.15121 0.09586 1.07134
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.115e-16 1.522e-01 0.000 1.000000
## Price 8.115e-01 1.593e-01 5.093 0.000348 ***
## Speed 3.870e-01 1.593e-01 2.429 0.033469 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.5695 on 11 degrees of freedom
## Multiple R-squared: 0.7256, Adjusted R-squared: 0.6757
## F-statistic: 14.54 on 2 and 11 DF, p-value: 0.0008157
lm(formula = Overall ~ Price + Speed, data = brokerage_unit_normal)
##
## Call:
## lm(formula = Overall ~ Price + Speed, data = brokerage_unit_normal)
##
## Coefficients:
## (Intercept) Price Speed
## 4.115e-16 8.115e-01 3.870e-01
summary(brokerage_model_unit_normal)
##
## Call:
## lm(formula = Overall ~ Price + Speed, data = brokerage_unit_normal)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.97638 -0.22987 -0.15121 0.09586 1.07134
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.115e-16 1.522e-01 0.000 1.000000
## Price 8.115e-01 1.593e-01 5.093 0.000348 ***
## Speed 3.870e-01 1.593e-01 2.429 0.033469 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.5695 on 11 degrees of freedom
## Multiple R-squared: 0.7256, Adjusted R-squared: 0.6757
## F-statistic: 14.54 on 2 and 11 DF, p-value: 0.0008157
summary(brokerage_model)
##
## Call:
## lm(formula = Overall ~ ., data = brokerage)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.58886 -0.13863 -0.09120 0.05781 0.64613
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.6633 0.8248 -0.804 0.438318
## Price 0.7746 0.1521 5.093 0.000348 ***
## Speed 0.4897 0.2016 2.429 0.033469 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3435 on 11 degrees of freedom
## Multiple R-squared: 0.7256, Adjusted R-squared: 0.6757
## F-statistic: 14.54 on 2 and 11 DF, p-value: 0.0008157
Satisfaction_with_Trade_Price is more influential than the Satisfaction_with_Speed_of_Execution.
Week 3 Homework Part 2
Use the data_RocketProp csv file to answer the following questions in R. Create an R Markdown file to answer the questions, and then “knit” your file to create an HTML document. Your HTML document should contain both textual explanations of your answers, as well as all R code needed to support your work.
rocket <- read.csv("C:/Users/raze1/OneDrive/Desktop/UIndy/MSDA 621/Homework/3/data_RocketProp.csv")
summary(rocket)
## y x
## Min. :1678 Min. : 2.000
## 1st Qu.:1783 1st Qu.: 7.125
## Median :2183 Median :12.750
## Mean :2131 Mean :13.363
## 3rd Qu.:2342 3rd Qu.:19.625
## Max. :2654 Max. :25.000
head(rocket)
## y x
## 1 2158.70 15.50
## 2 1678.15 23.75
## 3 2316.00 8.00
## 4 2061.30 17.00
## 5 2207.50 5.50
## 6 1708.30 19.00
rocket_model<-lm(y ~ x, data = rocket)
summary(rocket_model)
##
## Call:
## lm(formula = y ~ x, data = rocket)
##
## Residuals:
## Min 1Q Median 3Q Max
## -215.98 -50.68 28.74 66.61 106.76
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 2627.822 44.184 59.48 < 2e-16 ***
## x -37.154 2.889 -12.86 1.64e-10 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 96.11 on 18 degrees of freedom
## Multiple R-squared: 0.9018, Adjusted R-squared: 0.8964
## F-statistic: 165.4 on 1 and 18 DF, p-value: 1.643e-10
x=model.matrix(rocket_model)
x
## (Intercept) x
## 1 1 15.50
## 2 1 23.75
## 3 1 8.00
## 4 1 17.00
## 5 1 5.50
## 6 1 19.00
## 7 1 24.00
## 8 1 2.50
## 9 1 7.50
## 10 1 11.00
## 11 1 13.00
## 12 1 3.75
## 13 1 25.00
## 14 1 9.75
## 15 1 22.00
## 16 1 18.00
## 17 1 6.00
## 18 1 12.50
## 19 1 2.00
## 20 1 21.50
## attr(,"assign")
## [1] 0 1
mean(hatvalues(rocket_model))
## [1] 0.1
hatvalues(rocket_model)
## 1 2 3 4 5 6 7
## 0.05412893 0.14750959 0.07598722 0.06195725 0.10586587 0.07872092 0.15225968
## 8 9 10 11 12 13 14
## 0.15663134 0.08105925 0.05504393 0.05011875 0.13350221 0.17238964 0.06179345
## 15 16 17 18 19 20
## 0.11742196 0.06943538 0.09898644 0.05067227 0.16667373 0.10984216
cbind(rocket, leverage = hatvalues(rocket_model))
## y x leverage
## 1 2158.70 15.50 0.05412893
## 2 1678.15 23.75 0.14750959
## 3 2316.00 8.00 0.07598722
## 4 2061.30 17.00 0.06195725
## 5 2207.50 5.50 0.10586587
## 6 1708.30 19.00 0.07872092
## 7 1784.70 24.00 0.15225968
## 8 2575.00 2.50 0.15663134
## 9 2357.90 7.50 0.08105925
## 10 2256.70 11.00 0.05504393
## 11 2165.20 13.00 0.05011875
## 12 2399.55 3.75 0.13350221
## 13 1779.80 25.00 0.17238964
## 14 2336.75 9.75 0.06179345
## 15 1765.30 22.00 0.11742196
## 16 2053.50 18.00 0.06943538
## 17 2414.40 6.00 0.09898644
## 18 2200.50 12.50 0.05067227
## 19 2654.20 2.00 0.16667373
## 20 1753.70 21.50 0.10984216
x_newA=c(1, 25.5)
t(x_newA)%*%solve(t(x)%*%x)%*%x_newA
## [,1]
## [1,] 0.1831324
This is not considered extrapolation because it is not more than twice the average leverage.
x_newB=c(1, 15)
t(x_newB)%*%solve(t(x)%*%x)%*%x_newB
## [,1]
## [1,] 0.05242319
This is not considered extrapolation because it is not more than twice the average leverage.
cooks.distance(rocket_model)
## 1 2 3 4 5 6
## 0.0373281981 0.0497291858 0.0010260760 0.0161482719 0.3343768993 0.2290842436
## 7 8 9 10 11 12
## 0.0270491200 0.0191323748 0.0003959877 0.0047094549 0.0012482345 0.0761514881
## 13 14 15 16 17 18
## 0.0889892211 0.0192517639 0.0166302585 0.0387158541 0.0005955991 0.0041888627
## 19 20
## 0.1317143774 0.0425721512
cbind(rocket, leverage = hatvalues(rocket_model), Cooks = cooks.distance(rocket_model))
## y x leverage Cooks
## 1 2158.70 15.50 0.05412893 0.0373281981
## 2 1678.15 23.75 0.14750959 0.0497291858
## 3 2316.00 8.00 0.07598722 0.0010260760
## 4 2061.30 17.00 0.06195725 0.0161482719
## 5 2207.50 5.50 0.10586587 0.3343768993
## 6 1708.30 19.00 0.07872092 0.2290842436
## 7 1784.70 24.00 0.15225968 0.0270491200
## 8 2575.00 2.50 0.15663134 0.0191323748
## 9 2357.90 7.50 0.08105925 0.0003959877
## 10 2256.70 11.00 0.05504393 0.0047094549
## 11 2165.20 13.00 0.05011875 0.0012482345
## 12 2399.55 3.75 0.13350221 0.0761514881
## 13 1779.80 25.00 0.17238964 0.0889892211
## 14 2336.75 9.75 0.06179345 0.0192517639
## 15 1765.30 22.00 0.11742196 0.0166302585
## 16 2053.50 18.00 0.06943538 0.0387158541
## 17 2414.40 6.00 0.09898644 0.0005955991
## 18 2200.50 12.50 0.05067227 0.0041888627
## 19 2654.20 2.00 0.16667373 0.1317143774
## 20 1753.70 21.50 0.10984216 0.0425721512
plot(rocket_model)