library(readxl)
broker <- read_xlsx("C:/Users/justt/Desktop/School/621/Assignment/Homework 2/Brokerage Satisfaction.xlsx")
model1 <- lm(Overall_Satisfaction_with_Electronic_Trades ~ Satisfaction_with_Trade_Price + Satisfaction_with_Speed_of_Execution, data = broker)
summary(model1)
##
## Call:
## lm(formula = Overall_Satisfaction_with_Electronic_Trades ~ Satisfaction_with_Trade_Price +
## Satisfaction_with_Speed_of_Execution, data = broker)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.58886 -0.13863 -0.09120 0.05781 0.64613
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.6633 0.8248 -0.804 0.438318
## Satisfaction_with_Trade_Price 0.7746 0.1521 5.093 0.000348 ***
## Satisfaction_with_Speed_of_Execution 0.4897 0.2016 2.429 0.033469 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3435 on 11 degrees of freedom
## Multiple R-squared: 0.7256, Adjusted R-squared: 0.6757
## F-statistic: 14.54 on 2 and 11 DF, p-value: 0.0008157
Yes, due to the p-value being less than the .05 threshold, there is likely a statistical significance in a relationship between Overall_Satisfaction_with_Electronic_Trades and the Satisfaction_with_Speed_of_Execution.
For part a, I used the p-value of 0.033469.
Yes, due to the p-value being less than the .05 threshold, there is likely a statistical significance in a relationship between Overall_Satisfaction_with_Electronic_Trades and the Satisfaction_with_Trade_Price. With this p-value being much smaller than for the other variable, the relationship might be greater for Satisfaction_with_Trade_Price.
For part c, I used the p-value of 0.000348.
This current variation model is 72.56% (Multiple R-squared) accurate.
The calculated residuals are:
model1$residuals
## 1 2 3 4 5 6
## -0.133395541 -0.210856526 0.646131773 0.480581057 -0.149979422 -0.062674498
## 7 8 9 10 11 12
## -0.588858465 0.392491325 0.076204545 -0.140379336 -0.112435149 0.002632365
## 13 14
## -0.129506736 -0.069955392
The plots of the residuals, by variable, are:
plot(broker$Satisfaction_with_Trade_Price, model1$residuals, pch = 20)
abline(h = 0, col = "grey")
plot(broker$Satisfaction_with_Speed_of_Execution, model1$residuals, pch = 20)
abline(h = 0, col = "grey")
qqnorm(model1$residuals, main = "model1")
qqline(model1$residuals)
mean(hatvalues(model1))
## [1] 0.2142857
The average of the leverages of the data points is 0.2142857.
twice the average
mean(hatvalues(model1))*2
## [1] 0.4285714
show the leverages of the data set
cbind(broker, leverage = hatvalues(model1))
## Brokerage Satisfaction_with_Trade_Price
## 1 Scottrade, Inc. 3.2
## 2 Charles Schwab 3.3
## 3 Fidelity Brokerage Services 3.1
## 4 TD Ameritrade 2.8
## 5 E*Trade Financial 2.9
## 6 (Not listed) 2.4
## 7 Vanguard Brokerage Services 2.7
## 8 USAA Brokerage Services 2.4
## 9 Thinkorswim 2.6
## 10 Wells Fargo Investments 2.3
## 11 Interactive Brokers 3.7
## 12 Zecco.com 2.5
## 13 Firstrade Securities 3.0
## 14 Banc of America Investment Services 1.0
## Satisfaction_with_Speed_of_Execution
## 1 3.1
## 2 3.1
## 3 3.3
## 4 3.5
## 5 3.2
## 6 3.2
## 7 3.8
## 8 3.7
## 9 2.6
## 10 2.7
## 11 3.9
## 12 2.5
## 13 3.0
## 14 4.0
## Overall_Satisfaction_with_Electronic_Trades leverage
## 1 3.2 0.12226809
## 2 3.2 0.14248379
## 3 4.0 0.10348052
## 4 3.7 0.09498002
## 5 3.0 0.07909281
## 6 2.7 0.09225511
## 7 2.7 0.17268548
## 8 3.4 0.14817354
## 9 2.7 0.22725402
## 10 2.3 0.22639250
## 11 4.0 0.45080022
## 12 2.5 0.28805237
## 13 3.0 0.10586879
## 14 2.0 0.74621276
Data point of 11 for Interactive Brokers and point 14 for Banc of America Investment Services have high leverages where the leverage is higher than the double of the average of 0.4285714.
cooks.distance(model1)
## 1 2 3 4 5 6
## 0.0079790547 0.0243407712 0.1518660850 0.0756708677 0.0059271795 0.0012425789
## 7 8 9 10 11 12
## 0.2471814307 0.0888808107 0.0062442356 0.0210623352 0.0533835032 0.0000111262
## 13 14
## 0.0062752299 0.1601935254
plot(model1)
There are none of the Cook’s Distances that are larger than 1. The 4th chart in the plot confirms that there are no points that are on or outside 1.
X= model.matrix(model1)
X
## (Intercept) Satisfaction_with_Trade_Price
## 1 1 3.2
## 2 1 3.3
## 3 1 3.1
## 4 1 2.8
## 5 1 2.9
## 6 1 2.4
## 7 1 2.7
## 8 1 2.4
## 9 1 2.6
## 10 1 2.3
## 11 1 3.7
## 12 1 2.5
## 13 1 3.0
## 14 1 1.0
## Satisfaction_with_Speed_of_Execution
## 1 3.1
## 2 3.1
## 3 3.3
## 4 3.5
## 5 3.2
## 6 3.2
## 7 3.8
## 8 3.7
## 9 2.6
## 10 2.7
## 11 3.9
## 12 2.5
## 13 3.0
## 14 4.0
## attr(,"assign")
## [1] 0 1 2
x_new = c(1, 4, 2)
t(x_new)%*%solve(t(X)%*%X)%*%x_new
## [,1]
## [1,] 0.8323367
The leverage of this new point is predicted to be 0.8323367.
X= model.matrix(model1)
X
## (Intercept) Satisfaction_with_Trade_Price
## 1 1 3.2
## 2 1 3.3
## 3 1 3.1
## 4 1 2.8
## 5 1 2.9
## 6 1 2.4
## 7 1 2.7
## 8 1 2.4
## 9 1 2.6
## 10 1 2.3
## 11 1 3.7
## 12 1 2.5
## 13 1 3.0
## 14 1 1.0
## Satisfaction_with_Speed_of_Execution
## 1 3.1
## 2 3.1
## 3 3.3
## 4 3.5
## 5 3.2
## 6 3.2
## 7 3.8
## 8 3.7
## 9 2.6
## 10 2.7
## 11 3.9
## 12 2.5
## 13 3.0
## 14 4.0
## attr(,"assign")
## [1] 0 1 2
x_newb = c(1, 5, 3)
t(x_newb)%*%solve(t(X)%*%X)%*%x_newb
## [,1]
## [1,] 1.08481
The leverage of this new point is predicted to be 1.08481.
X= model.matrix(model1)
X
## (Intercept) Satisfaction_with_Trade_Price
## 1 1 3.2
## 2 1 3.3
## 3 1 3.1
## 4 1 2.8
## 5 1 2.9
## 6 1 2.4
## 7 1 2.7
## 8 1 2.4
## 9 1 2.6
## 10 1 2.3
## 11 1 3.7
## 12 1 2.5
## 13 1 3.0
## 14 1 1.0
## Satisfaction_with_Speed_of_Execution
## 1 3.1
## 2 3.1
## 3 3.3
## 4 3.5
## 5 3.2
## 6 3.2
## 7 3.8
## 8 3.7
## 9 2.6
## 10 2.7
## 11 3.9
## 12 2.5
## 13 3.0
## 14 4.0
## attr(,"assign")
## [1] 0 1 2
x_newc = c(1, 4, 3)
t(x_newc)%*%solve(t(X)%*%X)%*%x_newc
## [,1]
## [1,] 0.3992325
The leverage of this new point is predicted to be 0.3992325.
X= model.matrix(model1)
X
## (Intercept) Satisfaction_with_Trade_Price
## 1 1 3.2
## 2 1 3.3
## 3 1 3.1
## 4 1 2.8
## 5 1 2.9
## 6 1 2.4
## 7 1 2.7
## 8 1 2.4
## 9 1 2.6
## 10 1 2.3
## 11 1 3.7
## 12 1 2.5
## 13 1 3.0
## 14 1 1.0
## Satisfaction_with_Speed_of_Execution
## 1 3.1
## 2 3.1
## 3 3.3
## 4 3.5
## 5 3.2
## 6 3.2
## 7 3.8
## 8 3.7
## 9 2.6
## 10 2.7
## 11 3.9
## 12 2.5
## 13 3.0
## 14 4.0
## attr(,"assign")
## [1] 0 1 2
x_newd = c(1, 3, 2)
t(x_newd)%*%solve(t(X)%*%X)%*%x_newd
## [,1]
## [1,] 0.6074396
The leverage of this new point is predicted to be 0.6074396.
max(hatvalues(model1))
## [1] 0.7462128
The maximum leverage of the datapoints in the Brokerage Satisfaction excel file is 0.7462128.
The max leverage of the original data set is 0.7462128. The leverage of this new data point is 0.8323367. This is larger than the max leverage of the original data set so this is an extrapolation.
The max leverage of the original data set is 0.7462128. The leverage of this new data point is 1.08481. This is larger than the max leverage of the original data set so this is an extrapolation.
The max leverage of the original data set is 0.7462128. The leverage of this new data point is 0.3992325. This is smaller than the max leverage of the original data set so this is not an extrapolation.
The max leverage of the original data set is 0.7462128. The leverage of this new data point is 0.6074396. This is smaller than the max leverage of the original data set so this is not an extrapolation.