Set Working Directory
Read Files
heart = read.csv("heartbpchol.csv");
heart$BP_Status = as.factor(heart$BP_Status);
heart$Cholesterol = as.numeric(heart$Cholesterol)
bupa = read.csv("bupa.csv");
bupa$drinkgroup = as.factor(bupa$drinkgroup);
bupa$mcv = as.numeric(bupa$mcv);
bupa$alkphos = as.numeric(bupa$alkphos)
psych = read.csv("psych.csv");
psych$sex = as.factor(psych$sex);
psych$rank = as.factor(psych$rank);
psych$salary = as.numeric(psych$salary)
cars_new = read.csv("cars_new.csv");
cars_new$type = as.factor(cars_new$type);
cars_new$origin = as.factor(cars_new$origin);
cars_new$cylinders = as.factor(cars_new$cylinders);
cars_new$mpg_highway = as.numeric(cars_new$mpg_highway)Exercise 1: Analysis of Variance
The heartbpchol.csv data set contains continuous cholesterol (Cholesterol) and blood pressure status (BP_Status) (category: High/ Normal/ Optimal) for alive patients. For the heartbpchol.csv data set, consider a one-way ANOVA model to identify differences between group cholesterol means. The normality assumption is reasonable, so you can proceed without testing normality.
Exercise 1.A
Perform a one-way ANOVA for Cholesterol with BP_Status as the categorical predictor. Comment on statistical significance of BP_Status, the amount of variation described by the model, and whether or not the equal variance assumption can be trusted.
Data Exploration - Check Balance
##
## High Normal Optimal
## 229 245 67
boxplot(Cholesterol ~ BP_Status, data=heart,
main="Distribution of Cholesterol by BP_Status",
xlab = "BP_Status",
ylab = "Cholesterol",
col = "Grey",
border = "slategray",
horizontal = FALSE
)Observation: The distribution is unbalanced. Each BP_Status group has a different number of observations.
Run One-Way ANOVA
## Df Sum Sq Mean Sq F value Pr(>F)
## BP_Status 2 25211 12605 6.671 0.00137 **
## Residuals 538 1016631 1890
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Conclusion: The p-value of 0.00137 is below the significance level of 0.05, meaning that we reject the null hypothesis. Therefore, BP_Status has a significant effect on Cholesterol levels (at least one group in BP_Status has a different mean of Cholesterol).
R-square (variation of response variable explained by BP_Status)
## [1] 0.02419833
Conclusion: 2.4% of the variation of Cholesterol can be explained by BP_Status.
Check Equal Variance Assumption
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 2 0.1825 0.8332
## 538
Conclusion: The p-value is above the significance level of 0.05, meaning that we can’t reject the null. Therefore, all groups in BP_Status have the same variance.
Exercise 1.B
Comment on any significantly different cholesterol means as determined by the post-hoc test comparing all pairwise differences. Specifically explain what that tells us about differences in cholesterol levels across blood pressure status groups, like which group has the highest or lowest mean values of Cholesterol.
##
## Posthoc multiple comparisons of means: Scheffe Test
## 95% family-wise confidence level
##
## $BP_Status
## diff lwr.ci upr.ci pval
## Normal-High -11.543481 -21.35092 -1.736038 0.0159 *
## Optimal-High -18.646679 -33.46702 -3.826341 0.0089 **
## Optimal-Normal -7.103198 -21.81359 7.607194 0.4958
##
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Comments:
BP_Statuspairs Normal-High and Optimal-High have significantly different mean values of (effect on)Cholesterol(p-value below 0.05 means we reject the null).BP_Statuspair Optimal-Normal does not have a significantly different mean value ofCholesterol(p-value above 0.05 means we do not reject the null). Simply put, Optimal and NormalBP_Statushave equal means.Specifically, the following effects of
BP_StatusonCholesterolcan be seen:Normal < High (The mean
Cholesterolof High is greater than the meanCholesterolof Normal)Optimal < High (The mean
Cholesterolof High is greater than the meanCholesterolof Optimal)Optimal = Normal (The mean
Cholesterolof Normal is the same as the meanCholesterolof Optimal)
Exercise 2: Analysis of Variance
For this problem use the bupa.csv data set. Check UCI Machine Learning Repository for more information (http://archive.ics.uci.edu/ml/datasets/Liver+Disorders). The mean corpuscular volume and alkaline phosphatase are blood tests thought to be sensitive to liver disorder related to excessive alcohol consumption. We assume that normality and independence assumptions are valid.
Exercise 2.A
Perform a one-way ANOVA for Mean Corpuscular Volume or mcv as a function of drinkgroup. Comment on significance of the drinkgroup, the amount of variation described by the model, and whether or not the equal variance assumption can be trusted.
Data Exploration - Check Balance
##
## 1 2 3 4 5
## 117 52 88 67 21
boxplot(mcv ~ drinkgroup, data=bupa,
main="Distribution of MCV by drinkgroup",
xlab = "drinkgroup",
ylab = "mcv",
col = "Grey",
border = "slategray",
horizontal = FALSE
)Observation: The distribution is unbalanced. Each drinkgroup has a different number of observations.
One-Way ANOVA
## Df Sum Sq Mean Sq F value Pr(>F)
## drinkgroup 4 733 183.29 10.26 7.43e-08 ***
## Residuals 340 6073 17.86
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Conclusion: The p-value of 7.43e-08 is below the significance level of 0.05, meaning that we reject the null hypothesis. Therefore, drinkgroup has a significant effect on mcv (at least one group in drinkgroup has a different mean of mcv).
R-square (variation of response variable explained by drinkgroup)
## [1] 0.1077214
Conclusion: 10.8% of the variation of mcv can be explained by drinkgroup.
Check Equal Variance Assumption
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 4 0.3053 0.8744
## 340
Conclusion: The p-value is above the significance level of 0.05, meaning that we can’t reject the null. Therefore, all groups in drinkgroup have the same variance.
Exercise 2.B
Perform a one-way ANOVA for alkphos as a function of drinkgroup. Comment on statistical significance of the drinkgroup, the amount of variation described by the model, and whether or not the equal variance assumption can be trusted.
Data Exploration - Check Balance
##
## 1 2 3 4 5
## 117 52 88 67 21
boxplot(alkphos ~ drinkgroup, data=bupa,
main="Distribution of alkphos by drinkgroup",
xlab = "drinkgroup",
ylab = "alkphos",
col = "Grey",
border = "slategray",
horizontal = FALSE
)Observation: The distribution is unbalanced. Each drinkgroup has a different number of observations.
One-Way ANOVA
aov.res_bupa_alkphos= aov(alkphos~drinkgroup, data=bupa)
summary(aov.res_bupa_alkphos) #ANOVA result## Df Sum Sq Mean Sq F value Pr(>F)
## drinkgroup 4 4946 1236.4 3.792 0.00495 **
## Residuals 340 110858 326.1
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Conclusion: The p-value of 0.00495 is below the significance level of 0.05, meaning that we reject the null hypothesis. Therefore, drinkgroup has an effect on alkphos (at least one group in drinkgroup has a different mean of alkphos).
R-square (variation of response variable explained by drinkgroup)
## [1] 0.04270721
Conclusion: 4.3% of the variation of alkphos can be explained by drinkgroup.
Check Equal Variance Assumption
## Levene's Test for Homogeneity of Variance (center = median)
## Df F value Pr(>F)
## group 4 0.8089 0.5201
## 340
Conclusion: The p-value is above the significance level of 0.05, meaning that we can’t reject the null. Therefore, all groups in drinkgroup have the same variance.
Exercise 2.C
Perform post-hoc tests for models in a) and b). Comment on any similarities or differences you observe from their results.
##
## Posthoc multiple comparisons of means: Scheffe Test
## 95% family-wise confidence level
##
## $drinkgroup
## diff lwr.ci upr.ci pval
## 2-1 1.241452991 -0.94020481 3.423111 0.5410
## 3-1 0.938131313 -0.90892674 2.785189 0.6495
## 4-1 3.744610282 1.73913894 5.750082 1.9e-06 ***
## 5-1 3.746031746 0.64379565 6.848268 0.0081 **
## 3-2 -0.303321678 -2.59291786 1.986275 0.9966
## 4-2 2.503157290 0.08395442 4.922360 0.0380 *
## 5-2 2.504578755 -0.87987039 5.889028 0.2646
## 4-3 2.806478969 0.68408993 4.928868 0.0025 **
## 5-3 2.807900433 -0.37116998 5.986971 0.1151
## 5-4 0.001421464 -3.27222796 3.275071 1.0000
##
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Posthoc multiple comparisons of means: Scheffe Test
## 95% family-wise confidence level
##
## $drinkgroup
## diff lwr.ci upr.ci pval
## 2-1 -2.645299 -11.9663647 6.675766 0.9419
## 3-1 -4.056138 -11.9476367 3.835360 0.6389
## 4-1 -1.148743 -9.7170578 7.419571 0.9965
## 5-1 12.572650 -0.6815582 25.826857 0.0734 .
## 3-2 -1.410839 -11.1930681 8.371390 0.9953
## 4-2 1.496556 -8.8394138 11.832525 0.9952
## 5-2 15.217949 0.7579944 29.677903 0.0329 *
## 4-3 2.907395 -6.1604467 11.975236 0.9117
## 5-3 16.628788 3.0463078 30.211268 0.0069 **
## 5-4 13.721393 -0.2651729 27.707959 0.0578 .
##
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Differences:
mcv
drinkgroupGroups 4-1, 5-1, 4-2, and 4-3 respectively have significantly different mean values ofmcv(p-value below 0.05 means we reject the null).drinkgroupGroups 2-1, 3-1, 3-2, 5-2, 5-3, and 5-4 do not have a significantly different mean value ofmcv(p-value above 0.05 means we do not reject the null). Simply put, the precedingdrinkgrouppairs have equal means.Specifically, the following effects of
drinkgrouponmcvcan be seen:4 > 1 (The mean
mcvof 4 is greater than the meanmcvof 1)5 > 1 (The mean
mcvof 5 is greater than the meanmcvof 1)4 > 2 (The mean
mcvof 4 is greater than the meanmcvof 2)4 > 3 (The mean
mcvof 4 is greater than the meanmcvof 3)Equivalent means among the following
drinkgrouppairs: 2-1, 3-1, 3-2, 5-2, 5-3, and 5-4
alkphos
drinkgroupGroups 5-2 and 5-3 have significantly different mean values ofalkphos(p-value below 0.05 means we reject the null).drinkgroupGroups 2-1, 3-1, 4-1, 5-1, 3-2, 4-2, 4-3, and 5-4 do not have a significantly different mean value ofalkphos(p-value above 0.05 means we do not reject the null). Simply put, the precedingdrinkgrouppairs have equal means.Specifically, the following effects of
drinkgrouponalkphoscan be seen:5 > 2 (The mean
alkphosof 5 is greater than the meanalkphosof 2)5 > 3 (The mean
alkphosof 5 is greater than the meanalkphosof 3)Equivalent means among the following
drinkgrouppairs: 2-1, 3-1, 4-1, 5-1, 3-2, 4-2, 4-3, and 5-4
Similarities:
- Group pairs 2-1, 3-1, 3-2, 5-4 all have equal mean values, and their high p-values above the significance level means that they do not have an effect on either
mcvoralkphos.
Exercise 3:
The psychology department at a hypothetical university has been accused of underpaying female faculty members. The data represent salary (in thousands of dollars) for all 22 professors in the department. This problem is from Maxwell and Delaney (2004).
Exercise 3.A
Fit a two-way ANOVA model including sex (F, M) and rank (Assistant, Associate) the interaction term. What do the Type 1 and Type 3 sums of squares tell us about significance of effects? Is the interaction between sex and rank significant? Also comment on the variation explained by the model.
Two-Way ANOVA (Type 1)
aov.psych1 = aov(salary ~ sex * rank, data = psych)
aov.psych_3 = aov(salary ~ rank * sex, data = psych)
summary(aov.psych1)## Df Sum Sq Mean Sq F value Pr(>F)
## sex 1 155.15 155.15 17.007 0.000637 ***
## rank 1 169.82 169.82 18.616 0.000417 ***
## sex:rank 1 0.63 0.63 0.069 0.795101
## Residuals 18 164.21 9.12
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Df Sum Sq Mean Sq F value Pr(>F)
## rank 1 252.22 252.22 27.647 5.33e-05 ***
## sex 1 72.76 72.76 7.975 0.0112 *
## rank:sex 1 0.63 0.63 0.069 0.7951
## Residuals 18 164.21 9.12
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Two-Way ANOVA (Type 3)
## Anova Table (Type III tests)
##
## Response: salary
## Sum Sq Df F value Pr(>F)
## (Intercept) 8140.2 1 892.2994 < 2e-16 ***
## sex 28.0 1 3.0711 0.09671 .
## rank 70.4 1 7.7189 0.01240 *
## sex:rank 0.6 1 0.0695 0.79510
## Residuals 164.2 18
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Type 1 ANOVA Test
We see that sex and rank have p-values below the significance level of 0.05. Therefore, we reject the null hypothesis for both sex and rank and conclude that that both have a significant effect on salary. Additionally, the interaction between sex and rank yields a p-value above the significance level of .05. This means that we do not reject the null, indicating that the interaction does not have a significant effect on salary.
Type 3 ANOVA Test
We see that rank has a p-value below the significance level of 0.05. Therefore, we reject the null hypothesis for rank and conclude that that it has a significant effect on salary. Additionally, the sex and the interaction between sex and rank both yield a p-value above the significance level of .05. This means that we do not reject the null, indicating that the sex and the interaction does not have a significant effect on salary.
Exercise 3.B
Refit the model without the interaction term. Comment on the significance of effects and variation explained. Report and interpret the Type 1 and Type 3 tests of the main effects. Are the main effects of rank and sex significant?
Two-Way ANOVA (Type 1)
aov.psych2 = aov(salary ~ sex + rank, data = psych)
aov.psych4 = aov(salary ~ rank + sex, data = psych)
summary(aov.psych2)## Df Sum Sq Mean Sq F value Pr(>F)
## sex 1 155.2 155.15 17.88 0.000454 ***
## rank 1 169.8 169.82 19.57 0.000291 ***
## Residuals 19 164.8 8.68
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Df Sum Sq Mean Sq F value Pr(>F)
## rank 1 252.22 252.22 29.071 3.34e-05 ***
## sex 1 72.76 72.76 8.386 0.00926 **
## Residuals 19 164.84 8.68
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Two-Way ANOVA (Type 3)
## Anova Table (Type III tests)
##
## Response: salary
## Sum Sq Df F value Pr(>F)
## (Intercept) 10227.6 1 1178.8469 < 2.2e-16 ***
## sex 72.8 1 8.3862 0.0092618 **
## rank 169.8 1 19.5743 0.0002912 ***
## Residuals 164.8 19
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Type 1 ANOVA Test
We see that both sex and rank have p-values below the significance level of 0.05. Therefore, we reject the null hypothesis for both sex and rank and conclude that they both have a significant effect on salary, and at least one group in rank (Assoc or Assist) and one group in sex (Male or Female) have different mean values.
Type 3 ANOVA Test
We see that both sex and rank have p-values below the significance level of 0.05. Therefore, we reject the null hypothesis for both sex and rank and conclude that they both have a significant effect on salary, and at least one group in rank (Assoc or Assist) and one group in sex (Male or Female) have different mean values.
Exercise 3.C
Obtain model diagnostics to validate your Normality assumptions.
Conclusion: Through analysis of the Q-Q plot, we can see that a normal distribution is reasonable.
Exercise 3.D
Choose a final model based on your results from parts (a) and (b). Comment on any significant group differences through the post-hoc test. State the differences in salary across different main effect groups and interaction (if included) between them.
Decision
Based on the results from (a) and (b), we see that there does not exist an interaction effect. Therefore, the final model that is selected is the Two-Way ANOVA without interaction, specifically the Type 3 ANOVA test to ensure that we see unique contribution of each categorical variable. Because every effect is adjusted for all other effects, we believe this model is best suited for our dataset.
Post-Hoc Test
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = salary ~ sex + rank, data = psych)
##
## $sex
## diff lwr upr p adj
## M-F 5.333333 2.693648 7.973019 0.0004544
##
## $rank
## diff lwr upr p adj
## Assoc-Assist 5.377778 2.738092 8.017463 0.0004193
Conclusion:
sex
Due to a p-value below the significance level of 0.05, the M-F pair has a significant effect on
salary.Specifically, the following effect of
sexonsalarycan be seen:- M > F (The mean
salaryof Male is greater than the meansalaryof Female)
- M > F (The mean
rank
Due to a p-value below the significance level of 0.05, the Assoc-Assist pair has a significant effect on
salary.Specifically, the following effect of
rankonsalarycan be seen:- Assoc > Assist (The mean
salaryof Associate is greater than the meansalaryof Assistant)
- Assoc > Assist (The mean
Exercise 4:
Use the cars_new.csv. See HW1 for detailed information of variables.
Exercise 4.A
Start with a three-way main effects ANOVA and choose the best main effects ANOVA model for mpg_highway as a function of cylinders, origin, and type for the cars in this set. Comment on which terms should be kept in a model for mpg_highway and why based on Type 3 SS. For the model with just predictors you decide to keep, comment on the significant effects in the model and comment on how much variation in highway fuel efficiency the model describes.
We will utilize the Backwards Elimination model selection process to determine the main effects that will be included in the model.
Three-Way ANOVA (Type 3) Full Model
aov.cars_new1 = aov(mpg_highway ~ cylinders + origin + type, data = cars_new)
Anova(aov.cars_new1, type = 3)## Anova Table (Type III tests)
##
## Response: mpg_highway
## Sum Sq Df F value Pr(>F)
## (Intercept) 69548 1 6501.6715 < 2e-16 ***
## cylinders 1453 1 135.8499 < 2e-16 ***
## origin 1 1 0.0786 0.77948
## type 108 1 10.1018 0.00175 **
## Residuals 1883 176
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Type 3 ANOVA Test (Full Model)
We see that both cylinders and type have p-values below the significance level of 0.05. Therefore, we reject the null hypothesis for both cylinders and type and conclude that they both have a significant effect on mpg_highway, and at least one group in cylinders (4 or 6) and one group in type (Sedan or Sports) have different mean values. Because origin had a p-value below the significance level (does not meet our cutoff criteria), we do not reject the null; therefore, we will remove origin from the model because it has an insignificant effect. This is in line with the Backward Elimination model selection process.
Three-Way ANOVA (Type 3) Model (cylinders and type)
aov.cars_new1_1 = aov(mpg_highway ~ cylinders + type, data = cars_new)
Anova(aov.cars_new1_1, type = 3)## Anova Table (Type III tests)
##
## Response: mpg_highway
## Sum Sq Df F value Pr(>F)
## (Intercept) 88449 1 8311.96 < 2.2e-16 ***
## cylinders 1482 1 139.27 < 2.2e-16 ***
## type 116 1 10.88 0.001175 **
## Residuals 1883 177
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Type 3 ANOVA Test (Full Model)
Based on the results of this Type 3 ANOVA test, cylinders and type have ap-value below the significance level of 0.05, meaning we reject the null. Both of these predictors have a significant effect on mpg_highway.
Exercise 4.B
Starting with main effects chosen in part (a), find your best ANOVA model by adding in any additional interaction terms that will significantly improve the model. For your final model,comment on the significant effects and variation explained by the model.
Two-Way ANOVA with Interaction (Type 3)
## Anova Table (Type III tests)
##
## Response: mpg_highway
## Sum Sq Df F value Pr(>F)
## (Intercept) 85471 1 8358.838 < 2.2e-16 ***
## cylinders 1558 1 152.397 < 2.2e-16 ***
## type 198 1 19.392 1.844e-05 ***
## cylinders:type 84 1 8.201 0.004696 **
## Residuals 1800 176
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Conclusion:
The best model to use is the Two-Way ANOVA model (Type 3) with the implementation of categorical predictors cylinders, type, and their interaction. We see that both cylinders, type, and their interaction have p-values below the significance level of 0.05. Therefore, we reject the null hypothesis and conclude that the individual predictors and their interaction have a significant effect on mpg_highway.
Variation Explained by the Model (Predictors = cylinders, type, and interaction)
## [1] 0.4813821
Observation: 48% of the variation of mpg_highway can be explained by the model (cylinders, type, and interaction).
Exercise 4.C
Comment on any significant group differences through the post-hoc test. What does this tell usabout fuel efficiency differences across cylinders, origin, or type groups? See Hint in Exercise 3.
Post-Hoc Test
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = mpg_highway ~ cylinders * type, data = cars_new)
##
## $cylinders
## diff lwr upr p adj
## 6-4 -5.722662 -6.664343 -4.780981 0
##
## $type
## diff lwr upr p adj
## Sports-Sedan -2.817931 -4.470787 -1.165075 0.0009407
##
## $`cylinders:type`
## diff lwr upr p adj
## 6:Sedan-4:Sedan -6.1723315 -7.469178 -4.875485 0.0000000
## 4:Sports-4:Sedan -5.2275641 -8.306639 -2.148489 0.0001079
## 6:Sports-4:Sedan -6.6025641 -9.681639 -3.523489 0.0000006
## 4:Sports-6:Sedan 0.9447674 -2.120956 4.010491 0.8546517
## 6:Sports-6:Sedan -0.4302326 -3.495956 2.635491 0.9834567
## 6:Sports-4:Sports -1.3750000 -5.521993 2.771993 0.8253946
cylinders
Due to a p-value below the significance level of 0.05,
cylindersgroup 6-4 has a significant effect onmpg_highway.Specifically, the following effects of
cylindersonmpg_highwaycan be seen:- 6 < 4 (The mean
mpg_highwayof 4 is greater than the meanmpg_highwayof 6)
- 6 < 4 (The mean
type
Due to a p-value below the significance level of 0.05,
typegroup Sports-Sedan has a significant effect onmpg_highway.Specifically, the following effects of
typeonmpg_highwaycan be seen:- Sports < Sedan (The mean
mpg_highwayof Sedan is greater than the meanmpg_highwayof Sports)
- Sports < Sedan (The mean
cylinders and type Interaction
Due to a p-value below the significance level of 0.05,
typegroups 6:Sedan-4:Sedan, 4:Sports-4:Sedan, and 6:Sports-4:Sedan have a significant effect onmpg_highway.Specifically, the following effects of interaction effects on
mpg_highwaycan be seen:6:Sedan < 4:Sedan (The mean
mpg_highwayof 4:Sedan is greater than the meanmpg_highwayof 6:Sedan)4:Sports < 4:Sedan (The mean
mpg_highwayof 4:Sedan is greater than the meanmpg_highwayof 4:Sports)6:Sports < 4:Sedan (The mean
mpg_highwayof 4:Sedan is greater than the meanmpg_highwayof 6:Sports)
Conclusion:
In summary, the analysis above indicates the following:
4 Cylinder cars have a higher
mpg_highwaythan 6 Cylinder cars, meaning they have better highway fuel efficiencySedans have a a higher
mpg_highwaythan Sports car types, meaning they have better highway fuel efficiency4 Cylinder Sedans have a higher
mpg_highwaythan 6 Cylinder Sedans, meaning they have better highway fuel efficiency4 Cylinder Sedans have a higher
mpg_highwaythan 4 Cylinder Sports car types, meaning they have better fuel efficiency4 Cylinder Sedans have a higher
mpg_highwaythan 6 Cylinder Sports car types, meaning they have better fuel efficiency