Introduction

This report covers an analysis of cars in the year 2020, using the “Cars2020” dataset obtained from https://www.lock5stat.com/datasets3e/Cars2020.csv This dataset comprises of 21 variables and 110 observed car models, providing car information about maker, cost, specifications, and performance of each car. The primary objective of this report is to answer a series of research questions by analyzing this dataset.

Questions

  1. Is there a more significant correlation between LowPriced Cars and weight, or HighPriced Cars and weight?

  2. Does Chevrolet cars have an higher average highcost price than BMW cars?

  3. Does Acura, or Audi have a more significant impact on 0 to 60 mph time?

  4. Do car’s with 7 seats cost more overall?

  5. Does fuel capacity have a significant impact on a car’s city miles per gallon?

  6. Which size of vehicle has the most significant impact on braking.

  7. Does car wheelbase have a significant impact on 0 to 30mph time?

  8. Does car length have a significant impact on 0 to 60mph time?

  9. Does City MPG have a significant correlation to Highway MPG?

  10. Do sporty cars have a more significant impact on U-Turn than Sedans?

DataSet

Cars = read.csv("https://www.lock5stat.com/datasets3e/Cars2020.csv")
head(Cars)
##    Make Model   Type LowPrice HighPrice CityMPG HwyMPG Seating Drive Acc030
## 1 Acura   MDX    SUV     44.4     60.15      14     31       7   AWD    2.8
## 2 Acura   RLX  Sedan     54.9     61.00      15     36       5   AWD    2.7
## 3  Audi    A3  Sedan     33.3     43.00      18     40       5   AWD    3.2
## 4  Audi    A4 Sporty     37.4     45.70      18     40       5   AWD    2.7
## 5  Audi    A6  Sedan     54.9     73.90      17     39       5   AWD    2.8
## 6  Audi    A8  Wagon     83.8     83.80      20     27       5   AWD    2.4
##   Acc060 QtrMile Braking FuelCap Length Width Height Wheelbase UTurn Weight
## 1    6.8    15.3     135    19.5    196    77     67       111    40   4200
## 2    6.5    15.0     128    18.5    198    74     58       112    40   3930
## 3    8.3    16.4     124    13.2    175    70     56       104    37   3135
## 4    6.3    14.9     135    15.3    186    73     56       111    40   3630
## 5    6.8    15.3     129    19.3    195    74     57       115    38   4015
## 6    6.1    14.5     133    21.7    209    77     59       123    43   4810
##       Size
## 1 Midsized
## 2 Midsized
## 3    Small
## 4    Small
## 5 Midsized
## 6    Large

Analysis

Q1. Is there a more significant correlation between LowPriced Cars and weight, or HighPriced Cars and weight?

cor(Cars$LowPrice, Cars$Weight)
## [1] 0.5771807
t.test(Cars$LowPrice, Cars$Weight)
## 
##  Welch Two Sample t-test
## 
## data:  Cars$LowPrice and Cars$Weight
## t = -46.886, df = 109.08, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -4002.345 -3677.697
## sample estimates:
##  mean of x  mean of y 
##   34.56955 3874.59091
cor(Cars$HighPrice, Cars$Weight)
## [1] 0.5628846
t.test(Cars$HighPrice, Cars$Weight)
## 
##  Welch Two Sample t-test
## 
## data:  Cars$HighPrice and Cars$Weight
## t = -46.665, df = 109.22, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -3985.435 -3660.691
## sample estimates:
##  mean of x  mean of y 
##   51.52764 3874.59091

Based on this data we can see that their is slightly more lowerpriced cars meaning a higher correlation too, but both are not considered significant based on their p-values.

Q2. Does Chevrolet cars have an higher average highcost price than BMW cars?

# Subset the data for the two manufactures to compare
highcost_chevrolet <- Cars$HighPrice[Cars$Make == "Chevrolet"]
highcost_bmw <- Cars$HighPrice[Cars$Make == "BMW"]


# Perform two-sample t-test
t_test_result <- t.test(highcost_chevrolet, highcost_bmw, alternative = "greater")

# Print the result
print(t_test_result)
## 
##  Welch Two Sample t-test
## 
## data:  highcost_chevrolet and highcost_bmw
## t = -1.0579, df = 6.5139, p-value = 0.8361
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  -70.75601       Inf
## sample estimates:
## mean of x mean of y 
##     58.57     83.74

BMW has a significantly higher average highercost price than Chevrolet brand cars based on the dataset.

Q3. Does Acura, or Audi have a more significant impact on 0 to 60 mph time?

t.test(Cars$Acc060 ~ Cars$Make == "Acura")
## 
##  Welch Two Sample t-test
## 
## data:  Cars$Acc060 by Cars$Make == "Acura"
## t = 5.6606, df = 4.0371, p-value = 0.004669
## alternative hypothesis: true difference in means between group FALSE and group TRUE is not equal to 0
## 95 percent confidence interval:
##  0.6168594 1.7961036
## sample estimates:
## mean in group FALSE  mean in group TRUE 
##            7.856481            6.650000
t.test(Cars$Acc060 ~ Cars$Make == "Audi")
## 
##  Welch Two Sample t-test
## 
## data:  Cars$Acc060 by Cars$Make == "Audi"
## t = 2.2823, df = 7.1082, p-value = 0.05588
## alternative hypothesis: true difference in means between group FALSE and group TRUE is not equal to 0
## 95 percent confidence interval:
##  -0.02900649  1.79439111
## sample estimates:
## mean in group FALSE  mean in group TRUE 
##            7.882692            7.000000

Audi has a more significant 0 to 60 time impact than Acura based on this data.

Q4. Do car’s with 7 seats cost more overall?

#cars with 7 seats and cars without lowcost
seven_seats <- Cars$LowPrice[Cars$Seating == 7]
not_seven_seats <- Cars$LowPrice[Cars$Seating != 7]

# Perform two-sample t-test
t_test_result_1 <- t.test(seven_seats, not_seven_seats)


# Print the result
print(t_test_result_1)
## 
##  Welch Two Sample t-test
## 
## data:  seven_seats and not_seven_seats
## t = 1.001, df = 28.15, p-value = 0.3253
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -3.294154  9.594005
## sample estimates:
## mean of x mean of y 
##  37.31857  34.16865
#cars with 7 seats and cars without highcost
seven_seats <- Cars$HighPrice[Cars$Seating == 7]
not_seven_seats <- Cars$HighPrice[Cars$Seating != 7]

# Perform two-sample t-test
t_test_result_1 <- t.test(seven_seats, not_seven_seats)
# Print the result
print(t_test_result_1)
## 
##  Welch Two Sample t-test
## 
## data:  seven_seats and not_seven_seats
## t = 0.22717, df = 28.151, p-value = 0.8219
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -9.22499 11.52698
## sample estimates:
## mean of x mean of y 
##  52.53214  51.38115

Based on this data seven seat cars are significant to cost and cost more on overall for both low priced and high priced vehicles.

Q5. Does fuel capacity have a significant impact on a car’s city miles per gallon?

cor.test(Cars$FuelCap, Cars$CityMPG)
## 
##  Pearson's product-moment correlation
## 
## data:  Cars$FuelCap and Cars$CityMPG
## t = -15.772, df = 108, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.8840484 -0.7678422
## sample estimates:
##        cor 
## -0.8350299

Based on this this data there is not a significant impact on a car’s city MPG from fuel capacity. As fuel capacity goes up MPG goes down.

Q6. Which size of vehicle has the most significant impact on braking.

braking_small <- Cars$Braking[Cars$Size == "Small"]
braking_midsized <- Cars$Braking[Cars$Size == "Midsized"]
braking_large <- Cars$Braking[Cars$Size == "Large"]


# Perform two-sample t-test
t_test_result_1 <- t.test(braking_large, braking_midsized, alternative = "less")
# Perform two-sample t-test
t_test_result_2 <- t.test(braking_small, braking_midsized, alternative = "less")

# Print the result
print(t_test_result_1)
## 
##  Welch Two Sample t-test
## 
## data:  braking_large and braking_midsized
## t = 2.5607, df = 35.396, p-value = 0.9926
## alternative hypothesis: true difference in means is less than 0
## 95 percent confidence interval:
##      -Inf 6.782956
## sample estimates:
## mean of x mean of y 
##  135.4286  131.3415
print(t_test_result_2)
## 
##  Welch Two Sample t-test
## 
## data:  braking_small and braking_midsized
## t = -2.2101, df = 84.909, p-value = 0.01489
## alternative hypothesis: true difference in means is less than 0
## 95 percent confidence interval:
##        -Inf -0.7395089
## sample estimates:
## mean of x mean of y 
##  128.3542  131.3415

It seems that small sized cars have the least significant impact on breaking, while large sized cars have the most significant. Meaning that heavier cars take longer to come to a stop.

Q7. Does car wheelbase have a significant impact on 0 to 30mph time?

cor.test(Cars$Wheelbase, Cars$Acc030)
## 
##  Pearson's product-moment correlation
## 
## data:  Cars$Wheelbase and Cars$Acc030
## t = -4.3873, df = 108, p-value = 2.68e-05
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.5370596 -0.2175284
## sample estimates:
##        cor 
## -0.3889287

Based on this data wheelbase does not have an significant impact on a car’s 0 to 30mph time.

Q8. Does car length have a significant impact on 0 to 60mph time?

cor.test(Cars$Length, Cars$Acc060)
## 
##  Pearson's product-moment correlation
## 
## data:  Cars$Length and Cars$Acc060
## t = -5.7331, df = 108, p-value = 9.06e-08
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.6146840 -0.3252093
## sample estimates:
##        cor 
## -0.4830374

Based on this data wheelbase does not have an significant impact on a car’s 0 to 30mph time.

Q9. Does City MPG have a significant correlation to Highway MPG?

cor.test(Cars$CityMPG, Cars$HwyMPG)
## 
##  Pearson's product-moment correlation
## 
## data:  Cars$CityMPG and Cars$HwyMPG
## t = 18.968, df = 108, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.8252730 0.9141287
## sample estimates:
##       cor 
## 0.8769963

Based on this data city mpg does not have an significant impact on highway mph.

Q10. Do sporty cars have a more significant impact on U-Turn than Sedans?

uturn_sporty <- Cars$UTurn[Cars$Type == "Sporty"]
uturn_sudan <- Cars$UTurn[Cars$Type == "Sedan"]


# Perform two-sample t-test
t_test_result_2 <- t.test(uturn_sporty, uturn_sudan, alternative = "greater")

# Print the result
print(t_test_result_1)
## 
##  Welch Two Sample t-test
## 
## data:  braking_large and braking_midsized
## t = 2.5607, df = 35.396, p-value = 0.9926
## alternative hypothesis: true difference in means is less than 0
## 95 percent confidence interval:
##      -Inf 6.782956
## sample estimates:
## mean of x mean of y 
##  135.4286  131.3415

Based on this data sporty cars have a significantly larger U Turn than Sedans.

Summary

This will cover some of the key takeaways from each question answered in order they where asked.

1. Is there a more significant correlation between LowPriced Cars and weight, or HighPriced Cars and weight?

This question showed us that based on this data we can see that their is slightly more lowerpriced cars meaning a higher correlation too, but both are not considered significant based on their p-values.

2. Does Chevrolet cars have an higher average highcost price than BMW cars?

This question showed us that BMW has a significantly higher average highercost price than Chevrolet brand cars based on the dataset.

3. Does Acura, or Audi have a more significant impact on 0 to 60 mph time?

This question showed us that Audi has a more significant 0 to 60 time impact than Acura based on this data.

4. Do car’s with 7 seats cost more overall?

This question showed us that based on this data seven seat cars are significant to cost and cost more on overall for both low priced and high priced vehicles.

5. Does fuel capacity have a significant impact on a car’s city miles per gallon?

This question showed us that based on this this data there is not a significant impact on a car’s city MPG from fuel capacity. As fuel capacity goes up MPG goes down.

6. Which size of vehicle has the most significant impact on braking.

This question showed us that it seems that small sized cars have the least significant impact on breaking, while large sized cars have the most significant. Meaning that heavier cars take longer to come to a stop.

7. Does car wheelbase have a significant impact on 0 to 30mph time?

This question showed us that based on this data wheelbase does not have an significant impact on a car’s 0 to 30mph time.

8. Does car length have a significant impact on 0 to 60mph time?

This question showed us that based on this data car length does not have an significant impact on a car’s 0 to 60mph time.

9. Does City MPG have a significant correlation to Highway MPG?

This question showed us that based on this data city mpg does not have an significant impact on highway mph.

10. Do sporty cars have a more significant impact on U-Turn than Sedans?

This question shows us that based on this data sporty cars have a significantly larger U Turn than Sedans.

Conclusion

Based on all of the data gathered from questions answered some general conclusions can be made. One is that certain car brands, car weight, and the amount of seats all have a significant impact on car price. We also see that some physical attributes like wheelbase and car length do not have a significant impact on starting speeds.

By doing this project we have displayed that using statistic P-Values we can evaluate significance of certain variables against other variables. It is important to note here that smaller data set variables may have a worse P-value because of this. We also found that we can use statistics to better understand questions pertaining to collected data.