Setup

# Load Libriaries
library(tidyverse)
library(knitr)

# Read data
super <- read.csv("2013Coupes.csv")
dat <- filter(super,Price < 100000)

Summary Statistics

# Find Mean, Median and Standard Deviation of data w/o supercar and assign it to a variable name
reg <- dat %>%
        summarise(Mean = mean(Price), Median = median(Price), Standard_Dev = sd(Price))

# Find Mean, Median and Standard Deviation of data w/ supercar and assign it to a variable name
wporche <- super %>%
        summarise(Mean = mean(Price), Median = median(Price), Standard_Dev = sd(Price))

# In the darkness bind them
w2 <- rbind(reg,wporche)

#Display Summary Statistics
kable(w2)
Mean Median Standard_Dev
21145.8 20992 6193.661
155587.1 21807 445930.032

Observations

Row 1 is the summary statistics for the data set without the supercar. Row 2 is the summary statistics for the data set with the supercar. Three tests can provide information about the mean regarding this data. These three tests are the skew test, the outlier test, and the deviation test.

Skew Test

Data sets are typically skewed left if mean < median < mode. Data sets are typically skewed right if mode < median < mean. The data set without the supercar evaluates mean > median, suggesting the data is skewed right. This holds true for the supercar data set, but here the mean exceeds the median by more than 7x, indicating it is skewed far to the right.

Outlier Test

Potential outliers can be identified in a data set by calculating IQR, multiplying this value by 1.5, and then adding or subracting that value from Q3 and Q1 respectively. Any results returned by the test are potential outliers.

Less Supercar Outlier Test
q3outlier_value <- quantile(dat$Price,.75) + IQR(dat$Price) * 1.5  
q1outlier_value <- quantile(dat$Price,.25) - IQR(dat$Price) * 1.5
outlier_test <- filter(dat,Price > q3outlier_value | Price < q1outlier_value)
kable(outlier_test)
Vehicle.Type Year Make Model Price MPG..city. MPG..highway. Horsepower Cylinders

The test returned no values, indicating that potential outliers were not identified using this methodology.

Supercar Outlier Test
sq3outlier_value <- quantile(super$Price,.75) + IQR(super$Price) * 1.5  
sq1outlier_value <- quantile(super$Price,.25) - IQR(super$Price) * 1.5
s_outlier_test <- filter(super,Price > sq3outlier_value | Price < sq1outlier_value)
kable(s_outlier_test)
Vehicle.Type Year Make Model Price MPG..city. MPG..highway. Horsepower Cylinders
Coupe 2013 Porsche 918 Spider 1500000 22 22 875 8

The test returned the supercar, indicating it as a potential outlier.

Deviation Test

Values in a dataset that are 2 or more standard deviations from the mean are considered “far” from the mean. Any results returned from the test will be considered far from the mean using this methodology.

Less Supercar Deviation Test
sdtest <- filter(dat,Price < 21145.8-6193.661*2 | Price > 21145.8+6193.661*2)
kable(sdtest)
Vehicle.Type Year Make Model Price MPG..city. MPG..highway. Horsepower Cylinders

None of the data points are considered far from the mean.

Supercar Deviation Test
sdtest2 <- filter(super,Price < 155587.1-445930.032*2 | Price > 155587.1+445930.032*2)
kable(sdtest2)
Vehicle.Type Year Make Model Price MPG..city. MPG..highway. Horsepower Cylinders
Coupe 2013 Porsche 918 Spider 1500000 22 22 875 8

The supercar is far from the mean.

Conclusion

The data set without the supercar is skewed a little to the right, potential outliers were not identified, and all values are within two standard deviations of the mean. The data set with the supercar is skewed far to the right, and the supercar returned as both a potential outlier and far from the mean. I was surprised the change in standard deviation was higher than the change in mean, and found this to be a valuable exercise.

Reference

kable(super)
Vehicle.Type Year Make Model Price MPG..city. MPG..highway. Horsepower Cylinders
Coupe 2013 Jaguar XK 21807 16 24 385 8
Coupe 2013 Chevrolet Camero 27795 15 24 426 8
Coupe 2013 Ford Mustang 29145 15 26 420 8
Coupe 2013 Mercedes E550 14403 17 27 402 8
Coupe 2013 Audi S5 17209 18 28 333 6
Coupe 2013 BMW M3 25732 14 20 414 8
Coupe 2013 Mini Coupe 2D 13674 26 35 208 4
Coupe 2013 Dodge Challenger 13774 16 25 375 8
Coupe 2013 Cadillac CTS-V 27742 12 18 556 8
Coupe 2013 Nissan 370Z 20177 19 26 332 6
Coupe 2013 Porsche 918 Spider 1500000 22 22 875 8