Introduction to Project

For this project I extracted data from VIN numbers from a used car dataset to help answer some business questions about the used car marketplace.

The question I will try to answer are…

  1. What COUNTRY makes cars that depreciate the most in price from car mileage?
  2. What BRAND makes cars that depreciate the most in price from car mileage?
library(dplyr)
library(ggplot2)
library(ggthemes)

loc <- 'C:/Users/Owner/Desktop/true_car_listings.csv'
cardata_raw <- read.csv(loc)
cardata <- sample_n(cardata_raw, 100000)
cardata$Vin <- as.character(cardata$Vin)
cardata$country <- substr(cardata$Vin, start = 1, stop = 1)
unique(cardata$country)

for(i in 1:length(cardata$country)){
  if(cardata$country[i]=='1'||cardata$country[i]=='4'
     ||cardata$country[i]=='5'){
    cardata$country[i] <- 'USA'
  }
  else if(cardata$country[i]=='J'){
    cardata$country[i] <- 'Japan'
  }
  else if(cardata$country[i]=='K'){
    cardata$country[i] <- 'Korea'
  }
  else if(cardata$country[i]=='S'){
    cardata$country[i] <- 'Germany'
  }
  else if(cardata$country[i]=='L'){
    cardata$country[i] <- 'China'
  }
  else{
    cardata$country[i] <- 'DOP'
  }
}

cardata<- cardata[cardata$country != 'DOP', ]
cardata<- cardata[cardata$Mileage <= 250000, ]
carbrands <- c('Toyota','Honda','Ford','Mercedes-Benz', 'Chevrolet')
cardata_brand <- cardata[cardata$Make==carbrands, ]

Plots

pl1 <- ggplot(cardata, aes(x=Mileage, y=Price))+
  ggtitle('Car depreciation by Manufacturing Country')+
  geom_smooth(aes(color=country))+
  ylim(0, 100000)
print(pl1)
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
## Warning: Removed 64 rows containing non-finite values (stat_smooth).

pl2 <- ggplot(cardata_brand, aes(x=Mileage, y=Price))+
  ggtitle('Car depreciation by Brand')+
  geom_smooth(aes(color=Make))+
  ylim(0, 100000)
print(pl2)
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'

Conclusion

Cars from Germany depreciate the fastest. Also Mercedes-Benz makes cars the depreciate the fastest out of any of the top 5 car brands in America. The stereotype that Mercedes-Benz makes cars that depreciate quickly is true from the data.