2024-04-03

What is the Data?

The following data includes 97712 observations of used cars from 1970 to 2024. Used cars provide the ability to analyze data related to what the average car is like after it is new as most cars on the road are used not new. Below is a small sample of the observations.

year price transmission mileage fuelType tax mpg engineSize Manufacturer
2017 7495 Manual 11630 Petrol 145 60.1 1.0 hyundi
2017 10989 Manual 9200 Petrol 145 58.9 1.0 volkswagen
2019 27990 Semi-Auto 1614 Diesel 145 49.6 2.0 BMW
2017 12495 Manual 30960 Diesel 150 62.8 2.0 skoda
2017 7999 Manual 19353 Petrol 125 54.3 1.2 ford

Avg mpg per Manufacturer per Year

The above graph shows the comparison between Manufacturer, Year, and Average Miles Per Gallon for used cars from 1970 to 2024. The graph benefits from being 3D, allowing the several different comparisons to be made by simply rotating the graph. For example, comparing the y and z axes shows how average mpg has changed over time for all cars. If the graph is rotated to look at the x and z axes the comparison between manufacturer and their avg mpg for all years can be made.

Avg mpg Throughout the Years

We can represent the average miles per gallon throughout the years using the below equation:
\[ \text{mpg} = \beta_0 + \beta_1 \times \text{year} + \epsilon \] Where:
- \(\text{mpg}\) = miles per gallon
- \(\text{year}\) = year of production
- \(\beta_0\) and \(\beta_1\) = coefficients estimated based on data
- \(\epsilon\) = error

*This can be applied to other data sets as long as the coefficients are adjusted properly.

How The Graph Was Made (1/2)

Removing impossible values and setting the x,y,z axes:

averageMpg <- cars %>%
  #Remove impossible values
  filter(year >= 1970 & year <= 2024 & mpg > 0) %>%
    group_by(Manufacturer, year) %>%
      summarise(avgMpg = mean(mpg, na.rm = TRUE), .groups = "drop")

myX = averageMpg$Manufacturer #Setting x-axis
myY = averageMpg$year #Setting y-axis
myZ = averageMpg$avgMpg #Setting z-axis

How The Graph Was Made (2/2)

Creating and plotting the graph:

#Set to variable to add additional lines or markers later (optional)
g <- plot_ly(x = myX, y = myY, z = myZ, type = "scatter3d", 
             mode = "markers",
             color =  averageMpg$Manufacturer) %>%
      layout(scene = list(
        height = 800, width = 1200,
        margin = list(t = 50, b = 50, l = 50, r = 50),
        xaxis = list(title = "Manufacturer"),
        yaxis = list(title = "Year"),
        zaxis = list(title = "Average mpg")))
#g #This is where the plot is displayed

Transmission Type per Year (1990 - 2024)

The graph above shows an increase in all types of transmission until 2020, at which there is a sudden drop for all as well. This shows that the popularity of each type has not changed very much; instead, just the amount of total transmission produced has.

Car Price Per Year

The graph above shows the average price of a used car from 1990 to 2024. From this graph, it can be seen that there is an exponential increase in the average cost of a used car since 1990. (Note: 2024 is much lower than expected but that is likely due to incomplete data for the year)

Car Depreciation Over Time

The depreciation (amount lost value) of a car over time can be modeled using the following equation:
\[ P_t = P_0(1 - r)^t \] Where:
- \(P_t\) = value of the car at time \(t\)
- \(P_0\) = initial purchase price
- \(r\) = rate of depreciation per year
- \(t\) = number of years after the initial purchase

*Rate of depreciation can be estimated based on data.