Objective
We are all aware of the environmental changes happening in the world and are doing our best to reduce greenhouse gas emissions. However, the one thing we use every day that contributes significantly to these emissions is motor vehicles. In light of this, I have gathered data on fuel efficiency trends of cars from 1985 to 2017 to assess the automotive industry’s efforts to reduce emissions. As I delve into the data, I discovered a very disheartening reality of the progress in the fuel efficiency. The progress is slower than anticipated and there were some vehicles which are far behind in their ecofriendly advancements. As we can see that till 1998 the progress in fuel efficiency for car is very slow. Then from 1998 till 2003 we can see quite an improvement in fuel efficiency of cars but then after that we can see that the numbers are again comes back to almost where it was back in 1998. After 2003 there are more fuel efficient cars in the market and it is because of the investment of automotive industry towards the research and development in fuel efficient technologies which resulted in a wave of innovation. The market is now flooded with cleaner, more efficient vehicles, reducing greenhouse gas emissions and fighting climate change. Advancements like hybrid and electric vehicles which have significantly improved fuel efficiency by combining electric motors with combustion engines, improved aerodynamics, body shape, reduced drag coefficient, lightweight but strong materials, and start-stop systems have all contributed to improved fuel efficiency. The journey towards progress in fuel efficiency has been slower than we expected, but recent advancements have brought us a new era of cleaner and more efficient vehicles. With continued investment and innovation, we can hope to see more eco-friendly advancements in the future.
We will do some of the following things to validate our data visualization
We will first plot the data gathered from the year 1984 to 2017.
Reference
Vehicle Fuel Economy Estimates, 1984-2017. (n.d.). Www.kaggle.com. https://www.kaggle.com/datasets/epa/fuel-economy
How to Color Scatter Plot Points in R ? (2021, May 25). GeeksforGeeks. https://www.geeksforgeeks.org/how-to-color-scatter-plot-points-in-r/
Long, J. (JD), & Teetor, P. (n.d.). 10 Graphics | R Cookbook, 2nd Edition. In rc2e.com. Retrieved June 7, 2023, from https://rc2e.com/graphics#recipe-id171
The following code was used to
#To clean R`s previously stored data or memory we use rm() function
rm(list=ls())
library(readr)
library(dplyr)
library(readxl)
#library(tidyverse)
library(ggplot2)
library(car)
library(matlib)
library(GGally)
library(stats)
#library(ggthemes)
setwd("D:/RMIT Term 2/Data Visualisation & Comm/Assignment 3/Another data")
original_data <- read.csv("database.csv")
#Lets look at our datas first 6 rows by using head() function.
head(original_data)
## Vehicle.ID Year Make Model Class
## 1 26587 1984 Alfa Romeo GT V6 2.5 Minicompact Cars
## 2 27705 1984 Alfa Romeo GT V6 2.5 Minicompact Cars
## 3 26561 1984 Alfa Romeo Spider Veloce 2000 Two Seaters
## 4 27681 1984 Alfa Romeo Spider Veloce 2000 Two Seaters
## 5 27550 1984 AM General DJ Po Vehicle 2WD Special Purpose Vehicle 2WD
## 6 28426 1984 AM General DJ Po Vehicle 2WD Special Purpose Vehicle 2WD
## Drive Transmission Transmission.Descriptor Engine.Index
## 1 Manual 5-Speed 9001
## 2 Manual 5-Speed 9005
## 3 Manual 5-Speed 9002
## 4 Manual 5-Speed 9006
## 5 2-Wheel Drive Automatic 3-Speed 1830
## 6 2-Wheel Drive Automatic 3-Speed 1880
## Engine.Descriptor Engine.Cylinders Engine.Displacement Turbocharger
## 1 (FFS) 6 2.5 NA
## 2 (FFS) CA model 6 2.5 NA
## 3 (FFS) 4 2.0 NA
## 4 (FFS) CA model 4 2.0 NA
## 5 (FFS) 4 2.5 NA
## 6 (FFS) CA model 4 2.5 NA
## Supercharger Fuel.Type Fuel.Type.1 Fuel.Type.2 City.MPG..FT1.
## 1 Regular Regular Gasoline 17
## 2 Regular Regular Gasoline 17
## 3 Regular Regular Gasoline 18
## 4 Regular Regular Gasoline 18
## 5 Regular Regular Gasoline 18
## 6 Regular Regular Gasoline 18
## Unrounded.City.MPG..FT1. City.MPG..FT2. Unrounded.City.MPG..FT2.
## 1 0 0 0
## 2 0 0 0
## 3 0 0 0
## 4 0 0 0
## 5 0 0 0
## 6 0 0 0
## City.Gasoline.Consumption..CD. City.Electricity.Consumption
## 1 0 0
## 2 0 0
## 3 0 0
## 4 0 0
## 5 0 0
## 6 0 0
## City.Utility.Factor Highway.MPG..FT1. Unrounded.Highway.MPG..FT1.
## 1 0 24 0
## 2 0 24 0
## 3 0 25 0
## 4 0 25 0
## 5 0 17 0
## 6 0 17 0
## Highway.MPG..FT2. Unrounded.Highway.MPG..FT2.
## 1 0 0
## 2 0 0
## 3 0 0
## 4 0 0
## 5 0 0
## 6 0 0
## Highway.Gasoline.Consumption..CD. Highway.Electricity.Consumption
## 1 0 0
## 2 0 0
## 3 0 0
## 4 0 0
## 5 0 0
## 6 0 0
## Highway.Utility.Factor Unadjusted.City.MPG..FT1. Unadjusted.Highway.MPG..FT1.
## 1 0 21 34
## 2 0 21 34
## 3 0 23 35
## 4 0 23 35
## 5 0 22 24
## 6 0 22 24
## Unadjusted.City.MPG..FT2. Unadjusted.Highway.MPG..FT2. Combined.MPG..FT1.
## 1 0 0 20
## 2 0 0 20
## 3 0 0 21
## 4 0 0 21
## 5 0 0 17
## 6 0 0 17
## Unrounded.Combined.MPG..FT1. Combined.MPG..FT2. Unrounded.Combined.MPG..FT2.
## 1 0 0 0
## 2 0 0 0
## 3 0 0 0
## 4 0 0 0
## 5 0 0 0
## 6 0 0 0
## Combined.Electricity.Consumption Combined.Gasoline.Consumption..CD.
## 1 0 0
## 2 0 0
## 3 0 0
## 4 0 0
## 5 0 0
## 6 0 0
## Combined.Utility.Factor Annual.Fuel.Cost..FT1. Annual.Fuel.Cost..FT2.
## 1 0 1750 0
## 2 0 1750 0
## 3 0 1650 0
## 4 0 1650 0
## 5 0 2050 0
## 6 0 2050 0
## Gas.Guzzler.Tax Save.or.Spend..5.Year. Annual.Consumption.in.Barrels..FT1.
## 1 -2000 16.48050
## 2 -2000 16.48050
## 3 -1500 15.69571
## 4 -1500 15.69571
## 5 -3500 19.38882
## 6 -3500 19.38882
## Annual.Consumption.in.Barrels..FT2. Tailpipe.CO2..FT1.
## 1 0 -1
## 2 0 -1
## 3 0 -1
## 4 0 -1
## 5 0 -1
## 6 0 -1
## Tailpipe.CO2.in.Grams.Mile..FT1. Tailpipe.CO2..FT2.
## 1 444.3500 -1
## 2 444.3500 -1
## 3 423.1905 -1
## 4 423.1905 -1
## 5 522.7647 -1
## 6 522.7647 -1
## Tailpipe.CO2.in.Grams.Mile..FT2. Fuel.Economy.Score GHG.Score
## 1 0 -1 -1
## 2 0 -1 -1
## 3 0 -1 -1
## 4 0 -1 -1
## 5 0 -1 -1
## 6 0 -1 -1
## GHG.Score..Alt.Fuel. My.MPG.Data X2D.Passenger.Volume X2D.Luggage.Volume
## 1 -1 N 74 7
## 2 -1 N 74 7
## 3 -1 N 0 0
## 4 -1 N 0 0
## 5 -1 N 0 0
## 6 -1 N 0 0
## X4D.Passenger.Volume X4D.Luggage.Volume Hatchback.Passenger.Volume
## 1 0 0 0
## 2 0 0 0
## 3 0 0 0
## 4 0 0 0
## 5 0 0 0
## 6 0 0 0
## Hatchback.Luggage.Volume Start.Stop.Technology Alternative.Fuel.Technology
## 1 0
## 2 0
## 3 0
## 4 0
## 5 0
## 6 0
## Electric.Motor Manufacturer.Code Gasoline.Electricity.Blended..CD.
## 1 False
## 2 False
## 3 False
## 4 False
## 5 False
## 6 False
## Vehicle.Charger Alternate.Charger Hours.to.Charge..120V.
## 1 0
## 2 0
## 3 0
## 4 0
## 5 0
## 6 0
## Hours.to.Charge..240V. Hours.to.Charge..AC.240V. Composite.City.MPG
## 1 0 0 0
## 2 0 0 0
## 3 0 0 0
## 4 0 0 0
## 5 0 0 0
## 6 0 0 0
## Composite.Highway.MPG Composite.Combined.MPG Range..FT1. City.Range..FT1.
## 1 0 0 0 0
## 2 0 0 0 0
## 3 0 0 0 0
## 4 0 0 0 0
## 5 0 0 0 0
## 6 0 0 0 0
## Highway.Range..FT1. Range..FT2. City.Range..FT2. Highway.Range..FT2.
## 1 0 0 0
## 2 0 0 0
## 3 0 0 0
## 4 0 0 0
## 5 0 0 0
## 6 0 0 0
#We will remove some column as we dont want to use it in our data and we can do it by using subset() function.
subset_data <- subset(original_data, select = -c(Unrounded.City.MPG..FT1., City.MPG..FT2., Unrounded.City.MPG..FT2.,
City.Electricity.Consumption, City.Utility.Factor,
Unrounded.Highway.MPG..FT1., Highway.MPG..FT2., Unrounded.Highway.MPG..FT2., Highway.Gasoline.Consumption..CD.,
Highway.Electricity.Consumption, Highway.Utility.Factor,Unadjusted.City.MPG..FT2., Unadjusted.Highway.MPG..FT2.,
Unrounded.Combined.MPG..FT1., Combined.MPG..FT2., Unrounded.Combined.MPG..FT2., Combined.Electricity.Consumption,
Combined.Gasoline.Consumption..CD., Combined.Utility.Factor, Annual.Fuel.Cost..FT2., Gas.Guzzler.Tax,
Annual.Consumption.in.Barrels..FT2., Tailpipe.CO2.in.Grams.Mile..FT2., Alternative.Fuel.Technology, Electric.Motor, Manufacturer.Code, Gasoline.Electricity.Blended..CD.,
Vehicle.Charger, Alternate.Charger, Hours.to.Charge..120V., Hours.to.Charge..AC.240V., Composite.City.MPG,
Composite.Highway.MPG, Composite.Combined.MPG, Range..FT1., City.Range..FT1., Highway.Range..FT1., Range..FT2., City.Range..FT2., Highway.Range..FT2., City.Gasoline.Consumption..CD.))
#subset(original_data, select = -c(City.Gasoline.Consumption..CD.))
#Check the summary of your data by using summary() function
summary(subset_data)
## Vehicle.ID Year Make Model
## Min. : 1 Min. :1984 Length:38113 Length:38113
## 1st Qu.: 9529 1st Qu.:1991 Class :character Class :character
## Median :19058 Median :2001 Mode :character Mode :character
## Mean :19171 Mean :2000
## 3rd Qu.:28779 3rd Qu.:2009
## Max. :38542 Max. :2017
##
## Class Drive Transmission
## Length:38113 Length:38113 Length:38113
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
##
##
##
##
## Transmission.Descriptor Engine.Index Engine.Descriptor Engine.Cylinders
## Length:38113 Min. : 0 Length:38113 Min. : 2.000
## Class :character 1st Qu.: 0 Class :character 1st Qu.: 4.000
## Mode :character Median : 212 Mode :character Median : 6.000
## Mean : 8799 Mean : 5.737
## 3rd Qu.: 4451 3rd Qu.: 6.000
## Max. :69102 Max. :16.000
## NA's :136
## Engine.Displacement Turbocharger Supercharger Fuel.Type
## Min. :0.000 Mode:logical Length:38113 Length:38113
## 1st Qu.:2.200 TRUE:5239 Class :character Class :character
## Median :3.000 NA's:32874 Mode :character Mode :character
## Mean :3.318
## 3rd Qu.:4.300
## Max. :8.400
## NA's :134
## Fuel.Type.1 Fuel.Type.2 City.MPG..FT1. Highway.MPG..FT1.
## Length:38113 Length:38113 Min. : 6.00 Min. : 9.00
## Class :character Class :character 1st Qu.: 15.00 1st Qu.: 20.00
## Mode :character Mode :character Median : 17.00 Median : 24.00
## Mean : 17.98 Mean : 24.08
## 3rd Qu.: 20.00 3rd Qu.: 27.00
## Max. :150.00 Max. :122.00
##
## Unadjusted.City.MPG..FT1. Unadjusted.Highway.MPG..FT1. Combined.MPG..FT1.
## Min. : 0.00 Min. : 0.00 Min. : 7.00
## 1st Qu.: 18.00 1st Qu.: 27.12 1st Qu.: 17.00
## Median : 21.05 Median : 33.00 Median : 19.00
## Mean : 22.65 Mean : 33.68 Mean : 20.22
## 3rd Qu.: 25.20 3rd Qu.: 38.20 3rd Qu.: 23.00
## Max. :224.80 Max. :182.70 Max. :136.00
##
## Annual.Fuel.Cost..FT1. Save.or.Spend..5.Year.
## Min. : 500 Min. :-23500
## 1st Qu.:1600 1st Qu.: -5000
## Median :1950 Median : -3000
## Mean :1971 Mean : -3102
## 3rd Qu.:2350 3rd Qu.: -1250
## Max. :6050 Max. : 4250
##
## Annual.Consumption.in.Barrels..FT1. Tailpipe.CO2..FT1.
## Min. : 0.06 Min. : -1.00
## 1st Qu.:14.33 1st Qu.: -1.00
## Median :17.35 Median : -1.00
## Mean :17.52 Mean : 64.28
## 3rd Qu.:20.60 3rd Qu.: -1.00
## Max. :47.09 Max. :847.00
##
## Tailpipe.CO2.in.Grams.Mile..FT1. Tailpipe.CO2..FT2. Fuel.Economy.Score
## Min. : 0.0 Min. : -1.000 Min. :-1.000000
## 1st Qu.: 388.0 1st Qu.: -1.000 1st Qu.:-1.000000
## Median : 467.7 Median : -1.000 Median :-1.000000
## Mean : 472.8 Mean : 5.277 Mean : 0.005457
## 3rd Qu.: 555.4 3rd Qu.: -1.000 3rd Qu.:-1.000000
## Max. :1269.6 Max. :713.000 Max. :10.000000
##
## GHG.Score GHG.Score..Alt.Fuel. My.MPG.Data
## Min. :-1.000000 Min. :-1.0000 Length:38113
## 1st Qu.:-1.000000 1st Qu.:-1.0000 Class :character
## Median :-1.000000 Median :-1.0000 Mode :character
## Mean : 0.004802 Mean :-0.9268
## 3rd Qu.:-1.000000 3rd Qu.:-1.0000
## Max. :10.000000 Max. : 8.0000
##
## X2D.Passenger.Volume X2D.Luggage.Volume X4D.Passenger.Volume
## Min. : 0.00 Min. : 0.000 Min. : 0.00
## 1st Qu.: 0.00 1st Qu.: 0.000 1st Qu.: 0.00
## Median : 0.00 Median : 0.000 Median : 0.00
## Mean : 13.69 Mean : 1.847 Mean : 33.77
## 3rd Qu.: 0.00 3rd Qu.: 0.000 3rd Qu.: 91.00
## Max. :194.00 Max. :41.000 Max. :192.00
##
## X4D.Luggage.Volume Hatchback.Passenger.Volume Hatchback.Luggage.Volume
## Min. : 0.000 Min. : 0.00 Min. : 0.000
## 1st Qu.: 0.000 1st Qu.: 0.00 1st Qu.: 0.000
## Median : 0.000 Median : 0.00 Median : 0.000
## Mean : 6.154 Mean : 10.41 Mean : 2.028
## 3rd Qu.:13.000 3rd Qu.: 0.00 3rd Qu.: 0.000
## Max. :55.000 Max. :195.00 Max. :49.000
##
## Start.Stop.Technology Hours.to.Charge..240V.
## Length:38113 Min. : 0.00000
## Class :character 1st Qu.: 0.00000
## Mode :character Median : 0.00000
## Mean : 0.02763
## 3rd Qu.: 0.00000
## Max. :12.00000
##
#Check class of your data
class(subset_data)
## [1] "data.frame"
#Here we are selecting specific column that we want to use in our data.
year_city <- subset_data[c(2, 18)]
Below is the line graph using ggplot
After seeing the scatter plot we can see that the main revolution in the
industry comes in between 1998 to 2003 where we can see that some of the
cars has significantly improved the fuel efficiency.
Now we will breakdown our analysis and try to show the fuel
efficiency of various cars from different brands and body types like
SUV(Sports Utility Vehicle), hatchback, etc.
Now let us pick a specific group of class of car of whom we can see
the fuel efficiency over period of time. This time we are picking
subcompact cars.
Now we will select a specific brand like BMW
In this there is a wave like pattern which shows that there is not much
improvement done in terms of fuel efficiency but the main revolution
comes in the year 2011 where we can see BMW brand making some
effort.
Now we will select a specific brand like Audi
Between the year 1994 and 2000, Audi has invested enough in the
technologies that after year 2000 the mileage of their cars have
improved drastically which is why there is a sharp upward trend.
In all the above code we have check the city mileage of the cars. Now we will check what impact does the technology have on the highway mileage of the cars like Audi and BMW.
Now we will select a specific brand like BMW and check their highway mileage
Now we will select a specific brand like Audi and check their highway mileage
Data Reference
Vehicle Fuel Economy Estimates, 1984-2017. (n.d.). Www.kaggle.com. https://www.kaggle.com/datasets/epa/fuel-economy