This data set is based on 2016 Federal Transportation Authority energy consumption of various transportation agencies througout the USA.
library(tidyr)
## Warning: package 'tidyr' was built under R version 3.3.3
library(dplyr)
## Warning: package 'dplyr' was built under R version 3.3.3
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
data <- read.csv("Energy Consumption.csv")
head(data,30)
## ï..5.Digit.NTD.ID Legacy.NTD.ID
## 1 1 0001
## 2 1 0001
## 3 1 0001
## 4 1 0001
## 5 1 0001
## 6 1 0001
## 7 1 0001
## 8 2 0002
## 9 2 0002
## 10 2 0002
## 11 2 0002
## 12 3 0003
## 13 3 0003
## 14 3 0003
## 15 3 0003
## 16 5 0005
## 17 5 0005
## 18 6 0006
## 19 6 0006
## 20 6 0006
## 21 6 0006
## 22 7 0007
## 23 7 0007
## 24 7 0007
## 25 7 0007
## 26 7 0007
## 27 8 0008
## 28 8 0008
## 29 8 0008
## 30 8 0008
## Agency.Name Reporter.Type
## 1 King County Department of Transportation Full Reporter
## 2 King County Department of Transportation Full Reporter
## 3 King County Department of Transportation Full Reporter
## 4 King County Department of Transportation Full Reporter
## 5 King County Department of Transportation Full Reporter
## 6 King County Department of Transportation Full Reporter
## 7 King County Department of Transportation Full Reporter
## 8 Spokane Transit Authority Full Reporter
## 9 Spokane Transit Authority Full Reporter
## 10 Spokane Transit Authority Full Reporter
## 11 Spokane Transit Authority Full Reporter
## 12 Pierce County Transportation Benefit Area Authority Full Reporter
## 13 Pierce County Transportation Benefit Area Authority Full Reporter
## 14 Pierce County Transportation Benefit Area Authority Full Reporter
## 15 Pierce County Transportation Benefit Area Authority Full Reporter
## 16 Everett Transit Full Reporter
## 17 Everett Transit Full Reporter
## 18 Yakima Transit Full Reporter
## 19 Yakima Transit Full Reporter
## 20 Yakima Transit Full Reporter
## 21 Yakima Transit Full Reporter
## 22 Lane Transit District Full Reporter
## 23 Lane Transit District Full Reporter
## 24 Lane Transit District Full Reporter
## 25 Lane Transit District Full Reporter
## 26 Lane Transit District Full Reporter
## 27 Tri-County Metropolitan Transportation District of Oregon Full Reporter
## 28 Tri-County Metropolitan Transportation District of Oregon Full Reporter
## 29 Tri-County Metropolitan Transportation District of Oregon Full Reporter
## 30 Tri-County Metropolitan Transportation District of Oregon Full Reporter
## Reporting.Module Mode TOS Bio.Diesel Bunker.Fuel C.Natural.Gas
## 1 Urban VP DO NA
## 2 Urban DR PT 0 NA
## 3 Urban MB PT 0 NA
## 4 Urban FB DO 201,892 NA
## 5 Urban MB DO 0 NA
## 6 Urban SR DO NA
## 7 Urban TB DO NA
## 8 Urban VP DO NA
## 9 Urban DR DO 0 NA
## 10 Urban DR PT 0 NA
## 11 Urban MB DO 0 NA
## 12 Urban VP DO NA
## 13 Urban DR PT NA
## 14 Urban DR DO NA 25,162
## 15 Urban MB DO 0 NA 1,274,778
## 16 Urban DR DO 0 NA
## 17 Urban MB DO 0 NA
## 18 Urban DR PT NA
## 19 Urban VP DO 0 NA
## 20 Urban CB PT 0 NA
## 21 Urban MB DO 0 NA
## 22 Urban DR PT NA
## 23 Urban VP PT NA
## 24 Urban MB PT 9,764 NA
## 25 Urban MB DO 510,000 NA
## 26 Urban RB DO NA
## 27 Urban DR PT 748,873 NA
## 28 Urban LR DO NA
## 29 Urban MB DO 5,380,816 NA
## 30 Urban YR PT 106,645 NA
## Cooking.Oil Diesel.Fuel Electric.Battery Electric.Propulsion Ethanol
## 1 NA 52,825
## 2 NA 439,823
## 3 NA 33,976
## 4 NA 20,220
## 5 NA 10,151,916 226,958
## 6 NA 869,314
## 7 NA 19,655,645
## 8 NA
## 9 NA 111,563
## 10 NA 137,274
## 11 NA 1,133,980
## 12 NA
## 13 NA
## 14 NA
## 15 NA 183,096
## 16 NA 17,880
## 17 NA 307,976
## 18 NA
## 19 NA 1,168
## 20 NA 5,308
## 21 NA 169,082
## 22 NA
## 23 NA
## 24 NA 0
## 25 NA 0
## 26 NA 97,180
## 27 NA 0
## 28 NA 55,121,881
## 29 NA 0
## 30 NA 0
## Gasoline Hydrogen Kerosene Liquified.Nat.Gas Liquified.Petroleum.Gas
## 1 919,256 NA
## 2 527,723 NA 107,194
## 3 162,009 NA
## 4 NA
## 5 NA
## 6 NA
## 7 NA
## 8 69,869 NA
## 9 21,798 NA 2,726
## 10 20,583 NA
## 11 NA
## 12 336,298 NA
## 13 303,032 NA
## 14 89,011 NA
## 15 24,034 NA
## 16 104,663 NA
## 17 NA
## 18 45,779 NA
## 19 19,912 NA
## 20 2,427 NA
## 21 NA
## 22 169,473 NA
## 23 25,327 NA
## 24 224 NA
## 25 NA
## 26 NA
## 27 35,984 NA
## 28 NA
## 29 NA
## 30 NA
## Other.Fuel Other.Fuel.Description
## 1 NA NA
## 2 NA NA
## 3 NA NA
## 4 NA NA
## 5 NA NA
## 6 NA NA
## 7 NA NA
## 8 NA NA
## 9 NA NA
## 10 NA NA
## 11 NA NA
## 12 NA NA
## 13 NA NA
## 14 NA NA
## 15 NA NA
## 16 NA NA
## 17 NA NA
## 18 NA NA
## 19 NA NA
## 20 NA NA
## 21 NA NA
## 22 NA NA
## 23 NA NA
## 24 NA NA
## 25 NA NA
## 26 NA NA
## 27 NA NA
## 28 NA NA
## 29 NA NA
## 30 NA NA
This step took a while because I had a number of error messages when I was trying to add numbers together. The trick appeared to remove the commas from the numbers and use as.numeric(). I performed this for all the rows I wanted to use in the table.
Afterwards, I named a new variable energy where I used dplyr to pipe in steps, first by selecting the relevant variables, grouping by the agency name, and getting the summation of each of the fuel types. Since this first resulted in NAs all over the place, I substituted NA with 0. Normally a bad idea but I think it is ok with this data set.
data$Bio.Diesel <- as.numeric(gsub(",", "", data$Bio.Diesel))
data$C.Natural.Gas <- as.numeric(gsub(",", "", data$C.Natural.Gas))
data$Diesel.Fuel <- as.numeric(gsub(",", "", data$Diesel.Fuel))
data$Electric.Battery <- as.numeric(gsub(",", "", data$Electric.Battery))
data$Electric.Propulsion <- as.numeric(gsub(",", "", data$Electric.Propulsion))
data$Ethanol <- as.numeric(gsub(",", "", data$Ethanol))
data$Gasoline <- as.numeric(gsub(",", "", data$Gasoline))
data$Hydrogen <- as.numeric(gsub(",", "", data$Hydrogen))
data$Liquified.Nat.Gas <- as.numeric(gsub(",", "", data$Liquified.Nat.Gas))
data$Liquified.Petroleum.Gas <- as.numeric(gsub(",", "", data$Liquified.Petroleum.Gas))
energy <- data %>%
select(Agency.Name,
Mode,
TOS,
Bio.Diesel,
C.Natural.Gas,
Diesel.Fuel,
Electric.Battery,
Electric.Propulsion,
Ethanol,
Gasoline,
Hydrogen,
Liquified.Nat.Gas,
Liquified.Petroleum.Gas) %>%
group_by(Agency.Name) %>%
summarise(Diesel = sum(Diesel.Fuel),
CNG = sum(C.Natural.Gas),
Gasoline = sum(Gasoline),
Bio.Diesel = sum(Bio.Diesel),
All.Electric = sum(Electric.Battery),
Hybrid = sum(Electric.Propulsion),
Hydrogen = sum(Hydrogen),
LNG = sum(Liquified.Nat.Gas))
energy[is.na(energy)] <- 0
head(energy,20)
## # A tibble: 20 x 9
## Agency.Name Diesel CNG Gasoline
## <fctr> <dbl> <dbl> <dbl>
## 1 Academy Lines, Inc. 2158821 0 0
## 2 Access Services 0 290517 2076580
## 3 Adirondack Transit Lines, Inc, 459651 0 0
## 4 Alameda-Contra Costa Transit District 5235532 0 0
## 5 Alamo Area Council of Governments 1165 0 139087
## 6 Alaska Railroad Corporation 791966 0 0
## 7 Albany Transit System 0 0 0
## 8 Altamont Corridor Express 481901 0 0
## 9 Alternativa de Transporte Integrado -ATI 0 0 0
## 10 Altoona Metro Transit 0 0 0
## 11 Ames Transit Agency dba CyRide 367329 0 0
## 12 Anaheim Transportation Network 0 167320 0
## 13 Ann Arbor Area Transportation Authority 110052 0 0
## 14 Anne Arundel County 0 0 0
## 15 Antelope Valley Transit Authority 0 0 0
## 16 Arlington Transit - Arlington County 0 0 0
## 17 ART (Asheville Redefines Transit) 255798 0 0
## 18 Athens Transit System 260256 0 0
## 19 Audubon Area Community Services, Inc. 59 0 217436
## 20 Augusta Richmond County Transit Department 172661 0 0
## # ... with 5 more variables: Bio.Diesel <dbl>, All.Electric <dbl>,
## # Hybrid <dbl>, Hydrogen <dbl>, LNG <dbl>
Lastly, I wanted to see the total consumpution by fuel type across all agencies countrywide. The only problem is the unit of measurement. I could not find a key on the FTA website to determine if for example the consumption in diesel was in gallons, or the batter electric in kw/hrs. The largest number was the Hybrid, which common sense would tell you it cannot be the same unit of measurement as diesel.
Total.Consumption <- energy %>%
summarise(Diesel = sum(Diesel),
CNG = sum(CNG),
Gasoline = sum(Gasoline),
Bio.Diesel = sum(Bio.Diesel),
All.Electric = sum(All.Electric),
Hybrid = sum(Hybrid),
Hydrogen = sum(Hydrogen),
LNG = sum(LNG))
Total.Consumption
## # A tibble: 1 x 8
## Diesel CNG Gasoline Bio.Diesel All.Electric Hybrid Hydrogen
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 215177649 33629557 30570536 7873549 1608039 1041519833 0
## # ... with 1 more variables: LNG <dbl>
This tidying felt pretty clean and I felt the need to write it to csv.
write.csv(energy, "Transit_Energy.csv", row.names = FALSE)