Analysis of Global Energy Usage

Global Energy efficiency usage already challenges our lives. As the human population and economy grows, so does our demand for energy. Some energy from fossil fuels (like Coal, Peat and Oil Shale) unfortunately are not replaceable energy, yet currently 80% global energy comes from burning fossil fuels. An estimation for fossil fuel reserves is only 50-120 years. In addition, burning fossil fuels product more carbon dioxide, which contributes to global warming. NASA Earth observatory predicted the global warming for next century is at least 20 times faster. {link, April, 21, 2017} Since then we found the impact of shortage energy for our lives, we already spent a decade to look for renewable energy and better solution for energy conservation.

\(\textbf{Goals:}\)

  1. We propose to analyze the changes in global energy usage (replacement and replacement energy), per capital use and energy trading flows to better understand the progress for energy efficiency usage. we will study what actions need to be taken to ensure that economic stability can be maintained in the future.

  2. We will observe two largest population countries China and Indian energy consuming, since they are developing technology and economic in their highest speed. These two countries face more energy problems than other countries. In our research of data sets, we are going to discover what problem they are facing and what strategy they use.

  3. To discover which countries are energy suppliers / consumer, what relationship they bound together.

suppressMessages(suppressWarnings(library('tidyverse')))
suppressMessages(suppressWarnings(library('data.table')))
suppressMessages(suppressWarnings(library('dplyr')))
suppressMessages(suppressWarnings(library('ggplot2')))
suppressMessages(suppressWarnings(library('reshape2')))
suppressMessages(suppressWarnings(library('scales')))
suppressMessages(suppressWarnings(library('rmarkdown')))
suppressMessages(suppressWarnings(library('rvest')))
suppressMessages(suppressWarnings(library('stringr')))
suppressMessages(suppressWarnings(library('httr')))
suppressMessages(suppressWarnings(library('jsonlite')))

Methods:

Based on Energy-Questionnaire-Guidelines, all category can be divided into five energy products: Coal_Peat_Oil.Shale, Oil, Natural_Gas, Electricity_heat, Biofuels_Waste. Every subcategory for each product of all natural source are listed in the guidelines. We used join function to find the component of products by category from data set “all_energy_statistics” which can match the subcategory for every defined product from data set “fields”.

To get data set for import/export trade data for these five products, we use ‘grepl’ function to look for the words “import” and “export” in the column “commodity_transaction”. We also check if NA exists in after joined tables. Then we found “Electricity & heat” is not exist in import and export trade. Therefore the following, we only look for the other fours products transaction.

We fine the total electricity consumption for top ten countries and the 10-year growth in 2004-2014.From comparing how much changes. We also find 2014 consumption by households per capital in these countries. We use the data of population from Wiki website {link}. We calculate per capital consumption by using total_electricity (KW, million)/population of the country.

Read data tables in R and data wrangling

1.The five productes Coal_Peat_Oil.Shale, Oil, Nautal_Gas, Electricity_heat and Biofuels_Waste classify in Energy-Questionnaire-Guidelines.

#use fread function to read in large table
raw<-fread('all_energy_statistics.csv')

#eleminate low column 'footnote' for the survey
allEnergy<-raw[,-6] 

#reduce the duplicated discription in 'commodity_transaction' column
allEnergy$commodity_transaction<-gsub(".*-", "", allEnergy$commodity_transaction) 

#rename 'commodity_transaction' column as 'transaction' 
names(allEnergy)[names(allEnergy) == 'commodity_transaction'] <- 'transaction' 

#lowcase for the 'transaction' column
allEnergy$transaction <- tolower(allEnergy$transaction)

#convert dbl type to integer type for 'quantity'
allEnergy$quantity<-as.integer(allEnergy$quantity)
## Warning: NAs introduced by coercion to integer range
#select top 10 economic countries
subEnergy<- allEnergy[allEnergy$country_or_area %in% c("United States","China","Japan","Germany","United Kingdom","India","France","Brazil","Italy","Canada"), ]

fields<-fread('fields.csv')
head(fields)
##      Coal_Peat_Oil.Shale                      Oil
## 1:             hard_coal   conventional_crude_oil
## 2:            anthracite      natural_gas_liquids
## 3:           coking_coal additives_and_oxygenates
## 4: other_bituminius_coal       other_hydrocarbons
## 5:            brown_coal        aviation_gasoline
## 6:   sub_bituminous_coal           motor_gasoline
##                   Nautal Gas
## 1: natural_gas_including_lng
## 2:              gasworks_gas
## 3:         blast_furnace_gas
## 4:     other_recovered_gases
## 5:                          
## 6:                          
##                                             Electricity & heat
## 1: electricity_net_installed_capacity_of_electric_power_plants
## 2:                                                        heat
## 3:                               direct_use of_geothermal_heat
## 4:                            direct_use_of_solar_thermal_heat
## 5:                                                            
## 6:                                                            
##                       Biofuels & Waste
## 1:                            fuelwood
## 2:                            charcoal
## 3:                             bagasse
## 4:                       animal_ waste
## 5: other_vegetal_material_and_residues
## 6:                     municipal_waste
Coal_Peat_Oil.Shale<-fields[,1]
Oil<-fields[,2]
Nautal_Gas<-fields[,3]
Electricity_heat<-fields[,4]
Biofuels_Waste<-fields[,5]

CPO<-left_join(Coal_Peat_Oil.Shale,subEnergy, by = c("Coal_Peat_Oil.Shale"="category"))
OIL<-left_join(Oil,subEnergy,by = c("Oil"="category"))
NG<-left_join(Nautal_Gas, subEnergy,by = c("Nautal Gas"="category"))
EH<-left_join(Electricity_heat,subEnergy, by = c("Electricity & heat"="category"))
BW<-left_join(Biofuels_Waste,subEnergy, by = c("Biofuels & Waste"="category"))


impCPO<-CPO[grepl("imports",CPO[,3])==TRUE,]
impOIL<-OIL[grepl("imports",OIL[,3])==TRUE,]
impNG<-NG[grepl("imports",NG[,3])==TRUE,]
impBW<-BW[grepl("imports",BW[,3])==TRUE,]

expCPO<-CPO[grepl("exports",CPO[,3])==TRUE,]
expOIL<-OIL[grepl("exports",OIL[,3])==TRUE,]
expNG<-NG[grepl("exports",NG[,3])==TRUE,]
expBW<-BW[grepl("exports",BW[,3])==TRUE,]

With over 1,000,000 rows, sub-setting these data seems like a reasonable approach to analyzing them, as using the whole set would tax our computer resources. To get a good idea of where to subset, we can look at countries/areas and commodities and transactions.

Exploring the Data - Commodities, Transactions, and Catagories

The data is labeled by Countries, Commodity-Transactions, and Categories. This is a good place to start looking for ways to sub-set the data.

country <- energy[,1] %>% unique() %>% arrange(country_or_area)
## Warning: package 'bindrcpp' was built under R version 3.4.1
country
## # A tibble: 243 x 1
##        country_or_area
##                  <chr>
##  1         Afghanistan
##  2             Albania
##  3             Algeria
##  4      American Samoa
##  5             Andorra
##  6              Angola
##  7            Anguilla
##  8 Antarctic Fisheries
##  9 Antigua and Barbuda
## 10           Argentina
## # ... with 233 more rows
comm_trans <-energy[,2]  %>% unique() %>%arrange(commodity_transaction)
nrow(comm_trans)
## [1] 2336
head(comm_trans)
## # A tibble: 6 x 1
##                                    commodity_transaction
##                                                    <chr>
## 1                     Additives and Oxygenates - Exports
## 2                     Additives and Oxygenates - Imports
## 3                  Additives and Oxygenates - Production
## 4 Additives and Oxygenates - Receipts from other sources
## 5               Additives and Oxygenates - Stock changes
## 6         Additives and Oxygenates - Total energy supply
tail(comm_trans)
## # A tibble: 6 x 1
##                                                         commodity_transaction
##                                                                         <chr>
## 1      Vegetal waste - Transformation in CHP plants - main activity producers
## 2        Vegetal waste - Transformation in electricity plants - autoproducers
## 3 Vegetal waste - Transformation in electricity plants - main activity produc
## 4          Vegetal waste - Transformation in electricity, CHP and heat plants
## 5               Vegetal waste - Transformation in heat plants - autoproducers
## 6     Vegetal waste - Transformation in heat plants - main activity producers

We have 243 countries and areas and 2336 categories of commodities and transactions. The commodity-transactions are listed first by the commodity, the object being used, and then by the transactions, how that object is used. These are separated by a dash. We can use a regular expression to separate the commodities from the transactions.

#energy[,2] <- energy[,2] %>% tolower() These seemed to stall Rstudio. I look for another way to fix this.
#energy[,2] <- energy[,2] %>% str_replace("hrad","hard") #one of the cells had a spelling error
#tail(energy[,2])
comm_trans<- comm_trans %>% separate("commodity_transaction", c('commodity', 'transaction'), " - ")
## Warning: Too many values at 208 locations: 25, 68, 69, 71, 72, 74, 75, 123,
## 124, 125, 126, 168, 169, 170, 171, 209, 210, 211, 212, 215, ...
## Warning: Too few values at 28 locations: 772, 773, 774, 775, 776, 777, 778,
## 779, 780, 781, 782, 783, 784, 785, 786, 787, 788, 1039, 1040, 1041, ...
comm <- comm_trans[,1] %>% unique() 
comm # There are duplicates due to cases, spelling errors.
## # A tibble: 111 x 1
##                   commodity
##                       <chr>
##  1 Additives and Oxygenates
##  2             Animal waste
##  3             animal waste
##  4               Anthracite
##  5        Aviation gasoline
##  6                  Bagasse
##  7                Biodiesel
##  8                 Biogases
##  9              Biogasoline
## 10                  Bitumen
## # ... with 101 more rows
for(i in 1:nrow(comm[,1])){ # create a loop to get rid of case duplicates
  comm[i,1] <- comm[i,1] %>% tolower()
  comm[i,1] <- comm[i,1] %>% str_replace("hrad","hard") #one of the cells had a spelling error
}
comm <- comm %>% unique()
comm
## # A tibble: 97 x 1
##                   commodity
##                       <chr>
##  1 additives and oxygenates
##  2             animal waste
##  3               anthracite
##  4        aviation gasoline
##  5                  bagasse
##  6                biodiesel
##  7                 biogases
##  8              biogasoline
##  9                  bitumen
## 10             black liquor
## # ... with 87 more rows
transaction <- comm_trans[,2] %>% unique()
for(i in 1:nrow(transaction[,1])){ # create a loop to get rid of case duplicates
  transaction[i,1] <- transaction[i,1] %>% tolower()
}
transaction <- transaction %>% unique()
transaction
## # A tibble: 212 x 1
##                                    transaction
##                                          <chr>
##  1                                     exports
##  2                                     imports
##  3                                  production
##  4                 receipts from other sources
##  5                               stock changes
##  6                         total energy supply
##  7             transfers and recycled products
##  8                              transformation
##  9            transformation in oil refineries
## 10 consumption by commerce and public services
## # ... with 202 more rows
category <- energy[,7] %>% unique()
category
## # A tibble: 67 x 1
##                    category
##                       <chr>
##  1 additives_and_oxygenates
##  2             animal_waste
##  3               anthracite
##  4        aviation_gasoline
##  5                  bagasse
##  6                biodiesel
##  7                 biogases
##  8              biogasoline
##  9                  bitumen
## 10             black_liquor
## # ... with 57 more rows

We have now shortened our lists to 97 commodities, 212 transactions and 67 categories. These can be used with tidyr functions to create our subsets.

Subsetting Data

We can see that the above data is very large and will be difficult to deal with. Initially we ran some loops on the entire data set, but they took close to an hour to complete. This is not very practical, and sub setting the data into smaller parts may make these data more manageable.

According to www.investopedia.com 1, the top 10 world economies are:

  1. USA
  2. China
  3. Japan
  4. Germany
  5. UK
  6. India
  7. France
  8. Brazil
  9. Italy
  10. Canada

We will use these to break the data into more manageable blocks By Country:

#1 USA
usa <- energy %>% filter(country_or_area == "United States")
nrow(usa)
## [1] 18115
head(usa)
## # A tibble: 6 x 7
##   country_or_area              commodity_transaction  year
##             <chr>                              <chr> <int>
## 1   United States Additives and Oxygenates - Exports  2014
## 2   United States Additives and Oxygenates - Exports  2013
## 3   United States Additives and Oxygenates - Exports  2012
## 4   United States Additives and Oxygenates - Exports  2011
## 5   United States Additives and Oxygenates - Exports  2010
## 6   United States Additives and Oxygenates - Exports  2009
## # ... with 4 more variables: unit <chr>, quantity <dbl>,
## #   quantity_footnotes <int>, category <chr>
#2
china <- energy %>% filter(country_or_area == "China")
nrow(china)
## [1] 10603
head(china)
## # A tibble: 6 x 7
##   country_or_area                        commodity_transaction  year
##             <chr>                                        <chr> <int>
## 1           China Aviation gasoline - Consumption by transport  2014
## 2           China Aviation gasoline - Consumption by transport  2013
## 3           China Aviation gasoline - Consumption by transport  2012
## 4           China Aviation gasoline - Consumption by transport  2011
## 5           China Aviation gasoline - Consumption by transport  2010
## 6           China Aviation gasoline - Consumption by transport  2009
## # ... with 4 more variables: unit <chr>, quantity <dbl>,
## #   quantity_footnotes <int>, category <chr>
#3
japan <- energy %>% filter(country_or_area == "Japan")
nrow(japan)
## [1] 15497
head(japan)
## # A tibble: 6 x 7
##   country_or_area                                  commodity_transaction
##             <chr>                                                  <chr>
## 1           Japan Additives and Oxygenates - Receipts from other sources
## 2           Japan Additives and Oxygenates - Receipts from other sources
## 3           Japan Additives and Oxygenates - Receipts from other sources
## 4           Japan Additives and Oxygenates - Receipts from other sources
## 5           Japan Additives and Oxygenates - Receipts from other sources
## 6           Japan Additives and Oxygenates - Receipts from other sources
## # ... with 5 more variables: year <int>, unit <chr>, quantity <dbl>,
## #   quantity_footnotes <int>, category <chr>
#4
germany <- energy %>% filter(country_or_area == "Germany")
nrow(germany)
## [1] 18781
head(germany)
## # A tibble: 6 x 7
##   country_or_area                 commodity_transaction  year
##             <chr>                                 <chr> <int>
## 1         Germany Additives and Oxygenates - Production  2014
## 2         Germany Additives and Oxygenates - Production  2013
## 3         Germany Additives and Oxygenates - Production  2012
## 4         Germany Additives and Oxygenates - Production  2011
## 5         Germany Additives and Oxygenates - Production  2010
## 6         Germany Additives and Oxygenates - Production  2009
## # ... with 4 more variables: unit <chr>, quantity <dbl>,
## #   quantity_footnotes <int>, category <chr>
#5
uk <- energy %>% filter(country_or_area == "United Kingdom")
nrow(uk)
## [1] 14061
head(uk)
## # A tibble: 6 x 7
##   country_or_area                 commodity_transaction  year
##             <chr>                                 <chr> <int>
## 1  United Kingdom    Additives and Oxygenates - Exports  2014
## 2  United Kingdom    Additives and Oxygenates - Imports  2014
## 3  United Kingdom Additives and Oxygenates - Production  2014
## 4  United Kingdom Additives and Oxygenates - Production  2004
## 5  United Kingdom Additives and Oxygenates - Production  2003
## 6  United Kingdom Additives and Oxygenates - Production  2002
## # ... with 4 more variables: unit <chr>, quantity <dbl>,
## #   quantity_footnotes <int>, category <chr>
#6
india <- energy %>% filter(country_or_area == "India")
nrow(india)
## [1] 7022
head(india)
## # A tibble: 6 x 7
##   country_or_area
##             <chr>
## 1           India
## 2           India
## 3           India
## 4           India
## 5           India
## 6           India
## # ... with 6 more variables: commodity_transaction <chr>, year <int>,
## #   unit <chr>, quantity <dbl>, quantity_footnotes <int>, category <chr>
#7
france <- energy %>% filter(country_or_area == "France")
nrow(france)
## [1] 15637
head(france)
## # A tibble: 6 x 7
##   country_or_area              commodity_transaction  year
##             <chr>                              <chr> <int>
## 1          France Additives and Oxygenates - Exports  2014
## 2          France Additives and Oxygenates - Exports  2013
## 3          France Additives and Oxygenates - Exports  2012
## 4          France Additives and Oxygenates - Exports  2011
## 5          France Additives and Oxygenates - Exports  2010
## 6          France Additives and Oxygenates - Exports  2009
## # ... with 4 more variables: unit <chr>, quantity <dbl>,
## #   quantity_footnotes <int>, category <chr>
#8
brazil <- energy %>% filter(country_or_area == "Brazil")
nrow(brazil)
## [1] 9337
head(brazil)
## # A tibble: 6 x 7
##   country_or_area                        commodity_transaction  year
##             <chr>                                        <chr> <int>
## 1          Brazil Aviation gasoline - Consumption by transport  2014
## 2          Brazil Aviation gasoline - Consumption by transport  2013
## 3          Brazil Aviation gasoline - Consumption by transport  2012
## 4          Brazil Aviation gasoline - Consumption by transport  2011
## 5          Brazil Aviation gasoline - Consumption by transport  2010
## 6          Brazil Aviation gasoline - Consumption by transport  2009
## # ... with 4 more variables: unit <chr>, quantity <dbl>,
## #   quantity_footnotes <int>, category <chr>
#9
italy <- energy %>% filter(country_or_area == "Italy")
nrow(italy)
## [1] 14739
head(italy)
## # A tibble: 6 x 7
##   country_or_area              commodity_transaction  year
##             <chr>                              <chr> <int>
## 1           Italy Additives and Oxygenates - Exports  2014
## 2           Italy Additives and Oxygenates - Exports  2013
## 3           Italy Additives and Oxygenates - Exports  2012
## 4           Italy Additives and Oxygenates - Exports  2011
## 5           Italy Additives and Oxygenates - Exports  2010
## 6           Italy Additives and Oxygenates - Exports  2009
## # ... with 4 more variables: unit <chr>, quantity <dbl>,
## #   quantity_footnotes <int>, category <chr>
#10 
canada <- energy %>% filter(country_or_area == "Canada")
nrow(canada)
## [1] 13989
head(canada)
## # A tibble: 6 x 7
##   country_or_area              commodity_transaction  year
##             <chr>                              <chr> <int>
## 1          Canada Additives and Oxygenates - Imports  2014
## 2          Canada Additives and Oxygenates - Imports  2013
## 3          Canada Additives and Oxygenates - Imports  2012
## 4          Canada Additives and Oxygenates - Imports  2011
## 5          Canada Additives and Oxygenates - Imports  2010
## 6          Canada Additives and Oxygenates - Imports  2009
## # ... with 4 more variables: unit <chr>, quantity <dbl>,
## #   quantity_footnotes <int>, category <chr>

We will use these subsets to create further subsets as we perform our analysis.

Transforming Units

Units of measure are also of interest. Solid fuels will be measured by mass, and liquid and gasses by volume. Furthermore energy units can be measured in Kilowatt-hours, Power*time is Energy, or Joules, a direct measure of energy.

en_units <- energy[,4] %>% unique() %>% arrange(unit)
en_units
## # A tibble: 5 x 1
##                      unit
##                     <chr>
## 1  Cubic metres, thousand
## 2 Kilowatt-hours, million
## 3    Kilowatts,  thousand
## 4  Metric tons,  thousand
## 5              Terajoules

Kilowatt hours can be converted into Joules by the following conversion: \[ \frac{1 KW*hr}{1}*\frac{3600 s}{1 hr}*\frac{1000 J}{1 KW*s} = 3,600,000 J \\ 1 TJ = 1*10^{12} J \\ \frac{3.6x10^6 J}{1} * \frac{1 TJ}{1*10^{12}J} = 3.6*10^{-6} TJ \\ 1 TJ = \frac{1KW*hr}{3.6*10^{-6}} = \frac{1*10^6 KW*hr}{3.6} \] Note that KW*hr are measured in millions, such that \(1*10^6\space KW*hr = 3.6 TJ\), according to the above conversion. So to convert those measurements into TJ we need simply divide the Millions of \(KW*hr\) by 3.6 according to the above conversion.

# This seems to stall out my PC, so it is included, but not used.
#for(i in 1:nrow(energy)){
 # if(energy[i,4] == "Kilowatt-hours, million"){
  #  energy[i,5] = energy[i,5]/3.6
   # energy[i,4] = "Terajoules"  
#}
 # }
#This isn't useful given the above loop is too slow.
#en_units <- energy[,4] %>% unique() %>% arrange(unit)
#en_units

For any conversion of mass or volume, specific energy (Energy per mass) or Energy Density (Energy per Volume) must be factored into the measurement. The efficiency of the method of energy production must also be accounted for.

Types of Commodities

Non-renewable resources

Renewable resources

Part I: Electrical Power Production

We can use the categories: “total_electricity”, “nuclear_electricity”, “thermal_electricity”, “solar_electricity”, “tide_wave_and_ocean_electricity”, and “hydro” to analyze where the top 10 countries get their electricity.

USA

usa_total_elec <- usa %>% filter(category == "total_electricity") %>% select(commodity_transaction, year, quantity)
usa_total_elec <- usa_total_elec %>% spread(year, quantity, fill = 0) 
usa_total_elec[5,1] <- "Total"
usa_total_elec[5,-1] <- colSums(usa_total_elec[-5,-1])
usa_total_elec <- usa_total_elec %>% gather("year", "quantity", 2:26, convert=TRUE) %>% filter(commodity_transaction == "Total") 
usa_total_elec ## these numbers are less than the total thermal electrcity produced, we'll need to reevaluate.
## # A tibble: 25 x 3
##    commodity_transaction  year quantity
##                    <chr> <int>    <dbl>
##  1                 Total  1990  1103431
##  2                 Total  1991  1132813
##  3                 Total  1992  1125615
##  4                 Total  1993  1158603
##  5                 Total  1994  1201981
##  6                 Total  1995  1268160
##  7                 Total  1996  1296852
##  8                 Total  1997  1347588
##  9                 Total  1998  1378464
## 10                 Total  1999  1417272
## # ... with 15 more rows
usa_nuc_elec <- usa %>% filter(category == "nuclear_electricity") %>% select(year, quantity)
usa_nuc_elec
## # A tibble: 25 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014   830584
##  2  2013   822004
##  3  2012   801129
##  4  2011   821405
##  5  2010   838931
##  6  2009   830210
##  7  2008   837804
##  8  2007   836634
##  9  2006   816195
## 10  2005   810726
## # ... with 15 more rows
usa_solar_elec <- usa %>% filter(category == "solar_electricity") %>% select(year, quantity)
usa_solar_elec
## # A tibble: 25 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014    24603
##  2  2013    15872
##  3  2012    10145
##  4  2011     6153
##  5  2010     3934
##  6  2009     2514
##  7  2008     2091
##  8  2007     1673
##  9  2006     1287
## 10  2005     1120
## # ... with 15 more rows
usa_therm_elec <- usa %>% filter(category == "thermal_electricity") %>% select(year, quantity)
usa_therm_elec
## # A tibble: 25 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014  2995556
##  2  2013  2985891
##  3  2012  3016908
##  4  2011  3037683
##  5  2010  3135578
##  6  2009  2965173
##  7  2008  3173013
##  8  2007  3183440
##  9  2006  3121205
## 10  2005  3149290
## # ... with 15 more rows
usa_hydro_elec <- usa %>% filter(category == "hydro") %>% select(year, quantity)
usa_hydro_elec
## # A tibble: 25 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014   281527
##  2  2013   290113
##  3  2012   298287
##  4  2011   344679
##  5  2010   286333
##  6  2009   298410
##  7  2008   281995
##  8  2007   275545
##  9  2006   317689
## 10  2005   297926
## # ... with 15 more rows
usa_wave_elec <- usa %>% filter(category == "tide_wave_and_ocean_electricity") %>% select(year, quantity)
usa_wave_elec # No wave generation reported in the US.
## # A tibble: 0 x 2
## # ... with 2 variables: year <int>, quantity <dbl>
## We'll redifine total electricity by joining  table data frames.
usa_total_elec <- usa_therm_elec %>% left_join(usa_nuc_elec, by = "year")
#usa_total_elec <- usa_total_elec %>% left_join(usa_solar_elec, by = "year") #not included in EIA total
#usa_total_elec <- usa_total_elec %>% left_join(usa_hydro_elec, by = "year") #not included in EIA total
usa_total_elec <- usa_total_elec %>% mutate(total = rowSums(.[2:3]))
usa_total_elec
## # A tibble: 25 x 4
##     year quantity.x quantity.y   total
##    <int>      <dbl>      <dbl>   <dbl>
##  1  2014    2995556     830584 3826140
##  2  2013    2985891     822004 3807895
##  3  2012    3016908     801129 3818037
##  4  2011    3037683     821405 3859088
##  5  2010    3135578     838931 3974509
##  6  2009    2965173     830210 3795383
##  7  2008    3173013     837804 4010817
##  8  2007    3183440     836634 4020074
##  9  2006    3121205     816195 3937400
## 10  2005    3149290     810726 3960016
## # ... with 15 more rows
ggplot(NULL, aes(y=quantity,x=year))+ 
    geom_point(data=usa_total_elec, aes(y=total,x=year), color = "black")+
    geom_point(data=usa_nuc_elec, aes(y=quantity,x=year), color = "blue")+
    geom_point(data=usa_solar_elec, aes(y=quantity,x=year), color = "yellow3")+
    geom_point(data=usa_therm_elec, aes(y=quantity,x=year), color = "red") +
    geom_point(data=usa_hydro_elec, aes(y=quantity,x=year), color = "cyan") 

American electrical production has increased by about 36% over the past 25 years, largely by adding fossil fuel capacity. Electric production seems to be relatively steady since about 2005.

Note also that total electricity is somehow less than the amount of thermal electricity, when we would expect it to be greater than. This could be because the totals for thermal, nuclear, hydro, etc are based on how much resource goes into the process, not how much comes out, so thermodynamic efficiency is not taken into account. It could also be because the total has no category for at-home private use and that is a majority of power consumption, or the units for some of the categories are wrong, TJ instead of KW*hr, million or visa versa.

We will exclude the kaggle total from further analysis, and replace it by adding the subcategories.

Cross-Checking USA electrical production

The EIA has electrical production for the whole USA from 2001 to 2016. The units are in thousands of megawatt*hours.

\[ 1000 MW*hr = 1000*10^6 W*hr = 10^9 W*hr \\ 10^6 KW*hr = 10^6*1000 W*hr = 10^9 W*hr \] Thousands of Megawatt*hours are equivalent to millions of Kilowatts*hours. We do not need to convert these data to include with the original set.

The URL2 contains the API code so they are contained in a hidden line of code.

eia_usa <- USA %>%  fromJSON(simplifyDataFrame = TRUE) # Entire United States, the url is saved to 'USA'
eia_usa_df <- eia_usa$series$data %>% data.frame()
eia_usa_df$X1 <- eia_usa_df$X1 %>% as.character() %>% as.integer()
eia_usa_df$X2 <- eia_usa_df$X2 %>% as.character() %>% as.numeric()
eia_usa_df <- eia_usa_df %>% rename(year = X1, quantity = X2)
eia_usa_df
##    year quantity
## 1  2016  3918231
## 2  2015  3919294
## 3  2014  3937003
## 4  2013  3903715
## 5  2012  3890358
## 6  2011  3948186
## 7  2010  3972386
## 8  2009  3809837
## 9  2008  3974349
## 10 2007  4005343
## 11 2006  3908077
## 12 2005  3902192
## 13 2004  3808360
## 14 2003  3721159
## 15 2002  3698458
## 16 2001  3580053
ggplot(NULL, aes(y=quantity,x=year))+ 
    geom_point(data=eia_usa_df, aes(y=quantity,x=year), color = "black", shape = 4)+
    geom_point(data=usa_total_elec, aes(y=total,x=year), color = "black")+
    geom_point(data=usa_nuc_elec, aes(y=quantity,x=year), color = "blue")+
    geom_point(data=usa_solar_elec, aes(y=quantity,x=year), color = "yellow3")+
    geom_point(data=usa_therm_elec, aes(y=quantity,x=year), color = "red") +
    geom_point(data=usa_hydro_elec, aes(y=quantity,x=year), color = "cyan") 

usa_total_elec <- usa_total_elec[,-4] #drop the previous total
usa_total_elec <- usa_total_elec %>% left_join(usa_solar_elec, by = "year") #not included in EIA total
usa_total_elec <- usa_total_elec %>% left_join(usa_hydro_elec, by = "year") #not included in EIA total
usa_total_elec <- usa_total_elec %>% mutate(total = rowSums(.[2:5]))
usa_total_elec #This adds all catagories together.
## # A tibble: 25 x 6
##     year quantity.x quantity.y quantity.x.x quantity.y.y   total
##    <int>      <dbl>      <dbl>        <dbl>        <dbl>   <dbl>
##  1  2014    2995556     830584        24603       281527 4132270
##  2  2013    2985891     822004        15872       290113 4113880
##  3  2012    3016908     801129        10145       298287 4126469
##  4  2011    3037683     821405         6153       344679 4209920
##  5  2010    3135578     838931         3934       286333 4264776
##  6  2009    2965173     830210         2514       298410 4096307
##  7  2008    3173013     837804         2091       281995 4294903
##  8  2007    3183440     836634         1673       275545 4297292
##  9  2006    3121205     816195         1287       317689 4256376
## 10  2005    3149290     810726         1120       297926 4259062
## # ... with 15 more rows

We see that the data from EIA.gov trends with the data collected from kaggle.com, including the same dip in electrical production in the middle of the Great Recession of 2007-2012. The offsets between the EIA data in black “x” and the kaggle.com data in black dots could be due to different sampling techniques. I removed hydro electric and solar from the kaggle.com data set and the agreement between the two sets is almost exact. It appears the the EIA was only counting nuclear-thermal and fossil-fuel-thermal electrical generation.

China

china_nuc_elec <- china %>% filter(category == "nuclear_electricity")%>% select(year, quantity)
china_nuc_elec
## # A tibble: 23 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014   132538
##  2  2013   111613
##  3  2012    97394
##  4  2011    86350
##  5  2010    73880
##  6  2009    70134
##  7  2008    68394
##  8  2007    62130
##  9  2006    54843
## 10  2005    53088
## # ... with 13 more rows
china_solar_elec <- china %>% filter(category == "solar_electricity")%>% select(year, quantity) 
china_solar_elec
## # A tibble: 2 x 2
##    year quantity
##   <int>    <dbl>
## 1  2014    15189
## 2  2013     5564
china_therm_elec <- china %>% filter(category == "thermal_electricity")%>% select(year, quantity)
china_therm_elec
## # A tibble: 25 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014  4268649
##  2  2013  4247009
##  3  2012  3892814
##  4  2011  3833702
##  5  2010  3331928
##  6  2009  2982780
##  7  2008  2731949
##  8  2007  2722933
##  9  2006  2369603
## 10  2005  2047336
## # ... with 15 more rows
china_hydro_elec <- china %>% filter(category == "hydro")%>% select(year, quantity)
china_hydro_elec
## # A tibble: 25 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014  1064337
##  2  2013   920291
##  3  2012   872107
##  4  2011   698945
##  5  2010   722172
##  6  2009   615640
##  7  2008   585187
##  8  2007   485264
##  9  2006   435786
## 10  2005   397017
## # ... with 15 more rows
china_wave_elec <- china %>% filter(category == "tide_wave_and_ocean_electricity")%>% select(year, quantity)
china_wave_elec # No wave generation reported in China.
## # A tibble: 0 x 2
## # ... with 2 variables: year <int>, quantity <dbl>
#Create a data frame for calculating total electrical output.
china_total_elec <- china_therm_elec %>% left_join(china_nuc_elec, by = "year")
china_total_elec <- china_total_elec %>% left_join(china_solar_elec, by = "year") 
china_total_elec <- china_total_elec %>% left_join(china_hydro_elec, by = "year")
china_total_elec[is.na(china_total_elec)] <- 0 #China had missing data.
china_total_elec <- china_total_elec %>% mutate(total = rowSums(.[2:5]))
china_total_elec
## # A tibble: 25 x 6
##     year quantity.x quantity.y quantity.x.x quantity.y.y   total
##    <int>      <dbl>      <dbl>        <dbl>        <dbl>   <dbl>
##  1  2014    4268649     132538        15189      1064337 5480713
##  2  2013    4247009     111613         5564       920291 5284477
##  3  2012    3892814      97394            0       872107 4862315
##  4  2011    3833702      86350            0       698945 4618997
##  5  2010    3331928      73880            0       722172 4127980
##  6  2009    2982780      70134            0       615640 3668554
##  7  2008    2731949      68394            0       585187 3385530
##  8  2007    2722933      62130            0       485264 3270327
##  9  2006    2369603      54843            0       435786 2860232
## 10  2005    2047336      53088            0       397017 2497441
## # ... with 15 more rows
ggplot(NULL, aes(y=quantity,x=year))+ 
    geom_point(data=china_total_elec, aes(y=total,x=year), color = "black")+
    geom_point(data=china_nuc_elec, aes(y=quantity,x=year), color = "blue")+
    geom_point(data=china_solar_elec, aes(y=quantity,x=year), color = "yellow3")+
    geom_point(data=china_therm_elec, aes(y=quantity,x=year), color = "red") +
    geom_point(data=china_hydro_elec, aes(y=quantity,x=year), color = "cyan") 

China’s economic explosion of the last 20 years has been largely powered by fossil fuels, with some hydro-electricity. Nuclear and Solar make up only a small fraction. There looks like a slight change in the trend of China’s electrical production during the Great Recession, however the trend is still increasing.

Japan

japan_nuc_elec <- japan %>% filter(category == "nuclear_electricity")%>% select(year, quantity)
japan_nuc_elec
## # A tibble: 25 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014        0
##  2  2013     9303
##  3  2012    15939
##  4  2011   101761
##  5  2010   288230
##  6  2009   279750
##  7  2008   258128
##  8  2007   263832
##  9  2006   303426
## 10  2005   304755
## # ... with 15 more rows
japan_solar_elec <- japan %>% filter(category == "solar_electricity")%>% select(year, quantity)
japan_solar_elec
## # A tibble: 25 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014    24506
##  2  2013    14279
##  3  2012     6963
##  4  2011     5160
##  5  2010     3800
##  6  2009     2758
##  7  2008     2251
##  8  2007     2015
##  9  2006     1794
## 10  2005     1493
## # ... with 15 more rows
japan_therm_elec <- japan %>% filter(category == "thermal_electricity")%>% select(year, quantity)
japan_therm_elec
## # A tibble: 25 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014   921613
##  2  2013   950236
##  3  2012   950187
##  4  2011   876365
##  5  2010   758578
##  6  2009   702466
##  7  2008   732970
##  8  2007   779970
##  9  2006   696738
## 10  2005   702212
## # ... with 15 more rows
japan_hydro_elec <- japan %>% filter(category == "hydro")%>% select(year, quantity)
japan_hydro_elec
## # A tibble: 25 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014    86942
##  2  2013    84923
##  3  2012    83645
##  4  2011    91709
##  5  2010    90682
##  6  2009    83832
##  7  2008    83504
##  8  2007    84234
##  9  2006    97340
## 10  2005    86350
## # ... with 15 more rows
japan_wave_elec <- japan %>% filter(category == "tide_wave_and_ocean_electricity")%>% select(year, quantity)
japan_wave_elec # No wave generation reported in Japan.
## # A tibble: 0 x 2
## # ... with 2 variables: year <int>, quantity <dbl>
#Create a data frame for calculating total electrical output.
japan_total_elec <- japan_therm_elec %>% left_join(japan_nuc_elec, by = "year")
japan_total_elec <- japan_total_elec %>% left_join(japan_solar_elec, by = "year") 
japan_total_elec <- japan_total_elec %>% left_join(japan_hydro_elec, by = "year")
japan_total_elec[is.na(japan_total_elec)] <- 0 #for had missing data.
japan_total_elec <- japan_total_elec %>% mutate(total = rowSums(.[2:5]))
japan_total_elec
## # A tibble: 25 x 6
##     year quantity.x quantity.y quantity.x.x quantity.y.y   total
##    <int>      <dbl>      <dbl>        <dbl>        <dbl>   <dbl>
##  1  2014     921613          0        24506        86942 1033061
##  2  2013     950236       9303        14279        84923 1058741
##  3  2012     950187      15939         6963        83645 1056734
##  4  2011     876365     101761         5160        91709 1074995
##  5  2010     758578     288230         3800        90682 1141290
##  6  2009     702466     279750         2758        83832 1068806
##  7  2008     732970     258128         2251        83504 1076853
##  8  2007     779970     263832         2015        84234 1130051
##  9  2006     696738     303426         1794        97340 1099298
## 10  2005     702212     304755         1493        86350 1094810
## # ... with 15 more rows
ggplot(NULL, aes(y=quantity,x=year))+
    geom_point(data=japan_total_elec, aes(y=total,x=year), color = "black")+
    geom_point(data=japan_nuc_elec, aes(y=quantity,x=year), color = "blue")+
    geom_point(data=japan_solar_elec, aes(y=quantity,x=year), color = "yellow3")+
    geom_point(data=japan_therm_elec, aes(y=quantity,x=year), color = "red") +
    geom_point(data=japan_hydro_elec, aes(y=quantity,x=year), color = "cyan") 

We can see that after the Fukishima nuclear accident (2011), Japan brought most of it’s nuclear power off-line and made up for the loss with fossil-fuels. Like the US, we also see a dip corresponding to the Great Recession.

Germany

germ_nuc_elec <- germany %>% filter(category == "nuclear_electricity")%>% select(year, quantity)
germ_nuc_elec
## # A tibble: 24 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014    97129
##  2  2013    97290
##  3  2012    99460
##  4  2011   107971
##  5  2010   140556
##  6  2009   134932
##  7  2008   148495
##  8  2007   140534
##  9  2006   167269
## 10  2005   163055
## # ... with 14 more rows
germ_solar_elec <- germany %>% filter(category == "solar_electricity")%>% select(year, quantity)
germ_solar_elec
## # A tibble: 24 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014    36056
##  2  2013    31010
##  3  2012    26380
##  4  2011    19599
##  5  2010    11729
##  6  2009     6584
##  7  2008     4420
##  8  2007     3075
##  9  2006     2220
## 10  2005     1282
## # ... with 14 more rows
germ_therm_elec <- germany %>% filter(category == "thermal_electricity")%>% select(year, quantity)
germ_therm_elec
## # A tibble: 24 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014   409688
##  2  2013   428094
##  3  2012   423424
##  4  2011   410773
##  5  2010   413207
##  6  2009   388640
##  7  2008   418220
##  8  2007   426786
##  9  2006   408745
## 10  2005   401355
## # ... with 14 more rows
germ_hydro_elec <- germany %>% filter(category == "hydro")%>% select(year, quantity)
germ_hydro_elec
## # A tibble: 24 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014    25444
##  2  2013    28782
##  3  2012    27849
##  4  2011    23511
##  5  2010    27353
##  6  2009    24682
##  7  2008    26469
##  8  2007    28084
##  9  2006    26768
## 10  2005    26417
## # ... with 14 more rows
germ_wave_elec <- germany %>% filter(category == "tide_wave_and_ocean_electricity")%>% select(year, quantity)
germ_wave_elec # No wave generation reported in Germany.
## # A tibble: 0 x 2
## # ... with 2 variables: year <int>, quantity <dbl>
#Create a data frame for calculating total electrical output.
germ_total_elec <- germ_therm_elec %>% left_join(germ_nuc_elec, by = "year")
germ_total_elec <- germ_total_elec %>% left_join(germ_solar_elec, by = "year") 
germ_total_elec <- germ_total_elec %>% left_join(germ_hydro_elec, by = "year")
germ_total_elec[is.na(germ_total_elec)] <- 0 #for had missing data.
germ_total_elec <- germ_total_elec %>% mutate(total = rowSums(.[2:5]))
germ_total_elec
## # A tibble: 24 x 6
##     year quantity.x quantity.y quantity.x.x quantity.y.y  total
##    <int>      <dbl>      <dbl>        <dbl>        <dbl>  <dbl>
##  1  2014     409688      97129        36056        25444 568317
##  2  2013     428094      97290        31010        28782 585176
##  3  2012     423424      99460        26380        27849 577113
##  4  2011     410773     107971        19599        23511 561854
##  5  2010     413207     140556        11729        27353 592845
##  6  2009     388640     134932         6584        24682 554838
##  7  2008     418220     148495         4420        26469 597604
##  8  2007     426786     140534         3075        28084 598479
##  9  2006     408745     167269         2220        26768 605002
## 10  2005     401355     163055         1282        26417 592109
## # ... with 14 more rows
ggplot(NULL, aes(y=quantity,x=year))+ 
    geom_point(data=germ_total_elec, aes(y=total,x=year), color = "black")+
    geom_point(data=germ_nuc_elec, aes(y=quantity,x=year), color = "blue")+
    geom_point(data=germ_solar_elec, aes(y=quantity,x=year), color = "yellow3")+
    geom_point(data=germ_therm_elec, aes(y=quantity,x=year), color = "red") +
    geom_point(data=germ_hydro_elec, aes(y=quantity,x=year), color = "cyan") 

Germany’s fossil fuel based electrical out has held more or less constant, it does appear that Germany is replacing nuclear based electricity with solar. Like the US, we also see a dip corresponding to the Great Recession.

UK

uk_nuc_elec <- uk %>% filter(category == "nuclear_electricity")%>% select(year, quantity)
uk_nuc_elec
## # A tibble: 25 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014    63748
##  2  2013    70607
##  3  2012    70405
##  4  2011    68980
##  5  2010    62140
##  6  2009    69098
##  7  2008    52486
##  8  2007    63028
##  9  2006    75451
## 10  2005    81618
## # ... with 15 more rows
uk_solar_elec <- uk %>% filter(category == "solar_electricity")%>% select(year, quantity)
uk_solar_elec
## # A tibble: 16 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014     4050
##  2  2013     1989
##  3  2012     1352
##  4  2011      244
##  5  2010       41
##  6  2009       20
##  7  2008       17
##  8  2007       14
##  9  2006       11
## 10  2005        8
## 11  2004        4
## 12  2003        3
## 13  2002        3
## 14  2001        2
## 15  2000        1
## 16  1999        1
uk_therm_elec <- uk %>% filter(category == "thermal_electricity")%>% select(year, quantity)
uk_therm_elec
## # A tibble: 25 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014   230342
##  2  2013   250539
##  3  2012   263734
##  4  2011   273961
##  5  2010   302618
##  6  2009   289403
##  7  2008   320033
##  8  2007   319578
##  9  2006   309149
## 10  2005   305973
## # ... with 15 more rows
uk_hydro_elec <- uk %>% filter(category == "hydro")%>% select(year, quantity)
uk_hydro_elec
## # A tibble: 25 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014     8768
##  2  2013     7606
##  3  2012     8252
##  4  2011     8585
##  5  2010     6715
##  6  2009     8915
##  7  2008     9234
##  8  2007     8936
##  9  2006     8446
## 10  2005     7852
## # ... with 15 more rows
uk_wave_elec <- uk %>% filter(category == "tide_wave_and_ocean_electricity")%>% select(year, quantity)
uk_wave_elec 
## # A tibble: 6 x 2
##    year quantity
##   <int>    <dbl>
## 1  2014        2
## 2  2013        6
## 3  2012        4
## 4  2011        1
## 5  2010        2
## 6  2009        1
#Create a data frame for calculating total electrical output.
uk_total_elec <- uk_therm_elec %>% left_join(uk_nuc_elec, by = "year")
uk_total_elec <- uk_total_elec %>% left_join(uk_solar_elec, by = "year") 
uk_total_elec <- uk_total_elec %>% left_join(uk_hydro_elec, by = "year")
uk_total_elec <- uk_total_elec %>% left_join(uk_wave_elec, by = "year")
uk_total_elec[is.na(uk_total_elec)] <- 0 #for missing data.
uk_total_elec <- uk_total_elec %>% mutate(total = rowSums(.[2:6]))
uk_total_elec
## # A tibble: 25 x 7
##     year quantity.x quantity.y quantity.x.x quantity.y.y quantity  total
##    <int>      <dbl>      <dbl>        <dbl>        <dbl>    <dbl>  <dbl>
##  1  2014     230342      63748         4050         8768        2 306910
##  2  2013     250539      70607         1989         7606        6 330747
##  3  2012     263734      70405         1352         8252        4 343747
##  4  2011     273961      68980          244         8585        1 351771
##  5  2010     302618      62140           41         6715        2 371516
##  6  2009     289403      69098           20         8915        1 367437
##  7  2008     320033      52486           17         9234        0 381770
##  8  2007     319578      63028           14         8936        0 391556
##  9  2006     309149      75451           11         8446        0 393057
## 10  2005     305973      81618            8         7852        0 395451
## # ... with 15 more rows
ggplot(NULL, aes(y=quantity,x=year))+ 
    geom_point(data=uk_total_elec, aes(y=total,x=year), color = "black")+
    geom_point(data=uk_nuc_elec, aes(y=quantity,x=year), color = "blue")+
    geom_point(data=uk_solar_elec, aes(y=quantity,x=year), color = "yellow3")+
    geom_point(data=uk_therm_elec, aes(y=quantity,x=year), color = "red") +
    geom_point(data=uk_hydro_elec, aes(y=quantity,x=year), color = "cyan") +
    geom_point(data=uk_wave_elec, aes(y=quantity,x=year), color = "violet")

Total UK electrical production seems to fluctuate sinusoidal since 1990 with a period of about 50 years, based on the graph. We seemed to have captured half a period over 25 years. Like the US, we also see a dip corresponding to the Great Recession. It looks like the UK has not recovered electrical production as of 2014.

India

india_nuc_elec <- india %>% filter(category == "nuclear_electricity")%>% select(year, quantity)
india_nuc_elec
## # A tibble: 25 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014    36102
##  2  2013    34228
##  3  2012    32871
##  4  2011    32287
##  5  2010    26266
##  6  2009    18636
##  7  2008    14713
##  8  2007    16957
##  9  2006    18802
## 10  2005    17334
## # ... with 15 more rows
india_solar_elec <- india %>% filter(category == "solar_electricity")%>% select(year, quantity)
india_solar_elec
## # A tibble: 2 x 2
##    year quantity
##   <int>    <dbl>
## 1  2014     5020
## 2  2013     3433
india_therm_elec <- india %>% filter(category == "thermal_electricity")%>% select(year, quantity)
india_therm_elec
## # A tibble: 25 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014  1111341
##  2  2013   980599
##  3  2012   860173
##  4  2011   813625
##  5  2010   795052
##  6  2009   755656
##  7  2008   700147
##  8  2007   664143
##  9  2006   610385
## 10  2005   572415
## # ... with 15 more rows
india_hydro_elec <- india %>% filter(category == "hydro")%>% select(year, quantity)
india_hydro_elec
## # A tibble: 25 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014   129353
##  2  2013   141637
##  3  2012   113855
##  4  2011   103741
##  5  2010   114486
##  6  2009   106909
##  7  2008   114337
##  8  2007   120589
##  9  2006   113720
## 10  2005   101730
## # ... with 15 more rows
india_wave_elec <- india %>% filter(category == "tide_wave_and_ocean_electricity")%>% select(year, quantity)
india_wave_elec # No wave generation reported in India.
## # A tibble: 0 x 2
## # ... with 2 variables: year <int>, quantity <dbl>
#Create a data frame for calculating total electrical output.
india_total_elec <- india_therm_elec %>% left_join(india_nuc_elec, by = "year")
india_total_elec <- india_total_elec %>% left_join(india_solar_elec, by = "year") 
india_total_elec <- india_total_elec %>% left_join(india_hydro_elec, by = "year")
india_total_elec <- india_total_elec %>% left_join(india_wave_elec, by = "year")
india_total_elec[is.na(india_total_elec)] <- 0 #for missing data.
india_total_elec <- india_total_elec %>% mutate(total = rowSums(.[2:6]))
india_total_elec
## # A tibble: 25 x 7
##     year quantity.x quantity.y quantity.x.x quantity.y.y quantity   total
##    <int>      <dbl>      <dbl>        <dbl>        <dbl>    <dbl>   <dbl>
##  1  2014    1111341      36102         5020       129353        0 1281816
##  2  2013     980599      34228         3433       141637        0 1159897
##  3  2012     860173      32871            0       113855        0 1006899
##  4  2011     813625      32287            0       103741        0  949653
##  5  2010     795052      26266            0       114486        0  935804
##  6  2009     755656      18636            0       106909        0  881201
##  7  2008     700147      14713            0       114337        0  829197
##  8  2007     664143      16957            0       120589        0  801689
##  9  2006     610385      18802            0       113720        0  742907
## 10  2005     572415      17334            0       101730        0  691479
## # ... with 15 more rows
ggplot(NULL, aes(y=quantity,x=year))+ 
    geom_point(data=india_total_elec, aes(y=total,x=year), color = "black")+
    geom_point(data=india_nuc_elec, aes(y=quantity,x=year), color = "blue")+
    geom_point(data=india_solar_elec, aes(y=quantity,x=year), color = "yellow3")+
    geom_point(data=india_therm_elec, aes(y=quantity,x=year), color = "red") +
    geom_point(data=india_hydro_elec, aes(y=quantity,x=year), color = "cyan") 

Much like China, India has greatly increased it’s electricity output over the last 25 years by adding fossil fuel capacity. However, India does not appear to be as affected by the Great Recession as other countries. Given that approximately half of India’s landmass is in the tropics, it is surprising that India has not brought more Solar online.

France

fr_nuc_elec <- france %>% filter(category == "nuclear_electricity")%>% select(year, quantity)
fr_nuc_elec
## # A tibble: 25 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014   436474
##  2  2013   423685
##  3  2012   425406
##  4  2011   442383
##  5  2010   428521
##  6  2009   409736
##  7  2008   439447
##  8  2007   439730
##  9  2006   450191
## 10  2005   451529
## # ... with 15 more rows
fr_solar_elec <- france %>% filter(category == "solar_electricity")%>% select(year, quantity)
fr_solar_elec
## # A tibble: 21 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014     5909
##  2  2013     4735
##  3  2012     4016
##  4  2011     2078
##  5  2010      620
##  6  2009      174
##  7  2008       42
##  8  2007       27
##  9  2006       16
## 10  2005       13
## # ... with 11 more rows
fr_therm_elec <- france %>% filter(category == "thermal_electricity")%>% select(year, quantity)
fr_therm_elec
## # A tibble: 25 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014    33484
##  2  2013    50877
##  3  2012    56722
##  4  2011    54594
##  5  2010    62010
##  6  2009    55396
##  7  2008    59500
##  8  2007    61874
##  9  2006    60279
## 10  2005    66747
## # ... with 15 more rows
fr_hydro_elec <- france %>% filter(category == "hydro")%>% select(year, quantity)
fr_hydro_elec
## # A tibble: 25 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014    68626
##  2  2013    75867
##  3  2012    63594
##  4  2011    49865
##  5  2010    67525
##  6  2009    61969
##  7  2008    68368
##  8  2007    63260
##  9  2006    61742
## 10  2005    56332
## # ... with 15 more rows
fr_wave_elec <- france %>% filter(category == "tide_wave_and_ocean_electricity")%>% select(year, quantity)
fr_wave_elec 
## # A tibble: 25 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014      481
##  2  2013      414
##  3  2012      458
##  4  2011      477
##  5  2010      476
##  6  2009      448
##  7  2008      465
##  8  2007      465
##  9  2006      464
## 10  2005      481
## # ... with 15 more rows
#Create a data frame for calculating total electrical output.
fr_total_elec <- fr_therm_elec %>% left_join(fr_nuc_elec, by = "year")
fr_total_elec <- fr_total_elec %>% left_join(fr_solar_elec, by = "year") 
fr_total_elec <- fr_total_elec %>% left_join(fr_hydro_elec, by = "year")
fr_total_elec <- fr_total_elec %>% left_join(fr_wave_elec, by = "year")
fr_total_elec[is.na(fr_total_elec)] <- 0 #for missing data.
fr_total_elec <- fr_total_elec %>% mutate(total = rowSums(.[2:6]))
fr_total_elec
## # A tibble: 25 x 7
##     year quantity.x quantity.y quantity.x.x quantity.y.y quantity  total
##    <int>      <dbl>      <dbl>        <dbl>        <dbl>    <dbl>  <dbl>
##  1  2014      33484     436474         5909        68626      481 544974
##  2  2013      50877     423685         4735        75867      414 555578
##  3  2012      56722     425406         4016        63594      458 550196
##  4  2011      54594     442383         2078        49865      477 549397
##  5  2010      62010     428521          620        67525      476 559152
##  6  2009      55396     409736          174        61969      448 527723
##  7  2008      59500     439447           42        68368      465 567822
##  8  2007      61874     439730           27        63260      465 565356
##  9  2006      60279     450191           16        61742      464 572692
## 10  2005      66747     451529           13        56332      481 575102
## # ... with 15 more rows
ggplot(NULL, aes(y=quantity,x=year))+ 
    geom_point(data=fr_total_elec, aes(y=total,x=year), color = "black")+
    geom_point(data=fr_nuc_elec, aes(y=quantity,x=year), color = "blue")+
    geom_point(data=fr_solar_elec, aes(y=quantity,x=year), color = "yellow3")+
    geom_point(data=fr_therm_elec, aes(y=quantity,x=year), color = "red") +
    geom_point(data=fr_hydro_elec, aes(y=quantity,x=year), color = "cyan") +
    geom_point(data=fr_wave_elec, aes(y=quantity,x=year), color = "violet")

France is the only country we examined with the majority of its power generated by nuclear. This might make it the only top world economy that is a net energy exporter than an importer. It is also the only major economy that is decreasing fossil fuel based electricity. The difference seems to be made up for by hydro and nuclear power. Like the US, we also see a dip corresponding to the Great Recession.

Brazil

braz_nuc_elec <- brazil %>% filter(category == "nuclear_electricity")%>% select(year, quantity)
braz_nuc_elec
## # A tibble: 25 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014    15378
##  2  2013    15450
##  3  2012    16038
##  4  2011    15659
##  5  2010    14523
##  6  2009    12957
##  7  2008    13969
##  8  2007    12350
##  9  2006    13754
## 10  2005     9855
## # ... with 15 more rows
braz_solar_elec <- brazil %>% filter(category == "solar_electricity")%>% select(year, quantity)
braz_solar_elec
## # A tibble: 2 x 2
##    year quantity
##   <int>    <dbl>
## 1  2014       16
## 2  2013        7
braz_therm_elec <- brazil %>% filter(category == "thermal_electricity")%>% select(year, quantity)
braz_therm_elec
## # A tibble: 25 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014   189497
##  2  2013   157810
##  3  2012   116068
##  4  2011    85062
##  5  2010    95809
##  6  2009    60975
##  7  2008    78758
##  8  2007    58084
##  9  2006    56777
## 10  2005    55626
## # ... with 15 more rows
braz_hydro_elec <- brazil %>% filter(category == "hydro")%>% select(year, quantity)
braz_hydro_elec
## # A tibble: 25 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014   373439
##  2  2013   390992
##  3  2012   415342
##  4  2011   428333
##  5  2010   403289
##  6  2009   390988
##  7  2008   369556
##  8  2007   374015
##  9  2006   348805
## 10  2005   337457
## # ... with 15 more rows
braz_wave_elec <- brazil %>% filter(category == "tide_wave_and_ocean_electricity")%>% select(year, quantity)
braz_wave_elec # No wave generation reported.
## # A tibble: 0 x 2
## # ... with 2 variables: year <int>, quantity <dbl>
#Create a data frame for calculating total electrical output.
braz_total_elec <- braz_therm_elec %>% left_join(braz_nuc_elec, by = "year")
braz_total_elec <- braz_total_elec %>% left_join(braz_solar_elec, by = "year") 
braz_total_elec <- braz_total_elec %>% left_join(braz_hydro_elec, by = "year")
braz_total_elec <- braz_total_elec %>% left_join(braz_wave_elec, by = "year")
braz_total_elec[is.na(braz_total_elec)] <- 0 #for missing data.
braz_total_elec <- braz_total_elec %>% mutate(total = rowSums(.[2:6]))
braz_total_elec
## # A tibble: 25 x 7
##     year quantity.x quantity.y quantity.x.x quantity.y.y quantity  total
##    <int>      <dbl>      <dbl>        <dbl>        <dbl>    <dbl>  <dbl>
##  1  2014     189497      15378           16       373439        0 578330
##  2  2013     157810      15450            7       390992        0 564259
##  3  2012     116068      16038            0       415342        0 547448
##  4  2011      85062      15659            0       428333        0 529054
##  5  2010      95809      14523            0       403289        0 513621
##  6  2009      60975      12957            0       390988        0 464920
##  7  2008      78758      13969            0       369556        0 462283
##  8  2007      58084      12350            0       374015        0 444449
##  9  2006      56777      13754            0       348805        0 419336
## 10  2005      55626       9855            0       337457        0 402938
## # ... with 15 more rows
ggplot(NULL, aes(y=quantity,x=year))+ 
    geom_point(data=braz_total_elec, aes(y=total,x=year), color = "black")+
    geom_point(data=braz_nuc_elec, aes(y=quantity,x=year), color = "blue")+
    geom_point(data=braz_solar_elec, aes(y=quantity,x=year), color = "yellow3")+
    geom_point(data=braz_therm_elec, aes(y=quantity,x=year), color = "red") +
    geom_point(data=braz_hydro_elec, aes(y=quantity,x=year), color = "cyan") 

Brazil is the first country in our analysis that produces most of their electricity with renewable hydroelectric energy. Also, given Brazil’s low latitudes; the Northern part of the country intersects the Equator, it is surprising that Germany has brought more Solar online. Brazil gets much more direct sunlight. In terms of sustainability it looks like Brazil is replacing hydroelectric plants with fossil fuels. There may be short-term economic benefit to this, but in the long run this is bad for energy sustainability.

Italy

it_nuc_elec <- italy %>% filter(category == "nuclear_electricity")%>% select(year, quantity)
it_nuc_elec #Italy has no Nucleat capacity
## # A tibble: 0 x 2
## # ... with 2 variables: year <int>, quantity <dbl>
it_solar_elec <- italy %>% filter(category == "solar_electricity")%>% select(year, quantity)
it_solar_elec
## # A tibble: 25 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014    22306
##  2  2013    21589
##  3  2012    18862
##  4  2011    10796
##  5  2010     1906
##  6  2009      676
##  7  2008      193
##  8  2007       38
##  9  2006       35
## 10  2005       31
## # ... with 15 more rows
it_therm_elec <- italy %>% filter(category == "thermal_electricity")%>% select(year, quantity)
it_therm_elec
## # A tibble: 25 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014   175510
##  2  2013   192237
##  3  2012   216811
##  4  2011   227711
##  5  2010   230472
##  6  2009   226033
##  7  2008   260414
##  8  2007   264744
##  9  2006   261137
## 10  2005   251958
## # ... with 15 more rows
it_hydro_elec <- italy %>% filter(category == "hydro")%>% select(year, quantity)
it_hydro_elec
## # A tibble: 25 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014    60256
##  2  2013    54672
##  3  2012    43854
##  4  2011    47757
##  5  2010    54406
##  6  2009    53443
##  7  2008    47227
##  8  2007    38482
##  9  2006    43425
## 10  2005    42927
## # ... with 15 more rows
it_wave_elec <- italy %>% filter(category == "tide_wave_and_ocean_electricity")%>% select(year, quantity)
it_wave_elec # No wave generation reported.
## # A tibble: 0 x 2
## # ... with 2 variables: year <int>, quantity <dbl>
#Create a data frame for calculating total electrical output.
it_total_elec <- it_therm_elec %>% left_join(it_nuc_elec, by = "year")
it_total_elec <- it_total_elec %>% left_join(it_solar_elec, by = "year") 
it_total_elec <- it_total_elec %>% left_join(it_hydro_elec, by = "year")
it_total_elec <- it_total_elec %>% left_join(it_wave_elec, by = "year")
it_total_elec[is.na(it_total_elec)] <- 0 #for missing data.
it_total_elec <- it_total_elec %>% mutate(total = rowSums(.[2:6]))
it_total_elec
## # A tibble: 25 x 7
##     year quantity.x quantity.y quantity.x.x quantity.y.y quantity  total
##    <int>      <dbl>      <dbl>        <dbl>        <dbl>    <dbl>  <dbl>
##  1  2014     175510          0        22306        60256        0 258072
##  2  2013     192237          0        21589        54672        0 268498
##  3  2012     216811          0        18862        43854        0 279527
##  4  2011     227711          0        10796        47757        0 286264
##  5  2010     230472          0         1906        54406        0 286784
##  6  2009     226033          0          676        53443        0 280152
##  7  2008     260414          0          193        47227        0 307834
##  8  2007     264744          0           38        38482        0 303264
##  9  2006     261137          0           35        43425        0 304597
## 10  2005     251958          0           31        42927        0 294916
## # ... with 15 more rows
ggplot(NULL, aes(y=quantity,x=year))+ 
    geom_point(data=it_total_elec, aes(y=total,x=year), color = "black")+
    geom_point(data=it_nuc_elec, aes(y=quantity,x=year), color = "blue")+
    geom_point(data=it_solar_elec, aes(y=quantity,x=year), color = "yellow3")+
    geom_point(data=it_therm_elec, aes(y=quantity,x=year), color = "red") +
    geom_point(data=it_hydro_elec, aes(y=quantity,x=year), color = "cyan") 

Italy seems to be the hardest hit by the Great Recession. Their energy production decreases in 2007 and never recovers, similar to the UK. Italy also is bringing the \(2^{nd}\) Solar online than other countries in the survey having 22,306 kw*hr, million in 2014 compared to Germany’s 36,056 kw*hr, million.

Canada

can_nuc_elec <- canada %>% filter(category == "nuclear_electricity")%>% select(year, quantity)
can_nuc_elec
## # A tibble: 25 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014   107678
##  2  2013   103439
##  3  2012    94862
##  4  2011    93589
##  5  2010    90658
##  6  2009    90091
##  7  2008    96020
##  8  2007    93482
##  9  2006    97964
## 10  2005    92040
## # ... with 15 more rows
can_solar_elec <- canada %>% filter(category == "solar_electricity")%>% select(year, quantity)
can_solar_elec
## # A tibble: 23 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014     1756
##  2  2013     1499
##  3  2012      319
##  4  2011      260
##  5  2010      117
##  6  2009        5
##  7  2008       35
##  8  2007       26
##  9  2006       21
## 10  2005       17
## # ... with 13 more rows
can_therm_elec <- canada %>% filter(category == "thermal_electricity")%>% select(year, quantity)
can_therm_elec
## # A tibble: 25 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014   139402
##  2  2013   143406
##  3  2012   146063
##  4  2011   150042
##  5  2010   144963
##  6  2009   140817
##  7  2008   152054
##  8  2007   162288
##  9  2006   154030
## 10  2005   166022
## # ... with 15 more rows
can_hydro_elec <- canada %>% filter(category == "hydro")%>% select(year, quantity)
can_hydro_elec
## # A tibble: 25 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014   382574
##  2  2013   391861
##  3  2012   380340
##  4  2011   375797
##  5  2010   351461
##  6  2009   368762
##  7  2008   377561
##  8  2007   367695
##  9  2006   352963
## 10  2005   362031
## # ... with 15 more rows
can_wave_elec <- canada %>% filter(category == "tide_wave_and_ocean_electricity")%>% select(year, quantity)
can_wave_elec # No wave generation reported.
## # A tibble: 25 x 2
##     year quantity
##    <int>    <dbl>
##  1  2014       16
##  2  2013       15
##  3  2012       27
##  4  2011       26
##  5  2010       28
##  6  2009       30
##  7  2008       15
##  8  2007       23
##  9  2006       19
## 10  2005       28
## # ... with 15 more rows
#Create a data frame for calculating total electrical output.
can_total_elec <- can_therm_elec %>% left_join(can_nuc_elec, by = "year")
can_total_elec <- can_total_elec %>% left_join(can_solar_elec, by = "year") 
can_total_elec <- can_total_elec %>% left_join(can_hydro_elec, by = "year")
can_total_elec <- can_total_elec %>% left_join(can_wave_elec, by = "year")
can_total_elec[is.na(can_total_elec)] <- 0 #for missing data.
can_total_elec <- can_total_elec %>% mutate(total = rowSums(.[2:6]))
can_total_elec
## # A tibble: 25 x 7
##     year quantity.x quantity.y quantity.x.x quantity.y.y quantity  total
##    <int>      <dbl>      <dbl>        <dbl>        <dbl>    <dbl>  <dbl>
##  1  2014     139402     107678         1756       382574       16 631426
##  2  2013     143406     103439         1499       391861       15 640220
##  3  2012     146063      94862          319       380340       27 621611
##  4  2011     150042      93589          260       375797       26 619714
##  5  2010     144963      90658          117       351461       28 587227
##  6  2009     140817      90091            5       368762       30 599705
##  7  2008     152054      96020           35       377561       15 625685
##  8  2007     162288      93482           26       367695       23 623514
##  9  2006     154030      97964           21       352963       19 604997
## 10  2005     166022      92040           17       362031       28 620138
## # ... with 15 more rows
ggplot(NULL, aes(y=quantity,x=year))+ 
    geom_point(data=can_total_elec, aes(y=total,x=year), color = "black")+
    geom_point(data=can_nuc_elec, aes(y=quantity,x=year), color = "blue")+
    geom_point(data=can_solar_elec, aes(y=quantity,x=year), color = "yellow3")+
    geom_point(data=can_therm_elec, aes(y=quantity,x=year), color = "red") +
    geom_point(data=can_hydro_elec, aes(y=quantity,x=year), color = "cyan") +
    geom_point(data=can_wave_elec, aes(y=quantity,x=year), color = "violet")

Like Brazil, Canada gets most of its electricity from Hydro-electric plants. Also, we do see the Great Recession dip. It is also interesting to see that Canada seems to swap nuclear power and fossil fuel power.

ggplot(can_total_elec, aes(y=quantity.y, x=quantity.x))+
  geom_point(color="steelblue")+
  geom_smooth(method = "lm")

cor.test(can_total_elec$quantity.x,can_total_elec$quantity.y, method = "pearson")
## 
##  Pearson's product-moment correlation
## 
## data:  can_total_elec$quantity.x and can_total_elec$quantity.y
## t = -1.844, df = 23, p-value = 0.0781
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.66037322  0.04223113
## sample estimates:
##        cor 
## -0.3588885

Upon further examination, Canada’s anti-correlation between seems to be marginal, as it is significant at the 90% confidence level, but not the 95% confidence level.

Energy Analysis Conclusions

We can look at power totals side by side:

countries<- c("USA", "China", "Japan", "Germany", "UK", "India", "France", "Brazil", "Italy", "Canada")
totals<- matrix(as.numeric(c(usa_total_elec[1,6],china_total_elec[1,6],japan_total_elec[1,6], germ_total_elec[1,6],uk_total_elec[1,7], india_total_elec[1,7], fr_total_elec[1,7], braz_total_elec[1,7],it_total_elec[1,7],can_total_elec[1,7]), ncol = 1))
barplot(t(totals),names.arg = countries, horiz = TRUE, cex.names= 0.75,las=1, xlab = "2014 Total Electical Production (10^6 kw*hr)")

usa_china = totals[1,1]+totals[2,1]
japan_thru_canada = sum(totals[3:10,1])
totals2 <- matrix(c(usa_china,japan_thru_canada), ncol = 1)
barplot(t(totals2), names.arg = c("USA and China", "Japan to Canada"), horiz = TRUE, cex.names = 0.5, las=1,xlab = "USA and China Compared to Total of Other Top 10 Economies (10^6 kw*hr)")

We can see that the USA and China combined produce much more electricity than other top 10 economies in 2014. We also found that only 3 countries use something other than fossil fuels as their primary power supply. These are Brazil (Hydro), Canada (Hydro), and France (Nuclear).

Growth

The growth of electrical production between 1990 and 2014 will be modeled by the ratio of electricity produced in 1990 to the amount of electricity produced in 2014.

ratio <- c(as.numeric(usa_total_elec[1,6]/usa_total_elec[25,6]),
            as.numeric(china_total_elec[1,6]/china_total_elec[25,6]),
            as.numeric(japan_total_elec[1,6]/japan_total_elec[25,6]),
            as.numeric(germ_total_elec[1,6]/germ_total_elec[24,6]),
            as.numeric(uk_total_elec[1,7]/uk_total_elec[25,7]),
            as.numeric(india_total_elec[1,7]/india_total_elec[25,7]),
            as.numeric(fr_total_elec[1,7]/fr_total_elec[25,7]),
            as.numeric(braz_total_elec[1,7]/braz_total_elec[25,7]),
            as.numeric(it_total_elec[1,7]/it_total_elec[25,7]),
            as.numeric(can_total_elec[1,7]/can_total_elec[25,7]))
growth <- data.frame(countries,ratio ,stringsAsFactors = FALSE)
growth <- growth %>%  arrange(desc(ratio))
barplot(growth$ratio, names.arg = growth$countries,xlab = "Ratio of Electric Energy Generation 2014:1990" ,horiz = TRUE, cex.names = 0.75 ,las=1)

China India, and Brazil have also shown the most growth in electrical energy production and merit detailed analysis. The UK is the only top 10 economy to show a loss of electrical production in the 25 year period.

Part II: Product Import & export Outlook:

1. Coal_Peat_Oil.Shale, Oil

Total Coal_Peat_Oil Net Trade (Export-Import):

Japan was the biggest buyer before 2011, with steady grew from 190,000 thousand metric tons in 1990, and had max import 250,000 in 2004, became constant after. China was exporting the product before 2008 and changed grew direction in a sudden. China became the largest buyer within four years, with max import above 300,000 2013. The following is India, third largest buyer showing strong grew in import after 2010. United States and Canada kept exporting all the time during 1990-2014. United States was the biggest seller in most of the time. Yet it reduced export amount from 1991, it was still the major supplier. After 2006, United States started to enlarge its export amount. Second suppler Canada kept 30,000-50,000 thousands metric tons per year, with steady export amount all the time. Germany, France, Italy and Brazil had constant small import each year.

impCPO.country<-impCPO %>% group_by(country_or_area,year) %>% summarise(totalImport=sum(quantity)) 
expCPO.country<-expCPO %>% group_by(country_or_area,year) %>% summarise(totalExport=sum(quantity)) 
netCPO<-merge(impCPO.country,expCPO.country, c("country_or_area","year"))
netCPO<-mutate(netCPO,netTransation=totalExport - totalImport)

ggplot(data=netCPO, aes(x=year, y=netTransation, group = country_or_area, colour = country_or_area)) +
    geom_line() +
    geom_point( size=1, shape=10, fill="white")+
    ggtitle("Total Coal_Peat_Oil.Shale net trade (Export-Import), Metric tons,thousand") +
    scale_y_continuous(labels = comma)

2. Oil

Total Oil Net Trade:

United States is the largest buyer, kept grew amount from 360,000 thousand metric tons in 2009 and reached above 600,000 in 2005, then reduced the amount each year, yet it kept its first place until to 2014. Japan is the second largest buyer from 1990-2008, with steady import all the time, and slightly reduced the amount after 2010. China now becomes the biggest buyer in 2014, surpassed United States in 2014, and Japan in 2008. China had almost zero import before 1996, and had kept strong grew after. China imported 300,000 thousand metric tons oil in 2014. India also showed upward trend in importing more oil after 2005, from 200,000. Germany, Italy, France had constant import at around 100,000 per year, but started to reduce the amount after 2006. Canada is the only one country kept exporting oil at all the time, with small grew every year, and tended to increase the amount after 2010. United Kingdom kept selling oil before 2005, and became buyer after.

impOIL.country<-impOIL %>% group_by(country_or_area,year) %>% summarise(totalImport=sum(quantity)) 
expOIL.country<-expOIL %>% group_by(country_or_area,year) %>% summarise(totalExport=sum(quantity)) 
netOIL<-merge(impOIL.country,expOIL.country, c("country_or_area","year"))
netOIL<-mutate(netOIL,netTransation=totalExport - totalImport)

ggplot(data=netOIL, aes(x=year, y=netTransation, group = country_or_area, colour = country_or_area)) +
    geom_line() +
    geom_point( size=1, shape=10, fill="white")+
    ggtitle("Total Oil net trade (Export-Import), Metric tons, thousand")+
    scale_y_continuous(labels = comma)

3.Nautal Gas

Total Nautal Gas Net Trade:

United States is the biggest buyer from 1992-2013 with max import over 4,000,000 Terajoules in 2007 then reduced to 1,200,000 in 2014 ranking at fourth. Germany is the second largest buyer in most of time, yet its constantly import brought to the first place in 2014. Right after Germany, Italy had smooth grew before 2008 and surpassed united states in 2010. France had pretty steady grew and ranked at fourth before 2013. China showed very strong grew after 2008 which ranked at third in 2014. Similar to China, United Kingdom had enlarged their import from 2005. Japan did not have data in natural gas net trade. Canada is not only the biggest supplier, also the only country export Natural gas.

impNG.country<-impNG %>% group_by(country_or_area,year) %>% summarise(totalImport=sum(quantity)) 
expNG.country<-expNG %>% group_by(country_or_area,year) %>% summarise(totalExport=sum(quantity)) 
netNG<-merge(impNG.country,expNG.country, c("country_or_area","year"))
netNG<-mutate(netNG,netTransation=totalExport - totalImport)

ggplot(data=netNG, aes(x=year, y=netTransation, group = country_or_area, colour = country_or_area)) +
    geom_line() +
    geom_point( size=1, shape=10, fill="white")+
    ggtitle("Total Natural Gas net trade (Export-Import), Terajoules")+
    scale_y_continuous(labels = comma)

4. Biofuels & Waste

Total Biofuels & Waste Net Trade:

There are three forms import sources in this fields, United Kingdom took a lot import started from 2009 from 15,000 Terajoules. It grew with almost vertical speed, imported 70,000 Terajoules in 2014. Italy started to import biofuels and waste energy from 1998, and went up to 8,000 thousand cubic meters in 2014. Japan was out of this business. Others also had import or export, but the quantity of the trades are too small to be concerned.

impBW.country<-impBW %>% group_by(unit,country_or_area,year) %>% summarise(totalImport=sum(quantity)) 
expBW.country<-expBW %>% group_by(unit,country_or_area,year) %>% summarise(totalExport=sum(quantity)) 
netBW<-merge(impBW.country,expBW.country, c("country_or_area","year","unit"))
netBW<-mutate(netBW,netTransation=totalExport - totalImport)

ggplot(data=netBW, aes(x=year, y=netTransation, group = country_or_area, colour = country_or_area)) +
    geom_line() +
    geom_point( size=1, shape=10, fill="white")+
    ggtitle("Total Biofuels & Waste net trade (Export-Import)")+
    facet_grid(. ~ unit) +theme(axis.text.x = element_text(angle = 90, hjust = 1))+
    scale_y_continuous(labels = comma)

Part III: Electricity Consumption by hourseholds:

2014 vs 2004 total electricity consumption for households

For 2014 the total electricity consumption for households’ data set, United States ranked at the first place with 1,416,977 million Kilowatt-hours and 10% grow in 10 years. United States almost double the total amount of the second country China, which had 717, 610 consumption of million Kilowatt-hours and two-time growth in past 10 years. The following is Japan which had 273,938 million Kilowatt-hours and constant consumption in 10 years. India ranked at fourth with 217,405 million Kilowatt-hours and 1.27 times growth in past 10 years. Canada is at fifth with 161,568 million Kilowatt-hours and 7% growth. France is at sixth with 149,426 million Kilowatt-hours and 4% growth. Brazil is at seventh with 132,302 million Kilowatt-hours and 68% the third fastest growth in past 10 years. The next three countries ranking in order are Germany, United Kingdom and Italy, with 129,600, 108,881 and 64,255 separately. These three countries reduced the total electricity consumption for households in past 10 years, with 8%, 12% and 4% dropped in usage.

electricity2004<-subEnergy %>% filter(year=="2004") %>% filter(category=="total_electricity") 
electricity2004<-electricity2004[grepl("households",electricity2004[,2])==TRUE,]
electricity2004<-electricity2004[order(electricity2004$quantity,decreasing = TRUE),]
electricity2004<-electricity2004[order(electricity2004$quantity,decreasing = TRUE),]

electricity2014<-subEnergy %>% filter(year=="2014") %>% filter(category=="total_electricity") 
electricity2014<-electricity2014[grepl("households",electricity2014[,2])==TRUE,]
electricity2014<-electricity2014[order(electricity2014$quantity,decreasing = TRUE),]
electricity2014<-electricity2014[order(electricity2014$quantity,decreasing = TRUE),]

e<-merge(electricity2014,electricity2004, c("country_or_area"))
e<-data.frame(e$country_or_area,e$quantity.x,e$quantity.y,e$unit.x,e$transaction.x)
colnames(e)<- c("country_or_area","2014_quantity","2004_quantity","unit","transation")

#add new column "growth rate"
e["2004-2014growth"] <-round((e$`2014_quantity`-e$`2004_quantity`)/e$`2004_quantity`,2)

e<-e[order(e$`2014_quantity`,decreasing = TRUE),]

e
##   country_or_area 2014_quantity 2004_quantity                    unit
## 3           China        717610        238449 Kilowatt-hours, million
## 8           Japan        273938        273923 Kilowatt-hours, million
## 6           India        217405         95660 Kilowatt-hours, million
## 2          Canada        161568        150978 Kilowatt-hours, million
## 4          France        149426        143380 Kilowatt-hours, million
## 1          Brazil        132302         78577 Kilowatt-hours, million
## 5         Germany        129600        140400 Kilowatt-hours, million
## 7           Italy         64255         66592 Kilowatt-hours, million
##                   transation 2004-2014growth
## 3  consumption by households            2.01
## 8  consumption by households            0.00
## 6  consumption by households            1.27
## 2  consumption by households            0.07
## 4  consumption by households            0.04
## 1  consumption by households            0.68
## 5  consumption by households           -0.08
## 7  consumption by households           -0.04
ggplot(data=e, aes(x=reorder(country_or_area, -`2004-2014growth`), y=`2004-2014growth`,fill=country_or_area)) +
  geom_bar(stat="identity")+
  geom_text(aes(label=`2004-2014growth`), vjust=-1, color="black", size=3)+  
  ggtitle("Electricity Consumption by hourseholds 2004-2014 growth rate") +
  xlab("country_or_area)") + ylab("2004-2014growth")

For the graph of Per_Capital_Use(Kilowatt-hours), Canada and United States are at first and second places with very close amount of usage per capital. Their usage are close double amount of France and Japan usage per capital. China and India are at ninth and tenth, only 11.7% and 3.7% of the usage per capital in Canada.

url<-("https://en.wikipedia.org/wiki/List_of_countries_and_dependencies_by_population")
temp <- url %>% 
  html %>%
  html_nodes("table")
## Warning: 'html' is deprecated.
## Use 'read_html' instead.
## See help("Deprecated")
population<-html_table(temp[1])

#formate 'population' table in r
population<-data.frame(population)

#rename 'Country (or dependent territory)' column as 'Country' 
colnames(population)[2] <- "Country"

#remove [note] from country names in 'Country' column
population$Country <-gsub("\\[.*","",population$Country)

#remove "," from 'Population' column string, then covert char to interger data type
population$Population<-gsub("[^0-9]", "", population$Population) 
population$Population<-as.integer(as.character(population$Population))

population10<- population[population$Country %in% c("United States","China","Japan","Germany","United Kingdom","India","France","Brazil","Italy","Canada"), ]

population10<-data.frame(population10$Country,population10$Population)
#rename all column names  
colnames(population10) <- c("country_or_area","Population")

p<-merge(electricity2014,population10, c("country_or_area"))

#add new column "Per_Capital_Use"
p["Per_Capital_KW_hr"] <-round(p$quantity/p$Population*1000000,2)

p<-p[order(p$quantity,decreasing = TRUE),]

ggplot(data=p, aes(x=reorder(country_or_area, Per_Capital_KW_hr), y=Per_Capital_KW_hr,fill=country_or_area)) +
  geom_bar(stat="identity")+
  geom_text(aes(label=Per_Capital_KW_hr), vjust=-1, color="black", size=3)+  
  ggtitle("Electricity Consumption by hourseholds per capital") +
  xlab("Per_Capital_Use(Kilowatt-hours)") + ylab("country")+
  coord_flip()

References