We will begin by loading the data obtained from kaggle.com1 into a table data frame.
suppressMessages(suppressWarnings(library('tidyverse')))
energy <- 'all_energy_statistics.csv' %>% read.csv(stringsAsFactors = FALSE) %>% tbl_df()
head(energy)
## # A tibble: 6 x 7
## country_or_area commodity_transaction year
## <chr> <chr> <int>
## 1 Austria Additives and Oxygenates - Exports 1996
## 2 Austria Additives and Oxygenates - Exports 1995
## 3 Belgium Additives and Oxygenates - Exports 2014
## 4 Belgium Additives and Oxygenates - Exports 2013
## 5 Belgium Additives and Oxygenates - Exports 2012
## 6 Belgium Additives and Oxygenates - Exports 2011
## # ... with 4 more variables: unit <chr>, quantity <dbl>,
## # quantity_footnotes <int>, category <chr>
nrow(energy)
## [1] 1048575
ncol(energy)
## [1] 7
With over 1,000,000 rows, sub-setting these data seems like a reasonable approach to analyzing them, as using the whole set would tax our computer resources. To get a good idea of where to subset, we can look at countries/areas and commodities and transactions.
The data is labeled by Countries, Commodity-Transactions, and Categories. This is a good place to start looking for ways to sub-set the data.
country <- energy[,1] %>% unique() %>% arrange(country_or_area)
## Warning: package 'bindrcpp' was built under R version 3.4.1
country
## # A tibble: 243 x 1
## country_or_area
## <chr>
## 1 Afghanistan
## 2 Albania
## 3 Algeria
## 4 American Samoa
## 5 Andorra
## 6 Angola
## 7 Anguilla
## 8 Antarctic Fisheries
## 9 Antigua and Barbuda
## 10 Argentina
## # ... with 233 more rows
comm_trans <-energy[,2] %>% unique() %>%arrange(commodity_transaction)
nrow(comm_trans)
## [1] 2336
head(comm_trans)
## # A tibble: 6 x 1
## commodity_transaction
## <chr>
## 1 Additives and Oxygenates - Exports
## 2 Additives and Oxygenates - Imports
## 3 Additives and Oxygenates - Production
## 4 Additives and Oxygenates - Receipts from other sources
## 5 Additives and Oxygenates - Stock changes
## 6 Additives and Oxygenates - Total energy supply
tail(comm_trans)
## # A tibble: 6 x 1
## commodity_transaction
## <chr>
## 1 Vegetal waste - Transformation in CHP plants - main activity producers
## 2 Vegetal waste - Transformation in electricity plants - autoproducers
## 3 Vegetal waste - Transformation in electricity plants - main activity produc
## 4 Vegetal waste - Transformation in electricity, CHP and heat plants
## 5 Vegetal waste - Transformation in heat plants - autoproducers
## 6 Vegetal waste - Transformation in heat plants - main activity producers
We have 243 countries and areas and 2336 categories of commodities and transactions. The commodity-transactions are listed first by the commodity, the object being used, and then by the transactions, how that object is used. These are separated by a dash. We can use a regular expression to separate the commodities from the transactions.
suppressMessages(suppressWarnings(library(stringr)))
#energy[,2] <- energy[,2] %>% tolower() These seemed to stall Rstudio. I look for another way to fix this.
#energy[,2] <- energy[,2] %>% str_replace("hrad","hard") #one of the cells had a spelling error
#tail(energy[,2])
comm_trans<- comm_trans %>% separate("commodity_transaction", c('commodity', 'transaction'), " - ")
## Warning: Too many values at 208 locations: 25, 68, 69, 71, 72, 74, 75, 123,
## 124, 125, 126, 168, 169, 170, 171, 209, 210, 211, 212, 215, ...
## Warning: Too few values at 28 locations: 772, 773, 774, 775, 776, 777, 778,
## 779, 780, 781, 782, 783, 784, 785, 786, 787, 788, 1039, 1040, 1041, ...
comm <- comm_trans[,1] %>% unique()
comm # There are duplicates due to cases, spelling errors.
## # A tibble: 111 x 1
## commodity
## <chr>
## 1 Additives and Oxygenates
## 2 Animal waste
## 3 animal waste
## 4 Anthracite
## 5 Aviation gasoline
## 6 Bagasse
## 7 Biodiesel
## 8 Biogases
## 9 Biogasoline
## 10 Bitumen
## # ... with 101 more rows
for(i in 1:nrow(comm[,1])){ # create a loop to get rid of case duplicates
comm[i,1] <- comm[i,1] %>% tolower()
comm[i,1] <- comm[i,1] %>% str_replace("hrad","hard") #one of the cells had a spelling error
}
comm <- comm %>% unique()
comm
## # A tibble: 97 x 1
## commodity
## <chr>
## 1 additives and oxygenates
## 2 animal waste
## 3 anthracite
## 4 aviation gasoline
## 5 bagasse
## 6 biodiesel
## 7 biogases
## 8 biogasoline
## 9 bitumen
## 10 black liquor
## # ... with 87 more rows
transaction <- comm_trans[,2] %>% unique()
for(i in 1:nrow(transaction[,1])){ # create a loop to get rid of case duplicates
transaction[i,1] <- transaction[i,1] %>% tolower()
}
transaction <- transaction %>% unique()
transaction
## # A tibble: 212 x 1
## transaction
## <chr>
## 1 exports
## 2 imports
## 3 production
## 4 receipts from other sources
## 5 stock changes
## 6 total energy supply
## 7 transfers and recycled products
## 8 transformation
## 9 transformation in oil refineries
## 10 consumption by commerce and public services
## # ... with 202 more rows
category <- energy[,7] %>% unique()
category
## # A tibble: 67 x 1
## category
## <chr>
## 1 additives_and_oxygenates
## 2 animal_waste
## 3 anthracite
## 4 aviation_gasoline
## 5 bagasse
## 6 biodiesel
## 7 biogases
## 8 biogasoline
## 9 bitumen
## 10 black_liquor
## # ... with 57 more rows
We have now shortened our lists to 97 commodities, 212 transactions and 67 catagories. These can be used with tidyr functions to create our subsets.
We can see that the above data is very large and will be difficult to deal with. Initially we ran some loops on the entire data set, but they took close to an hour to complete. This is not very practical, and sub setting the data into smaller parts may make these data more manageable.
According to www.investopedia.com 2, the top 10 world economies are:
We will use these to break the data into more manageable blocks By Country:
#1 USA
usa <- energy %>% filter(country_or_area == "United States")
nrow(usa)
## [1] 18115
head(usa)
## # A tibble: 6 x 7
## country_or_area commodity_transaction year
## <chr> <chr> <int>
## 1 United States Additives and Oxygenates - Exports 2014
## 2 United States Additives and Oxygenates - Exports 2013
## 3 United States Additives and Oxygenates - Exports 2012
## 4 United States Additives and Oxygenates - Exports 2011
## 5 United States Additives and Oxygenates - Exports 2010
## 6 United States Additives and Oxygenates - Exports 2009
## # ... with 4 more variables: unit <chr>, quantity <dbl>,
## # quantity_footnotes <int>, category <chr>
#2
china <- energy %>% filter(country_or_area == "China")
nrow(china)
## [1] 10603
head(china)
## # A tibble: 6 x 7
## country_or_area commodity_transaction year
## <chr> <chr> <int>
## 1 China Aviation gasoline - Consumption by transport 2014
## 2 China Aviation gasoline - Consumption by transport 2013
## 3 China Aviation gasoline - Consumption by transport 2012
## 4 China Aviation gasoline - Consumption by transport 2011
## 5 China Aviation gasoline - Consumption by transport 2010
## 6 China Aviation gasoline - Consumption by transport 2009
## # ... with 4 more variables: unit <chr>, quantity <dbl>,
## # quantity_footnotes <int>, category <chr>
#3
japan <- energy %>% filter(country_or_area == "Japan")
nrow(japan)
## [1] 15497
head(japan)
## # A tibble: 6 x 7
## country_or_area commodity_transaction
## <chr> <chr>
## 1 Japan Additives and Oxygenates - Receipts from other sources
## 2 Japan Additives and Oxygenates - Receipts from other sources
## 3 Japan Additives and Oxygenates - Receipts from other sources
## 4 Japan Additives and Oxygenates - Receipts from other sources
## 5 Japan Additives and Oxygenates - Receipts from other sources
## 6 Japan Additives and Oxygenates - Receipts from other sources
## # ... with 5 more variables: year <int>, unit <chr>, quantity <dbl>,
## # quantity_footnotes <int>, category <chr>
#4
germany <- energy %>% filter(country_or_area == "Germany")
nrow(germany)
## [1] 18781
head(germany)
## # A tibble: 6 x 7
## country_or_area commodity_transaction year
## <chr> <chr> <int>
## 1 Germany Additives and Oxygenates - Production 2014
## 2 Germany Additives and Oxygenates - Production 2013
## 3 Germany Additives and Oxygenates - Production 2012
## 4 Germany Additives and Oxygenates - Production 2011
## 5 Germany Additives and Oxygenates - Production 2010
## 6 Germany Additives and Oxygenates - Production 2009
## # ... with 4 more variables: unit <chr>, quantity <dbl>,
## # quantity_footnotes <int>, category <chr>
#5
uk <- energy %>% filter(country_or_area == "United Kingdom")
nrow(uk)
## [1] 14061
head(uk)
## # A tibble: 6 x 7
## country_or_area commodity_transaction year
## <chr> <chr> <int>
## 1 United Kingdom Additives and Oxygenates - Exports 2014
## 2 United Kingdom Additives and Oxygenates - Imports 2014
## 3 United Kingdom Additives and Oxygenates - Production 2014
## 4 United Kingdom Additives and Oxygenates - Production 2004
## 5 United Kingdom Additives and Oxygenates - Production 2003
## 6 United Kingdom Additives and Oxygenates - Production 2002
## # ... with 4 more variables: unit <chr>, quantity <dbl>,
## # quantity_footnotes <int>, category <chr>
#6
india <- energy %>% filter(country_or_area == "India")
nrow(india)
## [1] 7022
head(india)
## # A tibble: 6 x 7
## country_or_area
## <chr>
## 1 India
## 2 India
## 3 India
## 4 India
## 5 India
## 6 India
## # ... with 6 more variables: commodity_transaction <chr>, year <int>,
## # unit <chr>, quantity <dbl>, quantity_footnotes <int>, category <chr>
#7
france <- energy %>% filter(country_or_area == "France")
nrow(france)
## [1] 15637
head(france)
## # A tibble: 6 x 7
## country_or_area commodity_transaction year
## <chr> <chr> <int>
## 1 France Additives and Oxygenates - Exports 2014
## 2 France Additives and Oxygenates - Exports 2013
## 3 France Additives and Oxygenates - Exports 2012
## 4 France Additives and Oxygenates - Exports 2011
## 5 France Additives and Oxygenates - Exports 2010
## 6 France Additives and Oxygenates - Exports 2009
## # ... with 4 more variables: unit <chr>, quantity <dbl>,
## # quantity_footnotes <int>, category <chr>
#8
brazil <- energy %>% filter(country_or_area == "Brazil")
nrow(brazil)
## [1] 9337
head(brazil)
## # A tibble: 6 x 7
## country_or_area commodity_transaction year
## <chr> <chr> <int>
## 1 Brazil Aviation gasoline - Consumption by transport 2014
## 2 Brazil Aviation gasoline - Consumption by transport 2013
## 3 Brazil Aviation gasoline - Consumption by transport 2012
## 4 Brazil Aviation gasoline - Consumption by transport 2011
## 5 Brazil Aviation gasoline - Consumption by transport 2010
## 6 Brazil Aviation gasoline - Consumption by transport 2009
## # ... with 4 more variables: unit <chr>, quantity <dbl>,
## # quantity_footnotes <int>, category <chr>
#9
italy <- energy %>% filter(country_or_area == "Italy")
nrow(italy)
## [1] 14739
head(italy)
## # A tibble: 6 x 7
## country_or_area commodity_transaction year
## <chr> <chr> <int>
## 1 Italy Additives and Oxygenates - Exports 2014
## 2 Italy Additives and Oxygenates - Exports 2013
## 3 Italy Additives and Oxygenates - Exports 2012
## 4 Italy Additives and Oxygenates - Exports 2011
## 5 Italy Additives and Oxygenates - Exports 2010
## 6 Italy Additives and Oxygenates - Exports 2009
## # ... with 4 more variables: unit <chr>, quantity <dbl>,
## # quantity_footnotes <int>, category <chr>
#10
canada <- energy %>% filter(country_or_area == "Canada")
nrow(canada)
## [1] 13989
head(canada)
## # A tibble: 6 x 7
## country_or_area commodity_transaction year
## <chr> <chr> <int>
## 1 Canada Additives and Oxygenates - Imports 2014
## 2 Canada Additives and Oxygenates - Imports 2013
## 3 Canada Additives and Oxygenates - Imports 2012
## 4 Canada Additives and Oxygenates - Imports 2011
## 5 Canada Additives and Oxygenates - Imports 2010
## 6 Canada Additives and Oxygenates - Imports 2009
## # ... with 4 more variables: unit <chr>, quantity <dbl>,
## # quantity_footnotes <int>, category <chr>
We will use these subsets to create futher subsets as we proform our anaysis.
Units of measure are also of interest. Solid fuels will be measured by mass, and liquid and gasses by volume. Furthermore energy units can be measured in Kilowatt-hours, Power*time is Energy, or Joules, a direct measure of energy.
en_units <- energy[,4] %>% unique() %>% arrange(unit)
en_units
## # A tibble: 5 x 1
## unit
## <chr>
## 1 Cubic metres, thousand
## 2 Kilowatt-hours, million
## 3 Kilowatts, thousand
## 4 Metric tons, thousand
## 5 Terajoules
Kilowatt hours can be converted into Joules by the following conversion: \[ \frac{1 KW*hr}{1}*\frac{3600 s}{1 hr}*\frac{1000 J}{1 KW*s} = 3,600,000 J \\ 1 TJ = 1*10^{12} J \\ \frac{3.6x10^6 J}{1} * \frac{1 TJ}{1*10^{12}J} = 3.6*10^{-6} TJ \\ 1 TJ = \frac{1KW*hr}{3.6*10^{-6}} = \frac{1*10^6 KW*hr}{3.6} \]
Note that KW*hr are measured in millions, such that \(1*10^6\space KW*hr = 3.6 TJ\), according to the above conversion. So to convert those measurements into TJ we need simply divide the Millions of \(KW*hr\) by 3.6 according to the above conversion.
# This seems to stall out my PC, so it is included, but not used.
#for(i in 1:nrow(energy)){
# if(energy[i,4] == "Kilowatt-hours, million"){
# energy[i,5] = energy[i,5]/3.6
# energy[i,4] = "Terajoules"
#}
# }
#This isn't useful given the above loop is too slow.
#en_units <- energy[,4] %>% unique() %>% arrange(unit)
#en_units
For any conversion of mass or volume, specific energy (Energy per mass) or Energy Density (Energy per Volume) must be factored into the measurement. The efficiency of the method of energy production must also be accounted for.
We can use the categories: “total_electricity”, “nuclear_electricity”, “thermal_electricity”, “solar_electricity”, “tide_wave_and_ocean_electricity”, and “hydro” to analyze where the top 10 countries get their electricity.
suppressMessages(suppressWarnings(library(ggplot2)))
usa_total_elec <- usa %>% filter(category == "total_electricity") %>% select(commodity_transaction, year, quantity)
usa_total_elec <- usa_total_elec %>% spread(year, quantity, fill = 0)
usa_total_elec[5,1] <- "Total"
usa_total_elec[5,-1] <- colSums(usa_total_elec[-5,-1])
usa_total_elec <- usa_total_elec %>% gather("year", "quantity", 2:26, convert=TRUE) %>% filter(commodity_transaction == "Total")
usa_total_elec ## these numbers are less than the total thermal electrcity produced, we'll need to reevaluate.
## # A tibble: 25 x 3
## commodity_transaction year quantity
## <chr> <int> <dbl>
## 1 Total 1990 1103431
## 2 Total 1991 1132813
## 3 Total 1992 1125615
## 4 Total 1993 1158603
## 5 Total 1994 1201981
## 6 Total 1995 1268160
## 7 Total 1996 1296852
## 8 Total 1997 1347588
## 9 Total 1998 1378464
## 10 Total 1999 1417272
## # ... with 15 more rows
usa_nuc_elec <- usa %>% filter(category == "nuclear_electricity") %>% select(year, quantity)
usa_nuc_elec
## # A tibble: 25 x 2
## year quantity
## <int> <dbl>
## 1 2014 830584
## 2 2013 822004
## 3 2012 801129
## 4 2011 821405
## 5 2010 838931
## 6 2009 830210
## 7 2008 837804
## 8 2007 836634
## 9 2006 816195
## 10 2005 810726
## # ... with 15 more rows
usa_solar_elec <- usa %>% filter(category == "solar_electricity") %>% select(year, quantity)
usa_solar_elec
## # A tibble: 25 x 2
## year quantity
## <int> <dbl>
## 1 2014 24603
## 2 2013 15872
## 3 2012 10145
## 4 2011 6153
## 5 2010 3934
## 6 2009 2514
## 7 2008 2091
## 8 2007 1673
## 9 2006 1287
## 10 2005 1120
## # ... with 15 more rows
usa_therm_elec <- usa %>% filter(category == "thermal_electricity") %>% select(year, quantity)
usa_therm_elec
## # A tibble: 25 x 2
## year quantity
## <int> <dbl>
## 1 2014 2995556
## 2 2013 2985891
## 3 2012 3016908
## 4 2011 3037683
## 5 2010 3135578
## 6 2009 2965173
## 7 2008 3173013
## 8 2007 3183440
## 9 2006 3121205
## 10 2005 3149290
## # ... with 15 more rows
usa_hydro_elec <- usa %>% filter(category == "hydro") %>% select(year, quantity)
usa_hydro_elec
## # A tibble: 25 x 2
## year quantity
## <int> <dbl>
## 1 2014 281527
## 2 2013 290113
## 3 2012 298287
## 4 2011 344679
## 5 2010 286333
## 6 2009 298410
## 7 2008 281995
## 8 2007 275545
## 9 2006 317689
## 10 2005 297926
## # ... with 15 more rows
usa_wave_elec <- usa %>% filter(category == "tide_wave_and_ocean_electricity") %>% select(year, quantity)
usa_wave_elec # No wave generation reported in the US.
## # A tibble: 0 x 2
## # ... with 2 variables: year <int>, quantity <dbl>
## We'll redifine total electricity by joining table data frames.
usa_total_elec <- usa_therm_elec %>% left_join(usa_nuc_elec, by = "year")
#usa_total_elec <- usa_total_elec %>% left_join(usa_solar_elec, by = "year") #not included in EIA total
#usa_total_elec <- usa_total_elec %>% left_join(usa_hydro_elec, by = "year") #not included in EIA total
usa_total_elec <- usa_total_elec %>% mutate(total = rowSums(.[2:3]))
usa_total_elec
## # A tibble: 25 x 4
## year quantity.x quantity.y total
## <int> <dbl> <dbl> <dbl>
## 1 2014 2995556 830584 3826140
## 2 2013 2985891 822004 3807895
## 3 2012 3016908 801129 3818037
## 4 2011 3037683 821405 3859088
## 5 2010 3135578 838931 3974509
## 6 2009 2965173 830210 3795383
## 7 2008 3173013 837804 4010817
## 8 2007 3183440 836634 4020074
## 9 2006 3121205 816195 3937400
## 10 2005 3149290 810726 3960016
## # ... with 15 more rows
ggplot(NULL, aes(y=quantity,x=year))+
geom_point(data=usa_total_elec, aes(y=total,x=year), color = "black")+
geom_point(data=usa_nuc_elec, aes(y=quantity,x=year), color = "blue")+
geom_point(data=usa_solar_elec, aes(y=quantity,x=year), color = "yellow3")+
geom_point(data=usa_therm_elec, aes(y=quantity,x=year), color = "red") +
geom_point(data=usa_hydro_elec, aes(y=quantity,x=year), color = "cyan")
American electrical production has increased by about 36% over the past 25 years, largely by adding fossil fuel capacity. Electric production seems to be relatively steady since about 2005.
Note also that total electricity is somehow less than the amount of thermal electricity, when we would expect it to be greater than. This could be because the totals for thermal, nuclear, hydro, etc are based on how much resource goes into the process, not how much comes out, so thermodynamic efficiency is not taken into account. It could also be because the total has no category for at-home private use and that is a majority of power consumption, or the units for some of the categories are wrong, TJ instead of KW*hr, million or visa versa.
We will exclude the kaggle total from further analysis, and replace it by adding the subcategories.
The EIA has electrical production for the whole USA from 2001 to 2016. The units are in thousands of megawatt*hours.
\[ 1000 MW*hr = 1000*10^6 W*hr = 10^9 W*hr \\ 10^6 KW*hr = 10^6*1000 W*hr = 10^9 W*hr \] Thousands of Megawatt*hours are equivalent to millions of Kilowatts*hours. We do not need to convert these data to include with the original set.
The URL3 contains the API code so they are contained in a hidden line of code.
eia_usa <- USA %>% fromJSON(simplifyDataFrame = TRUE) # Entire United States, the url is saved to 'USA'
eia_usa_df <- eia_usa$series$data %>% data.frame()
eia_usa_df$X1 <- eia_usa_df$X1 %>% as.character() %>% as.integer()
eia_usa_df$X2 <- eia_usa_df$X2 %>% as.character() %>% as.numeric()
eia_usa_df <- eia_usa_df %>% rename(year = X1, quantity = X2)
eia_usa_df
## year quantity
## 1 2016 3918231
## 2 2015 3919294
## 3 2014 3937003
## 4 2013 3903715
## 5 2012 3890358
## 6 2011 3948186
## 7 2010 3972386
## 8 2009 3809837
## 9 2008 3974349
## 10 2007 4005343
## 11 2006 3908077
## 12 2005 3902192
## 13 2004 3808360
## 14 2003 3721159
## 15 2002 3698458
## 16 2001 3580053
ggplot(NULL, aes(y=quantity,x=year))+
geom_point(data=eia_usa_df, aes(y=quantity,x=year), color = "black", shape = 4)+
geom_point(data=usa_total_elec, aes(y=total,x=year), color = "black")+
geom_point(data=usa_nuc_elec, aes(y=quantity,x=year), color = "blue")+
geom_point(data=usa_solar_elec, aes(y=quantity,x=year), color = "yellow3")+
geom_point(data=usa_therm_elec, aes(y=quantity,x=year), color = "red") +
geom_point(data=usa_hydro_elec, aes(y=quantity,x=year), color = "cyan")
usa_total_elec <- usa_total_elec[,-4] #drop the previous total
usa_total_elec <- usa_total_elec %>% left_join(usa_solar_elec, by = "year") #not included in EIA total
usa_total_elec <- usa_total_elec %>% left_join(usa_hydro_elec, by = "year") #not included in EIA total
usa_total_elec <- usa_total_elec %>% mutate(total = rowSums(.[2:5]))
usa_total_elec #This adds all catagories together.
## # A tibble: 25 x 6
## year quantity.x quantity.y quantity.x.x quantity.y.y total
## <int> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2014 2995556 830584 24603 281527 4132270
## 2 2013 2985891 822004 15872 290113 4113880
## 3 2012 3016908 801129 10145 298287 4126469
## 4 2011 3037683 821405 6153 344679 4209920
## 5 2010 3135578 838931 3934 286333 4264776
## 6 2009 2965173 830210 2514 298410 4096307
## 7 2008 3173013 837804 2091 281995 4294903
## 8 2007 3183440 836634 1673 275545 4297292
## 9 2006 3121205 816195 1287 317689 4256376
## 10 2005 3149290 810726 1120 297926 4259062
## # ... with 15 more rows
We see that the data from EIA.gov trends with the data collected from kaggle.com, including the same dip in electrical production in the middle of the Great Recession of 2007-2012. The offsets between the EIA data in black “x” and the kaggle.com data in black dots could be due to different sampling techniques. I removed hydro electric and solar from the kaggle.com data set and the agreement between the two sets is almost exact. It appears the the EIA was only counting nuclear-thermal and fossil-fuel-thermal electrical generation.
china_nuc_elec <- china %>% filter(category == "nuclear_electricity")%>% select(year, quantity)
china_nuc_elec
## # A tibble: 23 x 2
## year quantity
## <int> <dbl>
## 1 2014 132538
## 2 2013 111613
## 3 2012 97394
## 4 2011 86350
## 5 2010 73880
## 6 2009 70134
## 7 2008 68394
## 8 2007 62130
## 9 2006 54843
## 10 2005 53088
## # ... with 13 more rows
china_solar_elec <- china %>% filter(category == "solar_electricity")%>% select(year, quantity)
china_solar_elec
## # A tibble: 2 x 2
## year quantity
## <int> <dbl>
## 1 2014 15189
## 2 2013 5564
china_therm_elec <- china %>% filter(category == "thermal_electricity")%>% select(year, quantity)
china_therm_elec
## # A tibble: 25 x 2
## year quantity
## <int> <dbl>
## 1 2014 4268649
## 2 2013 4247009
## 3 2012 3892814
## 4 2011 3833702
## 5 2010 3331928
## 6 2009 2982780
## 7 2008 2731949
## 8 2007 2722933
## 9 2006 2369603
## 10 2005 2047336
## # ... with 15 more rows
china_hydro_elec <- china %>% filter(category == "hydro")%>% select(year, quantity)
china_hydro_elec
## # A tibble: 25 x 2
## year quantity
## <int> <dbl>
## 1 2014 1064337
## 2 2013 920291
## 3 2012 872107
## 4 2011 698945
## 5 2010 722172
## 6 2009 615640
## 7 2008 585187
## 8 2007 485264
## 9 2006 435786
## 10 2005 397017
## # ... with 15 more rows
china_wave_elec <- china %>% filter(category == "tide_wave_and_ocean_electricity")%>% select(year, quantity)
china_wave_elec # No wave generation reported in China.
## # A tibble: 0 x 2
## # ... with 2 variables: year <int>, quantity <dbl>
#Create a data frame for calculating total electrical output.
china_total_elec <- china_therm_elec %>% left_join(china_nuc_elec, by = "year")
china_total_elec <- china_total_elec %>% left_join(china_solar_elec, by = "year")
china_total_elec <- china_total_elec %>% left_join(china_hydro_elec, by = "year")
china_total_elec[is.na(china_total_elec)] <- 0 #China had missing data.
china_total_elec <- china_total_elec %>% mutate(total = rowSums(.[2:5]))
china_total_elec
## # A tibble: 25 x 6
## year quantity.x quantity.y quantity.x.x quantity.y.y total
## <int> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2014 4268649 132538 15189 1064337 5480713
## 2 2013 4247009 111613 5564 920291 5284477
## 3 2012 3892814 97394 0 872107 4862315
## 4 2011 3833702 86350 0 698945 4618997
## 5 2010 3331928 73880 0 722172 4127980
## 6 2009 2982780 70134 0 615640 3668554
## 7 2008 2731949 68394 0 585187 3385530
## 8 2007 2722933 62130 0 485264 3270327
## 9 2006 2369603 54843 0 435786 2860232
## 10 2005 2047336 53088 0 397017 2497441
## # ... with 15 more rows
ggplot(NULL, aes(y=quantity,x=year))+
geom_point(data=china_total_elec, aes(y=total,x=year), color = "black")+
geom_point(data=china_nuc_elec, aes(y=quantity,x=year), color = "blue")+
geom_point(data=china_solar_elec, aes(y=quantity,x=year), color = "yellow3")+
geom_point(data=china_therm_elec, aes(y=quantity,x=year), color = "red") +
geom_point(data=china_hydro_elec, aes(y=quantity,x=year), color = "cyan")
China’s economic explosion of the last 20 years has been largely powered by fossil fuels, with some hydro-electricity. Nuclear and Solar make up only a small fraction. There looks like a slight change in the trend of China’s electrical production during the Great Recession, however the trend is still increasing.
japan_nuc_elec <- japan %>% filter(category == "nuclear_electricity")%>% select(year, quantity)
japan_nuc_elec
## # A tibble: 25 x 2
## year quantity
## <int> <dbl>
## 1 2014 0
## 2 2013 9303
## 3 2012 15939
## 4 2011 101761
## 5 2010 288230
## 6 2009 279750
## 7 2008 258128
## 8 2007 263832
## 9 2006 303426
## 10 2005 304755
## # ... with 15 more rows
japan_solar_elec <- japan %>% filter(category == "solar_electricity")%>% select(year, quantity)
japan_solar_elec
## # A tibble: 25 x 2
## year quantity
## <int> <dbl>
## 1 2014 24506
## 2 2013 14279
## 3 2012 6963
## 4 2011 5160
## 5 2010 3800
## 6 2009 2758
## 7 2008 2251
## 8 2007 2015
## 9 2006 1794
## 10 2005 1493
## # ... with 15 more rows
japan_therm_elec <- japan %>% filter(category == "thermal_electricity")%>% select(year, quantity)
japan_therm_elec
## # A tibble: 25 x 2
## year quantity
## <int> <dbl>
## 1 2014 921613
## 2 2013 950236
## 3 2012 950187
## 4 2011 876365
## 5 2010 758578
## 6 2009 702466
## 7 2008 732970
## 8 2007 779970
## 9 2006 696738
## 10 2005 702212
## # ... with 15 more rows
japan_hydro_elec <- japan %>% filter(category == "hydro")%>% select(year, quantity)
japan_hydro_elec
## # A tibble: 25 x 2
## year quantity
## <int> <dbl>
## 1 2014 86942
## 2 2013 84923
## 3 2012 83645
## 4 2011 91709
## 5 2010 90682
## 6 2009 83832
## 7 2008 83504
## 8 2007 84234
## 9 2006 97340
## 10 2005 86350
## # ... with 15 more rows
japan_wave_elec <- japan %>% filter(category == "tide_wave_and_ocean_electricity")%>% select(year, quantity)
japan_wave_elec # No wave generation reported in Japan.
## # A tibble: 0 x 2
## # ... with 2 variables: year <int>, quantity <dbl>
#Create a data frame for calculating total electrical output.
japan_total_elec <- japan_therm_elec %>% left_join(japan_nuc_elec, by = "year")
japan_total_elec <- japan_total_elec %>% left_join(japan_solar_elec, by = "year")
japan_total_elec <- japan_total_elec %>% left_join(japan_hydro_elec, by = "year")
japan_total_elec[is.na(japan_total_elec)] <- 0 #for had missing data.
japan_total_elec <- japan_total_elec %>% mutate(total = rowSums(.[2:5]))
japan_total_elec
## # A tibble: 25 x 6
## year quantity.x quantity.y quantity.x.x quantity.y.y total
## <int> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2014 921613 0 24506 86942 1033061
## 2 2013 950236 9303 14279 84923 1058741
## 3 2012 950187 15939 6963 83645 1056734
## 4 2011 876365 101761 5160 91709 1074995
## 5 2010 758578 288230 3800 90682 1141290
## 6 2009 702466 279750 2758 83832 1068806
## 7 2008 732970 258128 2251 83504 1076853
## 8 2007 779970 263832 2015 84234 1130051
## 9 2006 696738 303426 1794 97340 1099298
## 10 2005 702212 304755 1493 86350 1094810
## # ... with 15 more rows
ggplot(NULL, aes(y=quantity,x=year))+
geom_point(data=japan_total_elec, aes(y=total,x=year), color = "black")+
geom_point(data=japan_nuc_elec, aes(y=quantity,x=year), color = "blue")+
geom_point(data=japan_solar_elec, aes(y=quantity,x=year), color = "yellow3")+
geom_point(data=japan_therm_elec, aes(y=quantity,x=year), color = "red") +
geom_point(data=japan_hydro_elec, aes(y=quantity,x=year), color = "cyan")
We can see that after the Fukishima nuclear accident (2011), Japan brought most of it’s nuclear power off-line and made up for the loss with fossil-fuels. Like the US, we also see a dip corresponding to the Great Recession.
germ_nuc_elec <- germany %>% filter(category == "nuclear_electricity")%>% select(year, quantity)
germ_nuc_elec
## # A tibble: 24 x 2
## year quantity
## <int> <dbl>
## 1 2014 97129
## 2 2013 97290
## 3 2012 99460
## 4 2011 107971
## 5 2010 140556
## 6 2009 134932
## 7 2008 148495
## 8 2007 140534
## 9 2006 167269
## 10 2005 163055
## # ... with 14 more rows
germ_solar_elec <- germany %>% filter(category == "solar_electricity")%>% select(year, quantity)
germ_solar_elec
## # A tibble: 24 x 2
## year quantity
## <int> <dbl>
## 1 2014 36056
## 2 2013 31010
## 3 2012 26380
## 4 2011 19599
## 5 2010 11729
## 6 2009 6584
## 7 2008 4420
## 8 2007 3075
## 9 2006 2220
## 10 2005 1282
## # ... with 14 more rows
germ_therm_elec <- germany %>% filter(category == "thermal_electricity")%>% select(year, quantity)
germ_therm_elec
## # A tibble: 24 x 2
## year quantity
## <int> <dbl>
## 1 2014 409688
## 2 2013 428094
## 3 2012 423424
## 4 2011 410773
## 5 2010 413207
## 6 2009 388640
## 7 2008 418220
## 8 2007 426786
## 9 2006 408745
## 10 2005 401355
## # ... with 14 more rows
germ_hydro_elec <- germany %>% filter(category == "hydro")%>% select(year, quantity)
germ_hydro_elec
## # A tibble: 24 x 2
## year quantity
## <int> <dbl>
## 1 2014 25444
## 2 2013 28782
## 3 2012 27849
## 4 2011 23511
## 5 2010 27353
## 6 2009 24682
## 7 2008 26469
## 8 2007 28084
## 9 2006 26768
## 10 2005 26417
## # ... with 14 more rows
germ_wave_elec <- germany %>% filter(category == "tide_wave_and_ocean_electricity")%>% select(year, quantity)
germ_wave_elec # No wave generation reported in Germany.
## # A tibble: 0 x 2
## # ... with 2 variables: year <int>, quantity <dbl>
#Create a data frame for calculating total electrical output.
germ_total_elec <- germ_therm_elec %>% left_join(germ_nuc_elec, by = "year")
germ_total_elec <- germ_total_elec %>% left_join(germ_solar_elec, by = "year")
germ_total_elec <- germ_total_elec %>% left_join(germ_hydro_elec, by = "year")
germ_total_elec[is.na(germ_total_elec)] <- 0 #for had missing data.
germ_total_elec <- germ_total_elec %>% mutate(total = rowSums(.[2:5]))
germ_total_elec
## # A tibble: 24 x 6
## year quantity.x quantity.y quantity.x.x quantity.y.y total
## <int> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2014 409688 97129 36056 25444 568317
## 2 2013 428094 97290 31010 28782 585176
## 3 2012 423424 99460 26380 27849 577113
## 4 2011 410773 107971 19599 23511 561854
## 5 2010 413207 140556 11729 27353 592845
## 6 2009 388640 134932 6584 24682 554838
## 7 2008 418220 148495 4420 26469 597604
## 8 2007 426786 140534 3075 28084 598479
## 9 2006 408745 167269 2220 26768 605002
## 10 2005 401355 163055 1282 26417 592109
## # ... with 14 more rows
ggplot(NULL, aes(y=quantity,x=year))+
geom_point(data=germ_total_elec, aes(y=total,x=year), color = "black")+
geom_point(data=germ_nuc_elec, aes(y=quantity,x=year), color = "blue")+
geom_point(data=germ_solar_elec, aes(y=quantity,x=year), color = "yellow3")+
geom_point(data=germ_therm_elec, aes(y=quantity,x=year), color = "red") +
geom_point(data=germ_hydro_elec, aes(y=quantity,x=year), color = "cyan")
Germany’s fossil fuel based electrical out has held more or less constant, it does appear that Germany is replacing nuclear based electricity with solar. Like the US, we also see a dip corresponding to the Great Recession.
uk_nuc_elec <- uk %>% filter(category == "nuclear_electricity")%>% select(year, quantity)
uk_nuc_elec
## # A tibble: 25 x 2
## year quantity
## <int> <dbl>
## 1 2014 63748
## 2 2013 70607
## 3 2012 70405
## 4 2011 68980
## 5 2010 62140
## 6 2009 69098
## 7 2008 52486
## 8 2007 63028
## 9 2006 75451
## 10 2005 81618
## # ... with 15 more rows
uk_solar_elec <- uk %>% filter(category == "solar_electricity")%>% select(year, quantity)
uk_solar_elec
## # A tibble: 16 x 2
## year quantity
## <int> <dbl>
## 1 2014 4050
## 2 2013 1989
## 3 2012 1352
## 4 2011 244
## 5 2010 41
## 6 2009 20
## 7 2008 17
## 8 2007 14
## 9 2006 11
## 10 2005 8
## 11 2004 4
## 12 2003 3
## 13 2002 3
## 14 2001 2
## 15 2000 1
## 16 1999 1
uk_therm_elec <- uk %>% filter(category == "thermal_electricity")%>% select(year, quantity)
uk_therm_elec
## # A tibble: 25 x 2
## year quantity
## <int> <dbl>
## 1 2014 230342
## 2 2013 250539
## 3 2012 263734
## 4 2011 273961
## 5 2010 302618
## 6 2009 289403
## 7 2008 320033
## 8 2007 319578
## 9 2006 309149
## 10 2005 305973
## # ... with 15 more rows
uk_hydro_elec <- uk %>% filter(category == "hydro")%>% select(year, quantity)
uk_hydro_elec
## # A tibble: 25 x 2
## year quantity
## <int> <dbl>
## 1 2014 8768
## 2 2013 7606
## 3 2012 8252
## 4 2011 8585
## 5 2010 6715
## 6 2009 8915
## 7 2008 9234
## 8 2007 8936
## 9 2006 8446
## 10 2005 7852
## # ... with 15 more rows
uk_wave_elec <- uk %>% filter(category == "tide_wave_and_ocean_electricity")%>% select(year, quantity)
uk_wave_elec
## # A tibble: 6 x 2
## year quantity
## <int> <dbl>
## 1 2014 2
## 2 2013 6
## 3 2012 4
## 4 2011 1
## 5 2010 2
## 6 2009 1
#Create a data frame for calculating total electrical output.
uk_total_elec <- uk_therm_elec %>% left_join(uk_nuc_elec, by = "year")
uk_total_elec <- uk_total_elec %>% left_join(uk_solar_elec, by = "year")
uk_total_elec <- uk_total_elec %>% left_join(uk_hydro_elec, by = "year")
uk_total_elec <- uk_total_elec %>% left_join(uk_wave_elec, by = "year")
uk_total_elec[is.na(uk_total_elec)] <- 0 #for missing data.
uk_total_elec <- uk_total_elec %>% mutate(total = rowSums(.[2:6]))
uk_total_elec
## # A tibble: 25 x 7
## year quantity.x quantity.y quantity.x.x quantity.y.y quantity total
## <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2014 230342 63748 4050 8768 2 306910
## 2 2013 250539 70607 1989 7606 6 330747
## 3 2012 263734 70405 1352 8252 4 343747
## 4 2011 273961 68980 244 8585 1 351771
## 5 2010 302618 62140 41 6715 2 371516
## 6 2009 289403 69098 20 8915 1 367437
## 7 2008 320033 52486 17 9234 0 381770
## 8 2007 319578 63028 14 8936 0 391556
## 9 2006 309149 75451 11 8446 0 393057
## 10 2005 305973 81618 8 7852 0 395451
## # ... with 15 more rows
ggplot(NULL, aes(y=quantity,x=year))+
geom_point(data=uk_total_elec, aes(y=total,x=year), color = "black")+
geom_point(data=uk_nuc_elec, aes(y=quantity,x=year), color = "blue")+
geom_point(data=uk_solar_elec, aes(y=quantity,x=year), color = "yellow3")+
geom_point(data=uk_therm_elec, aes(y=quantity,x=year), color = "red") +
geom_point(data=uk_hydro_elec, aes(y=quantity,x=year), color = "cyan") +
geom_point(data=uk_wave_elec, aes(y=quantity,x=year), color = "violet")
Total UK electrical production seems to fluctuate sinusoidal since 1990 with a period of about 50 years, based on the graph. We seemed to have captured half a period over 25 years. Like the US, we also see a dip corresponding to the Great Recession.
india_nuc_elec <- india %>% filter(category == "nuclear_electricity")%>% select(year, quantity)
india_nuc_elec
## # A tibble: 25 x 2
## year quantity
## <int> <dbl>
## 1 2014 36102
## 2 2013 34228
## 3 2012 32871
## 4 2011 32287
## 5 2010 26266
## 6 2009 18636
## 7 2008 14713
## 8 2007 16957
## 9 2006 18802
## 10 2005 17334
## # ... with 15 more rows
india_solar_elec <- india %>% filter(category == "solar_electricity")%>% select(year, quantity)
india_solar_elec
## # A tibble: 2 x 2
## year quantity
## <int> <dbl>
## 1 2014 5020
## 2 2013 3433
india_therm_elec <- india %>% filter(category == "thermal_electricity")%>% select(year, quantity)
india_therm_elec
## # A tibble: 25 x 2
## year quantity
## <int> <dbl>
## 1 2014 1111341
## 2 2013 980599
## 3 2012 860173
## 4 2011 813625
## 5 2010 795052
## 6 2009 755656
## 7 2008 700147
## 8 2007 664143
## 9 2006 610385
## 10 2005 572415
## # ... with 15 more rows
india_hydro_elec <- india %>% filter(category == "hydro")%>% select(year, quantity)
india_hydro_elec
## # A tibble: 25 x 2
## year quantity
## <int> <dbl>
## 1 2014 129353
## 2 2013 141637
## 3 2012 113855
## 4 2011 103741
## 5 2010 114486
## 6 2009 106909
## 7 2008 114337
## 8 2007 120589
## 9 2006 113720
## 10 2005 101730
## # ... with 15 more rows
india_wave_elec <- india %>% filter(category == "tide_wave_and_ocean_electricity")%>% select(year, quantity)
india_wave_elec # No wave generation reported in India.
## # A tibble: 0 x 2
## # ... with 2 variables: year <int>, quantity <dbl>
#Create a data frame for calculating total electrical output.
india_total_elec <- india_therm_elec %>% left_join(india_nuc_elec, by = "year")
india_total_elec <- india_total_elec %>% left_join(india_solar_elec, by = "year")
india_total_elec <- india_total_elec %>% left_join(india_hydro_elec, by = "year")
india_total_elec <- india_total_elec %>% left_join(india_wave_elec, by = "year")
india_total_elec[is.na(india_total_elec)] <- 0 #for missing data.
india_total_elec <- india_total_elec %>% mutate(total = rowSums(.[2:6]))
india_total_elec
## # A tibble: 25 x 7
## year quantity.x quantity.y quantity.x.x quantity.y.y quantity total
## <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2014 1111341 36102 5020 129353 0 1281816
## 2 2013 980599 34228 3433 141637 0 1159897
## 3 2012 860173 32871 0 113855 0 1006899
## 4 2011 813625 32287 0 103741 0 949653
## 5 2010 795052 26266 0 114486 0 935804
## 6 2009 755656 18636 0 106909 0 881201
## 7 2008 700147 14713 0 114337 0 829197
## 8 2007 664143 16957 0 120589 0 801689
## 9 2006 610385 18802 0 113720 0 742907
## 10 2005 572415 17334 0 101730 0 691479
## # ... with 15 more rows
ggplot(NULL, aes(y=quantity,x=year))+
geom_point(data=india_total_elec, aes(y=total,x=year), color = "black")+
geom_point(data=india_nuc_elec, aes(y=quantity,x=year), color = "blue")+
geom_point(data=india_solar_elec, aes(y=quantity,x=year), color = "yellow3")+
geom_point(data=india_therm_elec, aes(y=quantity,x=year), color = "red") +
geom_point(data=india_hydro_elec, aes(y=quantity,x=year), color = "cyan")
Much like China, India has greatly increased it’s electricity output over the last 25 years by adding fossil fuel capacity. However, India does not appear to be as affected by the Great Recession as other countries. Given that approximately half of India’s landmass is in the tropics, it is surprising that India has not brought more Solar online.
fr_nuc_elec <- france %>% filter(category == "nuclear_electricity")%>% select(year, quantity)
fr_nuc_elec
## # A tibble: 25 x 2
## year quantity
## <int> <dbl>
## 1 2014 436474
## 2 2013 423685
## 3 2012 425406
## 4 2011 442383
## 5 2010 428521
## 6 2009 409736
## 7 2008 439447
## 8 2007 439730
## 9 2006 450191
## 10 2005 451529
## # ... with 15 more rows
fr_solar_elec <- france %>% filter(category == "solar_electricity")%>% select(year, quantity)
fr_solar_elec
## # A tibble: 21 x 2
## year quantity
## <int> <dbl>
## 1 2014 5909
## 2 2013 4735
## 3 2012 4016
## 4 2011 2078
## 5 2010 620
## 6 2009 174
## 7 2008 42
## 8 2007 27
## 9 2006 16
## 10 2005 13
## # ... with 11 more rows
fr_therm_elec <- france %>% filter(category == "thermal_electricity")%>% select(year, quantity)
fr_therm_elec
## # A tibble: 25 x 2
## year quantity
## <int> <dbl>
## 1 2014 33484
## 2 2013 50877
## 3 2012 56722
## 4 2011 54594
## 5 2010 62010
## 6 2009 55396
## 7 2008 59500
## 8 2007 61874
## 9 2006 60279
## 10 2005 66747
## # ... with 15 more rows
fr_hydro_elec <- france %>% filter(category == "hydro")%>% select(year, quantity)
fr_hydro_elec
## # A tibble: 25 x 2
## year quantity
## <int> <dbl>
## 1 2014 68626
## 2 2013 75867
## 3 2012 63594
## 4 2011 49865
## 5 2010 67525
## 6 2009 61969
## 7 2008 68368
## 8 2007 63260
## 9 2006 61742
## 10 2005 56332
## # ... with 15 more rows
fr_wave_elec <- france %>% filter(category == "tide_wave_and_ocean_electricity")%>% select(year, quantity)
fr_wave_elec
## # A tibble: 25 x 2
## year quantity
## <int> <dbl>
## 1 2014 481
## 2 2013 414
## 3 2012 458
## 4 2011 477
## 5 2010 476
## 6 2009 448
## 7 2008 465
## 8 2007 465
## 9 2006 464
## 10 2005 481
## # ... with 15 more rows
#Create a data frame for calculating total electrical output.
fr_total_elec <- fr_therm_elec %>% left_join(fr_nuc_elec, by = "year")
fr_total_elec <- fr_total_elec %>% left_join(fr_solar_elec, by = "year")
fr_total_elec <- fr_total_elec %>% left_join(fr_hydro_elec, by = "year")
fr_total_elec <- fr_total_elec %>% left_join(fr_wave_elec, by = "year")
fr_total_elec[is.na(fr_total_elec)] <- 0 #for missing data.
fr_total_elec <- fr_total_elec %>% mutate(total = rowSums(.[2:6]))
fr_total_elec
## # A tibble: 25 x 7
## year quantity.x quantity.y quantity.x.x quantity.y.y quantity total
## <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2014 33484 436474 5909 68626 481 544974
## 2 2013 50877 423685 4735 75867 414 555578
## 3 2012 56722 425406 4016 63594 458 550196
## 4 2011 54594 442383 2078 49865 477 549397
## 5 2010 62010 428521 620 67525 476 559152
## 6 2009 55396 409736 174 61969 448 527723
## 7 2008 59500 439447 42 68368 465 567822
## 8 2007 61874 439730 27 63260 465 565356
## 9 2006 60279 450191 16 61742 464 572692
## 10 2005 66747 451529 13 56332 481 575102
## # ... with 15 more rows
ggplot(NULL, aes(y=quantity,x=year))+
geom_point(data=fr_total_elec, aes(y=total,x=year), color = "black")+
geom_point(data=fr_nuc_elec, aes(y=quantity,x=year), color = "blue")+
geom_point(data=fr_solar_elec, aes(y=quantity,x=year), color = "yellow3")+
geom_point(data=fr_therm_elec, aes(y=quantity,x=year), color = "red") +
geom_point(data=fr_hydro_elec, aes(y=quantity,x=year), color = "cyan") +
geom_point(data=fr_wave_elec, aes(y=quantity,x=year), color = "violet")
France is the only country we examined with the majority of its power generated by nuclear. This might make it the only top world economy that is a net energy exporter than an importer. It is also the only major economy that is decreasing fossil fuel based electricity. The difference seems to be made up for by hydro and nuclear power. Like the US, we also see a dip corresponding to the Great Recession.
braz_nuc_elec <- brazil %>% filter(category == "nuclear_electricity")%>% select(year, quantity)
braz_nuc_elec
## # A tibble: 25 x 2
## year quantity
## <int> <dbl>
## 1 2014 15378
## 2 2013 15450
## 3 2012 16038
## 4 2011 15659
## 5 2010 14523
## 6 2009 12957
## 7 2008 13969
## 8 2007 12350
## 9 2006 13754
## 10 2005 9855
## # ... with 15 more rows
braz_solar_elec <- brazil %>% filter(category == "solar_electricity")%>% select(year, quantity)
braz_solar_elec
## # A tibble: 2 x 2
## year quantity
## <int> <dbl>
## 1 2014 16
## 2 2013 7
braz_therm_elec <- brazil %>% filter(category == "thermal_electricity")%>% select(year, quantity)
braz_therm_elec
## # A tibble: 25 x 2
## year quantity
## <int> <dbl>
## 1 2014 189497
## 2 2013 157810
## 3 2012 116068
## 4 2011 85062
## 5 2010 95809
## 6 2009 60975
## 7 2008 78758
## 8 2007 58084
## 9 2006 56777
## 10 2005 55626
## # ... with 15 more rows
braz_hydro_elec <- brazil %>% filter(category == "hydro")%>% select(year, quantity)
braz_hydro_elec
## # A tibble: 25 x 2
## year quantity
## <int> <dbl>
## 1 2014 373439
## 2 2013 390992
## 3 2012 415342
## 4 2011 428333
## 5 2010 403289
## 6 2009 390988
## 7 2008 369556
## 8 2007 374015
## 9 2006 348805
## 10 2005 337457
## # ... with 15 more rows
braz_wave_elec <- brazil %>% filter(category == "tide_wave_and_ocean_electricity")%>% select(year, quantity)
braz_wave_elec # No wave generation reported.
## # A tibble: 0 x 2
## # ... with 2 variables: year <int>, quantity <dbl>
#Create a data frame for calculating total electrical output.
braz_total_elec <- braz_therm_elec %>% left_join(braz_nuc_elec, by = "year")
braz_total_elec <- braz_total_elec %>% left_join(braz_solar_elec, by = "year")
braz_total_elec <- braz_total_elec %>% left_join(braz_hydro_elec, by = "year")
braz_total_elec <- braz_total_elec %>% left_join(braz_wave_elec, by = "year")
braz_total_elec[is.na(braz_total_elec)] <- 0 #for missing data.
braz_total_elec <- braz_total_elec %>% mutate(total = rowSums(.[2:6]))
braz_total_elec
## # A tibble: 25 x 7
## year quantity.x quantity.y quantity.x.x quantity.y.y quantity total
## <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2014 189497 15378 16 373439 0 578330
## 2 2013 157810 15450 7 390992 0 564259
## 3 2012 116068 16038 0 415342 0 547448
## 4 2011 85062 15659 0 428333 0 529054
## 5 2010 95809 14523 0 403289 0 513621
## 6 2009 60975 12957 0 390988 0 464920
## 7 2008 78758 13969 0 369556 0 462283
## 8 2007 58084 12350 0 374015 0 444449
## 9 2006 56777 13754 0 348805 0 419336
## 10 2005 55626 9855 0 337457 0 402938
## # ... with 15 more rows
ggplot(NULL, aes(y=quantity,x=year))+
geom_point(data=braz_total_elec, aes(y=total,x=year), color = "black")+
geom_point(data=braz_nuc_elec, aes(y=quantity,x=year), color = "blue")+
geom_point(data=braz_solar_elec, aes(y=quantity,x=year), color = "yellow3")+
geom_point(data=braz_therm_elec, aes(y=quantity,x=year), color = "red") +
geom_point(data=braz_hydro_elec, aes(y=quantity,x=year), color = "cyan")
Brazil is the first country in our analysis that produces most of their electricity with renewable hydroelectric energy. Also, given Brazil’s low latitudes; the Northern part of the country intersects the Equator, it is surprising that Germany has brought more Solar online. Brazil gets much more direct sunlight. In terms of sustainability it looks like Brazil is replacing hydroelectric plants with fossil fuels. There may be short-term economic benefit to this, but in the long run this is bad for energy sustainability.
it_nuc_elec <- italy %>% filter(category == "nuclear_electricity")%>% select(year, quantity)
it_nuc_elec #Italy has no Nucleat capacity
## # A tibble: 0 x 2
## # ... with 2 variables: year <int>, quantity <dbl>
it_solar_elec <- italy %>% filter(category == "solar_electricity")%>% select(year, quantity)
it_solar_elec
## # A tibble: 25 x 2
## year quantity
## <int> <dbl>
## 1 2014 22306
## 2 2013 21589
## 3 2012 18862
## 4 2011 10796
## 5 2010 1906
## 6 2009 676
## 7 2008 193
## 8 2007 38
## 9 2006 35
## 10 2005 31
## # ... with 15 more rows
it_therm_elec <- italy %>% filter(category == "thermal_electricity")%>% select(year, quantity)
it_therm_elec
## # A tibble: 25 x 2
## year quantity
## <int> <dbl>
## 1 2014 175510
## 2 2013 192237
## 3 2012 216811
## 4 2011 227711
## 5 2010 230472
## 6 2009 226033
## 7 2008 260414
## 8 2007 264744
## 9 2006 261137
## 10 2005 251958
## # ... with 15 more rows
it_hydro_elec <- italy %>% filter(category == "hydro")%>% select(year, quantity)
it_hydro_elec
## # A tibble: 25 x 2
## year quantity
## <int> <dbl>
## 1 2014 60256
## 2 2013 54672
## 3 2012 43854
## 4 2011 47757
## 5 2010 54406
## 6 2009 53443
## 7 2008 47227
## 8 2007 38482
## 9 2006 43425
## 10 2005 42927
## # ... with 15 more rows
it_wave_elec <- italy %>% filter(category == "tide_wave_and_ocean_electricity")%>% select(year, quantity)
it_wave_elec # No wave generation reported.
## # A tibble: 0 x 2
## # ... with 2 variables: year <int>, quantity <dbl>
#Create a data frame for calculating total electrical output.
it_total_elec <- it_therm_elec %>% left_join(it_nuc_elec, by = "year")
it_total_elec <- it_total_elec %>% left_join(it_solar_elec, by = "year")
it_total_elec <- it_total_elec %>% left_join(it_hydro_elec, by = "year")
it_total_elec <- it_total_elec %>% left_join(it_wave_elec, by = "year")
it_total_elec[is.na(it_total_elec)] <- 0 #for missing data.
it_total_elec <- it_total_elec %>% mutate(total = rowSums(.[2:6]))
it_total_elec
## # A tibble: 25 x 7
## year quantity.x quantity.y quantity.x.x quantity.y.y quantity total
## <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2014 175510 0 22306 60256 0 258072
## 2 2013 192237 0 21589 54672 0 268498
## 3 2012 216811 0 18862 43854 0 279527
## 4 2011 227711 0 10796 47757 0 286264
## 5 2010 230472 0 1906 54406 0 286784
## 6 2009 226033 0 676 53443 0 280152
## 7 2008 260414 0 193 47227 0 307834
## 8 2007 264744 0 38 38482 0 303264
## 9 2006 261137 0 35 43425 0 304597
## 10 2005 251958 0 31 42927 0 294916
## # ... with 15 more rows
ggplot(NULL, aes(y=quantity,x=year))+
geom_point(data=it_total_elec, aes(y=total,x=year), color = "black")+
geom_point(data=it_nuc_elec, aes(y=quantity,x=year), color = "blue")+
geom_point(data=it_solar_elec, aes(y=quantity,x=year), color = "yellow3")+
geom_point(data=it_therm_elec, aes(y=quantity,x=year), color = "red") +
geom_point(data=it_hydro_elec, aes(y=quantity,x=year), color = "cyan")
Italy seems to be the hardest hit by the Great Recession. Their energy production decreases in 2007 and never recovers. Italy also is bringing the \(2^{nd}\) Solar online than other countries in the survey having 22,306 kw*hr, million in 2014 compared to Germany’s 36,056 kw*hr, million.
can_nuc_elec <- canada %>% filter(category == "nuclear_electricity")%>% select(year, quantity)
can_nuc_elec
## # A tibble: 25 x 2
## year quantity
## <int> <dbl>
## 1 2014 107678
## 2 2013 103439
## 3 2012 94862
## 4 2011 93589
## 5 2010 90658
## 6 2009 90091
## 7 2008 96020
## 8 2007 93482
## 9 2006 97964
## 10 2005 92040
## # ... with 15 more rows
can_solar_elec <- canada %>% filter(category == "solar_electricity")%>% select(year, quantity)
can_solar_elec
## # A tibble: 23 x 2
## year quantity
## <int> <dbl>
## 1 2014 1756
## 2 2013 1499
## 3 2012 319
## 4 2011 260
## 5 2010 117
## 6 2009 5
## 7 2008 35
## 8 2007 26
## 9 2006 21
## 10 2005 17
## # ... with 13 more rows
can_therm_elec <- canada %>% filter(category == "thermal_electricity")%>% select(year, quantity)
can_therm_elec
## # A tibble: 25 x 2
## year quantity
## <int> <dbl>
## 1 2014 139402
## 2 2013 143406
## 3 2012 146063
## 4 2011 150042
## 5 2010 144963
## 6 2009 140817
## 7 2008 152054
## 8 2007 162288
## 9 2006 154030
## 10 2005 166022
## # ... with 15 more rows
can_hydro_elec <- canada %>% filter(category == "hydro")%>% select(year, quantity)
can_hydro_elec
## # A tibble: 25 x 2
## year quantity
## <int> <dbl>
## 1 2014 382574
## 2 2013 391861
## 3 2012 380340
## 4 2011 375797
## 5 2010 351461
## 6 2009 368762
## 7 2008 377561
## 8 2007 367695
## 9 2006 352963
## 10 2005 362031
## # ... with 15 more rows
can_wave_elec <- canada %>% filter(category == "tide_wave_and_ocean_electricity")%>% select(year, quantity)
can_wave_elec # No wave generation reported.
## # A tibble: 25 x 2
## year quantity
## <int> <dbl>
## 1 2014 16
## 2 2013 15
## 3 2012 27
## 4 2011 26
## 5 2010 28
## 6 2009 30
## 7 2008 15
## 8 2007 23
## 9 2006 19
## 10 2005 28
## # ... with 15 more rows
#Create a data frame for calculating total electrical output.
can_total_elec <- can_therm_elec %>% left_join(can_nuc_elec, by = "year")
can_total_elec <- can_total_elec %>% left_join(can_solar_elec, by = "year")
can_total_elec <- can_total_elec %>% left_join(can_hydro_elec, by = "year")
can_total_elec <- can_total_elec %>% left_join(can_wave_elec, by = "year")
can_total_elec[is.na(can_total_elec)] <- 0 #for missing data.
can_total_elec <- can_total_elec %>% mutate(total = rowSums(.[2:6]))
can_total_elec
## # A tibble: 25 x 7
## year quantity.x quantity.y quantity.x.x quantity.y.y quantity total
## <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2014 139402 107678 1756 382574 16 631426
## 2 2013 143406 103439 1499 391861 15 640220
## 3 2012 146063 94862 319 380340 27 621611
## 4 2011 150042 93589 260 375797 26 619714
## 5 2010 144963 90658 117 351461 28 587227
## 6 2009 140817 90091 5 368762 30 599705
## 7 2008 152054 96020 35 377561 15 625685
## 8 2007 162288 93482 26 367695 23 623514
## 9 2006 154030 97964 21 352963 19 604997
## 10 2005 166022 92040 17 362031 28 620138
## # ... with 15 more rows
ggplot(NULL, aes(y=quantity,x=year))+
geom_point(data=can_total_elec, aes(y=total,x=year), color = "black")+
geom_point(data=can_nuc_elec, aes(y=quantity,x=year), color = "blue")+
geom_point(data=can_solar_elec, aes(y=quantity,x=year), color = "yellow3")+
geom_point(data=can_therm_elec, aes(y=quantity,x=year), color = "red") +
geom_point(data=can_hydro_elec, aes(y=quantity,x=year), color = "cyan") +
geom_point(data=can_wave_elec, aes(y=quantity,x=year), color = "violet")
Like Brazil, Canada gets most of its electricity from Hydro-electric plants. Also, we do see the Great Recession dip. It is also interesting to see that Canada seems to swap nuclear power and fossil fuel power.
ggplot(can_total_elec, aes(y=quantity.y, x=quantity.x))+
geom_point(color="steelblue")+
geom_smooth(method = "lm")
cor.test(can_total_elec$quantity.x,can_total_elec$quantity.y, method = "pearson")
##
## Pearson's product-moment correlation
##
## data: can_total_elec$quantity.x and can_total_elec$quantity.y
## t = -1.844, df = 23, p-value = 0.0781
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.66037322 0.04223113
## sample estimates:
## cor
## -0.3588885
Upon further examination, Canada’s anti-correlation between seems to be marginal, as it is significant at the 90% confidence level, but not the 95% confidence level.
We can look at power totals side by side:
totals<- matrix(as.numeric(c(usa_total_elec[1,6],china_total_elec[1,6],japan_total_elec[1,6], germ_total_elec[1,6],uk_total_elec[1,7], india_total_elec[1,7], fr_total_elec[1,7], braz_total_elec[1,7],it_total_elec[1,7],can_total_elec[1,7]), ncol = 1))
barplot(t(totals),names.arg = c("USA", "China", "Japan", "Germany", "UK", "India", "France", "Brazil", "Italy", "Canada"), horiz = TRUE, cex.names= 0.75,las=1, xlab = "2014 Total Electical Production (10^6 kw*hr)")
usa_china = totals[1,1]+totals[2,1]
japan_thru_canada = sum(totals[3:10,1])
totals2 <- matrix(c(usa_china,japan_thru_canada), ncol = 1)
barplot(t(totals2), names.arg = c("USA and China", "Japan to Canada"), horiz = TRUE, cex.names = 0.5, las=1,xlab = "USA and China Compared to Total of Other Top 10 Economies (10^6 kw*hr)")
We can see that the USA and China combined produce much more electricity than other top 10 economies in 2014. We also found that only 3 countries use something other than fossil fuels as their primary power supply. These are Brazil (Hydro), Canada (Hydro), and France (Nuclear).
China and India have also shown the most growth in electrical energy production and merit detailed analysis.