Quiz: Gapminder Dataset (with Two Variables3)

To put my data visualization and summarization skills to the test, I headed to the Gap Minder World Data website https://www.gapminder.org/data/ and downloaded two datasets that caught my attention:

  1. Population density (per square km) that is provided by the UN Population Division, which can be found at https://docs.google.com/spreadsheets/d/1cCskPjXJQrSyDRO4J1FSvn4-fjI9INkuAmQDxLVfc08/pub?gid=0. For more information, go to https://esa.un.org/unpd/wpp/.

  2. Oil Consumption Total By Countries from 1965-2010 provided by BP, which can be found at https://docs.google.com/spreadsheets/d/1Vg99EP2is_XIJCmt0Il94vHRTo2RHJl5aQf22Mzq1uA/pub?gid=0. For more information, go to http://www.bp.com/en/global/corporate/energy-economics/statistical-review-of-world-energy.html.


Loading Data into R

To load my data into R in the desired format, I primarily used the tidyr and the dplyr packages. Because the population density dataset has data from 1950-2010 and the oil consumption dataset has data from 1965-2010, I decided to only explore the overlapping years. Therefore, I looked at only the years 1965-2010.

Lastly, I used an inner_join to combine both datsets that had matching country names. I wanted my final data frame to have to have data from both their population density and their oil consumption and be formatted as follows: “country, year, density, oil_consumption”.

# The Gapminder website contains over 500 data sets with information about
# the world's population. 

# In your investigation, examine pairs of variable and create 2-5 plots that make
# use of the techniques from Lesson 4.

# ====================================================================
#Required Packages
suppressMessages(library("tidyr")) 
suppressMessages(library("dplyr")) 
suppressMessages(library("ggplot2"))
suppressMessages(library("gridExtra")) 

#------------------------------
#Loading Data Into R
# Dataset #1: Population Density (Per Sq. Km.) By Countries from 1950-2010
dens <- read.csv("population_density_per_sq_km.csv", header = TRUE)
names(dens)[1] <- "country"
names(dens) <- gsub("X", "", names(dens)) #remove X's infront of years

#Exclude years 1950 to 1964 to have same years as in the oil dataset
dens <- dens[,c("country", as.character(1965:2010))]

#collapsing multiple columns into 2 columns
dens2 <- gather(dens, "year", "density", 2:ncol(dens))
dens2$year <- as.factor(dens2$year)
#------------------------------
# Dataset #2: Oil Consumption Total (Tonnes Per Year) By Countries from 1965-2010
oil <- read.csv("oil_consumption_data.csv", header=TRUE)
oil <- oil[,-ncol(oil)]
names(oil)[1] <- "country"
names(oil) <- gsub("X", "", names(oil)) #remove X's infront of years
oil[oil == "-"] <- NA

#collapsing multiple columns into 2 columns
oil2 <- gather(oil, "year", "oil_consumption", 2:ncol(oil))
oil2$year <- as.factor(oil2$year)
#------------------------------
#Final Dataset
#innerJoin Data: Joins both population density and oil consumption 
#for countries that exist in both datasets

#Combinging oil2 and dens2 using an Inner join (get rid of those countries that are not included in one or the other)
dat <- inner_join(dens2, oil2, by = c("country", "year") )

#Change oil_consumption column from character to numeric
is.numeric(dat$oil_consumption)
## [1] FALSE
dat$oil_consumption <- as.numeric(dat$oil_consumption)
is.numeric(dat$density) #already a numeric vector
## [1] TRUE
head(dat)
##      country year density oil_consumption
## 1    Algeria 1965   5.006         1289000
## 2  Argentina 1965   8.019        21952000
## 3  Australia 1965   1.467        16902000
## 4    Austria 1965  86.671         5534000
## 5 Azerbaijan 1965  52.823              NA
## 6 Bangladesh 1965 401.337              NA

Mean and Median Line Graphs for Population Density & Oil Consumption

To get an overall idea of how the Population Density and Oil Consumption for the the entire world has been changing, I created two new data frames that grouped my data by year and took the average and median for that specific year. For example, I grouped all the oil consumption quantities from the year 1965 for all countries and took the mean and median, which became one row of the data frame dat.total_oil_consump_by_year. I repeated this for every year. The same was done for the population density, with the data being stored in dat.total_density_by_year.

Using these 2 new data frames, I created a line graph to depict a general trend as to how Oil Consumption and Population Density varied from 1965-2010.

Conditional Means and Medians of Population Density By Year
#---------------------------------------------------------------------------------------
#Conditional Means and medians of Population Density By Year
dat.total_density_by_year <- dat %>%
  group_by(year) %>%
  summarise(density_mean = mean(density, na.rm=T),
            density_median = median(density, na.rm=T)) %>%
  arrange(year)
head(dat.total_density_by_year)
## Source: local data frame [6 x 3]
## 
##     year density_mean density_median
##   (fctr)        (dbl)          (dbl)
## 1   1965     164.4497         40.908
## 2   1966     167.4409         41.549
## 3   1967     169.8910         42.317
## 4   1968     172.0117         42.746
## 5   1969     174.1104         43.157
## 6   1970     176.4172         43.543
p1 <- ggplot(aes(year,density_mean, group = 1), data=dat.total_density_by_year) +
  geom_line() +
  scale_x_discrete(breaks=seq(1965, 2010, 5)) +
  ggtitle("Average World Population Density vs. Year") +
  labs(x="Year", y="Population Density (Per Sq. Km.)") 

p2 <-ggplot(aes(year,density_median, group = 1), data=dat.total_density_by_year) +
  geom_line() +
  scale_x_discrete(breaks=seq(1965, 2010, 5)) +
  ggtitle("Median World Population Density vs. Year") +
  labs(x="Year", y="Population Density (Per Sq. Km.)")
grid.arrange(p1, p2)

There is an obvious upward trend in the mean and median average world population density in our world. In other words, the number of people living per square kilometer has been steadily increasing throughout the years. There isn’t anything else that stands out from these two particular plots, other than the fact that the y-axis for the mean has values that are far larger than those from the median y-axis.


Conditional Means and Medians of Oil Usage By Year
#Conditional Means and medians of Oil Usage By Year
dat.total_oil_consump_by_year <- dat %>%
  group_by(year) %>%
  summarise(oil_consump_mean = mean(oil_consumption, na.rm=T),
            oil_consump_median = median(oil_consumption, na.rm=T)) %>%
  arrange(year)
head(dat.total_oil_consump_by_year)
## Source: local data frame [6 x 3]
## 
##     year oil_consump_mean oil_consump_median
##   (fctr)            (dbl)              (dbl)
## 1   1965         25049804            5606000
## 2   1966         26918431            6068000
## 3   1967         28143038            6098000
## 4   1968         30031283            7005000
## 5   1969         32640811            7736000
## 6   1970         35362061            8582000
p1 <- ggplot(aes(year,oil_consump_mean, group = 1), data=dat.total_oil_consump_by_year) +
  geom_line() +
  scale_x_discrete(breaks=seq(1965, 2010, 5)) +
  ggtitle("Average World Oil Consumption vs. Year") +
  labs(x="Year", y="World Oil Consumption (Tonnes Per Year)")

p2 <- ggplot(aes(year,oil_consump_median, group = 1), 
             data=dat.total_oil_consump_by_year) +
  geom_line() +
  scale_x_discrete(breaks=seq(1965, 2010, 5)) +
  ggtitle("Median World Oil Consumption vs. Year") +
  labs(x="Year", y="World Oil Consumption (Tonnes Per Year)")
grid.arrange(p1, p2)

The line graph of the mean and median world oil consumption is not as smooth as the world population density line graphs. We see a significant drop in world oil consumption in the 1980s, which was known as the 1980s Oil Glut: “a serious surplus of crude oil caused by falling demand following the 1970s energy crisis. The world price of oil, which had peaked in 1980 at over US$35 per barrel ($101 per barrel today), fell in 1986 from $27 to below $10 ($58 to $22 today)” https://en.wikipedia.org/wiki/1980s_oil_glut. However, we begin to see an upward trend after 1985.


Relationship Between Average World Population Density and Average World Oil Consumption

Because we know that the world population mean and median are monotonically increasing, I can exclude the “year” variable (since I know my x-axis point will be in order) and plot the Average World Oil Consumption vs. Average World Population Density. We found that the correlation between the two variables is r = 0.9199912, which is very high. This result intuitively makes sense: the overpopulation of the world leads to a larger consumption of oil.

#Left-joining the data frames dat.total_density_by_year & dat.total_oil_consump_by_year by their same year value
dat2 <- left_join(dat.total_density_by_year, 
                  dat.total_oil_consump_by_year, by="year")
head(dat2)
## Source: local data frame [6 x 5]
## 
##     year density_mean density_median oil_consump_mean oil_consump_median
##   (fctr)        (dbl)          (dbl)            (dbl)              (dbl)
## 1   1965     164.4497         40.908         25049804            5606000
## 2   1966     167.4409         41.549         26918431            6068000
## 3   1967     169.8910         42.317         28143038            6098000
## 4   1968     172.0117         42.746         30031283            7005000
## 5   1969     174.1104         43.157         32640811            7736000
## 6   1970     176.4172         43.543         35362061            8582000
with(dat2, cor.test(density_mean, oil_consump_mean))
## 
##  Pearson's product-moment correlation
## 
## data:  density_mean and oil_consump_mean
## t = 15.57, df = 44, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.8591469 0.9551861
## sample estimates:
##       cor 
## 0.9199912
ggplot(aes(density_mean, oil_consump_mean), data=dat2) +
  geom_point() +
  geom_smooth() +
  ggtitle("Mean World Oil Consump. vs. Mean World Popn Density") 


The created a similar plot as the one above, but this time for the median. This plot was not as smooth as the one for the average, however it told a similar story.

with(dat2, cor.test(density_median, oil_consump_median))
## 
##  Pearson's product-moment correlation
## 
## data:  density_median and oil_consump_median
## t = 12.719, df = 44, p-value = 2.451e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.8030647 0.9360272
## sample estimates:
##       cor 
## 0.8866579
ggplot(aes(density_median, oil_consump_median), data=dat2) +
  geom_point() +
  geom_smooth() +
  ggtitle("Median World Oil Consump. vs. Median World Popn Density") 


Population Density

Scatterplots of Every Country’s Population Density

Instead of looking just at the mean and median line graphs of the population density, I decided to further explore the data by creating scatterplot of every country in the dataset. The scatterplot didn’t provide very useful information, except that there were three evident countries that had a larger population density compared to the rest of the world. These three countries raised the mean to be significantly larger than the median.

#---------------------------------------------------------------------------------------
#Population Density Plots of All Countries by Year
ggplot(aes(year, density), data=dat) +
  geom_jitter(alpha=.2, position=position_jitter(h=0)) +
  scale_x_discrete(breaks=seq(1965, 2010, 5)) +
  coord_trans(y='sqrt') +
  ggtitle("All Countries: Population Density vs. Year") +
  labs(x="Year", y="Population Density (Per Sq. Km.)") +
  geom_line(aes(year,density_mean, group = 1, color="Mean"),
            data=dat.total_density_by_year) +
  geom_line(aes(year,density_median, group = 1, color="Median"),
            data=dat.total_density_by_year) +
  labs(color="Legend text") 


Scatterplots of Every Country’s Population Density, with Names

To learn more about our above plot, I used passed the argument label=country inside my aesthetic wrapper and used the `geom_text’ layer. This revealed that the top 3 countries with highest population density: 1st “Hong Kong, China”, 2nd “Singapore”, and 3rd “Bangladash”.

Singapore’s population density overtakes Hong Kong, China around the year 2005. It appears that this occurred because Singapore’s population density continued to steadily increase, while Hong Kong, China stayed roughly the same after the year 2000.

#Explore Popn Density By Country Names
ggplot(aes(year, density, label=country), data=dat) +
  geom_point(alpha=.2) +
  scale_x_discrete(breaks=seq(1965, 2010, 5)) +
  coord_trans(y='sqrt') +
  ggtitle("All Countries: Population Density vs. Year") +
  labs(x="Year", y="Population Density (Per Sq. Km.)") +
  geom_text(size = 2)


Scatterplots of Every Country’s Population Density, with Names (Excluding Top 3 Countries)
#Zoom into Densities Less than 125 per sq. km. (excluding top 3 countries with highest popn density: "Hong Kong, China", "Singapore", and "Bangladash")
ggplot(aes(year, density, label=country), data=dat[dat$density < 125,]) +
  geom_point(alpha=.2) +
  scale_x_discrete(breaks=seq(1965, 2010, 5)) +
  coord_trans(y='sqrt') +
  ggtitle("All Countries: Population Density vs. Year") +
  labs(x="Year", y="Population Density (Per Sq. Km.)") +
  geom_text(size = 1.8)

After excluding the top 3 countries with the largest population density, I was able to zoom in a lot further into our graph. This revealed that the 3 countries with the lowest population density are (in ascending order) Australia, Iceland, and Canada.


Oil Consumption

Scatterplots of Every Country’s Oil Consumption

This scatterplot revealed that oil consumption is far more variable when compared to population density. There is one country that consumes oil at a significantly higher rate than the entire world! Similarly as the population density, the mean is a lot larger than the median world oil consumption because of those countries that consume oil at very large rates.

#------------------------------
#Oil Consumption Plots of All Countries By Year
ggplot(aes(year, oil_consumption), data=dat) +
  geom_jitter(alpha=.2, position=position_jitter(h=0)) +
  scale_x_discrete(breaks=seq(1965, 2010, 5)) +
  coord_trans(y='sqrt') +
  ggtitle("All Countries: Oil Consumption vs. Year") +
  labs(x="Year", y="Oil Consumption (Tonnes Per Year)") +
  geom_line(aes(year,oil_consump_mean, group = 1, color="Mean"), 
            data=dat.total_oil_consump_by_year) +
  geom_line(aes(year,oil_consump_median, group = 1, color="Median"), 
            data=dat.total_oil_consump_by_year) +
  labs(color="Legend text")


Scatterplots of Every Country’s Oil Consumption, with Names

To learn more about our above plot, I changed the points to names. Without much surprise, United States is unanimously the country with the highest amount of oil consumption in the world for the past since 1965.

Japan appears to the be the second largest oil consumer up until 1985. However, recall that we do not have any data on Russia’s oil consumption before 1985, so this might distort our analysis.

Russia’s oil consumption significantly drops after 1990, which is when Japan again overtakes second place. It is not until 2002 when China surpasses Japan and takes second place.

#Explore By Country Names
ggplot(aes(year, oil_consumption, label=country), data=dat) +
  geom_point(alpha=.2) +
  scale_x_discrete(breaks=seq(1965, 2010, 5)) +
  coord_trans(y='sqrt') +
  ggtitle("All Countries: Oil Consumption vs. Year") +
  labs(x="Year", y="Oil Consumption (Tonnes Per Year)") +
  geom_text(size = 1.8)


Scatterplots of Every Country’s Oil Consumption, with Names (Excluding United States)

To extract more information, we subset our data twice. We first exclude the top oil consumer. However, this graph does not provide very useful information, except for the one already described above. For the second graph, I only included countries who consumed less than 50,000,000 tonnes of oil per year. This second plot shows that the countries with the lowest amount of oil consumption are Qatar, Iceland, Bangladesh, and Ecuador. United Arab Emirates was the lowest oil consumer before the 70’s, however it quickly rose its consumption and surpassed a large number of countries.

#Zoom into Oil Consumption Less than 500,000,000 tonnes per year. (excluding top oil consumer "United States")
ggplot(aes(year, oil_consumption, label=country), 
       data=dat[dat$oil_consumption < 500000000,]) +
  geom_point(alpha=.2) +
  scale_x_discrete(breaks=seq(1965, 2010, 5)) +
  coord_trans(y='sqrt') +
  ggtitle("All Countries: Oil Consumption vs. Year") +
  labs(x="Year", y="Oil Consumption (Tonnes Per Year)") +
  geom_text(size = 1.8)

#Zoom into Oil Consumption Less than 50,000,000 tonnes per year
ggplot(aes(year, oil_consumption, label=country), 
       data=dat[dat$oil_consumption < 50000000,]) +
  geom_point(alpha=.2) +
  scale_x_discrete(breaks=seq(1965, 2010, 5)) +
  coord_trans(y='sqrt') +
  ggtitle("All Countries: Oil Consumption vs. Year") +
  labs(x="Year", y="Oil Consumption (Tonnes Per Year)") +
  geom_text(size = 1.8)


Analysis on Individual Countries

#------------------------------
#Let' Look at Specific Countries
table(dat$country)
## 
##              Algeria            Argentina            Australia 
##                   46                   46                   46 
##              Austria           Azerbaijan           Bangladesh 
##                   46                   46                   46 
##              Belarus               Brazil             Bulgaria 
##                   46                   46                   46 
##               Canada                Chile                China 
##                   46                   46                   46 
##             Colombia              Denmark              Ecuador 
##                   46                   46                   46 
##                Egypt              Finland               France 
##                   46                   46                   46 
##              Germany               Greece     Hong Kong, China 
##                   46                   46                   46 
##              Hungary              Iceland                India 
##                   46                   46                   46 
##            Indonesia                 Iran              Ireland 
##                   46                   46                   46 
##                Italy                Japan           Kazakhstan 
##                   46                   46                   46 
##               Kuwait            Lithuania             Malaysia 
##                   46                   46                   46 
##               Mexico          Netherlands          New Zealand 
##                   46                   46                   46 
##               Norway             Pakistan                 Peru 
##                   46                   46                   46 
##          Philippines               Poland             Portugal 
##                   46                   46                   46 
##                Qatar              Romania               Russia 
##                   46                   46                   46 
##         Saudi Arabia            Singapore      Slovak Republic 
##                   46                   46                   46 
##         South Africa                Spain               Sweden 
##                   46                   46                   46 
##          Switzerland               Taiwan             Thailand 
##                   46                   46                   46 
##               Turkey         Turkmenistan              Ukraine 
##                   46                   46                   46 
## United Arab Emirates       United Kingdom        United States 
##                   46                   46                   46 
##           Uzbekistan            Venezuela 
##                   46                   46
China <- dat[dat$country == "China",]
Russia <- dat[dat$country == "Russia",]
UAE <- dat[dat$country == "United Arab Emirates",]
UK <- dat[dat$country == "United Kingdom",]
USA <- dat[dat$country == "United States",]

#China
with(China, cor.test(density, oil_consumption))
## 
##  Pearson's product-moment correlation
## 
## data:  density and oil_consumption
## t = 12.8, df = 44, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.8050693 0.9367258
## sample estimates:
##       cor 
## 0.8878645
p1 <- ggplot(aes(density, oil_consumption), data=China) +
  geom_point() +
  geom_smooth() +
  ggtitle("China: Oil Consumption vs. Population Density") +
  labs(x="Population Density (Per Sq. Km.)", y="Oil Consumption (Tonnes Per Year)")
p2 <- ggplot(aes(year, density), data=China) +
  geom_point() +
  scale_x_discrete(breaks=seq(1965, 2010, 5)) +
  ggtitle("China: Population Density vs. Year") +
  labs(x="Year", y="Population Density (Per Sq. Km.)")
p3 <- ggplot(aes(year, oil_consumption), data=China) +
  geom_point() +
  scale_x_discrete(breaks=seq(1965, 2010, 5)) +
  ggtitle("China: Oil Consumption vs. Year") +
  labs(x="Year", y="Oil Consumption (Tonnes Per Year)")
grid.arrange(p1, p2, p3)

#Russia: Oil Consumption data not available from 1965-1984
with(Russia, cor.test(density, oil_consumption))
## 
##  Pearson's product-moment correlation
## 
## data:  density and oil_consumption
## t = 1.3758, df = 24, p-value = 0.1816
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.1306584  0.5953751
## sample estimates:
##       cor 
## 0.2703775
p1 <- ggplot(aes(density, oil_consumption), 
             data=Russia[!is.na(Russia$oil_consumption),]) +
  geom_point() +
  geom_smooth() +
  ggtitle("Russia: Oil Consumption vs. Population Density (1985-2010)") +
  labs(x="Population Density (Per Sq. Km.)", y="Oil Consumption (Tonnes Per Year)")
p2 <- ggplot(aes(year, density), data=Russia) +
  geom_point() +
  scale_x_discrete(breaks=seq(1965, 2010, 5)) +
  ggtitle("Russia: Population Density vs. Year") +
  labs(x="Year", y="Population Density (Per Sq. Km.)")
p3 <- ggplot(aes(year, oil_consumption), data=Russia[!is.na(Russia$oil_consumption),]) +
  geom_point() +
  scale_x_discrete(breaks=seq(1965, 2010, 5)) +
  ggtitle("Russia: Oil Consumption vs. Year (1985-2010)") +
  labs(x="Year", y="Oil Consumption (Tonnes Per Year)")
grid.arrange(p1, p2, p3)

#United Arab Emirates
with(UAE, cor.test(density, oil_consumption))
## 
##  Pearson's product-moment correlation
## 
## data:  density and oil_consumption
## t = 15.076, df = 41, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.8569927 0.9563831
## sample estimates:
##       cor 
## 0.9204258
p1 <- ggplot(aes(density, oil_consumption), data=UAE) +
  geom_point() +
  geom_smooth() +
  ggtitle("United Arab Emirates: Oil Consumption vs. Population Density") +
  labs(x="Population Density (Per Sq. Km.)", y="Oil Consumption (Tonnes Per Year)")
p2 <- ggplot(aes(year, density), data=UAE) +
  geom_point() +
  scale_x_discrete(breaks=seq(1965, 2010, 5)) +
  ggtitle("United Arab Emirates: Population Density vs. Year") +
  labs(x="Year", y="Population Density (Per Sq. Km.)")
p3 <- ggplot(aes(year, oil_consumption), data=UAE) +
  geom_point() +
  scale_x_discrete(breaks=seq(1965, 2010, 5)) +
  ggtitle("United Arab Emirates: Oil Consumption vs. Year") +
  labs(x="Year", y="Oil Consumption (Tonnes Per Year)")
grid.arrange(p1, p2, p3)

#UK
with(UK, cor.test(density, oil_consumption))
## 
##  Pearson's product-moment correlation
## 
## data:  density and oil_consumption
## t = -3.1418, df = 44, p-value = 0.003001
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.6389584 -0.1573108
## sample estimates:
##        cor 
## -0.4280606
p1 <- ggplot(aes(density, oil_consumption), data=UK) +
  geom_point() +
  geom_smooth() +
  ggtitle("United Kingdom: Oil Consumption vs. Population Density") +
  labs(x="Population Density (Per Sq. Km.)", y="Oil Consumption (Tonnes Per Year)")
p2 <- ggplot(aes(year, density), data=UK) +
  geom_point() +
  scale_x_discrete(breaks=seq(1965, 2010, 5)) +
  ggtitle("United Kingdom: Population Density vs. Year") +
  labs(x="Year", y="Population Density (Per Sq. Km.)")
p3 <- ggplot(aes(year, oil_consumption), data=UK) +
  geom_point() +
  scale_x_discrete(breaks=seq(1965, 2010, 5)) +
  ggtitle("United Kingdom: Oil Consumption vs. Year") +
  labs(x="Year", y="Oil Consumption (Tonnes Per Year)")
grid.arrange(p1, p2, p3)

#USA
with(USA, cor.test(density, oil_consumption))
## 
##  Pearson's product-moment correlation
## 
## data:  density and oil_consumption
## t = 8.0468, df = 44, p-value = 3.456e-10
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.6202683 0.8675836
## sample estimates:
##       cor 
## 0.7716254
p1 <- ggplot(aes(density, oil_consumption), data=USA) +
  geom_point() +
  geom_smooth() +
  ggtitle("USA: Oil Consumption vs. Population Density") +
  labs(x="Population Density (Per Sq. Km.)", y="Oil Consumption (Tonnes Per Year)")
p2 <- ggplot(aes(year, density), data=USA) +
  geom_point() +
  scale_x_discrete(breaks=seq(1965, 2010, 5)) +
  ggtitle("USA: Population Density vs. Year") +
  labs(x="Year", y="Population Density (Per Sq. Km.)")
p3 <- ggplot(aes(year, oil_consumption), data=USA) +
  geom_point() +
  scale_x_discrete(breaks=seq(1965, 2010, 5)) +
  ggtitle("USA: Oil Consumption vs. Year") +
  labs(x="Year", y="Oil Consumption (Tonnes Per Year)")
grid.arrange(p1, p2, p3)

#------------------------------