For this assignment, I will create two visualizations from the “nations” dataset. The first chart will show GDP in trillions, for 4 countries from the time period 1990 to 2015. The second chart will show GDP in trillions by World Bank Region, from the year 1990 to 2015.
First, I imported the data using the readr package and stored it in the “nations” variable, so it is easier to call.
library("readr")
nations <-read_csv("nations.csv")
## Parsed with column specification:
## cols(
## iso2c = col_character(),
## iso3c = col_character(),
## country = col_character(),
## year = col_double(),
## gdp_percap = col_double(),
## population = col_double(),
## birth_rate = col_double(),
## neonat_mortal_rate = col_double(),
## region = col_character(),
## income = col_character()
## )
I will be using functions from the dplyr and ggplot2 libraries. It can be installed with install.packages() if it is not already on the device.
library(dplyr)
## Warning: package 'dplyr' was built under R version 3.6.1
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2)
This chart will show GDP in trillions, for 4 countries from the time period 1990 to 2015. For this chart, I need to create a new variable using mutate from dyplr. In order to do this, I multiplied gdp_percap by population and divided by a trillion, naming the variable gdp_trill. I filtered by the only countries I was interested in: China, Germany, Japan, and the United States. I labelled the x-axis, y-axis, and title of the chart. I also set the theme and color pallette, and then ran both geom_point and geom_line to create a scatter plot with a line drawn over.
chart1 <- nations %>% mutate(gdp_trill=(gdp_percap*population/1000000000000)) %>% filter(country=="China" | country=="Germany" | country=="Japan" | country=="United States") %>% ggplot(aes(x = year, y = gdp_trill, color=country)) +
xlab("year") +
ylab("GDP ($ trillion)") +
ggtitle("China's Rise to Become the Largest Economy") +
scale_color_brewer(palette = "Set1") +
theme_minimal(base_size = 12) +
geom_point() +
geom_line()
chart1
This chart will show GDP in trillions by World Bank Region, from the year 1990 to 2015. For this chart, I used the function group_by for region and year. I then used summarize on the mutated gdp_trill variable. I filtered by the regions I was interested in: East Asia & Pacific, Europe & Central Asia, Latin America & Caribbean, Middle East & North Africa, North America, South Asia, and Sub-Saharan Africa. I labelled the x-axis, y-axis, and title of the chart. I also set the theme and color pallette, and then ran both geom_area with the color white to create a thin white line around each area.
chart2 <- nations %>% group_by(region, year) %>% mutate(gdp_trill=(gdp_percap*population/1000000000000)) %>% summarise(sum = sum(gdp_trill, na.rm = TRUE)) %>% filter(region=="East Asia & Pacific" | region=="Europe & Central Asia" | region=="Latin America & Caribbean" | region=="Middle East & North Africa" | region=="North America" | region=="South Asia" | region=="Sub-Saharan Africa") %>% ggplot(aes(x = year, y = sum, fill=region)) +
xlab("year") +
ylab("GDP ($ trillion)") +
ggtitle("GDP by World Bank Region") +
scale_fill_brewer(palette = "Set2") +
theme_minimal(base_size = 12) +
geom_area(color="white")
chart2