Nations HW

Loading Library’s

library(tidyverse)
library(highcharter)
library(RColorBrewer)

Set working directory

I need to set my working directory because I downloaded my dataset. I also used mutate to transform GDP of each country in trillions.

setwd("/Users/mikea/Desktop/Data 110 ")
nations <- read_csv("nations.csv") %>%
mutate(gdp_tn = gdp_percap*population/1000000000000)

Let’s take a look at all the countries and country codes.

#unique(nations$country)
#unique(nations$iso3c)
names(nations)
##  [1] "iso2c"              "iso3c"              "country"           
##  [4] "year"               "gdp_percap"         "population"        
##  [7] "birth_rate"         "neonat_mortal_rate" "region"            
## [10] "income"             "gdp_tn"

Let’s Choose my four countries!

The four countries I will be using is USA, Belgium, Japan, and Germany. I used the filter function to choose these countries and arranged them by year.

my4 <- nations %>%
  filter(iso3c == "DEU" | iso3c == "BEL" | iso3c == "JPN" | iso3c == "USA") %>%
  arrange(year)
  my4
## # A tibble: 100 × 11
##    iso2c iso3c country  year gdp_percap population birth_rate neonat_mortal_rate
##    <chr> <chr> <chr>   <dbl>      <dbl>      <dbl>      <dbl>              <dbl>
##  1 BE    BEL   Belgium  1990     19053.    9967379       12.4                4.6
##  2 DE    DEU   Germany  1990     19033.   79433029       11.4                3.4
##  3 JP    JPN   Japan    1990     19230.  123537000       10                  2.5
##  4 US    USA   United…  1990     23954.  249623000       16.7                5.8
##  5 BE    BEL   Belgium  1991     19974.   10004486       12.6                4.4
##  6 DE    DEU   Germany  1991     20521.   80013896       10.4                3.5
##  7 JP    JPN   Japan    1991     20467.  123921000        9.9                2.5
##  8 US    USA   United…  1991     24405.  252981000       16.2                5.6
##  9 BE    BEL   Belgium  1992     20658.   10045158       12.4                4.2
## 10 DE    DEU   Germany  1992     21230.   80624598       10                  3.5
## # ℹ 90 more rows
## # ℹ 3 more variables: region <chr>, income <chr>, gdp_tn <dbl>

Now let’s plot

This is an example where I used Highchart just to guide myself along my homework and experiment with. My Plots using ggplot are after this example.

highchart() %>%
 hc_add_series(data = my4, type = "line", hcaes(x = year, y = gdp_tn, group = country))

Now Let’s Plot With ggplot2!

In this plot I called my variable “my4” which had the data on my four countries. After I set my x-axis to be year, my Y-axis to be GDP, and color by country. I used geom point and line to replicate the chart from highchart.

ggplot(data = my4, aes(x = year, y = gdp_tn, group = country, color = country)) +
  geom_point() +
  geom_line() +
  scale_color_brewer(palette = "Set1") +
  labs(x = "Year", y = "GDP (trillions)", title = "GDP by Country") +
  theme_minimal() 

Grouping By Region

In this code, I created a new variable called grouped_data and plugged my previous variable “my4” in it. Next, I grouped by region and year. Lastly, summarized the sum of GDP.

grouped_data <- my4 %>%
  group_by(region, year) %>%
  summarise(GDP = sum(gdp_tn, na.rm = TRUE))
## `summarise()` has grouped output by 'region'. You can override using the
## `.groups` argument.
grouped_data
## # A tibble: 75 × 3
## # Groups:   region [3]
##    region               year   GDP
##    <chr>               <dbl> <dbl>
##  1 East Asia & Pacific  1990  2.38
##  2 East Asia & Pacific  1991  2.54
##  3 East Asia & Pacific  1992  2.62
##  4 East Asia & Pacific  1993  2.68
##  5 East Asia & Pacific  1994  2.76
##  6 East Asia & Pacific  1995  2.88
##  7 East Asia & Pacific  1996  3.00
##  8 East Asia & Pacific  1997  3.10
##  9 East Asia & Pacific  1998  3.08
## 10 East Asia & Pacific  1999  3.12
## # ℹ 65 more rows

2nd Plot using ggplot2

In this plot, I used the grouped_data variable and assigned my x-axis as year, Y-axis as GDP, and fill by region. I used geom area to visualize my graph and scale_fill_brewer to color my plot.

ggplot(data = grouped_data, aes(x = year, y = GDP, fill = region)) +
  geom_area(color = "white") +
  scale_fill_brewer(palette = "Set2") +
  labs(x = "Year", y = "GDP", title = "GDP by Region") +
  theme_minimal()

Thank You!