library(tidyverse)
library(highcharter)
library(RColorBrewer)
I need to set my working directory because I downloaded my dataset. I also used mutate to transform GDP of each country in trillions.
setwd("/Users/mikea/Desktop/Data 110 ")
nations <- read_csv("nations.csv") %>%
mutate(gdp_tn = gdp_percap*population/1000000000000)
#unique(nations$country)
#unique(nations$iso3c)
names(nations)
## [1] "iso2c" "iso3c" "country"
## [4] "year" "gdp_percap" "population"
## [7] "birth_rate" "neonat_mortal_rate" "region"
## [10] "income" "gdp_tn"
The four countries I will be using is USA, Belgium, Japan, and Germany. I used the filter function to choose these countries and arranged them by year.
my4 <- nations %>%
filter(iso3c == "DEU" | iso3c == "BEL" | iso3c == "JPN" | iso3c == "USA") %>%
arrange(year)
my4
## # A tibble: 100 × 11
## iso2c iso3c country year gdp_percap population birth_rate neonat_mortal_rate
## <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 BE BEL Belgium 1990 19053. 9967379 12.4 4.6
## 2 DE DEU Germany 1990 19033. 79433029 11.4 3.4
## 3 JP JPN Japan 1990 19230. 123537000 10 2.5
## 4 US USA United… 1990 23954. 249623000 16.7 5.8
## 5 BE BEL Belgium 1991 19974. 10004486 12.6 4.4
## 6 DE DEU Germany 1991 20521. 80013896 10.4 3.5
## 7 JP JPN Japan 1991 20467. 123921000 9.9 2.5
## 8 US USA United… 1991 24405. 252981000 16.2 5.6
## 9 BE BEL Belgium 1992 20658. 10045158 12.4 4.2
## 10 DE DEU Germany 1992 21230. 80624598 10 3.5
## # ℹ 90 more rows
## # ℹ 3 more variables: region <chr>, income <chr>, gdp_tn <dbl>
This is an example where I used Highchart just to guide myself along my homework and experiment with. My Plots using ggplot are after this example.
highchart() %>%
hc_add_series(data = my4, type = "line", hcaes(x = year, y = gdp_tn, group = country))
In this plot I called my variable “my4” which had the data on my four countries. After I set my x-axis to be year, my Y-axis to be GDP, and color by country. I used geom point and line to replicate the chart from highchart.
ggplot(data = my4, aes(x = year, y = gdp_tn, group = country, color = country)) +
geom_point() +
geom_line() +
scale_color_brewer(palette = "Set1") +
labs(x = "Year", y = "GDP (trillions)", title = "GDP by Country") +
theme_minimal()
In this code, I created a new variable called grouped_data and plugged my previous variable “my4” in it. Next, I grouped by region and year. Lastly, summarized the sum of GDP.
grouped_data <- my4 %>%
group_by(region, year) %>%
summarise(GDP = sum(gdp_tn, na.rm = TRUE))
## `summarise()` has grouped output by 'region'. You can override using the
## `.groups` argument.
grouped_data
## # A tibble: 75 × 3
## # Groups: region [3]
## region year GDP
## <chr> <dbl> <dbl>
## 1 East Asia & Pacific 1990 2.38
## 2 East Asia & Pacific 1991 2.54
## 3 East Asia & Pacific 1992 2.62
## 4 East Asia & Pacific 1993 2.68
## 5 East Asia & Pacific 1994 2.76
## 6 East Asia & Pacific 1995 2.88
## 7 East Asia & Pacific 1996 3.00
## 8 East Asia & Pacific 1997 3.10
## 9 East Asia & Pacific 1998 3.08
## 10 East Asia & Pacific 1999 3.12
## # ℹ 65 more rows
In this plot, I used the grouped_data variable and assigned my x-axis as year, Y-axis as GDP, and fill by region. I used geom area to visualize my graph and scale_fill_brewer to color my plot.
ggplot(data = grouped_data, aes(x = year, y = GDP, fill = region)) +
geom_area(color = "white") +
scale_fill_brewer(palette = "Set2") +
labs(x = "Year", y = "GDP", title = "GDP by Region") +
theme_minimal()