Nations Chart Assignment

Author

Nadia Omer

Loading the Dataset

library(tidyverse)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.2.0     ✔ readr     2.1.6
✔ forcats   1.0.1     ✔ stringr   1.6.0
✔ ggplot2   4.0.2     ✔ tibble    3.3.1
✔ lubridate 1.9.5     ✔ tidyr     1.3.2
✔ purrr     1.2.1     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

setwd("C:/Users/Administrator/OneDrive - montgomerycollege.edu/DATA 110")
nations <- read_csv("nations.csv")

Rows: 5275 Columns: 10
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (5): iso2c, iso3c, country, region, income
dbl (5): year, gdp_percap, population, birth_rate, neonat_mortal_rate

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Creating New Variable

nations <- nations|>
  mutate(GDP = (gdp_percap * population)/ 10^12)

nations_nona <- nations |>
  filter(!is.na(gdp_percap) & !is.na(population))

head(nations_nona)

# A tibble: 6 × 11
  iso2c iso3c country   year gdp_percap population birth_rate neonat_mortal_rate
  <chr> <chr> <chr>    <dbl>      <dbl>      <dbl>      <dbl>              <dbl>
1 AE    ARE   United …  1991     73037.    1913190       24.6                7.9
2 AE    ARE   United …  1993     71960.    2127863       22.4                7.3
3 AE    ARE   United …  2001     83534.    3217865       15.8                5.5
4 AE    ARE   United …  1992     73154.    2019014       23.5                7.6
5 AE    ARE   United …  1994     74684.    2238281       21.3                6.9
6 AE    ARE   United …  2007     75427.    6010100       12.8                4.7
# ℹ 3 more variables: region <chr>, income <chr>, GDP <dbl>

Plot Base

cool_countries <- nations |>
  filter(country %in% c("United States", "Japan", "Sudan", "Ethiopia"))

plot_base <- ggplot(cool_countries, aes(x = year, y = GDP, color = country))
plot_base #I used what we learned this week to make a base area for my plots

Plot 1

plot1 <- plot_base + 
  geom_point() +
  geom_line() +
  scale_color_brewer(palette = "Set1") +
  labs(x = "Year",
       y = "GDP ($ Trillion)",
       title = "GDP of Countries of Interest")
plot1

I chose these countries because they all have a particular interest to me. I chose Sudan because Sudan is where I am from, The United States because it is where I live, Japan because I studied Japanese for four years in High School, and Ethiopia because my best friend is from Ethiopia. I was most interested in countries that have connected me to others. The graph has pointed out that, while Ethiopia and Sudan have similar GDPs to each other, they have much lower GDPs in comparison to Japan and the United States. This insight gives me curiosity to find out why that is.

Grouping Countries into Regions

wb_region <- nations_nona |>
  group_by(region, year) |>
  summarise(GDP = sum(GDP, na.rm = TRUE))

`summarise()` has regrouped the output.
ℹ Summaries were computed grouped by region and year.
ℹ Output is grouped by region.
ℹ Use `summarise(.groups = "drop_last")` to silence this message.
ℹ Use `summarise(.by = c(region, year))` for per-operation grouping
  (`?dplyr::dplyr_by`) instead.

Plot 2

plot2 <- ggplot(wb_region, aes(year, GDP, fill = region)) +
  geom_area(color = "white", size = 0.2) +
  labs(x = "Year",
       y = "GDP ($ Trillion)",
       title = "GDP by World Bank Region",
       fill = "Region")

Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.

plot2