Nations

Author

ZHageman

Importing Tidyverse

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(dplyr)
library(ggplot2)

Importing the Dataset

nations <- read_csv("nations.csv")
Rows: 5275 Columns: 10
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (5): iso2c, iso3c, country, region, income
dbl (5): year, gdp_percap, population, birth_rate, neonat_mortal_rate

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Mutating the Data

Here, i mutated the data in order to get the gdp per capita of the nations in the trillions.

nations <- nations |>
  mutate(gdp = (gdp_percap * population) / 1e12)
nations
# A tibble: 5,275 × 11
   iso2c iso3c country  year gdp_percap population birth_rate neonat_mortal_rate
   <chr> <chr> <chr>   <dbl>      <dbl>      <dbl>      <dbl>              <dbl>
 1 AD    AND   Andorra  1996         NA      64291       10.9                2.8
 2 AD    AND   Andorra  1994         NA      62707       10.9                3.2
 3 AD    AND   Andorra  2003         NA      74783       10.3                2  
 4 AD    AND   Andorra  1990         NA      54511       11.9                4.3
 5 AD    AND   Andorra  2009         NA      85474        9.9                1.7
 6 AD    AND   Andorra  2011         NA      82326       NA                  1.6
 7 AD    AND   Andorra  2004         NA      78337       10.9                2  
 8 AD    AND   Andorra  2010         NA      84419        9.8                1.7
 9 AD    AND   Andorra  2001         NA      67770       11.8                2.1
10 AD    AND   Andorra  2002         NA      71046       11.2                2.1
# ℹ 5,265 more rows
# ℹ 3 more variables: region <chr>, income <chr>, gdp <dbl>

Plot 1

GPD Growth for Arab Nations

Filtering the nations

arabnations_filtered <- nations |>
  filter(country %in% c("Egypt, Arab Rep.", "Morocco", "Saudi Arabia", "West Bank and Gaza")) 
# this allows me to focus only on the countries im interested in
plot1 <- ggplot(arabnations_filtered, aes(x = year, y = gdp, color = country)) +
  geom_line() + 
  geom_point() + 
  scale_color_brewer(palette = "Set1") +
  labs(title = "Economic Growth Over Time: Arab Nations",
       x = "Year",
       y = "GDP, Trillions in (USD)", 
       caption = "Source: Nations Dataset") + 
  theme_minimal() + 
  theme(axis.text.x = element_text(angle = 20))
plot1
Warning: Removed 4 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 4 rows containing missing values or values outside the scale range
(`geom_point()`).

Including the United States

arabnations_filtered <- nations |>
  filter(country %in% c("United States", "Egypt, Arab Rep.", "Morocco", "Saudi Arabia", "West Bank and Gaza")) 
# this is the same as the code above, including the united states of america for perspective
plot1.2 <- ggplot(arabnations_filtered, aes(x = year, y = gdp, color = country)) +
  geom_line() + 
  geom_point() + 
  scale_color_brewer(palette = "Set1") +
  labs(title = "GPD Growth over Time: Arab Nations Compared to USA",
       x = "Year",
       y = "GDP, Trillions in (USD)", 
       caption = "Source: Nations Dataset") + 
  theme_minimal() + 
  theme(axis.text.x = element_text(angle = 20))
plot1.2
Warning: Removed 4 rows containing missing values or values outside the scale range
(`geom_line()`).
Warning: Removed 4 rows containing missing values or values outside the scale range
(`geom_point()`).

Plot 2

nations_summarized <- nations |>
  group_by(region, year) |> summarize(GDP = sum(gdp, na.rm = TRUE))
`summarise()` has grouped output by 'region'. You can override using the
`.groups` argument.
plot2 <- ggplot(nations_summarized, aes(x = year, y = GDP, fill = region)) +
  geom_area(color = "white") +
  scale_fill_brewer(palette = "Set2") +
  labs(title = "GDP by Region Over Time",
       x = "Year",
       y = "GDP in Trillions (USD)") +
  theme_classic()
plot2

Citations

Things i used for inspiration in order to complete these codes were,

Some of my peers old codes from the flights codes for the dplyr/mutations

https://stackoverflow.com/questions/48489219/why-does-summarize-drop-a-group

This website was also used.

And i looked back on previous classes to see how to turn the bottom axis text on an angle

I was getting another error code when filtering saying saying

“Warning: There was 1 warning in `filter()`. ℹ In argument: `country == c(”Egypt, Arab Rep.”, “Morocco”, “Saudi Arabia”, “West Bank and Gaza”)`. Caused by warning in `country == c(“Egypt, Arab Rep.”, “Morocco”, “Saudi Arabia”, “West Bank and Gaza”)`: ! longer object length is not a multiple of shorter object length”

I asked chatgpt and it said to use %in% rather than == so i used that

General Googling was also used however i looked at many sites and dont have all of the links saved, i will make sure to save all links next time.