Nations Dataset

Author

Hannnah Le

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(plotly)

Attaching package: 'plotly'
The following object is masked from 'package:ggplot2':

    last_plot
The following object is masked from 'package:stats':

    filter
The following object is masked from 'package:graphics':

    layout
library(ggfortify)
library(dplyr)
library(ggplot2)
library(RColorBrewer)
setwd("~/Desktop/Data 110")
nations <- read_csv("nations.csv")
Rows: 5275 Columns: 10
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (5): iso2c, iso3c, country, region, income
dbl (5): year, gdp_percap, population, birth_rate, neonat_mortal_rate

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

gdp of each country in trillions of dollars

nations <- nations %>%
  mutate(gdp_trillions = gdp_percap * population / 10^12)

mutate and group by code reference:

https://youtu.be/GM3u9POqhjU?si=nweIt1H1YfVO0qTs

four desired countries

filtered_data <- nations %>%
  filter(country %in% c("China", "Germany", "Japan", "United States"))

plot 1

ggplot(filtered_data, aes(x = year, y = gdp_trillions, color = country)) +
  geom_point() +
  geom_line() +
  scale_color_brewer(palette = "Set1") +
  labs(title = "China's Rise to Become the Largest Economy", 
       x = "Year", 
       y = "GDP ($ trillion)") +
  theme_minimal()

sum of gdp

region_data <- nations %>%
  group_by(region, year) %>%
  summarise(sum_GDP = sum(gdp_trillions, na.rm = TRUE))
`summarise()` has grouped output by 'region'. You can override using the
`.groups` argument.

plot 2

ggplot(region_data, aes(x = year, y = sum_GDP, fill = region)) +
  geom_area(color = "white", size = 0.1) +  # Thin white line around each area
  scale_fill_brewer(palette = "Set2") +
  labs(title = "GDP by World Bank Region", 
       x = "Year", 
       y = "GDP ($ trillion)") +
  theme_minimal()
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.