Nations Dataset Charts Assignment

Author

Balemlay

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(RColorBrewer)
 setwd("C:/Users/ebale/OneDrive/Desktop/DATA110")
nations<- read_csv("nations.csv")
Rows: 5275 Columns: 10
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (5): iso2c, iso3c, country, region, income
dbl (5): year, gdp_percap, population, birth_rate, neonat_mortal_rate

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Data cleaning and calculating GDP in trillion

 clean <- nations|>
  na.omit(nations$country)|>
  na.omit(nations$gdp_percap)|>
  na.omit(nations$population)|>
  na.omit(nations$region)|>
  mutate(gdp_trillion = (gdp_percap*population)/(10^12))

Filtering data for four countries

filtering <- clean|>
filter(country == "China" | country  == "Germany" | country == "Japan" | country == "United States") 

Plot1

p1 <- ggplot(filtering, aes(x = year, y = gdp_trillion, color = country)) +
  geom_point() +
  geom_line() +
  scale_color_brewer(palette = "Set1") + ## use the Set1 colorbrewer palette
  labs(title = "China's Rise to Become the Largest Economy",
       x = "Year",
       y = "GDP ($ trillion)") +
  theme_minimal(base_size = 12) +
  theme(plot.title = element_text(hjust = 0.5))  # Centers the title

p1

Grouping and summarising the dataset

gdp_area1 <- clean|>
  group_by(region,year)|>
  summarise(GDP = sum(gdp_trillion, na.rm = TRUE))   
`summarise()` has grouped output by 'region'. You can override using the
`.groups` argument.

Plot2

P2 <- ggplot(gdp_area1, aes(x = year, y = GDP, fill = region))+ 
  geom_area(color = "white", size = 0.2) +  # Add a thin white line around the areas
  scale_fill_brewer(palette = "Set2") +      # Use the Set2 palette to color regions
  labs(title = " GDP by World Bank Region", 
       x = "Year", 
       y = "GDP($trillion") +
 theme_minimal(base_size = 12) 
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.
P2