Here, I’m calling in all of the libraries I’ll need to be able to complete this assignment.

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.4.4     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.0
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(ggplot2)
library(dplyr)
library(RColorBrewer)

I set my working directory and let R read the “nations” csv file. After calling in the dataset, I looked through the dataset and explored all of the variables before moving on.

setwd("/Users/aashkanavale/Desktop/Montgomery College/MC Spring '24/DATA110/data sets")
nations <- read_csv("nations.csv")
## Rows: 5275 Columns: 10
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (5): iso2c, iso3c, country, region, income
## dbl (5): year, gdp_percap, population, birth_rate, neonat_mortal_rate
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Here, I used the mutate function to create a new column in the dataset, GDP. GDP stands for Gross Domestic Product and we calculated the GDP of each country with the math as seen below. Then, I used the filter function to select the four countries I wanted to use for the chart itself. I chose the 4 biggest countries by population; India, China, The United States, and Indonesia, in that order.

nations_ds <- nations |>
  mutate(GDP = (nations$gdp_percap * nations$population)/10^12) |>
  filter(country %in% c("India", "China", "United States", "Indonesia"))

I actually create the chart here. Originally, I was having trouble and I didn’t include the “color = country” chunk and my graph looked all weird** but as soon as I included that piece, the graph came together.

nationschart <- ggplot(nations_ds, aes(x = year, y = GDP, color = country)) + 
  geom_point() +
  geom_line() +
scale_color_brewer(palette = "Set1") +
labs(title = "Top 4 Biggest Countries' (by population) \n GDP Over the Years",
       x = "Year", 
       y = "GDP (in $trillions)", 
       color = "Country")
nationschart

I mutated again because the previous time, the mutation command was attached with the filter command and I was only able to access 3 of the 7 regions. I grouped region and year and removed any N/As.

nationschart2 <- nations |>
  mutate(GDP = (nations$gdp_percap * nations$population)/10^12) |>
  group_by(region, year) |>
  summarize(GDP = sum(GDP, na.rm = TRUE))
## `summarise()` has grouped output by 'region'. You can override using the
## `.groups` argument.

This part was also pretty simple. I just plotted the chart and it outputted exactly what I needed.

nationschart2 |>
  ggplot(aes(x = year, y = GDP, fill = region)) +
  geom_area() +
  scale_fill_brewer(palette = "Set2") +
  labs(title = "GDP by World Bank Region",
       x = "Year",
       y = "GDP (in $trillions)",
       fill = "Region")

** This is what my chart looked like at first without the “color = country” chunk. It had me stuck for a little while.

nationschart <- ggplot(nations_ds, aes(x = year, y = GDP)) + 
  geom_point() +
  geom_line() +
scale_color_brewer(palette = "Set1") +
labs(title = "Top 4 Biggest Countries' (by population) \n GDP Over the Years",
       x = "Year", 
       y = "GDP (in $trillions)", 
       color = "Country")
nationschart