HW Week 6

Author

Emilio Sanchez San Martin

##Loading Data

nations <- read.csv("~/Downloads/Data 101 and Data 110 class/Data 110/Data Sets/nations.csv")
library(tidyverse)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.4.4     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.0
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

p1) For both charts, you will first need to create a new variable in the data, using mutate from dplyr, giving the GDP of each country in trillions of dollars, by multiplying gdp_percap by population and dividing by a trillion.

p1 <- nations |>
  mutate(gdp = gdp_percap * population / 10^12)
head(p1)

  iso2c iso3c country year gdp_percap population birth_rate neonat_mortal_rate
1    AD   AND Andorra 1996         NA      64291       10.9                2.8
2    AD   AND Andorra 1994         NA      62707       10.9                3.2
3    AD   AND Andorra 2003         NA      74783       10.3                2.0
4    AD   AND Andorra 1990         NA      54511       11.9                4.3
5    AD   AND Andorra 2009         NA      85474        9.9                1.7
6    AD   AND Andorra 2011         NA      82326         NA                1.6
                 region      income gdp
1 Europe & Central Asia High income  NA
2 Europe & Central Asia High income  NA
3 Europe & Central Asia High income  NA
4 Europe & Central Asia High income  NA
5 Europe & Central Asia High income  NA
6 Europe & Central Asia High income  NA

p2) For the first chart, you will need to filter the data with dplyr for the four desired countries. When making the chart with ggplot2 you will need to add both geom_point and geom_line layers, and use the Set1 ColorBrewer palette using: scale_color_brewer(palette = “Set1”).

p2 <- p1 |>
  filter(country %in% c("Chile", "Bolivia", "Dominican Republic", "El Salvador")) #I choose these data variables because these are all Spanish speaking countries and I'm Chilean and Bolivian, and I know so many people from the Dominican Republic and El Salvador, haha!
head(p2)

  iso2c iso3c country year gdp_percap population birth_rate neonat_mortal_rate
1    BO   BOL Bolivia 1996   3119.282    7717445     32.873               35.6
2    BO   BOL Bolivia 1993   2726.411    7273824     34.126               39.0
3    BO   BOL Bolivia 2002   3647.572    8653343     29.796               28.4
4    BO   BOL Bolivia 2008   4986.849    9599916     26.510               24.4
5    BO   BOL Bolivia 2009   5108.847    9758799     25.991               23.4
6    BO   BOL Bolivia 2001   3570.055    8496378     30.356               29.0
                     region              income        gdp
1 Latin America & Caribbean Lower middle income 0.02407288
2 Latin America & Caribbean Lower middle income 0.01983143
3 Latin America & Caribbean Lower middle income 0.03156369
4 Latin America & Caribbean Lower middle income 0.04787333
5 Latin America & Caribbean Lower middle income 0.04985621
6 Latin America & Caribbean Lower middle income 0.03033254

p3) For the second chart, using dplyr you will need to group_by region and year, and then summarize on your mutated value for gdp using summarise(GDP = sum(gdp, na.rm = TRUE)). (There will be null values, or NAs, in this data, so you will need to use na.rm = TRUE).

p3 <- p1 |>
  group_by(region, year) |>
  summarise(GDP = sum(gdp, na.rm = TRUE))

`summarise()` has grouped output by 'region'. You can override using the
`.groups` argument.

head(p3)

# A tibble: 6 × 3
# Groups:   region [1]
  region               year   GDP
  <chr>               <int> <dbl>
1 East Asia & Pacific  1990  5.52
2 East Asia & Pacific  1991  6.03
3 East Asia & Pacific  1992  6.50
4 East Asia & Pacific  1993  7.04
5 East Asia & Pacific  1994  7.64
6 East Asia & Pacific  1995  8.29

Draw both charts with ggplot2. Each region’s area will be generated by the command geom_area () in ggplot2, and you will need to use the Set2 ColorBrewer palette using scale_fill_brewer(palette = “Set2”).

p2 |>
  ggplot(aes(x = year, y = gdp, color = country)) +
  geom_point() +
  geom_line() +
  labs(
    title = "Chile's Rise to Become the Largest Economy",
    subtitle = "By: Emilio S",
    x = "GDP ($ Trillions)",
    y = "Year"
  ) +
  scale_color_brewer(palette = "Set1") +
  theme_minimal()

p3 |>
  ggplot(aes(x = year, y = GDP, fill = region)) +
  geom_area(color = "white", size = 0.4) +
  labs(
    title = "GDP by World Bank Region",
    subtitle = "By: Emilio S",
    x = "GDP ($ Trillions)",
    y = "Year"
  ) +
  scale_fill_brewer(palette = "Set2") +
    theme_minimal()

Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.