── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.2.0 ✔ readr 2.2.0
✔ forcats 1.0.1 ✔ stringr 1.6.0
✔ ggplot2 4.0.2 ✔ tibble 3.3.1
✔ lubridate 1.9.5 ✔ tidyr 1.3.2
✔ purrr 1.2.1
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(dplyr)library(ggplot2)
Read the data
setwd("~/Downloads/First data 110 assignment_files")nations <-read_csv("nations.csv")
Rows: 5275 Columns: 10
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (5): iso2c, iso3c, country, region, income
dbl (5): year, gdp_percap, population, birth_rate, neonat_mortal_rate
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
View the data
head(nations)
# A tibble: 6 × 10
iso2c iso3c country year gdp_percap population birth_rate neonat_mortal_rate
<chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 AD AND Andorra 1996 NA 64291 10.9 2.8
2 AD AND Andorra 1994 NA 62707 10.9 3.2
3 AD AND Andorra 2003 NA 74783 10.3 2
4 AD AND Andorra 1990 NA 54511 11.9 4.3
5 AD AND Andorra 2009 NA 85474 9.9 1.7
6 AD AND Andorra 2011 NA 82326 NA 1.6
# ℹ 2 more variables: region <chr>, income <chr>
GDP variable
nations <- nations |>mutate(gdp = gdp_percap * population /10^12)
Plot 1: Dot-line of some countries
p1 <- nations |>filter(country %in%c("Cote d'Ivoire", "Canada", "Spain", "Thailand")) |>ggplot(aes(x = year , y = gdp, color = country)) +geom_point() +geom_line() +scale_color_brewer(palette ="Dark2", name ="country", labels =c("Cote d'Ivoire", "Canada", "Spain", "Thailand")) +labs(title ="GDP of Some Countries", caption ="Nation Dataset",x ="Year",y ="GDP($ trillion)") +theme_minimal()p1
Plot 2: Area chart of GDP by world bank region
p2 <- nations |>group_by(region, year) |>summarise(gdp =sum(gdp, na.rm =TRUE)) |>ggplot(aes(x = year, y = gdp, fill = region)) +geom_area() +scale_fill_brewer(palette ="Set1", name ="region") +labs(title ="GDP by world Bank Region", caption ="Nation Dataset",x ="Year",y ="GDP($ trillion)") +theme_minimal()
`summarise()` has regrouped the output.
ℹ Summaries were computed grouped by region and year.
ℹ Output is grouped by region.
ℹ Use `summarise(.groups = "drop_last")` to silence this message.
ℹ Use `summarise(.by = c(region, year))` for per-operation grouping
(`?dplyr::dplyr_by`) instead.
p2
Brief Essay
Regarding the countries I selected, I decided to choose one country from each continent. I chose Côte d’Ivoire to represent Africa because it is my country. I selected Canada to represent American because I have a lot of friends there. For Europe, I chose Spain because it is a country that I really like. Finally, I chose Thailand to represent Asia because it is one of the countries I would like to visit.