library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.2 ✔ readr 2.1.4
## ✔ forcats 1.0.0 ✔ stringr 1.5.0
## ✔ ggplot2 3.4.3 ✔ tibble 3.2.1
## ✔ lubridate 1.9.2 ✔ tidyr 1.3.0
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(ggfortify)
library(plotly)
##
## Attaching package: 'plotly'
##
## The following object is masked from 'package:ggplot2':
##
## last_plot
##
## The following object is masked from 'package:stats':
##
## filter
##
## The following object is masked from 'package:graphics':
##
## layout
# loading data set
nations <- read.csv("C:/Users/tianm/Documents/DATA110/data/nations.csv")
# Creating and saving a new column called gdp using other columns in the data
nations_GDP <- nations |>
mutate(gdp = gdp_percap * population / 1e12)
top_countries <- nations_GDP |> arrange(desc(gdp)) |> select(country) |> unique() |> simplify2array()
top_countries <- top_countries[1:4]
top_countries
## [1] "China" "United States" "India" "Japan"
# Filtering four desired European countries and creating a scatter plot of gdp per year from 1990 - 2014
nations_top <- nations_GDP |>
filter(country %in% top_countries)
plot_top <- nations_top |>
ggplot(aes(x = year, y = gdp, color = country)) +
geom_point() +
geom_line() +
scale_color_brewer(palette = "Set1") +
theme_bw() +
theme(panel.border = element_blank(),
legend.title = element_blank(),
legend.key = element_rect(fill = "white"),
plot.title = element_text(hjust = 0.5)) +
labs(x = "Year",
y = "GDP ($ Trillions)",
title = "Top 4 Countries with the highest GDPs from 1990 - 2014")
plot_top
China: The GDP of China began at little over zero trillion in 1990 and
increased gradually until around 2000. After that, the growth got
steeper, and by 2014, it had surpassed all other nations with a GDP of
almost fifteen trillion.
India: India’s GDP grew more slowly but steadily in 1990, beginning the year slightly higher than China’s. It was approximately 5 trillion in 2014, keeping it above Japan but below the US.
Japan: In 1990, the GDP of Japan increased steadily and was marginally higher than that of India. But after 2010, it seems to level off and decline a little, ending in 2014 at just less than 5 trillion.
United States: With a GDP just under 5 trillion, the US began as the leading nation in 1990. Compared to China, its growth is less rapid but nonetheless stable and continuous. In terms of GDP, the US ranked second after China in 2014 with just over $10 trillion.
# Creating an area plot of gdp by year and regions of the world
nations_region <- nations_GDP |>
group_by(region,year) |>
summarise(GDP = sum(gdp, na.rm = TRUE))
## `summarise()` has grouped output by 'region'. You can override using the
## `.groups` argument.
plot_region <- ggplot(nations_region, aes(x = year, y = GDP, fill = region)) +
geom_area(color = "white", size = 0.5) +
scale_fill_brewer(palette = "Set2") +
theme_minimal() +
labs(title = "GDP trend by regions between 2000 - 2014",
x = "Year",
y = "GDP ($ Trillion)") +
xlab("Year") +
ylab("GDP ($ Trillion)")
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
plot_region
With the greatest GDP for the whole time and notable growth, East Asia & Pacific takes center stage on the chart. Europe and Central Asia continues to rank second in terms of GDP, with North America coming in second in later years. Though their GDPs are significantly lower than those of the top regions, regions like South Asia and Sub-Saharan Africa continue to rise.