Using the given data set “Nations” we are going to plot charts using ggplot2
First off we call in our libraries and load in our dataset.
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.1 ✔ tibble 3.2.1
✔ lubridate 1.9.4 ✔ tidyr 1.3.1
✔ purrr 1.0.4
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(plotly)
Attaching package: 'plotly'
The following object is masked from 'package:ggplot2':
last_plot
The following object is masked from 'package:stats':
filter
The following object is masked from 'package:graphics':
layout
library(ggfortify)library(dplyr)library(GGally)
Registered S3 method overwritten by 'GGally':
method from
+.gg ggplot2
Rows: 5275 Columns: 10
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (5): iso2c, iso3c, country, region, income
dbl (5): year, gdp_percap, population, birth_rate, neonat_mortal_rate
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Looking in my environment i see my dataset an know i have 5,275 observables and 10 variables to work with
Lets look at the first few lines
head(Nations)
# A tibble: 6 × 10
iso2c iso3c country year gdp_percap population birth_rate neonat_mortal_rate
<chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 AD AND Andorra 1996 NA 64291 10.9 2.8
2 AD AND Andorra 1994 NA 62707 10.9 3.2
3 AD AND Andorra 2003 NA 74783 10.3 2
4 AD AND Andorra 1990 NA 54511 11.9 4.3
5 AD AND Andorra 2009 NA 85474 9.9 1.7
6 AD AND Andorra 2011 NA 82326 NA 1.6
# ℹ 2 more variables: region <chr>, income <chr>
Now we want to create a new variable, “GDP”. taking the variable “gdp_percap”, multiplying by “population” then dividing by one trillion
# A tibble: 6 × 11
iso2c iso3c country year gdp_percap population birth_rate neonat_mortal_rate
<chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 AD AND Andorra 1996 NA 64291 10.9 2.8
2 AD AND Andorra 1994 NA 62707 10.9 3.2
3 AD AND Andorra 2003 NA 74783 10.3 2
4 AD AND Andorra 1990 NA 54511 11.9 4.3
5 AD AND Andorra 2009 NA 85474 9.9 1.7
6 AD AND Andorra 2011 NA 82326 NA 1.6
# ℹ 3 more variables: region <chr>, income <chr>, GDP <dbl>
Next, we create a new dataframe filtering our 4 countries. China, Germany and the United States
# A tibble: 100 × 3
GDP year country
<dbl> <dbl> <chr>
1 1.47 1992 China
2 6.59 2005 China
3 3.68 2000 China
4 1.26 1991 China
5 16.6 2013 China
6 3.32 1999 China
7 18.1 2014 China
8 5.07 2003 China
9 5.73 2004 China
10 1.71 1993 China
# ℹ 90 more rows
Now, we plot out the first chart using geom_point and geom_line.
Plot_1 <-ggplot(Nations_df, aes(x = year, y = GDP, color = country)) +labs(title ="China's Rise to Become the Largest Economy",x ="year",y ="GDP ($ trillion)") +theme_minimal(base_size =12) +geom_line() +geom_point() +scale_color_brewer(palette ="Set1")Plot_1
Next as per instruction we will use the group_by function provided in the dplyr library
# A tibble: 175 × 3
region year GDP
<chr> <dbl> <dbl>
1 East Asia & Pacific 1990 5.52
2 East Asia & Pacific 1991 6.03
3 East Asia & Pacific 1992 6.50
4 East Asia & Pacific 1993 7.04
5 East Asia & Pacific 1994 7.64
6 East Asia & Pacific 1995 8.29
7 East Asia & Pacific 1996 8.96
8 East Asia & Pacific 1997 9.55
9 East Asia & Pacific 1998 9.60
10 East Asia & Pacific 1999 10.1
# ℹ 165 more rows
We plot the second chart making it interactive using ggplotly
Plot_2 <-ggplot(Regions_df, aes(x = year, y = GDP, fill=region)) +geom_area(color ="white", size =0.2) +labs(title ="Global GDP Growth by World bank Region",x ="Year",y ="Total GDP ($ Trillion)") +scale_fill_brewer(palette ="Set2") +theme_minimal(base_size =12)
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.