# load required packages
library (tidyverse)
library (RColorBrewer)Nations Dataset Charts Assignments
Visualizing Data with the Nations Dataset
Charts based on the World Bank’s GDP dataset were created using the following procedures.
Chart 1
Load required packages
The required packages, ‘tidyverse’ and ‘RColorBrewer’, have been loaded.
Load and process nations dataset
The dataset has been loaded using the ‘read_csv’ command.
# load the dataset
nations <- read_csv("~/Documents/1 - College/DATA 110/2 - Dataset/Nation/nations.csv") A new variable has been created in the dataset using the ‘mutate’ function from ‘dplyr’. This variable represents the GDP of each country in trillions of dollars, calculated by multiplying GDP per capita by the population and dividing by a trillion.
# create a new dataset with a new column showing GDP in trillions of dollars
nations_nv <- mutate(nations, gdp_tn = gdp_percap*population/1000000000000)The dataset has been checked using the head function to display the first six rows.
# check the dataset
head (nations_nv)# A tibble: 6 × 11
iso2c iso3c country year gdp_percap population birth_rate neonat_mortal_rate
<chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 AD AND Andorra 1996 NA 64291 10.9 2.8
2 AD AND Andorra 1994 NA 62707 10.9 3.2
3 AD AND Andorra 2003 NA 74783 10.3 2
4 AD AND Andorra 1990 NA 54511 11.9 4.3
5 AD AND Andorra 2009 NA 85474 9.9 1.7
6 AD AND Andorra 2011 NA 82326 NA 1.6
# ℹ 3 more variables: region <chr>, income <chr>, gdp_tn <dbl>
Prepare the dataset
A new dataset has been created by selecting four Southeast Asian nations: Malaysia, Indonesia, Thailand, and Singapore, to analyze their economic progress over a span of 24 years.
# filter the data for the desired four nations
ASEAN4 <- nations_nv %>%
filter(iso3c == "MYS" | iso3c == "IDN" | iso3c == "THA" | iso3c == "SGP") %>% arrange (year)I used the summary function to detect NA values in the dataset. However, there is no NA value.
# check the summary of dataset if there is na values
summary (ASEAN4) iso2c iso3c country year
Length:100 Length:100 Length:100 Min. :1990
Class :character Class :character Class :character 1st Qu.:1996
Mode :character Mode :character Mode :character Median :2002
Mean :2002
3rd Qu.:2008
Max. :2014
gdp_percap population birth_rate neonat_mortal_rate
Min. : 2894 Min. : 3047132 Min. : 9.30 Min. : 1.10
1st Qu.: 6911 1st Qu.: 15025754 1st Qu.:12.75 1st Qu.: 4.00
Median :11626 Median : 43242410 Median :17.09 Median : 7.75
Mean :19733 Mean : 77316923 Mean :17.57 Mean :10.29
3rd Qu.:23874 3rd Qu.: 96153690 3rd Qu.:21.49 3rd Qu.:16.45
Max. :83689 Max. :254454778 Max. :28.22 Max. :30.30
region income gdp_tn
Length:100 Length:100 Min. :0.06755
Class :character Class :character 1st Qu.:0.26028
Mode :character Mode :character Median :0.44776
Mean :0.63286
3rd Qu.:0.82798
Max. :2.68531
Make the chart
I created Chart-1 using ggplot with geom_line and geom_point functions, with scale_color_brewer(palette = “Set1”. The x-axis represents years, and the y-axis represents GDP in trillions, illustrating the growth of selected ASEAN economies.
# create the chart using ggplot with geom_line and geom_point
ASEAN4_chart <- ggplot (ASEAN4, aes(x = year, y = gdp_tn, color = country)) +
geom_line(size = 0.5) + # adjust the size of line
geom_point(alpha = 0.9, size = 3, pch = 18) + # use the different point shape with pch code
scale_color_brewer(name = "Selected Nations", palette = "Set1") +
ylim (0,3) +
labs(
x = "Year",
y = "GDP ($ trillion)",
title = "ASEAN's Economic Growth: A Rising Tide in Selected Nations",
caption = "Source: World Bank",
color = "Selected Nations"
) +
theme_minimal(base_size = 9) +
theme(
legend.position = "right",
legend.title = element_text(size = 10),
legend.text = element_text(size = 8),
plot.title = element_text(hjust = 0.6, face = "bold", margin = margin(b = 10, t = 10)),
plot.caption = element_text(face = "italic")
)Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.
# print the chart
print(ASEAN4_chart)Chart 2
A new dataset has been created by grouping regions and years. The x-axis represents years, and the y-axis represents GDP in trillions, illustrating the growth of regions worldwide.
# create a new data set
regions <- nations_nv %>%
group_by (year, region) %>%
summarize (gdp_tn = sum (gdp_tn, na.rm = TRUE)) %>%
arrange (year, region)`summarise()` has grouped output by 'year'. You can override using the
`.groups` argument.
I used the head function to display the first ten rows of the dataset.
head (regions, 10)# A tibble: 10 × 3
# Groups: year [2]
year region gdp_tn
<dbl> <chr> <dbl>
1 1990 East Asia & Pacific 5.52
2 1990 Europe & Central Asia 9.36
3 1990 Latin America & Caribbean 2.40
4 1990 Middle East & North Africa 1.66
5 1990 North America 6.54
6 1990 South Asia 1.35
7 1990 Sub-Saharan Africa 0.787
8 1991 East Asia & Pacific 6.03
9 1991 Europe & Central Asia 9.71
10 1991 Latin America & Caribbean 2.55
I created Chart - 2 using ‘ggplot’ with geom_area with cale_fill_brewer (palette = “Set2”). The x-axis represents years, and the y-axis represents GDP in trillions, illustrating the growth of selected ASEAN economies.
# create the area chart
Region_chart <-ggplot(regions, aes(x = year, y = gdp_tn, fill = region)) +
geom_area() +
scale_fill_brewer(name = "Regions", palette = "Set2") +
labs(
x = "Year",
y = "GDP (trillion USD)",
title = "Trends in Regional GDP (1990 - 2014)",
caption = "Source: World Bank"
) +
scale_x_continuous(breaks = seq(min(regions$year), max(regions$year), by = 4),
limits = c(min(regions$year), 2015)) +
theme_minimal() +
theme(
plot.title = element_text(face = "bold", size = 12, hjust = 0.5, margin = margin(t = 10, b = 10)),
axis.title.x = element_text(size = 9),
axis.title.y = element_text(size = 9),
plot.caption = element_text(face = "italic") # Makes the caption italic
)
# print the chart
print(Region_chart)Conclusion
From 1990 to 2014, Chart-1 shows that Indonesia exhibited the highest economic growth among the selected four ASEAN countries, followed by Thailand, Malaysia, and Singapore. Singapore’s growth can be attributed to its smaller population size relative to the other nations, allowing for more focused economic development strategies.
Regarding regional GDP growth, as depicted in Chart-2, the East Asia and Pacific region led globally, followed by Europe and Central Asia, Latin America and the Caribbean, the Middle East and North Africa, North America, South Asia, and Sub-Saharan Africa, respectively. These findings underscore the varied economic trajectories and regional dynamics within the global economy during this period.