library(tidyverse)
setwd("C:/Users/olait/OneDrive/Desktop/data 110")
nations<-read_csv("nations.csv")nations
Load required library and load data from working directory
make headers lowercase and remove space
names(nations)<-gsub(" ","_",tolower(names(nations)))
head(nations)# A tibble: 6 × 10
iso2c iso3c country year gdp_percap population birth_rate neonat_mortal_rate
<chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 AD AND Andorra 1996 NA 64291 10.9 2.8
2 AD AND Andorra 1994 NA 62707 10.9 3.2
3 AD AND Andorra 2003 NA 74783 10.3 2
4 AD AND Andorra 1990 NA 54511 11.9 4.3
5 AD AND Andorra 2009 NA 85474 9.9 1.7
6 AD AND Andorra 2011 NA 82326 NA 1.6
# ℹ 2 more variables: region <chr>, income <chr>
check the structure of the data
str(nations)spc_tbl_ [5,275 × 10] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
$ iso2c : chr [1:5275] "AD" "AD" "AD" "AD" ...
$ iso3c : chr [1:5275] "AND" "AND" "AND" "AND" ...
$ country : chr [1:5275] "Andorra" "Andorra" "Andorra" "Andorra" ...
$ year : num [1:5275] 1996 1994 2003 1990 2009 ...
$ gdp_percap : num [1:5275] NA NA NA NA NA NA NA NA NA NA ...
$ population : num [1:5275] 64291 62707 74783 54511 85474 ...
$ birth_rate : num [1:5275] 10.9 10.9 10.3 11.9 9.9 NA 10.9 9.8 11.8 11.2 ...
$ neonat_mortal_rate: num [1:5275] 2.8 3.2 2 4.3 1.7 1.6 2 1.7 2.1 2.1 ...
$ region : chr [1:5275] "Europe & Central Asia" "Europe & Central Asia" "Europe & Central Asia" "Europe & Central Asia" ...
$ income : chr [1:5275] "High income" "High income" "High income" "High income" ...
- attr(*, "spec")=
.. cols(
.. iso2c = col_character(),
.. iso3c = col_character(),
.. country = col_character(),
.. year = col_double(),
.. gdp_percap = col_double(),
.. population = col_double(),
.. birth_rate = col_double(),
.. neonat_mortal_rate = col_double(),
.. region = col_character(),
.. income = col_character()
.. )
- attr(*, "problems")=<externalptr>
Summary of the nations data
summary(nations) iso2c iso3c country year
Length:5275 Length:5275 Length:5275 Min. :1990
Class :character Class :character Class :character 1st Qu.:1996
Mode :character Mode :character Mode :character Median :2002
Mean :2002
3rd Qu.:2008
Max. :2014
gdp_percap population birth_rate neonat_mortal_rate
Min. : 239.7 Min. :9.004e+03 Min. : 6.90 Min. : 0.70
1st Qu.: 2263.6 1st Qu.:7.175e+05 1st Qu.:13.40 1st Qu.: 6.70
Median : 6563.2 Median :5.303e+06 Median :21.60 Median :15.00
Mean : 12788.8 Mean :2.958e+07 Mean :24.16 Mean :19.40
3rd Qu.: 17195.0 3rd Qu.:1.757e+07 3rd Qu.:33.88 3rd Qu.:29.48
Max. :141968.1 Max. :1.364e+09 Max. :55.12 Max. :73.10
NA's :766 NA's :14 NA's :295 NA's :525
region income
Length:5275 Length:5275
Class :character Class :character
Mode :character Mode :character
Create new variable by using the mutate function to create the new variable total_gdp_trillion
nations1 <-nations|>
mutate(gdp = (gdp_percap * population) / 1e12)Select four nations from the BRICS block of nations and compares with Nigeria : Brazil,Russia,India,China and Nigeria
While in the 60’s,there were suggestions that certain under-developed countries will be somewhat developed,and since the creation of the BRICS nations,my goal was to see how Nigeria competes with the selected countries.
While population plays a significant role for China and India,Brazil/Nigeria’s population are somewhat hence the comparison.
selected_countries <- c("Brazil", "Russia", "India", "China", "Nigeria")Filter the dataframe for the selected countries
selected_nations <- nations1 |>
filter(country %in% selected_countries)
print(selected_nations)# A tibble: 100 × 11
iso2c iso3c country year gdp_percap population birth_rate neonat_mortal_rate
<chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 BR BRA Brazil 2002 9468. 181045592 20.0 13.9
2 BR BRA Brazil 1995 8029. 162755054 22.0 20
3 BR BRA Brazil 1996 8227. 165303155 21.8 19.3
4 BR BRA Brazil 1993 7221. 157812220 22.6 21.5
5 BR BRA Brazil 1994 7649. 160260508 22.2 20.7
6 BR BRA Brazil 2001 9182. 178419396 20.5 14.9
7 BR BRA Brazil 1998 8507. 170516482 21.4 17.8
8 BR BRA Brazil 2007 12390. 192784521 16.8 10.4
9 BR BRA Brazil 2004 10325. 186116363 18.7 12.2
10 BR BRA Brazil 1990 6622. 150393143 24.3 24.3
# ℹ 90 more rows
# ℹ 3 more variables: region <chr>, income <chr>, gdp <dbl>
The first chart, I’ll filter the data with dplyr five desired countries. plot chart with ggplot2 and add both geom_point and geom_line layers, then use the Set1 ColorBrewer palette using: scale_color_brewer(palette = “Set1”).
install.packages(“RColorBrewer”)
library(RColorBrewer)create plot1
plot1 <- ggplot(selected_nations, aes(x = population, y = gdp, color = country)) +
geom_point(size = 2) + # Add points
geom_line() + # Add lines
scale_color_brewer(palette = "Set1") + # Apply Set1 color palette
theme_minimal() + # Use a minimal theme for better aesthetics
labs(title = "GDP vs Population for Selected Nations",
x = "Population (millions)",
y = "GDP (trillions USD)")
print(plot1)The plot above clearly demonstrates China’s significant dominance, with Brazil and Nigeria trailing far behind. Although Nigeria is not part of the BRICS nations, China’s economic prowess within the bloc diminishes its competitiveness in terms of policies and trade relations, particularly with the US. China’s influence tends to overshadow decisions favoring the rest of the bloc, including Nigeria and other African countries.
For the second chart, using dplyr you will need to group_by region and year, and then summarise on your mutated value for gdp using summarise(GDP = sum(gdp, na.rm = TRUE)). (There will be null values, or NAs, in this data, so you will need to use na.rm = TRUE).Each region’s area will be generated by the command geom_area () When drawing the chart with ggplot2, you will need to use the Set2 ColorBrewer palette using scale_fill_brewer(palette = “Set2”)
For plot2,using dplyr to groub_by region and year
nations_summary <- nations1 |>
group_by(region, year) |>
summarize(gdp = sum(gdp, na.rm = TRUE), .groups = 'drop')Plot2
plot <- ggplot(nations_summary, aes(x = year, y = gdp,
fill = region,colour = region)) +
geom_area(size=0.1) + # Add lines
scale_fill_brewer(palette = "Set2") + # Apply Set2 color palette
scale_color_manual(values = rep("white", length(unique(nations_summary$region)))) +
theme() + # Use a minimal theme for better aesthetics
labs(title = "Total GDP by Region and Year",
x = "Year",
y = "Total GDP (trillions USD)")
# Display the plot
print(plot)Plot2 clearly displays the regions’ total GDP and how each region has responded over the last 25 years. One significant takeaway from the plot above is the dominance of the East Asia and Pacific region compared to others. Another region that shows a significantly high average total GDP is Latin America and the Caribbean, in comparison to North America.
While GDP is a useful index for assessing wealth regionally, it inadequately portrays the economic status of wealthy nations. This can be observed when comparing migration patterns, as many individuals from regions with high GDP tend to migrate towards North American and European nations. This migration trend underscores that the high total GDP of these regions is influenced by the sheer number of countries and population within them.