Introduction

In this analysis, we explore how countries are distributed across income groups within different regions.

We aim to answer the question: “What is the distribution of countries across income groups in different regions?”

1. Import and Explore Data

# Load necessary libraries
library(readr)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)

# Import the data set
wdicountry <- read_csv("Desktop/R Programming/WDICountry.csv")
## Rows: 265 Columns: 31
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (25): Country Code, Short Name, Table Name, Long Name, 2-alpha code, Cur...
## dbl  (3): National accounts reference year, Latest industrial data, Latest t...
## lgl  (3): Alternative conversion factor, PPP survey year, Latest water withd...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# Explore the data set
summary(wdicountry)
##  Country Code        Short Name         Table Name         Long Name        
##  Length:265         Length:265         Length:265         Length:265        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##  2-alpha code       Currency Unit      Special Notes         Region         
##  Length:265         Length:265         Length:265         Length:265        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##  Income Group        WB-2 code         National accounts base year
##  Length:265         Length:265         Length:265                 
##  Class :character   Class :character   Class :character           
##  Mode  :character   Mode  :character   Mode  :character           
##                                                                   
##                                                                   
##                                                                   
##                                                                   
##  National accounts reference year SNA price valuation Lending category  
##  Min.   :2000                     Length:265          Length:265        
##  1st Qu.:2012                     Class :character    Class :character  
##  Median :2015                     Mode  :character    Mode  :character  
##  Mean   :2013                                                           
##  3rd Qu.:2015                                                           
##  Max.   :2022                                                           
##  NA's   :191                                                            
##  Other groups       System of National Accounts Alternative conversion factor
##  Length:265         Length:265                  Mode:logical                 
##  Class :character   Class :character            NA's:265                     
##  Mode  :character   Mode  :character                                         
##                                                                              
##                                                                              
##                                                                              
##                                                                              
##  PPP survey year Balance of Payments Manual in use
##  Mode:logical    Length:265                       
##  NA's:265        Class :character                 
##                  Mode  :character                 
##                                                   
##                                                   
##                                                   
##                                                   
##  External debt Reporting status System of trade   
##  Length:265                     Length:265        
##  Class :character               Class :character  
##  Mode  :character               Mode  :character  
##                                                   
##                                                   
##                                                   
##                                                   
##  Government Accounting concept IMF data dissemination standard
##  Length:265                    Length:265                     
##  Class :character              Class :character               
##  Mode  :character              Mode  :character               
##                                                               
##                                                               
##                                                               
##                                                               
##  Latest population census Latest household survey
##  Length:265               Length:265             
##  Class :character         Class :character       
##  Mode  :character         Mode  :character       
##                                                  
##                                                  
##                                                  
##                                                  
##  Source of most recent Income and expenditure data Vital registration complete
##  Length:265                                        Length:265                 
##  Class :character                                  Class :character           
##  Mode  :character                                  Mode  :character           
##                                                                               
##                                                                               
##                                                                               
##                                                                               
##  Latest agricultural census Latest industrial data Latest trade data
##  Length:265                 Min.   :1973           Min.   :1995     
##  Class :character           1st Qu.:2002           1st Qu.:2017     
##  Mode  :character           Median :2012           Median :2018     
##                             Mean   :2007           Mean   :2017     
##                             3rd Qu.:2013           3rd Qu.:2018     
##                             Max.   :2014           Max.   :2018     
##                             NA's   :118            NA's   :74       
##  Latest water withdrawal data
##  Mode:logical                
##  NA's:265                    
##                              
##                              
##                              
##                              
## 
str(wdicountry)
## spc_tbl_ [265 × 31] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ Country Code                                     : chr [1:265] "ABW" "AFE" "AFG" "AFW" ...
##  $ Short Name                                       : chr [1:265] "Aruba" "Africa Eastern and Southern" "Afghanistan" "Africa Western and Central" ...
##  $ Table Name                                       : chr [1:265] "Aruba" "Africa Eastern and Southern" "Afghanistan" "Africa Western and Central" ...
##  $ Long Name                                        : chr [1:265] "Aruba" "Africa Eastern and Southern" "Islamic State of Afghanistan" "Africa Western and Central" ...
##  $ 2-alpha code                                     : chr [1:265] "AW" "ZH" "AF" "ZI" ...
##  $ Currency Unit                                    : chr [1:265] "Aruban florin" NA "Afghan afghani" NA ...
##  $ Special Notes                                    : chr [1:265] NA "26 countries, stretching from the Red Sea in the North to the Cape of Good Hope in the South (https://www.world"| __truncated__ "The reporting period for national accounts data is designated as either calendar year basis (CY) or fiscal year"| __truncated__ "22 countries, stretching from the westernmost point of Africa, across the equator, and partly along the Atlanti"| __truncated__ ...
##  $ Region                                           : chr [1:265] "Latin America & Caribbean" NA "South Asia" NA ...
##  $ Income Group                                     : chr [1:265] "High income" NA "Low income" NA ...
##  $ WB-2 code                                        : chr [1:265] "AW" "ZH" "AF" "ZI" ...
##  $ National accounts base year                      : chr [1:265] "2013" NA "2016" NA ...
##  $ National accounts reference year                 : num [1:265] NA NA NA NA NA 2010 NA NA NA NA ...
##  $ SNA price valuation                              : chr [1:265] "Value added at basic prices (VAB)" NA "Value added at basic prices (VAB)" NA ...
##  $ Lending category                                 : chr [1:265] NA NA "IDA" NA ...
##  $ Other groups                                     : chr [1:265] NA NA "HIPC" NA ...
##  $ System of National Accounts                      : chr [1:265] "Country uses the 2008 System of National Accounts methodology" NA "Country uses the 1993 System of National Accounts methodology" NA ...
##  $ Alternative conversion factor                    : logi [1:265] NA NA NA NA NA NA ...
##  $ PPP survey year                                  : logi [1:265] NA NA NA NA NA NA ...
##  $ Balance of Payments Manual in use                : chr [1:265] "BPM6" NA "BPM6" NA ...
##  $ External debt Reporting status                   : chr [1:265] NA NA "Estimate" NA ...
##  $ System of trade                                  : chr [1:265] "General trade system" NA "General trade system" NA ...
##  $ Government Accounting concept                    : chr [1:265] NA NA "Consolidated central government" NA ...
##  $ IMF data dissemination standard                  : chr [1:265] "Enhanced General Data Dissemination System (e-GDDS)" NA "Enhanced General Data Dissemination System (e-GDDS)" NA ...
##  $ Latest population census                         : chr [1:265] "2020 (expected)" NA "1979" NA ...
##  $ Latest household survey                          : chr [1:265] NA NA "Demographic and Health Survey, 2015" NA ...
##  $ Source of most recent Income and expenditure data: chr [1:265] NA NA "Integrated household survey (IHS), 2016/17" NA ...
##  $ Vital registration complete                      : chr [1:265] "Yes" NA NA NA ...
##  $ Latest agricultural census                       : chr [1:265] NA NA NA NA ...
##  $ Latest industrial data                           : num [1:265] NA NA NA NA NA ...
##  $ Latest trade data                                : num [1:265] 2018 NA 2018 NA 2018 ...
##  $ Latest water withdrawal data                     : logi [1:265] NA NA NA NA NA NA ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   `Country Code` = col_character(),
##   ..   `Short Name` = col_character(),
##   ..   `Table Name` = col_character(),
##   ..   `Long Name` = col_character(),
##   ..   `2-alpha code` = col_character(),
##   ..   `Currency Unit` = col_character(),
##   ..   `Special Notes` = col_character(),
##   ..   Region = col_character(),
##   ..   `Income Group` = col_character(),
##   ..   `WB-2 code` = col_character(),
##   ..   `National accounts base year` = col_character(),
##   ..   `National accounts reference year` = col_double(),
##   ..   `SNA price valuation` = col_character(),
##   ..   `Lending category` = col_character(),
##   ..   `Other groups` = col_character(),
##   ..   `System of National Accounts` = col_character(),
##   ..   `Alternative conversion factor` = col_logical(),
##   ..   `PPP survey year` = col_logical(),
##   ..   `Balance of Payments Manual in use` = col_character(),
##   ..   `External debt Reporting status` = col_character(),
##   ..   `System of trade` = col_character(),
##   ..   `Government Accounting concept` = col_character(),
##   ..   `IMF data dissemination standard` = col_character(),
##   ..   `Latest population census` = col_character(),
##   ..   `Latest household survey` = col_character(),
##   ..   `Source of most recent Income and expenditure data` = col_character(),
##   ..   `Vital registration complete` = col_character(),
##   ..   `Latest agricultural census` = col_character(),
##   ..   `Latest industrial data` = col_double(),
##   ..   `Latest trade data` = col_double(),
##   ..   `Latest water withdrawal data` = col_logical()
##   .. )
##  - attr(*, "problems")=<externalptr>
head(wdicountry)
## # A tibble: 6 × 31
##   `Country Code` `Short Name`            `Table Name` `Long Name` `2-alpha code`
##   <chr>          <chr>                   <chr>        <chr>       <chr>         
## 1 ABW            Aruba                   Aruba        Aruba       AW            
## 2 AFE            Africa Eastern and Sou… Africa East… Africa Eas… ZH            
## 3 AFG            Afghanistan             Afghanistan  Islamic St… AF            
## 4 AFW            Africa Western and Cen… Africa West… Africa Wes… ZI            
## 5 AGO            Angola                  Angola       People's R… AO            
## 6 ALB            Albania                 Albania      Republic o… AL            
## # ℹ 26 more variables: `Currency Unit` <chr>, `Special Notes` <chr>,
## #   Region <chr>, `Income Group` <chr>, `WB-2 code` <chr>,
## #   `National accounts base year` <chr>,
## #   `National accounts reference year` <dbl>, `SNA price valuation` <chr>,
## #   `Lending category` <chr>, `Other groups` <chr>,
## #   `System of National Accounts` <chr>, `Alternative conversion factor` <lgl>,
## #   `PPP survey year` <lgl>, `Balance of Payments Manual in use` <chr>, …

2. Prepare Data

# Display all column names in the dataset
colnames(wdicountry)
##  [1] "Country Code"                                     
##  [2] "Short Name"                                       
##  [3] "Table Name"                                       
##  [4] "Long Name"                                        
##  [5] "2-alpha code"                                     
##  [6] "Currency Unit"                                    
##  [7] "Special Notes"                                    
##  [8] "Region"                                           
##  [9] "Income Group"                                     
## [10] "WB-2 code"                                        
## [11] "National accounts base year"                      
## [12] "National accounts reference year"                 
## [13] "SNA price valuation"                              
## [14] "Lending category"                                 
## [15] "Other groups"                                     
## [16] "System of National Accounts"                      
## [17] "Alternative conversion factor"                    
## [18] "PPP survey year"                                  
## [19] "Balance of Payments Manual in use"                
## [20] "External debt Reporting status"                   
## [21] "System of trade"                                  
## [22] "Government Accounting concept"                    
## [23] "IMF data dissemination standard"                  
## [24] "Latest population census"                         
## [25] "Latest household survey"                          
## [26] "Source of most recent Income and expenditure data"
## [27] "Vital registration complete"                      
## [28] "Latest agricultural census"                       
## [29] "Latest industrial data"                           
## [30] "Latest trade data"                                
## [31] "Latest water withdrawal data"
# Filter out rows where "Region" or "Income Group" are missing
wdicountry_clean <- wdicountry %>%
  filter(!is.na(Region), !is.na(`Income Group`))

3. Summarize and Group Data

# Summarize the number of countries by region and income group
income_by_region <- wdicountry_clean %>%
  group_by(Region, `Income Group`) %>%
  summarize(count = n())  # Count the number of countries in each group
## `summarise()` has grouped output by 'Region'. You can override using the
## `.groups` argument.

4. Visualize Data

# Create a bar plot for the distribution of countries by income group and region
ggplot(income_by_region, aes(x = `Income Group`, y = count, fill = Region)) +
  geom_bar(stat = "identity", position = "dodge") +
  labs(title = "Distribution of Countries by Income Group and Region",
       x = "Income Group",
       y = "Number of Countries") +
  theme_minimal()

Conclusion

The analysis shows that several regions, such as Sub-Saharan Africa, have a higher concentration of countries in the lower-income group, while regions like Europe and Central Asia tend to have a higher proportion of high-income countries. This provides insight into the global economic landscape.