The task and summary of findings

This task intends to understand the world’s geographical income levels, by analyzing the data set from the World Bank. As results of this analysis, 1) there are more countries that are defined as high income countries than countries in other income groups; 2) most of these high income countries are located in the Europe & Central Asia region.

The World Bank defines income levels by using gross national income (GNI) per capita. The groups are: low income, $1,135 or less; lower middle income, $1,136 to $4,465; upper middle income, $4,466 to $13,845; and high income, $13,846 or more.

Finding #1: More countries are in the high income group than in other groups.

# The below code uses dplyr's summarize(), count(), and group_by() functions to prepare the data for the analysis

library(readr)
CLASS <- read_csv("CLASS.csv")

library(dplyr)

df = CLASS[!is.na(CLASS$`Income group`),]

df_by_income = df %>% 
              group_by(`Income group`) %>%
              summarize(total_country_count = n()) %>%
              arrange(desc(total_country_count))

library(ggplot2)
ggplot(data=df_by_income) + 
  geom_col(mapping = aes(`Income group`, total_country_count), fill="purple",col="purple" ) +
  labs(x="Income groups", y="Numbers of country", title="Numbers of country by different income groups", caption = "Data source: World Bank")

Finding #2: The Europe & Central Asia region contains most high income countries in the world.

library(dplyr)

df_by_region = df %>% 
              group_by(`Income group`, Region) %>%
              summarize(total_country_count = n()) %>%
              arrange(`Income group`)


show_region = tibble(df_by_region %>% count(`Income group`, Region, total_country_count))

show_high = show_region[show_region$`Income group`=="High income",]

library(ggplot2)
ggplot(show_high) + 
  geom_col(mapping = aes(Region, total_country_count), fill="purple",col="purple") +
  labs(x="High income regions", y="Numbers of country", title="Number of countries in high income regions", caption = "Data source: World Bank")

show_high
## # A tibble: 6 × 4
##   `Income group` Region                     total_country_count     n
##   <chr>          <chr>                                    <int> <int>
## 1 High income    East Asia & Pacific                         15     1
## 2 High income    Europe & Central Asia                       38     1
## 3 High income    Latin America & Caribbean                   18     1
## 4 High income    Middle East & North Africa                   8     1
## 5 High income    North America                                3     1
## 6 High income    Sub-Saharan Africa                           1     1

Data source: https://datacatalog.worldbank.org/search/dataset/0037712/World-Development-Indicators