Introduction

Infant mortality is the death of infants in their first year of life. There are many causes of infant mortality. Some predominant causes include congenital malformation, infection, and SIDS, while infanticide, abuse, abandonment, and neglect may also be a factor in infant mortality. Infant mortality is measured by infant mortality rate, which is the number of newborns that die under one year old divided by the number of live births during a given year. Sometimes the infant mortality rate is also called the infant death rate. This infant mortality rate is reported by the number of live newborns dying under one year old per one thousand live births. This is done so different countries can compare their rates.


Content Overview

We will use this project to explain how to create a global map and represent accurate information on it. The data used for this code through-project was extracted from the world bank using the World Banks’ WDI API. For those who are not familiar with WDI API. WDI is an API that allows us to extract data from the world bank data bank into Rstudio. This project will focus on creating map that analyses the 2020 Infant Mortality rate, under-5 (per 1,000 live births) in all the countries of the world.


Why You Should Care

This topic is valuable because it is an important marker of the overall health of our global community. We need to understand where we are in terms of access to healthcare. It is also necessary to know what is going on regarding access to healthcare, not just in our countries but worldwide.

Learning Objectives

Through this code through project, students will learn the following:

  1. How to extract data from the World Bank using the World Bank WDI API R package.
  2. How to create a map and use it to show important information from around the world,
  3. How to use ggplot2 to create a global map.
  4. How to beautify and annotate map in ggplot2.


Body Title

Over the last fifty years, there has been a significant decrease in the infant mortality rate. The chance of a newborn baby dying before age five has decreased; however, some countries, especially those in developing countries, continue to suffer from high child mortality rates. In this project, we will only look at data from 2020. We will create a map that shows all the countries where the infant mortality rate is prevalent and where it is not

Further Exposition

According to the World Health Organization, “Globally 2.4 million children died in the first month of life in 2019. There are approximately 6 700 newborn deaths every day, amounting to 47% of all child deaths under the age of 5 years, up from 40% in 1990.”

The World Health Organization further states that the world has made substantial progress in child survival since 1990. Globally, neonatal deaths declined from 5.0 million in 1990 to 2.4 million in 2019. How has child mortality rate decrease in 2020? We will analyze it using the global map.

Here in the United States, in article posted by the US Center for Disease Control, In 2020, the infant mortality rate in the United States was 5.4 deaths per 1,000 live births. Almost 20,000 infants died in the United States in 2020. The five leading causes of infant death in 2020 were:

  1. Birth defects.
  2. Preterm birth and low birth weight.
  3. Sudden infant death syndrome.
  4. Injuries (e.g., suffocation).
  5. Maternal pregnancy complications.

Basic Example

A basic example shows how to use the world bank API to extract data from the world bank data bank. First, we need to load the package WDI package and then make sure that our input is in the order specified in the R documentation. recall that I have also loaded other packages at the beginning of this paper.

# Some code
library(WDI)

world_infanct_mortality_raw <- WDI(country = "all", "SH.DYN.MORT", extra = TRUE, start = 2015, end = 2021)

After extracting data from the world bank, we have to clean the data to analyze it. World bank data comes in its raw form. Hence, I renamed a confusing variable name, filtered out unnecessary information I did not need, and selected the variables I was interested in analyzing.

world_infanct_mortality_clean <- world_infanct_mortality_raw %>% 
  rename(child_mortality_rate = SH.DYN.MORT) %>% 
  filter(region != "Aggregates") %>% 
  select(country, child_mortality_rate, year, iso3c) 


Advanced Examples

In this segment, I had to import another set of data containing the world’s shape. The data set has both the longitude and the latitude variable, which is what we need to be able to create a global map.

# Some code
world_shapes <- read_sf("data/ne_110m_admin_0_countries/ne_110m_admin_0_countries.shp")

Next, I used the left_join() R function to merge both data. After the cleaning and the merger, the data is ready to be visualized.

infant_mortality_world_map <- world_shapes %>%
  left_join(world_infanct_mortality_clean, by = c("ISO_A3" = "iso3c")) %>%
  filter(ISO_A3 != "ATA")


Here we need the ggplot2 package we loaded at the beginning of this RMD file. The ggplot2 package allows us to visualize our data in a map. ggplot2 is an open-source data visualization package for the statistical programming language R. Created by Hadley Wickham in 2005, ggplot2 is an implementation of Leland Wilkinson’s Grammar of Graphics—a general scheme for data visualization which breaks up graphs into semantic components such as scales and layers. ggplot2 allows us to select the variable we visualize and the color of our choice.

# Some code 
map1 <- ggplot() +
  geom_sf(data = infant_mortality_world_map,
          aes(fill = child_mortality_rate),
          size = 0.25) +
  coord_sf(crs = st_crs("ESRI:54030")) +
  scale_fill_gradient(name = "Number", low = "#ADD8E6", high = "#C70039", na.value = "white") +
  theme_void() 


Most notably, it’s valuable for us to add texts and themes to make our map look beautiful and professional. We added the heading, subheading, and caption. We also modified the text by changing the fonts and making the heading bold; this makes the graph aesthetically pleasing.

# Some code

map1 +
  theme(legend.position = "right") +
  labs(title = "2020 Infant Mortality rate, under-5 (per 1,000 live births)",
       subtitle = "Annual Report by the World Bank",
       caption = "Source: The World Bank") +
  theme(plot.title = element_text(size = 20,
                                  face = "bold",
                                  family = "serif", 
                                  color = "black",
                                  hjust = 0.5,
                                  lineheight = 1.2),
        plot.subtitle = element_text(size = 15,
                                     family = "serif",
                                     color = "black",
                                     hjust = 0.5),
        plot.caption = element_text(size = 15,
                                    family = "serif",
                                    hjust = 0.5))

map1

From the map above, we can see the trend in infant mortality rate. The infant mortality rate is low in high-income developed countries in North America, Europe, and Australia. On the other hand, the infant mortality rate is high in low-income countries, mainly in Africa and Asia. Wealthy countries have better access to adequate healthcare, education, and infrastructural development, creating a conducive environment for infants to thrive. In contrast, low-income countries tend to have access to zero to low-quality healthcare, low-quality education, and low-quality infrastructure. This creates an unhealthy environment for infants to thrive.

Further Resources

Learn more about [WDI package, map making technique, World Bank dataset] with the following:




Works Cited

This code through references and cites the following sources: