Source and topic of the data

https://www.unicef.org/

https://www.kaggle.com/datasets/ruchi798/malnutrition-across-the-globe?select=country-wise-average.csv

TOPIC: MALNUTRITION ACROSS THE GLOBE

Let’s me introduce my topic and dataset!

The global prevalence of malnutrition is staggering. In the world, more than 2.3 billion people suffer from malnutrition according to the newspaper “Canada and the World”. Thus, it no longer spares any country in the world because one person in three suffers from this scourge, whether it is stunting in children or overweight in adults. And yes, malnutrition for some people represents only the deficiency in nutritious foods while not it comes in several forms. First malnutrition is also called emaciation, stunting or underweight, then vitamin or mineral deficiencies, then overweight, after obesity and finally non-communicable diseases linked to food. Today the number of overweight or obese adults is 1.9 billion, while 462 million adults are underweight. Of children under 5, 52 million are wasted, 17 million are severely wasted and 155 million are stunted, while 41 million are overweight or obese. Malnutrition plays a role in approximately 45% of deaths of children under the age of 5. These deaths occur mainly in low- and middle-income countries. At the same time, in these same countries, the rates of overweight or obese children are on the rise. The economic, social, medical and developmental consequences of the global burden of malnutrition are severe and persistent for individuals and their families, communities and countries. It is for this reason that I focused my interest on this topic which is certainly a subject of societal debate but above all a “poison” for the development of certain countries like my home country,GABON.

This dataset was taken from an article posted by UNICEF which is an international organization that works to prevent the issue of malnutrition, and it includes 2 categorical variables and 6 quantitative variables. So throughout this project, I will go through the different variables, namely quantitative variables such as “underweight”, “Stunting” for example and categorical “Income classification” in order to see which countries are most affected by malnutrition. Then using the Tableau Public map, I will define the areas or the continent that is most affected by malnutrition and especially the wasting amount of children.

For more details on the dataset, you will find above the link which gives access to the detailed dataset (the source).

Load the libraries and import the dataset

library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.4.0      ✔ purrr   0.3.5 
## ✔ tibble  3.1.8      ✔ dplyr   1.0.10
## ✔ tidyr   1.2.1      ✔ stringr 1.4.1 
## ✔ readr   2.1.3      ✔ forcats 0.5.2 
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
library(readr)
library(ggplot2)
library(plotly)
## 
## Attaching package: 'plotly'
## 
## The following object is masked from 'package:ggplot2':
## 
##     last_plot
## 
## The following object is masked from 'package:stats':
## 
##     filter
## 
## The following object is masked from 'package:graphics':
## 
##     layout
library(dplyr)
country_wise_average <- read_csv("C:/Users/claud/Downloads/country-wise-average.csv")
## Rows: 152 Columns: 8
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): Country
## dbl (7): Income Classification, Severe Wasting, Wasting, Overweight, Stuntin...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(country_wise_average)
## # A tibble: 6 × 8
##   Country     Income Classific…¹ Sever…² Wasting Overw…³ Stunt…⁴ Under…⁵ U5 Po…⁶
##   <chr>                    <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
## 1 AFGHANISTAN                  0    3.03   10.4     5.12    47.8   30.4    4919.
## 2 ALBANIA                      2    4.08    7.76   20.8     24.2    7.7     233.
## 3 ALGERIA                      2    2.73    5.94   12.8     19.6    7.34   3565.
## 4 ANGOLA                       1    2.4     6.93    2.55    42.6   23.6    3980.
## 5 ARGENTINA                    2    0.2     2.15   11.1     10.0    2.6    3614.
## 6 ARMENIA                      2    1.6     3.94   13.6     16.1    3.48    204.
## # … with abbreviated variable names ¹​`Income Classification`,
## #   ²​`Severe Wasting`, ³​Overweight, ⁴​Stunting, ⁵​Underweight,
## #   ⁶​`U5 Population ('000s)`
#Change the name of the dataset imported
country_wise_average <- read_csv("C:/Users/claud/Downloads/country-wise-average.csv")
## Rows: 152 Columns: 8
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): Country
## dbl (7): Income Classification, Severe Wasting, Wasting, Overweight, Stuntin...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
malnutrition<-country_wise_average
#Change the names of some variables
malnutrition$SevereWasting <- format(malnutrition$'Severe Wasting', format = "%Y")
malnutrition$IncomeClassification <- format(malnutrition$'Income Classification', format = "%Y")

Clean the dataset

check for N/A in a vector

It is important to look and summarize data to make sure that there is no missing data because some datasets have NA as an internal function, others do not.

na.malnutrition <- which(colSums(is.na(malnutrition)) >0)
sort(colSums(sapply(malnutrition[na.malnutrition], is.na)),decreasing = TRUE)
## Severe Wasting     Overweight        Wasting    Underweight       Stunting 
##             12              3              2              2              1
paste('Number of columns with no values:', length(na.malnutrition))
## [1] "Number of columns with no values: 5"

Remove uncessary values

Remove N/A Values from a vector

# Is N/A necessary for analysis?
na.malnutrition1<- malnutrition %>%
  filter(!is.na(Overweight) & !is.na(Wasting)& !is.na(Underweight)& !is.na(Stunting )& !is.na(SevereWasting ))

which(is.na(na.malnutrition1), arr.ind=TRUE)
##      row col
## [1,]   9   3
## [2,]  28   3
## [3,]  33   3
## [4,]  68   3
## [5,]  78   3
## [6,]  83   3
## [7,] 104   3
## [8,] 144   3
str(na.malnutrition1)
## spc_tbl_ [148 × 10] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
##  $ Country              : chr [1:148] "AFGHANISTAN" "ALBANIA" "ALGERIA" "ANGOLA" ...
##  $ Income Classification: num [1:148] 0 2 2 1 2 2 3 2 3 1 ...
##  $ Severe Wasting       : num [1:148] 3.03 4.08 2.73 2.4 0.2 ...
##  $ Wasting              : num [1:148] 10.35 7.76 5.94 6.93 2.15 ...
##  $ Overweight           : num [1:148] 5.12 20.8 12.83 2.55 11.12 ...
##  $ Stunting             : num [1:148] 47.8 24.2 19.6 42.6 10 ...
##  $ Underweight          : num [1:148] 30.38 7.7 7.34 23.6 2.6 ...
##  $ U5 Population ('000s): num [1:148] 4919 233 3565 3980 3614 ...
##  $ SevereWasting        : chr [1:148] " 3.0333333" " 4.0750000" " 2.7333333" " 2.4000000" ...
##  $ IncomeClassification : chr [1:148] "0" "2" "2" "1" ...
##  - attr(*, "spec")=
##   .. cols(
##   ..   Country = col_character(),
##   ..   `Income Classification` = col_double(),
##   ..   `Severe Wasting` = col_double(),
##   ..   Wasting = col_double(),
##   ..   Overweight = col_double(),
##   ..   Stunting = col_double(),
##   ..   Underweight = col_double(),
##   ..   `U5 Population ('000s)` = col_double()
##   .. )
##  - attr(*, "problems")=<externalptr>

Summarize and View update vector

summary(na.malnutrition1)
##    Country          Income Classification Severe Wasting      Wasting      
##  Length:148         Min.   :0.000         Min.   : 0.000   Min.   : 0.000  
##  Class :character   1st Qu.:1.000         1st Qu.: 0.900   1st Qu.: 3.288  
##  Mode  :character   Median :1.000         Median : 1.873   Median : 5.770  
##                     Mean   :1.405         Mean   : 2.169   Mean   : 6.635  
##                     3rd Qu.:2.000         3rd Qu.: 2.823   3rd Qu.: 8.775  
##                     Max.   :3.000         Max.   :11.400   Max.   :23.650  
##                                           NA's   :8                        
##    Overweight         Stunting      Underweight     U5 Population ('000s)
##  Min.   : 0.9625   Min.   : 1.00   Min.   : 0.100   Min.   :     1.0     
##  1st Qu.: 3.8025   1st Qu.:13.94   1st Qu.: 4.315   1st Qu.:   241.8     
##  Median : 6.2750   Median :25.56   Median :10.380   Median :   981.2     
##  Mean   : 7.1800   Mean   :26.11   Mean   :13.576   Mean   :  4122.1     
##  3rd Qu.: 9.0700   3rd Qu.:36.86   3rd Qu.:19.712   3rd Qu.:  3145.6     
##  Max.   :26.5000   Max.   :57.60   Max.   :46.267   Max.   :123014.5     
##                                                                          
##  SevereWasting      IncomeClassification
##  Length:148         Length:148          
##  Class :character   Class :character    
##  Mode  :character   Mode  :character    
##                                         
##                                         
##                                         
## 

Create one data visualization with this dataset

Step1: Filter the dataset

Knowing that we have several variables and that we are not going to use them all. We’ll start by filtering out the columns that aren’t needed.

na.malnutrition2 <- select(na.malnutrition1, -`SevereWasting`)
str(na.malnutrition2)
## tibble [148 × 9] (S3: tbl_df/tbl/data.frame)
##  $ Country              : chr [1:148] "AFGHANISTAN" "ALBANIA" "ALGERIA" "ANGOLA" ...
##  $ Income Classification: num [1:148] 0 2 2 1 2 2 3 2 3 1 ...
##  $ Severe Wasting       : num [1:148] 3.03 4.08 2.73 2.4 0.2 ...
##  $ Wasting              : num [1:148] 10.35 7.76 5.94 6.93 2.15 ...
##  $ Overweight           : num [1:148] 5.12 20.8 12.83 2.55 11.12 ...
##  $ Stunting             : num [1:148] 47.8 24.2 19.6 42.6 10 ...
##  $ Underweight          : num [1:148] 30.38 7.7 7.34 23.6 2.6 ...
##  $ U5 Population ('000s): num [1:148] 4919 233 3565 3980 3614 ...
##  $ IncomeClassification : chr [1:148] "0" "2" "2" "1" ...

Plot all variables– distribution

This plot presents an overview of the variables available for our visualization.

#Plot all variables-- distribution 
DATAoverview<-na.malnutrition2  %>%
  as_data_frame() %>%
  select_if(is.numeric) %>%
  gather(key = "variable", value = "value")
## Warning: `as_data_frame()` was deprecated in tibble 2.0.0.
## ℹ Please use `as_tibble()` instead.
## ℹ The signature and semantics have changed, see `?as_tibble`.
#Boxplot
ggplot(DATAoverview, aes(value)) +
  geom_boxplot(fill="darkorchid1") +
  facet_wrap(~variable, scale="free")
## Warning: Removed 8 rows containing non-finite values (`stat_boxplot()`).

#Density plot
ggplot(DATAoverview, aes(value)) +
  geom_density(fill="slateblue") +
  facet_wrap(~variable, scale="free")
## Warning: Removed 8 rows containing non-finite values (`stat_density()`).

Plot World’s Economies Classification

Unicef classifies the world’s economies into four income groups: low income represented by 0, lower middle income represented by 1, upper middle income represented by 2 and high income represented by 3.

na.malnutrition3 <- ggplot(na.malnutrition2, aes(x = IncomeClassification)) +
  labs(title = "World's Economies Classification") +
  geom_bar(mapping = aes(x= IncomeClassification, fill=IncomeClassification )) + scale_fill_brewer(palette = "PuBu") +
   theme_minimal() +
  xlab("Income Groups") + 
  ylab("Count")
na.malnutrition3

We can notice that on a global scale, upper middle income economies represented by 2 are the most numerous.

Plot3

In this plot, we are going to show how Underweight can be affected by Income Classification by creating an histogram.

ggplot(na.malnutrition2)+
  geom_histogram(mapping=aes(x=Underweight, color=IncomeClassification),bins=15,
                 fill='white')+
 labs(title = "Histogram of underweight population by Income classification") + 
  xlab("underweight") +
  ylab("Count") +
  theme_minimal()

We can see that low and middle income countries are the most confronted with manultrition.

Plot 4

The histogram below represents the percentages of children suffering from malnutrition and presenting a weight deficiency (underweight) in 2016

ggplot(na.malnutrition2)+
  geom_histogram(aes(x = Underweight), colour= "blue", fill= "blue")+
labs(
    title = "Distribution of values for the variable 'Underweight'", 
    x = 'Underweight', 
    y = '% of children below age 5'
  )
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

The average Underweight value is:

Calculate the average of Underweight

na.malnutrition2 %>%
  summarize(avg.weight = mean(Underweight))
## # A tibble: 1 × 1
##   avg.weight
##        <dbl>
## 1       13.6

Plot5: Graphical representation of the variable ‘Wasting’, where values are more than 5

na.malnutrition2 %>%
  filter(Wasting > 5) %>%
  ggplot()+
  geom_density(aes(x = Wasting, fill="coral3")) +
  labs(
    title = "Observing the values where wasting is more than 5.00", 
    x = 'Wasting', 
    y = 'Spread of variables above the value 5'
  )

Plot7: Graphical representation of values for the variable ‘Severe wasting’

set.seed(3)
x <- 1:20
y <- x + rnorm(20, mean = 0, sd = 10)
plot(x, y, pch = 19, col = "black")
abline(lm(y ~ x), col = "red", lwd = 3)

Plot 8 Graphical representation of relationship between ‘Underweight’ and ‘Stunting’

Malnutrition6 <- ggplot(na.malnutrition2, aes(x=Underweight, y=Stunting, col=Country))+
  geom_point(alpha=0.5)+
   xlab("Underweight")+
  ylab("Stunting")+
  labs(title = "Underweight vs Stunting") +
  theme_light() 

Malnutrition6  <- ggplotly(Malnutrition6)
Malnutrition6

Burundi ranks as the country most affected by stunting followed by Guatemala, Timor-Leste and Bangladesh.

Plot 8 Graphical representation of relationship between ‘Wasting’ and ‘Stunting’

Malnutrition7 <- ggplot(na.malnutrition2, aes(x=Stunting, y=Wasting, col=Country))+
  geom_point(alpha=0.5)+
   xlab("Stunting")+
  ylab("Wasting")+
  labs(title = "Stunting vs wasting") +
  theme_light() 

Malnutrition7  <- ggplotly(Malnutrition7)
Malnutrition7

South Soudan ranks as the country most affected by wasting followed by India, Sri-Landa and Djibouti.

Quick comment

According to the results of our visualization, the African continent, which has the majority of underdeveloped and developing countries, is the most affected by this scourge. Burundi leads the African countries affected by severe wasting in 2016. And according to global data published by UNICEF, Burundi ranks fourth behind India, Indonesia and Pakistan respectively. The other countries are Bangladesh, DR Congo, Ethiopia, Philippines, Niger and South Africa Burundi had exactly 482,590 cases of wasted children. UNICEF reports that about one in five deaths among children is linked to wasting adding wasting is caused by a lack of nutritious foods.

Background Research Summary

https://www.afro.who.int/news/whos-africa-nutrition-report-highlights-increase-malnutrition-africa

Ensuring better nutrition for populations is a major challenge to be met in order to achieve the objectives of sustainable development. In Africa, malnutrition is a rife phenomenon that kills millions of people, mainly children, every year. Malnutrition is a term that include several factors. It is not just about lack of food. Poor nutrition leaves the child with consequences that he will have to keep all his life. When it is chronic, it is responsible for both physical and intellectual growth delays observed in some children. The consequences of nutritional insufficiency in children are often irreversible. Chronic diseases can then develop in malnourished children when they grow up. In Africa, the problem is critical. Many children die before the age of 5. The main cause of these deaths is severe acute malnutrition. It is estimated that 12% of children under the age of 5 die due to lack of or insufficient breastfeeding. Hunger disappeared for about a decade in the world. It seems to be coming back and affects a large part of the world’s population, 11% according to the most recent UN report dealing with food security. According to the latest UN report, the rate of malnutrition in Africa continues to rise. Africa alone has 257 million people affected by malnutrition, including 237 million in the sub-Saharan region against 20 million in the North of the continent. This shows an increase of 34.5 million people compared to 2015. According to some researchers, Africa is at risk of not achieving the second sustainable development goal which is to eradicate hunger. But what are the key drivers of malnutrition in Africa? Disease, hunger and poverty are the main factors responsible for malnutrition in Africa. Poor living conditions, lack of education, precarious livelihoods, lack of access to health care and healthy and nutritious food are all conditions conducive to malnutrition in Africa. According to some experts, the eradication of hunger is not necessarily a guarantee for better nutrition for populations. Rather, it invites us to think about the means to ensure not only access to sufficient quantities of food but also to guarantee a healthy diet rich in essential nutrients for populations. In its report published in early 2017 on nutrition and food security in Africa, UNICEF considers the efforts remain insufficient. In one year, from 2015 to 2016, they notice that malnutrition affects more people on the continent with 224 million people in 2016 against 200 million in 2015. As well as this increase in the proportions of stunting is accompanied by a drastic increase in the rates of obesity and overweight, mainly in Southern Africa. Now, what do I suggest to solve this problem? In my opinion, to help the ever-increasing number of people dying of malnutrition in Africa, new policies must be taken. They must focus on the distribution of food to the most deprived people on a weekly basis,on controlling the quality of drinks and food consumed by the population, promoting fresh fruits and vegetables and enhancing the agricultural sector.

This visualization means a lot to me because I come from a country that is strongly affected by malnutrition and this phenomenon remains alarming over the years. Which requires my attention more on the issues of creating an organization with the aim of helping vulnerable people around the world. Nevertheless, I remain convinced that I do not have enough information in this dataser for my analysis. I would have liked to have more categorical variables such as gender, age groups, years in order to better exploit the issues.