Food Production Around the World

Intro

As the world population continues to grow, the concern of feeding everyone on the planet grows as well. While there is technically enough good to feed everyone, global hunger has yet to have been eradicated.

One challenge is finding the correct diet. While science has discovered certain diets are healthier than others, as the relative wealth in India and China have grown, it has led to an increase in meat production (Godfray, H. C. J., Beddington, J. R., Crute, I. R., Haddad, L., Lawrence, D., Muir, J. F., Pretty, J., Robsinson, S., Thomas, S. M., & Toulmin, C., 2010). Though vegetarian diets are more energy efficient, exploring how meat diets can sustainably continue without a complete switch to vegetarian diets is important to not only continuing human and planetary health, but also maintaining living conditions (Godfray, H. C. J., Beddington, J. R., Crute, I. R., Haddad, L., Lawrence, D., Muir, J. F., Pretty, J., Robsinson, S., Thomas, S. M., & Toulmin, C., 2010). Exploring the role that global diets and change of diets plays in food supply is important to ensuring that the planet can sustainably feed the growing population.

One factor that could also help food security, espeically in lower income countries, is to encourage personal gardens. These gardens help ensure that even if individuals are not capable of providing enough money for the family, there is still some access to food, especially important in low income countries where affordable food is not always a guarantee (Rijanta, R., 2020). However, encouraging these gardens in higher income countries could also help to bring less reliance on meat, as well as encourage greener communities.

Finally, it is also important to understand that the current global food supply chain sees a lot of waste. Not only is there waste from consumers not completely using the product, or producers inefficiency at using their resources, there is a lot of food waste generated “along the way,” and the reliance on plastics in food production has resulted in a global crisis of microplastics affecting the environment. There is hypothetically enough food to support the current life on Earth, but due to supply chain waste, there are still many people dying of hunger (Dhiman, S., & Mukherjee, G., 2021). Minimizing this waste, as well as creating more sustainable disposal is extremely important, not only to supporting life on Earth, but the health of the planet. To minimize the waste, there must also be an understanding of global food production.

This project will explore food production as explored by region as well as income group of various countries to gain a better understanding of global food production.

The data that will be explored in this project is food (specifically meat) production around the world. The dataset from the World Bank includes details on how much a country produces over the past sixty years where data is available, as well as population size. Region and income designations were added from data from the world bank to help further understand global supply chains and how food production happens on a global scale.

The definitions of region and income group were derived from the World Bank. Income group has been divided into four classes: High, upper middle, lower middle, and low. There are seven regions: North America, Latin American and the Caribbean, Europe and Central Asia, Middle East and North Africa, Sub-Saharan Africa, East Asia and the Pacific, and South Asia. Finally, production will primarily be explored in Production per capita (kg).

load the libraries

library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5     v purrr   0.3.4
## v tibble  3.1.6     v dplyr   1.0.7
## v tidyr   1.1.4     v stringr 1.4.0
## v readr   2.1.1     v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(gridExtra)
## Warning: package 'gridExtra' was built under R version 4.1.3
## 
## Attaching package: 'gridExtra'
## The following object is masked from 'package:dplyr':
## 
##     combine
library(ggplot2)

load the data

setwd("C:/Users/kirae/OneDrive/Desktop/school files/math 217/final project")
foodprod <- read_csv("global-food.csv")
## Rows: 11648 Columns: 40
## -- Column specification --------------------------------------------------------
## Delimiter: ","
## chr  (2): Product, Country
## dbl (31): Year, Population, Production (t), production__tonnes__per_capita, ...
## lgl  (7): Yield (t/ha), Yield (kg/animal), Land Use (ha), area_harvested__ha...
## 
## i Use `spec()` to retrieve the full column specification for this data.
## i Specify the column types or set `show_col_types = FALSE` to quiet this message.
class <- read_csv("worldbankSESclasses.csv")
## Rows: 265 Columns: 6
## -- Column specification --------------------------------------------------------
## Delimiter: ","
## chr (6): Country, Code, Region, Income group, Lending category, Other (EMU o...
## 
## i Use `spec()` to retrieve the full column specification for this data.
## i Specify the column types or set `show_col_types = FALSE` to quiet this message.
fooddf <- foodprod %>% 
  left_join(class, by = "Country")
fooddf
## # A tibble: 11,648 x 45
##    Product     Country      Year Population `Production (t)` production__tonnes~
##    <chr>       <chr>       <dbl>      <dbl>            <dbl>               <dbl>
##  1 Meat, Total Afghanistan  1961    9169406           129420              0.0141
##  2 Meat, Total Afghanistan  1962    9351442           132206              0.0141
##  3 Meat, Total Afghanistan  1963    9543200           138971              0.0146
##  4 Meat, Total Afghanistan  1964    9744772           143830              0.0148
##  5 Meat, Total Afghanistan  1965    9956318           150195              0.0151
##  6 Meat, Total Afghanistan  1966   10174840           175210              0.0172
##  7 Meat, Total Afghanistan  1967   10399936           184232              0.0177
##  8 Meat, Total Afghanistan  1968   10637064           201632              0.0190
##  9 Meat, Total Afghanistan  1969   10893772           202232              0.0186
## 10 Meat, Total Afghanistan  1970   11173654           189120              0.0169
## # ... with 11,638 more rows, and 39 more variables:
## #   `Production per capita (kg)` <dbl>, `Yield (t/ha)` <lgl>,
## #   `Yield (kg/animal)` <lgl>, `Land Use (ha)` <lgl>,
## #   area_harvested__ha__per_capita <lgl>, `Land Use per capita (m²)` <lgl>,
## #   `Producing or slaughtered animals` <lgl>,
## #   `Producing or slaughtered animals per capita` <lgl>, `Imports (t)` <dbl>,
## #   imports__tonnes__per_capita <dbl>, `Imports per capita (kg)` <dbl>, ...

Explore the data

Overall Production

prod2019 <- fooddf %>%
  filter(Year == "2019")
hist(prod2019$`Production per capita (kg)`, xlab = "Production per Capita (kg)", main = "Production per capita (kg) in 2019", col = "pink")

Production per capita in general is heavily rightward skewed.

Create dataframes for Income Groups

lowinc <- fooddf %>%
  filter(`Income group` == "Low income", Year == "2019")
lowmidinc <- fooddf %>%
  filter(`Income group` == "Lower middle income", Year == "2019")
highmidinc <- fooddf %>%
  filter(`Income group` == "Upper middle income", Year == "2019")
highinc <- fooddf %>%
  filter(`Income group` == "High income", Year == "2019")

Create Histograms for Income Groups

par(mfrow = c(2, 2))
xlimits <- range(highinc$`Production per capita (kg)`)
hist(lowinc$`Production per capita (kg)`, xlab = "Production per Capita (kg)", main = "Low Income", col = "#6c00ff", xlim = xlimits)
hist(lowmidinc$`Production per capita (kg)`, xlab = "Production per Capita (kg)", main = "Low Middle Income", col = "#9547ff", xlim = xlimits)
hist(highmidinc$`Production per capita (kg)`, xlab = "Production per Capita (kg)", main = "High Middle Income", col = "#bc8bff", xlim = xlimits)
hist(highinc$`Production per capita (kg)`, xlab = "Production per Capita (kg)", main = "High Income", col = "#e8d7ff", xlim = xlimits)

There appears to be a trend of production per capita growing in higher income countries.

Side By Side Boxplots of Income Group

ggplot(fooddf, aes(x=`Income group`, y=`Production per capita (kg)`), fill = `Income group`) +
  geom_boxplot(fill = "#008cff", outlier.color = "#1e2663") +
  coord_flip()
## Warning: Removed 67 rows containing non-finite values (stat_boxplot).

High income group has a higher mean production and much greater variation than any of the other income levels. Low income sees the lowest mean production.

Create Dataframes for Regions

SA <- fooddf %>%
  filter(Region == "South Asia")
EU_CA <- fooddf %>%
  filter(Region == "Europe & Central Asia")
ME_NA <- fooddf %>%
  filter(Region == "Middle East & North Africa")
SSAf <- fooddf %>%
  filter(Region == "Sub-Saharan Africa")
LA_Car <- fooddf %>%
  filter(Region == "Latin America & Caribbean")
EA_Pac <- fooddf %>%
  filter(Region == "East Asia & Pacific")
NoAm <- fooddf %>%
  filter(Region == "North America")

Histograms of Regions

histSA <- ggplot(data = SA, aes(x = `Production per capita (kg)`)) +
  geom_histogram(binwidth = 1, colour = "#fec876", fill = "#fec876") +
  ggtitle("South Asia")
histEU_CA <- ggplot(data = EU_CA, aes(x = `Production per capita (kg)`)) +
  geom_histogram(binwidth = 15, colour = "#00a4fc", fill = "#00a4fc") +
  ggtitle("Europe & Central Asia")
histME_NA <- ggplot(data = ME_NA, aes(x = `Production per capita (kg)`)) +
  geom_histogram(binwidth = 5, colour = "#922c40", fill = "#922c40") + 
  ggtitle("Middle East & \nNorth Africa")
histSSAf <- ggplot(data = SSAf, aes(x = `Production per capita (kg)`)) +
  geom_histogram(binwidth = 5, colour = "grey", fill = "grey")+
  ggtitle("Sub-Saharan Africa")
histLA_Car <- ggplot(data = LA_Car, aes(x = `Production per capita (kg)`)) +
  geom_histogram(binwidth = 10, colour = "#dd76fe", fill = "#dd76fe")+
  ggtitle("Latin America & \nCaribbean")
histEA_Pac <- ggplot(data = EA_Pac, aes(x = `Production per capita (kg)`)) +
  geom_histogram(binwidth = 15, colour = "#fedd00", fill = "#fedd00")+
  ggtitle("East Asia & Pacific")
histNoAm <- ggplot(data = NoAm, aes(x = `Production per capita (kg)`)) +
  geom_histogram(binwidth = 5, colour = "#03C03C", fill = "#03C03C")+
  ggtitle("North America")
grid.arrange(histSA, histEU_CA, histME_NA, histSSAf, histLA_Car, histEA_Pac, histNoAm, nrow = 3)
## Warning: Removed 53 rows containing non-finite values (stat_bin).

Majority of the regions are right skewed and normally distributed except for North America.

Side By Side Boxplots by Region

ggplot(fooddf, aes(x=`Region`, y=`Production per capita (kg)`), fill = Region) +
  geom_boxplot(fill = "#ffc2c2", outlier.color = "#b83636") +
  coord_flip() 
## Warning: Removed 67 rows containing non-finite values (stat_boxplot).

Production varies across region, with Europe and Central Asia as well as East Asia and Pacific seeing the greatest variance in Production per capita (kg). The greatest production appears to be from North America, followed by Europe and Central Asia.

North America Distribution

Side by Side Boxplots

ggplot(NoAm, aes(x=Country, y=`Production per capita (kg)`)) +
  geom_boxplot(fill = "#ff9ee9") +
  coord_flip()
## Warning: Removed 53 rows containing non-finite values (stat_boxplot).

Canada and the United states have similar distributions with the United States mean slightly higher, however Bermuda, included in North America, does not have any production data.

Explore the distribution of the United States and Canada

Data frames for the US and Canada

US <- fooddf %>%
  filter(Country == "United States")
Can <- fooddf %>%
  filter(Country == "Canada")

Histograms for the US and Canada

histUS <- ggplot(data = US, aes(x = `Production per capita (kg)`)) +
  geom_histogram(binwidth = 5, colour = "#556eff", fill = "#556eff")
histCan <- ggplot(data = Can, aes(x = `Production per capita (kg)`)) +
  geom_histogram(binwidth = 10, colour = "#ff7272", fill = "#ff7272")
grid.arrange(histUS, histCan, nrow = 1)

Both are distributed bimodally.

Multiple Linear Regression

Income Group, Region, and the Intersection of Income Group and Region

mfit4 <- lm(`Production per capita (kg)` ~ `Income group` + Region + `Income group`*Region, data = fooddf)
summary(mfit4)
## 
## Call:
## lm(formula = `Production per capita (kg)` ~ `Income group` + 
##     Region + `Income group` * Region, data = fooddf)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -85.820 -16.453  -4.159   6.503 312.716 
## 
## Coefficients: (8 not defined because of singularities)
##                                                                    Estimate
## (Intercept)                                                         90.1073
## `Income group`Low income                                            -2.3759
## `Income group`Lower middle income                                  -59.1733
## `Income group`Upper middle income                                  -66.1420
## RegionEurope & Central Asia                                          0.8091
## RegionLatin America & Caribbean                                    -37.4252
## RegionMiddle East & North Africa                                   -61.2229
## RegionNorth America                                                 23.8977
## RegionSouth Asia                                                   -20.6065
## RegionSub-Saharan Africa                                           -74.9420
## `Income group`Low income:RegionEurope & Central Asia                     NA
## `Income group`Lower middle income:RegionEurope & Central Asia       -3.8154
## `Income group`Upper middle income:RegionEurope & Central Asia       10.8198
## `Income group`Low income:RegionLatin America & Caribbean                 NA
## `Income group`Lower middle income:RegionLatin America & Caribbean   32.8784
## `Income group`Upper middle income:RegionLatin America & Caribbean   52.5939
## `Income group`Low income:RegionMiddle East & North Africa                NA
## `Income group`Lower middle income:RegionMiddle East & North Africa  47.6726
## `Income group`Upper middle income:RegionMiddle East & North Africa  58.8798
## `Income group`Low income:RegionNorth America                             NA
## `Income group`Lower middle income:RegionNorth America                    NA
## `Income group`Upper middle income:RegionNorth America                    NA
## `Income group`Low income:RegionSouth Asia                          -52.2140
## `Income group`Lower middle income:RegionSouth Asia                  -2.4762
## `Income group`Upper middle income:RegionSouth Asia                       NA
## `Income group`Low income:RegionSub-Saharan Africa                        NA
## `Income group`Lower middle income:RegionSub-Saharan Africa          58.1298
## `Income group`Upper middle income:RegionSub-Saharan Africa          81.7171
##                                                                    Std. Error
## (Intercept)                                                            1.9818
## `Income group`Low income                                               5.3855
## `Income group`Lower middle income                                      2.5351
## `Income group`Upper middle income                                      2.9171
## RegionEurope & Central Asia                                            2.2747
## RegionLatin America & Caribbean                                        2.9171
## RegionMiddle East & North Africa                                       2.7136
## RegionNorth America                                                    4.2039
## RegionSouth Asia                                                       5.6634
## RegionSub-Saharan Africa                                               5.6053
## `Income group`Low income:RegionEurope & Central Asia                       NA
## `Income group`Lower middle income:RegionEurope & Central Asia          5.1945
## `Income group`Upper middle income:RegionEurope & Central Asia          3.5916
## `Income group`Low income:RegionLatin America & Caribbean                   NA
## `Income group`Lower middle income:RegionLatin America & Caribbean      3.9485
## `Income group`Upper middle income:RegionLatin America & Caribbean      3.8352
## `Income group`Low income:RegionMiddle East & North Africa                  NA
## `Income group`Lower middle income:RegionMiddle East & North Africa     4.0910
## `Income group`Upper middle income:RegionMiddle East & North Africa     4.3380
## `Income group`Low income:RegionNorth America                               NA
## `Income group`Lower middle income:RegionNorth America                      NA
## `Income group`Upper middle income:RegionNorth America                      NA
## `Income group`Low income:RegionSouth Asia                              9.6175
## `Income group`Lower middle income:RegionSouth Asia                     6.2574
## `Income group`Upper middle income:RegionSouth Asia                         NA
## `Income group`Low income:RegionSub-Saharan Africa                          NA
## `Income group`Lower middle income:RegionSub-Saharan Africa             5.9901
## `Income group`Upper middle income:RegionSub-Saharan Africa             6.3705
##                                                                    t value
## (Intercept)                                                         45.468
## `Income group`Low income                                            -0.441
## `Income group`Lower middle income                                  -23.342
## `Income group`Upper middle income                                  -22.674
## RegionEurope & Central Asia                                          0.356
## RegionLatin America & Caribbean                                    -12.830
## RegionMiddle East & North Africa                                   -22.561
## RegionNorth America                                                  5.685
## RegionSouth Asia                                                    -3.639
## RegionSub-Saharan Africa                                           -13.370
## `Income group`Low income:RegionEurope & Central Asia                    NA
## `Income group`Lower middle income:RegionEurope & Central Asia       -0.735
## `Income group`Upper middle income:RegionEurope & Central Asia        3.013
## `Income group`Low income:RegionLatin America & Caribbean                NA
## `Income group`Lower middle income:RegionLatin America & Caribbean    8.327
## `Income group`Upper middle income:RegionLatin America & Caribbean   13.714
## `Income group`Low income:RegionMiddle East & North Africa               NA
## `Income group`Lower middle income:RegionMiddle East & North Africa  11.653
## `Income group`Upper middle income:RegionMiddle East & North Africa  13.573
## `Income group`Low income:RegionNorth America                            NA
## `Income group`Lower middle income:RegionNorth America                   NA
## `Income group`Upper middle income:RegionNorth America                   NA
## `Income group`Low income:RegionSouth Asia                           -5.429
## `Income group`Lower middle income:RegionSouth Asia                  -0.396
## `Income group`Upper middle income:RegionSouth Asia                      NA
## `Income group`Low income:RegionSub-Saharan Africa                       NA
## `Income group`Lower middle income:RegionSub-Saharan Africa           9.704
## `Income group`Upper middle income:RegionSub-Saharan Africa          12.827
##                                                                    Pr(>|t|)    
## (Intercept)                                                         < 2e-16 ***
## `Income group`Low income                                           0.659102    
## `Income group`Lower middle income                                   < 2e-16 ***
## `Income group`Upper middle income                                   < 2e-16 ***
## RegionEurope & Central Asia                                        0.722084    
## RegionLatin America & Caribbean                                     < 2e-16 ***
## RegionMiddle East & North Africa                                    < 2e-16 ***
## RegionNorth America                                                1.35e-08 ***
## RegionSouth Asia                                                   0.000276 ***
## RegionSub-Saharan Africa                                            < 2e-16 ***
## `Income group`Low income:RegionEurope & Central Asia                     NA    
## `Income group`Lower middle income:RegionEurope & Central Asia      0.462662    
## `Income group`Upper middle income:RegionEurope & Central Asia      0.002598 ** 
## `Income group`Low income:RegionLatin America & Caribbean                 NA    
## `Income group`Lower middle income:RegionLatin America & Caribbean   < 2e-16 ***
## `Income group`Upper middle income:RegionLatin America & Caribbean   < 2e-16 ***
## `Income group`Low income:RegionMiddle East & North Africa                NA    
## `Income group`Lower middle income:RegionMiddle East & North Africa  < 2e-16 ***
## `Income group`Upper middle income:RegionMiddle East & North Africa  < 2e-16 ***
## `Income group`Low income:RegionNorth America                             NA    
## `Income group`Lower middle income:RegionNorth America                    NA    
## `Income group`Upper middle income:RegionNorth America                    NA    
## `Income group`Low income:RegionSouth Asia                          5.81e-08 ***
## `Income group`Lower middle income:RegionSouth Asia                 0.692314    
## `Income group`Upper middle income:RegionSouth Asia                       NA    
## `Income group`Low income:RegionSub-Saharan Africa                        NA    
## `Income group`Lower middle income:RegionSub-Saharan Africa          < 2e-16 ***
## `Income group`Upper middle income:RegionSub-Saharan Africa          < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 40.27 on 8854 degrees of freedom
##   (2774 observations deleted due to missingness)
## Multiple R-squared:  0.343,  Adjusted R-squared:  0.3416 
## F-statistic: 243.3 on 19 and 8854 DF,  p-value: < 2.2e-16

Model:

Production per capita (kg) = 90.1073 + B2_IncomeGroup + B3_Region + B4_IncomeGroup*Region

Plots

par(mfrow = c(2,2))
plot(mfit4)

Conclusion

Region and income group do play a role in production of countries. There is variation that can be seen across income groups and regions, however, when put together, it was also important to consider the intersection of region and income group. Many regions featured countries of varying income status, and only when looking at the impact of both is there a greater understanding of how they influence production.

The model created had an adjusted R-squared statistic of 0.3416, meaning that about 34% of the variation is explained by this model. While region, income group, and the intersection of region and income group are able to show some insight into production, there are still other factors involved. Also interesting, all the p-values for the multiple linear regression showed that each piece, whether it was income group, region, or the intersection of them, was providing significant data.

However, it is also important to acknowledge that the diagnostic plots are all strange. The residual plots have some pattern and are not randomly distributed. The Q-Q plot also shows rightward skew.

This exploration showed that there is some impact of region and income group on production, and that there is variance within them. Countries that are high income generally see greater production than lower income countries. Regionality also shows that production globally is greatest in North America and Europe and Central Asia, which includes most of the world powers from the past century.

Further studies might explore other factors to find a more accurate model. While all the factors included in this model were significant, details on governments and political divisions as well as climate and environmental concerns might further explain the variation in production. The size of the country could also play a role.

While limited by time and space in this project, an exploration of all these variables over time might also be interesting to generate a better understanding of production over time. Exploring why North America is bimodally distributed would also be interesting and possibly help to provide further insight.

Bibliography

Godfray, H. C. J., Beddington, J. R., Crute, I. R., Haddad, L., Lawrence, D., Muir, J. F., Pretty, J., Robsinson, S., Thomas, S. M., & Toulmin, C. (2010). Food security: The challenge of feeding 9 billion people. Science, 327(5967), 812-818. 10.1126/science.1185383

Rijanta, R. (2020). The prospects & challenges of local foods production in rural Java, Indonesia: the case of Kulonprogo Regency. Journal of Studies & Research in Human Geography, 14(2), 321–335. https://doi-org.montgomerycollege.idm.oclc.org/10.5719/hgeo.2019.141.9

Dhiman, S., & Mukherjee, G. (2021). Present scenario and future scope of food waste to biofuel production. Journal of Food Process Engineering, 44(2), 1–21. https://doi- org.montgomerycollege.idm.oclc.org/10.1111/jfpe.13594