As the world population continues to grow, the concern of feeding everyone on the planet grows as well. While there is technically enough good to feed everyone, global hunger has yet to have been eradicated.
One challenge is finding the correct diet. While science has discovered certain diets are healthier than others, as the relative wealth in India and China have grown, it has led to an increase in meat production (Godfray, H. C. J., Beddington, J. R., Crute, I. R., Haddad, L., Lawrence, D., Muir, J. F., Pretty, J., Robsinson, S., Thomas, S. M., & Toulmin, C., 2010). Though vegetarian diets are more energy efficient, exploring how meat diets can sustainably continue without a complete switch to vegetarian diets is important to not only continuing human and planetary health, but also maintaining living conditions (Godfray, H. C. J., Beddington, J. R., Crute, I. R., Haddad, L., Lawrence, D., Muir, J. F., Pretty, J., Robsinson, S., Thomas, S. M., & Toulmin, C., 2010). Exploring the role that global diets and change of diets plays in food supply is important to ensuring that the planet can sustainably feed the growing population.
One factor that could also help food security, espeically in lower income countries, is to encourage personal gardens. These gardens help ensure that even if individuals are not capable of providing enough money for the family, there is still some access to food, especially important in low income countries where affordable food is not always a guarantee (Rijanta, R., 2020). However, encouraging these gardens in higher income countries could also help to bring less reliance on meat, as well as encourage greener communities.
Finally, it is also important to understand that the current global food supply chain sees a lot of waste. Not only is there waste from consumers not completely using the product, or producers inefficiency at using their resources, there is a lot of food waste generated “along the way,” and the reliance on plastics in food production has resulted in a global crisis of microplastics affecting the environment. There is hypothetically enough food to support the current life on Earth, but due to supply chain waste, there are still many people dying of hunger (Dhiman, S., & Mukherjee, G., 2021). Minimizing this waste, as well as creating more sustainable disposal is extremely important, not only to supporting life on Earth, but the health of the planet. To minimize the waste, there must also be an understanding of global food production.
This project will explore food production as explored by region as well as income group of various countries to gain a better understanding of global food production.
The data that will be explored in this project is food (specifically meat) production around the world. The dataset from the World Bank includes details on how much a country produces over the past sixty years where data is available, as well as population size. Region and income designations were added from data from the world bank to help further understand global supply chains and how food production happens on a global scale.
The definitions of region and income group were derived from the World Bank. Income group has been divided into four classes: High, upper middle, lower middle, and low. There are seven regions: North America, Latin American and the Caribbean, Europe and Central Asia, Middle East and North Africa, Sub-Saharan Africa, East Asia and the Pacific, and South Asia. Finally, production will primarily be explored in Production per capita (kg).
library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5 v purrr 0.3.4
## v tibble 3.1.6 v dplyr 1.0.7
## v tidyr 1.1.4 v stringr 1.4.0
## v readr 2.1.1 v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(gridExtra)
## Warning: package 'gridExtra' was built under R version 4.1.3
##
## Attaching package: 'gridExtra'
## The following object is masked from 'package:dplyr':
##
## combine
library(ggplot2)
setwd("C:/Users/kirae/OneDrive/Desktop/school files/math 217/final project")
foodprod <- read_csv("global-food.csv")
## Rows: 11648 Columns: 40
## -- Column specification --------------------------------------------------------
## Delimiter: ","
## chr (2): Product, Country
## dbl (31): Year, Population, Production (t), production__tonnes__per_capita, ...
## lgl (7): Yield (t/ha), Yield (kg/animal), Land Use (ha), area_harvested__ha...
##
## i Use `spec()` to retrieve the full column specification for this data.
## i Specify the column types or set `show_col_types = FALSE` to quiet this message.
class <- read_csv("worldbankSESclasses.csv")
## Rows: 265 Columns: 6
## -- Column specification --------------------------------------------------------
## Delimiter: ","
## chr (6): Country, Code, Region, Income group, Lending category, Other (EMU o...
##
## i Use `spec()` to retrieve the full column specification for this data.
## i Specify the column types or set `show_col_types = FALSE` to quiet this message.
fooddf <- foodprod %>%
left_join(class, by = "Country")
fooddf
## # A tibble: 11,648 x 45
## Product Country Year Population `Production (t)` production__tonnes~
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Meat, Total Afghanistan 1961 9169406 129420 0.0141
## 2 Meat, Total Afghanistan 1962 9351442 132206 0.0141
## 3 Meat, Total Afghanistan 1963 9543200 138971 0.0146
## 4 Meat, Total Afghanistan 1964 9744772 143830 0.0148
## 5 Meat, Total Afghanistan 1965 9956318 150195 0.0151
## 6 Meat, Total Afghanistan 1966 10174840 175210 0.0172
## 7 Meat, Total Afghanistan 1967 10399936 184232 0.0177
## 8 Meat, Total Afghanistan 1968 10637064 201632 0.0190
## 9 Meat, Total Afghanistan 1969 10893772 202232 0.0186
## 10 Meat, Total Afghanistan 1970 11173654 189120 0.0169
## # ... with 11,638 more rows, and 39 more variables:
## # `Production per capita (kg)` <dbl>, `Yield (t/ha)` <lgl>,
## # `Yield (kg/animal)` <lgl>, `Land Use (ha)` <lgl>,
## # area_harvested__ha__per_capita <lgl>, `Land Use per capita (m²)` <lgl>,
## # `Producing or slaughtered animals` <lgl>,
## # `Producing or slaughtered animals per capita` <lgl>, `Imports (t)` <dbl>,
## # imports__tonnes__per_capita <dbl>, `Imports per capita (kg)` <dbl>, ...
prod2019 <- fooddf %>%
filter(Year == "2019")
hist(prod2019$`Production per capita (kg)`, xlab = "Production per Capita (kg)", main = "Production per capita (kg) in 2019", col = "pink")
Production per capita in general is heavily rightward skewed.
lowinc <- fooddf %>%
filter(`Income group` == "Low income", Year == "2019")
lowmidinc <- fooddf %>%
filter(`Income group` == "Lower middle income", Year == "2019")
highmidinc <- fooddf %>%
filter(`Income group` == "Upper middle income", Year == "2019")
highinc <- fooddf %>%
filter(`Income group` == "High income", Year == "2019")
par(mfrow = c(2, 2))
xlimits <- range(highinc$`Production per capita (kg)`)
hist(lowinc$`Production per capita (kg)`, xlab = "Production per Capita (kg)", main = "Low Income", col = "#6c00ff", xlim = xlimits)
hist(lowmidinc$`Production per capita (kg)`, xlab = "Production per Capita (kg)", main = "Low Middle Income", col = "#9547ff", xlim = xlimits)
hist(highmidinc$`Production per capita (kg)`, xlab = "Production per Capita (kg)", main = "High Middle Income", col = "#bc8bff", xlim = xlimits)
hist(highinc$`Production per capita (kg)`, xlab = "Production per Capita (kg)", main = "High Income", col = "#e8d7ff", xlim = xlimits)
There appears to be a trend of production per capita growing in higher income countries.
ggplot(fooddf, aes(x=`Income group`, y=`Production per capita (kg)`), fill = `Income group`) +
geom_boxplot(fill = "#008cff", outlier.color = "#1e2663") +
coord_flip()
## Warning: Removed 67 rows containing non-finite values (stat_boxplot).
High income group has a higher mean production and much greater variation than any of the other income levels. Low income sees the lowest mean production.
SA <- fooddf %>%
filter(Region == "South Asia")
EU_CA <- fooddf %>%
filter(Region == "Europe & Central Asia")
ME_NA <- fooddf %>%
filter(Region == "Middle East & North Africa")
SSAf <- fooddf %>%
filter(Region == "Sub-Saharan Africa")
LA_Car <- fooddf %>%
filter(Region == "Latin America & Caribbean")
EA_Pac <- fooddf %>%
filter(Region == "East Asia & Pacific")
NoAm <- fooddf %>%
filter(Region == "North America")
histSA <- ggplot(data = SA, aes(x = `Production per capita (kg)`)) +
geom_histogram(binwidth = 1, colour = "#fec876", fill = "#fec876") +
ggtitle("South Asia")
histEU_CA <- ggplot(data = EU_CA, aes(x = `Production per capita (kg)`)) +
geom_histogram(binwidth = 15, colour = "#00a4fc", fill = "#00a4fc") +
ggtitle("Europe & Central Asia")
histME_NA <- ggplot(data = ME_NA, aes(x = `Production per capita (kg)`)) +
geom_histogram(binwidth = 5, colour = "#922c40", fill = "#922c40") +
ggtitle("Middle East & \nNorth Africa")
histSSAf <- ggplot(data = SSAf, aes(x = `Production per capita (kg)`)) +
geom_histogram(binwidth = 5, colour = "grey", fill = "grey")+
ggtitle("Sub-Saharan Africa")
histLA_Car <- ggplot(data = LA_Car, aes(x = `Production per capita (kg)`)) +
geom_histogram(binwidth = 10, colour = "#dd76fe", fill = "#dd76fe")+
ggtitle("Latin America & \nCaribbean")
histEA_Pac <- ggplot(data = EA_Pac, aes(x = `Production per capita (kg)`)) +
geom_histogram(binwidth = 15, colour = "#fedd00", fill = "#fedd00")+
ggtitle("East Asia & Pacific")
histNoAm <- ggplot(data = NoAm, aes(x = `Production per capita (kg)`)) +
geom_histogram(binwidth = 5, colour = "#03C03C", fill = "#03C03C")+
ggtitle("North America")
grid.arrange(histSA, histEU_CA, histME_NA, histSSAf, histLA_Car, histEA_Pac, histNoAm, nrow = 3)
## Warning: Removed 53 rows containing non-finite values (stat_bin).
Majority of the regions are right skewed and normally distributed except for North America.
ggplot(fooddf, aes(x=`Region`, y=`Production per capita (kg)`), fill = Region) +
geom_boxplot(fill = "#ffc2c2", outlier.color = "#b83636") +
coord_flip()
## Warning: Removed 67 rows containing non-finite values (stat_boxplot).
Production varies across region, with Europe and Central Asia as well as East Asia and Pacific seeing the greatest variance in Production per capita (kg). The greatest production appears to be from North America, followed by Europe and Central Asia.
ggplot(NoAm, aes(x=Country, y=`Production per capita (kg)`)) +
geom_boxplot(fill = "#ff9ee9") +
coord_flip()
## Warning: Removed 53 rows containing non-finite values (stat_boxplot).
Canada and the United states have similar distributions with the United States mean slightly higher, however Bermuda, included in North America, does not have any production data.
Data frames for the US and Canada
US <- fooddf %>%
filter(Country == "United States")
Can <- fooddf %>%
filter(Country == "Canada")
Histograms for the US and Canada
histUS <- ggplot(data = US, aes(x = `Production per capita (kg)`)) +
geom_histogram(binwidth = 5, colour = "#556eff", fill = "#556eff")
histCan <- ggplot(data = Can, aes(x = `Production per capita (kg)`)) +
geom_histogram(binwidth = 10, colour = "#ff7272", fill = "#ff7272")
grid.arrange(histUS, histCan, nrow = 1)
Both are distributed bimodally.
mfit4 <- lm(`Production per capita (kg)` ~ `Income group` + Region + `Income group`*Region, data = fooddf)
summary(mfit4)
##
## Call:
## lm(formula = `Production per capita (kg)` ~ `Income group` +
## Region + `Income group` * Region, data = fooddf)
##
## Residuals:
## Min 1Q Median 3Q Max
## -85.820 -16.453 -4.159 6.503 312.716
##
## Coefficients: (8 not defined because of singularities)
## Estimate
## (Intercept) 90.1073
## `Income group`Low income -2.3759
## `Income group`Lower middle income -59.1733
## `Income group`Upper middle income -66.1420
## RegionEurope & Central Asia 0.8091
## RegionLatin America & Caribbean -37.4252
## RegionMiddle East & North Africa -61.2229
## RegionNorth America 23.8977
## RegionSouth Asia -20.6065
## RegionSub-Saharan Africa -74.9420
## `Income group`Low income:RegionEurope & Central Asia NA
## `Income group`Lower middle income:RegionEurope & Central Asia -3.8154
## `Income group`Upper middle income:RegionEurope & Central Asia 10.8198
## `Income group`Low income:RegionLatin America & Caribbean NA
## `Income group`Lower middle income:RegionLatin America & Caribbean 32.8784
## `Income group`Upper middle income:RegionLatin America & Caribbean 52.5939
## `Income group`Low income:RegionMiddle East & North Africa NA
## `Income group`Lower middle income:RegionMiddle East & North Africa 47.6726
## `Income group`Upper middle income:RegionMiddle East & North Africa 58.8798
## `Income group`Low income:RegionNorth America NA
## `Income group`Lower middle income:RegionNorth America NA
## `Income group`Upper middle income:RegionNorth America NA
## `Income group`Low income:RegionSouth Asia -52.2140
## `Income group`Lower middle income:RegionSouth Asia -2.4762
## `Income group`Upper middle income:RegionSouth Asia NA
## `Income group`Low income:RegionSub-Saharan Africa NA
## `Income group`Lower middle income:RegionSub-Saharan Africa 58.1298
## `Income group`Upper middle income:RegionSub-Saharan Africa 81.7171
## Std. Error
## (Intercept) 1.9818
## `Income group`Low income 5.3855
## `Income group`Lower middle income 2.5351
## `Income group`Upper middle income 2.9171
## RegionEurope & Central Asia 2.2747
## RegionLatin America & Caribbean 2.9171
## RegionMiddle East & North Africa 2.7136
## RegionNorth America 4.2039
## RegionSouth Asia 5.6634
## RegionSub-Saharan Africa 5.6053
## `Income group`Low income:RegionEurope & Central Asia NA
## `Income group`Lower middle income:RegionEurope & Central Asia 5.1945
## `Income group`Upper middle income:RegionEurope & Central Asia 3.5916
## `Income group`Low income:RegionLatin America & Caribbean NA
## `Income group`Lower middle income:RegionLatin America & Caribbean 3.9485
## `Income group`Upper middle income:RegionLatin America & Caribbean 3.8352
## `Income group`Low income:RegionMiddle East & North Africa NA
## `Income group`Lower middle income:RegionMiddle East & North Africa 4.0910
## `Income group`Upper middle income:RegionMiddle East & North Africa 4.3380
## `Income group`Low income:RegionNorth America NA
## `Income group`Lower middle income:RegionNorth America NA
## `Income group`Upper middle income:RegionNorth America NA
## `Income group`Low income:RegionSouth Asia 9.6175
## `Income group`Lower middle income:RegionSouth Asia 6.2574
## `Income group`Upper middle income:RegionSouth Asia NA
## `Income group`Low income:RegionSub-Saharan Africa NA
## `Income group`Lower middle income:RegionSub-Saharan Africa 5.9901
## `Income group`Upper middle income:RegionSub-Saharan Africa 6.3705
## t value
## (Intercept) 45.468
## `Income group`Low income -0.441
## `Income group`Lower middle income -23.342
## `Income group`Upper middle income -22.674
## RegionEurope & Central Asia 0.356
## RegionLatin America & Caribbean -12.830
## RegionMiddle East & North Africa -22.561
## RegionNorth America 5.685
## RegionSouth Asia -3.639
## RegionSub-Saharan Africa -13.370
## `Income group`Low income:RegionEurope & Central Asia NA
## `Income group`Lower middle income:RegionEurope & Central Asia -0.735
## `Income group`Upper middle income:RegionEurope & Central Asia 3.013
## `Income group`Low income:RegionLatin America & Caribbean NA
## `Income group`Lower middle income:RegionLatin America & Caribbean 8.327
## `Income group`Upper middle income:RegionLatin America & Caribbean 13.714
## `Income group`Low income:RegionMiddle East & North Africa NA
## `Income group`Lower middle income:RegionMiddle East & North Africa 11.653
## `Income group`Upper middle income:RegionMiddle East & North Africa 13.573
## `Income group`Low income:RegionNorth America NA
## `Income group`Lower middle income:RegionNorth America NA
## `Income group`Upper middle income:RegionNorth America NA
## `Income group`Low income:RegionSouth Asia -5.429
## `Income group`Lower middle income:RegionSouth Asia -0.396
## `Income group`Upper middle income:RegionSouth Asia NA
## `Income group`Low income:RegionSub-Saharan Africa NA
## `Income group`Lower middle income:RegionSub-Saharan Africa 9.704
## `Income group`Upper middle income:RegionSub-Saharan Africa 12.827
## Pr(>|t|)
## (Intercept) < 2e-16 ***
## `Income group`Low income 0.659102
## `Income group`Lower middle income < 2e-16 ***
## `Income group`Upper middle income < 2e-16 ***
## RegionEurope & Central Asia 0.722084
## RegionLatin America & Caribbean < 2e-16 ***
## RegionMiddle East & North Africa < 2e-16 ***
## RegionNorth America 1.35e-08 ***
## RegionSouth Asia 0.000276 ***
## RegionSub-Saharan Africa < 2e-16 ***
## `Income group`Low income:RegionEurope & Central Asia NA
## `Income group`Lower middle income:RegionEurope & Central Asia 0.462662
## `Income group`Upper middle income:RegionEurope & Central Asia 0.002598 **
## `Income group`Low income:RegionLatin America & Caribbean NA
## `Income group`Lower middle income:RegionLatin America & Caribbean < 2e-16 ***
## `Income group`Upper middle income:RegionLatin America & Caribbean < 2e-16 ***
## `Income group`Low income:RegionMiddle East & North Africa NA
## `Income group`Lower middle income:RegionMiddle East & North Africa < 2e-16 ***
## `Income group`Upper middle income:RegionMiddle East & North Africa < 2e-16 ***
## `Income group`Low income:RegionNorth America NA
## `Income group`Lower middle income:RegionNorth America NA
## `Income group`Upper middle income:RegionNorth America NA
## `Income group`Low income:RegionSouth Asia 5.81e-08 ***
## `Income group`Lower middle income:RegionSouth Asia 0.692314
## `Income group`Upper middle income:RegionSouth Asia NA
## `Income group`Low income:RegionSub-Saharan Africa NA
## `Income group`Lower middle income:RegionSub-Saharan Africa < 2e-16 ***
## `Income group`Upper middle income:RegionSub-Saharan Africa < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 40.27 on 8854 degrees of freedom
## (2774 observations deleted due to missingness)
## Multiple R-squared: 0.343, Adjusted R-squared: 0.3416
## F-statistic: 243.3 on 19 and 8854 DF, p-value: < 2.2e-16
Production per capita (kg) = 90.1073 + B2_IncomeGroup + B3_Region + B4_IncomeGroup*Region
par(mfrow = c(2,2))
plot(mfit4)
Region and income group do play a role in production of countries. There is variation that can be seen across income groups and regions, however, when put together, it was also important to consider the intersection of region and income group. Many regions featured countries of varying income status, and only when looking at the impact of both is there a greater understanding of how they influence production.
The model created had an adjusted R-squared statistic of 0.3416, meaning that about 34% of the variation is explained by this model. While region, income group, and the intersection of region and income group are able to show some insight into production, there are still other factors involved. Also interesting, all the p-values for the multiple linear regression showed that each piece, whether it was income group, region, or the intersection of them, was providing significant data.
However, it is also important to acknowledge that the diagnostic plots are all strange. The residual plots have some pattern and are not randomly distributed. The Q-Q plot also shows rightward skew.
This exploration showed that there is some impact of region and income group on production, and that there is variance within them. Countries that are high income generally see greater production than lower income countries. Regionality also shows that production globally is greatest in North America and Europe and Central Asia, which includes most of the world powers from the past century.
Further studies might explore other factors to find a more accurate model. While all the factors included in this model were significant, details on governments and political divisions as well as climate and environmental concerns might further explain the variation in production. The size of the country could also play a role.
While limited by time and space in this project, an exploration of all these variables over time might also be interesting to generate a better understanding of production over time. Exploring why North America is bimodally distributed would also be interesting and possibly help to provide further insight.
Godfray, H. C. J., Beddington, J. R., Crute, I. R., Haddad, L., Lawrence, D., Muir, J. F., Pretty, J., Robsinson, S., Thomas, S. M., & Toulmin, C. (2010). Food security: The challenge of feeding 9 billion people. Science, 327(5967), 812-818. 10.1126/science.1185383
Rijanta, R. (2020). The prospects & challenges of local foods production in rural Java, Indonesia: the case of Kulonprogo Regency. Journal of Studies & Research in Human Geography, 14(2), 321–335. https://doi-org.montgomerycollege.idm.oclc.org/10.5719/hgeo.2019.141.9
Dhiman, S., & Mukherjee, G. (2021). Present scenario and future scope of food waste to biofuel production. Journal of Food Process Engineering, 44(2), 1–21. https://doi- org.montgomerycollege.idm.oclc.org/10.1111/jfpe.13594