Alexander Hinton, Elijah Oswald & Isabella Mondello
The relationship between GDP per capita and the number of children per woman (fertility rates) provides valuable insights into how economic development impacts population dynamics. The group believes this topic is compelling given the changes in the distribution of resources as countries experience economic group. This results in better healthcare, education and employment opportunities (Brooks, 2023). Research suggests that such distributions often correlate with a lower number of children per woman. However, the magnitude of this relationship varies significantly across regions and time periods, reflecting a complex interplay between economic factors and cultural or social norms. Key periods of interest for this research are the World War periods, which were marked by global economic turmoil and significant shifts in workforce participation, in addition to disruptions to family life that likely influenced fertility rates in different ways across nations.
This research question is both novel and essential given it goes beyond a simplistic understanding of economic growth impacting fertility. Instead, it seeks to explore how different historical and regional contexts shape this relationship. Economic performance, in this sense, has profound implications for demographic trends, which in turn affect policy decisions related to social services, resource allocation, and sustainable development. For instance, countries with rapidly declining birth rates may face challenges related to aging populations, while those with higher fertility rates may struggle to provide sufficient services and infrastructure (North, 2023). Understanding how economic factors influence reproductive behaviours can provide critical insights for long-term economic and social planning, particularly in a world increasingly focused on sustainability.
Our research group, composed of 66.66% women, brings a unique, personal perspective to and interest on this topic. The diverse backgrounds within the group allow us to draw from a variety of experiences and cultural understandings of how economic and social factors shape women’s reproductive choices. This personal connection enriches our approach to investigating how differences in economic status influence population growth patterns. By examining these trends, we aim to contribute to broader discussions on global population dynamics and the economic policies needed to address future challenges in an increasingly interconnected world.
library(dplyr) # For data wrangling
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2) # For data visualization
library(readr) # For reading CSV files
library(tidyr) # For data transformation
library(knitr) # For creating tables
gdp_pcap<- read.csv("gdp_pcap.csv")
children_per_woman_total_fertility<- read.csv("children_per_woman_total_fertility.csv")
gdp<- gdp_pcap %>%
rename_with(~ ifelse(.x == "country", .x, sub("^X", "", .x)), .cols = everything())
fertility<- children_per_woman_total_fertility%>%
rename_with(~ ifelse(.x == "country", .x, sub("^X", "", .x)), .cols = everything())
fertility_clean <- fertility %>%
mutate(across(-country, ~ ifelse(grepl("µ", .), as.numeric(sub("µ", "", .)) * 1e-6, as.numeric(.)))) ##### Did not need to change into numeric as the values were already of that nature.
gdp_clean <- gdp %>%
mutate(across(-country, ~ ifelse(grepl("µ", .), as.numeric(sub("µ", "", .)) * 1e-6, as.numeric(.))))
## Warning: There were 197 warnings in `mutate()`.
## The first warning was:
## ℹ In argument: `across(...)`.
## Caused by warning in `ifelse()`:
## ! NAs introduced by coercion
## ℹ Run `dplyr::last_dplyr_warnings()` to see the 196 remaining warnings.
gdp <- gdp_clean %>%
pivot_longer(
cols = -country, # Exclude the country column from being gathered
names_to = "year", # Name of the new column for the years
values_to = "gdp" # Name of the new column for the GDP values
)
fertility <- fertility_clean %>%
pivot_longer(cols = -1, names_to = "year", values_to = "children")
gdp_and_fertility <- gdp %>%
left_join(fertility, by = c("country", "year"))
View(gdp_and_fertility) #viewing the final merged data set
#data from 2005 and onwards
ModernTrends <- gdp_and_fertility %>% filter(year >= 2005)
ggplot(ModernTrends, aes(x = children)) +
geom_histogram(binwidth = 1, fill = "pink", color = "black") +
labs(
title = "Distribution of Children Per Women Per GDP Per Capita Since 2005",
x = "Number of Children",
y = "GDP Per Capita"
)
The ‘ModernTrends’ dataset had 18,720 observations of 4 variables, which was used to create the histogram above. This graph demonstrates the trend of women per children per capita since 2007 until 2024. The distribution is unimodal and indicating a negative correlation; as GDP per capita increases, the number of children born per women decreases.
One significant peak occurs at the higher GDP per capita levels, where fertility rates are lower, with countries having only two (2) children per woman. These nations often fall into middle-income categories, indicating that moderate economic growth correlates with reduced fertility as access to healthcare, education, and family planning improves.
The second highest frequency result is one (1) child per women. This further aligns with trends in economically developed countries, where higher incomes and urbanisation often lead to delayed childbearing and smaller family sizes. These countries may be experiencing higher quality of life, advanced healthcare systems and economic structures that support lower fertility rates.
ModernTrends %>%
ggplot(aes(x = children, y= gdp)) + geom_point() + geom_smooth() +
scale_y_log10() + labs(x = "Number of Children",
y= "GDP per capita ($)", title = "Number of Children Per Women Per GDP Per Capita Since 2005") + theme(plot.title = element_text(size = 11.5, hjust = 0.5))
## `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
## Warning: Removed 13540 rows containing non-finite outside the scale range
## (`stat_smooth()`).
## Warning: Removed 13540 rows containing missing values or values outside the scale range
## (`geom_point()`).
The scatter plot was also based on the ‘ModernTrends’ dataset and showed a clear negative relationship between the number of children per women and GDP per capita. Evidently, as GDP per capita increases, the number of children per women tends to decrease. The trend line indicates a strong downward slope, highlighting the wealthier countries generally have lower fertility rates, however there is some variability, particularly at lower GDP levels where fertility remains high.
The graph shows a dense cluster of data points the left side, where lower fertility rates are associated with higher GDP per capita. This pattern reflects the trend seen in economically developed countries, where access to healthcare, education and family planning results in smaller families. As we move to the right, countries have larger numbers of children per woman. The scatter plot therefore illustrates a clear inverse correlation, reinforcing the idea that as countries become wealthier, they tend to have fewer children per family. However, the spread of points may also be due to external factors such as social policies or cultural practices.
World_War_1<- gdp_and_fertility %>% filter(year >= 1914 & year <= 1918) #World War 1
World_War_2 <- gdp_and_fertility %>% filter(year >= 1939 & year <= 1945) #World War 2
World_War_1 %>% ggplot(aes(x = children, y= gdp)) + geom_point() + geom_smooth() +
scale_y_log10() + labs(x = "Number of Children Per Women",
y= "GDP per capita ($)", title = "Number of Children Per Women Per GDP Per Capita During World War One") +theme_bw() + theme(plot.title = element_text(size = 11.5, hjust = 0.5))
## `geom_smooth()` using method = 'loess' and formula = 'y ~ x'
## Warning: Removed 5 rows containing non-finite outside the scale range
## (`stat_smooth()`).
## Warning: Removed 5 rows containing missing values or values outside the scale range
## (`geom_point()`).
The scatter plot above illustrates the relationship between the number of children born per women and GDP per capita during World War 1.
The downward trend of the fitted line suggests that as GDP per capita increases, the number of children per woman tends to decrease. This indicates that wealthier societies, on average, had fewer children during this period. Notably, the highest concentration of data points is found between 6 and 7 children per woman. This concentration suggests that many countries experienced similar fertility rates, reflecting broader social and economic conditions during the time. These findings highlight that during World War I, GDP per capita had a significant impact on fertility rates, which may reflect families having less education during the period.
The ‘ModernTrends’ scatter plot demonstrates a clear negative relationship between the number of children born per woman and GDP per capita, indicating that wealthier societies generally have lower fertility rates. Similarly, the World War I analysis reflects this trend, showing that as GDP per capita increases, fertility rates decrease, suggesting that economic conditions significantly impact family size in both historical and contemporary contexts.
World_War_2 %>% ggplot(aes(x = children, y= gdp)) + geom_point() + geom_smooth() +
scale_y_log10() + labs(x = "Number of Children",
y= "GDP per capita ($)", title = "Number of Children Per Women Per GDP Per Capita During World War Two") +theme_bw() + theme(plot.title = element_text(size = 11.5, hjust = 0.5))
## `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'
## Warning: Removed 69 rows containing non-finite outside the scale range
## (`stat_smooth()`).
## Warning: Removed 69 rows containing missing values or values outside the scale range
## (`geom_point()`).
This scatter plot above shows the relationship between the number of children born per woman and the GDP per capita during the period of World War 2.
Similarly to the World War 1 results, the scatter plot reveals a significant inverse relationship between GDP per capita and the number of children per woman. As GDP per capita increases, the number of children born per woman tends to decrease. This trend may indicate that, as countries experienced economic shifts due to wartime production and resource allocation, families began to favour smaller sizes. The war effort often necessitated changes in societal roles, with women entering the workforce and prioritising economic stability, which likely contributed to informed reproductive choices and reduced family size. Access to family planning and healthcare may have improved as well, further influencing this demographic shift.
There are notably more data points around the 2-3 children per woman range compared to World War 1, but the majority remains around the 6-7 children per woman range. The higher concentration of data points in this sector underscores that, despite the overarching downward trend, many families continued to have larger numbers of children. This can be attributed to the historical context of the time, where high infant mortality rates and agrarian economies encouraged larger families as a safeguard against loss. The war further worsened these factors, as uncertainty about the future may have driven families to maintain traditional values regarding childbearing. Overall, these findings illustrate the complex interplay between economic conditions, wartime pressures, and societal norms in shaping demographic trends during a pivotal period in history.
gdp_fertility_reg<- gdp_and_fertility %>%
lm(formula = gdp~children) #running the regression
summary(gdp_fertility_reg) #summarising the output
##
## Call:
## lm(formula = gdp ~ children, data = .)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4698.6 -973.2 -355.0 611.5 9257.8
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7448.121 29.562 252.0 <2e-16 ***
## children -927.655 5.231 -177.3 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1647 on 41773 degrees of freedom
## (16920 observations deleted due to missingness)
## Multiple R-squared: 0.4295, Adjusted R-squared: 0.4295
## F-statistic: 3.145e+04 on 1 and 41773 DF, p-value: < 2.2e-16
The model shows a significant negative relationship between the “children” variable and GDP, meaning that as the number of children increases, GDP tends to decrease. The intercept is estimated at 7448.12, representing the expected GDP level when the “children” variable is zero.
The coefficient for “children” is -927.66, indicating that for every one-unit increase in “children,” GDP decreases by approximately 927.66. This negative association is highly significant, as seen in the large negative t-value of -177.3 and the very low p-value (< 2e-16), strongly suggesting that this result is not due to random chance.
Both the intercept and the “children” coefficient are significant at a level well below 5%, reinforcing that the “children” variable has a statistically strong effect on GDP.
#running a regression only for the recent age of the internet
ModernTrends_reg <- ModernTrends %>%
lm(formula = gdp~children)
summary(ModernTrends_reg)
##
## Call:
## lm(formula = gdp ~ children, data = .)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4910.7 -1722.3 -256.9 1602.7 6764.5
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 8203.15 88.16 93.05 <2e-16 ***
## children -1124.63 26.68 -42.15 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2208 on 5178 degrees of freedom
## (13540 observations deleted due to missingness)
## Multiple R-squared: 0.2555, Adjusted R-squared: 0.2553
## F-statistic: 1777 on 1 and 5178 DF, p-value: < 2.2e-16
This model focuses on data from the recent age of the internet, specifically from 2005 onward. It shows a significant negative relationship between the “children” variable and GDP, indicating that as the number of children increases, GDP tends to decrease within this time frame.
The intercept is estimated at 8203.15, representing the expected GDP level when the “children” variable is zero, though this interpretation may vary with context. The coefficient for “children” is -1124.63, suggesting that for every one-unit increase in “children,” GDP decreases by approximately 1124.63. This negative association is highly significant, with a t-value of -42.15 and an extremely low p-value (< 2e-16), indicating it is very unlikely that this result is due to random chance. Both the intercept and “children” coefficient are statistically significant well below the 5% threshold, reinforcing that “children” has a strong effect on GDP during the modern internet age.
anova(ModernTrends_reg)
## Analysis of Variance Table
##
## Response: gdp
## Df Sum Sq Mean Sq F value Pr(>F)
## children 1 8.6584e+09 8658423825 1776.8 < 2.2e-16 ***
## Residuals 5178 2.5233e+10 4873161
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The Analysis of Variance (ANOVA) Table focuses on the relationship between the “children” variable and GDP within the regression model. It highlights how much of the variability in GDP can be attributed to changes in the number of children.
The degrees of freedom for “children” is 1, indicating that it is a single predictor variable. The residual degrees of freedom is 5178, reflecting the number of observations left after accounting for “children.” The sum of squares for “children” is 8.6584e+09, demonstrating the amount of variability in GDP that is explained by this predictor. In contrast, the residual sum of squares is 2.5233e+10, which shows the variability in GDP that remains unexplained after considering the effect of “children.”
The mean square for “children” is 8658423825, calculated by dividing the sum of squares by its degrees of freedom. The mean square for the residuals is 4873161, indicating the average variability attributable to other factors not included in the model. The F-statistic is 1776.8, suggesting that “children” significantly improves the model’s explanatory power relative to the residual variability. This is further supported by a p-value of less than 2.2e-16, indicating that the relationship between “children” and GDP is highly significant.
Overall, the ANOVA table reinforces the finding that the “children” variable has a substantial impact on GDP.
The analysis applies various data transformation and visualisation techniques to explore the relationship between the GDP per capita and the number of children per woman over time, focusing on key historical and modern trends. Initially, the data is cleaned and reshaped to ensure compatibility between the GDP and fertility data from GapMinder for each country by year. The data is filtered to examine specific periods, including post 2007 trends and the events of World War I and World War II. This segmentation allows for comparisons across different eras, which highlighted the impacts of socio-economic changes on fertility and economic performance.
The visualisation of this data in histograms and scatter plots allowed us to explore the correlation between GDP per capita and fertility rates. The methodology we used further incorporates regression analysis which allowed us to quantify the relationship between the two variables. These statistical models provided us with the conclusion that there is a strong, negative correlation between fertility rates and GDP per capita across various periods and models. The findings reveal that wealthier societies generally exhibit lower fertility rates, likely due to improved access to education, healthcare and family planning. Regression and ANOVA analyes further confirmed that there is a significant significance of this relationship. Henceforth, it can be concluded that economic growth and wealth distribution are pivotal in shaping family size and societal demographics.
Brooks, R. (2023). What is the relationship between education and the economy? [online] North Wales Management School - Wrexham University. Available at: https://online.wrexham.ac.uk/what-is-the-relationship-between-education-and-the-economy/.
North, M. (2023). How 4 countries are addressing their ageing populations. [online] World Economic Forum. Available at: https://www.weforum.org/stories/2023/09/life-expectancy-countries-ageing-populations/ [Accessed 3 Nov. 2024].