The gross domestic product per capita, or GDP per capita, is a measure of a country’s economic output that accounts for its number of people. It divides the country’s gross domestic product by its total population. GDP per capita is an important indicator of economic performance and a useful unit to make cross-country comparisons of average living standards and economic wellbeing.
Life expectancy at birth is defined as how long, on average, a newborn can expect to live, if current death rates do not change. Life expectancy at birth is one of the most frequently used health status indicators.
Large inequalities in life expectancy by countries exist in the world. The big differences in health across the world is clearly visible in the following illustration:
Do citizens of wealthier countries live longer?
To answer this, we wanted to investigate the relation between GDP per Capita and the life expectancy of the citizen in the developed countries.
The question that we want to investigate is:
“Is there any relationship between a country’s GDP per Capita and Life Expectancy?”
To answer this, we:
A detailed description of datasets considered for data preprocessing, their sources, and variable descriptions are as follows:
newdata <- read_excel("gdp_lifeexp_data.xlsx")
head(newdata)
## # A tibble: 6 x 4
## LOCATION YEAR GDP LIFEEXP
## <chr> <dbl> <dbl> <dbl>
## 1 AUS 2000 28249. 79.3
## 2 AUS 2001 29475. 79.7
## 3 AUS 2002 30741. 80
## 4 AUS 2003 32245. 80.3
## 5 AUS 2004 33857. 80.6
## 6 AUS 2005 35571. 80.9
summary1 <- newdata %>% summarise(Min = min(GDP,na.rm = TRUE), Q1 = quantile(GDP,probs = .25,na.rm = TRUE),
Median = median(GDP, na.rm = TRUE), Q3 = quantile(GDP,probs = .75,na.rm = TRUE),
Max = max(GDP,na.rm = TRUE), Mean = mean(GDP, na.rm = TRUE),
SD = sd(GDP, na.rm = TRUE), n = n(),
Missing = sum(is.na(GDP)))
knitr::kable(summary1,caption="Summary Statistics for GDP per Capita")
| Min | Q1 | Median | Q3 | Max | Mean | SD | n | Missing |
|---|---|---|---|---|---|---|---|---|
| 6776.995 | 23859.79 | 31877.07 | 41360.25 | 116622.2 | 33850.01 | 15703.31 | 700 | 0 |
summary2 <- newdata %>% summarise(Min = min(LIFEEXP,na.rm = TRUE), Q1 = quantile(LIFEEXP,probs = .25,na.rm = TRUE),
Median = median(LIFEEXP, na.rm = TRUE), Q3 = quantile(LIFEEXP,probs = .75,na.rm = TRUE),
Max = max(LIFEEXP,na.rm = TRUE), Mean = mean(LIFEEXP, na.rm = TRUE),
SD = sd(LIFEEXP, na.rm = TRUE), n = n(),
Missing = sum(is.na(LIFEEXP)))
knitr::kable(summary2,caption="Summary Statistics for Life Expentancy")
| Min | Q1 | Median | Q3 | Max | Mean | SD | n | Missing |
|---|---|---|---|---|---|---|---|---|
| 70.1 | 77.1 | 79.7 | 81.4 | 84.2 | 78.95329 | 3.092587 | 700 | 0 |
g1 <- ggplot(newdata,aes(GDP))+geom_histogram(bins=40,color = "yellow3", fill="yellow3")+ ylab('Frequency')+ggtitle('GDP Per Capita')
g2 <- ggplot(newdata,aes(LIFEEXP))+geom_histogram(bins=40,color = "brown", fill="brown")+ ylab('Frequency')+ggtitle('Life Expectancy')
cowplot::plot_grid(g1, g2, labels = "AUTO")
par(mfrow = c(1,2))
boxplot(newdata$GDP, main="Boxplot of GDP PER CAPITA",col='yellow3', notch = T)
boxplot(newdata$LIFEEXP, main="Boxplot of Life Expectancy", col='brown', notch = T)
## [1] 69147 72018 78211 84575 68141 77891 83852 86592 82269 85579
## [11] 91814 91527 95246 100934 103788 110250 112702 116622 69358
## [1] 70.1 70.5 70.6 70.6
plot(LIFEEXP ~ GDP, data = newdata, main="Scatter of Life Expectancy and GDP",
ylab="Life Expectancy", xlab="GDP Per Capita", col=c('brown', 'yellow3'))
abline(lm(LIFEEXP ~ GDP, data = newdata))
Correlation Analysis
Hypothesis Generation
Ho : There is no correlation between GDP per Capita of an OECD country and its Life Expectancy.
Ha : There is significant correlation between GDP per Capita of an OECD country and its Life Expectancy.
Mathematically,
Ho :r = 0
Ha : r ≠ 0
A Pearson’s correlation was calculated to measure the strength of the linear relationship between GDP per capita of a OECD country and its life expectancy. The positive correlation was statistically significant, r=.67, p<.001, 95% CI [0.628, .710].
We can say that there are statistically significant evidence to reject Ho.
#Creating a correlation matrix
corr<-as.matrix(dplyr::select(newdata, LIFEEXP, GDP))
rcorr(corr, type = "pearson")
## LIFEEXP GDP
## LIFEEXP 1.00 0.67
## GDP 0.67 1.00
##
## n= 700
##
##
## P
## LIFEEXP GDP
## LIFEEXP 0
## GDP 0
r=cor(newdata$LIFEEXP,newdata$GDP)
CIr(r = r, n = 700, level = .95) %>% round(3)
## [1] 0.628 0.710
As the data demonstrated evidence of a positive linear relationship, a linear regression model was fitted to predict the dependent variable, life expectancy, using measures of GDP per capita. Other non-linear trends were ruled out.
Linear Regression Model
Hypothesis Generation
Ho: The data do not fit the linear regression model
Ha: The data fit the linear regression model
Mathematically,
Ho : α = 0
Ha : α ≠ 0
gdplifeexpmodel <- lm(LIFEEXP ~ GDP, data = newdata) # fitting the linear regression model using the lm() function
gdplifeexpmodel %>% summary()
##
## Call:
## lm(formula = LIFEEXP ~ GDP, data = newdata)
##
## Residuals:
## Min 1Q Median 3Q Max
## -7.4908 -1.4302 0.3848 1.6505 4.4314
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 7.448e+01 2.062e-01 361.17 <2e-16 ***
## GDP 1.321e-04 5.527e-06 23.91 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.295 on 698 degrees of freedom
## Multiple R-squared: 0.4502, Adjusted R-squared: 0.4494
## F-statistic: 571.6 on 1 and 698 DF, p-value: < 2.2e-16
gdplifeexpmodel %>% confint() # calculating 95% CI
## 2.5 % 97.5 %
## (Intercept) 7.407545e+01 7.488522e+01
## GDP 1.212884e-04 1.429922e-04
par(mfrow=c(1,4))
plot(gdplifeexpmodel)
Major Findings:
Statistics by Jim. c2020. Jim Frost. Consulted 11/10/20. Retrieved from:‘https://statisticsbyjim.com/basics/remove-outliers/’
Applied Analytics Course website. c2016. James Baglin. Module 9. Simple Linear Regression and Correlation. Updated 13/07/20. Consulted 11/10/20. Retrieved from:‘https://astral-theory-157510.appspot.com/secured/MATH1324_Module_09.html’
Our World in Data. Global Change Data Lab. Max Roser, Esteban Ortiz-Ospina and Hannah Ritchie (2013) - “Life Expectancy”. Published online at OurWorldInData.org. Retrieved from: ‘https://ourworldindata.org/life-expectancy’ [Online Resource]
Euromonitor International. c2020. Retrieved from: ‘https://blog.euromonitor.com/economic-growth-and-life-expectancy-do-wealthier-countries-live-longer/’