2023-05-03
We analyzed the Gapminder dataset, which includes information on life expectancy, GDP per capita, and population for countries across the world from 1952 to 2007.
There are 1704 observations and 6 variables (taken every 5 years), including country, year, life expectancy, GDP per capita, population, and continent.
Our goal was to explore patterns and relationships in the data using visualizations and clustering.
We used k-means clustering to group countries based on their average life expectancy and average GDP per capita averaged over the years from 1952 to 2007.
We chose these variables because we were interested in how a country’s economic development relates to life expectancy.
We used the elbow method to determine that three clusters were appropriate and the results are shown below.
Our analysis suggests that there is a strong relationship between economic development and life expectancy.
We found that developed countries tend to have higher life expectancy and GDP per capita than developing countries.
By identifying the characteristics of each cluster, our analysis could provide insights for policymakers and public health officials to improve health outcomes in less developed countries.
For future work, one should find other data sources to join the Gapminder dataset with by year and country to include more features such as education level, healthcare spending, and environmental factors. This would provide a more comprehensive understanding of the factors that influence life expectancy and economic development across the world.