2025-11-02

The Dataset

-This is a project that uses the world_population dataset from kaggle.

-The data set shows population data coming from 234 countries between 1970 to 2022.

-The data set has 17 columns. It has information such as the name of the Country or Territory, the Continent it is part of, the population for various years, Density, Area, and Growth Rate.

Brief Overview

We are going to explore the data through different visualizations to better understand and analyze the data. The following is the layout:

-World maps: Geographical mapping with color intensity of countries based Density and Growth Rate factors.

-Scatter plot: Shows the interactions between Continent, Density, Growth Rate, and Area using a 3d plot.

-Line Chart: A simple line chart showing the populations of Continents over time.

-Box Plot: Shows distribution of Growth Rates in each continent.

-Statistical Analysis - a descriptive summary of the mean, median, and standard deviation of Growth Rate between continents.

Ggplot World Maps

This is a series of three graphs that show the population density and growth rate across the world. The first graph uses logarithmic scale due to Density having a large range with many outliers. There are two graphs for population growth, one including Ukraine, and one excluding it. This decision was made as Ukraine had unnaturally high population decline, making it harder to understand the graph.

Density World Map

Growth Rate World Map

Growth Rate World Map (minus Ukraine)

World Map Analysis

There are a couple interesting things that we can see from the graphs. The population density seems to be highest overall in Asian countries like India and China. Ukraine as very high population decline due to the war and the bordering European countries have high growth rate due to the refugees Finally, we can see that the growth rate is the highest in African countries. Poorer countries and countries that have more immigration tend to have higher population growth based on the graph.

Plotly 3D plot

The following is a 3D Scatter Plot of Density, Growth Rate, and Area. Density and Area are logarithmic. The points are colored based on continents.

3D plot Analysis

There are quite a few interesting patterns in this graph. There is a clear inverse relationship between Density and Area, which makes sense mathematically. Higher Area also seems to have a correlation with higher Growth Rate which is interesting and might need further study. Growth Rate seems to trend highest with a Density of around 100/km2. Though this might just be error. The Continents surprisingly group up in noticeable clusters or bands. Such as Asia and Africa each forming a band of high population and high growth rate respectively. Oceania also clusters around low area and density which makes sense due to the many islands.

Plotly Line Graph

A Line chart is used to see the overall population growth of the various continents in a clear and understandable way. We can see that since 1970, Asia and Africa have grown the most. Furthermore, Europe seems to have not grown much.

Ggplot Boxplot Code

I used a box plot to figure out the growth rates of each continent and visualize it easily. The following is the code used to generate the box plot:

ggplot(world_pop, aes(x = Continent, y = Growth.Rate, fill = Continent)) +
  geom_boxplot(alpha = 0.7, outlier.colour = "red", outlier.shape = 16) +
  scale_y_continuous(labels = percent_format(accuracy = 0.01)) +
  scale_fill_brewer(palette = "Set2") +
  labs(
    title = "Distribution of Population Growth Rates by Continent",
    x = "Continent",
    y = "Growth Rate (%)",
    fill = "Continent"
  ) +
  theme_minimal(base_size = 13) +
  theme(
    plot.title = element_text(face = "bold", hjust = 0.5),
    axis.text.x = element_text(angle = 45, hjust = 1)
  )

Ggplot Growth Rate by Country

From the plot, Africa seems to have the highest growth rate of the continents. Europe and North America seem to have the lowest. Asia also does not have that high of a growth rate despite being highly populated

Statistical analysis

The statistical analysis shows us that Africa clearly has a significantly higher growth rate than other continents. Asia, South America, and Oceania all seem to have normal growth rates. North America and Europe seem to have low growth rates. On top of this, Europe has a very high standard deviation, showing that Europe has a wide range of growth rates.

## # A tibble: 6 × 4
##   Continent     Mean_Growth Median_Growth SD_Growth
##   <chr>         <chr>       <chr>         <chr>    
## 1 Africa        2.12%       2.31%         0.80%    
## 2 Asia          0.94%       0.81%         0.92%    
## 3 Europe        0.23%       0.15%         1.92%    
## 4 North America 0.42%       0.41%         0.60%    
## 5 Oceania       0.74%       0.79%         1.01%    
## 6 South America 0.80%       0.63%         0.58%

Conclusion and Thanks for your time!

Based on the above graphs and plots, it seems clear that Africa has the highest growth rate over all. Area seems to have a positive correlation with growth rate. Europe seems to have a wide range of growth rates. Though this could just be a result of Ukraine’s unnaturally low growth rate. It was also very interesting to see on the world graphs how Ukraine had such a low growth rate while the western countries around Ukraine had a higher growth rate. This trend shows the movement of refugees, which is really cool.