This project uses data from the 2022 Human Freedom Index
(HFI), which is an annual report that evaluates and ranks countries’
levels of human freedom based on personal and economic aspects. The
variables in the dataset include the name of the country, the year of
the country’s ranking (contains data from 2000 to 2020), region, and
overall human freedom score (hf), as well as overall rank
and quartile, with the rest of the variables split up into two main
categories, personal freedom (pf) and economic freedom
(ef). These main two categories each have several
subcategories, with each subcategory having its own subcategories. The
data used in the HFI was collected from various international
organizations, including the World Bank, the International Monetary
Fund, and the United Nations, among other reputable sources (Vasquez et
al). The Cato Institute and the Fraser Institute, which are the
organizations that developed and compiled the HFI, then analyze the data
and assign scores for each aspect of freedom to each country based on
expert assessments and quantitative indicators. These scores are
standardized to a common scale, often ranging from 0 to 10, for
consistency. In regards to prior cleaning for this project, I did not
have to clean the variable names as they were already formatted properly
(all lowercase and no spaces); the only cleaning that was involved was
excluding missing values during calculations. I chose this topic because
many parts of the world are currently suffering from the fallout of
international conflict and the information in the Human Freedom Index
may shed light on the factors contributing to these conflicts.
How has overall human freedom changed globally from 2000 to 2020?
Are personal or economic factors more influential on a country’s overall level of human freedom?
How do scores for various aspects of personal freedom compare across different regions in 2020?
Throughout history, diverse perspectives on the true essence of freedom have emerged. Philosophers such as Plato and Hobbes argued for a structured society with strict rules, believing that this was crucial to prevent chaos and maintain safety. According to them, limiting individual freedoms was seen as a necessary sacrifice for the greater good of societal order (Vasquez et al). However, contrasting viewpoints, advocated by figures like Lao Tzu and John Locke, provide a different perspective. They proposed that genuine freedom means allowing individuals to make their own decisions without unwarranted interference. From their view, true freedom transcends a mere absence of regulations; it involves granting people the autonomy to lead their lives on their own terms. This ongoing debate has significantly influenced our modern understanding of freedom, emphasizing its dynamic nature that goes beyond just not having rules.
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.3 ✔ readr 2.1.4
## ✔ forcats 1.0.0 ✔ stringr 1.5.0
## ✔ ggplot2 3.4.3 ✔ tibble 3.2.1
## ✔ lubridate 1.9.2 ✔ tidyr 1.3.0
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(plotly)
##
## Attaching package: 'plotly'
##
## The following object is masked from 'package:ggplot2':
##
## last_plot
##
## The following object is masked from 'package:stats':
##
## filter
##
## The following object is masked from 'package:graphics':
##
## layout
hfi <- read_csv("hfi_cc_2022.csv")
## Rows: 3465 Columns: 141
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (4): countries, region, ef_government_tax_income_data, ef_government_t...
## dbl (137): year, hf_score, hf_rank, hf_quartile, pf_rol_procedural, pf_rol_c...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
q1 <- hfi |> group_by(region, year) |> filter(!is.na(hf_score)) |> summarise(mean_freedom = mean(hf_score, na.rm = TRUE))
## `summarise()` has grouped output by 'region'. You can override using the
## `.groups` argument.
q1
## # A tibble: 208 × 3
## # Groups: region [10]
## region year mean_freedom
## <chr> <dbl> <dbl>
## 1 Caucasus & Central Asia 2000 7.15
## 2 Caucasus & Central Asia 2003 7.39
## 3 Caucasus & Central Asia 2004 7.12
## 4 Caucasus & Central Asia 2005 6.87
## 5 Caucasus & Central Asia 2006 6.83
## 6 Caucasus & Central Asia 2007 6.81
## 7 Caucasus & Central Asia 2008 6.75
## 8 Caucasus & Central Asia 2009 6.63
## 9 Caucasus & Central Asia 2010 6.54
## 10 Caucasus & Central Asia 2011 6.54
## # ℹ 198 more rows
# Customizing aesthetics, scales, labels, and themes for the plot
viz1 <- q1 |> ggplot(aes(x = year, y = mean_freedom, color = region, text = paste("Region: ", region, "\nYear: ", year, "\nMean Human Freedom Score: ", mean_freedom))) +
geom_line(aes(group = region), size = 0.8) +
scale_color_manual(values = c("red2", "#FF7F00", "#FFD700", "#4DAF4A", "blue", "#984EA3", "pink", "#000000", "gray70", "skyblue")) +
theme_minimal() +
labs(
x = "Year",
y = "Mean Human Freedom Score",
color = "Region",
title = "Global Trends in Human Freedom (2000-2020)"
) +
theme(plot.caption = element_text(hjust = 1))
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
# Converting the ggplot object to a plotly object for interactivity
viz1 <- ggplotly(viz1) |>
# Adding annotations to the plot for the caption and resizing the margins
layout(
annotations = list(
text = "Source: The Human Freedom Index 2022",
showarrow = FALSE,
xref = "paper",
yref = "paper",
x = 1.55,
y = -0.19,
font = list(size = 12)),
margin = list(l = 60, r = 50, b = 75, t = 60)
) |>
# Adding a trace to the plotly object for custom hover text
add_trace(
data = q1,
x = ~year,
y = ~mean_freedom,
type = "scatter",
mode = "lines",
line = list(width = 0.8),
hoverinfo = "text",
text = ~paste("Region: ", region, "\nYear: ", year, "\nMean Human Freedom Score: ", mean_freedom),
showlegend = FALSE
)
viz1
## Warning: 'scatter' objects don't have these attributes: 'colour'
## Valid attributes include:
## 'cliponaxis', 'connectgaps', 'customdata', 'customdatasrc', 'dx', 'dy', 'error_x', 'error_y', 'fill', 'fillcolor', 'fillpattern', 'groupnorm', 'hoverinfo', 'hoverinfosrc', 'hoverlabel', 'hoveron', 'hovertemplate', 'hovertemplatesrc', 'hovertext', 'hovertextsrc', 'ids', 'idssrc', 'legendgroup', 'legendgrouptitle', 'legendrank', 'line', 'marker', 'meta', 'metasrc', 'mode', 'name', 'opacity', 'orientation', 'selected', 'selectedpoints', 'showlegend', 'stackgaps', 'stackgroup', 'stream', 'text', 'textfont', 'textposition', 'textpositionsrc', 'textsrc', 'texttemplate', 'texttemplatesrc', 'transforms', 'type', 'uid', 'uirevision', 'unselected', 'visible', 'x', 'x0', 'xaxis', 'xcalendar', 'xhoverformat', 'xperiod', 'xperiod0', 'xperiodalignment', 'xsrc', 'y', 'y0', 'yaxis', 'ycalendar', 'yhoverformat', 'yperiod', 'yperiod0', 'yperiodalignment', 'ysrc', 'key', 'set', 'frame', 'transforms', '_isNestedKey', '_isSimpleKey', '_isGraticule', '_bbox'
Some key takeaways from this plot are that North America and Western Europe have consistently ranked top 2 in average human freedom scores while the Middle East & North African (MENA) region remained at the bottom of the ranks. It also appears that the MENA region as well as the Caucasus & Central Asia (CCA) region had experienced significant drops in their freedom scores around 2004 . This could be due to the rise in terrorism in the MENA region resulting from the 2003 Iraq War, which may have lowered freedom scores, especially in the category of Security and Safety, in the MENA region as well as the CCA region since they are geographically adjacent to each other (“Middle East”). Additionally, it seems that all regions saw drops between 2019 and 2020, which was during the outbreak of COVID-19, implying the pandemic contributed to a decline in human freedom in all aspects across the world.
# Use `geom_point()` and `geom_smooth(method = "lm")` to show the points with the regression lines
pf_corr <- hfi |> ggplot(aes(x = pf_score, y = hf_score)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE, color = "blue") +
labs(
x = "Personal Freedom Score",
y = "Human Freedom Score",
title = "Correlation Between Personal and Overall Human Freedom",
caption = "Source: The Human Freedom Index 2022"
) +
theme_minimal() +
theme(plot.caption = element_text(hjust = 1))
ef_corr <- hfi |> ggplot(aes(x = ef_score, y = hf_score)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE, color = "red") +
labs(
x = "Economic Freedom Score",
y = "Human Freedom Score",
title = "Correlation Between Economic and Overall Human Freedom",
caption = "Source: The Human Freedom Index 2022"
) +
theme_minimal() +
theme(plot.caption = element_text(hjust = 1))
## load `gridExtra` package and use `grid.arrange` to display both plots in the same output
library(gridExtra)
##
## Attaching package: 'gridExtra'
## The following object is masked from 'package:dplyr':
##
## combine
grid.arrange(pf_corr, ef_corr)
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 382 rows containing non-finite values (`stat_smooth()`).
## Warning: Removed 382 rows containing missing values (`geom_point()`).
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 382 rows containing non-finite values (`stat_smooth()`).
## Removed 382 rows containing missing values (`geom_point()`).
Looking at these scatterplots, it appears that both personal and economic freedom scores have strong positive correlations with overall freedom. Thus, the level of freedom shared by the citizens of a country is influenced by both social and economic factors. However, with these plots looking very identical in regards to the slope of their linear regression lines, the following summary statistics for their regression models will better illustrate which aspect of freedom is most influential.
cor(hfi$pf_score, hfi$hf_score, use = "complete.obs")
## [1] 0.9675184
summary(lm(hf_score ~ pf_score, data = hfi))
##
## Call:
## lm(formula = hf_score ~ pf_score, data = hfi)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.3650 -0.1950 0.0371 0.2004 0.8874
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 1.566832 0.026815 58.43 <2e-16 ***
## pf_score 0.755423 0.003556 212.44 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.3114 on 3081 degrees of freedom
## (382 observations deleted due to missingness)
## Multiple R-squared: 0.9361, Adjusted R-squared: 0.9361
## F-statistic: 4.513e+04 on 1 and 3081 DF, p-value: < 2.2e-16
The correlation analysis between personal freedom and human freedom reveals a very strong positive correlation of 0.97, indicating the tow variables are closely related. The model’s R-squared value of 0.94 means about 94% of the differences in human freedom scores can be explained by variations in personal freedom scores. This suggests that personal freedom has a big influence on a country’s overall human freedom.
cor(hfi$ef_score, hfi$hf_score, use = "complete.obs")
## [1] 0.8283995
summary(lm(hf_score ~ ef_score, data = hfi))
##
## Call:
## lm(formula = hf_score ~ ef_score, data = hfi)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.5498 -0.4251 0.1568 0.4978 1.9800
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.13183 0.08624 1.529 0.126
## ef_score 1.02921 0.01254 82.090 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.69 on 3081 degrees of freedom
## (382 observations deleted due to missingness)
## Multiple R-squared: 0.6862, Adjusted R-squared: 0.6861
## F-statistic: 6739 on 1 and 3081 DF, p-value: < 2.2e-16
In this model, the correlation coefficient is strong at 0.82, demonstrating positive association between economic freedom and human freedom. The linear regression model is also highly significant (p-value: < 2.2e-16). However, in addition to the correlation coefficient, the R-squared value is lower compared to the personal freedom model, standing at 0.69. This suggests that economic factors, while influential, might explain a slightly lower proportion of the variability in overall human freedom scores.
q3 <- hfi |> filter(year == "2020") |> group_by(region) |> summarise(
## After trying many times to get 'plotly` to preserve my formatting for the legend items and hover information, I decided to just format them here while creating the data frame for the plot instead leaving them in their the cleaned version.
"Rule of Law" = mean(pf_rol, na.rm = TRUE),
"Security and Safety" = mean(pf_ss, na.rm = TRUE),
"Freedom to Travel" = mean(pf_movement, na.rm = TRUE),
"Freedom of Religion" = mean(pf_religion, na.rm = TRUE),
"Freedom to Assemble Peacefully" = mean(pf_assembly, na.rm = TRUE),
"Freedom of Expression" = mean(pf_expression, na.rm = TRUE),
"Freedom to Choose Relationships" = mean(pf_identity, na.rm = TRUE)
) |>
# Use `pivot_longer()` to place all the subcategories of personal freedom into a single categorical column with their corresponding mean scores in a separte column. Format variable names how they would appear on the plot.
pivot_longer(
cols = c("Rule of Law", "Security and Safety", "Freedom to Travel", "Freedom of Religion", "Freedom to Assemble Peacefully", "Freedom of Expression", "Freedom to Choose Relationships"),
names_to = "Component",
values_to = "Mean Score"
) |>
pivot_longer(cols = region, names_to = "region", values_to = "Region") |>
select("Region", "Component", "Mean Score")
q3
## # A tibble: 70 × 3
## Region Component `Mean Score`
## <chr> <chr> <dbl>
## 1 Caucasus & Central Asia Rule of Law 4.87
## 2 Caucasus & Central Asia Security and Safety 8.11
## 3 Caucasus & Central Asia Freedom to Travel 6.71
## 4 Caucasus & Central Asia Freedom of Religion 5.83
## 5 Caucasus & Central Asia Freedom to Assemble Peacefully 5.58
## 6 Caucasus & Central Asia Freedom of Expression 4.71
## 7 Caucasus & Central Asia Freedom to Choose Relationships 8.96
## 8 East Asia Rule of Law 6.53
## 9 East Asia Security and Safety 9.31
## 10 East Asia Freedom to Travel 6.64
## # ℹ 60 more rows
# This is optional, but I changed the order of the levels of the components so it would match their order of reference in the Human Freedom Index report
q3$Component <- factor(q3$Component, levels = c("Rule of Law", "Security and Safety", "Freedom to Travel", "Freedom of Religion", "Freedom to Assemble Peacefully", "Freedom of Expression", "Freedom to Choose Relationships"))
viz2 <- q3 |> ggplot(aes(x = Region, y = `Mean Score`, fill = Component)) +
## Use `geom_col()` and `coord_flip()` to create the horizontal bar chart, and set the position to `position_dodge()" to have the bars next to each other. Adjust the width of the bars and their margins if needed.
geom_col(position = position_dodge(width = 0.82), width = 0.7) +
coord_flip() +
theme_minimal() +
labs(
x = "Region",
y = "Mean Score",
title = "Regional Averages of Personal Freedom Factors (2020)",
fill = "Component of Personal Freedom",
caption = "Source: The Human Freedom Index 2022"
) +
scale_fill_manual(
values = c("Rule of Law" = "red",
"Security and Safety" = "orange2",
"Freedom to Travel" = "yellow2",
"Freedom of Religion" = "green4",
"Freedom to Assemble Peacefully" = "skyblue2",
"Freedom of Expression" = "purple",
"Freedom to Choose Relationships" = "black"),
)
# Convert the ggplot object to a plotly object for interactivity
viz2 <- ggplotly(viz2)
# Add annotations to the plot for the caption and resize the margins; adjust the position of caption
viz2 <- viz2 |> layout(
annotations = list(
text = "Source: The Human Freedom Index 2022",
showarrow = FALSE,
xref = "paper",
yref = "paper",
x = 2.3,
y = -0.19,
font = list(size = 11)),
margin = list(l = 80, r = 50, b = 75, t = 60),
yaxis = list(title = list(standoff = 10)),
title = list(
text = "Regional Averages of Personal Freedom Factors (2020)",
x = 0.5
)
)
viz2
This horizontal bar chart shows mean scores for each
subcategory/component of personal freedom across global regions. Using
plotly’s isolation feature to compare each region’s scores
one component at a time, the plot shows similar regional results to the
first visualization as Western Europe and North America consistently
rank among the top scorers in each category while the Middle East &
North Africa score the lowest. Comparing the distribution of mean scores
across the regions, it seems that the freedom to choose relationships,
the freedom of religion, and the security and safety component of
freedom were mostly scored high throughout the regions, while the rule
of law and freedom to travel categories saw the lowest scores in each
region.
This exploration of the Human Freedom Index (HFI) from 2000 to 2020 exhibited compelling insights into the global dynamics of freedom. The line chart revealed consistently high rankings for North America and Western Europe, contrasting with the persistent low scores in the Middle East & North Africa (MENA) region. I inferred that the substantial drop in scores of the MENA and Caucasus & Central Asia (CCA) regions around 2004 aligned with the aftermath of the 2003 Iraq War, indicating a potential impact of geopolitical events on freedom. The global decline in scores during the COVID-19 pandemic suggested its widespread influence on overall human freedom. The scatterplots and linear regression models showed strong positive correlations between each main category of freedom (personal and economic) and overall human freedom, emphasizing personal freedom as the stronger contributor. The horizontal grouped chart further reinforced regional trends and revealed consistently high scores in components like freedom of religion and security, while highlighting lower scores in the rule of law and freedom to travel. For further exploration, I would want to do a multiple regression analysis to identify which of the 83 distinct indicators of personal and economic freedom contribute most to overall human freedom scores. These findings would help provide a more detailed analysis of the global freedom dynamics and enhance the precision of future policy recommendations and interventions.
“Middle East - Modern History.” Council on Foreign Relations, World101, world101.cfr.org/rotw/middle-east/modern-history.
Vásquez, Ian, et al. The Human Freedom Index 2022 - Fraser Institute, Fraser Institute, Cato Institute, 2022, www.fraserinstitute.org/sites/default/files/human-freedom-index-2022.pdf.