For my first IS460 Module Assignment, I explored a data set about Economic Freedom from the Fraser Institute. This data set caught my attention as a Finance major (and Data Science as well) because I have not taken many economics courses nor have learned about Economic Freedom. Now, reading this, you may be asking yourself, “what does data Economic Freedom contain”? This is the exact question that drew me to the data set. Variables describing to Economic Freedom in this data set are divvied into five areas: 1) governmental financials and taxes, 2) executive, judicial, and legislative grades, 3) fiscal and currency conditions, 4) capital transfers, tariffs and trade information, and 5) financial and business market conditions. These five areas of data amount to more than 80 variables, 6 decades, and 165 countries.
Before you continue reading my analysis of this Economic Freedom data set, please peruse this link and/or the data for yourself if curious: https://efotw.org/economic-freedom/dataset?geozone=world&page=dataset&min-year=2&max-year=0&filter=0
This data set…(talk about it)
The variables I will be focusing on are:
Year, Country, EFW Rank, World Bank Region, World Bank Income Classification…
# Setting up my working directory
setwd("C:/Users/raymo/OneDrive/Documents/IS460")
# Calling in the data
efw <- fread("efotw-2025-master-index-data-for-researchers-iso-efw-index.csv")
names(efw)[5] <- "EFW_Rank" # Renaming a column; https://r-lang.com/names-in-r/
efw_lean <- efw %>%
select(Year,Countries, EFW_Rank) %>% # Selects essential columns to make visualization
filter(Year > 1999) %>% # Filters out years before 2000
data.frame()
efw_top10 <- efw_lean[(efw_lean$EFW_Rank %in% c(1:10)), ] # Selects rows only with top 10 ranking
efw_top10_count <- count(efw_top10, Countries) # New data frame with counts from top 10 rankings
max_y <- round_any(max(efw_top10_count$n), 4, ceiling) # Sets a max-y limit on the y-axis for the counts
ggplot(efw_top10_count, aes(x = reorder(Countries, n), # Identifies the data set and variables in graph
y = n,
fill = n)) +
geom_bar(stat="identity") + # Identifies what kind of graph the visualization
coord_flip() + # Flips the set up of the graph
theme_minimal() +
labs(title = "Country Count of EFW Top 10 Rankings (since 2000)", # Titles for the chart and axes
x = "Countries",
y = "Count") +
theme(plot.title = element_text(hjust = 0.5)) + # Centers the chart title
scale_y_continuous(labels = comma, # Scales the y-axis to the max count
breaks = seq(0, max_y, by=4),
limits = c(0, max_y))
#-------------------------------------------------------------------------------
This visualization was my first idea I crafted when exploring this data set, as I wanted to quickly see which countries have been recognized for having high levels of economic freedom. I was I was curious to see how the country rankings have changed in recent years, especially since the turn of the 21st century. I made a subset data frame that eliminated any data with a year before 2000.
My findings are such: since 2000, six countries (the United States, Switzerland, Singapore, New Zealand, Hong Kong SAR China, and Australia) all have a top ten ranking each year since 2000. The United Kingdom, Ireland, Denmark, and Canada have all received at least a decade’s worth of top ten rankings, while Mauritius and Germany have both received more than five years of top ten rankings. Rounding out the list of the 19 nations with top ten rankings are the following: Finland, Estonia, Luxembourg, Iceland, and the Netherlands. Each of these nations have a few (all less than 4 specifically) years of top ten rankings.
viz_years <- c("2003", "2013", "2023") # Selecting three years for vizualization
efw_income_classes <- efw %>% # Reducing data based on...
select("Year", "Countries", "World Bank Region", # ... these variables...
"World Bank Current Income Classification, 1990-Present") %>%
filter(Year %in% viz_years)%>% # ... and these years...
data.frame() # ... into a new data frame
efw_income_classes$Year <- as.factor(efw_income_classes$Year) # Factoring the 'Year' variable
names(efw_income_classes)[3] <- "WB_Region" # Renaming for ease
names(efw_income_classes)[4] <- "WB_Income_Classification" # Renaming for ease
yearcount <- data.frame(count(efw_income_classes, Year, # Getting yearly counts of different income class
efw_income_classes$WB_Income_Classification))
yearcount <- yearcount[2:13,] # Removes unnecessary columns
names(yearcount)[2] <- "WB_Income_Classification" # Renaming for ease
income_levels <- c('High Income', 'Upper-Middle Income', # Factoring income level variable
'Lower-Middle Income', 'Low Income')
yearcount$WB_Income_Classification <- factor(yearcount$WB_Income_Classification,# Factoring income level variable
levels = income_levels)
# Visualization Code:
plot_ly(hole=0.7) %>%
layout(title="Countries by Income Classifications (2003, 2013, 2023)") %>% # Setting up outer most ring
add_trace(data = yearcount[yearcount$Year==2023,],
labels = ~WB_Income_Classification,
values = ~yearcount[yearcount$Year==2023, "n"],
type="pie",
textposition="inside",
hovertemplate="Year: 2023<br>Classification:%{label}<br>Percent:%{percent}<br>Country Count: %{value}<>extra</extra>") %>%
add_trace(data = yearcount[yearcount$Year==2013,], # Setting up middle ring
labels = ~WB_Income_Classification,
values = ~yearcount[yearcount$Year==2013, "n"],
type="pie",
textposition="inside",
hovertemplate="Year: 2013<br>Classification:%{label}<br>Percent:%{percent}<br>Country Count: %{value}<>extra</extra>",
domain=list(
x=c(0.17, 0.83), # Changing margins
y=c(0.17, 0.83))) %>% # Changing margins
add_trace(data = yearcount[yearcount$Year==2003,], # Setting up inner most ring
labels = ~WB_Income_Classification,
values = ~yearcount[yearcount$Year==2003, "n"],
type="pie",
textposition="inside",
hovertemplate="Year: 2003<br>Classification:%{label}<br>Percent:%{percent}<br>Country Count: %{value}<>extra</extra>",
domain=list(
x=c(0.29, 0.71), # Changing margins
y=c(0.29, 0.71))) # Changing margins
The visualization above contains three pie charts, with with pie chart shaped as a circular ring that represents the breakdown of proportions of countries classified by income classification. The inner most ring represents 2003; the middle ring represents 2013; the outermost ring represents 2023. The proportion of countries classified as ‘High Income’ has increased since 2003, with 22.7% of countries increasing to 35.6% in 2023. In contrast, the proportion of countries classified as ‘Low Income’ has decreased from 32.5% in 2003 increasing to 12.9% in 2023. Interestingly, the two middle-based classifications have differing trends that are opposite to each other. The proportion of ‘Upper-Middle Income’ countries has increased from 2003 (17.2%) to 2013 (26.7%), then slightly decreased after 2013, with 2023 reporting a proportion of 25.2%. ‘Lower-Middle Income’ mirrored a similar pattern to its counterpart category; 2003 saw 27.6% of countries in this proportion. Ten years later in 2013, this proportion decreased to 23%; ten more years later, this proportion climbed back up to 26.4%.
Coming soon…
Coming soon…
Coming soon…
As seen in the visualizations above, my takeaways from my analysis of this data set are…
Libraries utilized for this project are as followed: plyr, dplyr, tidyr, scales, plotly, ggplot2, ggrepel, cowplot, ggthemes, DescTools, lubridate, data.table, htmlwidgets, and RColorBrewer.