In this document I contrast country scores in the Heritage Foundation Index of Economic Freedom, focusing on the U.S. versus the Nordic countries Denmark, Finland, Norway, and Sweden, with New Zealand included as an example of a top-ranking country. (I don’t include Iceland because of its very small population.)
This document includes the code to produce the graph heading my post on social democracy published as part of my “Seven Answers” series of blog posts.
I use the tidyverse set of functions for general data manipulation and the knitr package to create a formatted table.
library("tidyverse")
library("knitr")
I use the following CSV file in this analysis:
ief-scores-1995-2018.csv contains the full list of scores for all countries for the years 1995 through 2018.For more information on how I created this file see the “References” section.
Each row of the file contains the following 15 variables (“chr” indicates a character string, “int” an integer value, and “num” a numeric value with decimal point):
name (chr). The country name (e.g., “United States”) associated with the scores.index_year (int). The year in which the index was published (e.g, 2018). This is usually the year after the year of data from which the component scores are calculated.overall_score (num). The average of the 12 component score variables for a given index year and country.property_rights (num). This and the following component scores run from 0 to 100, and are rounded to 1 digit past the decimal point (thus 3 significant digits).government_integrity (num).judicial_effectiveness (num).tax_burden (num).government_spending (num).fiscal_health (num).business_freedom (num).labor_freedom (num).monetary_freedom (num).trade_freedom (num).investment_freedom (num).financial_freedom (num).I begin by reading the CSV file ief-scores-1995-2018.csv into the table ief. The file does not need any further cleaning or revision before being used.
ief <- read_csv("ief-scores-1995-2018.csv")
I do one main plot and three other analyses just for fun:
government_spending and tax_burden).For the main graph I want only the overall scores, so I create a second table ief_overall with that data for all years and countries.
ief_overall <- ief %>%
select(name, index_year, overall_score)
I plot the data by year, showing a separate line for each country of interest. I include all years from 1995 on. (The IEF methodology changed over the years, so scores from earlier years aren’t strictly comparable with scores from later years. However, I’m just looking for overall trends.)
To color the graph lines I use a color palette designed to be more visible for people with color blindness. The palette is set so as to match up with the (alphabetical) order of countries in the legend, so that I can highlight the U.S. score in black and the New Zealand score in gray.
palette <- c("#E69F00", "#56B4E9", "#999999",
"#009E73", "#CC79A7", "#000000")
countries <- c("Denmark", "Finland", "New Zealand",
"Norway", "Sweden", "United States")
ief_overall %>%
filter(index_year >= 1995) %>%
filter(name %in% countries) %>%
ggplot(mapping=aes(x=index_year, y=overall_score,
group=name, color=name)) +
geom_line(size=0.8) +
scale_color_manual(values=palette) +
coord_cartesian(ylim=c(60, 90)) +
scale_x_continuous(breaks=seq(1995,2020,5)) +
labs(x="Year", y="Economic Freedom Score", color="Country") +
theme_bw()
To investigate the extent to which the various components of the overall score differ from country to country, I create a table ief_subscores containing the scores for the various components that go into the overall scores for 2018, using the following procedure:
index_year and overall_score columns, which we don’t need.score with a new column component to hold the name of the particular component of the overall score.ief_subscores <- ief %>%
filter(index_year == 2018) %>%
filter(name %in% countries) %>%
select(-index_year, -overall_score) %>%
gather(component, score, -name) %>%
mutate(score=score/12)
Finally, I create a stacked bar chart showing how the component scores add together to produce the overall scores for each country.
I use an alternate palette for the component fills because the default palette makes it difficult to see how much each component affects the overall result. (The colors repeat because there are more components than colors in the palette, but since the position of the components in the stacks matches the position of the corresponding components names in the legend, it’s still reasonably clear which component is which.)
component_palette <- c("#999999", "#E69F00", "#56B4E9", "#009E73",
"#F0E442", "#0072B2", "#D55E00", "#CC79A7",
"#999999", "#E69F00", "#56B4E9", "#009E73")
ief_subscores %>%
ggplot(mapping=aes(x=name, y=score, fill=component)) +
geom_bar(stat='identity') +
scale_fill_manual(values = component_palette) +
labs(x="Country",
y="Components of Economic Freedom Score",
fill="Component") +
theme_bw()
Recall that the thinner the bar for a component the lower the component score and the more the country is being penalized with respect to that component.
By looking at the component scores I see that some Nordic countries are penalized for their relatively high taxes and levels of government spending. How do the rankings change if we remove these components?
To find out, I first create a table of countries ranked in order of the values of overall_score in 2018. (This corresponds to the official rankings as published on the Heritage Foundation site.) I do this as follows:
name and overall_score columns.overall_score (highest to lowest).rankings_2018 <- ief %>%
filter(index_year == 2018) %>%
select(name, overall_score) %>%
arrange(desc(overall_score)) %>%
mutate(ranking = row_number())
Next I compute an alternative overall score using only ten components, with government_spending and tax_burden removed, and then create an alternative ranking from that. I do this as follows:
government_spending and tax_burden columns.overall_score variable as the average of the last 10 columns, rounded to 1 digit past the decimal place. (I ignore missing values so that they don’t affect the average.)overall_score (highest to lowest).name and overall_score columns.overall_score (highest to lowest).alt_rankings_2018 <- ief %>%
filter(index_year == 2018) %>%
select(-government_spending, -tax_burden) %>%
mutate(overall_score = round(rowMeans(.[4:13], na.rm=TRUE), 1)) %>%
select(name, overall_score) %>%
arrange(desc(overall_score)) %>%
mutate(ranking = row_number())
Then for convenience of comparison I create a table that has both sets of rankings side by side. I do this as follows:
name and overall_score columns, because I want to keep them separate from the columns of the same name in the original rankings.ranking as the common variable.ranking value.comparison_rankings_2018 <- alt_rankings_2018 %>%
rename(alt_name = name, alt_overall_score = overall_score) %>%
inner_join(rankings_2018) %>%
arrange(ranking) %>%
mutate(alt_ranking = ranking) %>%
select(ranking, name, overall_score,
alt_ranking, alt_name, alt_overall_score)
Finally, I display a table of the two sets of rankings compared; for brevity I show only the first 30 countries, which in the original rankings includes all the Nordic countries.
comparison_rankings_2018 %>%
filter(ranking <= 30) %>%
rename(Rank=ranking,
Country=name,
Score=overall_score,
"Rank (Alternative Ranking)"=alt_ranking,
"Country (Alternative Ranking)"=alt_name,
"Score (Alternative Ranking)"=alt_overall_score) %>%
kable()
| Rank | Country | Score | Rank (Alternative Ranking) | Country (Alternative Ranking) | Score (Alternative Ranking) |
|---|---|---|---|---|---|
| 1 | Hong Kong | 90.2 | 1 | Hong Kong | 90.0 |
| 2 | Singapore | 88.8 | 2 | New Zealand | 89.1 |
| 3 | New Zealand | 84.2 | 3 | Singapore | 88.4 |
| 4 | Switzerland | 81.7 | 4 | Denmark | 86.8 |
| 5 | Australia | 80.9 | 5 | Liechtenstein | 85.0 |
| 6 | Ireland | 80.4 | 6 | Sweden | 84.8 |
| 7 | Estonia | 78.8 | 7 | Australia | 84.6 |
| 8 | United Kingdom | 78.0 | 8 | Switzerland | 84.5 |
| 9 | Canada | 77.7 | 9 | United Kingdom | 82.6 |
| 10 | United Arab Emirates | 77.6 | 10 | Netherlands | 82.3 |
| 11 | Iceland | 77.0 | 11 | Finland | 82.1 |
| 12 | Denmark | 76.6 | 12 | Ireland | 81.9 |
| 13 | Taiwan | 76.6 | 13 | Estonia | 81.2 |
| 14 | Luxembourg | 76.4 | 14 | Iceland | 80.7 |
| 15 | Sweden | 76.3 | 15 | Norway | 80.6 |
| 16 | Georgia | 76.2 | 16 | Canada | 80.3 |
| 17 | Netherlands | 76.2 | 17 | Luxembourg | 80.3 |
| 18 | United States | 75.7 | 18 | Austria | 79.2 |
| 19 | Lithuania | 75.3 | 19 | Germany | 78.8 |
| 20 | Chile | 75.2 | 20 | United States | 78.7 |
| 21 | Mauritius | 75.1 | 21 | United Arab Emirates | 76.2 |
| 22 | Malaysia | 74.5 | 22 | Czech Republic | 76.0 |
| 23 | Norway | 74.3 | 23 | Belgium | 75.4 |
| 24 | Czech Republic | 74.2 | 24 | Georgia | 75.4 |
| 25 | Germany | 74.2 | 25 | Israel | 75.4 |
| 26 | Finland | 74.1 | 26 | Lithuania | 75.3 |
| 27 | South Korea | 73.8 | 27 | Taiwan | 75.3 |
| 28 | Latvia | 73.6 | 28 | Japan | 74.6 |
| 29 | Qatar | 72.6 | 29 | South Korea | 74.4 |
| 30 | Japan | 72.3 | 30 | Chile | 74.3 |
Note the dramatic improvements in rankings for the Nordic countries:
However,the United States remains almost unchanged, dropping from position 18 to position 20.
What would the comparison graph above look like if we ignored the spending and taxation components of the overall score. Here I repeat the original graph using an alternative score. I do this as follows:
government_spending and tax_burden variables from the data set.alt_overall_score as we did above, taking the average of the remaining 10 variables. (If there are any missing values we ignore them and just take the average of the variables that do have values.)name, index_year, and alt_overall_score.The countries and palette variables are carried over from the original plot.
ief %>%
filter(name %in% countries) %>%
select(-government_spending, -tax_burden) %>%
mutate(alt_overall_score = round(rowMeans(.[4:13], na.rm=TRUE), 1)) %>%
select(name, index_year, alt_overall_score) %>%
ggplot(mapping=aes(x=index_year, y=alt_overall_score,
group=name, color=name)) +
geom_line(size=0.8) +
scale_color_manual(values=palette) +
coord_cartesian(ylim=c(60, 95)) +
scale_x_continuous(breaks=seq(1995,2020,5)) +
labs(x="Year", y="Alternate Economic Freedom Score", color="Country") +
theme_bw()
In this alternative perspective Denmark was significantly more economically free than New Zealand for most of the last 10-15 years, losing ground only recently. (In fact, I had to increase the upper limit of the graph from 90 to 95 to avoid cutting off Denmark’s score.) Sweden has been comparable to or better than the United States in economic freedom since the early 2000s, and both Finland and Norway have caught up to and surpassed the U.S. in recent years.
Unfortunately the Heritage Foundation does not provide an easy way to download all scores at once for all years. I worked around this using the following procedure to create the CSV file ief-scores-1995-2018.csv:
I used the following R environment in doing the analysis above:
sessionInfo()
## R version 3.4.4 (2018-03-15)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 16.04.4 LTS
##
## Matrix products: default
## BLAS: /usr/lib/libblas/libblas.so.3.6.0
## LAPACK: /usr/lib/lapack/liblapack.so.3.6.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] bindrcpp_0.2 knitr_1.17 dplyr_0.7.2 purrr_0.2.3
## [5] readr_1.1.1 tidyr_0.7.0 tibble_1.3.3 ggplot2_2.2.1
## [9] tidyverse_1.1.1
##
## loaded via a namespace (and not attached):
## [1] Rcpp_0.12.12 highr_0.6 cellranger_1.1.0 compiler_3.4.4
## [5] plyr_1.8.4 bindr_0.1 forcats_0.2.0 tools_3.4.4
## [9] digest_0.6.12 lubridate_1.6.0 jsonlite_1.5 evaluate_0.10.1
## [13] nlme_3.1-131.1 gtable_0.2.0 lattice_0.20-35 pkgconfig_2.0.1
## [17] rlang_0.1.2 psych_1.7.5 yaml_2.1.14 parallel_3.4.4
## [21] haven_1.1.0 xml2_1.1.1 httr_1.3.0 stringr_1.2.0
## [25] hms_0.3 tidyselect_0.1.1 rprojroot_1.2 grid_3.4.4
## [29] glue_1.1.1 R6_2.2.2 readxl_1.0.0 foreign_0.8-69
## [33] rmarkdown_1.6 modelr_0.1.1 reshape2_1.4.2 magrittr_1.5
## [37] backports_1.1.0 scales_0.4.1 htmltools_0.3.6 rvest_0.3.2
## [41] assertthat_0.2.0 mnormt_1.5-5 colorspace_1.3-2 labeling_0.3
## [45] stringi_1.1.5 lazyeval_0.2.0 munsell_0.4.3 broom_0.4.2
You can find the source code for this analysis and others at my Seven Answers public Gitlab repository. This document and its source code are available for unrestricted use, distribution and modification under the terms of the Creative Commons CC0 1.0 Universal (CC0 1.0) Public Domain Dedication. Stated more simply, you’re free to do whatever you’d like with it.