Erasmus Teaching Mobility

Introduction to the Course: “Statistical Research Methods and Data Exploration”

Dear Students,

Welcome to the world of statistical research methods and data exploration! In this course, we embark on a journey through the realms of statistical analysis, drawing inspiration from the intricacies of the modern data landscape and its profound impact on global trends and economic insights.

Our guiding light for this academic odyssey is the lens of data, insightfully showcased in the article titled “What is happening to productivity in the World? Productivity Gini” This article serves as a beacon, illuminating the significance of productivity trends and their implications across nations.

As the author of that article and your guide throughout this course, I aim to take you on a compelling exploration through the multifaceted world of statistics and data analysis. We’ll delve into diverse datasets and scenarios, unraveling the mysteries concealed within the numbers, and uncovering the untold stories that data can narrate.

Our expedition commences with an emphasis on the methods behind statistical research. We will dissect the core concepts of statistical analysis, honing our skills in exploring datasets, interpreting distributions, formulating hypotheses, and using specialized criteria for analysis. These fundamental tools will be our compass, guiding us through the terrain of data interpretation and understanding.

Our itinerary doesn’t halt solely at statistical theory. Rather, it extends its boundaries to real-world applications and hands-on experiences. We’ll unravel the predictive powers of data and statistical models, exploring scenarios such as the Spaceship Titanic collision—a Kaggle competition setting—where predictive modeling and research planning take center stage.

Throughout our journey, we aim to equip you with a versatile set of tools—skills in data acquisition, exploration, hypothesis testing, predictive modeling, and research planning. These skills are not only vital in the world of academia but are also indispensable in the practical realms of industry, governance, and global economics.

So, fasten your seatbelts, prepare your analytical compasses, and get ready to navigate the enthralling landscapes of statistical research and data exploration. This course is a platform for us to delve into the captivating world of numbers, patterns, and stories that data conceals.

Together, let us embark on this voyage of discovery and enlightenment, where statistical research methods and data exploration lead us to a deeper understanding of our ever-evolving world.

Welcome aboard!

Reach the Data Available

Exploring World Development Indicators: A Gateway to Productivity Insights

In our quest to decipher the enigmatic landscape of productivity trends across the globe, our first port of call is an exploration into the realm of World Development Indicators (WDI). These indicators, encapsulated within the World Bank’s vast repository, are our stepping stones to understanding the economic pulse of nations and their productivity dynamics.

In this segment, we embark on an immersive expedition through the WDI—a gateway to a trove of economic data, metrics, and developmental insights. Our primary objective is to empower you with the fundamental skills needed to access, dissect, and glean valuable insights from this rich tapestry of information.

To initiate this journey, we will engage in a simple yet profound demonstration, showcasing how to access and navigate the WDI database using accessible tools, primarily the WDI package in R. Through hands-on exploration, we will highlight the seamless process of fetching pertinent economic indicators, such as GDP, population metrics, and other crucial developmental measures encapsulated within the WDI.

As we dive into the WDI dataset, we aim to illustrate the steps involved in extracting, summarizing, and visualizing key indicators. This immersive experience will equip you with the rudimentary skills to harness the potential of publicly available datasets, particularly the WDI, to unravel trends, patterns, and insights regarding global productivity.

This session acts as your compass, guiding you through the initial steps in your journey toward deciphering the complex tapestry of economic data. By the end of this segment, you will emerge with the prowess to navigate the WDI terrain, laying the groundwork for deeper explorations into the productivity narratives prevalent across the globe.

Join us on this inaugural step as we set sail to explore the WDI, laying the foundation for a deeper understanding of the global economic landscape and its productivity dynamics.

Understanding Packages in R: Your Tools for Data Exploration

In the realm of R, packages act as your trusty tools and resources, offering a multitude of functions and capabilities that augment the core functionalities of the programming language. They are instrumental in extending R’s abilities to encompass diverse data manipulation, visualization, and statistical analysis.

What Are R Packages?

R packages are collections of functions, datasets, and other supplementary materials bundled together to serve a specific purpose. Each package serves as a specialized toolkit, tailored to fulfill distinct analytical or computational needs. These packages are the gears in the machinery of R, enabling you to perform various tasks efficiently and effectively.

Why Packages Matter?

As you venture into the realm of statistical research and data exploration, understanding and utilizing packages become integral. They equip you with an arsenal of functions, enabling you to perform tasks ranging from data acquisition and cleaning to statistical analysis and visualization. Using packages saves time and effort by providing pre-written code for complex operations.

How to Access and Use Packages?

In R, accessing and using packages is a straightforward process. The key steps involve installation, loading, and utilization. The process begins by installing a package, which is a one-time operation. Once installed, the package needs to be loaded into your R session to access its functions and datasets. After loading, you can use the functions provided by the package to perform various tasks, adding powerful tools to your analytical toolkit.

Introduction to Package Usage

Throughout our course, we will introduce and utilize specific packages, such as the WDI package for World Bank data retrieval and analysis. We will guide you through the steps of installing, loading, and employing these packages, ensuring you are equipped to access and harness the functionalities they offer.

In your journey of statistical exploration, packages will be your trusted companions, enriching your experience and broadening the horizons of your data analysis endeavors.

Quick Overview of the WDI Package in R

The WDI package in R is a powerful tool designed to access the World Bank’s extensive World Development Indicators (WDI) repository. It serves as a specialized toolkit, simplifying the retrieval of key economic and social data.

Key Features:

  • Enables retrieval of diverse indicators such as GDP, population metrics, and education statistics. Accesses time-series data, allowing trend analysis over time.

  • Provides an intuitive interface to navigate and retrieve specific indicators for countries or regions.

Importance:

  • Essential for understanding global economic trends and productivity insights.

Simplifies complex data retrieval, enabling analysis of critical economic indicators across nations. Throughout our course, we’ll explore the WDI package, guiding you through its installation, usage, and the extraction of essential indicators. This tool will be pivotal in our journey towards understanding productivity dynamics in various countries.

Quick Overview of the dplyr Package in R

The dplyr package in R is a fundamental toolkit designed for efficient and intuitive data manipulation and transformation. It stands as a versatile set of functions essential for working with data frames, enabling streamlined operations for data analysis and cleaning.

Key Features:

  • Provides a collection of functions for common data manipulation tasks: filtering, selecting, arranging, summarizing, and mutating data.

  • Enhances data frame operations, allowing for seamless data manipulation using a consistent and easy-to-understand syntax.

  • Offers a cohesive set of verbs designed for intuitive and efficient data transformation and analysis. Importance:

  • Simplifies complex data manipulations, making tasks such as subsetting, summarizing, and arranging data frames more straightforward and coherent.

  • Crucial for data cleaning, transformation, and preparation, enabling efficient and effective data analysis workflows.

Throughout our course, we’ll delve into the capabilities of the dplyr package, showcasing its functions and guiding you through its usage. This package will serve as a cornerstone in your journey toward mastering essential data manipulation skills and fostering efficient data analysis practices.

Downloading and Using Packages in R

To download and install a package (e.g., “dplyr”), use install.packages() function. For example, install.packages("dplyr"). The install.packages() function is used to download and install a package onto your system. This operation is required only once for each package you wish to use.

To use the installed package’s functions, load it into your current R session using library() function. For example, library(dplyr). Loading a package via library() makes its functions and capabilities available for use in your current R session. Each time you start a new R session, you need to reload the packages you plan to use.

To access the package’s functions and capabilities in R, you’ll use library() each time you initiate an R session. This loads the installed package into the current session for immediate use. Throughout our course, we’ll emphasize the importance of installing necessary packages for R and loading them using library(), enabling you to access the full suite of tools and functions for your data analysis and statistical exploration.

Subregion

library(plotly)
library(readxl)
pwt <- read_excel("C:/Users/hutku/Downloads/pwt1001.xlsx", sheet = "Data")
dffk <- dff %>% left_join(pwt)

Note Source of World map code from Statistics Guides with Dr Paul Christiansen

libraries <- c(
    "tidyverse", "sf", "rnaturalearth",
    "wbstats", "gganimate", "classInt"
)
invisible(lapply(libraries, library, character.only = TRUE))
crs <- "+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"
world_sf <- ne_countries(
    type = "countries", scale = "small"
) %>%
    sf::st_as_sf() %>%
    sf::st_transform(crs)
world_sf_no_antartica <- world_sf %>%
    dplyr::filter(region_un != "Antarctica")
dffk <- dplyr::left_join(
    world_sf_no_antartica, dffk,
    by = c("iso_a2" = "iso2c")
)
countries_Melanesia <- dffk %>%
  filter(year ==2022, subregion == "Melanesia") %>%
  pull(country)
countries_SouthernEurope <- dffk %>%
  filter(year ==2022, subregion == "Southern Europe") %>%
  pull(country)
countries_NorthernAfrica <- dffk %>%
  filter(year ==2022) %>%
  filter(subregion == "Northern Africa" ) %>%
  pull(country)
countries_NA <- dffk %>%
  filter(year ==2022) %>%
  filter(is.na(subregion)) %>%
  pull(country)
countries_MiddleAfrica <- dffk %>%
  filter(year ==2022) %>%
  filter(subregion == "Middle Africa" ) %>%
  pull(country)
countries_SouthAmerica <- dffk %>%
  filter(year ==2022) %>%
  filter(subregion == "South America" ) %>%
  pull(country)
countries_WesternAsia <- dffk %>%
  filter(year ==2022) %>%
  filter(subregion == "Western Asia" ) %>%
  pull(country)
countries_AustraliaandNewZealand <- dffk %>%
  filter(year ==2022) %>%
  filter(subregion == "Australia and New Zealand" ) %>%
  pull(country)
countries_WesternEurope <- dffk %>%
  filter(year ==2022) %>%
  filter(subregion == "Western Europe" ) %>%
  pull(country)
countries_Caribbean <- dffk %>%
  filter(year ==2022) %>%
  filter(subregion == "Caribbean" ) %>%
  pull(country)
countries_SouthernAsia <- dffk %>%
  filter(year ==2022) %>%
  filter(subregion == "Southern Asia" ) %>%
  pull(country)
countries_EasternEurope <- dffk %>%
  filter(year ==2022) %>%
  filter(subregion == "Eastern Europe" ) %>%
  pull(country)
countries_CentralAmerica <- dffk %>%
  filter(year ==2022) %>%
  filter(subregion == "Central America" ) %>%
  pull(country)
countries_WesternAfrica <- dffk %>%
  filter(year ==2022) %>%
  filter(subregion == "Western Africa" ) %>%
  pull(country)
countries_SouthernAfrica <- dffk %>%
  filter(year ==2022) %>%
  filter(subregion == "Southern Africa" ) %>%
  pull(country)
countries_SouthEasternAsia <- dffk %>%
  filter(year ==2022) %>%
  filter(subregion == "South-Eastern Asia" ) %>%
  pull(country)
countries_EasternAfrica <- dffk %>%
  filter(year ==2022) %>%
  filter(subregion == "Eastern Africa" ) %>%
  pull(country)
countries_NorthernAmerica <- dffk %>%
  filter(year ==2022) %>%
  filter(subregion == "Northern America" ) %>%
  pull(country)
countries_EasternAsia <- dffk %>%
  filter(year ==2022) %>%
  filter(subregion == "Eastern Asia" ) %>%
  pull(country)
countries_NorthernEurope <- dffk %>%
  filter(year ==2022) %>%
  filter(subregion == "Northern Europe" ) %>%
  pull(country)
countries_CentralAsia <- dffk %>%
  filter(year ==2022) %>%
  filter(subregion == "Central Asia" ) %>%
  pull(country)
phi <- dffk %>% filter(country %in% countries_WesternAsia) %>%
ggplot(aes(x = year,
           y = verim,
           col = country)) +
  geom_line(show.legend = FALSE) +
  facet_wrap(~country, scales = "free") +
  transition_reveal(year)  +
  labs(title = "Year: {frame_along}")
animate(plot = phi,
        nframes = 30)

Western Asia

df_WA <- dff %>% 
  filter(
    country %in% countries_WesternAsia,
    year %in% c(2003, 2022),
  ) %>%
  mutate(year = factor(year)) %>% 
  select(country, year, verim)
df_WA <- as_tibble(df_WA)  %>% 
  arrange(country, year) %>% 
  mutate(
    change_verim = diff(verim), 
    order_dumbbells = if_else(change_verim < 0, -1, 1) * verim[2],
    .by = country
  )  %>% 
  mutate(country = fct_reorder(country, order_dumbbells))
df_WA %>% 
  ggplot(aes(x = verim, y = country)) +
  geom_path(
    aes(color = (change_verim < 0)),
    linewidth = 1,
    arrow = arrow(length = unit(0.3, 'cm'), type = 'closed')
  ) +
  geom_vline(xintercept=1, linetype='dotted', col = 'red')+
  annotate("text", x = 1, y = "Armenia", label = "Self-Sufficient Treshold", angle=90) +
  labs(
    title = 'Western Asia',
    x = 'Verim (2003 - 2022)', 
    y = element_blank(),
    fill = 'Year'
  ) +
  theme(
    panel.grid.major.y = element_blank(),
    panel.grid.minor.x = element_blank(),
    legend.position = 'none'
  )

Southern Asia

df_SA <- dff %>% 
  filter(
    country %in% countries_SouthernAsia,
    year %in% c(2003, 2022),
  ) %>%
  mutate(year = factor(year)) %>% 
  select(country, year, verim)
df_SA <- as_tibble(df_SA)  %>% 
  arrange(country, year) %>% 
  mutate(
    change_verim = diff(verim), 
    order_dumbbells = if_else(change_verim < 0, -1, 1) * verim[2],
    .by = country
  )  %>% 
  mutate(country = fct_reorder(country, order_dumbbells))
df_SA %>% 
  ggplot(aes(x = verim, y = country)) +
  geom_path(
    aes(color = (change_verim < 0)),
    linewidth = 1,
    arrow = arrow(length = unit(0.3, 'cm'), type = 'closed')
  ) +
  geom_vline(xintercept=1, linetype='dotted', col = 'red')+
  annotate("text", x = 1, y = 3, label = "Self-Sufficient Treshold", angle=90) +
  labs(
    title = 'Southern Asia',
    x = 'Verim (2003 - 2022)', 
    y = element_blank(),
    fill = 'Year'
  ) +
  theme(
    panel.grid.major.y = element_blank(),
    panel.grid.minor.x = element_blank(),
    legend.position = 'none'
  )

phi <- dffk %>% filter(country %in% countries_SouthernAsia) %>%
ggplot(aes(x = year,
           y = verim,
           col = country)) +
  geom_line(show.legend = FALSE) +
  facet_wrap(~country, scales = "free") +
  transition_reveal(year)  +
  labs(title = "Year: {frame_along}")
animate(plot = phi,
        nframes = 30)

South Eastern Asia

df_SEA <- dff %>% 
  filter(
    country %in% countries_SouthEasternAsia,
    year %in% c(2003, 2022),
  ) %>%
  mutate(year = factor(year)) %>% 
  select(country, year, verim)
df_SEA <- as_tibble(df_SEA)  %>% 
  arrange(country, year) %>% 
  mutate(
    change_verim = diff(verim), 
    order_dumbbells = if_else(change_verim < 0, -1, 1) * verim[2],
    .by = country
  )  %>% 
  mutate(country = fct_reorder(country, order_dumbbells))
df_SEA %>% 
  ggplot(aes(x = verim, y = country)) +
  geom_path(
    aes(color = (change_verim < 0)),
    linewidth = 1,
    arrow = arrow(length = unit(0.3, 'cm'), type = 'closed')
  ) +
  geom_vline(xintercept=1, linetype='dotted', col = 'red')+
  annotate("text", x = 1, y = 3, label = "Self-Sufficient Treshold", angle=90) +
  labs(
    title = 'South-Eastern Asia',
    x = 'Verim (2003 - 2022)', 
    y = element_blank(),
    fill = 'Year'
  ) +
  theme(
    panel.grid.major.y = element_blank(),
    panel.grid.minor.x = element_blank(),
    legend.position = 'none'
  )

phi <- dffk %>% filter(country %in% countries_SouthEasternAsia) %>%
ggplot(aes(x = year,
           y = verim,
           col = country)) +
  geom_line(show.legend = FALSE) +
  facet_wrap(~country, scales = "free") +
  transition_reveal(year)  +
  labs(title = "Year: {frame_along}")
animate(plot = phi,
        nframes = 30)

South Central Asia

df_CA <- dff %>% 
  filter(
    country %in% countries_CentralAsia,
    year %in% c(2003, 2022),
  ) %>%
  mutate(year = factor(year)) %>% 
  select(country, year, verim)
df_CA <- as_tibble(df_CA)  %>% 
  arrange(country, year) %>% 
  mutate(
    change_verim = diff(verim), 
    order_dumbbells = if_else(change_verim < 0, -1, 1) * verim[2],
    .by = country
  )  %>% 
  mutate(country = fct_reorder(country, order_dumbbells))
df_CA %>% 
  ggplot(aes(x = verim, y = country)) +
  geom_path(
    aes(color = (change_verim < 0)),
    linewidth = 1,
    arrow = arrow(length = unit(0.3, 'cm'), type = 'closed')
  ) +
  geom_vline(xintercept=1, linetype='dotted', col = 'red')+
  annotate("text", x = 1, y = 3, label = "Self-Sufficient Treshold", angle=90) +
  labs(
    title = 'Central Asia',
    x = 'Verim (2003 - 2022)', 
    y = element_blank(),
    fill = 'Year'
  ) +
  theme(
    panel.grid.major.y = element_blank(),
    panel.grid.minor.x = element_blank(),
    legend.position = 'none'
  )

phi <- dffk %>% filter(country %in% countries_CentralAsia) %>%
ggplot(aes(x = year,
           y = verim,
           col = country)) +
  geom_line(show.legend = FALSE) +
  facet_wrap(~country, scales = "free") +
  transition_reveal(year)  +
  labs(title = "Year: {frame_along}")
animate(plot = phi,
        nframes = 30)

Eastern Asia

df_EA <- dff %>% 
  filter(
    country %in% countries_EasternAsia,
    year %in% c(2003, 2022),
  ) %>%
  mutate(year = factor(year)) %>% 
  select(country, year, verim)
df_EA <- as_tibble(df_EA)  %>% 
  arrange(country, year) %>% 
  mutate(
    change_verim = diff(verim), 
    order_dumbbells = if_else(change_verim < 0, -1, 1) * verim[2],
    .by = country
  )  %>% 
  mutate(country = fct_reorder(country, order_dumbbells))
df_EA %>% 
  ggplot(aes(x = verim, y = country)) +
  geom_path(
    aes(color = (change_verim < 0)),
    linewidth = 1,
    arrow = arrow(length = unit(0.3, 'cm'), type = 'closed')
  ) +
  geom_vline(xintercept=1, linetype='dotted', col = 'red')+
  annotate("text", x = 1, y = 2.5, label = "Self-Sufficient Treshold", angle=90) +
  labs(
    title = 'Eastern Asia',
    x = 'Verim (2003 - 2022)', 
    y = element_blank(),
    fill = 'Year'
  ) +
  theme(
    panel.grid.major.y = element_blank(),
    panel.grid.minor.x = element_blank(),
    legend.position = 'none'
  )

phi <- dffk %>% filter(country %in% countries_EasternAsia) %>%
ggplot(aes(x = year,
           y = verim,
           col = country)) +
  geom_line(show.legend = FALSE) +
  facet_wrap(~country, scales = "free") +
  transition_reveal(year)  +
  labs(title = "Year: {frame_along}")
animate(plot = phi,
        nframes = 30)

Melanesia

df_MAL <- dff %>% 
  filter(
    country %in% countries_Melanesia,
    year %in% c(2003, 2022),
  ) %>%
  mutate(year = factor(year)) %>% 
  select(country, year, verim)
df_MAL <- as_tibble(df_MAL)  %>% 
  arrange(country, year) %>% 
  mutate(
    change_verim = diff(verim), 
    order_dumbbells = if_else(change_verim < 0, -1, 1) * verim[2],
    .by = country
  )  %>% 
  mutate(country = fct_reorder(country, order_dumbbells))
df_MAL %>% 
  ggplot(aes(x = verim, y = country)) +
  geom_path(
    aes(color = (change_verim < 0)),
    linewidth = 1,
    arrow = arrow(length = unit(0.3, 'cm'), type = 'closed')
  ) +
  geom_vline(xintercept=1, linetype='dotted', col = 'red')+
  annotate("text", x = 1, y = 3, label = "Self-Sufficient Treshold", angle=90) +
  labs(
    title = 'Melanesia',
    x = 'Verim (2003 - 2022)', 
    y = element_blank(),
    fill = 'Year'
  ) +
  theme(
    panel.grid.major.y = element_blank(),
    panel.grid.minor.x = element_blank(),
    legend.position = 'none'
  )

phi <- dffk %>% filter(country %in% countries_Melanesia) %>%
ggplot(aes(x = year,
           y = verim,
           col = country)) +
  geom_line(show.legend = FALSE) +
  facet_wrap(~country, scales = "free") +
  transition_reveal(year)  +
  labs(title = "Year: {frame_along}")
animate(plot = phi,
        nframes = 30)

Southern Europe

df_SE <- dff %>% 
  filter(
    country %in% countries_SouthernEurope,
    year %in% c(2003, 2022),
  ) %>%
  mutate(year = factor(year)) %>% 
  select(country, year, verim)
df_SE <- as_tibble(df_SE)  %>% 
  arrange(country, year) %>% 
  mutate(
    change_verim = diff(verim), 
    order_dumbbells = if_else(change_verim < 0, -1, 1) * verim[2],
    .by = country
  )  %>% 
  mutate(country = fct_reorder(country, order_dumbbells))
df_SE %>% 
  ggplot(aes(x = verim, y = country)) +
  geom_path(
    aes(color = (change_verim < 0)),
    linewidth = 1,
    arrow = arrow(length = unit(0.3, 'cm'), type = 'closed')
  ) +
  geom_vline(xintercept=1, linetype='dotted', col = 'red')+
  annotate("text", x = 1, y = 6, label = "Self-Sufficient Treshold", angle=90) +
  labs(
    title = 'Southern Europe',
    x = 'Verim (2003 - 2022)', 
    y = element_blank(),
    fill = 'Year'
  ) +
  theme(
    panel.grid.major.y = element_blank(),
    panel.grid.minor.x = element_blank(),
    legend.position = 'none'
  )

phi <- dffk %>% filter(country %in% countries_SouthernEurope) %>%
ggplot(aes(x = year,
           y = verim,
           col = country)) +
  geom_line(show.legend = FALSE) +
  facet_wrap(~country, scales = "free") +
  transition_reveal(year)  +
  labs(title = "Year: {frame_along}")
animate(plot = phi,
        nframes = 30)

Western Europe

df_WE <- dff %>% 
  filter(
    country %in% countries_WesternEurope,
    year %in% c(2003, 2022),
  ) %>%
  mutate(year = factor(year)) %>% 
  select(country, year, verim)
df_WE <- as_tibble(df_WE)  %>% 
  arrange(country, year) %>% 
  mutate(
    change_verim = diff(verim), 
    order_dumbbells = if_else(change_verim < 0, -1, 1) * verim[2],
    .by = country
  )  %>% 
  mutate(country = fct_reorder(country, order_dumbbells))
df_WE %>% 
  ggplot(aes(x = verim, y = country)) +
  geom_path(
    aes(color = (change_verim < 0)),
    linewidth = 1,
    arrow = arrow(length = unit(0.3, 'cm'), type = 'closed')
  ) +
  geom_vline(xintercept=1, linetype='dotted', col = 'red')+
  annotate("text", x = 1, y = 4, label = "Self-Sufficient Treshold", angle=90) +
  labs(
    title = 'Western Europe',
    x = 'Verim (2003 - 2022)', 
    y = element_blank(),
    fill = 'Year'
  ) +
  theme(
    panel.grid.major.y = element_blank(),
    panel.grid.minor.x = element_blank(),
    legend.position = 'none'
  )

phi <- dffk %>% filter(country %in% countries_WesternEurope) %>%
ggplot(aes(x = year,
           y = verim,
           col = country)) +
  geom_line(show.legend = FALSE) +
  facet_wrap(~country, scales = "free") +
  transition_reveal(year)  +
  labs(title = "Year: {frame_along}")
animate(plot = phi,
        nframes = 30)

Eastern Europe

df_EE <- dff %>% 
  filter(
    country %in% countries_EasternEurope,
    year %in% c(2003, 2022),
  ) %>%
  mutate(year = factor(year)) %>% 
  select(country, year, verim)
df_EE <- as_tibble(df_EE)  %>% 
  arrange(country, year) %>% 
  mutate(
    change_verim = diff(verim), 
    order_dumbbells = if_else(change_verim < 0, -1, 1) * verim[2],
    .by = country
  )  %>% 
  mutate(country = fct_reorder(country, order_dumbbells))
df_EE %>% 
  ggplot(aes(x = verim, y = country)) +
  geom_path(
    aes(color = (change_verim < 0)),
    linewidth = 1,
    arrow = arrow(length = unit(0.3, 'cm'), type = 'closed')
  ) +
  geom_vline(xintercept=1, linetype='dotted', col = 'red')+
  annotate("text", x = 1, y = 5, label = "Self-Sufficient Treshold", angle=90) +
  labs(
    title = 'Eastern Europe',
    x = 'Verim (2003 - 2022)', 
    y = element_blank(),
    fill = 'Year'
  ) +
  theme(
    panel.grid.major.y = element_blank(),
    panel.grid.minor.x = element_blank(),
    legend.position = 'none'
  )

phi <- dffk %>% filter(country %in% countries_EasternEurope) %>%
ggplot(aes(x = year,
           y = verim,
           col = country)) +
  geom_line(show.legend = FALSE) +
  facet_wrap(~country, scales = "free") +
  transition_reveal(year)  +
  labs(title = "Year: {frame_along}")
animate(plot = phi,
        nframes = 30)

Northern Europe

df_NE <- dff %>% 
  filter(
    country %in% countries_NorthernEurope,
    year %in% c(2003, 2022),
  ) %>%
  mutate(year = factor(year)) %>% 
  select(country, year, verim)
df_NE <- as_tibble(df_NE)  %>% 
  arrange(country, year) %>% 
  mutate(
    change_verim = diff(verim), 
    order_dumbbells = if_else(change_verim < 0, -1, 1) * verim[2],
    .by = country
  )  %>% 
  mutate(country = fct_reorder(country, order_dumbbells))
df_NE %>% 
  ggplot(aes(x = verim, y = country)) +
  geom_path(
    aes(color = (change_verim < 0)),
    linewidth = 1,
    arrow = arrow(length = unit(0.3, 'cm'), type = 'closed')
  ) +
  geom_vline(xintercept=1, linetype='dotted', col = 'red')+
  annotate("text", x = 1, y = 5, label = "Self-Sufficient Treshold", angle=90) +
  labs(
    title = 'Northern Europe',
    x = 'Verim (2003 - 2022)', 
    y = element_blank(),
    fill = 'Year'
  ) +
  theme(
    panel.grid.major.y = element_blank(),
    panel.grid.minor.x = element_blank(),
    legend.position = 'none'
  )

phi <- dffk %>% filter(country %in% countries_NorthernEurope) %>%
ggplot(aes(x = year,
           y = verim,
           col = country)) +
  geom_line(show.legend = FALSE) +
  facet_wrap(~country, scales = "free") +
  transition_reveal(year)  +
  labs(title = "Year: {frame_along}")
animate(plot = phi,
        nframes = 30)

Northern Africa

df_NA <- dff %>% 
  filter(
    country %in% countries_NorthernAfrica,
    year %in% c(2003, 2022),
  ) %>%
  mutate(year = factor(year)) %>% 
  select(country, year, verim)
df_NA <- as_tibble(df_NA)  %>% 
  arrange(country, year) %>% 
  mutate(
    change_verim = diff(verim), 
    order_dumbbells = if_else(change_verim < 0, -1, 1) * verim[2],
    .by = country
  )  %>% 
  mutate(country = fct_reorder(country, order_dumbbells))
df_NA %>% 
  ggplot(aes(x = verim, y = country)) +
  geom_path(
    aes(color = (change_verim < 0)),
    linewidth = 1,
    arrow = arrow(length = unit(0.3, 'cm'), type = 'closed')
  ) +
  geom_vline(xintercept=1, linetype='dotted', col = 'red')+
  annotate("text", x = 1, y = 3, label = "Self-Sufficient Treshold", angle=90) +
  labs(
    title = 'Northern Africa',
    x = 'Verim (2003 - 2022)', 
    y = element_blank(),
    fill = 'Year'
  ) +
  theme(
    panel.grid.major.y = element_blank(),
    panel.grid.minor.x = element_blank(),
    legend.position = 'none'
  )

phi <- dffk %>% filter(country %in% countries_NorthernAfrica) %>%
ggplot(aes(x = year,
           y = verim,
           col = country)) +
  geom_line(show.legend = FALSE) +
  facet_wrap(~country, scales = "free") +
  transition_reveal(year)  +
  labs(title = "Year: {frame_along}")
animate(plot = phi,
        nframes = 30)

Middle Africa

df_MA <- dff %>% 
  filter(
    country %in% countries_MiddleAfrica,
    year %in% c(2003, 2022),
  ) %>%
  mutate(year = factor(year)) %>% 
  select(country, year, verim)
df_MA <- as_tibble(df_MA)  %>% 
  arrange(country, year) %>% 
  mutate(
    change_verim = diff(verim), 
    order_dumbbells = if_else(change_verim < 0, -1, 1) * verim[2],
    .by = country
  )  %>% 
  mutate(country = fct_reorder(country, order_dumbbells))
df_MA %>% 
  ggplot(aes(x = verim, y = country)) +
  geom_path(
    aes(color = (change_verim < 0)),
    linewidth = 1,
    arrow = arrow(length = unit(0.3, 'cm'), type = 'closed')
  ) +
  geom_vline(xintercept=1, linetype='dotted', col = 'red')+
  annotate("text", x = 1, y = 5, label = "Self-Sufficient Treshold", angle=90) +
  labs(
    title = 'Middle Africa',
    x = 'Verim (2003 - 2022)', 
    y = element_blank(),
    fill = 'Year'
  ) +
  theme(
    panel.grid.major.y = element_blank(),
    panel.grid.minor.x = element_blank(),
    legend.position = 'none'
  )

phi <- dffk %>% filter(country %in% countries_MiddleAfrica) %>%
ggplot(aes(x = year,
           y = verim,
           col = country)) +
  geom_line(show.legend = FALSE) +
  facet_wrap(~country, scales = "free") +
  transition_reveal(year)  +
  labs(title = "Year: {frame_along}")
animate(plot = phi,
        nframes = 30)

Western Africa

df_WA <- dff %>% 
  filter(
    country %in% countries_WesternAfrica,
    year %in% c(2003, 2022),
  ) %>%
  mutate(year = factor(year)) %>% 
  select(country, year, verim)
df_WA <- as_tibble(df_WA)  %>% 
  arrange(country, year) %>% 
  mutate(
    change_verim = diff(verim), 
    order_dumbbells = if_else(change_verim < 0, -1, 1) * verim[2],
    .by = country
  )  %>% 
  mutate(country = fct_reorder(country, order_dumbbells))
df_WA %>% 
  ggplot(aes(x = verim, y = country)) +
  geom_path(
    aes(color = (change_verim < 0)),
    linewidth = 1,
    arrow = arrow(length = unit(0.3, 'cm'), type = 'closed')
  ) +
  geom_vline(xintercept=1, linetype='dotted', col = 'red')+
  annotate("text", x = 1, y = 5, label = "Self-Sufficient Treshold", angle=90) +
  labs(
    title = 'Western Africa',
    x = 'Verim (2003 - 2022)', 
    y = element_blank(),
    fill = 'Year'
  ) +
  theme(
    panel.grid.major.y = element_blank(),
    panel.grid.minor.x = element_blank(),
    legend.position = 'none'
  )

phi <- dffk %>% filter(country %in% countries_WesternAfrica) %>%
ggplot(aes(x = year,
           y = verim,
           col = country)) +
  geom_line(show.legend = FALSE) +
  facet_wrap(~country, scales = "free") +
  transition_reveal(year)  +
  labs(title = "Year: {frame_along}")
animate(plot = phi,
        nframes = 30)

Southern Africa

df_SA <- dff %>% 
  filter(
    country %in% countries_SouthernAfrica,
    year %in% c(2003, 2022),
  ) %>%
  mutate(year = factor(year)) %>% 
  select(country, year, verim)
df_SA <- as_tibble(df_SA)  %>% 
  arrange(country, year) %>% 
  mutate(
    change_verim = diff(verim), 
    order_dumbbells = if_else(change_verim < 0, -1, 1) * verim[2],
    .by = country
  )  %>% 
  mutate(country = fct_reorder(country, order_dumbbells))
df_SA %>% 
  ggplot(aes(x = verim, y = country)) +
  geom_path(
    aes(color = (change_verim < 0)),
    linewidth = 1,
    arrow = arrow(length = unit(0.3, 'cm'), type = 'closed')
  ) +
  geom_vline(xintercept=1, linetype='dotted', col = 'red')+
  annotate("text", x = 1, y = 3, label = "Self-Sufficient Treshold", angle=90) +
  labs(
    title = 'Southern Africa',
    x = 'Verim (2003 - 2022)', 
    y = element_blank(),
    fill = 'Year'
  ) +
  theme(
    panel.grid.major.y = element_blank(),
    panel.grid.minor.x = element_blank(),
    legend.position = 'none'
  )

phi <- dffk %>% filter(country %in% countries_SouthernAfrica) %>%
ggplot(aes(x = year,
           y = verim,
           col = country)) +
  geom_line(show.legend = FALSE) +
  facet_wrap(~country, scales = "free") +
  transition_reveal(year)  +
  labs(title = "Year: {frame_along}")
animate(plot = phi,
        nframes = 30)

Eastern Africa

df_EA <- dff %>% 
  filter(
    country %in% countries_EasternAfrica,
    year %in% c(2003, 2022),
  ) %>%
  mutate(year = factor(year)) %>% 
  select(country, year, verim)
df_EA <- as_tibble(df_EA)  %>% 
  arrange(country, year) %>% 
  mutate(
    change_verim = diff(verim), 
    order_dumbbells = if_else(change_verim < 0, -1, 1) * verim[2],
    .by = country
  )  %>% 
  mutate(country = fct_reorder(country, order_dumbbells))
df_EA %>% 
  ggplot(aes(x = verim, y = country)) +
  geom_path(
    aes(color = (change_verim < 0)),
    linewidth = 1,
    arrow = arrow(length = unit(0.3, 'cm'), type = 'closed')
  ) +
  geom_vline(xintercept=1, linetype='dotted', col = 'red')+
  annotate("text", x = 1, y = 5, label = "Self-Sufficient Treshold", angle=90) +
  labs(
    title = 'Eastern Africa',
    x = 'Verim (2003 - 2022)', 
    y = element_blank(),
    fill = 'Year'
  ) +
  theme(
    panel.grid.major.y = element_blank(),
    panel.grid.minor.x = element_blank(),
    legend.position = 'none'
  )

phi <- dffk %>% filter(country %in% countries_EasternAfrica) %>%
ggplot(aes(x = year,
           y = verim,
           col = country)) +
  geom_line(show.legend = FALSE) +
  facet_wrap(~country, scales = "free") +
  transition_reveal(year)  +
  labs(title = "Year: {frame_along}")
animate(plot = phi,
        nframes = 30)

South America

df_SA <- dff %>% 
  filter(
    country %in% countries_SouthAmerica,
    year %in% c(2003, 2022),
  ) %>%
  mutate(year = factor(year)) %>% 
  select(country, year, verim)
df_SA <- as_tibble(df_SA)  %>% 
  arrange(country, year) %>% 
  mutate(
    change_verim = diff(verim), 
    order_dumbbells = if_else(change_verim < 0, -1, 1) * verim[2],
    .by = country
  )  %>% 
  mutate(country = fct_reorder(country, order_dumbbells))
df_SA %>% 
  ggplot(aes(x = verim, y = country)) +
  geom_path(
    aes(color = (change_verim < 0)),
    linewidth = 1,
    arrow = arrow(length = unit(0.3, 'cm'), type = 'closed')
  ) +
  geom_vline(xintercept=1, linetype='dotted', col = 'red')+
  annotate("text", x = 1, y = 5, label = "Self-Sufficient Treshold", angle=90) +
  labs(
    title = 'South America',
    x = 'Verim (2003 - 2022)', 
    y = element_blank(),
    fill = 'Year'
  ) +
  theme(
    panel.grid.major.y = element_blank(),
    panel.grid.minor.x = element_blank(),
    legend.position = 'none'
  )

phi <- dffk %>% filter(country %in% countries_SouthAmerica) %>%
ggplot(aes(x = year,
           y = verim,
           col = country)) +
  geom_line(show.legend = FALSE) +
  facet_wrap(~country, scales = "free") +
  transition_reveal(year)  +
  labs(title = "Year: {frame_along}")
count
## function (x, ..., wt = NULL, sort = FALSE, name = NULL) 
## {
##     UseMethod("count")
## }
## <bytecode: 0x0000018357e7b8c0>
## <environment: namespace:dplyr>
animate(plot = phi,
        nframes = 30)

Central America

df_CA <- dff %>% 
  filter(
    country %in% countries_CentralAmerica,
    year %in% c(2003, 2022),
  ) %>%
  mutate(year = factor(year)) %>% 
  select(country, year, verim)
df_CA <- as_tibble(df_CA)  %>% 
  arrange(country, year) %>% 
  mutate(
    change_verim = diff(verim), 
    order_dumbbells = if_else(change_verim < 0, -1, 1) * verim[2],
    .by = country
  )  %>% 
  mutate(country = fct_reorder(country, order_dumbbells))
df_CA %>% 
  ggplot(aes(x = verim, y = country)) +
  geom_path(
    aes(color = (change_verim < 0)),
    linewidth = 1,
    arrow = arrow(length = unit(0.3, 'cm'), type = 'closed')
  ) +
  geom_vline(xintercept=1, linetype='dotted', col = 'red')+
  annotate("text", x = 1, y = 5, label = "Self-Sufficient Treshold", angle=90) +
  labs(
    title = 'Central America',
    x = 'Verim (2003 - 2022)', 
    y = element_blank(),
    fill = 'Year'
  ) +
  theme(
    panel.grid.major.y = element_blank(),
    panel.grid.minor.x = element_blank(),
    legend.position = 'none'
  )

phi <- dffk %>% filter(country %in% countries_CentralAmerica) %>%
ggplot(aes(x = year,
           y = verim,
           col = country)) +
  geom_line(show.legend = FALSE) +
  facet_wrap(~country, scales = "free") +
  transition_reveal(year)  +
  labs(title = "Year: {frame_along}")
animate(plot = phi,
        nframes = 30)

Northern America

df_NA <- dff %>% 
  filter(
    country %in% c(countries_NorthernAmerica,countries_AustraliaandNewZealand),
    year %in% c(2003, 2022),
  ) %>%
  mutate(year = factor(year)) %>% 
  select(country, year, verim)
df_NA <- as_tibble(df_NA)  %>% 
  arrange(country, year) %>% 
  mutate(
    change_verim = diff(verim), 
    order_dumbbells = if_else(change_verim < 0, -1, 1) * verim[2],
    .by = country
  )  %>% 
  mutate(country = fct_reorder(country, order_dumbbells))
df_NA %>% 
  ggplot(aes(x = verim, y = country)) +
  geom_path(
    aes(color = (change_verim < 0)),
    linewidth = 1,
    arrow = arrow(length = unit(0.3, 'cm'), type = 'closed')
  ) +
  geom_vline(xintercept=1, linetype='dotted', col = 'red')+
  annotate("text", x = 1, y = 3, label = "Self-Sufficient Treshold", angle=90) +
  labs(
    title = 'United States & New Zealand & Australia & New Zealand',
    x = 'Verim (2003 - 2022)', 
    y = element_blank(),
    fill = 'Year'
  ) +
  theme(
    panel.grid.major.y = element_blank(),
    panel.grid.minor.x = element_blank(),
    legend.position = 'none'
  )

phi <- dffk %>% filter(country %in% c(countries_NorthernAmerica,countries_AustraliaandNewZealand)) %>%
ggplot(aes(x = year,
           y = verim,
           col = country)) +
  geom_line(show.legend = FALSE) +
  facet_wrap(~country, scales = "free") +
  transition_reveal(year)  +
  labs(title = "Year: {frame_along}")
animate(plot = phi,
        nframes = 30)

Caribbean

df_CAR <- dff %>% 
  filter(
    country %in% countries_Caribbean,
    year %in% c(2003, 2022),
  ) %>%
  mutate(year = factor(year)) %>% 
  select(country, year, verim)
df_CAR <- as_tibble(df_CAR)  %>% 
  arrange(country, year) %>% 
  mutate(
    change_verim = diff(verim), 
    order_dumbbells = if_else(change_verim < 0, -1, 1) * verim[2],
    .by = country
  )  %>% 
  mutate(country = fct_reorder(country, order_dumbbells))
df_CAR %>% 
  ggplot(aes(x = verim, y = country)) +
  geom_path(
    aes(color = (change_verim < 0)),
    linewidth = 1,
    arrow = arrow(length = unit(0.3, 'cm'), type = 'closed')
  ) +
  geom_vline(xintercept=1, linetype='dotted', col = 'red')+
  annotate("text", x = 1, y = 3, label = "Self-Sufficient Treshold", angle=90) +
  labs(
    title = 'Caribbean',
    x = 'Verim (2003 - 2022)', 
    y = element_blank(),
    fill = 'Year'
  ) +
  theme(
    panel.grid.major.y = element_blank(),
    panel.grid.minor.x = element_blank(),
    legend.position = 'none'
  )

phi <- dffk %>% filter(country %in% countries_Caribbean) %>%
ggplot(aes(x = year,
           y = verim,
           col = country)) +
  geom_line(show.legend = FALSE) +
  facet_wrap(~country, scales = "free") +
  transition_reveal(year)  +
  labs(title = "Year: {frame_along}")
animate(plot = phi,
        nframes = 30)

Some Other Countries

df_OC <- dff %>% 
  filter(
    country %in% countries_NA,
    year %in% c(2003, 2022),
  ) %>%
  mutate(year = factor(year)) %>% 
  select(country, year, verim)
df_OC <- as_tibble(df_OC)  %>% 
  arrange(country, year) %>% 
  mutate(
    change_verim = diff(verim), 
    order_dumbbells = if_else(change_verim < 0, -1, 1) * verim[2],
    .by = country
  )  %>% 
  mutate(country = fct_reorder(country, order_dumbbells))
df_OC %>% 
  ggplot(aes(x = verim, y = country)) +
  geom_path(
    aes(color = (change_verim < 0)),
    linewidth = 1,
    arrow = arrow(length = unit(0.3, 'cm'), type = 'closed')
  ) +
  geom_vline(xintercept=1, linetype='dotted', col = 'red')+
  annotate("text", x = 1, y = 12, label = "Self-Sufficient Treshold", angle=90) +
  labs(
    title = 'Other Countries',
    x = 'Verim (2003 - 2022)', 
    y = element_blank(),
    fill = 'Year'
  ) +
  theme(
    panel.grid.major.y = element_blank(),
    panel.grid.minor.x = element_blank(),
    legend.position = 'none'
  )

phi <- dffk %>% filter(country %in% countr) %>% ggplot(aes(x = year, y = verim, col = country)) + geom_line(show.legend = FALSE) + facet_wrap(~country, scales = “free”) + transition_reveal(year) + labs(title = “Year: {frame_along}”)

animate(plot = phi, nframes = 30)

Note Source of World map code from Statistics Guides with Dr Paul Christiansen

# Robinson
robinson_crs <- "+proj=robin +lon_0=0 +x_0=0 +y_0=0 +ellps=WGS84 +datum=WGS84 +units=m +no_defs"
dffk_robinson <- dffk %>%
    sf::st_transform(robinson_crs)
vmin <- min(dffk$verim, na.rm = T)
vmax <- max(dffk$verim, na.rm = T)
brk <- round(classIntervals(
    dffk$verim,
    n = 77,
    style = "fisher"
)
$brks, 1) %>%
    head(-1) %>%
    tail(-1) %>%
    append(vmax)
breaks <- c(vmin, brk)
# Number of colors needed
num_colors <- 78

# Generate a color palette
new_cols <- rainbow(num_colors)

# Reverse the order if needed
new_cols <- rev(new_cols)

# Print or use the new color palette
print(new_cols)
##  [1] "#FF0014" "#FF0027" "#FF003B" "#FF004E" "#FF0062" "#FF0076" "#FF0089"
##  [8] "#FF009D" "#FF00B1" "#FF00C4" "#FF00D8" "#FF00EB" "#FF00FF" "#EB00FF"
## [15] "#D800FF" "#C400FF" "#B100FF" "#9D00FF" "#8900FF" "#7600FF" "#6200FF"
## [22] "#4E00FF" "#3B00FF" "#2700FF" "#1400FF" "#0000FF" "#0014FF" "#0027FF"
## [29] "#003BFF" "#004EFF" "#0062FF" "#0076FF" "#0089FF" "#009DFF" "#00B1FF"
## [36] "#00C4FF" "#00D8FF" "#00EBFF" "#00FFFF" "#00FFEB" "#00FFD8" "#00FFC4"
## [43] "#00FFB1" "#00FF9D" "#00FF89" "#00FF76" "#00FF62" "#00FF4E" "#00FF3B"
## [50] "#00FF27" "#00FF14" "#00FF00" "#14FF00" "#27FF00" "#3BFF00" "#4EFF00"
## [57] "#62FF00" "#76FF00" "#89FF00" "#9DFF00" "#B1FF00" "#C4FF00" "#D8FF00"
## [64] "#EBFF00" "#FFFF00" "#FFEB00" "#FFD800" "#FFC400" "#FFB100" "#FF9D00"
## [71] "#FF8900" "#FF7600" "#FF6200" "#FF4E00" "#FF3B00" "#FF2700" "#FF1400"
## [78] "#FF0000"
cols <- rev(new_cols)
animated_map <- function() {
    world_map <- ggplot(
        data = dffk,
        aes(fill = verim)
    ) +
        geom_sf(color = "white", size = 0.05) +
        scale_fill_gradientn(
            name = "",
            colours = cols,
            breaks = breaks,
            labels = round(breaks, 1),
            limits = c(vmin, vmax),
            na.value = "grey70"
        ) +
        coord_sf(crs = robinson_crs) +
        guides(fill = guide_legend(
            direction = "vertical",
            keyheight = unit(1, units = "mm"),
            keywidth = unit(1, units = "mm"),
            title.position = "top",
            title.hjust = .5,
            label.hjust = .5,
            nrow = 6,
            byrow = T,
            reverse = F,
            label.position = "right"
        )) +
        theme_minimal() +
        theme(
            axis.line = element_blank(),
            axis.text.x = element_blank(),
            axis.text.y = element_blank(),
            axis.ticks = element_blank(),
            axis.title.x = element_blank(),
            axis.title.y = element_blank(),
            legend.position = c(.5, -.015),
            legend.text = element_text(size = 5, color = "grey10"),
            panel.grid.major = element_line(color = "white", size = .2),
            panel.grid.minor = element_blank(),
            plot.title = element_text(
                face = "bold", size = 20,
                color = "grey10", hjust = .5, vjust = -3
            ),
            plot.subtitle = element_text(
                size = 40, color = "#c43c4e",
                hjust = .5, vjust = -1
            ),
            plot.caption = element_text(
                size = 8, color = "grey10",
                hjust = .5, vjust = -10
            ),
            plot.margin = unit(c(t = -4, r = -4, b = -4, l = -4), "lines"),
            plot.background = element_rect(fill = "white", color = NA),
            panel.background = element_rect(fill = "white", color = NA),
            legend.background = element_rect(fill = "white", color = NA),
            panel.border = element_blank()
        ) +
        labs(
            x = "",
            y = "",
            title = "Verim",
            subtitle = "Year: {as.integer(closest_state)}",
            caption = ""
        )

    return(world_map)
}

“Year: {as.integer(closest_state)}”

world_map <- animated_map()
print(world_map)

timelapse_world_map <- world_map +
    transition_states(year) +
    enter_fade() +
    exit_fade() +
    ease_aes("quadratic-in-out", interval = .2)
animated_world <- gganimate::animate(
    timelapse_world_map,
    nframes = 120,
    duration = 22,
    start_pause = 3,
    end_pause = 30,
    height = 6,
    width = 7.15,
    res = 300,
    units = "in",
    fps = 15,
    renderer = gifski_renderer(loop = T)
)
animated_world

library(showtext)
library(ggtext)
library(ggrepel)
data <- dffk %>% filter(year==2019 & !is.na(hc) )
hcverim <- data %>%
  ggplot(aes(x= hc, y=verim)) + 
  geom_point() +
  geom_text(data= data, aes(y = verim + .25, label=iso_a2, colour = region),
            size = 2) +
  geom_hline(yintercept=1, linetype='dotted', col = 'black') +
  geom_vline(xintercept=2.5, linetype='dotted', col = 'black') +
  geom_smooth(data=subset(data,verim>1 & hc>2.5),
               method=lm,se=FALSE) +
  labs(title = "Verim vs Human Capital (2022)",
       subtitle = NULL,
       tag = NULL, 
       x = "Human Capital",
       y= "Verim",
       color = NULL) +
  theme(
    axis.title.x = element_markdown(),
    axis.title.y = element_markdown(),
    axis.ticks = element_blank(),
    axis.line = element_line(),
    panel.background = element_rect(fill="#FFFFFF")
  ) 

source https://groups.google.com/forum/#!topic/ggplot2/1TgH-kG5XMA

lm_eqn <- function(df){
    m <- lm(verim ~ hc, df);
    eq <- substitute(verim == a + b %.% hc*","~r^2~"="~r2, 
         list(a = format(unname(coef(m)[1]), digits = 2),
              b = format(unname(coef(m)[2]), digits = 2),
             r2 = format(summary(m)$r.squared, digits = 3)))
    as.character(as.expression(eq));
}
hcverim + geom_text(x = 4, y = 3, label = lm_eqn(data %>% filter(hc>2.5, verim>1)), size = 4, parse = TRUE) + 
  geom_text(x = 1.2, y = 8, label = "A", size = 6, parse = TRUE) + 
  geom_text(x = 1.2, y = 0, label = "B", size = 6, parse = TRUE) + 
  geom_text(x = 4.2, y = 8, label = "C", size = 6, parse = TRUE) + 
  geom_text(x = 4.2, y = 0, label = "D", size = 6, parse = TRUE)

govverim <- data %>%
  ggplot(aes(x= csh_g, y=verim)) + 
  geom_point() +
  geom_text(data= data, aes(y = verim + .25, label=iso_a2, colour = region),
            size = 2) +
  geom_hline(yintercept=1, linetype='dotted', col = 'black')  +
  geom_smooth(data=subset(data,verim>1),method=lm,se=FALSE) +
  labs(title = "Verim vs Government spending share",
       subtitle = NULL,
       tag = NULL, 
       x = "Government spending share",
       y= "Verim",
       color = NULL) +
  theme(
    axis.title.x = element_markdown(),
    axis.title.y = element_markdown(),
    axis.ticks = element_blank(),
    axis.line = element_line(),
    panel.background = element_rect(fill="#FFFFFF")
  ) 
govverim

lm_eqn2 <- function(df){
    m <- lm(verim ~ csh_g, df);
    eq <- substitute(verim == a + b %.% csh_g*","~r^2~"="~r2, 
         list(a = format(unname(coef(m)[1]), digits = 2),
              b = format(unname(coef(m)[2]), digits = 2),
             r2 = format(summary(m)$r.squared, digits = 3)))
    as.character(as.expression(eq));
}
govverim + geom_text(x = 0.09, y = 2.5, label = lm_eqn2(data %>% filter(verim>1)), size = 3, parse = TRUE)