0) Overview

During COVID-19, governments rarely implemented a single border policy in isolation. Instead, they introduced structured policy packages (e.g., closing air travel + visa bans + specific exceptions).
This project applies association rule learning (Apriori + ECLAT) to discover which border policy measures tend to co-occur.

Transaction logic (core idea): - Transaction = one policy event: (Country + Start Date + Policy ID) - Item = a policy attribute active in that event (e.g., AIR, VISA_BAN, CITIZEN_EXCEP, plus TYPE_* and SUBTYPE_*)

Outputs we aim to deliver: - frequent policy bundles (frequent itemsets), - strong associations between policy measures (rules), - visualizations of rule structure, - interpretation in an institutional/governance context.

1) Dataset

We use the COVID Border Accountability Project (COBAP) policy list dataset (downloaded from Harvard Dataverse by the student).
The dataset provides event-level border restriction records including:

policy classification: POLICY_TYPE, POLICY_SUBTYPE
channels: AIR, LAND, SEA
restriction categories: VISA_BAN, HISTORY_BAN, CITIZEN, REFUGEE
exceptions: CITIZEN_EXCEP, COUNTRY_EXCEP, WORK_EXCEP
plus identifiers and timing: COUNTRY_NAME, START_DATE, ID

This dataset is well-suited for association rules because each policy event naturally represents a set of co-implemented measures.

2) Setup

We load packages used for: - data manipulation (tidyverse, lubridate) - association rule mining (arules) - rule visualization (arulesViz)

knitr::opts_chunk$set(echo = TRUE)

pkgs <- c("tidyverse", "lubridate", "arules", "arulesViz")
to_install <- pkgs[!sapply(pkgs, requireNamespace, quietly = TRUE)]
if(length(to_install) > 0) install.packages(to_install)

library(tidyverse)
library(lubridate)
library(arules)
library(arulesViz)

## 2) Load Data

df <- read.csv(“policy_list.csv”, stringsAsFactors = FALSE)

dim(df)

names(df)

head(df, 3)

3) Data Preparation

We create a TRANS identifier for each policy event and convert policy attributes into an item list. ## Data preparation

The original COBAP dataset is provided in a “wide” format, where each row corresponds to a single border policy event and each column represents a specific policy attribute (e.g. air travel restrictions, visa bans, citizen exceptions).

However, association rule algorithms such as Apriori and ECLAT require the data to be in a transaction format, where: - each transaction is a set of items, - each item represents a characteristic that is present in that transaction.

In this project, we define:

A transaction as one border policy event, uniquely identified by the combination of
(Country, Start Date, Policy ID).
An item as a policy feature that is active in that event:
- policy type and policy subtype,
- travel channels affected (AIR, LAND, SEA),
- restriction categories (VISA_BAN, HISTORY_BAN, CITIZEN, REFUGEE),
- exception structures (CITIZEN_EXCEP, COUNTRY_EXCEP, WORK_EXCEP).

This transformation allows us to interpret each policy event as a “policy package” consisting of several coordinated measures.

item_cols <- c( “POLICY_TYPE”, “POLICY_SUBTYPE”, “AIR”, “LAND”, “SEA”, “VISA_BAN”, “CITIZEN”, “HISTORY_BAN”, “REFUGEE”, “CITIZEN_EXCEP”, “COUNTRY_EXCEP”, “WORK_EXCEP” )

df2 <- df %>% select(ID, COUNTRY_NAME, START_DATE, all_of(item_cols)) %>% mutate( TRANS = paste(COUNTRY_NAME, START_DATE, ID, sep = “”), POLICY_TYPE = paste0(”TYPE”, POLICY_TYPE), POLICY_SUBTYPE = paste0(“SUBTYPE_”, POLICY_SUBTYPE) )

binary_cols <- c( “AIR”,“LAND”,“SEA”, “VISA_BAN”,“CITIZEN”,“HISTORY_BAN”,“REFUGEE”, “CITIZEN_EXCEP”,“COUNTRY_EXCEP”,“WORK_EXCEP” )

binary items (keep only variables where val == 1)

long_bin <- df2 %>% select(TRANS, all_of(binary_cols)) %>% pivot_longer(-TRANS, names_to=“var”, values_to=“val”) %>% filter(val == 1) %>% mutate(ITEM = var) %>% select(TRANS, ITEM)

categorical items (policy type and subtype become items)

long_cat <- df2 %>% select(TRANS, POLICY_TYPE, POLICY_SUBTYPE) %>% pivot_longer(-TRANS, names_to=“var”, values_to=“val”) %>% filter(!is.na(val), val != “SUBTYPE_NONE”) %>% mutate(ITEM = val) %>% select(TRANS, ITEM)

trans_single <- bind_rows(long_cat, long_bin) %>% distinct()

head(trans_single, 15)

4) Reading the transactions and inspecting

The arules package expects a transactions object. We save the single-format table to CSV and read it back as transactions.

write.csv(trans_single, “trans_single.csv”, row.names = FALSE)

cobap_trans <- read.transactions( “trans_single.csv”, format = “single”, cols = c(1,2), header = TRUE, sep = “,” )

cobap_trans summary(cobap_trans) inspect(cobap_trans[1:5])

5) Single- and two-dimension frequency inspection

We first explore: which items are most frequent, and how “long” transactions are (how many items per policy event).

most frequent items

itemFrequencyPlot(cobap_trans, topN = 20, type = “absolute”, main = “Top 20 items (absolute frequency)”) itemFrequencyPlot(cobap_trans, topN = 20, type = “relative”, main = “Top 20 items (relative frequency)”)

show top items as a table

sort(itemFrequency(cobap_trans), decreasing = TRUE) %>% head(20)

Association rules: measures (support, confidence, lift) Association rules have the form: {LHS} → {RHS} Key measures: Support: how often the whole rule occurs in the dataset (share of transactions containing both LHS and RHS) Confidence: how often RHS occurs given LHS (conditional probability of RHS given LHS) Lift: strength compared to random coincidence lift > 1 means LHS increases the likelihood of RHS. In policy terms, high-lift rules suggest that certain measures form coherent policy packages.

6) Apriori (mining rules)

We use Apriori with: minimum support = 0.02 minimum confidence = 0.60 min rule length = 2 This produced a manageable and interpretable rule set.

rules <- apriori( cobap_trans, parameter = list(supp = 0.02, conf = 0.60, minlen = 2) )

rules summary(rules)

7) Remove redundant rules and view strongest rules

Redundant rules can repeat the same information. We remove them and sort by lift.

rules_nored <- rules[!is.redundant(rules)] summary(rules_nored)

rules_lift <- sort(rules_nored, by = “lift”, decreasing = TRUE) inspect(head(rules_lift, 20))

8) What implies TYPE_COMPLETE?

rules_complete <- apriori( cobap_trans, parameter = list(supp = 0.01, conf = 0.40, minlen = 2), appearance = list(default = “lhs”, rhs = “TYPE_COMPLETE”), control = list(verbose = FALSE) )

summary(rules_complete) inspect(head(sort(rules_complete, by = “lift”, decreasing = TRUE), 15))

9) Visualizing rules

We visualize the rule structure with arulesViz (top rules by lift).

top_rules <- head(rules_lift, 100)

plot(top_rules, measure = c(“support”,“lift”), shading = “confidence”) plot(top_rules, method = “matrix”, measure = “lift”) plot(top_rules, method = “grouped”)

plot(head(rules_lift, 30), method = “graph”, control = list(type = “items”))

10) ECLAT: frequent itemsets and rule induction

ECLAT finds frequent itemsets directly. Then we can induce rules from those itemsets. This provides a second algorithmic perspective consistent with the course material.

freq_sets <- eclat(cobap_trans, parameter = list(supp = 0.02, maxlen = 5)) summary(freq_sets) inspect(head(sort(freq_sets, by = “support”, decreasing = TRUE), 15))

eclat_rules <- ruleInduction(freq_sets, cobap_trans, confidence = 0.60) summary(eclat_rules) inspect(head(sort(eclat_rules, by = “lift”, decreasing = TRUE), 15))

Interpretation of results (write-up) From the strongest rules (by lift and confidence), we observe that border closures were typically designed as packages: closure measures (air/land/sea), plus restriction types (visa/history/citizen), plus controlled exception structures (citizens, workers, specific countries). Example interpretation template: Rule: {A, B} → {C} Support: x% Confidence: y% Lift: z Meaning: When A and B are present, C appears with high probability; lift > 1 implies the association is stronger than chance. Limitations Association rules reveal co-occurrence patterns, not causality. They help identify which policy measures tend to be implemented together, but not the political or epidemiological reasons behind them. Additionally, results may reflect reporting differences and variation in how policies are coded across countries. Conclusion This project demonstrates how association rule learning can uncover structured patterns in real-world governance data. Using COBAP policy events as transactions, Apriori and ECLAT identify frequent border policy bundles and strong associations, supporting the conclusion that COVID border restrictions were deployed as coordinated policy packages rather than independent actions.

COBAP: Association Rules on COVID Border Policies