During the COVID-19 pandemic, governments rarely implemented a single border policy in isolation. Instead, they introduced policy packages, combining several measures such as travel channel closures, visa bans, and structured exceptions. The main goal of this project is to discover hidden patterns in how governments combine different border control policies during global crises.
This project applies association rule learning to uncover: - which policy measures tend to co-occur, - which combinations form stable institutional patterns, - and how border governance was structured during crisis management.
We interpret: - each policy event as a transaction (e.g.,
Country + Start Date + Policy ID), - each active policy feature as an
item (e.g., AIR, VISA_BAN,
CITIZEN_EXCEP, plus TYPE_* and
SUBTYPE_*).
Outputs we aim to deliver: - frequent policy bundles (frequent itemsets), - strong associations between policy measures (rules), - visualizations of rule structure, - interpretation in an institutional/governance context.
This allows us to treat border policies similarly to market-basket data, where the “basket” is a country’s policy package.
We use the COVID Border Accountability Project (COBAP) policy list dataset from Harvard Dataverse.
Link to dataset is available here: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi%3A10.7910%2FDVN%2FU6DJAC
This dataset records how governments restricted cross-border mobility during COVID:
Entry bans Visa suspensions Quarantine requirements Testing requirements Flight bans Border closures Exceptions for citizens, diplomats, essential workers, etc. It is already almost “transactional” by nature. Each row corresponds to one border policy event issued by a country.
Relevant variables include:
How it fits Association Rules perfectly: In market basket analysis: Transaction = shopping basket, Items = products
Here: Transaction = country–date (or country–week) Items = policy measures in force on that date
So one row becomes something like: {EntryBan, Quarantine, PCR_Test, VisaSuspension, CitizenException} That is structurally identical to: {Milk, Bread, Butter} So Apriori/ECLAT logic applies without forcing anything artificial.
We load: - tidyverse and lubridate for data manipulation, - arules for Apriori and ECLAT, - arulesViz for rule visualization.
knitr::opts_chunk$set(echo = TRUE)
pkgs <- c("tidyverse", "lubridate", "arules", "arulesViz")
to_install <- pkgs[!sapply(pkgs, requireNamespace, quietly = TRUE)]
if(length(to_install) > 0) install.packages(to_install)
library(tidyverse)
library(lubridate)
library(arules)
library(arulesViz)
We load the COBAP policy list and inspect its structure.
df <- read.csv("policy_list.csv", stringsAsFactors = FALSE)
dim(df)
## [1] 1762 44
names(df)
## [1] "ID" "COUNTRY_NAME" "ISO3"
## [4] "ISO2" "POLICY_TYPE" "POLICY_SUBTYPE"
## [7] "START_DATE" "END_DATE" "AIR"
## [10] "AIR_TYPE" "TARGETS_AIR" "LAND"
## [13] "LAND_TYPE" "TARGETS_LAND" "SEA"
## [16] "SEA_TYPE" "TARGETS_SEA" "CITIZEN"
## [19] "CITIZEN_LIST" "HISTORY_BAN" "HISTORY_BAN_LIST"
## [22] "REFUGEE" "REFUGEE_LIST" "VISA_BAN"
## [25] "VISA_BAN_TYPE" "VISA_BAN_LIST" "CITIZEN_EXCEP"
## [28] "CITIZEN_EXCEP_LIST" "COUNTRY_EXCEP" "COUNTRY_EXCEP_LIST"
## [31] "WORK_EXCEP" "SOURCE_QUALITY" "SOURCE_TYPE"
## [34] "INTERNAL_GOVT_SOURCE" "AIRLINE_SOURCE" "INSURANCE_SOURCE"
## [37] "GOVT_SOCIAL_MED_SOURCE" "EXT_GOVT_SOURCE" "INTERNAL_MEDIA_SOURCE"
## [40] "EXT_MEDIA_SOURCE" "OTHER_SOURCE" "END_SOURCE"
## [43] "COMMENTS" "OLD_ID"
glimpse(df)
## Rows: 1,762
## Columns: 44
## $ ID <chr> "CY36", "AI04", "AR03", "AW04", "BJ02", "BQ01",…
## $ COUNTRY_NAME <chr> "Cyprus", "Anguilla", "Argentina", "Aruba", "Be…
## $ ISO3 <chr> "CYP", "AIA", "ARG", "ABW", "BEN", "BES", "BLZ"…
## $ ISO2 <chr> "CY", "AI", "AR", "AW", "BJ", "BQ", "BZ", "CC",…
## $ POLICY_TYPE <chr> "PARTIAL", "COMPLETE", "COMPLETE", "COMPLETE", …
## $ POLICY_SUBTYPE <chr> "HISTORY_BAN", "ESSENTIAL_ONLY", "ESSENTIAL_ONL…
## $ START_DATE <chr> "06_21_21", "04_22_21", "03_27_20", "03_26_20",…
## $ END_DATE <chr> "06_28_21", "05_25_21", "04_26_20", "06_15_20",…
## $ AIR <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
## $ AIR_TYPE <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ TARGETS_AIR <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ LAND <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
## $ LAND_TYPE <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ TARGETS_LAND <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ SEA <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
## $ SEA_TYPE <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ TARGETS_SEA <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ CITIZEN <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
## $ CITIZEN_LIST <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ HISTORY_BAN <int> 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
## $ HISTORY_BAN_LIST <chr> "Spain, Netherlands, Andorra, Monaco, Vatican C…
## $ REFUGEE <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
## $ REFUGEE_LIST <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ VISA_BAN <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
## $ VISA_BAN_TYPE <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ VISA_BAN_LIST <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ CITIZEN_EXCEP <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
## $ CITIZEN_EXCEP_LIST <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ COUNTRY_EXCEP <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
## $ COUNTRY_EXCEP_LIST <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ WORK_EXCEP <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
## $ SOURCE_QUALITY <chr> "Very Sure", "Less Sure", "Less Sure", "Very Su…
## $ SOURCE_TYPE <chr> "1", "6", "6,8", "1", "1", "1", "1", "1", "3", …
## $ INTERNAL_GOVT_SOURCE <chr> "https://web.archive.org/web/20210725121754/htt…
## $ AIRLINE_SOURCE <chr> "", "", "", "", "", "", "", "", "", "https://we…
## $ INSURANCE_SOURCE <chr> "", "", "", "", "", "", "", "", "https://web.ar…
## $ GOVT_SOCIAL_MED_SOURCE <chr> "", "", "", "", "", "", "", "", "", "", "", "",…
## $ EXT_GOVT_SOURCE <chr> "", "", "", "", "", "", "", "", "", "", "https:…
## $ INTERNAL_MEDIA_SOURCE <chr> "", "https://web.archive.org/web/20210507195435…
## $ EXT_MEDIA_SOURCE <chr> "", "", "", "", "", "", "", "", "", "", "", "",…
## $ OTHER_SOURCE <chr> "", "", "https://perma.cc/HBD5-UVZY,", "", "", …
## $ END_SOURCE <chr> "", "", "", "", "", " https://web.archive.org/w…
## $ COMMENTS <chr> "", "", "Original policy started on 3/27 and wa…
## $ OLD_ID <chr> "EB170", "ET5", "117901", "MG12", "2098", "EB13…
nrow(df)
## [1] 1762
head(df, 3)
## ID COUNTRY_NAME ISO3 ISO2 POLICY_TYPE POLICY_SUBTYPE START_DATE END_DATE
## 1 CY36 Cyprus CYP CY PARTIAL HISTORY_BAN 06_21_21 06_28_21
## 2 AI04 Anguilla AIA AI COMPLETE ESSENTIAL_ONLY 04_22_21 05_25_21
## 3 AR03 Argentina ARG AR COMPLETE ESSENTIAL_ONLY 03_27_20 04_26_20
## AIR AIR_TYPE TARGETS_AIR LAND LAND_TYPE TARGETS_LAND SEA SEA_TYPE TARGETS_SEA
## 1 0 <NA> <NA> 0 <NA> <NA> 0 <NA> <NA>
## 2 0 <NA> <NA> 0 <NA> <NA> 0 <NA> <NA>
## 3 0 <NA> <NA> 0 <NA> <NA> 0 <NA> <NA>
## CITIZEN CITIZEN_LIST HISTORY_BAN
## 1 0 <NA> 1
## 2 0 <NA> 0
## 3 0 <NA> 0
## HISTORY_BAN_LIST
## 1 Spain, Netherlands, Andorra, Monaco, Vatican City, San Marino, Switzerland, Liechtenstein, Rwanda, Russia, United Arab Emirates, Ukraine, Jordan, Lebanon, Egypt, Belarus, Qatar, Serbia, Thailand, Armenia, Georgia, Bahrain, Kuwait, Canada, Taiwan, Albania, North Macedonia
## 2 <NA>
## 3 <NA>
## REFUGEE REFUGEE_LIST VISA_BAN VISA_BAN_TYPE VISA_BAN_LIST CITIZEN_EXCEP
## 1 0 <NA> 0 <NA> <NA> 0
## 2 0 <NA> 0 <NA> <NA> 0
## 3 0 <NA> 0 <NA> <NA> 0
## CITIZEN_EXCEP_LIST COUNTRY_EXCEP COUNTRY_EXCEP_LIST WORK_EXCEP SOURCE_QUALITY
## 1 <NA> 0 <NA> 0 Very Sure
## 2 <NA> 0 <NA> 0 Less Sure
## 3 <NA> 0 <NA> 0 Less Sure
## SOURCE_TYPE
## 1 1
## 2 6
## 3 6,8
## INTERNAL_GOVT_SOURCE
## 1 https://web.archive.org/web/20210725121754/https://www.pio.gov.cy/coronavirus/uploads/21062021_Epidemiological%20risk%20assessmentEN.pdf
## 2
## 3
## AIRLINE_SOURCE INSURANCE_SOURCE GOVT_SOCIAL_MED_SOURCE EXT_GOVT_SOURCE
## 1
## 2
## 3
## INTERNAL_MEDIA_SOURCE
## 1
## 2 https://web.archive.org/web/20210507195435/https://www.travelagentcentral.com/caribbean/anguilla-to-reopen-may-25-after-brief-closure
## 3 https://perma.cc/2XKK-FM4S
## EXT_MEDIA_SOURCE OTHER_SOURCE END_SOURCE
## 1
## 2
## 3 https://perma.cc/HBD5-UVZY,
## COMMENTS
## 1
## 2
## 3 Original policy started on 3/27 and was extended twice, sources reflect this
## OLD_ID
## 1 EB170
## 2 ET5
## 3 117901
This confirms that the dataset is event-level and contains both categorical policy classifications and binary policy indicators.
We create a TRANS identifier for each policy event and convert policy attributes into an item list.
The original COBAP dataset is provided in a “wide” format, where each row corresponds to a single border policy event and each column represents a specific policy attribute (e.g. air travel restrictions, visa bans, citizen exceptions).
However, association rule algorithms such as Apriori and ECLAT require the data to be in a transaction format, where: - each transaction is a set of items, - each item represents a characteristic that is present in that transaction.
In this project, we define:
This transformation allows us to interpret each policy event as a “policy package” consisting of several coordinated measures.
item_cols <- c(
"POLICY_TYPE", "POLICY_SUBTYPE",
"AIR", "LAND", "SEA",
"VISA_BAN", "CITIZEN", "HISTORY_BAN", "REFUGEE",
"CITIZEN_EXCEP", "COUNTRY_EXCEP", "WORK_EXCEP"
)
df2 <- df %>%
select(ID, COUNTRY_NAME, START_DATE, all_of(item_cols)) %>%
mutate(
TRANS = paste(COUNTRY_NAME, START_DATE, ID, sep = "_"),
POLICY_TYPE = paste0("TYPE_", POLICY_TYPE),
POLICY_SUBTYPE = paste0("SUBTYPE_", POLICY_SUBTYPE)
)
binary_cols <- c(
"AIR","LAND","SEA",
"VISA_BAN","CITIZEN","HISTORY_BAN","REFUGEE",
"CITIZEN_EXCEP","COUNTRY_EXCEP","WORK_EXCEP"
)
# binary items (keep only variables where val == 1)
long_bin <- df2 %>%
select(TRANS, all_of(binary_cols)) %>%
pivot_longer(-TRANS, names_to="var", values_to="val") %>%
filter(val == 1) %>%
mutate(ITEM = var) %>%
select(TRANS, ITEM)
# categorical items (policy type and subtype become items)
long_cat <- df2 %>%
select(TRANS, POLICY_TYPE, POLICY_SUBTYPE) %>%
pivot_longer(cols=-TRANS, names_to="var", values_to="val") %>%
filter(!is.na(val), val != "SUBTYPE_NONE") %>%
mutate(ITEM = val) %>%
select(TRANS, ITEM)
trans_single <- bind_rows(long_cat, long_bin) %>% distinct()
head(trans_single, 15)
## # A tibble: 15 × 2
## TRANS ITEM
## <chr> <chr>
## 1 Cyprus_06_21_21_CY36 TYPE_PARTIAL
## 2 Cyprus_06_21_21_CY36 SUBTYPE_HISTORY_BAN
## 3 Anguilla_04_22_21_AI04 TYPE_COMPLETE
## 4 Anguilla_04_22_21_AI04 SUBTYPE_ESSENTIAL_ONLY
## 5 Argentina_03_27_20_AR03 TYPE_COMPLETE
## 6 Argentina_03_27_20_AR03 SUBTYPE_ESSENTIAL_ONLY
## 7 Aruba_03_26_20_AW04 TYPE_COMPLETE
## 8 Aruba_03_26_20_AW04 SUBTYPE_ESSENTIAL_ONLY
## 9 Benin_03_17_20_BJ02 TYPE_COMPLETE
## 10 Benin_03_17_20_BJ02 SUBTYPE_ESSENTIAL_ONLY
## 11 Bonaire, Sint Eustatius and Saba_03_15_20_BQ01 TYPE_COMPLETE
## 12 Bonaire, Sint Eustatius and Saba_03_15_20_BQ01 SUBTYPE_ESSENTIAL_ONLY
## 13 Belize_04_05_20_BZ10 TYPE_COMPLETE
## 14 Belize_04_05_20_BZ10 SUBTYPE_ESSENTIAL_ONLY
## 15 Cocos (Keeling) Islands_04_17_20_CC33 TYPE_COMPLETE
Each row now represents one (transaction, item) pair. This is the standard “single format” required by the arules package.
trans_single_df <- as.data.frame(trans_single)
write.csv(trans_single, "trans_single.csv", row.names = FALSE)
cobap_trans <- read.transactions(
"trans_single.csv",
format = "single",
cols = c(1,2),
header = TRUE,
sep = ","
)
summary(cobap_trans)
## transactions as itemMatrix in sparse format with
## 1762 rows (elements/itemsets/transactions) and
## 22 columns (items) and a density of 0.1510164
##
## most frequent items:
## TYPE_PARTIAL SUBTYPE_BORDER_CLOSURE AIR
## 1333 828 689
## TYPE_COMPLETE CITIZEN_EXCEP (Other)
## 422 372 2210
##
## element (itemset/transaction) length distribution:
## sizes
## 1 2 3 4 5
## 7 36 1238 344 137
##
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.000 3.000 3.000 3.322 4.000 5.000
##
## includes extended item information - examples:
## labels
## 1 AIR
## 2 CITIZEN
## 3 CITIZEN_EXCEP
##
## includes extended transaction information - examples:
## transactionID
## 1 Afghanistan_02_23_20_AF02
## 2 Afghanistan_03_01_20_AF04
## 3 Afghanistan_03_31_20_AF01
inspect(cobap_trans[1:5])
## items transactionID
## [1] {LAND,
## SUBTYPE_BORDER_CLOSURE,
## TYPE_PARTIAL} Afghanistan_02_23_20_AF02
## [2] {LAND,
## SUBTYPE_BORDER_CLOSURE,
## TYPE_PARTIAL} Afghanistan_03_01_20_AF04
## [3] {AIR,
## SUBTYPE_BORDER_CLOSURE,
## TYPE_PARTIAL} Afghanistan_03_31_20_AF01
## [4] {LAND,
## SUBTYPE_BORDER_CLOSURE,
## TYPE_PARTIAL} Afghanistan_04_29_21_AF03
## [5] {CITIZEN_EXCEP,
## COUNTRY_EXCEP,
## SUBTYPE_WORK_EXCEP,
## TYPE_COMPLETE,
## WORK_EXCEP} Åland Islands_01_11_21_AX12
We now have a sparse transaction matrix where: - rows = policy events, - columns = policy features, - values = presence/absence.
Before mining rules, we explore: 1. Which policy measures (items) are most frequent? 2. How dominant different measures are across policy packages?
# most frequent items
itemFrequencyPlot(cobap_trans, topN = 20, type = "absolute",
main = "Top 20 items (absolute frequency)")
itemFrequencyPlot(cobap_trans, topN = 20, type = "relative",
main = "Top 20 items (relative frequency)")
sort(itemFrequency(cobap_trans), decreasing=TRUE) %>% head(20)
## TYPE_PARTIAL SUBTYPE_BORDER_CLOSURE AIR
## 0.756526674 0.469920545 0.391032917
## TYPE_COMPLETE CITIZEN_EXCEP HISTORY_BAN
## 0.239500568 0.211123723 0.153234960
## LAND SUBTYPE_HISTORY_BAN SEA
## 0.142451759 0.139046538 0.129398411
## CITIZEN SUBTYPE_CITIZENSHIP_BAN SUBTYPE_CITIZEN_EXCEP
## 0.110102157 0.110102157 0.100454030
## COUNTRY_EXCEP SUBTYPE_WORK_EXCEP WORK_EXCEP
## 0.077752554 0.073779796 0.073779796
## SUBTYPE_SPECIFIC_COUNTRY SUBTYPE_VISA_BAN VISA_BAN
## 0.044835414 0.035754824 0.035754824
## SUBTYPE_ESSENTIAL_ONLY TYPE_NOPOLICYIMPLEMENTED
## 0.020431328 0.003972758
The plots show that: - Closure types dominate the dataset, - AIR restrictions appear more frequently than LAND or SEA, - Exception mechanisms systematically accompany closures.
This confirms that border policies were implemented as structured bundles.
Each rule has the form: {LHS} → {RHS}
Key measures: Support: how often the whole rule occurs in the dataset
(share of transactions containing both LHS and RHS) Confidence: how
often RHS occurs when LHS is present (conditional probability of RHS
given LHS) Lift: strength compared to random coincidence
(lift > 1 indicates meaningful association) In policy terms,
high-lift rules suggest that certain measures form coherent policy
packages.
We use Apriori with: minimum support = 0.02 minimum confidence = 0.60 min rule length = 2 This produced a manageable and interpretable rule set.
rules <- apriori(
cobap_trans,
parameter = list(supp = 0.02, conf = 0.60, minlen = 2)
)
## Apriori
##
## Parameter specification:
## confidence minval smax arem aval originalSupport maxtime support minlen
## 0.6 0.1 1 none FALSE TRUE 5 0.02 2
## maxlen target ext
## 10 rules TRUE
##
## Algorithmic control:
## filter tree heap memopt load sort verbose
## 0.1 TRUE TRUE FALSE TRUE 2 TRUE
##
## Absolute minimum support count: 35
##
## set item appearances ...[0 item(s)] done [0.00s].
## set transactions ...[22 item(s), 1762 transaction(s)] done [0.00s].
## sorting and recoding items ... [19 item(s)] done [0.00s].
## creating transaction tree ... done [0.00s].
## checking subsets of size 1 2 3 4 5 done [0.00s].
## writing ... [128 rule(s)] done [0.00s].
## creating S4 object ... done [0.00s].
rules
## set of 128 rules
summary(rules)
## set of 128 rules
##
## rule length distribution (lhs + rhs):sizes
## 2 3 4 5
## 38 52 30 8
##
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.000 2.000 3.000 3.062 4.000 5.000
##
## summary of quality measures:
## support confidence coverage lift
## Min. :0.02043 Min. :0.6212 Min. :0.02043 Min. : 1.322
## 1st Qu.:0.04200 1st Qu.:0.9655 1st Qu.:0.04413 1st Qu.: 2.128
## Median :0.06867 Median :1.0000 Median :0.06867 Median : 4.175
## Mean :0.09434 Mean :0.9679 Mean :0.09992 Mean : 5.743
## 3rd Qu.:0.11862 3rd Qu.:1.0000 3rd Qu.:0.11862 3rd Qu.: 5.544
## Max. :0.46992 Max. :1.0000 Max. :0.75653 Max. :27.968
## count
## Min. : 36.0
## 1st Qu.: 74.0
## Median :121.0
## Mean :166.2
## 3rd Qu.:209.0
## Max. :828.0
##
## mining info:
## data ntransactions support confidence
## cobap_trans 1762 0.02 0.6
## call
## apriori(data = cobap_trans, parameter = list(supp = 0.02, conf = 0.6, minlen = 2))
These thresholds balance: - interpretability, - statistical relevance, - manageable rule volume.
Redundant rules can repeat the same information. We remove them and sort by lift.
rules_nored <- rules[!is.redundant(rules)]
summary(rules_nored)
## set of 41 rules
##
## rule length distribution (lhs + rhs):sizes
## 2 3
## 38 3
##
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.000 2.000 2.000 2.073 2.000 3.000
##
## summary of quality measures:
## support confidence coverage lift
## Min. :0.02043 Min. :0.6212 Min. :0.02043 Min. : 1.322
## 1st Qu.:0.04597 1st Qu.:0.9655 1st Qu.:0.06810 1st Qu.: 1.322
## Median :0.10045 Median :1.0000 Median :0.10045 Median : 4.175
## Mean :0.13165 Mean :0.9637 Mean :0.14288 Mean : 5.433
## 3rd Qu.:0.13905 3rd Qu.:1.0000 3rd Qu.:0.14245 3rd Qu.: 4.737
## Max. :0.46992 Max. :1.0000 Max. :0.75653 Max. :27.968
## count
## Min. : 36
## 1st Qu.: 81
## Median :177
## Mean :232
## 3rd Qu.:245
## Max. :828
##
## mining info:
## data ntransactions support confidence
## cobap_trans 1762 0.02 0.6
## call
## apriori(data = cobap_trans, parameter = list(supp = 0.02, conf = 0.6, minlen = 2))
rules_lift <- sort(rules_nored, by = "lift", decreasing = TRUE)
inspect(head(rules_lift, 20))
## lhs rhs
## [1] {VISA_BAN} => {SUBTYPE_VISA_BAN}
## [2] {SUBTYPE_VISA_BAN} => {VISA_BAN}
## [3] {SUBTYPE_WORK_EXCEP} => {WORK_EXCEP}
## [4] {WORK_EXCEP} => {SUBTYPE_WORK_EXCEP}
## [5] {SUBTYPE_SPECIFIC_COUNTRY} => {COUNTRY_EXCEP}
## [6] {SUBTYPE_CITIZENSHIP_BAN} => {CITIZEN}
## [7] {CITIZEN} => {SUBTYPE_CITIZENSHIP_BAN}
## [8] {SUBTYPE_HISTORY_BAN} => {HISTORY_BAN}
## [9] {HISTORY_BAN} => {SUBTYPE_HISTORY_BAN}
## [10] {AIR, LAND} => {SEA}
## [11] {SUBTYPE_CITIZEN_EXCEP} => {CITIZEN_EXCEP}
## [12] {COUNTRY_EXCEP, SUBTYPE_WORK_EXCEP} => {CITIZEN_EXCEP}
## [13] {COUNTRY_EXCEP, WORK_EXCEP} => {CITIZEN_EXCEP}
## [14] {COUNTRY_EXCEP} => {CITIZEN_EXCEP}
## [15] {SUBTYPE_SPECIFIC_COUNTRY} => {CITIZEN_EXCEP}
## [16] {SUBTYPE_WORK_EXCEP} => {CITIZEN_EXCEP}
## [17] {WORK_EXCEP} => {CITIZEN_EXCEP}
## [18] {SUBTYPE_ESSENTIAL_ONLY} => {TYPE_COMPLETE}
## [19] {SUBTYPE_SPECIFIC_COUNTRY} => {TYPE_COMPLETE}
## [20] {SUBTYPE_CITIZEN_EXCEP} => {TYPE_COMPLETE}
## support confidence coverage lift count
## [1] 0.03575482 1.0000000 0.03575482 27.968254 63
## [2] 0.03575482 1.0000000 0.03575482 27.968254 63
## [3] 0.07377980 1.0000000 0.07377980 13.553846 130
## [4] 0.07377980 1.0000000 0.07377980 13.553846 130
## [5] 0.04483541 1.0000000 0.04483541 12.861314 79
## [6] 0.11010216 1.0000000 0.11010216 9.082474 194
## [7] 0.11010216 1.0000000 0.11010216 9.082474 194
## [8] 0.13904654 1.0000000 0.13904654 6.525926 245
## [9] 0.13904654 0.9074074 0.15323496 6.525926 245
## [10] 0.04597049 0.6750000 0.06810443 5.216447 81
## [11] 0.10045403 1.0000000 0.10045403 4.736559 177
## [12] 0.03178207 0.9655172 0.03291714 4.573230 56
## [13] 0.03178207 0.9655172 0.03291714 4.573230 56
## [14] 0.07377980 0.9489051 0.07775255 4.494545 130
## [15] 0.04199773 0.9367089 0.04483541 4.436777 74
## [16] 0.06867196 0.9307692 0.07377980 4.408644 121
## [17] 0.06867196 0.9307692 0.07377980 4.408644 121
## [18] 0.02043133 1.0000000 0.02043133 4.175355 36
## [19] 0.04483541 1.0000000 0.04483541 4.175355 79
## [20] 0.10045403 1.0000000 0.10045403 4.175355 177
The strongest rules (highest lift) reveal stable policy combinations that occur far more often than random chance would predict.
rules_complete <- apriori(
cobap_trans,
parameter = list(supp = 0.01, conf = 0.40, minlen = 2),
appearance = list(default = "lhs", rhs = "TYPE_COMPLETE"),
control = list(verbose = FALSE)
)
summary(rules_complete)
## set of 22 rules
##
## rule length distribution (lhs + rhs):sizes
## 2 3 4 5
## 7 9 5 1
##
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.00 2.00 3.00 3.00 3.75 5.00
##
## summary of quality measures:
## support confidence coverage lift
## Min. :0.02043 Min. :1 Min. :0.02043 Min. :4.175
## 1st Qu.:0.03292 1st Qu.:1 1st Qu.:0.03292 1st Qu.:4.175
## Median :0.05675 Median :1 Median :0.05675 Median :4.175
## Mean :0.06269 Mean :1 Mean :0.06269 Mean :4.175
## 3rd Qu.:0.07378 3rd Qu.:1 3rd Qu.:0.07378 3rd Qu.:4.175
## Max. :0.21112 Max. :1 Max. :0.21112 Max. :4.175
## count
## Min. : 36.0
## 1st Qu.: 58.0
## Median :100.0
## Mean :110.5
## 3rd Qu.:130.0
## Max. :372.0
##
## mining info:
## data ntransactions support confidence
## cobap_trans 1762 0.01 0.4
## call
## apriori(data = cobap_trans, parameter = list(supp = 0.01, conf = 0.4, minlen = 2), appearance = list(default = "lhs", rhs = "TYPE_COMPLETE"), control = list(verbose = FALSE))
inspect(head(sort(rules_complete, by = "lift", decreasing = TRUE), 15))
## lhs rhs support confidence coverage lift count
## [1] {SUBTYPE_ESSENTIAL_ONLY} => {TYPE_COMPLETE} 0.02043133 1 0.02043133 4.175355 36
## [2] {SUBTYPE_SPECIFIC_COUNTRY} => {TYPE_COMPLETE} 0.04483541 1 0.04483541 4.175355 79
## [3] {SUBTYPE_CITIZEN_EXCEP} => {TYPE_COMPLETE} 0.10045403 1 0.10045403 4.175355 177
## [4] {SUBTYPE_WORK_EXCEP} => {TYPE_COMPLETE} 0.07377980 1 0.07377980 4.175355 130
## [5] {WORK_EXCEP} => {TYPE_COMPLETE} 0.07377980 1 0.07377980 4.175355 130
## [6] {COUNTRY_EXCEP} => {TYPE_COMPLETE} 0.07775255 1 0.07775255 4.175355 137
## [7] {CITIZEN_EXCEP} => {TYPE_COMPLETE} 0.21112372 1 0.21112372 4.175355 372
## [8] {COUNTRY_EXCEP,
## SUBTYPE_SPECIFIC_COUNTRY} => {TYPE_COMPLETE} 0.04483541 1 0.04483541 4.175355 79
## [9] {CITIZEN_EXCEP,
## SUBTYPE_SPECIFIC_COUNTRY} => {TYPE_COMPLETE} 0.04199773 1 0.04199773 4.175355 74
## [10] {CITIZEN_EXCEP,
## SUBTYPE_CITIZEN_EXCEP} => {TYPE_COMPLETE} 0.10045403 1 0.10045403 4.175355 177
## [11] {SUBTYPE_WORK_EXCEP,
## WORK_EXCEP} => {TYPE_COMPLETE} 0.07377980 1 0.07377980 4.175355 130
## [12] {COUNTRY_EXCEP,
## SUBTYPE_WORK_EXCEP} => {TYPE_COMPLETE} 0.03291714 1 0.03291714 4.175355 58
## [13] {CITIZEN_EXCEP,
## SUBTYPE_WORK_EXCEP} => {TYPE_COMPLETE} 0.06867196 1 0.06867196 4.175355 121
## [14] {COUNTRY_EXCEP,
## WORK_EXCEP} => {TYPE_COMPLETE} 0.03291714 1 0.03291714 4.175355 58
## [15] {CITIZEN_EXCEP,
## WORK_EXCEP} => {TYPE_COMPLETE} 0.06867196 1 0.06867196 4.175355 121
These rules show which measures tend to accompany a complete border closure. They describe the “standard policy package” for maximum restriction.
We visualize the rule structure with arulesViz (top rules by lift).
top_rules <- head(rules_lift, 100)
plot(top_rules, measure = c("support","lift"), shading = "confidence")
## To reduce overplotting, jitter is added! Use jitter = 0 to prevent jitter.
plot(top_rules, method = "matrix", measure = "lift")
## Itemsets in Antecedent (LHS)
## [1] "{VISA_BAN}" "{SUBTYPE_VISA_BAN}"
## [3] "{SUBTYPE_WORK_EXCEP}" "{WORK_EXCEP}"
## [5] "{SUBTYPE_SPECIFIC_COUNTRY}" "{AIR,LAND}"
## [7] "{SUBTYPE_CITIZENSHIP_BAN}" "{CITIZEN}"
## [9] "{COUNTRY_EXCEP,SUBTYPE_WORK_EXCEP}" "{COUNTRY_EXCEP,WORK_EXCEP}"
## [11] "{SUBTYPE_CITIZEN_EXCEP}" "{COUNTRY_EXCEP}"
## [13] "{SUBTYPE_ESSENTIAL_ONLY}" "{CITIZEN_EXCEP}"
## [15] "{TYPE_COMPLETE}" "{SUBTYPE_HISTORY_BAN}"
## [17] "{HISTORY_BAN}" "{SEA}"
## [19] "{LAND}" "{AIR}"
## [21] "{SUBTYPE_BORDER_CLOSURE}" "{TYPE_PARTIAL}"
## Itemsets in Consequent (RHS)
## [1] "{TYPE_PARTIAL}" "{SUBTYPE_BORDER_CLOSURE}"
## [3] "{AIR}" "{TYPE_COMPLETE}"
## [5] "{CITIZEN_EXCEP}" "{SEA}"
## [7] "{SUBTYPE_HISTORY_BAN}" "{HISTORY_BAN}"
## [9] "{SUBTYPE_CITIZENSHIP_BAN}" "{CITIZEN}"
## [11] "{COUNTRY_EXCEP}" "{SUBTYPE_WORK_EXCEP}"
## [13] "{WORK_EXCEP}" "{VISA_BAN}"
## [15] "{SUBTYPE_VISA_BAN}"
plot(top_rules, method = "grouped")
plot(head(rules_lift, 30), method = "graph", control = list(type = "items"))
## Warning: Unknown control parameters: type
## Available control parameters (with default values):
## layout = stress
## circular = FALSE
## ggraphdots = NULL
## edges = <environment>
## nodes = <environment>
## nodetext = <environment>
## colors = c("#EE0000FF", "#EEEEEEFF")
## engine = ggplot2
## max = 100
## verbose = FALSE
Visualizations reveal: - clusters of strongly connected policy measures, - dominance of closure + exception structures, - institutional consistency in policy design.
ECLAT finds frequent itemsets directly. Then we can induce rules from those itemsets. This provides a second algorithmic perspective consistent with the course material.
freq_sets <- eclat(cobap_trans, parameter = list(supp = 0.02, maxlen = 5))
## Eclat
##
## parameter specification:
## tidLists support minlen maxlen target ext
## FALSE 0.02 1 5 frequent itemsets TRUE
##
## algorithmic control:
## sparse sort verbose
## 7 -2 TRUE
##
## Absolute minimum support count: 35
##
## create itemset ...
## set transactions ...[22 item(s), 1762 transaction(s)] done [0.00s].
## sorting and recoding items ... [19 item(s)] done [0.00s].
## creating bit matrix ... [19 row(s), 1762 column(s)] done [0.00s].
## writing ... [94 set(s)] done [0.00s].
## Creating S4 object ... done [0.00s].
summary(freq_sets)
## set of 94 itemsets
##
## most frequent items:
## TYPE_PARTIAL TYPE_COMPLETE CITIZEN_EXCEP COUNTRY_EXCEP AIR
## 25 23 22 20 16
## (Other)
## 118
##
## element (itemset/transaction) length distribution:sizes
## 1 2 3 4 5
## 19 35 27 11 2
##
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.000 2.000 2.000 2.383 3.000 5.000
##
## summary of quality measures:
## support count
## Min. :0.02043 Min. : 36.00
## 1st Qu.:0.04271 1st Qu.: 75.25
## Median :0.07378 Median : 130.00
## Mean :0.10876 Mean : 191.63
## 3rd Qu.:0.12670 3rd Qu.: 223.25
## Max. :0.75653 Max. :1333.00
##
## includes transaction ID lists: FALSE
##
## mining info:
## data ntransactions support
## cobap_trans 1762 0.02
## call
## eclat(data = cobap_trans, parameter = list(supp = 0.02, maxlen = 5))
inspect(head(sort(freq_sets, by = "support", decreasing = TRUE), 15))
## items support count
## [1] {TYPE_PARTIAL} 0.7565267 1333
## [2] {SUBTYPE_BORDER_CLOSURE, TYPE_PARTIAL} 0.4699205 828
## [3] {SUBTYPE_BORDER_CLOSURE} 0.4699205 828
## [4] {AIR, SUBTYPE_BORDER_CLOSURE, TYPE_PARTIAL} 0.3910329 689
## [5] {AIR, TYPE_PARTIAL} 0.3910329 689
## [6] {AIR, SUBTYPE_BORDER_CLOSURE} 0.3910329 689
## [7] {AIR} 0.3910329 689
## [8] {TYPE_COMPLETE} 0.2395006 422
## [9] {CITIZEN_EXCEP, TYPE_COMPLETE} 0.2111237 372
## [10] {CITIZEN_EXCEP} 0.2111237 372
## [11] {HISTORY_BAN, TYPE_PARTIAL} 0.1532350 270
## [12] {HISTORY_BAN} 0.1532350 270
## [13] {LAND, SUBTYPE_BORDER_CLOSURE, TYPE_PARTIAL} 0.1424518 251
## [14] {LAND, TYPE_PARTIAL} 0.1424518 251
## [15] {LAND, SUBTYPE_BORDER_CLOSURE} 0.1424518 251
eclat_rules <- ruleInduction(freq_sets, cobap_trans, confidence = 0.60)
summary(eclat_rules)
## set of 128 rules
##
## rule length distribution (lhs + rhs):sizes
## 2 3 4 5
## 38 52 30 8
##
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 2.000 2.000 3.000 3.062 4.000 5.000
##
## summary of quality measures:
## support confidence lift itemset
## Min. :0.02043 Min. :0.6212 Min. : 1.322 Min. : 1.00
## 1st Qu.:0.04200 1st Qu.:0.9655 1st Qu.: 2.128 1st Qu.:18.00
## Median :0.06867 Median :1.0000 Median : 4.175 Median :37.00
## Mean :0.09434 Mean :0.9679 Mean : 5.743 Mean :36.72
## 3rd Qu.:0.11862 3rd Qu.:1.0000 3rd Qu.: 5.544 3rd Qu.:54.25
## Max. :0.46992 Max. :1.0000 Max. :27.968 Max. :75.00
##
## mining info:
## data ntransactions support
## cobap_trans 1762 0.02
## call
## eclat(data = cobap_trans, parameter = list(supp = 0.02, maxlen = 5))
## confidence
## 0.6
inspect(head(sort(eclat_rules, by = "lift", decreasing = TRUE), 15))
## lhs rhs support confidence lift itemset
## [1] {TYPE_PARTIAL,
## VISA_BAN} => {SUBTYPE_VISA_BAN} 0.03575482 1 27.96825 2
## [2] {SUBTYPE_VISA_BAN,
## TYPE_PARTIAL} => {VISA_BAN} 0.03575482 1 27.96825 2
## [3] {VISA_BAN} => {SUBTYPE_VISA_BAN} 0.03575482 1 27.96825 4
## [4] {SUBTYPE_VISA_BAN} => {VISA_BAN} 0.03575482 1 27.96825 4
## [5] {CITIZEN_EXCEP,
## COUNTRY_EXCEP,
## TYPE_COMPLETE,
## WORK_EXCEP} => {SUBTYPE_WORK_EXCEP} 0.03178207 1 13.55385 16
## [6] {CITIZEN_EXCEP,
## COUNTRY_EXCEP,
## SUBTYPE_WORK_EXCEP,
## TYPE_COMPLETE} => {WORK_EXCEP} 0.03178207 1 13.55385 16
## [7] {COUNTRY_EXCEP,
## TYPE_COMPLETE,
## WORK_EXCEP} => {SUBTYPE_WORK_EXCEP} 0.03291714 1 13.55385 17
## [8] {COUNTRY_EXCEP,
## SUBTYPE_WORK_EXCEP,
## TYPE_COMPLETE} => {WORK_EXCEP} 0.03291714 1 13.55385 17
## [9] {CITIZEN_EXCEP,
## COUNTRY_EXCEP,
## WORK_EXCEP} => {SUBTYPE_WORK_EXCEP} 0.03178207 1 13.55385 18
## [10] {CITIZEN_EXCEP,
## COUNTRY_EXCEP,
## SUBTYPE_WORK_EXCEP} => {WORK_EXCEP} 0.03178207 1 13.55385 18
## [11] {CITIZEN_EXCEP,
## TYPE_COMPLETE,
## WORK_EXCEP} => {SUBTYPE_WORK_EXCEP} 0.06867196 1 13.55385 19
## [12] {CITIZEN_EXCEP,
## SUBTYPE_WORK_EXCEP,
## TYPE_COMPLETE} => {WORK_EXCEP} 0.06867196 1 13.55385 19
## [13] {TYPE_COMPLETE,
## WORK_EXCEP} => {SUBTYPE_WORK_EXCEP} 0.07377980 1 13.55385 20
## [14] {SUBTYPE_WORK_EXCEP,
## TYPE_COMPLETE} => {WORK_EXCEP} 0.07377980 1 13.55385 20
## [15] {CITIZEN_EXCEP,
## WORK_EXCEP} => {SUBTYPE_WORK_EXCEP} 0.06867196 1 13.55385 21
ECLAT confirms Apriori’s findings and validates robustness.
From the strongest rules (by lift and confidence), we observe that border closures were typically designed as packages: closure measures (air/land/sea), plus restriction types (visa/history/citizen), plus controlled exception structures (citizens, workers, specific countries). Example interpretation template: Rule: {A, B} → {C} Support: x% Confidence: y% Lift: z Meaning: When A and B are present, C appears with high probability; lift > 1 implies the association is stronger than chance.
Association rules reveal co-occurrence patterns, not causality. They help identify which policy measures tend to be implemented together, but not the political or epidemiological reasons behind them. Additionally, results may reflect reporting differences and variation in how policies are coded across countries.
This project demonstrates how association rule learning can uncover structured patterns in real-world governance data. Using COBAP policy events as transactions, Apriori and ECLAT identify frequent border policy bundles and strong associations, supporting the conclusion that COVID border restrictions were deployed as coordinated policy packages rather than independent actions.