This report presents multiple modeling approaches for the assistive technology indicators dataset from Kenya. It covers data cleaning, exploratory analysis, and diverse modeling techniques ranging from classification to clustering.
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.1 ✔ stringr 1.6.0
## ✔ ggplot2 4.0.0 ✔ tibble 3.3.0
## ✔ lubridate 1.9.4 ✔ tidyr 1.3.1
## ✔ purrr 1.2.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
data <- read.csv("assistive_technology_indicators_ken.csv", skip = 1, stringsAsFactors = FALSE)
data <- data[-1, ] # Remove annotation header row if present
str(data)
## 'data.frame': 73 obs. of 17 variables:
## $ X.indicator.code : chr "ASSISTIVETECH_Q09" "ASSISTIVETECH_Q05" "ASSISTIVETECH_Q09" "ASSISTIVETECH_Q10" ...
## $ X.indicator.name : chr "Existence of regulations/standards/guidelines/protocols on assistive technology" "Availability of government/registered services for assistive technology" "Existence of regulations/standards/guidelines/protocols on assistive technology" "Existence of initiatives/collaborations on assistive technology" ...
## $ X.indicator.url : chr "https://www.who.int/data/gho/data/indicators/indicator-details/GHO/existence-of-national-regulations-standards-"| __truncated__ "https://www.who.int/data/gho/data/indicators/indicator-details/GHO/availability-of-government-or-registered-ser"| __truncated__ "https://www.who.int/data/gho/data/indicators/indicator-details/GHO/existence-of-national-regulations-standards-"| __truncated__ "https://www.who.int/data/gho/data/indicators/indicator-details/GHO/promotion-facilitation-support-or-investment"| __truncated__ ...
## $ X.date.year : int 2021 2021 2021 2021 2021 2021 2021 2021 2021 2021 ...
## $ X.date.year.start : int 2021 2021 2021 2021 2021 2021 2021 2021 2021 2021 ...
## $ X.date.year.end : int 2021 2021 2021 2021 2021 2021 2021 2021 2021 2021 ...
## $ X.region.code : chr "AFR" "AFR" "AFR" "AFR" ...
## $ X.region.name : chr "Africa" "Africa" "Africa" "Africa" ...
## $ X.country.code : chr "KEN" "KEN" "KEN" "KEN" ...
## $ X.country.name : chr "Kenya" "Kenya" "Kenya" "Kenya" ...
## $ X.dimension.type : chr "ASSISTIVETECHSUBQUESTION" "ASSISTIVETECHSUBQUESTION" "ASSISTIVETECHSUBQUESTION" "ASSISTIVETECHSUBQUESTION" ...
## $ X.dimension.code : chr "ASSISTIVETECHSUBQUESTION_ASSISTIVETECH_Q09_07" "ASSISTIVETECHSUBQUESTION_ASSISTIVETECH_Q05_06" "ASSISTIVETECHSUBQUESTION_ASSISTIVETECH_Q09_04" "ASSISTIVETECHSUBQUESTION_ASSISTIVETECH_Q10_02" ...
## $ X.dimension.name : chr "Inclusion of barrier-free/accessible environments in emergency preparedness and response programmes" "Services related to Vision" "Qualifications of assistive products providers" "Product development" ...
## $ X.indicator.value.num : logi NA NA NA NA NA NA ...
## $ X.indicator.value : chr "Yes" "No coverage" "Yes" "Yes" ...
## $ X.indicator.value.low : logi NA NA NA NA NA NA ...
## $ X.indicator.value.high: logi NA NA NA NA NA NA ...
summary(data)
## X.indicator.code X.indicator.name X.indicator.url X.date.year
## Length:73 Length:73 Length:73 Min. :2021
## Class :character Class :character Class :character 1st Qu.:2021
## Mode :character Mode :character Mode :character Median :2021
## Mean :2021
## 3rd Qu.:2021
## Max. :2021
## X.date.year.start X.date.year.end X.region.code X.region.name
## Min. :2021 Min. :2021 Length:73 Length:73
## 1st Qu.:2021 1st Qu.:2021 Class :character Class :character
## Median :2021 Median :2021 Mode :character Mode :character
## Mean :2021 Mean :2021
## 3rd Qu.:2021 3rd Qu.:2021
## Max. :2021 Max. :2021
## X.country.code X.country.name X.dimension.type X.dimension.code
## Length:73 Length:73 Length:73 Length:73
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## X.dimension.name X.indicator.value.num X.indicator.value
## Length:73 Mode:logical Length:73
## Class :character NA's:73 Class :character
## Mode :character Mode :character
##
##
##
## X.indicator.value.low X.indicator.value.high
## Mode:logical Mode:logical
## NA's:73 NA's:73
##
##
##
##
head(data, 5)
## X.indicator.code
## 2 ASSISTIVETECH_Q09
## 3 ASSISTIVETECH_Q05
## 4 ASSISTIVETECH_Q09
## 5 ASSISTIVETECH_Q10
## 6 ASSISTIVETECH_Q08
## X.indicator.name
## 2 Existence of regulations/standards/guidelines/protocols on assistive technology
## 3 Availability of government/registered services for assistive technology
## 4 Existence of regulations/standards/guidelines/protocols on assistive technology
## 5 Existence of initiatives/collaborations on assistive technology
## 6 Existence of financial mechanisms for assistive technology
## X.indicator.url
## 2 https://www.who.int/data/gho/data/indicators/indicator-details/GHO/existence-of-national-regulations-standards-guidelines-or-protocols-on-assistive-technology
## 3 https://www.who.int/data/gho/data/indicators/indicator-details/GHO/availability-of-government-or-registered-services-providing-assistive-technology
## 4 https://www.who.int/data/gho/data/indicators/indicator-details/GHO/existence-of-national-regulations-standards-guidelines-or-protocols-on-assistive-technology
## 5 https://www.who.int/data/gho/data/indicators/indicator-details/GHO/promotion-facilitation-support-or-investment-in-assistive-technology-related-areas-at-national-level
## 6 https://www.who.int/data/gho/data/indicators/indicator-details/GHO/measures-in-place-at-national-level-to-fully-or-partly-cover-users-costs-for-accessing-assistive-technology
## X.date.year X.date.year.start X.date.year.end X.region.code X.region.name
## 2 2021 2021 2021 AFR Africa
## 3 2021 2021 2021 AFR Africa
## 4 2021 2021 2021 AFR Africa
## 5 2021 2021 2021 AFR Africa
## 6 2021 2021 2021 AFR Africa
## X.country.code X.country.name X.dimension.type
## 2 KEN Kenya ASSISTIVETECHSUBQUESTION
## 3 KEN Kenya ASSISTIVETECHSUBQUESTION
## 4 KEN Kenya ASSISTIVETECHSUBQUESTION
## 5 KEN Kenya ASSISTIVETECHSUBQUESTION
## 6 KEN Kenya ASSISTIVETECHSUBQUESTION
## X.dimension.code
## 2 ASSISTIVETECHSUBQUESTION_ASSISTIVETECH_Q09_07
## 3 ASSISTIVETECHSUBQUESTION_ASSISTIVETECH_Q05_06
## 4 ASSISTIVETECHSUBQUESTION_ASSISTIVETECH_Q09_04
## 5 ASSISTIVETECHSUBQUESTION_ASSISTIVETECH_Q10_02
## 6 ASSISTIVETECHSUBQUESTION_ASSISTIVETECH_SUMMARY
## X.dimension.name
## 2 Inclusion of barrier-free/accessible environments in emergency preparedness and response programmes
## 3 Services related to Vision
## 4 Qualifications of assistive products providers
## 5 Product development
## 6 Summary
## X.indicator.value.num X.indicator.value X.indicator.value.low
## 2 NA Yes NA
## 3 NA No coverage NA
## 4 NA Yes NA
## 5 NA Yes NA
## 6 NA Yes NA
## X.indicator.value.high
## 2 NA
## 3 NA
## 4 NA
## 5 NA
## 6 NA
print(names(data))
## [1] "X.indicator.code" "X.indicator.name" "X.indicator.url"
## [4] "X.date.year" "X.date.year.start" "X.date.year.end"
## [7] "X.region.code" "X.region.name" "X.country.code"
## [10] "X.country.name" "X.dimension.type" "X.dimension.code"
## [13] "X.dimension.name" "X.indicator.value.num" "X.indicator.value"
## [16] "X.indicator.value.low" "X.indicator.value.high"
library(ggplot2)
ggplot(clean_data, aes(x = indicator_code, fill = value)) +
geom_bar(position = "stack") +
theme_minimal() +
labs(title = "Values by Indicator Code", x = "Indicator Code", y = "Count") +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
prop_tab <- clean_data %>%
count(value) %>%
mutate(proportion = n/sum(n))
ggplot(prop_tab, aes(x = value, y = proportion, fill = value)) +
geom_col() +
theme_minimal() +
labs(title = "Proportion of Response Values", y = "Proportion")
library(nnet)
multi_data <- clean_data %>%
mutate(outcome = factor(value),
indicator_group = factor(indicator_code))
if (nrow(multi_data) > 15 && length(unique(multi_data$outcome)) > 2) {
multinom_model <- multinom(outcome ~ indicator_group, data = multi_data, trace = FALSE)
summary(multinom_model)
multi_pred <- predict(multinom_model, multi_data)
mean(multi_pred == multi_data$outcome)
}
## Warning in sqrt(diag(vc)): NaNs produced
## [1] 0.890411
library(rpart.plot)
## Loading required package: rpart
tree_data <- clean_data %>%
group_by(indicator_code) %>%
summarise(
yes = sum(value == "Yes"),
nocover = sum(value == "No coverage"),
na = sum(value == "Information not available"),
total = n(),
yes_prop = yes/total
) %>%
mutate(primary_outcome = case_when(
yes_prop > 0.6 ~ "Mostly Yes",
yes_prop < 0.4 ~ "Mostly No",
TRUE ~ "Mixed"
))
tree_model <- rpart(primary_outcome ~ yes + nocover + na + yes_prop,
data = tree_data, method = "class")
rpart.plot(tree_model, main = "Decision Tree for Indicator Outcomes")
library(randomForest)
## randomForest 4.7-1.2
## Type rfNews() to see new features/changes/bug fixes.
##
## Attaching package: 'randomForest'
## The following object is masked from 'package:dplyr':
##
## combine
## The following object is masked from 'package:ggplot2':
##
## margin
set.seed(123)
rf_model <- randomForest(as.factor(primary_outcome) ~ yes + nocover + na + yes_prop,
data = tree_data)
rf_model
##
## Call:
## randomForest(formula = as.factor(primary_outcome) ~ yes + nocover + na + yes_prop, data = tree_data)
## Type of random forest: classification
## Number of trees: 500
## No. of variables tried at each split: 2
##
## OOB estimate of error rate: 0%
## Confusion matrix:
## Mostly No Mostly Yes class.error
## Mostly No 0 0 NaN
## Mostly Yes 0 9 0
varImpPlot(rf_model, main = "Variable Importance")
library(dplyr)
library(tidyr)
print(names(clean_data)) # Check your column names!
## [1] "indicator_code" "indicator_name" "dimension" "value"
cluster_matrix <- clean_data %>%
dplyr::mutate(value_numeric = case_when(
value == "Yes" ~ 1,
value == "Total coverage" ~ 1,
value == "No coverage" ~ 0,
value == "Information not available" ~ 0.5,
TRUE ~ 0
)) %>%
dplyr::select(indicator_code, dimension, value_numeric) %>%
tidyr::pivot_wider(names_from = dimension, values_from = value_numeric, values_fill = 0)
cluster_features <- as.matrix(cluster_matrix %>% dplyr::select(-indicator_code))
set.seed(123)
kmeans_model <- kmeans(cluster_features, centers = min(3, nrow(cluster_features) - 1), nstart = 25)
print(kmeans_model)
## K-means clustering with 3 clusters of sizes 1, 1, 8
##
## Cluster means:
## Inclusion of barrier-free/accessible environments in emergency preparedness and response programmes
## 1 0
## 2 1
## 3 0
## Services related to Vision Qualifications of assistive products providers
## 1 0 0
## 2 0 1
## 3 0 0
## Product development Summary People with difficulties in Cognition
## 1 1 1.000 0.000
## 2 0 1.000 0.000
## 3 0 0.875 0.125
## Voluntary private insurance schemes Human resources for Hearing
## 1 0.000 0.000
## 2 0.000 0.000
## 3 0.125 0.125
## Services related to Hearing Compulsory private insurance schemes
## 1 0 0.000
## 2 0 0.000
## 3 0 0.125
## Human resources for Self care A separate legislation on assistive technology
## 1 0.000 0.000
## 2 0.000 0.000
## 3 0.125 0.125
## Defense (or similar) Service delivery capacity
## 1 0.0000 1
## 2 0.0000 0
## 3 0.0625 0
## Legislation on labour (or similar)
## 1 0.000
## 2 0.000
## 3 0.125
## Legislation on social services (or similar)
## 1 0.000
## 2 0.000
## 3 0.125
## In a separate budget for assistive technology
## 1 0.000
## 2 0.000
## 3 0.125
## Information to users and their families People with difficulties in Mobility
## 1 1 0.000
## 2 0 0.000
## 3 0 0.125
## People with difficulties in Communication Education (or similar)
## 1 0.000 0.000
## 2 0.000 0.000
## 3 0.125 0.125
## In the budget for labour (or similar) Public (insurance) schemes
## 1 0.000 0.000
## 2 0.000 0.000
## 3 0.125 0.125
## Training related to Cognition People with difficulties in Hearing
## 1 0.000 0.000
## 2 0.000 0.000
## 3 0.125 0.125
## Training related to Mobility Safety of assistive products
## 1 0.000 0
## 2 0.000 1
## 3 0.125 0
## Inclusion of assistive products in emergency preparedness and response programmes
## 1 0
## 2 1
## 3 0
## In other budgets Collection of data on population-based needs for products
## 1 0.0000 1
## 2 0.0000 0
## 3 0.0625 0
## Participation of users in planning and monitoring services
## 1 1
## 2 0
## 3 0
## Human resources for Vision Human resources for Cognition
## 1 0.000 0.000
## 2 0.000 0.000
## 3 0.125 0.125
## International collaboration on manufacturing, procurement or supply of products
## 1 1
## 2 0
## 3 0
## Human resources for Communication
## 1 0.000
## 2 0.000
## 3 0.125
## A list of safe and effective assistive products that are subsidized or provided free to people who are eligible
## 1 0.000
## 2 0.000
## 3 0.125
## In the budget for defense (or similar) Other measures
## 1 0.000 0.000
## 2 0.000 0.000
## 3 0.125 0.125
## Services related to Communication Training related to Communication
## 1 0 0.000
## 2 0 0.000
## 3 0 0.125
## Delivery of assistive technology services
## 1 0
## 2 1
## 3 0
## Barrier-free/accessible environments Training related to Hearing
## 1 0 0.000
## 2 1 0.000
## 3 0 0.125
## Services related to Self care In the budget for health services (or similar)
## 1 0 0.000
## 2 0 0.000
## 3 0 0.125
## Health (or similar) Product procurement Other legislations
## 1 0.000 1 0.0000
## 2 0.000 0 0.0000
## 3 0.125 0 0.0625
## In the budget for education (or similar)
## 1 0.000
## 2 0.000
## 3 0.125
## In the budget for social services (or similar)
## 1 0.000
## 2 0.000
## 3 0.125
## People with difficulties in Self-care Procurement of assistive products
## 1 0.000 0
## 2 0.000 1
## 3 0.125 0
## Product affordability Legislation on defense (or similar) Labour (or similar)
## 1 1 0.0000 0.000
## 2 0 0.0000 0.000
## 3 0 0.0625 0.125
## Others Services related to Mobility
## 1 0.0000 0
## 2 0.0000 0
## 3 0.0625 0
## Legislation on health services (or similar) Social (or similar)
## 1 0.000 0.000
## 2 0.000 0.000
## 3 0.125 0.125
## People with difficulties in Vision Legislation on education (or similar)
## 1 0.000 0.000
## 2 0.000 0.000
## 3 0.125 0.125
## Human resources for Mobility Training related to Self-care
## 1 0.000 0.000
## 2 0.000 0.000
## 3 0.125 0.125
## Services related to Cognition
## 1 0
## 2 0
## 3 0
##
## Clustering vector:
## [1] 2 3 1 3 3 3 3 3 3 3
##
## Within cluster sum of squares by cluster:
## [1] 0.00000 0.00000 34.34375
## (between_SS / total_SS = 29.7 %)
##
## Available components:
##
## [1] "cluster" "centers" "totss" "withinss" "tot.withinss"
## [6] "betweenss" "size" "iter" "ifault"
library(FactoMineR)
library(factoextra)
## Welcome! Want to learn more? See two factoextra-related books at https://goo.gl/ve3WBa
# Remove columns of zero variance before PCA
zero_var_cols <- apply(cluster_features, 2, function(x) length(unique(x)) == 1)
cluster_features_clean <- cluster_features[, !zero_var_cols]
if(ncol(cluster_features_clean) > 1) {
pca <- prcomp(cluster_features_clean, scale. = TRUE)
fviz_pca_ind(pca, geom.ind = "point", col.ind = as.factor(kmeans_model$cluster),
palette = "Set2", addEllipses = TRUE, legend.title = "Cluster")
} else {
cat("No suitable variable columns for PCA!\n")
}
## Ignoring unknown labels:
## • linetype : "Cluster"
## Too few points to calculate an ellipse
## Too few points to calculate an ellipse
library(MASS)
##
## Attaching package: 'MASS'
## The following object is masked from 'package:dplyr':
##
## select
ordinal_data <- clean_data %>%
filter(value %in% c("No coverage", "Yes", "Total coverage")) %>%
mutate(outcome_ordinal = factor(value,
levels = c("No coverage", "Yes", "Total coverage"),
ordered = TRUE))
ordinal_model <- polr(outcome_ordinal ~ indicator_code, data = ordinal_data, Hess = TRUE)
summary(ordinal_model)
## Call:
## polr(formula = outcome_ordinal ~ indicator_code, data = ordinal_data,
## Hess = TRUE)
##
## Coefficients:
## Value Std. Error t value
## indicator_codeASSISTIVETECH_Q02 8.831 7.072e+01 1.249e-01
## indicator_codeASSISTIVETECH_Q03 -3.079 1.058e+01 -2.910e-01
## indicator_codeASSISTIVETECH_Q04 -3.250 6.010e+00 -5.408e-01
## indicator_codeASSISTIVETECH_Q05 -53.132 3.460e-06 -1.536e+07
## indicator_codeASSISTIVETECH_Q06 8.831 7.072e+01 1.249e-01
## indicator_codeASSISTIVETECH_Q07 9.013 7.072e+01 1.274e-01
## indicator_codeASSISTIVETECH_Q08 -3.207 7.695e+00 -4.167e-01
## indicator_codeASSISTIVETECH_Q09 -2.889 1.551e+01 -1.863e-01
## indicator_codeASSISTIVETECH_Q10 -2.654 2.433e+01 -1.091e-01
##
## Intercepts:
## Value Std. Error t value
## No coverage|Yes -2.574170e+01 2.341670e+01 -1.099300e+00
## Yes|Total coverage 1.062280e+01 7.071250e+01 1.502000e-01
##
## Residual Deviance: 16.89037
## AIC: 38.89037
summary_stats <- clean_data %>%
group_by(indicator_code, indicator_name) %>%
summarise(
n = n(),
yes_count = sum(value == "Yes"),
nocover_count = sum(value == "No coverage"),
na_count = sum(value == "Information not available"),
yes_pct = round(100 * yes_count / n, 2)
) %>%
arrange(desc(yes_pct))
## `summarise()` has grouped output by 'indicator_code'. You can override using
## the `.groups` argument.
knitr::kable(head(summary_stats, 10))
| indicator_code | indicator_name | n | yes_count | nocover_count | na_count | yes_pct |
|---|---|---|---|---|---|---|
| ASSISTIVETECH_Q08 | Existence of financial mechanisms for assistive technology | 6 | 6 | 0 | 0 | 100.00 |
| ASSISTIVETECH_Q09 | Existence of regulations/standards/guidelines/protocols on assistive technology | 8 | 8 | 0 | 0 | 100.00 |
| ASSISTIVETECH_Q10 | Existence of initiatives/collaborations on assistive technology | 9 | 9 | 0 | 0 | 100.00 |
| ASSISTIVETECH_Q03 | Availability of public funding for assistive technology | 8 | 7 | 0 | 1 | 87.50 |
| ASSISTIVETECH_Q02 | Coverage of people in need of assistive technology by legislation | 7 | 6 | 0 | 0 | 85.71 |
| ASSISTIVETECH_Q06 | Workforce availability for assistive technology | 7 | 6 | 0 | 0 | 85.71 |
| ASSISTIVETECH_Q07 | Availability of education/training for assistive technology | 6 | 5 | 0 | 0 | 83.33 |
| ASSISTIVETECH_Q01 | Existence of legislation on access to assistive technology | 8 | 6 | 0 | 2 | 75.00 |
| ASSISTIVETECH_Q04 | Existence of responsible ministry/authority on assistive technology | 7 | 5 | 0 | 2 | 71.43 |
| ASSISTIVETECH_Q05 | Availability of government/registered services for assistive technology | 7 | 0 | 7 | 0 | 0.00 |
This work explored and modeled Kenya’s assistive technology indicators dataset to inform humanitarian planning and policy. The analysis involved thorough data cleaning and preparation, focusing on transforming qualitative survey responses into formats suitable for statistical modeling.
Exploratory data analysis revealed patterns in the distribution of indicator responses, allowing an understanding of sector strengths, coverage gaps, and unknowns across dimensions such as education, legislation, funding, and service availability. Various modeling methodologies were implemented binary and multinomial logistic regression, decision trees, random forests, clustering, and ordinal regression. The modelling offered insights into the determinants of adequate assistive technology provision and potential predictors of systemic barriers.
Visualizations highlighted the distribution and coverage of “Yes”, “No coverage”, and “Information not available” responses, supporting rapid diagnostic interpretation. Predictive models enabled the identification of indicator groups most likely associated with positive outcomes or response variability, while clustering grouped indicators with similar patterns, offering a basis for further programmatic targeting.
All code and steps were delivered reproducibly in both script and RMarkdown formats, ensuring transparency, repeatability, and adaptability for further analysis, new years of data, or expansion to other countries.