Source: VectorStock
Source: VectorStock

1 Introduction

This report presents a comprehensive analysis of infectious disease indicators in Kenya, with particular focus on cholera trends and comparisons across multiple diseases including measles, poliomyelitis, and meningitis.

2 Dataset description

The dataset covers infectious disease indicators in Kenya over multiple years, including reported cases and deaths for various diseases such as cholera, measles, polio, tetanus, diphtheria, and others. It provides yearly data on specific health metrics like number of cases, deaths, and case fatality rates for these diseases, enabling analysis of disease trends and impacts over time.

2.1 Dataset size

The dataset “kenya-infectious-disease-indicators.csv” has 3 columns (Metric, Year, and Value) and approximately 384 rows of data, capturing various infectious disease indicators reported over multiple years in Kenya.

2.2 Installing packages and loading libraries

Before we begin our analysis, we need to load the necessary R packages. These libraries provide functions for data manipulation, visualization, and reading CSV files.

library(knitr)
library(tidyverse)
library(kableExtra)

2.3 Loading Data

Now we’ll import the Kenya infectious disease dataset into R. This dataset contains yearly records of various disease metrics collected over multiple decades.

data <- read_csv("C:/Users/user/Documents/un-report/data/practice-datasets/kenya-infectious-disease-indicators.csv")

3 Data Exploration

Let’s begin by examining the structure of our dataset. We’ll create a frequency table to see which disease metrics are most commonly reported and understand the overall composition of our data.

frequency_table <- data %>%
  count(Metric) %>%
  arrange(desc(n))

kable(frequency_table, caption = "<span style='color:blue; font-weight:bold;'> Table 1: Frequency of Reported Disease Metrics in Kenya </span>", escape = FALSE) 
Table 1: Frequency of Reported Disease Metrics in Kenya
Metric n
Measles - number of reported cases 46
Poliomyelitis - number of reported cases 39
Cholera case fatality rate 38
Number of reported cases of cholera 38
Number of reported deaths from cholera 38
Total tetanus - number of reported cases 36
Neonatal tetanus - number of reported cases 28
Pertussis - number of reported cases 28
Yellow fever - number of reported cases 27
Diphtheria - number of reported cases 24
Total rubella - number of reported cases 18
Number of suspected meningitis deaths reported 13
Mumps - number of reported cases 7
Congenital Rubella Syndrome - number of reported cases 2
Japanese encephalitis - number of reported cases 2

4 Data Processing

To prepare our data for analysis, we need to clean and filter it appropriately. This involves converting date formats, filtering specific disease metrics, and organizing the data into separate datasets for each disease of interest.

4.1 Converting Year column to numeric format

data$Year <- as.numeric(data$Year)

4.2 Filtering different disease metrics

measles_cases <- filter(data, Metric == "Measles - number of reported cases")
polio_cases <- filter(data, Metric == "Poliomyelitis - number of reported cases")
cholera_cfr <- filter(data, Metric == "Cholera case fatality rate")
cholera_cases <- filter(data, Metric == "Number of reported cases of cholera")
cholera_deaths <- filter(data, Metric == "Number of reported deaths from cholera")
tetanus_total <- filter(data, Metric == "Total tetanus - number of reported cases")
tetanus_neonatal <- filter(data, Metric == "Neonatal tetanus - number of reported cases")
pertussis_cases <- filter(data, Metric == "Pertussis - number of reported cases")
yellow_fever_cases <- filter(data, Metric == "Yellow fever - number of reported cases")
diphtheria_cases <- filter(data, Metric == "Diphtheria - number of reported cases")
rubella_total <- filter(data, Metric == "Total rubella - number of reported cases")
meningitis_deaths <- filter(data, Metric == "Number of suspected meningitis deaths reported")
mumps_cases <- filter(data, Metric == "Mumps - number of reported cases")
congenital_rubella <- filter(data, Metric == "Congenital Rubella Syndrome - number of reported cases")
japanese_encephalitis <- filter(data, Metric == "Japanese encephalitis - number of reported cases")

5 Research Questions

In this section, we address three key research questions about infectious diseases in Kenya. Each question is explored through detailed statistical analysis and data visualization to reveal important public health trends and patterns.

5.1 Question 1: How Have Reported Cholera Cases and Deaths Varied Over Time in Kenya?

Cholera is a major public health concern in Kenya, causing periodic outbreaks with significant morbidity and mortality. In this section, we examine the historical trends in both cholera cases and deaths to understand the disease burden and identify patterns in outbreak cycles.

5.1.1 Summary

Having reviewed each disease individually, we can now synthesize these findings to understand the broader patterns in infectious disease control in Kenya. Statistics

Here we present the key statistical highlights from the CFR data, including the highest and lowest rates observed during the study period.

First, we’ll calculate the total number of cholera cases and deaths recorded in the dataset to understand the overall magnitude of the cholera burden in Kenya.

5.1.2 Total number of cholera cases

total_cholera_cases <- sum(cholera_cases$Value, na.rm = TRUE)
print(paste("Total Cholera Cases:", total_cholera_cases))
## [1] "Total Cholera Cases: 118214"

5.1.3 Total number of cholera deaths

total_cholera_deaths <- sum(cholera_deaths$Value, na.rm = TRUE)
print(paste("Total Cholera Deaths:", total_cholera_deaths))
## [1] "Total Cholera Deaths: 3873"

5.1.4 Analysis

Based on the calculated totals, we can now interpret what these numbers tell us about cholera’s impact in Kenya over time. A total of 118,214 reported cholera cases have been recorded in Kenya over the available years. 3,873 deaths from cholera were reported in this time period. This indicates that cholera cases and deaths have varied considerably over time, with large outbreaks contributing to these totals. The data reflect cholera as a significant and persistent public health challenge in Kenya, with periodic epidemics causing thousands of cases and fatalities. If viewed over time, these data would likely show peaks during epidemic years and lower counts in others, consistent with the known cyclical patterns of cholera outbreaks in the country.

5.1.6 Variables

  • Year (numeric)
  • Reported cholera cases (numeric)
  • Reported cholera deaths (numeric)

5.1.7 Aesthetic Choices

  • x-axis: Year
  • y-axis: Count (cases or deaths)
  • Color: Metric (cases vs deaths)

5.1.8 Geometry

  1. geom_line() for trends over time
  2. geom_point() to show exact data points

5.1.9 Rationale

Line plots clearly show temporal trends and peaks. Different colors allow comparison between cases and deaths.

5.3 Question 3: How Do the Incidences of Different Infectious Diseases Compare Across Years in Kenya?

Comparing multiple diseases helps us understand the relative public health burden and the effectiveness of different intervention strategies. In this section, we examine measles, poliomyelitis, and meningitis to see how their patterns differ over time.

5.3.1 Overview

Before diving into the visualization, let’s examine the summary statistics for each disease to understand their individual characteristics and trends.

The incidences of measles, poliomyelitis, and meningitis compare across years in Kenya as follows:

5.3.2 Measles

  • Years Reported: 46
  • Peak Year: 1983
  • Peak Cases: 285,681
  • Average Annual Cases: ~41,989
  • Trend: Measles had the highest incidence among the three diseases, with massive outbreaks in the 1980s. The number of cases declined significantly after the 1990s, likely due to expanded vaccination programs.

5.3.3 Poliomyelitis

  • Years Reported: 39
  • Peak Year: 1988
  • Peak Cases: 1,688
  • Average Annual Cases: ~180
  • Trend: Polio cases were moderate in the 1980s and 1990s, with a clear decline toward eradication after 2000. Most recent years show zero reported cases, reflecting successful immunization efforts.

5.3.4 Meningitis (Suspected Deaths)

  • Years Reported: 13
  • Peak Year: 1977
  • Peak Deaths: 196
  • Average Annual Deaths: ~65
  • Trend: Meningitis data is sparse and irregular, but peaks in the late 1970s suggest episodic outbreaks. Reporting may have been inconsistent or limited to severe cases.

5.3.5 Summary

Measles was the most widespread and consistently reported disease. Polio showed moderate incidence with successful control over time. Meningitis had fewer data points but notable peaks in specific years. Visualization 3: Comparison of Multiple Diseases To compare these three diseases visually, we’ll create a multi-line graph with each disease represented by a different color. This allows for direct comparison of their temporal patterns and relative magnitudes.

5.3.6 Variables

  • Year (continuous)
  • Number of Cases/Deaths (continuous)
  • Disease type (categorical)

5.3.7 Aesthetic Choices

  • x-axis: Year
  • y-axis: Number of Cases/Deaths
  • Color: Disease type

5.3.8 Geometry

  • geom_line() with size = 1
  • geom_point() for data points
# Adding disease labels
measles_cases$Disease <- "Measles"
polio_cases$Disease <- "Poliomyelitis"
meningitis_deaths$Disease <- "Meningitis (Deaths)"

# Combining data
combined_data <- bind_rows(measles_cases, polio_cases, meningitis_deaths)
combined_data$Year <- as.numeric(combined_data$Year)

# Plotting disease comparison
ggplot(combined_data, aes(x = Year, y = Value, color = Disease)) +
  geom_line(size = 1) +
  geom_point() +
  labs(title = "Comparison of Measles, Poliomyelitis, and Meningitis in Kenya",
       x = "Year",
       y = "Number of Cases / Deaths",
       color = "Disease") +
  theme_minimal()

### Additional Detailed Visualizations To provide a more comprehensive view of the data, we present additional focused visualizations that examine specific aspects of cholera trends in greater detail.

Cholera Cases Trend (1971-2016) This detailed visualization focuses exclusively on cholera case numbers, using contrasting colors (blue line with red points) to make individual data points stand out and help identify specific outbreak years.

# Filter cholera cases
cholera_cases_filtered <- filter(data, Metric == "Number of reported cases of cholera")
cholera_cases_filtered$Year <- as.numeric(cholera_cases_filtered$Year)

# Plot cholera cases
ggplot(cholera_cases_filtered, aes(x = Year, y = Value)) +
  geom_line(color = "blue", size = 1) +
  geom_point(color = "red") +
  labs(title = "Trend of Reported Cholera Cases in Kenya (1971–2016)",
       x = "Year",
       y = "Number of Cases") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, hjust = 1))

5.3.9 Cholera Deaths Trend (1971-2016)

Similarly, we create a focused visualization of cholera deaths using red coloring throughout to emphasize the mortality aspect and make patterns in death rates clearly visible.

# Filter reported cholera deaths from 1971 to 2016
cholera_deaths_filtered <- data %>%
  filter(Metric == "Number of reported deaths from cholera") %>%
  filter(Year >= 1971 & Year <= 2016) %>%
  mutate(Year = as.numeric(Year))

# Plot cholera deaths
ggplot(cholera_deaths_filtered, aes(x = Year, y = Value)) +
  geom_line(color = "red") +
  geom_point(color = "red") +
  labs(title = "Reported Cholera Deaths in Kenya (1971-2016)",
       x = "Year",
       y = "Number of Deaths") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5))

5.3.10 Overall Conclusions

Having completed our comprehensive analysis of infectious disease data in Kenya, we now synthesize the key findings and their implications for public health policy and practice.

This analysis examined infectious disease trends in Kenya, with particular attention to cholera patterns between 1971 and 2016. The key findings include:

Cholera remains a significant public health challenge in Kenya, with over 118,000 cases and nearly 4,000 deaths recorded over the study period.

Case fatality rates have generally improved over time, declining from peaks of over 15% in the early 1970s to lower rates in recent years, though occasional spikes indicate ongoing challenges.

5.3.11 Different diseases show distinct patterns:

Measles had the highest burden but declined significantly after vaccination programs Polio has been effectively controlled through immunization Meningitis shows episodic outbreaks with sparse data Public health interventions appear effective, as evidenced by declining trends in most disease indicators, particularly for vaccine-preventable diseases.

These visualizations reveal important trends in disease incidence, mortality, and case fatality rates that can inform future public health interventions and resource allocation in Kenya. Session Information For reproducibility purposes, we include the R session information showing the versions of R and all packages used in this analysis.