Introduction

Dengue fever, also known as bone break fever, is a tropical disease spread by the bite of an infected mosquito, primarily Aedes aegypti. This disease is caused by the dengue virus which belongs to the flavivirus genus and consist of four distinct serotypes. A person can be infected with multiple different serotypes over their lifetime. Immunity acquired from prior infection does not confer immunity to subsequent infections for other dengue serotypes. The disease is endemic in the Caribbean, Central and South America with cases periodically occurring in Florida.

 

Figure 1. The chain of infection for dengue fever. Retrieved from https://www.cdc.gov/dengue/transmission

 

Florida routinely experiences two types of dengue cases. The first are travel-related where a person becomes infected in another country and exhibits symptoms after returning home. The second are locally-acquired cases where a person with no known history of international travel becomes ill from an infected mosquito in Florida. While local transmission of dengue is rare, the spread of dengue in Florida is a major public health concern because the vector,Ae. aegypti, is widespread in major population centers where frequent international travel occurs to dengue endemic countries.

This report wrangles and visualizes data acquired from the Florida Department of Health to illustrate that:

  • Miami-Dade County has experienced the greatest number of travel and locally acquired cases.
  • The greatest number of cases are reported in the months of June, July, August, and October.
  • Overall, travel-related dengue cases have risen dramatically in the past few years and require targeted intervention strategies to prevent local transmission from rising.

Data Preparation

This section loads the necessary libraries for performing the data preparation and analysis. The glimpse() function is leveraged from the skimr library to quickly assess the structure of the dengue dataset in order to perform any required tidying and data transformations.

# load library
library(tidyverse)
library(here)        # file pathways
library(skimr)       # data exploring
library(janitor)     # data cleaning
library(kableExtra)  # table formatting
library(plotly)      # interactive plots
library(hrbrthemes)  # plot themes
# read in dengue data
dengue <- read_csv(here('data', 'dengueCasesFL.csv'))

The dengue dataset is cleaned to remove the spaces within column names and set to lower case. Then, the data frame is re-ordered by date and arranged by most recent by year then month.

# clean dengue dataset
cleandengue <- dengue %>% 
  # clean col names
  clean_names() %>%
  # rename serotype and case fields
  rename(serotype = serotype_detected,
         cases = number_of_cases,
         type = case_type) %>%
  # reorder columns by year, month, county, then the rest of the column
  select(year,
         month,
         county,
         everything())  %>% # get rest of columns
  # arrange by year and month
  arrange(year,month) %>%
  # transform year and month as factor for plots
  mutate(year = factor(year),
         month = factor(month, levels = month.name))

# glimpse
glimpse(cleandengue)
## Rows: 1,750
## Columns: 6
## $ year     <fct> 2009, 2009, 2009, 2009, 2009, 2009, 2009, 2009, 2009, 2009, 2…
## $ month    <fct> April, April, April, August, August, August, August, December…
## $ county   <chr> "Orange", "Sarasota", "Hillsborough", "Orange", "Miami-Dade",…
## $ serotype <chr> "Unknown", "Unknown", "Unknown", "Unknown", "Unknown", "DENV-…
## $ cases    <dbl> 1, 1, 1, 1, 1, 1, 6, 2, 1, 1, 1, 3, 1, 1, 2, 1, 2, 1, 1, 1, 1…
## $ type     <chr> "travel", "travel", "travel", "travel", "travel", "local", "l…

The dataset contains the following columns:

  • county (text): the county of residence for the dengue case
  • month (text): the month when the case occurred
  • year (double): the year when the case occurred
  • serotype Detected (text): the serotype/variant of the dengue case
  • cases (double): the number of cases
  • type (text): whether the case was travel-related or locally acquired

Analysis

First, the top 10 counties where the number of dengue cases is greatest is identified. Based on the summarized data in the table below, Miami-Dade County reports the greatest number of dengue cases within Florida, with 2347 total cases reported from 2010 to 2024.

# summarize cases by county
county <- cleandengue %>% 
  # group by county
  group_by(county) %>% 
  # sum cases
  summarize(Cases = sum(cases)) %>%
  # rename county for table
  rename(County = county) %>%
  arrange(-Cases) %>%
  head(10) %>%
  kable(format = "html", booktabs = TRUE) %>% 
  kable_styling(font_size = 18)
# display county data table
county
County Cases
Miami-Dade 2347
Broward 384
Hillsborough 245
Palm Beach 225
Monroe 186
Orange 175
Lee 102
Osceola 65
Collier 50
Duval 48

Although a case summary is useful, the frequency at which cases are reported each month is another reliable indicator for the prevalence of dengue. Given that the number of cases are tabulated in the dataset by month for each county, a jittered point plot is useful for visualizing not only which county experiences the largest number of cases, but the monthly frequency at which cases have occurred. Since months where no cases were confirmed are absent from the dataset, a jittered plot is helpful for illustrating how many months experienced dengue for each county. A greater density of dots indicates a higher frequency of reported cases, whereas an absence or lower density represent less frequent cases reported. In addition, the type of cases are facet wrapped to see if the type of cases occur similarly between travel and locally acquired types.

One important factor to consider for interpreting the plot are the disparities in testing and reporting for local public health agencies which likely under-represent the actual extent of dengue incidence occurring in Florida.

cleandengue %>%
  # re-factor counties by the greatest number of cases
  mutate(county = fct_reorder(county, cases, .fun = max, .desc = FALSE)) %>%
  # plot re-factored cases by county and color by year
  ggplot(aes(x=county,y=cases,color=year)) +
  # jittered points
  geom_jitter() +
  # flip coords on plot
  coord_flip() +
  # set theme
  theme_minimal() +
  # facet wrap by type
  facet_wrap(~type) +
  # set main title
  labs(title='Monthly Dengue Cases') +
  # adjust plot elements
  theme(
    # center title 
    plot.title = element_text(hjust = 0.5)) +
  # set y label
  ylab('Cases') +
  # set x label
  xlab('County')
Figure 2. Travel versus local dengue cases in Florida from 2010-2024. Miami-Dade County by and far has reported the greatest number of both types of dengue cases

Figure 2. Travel versus local dengue cases in Florida from 2010-2024. Miami-Dade County by and far has reported the greatest number of both types of dengue cases

  # save plot for upload
  ggsave(here("outputs", "dengueFacet.png"))
## Saving 7 x 7 in image

Next, the months which had the greatest number of dengue cases reported are identified. Florida is known as a tourism destination both domestically and internationally. Given that Aedes aegypti is most active in the summer months, understanding seasonal trends of travel cases is vital toward preventing dengue’s local transmission and spread. The second visualization is a stacked bar plot time series which displays the number of monthly imported dengue cases versus the number of locally acquired cases from 2010 to 2024.

cleandengue %>%
  # group by month and type
  group_by(month, type) %>%
  # summarize cases
  summarise(sumCases = sum(cases)) %>%
# create stacked bar plot for monthly cases by type
  ggplot(aes(x = month, 
             y = sumCases, 
             fill = type)) + 
    # bar plot
    geom_bar(stat = "identity") +
    # set title
    labs(title='Monthly Dengue Cases') +
    theme_minimal() +
    # center plot title
    theme(plot.title = element_text(hjust = 0.5)) +
    # set x label
    xlab('Month') +
    # set y label
    ylab('Total Cases') +
    # set theme settings text
    theme(plot.title = element_text(hjust = 0.5),
          # x-axis text
          axis.text.x = element_text(size = 11,  # text size
                                     angle = 30,  # angle
                                     vjust = 0.8,  # vertical justification
                                     hjust = 0.8)) + # horizontal justification
    # set bar fill colors based on case type
    scale_fill_manual(values = c("local" = "#EF6F6C", "travel" = "#DDAE7E"), name = "Case Type")
Figure 3. Total monthly dengue cases by type. Travel-related cases are the largest source of dengue cases in Florida, while a small subset are a result of local transmission

Figure 3. Total monthly dengue cases by type. Travel-related cases are the largest source of dengue cases in Florida, while a small subset are a result of local transmission

After summarizing cases by month, the majority of cases have occurred in July, August, September, and October. However, the yearly abundance is another relevant factor for understanding the disease’s prevalence in recent years. The next block summarizes cases by year and type to illustrate the number of cases each year in an interactive plot.

# plot yearly time series 
ts_dengue <- cleandengue %>%
  # group by year and type
  group_by(year, type) %>%
  # sum cases
  summarise(cases = sum(cases)) %>%
  # plot; set fill based on type
  ggplot(aes(x=year,y=cases, fill=type))+
  # bar plot
  geom_bar(stat="identity") +
  # set main title
  labs(title='Yearly Dengue Cases') +
  # set y axis label
  ylab("Number of Cases") +
  # set x axis label
  xlab("Year") +
  # set theme
  theme_ipsum() + 
  # set bar colors based on type
  scale_fill_manual(values = c("local" = "#69b3a2", "travel" = "#DDAE7E"), name = "Case Type")

# convert plot to interactive plot with 
ts_dengue <- ggplotly(ts_dengue)
# display plot
ts_dengue

Figure 4. Total reported dengue cases each year. The number of cases has greatly increased since 2022, following a record-low in 2021, likely due to increased travel restrictions

Conclusion

This analysis resulted in three key findings

  • Dengue prevalence for both travel related and locally acquired cases was most abundant in Miami-Dade County
  • Dengue cases were most reported during the months of July, August, and September.
  • Yearly dengue cases have starkly increased from 2022 to 2024, with local transmission occurring more regularly.