Influenza vaccine rates

Background

Influenza is a common infectious disease that affects many people. Most people with influenza recover quickly, but elderly people and those with chronic medical conditions are at higher risk of complications and even death. Every year, millions of people get the flu. The good news is that the seasonal flu vaccine can lower the risk of getting the flu by about half. Getting the yearly flu vaccine is the best way to protect yourself from the flu.

Objective

In this project, I am intending to create a dashboard of Influenza vaccine rates on different counturies and an interactive graphic that shows how the Influenza vaccine rate changes over time by countury.

Data

Influenza vaccination rate refers to the number of people aged 65 and older who have received an annual influenza vaccination, divided by the total number of people over 65 years of age. This indicator is measured as a percentage of the population aged 65 and older who have received an annual influenza vaccine. The data come from administrative sources or surveys, depending on the country.

OECD Data https://data.oecd.org/healthcare/influenza-vaccination-rates.htm

Technologies

  • Shiny app in R using plotly for the interactive dropdown options
  • ggplot2 for plots and graphs
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 3.2.1     ✔ purrr   0.3.2
## ✔ tibble  2.1.3     ✔ dplyr   0.8.3
## ✔ tidyr   1.0.0     ✔ stringr 1.4.0
## ✔ readr   1.3.1     ✔ forcats 0.5.0
## ── Conflicts ────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
library(ggplot2)
library(shiny)
## Warning: package 'shiny' was built under R version 3.6.2
library(extrafont)
## Registering fonts with R
library(gganimate)
## Warning: package 'gganimate' was built under R version 3.6.2
library(animation)
library(grid)

Dataset

dataset <- read.csv("https://raw.githubusercontent.com/ekhahm/DATA608/master/DP_LIVE_18102020230217983.csv")
head(dataset)
##   LOCATION INDICATOR SUBJECT  MEASURE FREQUENCY TIME Value Flag.Codes
## 1      AUS FLUVACCIN     TOT PC_POP65         A 2000  74.0           
## 2      AUS FLUVACCIN     TOT PC_POP65         A 2001  78.0           
## 3      AUS FLUVACCIN     TOT PC_POP65         A 2002  76.9           
## 4      AUS FLUVACCIN     TOT PC_POP65         A 2003  76.9           
## 5      AUS FLUVACCIN     TOT PC_POP65         A 2004  79.1           
## 6      AUS FLUVACCIN     TOT PC_POP65         A 2006  77.5

Average percentage grouped by location(countury)

data_loc <- dataset %>%
          group_by(LOCATION) %>%
          summarise(mean = mean(Value), n= n())
data_loc
## # A tibble: 36 x 3
##    LOCATION  mean     n
##    <fct>    <dbl> <int>
##  1 AUS       76.7     7
##  2 AUT       28.2     2
##  3 BEL       60.8     5
##  4 CAN       63.9    15
##  5 CHE       55.2     9
##  6 CHL       72.8    19
##  7 CZE       19.0    11
##  8 DEU       49.3    13
##  9 DNK       40.6    17
## 10 ESP       60.9    20
## # … with 26 more rows

Average percentage grouped by year

data_year <- dataset %>%
          group_by(TIME) %>%
          summarise(mean = mean(Value), n= n())
data_year
## # A tibble: 20 x 3
##     TIME  mean     n
##    <int> <dbl> <int>
##  1  2000  55.1    12
##  2  2001  54.3    17
##  3  2002  50.2    19
##  4  2003  55.3    23
##  5  2004  56.0    23
##  6  2005  55.1    24
##  7  2006  53.0    25
##  8  2007  52.7    26
##  9  2008  51.3    28
## 10  2009  51.0    30
## 11  2010  44.1    29
## 12  2011  44.8    26
## 13  2012  44.3    29
## 14  2013  45.5    28
## 15  2014  43.2    32
## 16  2015  43.4    28
## 17  2016  42.8    29
## 18  2017  44.7    28
## 19  2018  46.7    27
## 20  2019  51.0     7

Visualization

dataset$LOCATION <- factor(dataset$LOCATION , levels=rev(levels(dataset$LOCATION)))

Heatmap

Heatmap is able to give an instant overview of entire data.

p <- ggplot(dataset,aes(x=TIME,y=LOCATION,fill=Value))+
  #add border white colour of line thickness 0.25
  geom_tile(colour="white",size=0.25)+
  #remove x and y axis labels
  labs(x="",y="")+
  #remove extra space
  scale_y_discrete(expand=c(0,0))+
  #define new breaks on x-axis
  scale_x_discrete(expand=c(0,0),
                   breaks=c("2000","2005","2010","2015","2020"))+
  #set a base size for all fonts
  theme_grey(base_size=8)+
  #theme options
  theme(
    #bold font for legend text
    legend.text=element_text(face="bold"),
    #set thickness of axis ticks
    axis.ticks=element_line(size=0.4),
    #remove plot background
    plot.background=element_blank(),
    #remove plot border
    panel.border=element_blank())
p

Barplot

Bar plot is created to show an average Influenza vaccination(%) by country and re-ordered from highest to lowest.

ggplot(data=data_loc, aes(x=reorder(LOCATION, mean), y=mean)) +
  geom_bar(stat="identity", fill="steelblue")+
  theme_minimal()+
  ggtitle("Average Influenza vaccination(%) by Location")+
  xlab("Country") + ylab("Average vaiccination")+
  geom_text(aes(label=round(mean,1)), vjust= 0.5, size=2.5)+
  coord_flip()

Line graph

This line graph shows total average of Influenza vaccination rate for all country.

ggplot(data=data_year, aes(x=TIME, y=mean)) +
  geom_line(linetype = "dashed")+
  ggtitle("Average Influenza vaccination(%) over time")+
  ylim(0, 100)+
  geom_point()

Shiny App

Using Shiny App, I take the datase and use dpylr to filter the data out by country and create line graph changing over year. Created Shiny App with dataset is below:
https://ehahm.shinyapps.io/Data608Final/