1 Honor Code

Honor Pledge: I have recreated my group submission using using the tools I have installed on my own computer

2 Part 1

  • Get the latest data from from https://covid19.who.int/table.

    • The file should likely be named “WHO COVID-19 global table data November XXth 2021 at XXXXX.csv”

    • Don’t use the data for the previous group assignments. It is not the most recent data.

  • Choose three WHO regions of interest. Note that we have 6 main regions: Africa, Americas, Eastern Mediterranean, Europe, South-East Asia, and Western Pacific.

  • Create a subset including 3 countries per WHO region. You can choose any three countries within each WHO region to compare the mortality rate (mutate(rate = "Deaths - cumulative total"/"Cases - cumulative total")). You will have 9 countries (3 countries * 3 WHO regions).

  • Using navbarPage(), create a shiny dashboard that contains 3 tabs where each tab has a barplot of mortality rates with error bars from three countries in the selected region. For example,

ui <- navbarPage(title = "Mortality Rate", 
      tabPanel("Africa", ...), 
      tabPanel("Americas", "...), 
      tabPanel("Eastern Mediterranean", ...))
  • Create separate tabPanel() for each level ofWHO region variable.

  • Under each tab, create a bar plot (with error bars) from three countries in the selected region with a controller for including/excluding errors bars. The default plot has no error bars.

  • Use different control types under each tab. Check available control types from Widget gallery.

  • Change a theme from the shinythemes package. See options from https://rstudio.github.io/shinythemes/.

  • Tips for renderPlot():

    output$<id> = renderPlot({
     
     p1 <-  ...your ggplot...
     print(p1)
      
     if (input$<id>) { # modify this line
     p1_e <- p1 + geom_errorbar(aes(ymin=rate-1.96*SE, ymax=rate+1.96*SE), width=.2)  # add error bars
     print(p1_e)
     }}) 
Shiny applications not supported in static R Markdown documents

3 Part 2

3.1 Reading in the Data

## [1] "AIzaSyAmTBchnehONJVgc7mD9A-P8buro6mLRDc"
##   Year   Month      State               Location
## 1 1998 January California             Restaurant
## 2 1998 January California                       
## 3 1998 January California             Restaurant
## 4 1998 January California             Restaurant
## 5 1998 January California Private Home/Residence
## 6 1998 January California             Restaurant
##                                Food Ingredient             Species
## 1                                                                 
## 2                           Custard                               
## 3                                                                 
## 4                         Fish, Ahi                Scombroid toxin
## 5 Lasagna, Unspecified; Eggs, Other            Salmonella enterica
## 6                                                  Shigella boydii
##   Serotype.Genotype    Status Illnesses Hospitalizations Fatalities
## 1                                    20                0          0
## 2                                   112                0          0
## 3                                    35                0          0
## 4                   Confirmed         4                0          0
## 5       Enteritidis Confirmed        26                3          0
## 6                   Confirmed        25                3          0

3.2 Data Description

This dataset can be found on Kaggle using this link. The inspiration of this dataset was to see if foodborne disease outbreaks are increasing or decreasing. Similarly, it can help us answer questions like which contaminant has been responsible for the most illness/hospitalizations/deaths, and which food locations for food preparation poses the greatest risk of foodborne illness.

This dataset specifically provides the data on foodborne disease outbreaks reported to CDC from 1998 through 2015. There are 14 columns: Year, Month, State, Location, Food, Ingredient, Species, Serotype.Genotype, Status, Illnessess, Hospitalizations, Fatalities, state, and region. Due to the nature of the dataset, the names of these columns make it clear what it is representing.There are 19119 observations in this dataset. The dataset has 4 numeric variables, one chr/string variables, and 9 factor variables.

3.3 Map

The map here shows us that States like Florida and California have the most total foodborne disease outbreaks since they are shown in the light colored blue, compared to the rest of the map which is filled with more dark blues. We can see that there are no clear patterns in which region has higher or lower count of the foodborne disease outbreaks. It is interersting that California and Florida are the only states with obvious high count of the total locations with foodborne disease outbreaks.