In this assignment, we have been provided data around mortality from all 50 states and the District of Columbia. We will be using R shiny package. We already created an account in https://www.shinyapps.io/ and set our name, token and password.
As a researcher, you frequently compare mortality rates from particular causes across different States. You need a visualization that will let you see (for 2010 only) the crude mortality rate, across all States, from one cause (for example, Neoplasms, which are effectively cancers). Create a visualization that allows you to rank States by crude mortality for each cause of death.
# load libraries
library(shiny)
## Warning: package 'shiny' was built under R version 3.6.2
library(rsconnect)
## Warning: package 'rsconnect' was built under R version 3.6.2
##
## Attaching package: 'rsconnect'
## The following object is masked from 'package:shiny':
##
## serverInfo
library(ggplot2)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(plotly)
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
# authorization of the account
rsconnect::setAccountInfo(name='akyneal', token='27031347FE166FAC8380BE849FA46ADE', secret='lnyro+EJUmCTyGMsIJ/+Q0mRcjMEkOE/Ky0Sy7sb')
We can follow the shiny app tutorial series provided by RStudio in order to create our shiny app. (Reference: (https://shiny.rstudio.com/tutorial/written-tutorial/lesson1/).
Structure of shiny app contains three components
*user unterface object(ui)
*server function (server)
*call to the shinnyApp function
With ui object we are going to set up the layout and appearance of the shiny app, with server function we are going to provide certain functions to compile the app, and finally with shinyApp function we are going to create the shiny app using ui and server.
# load the dataset in R and review
df <- read.csv("https://raw.githubusercontent.com/charleyferrari/CUNY_DATA_608/master/module3/data/cleaned-cdc-mortality-1999-2010-2.csv")
head(df)
## ICD.Chapter State Year Deaths Population
## 1 Certain infectious and parasitic diseases AL 1999 1092 4430141
## 2 Certain infectious and parasitic diseases AL 2000 1188 4447100
## 3 Certain infectious and parasitic diseases AL 2001 1211 4467634
## 4 Certain infectious and parasitic diseases AL 2002 1215 4480089
## 5 Certain infectious and parasitic diseases AL 2003 1350 4503491
## 6 Certain infectious and parasitic diseases AL 2004 1251 4530729
## Crude.Rate
## 1 24.6
## 2 26.7
## 3 27.1
## 4 27.1
## 5 30.0
## 6 27.6
#filter data based to make sure we are getting mortality for 2010 only and for particular cause
infectious_diseases <- filter(df,
Year == 2010 & ICD.Chapter == 'Certain infectious and parasitic diseases')
infectious_diseases <- infectious_diseases %>% arrange(Crude.Rate)
head(infectious_diseases)
## ICD.Chapter State Year Deaths Population
## 1 Certain infectious and parasitic diseases UT 2010 274 2763885
## 2 Certain infectious and parasitic diseases ID 2010 174 1567582
## 3 Certain infectious and parasitic diseases ND 2010 83 672591
## 4 Certain infectious and parasitic diseases AK 2010 88 710231
## 5 Certain infectious and parasitic diseases MN 2010 678 5303925
## 6 Certain infectious and parasitic diseases NE 2010 245 1826341
## Crude.Rate
## 1 9.9
## 2 11.1
## 3 12.3
## 4 12.4
## 5 12.8
## 6 13.4
# creating the visualization using ggplot to make sure it gives me the result as we expect for one cause
ggplot(infectious_diseases, aes(x=reorder(State, -Crude.Rate), y=Crude.Rate))+
geom_col(fill="tomato3")+
coord_flip()+
geom_text(aes(label=Crude.Rate),
size=3,
hjust=-0.5,
color="black")+
xlab("State")+
ylab("Crude Rate")+
theme(axis.text.x = element_text(angle=65, vjust=0.6))
Below, with ui, we will set up the header title, set up sidepanel for user to select the Cause (ICD Chapter in the dataset), default cause. We will also set up the main panel where bar plot will display based on the selected cause. On the server function, we will define the filter of data function based on cause and year 2010, the actual bar plot using ggplot.
# define UI layout for the shiny app
ui <- fluidPage(
headerPanel('State Moratlity Rate by Cause'),
sidebarPanel(
selectInput('Cause', 'Cause', unique(df$ICD.Chapter),
selected='Certain infectious and parasitic diseases') #will display default cause
),
mainPanel(
htmlOutput(outputId = 'selection'),
plotOutput('plot1', height="auto"),
h6("The Crude Mortality Rate across all States")
)
)
# define server logic for data and input
server <- shinyServer(function(input, output, session) {
selectedData <- reactive({
df %>% filter(ICD.Chapter == input$Cause & Year == 2010 )
})
output$selection <- renderText({
paste('<b> Crude rate for: </b>', input$Cause)
})
output$plot1 <- renderPlot({
ggplot(selectedData(), aes(x=reorder(State, -Crude.Rate), y=Crude.Rate)) +
geom_col(fill = "#ffa600") +
coord_flip() +
geom_text(aes(label=Crude.Rate),
size=3,
hjust=-0.2,
color="#003f5c") +
xlab("State") +
ylab("Crude Rate") +
theme(panel.background = element_blank())
}, height = function() {
session$clientData$output_plot1_width}
)
})
shinyApp(ui = ui, server = server)
## PhantomJS not found. You can install it with webshot::install_phantomjs(). If it is installed, please make sure the phantomjs executable can be found via the PATH variable.
Often you are asked whether particular States are improving their mortality rates (per cause) faster than, or slower than, the national average. Create a visualization that lets your clients see this for themselves for one cause of death at the time. Keep in mind that the national average should be weighted by the national population.
# creating dataframe with national average mortality rate
nation <- df %>%
select(ICD.Chapter, Year, State, Deaths, Population, Crude.Rate) %>%
mutate(Crude.Rate = round((sum(Deaths) / sum(Population)) * 10^5, 1)) %>%
mutate(State = "National") %>%
select(ICD.Chapter, Year, Crude.Rate, State) %>%
group_by(ICD.Chapter, Year)
nation$State <- as.character(nation$State)
head(nation)
## # A tibble: 6 x 4
## # Groups: ICD.Chapter, Year [6]
## ICD.Chapter Year Crude.Rate State
## <fct> <int> <dbl> <chr>
## 1 Certain infectious and parasitic diseases 1999 49.3 National
## 2 Certain infectious and parasitic diseases 2000 49.3 National
## 3 Certain infectious and parasitic diseases 2001 49.3 National
## 4 Certain infectious and parasitic diseases 2002 49.3 National
## 5 Certain infectious and parasitic diseases 2003 49.3 National
## 6 Certain infectious and parasitic diseases 2004 49.3 National
# creating the dataframe with each state mortality rate
state <- df %>%
select(ICD.Chapter, Year, Crude.Rate, State)
head(state)
## ICD.Chapter Year Crude.Rate State
## 1 Certain infectious and parasitic diseases 1999 24.6 AL
## 2 Certain infectious and parasitic diseases 2000 26.7 AL
## 3 Certain infectious and parasitic diseases 2001 27.1 AL
## 4 Certain infectious and parasitic diseases 2002 27.1 AL
## 5 Certain infectious and parasitic diseases 2003 30.0 AL
## 6 Certain infectious and parasitic diseases 2004 27.6 AL
# combanining them into one dataframe
df2 <- union_all(state, nation)
## Warning in bind_rows_(x, .id): binding factor and character vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
head(df2)
## ICD.Chapter Year Crude.Rate State
## 1 Certain infectious and parasitic diseases 1999 24.6 AL
## 2 Certain infectious and parasitic diseases 2000 26.7 AL
## 3 Certain infectious and parasitic diseases 2001 27.1 AL
## 4 Certain infectious and parasitic diseases 2002 27.1 AL
## 5 Certain infectious and parasitic diseases 2003 30.0 AL
## 6 Certain infectious and parasitic diseases 2004 27.6 AL
#creating layout with ui - user inputs State and Cause
ui <- fluidPage(
headerPanel('State Mortality Rates'),
sidebarPanel(
selectInput('State', 'State', unique(df2$State), selected='NY'),
selectInput('ICD.Chapter', 'Cause', unique(df2$ICD.Chapter), selected='Certain infectious and parasitic diseases')
),
mainPanel(
plotlyOutput('plot1')
)
)
# creating function to plot (using plotly this time) with server
server <- function(input, output, session) {
nationalData <- reactive({
nation %>%
filter(ICD.Chapter == input$ICD.Chapter)
})
statedata <- reactive({
df3 <- df2 %>%
filter(State == input$State, ICD.Chapter == input$ICD.Chapter)
})
combined <- reactive({
merge(x = nationalData(), y = statedata(), all = TRUE)
})
output$plot1 <- renderPlotly({
df3 <- df2 %>%
filter(State == input$State, ICD.Chapter == input$ICD.Chapter)
plot_ly(combined(), x = ~Year, y = ~Crude.Rate, color = ~State, type='scatter',
mode = 'lines')
})
}
shinyApp(ui = ui, server = server)