HOMEWORK #9: Interactive Graphic

My goal was to make an interactive world map featuring the cause of death data set from Kaggle

-The code below is some wrangling to get the cause of death data on a world map.

-I then used plotly to put the data as a hoverinfo feature.

-Then I decided to make a Shiny App using this data and the world map. My objective was to have the user choose the year and the specific cause of death, and then a choropleth of the world would appear showing how many people died in that particular year as a result of the disease/condition. The purpose of such an app would be to show global trends in cause of deaths. For example, we can see that deaths as a result of nutritional deficiencies increased between 1990 and 2019:

Deaths by nutritional deficiencies, 1990 Deaths by nutritional deficiencies, 2019 This Shiny app was successful in my local environment. However, when I published this app including the data from all years (1990-2019) joined with the world map data, which produced a dataset of almost 3 million rows, I ran into an “out of memory” error.

After some troubleshooting I was still not able to get the app to function correctly. I was tempted to upgrade my account in order to get memory > 1 gb, however I suspect the issue is more likely my inexperience working with shiny, and inefficient code.

I was able to publish a paired down version of the app, which prompts the users to select a cause of death of interest, and it displays a world choropleth of that data in the year 2000.

Cause of Death Shiny App

In order to make this app more polished as a portfolio piece, I will:

-transform the data to show the number of deaths per 100k in each country, as opposed to a total death count (in the current dataset, China appears red quite often due to population size.)

-Include data for each year in a way that doesn’t prevent the Shiny App from functioning as a web app.

-Give the user an option of how they would like data graphically represented (choropleth, bar graph, scatterplot)

This journey into the world of Shiny has been challenging but extremely rewarding.

*DATA WRANGLING:

  1. Loading cause of death dataset and world population dataset (from Kaggle)
#reading in dataset 
cod = read_csv("cause_of_deaths.csv")
## Rows: 6120 Columns: 34
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (2): Country/Territory, Code
## dbl (32): Year, Meningitis, Alzheimer's Disease and Other Dementias, Parkins...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#world pop data for year  2000
pop = read_csv("world_population.csv")
## Rows: 234 Columns: 17
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (4): CCA3, Country/Territory, Capital, Continent
## dbl (13): Rank, 2022 Population, 2020 Population, 2015 Population, 2010 Popu...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#renaming to make region column the same 
pop = pop %>% 
  rename(region = "Country/Territory") %>% select(region, `2000 Population`)

pop$region = gsub("Republic of the Congo", "Democratic Republic of the Congo", pop$region)

#rename country column to 'region' (for joining later to map data)
cod = cod %>% 
  rename(region = "Country/Territory")

#making sure "United States" exists
cod %>% 
  filter(region == "United States")
cod %>% 
  filter(region == "Congo")
#changing congo to full name 
cod$region = gsub("Congo", "Democratic Republic of the Congo", cod$region)
cod
cod2000 = cod %>% filter(Year == 2000)
cod2000
  1. Loading world map data
world_map = map_data("world")

#testing world map
world_map%>% 
  ggplot(aes(map_id = region)) +
  geom_map(map = world_map)+
  expand_limits(x = world_map$long, y = world_map$lat)

#checking names of countries 
world_map %>% 
  filter(region == "United States")
#renaming USA to United States
world_map$region = gsub("USA", "United States", world_map$region)

world_map %>% 
  filter(region == "Democratic Republic of the Congo")
  1. Joining cause of death dataset with world map data (on col region)
#only 1 year: 2000
ds = world_map %>% left_join(cod2000)
## Joining, by = "region"
ds = ds %>% left_join(pop)
## Joining, by = "region"
ds
#now cause of death data can be mapped with geom_polygon in ggplot 

#joining data set with all the years...this is BIG ONE, but I'd like the shiny app to let the user pick the year. I will omit population data because I dont have populations for each year in original data set. 


#ds5 = cod %>% inner_join(world_map)
#ds5
#ds5 has almost 3 million rows...my joining every year and every country with all the map data is just too big. I've deployed a shiny app with it, but it crashes with an "out of memory" error. Seems like there would be a cleaner way to handle this task. 

cod_allyears = world_map %>% left_join(cod)
## Joining, by = "region"
cod_allyears
  1. Testing dataset, filtered to one year, fill as one disease
ggplot(cod_allyears %>% filter(Year == 2000), aes(x = long, y =  lat, fill = Meningitis)) + 
    geom_polygon(aes(group = group), color = "coral1")  +
    expand_limits(x =cod_allyears$long, y = cod_allyears$lat) + 
    coord_map("mercator", xlim = c(-180,180)) + 
    theme(
      axis.title.x = element_blank(), 
      axis.title.y = element_blank(),
      axis.text.x = element_blank(), 
      axis.text.y = element_blank(),
      axis.line.y.left = element_blank(), 
      axis.line.x.bottom = element_blank(), 
      axis.ticks = element_blank())

  1. Mapping data with ggplotly
plot = ds %>%
  ggplot(aes(long, lat)) + 
  geom_polygon(aes(group = group, fill = region), color = "black") + 
  expand_limits(x =ds$long, y = ds$lat) + 
  coord_map("mercator", xlim = c(-180,180)) + 
  theme(legend.position = "none")

ggplotly(plot)
  1. Adding labels for hoverinfo:
this.year = 2019

plot = cod_allyears %>%
  filter(Year == this.year) %>% 
  ggplot(aes(x = long, y = lat, group = group,  
             label = `Parkinson's Disease`, 
             label1 = `Alzheimer's Disease and Other Dementias`,
             label2 = `Meningitis`,
             label3 = `Nutritional Deficiencies`, 
             label4 = `Interpersonal Violence`, 
             label5 = `Drug Use Disorders`, 
             label6 = `Lower Respiratory Infections`, 
             label7 = `Self-harm`, 
             label8 = `Environmental Heat and Cold Exposure`, 
             label9 = `Diabetes Mellitus`, 
             label10 = `Protein-Energy Malnutrition`,
             label11 = `Cirrhosis and Other Chronic Liver Diseases`, 
             label12 = `Acute Hepatitis`, 
             label13 = `Malaria`, 
             label14 = `Maternal Disorders`, 
             label15 = `Tuberculosis`, 
             label16 = `Neonatal Disorders`, 
             label17 = Neoplasms, 
             label18 = `Chronic Kidney Disease`, 
             label19 = `Road Injuries`, 
             label20 = `Digestive Diseases`, 
             label21 = `Parkinson's Disease`, 
             label22 = `Drowning`, 
             label23 = `HIV/AIDS`, 
             label24 = `Cardiovascular Diseases`, 
             label25 = `Alcohol Use Disorders`, 
             label26 = `Diarrheal Diseases`, 
             label27 = `Conflict and Terrorism`, 
             label28 = Poisonings, 
             label29 = `Chronic Respiratory Diseases`, 
             label30 = `Fire, Heat, and Hot Substances`
             )) + 
  geom_polygon(aes(fill = region ), color = "black") + 
  expand_limits(x =cod_allyears$long, y = cod_allyears$lat) + 
  coord_map("mercator", xlim = c(-180,180)) + 
  theme(legend.position = "none")



ggplotly(plot)