Instructions

Write this homework acting as if I don’t know what I asked you. For example, don’t simply list question numbers for the headings. If you gave this document to someone else who didn’t know the assignment, they should be able to understand what you did by reading the headings, code, and accompanying text.

Look to my HW1 and RMarkdown Organization examples for how to write good headings and organize your assignment.

This HW is worth 10 total points.

This assignment is pretty straightforward but offers a lot of room for creativity. I look forward to seeing what you’ll create!

  1. Change the author and date fields in the header above to your name and the date.

  2. Make sure to load any packages you may need right at the start. Do NOT include the learnr package, ever, unless you are writing an interactive Tutorial (which you won’t do in this) - this will cause problems.

  3. Ensure that no chunks have the include = FALSE or echo = FALSE option, as I want to be able to see all your code and output.

  4. Brief but descriptive headings and document organization (answers under headings, text near relevant code, brief explanatory text as indicated below, etc.) (1 pt)

  5. Create a map. It must have at least two groups/regions. Those groups/regions should be colored according to some value of a third valuable. In other words, make a map that displays some interesting differences across a geographic area.

    You can choose any geographic area you want and can find map data for in R. Examples might include U.S. states; different countries in Europe, Asia, Africa, or the Americas; and Chinese provinces. (Because I suspect many of you might be interested in mapping within China: For some reason, at least for me, using map_data("china") and the ggplot() methods we discussed produces an incorrect map with odd, piecemeal provincial borders. I do not know why. If you have the same problem, as a solution I recommend using Guangchang Yu’s chinamap package. You install it using remotes::install_github("GuangchuangYu/chinamap"), then create the map data frame using the get_map_china() function. From there you should be able to use the techniques we learned to generate a provincial map. Here are some examples of maps he made with that package. Note he uses a slightly different technique, geom_map(), but geom_polygon() seems to work just fine, too. Let me know if you’re having difficulty here, though!)

    The variable you map can be anything that varies across the geographic area you’re mapping. It cannot be from a data frame included in Tutorial 7.1, but otherwise you are free to choose whatever interests you! Note this part of the assignment does require you to find some data on your own (though it can be a data frame we’ve worked with in the past), import it, and link it to the map data. Linking may be a bit tricky - remember the region names will have to match exactly, including in capitalization. Use functions like str_to_lower() to help with that.

  6. Create an interactive plot using `plot_ly().

    You may either update a former static plot you made (for example, in your QTM 150 Final Project) to be interactive, or create an entirely new interactive plot. You may not simply alter any plot used in any prior Tutorial.

    You should give the plot a brief but descriptive title, and informative axis titles. You don’t explicitly know how to do this yet, but part of the challenge is to research it on your own (via Google or other means) and find out! It’s not too difficult, I promise.

    Describe in narrative text what your code is doing. Once again I want more detail than usual - you can describe any preparatory code briefly (about 1 sentence or comment per code block), but for the plotly code itself I want you to describe it essentially line by line. Basically, prove to me you can explain, in your own words, what your code is doing and how it’s working to create the interactive plot. (4.5 pts)

To submit this assignment:

Knit and submit as an HTML file just this one time (so I can see the interactivity of your plotly graphs).

——BEGIN ANSWER BELOW——–

Creating Maps and Interactive Plots using plot_ly()


Manipulating weedprices Dataset



weed_prices <- 
  read.csv("C:/Users/Sahithi Gangaram/Desktop/QTM151/weedprices.csv") 

weedprices2 <- weed_prices %>% 
    rename(state = ï..State)
weedprices2$HighQ = as.numeric(gsub("\\$","",weedprices2$HighQ))
weedprices2$MedQ = as.numeric(gsub("\\$","",weedprices2$MedQ))
weedprices2$LowQ = as.numeric(gsub("\\$","",weedprices2$LowQ))

weedprices3 <- weedprices2 %>% 
    as_tibble() %>% 
    group_by(state) %>% 
    summarize(avg_HighQ = mean(HighQ))

western <- c("montana", "wyoming", "colorado", "new mexico", "idaho", "utah",
             "arizona", "washington", "oregon", "nevada", "california", "alaska",
             "hawaii")
midwestern <- c("ohio", "michigan", "indiana", "illinois", "wisconsin", 
                "minnesota", "iowa", "missouri", "north dakota", "south dakota",
                "nebraska",  "kansas")
southern <- c("maryland", "delaware", "west virginia", "district of columbia", 
              "virginia", "kentucky", "north carolina", "south carolina", 
              "tennessee", "georgia", "florida", "mississippi", "alabama", 
              "arkansas", "louisiana", "oklahoma", "texas")
northeastern <- c("pennsylvania", "new jersey", "new york", "connecticut", 
                  "massachusetts", "rhode island", "vermont", "new hampshire", 
                  "maine")
 
# Used dplyr function mutate to make all state values lowercase 
weedprices4 <- weedprices3 %>%
      mutate(state = tolower(state))


us_state <- map_data("state")

weed_prices_withRegion <- us_state %>% as_tibble() %>% 
  mutate(subregion = case_when(region %in% western ~ "western",
                            region %in% midwestern ~ "midwestern",
                            region %in% southern ~ "southern",
                            region %in% northeastern ~ "northeastern"))

us_state_weed <- weed_prices_withRegion %>%
  as_tibble() %>% 
  left_join(weedprices4, by = c("region" = "state"), copy = TRUE)

In the above code chunk, I first revised the names of the headings in the data frame. Then I took all of the high quality weed prices for each state and summarized them by their mean. Finally, I created and added the subregion for each state using the left join function.


Creating the Map


ggplot(us_state_weed, aes(x = long, y = lat)) + # Define the data set and x, y axes
  geom_polygon(aes(group = group,
                   # fill = the gradient color representing each state's varying weed price
                   fill = avg_HighQ,      
                   # color = the outline representing the preset sub region categories for each state
                   color = subregion)) +  
                   # Here we define the fill legend labels, color scheme, and direction of colors
  scale_fill_distiller(type = "div", palette = "BrBG", labels = scales::label_dollar(), 
                       direction = -1) + 
                   # Here we define the outline color scheme  
  scale_color_brewer(palette = "Set1") +
                   # This function sets the aspect ratio of the coordinate plane     
  coord_fixed(1.3) + 
                   # Here we define the labels for the title, axes, and legends
  labs(title = "Average High-Quality Marijuana Prices Across U.S. States -- 
       Grouped By Subregion.",
       x = "Longitude",
       y = "Latitude",
       fill = "Price",
       color = "Subregion") + 
                   # Left justified the position of the legends
  theme(plot.title = element_text(hjust = 0.5, size = 13),
        # Here we define font settings regarding the title
        legend.position = "left") 

Creating an Interactive Plot using `plot_ly()

flights %>%
    # First, I manipulated the flights data to present count number of flights (represented by each row)
    # This count function allowed me to create a bar graph imitating a histogram 
    count(carrier, origin) %>% 
  # Define the data set and x, y axes, and the third variable for stacking
  plot_ly(x = ~carrier, y = ~n, color = ~origin) %>% 
  # This code adds the origin airport the highlight marker 
  highlight_key(~ origin) %>% 
  # Here I change the names of the axes to the variables and defined the graph style
  layout(yaxis = list(title = 'Count'), xaxis = list(title = 'Carrier'), barmode = 'stack')