Macro Analysis of the Month of July

Visit Distribution Over the Month

In general, we see that location visits tend to decrease on holy days such as Sundays and Saturdays. The highest amount of location visits tend to take place on Fridays. On average we can expect up to 155 visits at any given location per day and at least 84 visits per day to any given location. Notably there is a peak on a non-Friday, July 15, which could most likely indicate an important event after July 4th such a big holiday or sale day.

Neighborhood Visits in the Month of July

In general we can see that there relatively little fluctuation in number of visits by neighborhood. There is a fairly consistent flow through of individuals visiting the top big name location and neighborhoods. The brand is consistent in number of visit per neighborhood, however the number of visits by each brand varies daily.

Average Weekly Visit Analysis

There tends to be a fairly normal distribution of visist per day of the week with a focal point of Friday to big name brands. Over the month of July we see around 667,000 through cellphone data alone. What’s interesting is that there seems to be more of a uniform distribution when looking at average visits per day, rather than total visits per day of the week.

Friday Visits Breakdown

Big Name Brands and NIACS Code Breakdown

In general, we can see that the top brand possesses a large margin between itself and the other brands. When looking at the NIAC Codes, the top brand does not belong to the top NIACs Industry patronized on Fridays. Moreover, the top two big brand names switch places in reference to their industry types.

Brands visits tends to be larger than industry visits in neighborhood locations.

Small Brand Visits on Friday

In general we see the same trends as seen with the big name brands. The normal distribution over weekdays appears far stronger. Unlike the the top 15 big brands, we see a strong that the top three industry types lead the rest by a significant margin. It is no surprise that we see recreational fun activities increase in popularity through the increased number of vists to bars and restaurants. It is also worth noting that several industries maintain their positions such as the airport and non-alcoholic bars.

---
title: "Data Translation Project Submission"
author: "Emily Tennyson"
output: 
  html_document:
    toc: yes
    toc_float: true
    toc_depth: 4
    code_download: TRUE
---

<style>
body {
text-align: justify}
</style>


```{r setup, warning = FALSE, message = FALSE, include = FALSE}
#rm(list=ls(all=TRUE))

knitr::opts_chunk$set(echo =FALSE, warning = FALSE, message = FALSE)

library("tidyverse")
library("vtable")

library("dbplyr")
# ibrary("readxl")
library("lubridate")
library("magrittr")
library("paletteer")
library("ggeasy")
library("ggpubr")
library("gghighlight")
library("directlabels")
library("numform")
library("janitor")
library("pmdplyr")
library("ggrepel")
library("rvest")

library("GGally")
library("ggplot2")
library("seasonal")
library("forecast")

library(tigris)
library(remotes)
#remotes::install_github('yutannihilation/ggsflabel')
library("ggsflabel")

library(gridExtra)
library(reshape2)
library(grid)
library(gtable)
library("kableExtra")

library("gganimate")
library("png")
library("gifski")

library(ggalt)

```

```{r}
daily_visits <- readRDS('king_dailyvisits.Rdata')
origin_visits <- readRDS('king_originvisits_andmap.Rdata')
ov <- readRDS('king_originvisits.Rdata')
neighborhoods <- readRDS('neighborhood_names.Rdata')

```

```{r}
brand_visits <- daily_visits %>%
  group_by(brands) %>%
  summarise(
    avg_visits = round(mean(visits_by_day), digits = 0),
    med_visits = round(median(visits_by_day), digits = 0),
    max_visits = round(max(visits_by_day), digits = 0),
    min_visits = round(min(visits_by_day), digits = 0),
    total = round(sum(visits_by_day), digits = 0)
  )


day_visits <- daily_visits %>%
  group_by(date) %>%
  summarise(
    avg_visits = round(mean(visits_by_day), digits = 0),
    med_visits = round(median(visits_by_day), digits = 0),
    max_visits = round(max(visits_by_day), digits = 0),
    min_visits = round(min(visits_by_day), digits = 0),
    total = round(sum(visits_by_day), digits = 0)
  ) %>%   
  mutate(weekday = weekdays(date)) 

weekday_visits <- daily_visits %>%
  group_by(weekdays(date)) %>%
  summarise(
    avg_visits = round(mean(visits_by_day), digits = 0),
    med_visits = round(median(visits_by_day), digits = 0),
    max_visits = round(max(visits_by_day), digits = 0),
    min_visits = round(min(visits_by_day), digits = 0),
    total = round(sum(visits_by_day), digits = 0)
  ) %>% rename(
    "week_days" = "weekdays(date)"
  )

weekily_visits <- daily_visits %>%
  group_by(week(date)) %>%
  summarise(
    avg_visits = round(mean(visits_by_day), digits = 0),
    med_visits = round(median(visits_by_day), digits = 0),
    max_visits = round(max(visits_by_day), digits = 0),
    min_visits = round(min(visits_by_day), digits = 0),
    total = round(sum(visits_by_day), digits = 0)
  ) %>% rename(
    "week" = "week(date)"
  )

weekday_brand_visits <- daily_visits %>%
  group_by(weekdays(date), brands) %>%
  summarise(
    avg_visits = round(mean(visits_by_day), digits = 0),
    med_visits = round(median(visits_by_day), digits = 0),
    max_visits = round(max(visits_by_day), digits = 0),
    min_visits = round(min(visits_by_day), digits = 0),
    total = round(sum(visits_by_day), digits = 0)
  ) %>% rename(
    "week_days" = "weekdays(date)"
  )

weekly_brand_visits <- daily_visits %>%
  group_by(week(date), brands) %>%
  summarise(
    avg_visits = round(mean(visits_by_day), digits = 0),
    med_visits = round(median(visits_by_day), digits = 0),
    max_visits = round(max(visits_by_day), digits = 0),
    min_visits = round(min(visits_by_day), digits = 0),
    total = round(sum(visits_by_day), digits = 0)
  ) %>% rename(
    "week" = "week(date)"
  )

code_visits <- daily_visits %>%
  group_by(naics_code, naics_title) %>%
  summarise(
    avg_visits = round(mean(visits_by_day), digits = 0),
    med_visits = round(median(visits_by_day), digits = 0),
    max_visits = round(max(visits_by_day), digits = 0),
    min_visits = round(min(visits_by_day), digits = 0),
    total = round(sum(visits_by_day), digits = 0)
  )

weekday_code_visits <- daily_visits %>%
  group_by(naics_code, naics_title, weekdays(date)) %>%
  summarise(
    avg_visits = round(mean(visits_by_day), digits = 0),
    med_visits = round(median(visits_by_day), digits = 0),
    max_visits = round(max(visits_by_day), digits = 0),
    min_visits = round(min(visits_by_day), digits = 0),
    total = round(sum(visits_by_day), digits = 0)
  ) %>% rename(
    "week_days" = "weekdays(date)"
  )

weekly_code_visits <- daily_visits %>%
  group_by(naics_code, naics_title, week(date)) %>%
  summarise(
    avg_visits = round(mean(visits_by_day), digits = 0),
    med_visits = round(median(visits_by_day), digits = 0),
    max_visits = round(max(visits_by_day), digits = 0),
    min_visits = round(min(visits_by_day), digits = 0),
    total = round(sum(visits_by_day), digits = 0)
  ) %>% rename(
    "week" = "week(date)"
  )


neighborhood_visits <- daily_visits %>%
  right_join(right_join(ov, neighborhoods)) %>%
  filter(brands != "") %>%
  group_by(GEOID) %>%
  mutate(date_day = day(date)) %>%
  group_by(NEIGHBORHOOD_DISTRICT_NAME, brands, naics_code, date, date_day) %>%
  summarise(total = sum(visits_by_day))

sd_visits <- brand_visits %>%
  filter(brands != "") %>%
  summarize(mean = mean(total),
            sd_1 = sd(total),
            sd_2 = 2*sd(total),
            sd_3 = 3*sd(total))

sd_brands <- brand_visits%>%
  arrange(-total) %>%
  filter(total >= (1342.789	+ 11913.36)) %>%
  select(brands)
```

# Macro Analysis of the Month of July

## Visit Distribution Over the Month

In general, we see that location visits tend to decrease on holy days such as Sundays and Saturdays. The highest amount of location visits tend to take place on Fridays. On average we can expect up to 155 visits at any given location per day and at least 84 visits per day to any given location. Notably there is a peak on a non-Friday, July 15, which could most likely indicate an important event after July 4th such a big holiday or sale day.

```{r fig.height=7, fig.width=12,fig.align="center"}
ggplot(day_visits, aes(x = date)) +
  geom_ribbon(aes(ymin = avg_visits - 3*sd(avg_visits), ymax = avg_visits + 3*sd(avg_visits)), fill = "purple3", alpha = .25) +
  geom_line(aes(y = (avg_visits - 3*sd(avg_visits))), size = .5, color = "purple2") +
  geom_line(aes(y = (avg_visits + 3*sd(avg_visits))), size = .5, color = "purple2") +
  
  geom_ribbon(aes(ymin = avg_visits - 2*sd(avg_visits), ymax = avg_visits + 2*sd(avg_visits)), fill = "blue3", alpha = .35) +
  geom_line(aes(y = (avg_visits - 2*sd(avg_visits))), size = .5, color = "blue2") +
  geom_line(aes(y = (avg_visits + 2*sd(avg_visits))), size = .5, color = "blue2") +
  
  geom_ribbon(aes(ymin = avg_visits - sd(avg_visits), ymax = avg_visits + sd(avg_visits)), fill = "red3", alpha = .70) +
  geom_line(aes(y = (avg_visits - 1*sd(avg_visits))), size = .5, color = "red1") +
  geom_line(aes(y = (avg_visits + 1*sd(avg_visits))), size = .5, color = "red1") +
  
  scale_x_date(breaks = as.Date(c("2020-07-03", "2020-07-05", "2020-07-10", "2020-07-12", "2020-07-17", "2020-07-19", 
                                  "2020-07-24", "2020-07-26", "2020-07-31")),
               date_labels = "%b %d \n %a",
               expand = expansion(c(.02, .20))) +
  
  geom_line(aes(y = avg_visits), size = 1) +
  
  geom_text(aes(y = (119 + 3*round(sd(day_visits$avg_visits), digits = 2)),
              x = as.Date("2020-07-31"),
              label = paste((119 + round(3 * sd(day_visits$avg_visits), digits = 2)), ", +3rd SD"), 
              hjust = -0.05, vjust = 3), color = "plum3", size = 5) + 
  
  geom_text(aes(y = (119 - 3*round(sd(day_visits$avg_visits), digits = 2)),
              x = as.Date("2020-07-31"),
              label = paste((119 - round(3 * sd(day_visits$avg_visits), digits = 2)), ", -3rd SD"), 
              hjust = -0.05, vjust = -1), color = "plum3", size = 5) +
  
  geom_text(aes(y = (119 + 2*round(sd(day_visits$avg_visits), digits = 2)),
              x = as.Date("2020-07-31"),
              label = paste((119 + round(2 * sd(day_visits$avg_visits), digits = 2)), ", +2nd SD"), 
              hjust = -0.05, vjust = 2.75), color = "slateblue1", size = 5) + 
  
  geom_text(aes(y = (119 - 2*round(sd(day_visits$avg_visits), digits = 2)),
              x = as.Date("2020-07-31"),
              label = paste((119 - round(2 * sd(day_visits$avg_visits), digits = 2)), ", -2nd SD"), 
              hjust = -0.05, vjust = -1.5), color = "slateblue1", size = 5) +
  
  geom_text(aes(y = (119 + 1*round(sd(day_visits$avg_visits), digits = 2)),
              x = as.Date("2020-07-31"),
              label = paste((119 + round(1 * sd(day_visits$avg_visits), digits = 2)), ", +1st SD"), 
              hjust = -0.05, vjust = 2.25), color = "red4", size = 5) + 
  
  geom_text(aes(y = (119 - 1*round(sd(day_visits$avg_visits), digits = 2)),
              x = as.Date("2020-07-31"),
              label = paste((119 - round(1 * sd(day_visits$avg_visits), digits = 2)), ", -1st SD"), 
              hjust = -0.05, vjust = -1.), color = "red4", size = 5) +
  
  geom_text(aes(y = (119 - 0*round(sd(day_visits$avg_visits), digits = 2)),
              x = as.Date("2020-07-31"),
              label = paste((119 - round(0 * sd(day_visits$avg_visits), digits = 2)), ", Avg. Visits"), 
              hjust = -0.05, vjust = .5), color = "black", size = 5) +
  
  geom_vline(aes(xintercept = as.Date(c("2020-07-05"))), color = "gray65") +
  geom_vline(aes(xintercept = as.Date(c("2020-07-12"))), color = "gray65") +
  geom_vline(aes(xintercept = as.Date(c("2020-07-19"))), color = "gray65") +
  geom_vline(aes(xintercept = as.Date(c("2020-07-26"))), color = "gray65") +
  
  geom_vline(aes(xintercept = as.Date(c("2020-07-03"))), color = "gray65", linetype = "dashed") +
  geom_vline(aes(xintercept = as.Date(c("2020-07-10"))), color = "gray65", linetype = "dashed") +
  geom_vline(aes(xintercept = as.Date(c("2020-07-17"))), color = "gray65", linetype = "dashed") +
  geom_vline(aes(xintercept = as.Date(c("2020-07-24"))), color = "gray65", linetype = "dashed") +
  geom_vline(aes(xintercept = as.Date(c("2020-07-31"))), color = "gray65", linetype = "dashed") +
  
  ggtitle("Avg. Number of Visits Over July") +
  
  theme(line = element_line(color = "black"),
        axis.text.x = element_text(size = 16),
        axis.title.x = element_blank(),
        axis.ticks = element_blank(),
        axis.title.y = element_blank(),
        axis.text.y = element_text(size = 16),
        legend.position = "none",
        panel.background = element_blank(),
        panel.grid.major.x = element_blank(), 
        panel.grid.minor.x = element_blank(),
        panel.border = element_rect(colour = "black", fill = NA),
        plot.title = element_text(hjust = 0.5, size = 35, face = "bold"))
```

## Neighborhood Visits in the Month of July 

In general we can see that there relatively little fluctuation in number of visits by neighborhood. There is a fairly consistent flow through of individuals visiting the top big name location and neighborhoods. The brand is consistent in number of visit per neighborhood, however the number of visits by each brand varies daily.

```{r fig.width=20, fig.height=16, fig.align="center"}
animate((neighborhood_visits %>%
    arrange(-total) %>%
    filter(brands %in% sd_brands$brands) %>%
      ggplot(aes(y = NEIGHBORHOOD_DISTRICT_NAME, x = brands, size = total, color= as.factor(naics_code), frame = date)) +
      geom_point(aes(alpha = total), show.legend = TRUE) +
      
      scale_size(breaks = c(-1:20), range = c(1,20)) +
  
      
      theme(line = element_line(color = "black"),
          axis.text.x = element_text(size = 16, angle = 90, vjust = 0.5, hjust=1),
          axis.ticks = element_blank(),
          axis.text.y = element_text(size = 16),
          axis.title = element_blank(),
          legend.position = "none",
          panel.background = element_blank(),
          panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          panel.border = element_rect(colour = "black", fill = NA),
          plot.title = element_text(size = 20, face = "bold", hjust = 0.5)) +
      
      transition_states(date,
                        transition_length = 8,
                        state_length = 4) +
      ease_aes('elastic-in-out') +
    
    labs(title = 'Neighborhood Visits by Brands: {closest_state}')),
    height = 800, width = 1000,
    #nframes = 10,
    fps=7)
   
```

```{r}
animate( daily_visits %>%
    right_join(right_join(ov, neighborhoods)) %>%
    filter(brands != "") %>%
    group_by(GEOID) %>%
    mutate(totals = sum(visits_by_day),
           date =format(date, format = "%B %d")) %>%
    group_by(NEIGHBORHOOD_DISTRICT_NAME, brands, date) %>%
    summarise(total = sum(visits_by_day)) %>%
    arrange(-total) %>%
    filter(brands %in% sd_brands$brands) %>%
  
    ggplot(aes(y = total, x = NEIGHBORHOOD_DISTRICT_NAME, frame = date)) +
    geom_jitter(aes(alpha = total, size = total, color = as.factor(brands)), show.legend = TRUE) +
      
      #scale_color_gradient2(low = "blue", high = "red", mid = "purple", midpoint = 0, breaks = c(-10:20)) +
      scale_size(breaks = c(1:20), range = c(1,20)) +
    
      theme(line = element_line(color = "black"),
            axis.text.x = element_text(size = 14, angle = 90, vjust = 0.5, hjust=1),
            axis.ticks = element_blank(),
            axis.text.y = element_text(size = 14),
            axis.title = element_blank(),
            legend.position = "none",
            panel.background = element_blank(),
            panel.grid.major = element_blank(), 
            panel.grid.minor = element_blank(),
            panel.border = element_rect(colour = "black", fill = NA),
            plot.title = element_text(size = 20, face = "bold", hjust = 0.5)) +
  
      transition_states(date,
                        transition_length = 8,
                        state_length = 4) +
        
      shadow_wake(wake_length = 0.1, alpha = FALSE) +
      
      labs(title = 'Neighborhood Visits by Brands: {closest_state}'),
    height = 800, width = 1000,
    #nframes = 10,
    fps=7)
```


## Average Weekly Visit Analysis 

There tends to be a fairly normal distribution of visist per day of the week with a focal point of Friday to big name brands. Over the month of July we see around 667,000 through cellphone data alone. What's interesting is that there seems to be more of a uniform distribution when looking at average visits per day, rather than total visits per day of the week. 

```{r fig.width=14, fig.height=6, fig.align="center"}
gridExtra::grid.arrange(
      (ggplot(weekday_visits, aes(y = total, 
                               x = factor(weekday_visits$week_days, levels = as.character(wday(c(2:7,1), label=TRUE, abbr=FALSE))), 
                               fill = week_days, color = week_days)) +
         geom_col(stat="identity", position = "identity", show.legend = FALSE)  +
         
         scale_x_discrete(name = NULL,
                         labels = c("Mon", "Tues", "Wed", "Thurs", "Fri", "Sat", "Sun")) +
         
         scale_y_continuous(labels = scales::comma, 
                           name = "Total Visits During July",
                           expand = expansion(mult = c(0, .05))) +
         
         geom_text(position = position_dodge(width= .8), 
                  aes(label = paste(scales::comma(total)), size = 14, fontface = "bold", vjust = 1.5),
                  color = "white"
                  ) +
         
         geom_text(position = position_dodge(width= .8), 
                  aes(label = paste("Visits"), vjust = 3),
                  color = "white"
                  ) +
         
         ggtitle("Total Visits During July") +
         
         theme(line = element_line(color = "black"),
              axis.text.x = element_text(size = 16),
              axis.title.x = element_text(size = 20, face = "bold"),
              axis.ticks = element_blank(),
              axis.title.y = element_blank(),
              axis.text.y = element_blank(),
              legend.position = "none",
              panel.background = element_blank(),
              panel.grid.major.x = element_blank(), 
              panel.grid.minor = element_blank(),
              panel.border = element_rect(colour = "black", fill = NA),
              plot.title = element_text(hjust = 0.5, size = 16, face = "bold"))),
      
      (ggplot(weekday_visits, aes(y = avg_visits, 
                               x = factor(weekday_visits$week_days, levels = as.character(wday(c(2:7,1), label=TRUE, abbr=FALSE))), 
                               fill = week_days, color = week_days)) +
         geom_col(stat="identity", position = "identity", show.legend = FALSE) +
      
        scale_x_discrete(name = NULL,
                         labels = c("Mon", "Tues", "Wed", "Thurs", "Fri", "Sat", "Sun")) +
        scale_y_continuous(labels = scales::comma, 
                           name = "Avg. Visits During July",
                           expand = expansion(mult = c(0, .05))) +
        
        geom_text(position = position_dodge(width= .8), 
                  aes(label = paste(scales::comma(avg_visits)), size = 14, fontface = "bold", vjust = 1.5),
                  color = "white"
                  ) +
        
        geom_text(position = position_dodge(width= .8), 
                  aes(label = paste("Visits \n per Day"), vjust = 1.75),
                  color = "white"
                  ) +
        
        ggtitle("Average Visits During July") +
        
        theme(line = element_line(color = "black"),
              axis.text.x = element_text(size = 16),
              axis.title.x = element_text(size = 20, face = "bold"),
              axis.ticks = element_blank(),
              axis.title.y = element_blank(),
              axis.text.y = element_blank(),
              legend.position = "none",
              panel.background = element_blank(),
              panel.grid.major.x = element_blank(), 
              panel.grid.minor = element_blank(),
              panel.border = element_rect(colour = "black", fill = NA),
              plot.title = element_text(hjust = 0.5, size = 16, face = "bold"))),
      ncol = 2,
      top =   textGrob(expression(bold("Large Band Visit Distribution Over Weekdays in July")), 
                   gp = gpar(fontsize=35,font=8), 
                   hjust = .5)
      )
```

# Friday Visits Breakdown

## Big Name Brands and NIACS Code Breakdown

In general, we can see that the top brand possesses a large margin between itself and the other brands. When looking at the NIAC Codes, the top brand does not belong to the top NIACs Industry patronized on Fridays. Moreover, the top two big brand names switch places in reference to their industry types.

Brands visits tends to be larger than industry visits in neighborhood locations. 

```{r fig.width=14, fig.height=10, fig.align="center"}
weekday_brand_visits_15 <- subset(weekday_brand_visits, week_days == "Friday" & brands != "") %>%
  arrange(- total) %>%
  head(15) %>%
  mutate(rank = row_number())


weekday_code_visits_15 <- subset(weekday_code_visits, week_days == "Friday" & naics_title != "") %>%
  arrange(- total) %>%
  head(15) %>%
  mutate(rank = row_number()) 

gridExtra::grid.arrange(
  (weekday_brand_visits_15 %>%
    ggplot(aes(y = reorder(brands, total), x = total)) +
    geom_segment(aes(x = total, xend = rank, yend = brands), size = 2, color = "darkgray") +
    
    geom_point(size = 8, color="tomato") +
     
     geom_point(data = weekday_brand_visits_15 %>%
                  filter(brands == "UW Medicine"),
                size = 10, color = "purple") +
     
     geom_point(data = weekday_brand_visits_15 %>%
                  filter(brands == "Starbucks"),
                size = 10, color = "dark green") +
  
    scale_y_discrete(name = NULL) +
    scale_x_continuous(name = "Brand Type Visits",
                       position = "top",
                       limits = c(0, 40000),
                       expand = c(0, 0)) +
    
    geom_text(aes(label = paste(total, "visits"), hjust = -.25), size = 5) +
    
    theme(line = element_line(color = "black"),
          axis.text.x = element_blank(),
          axis.ticks = element_blank(),
          axis.title.x = element_text(size = 20, face = "bold"),
          axis.text.y = element_text(size = 16),
          legend.position = "none",
          panel.background = element_blank(),
          panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          panel.border = element_rect(colour = "black", fill = NA))),
  
  (weekday_code_visits_15 %>%
    ggplot(aes(y = reorder(naics_title, total), x = -total)) +
    geom_segment(aes(x = 0 - total, xend = -rank, yend = naics_title), size = 2, color = "darkgray") +
  
    geom_point(size = 8, color="tomato") +
     
     geom_point(data = weekday_code_visits_15 %>%
                  filter(naics_code == 622110),
                size = 10, color = "purple") +
     
     geom_point(data = weekday_code_visits_15 %>%
                  filter(naics_code == 722515),
                size = 10, color = "dark green") +
    
    scale_y_discrete(name = NULL,
                     labels = c("Drinking Places \n (Alcoholic Beverages)",
                                "Hardware Stores",
                                "Hotels (except Casino Hotels) \n and Motels",
                                "Department Stores",
                                "All Other General \n Merchandise Stores",
                                "Fitness and Recreational \n Sports Centers",
                                "Other Airport Operations",
                                "Supermarkets and Other \n Grocery (except Convenience) Stores",
                                "Gasoline Stations with \n Convenience Stores",
                                "General Medical and \n Surgical Hospitals",
                                "Snack and Nonalcoholic \n Beverage Bars",
                                "Limited-Service Restaurants",
                                "Nature Parks and Other \n Similar Institutions",
                                "Full-Service Restaurants",
                                "Lessors of Nonresidential \n Buildings (except Miniwarehouses)"),
                     position = "right")  +
    
     
     scale_x_reverse() +
     
     scale_x_continuous(name = "Industry Type Visits",
                       position = "top",
                       limits = c(-76910 - 65000, 0),
                       expand = c(0, 0)) +

    geom_text(aes(label = paste(total, "visits"), hjust = 1.15), size = 5) +
    
    theme(line = element_line(color = "black"),
          axis.text.x = element_blank(),
          axis.ticks = element_blank(),
          axis.title.x = element_text(size = 20, face = "bold"),
          axis.text.y = element_text(size = 16),
          legend.position = "none",
          panel.background = element_blank(),
          panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          panel.border = element_rect(colour = "black", fill = NA))),
  ncol = 2,
  widths = c(.95, 1),
  top =   textGrob(expression(bold("Top 15 Locations and Industries Patronized on Friday")), 
                   gp = gpar(fontsize=35,font=8), 
                   hjust = .5)
)
```

```{r message = FALSE, include = FALSE}
ov <- readRDS('king_originvisits.Rdata')

# get Washington map data
mapdata <- block_groups('WA','033', cb = TRUE)
neighborhoods <- readRDS('neighborhood_names.Rdata')
```

```{r}
combo_totals <- daily_visits %>%
  right_join(right_join(ov, neighborhoods)) %>%
  mutate(
    week_day = weekdays(date)
  ) %>%
  filter(brands %in% weekday_brand_visits_15$brands)


combo_totals <- combo_totals %>%
  left_join(combo_totals %>%
    filter(week_day == "Monday") %>%
    group_by(brands, naics_code, NEIGHBORHOOD_DISTRICT_NAME) %>%
    summarize(
     Monday_Total = sum(visits_by_day)
    ))%>%
  left_join(combo_totals %>%
    filter(week_day == "Tuesday") %>%
    group_by(brands, naics_code, NEIGHBORHOOD_DISTRICT_NAME) %>%
    summarize(
     Tuesday_Total = sum(visits_by_day)
    ))%>%
  left_join(combo_totals %>%
    filter(week_day == "Wednesday") %>%
    group_by(brands, naics_code, NEIGHBORHOOD_DISTRICT_NAME) %>%
    summarize(
     Wednesday_Total = sum(visits_by_day)
    ))%>%
  left_join(combo_totals %>%
    filter(week_day == "Thursday") %>%
    group_by(brands, naics_code, NEIGHBORHOOD_DISTRICT_NAME) %>%
    summarize(
     Thursday_Total = sum(visits_by_day)
    ))%>%
  left_join(combo_totals %>%
    filter(week_day == "Friday") %>%
    group_by(brands, naics_code, NEIGHBORHOOD_DISTRICT_NAME) %>%
    summarize(
     Friday_Total = sum(visits_by_day)
    ))%>%
  left_join(combo_totals %>%
    filter(week_day == "Saturday") %>%
    group_by(brands, naics_code, NEIGHBORHOOD_DISTRICT_NAME) %>%
    summarize(
     Saturday_Total = sum(visits_by_day)
    ))%>%
  left_join(combo_totals %>%
    filter(week_day == "Sunday") %>%
    group_by(brands, naics_code, NEIGHBORHOOD_DISTRICT_NAME) %>%
    summarize(
     Sunday_Total = sum(visits_by_day)
    ))

combo_totals_2 <- daily_visits %>%
  right_join(right_join(ov, neighborhoods)) %>%
  mutate(
    week_day = weekdays(date)
  ) %>%
  filter(naics_code %in% weekday_code_visits_15$naics_code)


combo_totals_2 <- combo_totals_2 %>%
  left_join(combo_totals_2 %>%
    filter(week_day == "Monday") %>%
    group_by(brands, naics_code, NEIGHBORHOOD_DISTRICT_NAME) %>%
    summarize(
     Monday_Total = sum(visits_by_day)
    ))%>%
  left_join(combo_totals_2 %>%
    filter(week_day == "Tuesday") %>%
    group_by(brands, naics_code, NEIGHBORHOOD_DISTRICT_NAME) %>%
    summarize(
     Tuesday_Total = sum(visits_by_day)
    ))%>%
  left_join(combo_totals_2 %>%
    filter(week_day == "Wednesday") %>%
    group_by(brands, naics_code, NEIGHBORHOOD_DISTRICT_NAME) %>%
    summarize(
     Wednesday_Total = sum(visits_by_day)
    ))%>%
  left_join(combo_totals_2 %>%
    filter(week_day == "Thursday") %>%
    group_by(brands, naics_code, NEIGHBORHOOD_DISTRICT_NAME) %>%
    summarize(
     Thursday_Total = sum(visits_by_day)
    ))%>%
  left_join(combo_totals_2 %>%
    filter(week_day == "Friday") %>%
    group_by(brands, naics_code, NEIGHBORHOOD_DISTRICT_NAME) %>%
    summarize(
     Friday_Total = sum(visits_by_day)
    ))%>%
  left_join(combo_totals_2 %>%
    filter(week_day == "Saturday") %>%
    group_by(brands, naics_code, NEIGHBORHOOD_DISTRICT_NAME) %>%
    summarize(
     Saturday_Total = sum(visits_by_day)
    ))%>%
  left_join(combo_totals_2 %>%
    filter(week_day == "Sunday") %>%
    group_by(brands, naics_code, NEIGHBORHOOD_DISTRICT_NAME) %>%
    summarize(
     Sunday_Total = sum(visits_by_day)
    ))
```

```{r}
geo_totals_data <- combo_totals %>%
  filter(brands %in% weekday_brand_visits_15$brands & week_day == "Friday") %>%
  group_by(GEOID) %>%
  group_by(NEIGHBORHOOD_DISTRICT_NAME) %>%
  mutate(Friday_Visits = signif(sum(Friday_Total), digits = 0),
         weekday_label = 
           case_when(
             row_number() == 1 ~ paste0(NEIGHBORHOOD_DISTRICT_NAME, "\n ", scales::comma(Friday_Total), " visits"),
             TRUE ~ NA_character_
           ))


geo_map <- geo_join(mapdata, geo_totals_data, by = 'GEOID', how = 'inner')

geo_totals_data_2 <- combo_totals_2 %>%
  filter(naics_code %in% weekday_code_visits_15$naics_code & week_day == "Friday") %>%
  group_by(GEOID) %>%
  group_by(NEIGHBORHOOD_DISTRICT_NAME) %>%
  mutate(Friday_Visits = signif(sum(Friday_Total), digits = 0),
         weekday_label = 
           case_when(
             row_number() == 1 ~ paste0(NEIGHBORHOOD_DISTRICT_NAME, "\n ", scales::comma(Friday_Total), " visits"),
             TRUE ~ NA_character_
           ))

geo_map_2 <- geo_join(mapdata, geo_totals_data_2, by = 'GEOID', how = 'inner')

```


```{r fig.height=16, fig.width = 12, fig.align="center"}
gridExtra::grid.arrange(
  (ggplot(geo_map) +
    geom_sf(aes(fill = NEIGHBORHOOD_DISTRICT_NAME,
                alpha = Friday_Total)) +
    geom_sf_label_repel(aes(label = weekday_label, 
                            color = NEIGHBORHOOD_DISTRICT_NAME,
                            hjust = 0.5, vjust = 0.5), size = 3.5) +
     ggtitle("Seattle Area Big Name Brand Visits") +
    
    theme(line = element_line(color = "black"),
          axis.text.x = element_blank(),
          axis.ticks = element_blank(),
          axis.text.y = element_blank(),
          axis.title = element_blank(),
          legend.position = "none",
          panel.background = element_blank(),
          panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          panel.border = element_rect(colour = "black", fill = NA),
          plot.title = element_text(hjust = 0.5, size = 16, face = "bold")
          )),

  (ggplot(geo_map_2) +
    geom_sf(aes(fill = NEIGHBORHOOD_DISTRICT_NAME,
                alpha = Friday_Total)) +
    geom_sf_label_repel(aes(label = weekday_label, 
                            color = NEIGHBORHOOD_DISTRICT_NAME,
                            hjust = 0.5, vjust = 0.5), size = 3.5) +
     
     ggtitle("Seattle Area Big Name NIAC Code Viist") +
    
    theme(line = element_line(color = "black"),
          axis.text.x = element_blank(),
          axis.ticks = element_blank(),
          axis.text.y = element_blank(),
          axis.title = element_blank(),
          legend.position = "none",
          panel.background = element_blank(),
          panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          panel.border = element_rect(colour = "black", fill = NA),
          plot.title = element_text(hjust = 0.5, size = 16, face = "bold"))),
  ncol = 2,
  top =   textGrob(expression(bold("Neighborhood Visist to Top 15 Large Brands in July")), 
                   gp = gpar(fontsize=35,font=8), 
                   hjust = .5)
)
  
```

```{r}
blank_weekday_visits <- subset(daily_visits, brands == "") %>%
  group_by(weekdays(date)) %>%
  summarise(
    avg_visits = round(mean(visits_by_day), digits = 0),
    med_visits = round(median(visits_by_day), digits = 0),
    max_visits = round(max(visits_by_day), digits = 0),
    min_visits = round(min(visits_by_day), digits = 0),
    total = round(sum(visits_by_day), digits = 0)
  ) %>% rename(
    "week_days" = "weekdays(date)"
  )

blank_weekday_title_visits <- subset(daily_visits, brands == "") %>%
  group_by(weekdays(date), naics_title) %>%
  summarise(
    avg_visits = round(mean(visits_by_day), digits = 0),
    med_visits = round(median(visits_by_day), digits = 0),
    max_visits = round(max(visits_by_day), digits = 0),
    min_visits = round(min(visits_by_day), digits = 0),
    total = round(sum(visits_by_day), digits = 0)
  ) %>% rename(
    "week_days" = "weekdays(date)"
  )
```


## Small Brand Visits on Friday

In general we see the same trends as seen with the big name brands. The normal distribution over weekdays appears far stronger. Unlike the the top 15 big brands, we see a strong that the top three industry types lead the rest by a significant margin. It is no surprise that we see recreational fun activities increase in popularity through the increased number of vists to bars and restaurants. It is also worth noting that several industries maintain their positions such as the airport and non-alcoholic bars.

```{r fig.width=14, fig.height=6, fig.align="center"}
gridExtra::grid.arrange(
  (ggplot(blank_weekday_visits, aes(y = total,
               x = factor(blank_weekday_visits$week_days, levels = as.character(wday(c(2:7,1), label=TRUE, abbr=FALSE))), 
                               fill = week_days, color = week_days)) +
    geom_col(stat="identity", position = "identity", show.legend = FALSE)  +
    
    scale_x_discrete(name = NULL,
                     labels = c("Mon", "Tues", "Wed", "Thurs", "Fri", "Sat", "Sun")) +
    scale_y_continuous(labels = scales::comma, name = "Total Visits During July",
                     expand = expansion(mult = c(0, .05))) +
    
    geom_text(position = position_dodge(width= .8), 
                aes(label = paste(scales::comma(total), "\n Visits"), vjust = 1.25),
                color = "white"
                ) +
     
     ggtitle("Total Visits During July") +
     
    theme(line = element_line(color = "black"),
              axis.text.x = element_text(size = 16),
              axis.title.x = element_text(size = 20, face = "bold"),
              axis.ticks = element_blank(),
              axis.title.y = element_blank(),
              axis.text.y = element_blank(),
              legend.position = "none",
              panel.background = element_blank(),
              panel.grid.major.x = element_blank(), 
              panel.grid.minor = element_blank(),
              panel.border = element_rect(colour = "black", fill = NA),
              plot.title = element_text(hjust = 0.5, size = 16, face = "bold"))),
  
  (ggplot(blank_weekday_visits, aes(y = avg_visits,
               x = factor(blank_weekday_visits$week_days, levels = as.character(wday(c(2:7,1), label=TRUE, abbr=FALSE))), 
                               fill = week_days, color = week_days)) +
    geom_col(stat="identity", position = "identity", show.legend = FALSE)  +
    
    scale_x_discrete(name = NULL,
                         labels = c("Mon", "Tues", "Wed", "Thurs", "Fri", "Sat", "Sun")) +
    scale_y_continuous(labels = scales::comma, 
                       name = "Total Visits During July",
                       expand = expansion(mult = c(0, .05))) +
    
    geom_text(position = position_dodge(width= .8), 
                aes(label = paste(scales::comma(avg_visits), "\n Visits \n per Day"), vjust = 1.25),
                color = "white"
                ) +
     
     ggtitle("Average Visits During July") +
     
    theme(line = element_line(color = "black"),
              axis.text.x = element_text(size = 16),
              axis.title.x = element_text(size = 20, face = c("bold", "italic")),
              axis.ticks = element_blank(),
              axis.title.y = element_blank(),
              axis.text.y = element_blank(),
              legend.position = "none",
              panel.background = element_blank(),
              panel.grid.major.x = element_blank(), 
              panel.grid.minor = element_blank(),
              panel.border = element_rect(colour = "black", fill = NA),
              plot.title = element_text(hjust = 0.5, size = 16, face = "bold"))),
  ncol = 2,
  top =   textGrob(expression(bold("Small Brand Visit Distribution Over Weekdays in July")), 
                   gp = gpar(fontsize=35,font=8), 
                   hjust = .5)
  )
```


```{r  fig.width=19, fig.height=12, fig.align="center"}
gridExtra::grid.arrange(
  (subset(blank_weekday_title_visits, week_days == "Wednesday" & naics_title != "") %>%
    arrange(- total) %>%
    head(15) %>%
    mutate(rank = row_number()) %>%
    ggplot(aes(y = reorder(naics_title, total), x = total)) +
    geom_segment(aes(x = total, xend = rank, yend = naics_title), size = 2, color = "darkgray") +
    
    geom_point(size = 8, color="tomato") +
     
      geom_point(data = blank_weekday_title_visits %>%
                   filter(naics_title == "Drinking Places (Alcoholic Beverages)" & week_days == "Wednesday"),
                 size = 14, color = "deepskyblue", shape = "diamond") +

     
     geom_point(data = blank_weekday_title_visits %>%
                   filter(naics_title == "Full-Service Restaurants" & week_days == "Wednesday"),
                 size = 14, color = "purple", shape = "diamond") +
     
     geom_point(data = blank_weekday_title_visits %>%
                   filter(naics_title == "Other Airport Operations" & week_days == "Friday"),
                size = 10, color = "darkgray") +
     
     geom_point(data = blank_weekday_title_visits %>%
                   filter(naics_title == "Fitness and Recreational Sports Centers" & week_days == "Friday"),
                size = 10, color = "darkgray") +
     
     geom_point(data = blank_weekday_title_visits %>%
                   filter(naics_title == "Supermarkets and Other Grocery (except Convenience) Stores" &
                            week_days == "Friday"),
                size = 10, color = "darkgray") +
     
     geom_point(data = blank_weekday_title_visits %>%
                   filter(naics_title == "Snack and Nonalcoholic Beverage Bars" & week_days == "Friday"),
                size = 10, color = "darkgray") +
    
    scale_y_discrete(name = NULL,
                     labels = rev(c("Lessors of\n Nonresidential  Buildings \n (except Miniwarehouses)",
                                "Nature Parks and Other \n Similar Institutions",
                                "Full-Service Restaurants",
                                "Other Airport Operations",
                                "Snack and Nonalcoholic \n Beverage Bars",
                                "Fitness and Recreational \n Sports Centers",
                                "Elementary and\n Secondary  Schools",
                                "Limited-Service\n Restaurants",
                                "General Medical and \n Surgical Hospitals",
                                "Golf Courses\n and Country  Clubs",
                                "Drinking Places \n (Alcoholic Beverages)",
                                "Supermarkets and \n Other Grocery Stores\n (except Convenience)",
                                "Offices of Dentists",
                                "Religious Organizations",
                                "Offices of Physicians \n (except Mental \nHealth Specialists)"))) +
    
    scale_x_continuous(name = "Wednesday Visits",
                       position = "top",
                       limits = c(0, 72490 + 50000),
                       expand = c(0, 0)) +
    
    geom_text(aes(label = paste(total, "visits"), hjust = -.25), size = 6) +
    
    theme(line = element_line(color = "black"),
          axis.text.x = element_blank(),
          axis.ticks = element_blank(),
          axis.title.x = element_text(size = 20, face = "bold"),
          axis.text.y = element_text(size = 16),
          legend.position = "none",
          panel.background = element_blank(),
          panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          panel.border = element_rect(colour = "black", fill = NA))),
  
  (subset(blank_weekday_title_visits, week_days == "Thursday" & naics_title != "") %>%
    arrange(- total) %>%
    head(15) %>%
    mutate(rank = row_number()) %>%
    ggplot(aes(y = reorder(naics_title, total), x = total)) +
    geom_segment(aes(x = total, xend = rank, yend = naics_title), size = 2, color = "darkgray") +
    
    geom_point(size = 8, color="tomato") +
     
     geom_point(data = blank_weekday_title_visits %>%
                   filter(naics_title == "Drinking Places (Alcoholic Beverages)" & week_days == "Thursday"),
                 size = 14, color = "deepskyblue", shape = "diamond") +
     
     geom_point(data = blank_weekday_title_visits %>%
                   filter(naics_title == "Full-Service Restaurants" & week_days == "Thursday"),
                 size = 14, color = "purple", shape = "diamond") +
    
     geom_point(data = blank_weekday_title_visits %>%
                   filter(naics_title == "Other Airport Operations" & week_days == "Friday"),
                size = 10, color = "darkgray") +
     
     geom_point(data = blank_weekday_title_visits %>%
                   filter(naics_title == "Fitness and Recreational Sports Centers" & week_days == "Friday"),
                size = 10, color = "darkgray") +
     
     geom_point(data = blank_weekday_title_visits %>%
                   filter(naics_title == "Supermarkets and Other Grocery (except Convenience) Stores" &
                            week_days == "Friday"),
                size = 10, color = "darkgray") +
     
     geom_point(data = blank_weekday_title_visits %>%
                   filter(naics_title == "Snack and Nonalcoholic Beverage Bars" & week_days == "Friday"),
                size = 10, color = "darkgray") +
     
    scale_y_discrete(name = NULL,
                     labels = rev(c("Lessors of \nNonresidentialBuildings\n(except Miniwarehouses)",
                                "Nature Parks and Other \n Similar Institutions",
                                "Full-Service Restaurants",
                                "Other Airport Operations",
                                "Snack and Nonalcoholic \n Beverage Bars",
                                "Fitness and Recreational \n Sports Centers",
                                "Elementary and\n  Secondary Schools",
                                "Limited-Service\n Restaurants",
                                "General Medical and \n Surgical Hospitals",
                                "Drinking Places \n (Alcoholic Beverages)",
                                "Golf Courses\n and Country Clubs",
                                "Supermarkets and \n Other Grocery Stores\n (except Convenience)",
                                "Offices of Dentists",
                                "Religious Organizations",
                                "Offices of Physicians \n (except Mental\nHealth Specialists)"))) +
    
    scale_x_continuous(name = "Thursday Visits",
                       position = "top",
                       limits = c(0, 72490 + 50000),
                       expand = c(0, 0)) +
    
    geom_text(aes(label = paste(total, "visits"), hjust = -.25), size = 6) +
    
    theme(line = element_line(color = "black"),
          axis.text.x = element_blank(),
          axis.ticks = element_blank(),
          axis.title.x = element_text(size = 20, face = "bold"),
          axis.text.y = element_text(size = 16),
          legend.position = "none",
          panel.background = element_blank(),
          panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          panel.border = element_rect(colour = "black", fill = NA))),
  
  
  (subset(blank_weekday_title_visits, week_days == "Friday" & naics_title != "") %>%
    arrange(- total) %>%
    head(15) %>%
    mutate(rank = row_number()) %>%
    ggplot(aes(y = reorder(naics_title, total), x = total)) +
    geom_segment(aes(x = total, xend = rank, yend = naics_title), size = 2, color = "darkgray") +
    
    geom_point(size = 8, color="tomato") +
     
     geom_point(data = blank_weekday_title_visits %>%
                   filter(naics_title == "Drinking Places (Alcoholic Beverages)" & week_days == "Friday"),
                 size = 14, color = "deepskyblue", shape = "diamond") +
     
     geom_point(data = blank_weekday_title_visits %>%
                   filter(naics_title == "Full-Service Restaurants" & week_days == "Friday"),
                 size = 14, color = "purple", shape = "diamond") +
     
     geom_point(data = blank_weekday_title_visits %>%
                   filter(naics_title == "Other Airport Operations" & week_days == "Friday"),
                size = 10, color = "darkgray") +
     
     geom_point(data = blank_weekday_title_visits %>%
                   filter(naics_title == "Fitness and Recreational Sports Centers" & week_days == "Friday"),
                size = 10, color = "darkgray") +
     
     geom_point(data = blank_weekday_title_visits %>%
                   filter(naics_title == "Supermarkets and Other Grocery (except Convenience) Stores" &
                            week_days == "Friday"),
                size = 10, color = "darkgray") +
     
     geom_point(data = blank_weekday_title_visits %>%
                   filter(naics_title == "Snack and Nonalcoholic Beverage Bars" & week_days == "Friday"),
                size = 10, color = "darkgray") +
    
    scale_y_discrete(name = NULL,
                     labels = c("Offices of Dentists",
                                "Used Merchandise Stores",
                                "Religious Organizations",
                                "Supermarkets and \n Other Grocery Stores\n (except Convenience)",
                                "General Medical and \n Surgical Hospitals",
                                "Golf Courses and \n Country Clubs",
                                "Elementary and \n Secondary Schools",
                                "Drinking Places \n (Alcoholic Beverages",
                                "Limited-Service\n Restaurants",
                                "Fitness and Recreational \n Sports Centers",
                                "Snack and Nonalcoholic \n Beverage Bars",
                                "Other Airport Operations",
                                "Nature Parks and Other \n Similar Institutions",
                                "Full-Service Restaurants",
                                "Lessors of \n Nonresidential Buildings \n (except Miniwarehouses)")) +
    
    scale_x_continuous(name = "Friday Visits",
                       position = "top",
                       limits = c(0, 72490 + 65000),
                       expand = c(0, 0)) +
    
    geom_text(aes(label = paste(total, "visits"), hjust = -.25), size = 6) +
    
    theme(line = element_line(color = "black"),
          axis.text.x = element_blank(),
          axis.ticks = element_blank(),
          axis.title.x = element_text(size = 20, face = "bold"),
          axis.text.y = element_text(size = 16),
          legend.position = "none",
          panel.background = element_blank(),
          panel.grid.major = element_blank(), 
          panel.grid.minor = element_blank(),
          panel.border = element_rect(colour = "black", fill = NA))),
  ncol = 3,
  top =   textGrob(expression(bold("Top 15 Industries Patronized Midweek")), 
                   gp = gpar(fontsize=30,font=8), 
                   hjust = .5)
)
```























