First, we acquire our data by web scraping the social determinants scores for all counties in New York State as reported by U.S. News, and then we create a subset for NYC’s 5 counties/boroughs.
County Population Health Equity Education Economy Housing
4 Bronx County 45.7 39.8 52.0 44.1 25.0
25 Kings County 56.2 26.7 62.0 61.2 27.8
32 New York County 81.0 17.5 78.8 75.5 50.8
42 Queens County 65.0 31.8 55.7 72.6 26.5
44 Richmond County 61.0 45.1 59.2 70.9 35.6
Food & Nutrition Environment Public Safety Community Vitality
4 58.6 50.0 79.6 15.9
25 67.8 43.0 77.3 23.1
32 83.9 51.1 75.9 35.9
42 70.9 39.9 78.4 18.2
44 61.7 51.5 85.4 43.0
Infrastructure
4 47.0
25 47.1
32 62.4
42 47.1
44 46.1
We retrieved the street addresses of CVS retail locations in NYC manually using CVS’ store locator as we were unable to scrape their webpages. We also geocoded all CVS retail locations using their street addresses and the ggmap package. This code is commented out as it uses an API key.
Address Borough
1 282 East 149th Street Bronx, NY 10451 bronx
2 224 East 161st Street Bronx, NY 10451 bronx
3 3775 East Tremont Avenue Bronx, NY 10465 bronx
4 1688 Westchester Avenue Bronx, NY 10472 bronx
5 50-56 East 167th Street Bronx, NY 10452 bronx
6 3681 Bruckner Blvd, Corner Of Bruckner Boulevard Bronx, NY 10461 bronx
Geo.Address Id Lat Lon
1 282 e 149th st, bronx, ny 10451, usa 2438 40.81692 -73.92225
2 224 e 161st st, bronx, ny 10451, usa 2430 40.82438 -73.91972
3 3775 e tremont ave, bronx, ny 10465, usa 3096 40.82557 -73.82054
4 1688 westchester ave, bronx, ny 10472, usa 8940 40.83019 -73.87108
5 50 e 167th st, bronx, ny 10452, usa 8969 40.83532 -73.92076
6 3681 bruckner blvd, bronx, ny 10461, usa 2697 40.85244 -73.82763
Target
1 0
2 0
3 0
4 0
5 0
6 0
We assign each CVS location their corresponding social determinant scores based on the borough/county that they are in. The following is a sample:
Address
127 40 W 225th St Bronx, NY 10463
134 112 W 34th Street Btwn. 6th And 7th Ave New York, NY 10120
53 81 Eight Avenue Btwn. 14th And 15th St. New York, NY 10011
52 300 Park Avenue South Btwn. 22nd And 23rd St. New York, NY 10010
104 54-06 31st Avenue Woodside, NY 11377
72 1396 Second Avenue Btwn. 72nd And 73rd St. New York, NY 10021
Borough
127 Bronx County
134 New York County
53 New York County
52 New York County
104 Queens County
72 New York County
Geo.Address
127 40 w 225th st, bronx, ny 10463, usa
134 112 west 34th street, 18th floor, new york, ny 10120, united states
53 81 8th ave, new york, ny 10011, usa
52 300 park ave s, new york, ny 10010, usa
104 54-6 31st ave, woodside, ny 11377, usa
72 1396 2nd ave, new york, ny 10021, usa
Id Lat Lon Target County Population Health
127 16921 40.87266 -73.90778 1 Bronx County 45.7
134 17726 40.74981 -73.98900 1 New York County 81.0
53 1618 40.74012 -74.00280 0 New York County 81.0
52 2962 40.73983 -73.98695 0 New York County 81.0
104 1761 40.75680 -73.90704 0 Queens County 65.0
72 2144 40.76894 -73.95797 0 New York County 81.0
Equity Education Economy Housing Food & Nutrition Environment
127 39.8 52.0 44.1 25.0 58.6 50.0
134 17.5 78.8 75.5 50.8 83.9 51.1
53 17.5 78.8 75.5 50.8 83.9 51.1
52 17.5 78.8 75.5 50.8 83.9 51.1
104 31.8 55.7 72.6 26.5 70.9 39.9
72 17.5 78.8 75.5 50.8 83.9 51.1
Public Safety Community Vitality Infrastructure
127 79.6 15.9 47.0
134 75.9 35.9 62.4
53 75.9 35.9 62.4
52 75.9 35.9 62.4
104 78.4 18.2 47.1
72 75.9 35.9 62.4
Additionally, we find the lowest score for each location to see what sort of services are most needed at each location in order to improve the health of their surrounding communities.
Address
109 4055 Hylan Blvd Staten Island, NY 10308
77 1622 Third Avenue Btwn. 91st And 92nd St. New York, NY 10128
115 2465 Richmond Avenue Staten Island, NY 10314
51 275 Third Avenue, Corner Of 22 Street New York, NY 10010
27 1346 Pennsylvania Avenue Brooklyn, NY 11239
103 89-11 Northern Boulevard Jackson Heights, NY 11372
Borough Geo.Address
109 Richmond County 4055 hylan blvd, staten island, ny 10308, usa
77 New York County 1622 3rd ave, new york, ny 10128, usa
115 Richmond County 2465 richmond ave, staten island, ny 10314, usa
51 New York County 275 3rd ave, new york, ny 10001, usa
27 Kings County 1346 granville payne ave, brooklyn, ny 11239, usa
103 Queens County 89 11 northern blvd, jackson heights, ny 11372, usa
Id Lat Lon Target County Population Health
109 6057 40.54092 -74.14760 0 Richmond County 61.0
77 2036 40.78227 -73.95183 0 New York County 81.0
115 6054 40.58762 -74.16634 0 Richmond County 61.0
51 2556 40.73788 -73.98350 0 New York County 81.0
27 2426 40.64707 -73.88238 0 Kings County 56.2
103 1113 40.75659 -73.87834 0 Queens County 65.0
Equity Education Economy Housing Food & Nutrition Environment
109 45.1 59.2 70.9 35.6 61.7 51.5
77 17.5 78.8 75.5 50.8 83.9 51.1
115 45.1 59.2 70.9 35.6 61.7 51.5
51 17.5 78.8 75.5 50.8 83.9 51.1
27 26.7 62.0 61.2 27.8 67.8 43.0
103 31.8 55.7 72.6 26.5 70.9 39.9
Public Safety Community Vitality Infrastructure MostInNeed
109 85.4 43.0 46.1 Housing
77 75.9 35.9 62.4 Equity
115 85.4 43.0 46.1 Housing
51 75.9 35.9 62.4 Equity
27 77.3 23.1 47.1 Community Vitality
103 78.4 18.2 47.1 Community Vitality
We also transform our social determinants data from wide to tall for some analyses.
County Component Score
1 Bronx County Population Health 45.7
2 Kings County Population Health 56.2
3 New York County Population Health 81.0
4 Queens County Population Health 65.0
5 Richmond County Population Health 61.0
6 Bronx County Equity 39.8
Next, we map our CVS locations in order to better understand how they are distributed throughout NYC. Most locations are in Manhattan, and there appear to be very few in the Bronx in relation to how large the borough is.
By plotting the total cumulative scores of the social determinant components by borough, we see that the Bronx has the lowest overall score and is in most need of resources that contribute to the overall health of its population. It is followed by Kings County (Brooklyn), Queens, Richmond County (Staten Island), and finally New York County (Manhattan), which is least in need.
We would also like to see how these social determinant factors relate to one another. A correlation network of the social determinant components allows us to see which factors, such as Education and Housing, are highly correlated with one another. Factors that are highly correlated should be addressed in a bundled form at locations depending on what is in need in the community. Furthermore, we can assign additional factors to address at the CVS stores based on how they correlate to what is most in need at that location.
Community Vitality is the lowest score in the Bronx and in Brooklyn, and this is most strongly correlated with Housing, the Economy and the Environment.
In order to have the most substantial impact on the health of NYC’s population, CVS should invest in addressing the social determinants of health in its design of HealthHubs in the Bronx and in Brooklyn. The component most in need of improvement in both the Bronx and Brooklyn is Community Vitality, which consists of community stability and social capital.
CVS locations in the Bronx and Brooklyn should focus on providing resources that will lead to improvements in Community Vitality in order to make these communities healthier. Community Vitality is most strongly correlated with Housing, the Economy and the Environment, so those factors should also be bundled and addressed at these CVS locations. Housing looks at housing affordability, housing capacity, and housing quality. The economy considers employment, income and opportunity. The environment depends on air and water, the natural environment, and natural hazards. CVS could potentially host workshops and help form community groups in order to educate members on these topics and ultimately guide them on their journeys towards leading healthier lives.
The following is a sample of 10 NYC store locations with assignments on what CVS should focus resources on at each store when redesigning them as HealthHubs. ‘MostInNeed’ denotes the social determinant component in most need at each of these locations.
Address
69 800 10th Avenue Btwn. 53rd And 54th St. New York, NY 10019
87 97-01 Liberty Avenue Ozone Park, NY 11417
9 732 Allerton Avenue Bronx, NY 10467
121 1571 Forest Avenue Staten Island, NY 10302
61 420 Fifth Ave Btwn 37th And 38th St. New York, NY 10018
63 757 3rd Ave Ground Floor Btwn. 47th And 48th St New York, NY 10017
MostInNeed
69 Equity
87 Community Vitality
9 Community Vitality
121 Housing
61 Equity
63 Equity
---
title: "CVS HealthHUBs and Social Determinants of Health in New York City"
author: "Omar Pineda Jr."
date: "5/15/2019"
output:
flexdashboard::flex_dashboard:
orientation: rows
source_code: embed
theme: journal
---
Sidebar {.sidebar}
-------------------------------------
In November 2018, CVS Health announced that it completed its acquisition of health insurer Aetna in a move that looks to transform the health care consumer experience in the United States. CVS will attempt to transform some of its 10,000 retail locations into HealthHubs (neighborhood health care destinations) as the industry increasingly looks to address the Social Determinants of Health (SDOH). The Social Determinants of Health considers the influence of factors such as education, the environment and local economy on the overall wellbeing of community members.
This project looks at how CVS should consider purposing their store locations throughout NYC's boroughs based on scores of different social determinant components. Where is there more need and opportunity, and which component should a given CVS HealthHub prioritize and address to improve the health of its surrounding community?
I have used the following data sources:
CVS Locations: https://www.cvs.com/store-locator/cvs-pharmacy-locations/New-York
Social Determinants Scores for New York: https://www.usnews.com/news/healthiest-communities/new-york
Methodology for Social Determinant Scores: https://www.usnews.com/news/healthiest-communities/articles/methodology
Row {.tabset .tabset-fade}
-------------------------------------
### CVS HealthHUB

### Data Acquisition
First, we acquire our data by web scraping the social determinants scores for all counties in New York State as reported by U.S. News, and then we create a subset for NYC's 5 counties/boroughs.
```{r nyLoad}
library(rvest)
ny <- read_html("https://www.usnews.com/news/healthiest-communities/new-york")
counties <- html_nodes(ny, css = "table")
table <- html_table(counties[[65]])
table2 <- table[2:63,seq(1,length(colnames(table)),2)] #removes empty row #1 and empty columns
#subset data for counties in New York City
nyc <- subset(table2, County == "New York County" | County == "Bronx County" | County == "Kings County" | County == "Queens County" | County == "Richmond County")
nyc
```
We retrieved the street addresses of CVS retail locations in NYC manually using CVS' store locator as we were unable to scrape their webpages. We also geocoded all CVS retail locations using their street addresses and the ggmap package. This code is commented out as it uses an API key.
```{r CVSLocations}
#code adapted from http://www.storybench.org/geocode-csv-addresses-r/
#library(ggmap)
#register_google(key = "xxx") #removed personal API key
# Initialize the data frame
#getOption("ggmap")
# Loop through the addresses to get the latitude and longitude of each address and add it to the
# ds data frame in new columns lat and lon
# ds <- read.csv("CVSstreetAddresses.csv")
#for(i in 1:nrow(ds))
#{
# Print("Working...")
# result <- geocode(ds$Address[i], output = "latlon", source = "google")
# ds$lon[i] <- as.numeric(result[1])
# ds$lat[i] <- as.numeric(result[2])
#}
#write.csv(ds, "csv.csv", row.names=FALSE)
cvs <- read.csv("https://raw.githubusercontent.com/omarp120/DATA607FinalProject/master/cvs.csv")
head(cvs)
```
### Data Transformations
We assign each CVS location their corresponding social determinant scores based on the borough/county that they are in. The following is a sample:
```{r transformation}
library(tidyr)
library(sqldf)
library(dplyr)
#Convert Boroughs in the CVS location file to the names of the corresponding counties in the social determinants file
cvs$Borough <- as.character(cvs$Borough)
cvs$Borough[cvs$Borough == "bronx"] <- "Bronx County"
cvs$Borough[cvs$Borough == "brooklyn"] <- "Kings County"
cvs$Borough[cvs$Borough == "manhattan"] <- "New York County"
cvs$Borough[cvs$Borough == "queens"] <- "Queens County"
cvs$Borough[cvs$Borough == "staten island"] <- "Richmond County"
#Join the social determinants data to the CVS locations data
query <-
'SELECT *
FROM cvs
LEFT JOIN nyc
ON cvs.borough = nyc.County
'
cvs2 <- sqldf(query)
head(cvs2[sample(nrow(cvs2), 10),])
```
Additionally, we find the lowest score for each location to see what sort of services are most needed at each location in order to improve the health of their surrounding communities.
```{r InNeed}
#Find the social determinant factor that is most needed (has the lowest score) for each CVS location
cvs3 <- cvs2
cvs3$MostInNeed <- apply(cvs3[,9:18], 1, function(x) colnames(cvs3[,9:18])[which.min(x)])
head(cvs3[sample(nrow(cvs2), 10),])
```
We also transform our social determinants data from wide to tall for some analyses.
```{r wide2Tall}
#Transform data from wide to tall
nycTall <- gather(nyc, "Component", "Score", 'Population Health':Infrastructure)
head(nycTall)
```
### CVS Locations in New York City
Next, we map our CVS locations in order to better understand how they are distributed throughout NYC. Most locations are in Manhattan, and there appear to be very few in the Bronx in relation to how large the borough is.
```{r map}
library(leaflet)
cvs %>%
leaflet() %>%
addTiles() %>%
addMarkers(clusterOption=markerClusterOptions()) %>%
addProviderTiles(providers$CartoDB.Positron)
```
### Analysis
By plotting the total cumulative scores of the social determinant components by borough, we see that the Bronx has the lowest overall score and is in most need of resources that contribute to the overall health of its population. It is followed by Kings County (Brooklyn), Queens, Richmond County (Staten Island), and finally New York County (Manhattan), which is least in need.
We would also like to see how these social determinant factors relate to one another. A correlation network of the social determinant components allows us to see which factors, such as Education and Housing, are highly correlated with one another. Factors that are highly correlated should be addressed in a bundled form at locations depending on what is in need in the community. Furthermore, we can assign additional factors to address at the CVS stores based on how they correlate to what is most in need at that location.
Community Vitality is the lowest score in the Bronx and in Brooklyn, and this is most strongly correlated with Housing, the Economy and the Environment.
### Total Social Determinant Scores by Borough
```{r analysis}
library(ggplot2)
library(RColorBrewer)
blues <- brewer.pal(9, "Blues")
blue_range <- colorRampPalette(blues)
ggplot(nycTall, aes(x = County, y = Score)) + geom_bar(stat = "identity", aes(color = Component), fill = "antiquewhite2") + xlab("Borough") + ylab("Total Score") + theme_bw() + theme(panel.grid.major = element_blank(), panel.border = element_blank()) + ggtitle("Total Social Determinant Scores")
```
### Correlation Network of Social Determinants of Health
```{r corNetwork}
library(corrr)
cor <- nyc[,2:11] #isolate the variables that we want to correlate to one another
cor.mat <- correlate(cor)
#cor.mat
cor %>% correlate() %>% network_plot(min_cor = 0.0)
```
### Conclusion
In order to have the most substantial impact on the health of NYC's population, CVS should invest in addressing the social determinants of health in its design of HealthHubs in the Bronx and in Brooklyn. The component most in need of improvement in both the Bronx and Brooklyn is Community Vitality, which consists of community stability and social capital.
CVS locations in the Bronx and Brooklyn should focus on providing resources that will lead to improvements in Community Vitality in order to make these communities healthier. Community Vitality is most strongly correlated with Housing, the Economy and the Environment, so those factors should also be bundled and addressed at these CVS locations. Housing looks at housing affordability, housing capacity, and housing quality. The economy considers employment, income and opportunity. The environment depends on air and water, the natural environment, and natural hazards. CVS could potentially host workshops and help form community groups in order to educate members on these topics and ultimately guide them on their journeys towards leading healthier lives.
The following is a sample of 10 NYC store locations with assignments on what CVS should focus resources on at each store when redesigning them as HealthHubs. 'MostInNeed' denotes the social determinant component in most need at each of these locations.
```{r sample}
head(cvs3[sample(nrow(cvs3), 10), c(1,19)])
```