output: html_document: default pdf_document: default — Submission By : -Abhilasha Kumar -Nadya Paramputri -Sharp Harry
Research Objective To investigate whether the presence of STEM jobs would impact sentiment(s) AND infrastructure investment of 250+ Atlanta neighborhoods
Study Background -Why STEM jobs? STEM workers have been acknowledged as economic drivers at local and federal level [1]. They account for the country’s innovation; generating ideas and technologies that generate jobs and raise the standards of U.S. Household [2].
These individuals make 29 times more than their STEM counterparts (Langdon et al, 2011)
Local service jobs such as carpenters, taxi drivers, teachers, nurses, and others are created at a ratio of 5:1 for every STEM worker hired in a city with high STEM worker population (Moretti, 2011).
This population are growing in the past 40 years (Watson, 2017).
-Why Atlanta?
Atlanta has developed a deep-rooted ecosystem with tech-savvy workforce that is proliferated with proximity to tech-focused schools, and Georgia Tech’s decision to build Tech Square back in the 1990s that connects students to internships and research opportunities.
Number of tech job postings in Atlanta surpassed those in Chicago, Austin, and San Francisco (Burning Glass, 2022).
The rise of remote work that accelerated the great transmigration of Silicon Valley STEM workers to neighboring states that are cheaper in living costs.
Neighborhood Sentiment: .Average Household Income & Housing Value .H1: We hypothesize that the higher the number of STEM population in an NPU, the lower the neighborhood sentiment would be .# of STEM population in the neighborhood
Economic Mobility Index .# of STEM population in the neighborhood .Average Household Income & Housing Value H2: We hypothesize that the higher the number of STEM population in an NPU, the higher the economic mobility index would be
The following steps are followed:
For this project , we download Tweets that contain the names of neighborhoods in Atlanta. We apply sentiment analysis to the Tweets and map/plot the sentiments associated with neighborhoods. Specifically, we preformed the the following steps:
library(rtweet)
## Warning: package 'rtweet' was built under R version 4.2.2
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.2.2
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.4.0 ✔ purrr 0.3.5
## ✔ tibble 3.1.8 ✔ dplyr 1.0.10
## ✔ tidyr 1.2.1 ✔ stringr 1.5.0
## ✔ readr 2.1.3 ✔ forcats 0.5.2
## Warning: package 'ggplot2' was built under R version 4.2.2
## Warning: package 'tibble' was built under R version 4.2.1
## Warning: package 'tidyr' was built under R version 4.2.2
## Warning: package 'readr' was built under R version 4.2.2
## Warning: package 'purrr' was built under R version 4.2.2
## Warning: package 'dplyr' was built under R version 4.2.2
## Warning: package 'stringr' was built under R version 4.2.2
## Warning: package 'forcats' was built under R version 4.2.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ purrr::flatten() masks rtweet::flatten()
## ✖ dplyr::lag() masks stats::lag()
library(sf)
## Warning: package 'sf' was built under R version 4.2.2
## Linking to GEOS 3.9.3, GDAL 3.5.2, PROJ 8.2.1; sf_use_s2() is TRUE
library(sentiment.ai)
## Warning: package 'sentiment.ai' was built under R version 4.2.2
library(SentimentAnalysis)
## Warning: package 'SentimentAnalysis' was built under R version 4.2.2
##
## Attaching package: 'SentimentAnalysis'
##
## The following object is masked from 'package:base':
##
## write
library(ggplot2)
library(here)
## Warning: package 'here' was built under R version 4.2.2
## here() starts at D:/Georgia Tech/Spec topic_/Project_proposal
library(tmap)
## Warning: package 'tmap' was built under R version 4.2.2
library(Hmisc);library(ff)
## Warning: package 'Hmisc' was built under R version 4.2.2
## Loading required package: lattice
## Loading required package: survival
## Warning: package 'survival' was built under R version 4.2.2
## Loading required package: Formula
##
## Attaching package: 'Hmisc'
##
## The following objects are masked from 'package:dplyr':
##
## src, summarize
##
## The following objects are masked from 'package:base':
##
## format.pval, units
## Warning: package 'ff' was built under R version 4.2.2
## Loading required package: bit
## Warning: package 'bit' was built under R version 4.2.2
##
## Attaching package: 'bit'
##
## The following object is masked from 'package:base':
##
## xor
##
## Attaching package ff
## - getOption("fftempdir")=="C:/Users/kumar/AppData/Local/Temp/Rtmpeu9znC/ff"
##
## - getOption("ffextension")=="ff"
##
## - getOption("ffdrop")==TRUE
##
## - getOption("fffinonexit")==TRUE
##
## - getOption("ffpagesize")==65536
##
## - getOption("ffcaching")=="mmnoflush" -- consider "ffeachflush" if your system stalls on large writes
##
## - getOption("ffbatchbytes")==16777216 -- consider a different value for tuning your system
##
## - getOption("ffmaxbytes")==536870912 -- consider a different value for tuning your system
##
##
## Attaching package: 'ff'
##
## The following objects are masked from 'package:utils':
##
## write.csv, write.csv2
##
## The following objects are masked from 'package:base':
##
## is.factor, is.ordered
Read the data into the current R environment.
# Read neighborhood shapefile
nb_shp <- st_read("D:/Georgia Tech/Spec topic_/major ass_5/Atlanta_Neighborhoods")
## Reading layer `Atlanta_Neighborhoods' from data source
## `D:\Georgia Tech\Spec topic_\major ass_5\Atlanta_Neighborhoods'
## using driver `ESRI Shapefile'
## Simple feature collection with 248 features and 20 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -84.55085 ymin: 33.64799 xmax: -84.28962 ymax: 33.88687
## Geodetic CRS: WGS 84
init_sentiment.ai(envname = "r-sentiment-ai", method = "conda") # feel free to change these arguments if you need to.
## <tensorflow.python.saved_model.load.Loader._recreate_base_user_object.<locals>._UserObject object at 0x0000025BA852B340>
Prepare to use Twitter API by specifying arguments of create_token() function using your credentials.
# whatever name that was assigned to the created app
appname <- "UrbanAnalytics_tutorial"
# create token named "twitter_token"
# the keys used should be replaced by your own keys obtained by creating the app
twitter_token <- create_token(
app = appname,
consumer_key = Sys.getenv("twitter_key"),
consumer_secret = Sys.getenv("twitter_key_secret"),
access_token = Sys.getenv("twitter_access_token"),
access_secret = Sys.getenv("twitter_access_token_secret"))
Step 5: Defining a function that downloads the tweets, clean them and apply senitment analysis to them.
# Extract neighborhood names from nb_shp's NAME column and store it in nb_names object.
nb_names <- nb_shp$NAME
# Define a search function
get_twt <- function(term){
term_mod <- paste0("\"", term, "\"")
out <- search_tweets(q = term_mod,
n = 1000,
lang = "en",
geocode = "33.76,-84.41,50mi",
retryonratelimit = TRUE,
include_rts = FALSE)
out <- out %>%
select(created_at, id, id_str, full_text, geo, coordinates, place, text)
# Basic cleaning
replace_reg <- "http[s]?://[A-Za-z\\d/\\.]+|&|<|>"
out <- out %>%
mutate(text = str_replace_all(text, replace_reg, ""),
text = gsub("@", "", text),
text = gsub("\n\n", "", text))
# Sentiment analysis
# Also add a column for neighborhood names
if (nrow(out)>0){
out <- out %>%
mutate(sentiment_ai = sentiment_score(out$text),
sentiment_an = analyzeSentiment(text)$SentimentQDAP,
nb = term)
print(paste0("Search term:", term))
} else {
return(out)
}
return(out)
}
Step 6: Apply the function to Tweets.
twt <- readRDS("twt_raw.rds")
# Apply the function to get Tweets
# twt <- map(nb_names, ~get_twt(.x))
There are 2 sets of tweets collected ,the first chunk cleans and
filters the first set and second chunk cleans and filters the second
chunk. The process is done in the following steps: - Drop empty elements
from the list twt. These are neighborhoods with no Tweets
referoilring to them. Hint: you can create a logical vector that has
FALSEs if the corresponding elements in twt has no Tweets
and TRUE otherwise.
The coordinates column is currently a list-column.
Unnest this column so that lat, long, and type (i.e., column names
inside coordinates) are separate columns. You can use unnest()
function.
Calculate the average sentiment score for each neighborhood. You
can group_by() nb column in twt objects and summarize() to
calculate means. Also add an additional column n that
contains the number of rows in each group using n() function.
Join the cleaned Tweet data back to the neighborhood shape file. Use the neighborhood name as the join key.
Step 7(a)
library("data.table")
## Warning: package 'data.table' was built under R version 4.2.2
##
## Attaching package: 'data.table'
## The following object is masked from 'package:bit':
##
## setattr
## The following objects are masked from 'package:dplyr':
##
## between, first, last
## The following object is masked from 'package:purrr':
##
## transpose
library(dplyr)
library(plyr)
## Warning: package 'plyr' was built under R version 4.2.2
## ------------------------------------------------------------------------------
## You have loaded plyr after dplyr - this is likely to cause problems.
## If you need functions from both plyr and dplyr, please load plyr first, then dplyr:
## library(plyr); library(dplyr)
## ------------------------------------------------------------------------------
##
## Attaching package: 'plyr'
## The following objects are masked from 'package:Hmisc':
##
## is.discrete, summarize
## The following object is masked from 'package:here':
##
## here
## The following objects are masked from 'package:dplyr':
##
## arrange, count, desc, failwith, id, mutate, rename, summarise,
## summarize
## The following object is masked from 'package:purrr':
##
## compact
twts <- twt[which(lapply(twt, nrow)!=0)]
twts <- rbindlist(twts , fill = FALSE, idcol = NULL)
typeof(twts)
## [1] "list"
twts_unnest <- unnest(twts, cols= c("coordinates"))
twts_clean <- twts_unnest %>% group_by(nb) %>%
dplyr::summarise(sentiment_ai = mean(sentiment_ai),
sentiment_an = mean(sentiment_an),
n = n()
)
names(twts_clean)[names(twts_clean) == 'nb'] <- 'NAME'
twt_poly <- merge(x= nb_shp, y = twts_clean, by= 'NAME')
Step 7(b) Cleaning and filtering previous 2 weeks of twitter data
bw_rds <- readRDS("D:/Georgia Tech/Spec topic_/Project_proposal/twt_nb_2022-11-13.rds")
twt_bw <- bw_rds[1:248] %>% do.call("rbind", .)
twts_clean_bw <- twt_bw %>% group_by(nb) %>%
dplyr::summarise(sentiment_ai = mean(sentiment_ai),
sentiment_an = mean(sentiment_an),
n = n()
)
names(twts_clean_bw)[names(twts_clean_bw) == 'nb'] <- 'NAME'
twt_poly_bw <- merge(x= nb_shp, y = twts_clean_bw, by= 'NAME')
names(twts_clean)[names(twts_clean) == 'nb'] <- 'NAME'
Step 7(c): Merging both the sets of tweet into one dataframe
merged_final_twt_2 <- join(twt_poly %>% as.data.frame(),twt_poly_bw %>% as.data.frame(), by = "ACRES")
total_twts <-rbind(twt_poly, twt_poly_bw)
tibble(total_twts)
## # A tibble: 111 × 24
## NAME OBJEC…¹ LOCALID GEOTYPE FULLF…² LEGAL…³ EFFEC…⁴ ENDDATE SRCREF ACRES
## <chr> <int> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <dbl>
## 1 Adams P… 62 <NA> Neighb… <NA> <NA> <NA> <NA> <NA> 629.
## 2 Atlanti… 231 <NA> Neighb… <NA> <NA> <NA> <NA> <NA> 163.
## 3 Ben Hill 71 <NA> Neighb… <NA> <NA> <NA> <NA> <NA> 685.
## 4 Bolton 75 <NA> Neighb… <NA> <NA> <NA> <NA> <NA> 965.
## 5 Brandon 17 <NA> Neighb… <NA> <NA> <NA> <NA> <NA> 410.
## 6 Brookha… 225 <NA> Neighb… <NA> <NA> <NA> <NA> <NA> 637.
## 7 Brookwo… 207 <NA> Neighb… <NA> <NA> <NA> <NA> <NA> 101.
## 8 Buckhea… 32 <NA> Neighb… <NA> <NA> <NA> <NA> <NA> 127.
## 9 Cabbage… 48 <NA> Neighb… <NA> <NA> <NA> <NA> <NA> 112.
## 10 Campbel… 60 <NA> Neighb… <NA> <NA> <NA> <NA> <NA> 283.
## # … with 101 more rows, 14 more variables: SQMILES <dbl>, OLDNAME <chr>,
## # NPU <chr>, CREATED_US <chr>, CREATED_DA <date>, LAST_EDITE <chr>,
## # LAST_EDI_1 <date>, GLOBALID <chr>, SHAPEAREA <dbl>, SHAPELEN <dbl>,
## # sentiment_ai <dbl>, sentiment_an <dbl>, n <int>,
## # geometry <MULTIPOLYGON [°]>, and abbreviated variable names ¹OBJECTID,
## # ²FULLFIPS, ³LEGALAREA, ⁴EFFECTDATE
all_twts <- saveRDS(total_twts,file = "merged_twts_1.rds")
Step 8. Analysis
Now that we have collected Tweets, calculated sentiment score, and merged it back to the original shapefile, we can map them to see spatial distribution and draw plots to see inter-variable relationships.
Step(8a): First, interactive choropleth maps, one using sentiment score as the color and the other one using the number of Tweets as the color. Use tmap_arrange() function to display the two maps side-by-side.
tmap_mode("view")
## tmap mode set to interactive viewing
a <- tm_basemap("OpenStreetMap")+tm_shape(total_twts) +
tm_polygons(col = "sentiment_ai", style = "quantile")
a
## Variable(s) "sentiment_ai" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.
b <- tm_basemap("OpenStreetMap")+ tm_shape(total_twts) +
tm_polygons(col = "n", style="quantile")
tmap_arrange(a,b, sync = TRUE)
## Variable(s) "sentiment_ai" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.
## Variable(s) "sentiment_ai" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.
Step(8b): Calculating correlation analysis between the number of Tweets for each neighborhood and sentiment score either using cor.test() function or ggpubr::stat_cor() function.
library(ggpubr)
## Warning: package 'ggpubr' was built under R version 4.2.2
##
## Attaching package: 'ggpubr'
## The following object is masked from 'package:plyr':
##
## mutate
twt_cor <- ggscatter(total_twts,x= "n", y = "sentiment_ai", add = "reg.line", add.params = list(color = "blue", fill = "lightgray"),method = "pearson", label.x = 3, label.y = 30) # Customize reg. line
## Warning in (function (mapping = NULL, data = NULL, stat = "identity", position =
## "identity", : Ignoring unknown parameters: `method`
twt_cor + stat_cor(p.accuracy = 0.001, r.accuracy = 0.01)
## `geom_smooth()` using formula = 'y ~ x'
twt_cor
## `geom_smooth()` using formula = 'y ~ x'
cor_map <- twt_cor + stat_cor(method = "pearson")
twt_cor <- cor.test(total_twts$n,total_twts$sentiment_ai)
Step 9: Convert nb shape file from polygon to point file. Step(9a): Find the centroid of the Neighborhood shapes in order to overlap them with the NPU with aligning boundaries.
st_centroid(nb_shp)
## Warning in st_centroid.sf(nb_shp): st_centroid assumes attributes are constant
## over geometries of x
## Simple feature collection with 248 features and 20 fields
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: -84.54262 ymin: 33.65449 xmax: -84.30083 ymax: 33.87584
## Geodetic CRS: WGS 84
## First 10 features:
## OBJECTID LOCALID NAME GEOTYPE FULLFIPS LEGALAREA
## 1 7 <NA> Peachtree Heights East Neighborhood <NA> <NA>
## 2 8 <NA> Mt. Gilead Woods Neighborhood <NA> <NA>
## 3 9 <NA> Meadowbrook Forest Neighborhood <NA> <NA>
## 4 10 <NA> Niskey Cove Neighborhood <NA> <NA>
## 5 11 <NA> Oakcliff Neighborhood <NA> <NA>
## 6 12 <NA> Just Us Neighborhood <NA> <NA>
## 7 13 <NA> Bush Mountain Neighborhood <NA> <NA>
## 8 14 <NA> Briar Glen Neighborhood <NA> <NA>
## 9 15 <NA> Fairburn Neighborhood <NA> <NA>
## 10 16 <NA> Ben Hill Terrace Neighborhood <NA> <NA>
## EFFECTDATE ENDDATE SRCREF ACRES SQMILES OLDNAME NPU
## 1 <NA> <NA> <NA> 133.22 0.21 Peachtree Heights East B
## 2 <NA> <NA> <NA> 35.59 0.06 Mt. Gilead Woods P
## 3 <NA> <NA> <NA> 70.85 0.11 Meadowbrook Forest P
## 4 <NA> <NA> <NA> 52.50 0.08 Niskey Cove P
## 5 <NA> <NA> <NA> 66.96 0.10 Oakcliff H
## 6 <NA> <NA> <NA> 17.69 0.03 Just Us T
## 7 <NA> <NA> <NA> 49.80 0.08 Bush Mountain S
## 8 <NA> <NA> <NA> 66.55 0.10 Briar Glen P
## 9 <NA> <NA> <NA> 114.84 0.18 Fairburn Avenue P
## 10 <NA> <NA> <NA> 212.19 0.33 Ben Hill Terrace P
## CREATED_US CREATED_DA LAST_EDITE LAST_EDI_1
## 1 <NA> <NA> GIS 2022-05-24
## 2 <NA> <NA> GIS 2022-05-24
## 3 <NA> <NA> GIS 2022-05-24
## 4 <NA> <NA> GIS 2022-05-24
## 5 <NA> <NA> GIS 2022-05-24
## 6 <NA> <NA> GIS 2022-05-24
## 7 <NA> <NA> GIS 2022-05-24
## 8 <NA> <NA> GIS 2022-05-24
## 9 <NA> <NA> GIS 2022-05-24
## 10 <NA> <NA> GIS 2022-05-24
## GLOBALID SHAPEAREA SHAPELEN
## 1 {7040B465-59F1-4D32-BB1A-6340CCAB5471} 5803162.1 9738.439
## 2 {0B7F1854-18A6-42CD-BDA4-CF6119EFD4D5} 1550308.9 5341.891
## 3 {6FE32BA0-9C9E-496F-BF73-9DBB21427BAB} 3086026.6 7538.819
## 4 {C72B1769-93B6-42BA-AB9C-C7EEC486B46C} 2286925.9 8814.793
## 5 {B4963201-5EE9-4D7C-A2EE-E8CCDA782740} 2916579.1 6926.384
## 6 {1C57E7B0-9F92-477D-9DF9-98CE4C268F94} 770378.3 3760.127
## 7 {50462634-1A5E-4D72-9509-06DBB9EEA0FA} 2169448.6 7795.086
## 8 {114318E1-DE48-43AA-853B-1708CAB5607F} 2898718.3 8633.698
## 9 {670CCF24-29C4-4455-B6B0-22F130488D42} 5002626.5 10200.742
## 10 {BB2396B6-8004-4C3F-94BE-E9DAD516A31E} 9243017.2 12950.567
## geometry
## 1 POINT (-84.38295 33.82574)
## 2 POINT (-84.50368 33.69995)
## 3 POINT (-84.50308 33.69259)
## 4 POINT (-84.52892 33.70871)
## 5 POINT (-84.49806 33.76169)
## 6 POINT (-84.42486 33.75258)
## 7 POINT (-84.43204 33.72751)
## 8 POINT (-84.5033 33.69695)
## 9 POINT (-84.52342 33.69249)
## 10 POINT (-84.52177 33.7)
Step(9b): Reading the NPU file from the location and spatial join nb shape file with NPU shape file
NPU_shape <- read_sf("D:/Georgia Tech/Spec topic_/Project_proposal/City_of_Atlanta_Neighborhood_Statistical_Areas/City_of_Atlanta_Neighborhood_Statistical_Areas/City_of_Atlanta_Neighborhood_Statistical_Areas.shp")
NPU_sf <- st_as_sf(NPU_shape)
sf_joined <- st_join(NPU_sf ,nb_shp, join = st_intersects)
tmap_mode("view")
## tmap mode set to interactive viewing
a <- tm_basemap("OpenStreetMap") + tm_shape(sf_joined) + tm_polygons(col = "NPU.x", style = "pretty")
a
Step10: Extracting a NPU level data with all the variable values from csv to dataframe format.
all_data <- read.csv("NPU_Neighborhood_EconMobility_Pop_Race_MedHHInc_MedHouse_TotalJob_STEMJob (1).csv")
all_data
## OBJECTID NPU NEIGHBORHOOD Economic.Mobility.Index
## 1 37 A Margaret Mitchel 63
## 2 3 B Peachtree Heights West 63
## 3 38 C Fernleaf 61
## 4 50 D Bolton 58
## 5 83 E Ansley Park 64
## 6 77 F Piedmont Heights 59
## 7 65 G Atlanta Industrial Park 40
## 8 NA G West Highlands 41
## 9 15 I Beecher Hills 42
## 10 24 J Center Hill 39
## 11 52 K Hunter Hills 40
## 12 10 L Vine City 45
## 13 12 M Castleberry Hill 58
## 14 102 N Cabbagetown 60
## 15 92 O East Lake 51
## 16 NA P Ben Hill 45
## 17 55 Q Midwest Cascade 49
## 18 66 R Campbellton Road 33
## 19 19 S Bush Mountain 42
## 20 22 T Ashview Heights 40
## 21 46 V Capitol Gateway 49
## 22 100 W Grant Park, Oakland 59
## 23 18 X Capitol View 45
## 24 97 Y Chosewood Park 41
## 25 47 Z Lakewood 36
## People.Based.Index Place.Based.Index Economic.System.Index
## 1 71 62 52
## 2 69 59 60
## 3 65 59 56
## 4 64 58 55
## 5 66 69 58
## 6 60 61 60
## 7 38 46 54
## 8 36 46 42
## 9 41 43 43
## 10 31 44 43
## 11 37 43 41
## 12 45 46 46
## 13 49 58 77
## 14 66 64 55
## 15 61 44 49
## 16 45 48 45
## 17 46 49 50
## 18 26 33 40
## 19 42 48 40
## 20 43 42 33
## 21 46 54 44
## 22 62 56 60
## 23 45 47 44
## 24 43 40 43
## 25 38 36 33
## Education.System.Index pop white black asian other hispanic medhhinc
## 1 69 4061 85.9 5.7 4.4 1.3 2.7 299991
## 2 65 4874 77.0 14.5 2.9 2.0 3.6 116250
## 3 62 2662 74.1 9.7 1.6 1.2 13.3 116108
## 4 56 5314 43.1 27.1 2.4 2.2 25.2 105331
## 5 65 3350 86.6 8.0 1.5 0.8 3.1 109269
## 6 57 2834 65.9 20.3 4.4 3.1 6.3 114292
## 7 61 2083 3.6 92.8 0.6 1.1 2.0 35038
## 8 41 3628 3.9 91.6 0.6 1.5 2.3 39589
## 9 42 2881 1.3 95.3 0.1 1.5 1.8 45212
## 10 37 2730 1.4 95.8 0.2 1.2 1.4 32051
## 11 40 3836 1.5 96.2 0.2 1.0 1.0 41994
## 12 45 2818 2.2 92.7 0.3 2.2 2.6 35244
## 13 48 14560 32.2 53.4 6.8 2.8 4.8 73780
## 14 56 3750 59.1 31.0 2.3 2.8 4.8 118100
## 15 51 4046 27.2 67.7 0.9 1.8 2.4 85981
## 16 43 3826 1.2 94.6 0.5 2.0 1.7 59371
## 17 51 1898 1.5 96.3 0.9 0.9 0.4 96093
## 18 34 6721 1.9 95.7 0.0 1.1 1.2 27689
## 19 41 3672 1.7 95.8 0.2 1.1 1.1 40136
## 20 4 2072 1.6 95.5 0.0 1.6 1.3 388803
## 21 51 2874 12.0 81.0 1.8 2.2 3.0 31516
## 22 60 6827 62.4 27.9 2.2 2.5 5.0 112075
## 23 44 2648 12.8 82.6 0.4 1.8 2.4 34123
## 24 36 3995 21.9 55.0 0.5 1.5 21.1 33252
## 25 37 3135 2.6 84.3 1.0 0.8 11.4 36305
## medhousevalue totaljob stemjob
## 1 974443 4799 56.80
## 2 606599 26278 55.40
## 3 676221 8540 33.30
## 4 397923 17993 28.60
## 5 467388 97905 42.00
## 6 646887 47137 35.24
## 7 208684 4370 16.80
## 8 100539 3319 22.90
## 9 168759 2647 18.30
## 10 105324 2581 16.20
## 11 104444 2541 28.20
## 12 236130 3002 20.60
## 13 374185 154829 40.80
## 14 567659 9411 12.90
## 15 421026 4161 15.70
## 16 192889 5133 12.80
## 17 325564 37 9.70
## 18 195071 2897 8.20
## 19 94993 904 24.90
## 20 282464 7852 20.50
## 21 231128 3286 16.30
## 22 407383 7018 14.40
## 23 142781 20542 48.00
## 24 179753 1791 48.80
## 25 143342 5707 29.40
Step 10: Combining the sentiment score with the NPU shape file.
print(total_twts)
## Simple feature collection with 111 features and 23 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -84.53565 ymin: 33.65559 xmax: -84.28962 ymax: 33.88687
## Geodetic CRS: WGS 84
## First 10 features:
## NAME OBJECTID LOCALID GEOTYPE FULLFIPS LEGALAREA EFFECTDATE
## 1 Adams Park 62 <NA> Neighborhood <NA> <NA> <NA>
## 2 Atlantic Station 231 <NA> Neighborhood <NA> <NA> <NA>
## 3 Ben Hill 71 <NA> Neighborhood <NA> <NA> <NA>
## 4 Bolton 75 <NA> Neighborhood <NA> <NA> <NA>
## 5 Brandon 17 <NA> Neighborhood <NA> <NA> <NA>
## 6 Brookhaven 225 <NA> Neighborhood <NA> <NA> <NA>
## 7 Brookwood 207 <NA> Neighborhood <NA> <NA> <NA>
## 8 Buckhead Village 32 <NA> Neighborhood <NA> <NA> <NA>
## 9 Cabbagetown 48 <NA> Neighborhood <NA> <NA> <NA>
## 10 Campbellton Road 60 <NA> Neighborhood <NA> <NA> <NA>
## ENDDATE SRCREF ACRES SQMILES OLDNAME NPU CREATED_US CREATED_DA
## 1 <NA> <NA> 628.53 0.98 Adams Park R <NA> <NA>
## 2 <NA> <NA> 163.06 0.25 Home Park E <NA> <NA>
## 3 <NA> <NA> 685.22 1.07 Ben Hill P <NA> <NA>
## 4 <NA> <NA> 964.68 1.51 Bolton D <NA> <NA>
## 5 <NA> <NA> 409.85 0.64 Brandon C <NA> <NA>
## 6 <NA> <NA> 636.92 1.00 Brookhaven B <NA> <NA>
## 7 <NA> <NA> 101.17 0.16 Brookwood E <NA> <NA>
## 8 <NA> <NA> 127.21 0.20 Buckhead Village B <NA> <NA>
## 9 <NA> <NA> 112.17 0.18 Cabbage Town N <NA> <NA>
## 10 <NA> <NA> 282.91 0.44 Campbellton Road R <NA> <NA>
## LAST_EDITE LAST_EDI_1 GLOBALID SHAPEAREA
## 1 GIS 2022-05-24 {806632F1-D3FC-4DD2-978A-42B625F8C601} 27378543
## 2 GIS 2022-05-24 {0E3C70B8-FB96-4598-9392-35B0449DF8FB} 7103044
## 3 GIS 2022-05-24 {BE4276A0-2F34-4E15-8DDA-F8059ACC9E22} 29848192
## 4 GIS 2022-05-24 {68C7BA9F-7608-4A79-B59E-65E2C4C00FE6} 42021278
## 5 GIS 2022-05-24 {F29E1386-9FEC-4907-AFAC-FBD45A723D0A} 17853189
## 6 GIS 2022-05-24 {CDC5A9FB-29F8-4F51-97DA-ED8D9DB1482E} 27744187
## 7 GIS 2022-05-24 {7160E578-A16B-4F40-A51F-A692B717629C} 4406751
## 8 GIS 2022-05-24 {ED28CC83-BB5C-4EC2-8DA5-F47473E1FAB0} 5541158
## 9 GIS 2022-05-24 {267DA5D6-6421-4870-AA02-137C41930051} 4886223
## 10 GIS 2022-05-24 {7F9C6923-F17A-4C70-9F40-49235294CB69} 12323573
## SHAPELEN sentiment_ai sentiment_an n geometry
## 1 21028.365 -0.67136337 0.020000000 2 MULTIPOLYGON (((-84.45195 3...
## 2 13535.866 -0.22949672 0.020506518 52 MULTIPOLYGON (((-84.39357 3...
## 3 38492.440 0.28917164 0.100000000 2 MULTIPOLYGON (((-84.52858 3...
## 4 36336.392 -0.58723179 0.005681818 4 MULTIPOLYGON (((-84.45799 3...
## 5 22601.822 -0.03029922 0.004557292 32 MULTIPOLYGON (((-84.41975 3...
## 6 25663.856 -0.32019438 0.031845238 10 MULTIPOLYGON (((-84.34826 3...
## 7 11375.268 -0.06301082 0.041058859 12 MULTIPOLYGON (((-84.39306 3...
## 8 11010.963 0.66483378 0.157894737 1 MULTIPOLYGON (((-84.37131 3...
## 9 9042.749 0.50428414 0.250000000 1 MULTIPOLYGON (((-84.36264 3...
## 10 15863.560 0.25729829 0.000000000 2 MULTIPOLYGON (((-84.4667 33...
all_datum <- merge(total_twts, all_data, by = "NPU", all= TRUE)
tibble(all_datum)
## # A tibble: 117 × 41
## NPU NAME OBJEC…¹ LOCALID GEOTYPE FULLF…² LEGAL…³ EFFEC…⁴ ENDDATE SRCREF
## <chr> <chr> <int> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 A Paces 221 <NA> Neighb… <NA> <NA> <NA> <NA> <NA>
## 2 A Margare… 98 <NA> Neighb… <NA> <NA> <NA> <NA> <NA>
## 3 A Kingswo… 96 <NA> Neighb… <NA> <NA> <NA> <NA> <NA>
## 4 A Paces 221 <NA> Neighb… <NA> <NA> <NA> <NA> <NA>
## 5 A Chastai… 240 <NA> Neighb… <NA> <NA> <NA> <NA> <NA>
## 6 A Margare… 98 <NA> Neighb… <NA> <NA> <NA> <NA> <NA>
## 7 B North B… 215 <NA> Neighb… <NA> <NA> <NA> <NA> <NA>
## 8 B Buckhea… 32 <NA> Neighb… <NA> <NA> <NA> <NA> <NA>
## 9 B Brookha… 225 <NA> Neighb… <NA> <NA> <NA> <NA> <NA>
## 10 B Lenox 89 <NA> Neighb… <NA> <NA> <NA> <NA> <NA>
## # … with 107 more rows, 31 more variables: ACRES <dbl>, SQMILES <dbl>,
## # OLDNAME <chr>, CREATED_US <chr>, CREATED_DA <date>, LAST_EDITE <chr>,
## # LAST_EDI_1 <date>, GLOBALID <chr>, SHAPEAREA <dbl>, SHAPELEN <dbl>,
## # sentiment_ai <dbl>, sentiment_an <dbl>, n <int>, OBJECTID.y <int>,
## # NEIGHBORHOOD <chr>, Economic.Mobility.Index <int>,
## # People.Based.Index <int>, Place.Based.Index <int>,
## # Economic.System.Index <int>, Education.System.Index <int>, pop <int>, …
twt_clean_1 <- all_datum %>% group_by(NPU) %>% summarise_at(vars(1:40), mean)
## Warning in mean.default(NAME): argument is not numeric or logical: returning NA
## Warning in mean.default(NAME): argument is not numeric or logical: returning NA
## Warning in mean.default(NAME): argument is not numeric or logical: returning NA
## Warning in mean.default(NAME): argument is not numeric or logical: returning NA
## Warning in mean.default(NAME): argument is not numeric or logical: returning NA
## Warning in mean.default(NAME): argument is not numeric or logical: returning NA
## Warning in mean.default(NAME): argument is not numeric or logical: returning NA
## Warning in mean.default(NAME): argument is not numeric or logical: returning NA
## Warning in mean.default(NAME): argument is not numeric or logical: returning NA
## Warning in mean.default(NAME): argument is not numeric or logical: returning NA
## Warning in mean.default(NAME): argument is not numeric or logical: returning NA
## Warning in mean.default(NAME): argument is not numeric or logical: returning NA
## Warning in mean.default(NAME): argument is not numeric or logical: returning NA
## Warning in mean.default(NAME): argument is not numeric or logical: returning NA
## Warning in mean.default(NAME): argument is not numeric or logical: returning NA
## Warning in mean.default(NAME): argument is not numeric or logical: returning NA
## Warning in mean.default(NAME): argument is not numeric or logical: returning NA
## Warning in mean.default(NAME): argument is not numeric or logical: returning NA
## Warning in mean.default(NAME): argument is not numeric or logical: returning NA
## Warning in mean.default(NAME): argument is not numeric or logical: returning NA
## Warning in mean.default(NAME): argument is not numeric or logical: returning NA
## Warning in mean.default(NAME): argument is not numeric or logical: returning NA
## Warning in mean.default(NAME): argument is not numeric or logical: returning NA
## Warning in mean.default(NAME): argument is not numeric or logical: returning NA
## Warning in mean.default(NAME): argument is not numeric or logical: returning NA
## Warning in mean.default(LOCALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(LOCALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(LOCALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(LOCALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(LOCALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(LOCALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(LOCALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(LOCALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(LOCALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(LOCALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(LOCALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(LOCALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(LOCALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(LOCALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(LOCALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(LOCALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(LOCALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(LOCALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(LOCALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(LOCALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(LOCALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(LOCALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(LOCALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(LOCALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(LOCALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GEOTYPE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GEOTYPE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GEOTYPE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GEOTYPE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GEOTYPE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GEOTYPE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GEOTYPE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GEOTYPE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GEOTYPE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GEOTYPE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GEOTYPE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GEOTYPE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GEOTYPE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GEOTYPE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GEOTYPE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GEOTYPE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GEOTYPE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GEOTYPE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GEOTYPE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GEOTYPE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GEOTYPE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GEOTYPE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GEOTYPE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GEOTYPE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GEOTYPE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(FULLFIPS): argument is not numeric or logical: returning
## NA
## Warning in mean.default(FULLFIPS): argument is not numeric or logical: returning
## NA
## Warning in mean.default(FULLFIPS): argument is not numeric or logical: returning
## NA
## Warning in mean.default(FULLFIPS): argument is not numeric or logical: returning
## NA
## Warning in mean.default(FULLFIPS): argument is not numeric or logical: returning
## NA
## Warning in mean.default(FULLFIPS): argument is not numeric or logical: returning
## NA
## Warning in mean.default(FULLFIPS): argument is not numeric or logical: returning
## NA
## Warning in mean.default(FULLFIPS): argument is not numeric or logical: returning
## NA
## Warning in mean.default(FULLFIPS): argument is not numeric or logical: returning
## NA
## Warning in mean.default(FULLFIPS): argument is not numeric or logical: returning
## NA
## Warning in mean.default(FULLFIPS): argument is not numeric or logical: returning
## NA
## Warning in mean.default(FULLFIPS): argument is not numeric or logical: returning
## NA
## Warning in mean.default(FULLFIPS): argument is not numeric or logical: returning
## NA
## Warning in mean.default(FULLFIPS): argument is not numeric or logical: returning
## NA
## Warning in mean.default(FULLFIPS): argument is not numeric or logical: returning
## NA
## Warning in mean.default(FULLFIPS): argument is not numeric or logical: returning
## NA
## Warning in mean.default(FULLFIPS): argument is not numeric or logical: returning
## NA
## Warning in mean.default(FULLFIPS): argument is not numeric or logical: returning
## NA
## Warning in mean.default(FULLFIPS): argument is not numeric or logical: returning
## NA
## Warning in mean.default(FULLFIPS): argument is not numeric or logical: returning
## NA
## Warning in mean.default(FULLFIPS): argument is not numeric or logical: returning
## NA
## Warning in mean.default(FULLFIPS): argument is not numeric or logical: returning
## NA
## Warning in mean.default(FULLFIPS): argument is not numeric or logical: returning
## NA
## Warning in mean.default(FULLFIPS): argument is not numeric or logical: returning
## NA
## Warning in mean.default(FULLFIPS): argument is not numeric or logical: returning
## NA
## Warning in mean.default(LEGALAREA): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LEGALAREA): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LEGALAREA): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LEGALAREA): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LEGALAREA): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LEGALAREA): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LEGALAREA): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LEGALAREA): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LEGALAREA): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LEGALAREA): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LEGALAREA): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LEGALAREA): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LEGALAREA): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LEGALAREA): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LEGALAREA): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LEGALAREA): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LEGALAREA): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LEGALAREA): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LEGALAREA): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LEGALAREA): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LEGALAREA): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LEGALAREA): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LEGALAREA): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LEGALAREA): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LEGALAREA): argument is not numeric or logical:
## returning NA
## Warning in mean.default(EFFECTDATE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(EFFECTDATE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(EFFECTDATE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(EFFECTDATE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(EFFECTDATE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(EFFECTDATE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(EFFECTDATE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(EFFECTDATE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(EFFECTDATE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(EFFECTDATE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(EFFECTDATE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(EFFECTDATE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(EFFECTDATE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(EFFECTDATE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(EFFECTDATE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(EFFECTDATE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(EFFECTDATE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(EFFECTDATE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(EFFECTDATE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(EFFECTDATE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(EFFECTDATE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(EFFECTDATE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(EFFECTDATE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(EFFECTDATE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(EFFECTDATE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(ENDDATE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(ENDDATE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(ENDDATE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(ENDDATE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(ENDDATE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(ENDDATE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(ENDDATE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(ENDDATE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(ENDDATE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(ENDDATE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(ENDDATE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(ENDDATE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(ENDDATE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(ENDDATE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(ENDDATE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(ENDDATE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(ENDDATE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(ENDDATE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(ENDDATE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(ENDDATE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(ENDDATE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(ENDDATE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(ENDDATE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(ENDDATE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(ENDDATE): argument is not numeric or logical: returning
## NA
## Warning in mean.default(SRCREF): argument is not numeric or logical: returning
## NA
## Warning in mean.default(SRCREF): argument is not numeric or logical: returning
## NA
## Warning in mean.default(SRCREF): argument is not numeric or logical: returning
## NA
## Warning in mean.default(SRCREF): argument is not numeric or logical: returning
## NA
## Warning in mean.default(SRCREF): argument is not numeric or logical: returning
## NA
## Warning in mean.default(SRCREF): argument is not numeric or logical: returning
## NA
## Warning in mean.default(SRCREF): argument is not numeric or logical: returning
## NA
## Warning in mean.default(SRCREF): argument is not numeric or logical: returning
## NA
## Warning in mean.default(SRCREF): argument is not numeric or logical: returning
## NA
## Warning in mean.default(SRCREF): argument is not numeric or logical: returning
## NA
## Warning in mean.default(SRCREF): argument is not numeric or logical: returning
## NA
## Warning in mean.default(SRCREF): argument is not numeric or logical: returning
## NA
## Warning in mean.default(SRCREF): argument is not numeric or logical: returning
## NA
## Warning in mean.default(SRCREF): argument is not numeric or logical: returning
## NA
## Warning in mean.default(SRCREF): argument is not numeric or logical: returning
## NA
## Warning in mean.default(SRCREF): argument is not numeric or logical: returning
## NA
## Warning in mean.default(SRCREF): argument is not numeric or logical: returning
## NA
## Warning in mean.default(SRCREF): argument is not numeric or logical: returning
## NA
## Warning in mean.default(SRCREF): argument is not numeric or logical: returning
## NA
## Warning in mean.default(SRCREF): argument is not numeric or logical: returning
## NA
## Warning in mean.default(SRCREF): argument is not numeric or logical: returning
## NA
## Warning in mean.default(SRCREF): argument is not numeric or logical: returning
## NA
## Warning in mean.default(SRCREF): argument is not numeric or logical: returning
## NA
## Warning in mean.default(SRCREF): argument is not numeric or logical: returning
## NA
## Warning in mean.default(SRCREF): argument is not numeric or logical: returning
## NA
## Warning in mean.default(OLDNAME): argument is not numeric or logical: returning
## NA
## Warning in mean.default(OLDNAME): argument is not numeric or logical: returning
## NA
## Warning in mean.default(OLDNAME): argument is not numeric or logical: returning
## NA
## Warning in mean.default(OLDNAME): argument is not numeric or logical: returning
## NA
## Warning in mean.default(OLDNAME): argument is not numeric or logical: returning
## NA
## Warning in mean.default(OLDNAME): argument is not numeric or logical: returning
## NA
## Warning in mean.default(OLDNAME): argument is not numeric or logical: returning
## NA
## Warning in mean.default(OLDNAME): argument is not numeric or logical: returning
## NA
## Warning in mean.default(OLDNAME): argument is not numeric or logical: returning
## NA
## Warning in mean.default(OLDNAME): argument is not numeric or logical: returning
## NA
## Warning in mean.default(OLDNAME): argument is not numeric or logical: returning
## NA
## Warning in mean.default(OLDNAME): argument is not numeric or logical: returning
## NA
## Warning in mean.default(OLDNAME): argument is not numeric or logical: returning
## NA
## Warning in mean.default(OLDNAME): argument is not numeric or logical: returning
## NA
## Warning in mean.default(OLDNAME): argument is not numeric or logical: returning
## NA
## Warning in mean.default(OLDNAME): argument is not numeric or logical: returning
## NA
## Warning in mean.default(OLDNAME): argument is not numeric or logical: returning
## NA
## Warning in mean.default(OLDNAME): argument is not numeric or logical: returning
## NA
## Warning in mean.default(OLDNAME): argument is not numeric or logical: returning
## NA
## Warning in mean.default(OLDNAME): argument is not numeric or logical: returning
## NA
## Warning in mean.default(OLDNAME): argument is not numeric or logical: returning
## NA
## Warning in mean.default(OLDNAME): argument is not numeric or logical: returning
## NA
## Warning in mean.default(OLDNAME): argument is not numeric or logical: returning
## NA
## Warning in mean.default(OLDNAME): argument is not numeric or logical: returning
## NA
## Warning in mean.default(OLDNAME): argument is not numeric or logical: returning
## NA
## Warning in mean.default(CREATED_US): argument is not numeric or logical:
## returning NA
## Warning in mean.default(CREATED_US): argument is not numeric or logical:
## returning NA
## Warning in mean.default(CREATED_US): argument is not numeric or logical:
## returning NA
## Warning in mean.default(CREATED_US): argument is not numeric or logical:
## returning NA
## Warning in mean.default(CREATED_US): argument is not numeric or logical:
## returning NA
## Warning in mean.default(CREATED_US): argument is not numeric or logical:
## returning NA
## Warning in mean.default(CREATED_US): argument is not numeric or logical:
## returning NA
## Warning in mean.default(CREATED_US): argument is not numeric or logical:
## returning NA
## Warning in mean.default(CREATED_US): argument is not numeric or logical:
## returning NA
## Warning in mean.default(CREATED_US): argument is not numeric or logical:
## returning NA
## Warning in mean.default(CREATED_US): argument is not numeric or logical:
## returning NA
## Warning in mean.default(CREATED_US): argument is not numeric or logical:
## returning NA
## Warning in mean.default(CREATED_US): argument is not numeric or logical:
## returning NA
## Warning in mean.default(CREATED_US): argument is not numeric or logical:
## returning NA
## Warning in mean.default(CREATED_US): argument is not numeric or logical:
## returning NA
## Warning in mean.default(CREATED_US): argument is not numeric or logical:
## returning NA
## Warning in mean.default(CREATED_US): argument is not numeric or logical:
## returning NA
## Warning in mean.default(CREATED_US): argument is not numeric or logical:
## returning NA
## Warning in mean.default(CREATED_US): argument is not numeric or logical:
## returning NA
## Warning in mean.default(CREATED_US): argument is not numeric or logical:
## returning NA
## Warning in mean.default(CREATED_US): argument is not numeric or logical:
## returning NA
## Warning in mean.default(CREATED_US): argument is not numeric or logical:
## returning NA
## Warning in mean.default(CREATED_US): argument is not numeric or logical:
## returning NA
## Warning in mean.default(CREATED_US): argument is not numeric or logical:
## returning NA
## Warning in mean.default(CREATED_US): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LAST_EDITE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LAST_EDITE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LAST_EDITE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LAST_EDITE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LAST_EDITE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LAST_EDITE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LAST_EDITE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LAST_EDITE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LAST_EDITE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LAST_EDITE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LAST_EDITE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LAST_EDITE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LAST_EDITE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LAST_EDITE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LAST_EDITE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LAST_EDITE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LAST_EDITE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LAST_EDITE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LAST_EDITE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LAST_EDITE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LAST_EDITE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LAST_EDITE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LAST_EDITE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LAST_EDITE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(LAST_EDITE): argument is not numeric or logical:
## returning NA
## Warning in mean.default(GLOBALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GLOBALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GLOBALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GLOBALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GLOBALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GLOBALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GLOBALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GLOBALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GLOBALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GLOBALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GLOBALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GLOBALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GLOBALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GLOBALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GLOBALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GLOBALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GLOBALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GLOBALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GLOBALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GLOBALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GLOBALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GLOBALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GLOBALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GLOBALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(GLOBALID): argument is not numeric or logical: returning
## NA
## Warning in mean.default(NEIGHBORHOOD): argument is not numeric or logical:
## returning NA
## Warning in mean.default(NEIGHBORHOOD): argument is not numeric or logical:
## returning NA
## Warning in mean.default(NEIGHBORHOOD): argument is not numeric or logical:
## returning NA
## Warning in mean.default(NEIGHBORHOOD): argument is not numeric or logical:
## returning NA
## Warning in mean.default(NEIGHBORHOOD): argument is not numeric or logical:
## returning NA
## Warning in mean.default(NEIGHBORHOOD): argument is not numeric or logical:
## returning NA
## Warning in mean.default(NEIGHBORHOOD): argument is not numeric or logical:
## returning NA
## Warning in mean.default(NEIGHBORHOOD): argument is not numeric or logical:
## returning NA
## Warning in mean.default(NEIGHBORHOOD): argument is not numeric or logical:
## returning NA
## Warning in mean.default(NEIGHBORHOOD): argument is not numeric or logical:
## returning NA
## Warning in mean.default(NEIGHBORHOOD): argument is not numeric or logical:
## returning NA
## Warning in mean.default(NEIGHBORHOOD): argument is not numeric or logical:
## returning NA
## Warning in mean.default(NEIGHBORHOOD): argument is not numeric or logical:
## returning NA
## Warning in mean.default(NEIGHBORHOOD): argument is not numeric or logical:
## returning NA
## Warning in mean.default(NEIGHBORHOOD): argument is not numeric or logical:
## returning NA
## Warning in mean.default(NEIGHBORHOOD): argument is not numeric or logical:
## returning NA
## Warning in mean.default(NEIGHBORHOOD): argument is not numeric or logical:
## returning NA
## Warning in mean.default(NEIGHBORHOOD): argument is not numeric or logical:
## returning NA
## Warning in mean.default(NEIGHBORHOOD): argument is not numeric or logical:
## returning NA
## Warning in mean.default(NEIGHBORHOOD): argument is not numeric or logical:
## returning NA
## Warning in mean.default(NEIGHBORHOOD): argument is not numeric or logical:
## returning NA
## Warning in mean.default(NEIGHBORHOOD): argument is not numeric or logical:
## returning NA
## Warning in mean.default(NEIGHBORHOOD): argument is not numeric or logical:
## returning NA
## Warning in mean.default(NEIGHBORHOOD): argument is not numeric or logical:
## returning NA
## Warning in mean.default(NEIGHBORHOOD): argument is not numeric or logical:
## returning NA
## Warning in mean.default(geometry): argument is not numeric or logical: returning
## NA
## Warning in mean.default(geometry): argument is not numeric or logical: returning
## NA
## Warning in mean.default(geometry): argument is not numeric or logical: returning
## NA
## Warning in mean.default(geometry): argument is not numeric or logical: returning
## NA
## Warning in mean.default(geometry): argument is not numeric or logical: returning
## NA
## Warning in mean.default(geometry): argument is not numeric or logical: returning
## NA
## Warning in mean.default(geometry): argument is not numeric or logical: returning
## NA
## Warning in mean.default(geometry): argument is not numeric or logical: returning
## NA
## Warning in mean.default(geometry): argument is not numeric or logical: returning
## NA
## Warning in mean.default(geometry): argument is not numeric or logical: returning
## NA
## Warning in mean.default(geometry): argument is not numeric or logical: returning
## NA
## Warning in mean.default(geometry): argument is not numeric or logical: returning
## NA
## Warning in mean.default(geometry): argument is not numeric or logical: returning
## NA
## Warning in mean.default(geometry): argument is not numeric or logical: returning
## NA
## Warning in mean.default(geometry): argument is not numeric or logical: returning
## NA
## Warning in mean.default(geometry): argument is not numeric or logical: returning
## NA
## Warning in mean.default(geometry): argument is not numeric or logical: returning
## NA
## Warning in mean.default(geometry): argument is not numeric or logical: returning
## NA
## Warning in mean.default(geometry): argument is not numeric or logical: returning
## NA
## Warning in mean.default(geometry): argument is not numeric or logical: returning
## NA
## Warning in mean.default(geometry): argument is not numeric or logical: returning
## NA
## Warning in mean.default(geometry): argument is not numeric or logical: returning
## NA
## Warning in mean.default(geometry): argument is not numeric or logical: returning
## NA
## Warning in mean.default(geometry): argument is not numeric or logical: returning
## NA
## Warning in mean.default(geometry): argument is not numeric or logical: returning
## NA
twt_clean_1
## Simple feature collection with 25 features and 40 fields (with 2 geometries empty)
## Geometry type: GEOMETRY
## Dimension: XY
## Bounding box: xmin: -84.53565 ymin: 33.65559 xmax: -84.28962 ymax: 33.88687
## Geodetic CRS: WGS 84
## # A tibble: 25 × 41
## NPU NAME OBJECTID.x LOCALID GEOTYPE FULLF…¹ LEGAL…² EFFEC…³ ENDDATE SRCREF
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 A NA 162. NA NA NA NA NA NA NA
## 2 B NA 141. NA NA NA NA NA NA NA
## 3 C NA 68.8 NA NA NA NA NA NA NA
## 4 D NA 74.2 NA NA NA NA NA NA NA
## 5 E NA 195. NA NA NA NA NA NA NA
## 6 F NA 1424. NA NA NA NA NA NA NA
## 7 G NA 154. NA NA NA NA NA NA NA
## 8 H NA 122. NA NA NA NA NA NA NA
## 9 I NA 161. NA NA NA NA NA NA NA
## 10 J NA 130 NA NA NA NA NA NA NA
## # … with 15 more rows, 31 more variables: ACRES <dbl>, SQMILES <dbl>,
## # OLDNAME <dbl>, CREATED_US <dbl>, CREATED_DA <date>, LAST_EDITE <dbl>,
## # LAST_EDI_1 <date>, GLOBALID <dbl>, SHAPEAREA <dbl>, SHAPELEN <dbl>,
## # sentiment_ai <dbl>, sentiment_an <dbl>, n <dbl>, OBJECTID.y <dbl>,
## # NEIGHBORHOOD <dbl>, Economic.Mobility.Index <dbl>,
## # People.Based.Index <dbl>, Place.Based.Index <dbl>,
## # Economic.System.Index <dbl>, Education.System.Index <dbl>, pop <dbl>, …
twt_clean_ <- twt_clean_1 %>% select("NPU", "NAME", "sentiment_ai", "n", "Economic.Mobility.Index", "People.Based.Index", "Economic.System.Index" ,"Education.System.Index" , "pop", "white", "black", "asian","other","hispanic","medhhinc","medhousevalue", "totaljob","stemjob")
twt_clean_
## Simple feature collection with 25 features and 18 fields (with 2 geometries empty)
## Geometry type: GEOMETRY
## Dimension: XY
## Bounding box: xmin: -84.53565 ymin: 33.65559 xmax: -84.28962 ymax: 33.88687
## Geodetic CRS: WGS 84
## # A tibble: 25 × 19
## NPU NAME sentime…¹ n Econo…² Peopl…³ Econo…⁴ Educa…⁵ pop white black
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 A NA -0.123 3.83 63 71 52 69 4061 85.9 5.7
## 2 B NA 0.244 7.12 63 69 60 65 4874 77 14.5
## 3 C NA 0.0395 13.2 61 65 56 62 2662 74.1 9.7
## 4 D NA -0.276 4.6 58 64 55 56 5314 43.1 27.1
## 5 E NA 0.226 26.2 64 66 58 65 3350 86.6 8
## 6 F NA 0.437 3.33 59 60 60 57 2834 65.9 20.3
## 7 G NA -0.149 29.5 40.5 37 48 51 2856. 3.75 92.2
## 8 H NA 0.443 7 NA NA NA NA NA NA NA
## 9 I NA 0.0267 1 42 41 43 42 2881 1.3 95.3
## 10 J NA 0.525 2 39 31 43 37 2730 1.4 95.8
## # … with 15 more rows, 8 more variables: asian <dbl>, other <dbl>,
## # hispanic <dbl>, medhhinc <dbl>, medhousevalue <dbl>, totaljob <dbl>,
## # stemjob <dbl>, geometry <GEOMETRY [°]>, and abbreviated variable names
## # ¹sentiment_ai, ²Economic.Mobility.Index, ³People.Based.Index,
## # ⁴Economic.System.Index, ⁵Education.System.Index
Step 11: Data Analysis and Visualization of all the tweets and sentiment_ai values.
ggplot(data = twt_clean_, mapping = aes(x=n, y=sentiment_ai)) +
geom_point() +
geom_smooth(method = "lm",se = FALSE) +
labs(
x = "Count_Tweets",
y = "Avg_Sentiment_Score",
title = "Tweet patterns in different NPU in Atlanta"
)
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 2 rows containing non-finite values (`stat_smooth()`).
## Warning: Removed 2 rows containing missing values (`geom_point()`).
Step 12: Correlation of mean of sentiment score with other data variables
spw <- ggplot(data = twt_clean_, mapping = aes(x=Economic.Mobility.Index, y=sentiment_ai)) +
geom_point() +
geom_smooth(method = "lm",se = FALSE) +
stat_cor(method = "pearson", label.x = 40, label.y = 0.65)
Description: This map shows that there is a negative correlation between the two of the variables and shows that
library(ggplot2)
library(tmap)
library(cowplot)
## Warning: package 'cowplot' was built under R version 4.2.2
##
## Attaching package: 'cowplot'
## The following object is masked from 'package:ggpubr':
##
## get_legend
library(ggplotify)
## Warning: package 'ggplotify' was built under R version 4.2.2
spw1 <- ggplot(data = twt_clean_, mapping = aes(x=Education.System.Index, y=sentiment_ai)) +
geom_point(alpha= 0.4, size = 3) +
geom_smooth(method = "lm",se = FALSE, color= "red") +
theme_bw()+ stat_cor(method = "Kendall", label.x = 30, label.y = 0.55)
spw1
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 3 rows containing non-finite values (`stat_smooth()`).
## Warning: Removed 3 rows containing non-finite values (`stat_cor()`).
## Warning: Computation failed in `stat_cor()`
## Caused by error in `match.arg()`:
## ! 'arg' should be one of "pearson", "kendall", "spearman"
## Warning: Removed 3 rows containing missing values (`geom_point()`).
spw2 <- ggplot(data = twt_clean_, mapping = aes(x=stemjob, y=sentiment_ai)) +
geom_point(alpha= 0.4, size = 3) +
geom_smooth(method = "lm",se = FALSE, color= "red")+
theme_bw()+ stat_cor(method = "pearson", label.x = 30, label.y = 0.55)
spw2
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 3 rows containing non-finite values (`stat_smooth()`).
## Warning: Removed 3 rows containing non-finite values (`stat_cor()`).
## Warning: Removed 3 rows containing missing values (`geom_point()`).
spw3 <- ggplot(data = twt_clean_, mapping = aes(x=People.Based.Index, y=sentiment_ai)) +
geom_point(alpha= 0.4, size = 3) +
geom_smooth(method = "lm",se = FALSE, color= "red") +
theme_bw()+ stat_cor(method = "pearson", label.x = 30, label.y = 0.55)
spw3
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 3 rows containing non-finite values (`stat_smooth()`).
## Warning: Removed 3 rows containing non-finite values (`stat_cor()`).
## Warning: Removed 3 rows containing missing values (`geom_point()`).
spw4 <- ggplot(data = twt_clean_, mapping = aes(x=white, y=sentiment_ai)) +
geom_point(alpha= 0.4, size = 3) +
geom_smooth(method = "lm",se = FALSE, color= "red") +
theme_bw()+ stat_cor(method = "pearson", label.x = 30, label.y = 0.55)
spw4
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 3 rows containing non-finite values (`stat_smooth()`).
## Warning: Removed 3 rows containing non-finite values (`stat_cor()`).
## Warning: Removed 3 rows containing missing values (`geom_point()`).
tmap_mode()
## current tmap mode is "view"
plot_grid(spw1, spw2, spw3, spw4 )
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 3 rows containing non-finite values (`stat_smooth()`).
## Warning: Removed 3 rows containing non-finite values (`stat_cor()`).
## Warning: Computation failed in `stat_cor()`
## Caused by error in `match.arg()`:
## ! 'arg' should be one of "pearson", "kendall", "spearman"
## Warning: Removed 3 rows containing missing values (`geom_point()`).
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 3 rows containing non-finite values (`stat_smooth()`).
## Warning: Removed 3 rows containing non-finite values (`stat_cor()`).
## Warning: Removed 3 rows containing missing values (`geom_point()`).
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 3 rows containing non-finite values (`stat_smooth()`).
## Warning: Removed 3 rows containing non-finite values (`stat_cor()`).
## Warning: Removed 3 rows containing missing values (`geom_point()`).
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 3 rows containing non-finite values (`stat_smooth()`).
## Warning: Removed 3 rows containing non-finite values (`stat_cor()`).
## Warning: Removed 3 rows containing missing values (`geom_point()`).
spw5 <- ggplot(data = twt_clean_,
mapping = aes(x= black, y=sentiment_ai)) +
geom_point(alpha= 0.4, size = 3) +
geom_smooth(method = "lm",se = FALSE, color= "red") +
theme_bw()+
stat_cor(method = "pearson", label.x = 30, label.y = 0.55)
spw5
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 3 rows containing non-finite values (`stat_smooth()`).
## Warning: Removed 3 rows containing non-finite values (`stat_cor()`).
## Warning: Removed 3 rows containing missing values (`geom_point()`).
spw6 <- ggplot(data = twt_clean_, mapping = aes(x= asian, y=sentiment_ai)) +
geom_point(alpha= 0.4, size = 3) +
geom_smooth(method = "lm",se = FALSE, color= "red") +
theme_bw()+
stat_cor(method = "pearson", label.x = 2, label.y = 0.55)
spw6
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 3 rows containing non-finite values (`stat_smooth()`).
## Warning: Removed 3 rows containing non-finite values (`stat_cor()`).
## Warning: Removed 3 rows containing missing values (`geom_point()`).
spw7 <- ggplot(data = twt_clean_, mapping = aes(x= other, y=sentiment_ai)) +
geom_point(alpha= 0.4, size = 3) +
geom_smooth(method = "lm",se = FALSE, color= "red") +
theme_bw()+
stat_cor(method = "pearson", label.x = 2, label.y = 0.55)
spw7
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 3 rows containing non-finite values (`stat_smooth()`).
## Warning: Removed 3 rows containing non-finite values (`stat_cor()`).
## Warning: Removed 3 rows containing missing values (`geom_point()`).
spw8 <- ggplot(data = twt_clean_, mapping = aes(x= hispanic, y=sentiment_ai)) +
geom_point(alpha= 0.4, size = 3) +
geom_smooth(method = "lm",se = FALSE, color= "red") +
theme_bw()+
stat_cor(method = "pearson", label.x = 2, label.y = 0.55)
spw8
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 3 rows containing non-finite values (`stat_smooth()`).
## Warning: Removed 3 rows containing non-finite values (`stat_cor()`).
## Warning: Removed 3 rows containing missing values (`geom_point()`).
plot_grid(spw5, spw6, spw7, spw8)
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 3 rows containing non-finite values (`stat_smooth()`).
## Warning: Removed 3 rows containing non-finite values (`stat_cor()`).
## Warning: Removed 3 rows containing missing values (`geom_point()`).
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 3 rows containing non-finite values (`stat_smooth()`).
## Warning: Removed 3 rows containing non-finite values (`stat_cor()`).
## Warning: Removed 3 rows containing missing values (`geom_point()`).
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 3 rows containing non-finite values (`stat_smooth()`).
## Warning: Removed 3 rows containing non-finite values (`stat_cor()`).
## Warning: Removed 3 rows containing missing values (`geom_point()`).
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 3 rows containing non-finite values (`stat_smooth()`).
## Warning: Removed 3 rows containing non-finite values (`stat_cor()`).
## Warning: Removed 3 rows containing missing values (`geom_point()`).
spw9 <- ggplot(data = twt_clean_, mapping = aes(x= medhhinc, y=sentiment_ai)) +
geom_point(alpha= 0.4, size = 3) +
geom_smooth(method = "lm",se = FALSE, color= "red") +
theme_bw()+
stat_cor(method = "pearson", label.x = 75000, label.y = 0.55)
spw10 <- ggplot(data = twt_clean_, mapping = aes(x= medhousevalue, y=sentiment_ai)) +
geom_point(alpha= 0.4, size = 3) +
geom_smooth(method = "lm",se = FALSE, color= "red") +
theme_bw()+
stat_cor(method = "pearson", label.x = 300000, label.y = 0.55)
plot_grid(spw9, spw10)
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 3 rows containing non-finite values (`stat_smooth()`).
## Warning: Removed 3 rows containing non-finite values (`stat_cor()`).
## Warning: Removed 3 rows containing missing values (`geom_point()`).
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 3 rows containing non-finite values (`stat_smooth()`).
## Warning: Removed 3 rows containing non-finite values (`stat_cor()`).
## Warning: Removed 3 rows containing missing values (`geom_point()`).
library(MASS)
## Warning: package 'MASS' was built under R version 4.2.2
##
## Attaching package: 'MASS'
## The following object is masked from 'package:dplyr':
##
## select
lm_model <- lm(sentiment_ai ~ NPU+ Economic.Mobility.Index + People.Based.Index +Education.System.Index + pop+ white+ black+ asian+other+hispanic+medhhinc+medhousevalue+ stemjob, data = twt_clean_)
summary(lm_model)
##
## Call:
## lm(formula = sentiment_ai ~ NPU + Economic.Mobility.Index + People.Based.Index +
## Education.System.Index + pop + white + black + asian + other +
## hispanic + medhhinc + medhousevalue + stemjob, data = twt_clean_)
##
## Residuals:
## ALL 22 residuals are 0: no residual degrees of freedom!
##
## Coefficients: (12 not defined because of singularities)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.1234 NaN NaN NaN
## NPUB 0.3672 NaN NaN NaN
## NPUC 0.1629 NaN NaN NaN
## NPUD -0.1529 NaN NaN NaN
## NPUE 0.3497 NaN NaN NaN
## NPUF 0.5600 NaN NaN NaN
## NPUG -0.0258 NaN NaN NaN
## NPUI 0.1501 NaN NaN NaN
## NPUJ 0.6481 NaN NaN NaN
## NPUK 0.8700 NaN NaN NaN
## NPUL 0.1206 NaN NaN NaN
## NPUM 0.3951 NaN NaN NaN
## NPUN 0.6389 NaN NaN NaN
## NPUO 0.5216 NaN NaN NaN
## NPUP 0.3858 NaN NaN NaN
## NPUR 0.3724 NaN NaN NaN
## NPUS 0.3152 NaN NaN NaN
## NPUT 0.6116 NaN NaN NaN
## NPUV 0.3482 NaN NaN NaN
## NPUW 0.3942 NaN NaN NaN
## NPUY 0.4967 NaN NaN NaN
## NPUZ 0.5994 NaN NaN NaN
## Economic.Mobility.Index NA NA NA NA
## People.Based.Index NA NA NA NA
## Education.System.Index NA NA NA NA
## pop NA NA NA NA
## white NA NA NA NA
## black NA NA NA NA
## asian NA NA NA NA
## other NA NA NA NA
## hispanic NA NA NA NA
## medhhinc NA NA NA NA
## medhousevalue NA NA NA NA
## stemjob NA NA NA NA
##
## Residual standard error: NaN on 0 degrees of freedom
## (3 observations deleted due to missingness)
## Multiple R-squared: 1, Adjusted R-squared: NaN
## F-statistic: NaN on 21 and 0 DF, p-value: NA
lm_model_1 <- ggplot(data = twt_clean_, mapping = aes(x= n+ Economic.Mobility.Index + People.Based.Index+ Economic.System.Index +Education.System.Index + pop+ white+ black+ asian+other+hispanic+medhhinc+medhousevalue+ stemjob, y=sentiment_ai)) +
geom_point(alpha= 0.4, size = 3) +
geom_smooth(method = "lm", color= "red") +
theme_bw()+
stat_cor(method = "pearson", label.x = 500000, label.y = 0.55)
lm_model_1
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 3 rows containing non-finite values (`stat_smooth()`).
## Warning: Removed 3 rows containing non-finite values (`stat_cor()`).
## Warning: Removed 3 rows containing missing values (`geom_point()`).
pred_lm_ <- predict(lm_model)
length(pred_lm_) <- length(twt_clean_$sentiment_ai)
plot_data <- data.frame(Predicted_value_sent = pred_lm_,
Observed_value_sent = twt_clean_$sentiment_ai,
twt_clean_$NPU)
ggplot(plot_data, aes(x = Predicted_value_sent, y = Observed_value_sent)) +
geom_point(alpha= 0.4, size = 3) +
geom_abline(intercept = 0, slope = 1, color = "green")
## Warning: Removed 4 rows containing missing values (`geom_point()`).
add column of the predicted points with the tweets
names(plot_data)[names(plot_data) == 'twt_clean_.NPU'] <- 'NPU'
twt_clean_2 <- merge(twt_clean_, plot_data, by = "NPU")
a_1 <- tm_basemap("OpenStreetMap") + tm_shape(twt_clean_2) + tm_polygons(col = "Predicted_value_sent", style = "pretty")
a_1
## Warning: The shape twt_clean_2 contains empty units.
## Variable(s) "Predicted_value_sent" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.
b_1 <- tm_basemap("OpenStreetMap")+tm_shape(total_twts) +
tm_polygons(col = "sentiment_ai", style = "quantile")
b_1
## Variable(s) "sentiment_ai" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.
tmap_arrange(a_1, b_1)
## Warning: The shape twt_clean_2 contains empty units.
## Variable(s) "Predicted_value_sent" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.
## Variable(s) "sentiment_ai" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.
## Warning: The shape twt_clean_2 contains empty units.
## Variable(s) "Predicted_value_sent" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.
## Variable(s) "sentiment_ai" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.