0.1 28 Sept 2016 supporess printing of all the code
0.2 30 Sept 2016 use metro area populations from wikipedia
0.5 1 Oct 2016 add ChoroplethRmaps to reduce map clutter and add date span
[1] "Using direct authentication"
Get city geo data from maps::cities
Select number of cities
n.cities <- 40
The top Cities are:
rank | population | metro | metro_match | name | country.etc | pop | lat | long | city_name |
---|---|---|---|---|---|---|---|---|---|
1 | 20182305 | New York | new york | New York NY | NY | 8124427 | 40.67 | -73.94 | New York |
2 | 13340068 | Los Angeles | los angeles | Los Angeles CA | CA | 3911500 | 34.11 | -118.41 | Los Angeles |
3 | 9551031 | Chicago | chicago | Chicago IL | IL | 2830144 | 41.84 | -87.68 | Chicago |
4 | 7102796 | Dallas | dallas | Dallas TX | TX | 1216543 | 32.79 | -96.77 | Dallas |
5 | 6656947 | Houston | houston | Houston TX | TX | 2043005 | 29.77 | -95.39 | Houston |
6 | 6097684 | Washington | washington | WASHINGTON DC | DC | 548359 | 38.91 | -77.02 | WASHINGTON |
7 | 6069875 | Philadelphia | philadelphia | Philadelphia PA | PA | 1439814 | 40.01 | -75.13 | Philadelphia |
8 | 6012331 | Miami | miami | Miami FL | FL | 386740 | 25.78 | -80.21 | Miami |
9 | 5710795 | Atlanta | atlanta | Atlanta GA | GA | 424096 | 33.76 | -84.42 | Atlanta |
10 | 4774321 | Boston | boston | Boston MA | MA | 567759 | 42.34 | -71.02 | Boston |
Data collection for the top 40 cities (by population) in the U.S. This includes cities from New York NY to Nashville TN.
Keeping first 40 metro areas comprises a total population of 174.3 million people.
## set up search terms
searchString.x <- "#AI" # search term
n.x <- 3000 # number of tweets
radius <- "30mi" # radius around selected geo-location
days.ago <-10
duration.days <- 7 # how many days
since.date <- (Sys.Date() - days.ago) %>% as.character # calculated starting date
until.date <- (Sys.Date() - days.ago + duration.days) %>% as.character # calculated ending date
Use the twitteR::searchTwitter
command.
[1] "Rate limited .... blocking for a minute and retrying up to 119 times ..."
[1] "Rate limited .... blocking for a minute and retrying up to 118 times ..."
[1] "Rate limited .... blocking for a minute and retrying up to 117 times ..."
[1] "Rate limited .... blocking for a minute and retrying up to 116 times ..."
[1] "Rate limited .... blocking for a minute and retrying up to 115 times ..."
[1] "Rate limited .... blocking for a minute and retrying up to 114 times ..."
[1] "Rate limited .... blocking for a minute and retrying up to 113 times ..."
[1] "Rate limited .... blocking for a minute and retrying up to 112 times ..."
[1] "Rate limited .... blocking for a minute and retrying up to 111 times ..."
#map.plot + ## use this to underlay with google map
ggplot() + ## use this to underlay with simple border outlines
geom_polygon(data = state.map %>% filter(region != "alaska" & region != "hawaii"), aes(x=long, y=lat, group = group), fill = "#DAC143", color = "#36180D") +
geom_point(aes(x = lon, y = lat, fill = tweet.flux, size = n.tweets), data=analyzed_df, pch=21, color = "#33333399") +
ggtitle(paste0(searchString.x, " tweets from ", since.date, " until ", until.date)) +
scale_fill_gradient(low = "#92A0CD", high = "#B32F2A", space = "Lab", na.value = "grey50", guide = "colourbar") +
theme_bw()
Here are the top few cities by tweet flux (in “twipermipeds”).
name | tweet.flux | n.tweets | population |
---|---|---|---|
San Jose CA | 105.80 | 1464 | 1976836 |
San Francisco CA | 67.50 | 2200 | 4656132 |
Seattle WA | 47.22 | 1234 | 3733580 |
Boston MA | 39.74 | 1328 | 4774321 |
Orlando FL | 28.19 | 471 | 2387138 |
Austin TX | 21.42 | 300 | 2000860 |
Nashville TN | 18.97 | 243 | 1830345 |
Denver CO | 18.93 | 373 | 2814330 |
Los Angeles CA | 17.20 | 1606 | 13340068 |
New York NY | 16.70 | 2360 | 20182305 |
San Diego CA | 14.59 | 337 | 3299521 |
WASHINGTON DC | 12.25 | 523 | 6097684 |
Phoenix AZ | 10.59 | 339 | 4574531 |
Chicago IL | 7.64 | 511 | 9551031 |
Portland OR | 7.47 | 125 | 2389228 |
Here are the top few cities sorted by raw tweets, again with major metro areas leading. Note that some other cities, like Chicago, have a large number of tweets but a lower flux because of their higher population.
name | tweet.flux | n.tweets | population |
---|---|---|---|
New York NY | 16.70 | 2360 | 20182305 |
San Francisco CA | 67.50 | 2200 | 4656132 |
Los Angeles CA | 17.20 | 1606 | 13340068 |
San Jose CA | 105.80 | 1464 | 1976836 |
Boston MA | 39.74 | 1328 | 4774321 |
Seattle WA | 47.22 | 1234 | 3733580 |
WASHINGTON DC | 12.25 | 523 | 6097684 |
Chicago IL | 7.64 | 511 | 9551031 |
Orlando FL | 28.19 | 471 | 2387138 |
Denver CO | 18.93 | 373 | 2814330 |
Phoenix AZ | 10.59 | 339 | 4574531 |
San Diego CA | 14.59 | 337 | 3299521 |
Austin TX | 21.42 | 300 | 2000860 |
Atlanta GA | 6.28 | 251 | 5710795 |
Nashville TN | 18.97 | 243 | 1830345 |