The following tutorial is a simplified version based on a previous geo-mapping tutorial (http://rpubs.com/cosmopolitanvan/224741). Consult with the older version for ways to remove errors in coordinate data.

You can use this tutorial to map Twitter users with coordinates obtained from Google Map API.

Disclaimer: this tutorial uses part of the codes in http://lucaspuente.github.io/notes/2016/04/05/Mapping-Twitter-Followers.

The required R libraries

require(twitteR)
require(data.table)
require(RJSONIO)
require(leaflet)

A quick demo of the location information on Twitter bio

#Get the location of @UMassPoll. To run the following code, you must first set up Twitter API. 
user<-getUser("UMassPoll")
user$location
[1] "Amherst, MA"

We will map the tweets mentioning @realdonaldtrump.For this task, we have a pre-collected data (tweets_mentioning_dt.csv). We will load the CSV file and name it tweets.

tweets <- read.csv("tweets_mentioning_dt.csv")

Create two new columns; one for storing the location information disclosed on a user’s Twitter bio, and the other column for storing the URL to a user’s profile image.

tweets$user_location_on_twitter_bio <- NA
tweets$profile_image <- NA

We will now loop over each tweet, finding who sent it and the location information from the user’s Twitter bio. Let’s first grab the location info for the first 200 tweets (That is why we put 1:200). You can of course change the number to cover all tweets (the tweets dataframe have 500 tweets, so you can put 1:500).

Because we will use the Twitter API, make sure you run the Twitter API authorization code before trying out the code below.

for (user in tweets$screenName[1:200]){  
  print(c("finding the profile for:",user))
  Sys.sleep(3) #build in a sleeper to prevent Twitter API rate limit error. 
  try(tweets[tweets$screenName==user,]$user_location_on_twitter_bio <- getUser(user)$location)
  try(tweets[tweets$screenName==user,]$profile_image <- getUser(user)$profileImageUrl)
}

While running the previous code, you likely have noticed some errors returned from the Twitter API (such as Not Found (HTTP 404)). When a tweet gives you an error, the code will ignore the error and proceed to the next tweet. There are many reasons for errors from the Twitter API. One common reason is that Twitter user could be suspended or deleted and therefore there is no location information to be found for the user.

The location information is stored in the column named tweets$user_location_on_twitter_bio. We can match the cities and states in the column with the exact coordinates through Google Map API. To do that, obtain a key from Google Maps Geocoding API. (https://developers.google.com/maps/documentation/geocoding/get-api-key). There is a limit of 2,500 coordinates per day if you are a standard Google Map API user.

#create a function for getting coordinates from Google Map API.We use the code published by Lucas Puente (http://lucaspuente.github.io/notes/2016/04/05/Mapping-Twitter-Followers)
tweets$lat <-NA
tweets$lng <-NA

source("https://raw.githubusercontent.com/LucasPuente/geocoding/master/geocode_helpers.R")
source("https://raw.githubusercontent.com/LucasPuente/geocoding/master/modified_geocode.R")
geocode_apply<-function(x){
  geocode(x, source = "google", output = "all", api_key="xxx")
}

We will create a separate dataframe ONLY for tweets containing geo-info. This dataframe is named tweets_withgeo. We will then loop over the user_location_on_twitter_bio column in tweets_withgeo and find corresponding coordinates for each location.

tweets_withgeo <- tweets[tweets$user_location_on_twitter_bio != "" & !is.na(tweets$user_location_on_twitter_bio),]

for (name in tweets_withgeo$user_location_on_twitter_bio[1:149]){ #get the coordinate data for the first 100 tweets via Google Map API.
  rowid<-which(tweets_withgeo$user_location_on_twitter_bio == name)
  print(paste0("getting the coordinates for:",name,", rowid is:",rowid))
  Sys.sleep(1)
  try(geodata <- geocode_apply(name))
  
  if (geodata$status=="OK" & length(geodata$results)=="1") {
    print(c("the lat is:",geodata$results[[1]]$geometry$location[[1]]))
    print(c("the lngis:", geodata$results[[1]]$geometry$location[[2]]))
    tweets_withgeo[rowid,]$lat <- geodata$results[[1]]$geometry$location[[1]]
    tweets_withgeo[rowid,]$lng <- geodata$results[[1]]$geometry$location[[2]]
  }else {
    print ("skipping")
  }
}
#create a separate dataframe called tweets_withgeo_show. This dataframe contains only complete coordinates. 
tweets_withgeo_show <- tweets_withgeo[!is.na(tweets_withgeo$lat),c("lat","lng", "user_location_on_twitter_bio", "profile_image", "text")]
map1 <- leaflet() %>% setView(lng = -98.35, lat = 39.50, zoom = 3)
map1 <- leaflet(data = tweets_withgeo_show) %>% 
  addTiles() %>%
  setView(lng = -98.35, lat = 39.50, zoom = 4) %>% 
  addCircleMarkers(lng = ~lng, lat = ~lat, popup = ~ as.character(user_location_on_twitter_bio)) %>% 
  addProviderTiles("CartoDB.Positron") %>%
  addCircleMarkers(
    stroke = FALSE, fillOpacity = 0.5
  ) 
Assuming 'lng' and 'lat' are longitude and latitude, respectively
map1

Try a different style? Show users’ profile images on the map

usericon <- makeIcon(
  iconUrl = tweets_withgeo_show$profile_image,
  iconWidth = 15, iconHeight = 15
)
map2 <- leaflet(data = tweets_withgeo_show) %>% 
  addTiles() %>%
  setView(lng = -98.35, lat = 39.50, zoom = 4) %>% 
  addMarkers(lng = ~lng, lat = ~lat, popup = ~ as.character(user_location_on_twitter_bio),icon = usericon,data = tweets_withgeo_show) %>% 
  addProviderTiles(providers$Esri.NatGeoWorldMap) 
map2

This tutorial is developed for COMM497DB Fall 2017, taught at UMass-Amherst.

If you find this tutorial helpful and would like to use it in your projects, please acknowledge the source:

Xu, Weiai W. (2017). How to Detect Sentiments from Donald Trump’s Tweets?. Amherst, MA: http://curiositybits.com

