In this tutorial, I will introduce R codes to map Twitter users with coordinates obtained from Google Map API. This tutorial is built on: http://lucaspuente.github.io/notes/2016/04/05/Mapping-Twitter-Followers.
Let’s fire up necessary R libraries
require(twitteR)
require(data.table)
require(RJSONIO)
require(leaflet)
A quick example of Twitter location information.
#Get the location of @UMassPoll.
user<-getUser("UMassPoll")
user$location
[1] "Amherst, MA"
Now we know where @UMassPoll is located. But a more interesting question is: where are @UMassPoll’s followers? To answer the question, we will first create a R function called get_followers. The function can download follower information from API, remove users whose location information is blank or contains special characters. Notice that Twitter API has rate limit, that is why retryOnRateLimit is set to 180.
We can now apply the function to download a list of @UMassPoll’s followers.
followers_df <- get_followers("UMassPoll")
The location information is stored in the column named location. We can match the cities and states with exact coordinates through Google Map API. To do that, obtain a key from Google Maps Geocoding API. (https://developers.google.com/maps/documentation/geocoding/get-api-key). There is a limit of 2,500 coordinates per day if you are a standard Google Map API user.
#create a function for getting coordinates from Google Map API.We use the code published by Lucas Puente (http://lucaspuente.github.io/notes/2016/04/05/Mapping-Twitter-Followers)
source("https://raw.githubusercontent.com/LucasPuente/geocoding/master/geocode_helpers.R")
source("https://raw.githubusercontent.com/LucasPuente/geocoding/master/modified_geocode.R")
geocode_apply<-function(x){
geocode(x, source = "google", output = "all", api_key="xxx")
}
Let’s apply the function called geocode_apply to get coordinates.
geocode_results<-sapply(followers_df$location, geocode_apply, simplify = F)
View(geocode_results_hashtag_clean)
Use the following code to clean the coordinate data.
condition_a <- sapply(geocode_results, function(x) x["status"]=="OK")
geocode_results<-geocode_results[condition_a]
condition_b <- lapply(geocode_results, lapply, length)
condition_b2<-sapply(condition_b, function(x) x["results"]=="1")
geocode_results<-geocode_results[condition_b2]
source("https://raw.githubusercontent.com/LucasPuente/geocoding/master/cleaning_geocoded_results.R")
results_b<-lapply(geocode_results, as.data.frame)
results_c<-lapply(results_b,function(x) subset(x, select=c("results.formatted_address",
"results.geometry.location")))
results_d<-lapply(results_c,function(x) data.frame(Location=x[1,"results.formatted_address"],
lat=x[1,"results.geometry.location"],
lng=x[2,"results.geometry.location"]))
results_e<-rbindlist(results_d)
Now, we have a dataframe of Twitter followers with the coordinates matching their self-reported location on Twitter bio. Let’s use the data to create an interactive map.
map1 <- leaflet(data = results_e) %>%
addTiles() %>%
setView(lng = -98.35, lat = 39.50, zoom = 4) %>%
addMarkers(lng = ~lng, lat = ~lat, popup = ~ as.character(Location)) %>%
addProviderTiles("CartoDB.Positron") %>%
addCircleMarkers(
stroke = FALSE, fillOpacity = 0.5
)
Assuming 'lng' and 'lat' are longitude and latitude, respectively
map1
Try a different style.
Or try this.
Next, we will map Twitter users who have tweeted a given hashtag. To begin with, we use Google Map API to get each Twitter user’s coordinate. After running the following code, you will get a dataframe named user_info that contains Twitter user profiles.
#Load the tweets you want to visualize.
tweets <- read.csv("hashtagtweets.csv")
#Let's mine the profile of the first 50 users in the data.
users <- tweets$screenName[1:50]
user_info<-as.data.frame(getUser(users[1]))
for (user in users[1:length(users)]){
print(c("mining the profile for:",user))
Sys.sleep(5)
#because of the Twitter API limit, we let R rest for 5 sec after each request.
a<-getUser(user)
a<- as.data.frame(a)
user_info<-rbind(user_info,a)
}
In the beginning, we’ve created a function called geocode_apply. We can apply the function to our dataframe named user_info.
geocode_results<-sapply(user_info$location, geocode_apply, simplify = F)
We can repeat the following code for cleaning coordinate data.
condition_a <- sapply(geocode_results, function(x) x["status"]=="OK")
geocode_results<-geocode_results[condition_a]
condition_b <- lapply(geocode_results, lapply, length)
condition_b2<-sapply(condition_b, function(x) x["results"]=="1")
geocode_results<-geocode_results[condition_b2]
source("https://raw.githubusercontent.com/LucasPuente/geocoding/master/cleaning_geocoded_results.R")
results_b<-lapply(geocode_results, as.data.frame)
results_c<-lapply(results_b,function(x) subset(x, select=c("results.formatted_address",
"results.geometry.location")))
results_d<-lapply(results_c,function(x) data.frame(Location=x[1,"results.formatted_address"],
lat=x[1,"results.geometry.location"],
lng=x[2,"results.geometry.location"]))
results_e<-rbindlist(results_d)
Now, visualize the Twitter users.
map2 <- leaflet() %>% setView(lng = -98.35, lat = 39.50, zoom = 3)
map2 <- leaflet(data = results_e) %>%
addTiles() %>%
setView(lng = -98.35, lat = 39.50, zoom = 4) %>%
addMarkers(lng = ~lng, lat = ~lat, popup = ~ as.character(Location)) %>%
addProviderTiles("CartoDB.Positron") %>%
addCircleMarkers(
stroke = FALSE, fillOpacity = 0.5
)
Assuming 'lng' and 'lat' are longitude and latitude, respectively
map2
---
title: "Geo-Mapping Twitter Users"
output:
  html_notebook: default
  html_document: default
---
In this tutorial, I will introduce R codes to map Twitter users with coordinates obtained from Google Map API. This tutorial is built on: http://lucaspuente.github.io/notes/2016/04/05/Mapping-Twitter-Followers.  

Let's fire up necessary R libraries
```{r warning = FALSE, results="hide"}
require(twitteR)
require(data.table)
require(RJSONIO)
require(leaflet)
```

A quick example of Twitter location information. 
```{r}
#Get the location of @UMassPoll. To run the following code, you must first set up Twitter API. 
user<-getUser("UMassPoll")
user$location
```

Now we know where @UMassPoll is located. But a more interesting question is: where are  @UMassPoll's followers? To answer the question, we will first create a R function called get_followers. The function can download follower information from API, remove users whose location information is blank or contains special characters. Notice that Twitter API has rate limit, that is why retryOnRateLimit is set to 180. 
```{r, echo=TRUE}
#Create a function called get_followers. We can "recycle" this function for other dataset.
get_followers <- function(username){
  user<-getUser(username)
  print(c("downloading the followers of:", username))
  follower_IDs <- user$getFollowers(retryOnRateLimit=180)
  followers_df = rbindlist(lapply(follower_IDs,as.data.frame))
  followers_df<-subset(followers_df, location!="")
  followers_df$location<-gsub("%", " ",followers_df$location)
  return(followers_df)
}
```

We can now apply the function to download a list of @UMassPoll's followers. 
```{r results="hide"}
followers_df <- get_followers("UMassPoll")
```

The location information is stored in the column named location. We can match the cities and states with exact coordinates through Google Map API. To do that, obtain a key from Google Maps Geocoding API. (https://developers.google.com/maps/documentation/geocoding/get-api-key). There is a limit of 2,500 coordinates per day if you are a standard Google Map API user. 
```{r, echo=TRUE}
#create a function for getting coordinates from Google Map API.We use the code published by Lucas Puente (http://lucaspuente.github.io/notes/2016/04/05/Mapping-Twitter-Followers)
source("https://raw.githubusercontent.com/LucasPuente/geocoding/master/geocode_helpers.R")
source("https://raw.githubusercontent.com/LucasPuente/geocoding/master/modified_geocode.R")

geocode_apply<-function(x){
  geocode(x, source = "google", output = "all", api_key="xxx")
}
```

Let's apply the function called geocode_apply to get coordinates. 
```{r, message=FALSE}
geocode_results<-sapply(followers_df$location, geocode_apply, simplify = F)
```

Use the following code to clean the coordinate data. 
```{r}
condition_a <- sapply(geocode_results, function(x) x["status"]=="OK")
geocode_results<-geocode_results[condition_a]
condition_b <- lapply(geocode_results, lapply, length)
condition_b2<-sapply(condition_b, function(x) x["results"]=="1")
geocode_results<-geocode_results[condition_b2]
source("https://raw.githubusercontent.com/LucasPuente/geocoding/master/cleaning_geocoded_results.R")
results_b<-lapply(geocode_results, as.data.frame)
results_c<-lapply(results_b,function(x) subset(x, select=c("results.formatted_address",
                                                        "results.geometry.location")))
results_d<-lapply(results_c,function(x) data.frame(Location=x[1,"results.formatted_address"],
                                                  lat=x[1,"results.geometry.location"],
                                                lng=x[2,"results.geometry.location"]))
results_e<-rbindlist(results_d)
```

Now, we have a dataframe of Twitter followers with the coordinates matching their self-reported location on Twitter bio. Let's use the data to create an interactive map.
```{r}
map1 <- leaflet() %>% setView(lng = -98.35, lat = 39.50, zoom = 3)
map1 <- leaflet(data = results_e) %>% 
  addTiles() %>%
  setView(lng = -98.35, lat = 39.50, zoom = 4) %>% 
  addMarkers(lng = ~lng, lat = ~lat, popup = ~ as.character(Location)) %>% 
  addProviderTiles("CartoDB.Positron") %>%
  addCircleMarkers(
    stroke = FALSE, fillOpacity = 0.5
  ) 
map1
```

Try a different style.
```{r}
map1 %>% addProviderTiles("Stamen.Toner")
```

Or try this.
```{r}
map1 %>% addTiles() 
```

Next, we will map Twitter users who have tweeted a given hashtag. To begin with, we use Google Map API to get each Twitter user's coordinate. After running the following code, you will get a dataframe named user_info that contains Twitter user profiles.

```{r}
#Load the tweets you want to visualize. 
tweets <- read.csv("hashtagtweets.csv")

#Let's mine the profile of the first 50 users in the data. 
users <- tweets$screenName[1:50]
user_info<-as.data.frame(getUser(users[1]))
for (user in users[1:length(users)]){
  print(c("mining the profile for:",user))
  Sys.sleep(5) 
  #because of the Twitter API limit, we let R rest for 5 sec after each request. 
  a<-getUser(user)
  a<- as.data.frame(a)
  user_info<-rbind(user_info,a)
}
```

In the beginning, we've created a function called geocode_apply. We can apply the function to our dataframe named user_info.  
```{r, message=FALSE, warning=FALSE}
geocode_results<-sapply(user_info$location, geocode_apply, simplify = F)
```

We can repeat the following code for cleaning coordinate data.
```{r, message=FALSE, warning=FALSE}
condition_a <- sapply(geocode_results, function(x) x["status"]=="OK")
geocode_results<-geocode_results[condition_a]
condition_b <- lapply(geocode_results, lapply, length)
condition_b2<-sapply(condition_b, function(x) x["results"]=="1")
geocode_results<-geocode_results[condition_b2]
source("https://raw.githubusercontent.com/LucasPuente/geocoding/master/cleaning_geocoded_results.R")
results_b<-lapply(geocode_results, as.data.frame)
results_c<-lapply(results_b,function(x) subset(x, select=c("results.formatted_address",
                                                        "results.geometry.location")))
results_d<-lapply(results_c,function(x) data.frame(Location=x[1,"results.formatted_address"],
                                                  lat=x[1,"results.geometry.location"],
                                                lng=x[2,"results.geometry.location"]))
results_e<-rbindlist(results_d)
```

Now, visualize the Twitter users.
```{r}
map2 <- leaflet() %>% setView(lng = -98.35, lat = 39.50, zoom = 3)
map2 <- leaflet(data = results_e) %>% 
  addTiles() %>%
  setView(lng = -98.35, lat = 39.50, zoom = 4) %>% 
  addMarkers(lng = ~lng, lat = ~lat, popup = ~ as.character(Location)) %>% 
  addProviderTiles("CartoDB.Positron") %>%
  addCircleMarkers(
    stroke = FALSE, fillOpacity = 0.5
  ) 
map2
```

