A few days ago I was inspired by this post to create a Twitter bot. As promised, the process was simple, and in no time I had a bot that periodically tweeted out the surf conditions in El Porto, California. Fun! But also sort of boring and sterile. That got me thinking, you know what’s more fun than a bot pinging out a status four times a day? An interactive bot that replies to DMs. Thanks to the DM capabilities in the twitteR package, this is easy to set up.

First, create the initial bot - the code below sets up a simplified version of the bot I made (as with everything I do, there’s probably a more elegant/efficient way of coding).

#devtools::install_github("geoffjentry/twitteR")
library(twitteR)
library(rvest)
library(stringr)
library(dplyr)

# setup authentication (assumes authentication keys kept local)
api_key <- Sys.getenv("twitter_api_key")
api_secret <- Sys.getenv("twitter_api_secret")
access_token <- Sys.getenv("twitter_access_token")
access_token_secret <- Sys.getenv("twitter_access_token_secret")
options(httr_oauth_cache = TRUE)
setup_twitter_oauth(api_key, api_secret, access_token, access_token_secret)

# scrape data 
current_url <- "http://magicseaweed.com/El-Porto-Beach-Surf-Report/2677/"

current <- read_html(current_url) %>%
  html_nodes(".msw-fc-current-v0 .row") %>%
  html_text()  %>% str_split("        ")
current <- unlist(current)

# clean up empty cells and split phrases for easy use
cond <- current
cond <- cond[!str_detect(cond, "Wind Swell")]
cond <- cond[!str_detect(cond, "Secondary Swell")]
cond[cond==""] <- NA
cond <- cond[complete.cases(cond)]
cond <- str_trim(cond, "both") %>%
  str_split(" ")

waves <- cond[[1]]
wind <- cond[[2]]
swell <- cond[[3]]
weather <- cond[[4]]

current_surf <- str_c(strftime(Sys.time(),"%I:%M %p"),":"," Waves ", waves, " with a ", swell[2], " of ", 
                      swell[6], " at ", swell[8], ". ", wind[3], " wind at ", wind[1], ". ", weather[1], 
                      "and ", weather[3], " Water temp: ", weather[5], weather[6], ". #elporto #surf")
current_surf

# tweet status update
tweet(current_surf)

The above code provides a status update that might be useful to someone thinking of going surfing.

But the status updates are only sent out at specified times of day. What if you want to know if conditions have changed? Sure, you could look online, but that defeats the purpose of this bot, so we’ll ignore that possibliity. Instead, you could DM the surf bot and have it send back a response.

The first thing to do is create a file that stores the tweetID, this is a time stamp unique to each tweet, and it will act as our gatekeeper. Assuming you’re authenticated with read, write and Direct Message authority, this code will pull the list of DMs for your bot. Note that by default, the dmGet() function only pulls 25 DMs and it starts from the beginning of the bot’s time, so if there are over 25 DMs you need to grab more. Assuming you’ve grabbed all of the DMs, the first one in the list is the most recent, and this is the initial tweetID that we’ll use as the benchmark. We only need this code the first time, after that it can be commented out.

# genesis for data ID
dms <- dmGet()
x <- dms[[1]]
id1 <- data.frame(id1 = as.numeric(x$getId()))
write_tsv(id1, "path/to/id1")

Now, we’ll create a new R script. If the first script was surf_bot.R, then this is surf_bot2.R.

# surf_bot 2 - reply to DMs directed at @ElPortoSurf
library(twitteR)
library(dplyr)
library(readr)
library(stringr)
library(rvest)

# setup authentication
api_key <- Sys.getenv("twitter_api_key")
api_secret <- Sys.getenv("twitter_api_secret")
access_token <- Sys.getenv("twitter_access_token")
access_token_secret <- Sys.getenv("twitter_access_token_secret")
options(httr_oauth_cache = TRUE)
setup_twitter_oauth(api_key, api_secret, access_token, access_token_secret)

# read in current tweetID of last DM responded to
curr_id <- as.numeric(read_table("path/to/id1"))

# establish search words worth replying to
## this is spelled out specifically in the bot account's profile:
## "DM with phrase 'current conditions' for updates"
## or can be done creatively
words <- c('current', 'conditions') 

# gather dms and pull out unique tweetIDs
dms <- dmGet(sinceID = curr_id)
test_id <- sapply(dms, function(x) as.vector(as.numeric(x$getId())))

# test if any DMs are new compared to benchmark
new <- length(test_id[test_id > curr_id])

if (new == 0) {
  NULL
}
if (new > 0) {
  # pull out screen_name of sender and tweet text
  test_name <- sapply(dms, function(x) as.vector(x$getSenderSN()))[c(seq(new))]
  test_text <- sapply(dms, function(x) as.vector(x$getText()))[c(seq(new))]
  
  # test to see if DM matches our search phrase, keep those names that do
  final_name <- as.vector(test_name[which(str_detect(str_to_lower(test_text), 
                                                     str_to_lower(paste(words, collapse = ' '))) == TRUE)])
  
  # scraping current conditions (exact same as above)
  if (length(final_name) > 0) {
    current_url <- "http://magicseaweed.com/El-Porto-Beach-Surf-Report/2677/"

    current <- read_html(current_url) %>%
      html_nodes(".msw-fc-current-v0 .row") %>%
      html_text()  %>% str_split("        ")
    current <- unlist(current)
    
    # break into lists for easy use (drop Wind Swell and Secondary Swell)
    cond <- current
    cond <- cond[!str_detect(cond, "Wind Swell")]
    cond <- cond[!str_detect(cond, "Secondary Swell")]
    cond[cond==""] <- NA
    cond <- cond[complete.cases(cond)]
    cond <- str_trim(cond, "both") %>%
      str_split(" ")
    
    waves <- cond[[1]]
    wind <- cond[[2]]
    swell <- cond[[3]]
    weather <- cond[[4]]
    
    current_surf <- str_c(strftime(Sys.time(),"%I:%M %p"),":"," Waves ", waves, " with a ", swell[2], " of ", 
                          swell[6], " at ", swell[8], ". ", wind[3], " wind at ", wind[1], ". ", weather[1], 
                          " and ", weather[3], " Water temp: ", weather[5], weather[6], ". #elporto #surf")
    current_surf
    
    # make sure not sending multiple DMs to same account
    final_name <- unique(final_name)
   
     # send DM response to each account
    for (i in 1:length(final_name)) {
      dmSend(current_surf, final_name[i])
    }
  }
  else if (length(final_name) == 0) {
    NULL
  }
}

# write over the tweetID file with updated tweetID 
tmp_id <- data.frame(test_id[1])
write_tsv(tmp_id, "path/to/id1")

And that’s it. Now just set it up so that your computer runs the file every set period of time (on a mac use launchd), and you have a simple, interactive twitter bot.

Replication code here