Unit4_RTutorial Week

Introduction

In Unit 4: You will discover text and social network analysis by reviewing the walkthroughs from the Data science in education Using R book. In Unit3, we examined the foundations of the collaborative data-intensive improvement (CDI) model and discussed the text mining methods in education from a theoretical perspective. Now, you will able to practice text-mining in the Walkthrough 5: Text Analysis with Twitter Data. After you completed the Walkthrough 5, you will move to the Social Network Analysis part. In the Walkthrough 6, you will discover interactions among the Twitter users by following SNA. To complete this week’s tutorial, please read through everything carefully and complete all “Your Turn” parts. Each “Your Turn” is 1 point and render&publishing your work (e.g., RPubs, QuartoPubs, GitHub) is 1 point.

Walkthrough 5: Text Analysis With Social Media Data

Topics Emphasized

  • Tidying data

  • Transforming data

  • Visualizing data

Functions Introduced

  • sample_n()

  • set.seed()

  • tidytext::unnest_tokens()

  • nrc::get_sentiments()

  • tidytext::inner_join()

Vocabulary

  • RDS files

  • text analysis

  • stop words

  • tokenize

Chapter Overview

In this chapter, we focus on analyzing textual data from Twitter. We focus on this particular data source because we think it is relevant to a number of educational topics and questions, including how newcomers learn to visualize data. In addition, Twitter data is complex, and includes not only information about who posted a tweet (and when - and a great deal of additional information (see (Michael W. Kearney et al., 2022)), it also includes the text of the tweet). This makes it especially well-suited for exploring the uses of text analysis, which is broadly part of a group of techniques involving the analysis of text as data, Natural Language Processing (often abbreviated NLP) (Hirschberg & Manning, 2015).

Background

When we think about data science in education, our minds tends to go data stored in spreadsheets. But what can we learn about the student experience from text data? Take a moment to mentally review all the moments in your work day that you generated or consumed text data. In education, we’re surrounded by it. We do our lessons in word processor documents, our students submit assignments online, and the school community expresses themselves on public social media platforms. The text we generate can be an authentic reflection of reality in schools, so how might we learn from it?

Even the most basic text analysis techniques will expand your data science toolkit. For example, you can use text analysis to count the number of key words that appear in open ended survey responses. You can analyze word patterns in student responses or message board posts.

Analyzing a collection of text is different from analyzing large numerical datasets because words don’t have agreed upon values the way numbers do. The number 2 will always be more than 1 and less than 3. The word “fantastic,” on the other hand, has multiple ambiguous levels of degree depending on interpretation and context.

Using text analysis can help to broadly estimate what is happening in the text. When paired with observations, interviews, and close review of the text, this approach can help education staff learn from text data. In this chapter, we’ll learn how to count the frequency of words in a dataset and associate those words with common feelings like positivity or joy.

We’ll show these techniques using a dataset of tweets. We encourage you to complete the walkthrough, then reflect on how the skills learned can be applied to other texts, like word processing documents or websites.

Data Source

It’s useful to learn text analysis techniques from datasets that are available for download. Take a moment to do an online search for “download tweet dataset” and note the abundance of Twitter datasets available. Since there’s so much, it’s useful to narrow the tweets to only those that help you answer your analytic questions. Hashtags are text within a tweet that act as a way to categorize content. Here’s an example:

RT @CKVanPay: I’m trying to recreate some Stata code in R, anyone have a good resource for what certain functions in Stata are doing? #RStats #Stata

Twitter recognizes any words that start with a “#” as a hashtag. The hashtags “#RStats” and “#Stata” make this tweet conveniently searchable. If Twitter uses search for “#RStats”, Twitter returns all the Tweets containing that hashtag.

In this example, we’ll be analyzing a dataset of tweets that have the hashtag #tidytuesday (https://twitter.com/hashtag/tidytuesday). #tidytuesday is a community sparked by the work of one of the Data Science in Education Using R co-authors, Jesse Mostipak, who created the (related) #r4ds community from which #tidytuesday was created. #tidytuesday is a weekly data visualization challenge. A great place to see examples from past #tidytuesday challenges is an interactive Shiny application (https://github.com/nsgrantham/tidytuesdayrocks).

The #tidytuesday hashtag (search Twitter for the hashtag, or see the results here: http://bit.ly/tidytuesday-search) returns tweets about the weekly TidyTuesday practice, where folks learning R create and tweet data visualizations they made while learning to use tidyverse R packages.

Methods

In this walkthrough, we’ll be learning how to count words in a text dataset. We’ll also use a technique called sentiment analysis to count and visualize the appearance of words that have a positive association. Lastly, we’ll learn how to get more context by selecting random rows of tweets for closer reading.

Load Packages

For this analysis, we’ll be using the {tidyverse}, {here}, and {dataedu} packages. We will also use the {tidytext} package for working with textual data (Robinson & Silge, 2022). As it has not been used previously in the book, you may need to install the {tidytext} package (and - if you haven’t just yet - the other packages), first. For instructions on and an overview about installing packages, see the Packages section of the Foundational Skillschapter.

Let’s load our packages before moving on to import the data:

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.3     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.0
✔ ggplot2   3.4.3     ✔ tibble    3.2.1
✔ lubridate 1.9.2     ✔ tidyr     1.3.0
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(here)
Warning: package 'here' was built under R version 4.3.2
here() starts at C:/Users/AndrzejKmiecik/OneDrive - Core4ce/Documents/NIU/ETR537/Week 12
library(tidytext)

We will load dataedu library as well. But, before loading that library, we need to install the library. In your previous assignment, you practiced to install a library from CRAN repository. Yet, that is not only the way to install a library. Sometimes, developer creates new libraries and share them through the GitHub. This is also a case for the “dataedu” library. Check the code below and run it.

# install remotes
install.packages("remotes", repos = "http://cran.us.r-project.org")

# install the dataedu package
remotes::install_github("data-edu/dataedu")

#load the library
library(dataedu)

Import Data

We’ve included the raw dataset of TidyTuesday tweets in the {dataedu} package. You can see the dataset by typing tt_tweets. Let’s start by assigning the name raw_tweets to this dataset:

raw_tweets <- dataedu::tt_tweets

Your Turn:

Let’s return to our raw_tweets dataset. Run str(raw_tweets) and notice the number of variables in this dataset. It’s good practice to use functions like glimpse() , head() or str() to look at the data type of each variable.

str(raw_tweets)
tibble [4,418 × 90] (S3: tbl_df/tbl/data.frame)
 $ user_id                : chr [1:4418] "1159211379963772928" "1073323083170238466" "1073323083170238466" "1073323083170238466" ...
 $ status_id              : chr [1:4418] "1163154266065735680" "1163247504542130176" "1145043578479108097" "1116864894144528384" ...
 $ created_at             : POSIXct[1:4418], format: "2019-08-18 18:22:42" "2019-08-19 00:33:11" ...
 $ screen_name            : chr [1:4418] "MKumarYYC" "cizzart" "cizzart" "cizzart" ...
 $ text                   : chr [1:4418] "First #TidyTuesday submission! Roman emperors and their rise to power in different eras. Interested to hear abo"| __truncated__ "El saqueo de #Macri \nEl #MERVAL colapsó el lunes 12/08 y fue la segunda mas grande en la historia desde 1950 s"| __truncated__ "Proyeccion dinamica de población por SEXO para la provincia de #Misiones para los años 2010-2040 elaborado a pa"| __truncated__ "#Argentina Número de SMS enviados en el periodo 2013-2018. Los numeros no dejan dudas: el declive del SMS es in"| __truncated__ ...
 $ source                 : chr [1:4418] "Twitter Web App" "Twitter Web App" "Twitter Web Client" "Twitter Web Client" ...
 $ display_text_width     : num [1:4418] 178 280 196 214 262 227 260 271 117 268 ...
 $ reply_to_status_id     : chr [1:4418] NA NA NA NA ...
 $ reply_to_user_id       : chr [1:4418] NA NA NA NA ...
 $ reply_to_screen_name   : chr [1:4418] NA NA NA NA ...
 $ is_quote               : logi [1:4418] FALSE FALSE FALSE FALSE FALSE FALSE ...
 $ is_retweet             : logi [1:4418] FALSE FALSE FALSE FALSE FALSE FALSE ...
 $ favorite_count         : int [1:4418] 9 1 1 1 0 1 1 1 6 5 ...
 $ retweet_count          : int [1:4418] 3 1 0 1 1 1 1 0 3 2 ...
 $ quote_count            : int [1:4418] NA NA NA NA NA NA NA NA NA NA ...
 $ reply_count            : int [1:4418] NA NA NA NA NA NA NA NA NA NA ...
 $ hashtags               :List of 4418
  ..$ : chr [1:3] "TidyTuesday" "rstats" "DataScience"
  ..$ : chr [1:6] "Macri" "MERVAL" "Bloomberg" "rstats" ...
  ..$ : chr [1:6] "Misiones" "rstats" "dataviz" "TidyTuesday" ...
  ..$ : chr [1:6] "Argentina" "dataviz" "rstats" "TidyTuesday" ...
  ..$ : chr [1:6] "Misiones" "tidytuesday" "rstats" "Posadas" ...
  ..$ : chr [1:5] "Misiones" "rstats" "TidyTuesday" "dataviz" ...
  ..$ : chr [1:6] "Posadas" "rstats" "dataviz" "TidyTuesday" ...
  ..$ : chr [1:6] "dataviz" "Misiones" "Argentina" "posadas" ...
  ..$ : chr [1:4] "Posadas" "Misiones" "rstats" "TidyTuesday"
  ..$ : chr [1:8] "Posadas" "Misiones" "rstats" "dataviz" ...
  ..$ : chr [1:3] "Misiones" "rstats" "TidyTuesday"
  ..$ : chr [1:4] "Misiones" "Posadas" "TidyTuesday" "dataviz"
  ..$ : chr [1:7] "Elecciones2019" "MisionerismoPuro" "MisionesVota" "rstats" ...
  ..$ : chr [1:6] "Misiones" "género" "rstats" "TidyTuesday" ...
  ..$ : chr [1:6] "Elecciones2019" "Misiones" "TidyTuesday" "dataviz" ...
  ..$ : chr [1:6] "Misiones" "Posadas" "dataviz" "TidyTuesday" ...
  ..$ : chr [1:7] "Elecciones2019" "Misiones" "CierreDeListas" "Posadas" ...
  ..$ : chr [1:7] "Argentina" "TidyTuesday" "r4ds" "economia" ...
  ..$ : chr [1:5] "Misiones" "rstats" "Posadas" "r4ds" ...
  ..$ : chr [1:6] "Misiones" "voto" "rstats" "dataviz" ...
  ..$ : chr [1:9] "Candelaria" "elecciones" "politicos" "Misiones" ...
  ..$ : chr [1:6] "Posadas" "Misiones" "rstats" "TidyTuesday" ...
  ..$ : chr [1:6] "Misiones" "rstats" "TidyTuesday" "dataviz" ...
  ..$ : chr [1:6] "Argentina" "dataviz" "TidyTuesday" "Posadas" ...
  ..$ : chr [1:6] "rstats" "Posadas" "r4ds" "TidyTuesday" ...
  ..$ : chr [1:7] "pobreza" "Argentina" "rstats" "r4ds" ...
  ..$ : chr [1:7] "Misiones" "Posadas" "HacemosCiudad" "SigamosHaciendoJuntos" ...
  ..$ : chr [1:6] "MERVAL" "Macri" "rstats" "TidyTuesday" ...
  ..$ : chr [1:7] "Elecciones2019" "Misiones" "rstats" "TidyTuesday" ...
  ..$ : chr [1:6] "Misiones" "rstats" "dataviz" "TidyTuesday" ...
  ..$ : chr [1:6] "Argentina" "Misiones" "Posadas" "rstats" ...
  ..$ : chr [1:6] "Argentina" "rstats" "r4ds" "tidyverse" ...
  ..$ : chr [1:6] "Argentina" "dataviz" "rstats" "TidyTuesday" ...
  ..$ : chr [1:6] "Misiones" "TidyTuesday" "dataviz" "Posadas" ...
  ..$ : chr [1:4] "EleccionesArgentina" "EscrutinioDefinitivo" "rstats" "TidyTuesday"
  ..$ : chr [1:3] "rstats" "TidyTuesday" "ggplot"
  ..$ : chr "tidytuesday"
  ..$ : chr "TidyTuesday"
  ..$ : chr "TidyTuesday"
  ..$ : chr "TidyTuesday"
  ..$ : chr [1:4] "TidyTuesday" "rstats" "r4ds" "tidyverse"
  ..$ : chr [1:3] "TidyTuesday" "rstats" "ggplot2"
  ..$ : chr [1:2] "TidyTuesday" "rstats"
  ..$ : chr "tidytuesday"
  ..$ : chr "TidyTuesday"
  ..$ : chr "TidyTuesday"
  ..$ : chr [1:3] "TidyTuesday" "rstats" "dataviz"
  ..$ : chr "TidyTuesday"
  ..$ : chr [1:3] "TidyTuesday" "Rstats" "dataviz"
  ..$ : chr [1:3] "Tidytuesday" "DataViz" "rstats"
  ..$ : chr [1:3] "TidyTuesday" "RStats" "dataviz"
  ..$ : chr [1:4] "TidyTuesday" "rstats" "ggplot" "dataviz"
  ..$ : chr "tidytuesday"
  ..$ : chr [1:3] "TidyTuesday" "Rstat" "dataviz"
  ..$ : chr [1:3] "TidyTuesday" "RStats" "ggplot2"
  ..$ : chr [1:3] "TidyTuesday" "dataviz" "Rstats"
  ..$ : chr "Tidytuesday"
  ..$ : chr [1:3] "TidyTuesday" "rstats" "dataviz"
  ..$ : chr [1:3] "tidyTuesday" "rstats" "DataViz"
  ..$ : chr [1:3] "TidyTuesday" "rstats" "dataviz"
  ..$ : chr [1:3] "TidyTuesday" "Rstats" "DataViz"
  ..$ : chr "TidyTuesday"
  ..$ : chr [1:3] "TidyTuesday" "rstats" "dataviz"
  ..$ : chr [1:2] "TidyTuesday" "rstats"
  ..$ : chr [1:5] "decor" "housedecor" "decluttering" "TidyTuesday" ...
  ..$ : chr [1:4] "tidytuesday" "rstats" "RFeedbackFriday" "functionfriday"
  ..$ : chr [1:2] "rstats" "tidytuesday"
  ..$ : chr "tidytuesday"
  ..$ : chr "TidyTuesday"
  ..$ : chr "tidytuesday"
  ..$ : chr "TidyTuesday"
  ..$ : chr [1:2] "rstudioconf" "tidytuesday"
  ..$ : chr [1:3] "rstats" "TidyTuesday" "wine"
  ..$ : chr "TidyTuesday"
  ..$ : chr "TidyTuesday"
  ..$ : chr [1:3] "TidyTuesday" "rstats" "r4ds"
  ..$ : chr [1:3] "TidyTuesday" "rstats" "r4ds"
  ..$ : chr [1:3] "TidyTuesday" "rstats" "r4ds"
  ..$ : chr [1:3] "TidyTuesday" "rstats" "r4ds"
  ..$ : chr [1:3] "TidyTuesday" "rstats" "r4ds"
  ..$ : chr "TidyTuesday"
  ..$ : chr [1:3] "TidyTuesday" "R4DS" "rstats"
  ..$ : chr [1:3] "TidyTuesday" "rstats" "r4ds"
  ..$ : chr "TidyTuesday"
  ..$ : chr [1:4] "TidyTuesday" "dataviz" "r4ds" "rstats"
  ..$ : chr [1:3] "TidyTuesday" "rstats" "r4ds"
  ..$ : chr [1:3] "rstats" "R4DS" "TidyTuesday"
  ..$ : chr [1:3] "TidyTuesday" "rstats" "r4ds"
  ..$ : chr [1:3] "TidyTuesday" "r4ds" "rstats"
  ..$ : chr [1:2] "TidyTuesday" "rstats"
  ..$ : chr "TidyTuesday"
  ..$ : chr "TidyTuesday"
  ..$ : chr [1:3] "TidyTuesday" "rstats" "r4ds"
  ..$ : chr [1:3] "TidyTuesday" "rstats" "r4ds"
  ..$ : chr [1:3] "TidyTuesday" "rstats" "r4ds"
  ..$ : chr [1:3] "TidyTuesday" "r4ds" "rstats"
  ..$ : chr [1:3] "TidyTuesday" "rstats" "r4ds"
  ..$ : chr "TidyTuesday"
  ..$ : chr [1:3] "TidyTuesday" "rstats" "r4ds"
  .. [list output truncated]
 $ symbols                :List of 4418
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  .. [list output truncated]
 $ urls_url               :List of 4418
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "elecciones2019.misiones.gov.ar"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "blog.davidmasp.com/post/2018-05-2…"
  ..$ : chr "davidmasp.netlify.com/post/roman-emp…"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "twitter.com/R4DScommunity/…"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "bit.ly/2v5KWXr"
  ..$ : chr NA
  ..$ : chr "johaniefournier.com/2019/02/20/tid…"
  ..$ : chr "wp.me/p9DZye-tZ"
  ..$ : chr "bit.ly/2OsbdIu"
  ..$ : chr "bit.ly/2JPraqo"
  ..$ : chr "bit.ly/2IrEkf2"
  ..$ : chr "johaniefournier.com/2019/02/14/tid…"
  ..$ : chr "bit.ly/2SzG4az"
  ..$ : chr "wp.me/p9DZye-r9"
  ..$ : chr "bit.ly/2G0Ivf5"
  ..$ : chr "wp.me/p9DZye-qt"
  ..$ : chr "bit.ly/2F7q73M"
  ..$ : chr "bit.ly/2TNeo3p"
  ..$ : chr "bit.ly/2L14RSv"
  ..$ : chr "bit.ly/2XJqI6I"
  ..$ : chr "wp.me/p9DZye-se"
  ..$ : chr "bit.ly/2SnfwtQ"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "twitter.com/beeonaposy/sta…"
  ..$ : chr "twitter.com/thomas_mock/st…"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "rfordatasci.com"
  ..$ : chr "twitter.com/thomas_mock/st…"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "bit.ly/2TMiRyW"
  ..$ : chr NA
  ..$ : chr "github.com/jkaupp/tidytue…"
  ..$ : chr "bit.ly/2TMiRyW"
  ..$ : chr "bit.ly/2TMiRyW"
  ..$ : chr "bit.ly/2TMiRyW"
  ..$ : chr "github.com/jkaupp/tidytue…"
  ..$ : chr "bit.ly/2TMiRyW"
  ..$ : chr NA
  ..$ : chr "bit.ly/2TMiRyW"
  ..$ : chr "github.com/jkaupp/tidytue…"
  ..$ : chr [1:2] "jakekaupp.com/post/tidytuesd…" "github.com/jkaupp/tidytue…"
  ..$ : chr "bit.ly/2TMiRyW"
  ..$ : chr "bit.ly/2TMiRyW"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "github.com/jkaupp/tidyweek"
  ..$ : chr "bit.ly/2TMiRyW"
  ..$ : chr "bit.ly/2TMiRyW"
  ..$ : chr "bit.ly/2TMiRyW"
  ..$ : chr "bit.ly/2TMiRyW"
  ..$ : chr [1:2] "science.sciencemag.org/content/347/62…" "bit.ly/2TMiRyW"
  ..$ : chr NA
  ..$ : chr "bit.ly/2TMiRyW"
  .. [list output truncated]
 $ urls_t.co              :List of 4418
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "https://t.co/ReGRfNKKc7"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "https://t.co/h1Y4MsrQcj"
  ..$ : chr "https://t.co/IHUAXJbVZR"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "https://t.co/OrXxYO5OhI"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "https://t.co/FD6hI1S2sW"
  ..$ : chr NA
  ..$ : chr "https://t.co/DwveXbobZM"
  ..$ : chr "https://t.co/BC612rVkjt"
  ..$ : chr "https://t.co/oOaDoDFFtR"
  ..$ : chr "https://t.co/aA2DQSDFks"
  ..$ : chr "https://t.co/QEe1QI2PRB"
  ..$ : chr "https://t.co/PsEotOunt9"
  ..$ : chr "https://t.co/fsqFss1p4k"
  ..$ : chr "https://t.co/uEvmghu1cb"
  ..$ : chr "https://t.co/GtAkJ84BNJ"
  ..$ : chr "https://t.co/FUXjNmsIet"
  ..$ : chr "https://t.co/GMDO4AqFff"
  ..$ : chr "https://t.co/Ar45TLNBx9"
  ..$ : chr "https://t.co/3MOiNiEGfR"
  ..$ : chr "https://t.co/wRRS3tqzgF"
  ..$ : chr "https://t.co/gDQ7MMjuqc"
  ..$ : chr "https://t.co/W3CTyrh61J"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "https://t.co/xLPBe2Zbu8"
  ..$ : chr "https://t.co/WFcuU3PW1K"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "https://t.co/FX7E61yx4h"
  ..$ : chr "https://t.co/qLiEUf52lo"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "https://t.co/1BhlhPnYyC"
  ..$ : chr NA
  ..$ : chr "https://t.co/kuJdBQG4pn"
  ..$ : chr "https://t.co/1BhlhPnYyC"
  ..$ : chr "https://t.co/1BhlhPnYyC"
  ..$ : chr "https://t.co/1BhlhPnYyC"
  ..$ : chr "https://t.co/kuJdBQXFgV"
  ..$ : chr "https://t.co/1BhlhPnYyC"
  ..$ : chr NA
  ..$ : chr "https://t.co/1BhlhPnYyC"
  ..$ : chr "https://t.co/kuJdBQG4pn"
  ..$ : chr [1:2] "https://t.co/CgRT1WkxUA" "https://t.co/kuJdBQG4pn"
  ..$ : chr "https://t.co/1BhlhPnYyC"
  ..$ : chr "https://t.co/1BhlhPnYyC"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "https://t.co/qS64cQ3oDJ"
  ..$ : chr "https://t.co/1BhlhPnYyC"
  ..$ : chr "https://t.co/1BhlhPnYyC"
  ..$ : chr "https://t.co/1BhlhPnYyC"
  ..$ : chr "https://t.co/1BhlhPnYyC"
  ..$ : chr [1:2] "https://t.co/N0il99wBqy" "https://t.co/1BhlhPnYyC"
  ..$ : chr NA
  ..$ : chr "https://t.co/1BhlhPnYyC"
  .. [list output truncated]
 $ urls_expanded_url      :List of 4418
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "http://www.elecciones2019.misiones.gov.ar"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "http://blog.davidmasp.com/post/2018-05-24-tidytuesday_8/"
  ..$ : chr "https://davidmasp.netlify.com/post/roman-emperors-r-dataisbeautiful/"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "https://twitter.com/R4DScommunity/status/991010304984272896"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "https://bit.ly/2v5KWXr"
  ..$ : chr NA
  ..$ : chr "http://johaniefournier.com/2019/02/20/tidytuesday-2019w8-doctorants-des-universites-americaines/"
  ..$ : chr "https://wp.me/p9DZye-tZ"
  ..$ : chr "https://bit.ly/2OsbdIu"
  ..$ : chr "https://bit.ly/2JPraqo"
  ..$ : chr "https://bit.ly/2IrEkf2"
  ..$ : chr "http://johaniefournier.com/2019/02/14/tidytuesday-2019w7-usda-depenses-federales-en-recherche-et-developpement/"
  ..$ : chr "https://bit.ly/2SzG4az"
  ..$ : chr "https://wp.me/p9DZye-r9"
  ..$ : chr "https://bit.ly/2G0Ivf5"
  ..$ : chr "https://wp.me/p9DZye-qt"
  ..$ : chr "https://bit.ly/2F7q73M"
  ..$ : chr "https://bit.ly/2TNeo3p"
  ..$ : chr "https://bit.ly/2L14RSv"
  ..$ : chr "https://bit.ly/2XJqI6I"
  ..$ : chr "https://wp.me/p9DZye-se"
  ..$ : chr "https://bit.ly/2SnfwtQ"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "https://twitter.com/beeonaposy/status/1055493736703148032"
  ..$ : chr "https://twitter.com/thomas_mock/status/1153473649359425542"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "http://rfordatasci.com"
  ..$ : chr "https://twitter.com/thomas_mock/status/1133017353581670400"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "http://bit.ly/2TMiRyW"
  ..$ : chr NA
  ..$ : chr "https://github.com/jkaupp/tidytuesdays"
  ..$ : chr "http://bit.ly/2TMiRyW"
  ..$ : chr "http://bit.ly/2TMiRyW"
  ..$ : chr "http://bit.ly/2TMiRyW"
  ..$ : chr "https://github.com/jkaupp/tidytuesdays"
  ..$ : chr "http://bit.ly/2TMiRyW"
  ..$ : chr NA
  ..$ : chr "http://bit.ly/2TMiRyW"
  ..$ : chr "https://github.com/jkaupp/tidytuesdays"
  ..$ : chr [1:2] "http://www.jakekaupp.com/post/tidytuesday-the-sad-story/" "https://github.com/jkaupp/tidytuesdays"
  ..$ : chr "http://bit.ly/2TMiRyW"
  ..$ : chr "http://bit.ly/2TMiRyW"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "https://github.com/jkaupp/tidyweek"
  ..$ : chr "http://bit.ly/2TMiRyW"
  ..$ : chr "http://bit.ly/2TMiRyW"
  ..$ : chr "http://bit.ly/2TMiRyW"
  ..$ : chr "http://bit.ly/2TMiRyW"
  ..$ : chr [1:2] "https://science.sciencemag.org/content/347/6223/768" "http://bit.ly/2TMiRyW"
  ..$ : chr NA
  ..$ : chr "http://bit.ly/2TMiRyW"
  .. [list output truncated]
 $ media_url              :List of 4418
  ..$ : chr "http://pbs.twimg.com/media/ECRao8OU8AAFVux.png"
  ..$ : chr "http://pbs.twimg.com/tweet_video_thumb/ECSvaTVWkAA6W3n.jpg"
  ..$ : chr "http://pbs.twimg.com/tweet_video_thumb/D-QCUiiXYAIh09j.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D3_lVfKWkAAcQtl.png"
  ..$ : chr "http://pbs.twimg.com/media/D5VDymMXsAAPizR.png"
  ..$ : chr "http://pbs.twimg.com/media/D4KmRTUWsAARl7Y.png"
  ..$ : chr "http://pbs.twimg.com/media/D5sxWV1W0AAQCQ6.png"
  ..$ : chr "http://pbs.twimg.com/media/D9SThyzXUAA0_Pc.png"
  ..$ : chr "http://pbs.twimg.com/tweet_video_thumb/D7eNp-YXYAAgN5F.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D5b3pMaWkAAUaqi.png"
  ..$ : chr "http://pbs.twimg.com/media/D7I48beW4AA9wHS.png"
  ..$ : chr "http://pbs.twimg.com/media/D7Ts-G-WwAERKYr.png"
  ..$ : chr "http://pbs.twimg.com/media/D8G8N0FXYAADddX.png"
  ..$ : chr "http://pbs.twimg.com/media/D5FetuYWAAEluPy.png"
  ..$ : chr "http://pbs.twimg.com/tweet_video_thumb/D77ISuHX4AMJxsk.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D8L7EUZXsAAILIQ.png"
  ..$ : chr "http://pbs.twimg.com/media/D9uAPKwWsAEuQ08.png"
  ..$ : chr "http://pbs.twimg.com/tweet_video_thumb/D69uPzGW0AcPYVT.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D6yQrObWsAYtNEY.png"
  ..$ : chr "http://pbs.twimg.com/media/D5NqjzgXsAUoBpw.png"
  ..$ : chr "http://pbs.twimg.com/media/D5ki_zMXoAEcojf.png"
  ..$ : chr "http://pbs.twimg.com/media/D5fPkhiWAAEA4TH.png"
  ..$ : chr "http://pbs.twimg.com/media/D-g7fYkWwAEhUag.png"
  ..$ : chr "http://pbs.twimg.com/media/D_SCajcWwAInlwI.png"
  ..$ : chr "http://pbs.twimg.com/media/D5-5eEYW4AEHv3E.jpg"
  ..$ : chr "http://pbs.twimg.com/tweet_video_thumb/D69KvGqXsAUc-3-.jpg"
  ..$ : chr "http://pbs.twimg.com/tweet_video_thumb/D78g09fXsAADZuo.jpg"
  ..$ : chr "http://pbs.twimg.com/media/ECRIeZLXYAEjY3c.png"
  ..$ : chr "http://pbs.twimg.com/tweet_video_thumb/D76lAcxXoAAojXX.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D6Q78deWAAA5X7e.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D3xGh4VWsAo7P3D.png"
  ..$ : chr "http://pbs.twimg.com/media/EAN76m-X4AA6mPm.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D4fN3NEUIAAt9sQ.png"
  ..$ : chr "http://pbs.twimg.com/tweet_video_thumb/D8UJfkyU8AEqgSW.jpg"
  ..$ : chr "http://pbs.twimg.com/tweet_video_thumb/ECAohZvWwAI_qCa.jpg"
  ..$ : chr "http://pbs.twimg.com/media/ECFUs9YXsAI6WFU.png"
  ..$ : chr "http://pbs.twimg.com/media/DeWyA5AXcAAbIZW.jpg"
  ..$ : chr "http://pbs.twimg.com/media/EB74e9aX4AIbRyr.jpg"
  ..$ : chr "http://pbs.twimg.com/tweet_video_thumb/DwWVhcDW0AEoY53.jpg"
  ..$ : chr "http://pbs.twimg.com/media/DxdV4MjXcAA5zRp.jpg"
  ..$ : chr "http://pbs.twimg.com/media/DcD-EHQXcAA2lW2.png"
  ..$ : chr "http://pbs.twimg.com/media/EAyPKmKXkAAO1Go.png"
  ..$ : chr NA
  ..$ : chr "http://pbs.twimg.com/tweet_video_thumb/EAPLq07WsAAI3-x.jpg"
  ..$ : chr NA
  ..$ : chr "http://pbs.twimg.com/media/EB338g0W4AAmXZW.png"
  ..$ : chr "http://pbs.twimg.com/media/D36rQDxWAAIWo_D.png"
  ..$ : chr NA
  ..$ : chr "http://pbs.twimg.com/media/Dzz4nvGXQAc3hGU.jpg"
  ..$ : chr "http://pbs.twimg.com/media/EB5PhN1WwAATVTk.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D2oKf0oXcAMjBK5.png"
  ..$ : chr "http://pbs.twimg.com/media/D_x29PsWkAAFJJb.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D4ZZA8WWAAAZ7_w.jpg"
  ..$ : chr "http://pbs.twimg.com/media/DzVBxOUVAAEXe6Y.png"
  ..$ : chr "http://pbs.twimg.com/media/DyxAvsMUwAAk4pa.jpg"
  ..$ : chr "http://pbs.twimg.com/media/EA7it89XoAEKk4N.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D3WuhI2W0AECN3j.png"
  ..$ : chr "http://pbs.twimg.com/media/EAc8hUdXkAALHJN.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D1gIDTqX4AAHiLe.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D2Eau7DWsAADzQm.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D44dglnWwAAmCUA.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D_TpYgHWwAASkeV.jpg"
  ..$ : chr "http://pbs.twimg.com/media/EBaXVc3XYAMxoeS.jpg"
  ..$ : chr "http://pbs.twimg.com/media/DzF65vEVAAE9D2Q.jpg"
  ..$ : chr "http://pbs.twimg.com/media/EBznSNyXYAAoOQg.jpg"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "http://pbs.twimg.com/media/D6u7QKEXsAIVbVv.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D3-EC_3XoAAtMGq.jpg"
  ..$ : chr "http://pbs.twimg.com/media/DdgWok6XUAA0oSI.jpg"
  ..$ : chr "http://pbs.twimg.com/media/EB8IPOdXYAAjOd7.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D3U5DtIX4AM5Pi2.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D6EpFtSW4AQ-xsE.jpg"
  ..$ : chr "http://pbs.twimg.com/media/DdKeNkIV0AAnNYi.jpg"
  ..$ : chr "http://pbs.twimg.com/media/EByUw4RXYAIF2qK.jpg"
  ..$ : chr NA
  ..$ : chr "http://pbs.twimg.com/media/DxAEVpaVAAAsuQW.jpg"
  ..$ : chr "http://pbs.twimg.com/media/DgDxVxBU8AEPcRW.jpg"
  ..$ : chr "http://pbs.twimg.com/media/DbF4BHoU0AAUKgq.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D08daVgXQAAYPBk.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D7wvWgqXYAEXM6n.jpg"
  ..$ : chr "http://pbs.twimg.com/media/Dz9YRqAXcAIGbUV.jpg"
  ..$ : chr "http://pbs.twimg.com/media/DgE0TaHXkAEo6nQ.jpg"
  ..$ : chr "http://pbs.twimg.com/media/DZ4oICtW0AAudAz.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D4UbiD2W0AcJDNJ.png"
  ..$ : chr "http://pbs.twimg.com/media/D-CHWxiXkAU3V8-.jpg"
  ..$ : chr "http://pbs.twimg.com/media/DxrxGaGX0AUIp9T.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D_xs1XgXoAAMc62.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D7MsF_VXYAA4fQJ.jpg"
  ..$ : chr NA
  ..$ : chr "http://pbs.twimg.com/media/D84JZHPXoAAdDUh.jpg"
  .. [list output truncated]
 $ media_t.co             :List of 4418
  ..$ : chr "https://t.co/1LTWIUBXwA"
  ..$ : chr "https://t.co/BhqvRIRWpl"
  ..$ : chr "https://t.co/MZrleaJUGg"
  ..$ : chr "https://t.co/uO5GnfDHlV"
  ..$ : chr "https://t.co/8zuoGS7U2N"
  ..$ : chr "https://t.co/yl7e5AnbWJ"
  ..$ : chr "https://t.co/5Cfzxjq4w4"
  ..$ : chr "https://t.co/yZkWgO7csw"
  ..$ : chr "https://t.co/nN0zWmG0Lx"
  ..$ : chr "https://t.co/5jM3FpZDie"
  ..$ : chr "https://t.co/J5R75p7Dxw"
  ..$ : chr "https://t.co/POsaubL7MI"
  ..$ : chr "https://t.co/U8FYIJw94K"
  ..$ : chr "https://t.co/zSqLlFB8s2"
  ..$ : chr "https://t.co/ICSKdNxCV8"
  ..$ : chr "https://t.co/eJ0h1eN2gC"
  ..$ : chr "https://t.co/iBxxYr9SOt"
  ..$ : chr "https://t.co/n1eNJu4cqV"
  ..$ : chr "https://t.co/yKdufhJTYK"
  ..$ : chr "https://t.co/R5TJt3UjZy"
  ..$ : chr "https://t.co/tCegQ6eAnc"
  ..$ : chr "https://t.co/62Xvule47Y"
  ..$ : chr "https://t.co/6pXN27Y4h2"
  ..$ : chr "https://t.co/rwR2EFXV7L"
  ..$ : chr "https://t.co/bKkuNyGS6N"
  ..$ : chr "https://t.co/vqICJrmaza"
  ..$ : chr "https://t.co/F2iTmtXWd2"
  ..$ : chr "https://t.co/8Aka4jfRsN"
  ..$ : chr "https://t.co/zPdYu7VN5b"
  ..$ : chr "https://t.co/hSN91kPk58"
  ..$ : chr "https://t.co/LG6TpHtA1E"
  ..$ : chr "https://t.co/aAlodFghxL"
  ..$ : chr "https://t.co/w4UCbzAqdU"
  ..$ : chr "https://t.co/dmYyS3T89w"
  ..$ : chr "https://t.co/DMGMSnCpEv"
  ..$ : chr "https://t.co/IIxjBPXctJ"
  ..$ : chr "https://t.co/AFO3uPFDCA"
  ..$ : chr "https://t.co/KY5FUq5dxV"
  ..$ : chr "https://t.co/0k432OqZv2"
  ..$ : chr "https://t.co/hXIYemBZhj"
  ..$ : chr "https://t.co/JB0GXmND2t"
  ..$ : chr "https://t.co/BtGU00e2DL"
  ..$ : chr NA
  ..$ : chr "https://t.co/PyCOoBWT8r"
  ..$ : chr NA
  ..$ : chr "https://t.co/oeMs0CPToY"
  ..$ : chr "https://t.co/NjT5iK41jS"
  ..$ : chr NA
  ..$ : chr "https://t.co/PpfALYngGa"
  ..$ : chr "https://t.co/1n6ou8aIz2"
  ..$ : chr "https://t.co/BMADARBcWf"
  ..$ : chr "https://t.co/uR0bV0XjEu"
  ..$ : chr "https://t.co/ez75XY6hA8"
  ..$ : chr "https://t.co/FukbDKPwXf"
  ..$ : chr "https://t.co/W1mnkB84Jg"
  ..$ : chr "https://t.co/WkMRsHYjA1"
  ..$ : chr "https://t.co/3WHhFWCjcj"
  ..$ : chr "https://t.co/pKaECW4gzs"
  ..$ : chr "https://t.co/bNAdHKiHMm"
  ..$ : chr "https://t.co/aWZbJpVn4p"
  ..$ : chr "https://t.co/B6yLiJ7xKo"
  ..$ : chr "https://t.co/zL7SpXpu9t"
  ..$ : chr "https://t.co/hB4Kg9RiTq"
  ..$ : chr "https://t.co/DGAFPxshYo"
  ..$ : chr "https://t.co/Sgu57cmeFg"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "https://t.co/UeNyhR12D9"
  ..$ : chr "https://t.co/t7kKsdILkv"
  ..$ : chr "https://t.co/vUyEqbFGdV"
  ..$ : chr "https://t.co/tjeInoq1AO"
  ..$ : chr "https://t.co/zUlJXI2Epo"
  ..$ : chr "https://t.co/KzowDvGtBx"
  ..$ : chr "https://t.co/86Hbx2guLj"
  ..$ : chr "https://t.co/sYnmKuD99k"
  ..$ : chr NA
  ..$ : chr "https://t.co/QykAPBgKE3"
  ..$ : chr "https://t.co/xrv7UjC19V"
  ..$ : chr "https://t.co/779kyDsrkk"
  ..$ : chr "https://t.co/Pn0WjojXcW"
  ..$ : chr "https://t.co/InaI2UJ8ul"
  ..$ : chr "https://t.co/m7aiRJYXhn"
  ..$ : chr "https://t.co/hcuJkOfmFB"
  ..$ : chr "https://t.co/j34DA2DzN6"
  ..$ : chr "https://t.co/QgEAqF6UkG"
  ..$ : chr "https://t.co/PKdBmEFhCI"
  ..$ : chr "https://t.co/Wz5xbZjWID"
  ..$ : chr "https://t.co/hXKkVxHaby"
  ..$ : chr "https://t.co/KJ4PcjgFaw"
  ..$ : chr NA
  ..$ : chr "https://t.co/4qmqY59wXc"
  .. [list output truncated]
 $ media_expanded_url     :List of 4418
  ..$ : chr "https://twitter.com/MKumarYYC/status/1163154266065735680/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1163247504542130176/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1145043578479108097/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1116864894144528384/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1122882491302465536/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1117638749414932483/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1124553123111079937/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1140702108649426945/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1132529994255745024/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1123358546874261504/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1131031073507467264/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1131793343304929280/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1135397698427904005/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1121785566432059393/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1134565758418259968/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1135751707424710656/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1142649391133679617/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1130245472633929728/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1129440599273279490/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1122359740556828673/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1123971749165568003/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1123596838538547201/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1146232742109229059/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1149688272513392640/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1125829312895172609/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1130206810349162498/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1134664174674612225/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1163134302575570945/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1134528497525936128/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1127094197562019841/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1115846484270952450/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1153902330531725314/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1119091254682210305/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1136326837544595459/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1161973312051974145/photo/1"
  ..$ : chr "https://twitter.com/dgwinfred/status/1162303313280417792/photo/1"
  ..$ : chr "https://twitter.com/davidmasp/status/1001412196247666688/photo/1"
  ..$ : chr "https://twitter.com/davidmasp/status/1161638973808287746/photo/1"
  ..$ : chr "https://twitter.com/davidmasp/status/1082436223791259659/photo/1"
  ..$ : chr "https://twitter.com/davidmasp/status/1087432261069430785/photo/1"
  ..$ : chr "https://twitter.com/davidmasp/status/991073965899644928/photo/1"
  ..$ : chr "https://twitter.com/datawookie/status/1156456637588283392/photo/1"
  ..$ : chr NA
  ..$ : chr "https://twitter.com/datawookie/status/1153990122255343617/photo/1"
  ..$ : chr NA
  ..$ : chr "https://twitter.com/jvaghela4/status/1161356902498013185/photo/1"
  ..$ : chr "https://twitter.com/FournierJohanie/status/1116518351147360257/photo/1"
  ..$ : chr NA
  ..$ : chr "https://twitter.com/FournierJohanie/status/1098025772554612738/photo/1"
  ..$ : chr "https://twitter.com/FournierJohanie/status/1161454327296339968/photo/1"
  ..$ : chr "https://twitter.com/FournierJohanie/status/1110711892086001665/photo/1"
  ..$ : chr "https://twitter.com/FournierJohanie/status/1151926405162291200/photo/1"
  ..$ : chr "https://twitter.com/FournierJohanie/status/1118679777232158722/photo/1"
  ..$ : chr "https://twitter.com/FournierJohanie/status/1095854400004853765/photo/1"
  ..$ : chr "https://twitter.com/FournierJohanie/status/1093320787770200065/photo/1"
  ..$ : chr "https://twitter.com/FournierJohanie/status/1157111441419395074/photo/1"
  ..$ : chr "https://twitter.com/FournierJohanie/status/1113988718338215937/photo/1"
  ..$ : chr "https://twitter.com/FournierJohanie/status/1154958378764046336/photo/1"
  ..$ : chr "https://twitter.com/FournierJohanie/status/1105642831413239808/photo/1"
  ..$ : chr "https://twitter.com/FournierJohanie/status/1108196618464047105/photo/1"
  ..$ : chr "https://twitter.com/FournierJohanie/status/1120865858111385600/photo/1"
  ..$ : chr "https://twitter.com/FournierJohanie/status/1149800610004426752/photo/1"
  ..$ : chr "https://twitter.com/FournierJohanie/status/1159280417813532673/photo/1"
  ..$ : chr "https://twitter.com/FournierJohanie/status/1094791392797380608/photo/1"
  ..$ : chr "https://twitter.com/debrafranke7/status/1161057113147396096/photo/1"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "https://twitter.com/jakekaupp/status/1129202271789748234/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/1116756586729439236/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/997572245261307905/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/1161656332686102529/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/1113859250885996546/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/1126226854984257536/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/996032676812374016/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/1160966530538061824/photo/1"
  ..$ : chr NA
  ..$ : chr "https://twitter.com/jakekaupp/status/1085372365062787073/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/1009072751708196866/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/986702200603840512/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/1103132855356608512/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/1133833304271200257/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/1098694015669731328/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/1009145699517255683/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/981265942226309121/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/1118330818768797696/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/1144063378543038465/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/1088447419778588672/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/1151915429109149696/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/1131296911598735361/photo/1"
  ..$ : chr NA
  ..$ : chr "https://twitter.com/jakekaupp/status/1138861812978520064/photo/1"
  .. [list output truncated]
 $ media_type             :List of 4418
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr NA
  ..$ : chr "photo"
  ..$ : chr NA
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr NA
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr NA
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr "photo"
  ..$ : chr NA
  ..$ : chr "photo"
  .. [list output truncated]
 $ ext_media_url          :List of 4418
  ..$ : chr "http://pbs.twimg.com/media/ECRao8OU8AAFVux.png"
  ..$ : chr "http://pbs.twimg.com/tweet_video_thumb/ECSvaTVWkAA6W3n.jpg"
  ..$ : chr "http://pbs.twimg.com/tweet_video_thumb/D-QCUiiXYAIh09j.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D3_lVfKWkAAcQtl.png"
  ..$ : chr "http://pbs.twimg.com/media/D5VDymMXsAAPizR.png"
  ..$ : chr "http://pbs.twimg.com/media/D4KmRTUWsAARl7Y.png"
  ..$ : chr "http://pbs.twimg.com/media/D5sxWV1W0AAQCQ6.png"
  ..$ : chr "http://pbs.twimg.com/media/D9SThyzXUAA0_Pc.png"
  ..$ : chr "http://pbs.twimg.com/tweet_video_thumb/D7eNp-YXYAAgN5F.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D5b3pMaWkAAUaqi.png"
  ..$ : chr [1:2] "http://pbs.twimg.com/media/D7I48beW4AA9wHS.png" "http://pbs.twimg.com/media/D7I6Gc7WwAEVR5w.png"
  ..$ : chr "http://pbs.twimg.com/media/D7Ts-G-WwAERKYr.png"
  ..$ : chr "http://pbs.twimg.com/media/D8G8N0FXYAADddX.png"
  ..$ : chr "http://pbs.twimg.com/media/D5FetuYWAAEluPy.png"
  ..$ : chr "http://pbs.twimg.com/tweet_video_thumb/D77ISuHX4AMJxsk.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D8L7EUZXsAAILIQ.png"
  ..$ : chr "http://pbs.twimg.com/media/D9uAPKwWsAEuQ08.png"
  ..$ : chr "http://pbs.twimg.com/tweet_video_thumb/D69uPzGW0AcPYVT.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D6yQrObWsAYtNEY.png"
  ..$ : chr "http://pbs.twimg.com/media/D5NqjzgXsAUoBpw.png"
  ..$ : chr "http://pbs.twimg.com/media/D5ki_zMXoAEcojf.png"
  ..$ : chr "http://pbs.twimg.com/media/D5fPkhiWAAEA4TH.png"
  ..$ : chr "http://pbs.twimg.com/media/D-g7fYkWwAEhUag.png"
  ..$ : chr "http://pbs.twimg.com/media/D_SCajcWwAInlwI.png"
  ..$ : chr [1:2] "http://pbs.twimg.com/media/D5-5eEYW4AEHv3E.jpg" "http://pbs.twimg.com/media/D5-9u0OXkAMt4SX.png"
  ..$ : chr "http://pbs.twimg.com/tweet_video_thumb/D69KvGqXsAUc-3-.jpg"
  ..$ : chr "http://pbs.twimg.com/tweet_video_thumb/D78g09fXsAADZuo.jpg"
  ..$ : chr "http://pbs.twimg.com/media/ECRIeZLXYAEjY3c.png"
  ..$ : chr "http://pbs.twimg.com/tweet_video_thumb/D76lAcxXoAAojXX.jpg"
  ..$ : chr [1:2] "http://pbs.twimg.com/media/D6Q78deWAAA5X7e.jpg" "http://pbs.twimg.com/media/D6Q7-zcX4AAt78h.png"
  ..$ : chr [1:2] "http://pbs.twimg.com/media/D3xGh4VWsAo7P3D.png" "http://pbs.twimg.com/media/D3xGkA8WAAAFHEB.png"
  ..$ : chr "http://pbs.twimg.com/media/EAN76m-X4AA6mPm.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D4fN3NEUIAAt9sQ.png"
  ..$ : chr "http://pbs.twimg.com/tweet_video_thumb/D8UJfkyU8AEqgSW.jpg"
  ..$ : chr "http://pbs.twimg.com/tweet_video_thumb/ECAohZvWwAI_qCa.jpg"
  ..$ : chr "http://pbs.twimg.com/media/ECFUs9YXsAI6WFU.png"
  ..$ : chr "http://pbs.twimg.com/media/DeWyA5AXcAAbIZW.jpg"
  ..$ : chr "http://pbs.twimg.com/media/EB74e9aX4AIbRyr.jpg"
  ..$ : chr "http://pbs.twimg.com/tweet_video_thumb/DwWVhcDW0AEoY53.jpg"
  ..$ : chr [1:2] "http://pbs.twimg.com/media/DxdV4MjXcAA5zRp.jpg" "http://pbs.twimg.com/media/DxdV4MWWkAAwHPv.jpg"
  ..$ : chr [1:2] "http://pbs.twimg.com/media/DcD-EHQXcAA2lW2.png" "http://pbs.twimg.com/media/DcD-HCYWkAE0IsD.jpg"
  ..$ : chr "http://pbs.twimg.com/media/EAyPKmKXkAAO1Go.png"
  ..$ : chr NA
  ..$ : chr "http://pbs.twimg.com/tweet_video_thumb/EAPLq07WsAAI3-x.jpg"
  ..$ : chr NA
  ..$ : chr [1:2] "http://pbs.twimg.com/media/EB338g0W4AAmXZW.png" "http://pbs.twimg.com/media/EB338g2WkAYPe50.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D36rQDxWAAIWo_D.png"
  ..$ : chr NA
  ..$ : chr "http://pbs.twimg.com/media/Dzz4nvGXQAc3hGU.jpg"
  ..$ : chr "http://pbs.twimg.com/media/EB5PhN1WwAATVTk.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D2oKf0oXcAMjBK5.png"
  ..$ : chr "http://pbs.twimg.com/media/D_x29PsWkAAFJJb.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D4ZZA8WWAAAZ7_w.jpg"
  ..$ : chr "http://pbs.twimg.com/media/DzVBxOUVAAEXe6Y.png"
  ..$ : chr [1:2] "http://pbs.twimg.com/media/DyxAvsMUwAAk4pa.jpg" "http://pbs.twimg.com/media/DyxBCiLUcAEFnsP.jpg"
  ..$ : chr "http://pbs.twimg.com/media/EA7it89XoAEKk4N.jpg"
  ..$ : chr [1:2] "http://pbs.twimg.com/media/D3WuhI2W0AECN3j.png" "http://pbs.twimg.com/media/D3WuhI1WkAEDL1m.png"
  ..$ : chr "http://pbs.twimg.com/media/EAc8hUdXkAALHJN.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D1gIDTqX4AAHiLe.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D2Eau7DWsAADzQm.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D44dglnWwAAmCUA.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D_TpYgHWwAASkeV.jpg"
  ..$ : chr "http://pbs.twimg.com/media/EBaXVc3XYAMxoeS.jpg"
  ..$ : chr [1:2] "http://pbs.twimg.com/media/DzF65vEVAAE9D2Q.jpg" "http://pbs.twimg.com/media/DzF65tbUwAEIPce.jpg"
  ..$ : chr "http://pbs.twimg.com/media/EBznSNyXYAAoOQg.jpg"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "http://pbs.twimg.com/media/D6u7QKEXsAIVbVv.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D3-EC_3XoAAtMGq.jpg"
  ..$ : chr "http://pbs.twimg.com/media/DdgWok6XUAA0oSI.jpg"
  ..$ : chr "http://pbs.twimg.com/media/EB8IPOdXYAAjOd7.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D3U5DtIX4AM5Pi2.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D6EpFtSW4AQ-xsE.jpg"
  ..$ : chr [1:2] "http://pbs.twimg.com/media/DdKeNkIV0AAnNYi.jpg" "http://pbs.twimg.com/media/DdKeO2pV4AAIuY1.jpg"
  ..$ : chr "http://pbs.twimg.com/media/EByUw4RXYAIF2qK.jpg"
  ..$ : chr NA
  ..$ : chr "http://pbs.twimg.com/media/DxAEVpaVAAAsuQW.jpg"
  ..$ : chr "http://pbs.twimg.com/media/DgDxVxBU8AEPcRW.jpg"
  ..$ : chr "http://pbs.twimg.com/media/DbF4BHoU0AAUKgq.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D08daVgXQAAYPBk.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D7wvWgqXYAEXM6n.jpg"
  ..$ : chr "http://pbs.twimg.com/media/Dz9YRqAXcAIGbUV.jpg"
  ..$ : chr "http://pbs.twimg.com/media/DgE0TaHXkAEo6nQ.jpg"
  ..$ : chr "http://pbs.twimg.com/media/DZ4oICtW0AAudAz.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D4UbiD2W0AcJDNJ.png"
  ..$ : chr "http://pbs.twimg.com/media/D-CHWxiXkAU3V8-.jpg"
  ..$ : chr "http://pbs.twimg.com/media/DxrxGaGX0AUIp9T.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D_xs1XgXoAAMc62.jpg"
  ..$ : chr "http://pbs.twimg.com/media/D7MsF_VXYAA4fQJ.jpg"
  ..$ : chr NA
  ..$ : chr "http://pbs.twimg.com/media/D84JZHPXoAAdDUh.jpg"
  .. [list output truncated]
 $ ext_media_t.co         :List of 4418
  ..$ : chr "https://t.co/1LTWIUBXwA"
  ..$ : chr "https://t.co/BhqvRIRWpl"
  ..$ : chr "https://t.co/MZrleaJUGg"
  ..$ : chr "https://t.co/uO5GnfDHlV"
  ..$ : chr "https://t.co/8zuoGS7U2N"
  ..$ : chr "https://t.co/yl7e5AnbWJ"
  ..$ : chr "https://t.co/5Cfzxjq4w4"
  ..$ : chr "https://t.co/yZkWgO7csw"
  ..$ : chr "https://t.co/nN0zWmG0Lx"
  ..$ : chr "https://t.co/5jM3FpZDie"
  ..$ : chr [1:2] "https://t.co/J5R75p7Dxw" "https://t.co/J5R75p7Dxw"
  ..$ : chr "https://t.co/POsaubL7MI"
  ..$ : chr "https://t.co/U8FYIJw94K"
  ..$ : chr "https://t.co/zSqLlFB8s2"
  ..$ : chr "https://t.co/ICSKdNxCV8"
  ..$ : chr "https://t.co/eJ0h1eN2gC"
  ..$ : chr "https://t.co/iBxxYr9SOt"
  ..$ : chr "https://t.co/n1eNJu4cqV"
  ..$ : chr "https://t.co/yKdufhJTYK"
  ..$ : chr "https://t.co/R5TJt3UjZy"
  ..$ : chr "https://t.co/tCegQ6eAnc"
  ..$ : chr "https://t.co/62Xvule47Y"
  ..$ : chr "https://t.co/6pXN27Y4h2"
  ..$ : chr "https://t.co/rwR2EFXV7L"
  ..$ : chr [1:2] "https://t.co/bKkuNyGS6N" "https://t.co/bKkuNyGS6N"
  ..$ : chr "https://t.co/vqICJrmaza"
  ..$ : chr "https://t.co/F2iTmtXWd2"
  ..$ : chr "https://t.co/8Aka4jfRsN"
  ..$ : chr "https://t.co/zPdYu7VN5b"
  ..$ : chr [1:2] "https://t.co/hSN91kPk58" "https://t.co/hSN91kPk58"
  ..$ : chr [1:2] "https://t.co/LG6TpHtA1E" "https://t.co/LG6TpHtA1E"
  ..$ : chr "https://t.co/aAlodFghxL"
  ..$ : chr "https://t.co/w4UCbzAqdU"
  ..$ : chr "https://t.co/dmYyS3T89w"
  ..$ : chr "https://t.co/DMGMSnCpEv"
  ..$ : chr "https://t.co/IIxjBPXctJ"
  ..$ : chr "https://t.co/AFO3uPFDCA"
  ..$ : chr "https://t.co/KY5FUq5dxV"
  ..$ : chr "https://t.co/0k432OqZv2"
  ..$ : chr [1:2] "https://t.co/hXIYemBZhj" "https://t.co/hXIYemBZhj"
  ..$ : chr [1:2] "https://t.co/JB0GXmND2t" "https://t.co/JB0GXmND2t"
  ..$ : chr "https://t.co/BtGU00e2DL"
  ..$ : chr NA
  ..$ : chr "https://t.co/PyCOoBWT8r"
  ..$ : chr NA
  ..$ : chr [1:2] "https://t.co/oeMs0CPToY" "https://t.co/oeMs0CPToY"
  ..$ : chr "https://t.co/NjT5iK41jS"
  ..$ : chr NA
  ..$ : chr "https://t.co/PpfALYngGa"
  ..$ : chr "https://t.co/1n6ou8aIz2"
  ..$ : chr "https://t.co/BMADARBcWf"
  ..$ : chr "https://t.co/uR0bV0XjEu"
  ..$ : chr "https://t.co/ez75XY6hA8"
  ..$ : chr "https://t.co/FukbDKPwXf"
  ..$ : chr [1:2] "https://t.co/W1mnkB84Jg" "https://t.co/W1mnkB84Jg"
  ..$ : chr "https://t.co/WkMRsHYjA1"
  ..$ : chr [1:2] "https://t.co/3WHhFWCjcj" "https://t.co/3WHhFWCjcj"
  ..$ : chr "https://t.co/pKaECW4gzs"
  ..$ : chr "https://t.co/bNAdHKiHMm"
  ..$ : chr "https://t.co/aWZbJpVn4p"
  ..$ : chr "https://t.co/B6yLiJ7xKo"
  ..$ : chr "https://t.co/zL7SpXpu9t"
  ..$ : chr "https://t.co/hB4Kg9RiTq"
  ..$ : chr [1:2] "https://t.co/DGAFPxshYo" "https://t.co/DGAFPxshYo"
  ..$ : chr "https://t.co/Sgu57cmeFg"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "https://t.co/UeNyhR12D9"
  ..$ : chr "https://t.co/t7kKsdILkv"
  ..$ : chr "https://t.co/vUyEqbFGdV"
  ..$ : chr "https://t.co/tjeInoq1AO"
  ..$ : chr "https://t.co/zUlJXI2Epo"
  ..$ : chr "https://t.co/KzowDvGtBx"
  ..$ : chr [1:2] "https://t.co/86Hbx2guLj" "https://t.co/86Hbx2guLj"
  ..$ : chr "https://t.co/sYnmKuD99k"
  ..$ : chr NA
  ..$ : chr "https://t.co/QykAPBgKE3"
  ..$ : chr "https://t.co/xrv7UjC19V"
  ..$ : chr "https://t.co/779kyDsrkk"
  ..$ : chr "https://t.co/Pn0WjojXcW"
  ..$ : chr "https://t.co/InaI2UJ8ul"
  ..$ : chr "https://t.co/m7aiRJYXhn"
  ..$ : chr "https://t.co/hcuJkOfmFB"
  ..$ : chr "https://t.co/j34DA2DzN6"
  ..$ : chr "https://t.co/QgEAqF6UkG"
  ..$ : chr "https://t.co/PKdBmEFhCI"
  ..$ : chr "https://t.co/Wz5xbZjWID"
  ..$ : chr "https://t.co/hXKkVxHaby"
  ..$ : chr "https://t.co/KJ4PcjgFaw"
  ..$ : chr NA
  ..$ : chr "https://t.co/4qmqY59wXc"
  .. [list output truncated]
 $ ext_media_expanded_url :List of 4418
  ..$ : chr "https://twitter.com/MKumarYYC/status/1163154266065735680/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1163247504542130176/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1145043578479108097/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1116864894144528384/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1122882491302465536/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1117638749414932483/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1124553123111079937/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1140702108649426945/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1132529994255745024/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1123358546874261504/photo/1"
  ..$ : chr [1:2] "https://twitter.com/cizzart/status/1131031073507467264/photo/1" "https://twitter.com/cizzart/status/1131031073507467264/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1131793343304929280/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1135397698427904005/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1121785566432059393/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1134565758418259968/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1135751707424710656/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1142649391133679617/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1130245472633929728/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1129440599273279490/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1122359740556828673/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1123971749165568003/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1123596838538547201/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1146232742109229059/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1149688272513392640/photo/1"
  ..$ : chr [1:2] "https://twitter.com/cizzart/status/1125829312895172609/photo/1" "https://twitter.com/cizzart/status/1125829312895172609/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1130206810349162498/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1134664174674612225/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1163134302575570945/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1134528497525936128/photo/1"
  ..$ : chr [1:2] "https://twitter.com/cizzart/status/1127094197562019841/photo/1" "https://twitter.com/cizzart/status/1127094197562019841/photo/1"
  ..$ : chr [1:2] "https://twitter.com/cizzart/status/1115846484270952450/photo/1" "https://twitter.com/cizzart/status/1115846484270952450/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1153902330531725314/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1119091254682210305/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1136326837544595459/photo/1"
  ..$ : chr "https://twitter.com/cizzart/status/1161973312051974145/photo/1"
  ..$ : chr "https://twitter.com/dgwinfred/status/1162303313280417792/photo/1"
  ..$ : chr "https://twitter.com/davidmasp/status/1001412196247666688/photo/1"
  ..$ : chr "https://twitter.com/davidmasp/status/1161638973808287746/photo/1"
  ..$ : chr "https://twitter.com/davidmasp/status/1082436223791259659/photo/1"
  ..$ : chr [1:2] "https://twitter.com/davidmasp/status/1087432261069430785/photo/1" "https://twitter.com/davidmasp/status/1087432261069430785/photo/1"
  ..$ : chr [1:2] "https://twitter.com/davidmasp/status/991073965899644928/photo/1" "https://twitter.com/davidmasp/status/991073965899644928/photo/1"
  ..$ : chr "https://twitter.com/datawookie/status/1156456637588283392/photo/1"
  ..$ : chr NA
  ..$ : chr "https://twitter.com/datawookie/status/1153990122255343617/photo/1"
  ..$ : chr NA
  ..$ : chr [1:2] "https://twitter.com/jvaghela4/status/1161356902498013185/photo/1" "https://twitter.com/jvaghela4/status/1161356902498013185/photo/1"
  ..$ : chr "https://twitter.com/FournierJohanie/status/1116518351147360257/photo/1"
  ..$ : chr NA
  ..$ : chr "https://twitter.com/FournierJohanie/status/1098025772554612738/photo/1"
  ..$ : chr "https://twitter.com/FournierJohanie/status/1161454327296339968/photo/1"
  ..$ : chr "https://twitter.com/FournierJohanie/status/1110711892086001665/photo/1"
  ..$ : chr "https://twitter.com/FournierJohanie/status/1151926405162291200/photo/1"
  ..$ : chr "https://twitter.com/FournierJohanie/status/1118679777232158722/photo/1"
  ..$ : chr "https://twitter.com/FournierJohanie/status/1095854400004853765/photo/1"
  ..$ : chr [1:2] "https://twitter.com/FournierJohanie/status/1093320787770200065/photo/1" "https://twitter.com/FournierJohanie/status/1093320787770200065/photo/1"
  ..$ : chr "https://twitter.com/FournierJohanie/status/1157111441419395074/photo/1"
  ..$ : chr [1:2] "https://twitter.com/FournierJohanie/status/1113988718338215937/photo/1" "https://twitter.com/FournierJohanie/status/1113988718338215937/photo/1"
  ..$ : chr "https://twitter.com/FournierJohanie/status/1154958378764046336/photo/1"
  ..$ : chr "https://twitter.com/FournierJohanie/status/1105642831413239808/photo/1"
  ..$ : chr "https://twitter.com/FournierJohanie/status/1108196618464047105/photo/1"
  ..$ : chr "https://twitter.com/FournierJohanie/status/1120865858111385600/photo/1"
  ..$ : chr "https://twitter.com/FournierJohanie/status/1149800610004426752/photo/1"
  ..$ : chr "https://twitter.com/FournierJohanie/status/1159280417813532673/photo/1"
  ..$ : chr [1:2] "https://twitter.com/FournierJohanie/status/1094791392797380608/photo/1" "https://twitter.com/FournierJohanie/status/1094791392797380608/photo/1"
  ..$ : chr "https://twitter.com/debrafranke7/status/1161057113147396096/photo/1"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "https://twitter.com/jakekaupp/status/1129202271789748234/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/1116756586729439236/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/997572245261307905/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/1161656332686102529/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/1113859250885996546/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/1126226854984257536/photo/1"
  ..$ : chr [1:2] "https://twitter.com/jakekaupp/status/996032676812374016/photo/1" "https://twitter.com/jakekaupp/status/996032676812374016/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/1160966530538061824/photo/1"
  ..$ : chr NA
  ..$ : chr "https://twitter.com/jakekaupp/status/1085372365062787073/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/1009072751708196866/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/986702200603840512/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/1103132855356608512/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/1133833304271200257/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/1098694015669731328/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/1009145699517255683/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/981265942226309121/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/1118330818768797696/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/1144063378543038465/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/1088447419778588672/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/1151915429109149696/photo/1"
  ..$ : chr "https://twitter.com/jakekaupp/status/1131296911598735361/photo/1"
  ..$ : chr NA
  ..$ : chr "https://twitter.com/jakekaupp/status/1138861812978520064/photo/1"
  .. [list output truncated]
 $ ext_media_type         : chr [1:4418] NA NA NA NA ...
 $ mentions_user_id       :List of 4418
  ..$ : chr NA
  ..$ : chr "2792557771"
  ..$ : chr "4471336875"
  ..$ : chr "2249583145"
  ..$ : chr [1:2] "856530775" "1550707568"
  ..$ : chr "4471336875"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr [1:2] "856530775" "1550707568"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "2775928246"
  ..$ : chr NA
  ..$ : chr "884910628674174976"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "1105263752553869312"
  ..$ : chr NA
  ..$ : chr [1:2] "1550707568" "856530775"
  ..$ : chr "4471336875"
  ..$ : chr "235261861"
  ..$ : chr "4471336875"
  ..$ : chr NA
  ..$ : chr "1105263752553869312"
  ..$ : chr [1:2] "3092381638" "46245868"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "611597719"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr [1:3] "3230388598" "267256091" "933014576647548928"
  ..$ : chr [1:2] "1241814552" "936191454"
  ..$ : chr NA
  ..$ : chr "16902857"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "1241814552"
  ..$ : chr [1:3] "4481853575" "379406678" "3092381638"
  ..$ : chr "970554068899790848"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "1241814552"
  ..$ : chr "2949131574"
  ..$ : chr [1:2] "983470194982088704" "880766885544902656"
  ..$ : chr "983470194982088704"
  ..$ : chr NA
  ..$ : chr "543981616"
  ..$ : chr [1:5] "1719561181" "983470194982088704" "744431959" "2440258777" ...
  ..$ : chr [1:5] "24228154" "46245868" "1241814552" "235261861" ...
  ..$ : chr [1:2] "46245868" "983470194982088704"
  ..$ : chr NA
  ..$ : chr [1:2] "752982253576318976" "3092381638"
  ..$ : chr [1:2] "2440258777" "983470194982088704"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "2931914597"
  ..$ : chr NA
  ..$ : chr "1929862952"
  ..$ : chr "934434517"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "1241814552"
  ..$ : chr NA
  ..$ : chr "5988062"
  ..$ : chr [1:2] "396746547" "29306310"
  ..$ : chr NA
  ..$ : chr "983470194982088704"
  ..$ : chr NA
  ..$ : chr [1:2] "495425905" "69133574"
  ..$ : chr NA
  .. [list output truncated]
 $ mentions_screen_name   :List of 4418
  ..$ : chr NA
  ..$ : chr "eldestapeweb"
  ..$ : chr "INDECArgentina"
  ..$ : chr "ENACOMArgentina"
  ..$ : chr [1:2] "tribunalelecmns" "CamaraElectoral"
  ..$ : chr "INDECArgentina"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr [1:2] "tribunalelecmns" "CamaraElectoral"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "AgroMnes"
  ..$ : chr NA
  ..$ : chr "Renovadoresok"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "R4DS_es"
  ..$ : chr NA
  ..$ : chr [1:2] "CamaraElectoral" "tribunalelecmns"
  ..$ : chr "INDECArgentina"
  ..$ : chr "rstudio"
  ..$ : chr "INDECArgentina"
  ..$ : chr NA
  ..$ : chr "R4DS_es"
  ..$ : chr [1:2] "CedScherer" "drob"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "thomasp85"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr [1:3] "dataandme" "_inundata" "mybinderteam"
  ..$ : chr [1:2] "thomas_mock" "benalexkeen"
  ..$ : chr NA
  ..$ : chr "kernpanik"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "thomas_mock"
  ..$ : chr [1:3] "LarissaKostiw" "tanya_shapiro" "CedScherer"
  ..$ : chr "DaveBloom11"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "thomas_mock"
  ..$ : chr "MarieKondo"
  ..$ : chr [1:2] "R4DScommunity" "UseR2019_Conf"
  ..$ : chr "R4DScommunity"
  ..$ : chr NA
  ..$ : chr "dgkeyes"
  ..$ : chr [1:5] "Kuprinasha" "R4DScommunity" "Giants_Collide" "kierisi" ...
  ..$ : chr [1:5] "hspter" "drob" "thomas_mock" "rstudio" ...
  ..$ : chr [1:2] "drob" "R4DScommunity"
  ..$ : chr NA
  ..$ : chr [1:2] "Physacourses" "CedScherer"
  ..$ : chr [1:2] "kierisi" "R4DScommunity"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "lenkiefer"
  ..$ : chr NA
  ..$ : chr "spren9er"
  ..$ : chr "chucc900"
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr NA
  ..$ : chr "thomas_mock"
  ..$ : chr NA
  ..$ : chr "TheEconomist"
  ..$ : chr [1:2] "grssnbchr" "angelozehr"
  ..$ : chr NA
  ..$ : chr "R4DScommunity"
  ..$ : chr NA
  ..$ : chr [1:2] "CieraReports" "hadleywickham"
  ..$ : chr NA
  .. [list output truncated]
 $ lang                   : chr [1:4418] "en" "es" "es" "es" ...
 $ quoted_status_id       : chr [1:4418] NA NA NA NA ...
 $ quoted_text            : chr [1:4418] NA NA NA NA ...
 $ quoted_created_at      : POSIXct[1:4418], format: NA NA ...
 $ quoted_source          : chr [1:4418] NA NA NA NA ...
 $ quoted_favorite_count  : int [1:4418] NA NA NA NA NA NA NA NA NA NA ...
 $ quoted_retweet_count   : int [1:4418] NA NA NA NA NA NA NA NA NA NA ...
 $ quoted_user_id         : chr [1:4418] NA NA NA NA ...
 $ quoted_screen_name     : chr [1:4418] NA NA NA NA ...
 $ quoted_name            : chr [1:4418] NA NA NA NA ...
 $ quoted_followers_count : int [1:4418] NA NA NA NA NA NA NA NA NA NA ...
 $ quoted_friends_count   : int [1:4418] NA NA NA NA NA NA NA NA NA NA ...
 $ quoted_statuses_count  : int [1:4418] NA NA NA NA NA NA NA NA NA NA ...
 $ quoted_location        : chr [1:4418] NA NA NA NA ...
 $ quoted_description     : chr [1:4418] NA NA NA NA ...
 $ quoted_verified        : logi [1:4418] NA NA NA NA NA NA ...
 $ retweet_status_id      : chr [1:4418] NA NA NA NA ...
 $ retweet_text           : chr [1:4418] NA NA NA NA ...
 $ retweet_created_at     : POSIXct[1:4418], format: NA NA ...
 $ retweet_source         : chr [1:4418] NA NA NA NA ...
 $ retweet_favorite_count : int [1:4418] NA NA NA NA NA NA NA NA NA NA ...
 $ retweet_retweet_count  : int [1:4418] NA NA NA NA NA NA NA NA NA NA ...
 $ retweet_user_id        : chr [1:4418] NA NA NA NA ...
 $ retweet_screen_name    : chr [1:4418] NA NA NA NA ...
 $ retweet_name           : chr [1:4418] NA NA NA NA ...
 $ retweet_followers_count: int [1:4418] NA NA NA NA NA NA NA NA NA NA ...
 $ retweet_friends_count  : int [1:4418] NA NA NA NA NA NA NA NA NA NA ...
 $ retweet_statuses_count : int [1:4418] NA NA NA NA NA NA NA NA NA NA ...
 $ retweet_location       : chr [1:4418] NA NA NA NA ...
 $ retweet_description    : chr [1:4418] NA NA NA NA ...
 $ retweet_verified       : logi [1:4418] NA NA NA NA NA NA ...
 $ place_url              : chr [1:4418] NA NA NA NA ...
 $ place_name             : chr [1:4418] NA NA NA NA ...
 $ place_full_name        : chr [1:4418] NA NA NA NA ...
 $ place_type             : chr [1:4418] NA NA NA NA ...
 $ country                : chr [1:4418] NA NA NA NA ...
 $ country_code           : chr [1:4418] NA NA NA NA ...
 $ geo_coords             :List of 4418
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  .. [list output truncated]
 $ coords_coords          :List of 4418
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  ..$ : num [1:2] NA NA
  .. [list output truncated]
 $ bbox_coords            :List of 4418
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  ..$ : num [1:8] NA NA NA NA NA NA NA NA
  .. [list output truncated]
 $ status_url             : chr [1:4418] "https://twitter.com/MKumarYYC/status/1163154266065735680" "https://twitter.com/cizzart/status/1163247504542130176" "https://twitter.com/cizzart/status/1145043578479108097" "https://twitter.com/cizzart/status/1116864894144528384" ...
 $ name                   : chr [1:4418] "Mehul Kumar" "César" "César" "César" ...
 $ location               : chr [1:4418] "Calgary, AB" "Posadas, Argentina" "Posadas, Argentina" "Posadas, Argentina" ...
 $ description            : chr [1:4418] "Dabbling in data devices | #rstats/#datascience| #internationaldevelopment| mountain biker | 🍁 MSc. candidate |"| __truncated__ "Ciencia de Datos y Estadistica #rstats/ @rstudio #DataScience #Statistics. (R for Political Data Science/ cesar"| __truncated__ "Ciencia de Datos y Estadistica #rstats/ @rstudio #DataScience #Statistics. (R for Political Data Science/ cesar"| __truncated__ "Ciencia de Datos y Estadistica #rstats/ @rstudio #DataScience #Statistics. (R for Political Data Science/ cesar"| __truncated__ ...
 $ url                    : chr [1:4418] NA NA NA NA ...
 $ protected              : logi [1:4418] FALSE FALSE FALSE FALSE FALSE FALSE ...
 $ followers_count        : int [1:4418] 10 57 57 57 57 57 57 57 57 57 ...
 $ friends_count          : int [1:4418] 21 120 120 120 120 120 120 120 120 120 ...
 $ listed_count           : int [1:4418] 0 0 0 0 0 0 0 0 0 0 ...
 $ statuses_count         : int [1:4418] 18 152 152 152 152 152 152 152 152 152 ...
 $ favourites_count       : int [1:4418] 180 977 977 977 977 977 977 977 977 977 ...
 $ account_created_at     : POSIXct[1:4418], format: "2019-08-07 21:15:04" "2018-12-13 21:05:39" ...
 $ verified               : logi [1:4418] FALSE FALSE FALSE FALSE FALSE FALSE ...
 $ profile_url            : chr [1:4418] NA NA NA NA ...
 $ profile_expanded_url   : chr [1:4418] NA NA NA NA ...
 $ account_lang           : logi [1:4418] NA NA NA NA NA NA ...
 $ profile_banner_url     : chr [1:4418] "https://pbs.twimg.com/profile_banners/1159211379963772928/1566153017" "https://pbs.twimg.com/profile_banners/1073323083170238466/1551337674" "https://pbs.twimg.com/profile_banners/1073323083170238466/1551337674" "https://pbs.twimg.com/profile_banners/1073323083170238466/1551337674" ...
 $ profile_background_url : chr [1:4418] NA "http://abs.twimg.com/images/themes/theme1/bg.png" "http://abs.twimg.com/images/themes/theme1/bg.png" "http://abs.twimg.com/images/themes/theme1/bg.png" ...
 $ profile_image_url      : chr [1:4418] "http://pbs.twimg.com/profile_images/1163156171651338240/lswwW8wN_normal.jpg" "http://pbs.twimg.com/profile_images/1132000357301805059/lHr81Hig_normal.png" "http://pbs.twimg.com/profile_images/1132000357301805059/lHr81Hig_normal.png" "http://pbs.twimg.com/profile_images/1132000357301805059/lHr81Hig_normal.png" ...
head(raw_tweets)
# A tibble: 6 × 90
  user_id             status_id     created_at          screen_name text  source
  <chr>               <chr>         <dttm>              <chr>       <chr> <chr> 
1 1159211379963772928 116315426606… 2019-08-18 18:22:42 MKumarYYC   "Fir… Twitt…
2 1073323083170238466 116324750454… 2019-08-19 00:33:11 cizzart     "El … Twitt…
3 1073323083170238466 114504357847… 2019-06-29 18:57:17 cizzart     "Pro… Twitt…
4 1073323083170238466 111686489414… 2019-04-13 00:45:15 cizzart     "#Ar… Twitt…
5 1073323083170238466 112288249130… 2019-04-29 15:17:02 cizzart     "Pes… Twitt…
6 1073323083170238466 111763874941… 2019-04-15 04:00:17 cizzart     "Dat… Twitt…
# ℹ 84 more variables: display_text_width <dbl>, reply_to_status_id <chr>,
#   reply_to_user_id <chr>, reply_to_screen_name <chr>, is_quote <lgl>,
#   is_retweet <lgl>, favorite_count <int>, retweet_count <int>,
#   quote_count <int>, reply_count <int>, hashtags <list>, symbols <list>,
#   urls_url <list>, urls_t.co <list>, urls_expanded_url <list>,
#   media_url <list>, media_t.co <list>, media_expanded_url <list>,
#   media_type <list>, ext_media_url <list>, ext_media_t.co <list>, …

What did you realize about this data?

  • From what I can see, looks like there are 90 variables. Also, majority of the media type is a picture. Although I can assume the ones marked NANA might be only text. I do not like how that was not classified here. I like the favorite count and retweet count, that could show me popular tweets or trends. this would allow me to look at Trends over time, since there’s a timestamp associated with each tweet. One thing I might be able to do is do network analysis because the dataset includes retweets or mentions. Also, I noticed some Spanish tweets, I can filter that out or (more advanced) run it through Google translator, but for this course I will just filter them out, its not that many.

For this walkthrough, we won’t need all 90 variables so let’s clean the dataset and keep only the ones we want.

Process Data

In this section we’ll select the columns we need for our analysis and we’ll transform the dataset so each row represents a word. After that, our dataset will be ready for exploring.

Your Turn:

Complete the code below.

First, let’s use select() to pick the two columns we’ll need: status_id and text.

status_id will help us associate interesting words with a particular tweet and text will give us the text from that tweet.

We’ll also change status_id to the character data type since it’s meant to label tweets and doesn’t actually represent a numerical value.

tweets <-
  raw_tweets |>
  #filter for English tweets
  filter(lang == "en") |>
  select(status_id, text) |>
  # Convert the ID field to the character data type
  mutate(status_id = as.character(status_id))

Now the dataset has a column to identify each tweet and a column that shows the text that users tweeted. But each row has the entire tweet in the text variable, which makes it hard to analyze. If we kept our dataset like this, we’d need to use functions on each row to do something like count the number of times the word “good” appears. We can count words more efficiently if each row represented a single word. Splitting sentences in a row into single words in a row is called “tokenizing.” In their book Text Mining With R, Silge & Robinson (2017)describe tokens this way:

A token is a meaningful unit of text, such as a word, that we are interested in using for analysis, and tokenization is the process of splitting text into tokens. This one-token-per-row structure is in contrast to the ways text is often stored in current analyses, perhaps as strings or in a document-term matrix.

Let’s use unnest_tokens() from the {tidytext} package to take our dataset of tweets and transform it into a dataset of words.

tokens <- 
  tweets %>%
  unnest_tokens(output = word, input = text)

tokens 
# A tibble: 131,232 × 2
   status_id           word       
   <chr>               <chr>      
 1 1163154266065735680 first      
 2 1163154266065735680 tidytuesday
 3 1163154266065735680 submission 
 4 1163154266065735680 roman      
 5 1163154266065735680 emperors   
 6 1163154266065735680 and        
 7 1163154266065735680 their      
 8 1163154266065735680 rise       
 9 1163154266065735680 to         
10 1163154266065735680 power      
# ℹ 131,222 more rows
# Number of observations before tokenization
nrow(tweets)
[1] 4258
# Number of observations after tokenization
nrow(tokens)
[1] 131232

We use output = word to tell unnest_tokens() that we want our column of tokens to be called word. We use input = text to tell unnest_tokens() to tokenize the tweets in the text column of our tweets dataset. The result is a new dataset where each row has a single word in the word column and a unique ID in the status_id column that tells us which tweet the word appears in.

Your Turn:

What happened after you tokenized the data? How did the number of observations change?

  • Oh, I see what happen there. each word is now its own observation. Nice. I could use this to see what word is used most and possibly do some analysis on that. So Before Tokenization, each row in your dataset represents one tweet. After Tokenization, each row represents one word from a tweet. The number of rows now depends on the total number of words in all the tweets. We have now 131,232 observations!

Removing Stop Words

We’re almost ready to start analyzing the dataset! There’s one more step we’ll take–removing common words that don’t help us learn about what people are tweeting about. Words like “the” or “a” are in a category of words called “stop words”. Stop words serve a function in verbal communication, but don’t tell us much on their own. As a result, they clutter our dataset of useful words and make it harder to manage the volume of words we want to analyze. The {tidytext} package includes a dataset called stop_words that we’ll use to remove rows containing stop words. We’ll use anti_join() on our tokens dataset and the stop_wordsdataset to keep only rows that have words not appearing in the stop_words dataset.

Run the following code:

data(stop_words)

tokens <-
  tokens %>%
  anti_join(stop_words, by = "word")

Why does this work? Let’s look closer. inner_join() matches the observations in one dataset to another by a specified common variable. Any rows that don’t have a match get dropped from the resulting dataset. anti_join() does the same thing as inner_join() except it drops matching rows and keeps the rows that don’t match. This is convenient for our analysis because we want to remove rows from tokens that contain words in the stop_words dataset. When we call anti_join(), we’re left with rows that don’tmatch words in the stop_words dataset. These remaining words are the ones we’ll be analyzing.

One final note before we start counting words: Remember when we first tokenized our dataset and we passed unnest_tokens() the argument output = word? We conveniently chose word as our column name because it matches the column name word in the stop_words dataset. This makes our call to anti_join() simpler because anti_join() knows to look for the column named word in each dataset.

Analysis: Counting Words

Now it’s time to start exploring our newly cleaned dataset of tweets. Computing the frequency of each word and seeing which words showed up the most often is a good start. We can pipe tokens to the countfunction to do this:

tokens |>
    count(word, sort = TRUE) 
# A tibble: 15,334 × 2
   word            n
   <chr>       <int>
 1 t.co         5432
 2 https        5406
 3 tidytuesday  4316
 4 rstats       1748
 5 data         1105
 6 code          988
 7 week          868
 8 r4ds          675
 9 dataviz       607
10 time          494
# ℹ 15,324 more rows

We pass count() the argument sort = TRUE to sort the n variable from the highest value to the lowest value. This makes it easy to see the most frequently occurring words at the top. Not surprisingly, “tidytuesday” was the third most frequent word in this dataset.

We may want to explore further by showing the frequency of words as a percent of the whole dataset. Calculating percentages like this is useful in a lot of education scenarios because it helps us make comparisons across different sized groups. For example, you may want to calculate what percentage of students in each classroom receive special education services.

In our tweets dataset, we’ll be calculating the count of words as a percentage of all tweets. We can do that by using mutate() to add a column called percentpercent will divide n by sum(n), which is the total number of words. Finally, will multiply the result by 100.

tokens |>
  count(word, sort = TRUE) %>%
  # n as a percent of total words
  mutate(percent = n / sum(n) * 100)
# A tibble: 15,334 × 3
   word            n percent
   <chr>       <int>   <dbl>
 1 t.co         5432   7.39 
 2 https        5406   7.36 
 3 tidytuesday  4316   5.87 
 4 rstats       1748   2.38 
 5 data         1105   1.50 
 6 code          988   1.34 
 7 week          868   1.18 
 8 r4ds          675   0.919
 9 dataviz       607   0.826
10 time          494   0.672
# ℹ 15,324 more rows

Your Turn:

What did you in the percentage -word table? Did you recognize anything interesting?

  • I noticed some top words like t.co, which I do not know what is the importance of that. I would assume its noise. Https is the next top word which tells me there is a lot of references to a site. Next is tidytuesday, that is to be expected given the name handle so I would like to filter that out. The rest may be usable for analysis, depending what I will be looking for.

Analysis: Sentiment Analysis

Now that we have a sense of the most frequently appearing words, it’s time to explore some questions in our tweets dataset. Let’s imagine that we’re education consultants trying to learn about the community surrounding the TidyTuesday data visualization ritual. We know from the first part of our analysis that the token “dataviz” (a short name for data visualization) appeared frequently relative to other words, so maybe we can explore that further. A good start would be to see how the appearance of that token in a tweet is associated with other positive words.

We’ll need to use a technique called sentiment analysis to get at the “positivity” of words in these tweets. Sentiment analysis tries to evaluate words for their emotional association. If we analyze words by the emotions they convey, we can start to explore patterns in large text datasets like our tokens data.

Earlier we used anti_join() to remove stop words in our dataset. We’re going to do something similar here to reduce our tokens dataset to only words that have a positive association. We’ll use a dataset called the NRC Word-Emotion Association Lexicon to help us identify words with a positive association. This dataset was published in a work called Crowdsourcing a Word-Emotion Association Lexicon (Mohammad & Turney, 2013)

We need to install a package called {textdata} to make sure we have the NRC Word-Emotion Association Lexicon dataset available to us. Note that you only need to have this package installed. You do not need to load it with the library(textdata) command.

If you don’t already have it, let’s install {textdata}:

install.packages("textdata")
library(textdata)

1

get_sentiments("nrc")
# A tibble: 13,872 × 2
   word        sentiment
   <chr>       <chr>    
 1 abacus      trust    
 2 abandon     fear     
 3 abandon     negative 
 4 abandon     sadness  
 5 abandoned   anger    
 6 abandoned   fear     
 7 abandoned   negative 
 8 abandoned   sadness  
 9 abandonment anger    
10 abandonment fear     
# ℹ 13,862 more rows

This returns a dataset with two columns. The first is word and contains a list of words. The second is the sentiment column, which contains an emotion associated with each word. This dataset is similar to the stop_words dataset. Note that this dataset also uses the column name word, which will again make it easy for us to match this dataset to our tokens dataset.

Count Positive Words

Let’s begin working on reducing our tokens dataset down to only words that the NRC dataset associates with positivity. We’ll start by creating a new dataset, nrc_pos, which contains the NRC words that have the positive sentiment. Then we’ll match that new dataset to tokens using the word column that is common to both datasets. Finally, we’ll use count() to total up the appearances of each positive word.

# Only positive in the NRC dataset
nrc_pos <-
  get_sentiments("nrc") %>%
  filter(sentiment == "positive")

# Match to tokens
pos_tokens_count <-
  tokens %>%
  inner_join(nrc_pos, by = "word") %>%
  # Total appearance of positive words
  count(word, sort = TRUE) 

pos_tokens_count
# A tibble: 642 × 2
   word          n
   <chr>     <int>
 1 fun         173
 2 top         162
 3 learn       131
 4 found       128
 5 love        113
 6 community   110
 7 learning     97
 8 happy        95
 9 share        90
10 inspired     85
# ℹ 632 more rows

Your Turn:

After you run the previous code chuck and observed the “pos_tokens_count”, what did this output tell you?

  • This code is designed to filter the tokens dataset to only include words that are marked as “positive” in the NRC Word-Emotion Association Lexicon, and then count the frequency of each of these positive words. In pos_tokens_count I noticed ‘fun’, ‘top’, and ‘learn’ are the top 3 positive terms.

We can visualize these words nicely by using {ggplot2} to show the positive words in a bar chart. There are 644 words total, which is hard to convey in a compact chart. We’ll solve that problem by filtering our dataset to only words that appear 75 times or more.

pos_tokens_count |>
  # only words that appear 75 times or more
  filter(n >= 75) |>
  ggplot(aes(x = reorder(word, -n), y = n)) +
  geom_bar(stat = "identity") +
  labs(
    title = "Count of Words Associated with Positivity",
    subtitle = "Tweets with the hashtag #tidytuesday",
    caption = "Data: Twitter and NRC",
    x = "",
    y = "Count"
  ) 

“Dataviz” and Other Positive Words

Earlier in the analysis we learned that “dataviz” was among the most frequently occurring words in this dataset. We can continue our exploration of TidyTuesday tweets by seeing how many tweets with “dataviz” also had at least one positive word from the NRC dataset. Looking at this might give us some clues about how people in the TidyTuesday learning community view dataviz as a tool.

There are a few steps to this part of the analysis, so let’s review our strategy. We’ll need to use the status_id field in the tweets dataset to filter the tweets that have the word dataviz in them. Then we need to use the status_id field in this new bunch of dataviz tweets to identify the tweets that include at least one positive word.

How do we know which status_id values contain the word “dataviz” and which ones contain a positive word? Recall that our tokens dataset only has one word per row, which makes it easy to use functions like filter() and inner_join() to make two new datasets: one of status_id values that have “dataviz” in the word column and one of status_id values that have a positive word in the word column.

We’ll explore the combinations of “dataviz” and any positive words in our tweets dataset using these three ingredients: our tweets dataset, a vector of status_ids for tweets that have “dataviz” in them, and a vector of status_ids for tweets that have positive words in them. Now that we have our strategy, let’s write some code and see how it works.

First, we’ll make a vector of status_ids for tweets that have “dataviz” in them. This will be used later to identify tweets that contain “dataviz” in the text. We’ll use filter() on our tokens dataset to keep only the rows that have “dataviz” in the word column. Let’s name that new dataset dv_tokens.

dv_tokens <-
  tokens %>%
  filter(word == "dataviz")

dv_tokens
# A tibble: 607 × 2
   status_id           word   
   <chr>               <chr>  
 1 1116518351147360257 dataviz
 2 1098025772554612738 dataviz
 3 1161454327296339968 dataviz
 4 1110711892086001665 dataviz
 5 1151926405162291200 dataviz
 6 1095854400004853765 dataviz
 7 1157111441419395074 dataviz
 8 1154958378764046336 dataviz
 9 1105642831413239808 dataviz
10 1108196618464047105 dataviz
# ℹ 597 more rows

The result is a dataset that has status_ids in one column and the word “dataviz” in the other column. We can use $ to extract a vector of status_ids for tweets that have “dataviz” in the text. This vector has hundreds of values, so we’ll use head to view just the first six.

# Extract status_id
head(dv_tokens$status_id)
[1] "1116518351147360257" "1098025772554612738" "1161454327296339968"
[4] "1110711892086001665" "1151926405162291200" "1095854400004853765"

Now let’s do this again, but this time we’ll we’ll make a vector of status_ids for tweets that have positive words in them. This will be used later to identify tweets that contain a positive word in the text. We’ll use filter() on our tokens dataset to keep only the rows that have any of the positive words in the in the word column. If you’ve been running all the code up to this point in the walkthrough, you’ll notice that you already have a dataset of positive words called nrc_pos, which can be turned into a vector of positive words by typing nrc_pos$word. We can use the %in% operator in our call to filter() to find only words that are in this vector of positive words. Let’s name this new dataset pos_tokens.

pos_tokens <- 
  tokens %>%
  filter(word %in% nrc_pos$word)

pos_tokens
# A tibble: 4,885 × 2
   status_id           word      
   <chr>               <chr>     
 1 1163154266065735680 throne    
 2 1001412196247666688 honey     
 3 1001412196247666688 production
 4 1001412196247666688 increase  
 5 1001412196247666688 production
 6 1161638973808287746 found     
 7 991073965899644928  community 
 8 991073965899644928  community 
 9 991073965899644928  trend     
10 991073965899644928  population
# ℹ 4,875 more rows

The result is a dataset that has status_ids in one column and a positive word from tokens in the other column. We’ll again use $ to extract a vector of status_ids for these tweets.

# Extract status_id
head(pos_tokens$status_id) 
[1] "1163154266065735680" "1001412196247666688" "1001412196247666688"
[4] "1001412196247666688" "1001412196247666688" "1161638973808287746"

That’s a lot of status_ids, many of which are duplicates. Let’s try and make the vector of status_ids a little shorter. We can use distinct() to get a data frame of status_ids, where each status_id only appears once:

pos_tokens <-
  pos_tokens %>% 
  distinct(status_id)

Note that distinct() drops all variables except for status_id. For good measure, let’s use distinct() on our dv_tokens data frame too:

dv_tokens <-
  dv_tokens %>% 
  distinct(status_id)

Now we have a data frame of status_id for tweets containing “dataviz” and another for tweets containing a positive word. Let’s use these to transform our tweets dataset. First we’ll filter tweets for rows that have the “dataviz” status_id. Then we’ll create a new column called positive that will tell us if the status_id is from our vector of positive word status_ids. We’ll name this filtered dataset dv_pos.

dv_pos <-
  tweets %>%
  # Only tweets that have the dataviz status_id
  filter(status_id %in% dv_tokens$status_id) %>%
  # Is the status_id from our vector of positive word?
  mutate(positive = if_else(status_id %in% pos_tokens$status_id, 1, 0))

Let’s take a moment to dissect how we use if_else() to create our positive column. We gave if_else() three arguments:

  • status_id %in% pos_tokens$status_id: a logical statement

  • 1: the value of positive if the logical statement is true

  • 0: the value of positive if the logical statement is false

So our new positive column will take the value 1 if the status_id was in our pos_tokens dataset and the value 0 if the status_id was not in our pos_tokens dataset. Practically speaking, positive is 1 if the tweet has a positive word and 0 if it does not have a positive word.

And finally, let’s see what percent of tweets that had “dataviz” in them also had at least one positive word:

dv_pos %>%
  count(positive) %>%
  mutate(perc = n / sum(n)) 
# A tibble: 2 × 3
  positive     n  perc
     <dbl> <int> <dbl>
1        0   272 0.450
2        1   333 0.550

Your Turn:

The table above represents the percent of tweets that had “dataviz”, can you please interpret the result of the table and tell what did you discover at the end of this analysis?

Your Response Here: In this analysis of tweets containing the term “dataviz” from the TidyTuesday dataset, I discovered a predominantly positive sentiment towards data visualization. The output shows that out of the total tweets analyzed, approximately 55.04% (333 tweets) contained at least one positive word, as identified by the NRC Word-Emotion Association Lexicon. This indicates that more than half of the tweets about “dataviz” are associated with positive sentiments. On the other hand, around 44.96% (272 tweets) did not include positive words, which could suggest a neutral stance or simply an absence of explicitly positive language, rather than a negative sentiment. This finding suggests that the TidyTuesday community generally views “dataviz” in a positive light, reflecting enthusiasm, appreciation, or positive engagement with data visualization. However, it’s important to consider the limitations of sentiment analysis, as it may not capture all nuances, especially in tweets where context is key. The presence of positive words does not always equate to an overall positive sentiment, but in this case, it does indicate a favorable trend towards “dataviz” within the TidyTuesday community.

Since the point of exploratory data analysis is to explore and develop questions, let’s continue to do that. In this last section we’ll review a random selection of tweets for context.

Taking A Close Read of Randomly Selected Tweets

Let’s review where we are so far as we work to learn more about the TidyTuesday learning community through tweets. So far we’ve counted frequently used words and estimated the number of tweets with positive associations. This dataset is large, so we need to zoom out and find ways to summarize the data. But it’s also useful to explore by zooming in and reading some of the tweets. Reading tweets helps us to build intuition and context about how users talk about TidyTuesday in general. Even though this doesn’t lead to quantitative findings, it helps us to learn more about the content we’re studying and analyzing. Instead of reading all 4418 tweets, let’s write some code to randomly select tweets to review.

First, let’s make a dataset of tweets that had positive words from the NRC dataset. Remember earlier when we made a dataset of tweets that had “dataviz” and a column that had a value of 1 for containing positive words and 0 for not containing positive words? Let’s reuse that technique, but instead of applying to a dataset of tweets containing “dataviz”, let’s use it on our dataset of all tweets.

pos_tweets <-
  tweets %>%
  mutate(positive = if_else(status_id %in% pos_tokens$status_id, 1, 0)) %>%
  filter(positive == 1)

Again, we’re using if_else to make a new column called positive that takes its value based on whether status_id %in% pos_tokens$status_id is true or not.

We can use slice() to help us pick the rows. When we pass slice() a row number, it returns that row from the dataset. For example, we can select the 1st and 3rd row of our tweets dataset this way:

tweets %>% 
  slice(1, 3)
# A tibble: 2 × 2
  status_id           text                                                      
  <chr>               <chr>                                                     
1 1163154266065735680 "First #TidyTuesday submission! Roman emperors and their …
2 1001412196247666688 "My #tidytuesday submission for week 8. Honey production …

Randomly selecting rows from a dataset is great technique to have in your toolkit. Random selection helps us avoid some of the biases we all have when we pick rows to review ourselves.

Here’s one way to do that using base R:

sample(x = 1:10, size = 5)
[1] 1 8 9 6 7

Passing sample() a vector of numbers and the size of the sample you want returns a random selection from the vector. Try changing the value of x and size to see how this works.

{dplyr} has a version of this called sample_n() that we can use to randomly select rows in our tweets dataset. Using sample_n() looks like this:

set.seed(2020)

pos_tweets %>% 
  sample_n(., size = 10)
# A tibble: 10 × 3
   status_id           text                                             positive
   <chr>               <chr>                                               <dbl>
 1 1133817727615930369 "@allison_horst @thomas_mock I wonder if there …        1
 2 1146039959725518848 "⁣⁣⁣⁣⁣⁣Follow @House_of_Honda⁣⁣⁣⁣\n⁣⁣⁣\n⁣⁣⁣⁣\n\U0001f530 This FB…        1
 3 988608513663528960  "#TidyTuesday\nI know there must be far easier …        1
 4 1117623421012205568 "#TidyTuesday #rstats Week 2019-04-09: 50 years…        1
 5 1029834250063945728 "Following my previous tweet, I took a deep div…        1
 6 1141050762337882112 "1960: the year owls with extreme-length ears r…        1
 7 1118668150780907520 "A simple #TidyTuesday submission this week. A …        1
 8 993149495193100288  "@TheresaWege Great visualization Theresa! Than…        1
 9 1117692475135520768 "#TidyTuesday inspired by @sil_aarts https://t.…        1
10 1143570115864211456 "Had another go at the #tidytuesday bird count …        1

That returned ten randomly selected tweets that we can now read through and discuss. Let’s look a little closer at how we did that. We used sample_n(), which returns randomly selected rows from our tweets dataset. We also specified that size = 10, which means we want sample_n() to give us 10 randomly selected rows. A few lines before that, we used set.seed(2020). This helps us ensure that, while sample_n()theoretically plucks 10 random numbers, our readers can run this code and get the same result we did. Using set.seed(2020) at the top of your code makes sample_n() pick the same ten rows every time. Try changing 2020 to another number and notice how sample_n() picks a different set of ten numbers, but repeatedly picks those numbers until you change the argument in set.seed().

Summary of Walkthrough 5

The purpose of this walkthrough is to share code with you so you can practice some basic text analysis techniques. Now it’s time to make your learning more meaningful by adapting this code to text-based files you regularly see at work. Trying reading in some of these and doing a similar analysis:

  • News articles

  • Procedure manuals

  • Open ended responses in surveys

There are also advanced text analysis techniques to explore. Consider trying topic modeling (https://www.tidytextmining.com/topicmodeling.html) or finding correlations between terms (https://www.tidytextmining.com/ngrams.html), both described in (Silge & Robinson, 2017).

Finally, if you feel like there is more to analyze where it comes to this particular hashtag, we agree! We use this data set further in the next chapter on social network analysis. Moreover, if you want to collect our own Twitter data, head to Appendix B to read about and consider some potential strategies.

Walkthrough 6: Exploring Relationships Using Social Network Analysis With Social Media Data

Topics Emphasized

  • Transforming data

  • Visualizing data

Functions Introduced

  • rtweet::search_tweets()

  • randomNames::randomNames()

  • tidyr::unnest()

  • tidygraph::as_tbl_graph()

  • ggraph::ggraph()

Vocabulary

  • Application Programming Interface (API)

  • edgelist

  • edge

  • influence model

  • regex

  • selection model

  • social network analysis

  • sociogram

  • vertex

Chapter Overview

In the previous walkthrough, we focused on using text analysis to understand the content of tweets. In this, we focus on the interactions between #tidytuesday participants using social network analysis techniques.

While social network analysis is increasingly common, it remains challenging to carry out. For one, cleaning and tidying the data can be even more challenging than for most other data sources, as net data for social network analysis (or network data) often includes variables about both individuals (such as information students or teachers) and their relationships (whether they have a relationship at all, for example, or how strong or of what type their relationship is). This chapter is designed to take you from not having carried out social network analysis through visualizing network data.

Background

There are a few reasons to be interested in social media. For example, if you work in a school district, you may want to know who is interacting with the content you share. If you are a researcher, you may want to investigate what teachers, administrators, and others do through state-based hashtags (e.g., Joshua M. Rosenberg et al. (2016)). Social media-based data also provides new contexts for learning to take place, like in professional learning networks (Trust et al., 2016).

In the past, if a teacher wanted advice about how to plan a unit or to design a lesson, they would turn to a trusted peer in their building or district (Spillane et al., 2012). Today they are as likely to turn to someone in a social media network. Social media interactions like the ones tagged with the #tidytuesday hashtag are increasingly common in education. Using data science tools to learn from these interactions is valuable for improving the student experience.

Packages

In this chapter, we access data using the {rtweet} package (Michael W. Kearney, 2016). Through {rtweet} and a Twitter account, it is easy to access data from Twitter. We will load the {tidyverse} and {rtweet} packages to get started.

We will also load other packages that we will be using in this analysis, including two packages related to social network analysis (Pedersen, 2022b2022a) as well as one that will help us to use not-anonymized names in a savvy way (Betebenner, 2021). As always, if you have not installed any of these packages before (which may particularly be the case for the {rtweet}, {randomNames}, {tidygraph}, and {ggraph} packages, which we have not yet used int he book), do so using the install.packages() function. More on installing packages is included in the Packages section of the Foundational Skills chapter.

Let’s load the packages with the following calls to the library() function:

library(rtweet)

Attaching package: 'rtweet'
The following object is masked from 'package:purrr':

    flatten
library(dataedu)
library(randomNames)
Warning: package 'randomNames' was built under R version 4.3.2
library(tidygraph)
Warning: package 'tidygraph' was built under R version 4.3.2

Attaching package: 'tidygraph'
The following object is masked from 'package:stats':

    filter
library(ggraph)
Warning: package 'ggraph' was built under R version 4.3.2

Data Sources and Import

Here is an example of searching the most recent 1,000 tweets which include the hashtag #rstats. When you run this code, you will be prompted to authenticate your access via Twitter.

You can find a greater number of tweets by adding a greater value to the n argument of the search_tweets() function, as follows, to collect the most recent 500 tweets:

tt_tweets <- dataedu::tt_tweets

View Data

Your Turn

View the tt_tweets data with nrow() or glimpse() functions.

nrow(tt_tweets)
[1] 4418
library(dplyr)
glimpse(tt_tweets)
Rows: 4,418
Columns: 90
$ user_id                 <chr> "1159211379963772928", "1073323083170238466", …
$ status_id               <chr> "1163154266065735680", "1163247504542130176", …
$ created_at              <dttm> 2019-08-18 18:22:42, 2019-08-19 00:33:11, 201…
$ screen_name             <chr> "MKumarYYC", "cizzart", "cizzart", "cizzart", …
$ text                    <chr> "First #TidyTuesday submission! Roman emperors…
$ source                  <chr> "Twitter Web App", "Twitter Web App", "Twitter…
$ display_text_width      <dbl> 178, 280, 196, 214, 262, 227, 260, 271, 117, 2…
$ reply_to_status_id      <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ reply_to_user_id        <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ reply_to_screen_name    <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ is_quote                <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALS…
$ is_retweet              <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALS…
$ favorite_count          <int> 9, 1, 1, 1, 0, 1, 1, 1, 6, 5, 1, 4, 1, 1, 2, 1…
$ retweet_count           <int> 3, 1, 0, 1, 1, 1, 1, 0, 3, 2, 0, 2, 1, 1, 0, 1…
$ quote_count             <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ reply_count             <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ hashtags                <list> <"TidyTuesday", "rstats", "DataScience">, <"M…
$ symbols                 <list> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
$ urls_url                <list> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
$ urls_t.co               <list> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
$ urls_expanded_url       <list> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
$ media_url               <list> "http://pbs.twimg.com/media/ECRao8OU8AAFVux.p…
$ media_t.co              <list> "https://t.co/1LTWIUBXwA", "https://t.co/Bhqv…
$ media_expanded_url      <list> "https://twitter.com/MKumarYYC/status/1163154…
$ media_type              <list> "photo", "photo", "photo", "photo", "photo", …
$ ext_media_url           <list> "http://pbs.twimg.com/media/ECRao8OU8AAFVux.p…
$ ext_media_t.co          <list> "https://t.co/1LTWIUBXwA", "https://t.co/Bhqv…
$ ext_media_expanded_url  <list> "https://twitter.com/MKumarYYC/status/1163154…
$ ext_media_type          <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ mentions_user_id        <list> NA, "2792557771", "4471336875", "2249583145",…
$ mentions_screen_name    <list> NA, "eldestapeweb", "INDECArgentina", "ENACOM…
$ lang                    <chr> "en", "es", "es", "es", "es", "es", "es", "es"…
$ quoted_status_id        <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ quoted_text             <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ quoted_created_at       <dttm> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
$ quoted_source           <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ quoted_favorite_count   <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ quoted_retweet_count    <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ quoted_user_id          <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ quoted_screen_name      <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ quoted_name             <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ quoted_followers_count  <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ quoted_friends_count    <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ quoted_statuses_count   <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ quoted_location         <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ quoted_description      <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ quoted_verified         <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ retweet_status_id       <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ retweet_text            <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ retweet_created_at      <dttm> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
$ retweet_source          <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ retweet_favorite_count  <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ retweet_retweet_count   <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ retweet_user_id         <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ retweet_screen_name     <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ retweet_name            <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ retweet_followers_count <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ retweet_friends_count   <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ retweet_statuses_count  <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ retweet_location        <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ retweet_description     <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ retweet_verified        <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ place_url               <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ place_name              <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ place_full_name         <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ place_type              <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ country                 <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ country_code            <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ geo_coords              <list> <NA, NA>, <NA, NA>, <NA, NA>, <NA, NA>, <NA, …
$ coords_coords           <list> <NA, NA>, <NA, NA>, <NA, NA>, <NA, NA>, <NA, …
$ bbox_coords             <list> <NA, NA, NA, NA, NA, NA, NA, NA>, <NA, NA, NA…
$ status_url              <chr> "https://twitter.com/MKumarYYC/status/11631542…
$ name                    <chr> "Mehul Kumar", "César", "César", "César", "Cés…
$ location                <chr> "Calgary, AB", "Posadas, Argentina", "Posadas,…
$ description             <chr> "Dabbling in data devices | #rstats/#datascien…
$ url                     <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ protected               <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALS…
$ followers_count         <int> 10, 57, 57, 57, 57, 57, 57, 57, 57, 57, 57, 57…
$ friends_count           <int> 21, 120, 120, 120, 120, 120, 120, 120, 120, 12…
$ listed_count            <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ statuses_count          <int> 18, 152, 152, 152, 152, 152, 152, 152, 152, 15…
$ favourites_count        <int> 180, 977, 977, 977, 977, 977, 977, 977, 977, 9…
$ account_created_at      <dttm> 2019-08-07 21:15:04, 2018-12-13 21:05:39, 201…
$ verified                <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALS…
$ profile_url             <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ profile_expanded_url    <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ account_lang            <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ profile_banner_url      <chr> "https://pbs.twimg.com/profile_banners/1159211…
$ profile_background_url  <chr> NA, "http://abs.twimg.com/images/themes/theme1…
$ profile_image_url       <chr> "http://pbs.twimg.com/profile_images/116315617…

Methods: Process Data

Network data requires some processing before it can be used in subsequent analyses. The network dataset needs a way to identify each participant’s role in the interaction. We need to answer questions like:

  • Did someone reach out to another for help?

  • Was someone contacted by another for help?

We can process the data by creating an edgelist. An edgelist is a dataset where each row is a unique interaction between two parties. Each row (which represents a single relationship) in the edgelist is referred to as an edge. We note that one challenge facing data scientists beginning to use network analysis is the different terms that are used for similar (or the same!) aspects of analyses: Edges are sometimes referred to as ties or relations, but these generally refer to the same thing, though they may be used in different contexts.

An edgelist looks like the following, where the sender (sometimes called the “nominator”) column identifies who is initiating the interaction and the receiver (sometimes called the “nominee”) column identifies who is receiving the interaction:

In this edgelist, the sender column might identify someone who nominates another (the receiver) as someone they go to for help. The sender might also identify someone who interacts with the receiver in other ways, like “liking” or “mentioning” their tweets. In the following steps, we will work to create an edgelist from the data from #tidytuesday on Twitter.

Extracting Mentions

Let’s extract the mentions. There is a lot going on in the code below; let’s break it down line-by-line, starting with mutate():

  • mutate(all_mentions = str_extract_all(text, regex)): this line uses a regex, or regular expression, to identify all of the usernames in the tweet (note: the regex comes from from this Stack Overflow page (https://stackoverflow.com/questions/18164839/get-twitter-username-with-regex-in-r))

  • unnest(all_mentions) this line uses a {tidyr} function, unnest() to move every mention to its own line, while keeping all of the other information the same (see more about unnest() here: https://tidyr.tidyverse.org/reference/unnest.html)).

Now let’s use these functions to extract the mentions from the dataset. Here’s how all the code looks in action:

regex <- "@([A-Za-z]+[A-Za-z0-9_]+)(?![A-Za-z0-9_]*\\.)"

tt_tweets <-
  tt_tweets %>%
  # Use regular expression to identify all the usernames in a tweet
  mutate(all_mentions = str_extract_all(text, regex)) %>%
  unnest(all_mentions)

Let’s put these into their own data frame, called mentions.

mentions <-
  tt_tweets %>%
  mutate(all_mentions = str_trim(all_mentions)) %>%
  select(sender = screen_name, all_mentions)

Putting the Edgelist Together

Recall that an edgelist is a data structure that has columns for the “sender” and “receiver” of interactions. Someone “sends” the mention to someone who is mentioned, who can be considered to “receive” it. To make the edgelist, we’ll need to clean it up a little by removing the “@” symbol. Let’s look at our data as it is now.

mentions
# A tibble: 2,447 × 2
   sender  all_mentions    
   <chr>   <chr>           
 1 cizzart @eldestapeweb   
 2 cizzart @INDECArgentina 
 3 cizzart @ENACOMArgentina
 4 cizzart @tribunalelecmns
 5 cizzart @CamaraElectoral
 6 cizzart @INDECArgentina 
 7 cizzart @tribunalelecmns
 8 cizzart @CamaraElectoral
 9 cizzart @AgroMnes       
10 cizzart @AgroindustriaAR
# ℹ 2,437 more rows

Let’s remove that “@” symbol from the columns we created and save the results to a new tibble, edgelist.

edgelist <- 
  mentions %>% 
  # remove "@" from all_mentions column
  mutate(all_mentions = str_sub(all_mentions, start = 2)) %>% 
  # rename all_mentions to receiver
  select(sender, receiver = all_mentions)

Analysis and Results

Now that we have our edgelist, let’s plot the network. We’ll use the {tidygraph} and {ggraph} packages to visualize the data. We note that network visualizations are often referred to as sociograms, or a representation of the relationships between individuals in a network. We use this term and the term network visualization interchangeably in this chapter.

Plotting the Network

Large networks like this one can be hard to work with because of their size. We can get around that problem by only include some individuals. Let’s explore how many interactions each individual in the network sent by using count():

interactions_sent <- edgelist %>% 
  # this counts how many times each sender appears in the data frame, effectively counting how many interactions each individual sent 
  count(sender) %>% 
  # arranges the data frame in descending order of the number of interactions sent
  arrange(desc(n))

interactions_sent
# A tibble: 618 × 2
   sender            n
   <chr>         <int>
 1 thomas_mock     347
 2 R4DScommunity    78
 3 WireMonkey       52
 4 CedScherer       41
 5 allison_horst    37
 6 mjhendrickson    34
 7 kigtembu         27
 8 WeAreRLadies     25
 9 PBecciu          23
10 sil_aarts        23
# ℹ 608 more rows

Your Turn:

What is your first impressions related to “interactions_sent” data frame?

Your Response: Analyzing the interactions_sent DataFrame from the #tidytuesday Twitter community reveals several interesting aspects about user engagement and interaction patterns. At the top, user thomas_mock shows a crazy high level of activity with 347 interactions, indicating an important role in the community, possibly as a key influencer or central figure. Following are users like R4DScommunity and WireMonkey, with 78 and 52 interactions respectively, suggesting their significant contribution and engagement in the community.

The data shows a pronounced drop-off in interaction counts after the top few users, which is a common characteristic in social networks where a handful of participants tend to be much more active than the rest. This pattern suggests that the community’s discussions and dynamics might be driven by a few influential individuals. Users with moderate interaction counts, such as CedScherer, allison_horst, and mjhendrickson, might represent community leaders or active content creators, playing important roles in disseminating information, sparking discussions, or providing support.

The presence of a broad range of users with varying levels of interaction indicates a diverse and healthy community where multiple voices contribute to the discourse. The top users by interaction count likely act as influencers within the community, significantly impacting the topics discussed, the direction of conversations, and the overall community sentiment.

interactions_sent <- 
  interactions_sent %>% 
  filter(n > 1)

That leaves us with only 349, which will be much easier to work with.

We now need to filter the edgelist to only include these 349 individuals. The following code uses the filter() function combined with the %in% operator to do this:

edgelist <- edgelist %>% 
  # the first of the two lines below filters to include only senders in the interactions_sent data frame
  # the second line does the same, for receivers
  filter(sender %in% interactions_sent$sender,
         receiver %in% interactions_sent$sender)

We’ll use the as_tbl_graph() function, which identifies the first column as the “sender” and the second as the “receiver.” Let’s look at the object it creates:

g <- 
  as_tbl_graph(edgelist)

g
# A tbl_graph: 267 nodes and 975 edges
#
# A directed multigraph with 7 components
#
# A tibble: 267 × 1
  name           
  <chr>          
1 dgwinfred      
2 datawookie     
3 jvaghela4      
4 FournierJohanie
5 JonTheGeek     
6 jakekaupp      
# ℹ 261 more rows
#
# A tibble: 975 × 2
   from    to
  <int> <int>
1     1    32
2     1    36
3     2   120
# ℹ 972 more rows

We can see that the network now has 267 individuals, all of which sent more than one interaction. The individuals in a network are often referred to as nodes (and, this terminology is used in the {ggraph} functions for plotting the individuals - the nodes - in a network). We note that nodes are sometimes referred to as vertices or actors; like the different names for edges, these generally mean the same thing.

Next, we’ll use the ggraph() function:

g %>%
  # we chose the kk layout as it created a graph which was easy-to-interpret, but others are available; see ?ggraph
  ggraph(layout = "kk") +
  # this adds the points to the graph
  geom_node_point() +
  # this adds the links, or the edges; alpha = .2 makes it so that the lines are partially transparent
  geom_edge_link(alpha = .2) +
  # this last line of code adds a ggplot2 theme suitable for network graphs
  theme_graph()
Warning: Using the `size` aesthetic in this geom was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` in the `default_aes` field and elsewhere instead.

Finally, let’s size the points based on a measure of centrality. A common way to do this is to measure how influential an individual may be based on the interactions observed.

g %>% 
  # this calculates the centrality of each individual using the built-in centrality_authority() function
  mutate(centrality = centrality_authority()) %>% 
  ggraph(layout = "kk") + 
  geom_node_point(aes(size = centrality, color = centrality)) +
  # this line colors the points based upon their centrality
  scale_color_continuous(guide = 'legend') + 
  geom_edge_link(alpha = .2) 

Your Turn

What this social network graph tells us regarding to the measure of centrality?

  • The social network graph of the #tidytuesday Twitter community with nodes sized according to centrality measures, offers a revealing glimpse into the community’s dynamics and structure. Centrality is a crucial metric in network analysis, indicating the importance or influence of individuals within the network. The presence of a highly central node, marked as 1.00 on the slight right side of the graph, suggests that this individual is exceptionally influential. This could be due to their active engagement in initiating conversations, responding to others, or being frequently mentioned, thereby playing an important role in information dissemination, connecting subgroups, or driving community discussions.

    Outliers in the graph are also telling. These individuals, with fewer connections, might represent newer members, less active participants, or those engaged in more niche topics within the broader #tidytuesday context. Their peripheral positions offer insights into the community’s inclusivity and the diversity of engagement levels.

    The overall structure of the graph, including the central node and outliers, reveals the community’s organizational pattern. A graph with a few nodes having significantly higher centrality scores suggests a community with a few very influential members, indicating a centralized structure.

    This graph also sheds light on the flow of information and influence within the community. Individuals with high centrality are likely major contributors to the spread of information, ideas, and trends, often seen as authorities or thought leaders. For those seeking to engage with the community, understanding centrality may help identify key individuals for connection, potentially offering more visibility and broader reach within the community.

Conclusion

In this chapter, we used social media data from the #tidytuesday hashtag to prepare and visualize social network data. Sociograms are a useful visualization tool to reveal who is interacting with whom–and, in some cases, to suggest why. In our applications of data science, we have found that the individuals (such as teachers or students) who are represented in a network often like to see what the network (and the relationships in it) look like. It can be compelling to think about why networks are the way they are, and how changes could be made to - for example - foster more connections between individuals who have few opportunities to interact. In this way, social network analysis can be useful to the data scientist in education because it provides a technique to communicate with other educational stakeholders in a compelling way.

Social network analysis is a broad (and growing) domain, and this chapter was intended to present some of its foundation. Fortunately for R users, many recent developments are implemented first in R (e.g., (R-amen?)). If you are interested in some of the additional steps that you can take to model and analyze network data, consider the appendix on two types of models (for selection and influence processes), Appendix C.