Assignment 5: Colbert v Fallon

Option 1

Introduction

This document explores Twitter tweets mentioning Jimmy Fallon and The Thonight Show on NBC versus Stephen Colbert and The Late Show on CBS. There are four different data sets in this analysis, each containing the most recent 200 tweets as of 10pm on April 20, 2020. The data sets are tweets mentioning the following handles: @StephenAtHome (Stephen Colbert’s personal account), @ColbertLateShow (Stephen Colbert’s show account), @JimmyFallon (Jimmy Fallon’ personal account), and @FallonTonight (Jimmy Fallon’s show account). I’ve chosen these two late night shows and hosts for the assignment because I’ve found myself with HOURS more free time thanks to the quarantine - hours which I have filled with way too much late night television after my new roommates (aka parents) have gone to bed.

Packages

Several packages will be critical for our analysis.

Tidyverse: The tidyverse is a collection of packages that have different notations to create a more seamless data science approach.

Dplyr: Dplyr is comparable to the SQL language and helps users manipulate datasets easily.

## -- Attaching packages ----------------------------------------------------------------- tidyverse 1.3.0 --
## v ggplot2 3.2.1     v purrr   0.3.3
## v tibble  2.1.3     v dplyr   0.8.3
## v tidyr   1.0.2     v stringr 1.4.0
## v readr   1.3.1     v forcats 0.4.0
## -- Conflicts -------------------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

Import Data

## Parsed with column specification:
## cols(
##   .default = col_character(),
##   created_at = col_datetime(format = ""),
##   display_text_width = col_double(),
##   is_quote = col_logical(),
##   is_retweet = col_logical(),
##   favorite_count = col_double(),
##   retweet_count = col_double(),
##   quote_count = col_logical(),
##   reply_count = col_logical(),
##   symbols = col_logical(),
##   ext_media_type = col_logical(),
##   quoted_created_at = col_datetime(format = ""),
##   quoted_favorite_count = col_double(),
##   quoted_retweet_count = col_double(),
##   quoted_followers_count = col_double(),
##   quoted_friends_count = col_double(),
##   quoted_statuses_count = col_double(),
##   quoted_verified = col_logical(),
##   retweet_status_id = col_logical(),
##   retweet_text = col_logical(),
##   retweet_created_at = col_logical()
##   # ... with 21 more columns
## )
## See spec(...) for full column specifications.
## Parsed with column specification:
## cols(
##   .default = col_character(),
##   created_at = col_datetime(format = ""),
##   display_text_width = col_double(),
##   is_quote = col_logical(),
##   is_retweet = col_logical(),
##   favorite_count = col_double(),
##   retweet_count = col_double(),
##   quote_count = col_logical(),
##   reply_count = col_logical(),
##   symbols = col_logical(),
##   ext_media_type = col_logical(),
##   quoted_created_at = col_datetime(format = ""),
##   quoted_favorite_count = col_double(),
##   quoted_retweet_count = col_double(),
##   quoted_followers_count = col_double(),
##   quoted_friends_count = col_double(),
##   quoted_statuses_count = col_double(),
##   quoted_verified = col_logical(),
##   retweet_status_id = col_logical(),
##   retweet_text = col_logical(),
##   retweet_created_at = col_logical()
##   # ... with 21 more columns
## )
## See spec(...) for full column specifications.
## Parsed with column specification:
## cols(
##   .default = col_character(),
##   created_at = col_datetime(format = ""),
##   display_text_width = col_double(),
##   is_quote = col_logical(),
##   is_retweet = col_logical(),
##   favorite_count = col_double(),
##   retweet_count = col_double(),
##   quote_count = col_logical(),
##   reply_count = col_logical(),
##   symbols = col_logical(),
##   ext_media_type = col_logical(),
##   quoted_created_at = col_datetime(format = ""),
##   quoted_favorite_count = col_double(),
##   quoted_retweet_count = col_double(),
##   quoted_followers_count = col_double(),
##   quoted_friends_count = col_double(),
##   quoted_statuses_count = col_double(),
##   quoted_verified = col_logical(),
##   retweet_status_id = col_logical(),
##   retweet_text = col_logical(),
##   retweet_created_at = col_logical()
##   # ... with 21 more columns
## )
## See spec(...) for full column specifications.
## Parsed with column specification:
## cols(
##   .default = col_character(),
##   created_at = col_datetime(format = ""),
##   display_text_width = col_double(),
##   is_quote = col_logical(),
##   is_retweet = col_logical(),
##   favorite_count = col_double(),
##   retweet_count = col_double(),
##   quote_count = col_logical(),
##   reply_count = col_logical(),
##   symbols = col_logical(),
##   ext_media_type = col_logical(),
##   quoted_created_at = col_datetime(format = ""),
##   quoted_favorite_count = col_double(),
##   quoted_retweet_count = col_double(),
##   quoted_followers_count = col_double(),
##   quoted_friends_count = col_double(),
##   quoted_statuses_count = col_double(),
##   quoted_verified = col_logical(),
##   retweet_status_id = col_logical(),
##   retweet_text = col_logical(),
##   retweet_created_at = col_logical()
##   # ... with 21 more columns
## )
## See spec(...) for full column specifications.

Question 1

Question: When mentioned on Twitter, which of the four accounts gets the highest average number of favorites? Fan engagement via favorites are a key measure in the show and/or host connection with viewers and is indicative of their relative success.

Data: I pulled the most recent 200 tweets mentioning each of the shows or personal accounts. I will find the average favorites for each.

Analysis:

## # A tibble: 1 x 1
##   Avg_Fav_Colbert_Show
##                  <dbl>
## 1                 7.79
## # A tibble: 1 x 1
##   Avg_Fav_Colbert_Pers
##                  <dbl>
## 1                 1.52
## # A tibble: 1 x 1
##   Avg_Fav_Fallon_Show
##                 <dbl>
## 1                1.28
## # A tibble: 1 x 1
##   Avg_Fav_Fallon_Pers
##                 <dbl>
## 1               0.255

Results and Interpretation: Average favorites for tweets mentioning Colbert’s show account: 7.79 Average favorites for tweets mentioning Colbert’s personal account: 1.52 Average favorites for tweets mentioning Fallon’s show account: 1.28 Average favorites for tweets mentioning Fallon’s personal account: 0.26

One of the tweets from Colbert’s show account got 519 favorites which is a bit of an outlier that probably drove the average favorites up. However, on both show and person accounts, tweets associated with Colbert are receiving more favorites. This could be an indication that Colbert is resonating better with or engaging online with fans more effectively.

Question 2

Question: What usernames are sending the most tweets mentioning each show’s accounts? By understanding the most engaged users, we can make some assumptions about the show’s key target markets.

Data: I pulled the most recent 200 tweets mentioning each of the shows’ accounts. I will group by the usernames to see how frequently top accounts are tweeting.

Analysis:

## # A tibble: 157 x 2
##    screen_name         n
##    <chr>           <int>
##  1 colbertlateshow     8
##  2 wlj511              7
##  3 al_cidi             4
##  4 Chiefsnakeeyes      4
##  5 jim_witte_01        4
##  6 michajfs            4
##  7 browneyedsugar      3
##  8 DianaMaumee         3
##  9 Fever_Mark          3
## 10 Moonwake            3
## # ... with 147 more rows
## # A tibble: 170 x 2
##    screen_name         n
##    <chr>           <int>
##  1 ChrisRX18          14
##  2 DianeShamp          3
##  3 RyanBartholomee     3
##  4 1infinitecosmos     2
##  5 AlexJ_May           2
##  6 caringdanielle      2
##  7 CodyHillmanKing     2
##  8 DoReMiJackie        2
##  9 JohnEllena          2
## 10 MINMEI3013          2
## # ... with 160 more rows

Results and Interpretation:

The Late Show with Stephen Colbert: 1. @colbertlateshow, 8 tweets: They are their own top tweeter 2. @wlj511, 7 tweets: Wayne Johnson is a lovely old man who likes to keep the show updated on his life 3. @al_cidi, 4 tweets: This is a really hectic account and I don’t really understand what it is

The Tonight Show Starring Jimmy Fallon: 1. @ChrisRX18, 14 tweets: Christine is a lovely, middle aged woman who loves watching the Oscars and Democratic Debates 2. @DianeShamp, 3 tweets: Diane is another middle aged woman who is a big fan of the democratic party 3. @RyanBartholomee, 3 tweets: Ryan is a father of 3 gals and religiously tweets his every reaction to Jimmy Fallon

From just this small glance at the highly engaged individuals, Colbert viewers (and probably CBS viewers as a whole) skew older. Fallon appeals to middle-aged democrats.

Question 3

Question: Which hosts’ personal account has the most mentions with text containing “Trump”? Comedians and specifically Late Night hosts have walked the line between being political versus neutral. By examining trends connecting Trump with the late night hosts, we can start to understand how viewers are engaging with their political strategy (or lack thereof).

Data: Using the most recent tweets mentioning each hosts’ personal account, I will count the number of rows containing the word “Trump” in the tweet’s text.

Analysis:

## [1] 26
## [1] 3

Results and Interpretation:

Stephen Colbert: 26 tweets Jimmy Fallon: 3 tweets

This seems in line with each host’s strategy and political activism. Stephen Colbert does not shy away from criticizing Trump’s policies or asserting his opinions. It seems that fans are more engaged with him politically than with Jimmy Fallon who sparingly references Trump.