library("dslabs")
## Warning: package 'dslabs' was built under R version 4.2.3
data(package="dslabs")
data("trump_tweets")
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.0 ✔ readr 2.1.4
## ✔ forcats 1.0.0 ✔ stringr 1.5.0
## ✔ ggplot2 3.4.1 ✔ tibble 3.1.8
## ✔ lubridate 1.9.2 ✔ tidyr 1.3.0
## ✔ purrr 1.0.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the ]8;;http://conflicted.r-lib.org/conflicted package]8;; to force all conflicts to become errors
library(treemap)
library(ggplot2)
I had to filter the data to his tweets that were liked more than 250000 times to narrow the list from 20761 tweets to 16 to make it more manageable. Then I created a treemap.
most_liked_trump_tweets <- trump_tweets %>%
filter(favorite_count >= 250000)
treemap(most_liked_trump_tweets, index="text", vSize="favorite_count",
vColor="retweet_count", type="manual", palette="Oranges", title="Trump Most Liked Tweets", border.col=c("orange"), fontcolor.labels = "black",
title.legend = "Number of Retweets")
For this homework, I wanted to do something funny so I chose the Trump Tweets dataset. I started with installing the dslabs package then I run the dataset trump_tweets. I wasn’t surprised when the long list of tweets appeared. I wanted to make a treemap because I think it’s the best choice to display the visual I envisioned. However, in order to do so, I know I needed to narrow the more than 20 thousands tweets down to maybe the top 20. So I looked at the list and sorted them from highest to lowest based on the number of favorites. Then I filtered the data to only show the tweets that have 250000 likes or more, narrowing the data down to 16 tweets. Of course, “Make America Great Again” was going to be up there.
So I created the treemap using the number of likes (favorite_count) and the number of retweets (retweets_count). The bigger the square, the more liked the tweet! The treemap also shows some correlation between most liked tweets and the amount of times they were retweeted as well. #FraudNewsCNN and TODAY WE MAKE AMERCICA GREAT AGAIN! got a lot of likes and retweets! How riveting… insert sarcasm
Oh and if you’re wondering what https://t.co/WYUnHjjUjg in the #FraudNewsCNN tweet is, I took the liberty to find out. It’s a gif of Trump tackling a someone whose face was replaced with the CNN logo.