Thanksgiving Pies

Thanksgiving is without a doubt the MOST gluttonous holiday of the year (though one may argue that Halloween could be a close contender as well). And sure, the turkey takes center stage with the accompanying sides of mashed potatoes, stuffing, cranberry sauce, etc. By the time dinner is finished, the top button of people’s paints are either on the verge of popping or are already undone leaving no room for dessert. However, to some, dessert is where Thanksgiving dinner truly shines with its quintessential pies. Our analysis today will take a look into the pies that will be in the race to becoming America’s most favorite Thanksgiving Pie (title pending).

Methodology

In combing through approximately 1,000 tweets that reference #Thanksgiving our research focused on four pies in particular; Apple Pie, Blueberry Pie, Pumnpkin Pie, and Pecan Pie.

Combing through the tweets we were able to cultivate from the Twitter API, we began to measure initial response based on sentimentality of the tweets themselves. Finding that although Blueberry Pie has a particularly high share of “positive” mentions, it is worth noting that the actual number of Blueberry Pie tweets was significantly low (coming in at 4) - see Appendix Table 1.2. And just as joyous as people are to see a Blueberry Pie come out on Thanksgiving Day, they are just as likely to be overcome with sadness at the very sight of it.

Findings

Pumpkin Pie had a substantially strong representation in regard to the total number of positive mentions, the lowest negative response, and the highest scores in positivity as well (including anticipation, joy, and trust). However, as we began to look to the dispertion of sentimentality across all its tweets, we noticed that Pumpkin Pie fell behind in its share of positive mentions and barely slipped past Apple Pie on measures of anticipation.

#create nrc to allow for sentiment analysis

nrc <- sentiments %>%
  filter(lexicon == "nrc") %>%
  select(word, sentiment)
#compile sentiment word analysis per pie data frame

reg <- "([^A-Za-z\\d#@']|'(?![A-Za-z\\d#@]))"

apple_words <- apple_df %>%
filter(!str_detect(text, '^"')) %>%
mutate(text = str_replace_all(text, "https://t.co/[A-Za-z\\d]+|&amp;", "")
) %>% unnest_tokens(word, text, token = "regex", pattern = reg) %>%
filter(!word %in% stop_words$word, str_detect(word, "[a-z]"))

blueberry_words <- blueberry_df %>%
filter(!str_detect(text, '^"')) %>%
mutate(text = str_replace_all(text, "https://t.co/[A-Za-z\\d]+|&amp;", "")
) %>% unnest_tokens(word, text, token = "regex", pattern = reg) %>%
filter(!word %in% stop_words$word, str_detect(word, "[a-z]"))

pumpkin_words <- pumpkin_df %>%
filter(!str_detect(text, '^"')) %>%
mutate(text = str_replace_all(text, "https://t.co/[A-Za-z\\d]+|&amp;", "")
) %>% unnest_tokens(word, text, token = "regex", pattern = reg) %>%
filter(!word %in% stop_words$word, str_detect(word, "[a-z]"))

pecan_words <- pecan_df %>%
filter(!str_detect(text, '^"')) %>%
mutate(text = str_replace_all(text, "https://t.co/[A-Za-z\\d]+|&amp;", "")
) %>% unnest_tokens(word, text, token = "regex", pattern = reg) %>%
filter(!word %in% stop_words$word, str_detect(word, "[a-z]"))


apple_sentiments <- apple_words %>% inner_join(nrc, by = "word")
blueberry_sentiments <- blueberry_words %>% inner_join(nrc, by = "word")
pumpkin_sentiments <- pumpkin_words %>% inner_join(nrc, by = "word")
pecan_sentiments <- pecan_words %>% inner_join(nrc, by = "word")
#create an identifying pie column to bind into master data frame

apple_sentiments$pie <- "apple"
blueberry_sentiments$pie <- "blueberry"
pumpkin_sentiments$pie <- "pumpkin"
pecan_sentiments$pie <- "pecan"

master_sentiments <- rbind(apple_sentiments, blueberry_sentiments, pumpkin_sentiments, pecan_sentiments)
Table 1.1 - Number of Pie Mentions by Sentiment
#plot number of twitter mentions by pie by sentiment

thanksgiving_df <- master_sentiments %>% 
  group_by(pie, sentiment) %>% 
  summarize(n = n()) %>%
  mutate(frequency = n/sum(n))

ggplot(thanksgiving_df, aes(x = sentiment, y = n, fill = pie)) + 
  geom_bar(stat = "identity", position = "dodge") +
  xlab("Sentiment") +
  ylab("n") +
  theme(axis.text.x = element_text(angle = 90, hjust = 1))

Table 1.2 - Share of Pie Mentions by Sentiment
#plot share of twitter mentions by pie by sentiment

thanksgiving_df <- master_sentiments %>% 
  group_by(pie, sentiment) %>% 
  summarize(n = n()) %>%
  mutate(frequency = n/sum(n))

ggplot(thanksgiving_df, aes(x = sentiment, y = frequency, fill = pie)) + 
  geom_bar(stat = "identity", position = "dodge") +
  xlab("Sentiment") +
  ylab("Percent of tweets") +
  theme(axis.text.x = element_text(angle = 90, hjust = 1))

Positive Consumption Journey

From our initial analysis it was clear that the top contenders were Apple and Pumpkin Pie. Therefore, to settle the score we mapped out a positive consumption journey (from the build-up to the pie reveal to post-consumption) as a measure of more clearly defining a winner.

We began by observing the anticipation people felt at the very idea that either Apple or Pumpkin Pie would be presented at the table this year, followed by the sense of surprise that followed through, the joy or elation that they’re wishes were met, the trust that it was going to taste as good as they imagined, and the positive experience at the conclusion. To visually see that positive consumption journey, please refer below to Table 1.3.

Table 1.3 - The Positive Consumption Journey of Apple and Pumpkin Pie by Share of Mentions
#plot consumption journey for Apple & Pumpkin Pies

thanksgiving_ap_df <- master_sentiments %>% 
  group_by(pie, sentiment) %>% 
  summarize(n = n()) %>%
  filter(pie == "apple" | pie == "pumpkin") %>%
  filter(sentiment == "anticipation" | sentiment == "surprise" | sentiment == "joy" | sentiment == "trust" | sentiment == "positive") %>%
  mutate(frequency = n/sum(n))

thanksgiving_ap_df$pos_sentiment <- factor(thanksgiving_ap_df$sentiment, levels = c("anticipation", "surprise", "joy", "trust", "positive"))
ggplot(thanksgiving_ap_df, aes(x = pos_sentiment, y = frequency, fill = pie)) + 
  geom_bar(stat = "identity", position = "dodge") +
  xlab("Positive Sentiment") +
  ylab("Percent of tweets") +
  theme(axis.text.x = element_text(angle = 90, hjust = 1))

Pie-thagorean Algorithm

However, to more scientifically calculate the true winner, we applied the Pie-thagorean algorithm (a derivative of the Pythagorean Ranking algorithm) to determine which pie is most likely to impress and satisfy crowds at the Thanksgiving table this year. The theorem is calculated as:

(# of twitter mentions)^0.23 * (# of affinity mentions)^2.34 / (# of total mentions)^2.34

With “affinity mentions” equal to the sum of anticipation, joy, positive, surprise, and trust. After running the calculations, the results are as follows …

Table 2.1 - Pie-thagorean Results
#calculate the pie-thagorean score for Apple & Pumpkin Pies

pie_thagorean_df <- thanksgiving_df %>%
  filter(pie == "apple" | pie == "pumpkin") %>%
  filter(sentiment == "anticipation" | sentiment == "joy" | sentiment == "positive" | sentiment == "surprise" | sentiment == "trust") %>%
  select("pie", "sentiment", "n")

pie_thagorean_pos_app_df <- pie_thagorean_df %>%
  filter(pie == "apple") %>%
  summarise(N = sum(n))

pie_thagorean_pos_pump_df <- pie_thagorean_df %>%
  filter(pie == "pumpkin") %>%
  summarise(N = sum(n))

pie_thagorean_df <- rbind(pie_thagorean_pos_app_df, pie_thagorean_pos_pump_df)

pie_thagorean_tot_app_df <- thanksgiving_df %>%
  filter(pie == "apple") %>%
  summarise(N = sum(n))
colnames(pie_thagorean_tot_app_df) <- c("pie", "total")

pie_thagorean_tot_pum_df <- thanksgiving_df %>%
  filter(pie == "pumpkin") %>%
  summarise(N = sum(n))
colnames(pie_thagorean_tot_pum_df) <- c("pie", "total")

pie_thagorean_tot_df <- rbind(pie_thagorean_tot_app_df, pie_thagorean_tot_pum_df)

pie_thagorean_df <- bind_cols(pie_thagorean_df, pie_thagorean_tot_df)

pie_thagorean_df <- pie_thagorean_df %>%
  select("pie", "N", "total") %>%
  mutate(loyalty_measure = (round((N^0.23)*((N^2.34)/(total^2.34)), digits=1)))
colnames(pie_thagorean_df) <- c("pie", "affinity sentiments", "total sentiments", "pie-thagorean score")

kable(pie_thagorean_df)
pie affinity sentiments total sentiments pie-thagorean score
apple 228 279 2.2
pumpkin 1182 1372 3.6

Conclusion

There’s no denying that Pumpkin is Pumpking. Not only with its significantly larger volume of mentions during this time period but supported also by scientific calculations to affirm the strong affinity people have for its presence, Pumpkin Pie has been crowned America’s most favorite Thanksgiving Pie for 2017.

Appendix

Apple Sentiments

Table 3.1 - Raw Ranking of Apple Pie Mentions by Sentiment
apple_app <- apple_sentiments %>% group_by(sentiment) %>% summarize(n = n())
apple_app <- apple_app %>%
  mutate(pct_total = (round((n/sum(n)*100), digits=1)))
kable(apple_app)
sentiment n pct_total
anger 14 5.0
anticipation 37 13.3
disgust 2 0.7
fear 3 1.1
joy 51 18.3
negative 24 8.6
positive 104 37.3
sadness 8 2.9
surprise 11 3.9
trust 25 9.0

Blueberry Sentiments

Table 3.2 - Raw Ranking of Blueberry Pie Mentions by Sentiment
blueberry_app <- blueberry_sentiments %>% group_by(sentiment) %>% summarize(n = n())
blueberry_app <- blueberry_app %>%
  mutate(pct_total = (round((n/sum(n)*100), digits=1)))
kable(blueberry_app)
sentiment n pct_total
joy 1 25
positive 2 50
sadness 1 25

Pumpkin Sentiments

Table 3.3 - Raw Ranking of Pumpkin Pie Mentions by Sentiment
pumpkin_app <- pumpkin_sentiments %>% group_by(sentiment) %>% summarize(n = n())
pumpkin_app <- pumpkin_app %>%
  mutate(pct_total = (round((n/sum(n)*100), digits=1)))
kable(pumpkin_app)
sentiment n pct_total
anger 25 1.8
anticipation 181 13.2
disgust 17 1.2
fear 15 1.1
joy 303 22.1
negative 97 7.1
positive 462 33.7
sadness 36 2.6
surprise 51 3.7
trust 185 13.5

Pecan Sentiments

Table 3.4 - Raw Ranking of Pecan Pie Mentions by Sentiment
pecan_app <- pecan_sentiments %>% group_by(sentiment) %>% summarize(n = n())
pecan_app <- pecan_app %>%
  mutate(pct_total = (round((n/sum(n)*100), digits=1)))
kable(pecan_app)
sentiment n pct_total
anger 7 1.6
anticipation 77 17.6
disgust 7 1.6
fear 12 2.7
joy 71 16.2
negative 55 12.6
positive 121 27.6
sadness 14 3.2
surprise 15 3.4
trust 59 13.5

Twitter #Thanksgiving Pie Calls

num_tweets <- 1000

apple <- searchTwitter('#thanksgiving#applepie', n = num_tweets)
## Warning in doRppAPICall("search/tweets", n, params = params,
## retryOnRateLimit = retryOnRateLimit, : 1000 tweets were requested but the
## API can only return 113
blueberry <- searchTwitter('#thanksgiving#blueberrypie', n = num_tweets)
## Warning in doRppAPICall("search/tweets", n, params = params,
## retryOnRateLimit = retryOnRateLimit, : 1000 tweets were requested but the
## API can only return 2
pumpkin <- searchTwitter('#thanksgiving#pumpkinpie', n = num_tweets)
## Warning in doRppAPICall("search/tweets", n, params = params,
## retryOnRateLimit = retryOnRateLimit, : 1000 tweets were requested but the
## API can only return 468
pecan <- searchTwitter('#thanksgiving#pecanpie', n = num_tweets)
## Warning in doRppAPICall("search/tweets", n, params = params,
## retryOnRateLimit = retryOnRateLimit, : 1000 tweets were requested but the
## API can only return 124
apple_df <- twListToDF(apple)
blueberry_df <- twListToDF(blueberry)
pumpkin_df <- twListToDF(pumpkin)
pecan_df <- twListToDF(pecan)

```