Thanksgiving Pies

Thanksgiving is without a doubt the MOST gluttonous holiday of the year (though one may argue that Halloween could be a close contender as well). And sure, the turkey takes center stage with the accompanying sides of mashed potatoes, stuffing, cranberry sauce, etc. By the time dinner is finished, the top button of people’s paints are either on the verge of popping or are already undone leaving no room for dessert. However, to some, dessert is where Thanksgiving dinner truly shines with its quintessential pies. Our analysis today will take a look into the pies that will be in the race to becoming America’s most favorite Thanksgiving Pie (title pending).

Methodology

In combing through approximately 1,000 tweets that reference #Thanksgiving our research focused on four pies in particular; Apple Pie, Blueberry Pie, Pumnpkin Pie, and Pecan Pie.

Combing through the tweets we were able to cultivate from the Twitter API, we began to measure initial response based on sentimentality of the tweets themselves. Finding that although Blueberry Pie has a particularly high share of “positive” mentions, it is worth noting that the actual number of Blueberry Pie tweets was significantly low (coming in at 4) - see Appendix Table 1.2. And just as joyous as people are to see a Blueberry Pie come out on Thanksgiving Day, they are just as likely to be overcome with sadness at the very sight of it.

Findings

Pumpkin Pie had a substantially strong representation in regard to the total number of positive mentions, the lowest negative response, and the highest scores in positivity as well (including anticipation, joy, and trust). However, as we began to look to the dispertion of sentimentality across all its tweets, we noticed that Pumpkin Pie fell behind in its share of positive mentions and barely slipped past Apple Pie on measures of anticipation.

#create nrc to allow for sentiment analysis

nrc <- sentiments %>%
  filter(lexicon == "nrc") %>%
  select(word, sentiment)

#compile sentiment word analysis per pie data frame

reg <- "([^A-Za-z\\d#@']|'(?![A-Za-z\\d#@]))"

apple_words <- apple_df %>%
filter(!str_detect(text, '^"')) %>%
mutate(text = str_replace_all(text, "https://t.co/[A-Za-z\\d]+|&amp;", "")
) %>% unnest_tokens(word, text, token = "regex", pattern = reg) %>%
filter(!word %in% stop_words$word, str_detect(word, "[a-z]"))

blueberry_words <- blueberry_df %>%
filter(!str_detect(text, '^"')) %>%
mutate(text = str_replace_all(text, "https://t.co/[A-Za-z\\d]+|&amp;", "")
) %>% unnest_tokens(word, text, token = "regex", pattern = reg) %>%
filter(!word %in% stop_words$word, str_detect(word, "[a-z]"))

pumpkin_words <- pumpkin_df %>%
filter(!str_detect(text, '^"')) %>%
mutate(text = str_replace_all(text, "https://t.co/[A-Za-z\\d]+|&amp;", "")
) %>% unnest_tokens(word, text, token = "regex", pattern = reg) %>%
filter(!word %in% stop_words$word, str_detect(word, "[a-z]"))

pecan_words <- pecan_df %>%
filter(!str_detect(text, '^"')) %>%
mutate(text = str_replace_all(text, "https://t.co/[A-Za-z\\d]+|&amp;", "")
) %>% unnest_tokens(word, text, token = "regex", pattern = reg) %>%
filter(!word %in% stop_words$word, str_detect(word, "[a-z]"))


apple_sentiments <- apple_words %>% inner_join(nrc, by = "word")
blueberry_sentiments <- blueberry_words %>% inner_join(nrc, by = "word")
pumpkin_sentiments <- pumpkin_words %>% inner_join(nrc, by = "word")
pecan_sentiments <- pecan_words %>% inner_join(nrc, by = "word")

#create an identifying pie column to bind into master data frame

apple_sentiments$pie <- "apple"
blueberry_sentiments$pie <- "blueberry"
pumpkin_sentiments$pie <- "pumpkin"
pecan_sentiments$pie <- "pecan"

master_sentiments <- rbind(apple_sentiments, blueberry_sentiments, pumpkin_sentiments, pecan_sentiments)

Table 1.1 - Number of Pie Mentions by Sentiment

#plot number of twitter mentions by pie by sentiment

thanksgiving_df <- master_sentiments %>% 
  group_by(pie, sentiment) %>% 
  summarize(n = n()) %>%
  mutate(frequency = n/sum(n))

ggplot(thanksgiving_df, aes(x = sentiment, y = n, fill = pie)) + 
  geom_bar(stat = "identity", position = "dodge") +
  xlab("Sentiment") +
  ylab("n") +
  theme(axis.text.x = element_text(angle = 90, hjust = 1))

Table 1.2 - Share of Pie Mentions by Sentiment

#plot share of twitter mentions by pie by sentiment

thanksgiving_df <- master_sentiments %>% 
  group_by(pie, sentiment) %>% 
  summarize(n = n()) %>%
  mutate(frequency = n/sum(n))

ggplot(thanksgiving_df, aes(x = sentiment, y = frequency, fill = pie)) + 
  geom_bar(stat = "identity", position = "dodge") +
  xlab("Sentiment") +
  ylab("Percent of tweets") +
  theme(axis.text.x = element_text(angle = 90, hjust = 1))

Positive Consumption Journey

From our initial analysis it was clear that the top contenders were Apple and Pumpkin Pie. Therefore, to settle the score we mapped out a positive consumption journey (from the build-up to the pie reveal to post-consumption) as a measure of more clearly defining a winner.

We began by observing the anticipation people felt at the very idea that either Apple or Pumpkin Pie would be presented at the table this year, followed by the sense of surprise that followed through, the joy or elation that they’re wishes were met, the trust that it was going to taste as good as they imagined, and the positive experience at the conclusion. To visually see that positive consumption journey, please refer below to Table 1.3.

Table 1.3 - The Positive Consumption Journey of Apple and Pumpkin Pie by Share of Mentions

#plot consumption journey for Apple & Pumpkin Pies

thanksgiving_ap_df <- master_sentiments %>% 
  group_by(pie, sentiment) %>% 
  summarize(n = n()) %>%
  filter(pie == "apple" | pie == "pumpkin") %>%
  filter(sentiment == "anticipation" | sentiment == "surprise" | sentiment == "joy" | sentiment == "trust" | sentiment == "positive") %>%
  mutate(frequency = n/sum(n))

thanksgiving_ap_df$pos_sentiment <- factor(thanksgiving_ap_df$sentiment, levels = c("anticipation", "surprise", "joy", "trust", "positive"))
ggplot(thanksgiving_ap_df, aes(x = pos_sentiment, y = frequency, fill = pie)) + 
  geom_bar(stat = "identity", position = "dodge") +
  xlab("Positive Sentiment") +
  ylab("Percent of tweets") +
  theme(axis.text.x = element_text(angle = 90, hjust = 1))

Pie-thagorean Algorithm

However, to more scientifically calculate the true winner, we applied the Pie-thagorean algorithm (a derivative of the Pythagorean Ranking algorithm) to determine which pie is most likely to impress and satisfy crowds at the Thanksgiving table this year. The theorem is calculated as:

(# of twitter mentions)^0.23 * (# of affinity mentions)^2.34 / (# of total mentions)^2.34

With “affinity mentions” equal to the sum of anticipation, joy, positive, surprise, and trust. After running the calculations, the results are as follows …

Table 2.1 - Pie-thagorean Results

#calculate the pie-thagorean score for Apple & Pumpkin Pies

pie_thagorean_df <- thanksgiving_df %>%
  filter(pie == "apple" | pie == "pumpkin") %>%
  filter(sentiment == "anticipation" | sentiment == "joy" | sentiment == "positive" | sentiment == "surprise" | sentiment == "trust") %>%
  select("pie", "sentiment", "n")

pie_thagorean_pos_app_df <- pie_thagorean_df %>%
  filter(pie == "apple") %>%
  summarise(N = sum(n))

pie_thagorean_pos_pump_df <- pie_thagorean_df %>%
  filter(pie == "pumpkin") %>%
  summarise(N = sum(n))

pie_thagorean_df <- rbind(pie_thagorean_pos_app_df, pie_thagorean_pos_pump_df)

pie_thagorean_tot_app_df <- thanksgiving_df %>%
  filter(pie == "apple") %>%
  summarise(N = sum(n))
colnames(pie_thagorean_tot_app_df) <- c("pie", "total")

pie_thagorean_tot_pum_df <- thanksgiving_df %>%
  filter(pie == "pumpkin") %>%
  summarise(N = sum(n))
colnames(pie_thagorean_tot_pum_df) <- c("pie", "total")

pie_thagorean_tot_df <- rbind(pie_thagorean_tot_app_df, pie_thagorean_tot_pum_df)

pie_thagorean_df <- bind_cols(pie_thagorean_df, pie_thagorean_tot_df)

pie_thagorean_df <- pie_thagorean_df %>%
  select("pie", "N", "total") %>%
  mutate(loyalty_measure = (round((N^0.23)*((N^2.34)/(total^2.34)), digits=1)))
colnames(pie_thagorean_df) <- c("pie", "affinity sentiments", "total sentiments", "pie-thagorean score")

kable(pie_thagorean_df)

pie	affinity sentiments	total sentiments	pie-thagorean score
apple	228	279	2.2
pumpkin	1182	1372	3.6

Conclusion

There’s no denying that Pumpkin is Pumpking. Not only with its significantly larger volume of mentions during this time period but supported also by scientific calculations to affirm the strong affinity people have for its presence, Pumpkin Pie has been crowned America’s most favorite Thanksgiving Pie for 2017.

Appendix

Apple Sentiments

Table 3.1 - Raw Ranking of Apple Pie Mentions by Sentiment

apple_app <- apple_sentiments %>% group_by(sentiment) %>% summarize(n = n())
apple_app <- apple_app %>%
  mutate(pct_total = (round((n/sum(n)*100), digits=1)))
kable(apple_app)

sentiment	n	pct_total
anger	14	5.0
anticipation	37	13.3
disgust	2	0.7
fear	3	1.1
joy	51	18.3
negative	24	8.6
positive	104	37.3
sadness	8	2.9
surprise	11	3.9
trust	25	9.0

Blueberry Sentiments

Table 3.2 - Raw Ranking of Blueberry Pie Mentions by Sentiment

blueberry_app <- blueberry_sentiments %>% group_by(sentiment) %>% summarize(n = n())
blueberry_app <- blueberry_app %>%
  mutate(pct_total = (round((n/sum(n)*100), digits=1)))
kable(blueberry_app)

sentiment	n	pct_total
joy	1	25
positive	2	50
sadness	1	25

Pumpkin Sentiments

Table 3.3 - Raw Ranking of Pumpkin Pie Mentions by Sentiment

pumpkin_app <- pumpkin_sentiments %>% group_by(sentiment) %>% summarize(n = n())
pumpkin_app <- pumpkin_app %>%
  mutate(pct_total = (round((n/sum(n)*100), digits=1)))
kable(pumpkin_app)

sentiment	n	pct_total
anger	25	1.8
anticipation	181	13.2
disgust	17	1.2
fear	15	1.1
joy	303	22.1
negative	97	7.1
positive	462	33.7
sadness	36	2.6
surprise	51	3.7
trust	185	13.5

Pecan Sentiments

Table 3.4 - Raw Ranking of Pecan Pie Mentions by Sentiment

pecan_app <- pecan_sentiments %>% group_by(sentiment) %>% summarize(n = n())
pecan_app <- pecan_app %>%
  mutate(pct_total = (round((n/sum(n)*100), digits=1)))
kable(pecan_app)

sentiment	n	pct_total
anger	7	1.6
anticipation	77	17.6
disgust	7	1.6
fear	12	2.7
joy	71	16.2
negative	55	12.6
positive	121	27.6
sadness	14	3.2
surprise	15	3.4
trust	59	13.5

Twitter #Thanksgiving Pie Calls

num_tweets <- 1000

apple <- searchTwitter('#thanksgiving#applepie', n = num_tweets)

## Warning in doRppAPICall("search/tweets", n, params = params,
## retryOnRateLimit = retryOnRateLimit, : 1000 tweets were requested but the
## API can only return 113

blueberry <- searchTwitter('#thanksgiving#blueberrypie', n = num_tweets)

## Warning in doRppAPICall("search/tweets", n, params = params,
## retryOnRateLimit = retryOnRateLimit, : 1000 tweets were requested but the
## API can only return 2

pumpkin <- searchTwitter('#thanksgiving#pumpkinpie', n = num_tweets)

## Warning in doRppAPICall("search/tweets", n, params = params,
## retryOnRateLimit = retryOnRateLimit, : 1000 tweets were requested but the
## API can only return 468

pecan <- searchTwitter('#thanksgiving#pecanpie', n = num_tweets)

## Warning in doRppAPICall("search/tweets", n, params = params,
## retryOnRateLimit = retryOnRateLimit, : 1000 tweets were requested but the
## API can only return 124

apple_df <- twListToDF(apple)
blueberry_df <- twListToDF(blueberry)
pumpkin_df <- twListToDF(pumpkin)
pecan_df <- twListToDF(pecan)

```

Sweet as Pies

Pete Wiernusz

11/19/2017