Advanced Data Analytics in DM

Social media campaign analysis

In this report we will be looking at the Social Media Campaign Data Set.

library(tidyverse)
library(kableExtra)
social <- read_csv("social_media_campaign_data.csv")

In this graph we will analyse the success rate of each campaign in terms of engagement.

eng_by_campaign <- social %>% 
  group_by(Campaign) %>%
  summarise(total_engagements = sum(Engagements, na.rm = TRUE))

ggplot(eng_by_campaign, aes(x = reorder(Campaign, total_engagements), ##highest to lowest##
                            y = total_engagements)) +
  geom_col(fill = "mediumpurple4") +
  geom_text(aes(label = scales::comma(total_engagements)),
            hjust = +1.1, ##Puts numbers to the left
            color = "white",
            fontface = "bold") +
  coord_flip() + ##makes it horizontal##
  labs(title = "Total Engagements by Campaign",
       x = "Campaign name",
       y = "Total Engagements")

Earlier campaigns have overall higher engagement than the later ones, which is to be expected. Holiday 2025 did better than other 2025 campaigns, which means that there might be an opportunity for them to keep doing right what they did right already.

This barchart visualises which platforms bring the most engagement.

social %>%
  group_by(Platform) %>%
  summarise(total_engagements = sum(Engagements, na.rm = TRUE)) %>%
  ggplot(aes(x = reorder(Platform, total_engagements), y = total_engagements)) +
  geom_col(fill = "plum") +
  geom_text(aes(label = scales::comma(total_engagements)),
            hjust = +1.1, ##Puts numbers to the left
            color = "white",
            fontface = "bold") +
  coord_flip() +
  labs(title = "Total Engagements by Platform", x = "Platform", y = "Total Engagements")

The engagement is the highest on X. Instagram, LinkedIn and TikTok are all in a similar numbers, which means that the company is moderately consistent across all platforms with overall preference from the management and the consumers for X.

This line graph visualises when is engagement the highest during the day and therefore when should the company post.

social %>%
  group_by(`Post Hour`) %>%
  summarise(avg_engagements = mean(Engagements, na.rm = TRUE)) %>%
  ggplot(aes(x = `Post Hour`, y = avg_engagements)) +
  geom_area(fill = "thistle", alpha = 0.7) + ##transparency
  geom_line(color = "orchid") +
  labs(title = "Average Engagements by Hour of Posting",
       x = "Hour of Day",
       y = "Average Engagements")

The hours when consumers are engaging with the content the most are between 11am and 12pm, then around 8pm and later at 11pm. This tells the company that they should post at those times to reach as many consumers as possible. Early morning at around 6am is also a good hour to post.

This table showcases how efficient are the company’s marketing efforts in terms of investments and its return.

roi_table <- social %>%
  group_by(Campaign) %>%
  summarise(
    total_spend = sum(Spend, na.rm = TRUE),
    conversion_value = sum(`Conversion Value`, na.rm = TRUE),
    ROI = round(conversion_value / total_spend, 2)
  ) %>%
  arrange(desc(ROI))
  
knitr::kable(roi_table, 
             digits = 2,
             caption = "Return on Investment (ROI) by Campaign",
             col.names = c("Campaign", "Total Spend (€)", "Conversion Value (€)", "ROI"),
             align = c("l", "r", "r", "r")) %>%
  kable_styling(full_width = F) %>%
    row_spec(0, bold = TRUE, color = "mediumpurple4", background = "thistle3") %>%
  row_spec(1:10, background = "lavender") %>%
  column_spec(1, bold = TRUE, color = "mediumpurple4")
Return on Investment (ROI) by Campaign
Campaign Total Spend (€) Conversion Value (€) ROI
Back to School 2024 3701.70 248982.6 67.26
Holiday 2025 3025.48 193330.1 63.90
Spring 2025 2419.43 150261.5 62.11
Fall 2025 2782.16 169007.7 60.75
Summer 2024 3447.93 203331.7 58.97
Holiday 2024 4755.51 251464.9 52.88
Fall 2024 4110.95 215409.4 52.40
Summer 2025 3233.85 149250.8 46.15
Back to School 2025 3066.13 133503.5 43.54
Spring 2024 3476.50 147737.7 42.50

The trend here is clear, the more the company spent, the better the campaign and the better the reach, which means that they’ve been smart when doing their marketing activities. The best investment so far has been Back to School 2024 campaign.

This graph visualises average engagement on combination of each platform and content type.

sum_ct_pf <- social %>%
  group_by(Platform, `Post Type`) %>%
  summarise(
    avg_engagements = mean(Engagements, na.rm = TRUE),
    avg_emojis     = mean(`Emoji Count`, na.rm = TRUE),
    avg_hashtags   = mean(`Hashtag Count`, na.rm = TRUE),
    avg_words      = mean(`Word Count`, na.rm = TRUE),
    .groups = "drop") ##to give output without grouping

ggplot(sum_ct_pf, aes(x = Platform, y = `Post Type`, fill = avg_engagements)) +
  geom_tile() +
  geom_text(aes(label = paste0(":) ", round(avg_emojis,1),
                               "\n# ", round(avg_hashtags,1),
                               "\nW ", round(avg_words,0))),
  size = 3) +
  labs(title = "Performance by Content Type × Platform",
       subtitle = "Fill: Average Engagements; Text: Avg Emojis / Hashtags / Words",
       x = "Platform", y = "Content Type", fill = "Avg Eng") +
       scale_fill_gradient(low = "lavender", high = "orchid")

Engagement appears to be the highest on carousels and stories for X, stories for TikTok, Reels for LinkedIn and video and story formats for Instagram. In the coloured fields we can also see the average number of emoji, hashtag and word counts of the posts, which can help in future content creation. These numbers appear to be similar across all platforms and content types and do not seem to influence the trend here. Video content is surprisingly not performing as well as would be expected from a content type as popular across all social media. Story posts are performing well across all platforms and are great for consistency.

This boxplot visualises how much is usually spent by post type and its outliers.

ggplot(data = social) +
  (aes(x = `Post Type`, y = Spend, fill = `Post Type`)) +
  geom_boxplot(outlier.color = "orchid", outlier.alpha = 0.5) + ##Sets the transparency of the outliers##
  scale_y_continuous(labels = scales::dollar) + ##adds $ signs##
  scale_fill_manual(values = c(
    "Carousel" = "plum3", 
    "Image"= "plum3",
    "Reel" = "plum3",
    "Story" = "plum3",
    "Video" = "plum3")) +
  labs(
    title = "Spend Distribution by Post Type",
    x = "Post Type",
    y = "Spend (USD)") +
  theme(legend.position = "none")  ##hides the legend

Median spent is usually under $50. Most posts receive similar non-expensive budgets, with several higher-price outliers in each category. Video and Story outliers tell us, that those are the main post types requiring extra spending. Image and carousel posts are more consistent in terms of spending. The graph tells us, that their spending is consistent, low-cost and focused on video based content, which brings the most potential of engagement and strategic importance.

This scatter plot visualises relationship between spending and conversions.

ggplot(data = social) +
  (aes(x = Spend, y = Conversions, color = Platform)) +
  geom_point(alpha = 0.7) +
  geom_smooth(method = "lm", se = FALSE) + ##Adds a trend line, makes it straight, removes gray shading around##
  scale_x_continuous(labels = scales::dollar) + ##x-axis in $##
  scale_color_manual(values = c(
    "Instagram" = "plum", 
    "LinkedIn"= "mediumslateblue",
    "TikTok" = "maroon2",
    "X" = "mediumpurple4")) +
  labs(title = "Relationship Between Spend and Conversions")

This graph shows that there is a positive relationship between spending and conversions it generates - the more that is spend, the more conversions it brings. The most efficient spending is on X, where is also the highest engagement. LinkedIn is also doing well, but its success varies much more. Instagram and TikTok are spent on the least, sometimes not at all and it brings conversions appropriate to that.

This table summarises and compares average spend leading to average impressions, engagements, clicks and CTR.

platform_summary <- social %>%
  group_by(Platform) %>%
  summarise(
    avg_spend = mean(Spend, na.rm = TRUE),
    avg_impressions = mean(Impressions, na.rm = TRUE),
    avg_engagements = mean(Engagements, na.rm = TRUE),
    avg_clicks = mean(Clicks, na.rm = TRUE),
    avg_conversions = mean(Conversions, na.rm = TRUE),
    avg_CTR = mean(CTR, na.rm = TRUE)
  ) %>%
arrange(desc(avg_spend))

knitr::kable(platform_summary,
             digits = 2,
             caption = "Average Performance Metrics by Platform",
             col.names = c("Platform", "avg Spend ($)", "avg Impressions", "avg Engagements", "avg Clicks", "avg Conversions", "avg CTR"),
             align = c("l", "r", "r", "r", "r", "r", "r")) %>%
  kable_styling(full_width = F) %>%
    row_spec(0, bold = TRUE, color = "thistle", background = "mediumpurple4") %>%
  row_spec(1:4, background = "lavender") %>%
  column_spec(1, bold = TRUE, color = "mediumpurple4")
Average Performance Metrics by Platform
Platform avg Spend ($) avg Impressions avg Engagements avg Clicks avg Conversions avg CTR
X 39.09 7837.96 429.11 214.56 27.45 0.03
TikTok 37.28 7442.52 390.21 192.83 23.31 0.03
LinkedIn 34.11 7703.42 410.94 205.36 25.30 0.03
Instagram 33.44 7479.50 392.00 188.19 24.21 0.03

X is again leading on all categories, LinkedIn is right behind it in terms of success, but TikTok is spent on more and generating less. That is an indicator of either shifting focus to investing more into LinkedIn or invest into TikTok better.

Conclusion

The company is doing well on the basic level of marketing with moderate engagement and conversions. Back to School 2024 was their most profitable campaign and Holiday 2024 had the most engagement. Their spending is minimal but appropriate, they should keep focusing on X and LinkedIn. TikTok should be invested into the least as it brings least conversions. Carousel and stories on X and reels on Linkedin are the most popular content types and their ideal posting times are between 11am and 12pm, 8pm and 11pm.

As if analytics wasn’t fun enough…

Here are some memes I created depicting my coding experience :D

knitr::include_graphics("meme2.jpg")

knitr::include_graphics("meme1.jpg")