Introduction

“From impressions to clicks to conversions — this project traces the full digital marketing funnel across channels, audiences, and time. Through eight data-driven visualizations, we explore which channels maximize ROI, which audiences respond best, and whether higher spend truly drives better results — transforming raw campaign numbers into a clear, actionable story for modern marketers.”


Data Import & Preparation

# Load the Marketing Campaign Performance dataset
# Source: https://www.kaggle.com/datasets/manishabhatt22/marketing-campaign-performance-dataset
# Download CSV and place in your working directory before knitting

dat <- read_csv("marketing_campaign_performance.csv")

# Clean column names
names(dat) <- make.names(names(dat))

# Parse date and derive month column
dat <- dat %>%
  mutate(
    Date         = as.Date(Date, format = "%Y-%m-%d"),
    Month        = floor_date(Date, unit = "month"),
    Duration_Group = if_else(Duration < 14, "Short (< 14 days)", "Long (≥ 14 days)")
  )

glimpse(dat)
## Rows: 10,000
## Columns: 17
## $ Campaign_ID      <chr> "C00001", "C00002", "C00003", "C00004", "C00005", "C0…
## $ Company          <chr> "EpsilonAds", "ThetaRetail", "ThetaRetail", "BetaBran…
## $ Campaign_Type    <chr> "Email", "Influencer", "Search", "Email", "Email", "S…
## $ Target_Audience  <chr> "Men 18-24", "Men 45+", "Men 35-44", "Women 25-34", "…
## $ Duration         <dbl> 18, 40, 37, 57, 13, 59, 26, 38, 35, 26, 38, 17, 20, 2…
## $ Channel_Used     <chr> "Email", "Influencer", "Google Ads", "Email", "Email"…
## $ Conversion_Rate  <dbl> 0.0436, 0.0429, 0.0560, 0.0505, 0.0274, 0.0525, 0.033…
## $ Acquisition_Cost <dbl> 14.50, 50.82, 60.40, 14.59, 36.14, 16.31, 8.36, 38.01…
## $ ROI              <dbl> 3.377, 4.101, 4.395, 3.602, 4.882, 5.469, 3.152, 3.67…
## $ Clicks           <dbl> 4949, 11471, 8089, 3458, 2764, 868, 1607, 3116, 4146,…
## $ Impressions      <dbl> 148679, 402483, 239664, 168870, 95825, 15342, 47340, …
## $ Engagement_Score <dbl> 52.8, 84.0, 46.2, 45.6, 64.1, 40.1, 74.3, 87.8, 61.9,…
## $ Customer_Segment <chr> "Budget Shoppers", "Fashionistas", "Tech Enthusiasts"…
## $ Date             <date> 2022-08-17, 2022-10-12, 2022-05-08, 2022-08-27, 2023…
## $ CTR              <dbl> 0.03329, 0.02850, 0.03375, 0.02048, 0.02885, 0.05660,…
## $ Month            <date> 2022-08-01, 2022-10-01, 2022-05-01, 2022-08-01, 2023…
## $ Duration_Group   <chr> "Long (≥ 14 days)", "Long (≥ 14 days)", "Long (≥ 14 d…

Figure 1 — Horizontal Bar Chart: Average ROI by Marketing Channel

fig1_dat <- dat %>%
  group_by(Channel_Used) %>%
  summarise(avg_ROI = mean(ROI, na.rm = TRUE)) %>%
  arrange(desc(avg_ROI)) %>%
  mutate(Channel_Used = fct_reorder(Channel_Used, avg_ROI))

overall_avg <- mean(dat$ROI, na.rm = TRUE)

ggplot(fig1_dat, aes(x = avg_ROI, y = Channel_Used, fill = avg_ROI)) +
  geom_col(width = 0.6, show.legend = FALSE) +
  geom_vline(xintercept = overall_avg, linetype = "dashed", color = "gray40", linewidth = 0.7) +
  geom_text(aes(label = round(avg_ROI, 2)), hjust = -0.15, size = 3.5, color = "gray20") +
  annotate("text", x = overall_avg + 0.01, y = 0.6,
           label = paste0("Overall avg: ", round(overall_avg, 2)),
           size = 3, color = "gray40", hjust = 0) +
  scale_fill_gradient(low = "#a8d5e2", high = "#1a6985") +
  scale_x_continuous(expand = expansion(mult = c(0, 0.15))) +
  labs(
    title    = "Average ROI by Marketing Channel",
    subtitle = "Channels sorted by return on investment; dashed line marks the portfolio average",
    x        = "Average ROI",
    y        = NULL,
    caption  = "Source: Kaggle — Marketing Campaign Performance Dataset"
  ) +
  theme_minimal(base_size = 13) +
  theme(
    plot.title    = element_text(face = "bold"),
    plot.subtitle = element_text(color = "gray40", size = 10),
    panel.grid.major.y = element_blank(),
    panel.grid.minor   = element_blank()
  )

Caption: Not all channels are created equal. This chart ranks marketing channels by their average Return on Investment, with a dashed reference line marking the overall portfolio average. Channels to the right outperform the mean and warrant increased budget allocation.


Figure 2 — Interactive Line Plot: CTR Trend Over Time by Channel (Interactive)

fig2_dat <- dat %>%
  group_by(Channel_Used, Month) %>%
  summarise(avg_CTR = mean(CTR, na.rm = TRUE), .groups = "drop")

p2 <- ggplot(fig2_dat, aes(x = Month, y = avg_CTR,
                            color = Channel_Used, group = Channel_Used,
                            text = paste0("Channel: ", Channel_Used,
                                          "<br>Month: ", format(Month, "%b %Y"),
                                          "<br>Avg CTR: ", round(avg_CTR, 4)))) +
  geom_line(linewidth = 0.9) +
  geom_point(size = 1.8) +
  scale_color_brewer(palette = "Set2") +
  scale_y_continuous(labels = percent_format(accuracy = 0.01)) +
  labs(
    title   = "Average Click-Through Rate Over Time by Channel",
    x       = NULL,
    y       = "Average CTR",
    color   = "Channel",
    caption = "Source: Kaggle — Marketing Campaign Performance Dataset"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    plot.title = element_text(face = "bold"),
    legend.position = "bottom"
  )

ggplotly(p2, tooltip = "text") %>%
  layout(legend = list(orientation = "h", y = -0.2))

Caption: Click-Through Rates tell a story over time. This interactive multi-line plot tracks monthly average CTR for each marketing channel. Hover over any point to see exact values. Use the legend to isolate individual channels and detect seasonal patterns or diverging trends.


Figure 3 — Scatter Plot: Acquisition Cost vs. Conversion Rate

fig3_dat <- dat %>%
  filter(!is.na(Acquisition_Cost), !is.na(Conversion_Rate)) %>%
  sample_n(min(3000, nrow(.)))   # sample for legibility if dataset is large

ggplot(fig3_dat, aes(x = Acquisition_Cost, y = Conversion_Rate, color = Campaign_Type)) +
  geom_point(alpha = 0.4, size = 1.8) +
  geom_smooth(method = "lm", se = FALSE, linewidth = 0.9) +
  scale_color_brewer(palette = "Dark2") +
  scale_y_continuous(labels = percent_format(accuracy = 1)) +
  scale_x_continuous(labels = dollar_format()) +
  labs(
    title    = "Acquisition Cost vs. Conversion Rate by Campaign Type",
    subtitle = "Each point is one campaign; lines show linear trends per campaign type",
    x        = "Acquisition Cost (USD per customer)",
    y        = "Conversion Rate",
    color    = "Campaign Type",
    caption  = "Source: Kaggle — Marketing Campaign Performance Dataset"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    plot.title    = element_text(face = "bold"),
    plot.subtitle = element_text(color = "gray40", size = 10),
    legend.position = "bottom"
  )

Caption: Does spending more guarantee better results? Each point represents a campaign, colored by type, plotted against acquisition cost and conversion rate. The trend lines expose whether spend and conversions move together — or whether efficiency varies dramatically by campaign type.


Figure 4 — Faceted Bar Chart: CTR by Campaign Type and Target Audience

fig4_dat <- dat %>%
  group_by(Campaign_Type, Target_Audience) %>%
  summarise(avg_CTR = mean(CTR, na.rm = TRUE), .groups = "drop")

ggplot(fig4_dat, aes(x = fct_reorder(Target_Audience, avg_CTR),
                     y = avg_CTR, fill = Campaign_Type)) +
  geom_col(show.legend = FALSE, width = 0.7) +
  coord_flip() +
  facet_wrap(~ Campaign_Type, scales = "free_x") +
  scale_fill_brewer(palette = "Set2") +
  scale_y_continuous(labels = percent_format(accuracy = 0.01)) +
  labs(
    title    = "Average CTR by Campaign Type and Target Audience",
    subtitle = "Faceted by campaign type; audiences sorted by CTR within each panel",
    x        = NULL,
    y        = "Average CTR",
    caption  = "Source: Kaggle — Marketing Campaign Performance Dataset"
  ) +
  theme_minimal(base_size = 11) +
  theme(
    plot.title    = element_text(face = "bold"),
    plot.subtitle = element_text(color = "gray40", size = 9),
    strip.text    = element_text(face = "bold"),
    panel.grid.major.y = element_blank()
  )

Caption: The right message to the wrong audience falls flat. This faceted chart breaks down average CTR by target audience segment within each campaign type, highlighting which demographic groups are most responsive and where targeting strategy can be sharpened.


Figure 5 — Box Plot: Distribution of ROI by Campaign Type

fig5_dat <- dat %>%
  filter(!is.na(ROI), !is.na(Campaign_Type))

ggplot(fig5_dat, aes(x = fct_reorder(Campaign_Type, ROI, median),
                     y = ROI, fill = Campaign_Type)) +
  geom_boxplot(outlier.shape = NA, alpha = 0.7, width = 0.5) +
  geom_jitter(width = 0.15, alpha = 0.08, size = 0.8, color = "gray30") +
  scale_fill_brewer(palette = "Pastel1") +
  coord_flip() +
  labs(
    title    = "Distribution of ROI by Campaign Type",
    subtitle = "Box shows median and IQR; jittered points show individual campaigns",
    x        = NULL,
    y        = "Return on Investment (ROI)",
    caption  = "Source: Kaggle — Marketing Campaign Performance Dataset"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    plot.title    = element_text(face = "bold"),
    plot.subtitle = element_text(color = "gray40", size = 10),
    legend.position = "none",
    panel.grid.major.y = element_blank()
  )

Caption: Averages can be deceiving. This box plot reveals not just the median ROI for each campaign type, but its spread and outliers — showing which campaign types are reliably strong performers and which carry higher variance and financial risk.


Figure 6 — Heatmap: Engagement Score by Channel and Customer Segment

fig6_dat <- dat %>%
  group_by(Channel_Used, Customer_Segment) %>%
  summarise(avg_engagement = mean(Engagement_Score, na.rm = TRUE), .groups = "drop")

ggplot(fig6_dat, aes(x = Customer_Segment, y = Channel_Used, fill = avg_engagement)) +
  geom_tile(color = "white", linewidth = 0.5) +
  geom_text(aes(label = round(avg_engagement, 1)), size = 3.2, color = "white", fontface = "bold") +
  scale_fill_gradient(low = "#c6dbef", high = "#084594") +
  labs(
    title    = "Average Engagement Score by Channel and Customer Segment",
    subtitle = "Darker cells indicate higher engagement; values shown in each cell",
    x        = "Customer Segment",
    y        = "Channel",
    fill     = "Avg Engagement",
    caption  = "Source: Kaggle — Marketing Campaign Performance Dataset"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    plot.title    = element_text(face = "bold"),
    plot.subtitle = element_text(color = "gray40", size = 10),
    axis.text.x   = element_text(angle = 30, hjust = 1),
    panel.grid    = element_blank()
  )

Caption: Where attention meets audience. Each cell of this heatmap represents the average engagement score at the intersection of a marketing channel and a customer segment. Darker cells identify the combinations that resonate most — a practical guide for precision targeting.


Figure 7 — Ridge Plot: Distribution of Impressions by Channel

fig7_dat <- dat %>%
  filter(!is.na(Impressions), Impressions > 0)

ggplot(fig7_dat, aes(x = Impressions, y = Channel_Used, fill = Channel_Used)) +
  geom_density_ridges(alpha = 0.7, scale = 1.2, rel_min_height = 0.01) +
  scale_x_log10(labels = label_comma()) +
  scale_fill_brewer(palette = "Set2") +
  labs(
    title    = "Distribution of Impressions by Marketing Channel",
    subtitle = "Log scale on x-axis; each ridge shows the density of campaign impressions per channel",
    x        = "Impressions (log scale)",
    y        = NULL,
    caption  = "Source: Kaggle — Marketing Campaign Performance Dataset"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    plot.title    = element_text(face = "bold"),
    plot.subtitle = element_text(color = "gray40", size = 10),
    legend.position = "none"
  )

Caption: Reach is not evenly distributed. This ridge plot shows how impressions are spread across campaigns within each channel, plotted on a log scale to account for skewness. Peaks and tails reveal whether channels deliver broad, consistent reach or operate in high-low extremes.


Figure 8 — Dumbbell Chart: CTR for Short vs. Long Campaigns by Campaign Type

fig8_dat <- dat %>%
  filter(!is.na(CTR), !is.na(Duration_Group), !is.na(Campaign_Type)) %>%
  group_by(Campaign_Type, Duration_Group) %>%
  summarise(avg_CTR = mean(CTR, na.rm = TRUE), .groups = "drop") %>%
  mutate(Duration_Group = if_else(str_detect(Duration_Group, "Short"), "Short", "Long")) %>%
  pivot_wider(names_from = Duration_Group,
              values_from = avg_CTR) %>%
  filter(!is.na(Short), !is.na(Long)) %>%
  mutate(Campaign_Type = fct_reorder(Campaign_Type, Long))

ggplot(fig8_dat) +
  geom_segment(aes(x = Short, xend = Long,
                   y = Campaign_Type, yend = Campaign_Type),
               color = "gray70", linewidth = 1.2) +
  geom_point(aes(x = Short, y = Campaign_Type),
             color = "#e07b39", size = 4) +
  geom_point(aes(x = Long, y = Campaign_Type),
             color = "#1a6985", size = 4) +
  scale_x_continuous(labels = percent_format(accuracy = 0.01)) +
  annotate("text", x = max(fig8_dat$Long, na.rm = TRUE),
           y = 0.5, label = "● Long (≥14 days)",
           color = "#1a6985", size = 3.2, hjust = 1) +
  annotate("text", x = min(fig8_dat$Short, na.rm = TRUE),
           y = 0.5, label = "● Short (<14 days)",
           color = "#e07b39", size = 3.2, hjust = 0) +
  labs(
    title    = "CTR for Short vs. Long Campaigns by Campaign Type",
    subtitle = "Orange dot = short campaigns (<14 days); Blue dot = long campaigns (≥14 days)",
    x        = "Average CTR",
    y        = NULL,
    caption  = "Source: Kaggle — Marketing Campaign Performance Dataset"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    plot.title    = element_text(face = "bold"),
    plot.subtitle = element_text(color = "gray40", size = 10),
    panel.grid.major.y = element_blank(),
    panel.grid.minor   = element_blank()
  )

Caption: Is patience rewarded in marketing? Each dumbbell connects the average CTR of short campaigns (under 14 days) to long campaigns (14 days or more) for the same campaign type. The direction and size of the gap answers a practical question: does running a campaign longer actually improve click-through performance?


Summary

This project examined digital marketing campaign performance across five channels and multiple campaign types using eight polished visualizations. Key findings include:

Together, these visualizations tell a complete story of the digital marketing funnel — from broad reach (impressions) to engagement, conversion, and ultimately return on investment.


Report published to RPubs. All figures created in R using ggplot2, plotly, and ggridges. Dataset: Kaggle — Marketing Campaign Performance Dataset.