Marketing Campaign Performance Analysis using Data Visualization
This project focuses on analyzing a large-scale marketing campaign dataset to understand how different campaign strategies influence customer engagement, conversion rate, impressions, clicks, and return on investment (ROI).
Marketing Campaign Dataset
Open marketing campaign dataset provided for academic analysis.
Link : https://www.kaggle.com/datasets/manishabhatt22/marketing-campaign-performance-dataset
.csvDate)| Variable | Description |
|---|---|
| Campaign_Type | Type of marketing campaign |
| Channel_Used | Marketing channel used |
| Conversion_Rate | Percentage of successful conversions |
| ROI | Return on Investment |
| Clicks | Number of clicks |
| Impressions | Number of impressions |
| Engagement_Score | Customer engagement level |
| Target_Audience | Audience category |
| Location | Geographic location |
| Date | Campaign date |
The project aims to answer the following questions:
The report contains at least eight figures with more than three visualization types.
# Import the marketing dataset
Campaign_data <- read_csv("marketing_sampled_data.csv")
## Rows: 10000 Columns: 16
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (10): Company, Campaign_Type, Target_Audience, Duration, Channel_Used, A...
## dbl (6): Campaign_ID, Conversion_Rate, ROI, Clicks, Impressions, Engagement...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
# Clean Acquisition_Cost: remove "$" and "," then convert to numeric
Campaign_data <- Campaign_data %>%
mutate(Acquisition_Cost = as.numeric(gsub("[$,]", "", Acquisition_Cost)))
# Convert to tibble
Campaign_data <- tibble(Campaign_data)
# Preview the data
Campaign_data
## # A tibble: 10,000 × 16
## Campaign_ID Company Campaign_Type Target_Audience Duration Channel_Used
## <dbl> <chr> <chr> <chr> <chr> <chr>
## 1 182735 Innovate Ind… Display Men 18-24 30 days YouTube
## 2 188942 TechCorp Influencer All Ages 45 days Google Ads
## 3 134058 TechCorp Display Women 35-44 30 days Email
## 4 124022 TechCorp Social Media Men 18-24 60 days Google Ads
## 5 160997 Innovate Ind… Display Men 18-24 30 days Instagram
## 6 103065 DataTech Sol… Email All Ages 60 days Google Ads
## 7 124507 TechCorp Search Men 18-24 15 days Email
## 8 199365 Alpha Innova… Email Women 35-44 30 days Email
## 9 193627 TechCorp Email Men 25-34 15 days YouTube
## 10 45404 NexGen Syste… Social Media Men 25-34 30 days YouTube
## # ℹ 9,990 more rows
## # ℹ 10 more variables: Conversion_Rate <dbl>, Acquisition_Cost <dbl>,
## # ROI <dbl>, Location <chr>, Language <chr>, Clicks <dbl>, Impressions <dbl>,
## # Engagement_Score <dbl>, Customer_Segment <chr>, Date <chr>
The Bar Chart compares Campaign Performances. X-axis shows 5 types of campaign: Influencer, Social Media,Display,Search & Email. Y-axis Shows Average ROI for each type.The chart highlights which campaign type delivers the highest return on investment.The data is fetched from a two column tibble containing Campaign_Type and Average ROI. The key insight : No single campaign type dramatically outperforms the others, so diversification across types may be a sound approach.
fig_dat1 <- Campaign_data %>%
select(Campaign_Type, ROI) %>%
group_by(Campaign_Type) %>%
summarise(avg_ROI = round(mean(ROI, na.rm = TRUE), 2), .groups = "drop") %>%
arrange(desc(avg_ROI))
fig_dat1
## # A tibble: 5 × 2
## Campaign_Type avg_ROI
## <chr> <dbl>
## 1 Influencer 5.02
## 2 Social Media 5.02
## 3 Search 5
## 4 Display 4.98
## 5 Email 4.97
ggplot(fig_dat1, aes(x = reorder(Campaign_Type, -avg_ROI),
y = avg_ROI,
fill = Campaign_Type)) +
geom_col(width = 0.6, show.legend = FALSE) +
geom_text(aes(label = avg_ROI), vjust = -0.5, size = 4) +
scale_fill_brewer(palette = "Set2") +
labs(
title = "Average ROI by Campaign Type",
subtitle = "Based on 10,000 marketing campaigns",
x = "Campaign Type",
y = "Average ROI"
) +
theme_minimal(base_size = 13) +
theme(
plot.title = element_text(face = "bold", hjust = 0.5),
plot.subtitle = element_text(hjust = 0.5)
)
The Animated Line Chart is built with gganimate. X‑axis represents the month‑year. y‑axis shows the monthly average ROI.Each line corresponds to one campaign type, allowing comparisons across Display, Influencer, Social Media, Email, and Search.The animation progressively reveals each month, showing how ROI evolves over time for all campaign types simultaneously. The data is fecthed from a tibble consisting of three column with Month, Campaign_Type and avg_ROI. The Key Insight : The animation adds a temporal dimension, instead of seeing all data at once, you watch ROI unfold month by month, which makes seasonal patterns and performance shifts much more intuitive.
fig_dat2 <- Campaign_data %>%
select(Date, Campaign_Type, ROI) %>%
mutate(Month = floor_date(as.Date(Date, format = "%m/%d/%Y"), "month")) %>%
group_by(Month, Campaign_Type) %>%
summarise(avg_ROI = round(mean(ROI, na.rm = TRUE), 2), .groups = "drop") %>%
arrange(Month)
fig_dat2
## # A tibble: 60 × 3
## Month Campaign_Type avg_ROI
## <date> <chr> <dbl>
## 1 2021-01-01 Display 5.05
## 2 2021-01-01 Email 5.06
## 3 2021-01-01 Influencer 4.93
## 4 2021-01-01 Search 5.2
## 5 2021-01-01 Social Media 5.34
## 6 2021-02-01 Display 4.95
## 7 2021-02-01 Email 4.95
## 8 2021-02-01 Influencer 4.97
## 9 2021-02-01 Search 4.84
## 10 2021-02-01 Social Media 5
## # ℹ 50 more rows
p2 <- ggplot(fig_dat2, aes(x = Month,
y = avg_ROI,
color = Campaign_Type,
group = Campaign_Type)) +
geom_line(linewidth = 1) +
geom_point(size = 3) +
geom_text(aes(label = Campaign_Type),
hjust = -0.1,
size = 3,
fontface = "bold") +
scale_color_brewer(palette = "Set2") +
scale_x_date(date_labels = "%b %Y",
date_breaks = "2 months",
expand = expansion(mult = c(0.05, 0.2))) +
scale_y_continuous(limits = c(3.5, 6.5)) +
labs(
title = "Monthly Average ROI Trends by Campaign Type",
subtitle = "Month: {frame_along}",
x = "Month",
y = "Average ROI",
color = "Campaign Type",
caption = "Source: Marketing Campaign Dataset"
) +
theme_minimal(base_size = 13) +
theme(
plot.title = element_text(face = "bold", hjust = 0.5),
plot.subtitle = element_text(hjust = 0.5, color = "grey40"),
axis.text.x = element_text(angle = 45, hjust = 1),
legend.position = "bottom"
) +
transition_reveal(Month) +
ease_aes("linear")
# Save to same folder as the .Rmd file
gif_path <- file.path(getwd(), "viz_2_animated.gif")
animate(
p2,
nframes = 100,
fps = 10,
width = 800, # pixels, not inches
height = 500,
units = "px", # ← key fix: specify pixels explicitly
res = 96, # ← key fix: set resolution
renderer = gifski_renderer(gif_path)
)
The Scatter Plot shows campaign performance, and whether the more clicks translate into better conversion rates. X-axis represents number of Clicks and Y-Axis represents the Conversion rate (%).Each point represents a single marketing campaign. Points are colored by Campaign types to distinguish between different Campaigns like Display, Email, Influencer, Search and Social Media.The data is fetched from a three column tibble containing Clicks, Conversion_Rate and Campaign_Type.The key insight: More clicks do not necessarily translate into better conversion rates and differences across campaign types are subtle.
fig_dat3 <- Campaign_data %>%
select(Clicks, Conversion_Rate, Campaign_Type)
fig_dat3
## # A tibble: 10,000 × 3
## Clicks Conversion_Rate Campaign_Type
## <dbl> <dbl> <chr>
## 1 121 0.07 Display
## 2 797 0.05 Influencer
## 3 426 0.15 Display
## 4 870 0.14 Social Media
## 5 169 0.13 Display
## 6 155 0.13 Email
## 7 304 0.05 Search
## 8 732 0.02 Email
## 9 592 0.11 Email
## 10 483 0.13 Social Media
## # ℹ 9,990 more rows
ggplot(fig_dat3, aes(x = Clicks, y = Conversion_Rate, color = Campaign_Type)) +
geom_jitter(alpha = 0.3, size = 1.5, height = 0.005) +
geom_smooth(method = "lm", se = FALSE, linewidth = 0.8) +
scale_color_brewer(palette = "Set2") +
scale_y_continuous(labels = scales::percent_format(accuracy = 1)) +
labs(
title = "Clicks vs Conversion Rate by Campaign Type",
subtitle = "Each point represents one campaign",
x = "Clicks",
y = "Conversion Rate (%)",
color = "Campaign Type"
) +
theme_minimal(base_size = 13) +
theme(
plot.title = element_text(face = "bold", hjust = 0.5),
plot.subtitle = element_text(hjust = 0.5),
legend.position = "bottom"
)
The Histogram shows the distribution of Engagement scores across all campaigns. The X-axis shows the Engagement Scores (1-10) and Y-Axis shows the Count of Campaigns falling in each bin. The data is fetched from Single column Tibble consisting Engagement_Score. The Key Insight: Engagement performance is consistent across campaigns, suggesting that no particular score range is disproportionately common.
fig_dat4 <- Campaign_data %>%
select(Engagement_Score)
fig_dat4
## # A tibble: 10,000 × 1
## Engagement_Score
## <dbl>
## 1 2
## 2 5
## 3 5
## 4 1
## 5 10
## 6 6
## 7 1
## 8 8
## 9 9
## 10 8
## # ℹ 9,990 more rows
ggplot(fig_dat4, aes(x = Engagement_Score)) +
geom_histogram(binwidth = 1, fill = "#66C2A5", color = "white", boundary = 0.5) +
scale_x_continuous(breaks = 1:10) +
labs(
title = "Distribution of Engagement Scores",
subtitle = "Based on 10,000 marketing campaigns",
x = "Engagement Score (1–10)",
y = "Number of Campaigns"
) +
theme_minimal(base_size = 13) +
theme(
plot.title = element_text(face = "bold", hjust = 0.5),
plot.subtitle = element_text(hjust = 0.5)
)
The Boxplot how ROI varies across different marketing channels. The X-axis reresents channels used (YouTube, Website, Email, Instagram, Google Ads, Facebook). Y-Axis represents ROI values.Each box displays the median ROI, the interquartile range (IQR), and any outliers. A white diamond inside each box marks the mean ROI. The data is fetched from the Tibble consisting of Channel_used.The key insight : ROI is not uniform across channels — some platforms yield steadier returns, while others are more unpredictable.
fig_dat5 <- Campaign_data %>%
select(Channel_Used, ROI)
fig_dat5
## # A tibble: 10,000 × 2
## Channel_Used ROI
## <chr> <dbl>
## 1 YouTube 5.82
## 2 Google Ads 7.37
## 3 Email 3.28
## 4 Google Ads 3.19
## 5 Instagram 6.55
## 6 Google Ads 2.81
## 7 Email 6.54
## 8 Email 7.18
## 9 YouTube 3.48
## 10 YouTube 5.15
## # ℹ 9,990 more rows
ggplot(fig_dat5, aes(x = reorder(Channel_Used, ROI, FUN = median),
y = ROI,
fill = Channel_Used)) +
geom_boxplot(outlier.shape = 21, outlier.size = 1.5,
outlier.alpha = 0.5, show.legend = FALSE) +
stat_summary(fun = mean, geom = "point", shape = 23,
size = 3, fill = "white") +
scale_fill_brewer(palette = "Set2") +
labs(
title = "Distribution of ROI Across Marketing Channels",
subtitle = "White diamond indicates the mean; box shows median and IQR",
x = "Channel Used",
y = "ROI"
) +
theme_minimal(base_size = 13) +
theme(
plot.title = element_text(face = "bold", hjust = 0.5),
plot.subtitle = element_text(hjust = 0.5),
axis.text.x = element_text(angle = 15, hjust = 1)
)
The Heatmap shows how average ROI varies across campaign types and target audiences.X-axis represents the different campaign types (Display, Email, Influencer, Search, Social Media). Y‑axis represents target audience groups (e.g., Men 18–24, Women 25–34, etc.).The fill color of each cell indicates the average ROI for that combination. The data is fetched from a tibble of three columns consisting of Campaign_Type, Target_Audience, avg_ROI. The Key Insight : ROI is not uniform across audiences and campaign types — tailoring campaigns to the right audience can yield measurable improvements.
fig_dat6 <- Campaign_data %>%
select(Campaign_Type, Target_Audience, ROI) %>%
group_by(Campaign_Type, Target_Audience) %>%
summarise(avg_ROI = round(mean(ROI, na.rm = TRUE), 2), .groups = "drop")
fig_dat6
## # A tibble: 25 × 3
## Campaign_Type Target_Audience avg_ROI
## <chr> <chr> <dbl>
## 1 Display All Ages 5.02
## 2 Display Men 18-24 4.93
## 3 Display Men 25-34 5
## 4 Display Women 25-34 5.06
## 5 Display Women 35-44 4.92
## 6 Email All Ages 4.97
## 7 Email Men 18-24 4.91
## 8 Email Men 25-34 5
## 9 Email Women 25-34 5.01
## 10 Email Women 35-44 4.98
## # ℹ 15 more rows
ggplot(fig_dat6, aes(x = Campaign_Type,
y = Target_Audience,
fill = avg_ROI)) +
geom_tile(color = "white", linewidth = 0.8) +
geom_text(aes(label = avg_ROI), size = 3.5, color = "white", fontface = "bold") +
scale_fill_gradient(low = "#a8d5b5",
high = "#1a6b3c",
name = "Avg ROI") +
labs(
title = "Heatmap of Average ROI by\nCampaign Type and Target Audience",
subtitle = "Darker green = higher average ROI",
x = "Campaign Type",
y = "Target Audience"
) +
theme_minimal(base_size = 13) +
theme(
plot.title = element_text(face = "bold", hjust = 0.5),
plot.subtitle = element_text(hjust = 0.5),
axis.text.x = element_text(angle = 20, hjust = 1),
panel.grid = element_blank()
)
The Pie Chart shows how marketing campaigns are distributed across customer segments.Each slice represents one segment, sized according to the number of campaigns directed at that group.Percentage labels on the slices make the proportions clear at a glance.The data is fetched from tibble consisting of three columns which are Customer_Segment, count and percentage. The Key insight : The marketing campaigns are evenly allocated, reflecting a deliberate effort to reach diverse customer groups rather than concentrating resources on just one.
fig_dat7 <- Campaign_data %>%
select(Customer_Segment) %>%
group_by(Customer_Segment) %>%
summarise(count = n(), .groups = "drop") %>%
mutate(percentage = round(count / sum(count) * 100, 1),
label = paste0(Customer_Segment, "\n", percentage, "%"))
fig_dat7
## # A tibble: 5 × 4
## Customer_Segment count percentage label
## <chr> <int> <dbl> <chr>
## 1 Fashionistas 2001 20 "Fashionistas\n20%"
## 2 Foodies 2039 20.4 "Foodies\n20.4%"
## 3 Health & Wellness 1962 19.6 "Health & Wellness\n19.6%"
## 4 Outdoor Adventurers 1942 19.4 "Outdoor Adventurers\n19.4%"
## 5 Tech Enthusiasts 2056 20.6 "Tech Enthusiasts\n20.6%"
ggplot(fig_dat7, aes(x = "", y = count, fill = Customer_Segment)) +
geom_col(width = 1, color = "white", linewidth = 0.8) +
coord_polar(theta = "y", start = 0) +
geom_text(aes(label = paste0(percentage, "%")),
position = position_stack(vjust = 0.5),
size = 3.8,
fontface = "bold",
color = "white") +
scale_fill_brewer(palette = "Set2") +
labs(
title = "Distribution of Campaigns\nby Customer Segment",
subtitle = "Proportion of total campaigns targeting each segment",
fill = "Customer Segment",
x = NULL,
y = NULL
) +
theme_void(base_size = 13) +
theme(
plot.title = element_text(face = "bold", hjust = 0.5),
plot.subtitle = element_text(hjust = 0.5, margin = margin(b = 10)),
legend.position = "right"
)
The Faceted bar chart that compares campaign performance across different geographic markets. Each panel represents one location (e.g., Chicago, Houston, Los Angeles, Miami, New York). Within each panel, bars show the average ROI for each campaign type (Influencer, Social Media, Search, Display, Email).This layout makes it easy to compare how campaign types perform both within a single city and across multiple cities. The data is fetched from a Tibble consisting of Location, Campaign_Type, and avg_ROI. The key Insight : While ROI values are close across campaign types, regional differences matter — Miami shows the strongest returns, while Houston and Los Angeles reveal more variation. This insight can guide location‑specific marketing strategies.
fig_dat8 <- Campaign_data %>%
select(Location, Campaign_Type, ROI) %>%
group_by(Location, Campaign_Type) %>%
summarise(avg_ROI = round(mean(ROI, na.rm = TRUE), 2), .groups = "drop")
fig_dat8
## # A tibble: 25 × 3
## Location Campaign_Type avg_ROI
## <chr> <chr> <dbl>
## 1 Chicago Display 5.1
## 2 Chicago Email 4.97
## 3 Chicago Influencer 5.07
## 4 Chicago Search 5.03
## 5 Chicago Social Media 4.99
## 6 Houston Display 4.85
## 7 Houston Email 5
## 8 Houston Influencer 4.96
## 9 Houston Search 4.94
## 10 Houston Social Media 5.04
## # ℹ 15 more rows
ggplot(fig_dat8, aes(x = reorder(Campaign_Type, avg_ROI),
y = avg_ROI,
fill = Campaign_Type)) +
geom_col(width = 0.7, show.legend = FALSE) +
geom_text(aes(label = avg_ROI),
hjust = -0.1,
size = 2.8,
fontface = "bold") +
scale_fill_brewer(palette = "Set2") +
scale_y_continuous(expand = expansion(mult = c(0, 0.15))) +
coord_flip() +
facet_wrap(~ Location, ncol = 3) +
labs(
title = "Average ROI by Campaign Type Across Locations",
subtitle = "Each panel represents a geographic market",
x = "Campaign Type",
y = "Average ROI"
) +
theme_minimal(base_size = 11) +
theme(
plot.title = element_text(face = "bold", hjust = 0.5),
plot.subtitle = element_text(hjust = 0.5),
strip.text = element_text(face = "bold", size = 10),
strip.background = element_rect(fill = "#f0f0f0", color = NA),
panel.spacing = unit(1, "lines")
)