The goal of this task is to conduct an Exploratory Data Analysis (EDA) on marketing data provided by ######. The data contains weekly information about 3 different online marketing campaigns in one market.
Theese libraries are used throught the code. Libraries specific to the line of code are defined with the code.
library(ggplot2) # Used to Create Visualization
library(dplyr) # Used to Manipulate and Clean Data
library(reshape2) # Used to Manipulate Data for Plotting
Exploring the data to find different datatypes in the column.
mydata <- read.csv(file="marketing_campaigns.csv", header=TRUE, sep=";")
str(mydata)
## 'data.frame': 91 obs. of 5 variables:
## $ Week : int 1 2 3 4 5 6 7 8 9 10 ...
## $ Campaign: chr "Aldebaran" "Aldebaran" "Aldebaran" "Aldebaran" ...
## $ Visits : int 27 64 80 93 120 130 146 173 170 218 ...
## $ Revenue : num 2.27 10.82 7.13 11.09 14.28 ...
## $ Cost : num 3.76 15.32 10.75 16.91 21.45 ...
It is observed that the dataset has 5 columns with 4 columns as numeric and “Campaign” being a Factor. The dataset has no NA’s or missing values.
Getting an initial initution of the number of visits incurred from different campaigns over the 30 weeks period.
ggplot(mydata, aes(x = Week, y = Visits, col = Campaign)) +
geom_line() +
scale_x_continuous(breaks = c(0,5,10,15,20,25,30)) +
ggtitle("Visits vs Weeks") +
theme(plot.title = element_text(hjust = 0.5))
It can be observed from the graph above that there has been an impressive increase in the number of visits for the Aldebaran campaign .
There has been a slight increase in the number of visitors for the Bartledan campaign.
There has been not much change in the number of visits for Cottington campaign.
Introducing addition labels for better visualizations.
Let us assume that
Net_Income = Revenue - Cost
mydata2 <- mydata %>% mutate(Net_Income = Revenue-Cost)
head(mydata2)
## Week Campaign Visits Revenue Cost Net_Income
## 1 1 Aldebaran 27 2.269511 3.763627 -1.494116
## 2 2 Aldebaran 64 10.820403 15.322613 -4.502210
## 3 3 Aldebaran 80 7.132998 10.753533 -3.620535
## 4 4 Aldebaran 93 11.085813 16.906191 -5.820379
## 5 5 Aldebaran 120 14.282481 21.446570 -7.164089
## 6 6 Aldebaran 130 19.430408 27.218662 -7.788254
Trying to Gain Insights into the costs incured and revenue generated from the different campaigns.
mdata3 <- melt(mydata2, id=c("Week","Campaign"))
library(directlabels) # Library for adding names to lines on plots
ggplot(mdata3 , aes(x = Week, y = value, col = variable)) + geom_line() +
scale_x_continuous(breaks = c(0,5,10,15,20,25,30)) +
geom_dl(aes(label = variable) , method = list(dl.combine("last.points"), cex = 0.7)) +
facet_grid(Campaign~ .) + ggtitle("Timeline of Various Labels") +
theme(plot.title = element_text(hjust = 0.5))
Aldebaren campaign
The only campaign where revenue is greater than cost and there has been a positive net income is Aldebaren.It has also incured an exponential growth of visitor traffic.
Also, it is observed that not many visitors have translated to business in this campaign as the overall revenue seems to be almost equivalent to the other campaigns.
Bartledan campaign
Altough the Cost is greater than Revenue and the number of visitors only show a modest growth for the Bartledan campaign, it has generated a higher revenue as compared to Aldebaren .It seems that a very high percentage of vistors translate to business and revenue in this region.
Cottington campaign
Cottington is the only Campaign that has seen no growth in the number of visitors and a sharp decline in revenue in the last few weeks.
Cottington region initially had a positive net revenue and went into a negative net revenue after week 21.
Other observations
The revenue generated is directly related to costs.
An important metric here could be revenue generated per number of visitors.
Introducing a metric “RPV = Revenue/visitors” and visualizing it.
We visualize two graph here.The first being the Revenue per visitors and second being the Net Income .
mydatarpv <- mydata %>% mutate(RPV = Revenue/Visits)
ggplot(mydatarpv, aes(x = Week, y = RPV, col = Campaign)) + geom_line() +
scale_x_continuous(breaks = c(0,5,10,15,20,25,30))+
ggtitle("Revenue per visit Vs Weeks for 3 regions") +
theme(plot.title = element_text(hjust = 0.5))
All regions seemn to show a steady growth in revenue generated per visit and the Revenue per visit show the following patterns:
Cottington > Bartledan > Aldebaren
The income from the campaigns could be a important decision making parameter:
mydata21 <- mydata2[ c(1:2,6) ]
ggplot(data=mydata21, aes(x = Week, y=Net_Income, col=Campaign))+ geom_line() +
stat_smooth(inherit.aes=T, se=F, span=0.8, show.legend = T) +
scale_colour_discrete(guide = 'none') + scale_x_discrete(expand=c(0, 1)) +
geom_dl(aes(label = Campaign),
method = list(dl.combine("first.points","last.points"), cex = 0.8))+
ggtitle("Development of Net Income in different markets ")+
theme(plot.title = element_text(hjust = 0.5))
The net income shows the following patterns:
Aldebaren >>> Cottington > Bartledan
A worrying fact is that Cottington has gone from a very high positive Net Income to a negative Income.
Aldebaren campaign seem to inversely effect the Cottington campaign. It might be possible that some business has been transfered from cottington to Aldebaren. We would furthur explore this.
Bartledan has never generated a positive income .
Let us look at the Current overall Spendings and Income across the 3 markets
ggplot(data = mydata2, aes(x = Week, y = Cost, fill = Campaign)) +
geom_bar(stat = "identity") +
ggtitle("Overall Weekly Spendings across all markets ")+
theme(plot.title = element_text(hjust = 0.5))
ggplot(data=mydata2, aes(x = Week, y = Net_Income,fill = Campaign)) +
geom_bar(stat = "identity") +
ggtitle("Overall Weekly Earnings across all markets ")+
theme(plot.title = element_text(hjust = 0.5))
From the bar plots above, we observe: .
There has been the least spending in Alderban campaign but the net Income has slowly risen to the positive side.
There has been very high Spendings in the Cottington campaign and a decrease in net revenue .
The cottington campaign hit negative revenues after week 21 and Aldebaran hit positive revenues from the same week. Both campaigns might be addressing the same customer group .
Bartledan campaign has seen the maximum loses altough there has been a steady increase in spendings.
From prior visualizations, we observed that there has been an overall increase in revenue per visit for the three regions with Cottington region providing very high overall revenue.
The high revenue is not justified as there has been a high cost and net income from this campaign has been on a sharp decline.
Also cottington is the only region that has seen a decline in the number of Visitors.
It might be possible that other campaigns target the same group hence there might be a possible budget overspending for this campaign.
Bartlenden campaign has seen a growth in the number of visitors but theese visitors do not translate to big business for the company as compared to the cottington campaign.The current spending might only need a slight increase in this market to keep the visitors growing. There has been a sharp rise in Revenue cost deficit over the last 3 weeks .
The Campaign of Aldebaran has shown significant potential and altough the Revenue per visit is low, it shows potential for growth and producing more revenue. With low spendings, the overall business brought to the company is impressive. More marketing budget could really help in this campaign.