This lecture introduces the concept of one-way ANOVA using a business dataset in R,

The Problem:

A company wants to evaluate the effectiveness of different marketing channels (Social Media, Email, Search Engine) on average monthly sales. They need to know if there’s a significant difference in sales performance between these channels to better allocate their advertising budget.

Step 1: Set Up and Data Generation

We will start by generating a mock dataset in R to simulate our business problem. This ensures our code is fully reproducible.

# Create a reproducible dataset for our analysis
set.seed(42) # For reproducibility

# Sales data for different marketing channels
sales_social_media <- rnorm(50, mean = 55000, sd = 12000)
sales_email <- rnorm(50, mean = 68000, sd = 10000)
sales_search_engine <- rnorm(50, mean = 66000, sd = 11000)

# Sales data for a different variable, Store_Location, with no significant difference
sales_downtown <- rnorm(50, mean = 60000, sd = 1100)
sales_suburb <- rnorm(50, mean = 59500, sd = 1300)
sales_mall <- rnorm(50, mean = 60000, sd = 1200)

# Combine into a data frame
marketing_data <- data.frame(
  Sales = c(sales_social_media, sales_email, sales_search_engine),
  Channel = factor(c(rep("Social Media", 50), rep("Email", 50), rep("Search Engine", 50))))
  
#Location into a data frame 
location_data <- data.frame(
  Sales = c(sales_downtown, sales_suburb, sales_mall),
  Location = factor(c(rep("Social Media", 50), rep("Email", 50), rep("Search Engine", 50))))

# Take a look at the data structure
str(marketing_data)
'data.frame':   150 obs. of  2 variables:
 $ Sales  : num  71452 48224 59358 62594 59851 ...
 $ Channel: Factor w/ 3 levels "Email","Search Engine",..: 3 3 3 3 3 3 3 3 3 3 ...
summary(marketing_data)
     Sales                Channel  
 Min.   :23123   Email        :50  
 1st Qu.:54983   Search Engine:50  
 Median :62750   Social Media :50  
 Mean   :62638                     
 3rd Qu.:71324                     
 Max.   :95721                     

Step 2: Exploratory Data Analysis (EDA)

Before we run any formal tests, it’s always good to visually inspect our data. Boxplots are a great way to do this for categorical predictors.

# Load a plotting library
install.packages("ggplot2")
Error in install.packages : Updating loaded packages
library(ggplot2)

# Create a boxplot of Sales by Channel
ggplot(marketing_data, aes(x = Channel, y = Sales, fill = Channel)) +
  geom_boxplot() +
  labs(title = "Boxplot of Sales by Marketing Channel",
       y = "Monthly Sales ($)", x = "Marketing Channel") +
  theme_minimal()

Step 3: One-Way ANOVA (Significant Result)

Hypotheses:

\(H_0\): There is no significant difference in the mean sales between the marketing channels.

\(H_a\): At least one marketing channel’s mean sales is significantly different.

We’ll use the aov() function to perform the analysis of variance.

# Run the one-way ANOVA
anova_model_1 <- aov(Sales ~ Channel, data = marketing_data)

# View the ANOVA table
summary(anova_model_1)
             Df    Sum Sq   Mean Sq F value   Pr(>F)    
Channel       2 5.425e+09 2.713e+09   21.38 7.06e-09 ***
Residuals   147 1.865e+10 1.269e+08                     
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Interpretation:

The Pr(>F) column (our p-value) is a crucial part of the output. If it’s less than our significance level (e.g., 0.05), we reject the null hypothesis. In this case, it appears to be very small, indicating a significant difference.

Business Context:

A significant p-value means we have strong evidence that the average sales are not the same across all marketing channels. The marketing team now has data to support that certain channels are more effective than others.

Step 4: Post-Hoc Test

Since our ANOVA was significant, we don’t know which specific channels are different. A post-hoc test, like Tukey’s Honestly Significant Difference (HSD), will tell us.

# Perform Tukey's HSD test
tukey_test <- TukeyHSD(anova_model_significant)
tukey_test
  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = Sales ~ Channel, data = marketing_data)

$Channel
                                 diff       lwr        upr     p adj
Search Engine-Email         -4670.776 -10004.46   662.9052 0.0989231
Social Media-Email         -14435.076 -19768.76 -9101.3942 0.0000000
Social Media-Search Engine  -9764.299 -15097.98 -4430.6181 0.0000797

Interpretation:

The output shows the difference between each pair of channels. The p adj column is the adjusted p-value for that comparison.

If p adj is less than 0.05, the difference is significant. For example, if “Email-Social Media” has a p-value less than 0.05, it means those two channels have significantly different mean sales.

Business Context:

This test helps us get specific. For example, we can conclude that Email marketing leads to significantly higher sales than Social Media. This information allows the company to make targeted decisions, such as increasing the budget for the most profitable channels.

Step 5: One-Way ANOVA

Now, let’s run an ANOVA on the Store_Location variable

# Run the one-way ANOVA for Store_Location
anova_model_location <- aov(Sales ~ Location, data = location_data)

# View the ANOVA table
summary(anova_model_location)
             Df    Sum Sq Mean Sq F value Pr(>F)  
Location      2   7037549 3518774   2.513 0.0845 .
Residuals   147 205794102 1399960                 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Interpretation:

The p-value for Store_Location is greater than 0.05.

Business Context:

This non-significant result is also valuable! It tells the company that there is no statistical evidence that sales are different based on the store’s location. This means they don’t need to change their strategy or worry about allocating more resources to one location over another based on sales performance.

LS0tDQp0aXRsZTogIk9uZS1XYXkgQU5PVkEiDQpvdXRwdXQ6IA0KICBodG1sX25vdGVib29rOg0KICAgIHRvYzogdHJ1ZQ0KICAgIHRvY19mbG9hdDogdHJ1ZQ0KLS0tDQoNClRoaXMgbGVjdHVyZSBpbnRyb2R1Y2VzIHRoZSBjb25jZXB0IG9mIG9uZS13YXkgQU5PVkEgdXNpbmcgYSBidXNpbmVzcyBkYXRhc2V0IGluIFIsIA0KDQojIyBUaGUgIFByb2JsZW06DQpBIGNvbXBhbnkgd2FudHMgdG8gZXZhbHVhdGUgdGhlIGVmZmVjdGl2ZW5lc3Mgb2YgZGlmZmVyZW50IG1hcmtldGluZyBjaGFubmVscyAoU29jaWFsIE1lZGlhLCBFbWFpbCwgU2VhcmNoIEVuZ2luZSkgb24gYXZlcmFnZSBtb250aGx5IHNhbGVzLiBUaGV5IG5lZWQgdG8ga25vdyBpZiB0aGVyZSdzIGEgc2lnbmlmaWNhbnQgZGlmZmVyZW5jZSBpbiBzYWxlcyBwZXJmb3JtYW5jZSBiZXR3ZWVuIHRoZXNlIGNoYW5uZWxzIHRvIGJldHRlciBhbGxvY2F0ZSB0aGVpciBhZHZlcnRpc2luZyBidWRnZXQuDQoNCiMjIyBTdGVwIDE6IFNldCBVcCBhbmQgRGF0YSBHZW5lcmF0aW9uDQpXZSB3aWxsIHN0YXJ0IGJ5IGdlbmVyYXRpbmcgYSBtb2NrIGRhdGFzZXQgaW4gUiB0byBzaW11bGF0ZSBvdXIgYnVzaW5lc3MgcHJvYmxlbS4gVGhpcyBlbnN1cmVzIG91ciBjb2RlIGlzIGZ1bGx5IHJlcHJvZHVjaWJsZS4NCg0KYGBge3J9DQojIENyZWF0ZSBhIHJlcHJvZHVjaWJsZSBkYXRhc2V0IGZvciBvdXIgYW5hbHlzaXMNCnNldC5zZWVkKDQyKSAjIEZvciByZXByb2R1Y2liaWxpdHkNCg0KIyBTYWxlcyBkYXRhIGZvciBkaWZmZXJlbnQgbWFya2V0aW5nIGNoYW5uZWxzDQpzYWxlc19zb2NpYWxfbWVkaWEgPC0gcm5vcm0oNTAsIG1lYW4gPSA1NTAwMCwgc2QgPSAxMjAwMCkNCnNhbGVzX2VtYWlsIDwtIHJub3JtKDUwLCBtZWFuID0gNjgwMDAsIHNkID0gMTAwMDApDQpzYWxlc19zZWFyY2hfZW5naW5lIDwtIHJub3JtKDUwLCBtZWFuID0gNjYwMDAsIHNkID0gMTEwMDApDQoNCiMgU2FsZXMgZGF0YSBmb3IgYSBkaWZmZXJlbnQgdmFyaWFibGUsIFN0b3JlX0xvY2F0aW9uLCB3aXRoIG5vIHNpZ25pZmljYW50IGRpZmZlcmVuY2UNCnNhbGVzX2Rvd250b3duIDwtIHJub3JtKDUwLCBtZWFuID0gNjAwMDAsIHNkID0gMTEwMCkNCnNhbGVzX3N1YnVyYiA8LSBybm9ybSg1MCwgbWVhbiA9IDU5NTAwLCBzZCA9IDEzMDApDQpzYWxlc19tYWxsIDwtIHJub3JtKDUwLCBtZWFuID0gNjAwMDAsIHNkID0gMTIwMCkNCg0KIyBDb21iaW5lIGludG8gYSBkYXRhIGZyYW1lDQptYXJrZXRpbmdfZGF0YSA8LSBkYXRhLmZyYW1lKA0KICBTYWxlcyA9IGMoc2FsZXNfc29jaWFsX21lZGlhLCBzYWxlc19lbWFpbCwgc2FsZXNfc2VhcmNoX2VuZ2luZSksDQogIENoYW5uZWwgPSBmYWN0b3IoYyhyZXAoIlNvY2lhbCBNZWRpYSIsIDUwKSwgcmVwKCJFbWFpbCIsIDUwKSwgcmVwKCJTZWFyY2ggRW5naW5lIiwgNTApKSkpDQogIA0KI0xvY2F0aW9uIGludG8gYSBkYXRhIGZyYW1lIA0KbG9jYXRpb25fZGF0YSA8LSBkYXRhLmZyYW1lKA0KICBTYWxlcyA9IGMoc2FsZXNfZG93bnRvd24sIHNhbGVzX3N1YnVyYiwgc2FsZXNfbWFsbCksDQogIExvY2F0aW9uID0gZmFjdG9yKGMocmVwKCJTb2NpYWwgTWVkaWEiLCA1MCksIHJlcCgiRW1haWwiLCA1MCksIHJlcCgiU2VhcmNoIEVuZ2luZSIsIDUwKSkpKQ0KDQojIFRha2UgYSBsb29rIGF0IHRoZSBkYXRhIHN0cnVjdHVyZQ0Kc3RyKG1hcmtldGluZ19kYXRhKQ0Kc3VtbWFyeShtYXJrZXRpbmdfZGF0YSkNCmBgYA0KDQojIyBTdGVwIDI6IEV4cGxvcmF0b3J5IERhdGEgQW5hbHlzaXMgKEVEQSkNCkJlZm9yZSB3ZSBydW4gYW55IGZvcm1hbCB0ZXN0cywgaXQncyBhbHdheXMgZ29vZCB0byB2aXN1YWxseSBpbnNwZWN0IG91ciBkYXRhLiBCb3hwbG90cyBhcmUgYSBncmVhdCB3YXkgdG8gZG8gdGhpcyBmb3IgY2F0ZWdvcmljYWwgcHJlZGljdG9ycy4NCg0KYGBge3J9DQojIExvYWQgYSBwbG90dGluZyBsaWJyYXJ5DQppbnN0YWxsLnBhY2thZ2VzKCJnZ3Bsb3QyIikNCmxpYnJhcnkoZ2dwbG90MikNCg0KIyBDcmVhdGUgYSBib3hwbG90IG9mIFNhbGVzIGJ5IENoYW5uZWwNCmdncGxvdChtYXJrZXRpbmdfZGF0YSwgYWVzKHggPSBDaGFubmVsLCB5ID0gU2FsZXMsIGZpbGwgPSBDaGFubmVsKSkgKw0KICBnZW9tX2JveHBsb3QoKSArDQogIGxhYnModGl0bGUgPSAiQm94cGxvdCBvZiBTYWxlcyBieSBNYXJrZXRpbmcgQ2hhbm5lbCIsDQogICAgICAgeSA9ICJNb250aGx5IFNhbGVzICgkKSIsIHggPSAiTWFya2V0aW5nIENoYW5uZWwiKSArDQogIHRoZW1lX21pbmltYWwoKQ0KYGBgDQoNCiMjIFN0ZXAgMzogT25lLVdheSBBTk9WQSAoU2lnbmlmaWNhbnQgUmVzdWx0KQ0KSHlwb3RoZXNlczoNCg0KJEhfMCQ6IFRoZXJlIGlzIG5vIHNpZ25pZmljYW50IGRpZmZlcmVuY2UgaW4gdGhlIG1lYW4gc2FsZXMgYmV0d2VlbiB0aGUgbWFya2V0aW5nIGNoYW5uZWxzLg0KDQokSF9hJDogQXQgbGVhc3Qgb25lIG1hcmtldGluZyBjaGFubmVsJ3MgbWVhbiBzYWxlcyBpcyBzaWduaWZpY2FudGx5IGRpZmZlcmVudC4NCg0KV2UnbGwgdXNlIHRoZSBgYW92KClgIGZ1bmN0aW9uIHRvIHBlcmZvcm0gdGhlIGFuYWx5c2lzIG9mIHZhcmlhbmNlLg0KDQpgYGB7cn0NCiMgUnVuIHRoZSBvbmUtd2F5IEFOT1ZBDQphbm92YV9tb2RlbF8xIDwtIGFvdihTYWxlcyB+IENoYW5uZWwsIGRhdGEgPSBtYXJrZXRpbmdfZGF0YSkNCg0KIyBWaWV3IHRoZSBBTk9WQSB0YWJsZQ0Kc3VtbWFyeShhbm92YV9tb2RlbF8xKQ0KYGBgDQoNCiMjIyBJbnRlcnByZXRhdGlvbjoNClRoZSBgUHIoPkYpYCBjb2x1bW4gKG91ciBwLXZhbHVlKSBpcyBhIGNydWNpYWwgcGFydCBvZiB0aGUgb3V0cHV0LiBJZiBpdCdzIGxlc3MgdGhhbiBvdXIgc2lnbmlmaWNhbmNlIGxldmVsIChlLmcuLCAwLjA1KSwgd2UgcmVqZWN0IHRoZSBudWxsIGh5cG90aGVzaXMuIEluIHRoaXMgY2FzZSwgaXQgYXBwZWFycyB0byBiZSB2ZXJ5IHNtYWxsLCBpbmRpY2F0aW5nIGEgc2lnbmlmaWNhbnQgZGlmZmVyZW5jZS4NCg0KIyMjIEJ1c2luZXNzIENvbnRleHQ6DQpBIHNpZ25pZmljYW50IHAtdmFsdWUgbWVhbnMgd2UgaGF2ZSBzdHJvbmcgZXZpZGVuY2UgdGhhdCB0aGUgYXZlcmFnZSBzYWxlcyBhcmUgbm90IHRoZSBzYW1lIGFjcm9zcyBhbGwgbWFya2V0aW5nIGNoYW5uZWxzLiBUaGUgbWFya2V0aW5nIHRlYW0gbm93IGhhcyBkYXRhIHRvIHN1cHBvcnQgdGhhdCBjZXJ0YWluIGNoYW5uZWxzIGFyZSBtb3JlIGVmZmVjdGl2ZSB0aGFuIG90aGVycy4NCg0KIyMgU3RlcCA0OiBQb3N0LUhvYyBUZXN0DQpTaW5jZSBvdXIgQU5PVkEgd2FzIHNpZ25pZmljYW50LCB3ZSBkb24ndCBrbm93IHdoaWNoIHNwZWNpZmljIGNoYW5uZWxzIGFyZSBkaWZmZXJlbnQuIEEgcG9zdC1ob2MgdGVzdCwgbGlrZSBUdWtleSdzIEhvbmVzdGx5IFNpZ25pZmljYW50IERpZmZlcmVuY2UgKEhTRCksIHdpbGwgdGVsbCB1cy4NCg0KYGBge3J9DQojIFBlcmZvcm0gVHVrZXkncyBIU0QgdGVzdA0KdHVrZXlfdGVzdCA8LSBUdWtleUhTRChhbm92YV9tb2RlbF9zaWduaWZpY2FudCkNCnR1a2V5X3Rlc3QNCmBgYA0KDQojIyMgSW50ZXJwcmV0YXRpb246DQpUaGUgb3V0cHV0IHNob3dzIHRoZSBkaWZmZXJlbmNlIGJldHdlZW4gZWFjaCBwYWlyIG9mIGNoYW5uZWxzLiBUaGUgYHAgYWRqYCBjb2x1bW4gaXMgdGhlIGFkanVzdGVkIHAtdmFsdWUgZm9yIHRoYXQgY29tcGFyaXNvbi4NCg0KSWYgYHAgYWRqYCBpcyBsZXNzIHRoYW4gMC4wNSwgdGhlIGRpZmZlcmVuY2UgaXMgc2lnbmlmaWNhbnQuIEZvciBleGFtcGxlLCBpZiAiRW1haWwtU29jaWFsIE1lZGlhIiBoYXMgYSBwLXZhbHVlIGxlc3MgdGhhbiAwLjA1LCBpdCBtZWFucyB0aG9zZSB0d28gY2hhbm5lbHMgaGF2ZSBzaWduaWZpY2FudGx5IGRpZmZlcmVudCBtZWFuIHNhbGVzLg0KDQojIyMgQnVzaW5lc3MgQ29udGV4dDoNClRoaXMgdGVzdCBoZWxwcyB1cyBnZXQgc3BlY2lmaWMuIEZvciBleGFtcGxlLCB3ZSBjYW4gY29uY2x1ZGUgdGhhdCBFbWFpbCBtYXJrZXRpbmcgbGVhZHMgdG8gc2lnbmlmaWNhbnRseSBoaWdoZXIgc2FsZXMgdGhhbiBTb2NpYWwgTWVkaWEuIFRoaXMgaW5mb3JtYXRpb24gYWxsb3dzIHRoZSBjb21wYW55IHRvIG1ha2UgdGFyZ2V0ZWQgZGVjaXNpb25zLCBzdWNoIGFzIGluY3JlYXNpbmcgdGhlIGJ1ZGdldCBmb3IgdGhlIG1vc3QgcHJvZml0YWJsZSBjaGFubmVscy4NCg0KIyMgU3RlcCA1OiBPbmUtV2F5IEFOT1ZBIA0KTm93LCBsZXQncyBydW4gYW4gQU5PVkEgb24gdGhlIFN0b3JlX0xvY2F0aW9uIHZhcmlhYmxlDQoNCmBgYHtyfQ0KIyBSdW4gdGhlIG9uZS13YXkgQU5PVkEgZm9yIFN0b3JlX0xvY2F0aW9uDQphbm92YV9tb2RlbF9sb2NhdGlvbiA8LSBhb3YoU2FsZXMgfiBMb2NhdGlvbiwgZGF0YSA9IGxvY2F0aW9uX2RhdGEpDQoNCiMgVmlldyB0aGUgQU5PVkEgdGFibGUNCnN1bW1hcnkoYW5vdmFfbW9kZWxfbG9jYXRpb24pDQpgYGANCg0KIyMjIEludGVycHJldGF0aW9uOg0KVGhlIHAtdmFsdWUgZm9yIGBTdG9yZV9Mb2NhdGlvbmAgaXMgZ3JlYXRlciB0aGFuIDAuMDUuDQoNCiMjIyBCdXNpbmVzcyBDb250ZXh0Og0KVGhpcyBub24tc2lnbmlmaWNhbnQgcmVzdWx0IGlzIGFsc28gdmFsdWFibGUhIEl0IHRlbGxzIHRoZSBjb21wYW55IHRoYXQgdGhlcmUgaXMgbm8gc3RhdGlzdGljYWwgZXZpZGVuY2UgdGhhdCBzYWxlcyBhcmUgZGlmZmVyZW50IGJhc2VkIG9uIHRoZSBzdG9yZSdzIGxvY2F0aW9uLiBUaGlzIG1lYW5zIHRoZXkgZG9uJ3QgbmVlZCB0byBjaGFuZ2UgdGhlaXIgc3RyYXRlZ3kgb3Igd29ycnkgYWJvdXQgYWxsb2NhdGluZyBtb3JlIHJlc291cmNlcyB0byBvbmUgbG9jYXRpb24gb3ZlciBhbm90aGVyIGJhc2VkIG9uIHNhbGVzIHBlcmZvcm1hbmNlLg0K