E-News Express — Business Statistics Project

Problem Statement

E-News Express wants to test whether a new landing page improves user engagement and conversions compared to the old page.

Objectives

  1. To perform univariate analysis
  2. To perform bivariate analysis
  3. To provide insight and business recommendations
# Read the data
data <- read.csv("C:\\Users\\njugu\\Downloads\\New folder\\abtest.csv")

# View first few rows
head(data)
##   user_id     group landing_page time_spent_on_the_page converted
## 1  546592   control          old                   3.48        no
## 2  546468 treatment          new                   7.13       yes
## 3  546462 treatment          new                   4.40        no
## 4  546567   control          old                   3.02        no
## 5  546459 treatment          new                   4.75       yes
## 6  546558   control          old                   5.28       yes
##   language_preferred
## 1            Spanish
## 2            English
## 3            Spanish
## 4             French
## 5            Spanish
## 6            English
# View last few rows
tail(data)
##     user_id     group landing_page time_spent_on_the_page converted
## 95   546550   control          old                   3.05        no
## 96   546446 treatment          new                   5.15        no
## 97   546544   control          old                   6.52       yes
## 98   546472 treatment          new                   7.07       yes
## 99   546481 treatment          new                   6.20       yes
## 100  546483 treatment          new                   5.86       yes
##     language_preferred
## 95             English
## 96             Spanish
## 97             English
## 98             Spanish
## 99             Spanish
## 100            English
dim(data)
## [1] 100   6
str(data)
## 'data.frame':    100 obs. of  6 variables:
##  $ user_id               : int  546592 546468 546462 546567 546459 546558 546448 546581 546461 546548 ...
##  $ group                 : chr  "control" "treatment" "treatment" "control" ...
##  $ landing_page          : chr  "old" "new" "new" "old" ...
##  $ time_spent_on_the_page: num  3.48 7.13 4.4 3.02 4.75 ...
##  $ converted             : chr  "no" "yes" "no" "no" ...
##  $ language_preferred    : chr  "Spanish" "English" "Spanish" "French" ...
summary(data)
##     user_id          group           landing_page       time_spent_on_the_page
##  Min.   :546443   Length:100         Length:100         Min.   : 0.190        
##  1st Qu.:546468   Class :character   Class :character   1st Qu.: 3.880        
##  Median :546493   Mode  :character   Mode  :character   Median : 5.415        
##  Mean   :546517                                         Mean   : 5.378        
##  3rd Qu.:546567                                         3rd Qu.: 7.022        
##  Max.   :546592                                         Max.   :10.710        
##   converted         language_preferred
##  Length:100         Length:100        
##  Class :character   Class :character  
##  Mode  :character   Mode  :character  
##                                       
##                                       
## 
colSums(is.na(data))
##                user_id                  group           landing_page 
##                      0                      0                      0 
## time_spent_on_the_page              converted     language_preferred 
##                      0                      0                      0
sum(duplicated(data))
## [1] 0
#Categorical Variables
barplot(table(data$landing_page),
        main = "Distribution of Landing Page",
        col = "lightblue")

barplot(table(data$language_preferred),
        main = "Distribution of Preferred Language",
        col = c("skyblue", "lightgreen", "orange"))

hist(data$time_spent_on_the_page,
     main = "Histogram of Time Spent on Page",
     xlab = "Time (minutes)",
     col = "lightgreen",
     breaks = 10)

boxplot(data$time_spent_on_the_page,
        main = "Boxplot of Time Spent on Page",
        col = "lightblue")

# BIVARIATE ANALYSIS & HYPOTHESIS TESTING ###Do users spend more time on the new landing page than the old landing page? #### H_0 (Null): Mean time spent on the new page ≤ mean time on the old page

H_1 (Alternative): Mean time spent on the new page > mean time on the old page

alpha <- 0.05

new_page <- data$time_spent_on_the_page[data$landing_page == "new"]
old_page <- data$time_spent_on_the_page[data$landing_page == "old"]

boxplot(time_spent_on_the_page ~ landing_page,
        data = data,
        main = "Time Spent on Page by Landing Page",
        col = c("orange", "lightgreen"))

t.test(new_page, old_page, alternative = "greater")
## 
##  Welch Two Sample t-test
## 
## data:  new_page and old_page
## t = 3.7868, df = 87.975, p-value = 0.0001392
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  0.9485536       Inf
## sample estimates:
## mean of x mean of y 
##    6.2232    4.5324

Is the conversion rate higher for the new page?

H_0: Conversion rate (new) ≤ Conversion rate (old)

H_1: Conversion rate (new) > Conversion rate (old)

conversion_table <- table(data$landing_page, data$converted)
conversion_table
##      
##       no yes
##   new 17  33
##   old 29  21
prop.test(conversion_table, alternative = "greater")
## 
##  2-sample test for equality of proportions with continuity correction
## 
## data:  conversion_table
## X-squared = 4.8712, df = 1, p-value = 0.9863
## alternative hypothesis: greater
## 95 percent confidence interval:
##  -0.4191348  1.0000000
## sample estimates:
## prop 1 prop 2 
##   0.34   0.58

Is conversion dependent on preferred language?

H_0: Conversion and language are independent

Is time spent on the new page the same across different languages?

####H_0: Mean time spent is equal across languages

H_1: At least one language has a different mean time

new_page_data <- data[data$landing_page == "new", ]

anova_result <- aov(time_spent_on_the_page ~ language_preferred,
                    data = new_page_data)
summary(anova_result)
##                    Df Sum Sq Mean Sq F value Pr(>F)
## language_preferred  2   5.68   2.838   0.854  0.432
## Residuals          47 156.10   3.321

Conclusion & Business Recommendations

Conclusion

This study was conducted to evaluate the effectiveness of the new landing page introduced by E-news Express in improving user engagement and increasing subscriber conversions. Using statistical analysis at a 5% significance level, several insights were obtained from the A/B testing experiment.

The analysis revealed that users spend significantly more time on the new landing page compared to the existing landing page. This indicates that the redesigned layout and improved content are more engaging and successful in retaining user attention.

Further, the conversion rate for the new landing page was found to be significantly higher than that of the old landing page. This suggests that increased engagement on the new page positively influences users’ decisions to subscribe, making the new design more effective in achieving the company’s primary business objective.

The chi-square test examining the relationship between conversion status and preferred language showed that conversion is dependent on language preference. This implies that users’ likelihood of subscribing varies based on the language in which the content is presented.

##Business Recommendations #### Based on the findings of this analysis, the following recommendations are made to E-news Express management:

Adopt the new landing page permanently Since the new landing page significantly improves both user engagement and conversion rates, it should replace the existing landing page as the primary interface for new visitors.

Optimize content based on language preference As conversion and engagement vary across languages, the company should further analyze which language versions perform best and improve content quality, layout, and recommendations for underperforming language groups.

####Use personalized content strategies Tailoring recommended articles and layout based on user language preference and behavior can further enhance engagement and increase subscription likelihood.