DEPENDENT T-TEST & WILCOXON SIGN RANK

Used to test if there is a difference between Before PreCampaignSales and PostCampaignSales After scores (comparing the means).

NULL HYPOTHESIS (H0)

There is no difference between the PreCampaignSales and PostCampaignSales

ALTERNATE HYPOTHESIS (H1)

There is a difference between the PreCampaignSales and PostCampaignSales

#install.packages("readxl")
library(readxl)
dataset <- read_excel("C:\\Users\\navya\\Downloads\\A6R4.xlsx")

CALCULATE THE DIFFERENCE SCORES

Before <- dataset$PreCampaignSales
After <- dataset$PostCampaignSales

Differences <- After - Before

HISTOGRAM

hist(Differences,
     main = "Histogram of Difference Scores",
     xlab = "Value",
     ylab = "Frequency",
     col = "blue",
     border = "black",
     breaks = 20)

# DIRECTIONS: Answer the questions below directly in your code.

# QUESTION 1: Is the histograms symmetrical, positively skewed, or negatively skewed?
# ANSWER: The histogram appears negatively skewed

# QUESTION 2: Did the histogram look too flat, too tall, or did it have a proper bell curve?
# ANSWER: The histogram appears to have moderately high peak looks too tall.

SHAPIRO-WILK TEST

shapiro.test(Differences)
## 
##  Shapiro-Wilk normality test
## 
## data:  Differences
## W = 0.94747, p-value = 0.01186

BOXPLOT

boxplot(Differences,
        main = "Distribution of Score Differences (After - Before)",
        ylab = "Difference in Scores",
        col = "blue",
        border = "darkblue")

# DIRECTIONS:

# QUESTION 1: How many dots are in your boxplot?
# A) No dots.
# B) One or two dots. 
# C) Many dots.
# ANSWER: B

# QUESTION 2: Where are the dots in your boxplot?
# A) There are no dots.
# B) Very close to the whiskers (lines of the boxplot).
# C) Far from the whiskers (lines of the boxplot).
# ANSWER: B

# QUESTION 3: Based on the dots and their location, is the data normal?
# Based on dots which is only one dot is outside the data is normal

DESCRIPTIVE STATISTICS

DESCRIPTIVES FOR BEFORE SCORES

mean(Before, na.rm = TRUE)
## [1] 25154.53
median(Before, na.rm = TRUE)
## [1] 24624
sd(Before, na.rm = TRUE)
## [1] 12184.4
length(Before)
## [1] 60

DESCRIPTIVES FOR AFTER SCORES

mean(After, na.rm = TRUE)
## [1] 26873.45
median(After, na.rm = TRUE)
## [1] 25086
sd(After, na.rm = TRUE)
## [1] 14434.37
length(After)
## [1] 60

WILCOXON SIGN RANK TEST

wilcox.test(Before, After, paired = TRUE)
## 
##  Wilcoxon signed rank test with continuity correction
## 
## data:  Before and After
## V = 640, p-value = 0.0433
## alternative hypothesis: true location shift is not equal to 0

DETERMINE STATISTICAL SIGNIFICANCE

EFFECT SIZE FOR WILCOXON SIGN RANK TEST

#install.packages("rstatix")
library(rstatix)
## 
## Attaching package: 'rstatix'
## The following object is masked from 'package:stats':
## 
##     filter

CALCULATE RANK BISERIAL CORRELATION (EFFECT SIZE)

df_long <- data.frame(
  id = rep(1:length(Before), 2),
  time = rep(c("Before", "After"), each = length(Before)),
  scores = c(Before, After)
  )
wilcox_effsize(df_long, scores ~ time, paired = TRUE)
## # A tibble: 1 × 7
##   .y.    group1 group2 effsize    n1    n2 magnitude
## * <chr>  <chr>  <chr>    <dbl> <int> <int> <ord>    
## 1 scores After  Before   0.261    60    60 small
# QUESTION
# Q1) What is the size of the effect?
# Answer: The rank bieserial correlation is 0.86 which is +/- 0.50 to 1.00 the size is very large.
#  
# Q2) Which group had the higher average score?
# Answer: The after training scores (PostCampaignSales) were higher median=25086.00 than before scores median = 24624.00


# SUMMARY OF RESULTS
# A Wilcoxon signed rank test was conducted to compare sales before and after the marketing campaign among 60 stores. Median scales were higher after the camapaign(median=25086.00) than before campaign(median=24624.00) p<0.001 v=43.This results indicate that marketing campaign were increased by sales figures.The effect size was r=0.86 which is very large effect.