Part 1: Experiment Design

Title: Consumer Pseudo-Showrooming and Omni-Channel Product Placement Strategies

Abstract: Recent advances in information technologies (IT) have powered the merger of online and offline retail channels into one single platform. Modern consumers frequently switch between online and offline channels when they navigate through various stages of the decision journey, motivating multichannel sellers to develop omni-channel strategies that optimize their overall profit. This study examines consumers’ cross-channel search behavior of “pseudo-showrooming,” or the consumer behavior of inspecting one product at a seller’s physical store before buying a related but different product at the same seller’s online store, and investigates how such consumer behavior allows a multichannel seller to achieve better coordination between its online and offline arms through optimal product placement strategies.

Participants in the study were grouped into the following categories:

Where_bought: Where they ended up purchasing an item: bought at the store, bought online.
Who_bought: If they bought from the same or a different retailer.

Each participant was then measured on:

Money: how much money they spent in dollars on the product.
Time: how much time (in minutes) they spent looking at the product online.

What would be one possible null hypothesis based on this study?

There is no significant difference in the amount of money spent by consumers who engage in pseudo-showrooming compared to those who buy the product directly in-store.

What would be one possible alternative hypothesis based on this study?

There is a significant difference in the amount of money spent by consumers who engage in pseudo-showrooming compared to those who buy the product directly in-store.

Who are they sampling in this study?

The participants in this study are consumers who engage in “pseudo-showrooming” behaviors.

Who is the intended population in this study? The intended population in this study is all consumers.
Give an example of type 1 error based on this study (do not just define, explain in context how it might have happened here). Here a type 1 error would be a conclusion that says that a customer spent significantly more buying online a product related to one he/she saw in store when there’s in reality no significant difference.
Give an example of type 2 error based on this study (do not just define, explain in context how it might have happened here). A Type 2 error could occur if the study fails to detect a significant difference in money spent between consumers who inspect a product in-store and buy a related but different product online compared to those who buy in-store when, in fact, there is a significant difference.

Part 2: Use the 04_data.csv to complete this portion.

library(rio)
Data <- import("/Users/bengalycisse/Downloads/Resume & Career/Classes/Harrisburg U/ANLY 500/04_data.csv")

For each IV list the levels (next to a, b):
1. Where bought: online, in-store
2. Who bought: Same retailer, Different retailer
What are the conditions in this experiment? Online-Same retailer ; Online-Different retailer ; in-store Same retailer ; in-store Different retailer
For each condition list the means, standard deviations, and standard error for the conditions for time and money spent. Please note that means you should have several sets of M, SD, and SE. Be sure you name the sets of means, sd, and se different things so you can use them later.

library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

# Calculate means, standard deviations, and standard errors for money and time for each condition
condition_means <- Data %>%
  group_by(where_bought, who_bought) %>%
  summarise(
    mean_money = mean(money),
    mean_time = mean(time),
    sd_money = sd(money),
    sd_time = sd(time),
    se_money = sd_money / sqrt(n()),
    se_time = sd_time / sqrt(n()),
  )

## `summarise()` has grouped output by 'where_bought'. You can override using the
## `.groups` argument.

print(condition_means)

## # A tibble: 4 × 8
## # Groups:   where_bought [2]
##   where_bought who_bought mean_money mean_time sd_money sd_time se_money se_time
##   <chr>        <chr>           <dbl>     <dbl>    <dbl>   <dbl>    <dbl>   <dbl>
## 1 online       different        25.4      10.0     2.81    2.75    0.397   0.389
## 2 online       same             25.1      10.5     3.26    3.03    0.461   0.428
## 3 store        different        34.7      15.3     3.05    2.66    0.432   0.376
## 4 store        same             34.9      20.1     2.74    2.91    0.387   0.412

Which condition appears to have the best model fit using the mean as the model (i.e. smallest error) for time? “in-store Different retailer” appears to have the best model fit.
What are the df for each condition?

# Calculate degrees of freedom for each condition
df_conditions <- Data %>%
  group_by(where_bought, who_bought) %>%
  summarise(df = n() - 1,
            )

## `summarise()` has grouped output by 'where_bought'. You can override using the
## `.groups` argument.

df_conditions

What is the confidence interval (95%) for the means?

conf_interval <- condition_means %>%
  mutate(
    ci_low_money = mean_money - qt(0.975, 49) * se_money,
    ci_high_money = mean_money + qt(0.975, 49) * se_money,
    ci_low_time = mean_time - qt(0.975, 49) * se_time,
    ci_high_time = mean_time + qt(0.975, 49) * se_time
  )
##money
#Lower CI : 24.6 24.2 33.8 34.1
#Upper CI : 26.2 26.1 35.6 35.7

##time
#Lower CI : 09.27 09.66 14.58 19.27
#Upper CI : 10.80 11.40 16.10 20.90

Use the MOTE library to calculate the effect size for the difference between money spent for the following comparisons (that means you’ll have to do this twice):

library(MOTE)

##Store versus online when bought at the same retailer
store_same <- Data %>%
  filter(where_bought == "store", who_bought == "same")%>%
  select(money) %>%
  unlist()
online_same <- Data %>%
  filter(where_bought == "online", who_bought == "same")%>%
  select(money) %>%
  unlist()
# Calculate means and standard deviations
mean_store_same <- mean(store_same)
mean_online_same <- mean(online_same)

sd_store_same <- sd(store_same)
sd_online_same <- sd(online_same)

# Calculate effect size for money spent for "store" and "online" when bought at the same retailer
effect_same_retailer <- d.ind.t(
  mean_store_same, mean_online_same,
  sd_store_same, sd_online_same,
  length(store_same) - 1, length(online_same) - 1,
  a = 0.05
)
effect_same_retailer$d

## [1] 3.240354

##Store versus online when bought at a different retailer
store_different <- Data %>%
  filter(where_bought == "store", who_bought == "different")%>%
  select(money) %>%
  unlist()
online_different <- Data %>%
  filter(where_bought == "online", who_bought == "different")%>%
  select(money) %>%
  unlist()
# Calculate means and standard deviations
mean_store_different <- mean(store_different)
mean_online_different <- mean(online_different)

sd_store_different <- sd(store_different)
sd_online_different <- sd(online_different)

# Calculate effect size for money spent for "store" and "online" when bought at the same retailer
effect_different_retailer <- d.ind.t(
  mean_store_different, mean_online_different,
  sd_store_different, sd_online_different,
  length(store_different) - 1, length(online_different) - 1,
  a = 0.05
)
effect_different_retailer$d

## [1] 3.180373

What can you determine about the effect size in the experiment - is it small, medium or large?

My values say that the effect size in this experiment is very large

How many people did we need in the study for each comparison?

library(pwr)

# Effect sizes from the calculations
effect_size_same_retailer <- effect_same_retailer$d
effect_size_different_retailer <- effect_different_retailer$d

# Significance level (α)
alpha <- 0.05

# Power (1 - β)
power <- 0.80

##Store versus online when bought at the same retailer
sample_size_same_retailer <- pwr.t.test(d = effect_size_same_retailer, sig.level = alpha, power = power, type = "two.sample")$n

##Store versus online when bought at a different retailer
sample_size_different_retailer <- pwr.t.test(d = effect_size_different_retailer, sig.level = alpha, power = power, type = "two.sample")$n

sample_size_same_retailer # 2.852122

## [1] 2.852122

sample_size_different_retailer # 2.901572

## [1] 2.901572

Introduction to Data Analytics 2

Bengaly Cisse

2024-09-05

Part 1: Experiment Design

Part 2: Use the 04_data.csv to complete this portion.