Sensory Marketing &amp; Product Innovation: Exercise 2 - Making Choices

Connect with me on Open Science Framework | Contact me via LinkedIn

It might be necessary to use right-click -> open in a new browser window depending on your machine.

R analysis script presenting the solutions for the exercise in Sensory Marketing regarding Biswas et al. (2014). In one of the questions, students are asked to estimate the appropriate sample size for a replication of Study 1b’s findings (under $\chi^{2}$-test, $\alpha$ = 5%, Power = 0.8).

The purpose of this script does not solely lay in answering the exercise question. Moreover, studying these scripts should students make become familiar with some aspects of working in R.

If this grabs your attention

If this exercise grabs your attention, please check out our master study programs at the Otto-von-Guericke-University in Magdeburg (Germany) by clicking on the logo!

1 Loading packages

Beware!

R is a context-sensitive language. Thus, ‘data’ will be interpreted not in the same way as ‘Data’ will.

In R most functionality is provided by additional packages.
Most of the packages are well-documented, see: https://cran.r-project.org/

The code chunk below first evaluates if the package pacman (Rinker & Kurkiewicz, 2018) is already installed on your machine. If yes, the corresponding package will be loaded. If not, R will install the package.
Alternatively, you can do this manually first by executing install.packages(“pacman”) and then library(pacman).
The second line then loads the package pacman.
The third line uses the function p_load() from the pacman package to install (if necessary) and loads all packages that we provide as arguments (e.g., pwr (Champely, 2020), which provides functions for statistical power calculations).

if (!"pacman" %in% rownames(installed.packages())) install.packages("pacman")

library(pacman)

pacman::p_load(tidyverse, pwr, compute.es)

Expand to learn more about calling functions

In all code chunks throughout this script, you can receive additional help on each used function by clicking on its name (or via right-click and then opening in a new browser tab). Alternatively, when coding, we can see which arguments a function understands by pressing ‘F1’ while setting the cursor to the function’s name.

Here is the R session info which gives you information on my machine, all loaded packages, and their version:

sessionInfo()

R version 4.3.1 (2023-06-16 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 11 x64 (build 22621)

Matrix products: default


locale:
[1] LC_COLLATE=German_Germany.utf8  LC_CTYPE=German_Germany.utf8   
[3] LC_MONETARY=German_Germany.utf8 LC_NUMERIC=C                   
[5] LC_TIME=German_Germany.utf8    

time zone: Europe/Berlin
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] compute.es_0.2-5 pwr_1.3-0        htmltools_0.5.6  quarto_1.3      
 [5] ggpubr_0.6.0     lubridate_1.9.3  forcats_1.0.0    stringr_1.5.0   
 [9] dplyr_1.1.3      purrr_1.0.2      readr_2.1.4      tidyr_1.3.0     
[13] tibble_3.2.1     ggplot2_3.4.3    tidyverse_2.0.0  labelled_2.12.0 
[17] knitr_1.44       kableExtra_1.3.4 haven_2.5.3      pacman_0.5.1    

loaded via a namespace (and not attached):
 [1] gtable_0.3.4      xfun_0.40         processx_3.8.2    rstatix_0.7.2    
 [5] tzdb_0.4.0        vctrs_0.6.3       tools_4.3.1       ps_1.7.5         
 [9] generics_0.1.3    fansi_1.0.4       pkgconfig_2.0.3   webshot_0.5.5    
[13] lifecycle_1.0.3   compiler_4.3.1    munsell_0.5.0     carData_3.0-5    
[17] yaml_2.3.7        pillar_1.9.0      later_1.3.1       car_3.1-2        
[21] abind_1.4-5       tidyselect_1.2.0  rvest_1.0.3       digest_0.6.33    
[25] stringi_1.7.12    fastmap_1.1.1     grid_4.3.1        colorspace_2.1-0 
[29] cli_3.6.1         magrittr_2.0.3    utf8_1.2.3        broom_1.0.5      
[33] withr_2.5.1       scales_1.2.1      backports_1.4.1   timechange_0.2.0 
[37] rmarkdown_2.25    httr_1.4.7        ggsignif_0.6.4    hms_1.1.3        
[41] evaluate_0.22     viridisLite_0.4.2 rlang_1.1.1       Rcpp_1.0.11      
[45] glue_1.6.2        xml2_1.3.5        svglite_2.1.1     rstudioapi_0.15.0
[49] jsonlite_1.8.7    R6_2.5.1          systemfonts_1.0.4

2 Finding an answer to question #5

How many subjects have to be recruited to replicate the overall effect of Study 1b (Assumptions: $\alpha$ = 5%, Power = 0.8, Chi²-test, effect size = sample estimate)?

2.1 Calculate effect size Cohen’s $\omega$

Information the article provides:

“Fifty-one restaurant patrons ($M_{age}$ = 38 years, 26% female) participated in the experiment in exchange for a free drink and complimentary chocolates.”
“Consumers chose the chocolate sampled sequentially last to a greater extent when the sensory cues were dissimilar (vs. similar) and chose the first chocolate to a greater extent when the sensory cues were similar (vs. dissimilar) ($\chi^{2}$ = 7.07, p < .01).”

From this, we can extract:

sample size n=51
the $\chi^{2}$ value which is 7.07

Unfortunately, the authors did not report the degrees of freedom (df), which is a suboptimal style. However, based on the information provided, we understand that their analysis is based on a 2x2 contingency table:

	First chocolate chosen	Last chocolate chosen
Similar cues	a	b
Dissimilar cues	c	d

Furthermore, we know from lectures in statistics that the df in a $\chi^{2}$-test equal: (number of rows -1) * (number of columns -1). Thus, df = (2-1)*(2-1) = 1. The degrees of freedom are 1.

This is all that we need to first calculate the observed effect size Cohen’s $\omega$ (Cohen, 1988). Remember from exercise: Cohen’s $\omega$=$\sqrt{\frac{chi^{2}}{n}}$. Therefore, we simply need to solve $\sqrt{\frac{7.07}{51}}$

We achieve this by using the next code chunk. Within this code chunk, we use the sqrt() function (place the cursor in the function and press ‘F1’ to see help), which calculates the square root of a number. We feed this function with the values of $\chi^{2}(1)$ = 7.07, and the total sample size n = 51. We assign the results to an object that we call ‘Cohen_omega’.

In the next line, we call the object and simultaneously round the results to 4 digits. This is obtained by using the round() function.

Cohen_omega <- sqrt(7.07/51)

round(Cohen_omega, digits = 4)

[1] 0.3723

We can see that the calculated effect size 0.3723 perfectly mirrors the one we have seen in the exercise slides. Common classifications for Cohen’s $\omega$ are: [0.1 | 0.3[ - small effect, [0.3 | 0.5[ - medium effect, and [0.5 | +$\infty$] - large effect.

2.2 Conduct A priori power analysis to search for minimum n

In the question, we are asked for a statistical power of 80% (a commonly used threshold in social sciences).

To estimate a minimum sample size n for the replication, we apply the pwr.chisq.test() function from the pwr package (Champely, 2020). This function needs us to provide 5 arguments to fill its parameters. These are (see help by pressing ‘F1’):

w = Effect size Cohen’s $\omega$
N = Total number of observations
df= degrees of freedom
sig.level = Significance level (Type I error probability $\alpha$)
power = Power of test (1 minus Type II error probability)

The function, furthermore, assumes us to set one of these arguments to NULL. By doing so, we tell the function to use the remaining 4 parameters to search for the value of the fifth. In our case, we are searching for ‘N’, therefore, we set ‘N=NULL’.

We assign the results of our power analysis to a new object named ‘results’. Then we call for its content.

results <- pwr.chisq.test(w=Cohen_omega, df=1 ,N=NULL, sig.level=0.05, power = 0.8) 

results

 Chi squared power calculation 

          w = 0.3723271
          N = 56.61837
         df = 1
  sig.level = 0.05
      power = 0.8

NOTE: N is the number of observations

We can see that the calculated sample size 57 perfectly mirrors the one we have seen in G*Power 3 (Faul, Erdfelder, Lang, & Buchner, 2007). Keep in mind: if an original study comes with a medium effect size, then a replication with a power of 80% often needs a comparable sample size as compared to the original study. However, studies reporting only small effects usually need many more participants for a successful replication.

2.3 Visualize results

In a last step, we can visualize the relationship between the expected statistical power and different sample sizes.

For this purpose, we apply the plot() function with the ‘results’ object and a catchy label for the x-axes as arguments.

plot(results, xlab="sample size")

From this plot, we can alternatively extract the same information for sample size planning.

3 Additional reading (not relevant for the exam)

In the above-discussed example, we were planning a replication of the original findings. In doing so, we have made certain assumptions:

We assume that the observed effect size Cohen’s $\omega$ = 0.3723 is the true effect size to find. Strictly speaking, this is wrong. Our calculated value is only a sample estimate of the true effect size in the underlying population. Under the assumption of knowing the true effect size $\omega$ = 0.3723 we calculated that we need at least n=57 participants for a replication of the original findings with a power of 80%. This means that we will have approximately the chance of 80% to find the effect to be significant at $\alpha$=0.05. Because, the true effect to find might, however, be smaller, this calculation may present an overly optimistic approach. Therefore, other researchers prefer to focus on the confidence intervals for the estimate of effect size measures (Thompson, 2002).

Apart from Cohen’s $\omega$, there are plenty of other effect sizes ‘out in the market’. They all can be converted into each other(Borenstein, Hedges, Higgins, & Rothstein, 2009; Fern & Monroe, 1996; Lenhard & Lenhard, 2017; Volker, 2006). In our case (Chi² with df=1), for example, Cohen’s $\omega$ is equivalent to the common language effect size r (Lenhard & Lenhard, 2017).

One R package that facilitates the calculation of confidence levels for effect sizes is the compute.es package (Re, 2013). This package supports the use of r as an effect size measure.

We can apply the chies() function to obtain confidence intervals for our observed effect size. We use the function twice. The first time we assign the results to an object named ‘CI_effect’. The second time we do not assign the results to an object, which is equivalent to printing the results to the screen.

In the function call we feed the function with 5 arguments:

chi.sq = our $\chi^{2}$ value which is 7.07
n = our sample size of 51
level = the level of confidence for the confidence intervals to be calculated
dig = the number of digits to be saved
verbose = TRUE, to print detailed results

CI_effect <- chies(chi.sq = 7.07, n =51, level = 95, dig = 3, verbose = FALSE)

chies(chi.sq = 7.07, n =51, level = 95, dig = 3, verbose = TRUE)

Mean Differences ES: 
 
 d [ 95 %CI] = 0.794 [ 0.197 , 1.392 ] 
  var(d) = 0.093 
  p-value(d) = 0.012 
  U3(d) = 78.653 % 
  CLES(d) = 71.286 % 
  Cliff's Delta = 0.426 
 
 g [ 95 %CI] = 0.782 [ 0.194 , 1.37 ] 
  var(g) = 0.09 
  p-value(g) = 0.012 
  U3(g) = 78.296 % 
  CLES(g) = 70.991 % 
 
 Correlation ES: 
 
 r [ 95 %CI] = 0.372 [ 0.108 , 0.588 ] 
  var(r) = 0.015 
  p-value(r) = 0.009 
 
 z [ 95 %CI] = 0.391 [ 0.108 , 0.674 ] 
  var(z) = 0.021 
  p-value(z) = 0.009 
 
 Odds Ratio ES: 
 
 OR [ 95 %CI] = 4.225 [ 1.43 , 12.483 ] 
  p-value(OR) = 0.012 
 
 Log OR [ 95 %CI] = 1.441 [ 0.358 , 2.524 ] 
  var(lOR) = 0.306 
  p-value(Log OR) = 0.012 
 
 Other: 
 
 NNT = 3.556 
 Total N = 51

Along the output of the chies()-function we see for the relevant effect size r that we are additionally provided with a test for significance for the effect size itself. This is an alternative to the classical $\chi^{2}$ test. If we reach a significant result here, this means that the effect size obtained is significantly greater than 0. Put differently, the effect is assumed to exist in the underlying population.

We now recognize that the effect size in the underlying population is likely to be located somewhere in the interval ranging from 0.108 to 0.588. Put differently, in the worst case we have to plan for a replication of an effect that is, indeed, only of the size of $\omega$ = 0.108.

We can briefly calculate how this will change our assumptions about the minimum sample size. We, again, use the pwr.chisq.test() function (see above). This time, we hand in ‘CI_effect$l.r’ as the argument for the effect size parameter w. Calling ‘CI_effect$l.r’ simply returns the value of the lower limit of the 95% confidence interval of the effect size r (=Cohen’s $\omega$ in our case).

R objects such as ‘CI_effect’ can compromise different elements. We can assess these elements by using the ‘$’ notation. In the code chunk below, the element ‘l.r’ simply stores the value lower limit of the 95% confidence interval of the effect size r.

results_worst_case <- pwr.chisq.test(w=CI_effect$l.r, df=1 ,N=NULL, sig.level=0.05, power = 0.8) 

results_worst_case

 Chi squared power calculation 

          w = 0.108
          N = 672.9133
         df = 1
  sig.level = 0.05
      power = 0.8

NOTE: N is the number of observations

We can see that the more conservative sample size is 673, which makes a replication of the original findings almost impossible.

This is a general problem in consumer research.

Most universities (especially the smaller European schools) do not have the necessary budget to replicate findings published in the top-tier journals which, however, decreases trust in the findings’ usefulness.

References

Biswas, D., Labrecque, L. I., Lehmann, D. R., & Markos, E. (2014). Making Choices While Smelling, Tasting, and Listening: The Role of Sensory (Dis)similarity When Sequentially Sampling Products. Journal of Marketing, 78(1), 112–126. doi: 10.1509/jm.12.0325

Borenstein, M., Hedges, L. V., Higgins, J. P. T., & Rothstein, H. R. (2009). Introduction to meta-analysis. doi: 10.1002/9780470743386

Champely, S. (2020). Pwr: Basic functions for power analysis. Retrieved from https://CRAN.R-project.org/package=pwr

Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Lawrence Erlbaum.

Faul, F., Erdfelder, E., Lang, A.-G., & Buchner, A. (2007). G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39(2), 175–191. doi: 10.3758/bf03193146

Fern, E. F., & Monroe, K. B. (1996). Effect-Size Estimates: Issues and Problems in Interpretation. Journal of Consumer Research, 23(2), 89. doi: 10.1086/209469

Lenhard, W., & Lenhard, A. (2017). Computation of Effect Sizes. doi: 10.13140/RG.2.2.17823.92329

Re, A. C. D. (2013). Compute.es: Compute effect sizes. Retrieved from https://cran.r-project.org/package=compute.es

Rinker, T. W., & Kurkiewicz, D. (2018). Pacman: Package management for r. Retrieved from http://github.com/trinker/pacman

Thompson, B. (2002). What Future Quantitative Social Science Research Could Look Like: Confidence Intervals for Effect Sizes. Educational Researcher, 31(3), 25–32. doi: 10.3102/0013189x031003025

Volker, M. A. (2006). Reporting effect size estimates in school psychology research. Psychology in the Schools, 43(6), 653–672. doi: 10.1002/pits.20176

Citation

For attribution, please cite this work as:

Prof. Dr. Lichters, M. (2023, October 6). Sensory Marketing & Product Innovation: Exercise 2 - Making Choices. Retrieved from https://rpubs.com/M_Lichters/smpiexercise2

Sensory Marketing and Product Innovation