Preference Data Analysis

knitr::opts_chunk$set(echo = TRUE)

First we load the data Usually, we would have to design our experiment and plan how we would collect the data. In this case, our data has already been collected.

preference <- read.csv("preference.csv")
preference

##    preference primed
## 1         1.8      0
## 2         0.1      0
## 3         4.0      0
## 4         2.1      0
## 5         2.4      0
## 6         3.4      0
## 7         1.7      0
## 8         2.2      0
## 9         1.9      0
## 10        1.9      0
## 11        0.1      0
## 12        3.3      0
## 13        2.1      0
## 14        2.0      0
## 15        1.4      0
## 16        1.6      0
## 17        2.3      0
## 18        1.8      0
## 19        3.2      0
## 20        0.8      0
## 21        1.7      1
## 22        1.7      1
## 23        4.2      1
## 24        3.0      1
## 25        2.9      1
## 26        3.0      1
## 27        4.0      1
## 28        4.1      1
## 29        2.9      1
## 30        2.9      1
## 31        1.2      1
## 32        4.0      1
## 33        3.0      1
## 34        3.9      1
## 35        3.1      1
## 36        2.5      1
## 37        3.2      1
## 38        4.1      1
## 39        3.9      1
## 40        1.1      1
## 41        1.9      1
## 42        3.1      1

There are two columns and 43 rows. Therefore, there are 42 samples. The first row is reserved for the labels. The columns are labelled “preference” and “primed.” The rows represent the values of consumer preferences.

Step 3: Identify the purpose of the study The purpose of the study is to help discern whether “priming” consumers affects product preference. The study was prompted as some researchers believe that constomers who are “primed” have, “thought about the image earlier in an unrelated context, process visual information easier, and might have different preferences from non-primed customers.”

Step 4: Visualize data

library(ggplot2)
ggplot(data=preference, mapping=aes(x=preference, y=primed)) + geom_point()

preference$catprim[preference$primed=="0"]<-"not primed"
preference$catprim[preference$primed=="1"] <-"primed"
ggplot()+geom_boxplot(data=preference, aes(x=catprim, y=preference))

Step 5: Interpret the plot The plot shows that overwhelmingly, consumers who were primed had a greater preference for the shampoo when they had the positive word associated. In the primed group, seven participants gave a rating of three or above. In the non-primed group, only 4 gave such ratings. In the primed group, the majority of participants gave 3 and above ratings. In the non-primed group, only 1/4 of the participants gave these ratings.

Step 6: Formulate the null hypothesis The null hypothesis for this data set is: there is no difference in customer preference between primed and non-primed consumers.

Step 7: Formulate the alternative hypothesis The alternative hypothesis for this data set is: there is a difference in customer preference between primed and non-primed consumers.

Step 8: Decide on type of test We will be using a t-test.

Step 6: Choose one sample or two We will be using a two sample test for this data: one sample for primed, and one for non-primed.

Step 10: Check assumptions of the test Since we assume data is normal for t-tests, we will use a qq-plot to check the normalcy of the data.

ggplot(data=preference) + geom_qq(mapping=aes(sample= preference, color=primed))

Step 11: Decide on level of significance

Our level of significance will be 0.05 as is the traditional level.

Step 12: Perform the test

t.test(formula=preference~as.factor(primed), data=preference)

## 
##  Welch Two Sample t-test
## 
## data:  preference by as.factor(primed)
## t = -3.2072, df = 39.282, p-value = 0.002666
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -1.577912 -0.357543
## sample estimates:
## mean in group 0 mean in group 1 
##        2.005000        2.972727

Step 13: Interpret the p-value Since the p-value is below 0.05, we reject the null hypothesis that states there is no preference difference between primed and non-primed consumers.

Step 14: Interpret the confidence interval Zero is not listed as an interval as is shown by the t-test. Therefore, the means cannot be the same. The step 13 results are conclusive.

Step 15: Interpret the sample estimates We have concluded that the means are not equal, meaning that we have analyzed our data enough to answer our original question: is there a difference in preference of a product between primed and non-primed consumers.

Step 16: State your conclusion We have enough evidence to state that primed consumers are more likely to have a positive opinion on products. We came to this conclusion through an analysis of qqplots, a t-test, and two additional plots.

Preference Data Analysis

Alexis Portnoy

12/1/2017