STEP ONE: DESIGN THE EXPERIMENT

STEP TWO: LOAD DATA

library(ggplot2)
preference <- read.csv("preference.csv")
preference
##    preference primed
## 1         1.8      0
## 2         0.1      0
## 3         4.0      0
## 4         2.1      0
## 5         2.4      0
## 6         3.4      0
## 7         1.7      0
## 8         2.2      0
## 9         1.9      0
## 10        1.9      0
## 11        0.1      0
## 12        3.3      0
## 13        2.1      0
## 14        2.0      0
## 15        1.4      0
## 16        1.6      0
## 17        2.3      0
## 18        1.8      0
## 19        3.2      0
## 20        0.8      0
## 21        1.7      1
## 22        1.7      1
## 23        4.2      1
## 24        3.0      1
## 25        2.9      1
## 26        3.0      1
## 27        4.0      1
## 28        4.1      1
## 29        2.9      1
## 30        2.9      1
## 31        1.2      1
## 32        4.0      1
## 33        3.0      1
## 34        3.9      1
## 35        3.1      1
## 36        2.5      1
## 37        3.2      1
## 38        4.1      1
## 39        3.9      1
## 40        1.1      1
## 41        1.9      1
## 42        3.1      1

STEP THREE: DESCRIBE DATA

There are two columns and 42 rows for this data set. The columns represent the two groups, which are preference and primed, while the rows represent the subject’s feelings towards the pet on the label.

STEP FOUR: IDENTIFY THE PURPOSE OF THE STUDY

The purpose of this study is to determine if the animal on the label of the bottle makes a person more likely to purchase that drink, as well as to determine if people being primed toward a certain image affects their choices through the labels as well.

STEP FIVE: VISUALIZE DATA

Our data for this plot is categorical.

library(ggplot2)
ggplot(data=preference, mapping=aes(x=as.factor (primed), y=preference)) + geom_point()

STEP SIX: INTERPRET THE PLOT

There are higher points of preference for those that have been primed versus those who haven’t. There’s a high concentration for the primed group as well, while the data for the un-primed group has more spread. The un-primed group is concentrated around 1.5-2.5, while the primed group is around 3-5. ## STEP SEVEN: FORMULATE THE NULL HYPOTHESIS The null hypothesis is that the means are the same so that it does not matter if the person has been primed or not.

STEP EIGHT: IDENTIFY THE ALTERNATIVE HYPOTHESIS

The mean of the primed population will be larger than that of the un-primed population.

STEP NINE: DECIDE ON TYPE OF TEST

For this data, we will be using a t-test due tot eh data of the means being a quantitative variable.

STEP TEN: CHOOSE ONE SAMPLE OR TWO

Two sample - the primed population and the un-primed population.

STEP ELEVEN: CHECK ASSUMPTIONS OF THE TEST

For the t-test, the main assumption is that the data lie close enough to a Normal (bell shaped) distribution. How close does it have to be? It depends on the sample size, the greater the sample size the more robust the t-test is to non-Normality. Actually even for small sample sizes (10 or 11) it is fairly robust, so unless there is strong skewness or substantial outliers we will be OK.

The best way of judging this is with a qq-plot.

ggplot(data=preference) + geom_qq(mapping=aes(sample=preference, color=as.factor(primed)))

STEP TWELVE: DECIDE ON A LEVEL OF SIGNIFICANCE OF THE TEST

The normal level of significance is 0.05.

STEP THIRTEEN: PERFORM THE TEST

t.test(formula=preference~primed, data=preference)
## 
##  Welch Two Sample t-test
## 
## data:  preference by primed
## t = -3.2072, df = 39.282, p-value = 0.002666
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -1.577912 -0.357543
## sample estimates:
## mean in group 0 mean in group 1 
##        2.005000        2.972727

STEP FOURTEEN: INTERPRET THE P-VALUE

Since the p-value (.002666) is less than the level of significance (.05), we reject the null hypothesis that the means are equal.

STEP FIFTEEN: INTERPRET THE CONFIDENCE INTERVAL

The confidence interval is between -1.577912 and -.357543, which is a 95% confidence interval. Since 0 is not included in the interval, the means cannot be the same. The confidence interval is the range of plausible values for the difference in means. Zero is not in this interval. Therefore 0 is not a plausible value for the difference in means, so it is not plausible that the means are the same. The result of STEP 15 is consistent with the result of STEP 14.

STEP SIXTEEN: INTERPRET THE SAMPLE ESTIMATES

We have concluded that the means are not equal, but we really want to know: is it better to be primed or un-primed? Knowing that the means are unequal we can answer this question by looking the sample estimates primed subjects had a higher preference than un-primed subjects.

STEP SEVENTEEN: STATE YOUR CONCLUSION

We have concluded through evidence that primed subjects have a higher preference to the products with an animal on them than the un-primed subjects. Businesses could use this to determine that recognizable labels will result in consumers being more likely to buy their product.