library(ggplot2)

Step 1: Design the Experiment

Normaly this would be the first step in a project, however, this experiment has already been designed for us. We will decide What tests to use and What hypotheses to test test. Often you make these decisions before you collect data. However, because we are handed a data set, we should put these decisions off until after we have seen the data.

Step 2: Load the Data

preference<- read.csv("preferencedata.csv")
preference
##    preference primed
## 1         1.8      0
## 2         0.1      0
## 3         4.0      0
## 4         2.1      0
## 5         2.4      0
## 6         3.4      0
## 7         1.7      0
## 8         2.2      0
## 9         1.9      0
## 10        1.9      0
## 11        0.1      0
## 12        3.3      0
## 13        2.1      0
## 14        2.0      0
## 15        1.4      0
## 16        1.6      0
## 17        2.3      0
## 18        1.8      0
## 19        3.2      0
## 20        0.8      0
## 21        1.7      1
## 22        1.7      1
## 23        4.2      1
## 24        3.0      1
## 25        2.9      1
## 26        3.0      1
## 27        4.0      1
## 28        4.1      1
## 29        2.9      1
## 30        2.9      1
## 31        1.2      1
## 32        4.0      1
## 33        3.0      1
## 34        3.9      1
## 35        3.1      1
## 36        2.5      1
## 37        3.2      1
## 38        4.1      1
## 39        3.9      1
## 40        1.1      1
## 41        1.9      1
## 42        3.1      1

Step 3: Describe the Data

According to traditional brand research, successful logos are ones that are highly relevant to the product they represent. However, a market research firm recently reported that nearly 20% of all table wine brands introduced in the last three years featured an animal on the label. Since animals have little to do with the product, we would like to know why marketers are using this tactic. Some researchers have hypothesized that consumers who are “primed,” have thought about the image earlier in an unrelated context, process visual information easier and might have different pref- erences from non-primed consumers.

To investigate this, a research team randomly assigned participants to either a primed condition (denoted 1 in the data set) or non-primed condition (denoted 0). Each participant was asked to indicate their attitude toward a product on a continuous rating scale ranging from 0 (dislike very much) to 6 (like very much). A bottle of pet shampoo with a picture of a collie on the label was the product. Prior to giving their score, however, participants were asked to do a word find where four of the words were common across groups (pet, grooming, bottle, label) and four were either related to the image (dog, collie, puppy, woof for the primed group) or image conflicting (cat, feline, kitten, meow for the non-primed group). Responses on 44 individuals (20 in the non-primed group and 22 in the primed group) were recorded to one decimal place.

Step 4: Purpose

The purpose of this study is to determine if priming a subject to a type of lable has an affect on their preference when presented with products with different lables.

Step 5: Visualize the Data

ggplot(data=preference, mapping=aes(x=as.factor(primed) , y=preference)) + geom_point()

Step 6: Interpret the Plot

When looking at this plot we can see that we see that there is a possiblity that the test subjects that were primed have a higher chance of prefering certain products.

Step 7: Formulate the Null Hypothesis

First we identify the samples from the population of all people:

  1. The amount of people primed
  2. The amound of people not primed

The null hypothesis for this experiment would be:

“We say the null hypothesis is that the population mean for the”primed" group is equal to the population mean of the “not primed” group. Typically this is expressed as the difference in means is zero."

Step 8: Identify Alternative Hypothesis

This is the statement that the population means are different:

“The population mean for the”primed" group is not equal to the population mean of the “not primed” group. THis is typically expressed as a number, the difference in means is not zero."

Step 9: Decide on type of Test

We can choose between two different tests, a T test and a proportions test. T test is for testing hypotheses about population means of a quantitative variable, and a proportions test is when the variables are categorial: “yes” or “no”. The correst choice for this experiment is a T-test, we will see why below:

Step 10: Choose one Sample or Two

For this experiment we will use a two-sample test because we have two distinct samples:

  1. People primed
  2. People not primed

Step 11: Check Assumptions of the Test

ggplot(data=preference) + geom_qq(mapping=aes(sample=preference, color=as.factor(primed)))

Above we decided a T test would be the best idea for this example. For the t-test, the main assumption is that the data lie close enough to a Normal (bell shaped) distribution. If the data are Normal, they will lie on a line. This graphs shows that it is probably close enough to deam the data linear and therefore state it is a normal distribution.

Step 12: Decide on Level of Significance

It is commonly accepted among statisticians that a level of 0.05 is an appropriate level of significance. This is the level we will use for this experiment.

Step 13: Perform the Test

t.test(formula=preference~as.factor(primed), data=preference)
## 
##  Welch Two Sample t-test
## 
## data:  preference by as.factor(primed)
## t = -3.2072, df = 39.282, p-value = 0.002666
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -1.577912 -0.357543
## sample estimates:
## mean in group 0 mean in group 1 
##        2.005000        2.972727

Step 14: Interpret the P-value

Since the p-value is less than the level of significance (0.002666 < 0.05), we REJECT the null hypothesis that the means are equal.

Step 15: Interpret the Confidence Interval

The confidence interval is the range of plausible values for the difference in means. In this experiment the confidence interval is between -1.577921 and -0.357543 which does not include Zero. Zero is not a plausible value for the difference in means, so it is not plausible that the means are the same. The result of STEP 15 is consistent with the result of STEP 14, we ACCEPT the alternative hypothesis.

Step 16: Interpret the sample Estimates

The mean of the primed group (group 1) is higher than the mean on the not primed group (2.972727 > 2.005000), this shows us that the primed group has more people who preferred certain products.

Step 17: Conclusion

Using all the data we have collected we can determine that there is enough evidence to show that people who are primed prior to viewing products are more likely to perfer that product.. Advertisment companies can now use this to reach a broader consumer base. This is why some wine companies have put animals on their wine lables, it is something that people have been primed to by society.