library(ggplot2)
Normaly this would be the first step in a project, however, this experiment has already been designed for us. We will decide What tests to use and What hypotheses to test test. Often you make these decisions before you collect data. However, because we are handed a data set, we should put these decisions off until after we have seen the data.
preference<- read.csv("preferencedata.csv")
preference
## preference primed
## 1 1.8 0
## 2 0.1 0
## 3 4.0 0
## 4 2.1 0
## 5 2.4 0
## 6 3.4 0
## 7 1.7 0
## 8 2.2 0
## 9 1.9 0
## 10 1.9 0
## 11 0.1 0
## 12 3.3 0
## 13 2.1 0
## 14 2.0 0
## 15 1.4 0
## 16 1.6 0
## 17 2.3 0
## 18 1.8 0
## 19 3.2 0
## 20 0.8 0
## 21 1.7 1
## 22 1.7 1
## 23 4.2 1
## 24 3.0 1
## 25 2.9 1
## 26 3.0 1
## 27 4.0 1
## 28 4.1 1
## 29 2.9 1
## 30 2.9 1
## 31 1.2 1
## 32 4.0 1
## 33 3.0 1
## 34 3.9 1
## 35 3.1 1
## 36 2.5 1
## 37 3.2 1
## 38 4.1 1
## 39 3.9 1
## 40 1.1 1
## 41 1.9 1
## 42 3.1 1
According to traditional brand research, successful logos are ones that are highly relevant to the product they represent. However, a market research firm recently reported that nearly 20% of all table wine brands introduced in the last three years featured an animal on the label. Since animals have little to do with the product, we would like to know why marketers are using this tactic. Some researchers have hypothesized that consumers who are “primed,” have thought about the image earlier in an unrelated context, process visual information easier and might have different pref- erences from non-primed consumers.
To investigate this, a research team randomly assigned participants to either a primed condition (denoted 1 in the data set) or non-primed condition (denoted 0). Each participant was asked to indicate their attitude toward a product on a continuous rating scale ranging from 0 (dislike very much) to 6 (like very much). A bottle of pet shampoo with a picture of a collie on the label was the product. Prior to giving their score, however, participants were asked to do a word find where four of the words were common across groups (pet, grooming, bottle, label) and four were either related to the image (dog, collie, puppy, woof for the primed group) or image conflicting (cat, feline, kitten, meow for the non-primed group). Responses on 44 individuals (20 in the non-primed group and 22 in the primed group) were recorded to one decimal place.
The purpose of this study is to determine if priming a subject to a type of lable has an affect on their preference when presented with products with different lables.
ggplot(data=preference, mapping=aes(x=as.factor(primed) , y=preference)) + geom_point()
When looking at this plot we can see that we see that there is a possiblity that the test subjects that were primed have a higher chance of prefering certain products.
First we identify the samples from the population of all people:
The null hypothesis for this experiment would be:
“We say the null hypothesis is that the population mean for the”primed" group is equal to the population mean of the “not primed” group. Typically this is expressed as the difference in means is zero."
This is the statement that the population means are different:
“The population mean for the”primed" group is not equal to the population mean of the “not primed” group. THis is typically expressed as a number, the difference in means is not zero."
We can choose between two different tests, a T test and a proportions test. T test is for testing hypotheses about population means of a quantitative variable, and a proportions test is when the variables are categorial: “yes” or “no”. The correst choice for this experiment is a T-test, we will see why below:
For this experiment we will use a two-sample test because we have two distinct samples:
ggplot(data=preference) + geom_qq(mapping=aes(sample=preference, color=as.factor(primed)))
Above we decided a T test would be the best idea for this example. For the t-test, the main assumption is that the data lie close enough to a Normal (bell shaped) distribution. If the data are Normal, they will lie on a line. This graphs shows that it is probably close enough to deam the data linear and therefore state it is a normal distribution.
It is commonly accepted among statisticians that a level of 0.05 is an appropriate level of significance. This is the level we will use for this experiment.
t.test(formula=preference~as.factor(primed), data=preference)
##
## Welch Two Sample t-test
##
## data: preference by as.factor(primed)
## t = -3.2072, df = 39.282, p-value = 0.002666
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -1.577912 -0.357543
## sample estimates:
## mean in group 0 mean in group 1
## 2.005000 2.972727
Since the p-value is less than the level of significance (0.002666 < 0.05), we REJECT the null hypothesis that the means are equal.
The confidence interval is the range of plausible values for the difference in means. In this experiment the confidence interval is between -1.577921 and -0.357543 which does not include Zero. Zero is not a plausible value for the difference in means, so it is not plausible that the means are the same. The result of STEP 15 is consistent with the result of STEP 14, we ACCEPT the alternative hypothesis.
The mean of the primed group (group 1) is higher than the mean on the not primed group (2.972727 > 2.005000), this shows us that the primed group has more people who preferred certain products.
Using all the data we have collected we can determine that there is enough evidence to show that people who are primed prior to viewing products are more likely to perfer that product.. Advertisment companies can now use this to reach a broader consumer base. This is why some wine companies have put animals on their wine lables, it is something that people have been primed to by society.