Description of the dataset This dataset was obtained as part of a course “The Impact of Social Environment on Health”, read by Marijn Stok at the University of Konstanz. The students were divided in groups á 5 participants. Each group investigated a question and designed an experiment. The research question in this particular dataset was about how norms influence our eating behavior. There is evidence that norms transmitted in an injunctive form cause reactance and thus may enhance unhealthy eating choice (Stok et al, 2014). On the contrary a descriptive norm might promote healthy eating behavior. We formulated two norms on two posters: “Eat healthier! How? Nutrient experts recommend making fruits and vegetables to a bigger part of your diet: a healthy diet should include at least 3 portions of fruits and vegetables per day” (injunctive) vs. “Join other students and eat healthier! How? Make fruits and vegetables be a bigger part of your diet: 67% of students at the University of Konstanz eat at least 3 portions of fruits and vegetables per day” (descriptive). The participants were confronted with either of norms. We were interested to see if participants in the descriptive condition will opt for a healthy snack (mandarin) more frequently and if participants in the injunctive condition will choose unhealthy snack (Lebkuchen) more frequently. Additional distractor-posters were designed which were shown to participants in both conditions. These posters included information about healthy eating helping to protect the environment, helping to save money or helping to protect health. Research has shown that people confronted with a list of different reasons for behavior of interest, descriptive and injunctive norms among these reasons, usually underestimate the impact of norms on their behavior ranking them as least influential. At the same time it appears that the impact of norms on the actual behavior is bigger than of the other information. People tend not to recognize the guiding impact of the norms on the behavior (Nolan et al., 2008). We were interested to see if participants will rank the distractor-posters higher in impact than the norms-posters. Additionally we asked participants about their intention to eat healthier to see whether the norms can help to overcome the intention-behavior gap. On the other hand, if no behavioral difference would be found between conditions, we could at least see if norms influence the inner intention to eat healthier. The data was collected at the University. The investigators contacted students in foyer and asked them to fill out a questionnaire on a tablet about eating habits. A snack (mandarin or cookie) was offered as a reward. Participants received a tablet. First they answered some general questions. Then they were asked to rank 4 posters (poster with a norm, health-poster, save-money-poster and environment-poster) according to the impact it made on them. To make sure that participants take a closer look at the norm-poster, they were asked to rank this poster in more detail (color, message, font, general impression). Participants were made to believe that this specific poster was randomly selected. After filling out the questionnaire participants could choose a snack. There are 50 columns and 68 rows in the dataset. Names of columns: Lfdn: actual numbers of participants who were finally included into the dataset Lastpage: the ID of the condition Duration: how long it took to fill out the questionnaire (seconds) tn_number: number of participant starting from 1 to 68 for the dataset agree: informed consent
General questions about health and eating: health: self-reported health eating: self reported healthy eating importance: how important is healthy eating for a participant fruit: consumed portions of fruit per week veg: consumed portions of vegetables per weak diet: “Are you on a diet?” y/n vegetarian: diet y/n vegan: diet y/n religion: diet (exclusion of products) due to religion allergies: diet y/n allergies_kind: specifying allergy others: any other diet others_kind: specifying
Ranking of posters: Poster_condition: poster with either descriptive or injunctive norm Poster_environment: poster with a statement that healthy eating helps to protect environment Poster_money: money statement Poster_health: statement about protecting health Back: rating of poster background Color: rating of poster color Message: rating of poster message Text: rating of text quality Font: rating of text font Composition: rating of poster composition Overall: overall impression
Measurement of intention Want: I want to eat healthy Intend: I intend to eat healthy Plan: I plan to eat healthy Will: I will eat healthy Fruit_intention: I intend to eat more fruit Veg_intention: I intend to eat more vegetables Check: estimation of how healthy the participant eats at the moment
Deographic variables: Gender Nationality Age Occupation Height Weight
Reward: which reward a participant chose - mandarin or Lebkuchen Session_id Ats: code of questionnaire Datetime: date and time Date_of_last_access Condition: condition injunctive or descriptive
Questions:
library("memisc", lib.loc="~/R/win-library/3.2")
## Warning: package 'memisc' was built under R version 3.2.3
## Loading required package: lattice
## Loading required package: MASS
##
## Attaching package: 'memisc'
##
## Die folgenden Objekte sind maskiert von 'package:stats':
##
## contr.sum, contr.treatment, contrasts
##
## Das folgende Objekt ist maskiert 'package:base':
##
## as.array
library("rmarkdown", lib.loc="~/R/win-library/3.2")
library("yarrr", lib.loc="~/R/win-library/3.2")
library("RColorBrewer", lib.loc="~/R/win-library/3.2")
my.data <- as.data.set(spss.system.file('C:/Users/Elena/Downloads/Dataset_group lebkuchen_2015_12_22.sav'))
mean(my.data$poster_condition[my.data$condition == "injunctive"])
## [1] 2.058824
mean(my.data$poster_health[my.data$condition == "injunctive"])
## [1] 2.764706
mean(my.data$poster_environment[my.data$condition == "injunctive"])
## [1] 2.323529
mean(my.data$poster_money[my.data$condition == "injunctive"])
## [1] 2.852941
mean(my.data$poster_condition[my.data$condition == "descriptive"])
## [1] 2.235294
mean(my.data$poster_money[my.data$condition == "descriptive"])
## [1] 2.970588
mean(my.data$poster_health[my.data$condition == "descriptive"])
## [1] 2.705882
mean(my.data$poster_environment[my.data$condition == "descriptive"])
## [1] 2.088235
sd(my.data$poster_condition)
## [1] 1.096326
median(my.data$poster_condition)
## [1] 2
library(survival)
lapply(my.data[,c("poster_condition", "poster_money", "poster_health", "poster_environment")], function(x) t.test(x ~ my.data$condition, var.equal = TRUE))
## $poster_condition
##
## Two Sample t-test
##
## data: x by my.data$condition
## t = 0.66088, df = 66, p-value = 0.511
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.3566579 0.7095991
## sample estimates:
## mean in group descriptive mean in group injunctive
## 2.235294 2.058824
##
##
## $poster_money
##
## Two Sample t-test
##
## data: x by my.data$condition
## t = 0.49556, df = 66, p-value = 0.6218
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.3563402 0.5916343
## sample estimates:
## mean in group descriptive mean in group injunctive
## 2.970588 2.852941
##
##
## $poster_health
##
## Two Sample t-test
##
## data: x by my.data$condition
## t = -0.21349, df = 66, p-value = 0.8316
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.6089300 0.4912829
## sample estimates:
## mean in group descriptive mean in group injunctive
## 2.705882 2.764706
##
##
## $poster_environment
##
## Two Sample t-test
##
## data: x by my.data$condition
## t = -0.88021, df = 66, p-value = 0.3819
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.7690103 0.2984221
## sample estimates:
## mean in group descriptive mean in group injunctive
## 2.088235 2.323529
with(my.data, t.test(poster_condition ~ condition))
##
## Welch Two Sample t-test
##
## data: poster_condition by condition
## t = 0.66088, df = 64.491, p-value = 0.511
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.3568908 0.7098319
## sample estimates:
## mean in group descriptive mean in group injunctive
## 2.235294 2.058824
t.test.poster<-with(my.data, t.test(poster_condition ~ condition))
apa(t.test.poster)
## [1] "mean difference = -0.18, t(64.49) = 0.66, p = 0.51 (2-tailed)"
with(my.data, boxplot(poster_condition ~ condition,
ylab = "Ranking",
xlab = "Condition",
main = "Ranking of posters according to condition",
col="paleturquoise2"))
with(my.data, aggregate(poster_condition ~ condition, FUN = median))
## condition poster_condition
## 1 1 2
## 2 2 2
recode.v <- function(original.vector,
old.values,
new.values,
others = NULL) {
if(is.null(others)) {
new.vector <- original.vector
}
if(is.null(others) == F) {
new.vector <- rep(others,
length(original.vector))
}
for (i in 1:length(old.values)) {
change.log <- new.vector == old.values[i] &
is.na(new.vector) == F
new.vector[change.log] <- new.values[i]
}
return(new.vector)
}
#recode each column
my.data$want <- as.character(my.data$want)
my.data$want <- recode.v(original.vector = my.data$want,
old.values = c("very much", "much", "neutral"),
new.values = c(2, 1, 0)
)
my.data$want <-as.numeric(my.data$want)
my.data$intend <- as.character(my.data$intend)
my.data$intend <- recode.v(original.vector = my.data$intend,
old.values = c("very much", "much", "neutral"),
new.values = c(2, 1, 0)
)
my.data$intend <-as.numeric(my.data$intend)
my.data$plan <- as.character(my.data$plan)
my.data$plan <- recode.v(original.vector = my.data$plan,
old.values = c("very much", "much", "neutral"),
new.values = c(2, 1, 0)
)
my.data$plan <-as.numeric(my.data$plan)
## Warning: NAs durch Umwandlung erzeugt
my.data$will <- as.character(my.data$will)
my.data$will <- recode.v(original.vector = my.data$will,
old.values = c("very likely", "likely", "neutral"),
new.values = c(2, 1, 0)
)
my.data$will <-as.numeric(my.data$will)
#create new column for total intention
my.data$intention.total <-my.data$want+my.data$will+my.data$intend+my.data$plan/4
with(my.data, t.test(intention.total ~ condition))
##
## Welch Two Sample t-test
##
## data: intention.total by condition
## t = -0.28968, df = 63.931, p-value = 0.773
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -0.9253752 0.6910002
## sample estimates:
## mean in group descriptive mean in group injunctive
## 3.757812 3.875000
my.data$reward <- recode.v(original.vector = my.data$reward,
old.values = c("Landwirten", "lebkuch", "leiblichen", "man darin", "mandarine","mit ging", "nichts" ),
new.values = c("lebkuchen", "lebkuchen", "lebkuchen", "mandarin", "mandarin", "mandarin", "nothing")
)
my.data$reward[my.data$reward == "a"] <- NA
table(my.data$reward)
##
## lebkuchen mandarin nothing
## 17 38 12
# now i will conduct a Chi-Squared-Test
library(MASS)
tbl = table(my.data$reward, my.data$condition)
tbl
##
## descriptive injunctive
## lebkuchen 7 10
## mandarin 19 19
## nothing 8 4
chisq.test(tbl)
##
## Pearson's Chi-squared test
##
## data: tbl
## X-squared = 1.8482, df = 2, p-value = 0.3969
#i have to recode columns to numeric and calculate the mean.
my.data$fruit_intention<-as.character(my.data$fruit_intention)
my.data$fruit_intention<-as.numeric(my.data$fruit_intention)
## Warning: NAs durch Umwandlung erzeugt
my.data$fruit<-as.numeric(my.data$fruit)
## Warning in .nextMethod(x = x, mode = mode): NAs durch Umwandlung erzeugt
cor.fruit<-with(my.data, cor.test(fruit_intention, fruit))
apa(cor.fruit)
## [1] "r = 0.46, t(65) = 4.17, p < 0.01 (2-tailed)"
my.data$veg_intention<-as.numeric(my.data$veg_intention)
my.data$veg<-as.numeric(my.data$veg)
cor.veg<-with(my.data, cor.test(veg_intention, veg))
apa(cor.veg)
## [1] "r = 0.95, t(66) = 24.7, p < 0.01 (2-tailed)"
# Can intention to eat healthier be explained through importance of the topic "healthy eating" for a person?
regression<-lm(intention.total ~ importance,
data = my.data)
summary(regression)
##
## Call:
## lm(formula = intention.total ~ importance, data = my.data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.5000 -0.5000 -0.5000 0.7216 2.7500
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.7500 0.3645 13.031 < 2e-16 ***
## importanceimportant -1.0000 0.4385 -2.280 0.025978 *
## importanceneutral -2.1364 0.5815 -3.674 0.000495 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.503 on 63 degrees of freedom
## (2 observations deleted due to missingness)
## Multiple R-squared: 0.1784, Adjusted R-squared: 0.1523
## F-statistic: 6.84 on 2 and 63 DF, p-value: 0.00205
#importance is a significant predictor for intention
plot (x= my.data$importance,
y=my.data$intention,
pch = 16,
col = "blue",
xlab = "Importance",
ylab = "Intention",
main = "Distribution"
)
points(my.data$intention,
pch = 16,
col = "orange"
)
abline(a = 0,
b = 1,
lwd = 2,
lty = 2)
# again recode a column to numeric
my.data$weight<-as.numeric(my.data$weight)
#now histogram:
hist(my.data$weight,
xlim = c(30,150),
ylim = c(0, 50),
xlab = "Weight",
ylab = "Persons",
main = "Weight",
cex.main = .7,
col = "chartreuse"
)
text(mean(my.data$weight), 38,
labels = paste("Mean\n", round(mean(my.data$weight), 2), sep = ""),
adj = 0,
pos = 4
)
abline(v = mean(my.data$weight), lty = 2)
text(median(my.data$weight), 38,
labels = paste("Median\n", round(median(my.data$weight), 2), sep = ""),
adj = 0,
pos = 1
)
abline(v = median(my.data$weight), lty = 2)
# first of all i have to clean data and assign the courses to the same labels
my.data$student <- recode.v(original.vector = my.data$student,
old.values = c("Biologe, French", "Business education", "ecenomics", "Germanistik", "jura","Law", "Lehramt", "lehramt"),
new.values = c("biology", "economics", "economics", "german literature", "law", "law", "teacher", "teacher")
)
my.data$student <- recode.v(original.vector = my.data$student,
old.values = c("econimics", "educational sciences", "Linguistics", "Math,English", "Physik","Sprachwissenschaft", "wiwi", "Wirtschaftswissenschaften"),
new.values = c("economics", "teacher", "linguistics", "math", "physic", "linguistics", "economics", "economics")
)
my.data$student <- recode.v(original.vector = my.data$student,
old.values = c("politics and public Administration", "Politics and Public Administration", "politics and puplic administration", "politics public administration", "politisch and public administration","psychologie", "Psychologie"),
new.values = c("politics", "politics", "politics", "politics", "politics", "psychology", "psychology")
)
my.data$student <- recode.v(original.vector = my.data$student,
old.values = c("politicalscience", "Psychology", "school", "sportwissenschaft", "wirtschaftspadagogik","philosophy, german", "sport"),
new.values = c("politics", "psychology", "teacher", "sport", "economics", "philosophy", "sports")
)
my.data$student[my.data$student == "-99"] <- NA
table(my.data$student)
##
## biology chemistry economics
## 2 2 10
## french german literature Information Engineering
## 1 2 1
## law linguistics math
## 8 3 1
## philosophy physic politics
## 1 1 8
## psychology sociology sports
## 10 2 2
## teacher
## 4
# now I create a new colus with arts = 0, science=1
my.data$study.type <- NA
my.data$study.type[my.data$student%in% c("biology", "chemistry","Information Engineering ", "math", "physic", "psychology", "economics") ] <- 1
my.data$study.type[my.data$student%in% c("french", "german literature","law", "linguistics", "philosophy", "sociology", "teacher", "sports") ] <- 0
# now I write a function
study.section <- function(x) {
if(x == 0) {output <- "arts"}
if(x == 1) {output <- "science"}
return(output)
}
study.section(1)
## [1] "science"
my.data$invalid.answers <- NA
for(row.i in 1:nrow(my.data)) {
data.temp <- my.data[row.i,]
n.na <- sum(is.na(data.temp)) - 1
my.data$invalid.answers[row.i] <- n.na
}