William Villano: Homework #4

Question 1

Before running any analyses, assume based on past research that the effect of beverage group on anxiety has an effect size of d =.4. Given your sample size of 96 and assuming equal n per group, use the pwr.t.test function in the pwr package in R to determine your power to detect an effect of this size.

pwr_obj <- pwr.t.test(n = 96, d = .4, sig.level = .05, type = "two.sample", 
           alternative = "two.sided")

Answer: With a sample size of 96, power to detect an effect of this size (d = 0.4) is 0.7873581.

Question 2

First test to see if the two beverage groups already differ on their anxiety about receiving a shock before receiving their respective beverages (at baseline). Fit a linear model to test this question. Report the corresponding t-statistic, df and p-value and describe the result of the model in words.

## load data
Data <- read.csv("C:/Users/wvillano/Downloads/HW4_Data.dat", sep = "\t")

# fit linear model
anxBase.lm <- lm(AnxBase ~ 1 + BG, data = Data)

Answer: Individuals in the alcoholic and non-alcoholic experimental groups do not differ significantly in their levels of anxiety at baseline (t(94) = 0.00, p = 1).

Question 3

From the output of that same linear model that you just ran, interpret the “intercept” or b0 coefficient. What does it mean in this sample? What does its corresponding p value mean?

summary(anxBase.lm)

## 
## Call:
## lm(formula = AnxBase ~ 1 + BG, data = Data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.04167 -0.04167 -0.04167 -0.04167  0.95833 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.042e+00  9.396e-02   11.09   <2e-16 ***
## BG          -6.799e-17  1.329e-01    0.00        1    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.651 on 94 degrees of freedom
## Multiple R-squared:  3.76e-30,   Adjusted R-squared:  -0.01064 
## F-statistic: 3.535e-28 on 1 and 94 DF,  p-value: 1

Answer: Given the dummy coding of experimental groups in these data (0 = non-alcoholic group), the intercept for this linear model represents the predicted level of anxiety at baseline for individuals in the non-alcoholic experimental group. For these data, the predicted level of anxiety at baseline for individuals in the non-alcoholic group is 1.042, and the p-value (p < 2e-16) inidcates that this parameter estimate is significantly different from zero.

Question 4

Fit a linear model predicting anxiety (not baseline anxiety) from beverage group. Test if beverage group significantly predicts anxiety (report t-statistic, df, and p-value) and provide a 95% confidence interval for the parameter estimate. Describe the effect of beverage group on anxiety in a sentence.

# fit linear model
anxTest.lm <- lm(AnxTest ~ 1 + BG, data = Data)

# compute 95% confidence interval
anxTest.lm.ci <- confint(anxTest.lm, level = .95)

print(anxTest.lm.ci[2,])

##      2.5 %     97.5 % 
## -1.4538324 -0.4211676

Answer: Results of this linear model indicate that the effect of beverage group on anxiety levels following beverage consumption was significantly different from zero. In terms of model parameter estimates, the predicted level of anxiety (AnxTest) for participants in the non-alcoholic beverage group was 3.8125, which was significantly different from zero (t(94) = 20.733, p < 2e-16). Predicted anxiety levels were 0.9375 points lower for participants in the alcoholic beverage group, which was significantly different from zero at a = 0.05. This effect is further confirmed by 95% confidence intervals that do not contain 0 (B = -0.9375, 95% CI [-1.4538324 -0.4211676], t(94) = -3.605, p = 0.000502).

Question 5

Report PRE along with its interpretation in a sentence to describe the effect of beverage group.

# method 1:  R^2
summary(anxTest.lm)

## 
## Call:
## lm(formula = AnxTest ~ 1 + BG, data = Data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.8750 -0.8750  0.1875  1.1250  3.1250 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   3.8125     0.1839  20.733  < 2e-16 ***
## BG           -0.9375     0.2600  -3.605 0.000502 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.274 on 94 degrees of freedom
## Multiple R-squared:  0.1215, Adjusted R-squared:  0.1121 
## F-statistic:    13 on 1 and 94 DF,  p-value: 0.0005017

# method 2: model comparison
model_C <- lm(AnxTest ~ 1, data = Data)
model_A <- lm(AnxTest ~ 1 + BG, data = Data)

# compute PRE
PRE <- (sum(residuals(model_C)^2) - sum(residuals(model_A)^2)) / sum(residuals(model_C)^2)

Answer: Here, PRE = R² = 0.1215, which indicates that beverage group (i.e., whether or not a participant drank alcohol) accounts for 12.15% of the variance in anxiety scores following beverage consumption, and a model with a predictor for beverage group explains 12.15% more variance than a compact model with no predictor for beverage group.

Question 6

Write a concise summary of the results (a few sentences). Explain the hypothesis you tested, the statistical results of your test, and the practical interpretation of the result.

Answer: In sum, we tested whether participants in the non-alcoholic and alcoholic beverage groups differed in their levels of anxiety at baseline. We failed to reject the null hypothesis for this test, that the difference between anxiety levels for each group at baseline was zero (t(94) = 0.00, p = 1). This means that in our experiment, participants in each beverage group did not differ in their levels of anxiety at baseline. Next, we tested whether participants differed in their levels of anxiety following beverage consumption. We were able to reject the null hypotheses for this test, that the difference in anxiety levels between groups following beverage consumption was zero (t(94) = -3.605, p = 0.000502). Results of this test indicated that participants in the non-alcoholic and alcoholic beverage groups significantly differed in their anxiety levels following beverage consumption. Our model indicates that predicted anxiety levels were almost one point lower for participants who consumed an alcoholic beverage (B = -0.9375, 95% CI [-1.4538324 -0.4211676])

Question 7

The idea that the mean is the most efficient estimator for a given sample relies on what core assumption?

Answer: This relies on the assumption that errors are independent and equally distributed.

Question 8

In a simple model that predicts the mean, the mean squared error is also known as what?