In class, we went through an R example concerning tree growth and tree tubes.
Since the data isn’t available I will go through the R code and comment on the results.
tube_linear <- lm(growth_yr1 ~ tubes, data = treetubes_yr1)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.10585 0.005665 18.685 9.931e-56
tubes -0.04013 0.041848 -0.959 3.382e-01
R squared = 0.002414
Residual standard error = 0.1097
tube_multi1 <- lmer(growth_yr1 ~ tubes + (1|transect), data = treetubes_yr1)
Estimate Std. Error t value
(Intercept) 0.10636 0.01329 8.005
tubes -0.04065 0.05165 -0.787
Groups Name Variance Std.Dev.
transect (Intercept) 0.00084 0.029
Residual 0.01155 0.107
The first model is a standard linear model and the second model takes into account the correlation.
As you can see, the standard error for the ‘tubes’ coefficients increases when correlation is accounted for. This means the original model (tube_linear) is overly enthusiastic about the contribution tubes has on tree growth.
The last thing I want to make note of is the conceptual exercise our team was tasked with evaluating.
These are the instructions:
Examples with correlated data. For each of the following studies: -Identify the most basic observational units -Identify the grouping units (could be multiple levels of grouping) -State the response(s) measured and variable type (normal, binary, Poisson, etc.) -Write a sentence describing the within-group correlation. -Identify fixed and random effects
Nurse stress study.
Four wards were randomly selected at each of 25 hospitals and randomly assigned to offer a stress reduction program for nurses on the ward or to serve as a control. At the conclusion of the study period, a random sample of 10 nurses from each ward completed a test to measure job-related stress. Factors assumed to be related include nurse experience, age, hospital size and type of ward.
The basic observational units are the nurses.
The groups are the hospitals and the particular wards in those hospitals.
The response is the stress evaluation which is probably a normal variable.
The reason there is within-group correlation is because nurses in a certain hospital might experiences similar levels of stress. For example a hospital in a large city might on average be more stressful than a hospital in a smaller city. Also there is correlation within each ward. It is fair to assume that nurses working in the same ward of a hospital have similar stress levels.
Lastly, the fixed variable is whether or not the nurses were in a stress reduction program or not. Pretty much everything else is random effects like experience, age, hospital size, etc… This is because these variables are not controlled by the researchers.