SubjectID | W1 | W2 | W3 | W4 | W5 |
---|---|---|---|---|---|
100 | 27 | 27 | NA | NA | NA |
103 | NA | 37 | 40 | NA | 43 |
106 | 38 | 36 | NA | NA | NA |
107 | 24 | 25 | 33 | NA | 24 |
122 | 27 | 43 | NA | 37 | NA |
125 | 46 | NA | 45 | NA | NA |
130 | 18 | NA | NA | NA | NA |
135 | 48 | NA | NA | NA | NA |
137 | 28 | 23 | NA | NA | NA |
139 | NA | 37 | 33 | NA | 26 |
143 | 39 | 44 | 40 | NA | 35 |
160 | 38 | 38 | NA | NA | 24 |
172 | 19 | 33 | 31 | NA | 16 |
SubjectID | workshop | score |
---|---|---|
100 | W1 | 27 |
103 | W1 | NA |
106 | W1 | 38 |
107 | W1 | 24 |
122 | W1 | 27 |
125 | W1 | 46 |
130 | W1 | 18 |
135 | W1 | 48 |
137 | W1 | 28 |
139 | W1 | NA |
143 | W1 | 39 |
160 | W1 | 38 |
172 | W1 | 19 |
100 | W2 | 27 |
103 | W2 | 37 |
106 | W2 | 36 |
107 | W2 | 25 |
122 | W2 | 43 |
125 | W2 | NA |
130 | W2 | NA |
135 | W2 | NA |
137 | W2 | 23 |
139 | W2 | 37 |
143 | W2 | 44 |
160 | W2 | 38 |
172 | W2 | 33 |
100 | W3 | NA |
103 | W3 | 40 |
106 | W3 | NA |
107 | W3 | 33 |
122 | W3 | NA |
125 | W3 | 45 |
130 | W3 | NA |
135 | W3 | NA |
137 | W3 | NA |
139 | W3 | 33 |
143 | W3 | 40 |
160 | W3 | NA |
172 | W3 | 31 |
100 | W4 | NA |
103 | W4 | NA |
106 | W4 | NA |
107 | W4 | NA |
122 | W4 | 37 |
125 | W4 | NA |
130 | W4 | NA |
135 | W4 | NA |
137 | W4 | NA |
139 | W4 | NA |
143 | W4 | NA |
160 | W4 | NA |
172 | W4 | NA |
100 | W5 | NA |
103 | W5 | 43 |
106 | W5 | NA |
107 | W5 | 24 |
122 | W5 | NA |
125 | W5 | NA |
130 | W5 | NA |
135 | W5 | NA |
137 | W5 | NA |
139 | W5 | 26 |
143 | W5 | 35 |
160 | W5 | 24 |
172 | W5 | 16 |
SubjectID | missing |
---|---|
100 | 3 |
103 | 2 |
106 | 3 |
107 | 1 |
122 | 2 |
125 | 3 |
130 | 4 |
135 | 4 |
137 | 3 |
139 | 2 |
143 | 1 |
160 | 2 |
172 | 1 |
p1 + geom_line() + stat_smooth(aes(group = SubjectID)) + stat_summary(aes(group = SubjectID),
geom = "point", fun.y = mean, shape = 17, size = 3)
p1 + geom_line() + stat_smooth(aes(group = SubjectID), method = "lm", se = FALSE) +
stat_summary(aes(group = SubjectID), geom = "point", fun.y = mean, shape = 17, size = 3)
Now, we might be interested in estimating the overall trend in the data. One option is to add a line using locally weighted regression (lowess) to ???smooth??? over all the variability and give a sense of the overall or average trend. It just takes one short line of code and is automatically calculated
Note again, we use group = 1, so the smooth is not calculated separately for each id
geom_smooth: method=“auto” and size of largest group is <1000, so using loess. Use ‘method = x’ to change the smoothing method.
p1 + geom_line() + stat_smooth(aes(group = 1)) + stat_summary(aes(group = 1),
geom = "point", fun.y = mean, shape = 17, size = 3)
Looks like between worskshop 1 and 4, an intervention/ or something happened, and we wish to fit a piecewise linear model rather than an overall smooth. We can do this by creating a dummy variable (pre/post intervention) and its interaction with time. The only change is a slightly more complex formula.
The default is y ~ x. I(x > 1) creates a dummy (TRUE/FALSE) variable if x (time in this case) is greater than 1. The aestrick in the formula asks for the main effects and the interaction between x and the dummy variable from x.
Now we can see that the trend line jumps after W4, and the slope is allowed to change and it seems drastic indicating their is an interaction between time and intervention (if the change appears minimal, it is a suggestion there is not an interaction between our hypothetical intervention and time)
p1 + geom_line() + stat_smooth(aes(group = 1), method = "lm", formula = y ~
x * I(x > 2), se = FALSE) + stat_summary(aes(group = 1), fun.y = mean, geom = "point",
shape = 17, size = 3)
Preparing the data
Test code for extracting Complete Cases
## [1] TRUE
## # A tibble: 8 x 3
## SubjectID W1 W2
## <chr> <dbl> <dbl>
## 1 100 27 27
## 2 106 38 36
## 3 107 24 25
## 4 122 27 43
## 5 137 28 23
## 6 143 39 44
## 7 160 38 38
## 8 172 19 33
SubjectID | W1 | W2 |
---|---|---|
100 | 27 | 27 |
106 | 38 | 36 |
107 | 24 | 25 |
122 | 27 | 43 |
137 | 28 | 23 |
143 | 39 | 44 |
160 | 38 | 38 |
172 | 19 | 33 |
## Classes 'tbl_df', 'tbl' and 'data.frame': 8 obs. of 3 variables:
## $ SubjectID: chr "100" "106" "107" "122" ...
## $ W1 : num 27 38 24 27 28 39 38 19
## $ W2 : num 27 36 25 43 23 44 38 33
## # A tibble: 4 x 3
## SubjectID W1 W3
## <chr> <dbl> <dbl>
## 1 107 24 33
## 2 125 46 45
## 3 143 39 40
## 4 172 19 31
## SubjectID W1 W3
## Length:4 Min. :19.00 Min. :31.00
## Class :character 1st Qu.:22.75 1st Qu.:32.50
## Mode :character Median :31.50 Median :36.50
## Mean :32.00 Mean :37.25
## 3rd Qu.:40.75 3rd Qu.:41.25
## Max. :46.00 Max. :45.00
## # A tibble: 1 x 3
## SubjectID W1 W4
## <chr> <dbl> <dbl>
## 1 122 27 37
## # A tibble: 4 x 3
## SubjectID W1 W5
## <chr> <dbl> <dbl>
## 1 107 24 24
## 2 143 39 35
## 3 160 38 24
## 4 172 19 16
## SubjectID W1 W5
## Length:4 Min. :19.00 Min. :16.00
## Class :character 1st Qu.:22.75 1st Qu.:22.00
## Mode :character Median :31.00 Median :24.00
## Mean :30.00 Mean :24.75
## 3rd Qu.:38.25 3rd Qu.:26.75
## Max. :39.00 Max. :35.00