Do some reports really get ~700 words of feedback in text annotations?
Max number of words in text annotations in a single report in each set of reports
## [1] 423 392 558 296 524 431 353 696 625 709 450
For the BIOM2011Sem2Report1 with 709 words in text annoations, how many annotations and how many words per annotation?
## [1] 84 1 1 2 11 11 2 2 22 71 39 5 1 37 6 1 7
## [18] 2 1 21 1 3 5 6 5 5 4 29 6 4 25 6 18 53
## [35] 4 6 16 186
## [1] 709
What does a text annotation with 186 words look like?
## [1] NA
How many audio annotations were used in this report?
## integer(0)
for number of annotations or amount of feedback per report for each feedback modality
MANOVA to see if there are differences between the 11 projects in any of the following dependent variables:
audio.num,txt.num,audio.total.words,txt.total.words
## Df Pillai approx F num Df den Df Pr(>F)
## as.factor(project) 10 0.71374 62.246 40 11464 < 2.2e-16 ***
## Residuals 2866
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Response audio.num :
## Df Sum Sq Mean Sq F value Pr(>F)
## as.factor(project) 10 16149 1614.87 103.17 < 2.2e-16 ***
## Residuals 2866 44859 15.65
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Response txt.num :
## Df Sum Sq Mean Sq F value Pr(>F)
## as.factor(project) 10 7404 740.37 19.445 < 2.2e-16 ***
## Residuals 2866 109120 38.07
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Response audio.total.words :
## Df Sum Sq Mean Sq F value Pr(>F)
## as.factor(project) 10 309464963 30946496 175.55 < 2.2e-16 ***
## Residuals 2866 505240297 176288
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Response txt.total.words :
## Df Sum Sq Mean Sq F value Pr(>F)
## as.factor(project) 10 3283044 328304 50.578 < 2.2e-16 ***
## Residuals 2866 18603385 6491
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## 3107 observations deleted due to missingness
MANCOVA to see if there are differences due to course, semester or report number in any of the following dependent variables:
audio.num,txt.num,audio.total.words,txt.total.words
## Df Pillai approx F
## as.factor(course) 1 0.37272 425.29
## as.factor(sem) 1 0.09086 71.53
## as.factor(report) 3 0.20476 52.47
## as.factor(course):as.factor(sem) 1 0.03641 27.04
## as.factor(course):as.factor(report) 1 0.02993 22.09
## as.factor(sem):as.factor(report) 2 0.02658 9.64
## as.factor(course):as.factor(sem):as.factor(report) 1 0.00402 2.89
## Residuals 2866
## num Df den Df Pr(>F)
## as.factor(course) 4 2863 < 2.2e-16
## as.factor(sem) 4 2863 < 2.2e-16
## as.factor(report) 12 8595 < 2.2e-16
## as.factor(course):as.factor(sem) 4 2863 < 2.2e-16
## as.factor(course):as.factor(report) 4 2863 < 2.2e-16
## as.factor(sem):as.factor(report) 8 5728 2.268e-13
## as.factor(course):as.factor(sem):as.factor(report) 4 2863 0.02122
## Residuals
##
## as.factor(course) ***
## as.factor(sem) ***
## as.factor(report) ***
## as.factor(course):as.factor(sem) ***
## as.factor(course):as.factor(report) ***
## as.factor(sem):as.factor(report) ***
## as.factor(course):as.factor(sem):as.factor(report) *
## Residuals
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Response audio.num :
## Df Sum Sq Mean Sq
## as.factor(course) 1 11600 11599.9
## as.factor(sem) 1 2911 2911.2
## as.factor(report) 3 850 283.2
## as.factor(course):as.factor(sem) 1 80 79.9
## as.factor(course):as.factor(report) 1 372 372.1
## as.factor(sem):as.factor(report) 2 333 166.6
## as.factor(course):as.factor(sem):as.factor(report) 1 3 2.7
## Residuals 2866 44859 15.7
## F value Pr(>F)
## as.factor(course) 741.1028 < 2.2e-16 ***
## as.factor(sem) 185.9946 < 2.2e-16 ***
## as.factor(report) 18.0947 1.246e-11 ***
## as.factor(course):as.factor(sem) 5.1025 0.02397 *
## as.factor(course):as.factor(report) 23.7698 1.145e-06 ***
## as.factor(sem):as.factor(report) 10.6443 2.479e-05 ***
## as.factor(course):as.factor(sem):as.factor(report) 0.1749 0.67580
## Residuals
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Response txt.num :
## Df Sum Sq Mean Sq
## as.factor(course) 1 221 220.90
## as.factor(sem) 1 970 970.24
## as.factor(report) 3 4252 1417.46
## as.factor(course):as.factor(sem) 1 866 865.55
## as.factor(course):as.factor(report) 1 111 110.81
## as.factor(sem):as.factor(report) 2 941 470.70
## as.factor(course):as.factor(sem):as.factor(report) 1 42 42.40
## Residuals 2866 109120 38.07
## F value Pr(>F)
## as.factor(course) 5.8019 0.01607 *
## as.factor(sem) 25.4831 4.742e-07 ***
## as.factor(report) 37.2290 < 2.2e-16 ***
## as.factor(course):as.factor(sem) 22.7333 1.954e-06 ***
## as.factor(course):as.factor(report) 2.9105 0.08811 .
## as.factor(sem):as.factor(report) 12.3629 4.507e-06 ***
## as.factor(course):as.factor(sem):as.factor(report) 1.1137 0.29136
## Residuals
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Response audio.total.words :
## Df Sum Sq
## as.factor(course) 1 271470027
## as.factor(sem) 1 16711453
## as.factor(report) 3 11409507
## as.factor(course):as.factor(sem) 1 1265677
## as.factor(course):as.factor(report) 1 1379600
## as.factor(sem):as.factor(report) 2 6862255
## as.factor(course):as.factor(sem):as.factor(report) 1 366443
## Residuals 2866 505240297
## Mean Sq F value
## as.factor(course) 271470027 1539.9268
## as.factor(sem) 16711453 94.7965
## as.factor(report) 3803169 21.5737
## as.factor(course):as.factor(sem) 1265677 7.1796
## as.factor(course):as.factor(report) 1379600 7.8258
## as.factor(sem):as.factor(report) 3431127 19.4632
## as.factor(course):as.factor(sem):as.factor(report) 366443 2.0787
## Residuals 176288
## Pr(>F)
## as.factor(course) < 2.2e-16 ***
## as.factor(sem) < 2.2e-16 ***
## as.factor(report) 8.159e-14 ***
## as.factor(course):as.factor(sem) 0.007416 **
## as.factor(course):as.factor(report) 0.005185 **
## as.factor(sem):as.factor(report) 4.019e-09 ***
## as.factor(course):as.factor(sem):as.factor(report) 0.149480
## Residuals
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Response txt.total.words :
## Df Sum Sq Mean Sq
## as.factor(course) 1 334577 334577
## as.factor(sem) 1 3201 3201
## as.factor(report) 3 2670123 890041
## as.factor(course):as.factor(sem) 1 4926 4926
## as.factor(course):as.factor(report) 1 46487 46487
## as.factor(sem):as.factor(report) 2 208703 104351
## as.factor(course):as.factor(sem):as.factor(report) 1 15028 15028
## Residuals 2866 18603385 6491
## F value Pr(>F)
## as.factor(course) 51.5443 8.88e-13 ***
## as.factor(sem) 0.4932 0.48257
## as.factor(report) 137.1179 < 2.2e-16 ***
## as.factor(course):as.factor(sem) 0.7588 0.38376
## as.factor(course):as.factor(report) 7.1616 0.00749 **
## as.factor(sem):as.factor(report) 16.0762 1.14e-07 ***
## as.factor(course):as.factor(sem):as.factor(report) 2.3152 0.12823
## Residuals
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## 3107 observations deleted due to missingness
Pairwise comparisons (t-test) to see if there are differences between semesters for any of the following dependent variables:
audio.num,txt.num,audio.total.words,txt.total.words
## $audio.num
## NULL
##
## $txt.num
##
## Welch Two Sample t-test
##
## data: df[, j] by df[, 10]
## t = 18.7758, df = 5671.162, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 2.330869 2.874347
## sample estimates:
## mean in group Sem1 mean in group Sem2
## 6.174953 3.572345
##
##
## $audio.total.words
##
## Welch Two Sample t-test
##
## data: df[, j] by df[, 10]
## t = -10.2021, df = 5877.782, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -2.208358 -1.496463
## sample estimates:
## mean in group Sem1 mean in group Sem2
## 5.91630 7.76871
##
##
## $txt.total.words
## NULL
##
## $<NA>
##
## Welch Two Sample t-test
##
## data: df[, j] by df[, 10]
## t = 12.6192, df = 4631.418, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 146.7064 200.6740
## sample estimates:
## mean in group Sem1 mean in group Sem2
## 572.9726 399.2824
##
##
## $<NA>
##
## Welch Two Sample t-test
##
## data: df[, j] by df[, 10]
## t = -4.0553, df = 4215.772, p-value = 5.097e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -17.434578 -6.070955
## sample estimates:
## mean in group Sem1 mean in group Sem2
## 99.06391 110.81667
as above but with “BIOL1040Sem1Report 1” removed
## $audio.num
## NULL
##
## $txt.num
##
## Welch Two Sample t-test
##
## data: df2[, j] by df2[, 10]
## t = 17.6256, df = 4163.686, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 2.476521 3.096408
## sample estimates:
## mean in group Sem1 mean in group Sem2
## 6.358809 3.572345
##
##
## $audio.total.words
##
## Welch Two Sample t-test
##
## data: df2[, j] by df2[, 10]
## t = -12.6402, df = 5231.012, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -2.807823 -2.053813
## sample estimates:
## mean in group Sem1 mean in group Sem2
## 5.337892 7.768710
##
##
## $txt.total.words
## NULL
##
## $<NA>
##
## Welch Two Sample t-test
##
## data: df2[, j] by df2[, 10]
## t = 12.4684, df = 3745.657, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 161.8319 222.2226
## sample estimates:
## mean in group Sem1 mean in group Sem2
## 591.3097 399.2824
##
##
## $<NA>
##
## Welch Two Sample t-test
##
## data: df2[, j] by df2[, 10]
## t = -10.5543, df = 3513.092, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -36.96661 -25.38390
## sample estimates:
## mean in group Sem1 mean in group Sem2
## 79.64142 110.81667
Pairwise comparisons (t-test) to see if there are differences between courses for any of the following dependent variables:
audio.num,txt.num,audio.total.words,txt.total.words
## $audio.num
## NULL
##
## $txt.num
##
## Welch Two Sample t-test
##
## data: df[, j] by df[, 9]
## t = -16.6575, df = 632.147, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -8.155243 -6.435203
## sample estimates:
## mean in group BIOL1040 mean in group BIOM2011
## 4.207224 11.502447
##
##
## $audio.total.words
##
## Welch Two Sample t-test
##
## data: df[, j] by df[, 9]
## t = 2.9169, df = 669.317, p-value = 0.003654
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 0.4081978 2.0896374
## sample estimates:
## mean in group BIOL1040 mean in group BIOM2011
## 6.912865 5.663948
##
##
## $txt.total.words
## NULL
##
## $<NA>
##
## Welch Two Sample t-test
##
## data: df[, j] by df[, 9]
## t = -19.0374, df = 618.693, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -766.0988 -622.8240
## sample estimates:
## mean in group BIOL1040 mean in group BIOM2011
## 414.276 1108.737
##
##
## $<NA>
##
## Welch Two Sample t-test
##
## data: df[, j] by df[, 9]
## t = 5.7629, df = 392.521, p-value = 1.674e-08
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 26.30222 53.54064
## sample estimates:
## mean in group BIOL1040 mean in group BIOM2011
## 108.16865 68.24722
as above but with “BIOL1040Sem1Report 1” removed
## $audio.num
## NULL
##
## $txt.num
##
## Welch Two Sample t-test
##
## data: df2[, j] by df2[, 10]
## t = 17.6256, df = 4163.686, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 2.476521 3.096408
## sample estimates:
## mean in group Sem1 mean in group Sem2
## 6.358809 3.572345
##
##
## $audio.total.words
##
## Welch Two Sample t-test
##
## data: df2[, j] by df2[, 10]
## t = -12.6402, df = 5231.012, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -2.807823 -2.053813
## sample estimates:
## mean in group Sem1 mean in group Sem2
## 5.337892 7.768710
##
##
## $txt.total.words
## NULL
##
## $<NA>
##
## Welch Two Sample t-test
##
## data: df2[, j] by df2[, 10]
## t = 12.4684, df = 3745.657, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 161.8319 222.2226
## sample estimates:
## mean in group Sem1 mean in group Sem2
## 591.3097 399.2824
##
##
## $<NA>
##
## Welch Two Sample t-test
##
## data: df2[, j] by df2[, 10]
## t = -10.5543, df = 3513.092, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -36.96661 -25.38390
## sample estimates:
## mean in group Sem1 mean in group Sem2
## 79.64142 110.81667
Multiple comparisons (anovas) to see if there are differences between reports for any of the follwing dependent variables:
audio.num,txt.num,audio.total.words,txt.total.words
## Df Sum Sq Mean Sq F value Pr(>F)
## df[, 11] 3 11837 3946 133.2 <2e-16 ***
## Residuals 5980 177197 30
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = df[, j] ~ df[, 11])
##
## $`df[, 11]`
## diff lwr upr p adj
## Report 2-Report 1 -1.336542462 -1.7918556 -0.8812293 0
## Report 3-Report 1 -3.360336989 -3.8399845 -2.8806895 0
## Report 4-Report 1 -3.359089305 -3.9858455 -2.7323331 0
## Report 3-Report 2 -2.023794526 -2.5058398 -1.5417493 0
## Report 4-Report 2 -2.022546843 -2.6511399 -1.3939538 0
## Report 4-Report 3 0.001247684 -0.6451894 0.6476847 1
##
## Df Sum Sq Mean Sq F value Pr(>F)
## df[, 11] 3 3536 1178.7 23.88 2.34e-15 ***
## Residuals 5980 295160 49.4
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = df[, j] ~ df[, 11])
##
## $`df[, 11]`
## diff lwr upr p adj
## Report 2-Report 1 -0.5060252 -1.0936643 0.0816139 0.1198130
## Report 3-Report 1 -0.8579640 -1.4770097 -0.2389183 0.0021040
## Report 4-Report 1 -2.6245273 -3.4334353 -1.8156193 0.0000000
## Report 3-Report 2 -0.3519388 -0.9740791 0.2702015 0.4658540
## Report 4-Report 2 -2.1185020 -2.9297807 -1.3072234 0.0000000
## Report 4-Report 3 -1.7665632 -2.6008719 -0.9322546 0.0000003
##
## Df Sum Sq Mean Sq F value Pr(>F)
## df[, 11] 3 7.331e+07 24436492 103.6 <2e-16 ***
## Residuals 4630 1.092e+09 235786
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 1350 observations deleted due to missingness
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = df[, j] ~ df[, 11])
##
## $`df[, 11]`
## diff lwr upr p adj
## Report 2-Report 1 -104.29107 -148.8731 -59.70909 0.0000001
## Report 3-Report 1 -281.07153 -331.5004 -230.64269 0.0000000
## Report 4-Report 1 -337.96479 -400.9629 -274.96668 0.0000000
## Report 3-Report 2 -176.78046 -228.0557 -125.50519 0.0000000
## Report 4-Report 2 -233.67372 -297.3514 -169.99603 0.0000000
## Report 4-Report 3 -56.89326 -124.7929 11.00639 0.1366598
##
## Df Sum Sq Mean Sq F value Pr(>F)
## df[, 11] 3 2011969 670656 79.07 <2e-16 ***
## Residuals 4222 35809754 8482
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 1758 observations deleted due to missingness
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = df[, j] ~ df[, 11])
##
## $`df[, 11]`
## diff lwr upr p adj
## Report 2-Report 1 -30.27932 -39.41023 -21.148414 0.0000000
## Report 3-Report 1 -44.69396 -54.29758 -35.090342 0.0000000
## Report 4-Report 1 -66.14620 -79.16715 -53.125252 0.0000000
## Report 3-Report 2 -14.41464 -24.05372 -4.775559 0.0007108
## Report 4-Report 2 -35.86688 -48.91401 -22.819754 0.0000000
## Report 4-Report 3 -21.45224 -34.83445 -8.070032 0.0002259
the above analysis repeated with Reports categorised as “notfinal” or “final”
## $audio.num
## NULL
##
## $txt.num
##
## Welch Two Sample t-test
##
## data: df[, j] by df[, 12]
## t = -16.4174, df = 4889.537, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -2.492470 -1.960702
## sample estimates:
## mean in group final mean in group notfinal
## 3.411861 5.638447
##
##
## $audio.total.words
##
## Welch Two Sample t-test
##
## data: df[, j] by df[, 12]
## t = -6.2026, df = 4091.043, p-value = 6.102e-10
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -1.5157914 -0.7876997
## sample estimates:
## mean in group final mean in group notfinal
## 5.986942 7.138688
##
##
## $txt.total.words
## NULL
##
## $<NA>
##
## Welch Two Sample t-test
##
## data: df[, j] by df[, 12]
## t = -7.129, df = 2112.991, p-value = 1.382e-12
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -153.86257 -87.47406
## sample estimates:
## mean in group final mean in group notfinal
## 414.9412 535.6095
##
##
## $<NA>
##
## Welch Two Sample t-test
##
## data: df[, j] by df[, 12]
## t = -15.8376, df = 3374.778, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -48.09512 -37.49877
## sample estimates:
## mean in group final mean in group notfinal
## 74.83228 117.62923
## Recording Text
## Overall 78.78 10.90
## BIOL1040 74.17 11.26
## BIOM2011 93.56 7.08
## Recording Text
## Overall 0.01 0.01
## BIOL1040 0.01 0.01
## BIOM2011 0.01 0.02
in mean +/- SEM format:
There were 78.78 +/- 0.01 words per audio annotation overall.
There were 74.17 +/- 0.01 words per audio annotation in BIOL1040.
There were 93.56 +/- 0.01 words per audio annotation in BIOM2011.
There were 10.9 +/- 0.01 words per text annotation overall. There were 11.26 +/- 0.01 words per text annotation in BIOL1040.
There were 7.08 +/- 0.02 words per text annotation in BIOM2011.
so yes, there are significantly more words per audio annotation than words per text annotations (overall)
and there are significantly more words per audio annotation than words per text annotations in BIOL1040
and there are significantly more words per audio annotation than words per text annotations in BIOM2011
##
## Welch Two Sample t-test
##
## data: value by AnnotType
## t = 157.4274, df = 30941.15, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 67.03474 68.72501
## sample estimates:
## mean in group Recording mean in group Text
## 78.78475 10.90488
##
## Welch Two Sample t-test
##
## data: value by AnnotType
## t = 156.0541, df = 23891.61, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 62.12150 63.70186
## sample estimates:
## mean in group Recording mean in group Text
## 74.17457 11.26289
##
## Welch Two Sample t-test
##
## data: value by AnnotType
## t = 68.8026, df = 7283.872, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 84.02254 88.95082
## sample estimates:
## mean in group Recording mean in group Text
## 93.563006 7.076325