Last modified on Wed Mar 30 11:19:47 2016

Methods

Student cohort

During the period of this study, there were 733 and 972 students enrolled in the level 1 course, and 251 and 92 students enrolled in the level 2 course in 2013 Semesters 1 and 2, respectively. There were slightly more female (55%) than male (45%) students, the vast majority (89%) were 17 – 21 years old, domestic students (91%), with high- to mid-range university entry scores (84%).

need to get program data to check the following:
Just over half (53%) of the students taking these courses were enrolled in a three-year Bachelor of Science (BSc) or four-year dual degree combining BSc with another degree, or the Bachelor of Biomedical Science (four year research-focused program). An additional 21% were enrolled in one of four sports science degrees, and 12% were enrolled in the combined BSc/Medical program (in which students undertake an accelerated two-year BSc program and then enter a four-year, graduate entry Medical program). The remaining students were enrolled in a large range of single and dual degrees (e.g. Arts, Engineering), specialty degrees and applied science degrees.

Data collection and analysis

A total of 5960 reports submitted and marked through FACS in semester 1 and 2, 2013 were used for this study. The modality, length (in time or number of words) and position of each feedback annotation provided on all reports was recorded, and the time taken to mark each collated. Clickstream data logs were collected as students accessed their marked documents, providing information on opening dates and durations, and student interaction with the feedback annotations. Data on student academic performance for each report were also collated. Throughout this study, quantitative analyses were performed using R 3.1.1 (R Development Core Team, Auckland, NZ). The results were expressed as mean and standard error of the mean (SEM), and were considered significant if p<0.05.

Results and Discussion

The increasing availability of online technologies, which allow the provision of multimodal feedback annotations that can be generalised or situated, has certainly increased the variety and flexibility of feedback delivery options (refs). However, it has also increased the variability of feedback provision and changed the way in which students interact with feedback, leading to potentially greater variability in student outcomes (refs). The primary aims of this study were to understand how the different modalities of feedback available through online technologies affected feedback provisionboth across successive student assessment tasks and within large cohorts with multiple markers, and to examine the way students interacted with the feedback provided. Analyses of FACS data from laboratory reports submitted for assessment (Table 1) in two biomedical science courses in level 1 (n = 1705 students) and level 2 (n = 343), in Semesters 1 and 2, 2013, have shown that there are significant differences in the ways in which markers use the different modalities of feedback (Figure 2-4). In addition, the data demonstrates that there are substantial differences in the way students interact with their marked reports and feedback within them across the semesters (Figures 5-8).

Table 1: Number of assignments processed through FACS

##    course        sem Report 0 Report 1 Report 2 Report 3
## 1 Level 1 Semester 1      682      674      671      667
## 2 Level 1 Semester 2        0      897      870      854
## 3 Level 2 Semester 1        0      239      241        0
## 4 Level 2 Semester 2        0       85       80        0

NB Report 4 for the level 1 course in semester 2 was administered by another academic department which did not participate in the FACS trial.

Feedback Provision

A randomly selected sample of 160 audio annotations were transcribed and their word count determined. Talking speed was relatively stable across audio annotations, as they contained on average 164+/-6 words per minute, which is also consistent with previous reports of 450-500 words per 3 minute audio feedback comment (Brearley and Cullen 2012). To allow comparison between word length of typed and audio annotations, the length (in minutes) of each audio annotation was multiplied by 164, with audio annotations on average containing significantly more words (78.8+/-0.4 words) than typed annotations (10.9+/-0.1 words; p<0.001) (Figure 4). In the first years’ reports, markers provided on average a significantly greater number of typed annotations (6.9+/-0.1) than audio annotations (4.2+/-0.1; p<0.001; Figure 2). This was reversed on the second years’ reports, where markers provided significantly more audio annotations (11.6+/-0.4) than typed annotations (5.6+/-0.4; p<0.001; Figure 2). However, for both first (p<0.001) and second (p<0.001) years’ reports, the total amount of feedback provided in audio annotations, in terms of word length, was significantly greater than typed annotations (Figure 3).

When comparing between year levels, it was found that markers provided significantly more audio annotations on second years’ work on average than on the first years’ reports (p<0.001; Figure 2), the annotations were also significantly longer (94.2+/-1.3 words; p<0.001) than those provided to first year students (74+/-0.4 words). In contrast, on the first year students reports markers provided more, significantly longer (11.3+/-0.1 words) text annotations than for second years reports, who had fewer, shorter text annotations (6.9+/-0.2 words; p<0.001). Despite this, the total amount of feedback provided on second years’ work (1126+/-37 words) was significantly greater than that on first years’ reports (388+/-4 words; p<0.001; Figure 4A & B), with audio annotations making up a significantly larger proportion of the total feedback provided on the second years’ reports [DELETE? REPETITVE?]. Reports from second year students also took longer to mark (28.4+/-1 minutes) than first years’ reports (13.9+/-0.2 minutes; p<0.001). This may be related to the differences in task design between first and second year, with the second years’ reports being of greater average length (1909+/-150 words) than the first years’ reports (1269+/-99 words), and/or may reflect that the higher expectations of scientific reasoning for second year students require more detailed explanation by markers (Zimbardi et al. 2013).

Figure 2: Number of audio and text annotations for level 1 course (Report 0: n=681, Report 1: n=1559, Report 2: n=1535, Report 3: n=1516) and for the level 2 course (Report 1: n=299, Report 2: n=295). Shown is the A) mean+/-SEM number of audio annotations per report across two semesters of level 1 and 2 subjects; B) mean+/-SEM number of text annotations per report across two semesters of level 1 and 2 subjects.

With the exception of the formative task (Report 0), markers tended to provide more audio annotations on the earliest reports in each year and semester, with the amount of audio feedback provided then declining on subsequent reports across each semester (Figure 2A; p<0.001). Interestingly, despite this decline, the average length of each in situ audio annotation was relatively consistent, with averages across semesters and year levels at 28.8+/-0.2 seconds per annotation. With an average talking speed of 164 words/minute, this would suggest that markers tend to give feedback in approximately 80 word ‘sound bites’, effectively a short paragraph of information. This contrasts with other studies where a single audio annotation is given, as these tend to be considerably longer (Ribchester, France, and Wakefield 2008; Gould and Day 2013), but may equate to a similar amount of audio feedback being provided overall on equivalent tasks (Ice et al. 2010; Lunt and Curran 2010).

While audio annotations were relatively consistent in length, the typed annotations were not. The first year students in semester 1 received longer typed annotations on their early reports, with the longest typed comments appearing on the formative task (15.4+/-0.2 words) and declining thereafter (Report 1: 12+/-0.2 words; Report 2: 11+/-0.2 words; Report 3: 8.9+/-0.2 words; p<0.001; Figure 3). In semester 2, where no formative task existed, the typed annotations declined in a similar way (Report 1: 12.2+/-0.2 words; Report 2: 10.4+/-0.1 words; Report 3: 8.8+/-0.1 words; p<0.001; Figure 3). Collectively, the amount of typed feedback provided across the first year reports was similar in each semester (Semester 1: 10.7+/-0.1 words; Semester 2: 10.5+/-0.1 words; p=0.157). For the second year reports, the number (Report 1: 7.1+/-0.7 annotations; Report 2: 4+/-0.4 annotations, p<0.001) and length (Report 1: 7+/-0.2 words; Report 2: 6.7+/-0.3 words) of typed annotations declined (p<0.001) across the two reports.

The availability of in situ feedback possible with this marking tool was well utilised by markers, with annotations, whether typed or audio, being placed primarily in situ, and very few reports receiving ‘summary’ annotations either on the first (9%) or last (0.01%) page of the report text. This meant that each feedback annotation was placed near the specific portion of student work to which it referred. Previous studies investigating the impact of audio annotations have highlighted that students view the separation of audio comments from the relevant section of their work as confusing, for example when a single overall audio comment is provided, or when audio and text files are provided separately (Ribchester, France, and Wakefield 2008; Rodway-Dyer, Dunne, and Newcombe 2009). In this regard, the advent of completely online assessment submission and marking systems, which allow in situ embedding of feedback regardless of all modalities, represent a notable improvement in feedback provision.

Figure 3: Number of words provided in each audio and text annotation for level 1 course (Report 0: n=681, Report 1: n=1559, Report 2: n=1535, Report 3: n=1516) and for the level 2 course (Report 1: n=299, Report 2: n=295). Shown is the A) mean+/-SEM number of words in each audio annotation in reports across two semesters and levels; B) mean+/-SEM number of words in each text annotation in reports across two semesters and two levels. NB The y-axis scale for A is 10x the y-axis scale for B.

Figure 4: Total amount of feedback provided in audio and text feedback for level 1 course (Report 0: n=681, Report 1: n=1559, Report 2: n=1535, Report 3: n=1516) and for the level 2 course (Report 1: n=299, Report 2: n=295). Shown is the A) mean+/-SEM number of words in audio annotations in reports across two semesters and levels; B) mean+/-SEM number of words in text annotations in reports across two semesters and two levels. NB The y-axis scale for A is 10x the y-axis scale for B.

Feedback use

Proportion of students viewing report feedback
The vast majority of the first year students opened their marked reports (92%), although the proportion who did so tended to decline slightly across the semester, with the final report being opened by fewer students (83%; Figure 5). Surprisingly, the reverse was true in second year students. In comparison to the first year students, fewer second year students opened any of their reports (85%), with only 63% opening their first report and 74% opening their second report. Notably, this pattern was consistent in each semester (Figure 5). It is difficult to identify the cause of this difference between first and second year students, it may reflect a lower engagement with assessment by second year students (Loughlin et al. 2013) or could be that the additional opportunities students had during class to gain verbal feedback on their first report meant that they were less reliant on the feedback provided on the report documents.

Figure 5: The proportion of students who looked at their feedback (shown as the proportion of students who received feedback in each cohort). Number of students in each cohort: Level 1 Semester 1: n=682; Level 1 Semester 2: n=897; Level 2 Semester 1: n=239; Level 2 Semester 2: n=85.

Feedback viewing duration and pause times
In terms of the duration for which student had reports open, in each semester and year level the duration declined markedly from averages ranging between 3-7 hours for the non-final reports, down to 29-83 minutes for the final report (Figure 6).

When the patterns of open duration across the cohorts were examined in more detail (Figure 7), it was apparent that subsets of students interacted with their marked reports for distinct periods of time. For the non-final reports in each semester, the majority of students were divided approximately equally between those who opened their report for between one minute and an hour (41%), and those who opened them for greater than an hour (43%), with a smaller subset who opened their reports for less than 1 minute (6%) or did not open their report at all (10%). It is possible that these durations represent students engaging in different categories of behaviour (Warnakulasooriya et al 2007), students who work through the feedback thoroughly (moderate users); students who work through the feedback thoroughly and also use it directly to inform their subsequent report writing (long users); and a small tail of students who glance through their reports very quickly (short users), perhaps primarily to check their grade. This categorisation is supported by the timing and duration of openings across the assessment period. In the first year cohorts it was observed that there was a cluster of openings of marked reports shortly after their release and another cluster in the period 48 hours prior to the due date for the next report with the latter group being of longer open durations (data not shown). It is also supported by the pattern of open durations for the final reports from each semester, which have on average, much shorter open durations (Figure 6) and far greater proportions of students falling into the categories of shorter open durations, of less than one minute (7%), or between one minute and one hour (57%; Figure 7B & D). The duration and timing of students interaction with feedback occurring on the non-final tasks suggests that the students perceive these tasks to be sufficiently similar to one another for the feedback to be useful for the subsequent tasks (Boud and Molloy 2013) but also suggests that one of the key drivers for student interaction with feedback is the immediacy of its use on similar assessment tasks.

need to add in P groups and convert dates
This categorisation is supported by the timing of openings where it can be seen that there is a cluster of openings occurring shortly after the release of the marked reports (Figure X) and another cluster in the period 48 hours prior to the due date for the next report (Figure X) with the latter group being of longer open durations. It is also supported by the pattern of open durations for the final reports from each semester, which have on average, much shorter open durations (Figure 6) and far greater proportions of students falling into the categories of shorter open durations, of less than one minute (X%), or between one minute and one hour (X%; Figure 7B & D). The duration and timing of students interaction with feedback occurring on the non-final tasks suggests that the students perceive these tasks to be sufficiently similar to one another for the feedback to be useful for the subsequent tasks (Boud and Molloy 2013) but also suggests that one of the key drivers for student interaction with feedback is the immediacy of its use on similar assessment tasks.

Figure 6: The duration students had their report feedback open for the level 1 course (Report 0: n=682, Report 1: n=1571, Report 2: n=1541, Report 3: n=1521) and for the level 2 course (Report 1: n=324, Report 2: n=321). Data are expressed as the mean+/-SEM time for which each report was open, in hours.

Figure 7: The number of students who had their feedback open for various durations following the Level 1 Semester 1 course A) formative report (Report 0: n=682), and B) final report (Report 3: n=667), and the Level 2 Semester 1 course C) first report (Report 1: n=239), and C) final report (Report 2: n=241). Data are presented as a frequency histogram showing the number of student across the log transformed open duration.

Furthermore, the analytics used in this study are able to go beyond simple duration of openings, with the “clickstream” data providing the first insights into the temporal patterns of student interactions with feedback. These clicks represent student interactions with the report document, such as selecting a position within it, opening an audio annotation or scrolling. Students interacted with their non-final reports to a significantly greater degree compared with their final reports for both the level 1 course (Report 1: 105+/-2 clicks per report; Report 3: 38+/-1 clicks per report; p<0.001) and the level 2 course (Report 1: 202+/-10 clicks per report; Report 2: 67+/-4 clicks per report; p<0.001). In addition, the pauses between clicks may provide a useful lens for understanding how students are interacting with their feedback. Minuscule pauses between scrolling clicks might indicate students accessing specific parts of their marked report, while short pauses between clicks are likely to represent students reading the feedback or their submission. Moderate pauses might indicate students working outside their feedback (for example on their next report), and very long pauses may indicate students leaving their feedback open in the background. On average, the amount of time students paused between clicks was 2.64+/-0.06 minutes for non-final reports, and 1.18+/-0.1 minutes for final reports (Figure 8). The vast majority of these pauses (83%) fell between 1 second and 3 minutes, with very few pauses exceeding one hour (4%; Figure 9). This finding suggests that, despite the often considerable duration for which students had their reports open, they were spending much of this time actively interacting with the report, rather than simply leaving it open for extended periods. This pattern of behaviour was also very consistent across all non-final reports at both year levels, such that it is likely that this represents the “normal” pattern by which students interact with in situ feedback.

Figure 8: Duration for which students pause between clicks when interacting with feedback for the level 1 course (Report 0: n=682, Report 1: n=1571, Report 2: n=1541, Report 3: n=1521) and for the level 2 course (Report 1: n=324, Report 2: n=321). Data are expressed as the mean+/-SEM time that students paused between clicks for each report, in minutes

Figure 9: Frequency histogram of the duration students pause between clicks when interacting with feedback across the corpus of reports (n=5195 reports). Data are expressed as the number of times students paused between clicks for each duration (x scale is log transformed ranging from msec to hours).

Figure 9: Frequency histogram of the duration students pause between clicks when interacting with feedback for the A) non-final reports (n=3686 reports) and B) final reports (n=1509 reports). Data are expressed as the number of times students paused between clicks for each duration (x scale is log transformed ranging from msec to hours).

Figure 9: Frequency histogram of the duration students pause between clicks when interacting with feedback for the A) non-final reports (n=3686 reports) and B) final reports (n=1509 reports). Data are expressed as the proportion of pauses students took between clicks, ranging from milliseconds to hours (x scale is log transformed to improve clarity).

Academic Performance

Both first and second year students showed a steady improvement in their performance on each report within each semester (Figure 10), with the average achievement on each subsequent task being significantly higher than the preceding task (Level 1 Semester 1: p<0.001; Semester 2: p<0.001; Level 2: p<0.001). It should be noted that the performance on the first task in the second year course was significantly lower than that of the final report in the first year course (p<0.001). Potentially this reflects the increase in assessment complexity and expectations across the year levels (Colthorpe et al. 2015). Interestingly, the average performance for the first year students in each semester was comparable between the summative tasks (p=0.15), but their performance on the formative task in semester 1 was lower than any summative tasks, regardless of semester (p<0.001; Figure 10). This suggests that students in semester 2 are not disadvantaged by the lack of the formative task, potentially the experience they have gained in completing a semester prior to commencing this course has been an effective substitute for any learning gains from the formative task.

Figure 10: Overall final mark for reports in the level 1 course (Report 0: n=682, Report 1: n=1571, Report 2: n=1541, Report 3: n=1521) and for the level 2 course (Report 1: n=324, Report 2: n=321). Data are expressed as the mean+/-SEM mark that students achieved for each report, as a percentage.

When student performance was compared to the extent to which they viewed their feedback, a number of differences were apparent both in terms of the average mark received on any given report and on changes between reports. In both first and second year students, all categories of students demonstrated significant, sequential improvements in average marks as they progressed between reports, with two exceptions (Figure 11). Students who only opened their feedback for a short duration showed no significant improvement in performance between reports, and those first year students who did not open their feedback improved between the first and second summative reports but then did not improve further (Figure 11). Generally, students who had their feedback open for medium or long durations had significantly higher marks than students who opened their feedback for short durations or those who did not open their feedback, with the long durations also out-performing the medium durations in later reports (Figure 11). While the overall improvement in performance across subsequent tasks cannot be attributed entirely to feedback, as the experience of completing one task is expected to contribute to a subsequent, similar task (ref), the relationships between differing durations of opening of reports and academic achievement does suggest that prolonged use of feedback is beneficial. Students who spend longer with their feedback open are more likely to have greater improvements in achievement on subsequent tasks than those who never open or only open their feedback briefly.

##    R0binA       reportA R0binB       reportB       p-adj  sig
## 27  short Report0.Grade  short Report3.Grade 0.001330475 TRUE
## 73  short Report1.Grade  short Report3.Grade 0.001679298 TRUE

Figure 11: Final marks for reports in the level 1, semester 1 course, with students categorised based on the duration for which they opened their Report 0 feedback into unopened (blue bars; n = 57), and short (green bars; <1 minute; n = 26), medium (yellow bars; >1 minute <1 hour; n = 264) and long (orange bars; >1 hour; n = 320) open durations. Data are expressed as the mean+/-SEM mark that students achieved for each report, as a percentage.

We also investigated whether there were relationships between students’ academic performance, their interactions with the feedback and mark provision of feedback. Pioneering work in learning analytics using social networking analysis to investigate the interactions amongst students and staff in discussion boards indicates that academics interact more with high achieving students than with low achieving students (Dawson 2010; Dawson, Tan, and McWilliam 2011), In contrast, our analyses suggest that markers provide more feedback to students who perform poorly on their first report (Figure X). However this pattern breaks down for subsequent reports, where students who perform very poorly on report 2 receive the lowest amount of feedback (Figure X).