The Effect of a Randomised Flipped Classroom Intervention on Mathematics Performance

Introduction

We present some results from a randomised controlled trial in the teaching of a first-year mathematics course. The course was a basic and elementary course in mathematics for business students taught the second semester 2013. Our experiment consisted of randomly forming two groups of students at the beginning of the course. One group (n=132) received traditional lectures, i.e. business-as-usual, while the other group (n=90) received instruction according to the flipped classroom principle. The groups were formed by randomization, so that marked differences between the groups after the intervention are attributable to difference in received instruction.

In this report we focus on performance in mathematics. In addition to performance, it is also of interest to consider attitudes toward mathematics as an important outcome of mathematics instruction. The impact of the intervention (flipped classroom) on attitude, such as enjoyment of mathematics, value of mathematics in society and self-confidence in mathematics, will be presented in a related report.

Pre- and post-tests of achievement were administered at the beginning and end of the instruction period, which lasted ten weeks with three hours of instruction each week. Four weeks after the post-test the students were evaluated in a final, formal exam. While the pre- and post-tests were low-stakes, with no impact on the grade, the exam was high-stakes, determining completely the final grade given.

The pre-test, post-test and exam were three different multiple-choice tests. The pre-test was meant to indicate the entry-level in mathematical skills for each student. The post-test was an exam-like test covering all the central learning objectives of the course.

Background and description of the intervention

Instruction with the flipped classroom concentrates on solving exercises, discussion and group work. The transfer of knowledge, which is the main task of traditional lectures, is relegated to a series of online videos that the student watches out-of-class in preparation to class time. In class the student then is free to work on mathematical exercises and get feedback from the instructor and fellow students.

The flipped classroom had previously been implemented in an elementary statistics course the first semester 2013, with 70 students. The students reported higher satisfaction with the course and a more positive attitude towards statistics after completing the course with flipped classroom, compared to previous years and to other students simultaneously taking the same course with traditional lectures (and different instructors). However, the switch to a flipped classroom did not alter the achievement of the students. That is, abandoning lectures did not improve nor worsen the performance of the students on the final exam, compared to previous years with the same instructor, and compared to other classes taught at the same time with traditional teaching.

We suspected that this lack on effect in achievement might be due to a sub-optimal use of class time in the flipped classroom environment. In the statistics course each student was allowed to work in his/her own pace, i.e. asynchronously. This meant that each student was responsible for progress, and that students meeting in class time to work on problems could not automatically discuss and work together.

Hence for the second semester mathematics course investigated in the present study, we abandoned the asynchronous model, and released videos not all at the same time, but in modules to be worked on the next week. We also implemented team-based learning. Heterogeneous teams with 5-7 students were formed and maintained throughout the semester. For each instruction unit (3-hour sessions once a week) we prepared exercise sheets to be worked on during the unit. The exercises related to the video content that was the assigned one week in advance, and to previous material the students had already worked with. The students were asked to first work on the problems individually, for about 70 minutes. Then the students grouped with their team and discussed the exercises and together decided upon a common team answer to each exercise. After 70 minutes of team work the students reported their team answers, and the instructor went through all exercises. The scores of each team were collected and made available on the online learning platform. The team scores did not have any impact on the final grade. Hence one condition for teams to work optimally, that is, that the team work is incorporated in the grading, was not implemented in the present study.

The intervention period ran over ten weeks, with 3-hour sessions each week both for the traditional lectures and the flipped classroom. The instructor was identical for these groups. Also the groups were given instruction on the same days. Hence, 30 hours of instruction were given under quite similar external conditions for both groups. One difference was however that while the lectures were given in an auditorium, the flipped sessions took place in a space with a common area, with tables and chairs, and also with cells were the students could work in groups and use whiteboards.

Design

alt text

In the first instruction session the students were randomly assigned to the treatment/intervention group and the control group (traditional). Due to space restrictions, about 100 students were assigned to the flipped group. The group that received lectures was larger, with about 130 students. The students were informed about the background and purpose of the experiment, and students from the flipped group were allowed to apply for a transfer to the control group. Reasons for this could be practical (e.g. timetable) or skepticism toward a new instruction method. A total of five students were allowed to switch from flipped to lectures. Although this compromises the purity of our randomised controlled trial, our conclusions are not invalidated by this minor deviation from randomization.

Three different performance tests (multiple-choice) were administered at three occasions. The pre-test consisted of 16 questions and measured entry-level skills of the students. It was administered during the first course day. 132 students from the control group, and 90 students from the treatment (flipped) group completed the test. At the end of the treatment period (10 weeks) all students were invited to complete the post-test, which consisted of 20 questions designed to measure the learning outcomes of the course. The post-test was designed to imitate a typical final exam. It was checked by an external lecturer to ensure that it covered the course content.The post-test was low-stakes, i.e. it was not mandatory and did not affect the grading of students. The grade each student received was based on the final exam, which consisted of 20 questions and was created by an external authority.

Performance Results for Treatment and Control Groups

The following figures sums up the basic facts. The first figure depicts boxplots of percentage correct answers for treatment and control group and for each of the three tests. The standardized score is the percentage of correct answers for the test. As we can see the groups perform similarly on the pre-test. This was expected, since we used randomisation to form the groups and since the pre-test was given before the treatment started.

On the post-test however, the treatment group (flipped) performed overall better than the control group (traditional - red color). Importantly, four weeks after the treatment stopped, the treatment group still performed markedly better than the control group on the final exam. For instance was the failure rate 16% for the treatment group and 22% for the control group.

plot of chunk unnamed-chunk-2

The boxplots is a graphical representation of 5 key values: minimum, lower quartile, median (thick black line), upper quartile and maximum. We can also report the longitudinal development of performance scores in terms of mean values and confidence bands:

plot of chunk unnamed-chunk-3

Summary

The flipped intervention in this randomised controlled trial had a marked effect on mathematics performance. We have presented performance scores for three successive tests: pre-test, post-test and exam. As expected, due to randomization, the treatment and control group performed similarly on the pre-test. On the post-intervention test, however, the treatment group performed significantly better than the control group. In statistical terms, the null hypothesis of equal means in the treatment and control populations was rejected at all reasonable levels of significance, with test statistic \( t=4.2 \) yielding a p-value \( <0.0001 \). We can conclude that this increased performance is caused by some element(s) of the flipped instruction style. We speculate, based on previous experience with a statistics course and flipped classroom, where we found no effect, that the main element responsible for improved performance is the implementation of active learning elements such as team-based learning. Finally, we found that the increased performance for the treatment group was still present four weeks after the treatment was terminated, i.e. at the exam. As expected, the gap between the groups narrowed during these four weeks. But the difference was still significant, with a test statistic of \( t=2.7 \) and a p-value \( <0.01 \).