In your first Access lab, we’ll be conducting an experiment in class to test an experiment on memory.
After the lab, we will be writing our first lab report. A lab report has many different sections, which you’ll learn more about in class. One of these sections is our results section.
In the results section of a lab report, we need to describe to the reader what the results of our experiment were.
In your lab report, you’ll be asked to summarise your findings as well as describe any patterns in your data for the reader.
In a lab report, we summarise our data by calculating descriptive statistics - numbers that help to describe our data (for example, the avergae memory score for a paticular group).
Your pre-lab task is to work through these instructions. The skills you’ll learn in this homework are the same skills we’ll be using in the second week of the labs using the data we’ll collect in the lab.
The first thing we need to do is to download the folder containing the data we’ll be using to practice with in this homework task. Download the folder from Moodle and save this on your computer.
Next, we need to set the working directory in R. This is an instruction to tell R where the folder with the data we want to use is saved on our computers.
To set the working directory follow these steps:
We load a package using the R command library() with the name of the package between the parentheses ().
So, in this case (where our package is called “tidyverse”) we give R the command library(tidyverse)
library(tidyverse)
Now R knows where to find our data (in our working directory) and has loaded in a package (tidyverse) with all the functions we need. Next, we need to load in our data set. To do this we type in the command dat <- read_csv('Access_practice_data.csv')
dat <- read_csv('Access_practice_data.csv')
The command read_csv is telling R we want to read in a ‘csv’ file (like saying .doc or .docx). We have to make sure the file name is typed in exactly the same in R as it is saved on our computer. The arrow <- assigns the data we are reading in our new dame for our data called dat.
For this practice session we’ll be looking at some made-up data. In this hypothetical example, we have 10 ‘day’ class Access students and 10 ‘evening’ class access students, amnd we want to compare if there are differences in their final exam grades.
Lets start by looking at our data by typing View(dat)
View(dat)
A new screen will appear showing us our data set. Our data has 20 rows (1 row for each participant) and 3 columns. The first column id gives the participant number. The second column Class tells us if our participant was in the Day and Evening class, and our third column ExamGrade gives us the exam grade (our of 22) for that participant.
Descriptive statistics are numbers that help to describe our data (for example, the avergae memory score for a paticular group). In this exercise, we want to calculate the average grade for each group of participants, depending on whether they are in the Day or Evening class.
The average can also be referred to as the mean. The mean score is calculated by adding all the score of one group together, and then dividing the total by the number of participants. What to learn more about calculating means? have a look at this website
To calculating the mean exam grade for the day and evening class seperately, we need to first tell R we want to create 2 groups of participants. We do this by using the command group_by() and telling R how we want to group our data.
dat_grouped <- group_by(dat, Class)
The argument group_by(dat, Class) can be read as “group our data (called dat) by Class”. We now have made 2 groups - one for the day class and one for the evening class.
Now we can calculate the mean (i.e. average) exam score for each group by using the summarise() command.
dat_mean <- summarise(dat_grouped, mean=mean(ExamGrade))
The argument summarise(dat_grouped, mean=mean(ExamGrade)) can be read as “summarise our data (called dat_grouped) by calculating the mean exam grade”. Because we have previously grouped our data, R knows to calculate the mean exam grade for each group
We can check our data by typing View(dat_mean)
View(dat_mean)
We should now see a table that gives us the mean value for the day class (18.5) and the evening class (14.6).
The next time of descriptive statistics we want to calculate is known as the standard deviation. The standard deviation measured the amount of spread in our data (i.e. how similar participants are to the mean). The standard deviation is a more difficult concept than the mean, so have a look at this website (https://www.mathsisfun.com/data/standard-deviation.html) for more information about how the standard devition is calculated.
It is important you understand why the standard deviation is important. Have a look at this website (https://www.dummies.com/education/math/statistics/why-standard-deviation-is-an-important-statistic/) for more information.
We calculate the staandard deviation by using the argument sd()
dat_sd <- summarise(dat_grouped, sd=sd(ExamGrade))
The argument summarise(dat_grouped, sd=sd(ExamGrade)) can be read as “summarise our data (called dat_grouped) by calculating the stadard deviation of the exam grade”. Because we have previously grouped our data, R knows to calculate the standard deviation of the exam grade for each group
We can check our data by typing View(dat_sd)
View(dat_sd)
Often in psychology we want to make a visual representation of our data. In this case, we want to make one of the simplest types of visual representation - a bar graph.
There are 2 stages to making our bar graph. First, we need to tell R what values we want to graph (our groups means).
Our groups means are stored in dat_mean. We are only interested in the column that includes our group means, which is in the column called ‘mean’. To tell R we want to look at a particular collumn within a table we use the dollar $ sign. So dat_mean$mean tells R we want to make a bar chart using only the ‘mean’ column from the dataset ‘dat_mean’.
Next, we need to give our barplot() a title using the argument main="title"`` , as well as an x-axis label (vertical along the bottom of the graph) usingxlab=“x-label”and a y-axis label (horizontal along the edge of the graph) usingyab = “y- label”``` .
The last argument name.arg=c() allows us to name our two columns. Try running the code below and check your graph looks the same.
barplot(dat_mean$mean, main="Average grade in Day and Evening class Students", xlab="Class", ylab = "Grade", names.arg=c("Day", "Evening"))
Our y-axis (along the side of the graph) is too small. So lets add 1 extra argument ylim=c(ymin, ymax) to tell R how big to make the y-axis, as in the code below:
barplot(dat_mean$mean, main="Average grade in Day and Evening class Students", xlab="Class", ylab = "Grade", names.arg=c("Day", "Evening"), ylim=c(0, 20))