Preliminaries:

Loading packages:

library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.3     v purrr   0.3.4
## v tibble  3.1.2     v dplyr   1.0.6
## v tidyr   1.1.3     v stringr 1.4.0
## v readr   1.4.0     v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(dplyr)
library(tibble)
library(psych)
## 
## Attaching package: 'psych'
## The following objects are masked from 'package:ggplot2':
## 
##     %+%, alpha
library(ggplot2)
library(plotrix)
## 
## Attaching package: 'plotrix'
## The following object is masked from 'package:psych':
## 
##     rescale

Loading data

cleandata <- read.csv("cleandata.csv")

Goals

This week our goal is to finish all our plots! So far we have finished all of the tables, and have a start on figure four.

Therefore the overall goals are:

  • to finish figure 4
  • start/finish figure 3
  • start/finish figure 5

Once we finish those, we will have finished our replication plan and will just need to clean our data and possibly find more efficient and neater ways to layout our code.

Figure 4

Here is a refresher of what figure 4 looks like:

The issue with figure 4 last week was calculating the right values for the error bars.

After learning that the error bars are actually standard error, we found a helpful package called plotrix which includes a function std.error which calculates standard error for you.

The plot includes the differences in implcit bias across two different time points for each condition: cued, uncued.

Firstly, I created a tibble based off the data calculated for table 3. This time we exclude SD as this is not needed.

table3.1 <- tibble(
  statistics = c("mean1", "mean2"),
  Baseline = c(0.52, 0.60),
  Prenap = c(0.21, 0.30),
  Postnap = c(0.31, 0.25),
  Week = c(0.40, 0.40)
)

print(table3.1)
## # A tibble: 2 x 5
##   statistics Baseline Prenap Postnap  Week
##   <chr>         <dbl>  <dbl>   <dbl> <dbl>
## 1 mean1          0.52   0.21    0.31   0.4
## 2 mean2          0.6    0.3     0.25   0.4

I then mutated this data to create new variables for the immediate and week delay timepoints. I did this by indicating how to calculate the values for the new columns.

the std.error function is used to calculate the standard error.

As shown the se are: - immediate: 0.075 - week: 0.045

table3.2 <- table3.1 %>% 
  mutate(week = Week - Prenap) %>% 
  mutate(immediate = Postnap- Prenap)

print(table3.2)
## # A tibble: 2 x 7
##   statistics Baseline Prenap Postnap  Week  week immediate
##   <chr>         <dbl>  <dbl>   <dbl> <dbl> <dbl>     <dbl>
## 1 mean1          0.52   0.21    0.31   0.4  0.19      0.1 
## 2 mean2          0.6    0.3     0.25   0.4  0.1      -0.05
std.error(table3.2)
## Warning in var(if (is.vector(x) || is.factor(x)) x else as.double(x), na.rm =
## na.rm): NAs introduced by coercion
## statistics   Baseline     Prenap    Postnap       Week       week  immediate 
##         NA      0.040      0.045      0.030      0.000      0.045      0.075

creating a data frame for the figure

We then use all the collated information to create a data frame.

time1 <- c(rep("immediate",2),rep("week",2))
condition <-rep(c("cued","uncued"),2)
bias_change <- c(0.10, -0.05, 0.19, 0.10)
se <- c(0.075, 0.075, 0.045, 0.045)
data1 = data.frame(time1, condition, bias_change, se)

head(data1)
##       time1 condition bias_change    se
## 1 immediate      cued        0.10 0.075
## 2 immediate    uncued       -0.05 0.075
## 3      week      cued        0.19 0.045
## 4      week    uncued        0.10 0.045

Creating the plot

#plot
ggplot(data = data1, aes(
  x = time1,
  y = bias_change,
  fill = condition
)) +
  geom_bar(position = "dodge", stat = "identity", alpha=0.7) +
  geom_errorbar(aes(
    x= time1,
    ymin=bias_change-se,
    ymax=bias_change+se), 
    width=0.4, colour="grey", alpha= 0.9, position = position_dodge(0.9) ) +
  ylim(-0.2, 0.4) #where the y axis cuts off

It looks great but I think my values are still slightly off! Perhaps I applied the std.error function to the wrong type of data set? I’m not too sure where else I could apply this function to get the right error bars.

It’s tricky because on plots they don’t provide the exact limits of the error bars in numbers!

New method

I think the reason why the values are off is because standard error was calculated based off data set created separately (if that makes sense).

I tried a different method where I calculated the differences from the means then tried to calculate standard error but the std.error function only came up with NA.

I then tried a method Julia discovered where you calculated std error not from the means, but the overall data set.

Selecting relevant variables

biaslevelsbycondition <- cleandata %>% 
  select(preIATcued, postIATcued, weekIATcued, preIATuncued, postIATuncued, weekIATuncued)

Mutate new columns

  • we create new variables which calculate bias change for each participant
  • we then select these variables as this is the data we care about
biaschange <- biaslevelsbycondition %>% 
  mutate(immediatecued = postIATcued - preIATcued,
         immediateuncued = postIATuncued - preIATuncued,
         weekcued = weekIATcued - preIATcued,
         weekuncued = weekIATuncued - preIATuncued) %>% 
  select(immediatecued, immediateuncued, weekcued, weekuncued)

print(biaschange)
##    immediatecued immediateuncued    weekcued  weekuncued
## 1     0.12285724      0.25266550 -0.35527924  0.46815278
## 2     0.17844119     -0.39671291  0.59254353 -0.35055488
## 3    -0.51335388      0.30253356 -0.11217557  0.33197054
## 4    -0.21665750      1.89179922  0.95274783  1.14422385
## 5     0.13743091     -0.05132536 -0.32326556  0.37122917
## 6     0.84154653      0.01112392  0.41280881  0.86926351
## 7     0.65038119      0.71644219 -0.10239429 -0.48768220
## 8    -0.35864801      0.20623826 -0.84187543 -0.09632180
## 9    -0.83472353     -1.02913100  0.50672882 -0.65725296
## 10   -0.26523012     -0.78392404 -0.64736111 -0.48847160
## 11   -0.75408705     -0.30895573 -0.84633170 -0.10043463
## 12   -0.66885971      0.38734455  0.27775450  0.41514453
## 13    0.38734455     -0.37923293 -0.01512578  0.40248008
## 14   -0.02219890     -0.48931093 -0.27032684  0.18263630
## 15    1.06073941      0.28904094  0.72443999  0.37203804
## 16    1.31520850      0.06274791  1.93357079  0.72151897
## 17    0.21402818     -0.04269108  0.17602418  0.45106677
## 18   -0.32124822     -0.17573081 -0.02155470  0.19152527
## 19    0.34347364     -0.74241428  0.30278679  0.16319058
## 20    0.42292672      0.22720184  0.88422012 -0.28413048
## 21    0.45457335      0.43590559  0.59326907  0.56417303
## 22    0.25036047     -0.25107766 -0.26438277  0.20746287
## 23   -0.51281515     -0.41976944 -0.94395533 -1.00674734
## 24    0.64048165      0.03596020  1.54843702  0.08574250
## 25   -0.45685123     -0.97588376  0.15987627  0.16606509
## 26    0.43246721     -0.69468327  0.39009895  0.06119897
## 27    0.77170531     -0.13117862  0.51892659 -0.86071096
## 28    0.19732707      0.40017362  0.56270451  0.49227230
## 29   -0.03019308     -0.19573685 -0.11287244 -0.64154214
## 30    0.05255875      0.52459641  0.06005950  0.26276231
## 31   -0.54491511     -0.34708404  0.12103977  0.03916864

Calculating the means

biaschangemean <- biaschange %>% 
  summarise(across(contains("cued"), list(mean = mean)))

print(biaschangemean)
##   immediatecued_mean immediateuncued_mean weekcued_mean weekuncued_mean
## 1         0.09593775          -0.05390545     0.1890689      0.09643346

Finding standard error

This time I don’t use the means data set.

biaschangeerror <- std.error(biaschange)

print(biaschangeerror)
##   immediatecued immediateuncued        weekcued      weekuncued 
##      0.09759788      0.10297893      0.11593440      0.09008655

It worked! Time to apply this to the plot.

Final plot

creating a data frame

time1 <- c(rep("immediate",2),rep("week",2))
condition <-rep(c("cued","uncued"),2)
bias_change <- c(0.09593775, -0.05390545, 0.1890689, 0.09643346)
se <- c(0.09759788, 0.10297893, 0.11593440, 0.09008655)
data1 = data.frame(time1, condition, bias_change, se)

head(data1)
##       time1 condition bias_change         se
## 1 immediate      cued  0.09593775 0.09759788
## 2 immediate    uncued -0.05390545 0.10297893
## 3      week      cued  0.18906890 0.11593440
## 4      week    uncued  0.09643346 0.09008655

the plot

#plot
ggplot(data = data1, aes(
  x = time1,
  y = bias_change,
  fill = condition
)) +
  geom_bar(position = "dodge", stat = "identity", alpha=0.7) +
  geom_errorbar(aes(
    x= time1,
    ymin=bias_change-se,
    ymax=bias_change+se), 
    width=0.4, colour="grey", alpha= 0.9, position = position_dodge(0.9) ) +
  ylim(-0.2, 0.4) #where the y axis cuts off

Figure 3

Figure three displays the average bias levels over each time point for both cued and uncued conditions. This is basically a graphical display of the data from table three.

figure 3

table 3

Trying to make the data set and basic plot

Using the data I replicated from table 3, I attempted to create a data frame, using the same method as I did for figure 4.

To create a data frame I need to isnert values or labels into different groups i.e. time points, conditions and values. Then I indicated to R that I wanted to make a data frame including those variables.

time <- c(rep("Baseline", 2), rep("Prenap", 2), rep("Postnap", 2), rep("1-week", 2))
condition <-  rep(c("cued", "uncued"), 2)
bias_av <- c(0.52, 0.60, 0.21, 0.30, 0.31, 0.25, 0.40, 0.47)
data2 = data.frame(time, condition, bias_av)

head(data2)
##       time condition bias_av
## 1 Baseline      cued    0.52
## 2 Baseline    uncued    0.60
## 3   Prenap      cued    0.21
## 4   Prenap    uncued    0.30
## 5  Postnap      cued    0.31
## 6  Postnap    uncued    0.25

Plot attempt 1

I then attempted to plot the data, assuming that the graph is a line graph.

ggplot(data = data2, aes(
  x = time, 
  y = bias_av)) +
  geom_line()

Hm, doesn’t look quite right, in fact it’s very off. Perhaps I have the values entered wrong, or the layout of a dataset is different for line plots.

Second attempt at data set where I changed the order of the values I entered in bias_av

time <- c(rep("Baseline", 4), rep("Prenap", 4), rep("Postnap", 4), rep("1-week", 4))
condition <-  rep(c("cued", "uncued"), 2)
bias_av <- c(0.52, 0.21, 0.31, 0.40, 0.60, 0.30, 0.25, 0.40)
data2 = data.frame(time, condition, bias_av)

head(data2)
##       time condition bias_av
## 1 Baseline      cued    0.52
## 2 Baseline    uncued    0.21
## 3 Baseline      cued    0.31
## 4 Baseline    uncued    0.40
## 5   Prenap      cued    0.60
## 6   Prenap    uncued    0.30

Plot attempt 2:

  • this time I tried to see if changing the position of the bars would make a difference, perhaps it’ll help show the two different conditions?
ggplot(data = data2, aes(
  x = time, 
  y = bias_av,
  fill= condition)) +
  geom_line(position= "dodge", stat = "identity")
## Warning: Width not defined. Set with `position_dodge(width = ?)`

Still not quite there!

This time I resorted to our trusty friend google and remade the dataframe:

take 3: this time I tried following the data frame layout from a guide on google.

In this data frame, I repeated the conditions so each time point could correspond with each condition.

data3 <- data.frame(
  condition = factor(c("cued", "cued", "uncued", "uncued")),
  time = factor(c("Baseline", "Prenap", "Postnap", "1-week")),
  levels = c("cued", "uncued"),
  bias_av = c(0.52, 0.21, 0.31, 0.40, 0.60, 0.30, 0.25, 0.40)
)

head(data3)
##   condition     time levels bias_av
## 1      cued Baseline   cued    0.52
## 2      cued   Prenap uncued    0.21
## 3    uncued  Postnap   cued    0.31
## 4    uncued   1-week uncued    0.40
## 5      cued Baseline   cued    0.60
## 6      cued   Prenap uncued    0.30

Plot attempt 3

ggplot(data = data3, aes(
  x = time,
  y = bias_av,
  group = condition)) +
  geom_line(position = "dodge")
## Warning: Width not defined. Set with `position_dodge(width = ?)`

We now have lines! I noted that the order of my time points were not correct or following the way I entered them. Instead they were in alphabetical order.

Now back to the data frame. This time I duplicated the time points so each time point could be paired with a condition. trying data frame again - making sure the conditions and time points line up

data4 <- data.frame(
  condition = factor(c("cued", "cued", "cued", "cued", "uncued", "uncued", "uncued", "uncued")),
  time = factor(c("Baseline", "Prenap", "Postnap", "1-week", "Baseline", "Prenap", "Postnap", "1-week")),
  levels = c("Baseline", "Prenap", "Postnap", "1-week"),
  bias_av = c(0.52, 0.21, 0.31, 0.40, 0.60, 0.30, 0.25, 0.40)
)

head(data4)
##   condition     time   levels bias_av
## 1      cued Baseline Baseline    0.52
## 2      cued   Prenap   Prenap    0.21
## 3      cued  Postnap  Postnap    0.31
## 4      cued   1-week   1-week    0.40
## 5    uncued Baseline Baseline    0.60
## 6    uncued   Prenap   Prenap    0.30

Plot attempt 4

In this attempt I also added colour to the lines. I also googled how to reorder my time points by indicating it in the aesthetics section which worked!

ggplot(data = data4, aes(
  x = factor(time, level = c("Baseline", "Prenap", "Postnap", "1-week")), #googled how to reorder x variables ! 
  y = bias_av,
  colour = condition,
  group = condition)) +
  geom_line()

The graph is so close! Just need to add error bars.

Error bars and calculating SE

Following last week’s learning log I learnt that for error bars, researchers tend to use SE and NOT SD.

Attempt 1 - describe function

Firstly, I tried using the describe function from the psych package.

table3.2 <- table3.1 %>% 
  mutate(week = Week - Prenap) %>% 
  mutate(immediate = Postnap- Prenap)

describe(table3.2)

Then I used these values from the describe package to add a SE variable to my data frame

data4 <- data.frame(
  condition = factor(c("cued", "cued", "cued", "cued", "uncued", "uncued", "uncued", "uncued")),
  time = factor(c("Baseline", "Prenap", "Postnap", "1-week", "Baseline", "Prenap", "Postnap", "1-week")),
  levels = c("Baseline", "Prenap", "Postnap", "1-week"),
  bias_av = c(0.52, 0.21, 0.31, 0.40, 0.60, 0.30, 0.25, 0.40),
  se = c(0.04, 0.04, 0.03, 0.03)
)

head(data4)
##   condition     time   levels bias_av   se
## 1      cued Baseline Baseline    0.52 0.04
## 2      cued   Prenap   Prenap    0.21 0.04
## 3      cued  Postnap  Postnap    0.31 0.03
## 4      cued   1-week   1-week    0.40 0.03
## 5    uncued Baseline Baseline    0.60 0.04
## 6    uncued   Prenap   Prenap    0.30 0.04

Using the same principles for error bars in my last learning log, I added error bars to figure 3!

ggplot(data = data4, aes(
  x = factor(time, level = c("Baseline", "Prenap", "Postnap", "1-week")), 
  y = bias_av,
  colour = condition,
  group = condition)) +
  geom_line() +
  geom_errorbar(aes(
    x= time,
    ymin=bias_av-se,
    ymax=bias_av+se),
width=0.1, colour="black", alpha= 0.9)

It looks great, but the se values look slightly off, perhaps I calculated them wrong. You can tell with the 1-week delay time points, where the se is the same for both conditions. However, in the original, the se are not the same!

Attempt 2 - std.error

This time I tried a new method using std.error which automatically calculates se!

load the package

library(plotrix)

Select relevant variables and calculate SE

table3 <- cleandata %>% 
  select(baseIATcued, baseIATuncued, preIATcued, preIATuncued, postIATcued, postIATuncued, weekIATcued, weekIATuncued) #select relevant variables 


setable3 <- std.error(table3)
print(setable3)
##   baseIATcued baseIATuncued    preIATcued  preIATuncued   postIATcued 
##    0.06522755    0.08030357    0.09232149    0.07937978    0.07984374 
## postIATuncued   weekIATcued weekIATuncued 
##    0.08578681    0.06954221    0.08388760

I then just copy pasted these values into the data frame:

data4 <- data.frame(
  condition = factor(c("cued", "cued", "cued", "cued", "uncued", "uncued", "uncued", "uncued")),
  time = factor(c("Baseline", "Prenap", "Postnap", "1-week", "Baseline", "Prenap", "Postnap", "1-week")),
  levels = c("Baseline", "Prenap", "Postnap", "1-week"),
  bias_av = c(0.52, 0.21, 0.31, 0.40, 0.60, 0.30, 0.25, 0.40),
  se = c(0.06522755, 0.09232149, 0.07984374, 0.06954221, 0.08030357, 0.07937978, 0.08578681, 0.08388760)
)

head(data4)
##   condition     time   levels bias_av         se
## 1      cued Baseline Baseline    0.52 0.06522755
## 2      cued   Prenap   Prenap    0.21 0.09232149
## 3      cued  Postnap  Postnap    0.31 0.07984374
## 4      cued   1-week   1-week    0.40 0.06954221
## 5    uncued Baseline Baseline    0.60 0.08030357
## 6    uncued   Prenap   Prenap    0.30 0.07937978

Now to test to see if it worked with the plot!

ggplot(data = data4, aes(
  x = factor(time, level = c("Baseline", "Prenap", "Postnap", "1-week")), #googled how to reorder x variables ! 
  y = bias_av,
  colour = condition,
  group = condition)) +
  geom_line() +
  geom_errorbar(aes(
    x= time,
    ymin=bias_av-se,
    ymax=bias_av+se),
width=0.1, colour="grey", alpha= 0.9)

Amazing! Looks like the original. The next steps are just to tidy up the plot, maybe play with colours, and edit the titles and labels.

I’ll note that a challenge with error bars is that there are no defined values given to us from the paper. Therefore we are just hoping/guessing that our error bars are correct!

Figure 5

Our last plot!

Funny enough, as a group we all agreed we somehow picked the hardest plot to start with (plot 4). However starting with that plot gave us a lot of insight and knowledge to smash out the other two plots.

I will give props to Kath who completed the figure first to share with everyone but I thought I would still have a go myself.

Figure 5 displays the association between bias change and sleep duration.

A variable for the x axis already exists in the data provided (SWSxREM), therefore, we only need to calculate the values for the y axis: differential bias change.

According to the paper, differential bias change is calculated using the following formula:

(baselinecued - delayedcued) - (baselineuncued - delayeduncued)

All of which are variables that exist in the data! (under different names)

Therefore, to calculate differential bias change, I need to mutate the data to create a new variable.

Below, I am selecting the relevant variables, and mutating them to calculate differential bias change for each participant.

differential <- cleandata %>%
  select(ParticipantID, baseIATcued, weekIATcued, baseIATuncued, weekIATuncued, SWSxREM) %>% 
  mutate(cued_differential = baseIATcued - weekIATcued,
         uncued_differential = baseIATuncued - weekIATuncued,
         diff_bias_change = cued_differential - uncued_differential)

print(differential)
##    ParticipantID baseIATcued weekIATcued baseIATuncued weekIATuncued SWSxREM
## 1            ub6  0.57544182  0.20377367    0.60953653    0.68277422     276
## 2            ub7  0.09911241  0.45873715    0.64396538   -0.01070460       0
## 3            ub8  0.20577365  0.39859469    1.52435622    0.71187286     408
## 4            ub9  0.35314196  0.92341592    0.13108478    0.20212832     408
## 5           ub11  0.57200207 -0.01869151    0.04879409    0.13071184      32
## 6           ub13  0.31025514  0.56073473    0.90121486    1.11629844     648
## 7           ub14  0.23241080 -0.06857532    1.50094682   -0.27687601     333
## 8           ub15  0.67870908  0.25359928    0.61393136    0.03100248      36
## 9           ub18  1.08814254  0.77816500    0.52245709   -0.31888702     552
## 10          ub24  0.55318776  0.08084087    0.28256540    0.03038869     275
## 11          ub25  0.91751740  0.28109140    0.41293602    0.58451878      70
## 12          ub27  0.68424700  0.51216459    0.44334149    0.98690632     496
## 13          ub28  1.08844176  0.55663600    0.77617187    0.93836953       0
## 14          ub29  0.94430311  0.21879167    0.76616645    0.40828505     476
## 15          ub31 -0.15051656  0.64307684    1.20427060    0.88612653     363
## 16          ub32 -0.01583277  0.90497541    0.89984149    0.99438842       0
## 17          ub34  0.53679315  0.57458515    0.26553538    0.00799229     198
## 18          ub35  0.89884164 -0.12045279    0.94869928    1.19622739       0
## 19          ub36  0.63866504  0.64107173    0.36672662    0.62092800     216
## 20          ub38 -0.23209723  0.13055599   -0.20514602   -0.07578923     240
## 21          ub40  0.35954000  0.30356801    0.51371306    0.50089370     450
## 22          ub41  0.26726994 -0.24029660    0.35867739    0.02753136     684
## 23          ub42  0.84482587 -0.37601343    0.63582486   -0.55788997     836
## 24          ub43  0.63319883  0.99021393    0.56643958    0.38961081     506
## 25          ub44  0.43954561  0.62261116    0.93824426    0.98645892       0
## 26          ub45  0.73144877  0.16658792    1.08670619    0.86611864     418
## 27          ub46 -0.07735156  0.05022121    0.08362815   -0.10227978     480
## 28          ub47  1.08893601  1.22970000   -0.18741515    0.16271922      50
## 29          ub48  0.79863715  0.89204287    1.32229673    0.62636312       0
## 30          ub49  0.44411896  0.16209957    0.33019024    0.33564519     336
## 31          ub50  0.53631389  0.68478850    0.15458906    0.28350526     224
##    cued_differential uncued_differential diff_bias_change
## 1        0.371668148         -0.07323769       0.44490584
## 2       -0.359624732          0.65466998      -1.01429471
## 3       -0.192821041          0.81248335      -1.00530440
## 4       -0.570273967         -0.07104354      -0.49923043
## 5        0.590693579         -0.08191775       0.67261133
## 6       -0.250479588         -0.21508358      -0.03539601
## 7        0.300986111          1.77782283      -1.47683672
## 8        0.425109806          0.58292888      -0.15781907
## 9        0.309977544          0.84134411      -0.53136657
## 10       0.472346888          0.25217670       0.22017019
## 11       0.636426002         -0.17158276       0.80800876
## 12       0.172082405         -0.54356483       0.71564723
## 13       0.531805761         -0.16219766       0.69400342
## 14       0.725511445          0.35788139       0.36763005
## 15      -0.793593406          0.31814407      -1.11173748
## 16      -0.920808186         -0.09454693      -0.82626125
## 17      -0.037791997          0.25754309      -0.29533509
## 18       1.019294435         -0.24752811       1.26682255
## 19      -0.002406687         -0.25420138       0.25179469
## 20      -0.362653222         -0.12935678      -0.23329644
## 21       0.055971988          0.01281936       0.04315263
## 22       0.507566544          0.33114603       0.17642051
## 23       1.220839300          1.19371483       0.02712447
## 24      -0.357015106          0.17682877      -0.53384387
## 25      -0.183065550         -0.04821467      -0.13485088
## 26       0.564860845          0.22058754       0.34427330
## 27      -0.127572774          0.18590793      -0.31348070
## 28      -0.140763994         -0.35013437       0.20937037
## 29      -0.093405722          0.69593361      -0.78933933
## 30       0.282019388         -0.00545494       0.28747433
## 31      -0.148474613         -0.12891620      -0.01955842

Now I need to plot this data using ggplot.

I will use geom_point to plot the data.

ggplot(data = differential, aes(
  x = SWSxREM,
  y = diff_bias_change
)) +
  geom_point()

Looks good so far, now need to add a regression line using geom_smooth.

ggplot(data = differential, aes(
  x = SWSxREM,
  y = diff_bias_change
)) +
  geom_point() +
  geom_smooth()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

Hm, looks good but we want a straight regression line and no confidence interval area.

By putting method = lm in the brackets, where lm stands for linear model, I can make the regression line straight. I use se = F to remove the shading, I assume means standard error = false.

ggplot(data = differential, aes(
  x = SWSxREM,
  y = diff_bias_change
)) +
  geom_point() +
  geom_smooth(method = lm, se = F)
## `geom_smooth()` using formula 'y ~ x'

For some reason, the plot is not starting at 0 for the x-axis like the original. I can try to use xlim to fix this.

ggplot(data = differential, aes(
  x = SWSxREM,
  y = diff_bias_change
)) +
  geom_point() +
  geom_smooth(method = lm, se = F) +
  xlim(0, 1000)
## `geom_smooth()` using formula 'y ~ x'

xlim didn’t work. Kath found a solution on google using the scale_x_continuous function.

Along with this, I added a title, and theme.

ggplot(data = differential, aes(
  x = SWSxREM,
  y = diff_bias_change
))+
  geom_point()+
  geom_smooth(method = lm, 
              se = F)+ 
  scale_x_continuous(expand = c(0,0),limits = c(0,1000))+ 
  scale_y_continuous(expand = c(0,0),limits = c(-2,1.5))+
  labs(title = "Fig 5. No association between minutes in SWS x minutes in REM and differential bias change", 
       x = "SWS x REM sleep duration (min)",
       y = "Differential bias change")+
  theme_bw()
## `geom_smooth()` using formula 'y ~ x'

Figure 5 looks great!

Next steps

The next steps are mainly just to clean up my data and get a better understanding of all the functions and steps I used to get to the tables and plots for the report!

This includes creating final copies of my data and adding things like headings to plots. Perhaps I could even play with colours and types of lines.