#Global options
knitr::opts_chunk$set(echo = TRUE, 
                      message = FALSE, 
                      warning = FALSE, 
                      include = TRUE, #<- show codes
                      cache = FALSE)

knitr::opts_knit$set(progress = TRUE, verbose = TRUE)

# auto format (kable)
options(kableExtra.auto_format = FALSE)

#get file
load(file="C:/Users/luisf/Dropbox/Puc-Rio/Projeto - Memoria da dor 3/Base R - memory of pain 2018.RData")

Note The first sections include all processes I have conducted with the original dataframes. All functions were hided from this markdown report, but you check it out by clicking “CODE” on the top right hand corner. All commands were left intact to provide the information of what was done with the datasets. Therefore, If you want to load the two original files (one in SPSS format and the other one is CSV format), and achieve the same “final” and processed DS, you are welcome. Otherwise, you can load the R file from my OSF (https://osf.io/rwuz9/ , processing and analysis, importable data, r files) and go straight to section 3 (“Results”), on line ~250.
Thank you.
Last update: March 22, 2021

1 Data processing

T1 (SPSS dataset). Manual entries

ds_t1 <- haven::read_sav("C:/Users/luisf/Dropbox/Puc-Rio/Projeto - Memoria da dor 3/Maratonistas 2018 com emails LUIS.sav")

Google dataset (online entries)

ds_t2 <- readxl::read_excel("C:/Users/luisf/Dropbox/Puc-Rio/Projeto - Memoria da dor 3/Avaliação de atletas que participaram da Maratona do Rio de Janeiro (Junho de 2018) (respostas).xlsx")

Backups

backup_ds_t1 <- ds_t1
backup_ds_t2 <- ds_t2

Dataset cleaning

ds_t1 <- clean_names(ds_t1)
ds_t2 <- clean_names(ds_t2)

# Names (English) T1
ds_t1 <- ds_t1 %>% 
  rename(sex = sexo,
         age = idade,
         run_alone = companhia,
         athlete = atleta,
         music = musica,
         first_marathon = primeiramaratona,
         pain_t1 = dor_t1,
         distressed = perturbado,
         productive = produtivo,
         strong = forte,
         enthusiastic = entusiasmado,
         inspired = inspirado,
         afraid = medroso,
         vigorous = vigoroso,
         stimulated = estimulado,
         humiliated = humilhado,
         determined = determinado,
         irritable = irritado,
         powerfull = poderoso,
         dynamic = dinamico,
         scared = assustado,
         nervous = nervoso,
         interested = interessado)
         
# Names (English) T2
ds_t2 <- ds_t2 %>% 
  rename(pain_t2 = lembrando_da_sua_corrida_quanto_foi_a_dor_que_voce_sentiu_lembre_se_que_nao_e_para_lembrar_o_que_voce_marcou_na_epoca_mas_o_quanto_voce_acha_que_foi_sua_dor_depois_da_corrida) %>% 
  rename(word_memory = no_dia_da_corrida_falamos_para_voce_algumas_palavras_e_voce_as_repetiu_depois_marque_aquelas_que_voce_lembra_se_nao_lembrar_nao_marque_nada) %>% 
  rename(age_cat = sua_idade_e) %>% 
  rename(sex = seu_sexo_e) %>% 
  rename(email = seu_e_mail_e) %>% 
  rename(suggestions = se_quiser_dar_dicas_ou_sugestoes_escreva_abaixo)

Sex and factors

#sex t1
ds_t1 <- ds_t1 %>% 
  mutate(sex = factor(
    case_when(sex == "1" ~ "female",
              sex == "2" ~ "male"),levels = c("female", "male"))) 

#sex t2
ds_t2 <- ds_t2 %>% 
  mutate(sex = factor(
    case_when(sex == "Feminino" ~ "female",
              sex == "Masculino" ~ "male"),levels = c("female", "male")))

Coding error is frequent in datasets and the code below checks all inconsistencies among e-mails in both datasets by listing all e-mails present in second time (t2, online collection) but missing in the merged dataset (t1 + t2). If a participant provided his/her e-mail at the second time but this information is not reached in the merged dataset, it means some error was made during the manual phase of this research.

Eight participants were incorrectly coded, but only seven were manually fixed. The e-mail mro.scm@gmail.com was present at the second time of this research, but we were not able to find a similar information in the raw data. The list below reports the changes.

1 - milenamusta@yahoo.com.br TO milenamurta@yahoo.com.br
2 - silvana1souza@gmail.com TO silvana1sousa@gmail.com
3 - emiliofdo@uol.com.br TO emilio_figueiredo@hotmail.com
4 - kaylariany@hotmail.com TO keylariany@hotmail.com
5 - visiomariobr@yahoo.com.br TO visionariobr@yahoo.com
6 - claudiodominguescoelho@id.uff. TO claudiodominguescoelho@id.uff.br
7 - personalrafasimplicio@gmail.co to personalrafasimplicio@gmail.com

After these changes, the dataset was loaded again and all steps were carried out.

2 Merging dataset

#lower case
ds_t1 <- ds_t1 %>% 
  mutate_all(., tolower)

#get from the second dataset only the target variables
ds_t2 <- ds_t2 %>% 
  mutate_all(., tolower) %>% #lower case
  distinct(email, .keep_all= TRUE) #remove duplicates from t2

#merge datasets
ds <- left_join(ds_t1, ds_t2, by = "email") #should have 108

Remove all SPSS attributes

ds[] <- lapply(ds, function(x) { attributes(x) <- NULL; x })

Adjusting all levels and codings

ds %>% count(sex.x)

## # A tibble: 2 x 2
##   sex.x      n
## * <chr>  <int>
## 1 female    45
## 2 male      63

ds %>% count(athlete)

## # A tibble: 3 x 2
##   athlete     n
## *   <int> <int>
## 1       1    19
## 2       2    76
## 3      NA    13

ds <- ds %>% 
  mutate(sex = factor(sex.x, levels = c("female", "male")),
         age = as.numeric(age),
         pain_t1 = as.numeric(pain_t1),
         pain_t2 = as.numeric(pain_t2)) %>% 
  mutate_at(vars(run_alone,
                 athlete,
                 music,
                 first_marathon,
                 music), ~factor(case_when(
                   . == "1" ~ "yes",
                   . == "2" ~ "no",
                  TRUE ~ NA_character_),
                   levels = c("no", "yes")))

ds %>% count(sex)
ds %>% count(athlete)
ds %>% count(music)
ds %>% count(first_marathon)
ds %>% count(run_alone)

!ok

Create a summative score for PANAs

first: transform all item into numeric

ds <- ds %>% 
  mutate_at(vars(distressed:interested), ~as.numeric(.))

Checking all codes

ds %>% select(distressed:interested) %>% summarytools::descr()

Positive affect

ds <- ds %>% 
  mutate(sum_positive = rowSums(select(., productive, strong, enthusiastic, inspired, vigorous, stimulated, determined, powerfull, dynamic, interested), na.rm=T)) #should have 10 variables

Negative affect

ds <- ds %>% 
  mutate(sum_negative = rowSums(select(., distressed, apreensivo, amedrontado, incomodado, angustiado, afraid, humiliated, irritable, scared, nervous), na.rm=T)) #should have 10 variables

Compute an Overall score (positive - negative)

ds <- ds %>% 
  mutate(panas_overall = (sum_positive-sum_negative))

create a groupping for panas

ds <- ds %>% 
  mutate(panas_group = case_when(
                              panas_overall <= 0 ~ "negative or neutral",
                              panas_overall > 0 ~ "positive"))

If you are here, loaded the original datasets (one from spss and one from csv), and ran all previous code, your dataset (ds) should be equal than mine. However, I suggest that you could just load the R file and everything will be fine as well. The following chunks reports the published results and the rationale behind them.

Double checked on March 7, 2021

! done!

3 Results

3.1 Participants

view(summarytools::dfSummary(ds))
#DataExplorer::create_report(ds)
#library(ExPanDaR)
#ExPanD(ds)

ds %>% filter(!is.na(pain_t2)) %>% nrow()

## [1] 56

#sex 1 female 2 male
#run alone 1 yes 2 no
ds%>% 
  select(sex, age,run_alone, athlete, music, first_marathon, sum_positive, sum_negative, pain_t1, pain_t2) %>%
  arsenal::tableby(sex ~ .,., control = arsenal::tableby.control(cat.stats=c("count"))) %>% 
  summary()

## 
## 
## |                            |    1 (N=45)     |    2 (N=63)     |  Total (N=108)  | p value|
## |:---------------------------|:---------------:|:---------------:|:---------------:|-------:|
## |**age**                     |                 |                 |                 |   0.333|
## |&nbsp;&nbsp;&nbsp;N-Miss    |       16        |       16        |       32        |        |
## |&nbsp;&nbsp;&nbsp;Mean (SD) | 39.172 (6.934)  | 41.213 (9.851)  | 40.434 (8.858)  |        |
## |&nbsp;&nbsp;&nbsp;Range     | 27.000 - 53.000 | 24.000 - 69.000 | 24.000 - 69.000 |        |
## |**run_alone**               |                 |                 |                 |   0.848|
## |&nbsp;&nbsp;&nbsp;N-Miss    |        6        |       12        |       18        |        |
## |&nbsp;&nbsp;&nbsp;Mean (SD) |  1.667 (0.478)  |  1.647 (0.483)  |  1.656 (0.478)  |        |
## |&nbsp;&nbsp;&nbsp;Range     |  1.000 - 2.000  |  1.000 - 2.000  |  1.000 - 2.000  |        |
## |**athlete**                 |                 |                 |                 |   0.683|
## |&nbsp;&nbsp;&nbsp;N-Miss    |        4        |        9        |       13        |        |
## |&nbsp;&nbsp;&nbsp;Mean (SD) |  1.780 (0.419)  |  1.815 (0.392)  |  1.800 (0.402)  |        |
## |&nbsp;&nbsp;&nbsp;Range     |  1.000 - 2.000  |  1.000 - 2.000  |  1.000 - 2.000  |        |
## |**music**                   |                 |                 |                 |   0.434|
## |&nbsp;&nbsp;&nbsp;N-Miss    |       12        |       24        |       36        |        |
## |&nbsp;&nbsp;&nbsp;Mean (SD) |  1.424 (0.502)  |  1.333 (0.478)  |  1.375 (0.488)  |        |
## |&nbsp;&nbsp;&nbsp;Range     |  1.000 - 2.000  |  1.000 - 2.000  |  1.000 - 2.000  |        |
## |**first_marathon**          |                 |                 |                 |   0.029|
## |&nbsp;&nbsp;&nbsp;N-Miss    |        1        |        3        |        4        |        |
## |&nbsp;&nbsp;&nbsp;Mean (SD) |  1.455 (0.504)  |  1.250 (0.437)  |  1.337 (0.475)  |        |
## |&nbsp;&nbsp;&nbsp;Range     |  1.000 - 2.000  |  1.000 - 2.000  |  1.000 - 2.000  |        |
## |**sum_positive**            |                 |                 |                 |   0.222|
## |&nbsp;&nbsp;&nbsp;Mean (SD) | 39.511 (10.001) | 37.079 (10.258) | 38.093 (10.177) |        |
## |&nbsp;&nbsp;&nbsp;Range     | 0.000 - 50.000  | 0.000 - 50.000  | 0.000 - 50.000  |        |
## |**sum_negative**            |                 |                 |                 |   0.463|
## |&nbsp;&nbsp;&nbsp;Mean (SD) | 11.400 (4.064)  | 11.968 (3.873)  | 11.731 (3.945)  |        |
## |&nbsp;&nbsp;&nbsp;Range     | 0.000 - 31.000  | 0.000 - 28.000  | 0.000 - 31.000  |        |
## |**pain_t1**                 |                 |                 |                 |   0.894|
## |&nbsp;&nbsp;&nbsp;Mean (SD) |  6.600 (2.973)  |  6.675 (2.773)  |  6.644 (2.844)  |        |
## |&nbsp;&nbsp;&nbsp;Range     | 0.000 - 10.000  | 0.000 - 10.000  | 0.000 - 10.000  |        |
## |**pain_t2**                 |                 |                 |                 |   0.205|
## |&nbsp;&nbsp;&nbsp;N-Miss    |       21        |       31        |       52        |        |
## |&nbsp;&nbsp;&nbsp;Mean (SD) |  6.292 (2.629)  |  5.406 (2.500)  |  5.786 (2.571)  |        |
## |&nbsp;&nbsp;&nbsp;Range     | 0.000 - 10.000  | 0.000 - 10.000  | 0.000 - 10.000  |        |

Check difference in proportions

gmodels::CrossTable(ds$sex)

## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  108 
## 
##  
##           |         1 |         2 | 
##           |-----------|-----------|
##           |        45 |        63 | 
##           |     0.417 |     0.583 | 
##           |-----------|-----------|
## 
## 
## 
##

chisq.test(table(ds$sex))

## 
##  Chi-squared test for given probabilities
## 
## data:  table(ds$sex)
## X-squared = 3, df = 1, p-value = 0.08326

Check difference in age

t.test(age ~ sex, var.equal = T, data = ds)

## 
##  Two Sample t-test
## 
## data:  age by sex
## t = -0.97516, df = 74, p-value = 0.3327
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -6.209423  2.128719
## sample estimates:
## mean in group 1 mean in group 2 
##        39.17241        41.21277

Check differences in run alone

gmodels::CrossTable(ds$sex, ds$run_alone, chisq = T)

## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## | Chi-square contribution |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  90 
## 
##  
##              | ds$run_alone 
##       ds$sex |         1 |         2 | Row Total | 
## -------------|-----------|-----------|-----------|
##            1 |        13 |        26 |        39 | 
##              |     0.014 |     0.007 |           | 
##              |     0.333 |     0.667 |     0.433 | 
##              |     0.419 |     0.441 |           | 
##              |     0.144 |     0.289 |           | 
## -------------|-----------|-----------|-----------|
##            2 |        18 |        33 |        51 | 
##              |     0.011 |     0.006 |           | 
##              |     0.353 |     0.647 |     0.567 | 
##              |     0.581 |     0.559 |           | 
##              |     0.200 |     0.367 |           | 
## -------------|-----------|-----------|-----------|
## Column Total |        31 |        59 |        90 | 
##              |     0.344 |     0.656 |           | 
## -------------|-----------|-----------|-----------|
## 
##  
## Statistics for All Table Factors
## 
## 
## Pearson's Chi-squared test 
## ------------------------------------------------------------
## Chi^2 =  0.03762905     d.f. =  1     p =  0.8461899 
## 
## Pearson's Chi-squared test with Yates' continuity correction 
## ------------------------------------------------------------
## Chi^2 =  2.151774e-30     d.f. =  1     p =  1 
## 
##

Check differences in run alone

gmodels::CrossTable(ds$sex, ds$athlete, chisq = T)

## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## | Chi-square contribution |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  95 
## 
##  
##              | ds$athlete 
##       ds$sex |         1 |         2 | Row Total | 
## -------------|-----------|-----------|-----------|
##            1 |         9 |        32 |        41 | 
##              |     0.078 |     0.020 |           | 
##              |     0.220 |     0.780 |     0.432 | 
##              |     0.474 |     0.421 |           | 
##              |     0.095 |     0.337 |           | 
## -------------|-----------|-----------|-----------|
##            2 |        10 |        44 |        54 | 
##              |     0.059 |     0.015 |           | 
##              |     0.185 |     0.815 |     0.568 | 
##              |     0.526 |     0.579 |           | 
##              |     0.105 |     0.463 |           | 
## -------------|-----------|-----------|-----------|
## Column Total |        19 |        76 |        95 | 
##              |     0.200 |     0.800 |           | 
## -------------|-----------|-----------|-----------|
## 
##  
## Statistics for All Table Factors
## 
## 
## Pearson's Chi-squared test 
## ------------------------------------------------------------
## Chi^2 =  0.171635     d.f. =  1     p =  0.6786628 
## 
## Pearson's Chi-squared test with Yates' continuity correction 
## ------------------------------------------------------------
## Chi^2 =  0.02413618     d.f. =  1     p =  0.8765389 
## 
##

Check differences in first marathon

gmodels::CrossTable(ds$sex, ds$first_marathon, chisq = T)

## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## | Chi-square contribution |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  104 
## 
##  
##              | ds$first_marathon 
##       ds$sex |         1 |         2 | Row Total | 
## -------------|-----------|-----------|-----------|
##            1 |        24 |        20 |        44 | 
##              |     0.924 |     1.821 |           | 
##              |     0.545 |     0.455 |     0.423 | 
##              |     0.348 |     0.571 |           | 
##              |     0.231 |     0.192 |           | 
## -------------|-----------|-----------|-----------|
##            2 |        45 |        15 |        60 | 
##              |     0.677 |     1.335 |           | 
##              |     0.750 |     0.250 |     0.577 | 
##              |     0.652 |     0.429 |           | 
##              |     0.433 |     0.144 |           | 
## -------------|-----------|-----------|-----------|
## Column Total |        69 |        35 |       104 | 
##              |     0.663 |     0.337 |           | 
## -------------|-----------|-----------|-----------|
## 
##  
## Statistics for All Table Factors
## 
## 
## Pearson's Chi-squared test 
## ------------------------------------------------------------
## Chi^2 =  4.756635     d.f. =  1     p =  0.02918556 
## 
## Pearson's Chi-squared test with Yates' continuity correction 
## ------------------------------------------------------------
## Chi^2 =  3.88465     d.f. =  1     p =  0.04872941 
## 
##

Check differences in run listen to music

gmodels::CrossTable(ds$sex, ds$music, chisq = T)

## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## | Chi-square contribution |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  72 
## 
##  
##              | ds$music 
##       ds$sex |         1 |         2 | Row Total | 
## -------------|-----------|-----------|-----------|
##            1 |        19 |        14 |        33 | 
##              |     0.128 |     0.213 |           | 
##              |     0.576 |     0.424 |     0.458 | 
##              |     0.422 |     0.519 |           | 
##              |     0.264 |     0.194 |           | 
## -------------|-----------|-----------|-----------|
##            2 |        26 |        13 |        39 | 
##              |     0.108 |     0.181 |           | 
##              |     0.667 |     0.333 |     0.542 | 
##              |     0.578 |     0.481 |           | 
##              |     0.361 |     0.181 |           | 
## -------------|-----------|-----------|-----------|
## Column Total |        45 |        27 |        72 | 
##              |     0.625 |     0.375 |           | 
## -------------|-----------|-----------|-----------|
## 
##  
## Statistics for All Table Factors
## 
## 
## Pearson's Chi-squared test 
## ------------------------------------------------------------
## Chi^2 =  0.630303     d.f. =  1     p =  0.4272442 
## 
## Pearson's Chi-squared test with Yates' continuity correction 
## ------------------------------------------------------------
## Chi^2 =  0.3020979     d.f. =  1     p =  0.5825702 
## 
##

3.2 Participants at t2

#ds_t2 %>% count(sex)

ds %>% 
  filter(!is.na(pain_t2)) %>%  
  select(sex, age,run_alone, athlete, music, first_marathon, sum_positive, sum_negative, pain_t1, pain_t2) %>%
  arsenal::tableby(sex ~ .,., control = arsenal::tableby.control(cat.stats=c("count"))) %>% 
  summary()

## 
## 
## |                            |    1 (N=24)     |    2 (N=32)     |  Total (N=56)   | p value|
## |:---------------------------|:---------------:|:---------------:|:---------------:|-------:|
## |**age**                     |                 |                 |                 |   0.927|
## |&nbsp;&nbsp;&nbsp;N-Miss    |        7        |        7        |       14        |        |
## |&nbsp;&nbsp;&nbsp;Mean (SD) | 40.647 (8.108)  | 40.920 (10.140) | 40.810 (9.266)  |        |
## |&nbsp;&nbsp;&nbsp;Range     | 27.000 - 53.000 | 24.000 - 63.000 | 24.000 - 63.000 |        |
## |**run_alone**               |                 |                 |                 |   0.739|
## |&nbsp;&nbsp;&nbsp;N-Miss    |        1        |        7        |        8        |        |
## |&nbsp;&nbsp;&nbsp;Mean (SD) |  1.609 (0.499)  |  1.560 (0.507)  |  1.583 (0.498)  |        |
## |&nbsp;&nbsp;&nbsp;Range     |  1.000 - 2.000  |  1.000 - 2.000  |  1.000 - 2.000  |        |
## |**athlete**                 |                 |                 |                 |   0.583|
## |&nbsp;&nbsp;&nbsp;N-Miss    |        0        |        5        |        5        |        |
## |&nbsp;&nbsp;&nbsp;Mean (SD) |  1.750 (0.442)  |  1.815 (0.396)  |  1.784 (0.415)  |        |
## |&nbsp;&nbsp;&nbsp;Range     |  1.000 - 2.000  |  1.000 - 2.000  |  1.000 - 2.000  |        |
## |**music**                   |                 |                 |                 |   1.000|
## |&nbsp;&nbsp;&nbsp;N-Miss    |        4        |       12        |       16        |        |
## |&nbsp;&nbsp;&nbsp;Mean (SD) |  1.350 (0.489)  |  1.350 (0.489)  |  1.350 (0.483)  |        |
## |&nbsp;&nbsp;&nbsp;Range     |  1.000 - 2.000  |  1.000 - 2.000  |  1.000 - 2.000  |        |
## |**first_marathon**          |                 |                 |                 |   0.166|
## |&nbsp;&nbsp;&nbsp;N-Miss    |        0        |        3        |        3        |        |
## |&nbsp;&nbsp;&nbsp;Mean (SD) |  1.500 (0.511)  |  1.310 (0.471)  |  1.396 (0.494)  |        |
## |&nbsp;&nbsp;&nbsp;Range     |  1.000 - 2.000  |  1.000 - 2.000  |  1.000 - 2.000  |        |
## |**sum_positive**            |                 |                 |                 |   0.174|
## |&nbsp;&nbsp;&nbsp;Mean (SD) | 41.417 (7.885)  | 37.719 (11.229) | 39.304 (10.023) |        |
## |&nbsp;&nbsp;&nbsp;Range     | 22.000 - 50.000 | 0.000 - 50.000  | 0.000 - 50.000  |        |
## |**sum_negative**            |                 |                 |                 |   0.352|
## |&nbsp;&nbsp;&nbsp;Mean (SD) | 11.875 (2.692)  | 11.031 (3.729)  | 11.393 (3.323)  |        |
## |&nbsp;&nbsp;&nbsp;Range     | 10.000 - 21.000 | 0.000 - 17.000  | 0.000 - 21.000  |        |
## |**pain_t1**                 |                 |                 |                 |   0.482|
## |&nbsp;&nbsp;&nbsp;Mean (SD) |  7.167 (2.496)  |  6.703 (2.365)  |  6.902 (2.411)  |        |
## |&nbsp;&nbsp;&nbsp;Range     | 0.000 - 10.000  | 1.000 - 10.000  | 0.000 - 10.000  |        |
## |**pain_t2**                 |                 |                 |                 |   0.205|
## |&nbsp;&nbsp;&nbsp;Mean (SD) |  6.292 (2.629)  |  5.406 (2.500)  |  5.786 (2.571)  |        |
## |&nbsp;&nbsp;&nbsp;Range     | 0.000 - 10.000  | 0.000 - 10.000  | 0.000 - 10.000  |        |

Check difference in proportions

ds %>% 
  filter(!is.na(pain_t2)) %>% 
  {gmodels::CrossTable(.$sex.x)}

## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  56 
## 
##  
##           |    female |      male | 
##           |-----------|-----------|
##           |        24 |        32 | 
##           |     0.429 |     0.571 | 
##           |-----------|-----------|
## 
## 
## 
##

ds %>% 
  filter(!is.na(pain_t2)) %>% 
  {chisq.test(table(.$sex))}

## 
##  Chi-squared test for given probabilities
## 
## data:  table(.$sex)
## X-squared = 1.1429, df = 1, p-value = 0.285

Check difference in age

ds %>% 
  filter(!is.na(pain_t2)) %>% 
  {t.test(age ~ sex, var.equal = T, data = .)}

## 
##  Two Sample t-test
## 
## data:  age by sex
## t = -0.092558, df = 40, p-value = 0.9267
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -6.232813  5.686930
## sample estimates:
## mean in group 1 mean in group 2 
##        40.64706        40.92000

Check differences in run alone

ds %>% 
  filter(!is.na(pain_t2)) %>% 
  {gmodels::CrossTable(.$sex, .$run_alone, chisq = T)}

## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## | Chi-square contribution |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  48 
## 
##  
##              | .$run_alone 
##        .$sex |         1 |         2 | Row Total | 
## -------------|-----------|-----------|-----------|
##            1 |         9 |        14 |        23 | 
##              |     0.036 |     0.025 |           | 
##              |     0.391 |     0.609 |     0.479 | 
##              |     0.450 |     0.500 |           | 
##              |     0.188 |     0.292 |           | 
## -------------|-----------|-----------|-----------|
##            2 |        11 |        14 |        25 | 
##              |     0.033 |     0.023 |           | 
##              |     0.440 |     0.560 |     0.521 | 
##              |     0.550 |     0.500 |           | 
##              |     0.229 |     0.292 |           | 
## -------------|-----------|-----------|-----------|
## Column Total |        20 |        28 |        48 | 
##              |     0.417 |     0.583 |           | 
## -------------|-----------|-----------|-----------|
## 
##  
## Statistics for All Table Factors
## 
## 
## Pearson's Chi-squared test 
## ------------------------------------------------------------
## Chi^2 =  0.1168696     d.f. =  1     p =  0.7324548 
## 
## Pearson's Chi-squared test with Yates' continuity correction 
## ------------------------------------------------------------
## Chi^2 =  0.002385093     d.f. =  1     p =  0.9610489 
## 
##

Check differences in run alone

ds %>% 
  filter(!is.na(pain_t2)) %>% 
  {gmodels::CrossTable(.$sex, .$athlete, chisq = T)}

## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## | Chi-square contribution |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  51 
## 
##  
##              | .$athlete 
##        .$sex |         1 |         2 | Row Total | 
## -------------|-----------|-----------|-----------|
##            1 |         6 |        18 |        24 | 
##              |     0.131 |     0.036 |           | 
##              |     0.250 |     0.750 |     0.471 | 
##              |     0.545 |     0.450 |           | 
##              |     0.118 |     0.353 |           | 
## -------------|-----------|-----------|-----------|
##            2 |         5 |        22 |        27 | 
##              |     0.116 |     0.032 |           | 
##              |     0.185 |     0.815 |     0.529 | 
##              |     0.455 |     0.550 |           | 
##              |     0.098 |     0.431 |           | 
## -------------|-----------|-----------|-----------|
## Column Total |        11 |        40 |        51 | 
##              |     0.216 |     0.784 |           | 
## -------------|-----------|-----------|-----------|
## 
##  
## Statistics for All Table Factors
## 
## 
## Pearson's Chi-squared test 
## ------------------------------------------------------------
## Chi^2 =  0.3155303     d.f. =  1     p =  0.5743062 
## 
## Pearson's Chi-squared test with Yates' continuity correction 
## ------------------------------------------------------------
## Chi^2 =  0.04869792     d.f. =  1     p =  0.8253447 
## 
##

check differences in listen to music

ds %>% 
  filter(!is.na(pain_t2)) %>% 
  {gmodels::CrossTable(.$sex, .$music, chisq = T)}

## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## | Chi-square contribution |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  40 
## 
##  
##              | .$music 
##        .$sex |         1 |         2 | Row Total | 
## -------------|-----------|-----------|-----------|
##            1 |        13 |         7 |        20 | 
##              |     0.000 |     0.000 |           | 
##              |     0.650 |     0.350 |     0.500 | 
##              |     0.500 |     0.500 |           | 
##              |     0.325 |     0.175 |           | 
## -------------|-----------|-----------|-----------|
##            2 |        13 |         7 |        20 | 
##              |     0.000 |     0.000 |           | 
##              |     0.650 |     0.350 |     0.500 | 
##              |     0.500 |     0.500 |           | 
##              |     0.325 |     0.175 |           | 
## -------------|-----------|-----------|-----------|
## Column Total |        26 |        14 |        40 | 
##              |     0.650 |     0.350 |           | 
## -------------|-----------|-----------|-----------|
## 
##  
## Statistics for All Table Factors
## 
## 
## Pearson's Chi-squared test 
## ------------------------------------------------------------
## Chi^2 =  0     d.f. =  1     p =  1 
## 
## Pearson's Chi-squared test 
## ------------------------------------------------------------
## Chi^2 =  0     d.f. =  1     p =  1 
## 
##

Check differences in first marathon

ds %>% 
  filter(!is.na(pain_t2)) %>% 
  {gmodels::CrossTable(.$sex, .$first_marathon, chisq = T)}

## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## | Chi-square contribution |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  53 
## 
##  
##              | .$first_marathon 
##        .$sex |         1 |         2 | Row Total | 
## -------------|-----------|-----------|-----------|
##            1 |        12 |        12 |        24 | 
##              |     0.428 |     0.652 |           | 
##              |     0.500 |     0.500 |     0.453 | 
##              |     0.375 |     0.571 |           | 
##              |     0.226 |     0.226 |           | 
## -------------|-----------|-----------|-----------|
##            2 |        20 |         9 |        29 | 
##              |     0.354 |     0.540 |           | 
##              |     0.690 |     0.310 |     0.547 | 
##              |     0.625 |     0.429 |           | 
##              |     0.377 |     0.170 |           | 
## -------------|-----------|-----------|-----------|
## Column Total |        32 |        21 |        53 | 
##              |     0.604 |     0.396 |           | 
## -------------|-----------|-----------|-----------|
## 
##  
## Statistics for All Table Factors
## 
## 
## Pearson's Chi-squared test 
## ------------------------------------------------------------
## Chi^2 =  1.974446     d.f. =  1     p =  0.1599768 
## 
## Pearson's Chi-squared test with Yates' continuity correction 
## ------------------------------------------------------------
## Chi^2 =  1.261253     d.f. =  1     p =  0.261414 
## 
##

3.3 Reliability of the PANAS

#positive
ds %>% 
  select(productive, strong, enthusiastic, inspired, vigorous, stimulated, determined, powerfull, dynamic, interested) %>% 
  psych::alpha(.)

## 
## Reliability analysis   
## Call: psych::alpha(x = .)
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean   sd median_r
##       0.88      0.88     0.9      0.43 7.6 0.017    4 0.77     0.41
## 
##  lower alpha upper     95% confidence boundaries
## 0.84 0.88 0.91 
## 
##  Reliability if an item is dropped:
##              raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## productive        0.87      0.88    0.89      0.44 7.2    0.019 0.013  0.42
## strong            0.86      0.87    0.88      0.42 6.5    0.020 0.011  0.40
## enthusiastic      0.86      0.87    0.87      0.42 6.6    0.020 0.008  0.40
## inspired          0.86      0.87    0.88      0.42 6.4    0.020 0.010  0.40
## vigorous          0.87      0.87    0.89      0.43 6.9    0.019 0.014  0.40
## stimulated        0.86      0.87    0.88      0.42 6.5    0.020 0.012  0.40
## determined        0.87      0.88    0.89      0.44 7.1    0.019 0.013  0.42
## powerfull         0.87      0.88    0.89      0.44 7.1    0.019 0.013  0.43
## dynamic           0.87      0.88    0.89      0.45 7.3    0.018 0.012  0.44
## interested        0.87      0.88    0.89      0.45 7.3    0.019 0.013  0.44
## 
##  Item statistics 
##                n raw.r std.r r.cor r.drop mean   sd
## productive   105  0.65  0.65  0.59   0.55  3.5 1.18
## strong       104  0.77  0.76  0.75   0.68  3.9 1.24
## enthusiastic 102  0.74  0.76  0.76   0.68  4.2 0.97
## inspired     101  0.78  0.79  0.78   0.72  4.2 0.96
## vigorous     105  0.71  0.69  0.65   0.60  3.5 1.45
## stimulated   102  0.77  0.78  0.76   0.72  4.2 0.86
## determined   105  0.65  0.65  0.59   0.56  4.4 0.97
## powerfull    104  0.65  0.66  0.61   0.57  4.2 1.04
## dynamic      101  0.63  0.62  0.57   0.52  3.8 1.21
## interested   104  0.65  0.63  0.57   0.54  3.9 1.09
## 
## Non missing response frequency for each item
##                 1    2    3    4    5 miss
## productive   0.07 0.15 0.24 0.33 0.21 0.03
## strong       0.07 0.09 0.13 0.27 0.44 0.04
## enthusiastic 0.03 0.04 0.08 0.37 0.48 0.06
## inspired     0.01 0.07 0.10 0.32 0.50 0.06
## vigorous     0.15 0.11 0.12 0.27 0.34 0.03
## stimulated   0.01 0.05 0.07 0.43 0.44 0.06
## determined   0.05 0.01 0.04 0.33 0.57 0.03
## powerfull    0.02 0.08 0.12 0.28 0.51 0.04
## dynamic      0.06 0.12 0.16 0.33 0.34 0.06
## interested   0.05 0.07 0.16 0.40 0.32 0.04

#negative
ds %>% 
  select(distressed, apreensivo, amedrontado, incomodado, angustiado, afraid, humiliated, irritable, scared, nervous) %>% 
  psych::alpha()

## 
## Reliability analysis   
## Call: psych::alpha(x = .)
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean   sd median_r
##       0.76      0.78    0.82      0.26 3.5 0.033  1.2 0.36     0.22
## 
##  lower alpha upper     95% confidence boundaries
## 0.7 0.76 0.83 
## 
##  Reliability if an item is dropped:
##             raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## distressed       0.76      0.77    0.82      0.27 3.3    0.033 0.036  0.20
## apreensivo       0.74      0.75    0.80      0.25 3.0    0.037 0.030  0.22
## amedrontado      0.72      0.73    0.77      0.24 2.8    0.040 0.024  0.20
## incomodado       0.78      0.78    0.83      0.28 3.6    0.031 0.034  0.25
## angustiado       0.75      0.76    0.80      0.26 3.1    0.035 0.037  0.20
## afraid           0.74      0.75    0.79      0.25 3.1    0.037 0.026  0.22
## humiliated       0.77      0.78    0.82      0.28 3.5    0.034 0.029  0.22
## irritable        0.75      0.76    0.79      0.26 3.2    0.034 0.032  0.22
## scared           0.72      0.74    0.78      0.24 2.8    0.039 0.024  0.20
## nervous          0.72      0.73    0.77      0.23 2.7    0.039 0.031  0.19
## 
##  Item statistics 
##               n raw.r std.r r.cor r.drop mean   sd
## distressed  105  0.50  0.49  0.39   0.32  1.3 0.70
## apreensivo  105  0.65  0.60  0.55   0.49  1.4 0.85
## amedrontado 104  0.76  0.71  0.72   0.63  1.2 0.62
## incomodado  104  0.46  0.40  0.27   0.24  1.4 0.77
## angustiado  105  0.55  0.58  0.51   0.42  1.1 0.53
## afraid      103  0.59  0.60  0.56   0.50  1.2 0.56
## humiliated  104  0.36  0.42  0.32   0.25  1.0 0.24
## irritable   104  0.52  0.55  0.51   0.37  1.2 0.55
## scared      105  0.69  0.68  0.68   0.60  1.2 0.62
## nervous     105  0.71  0.72  0.71   0.61  1.2 0.56
## 
## Non missing response frequency for each item
##                1    2    3    4    5 miss
## distressed  0.80 0.12 0.05 0.03 0.00 0.03
## apreensivo  0.72 0.16 0.07 0.04 0.01 0.03
## amedrontado 0.91 0.06 0.01 0.00 0.02 0.04
## incomodado  0.73 0.20 0.03 0.03 0.01 0.04
## angustiado  0.91 0.05 0.02 0.02 0.00 0.03
## afraid      0.89 0.09 0.00 0.01 0.01 0.05
## humiliated  0.97 0.02 0.01 0.00 0.00 0.04
## irritable   0.88 0.09 0.02 0.02 0.00 0.04
## scared      0.92 0.04 0.01 0.02 0.01 0.03
## nervous     0.89 0.08 0.03 0.00 0.01 0.03

ds %>% 
  select(distressed:interested) %>% #get items
  psych::alpha(., check.keys = T)

## 
## Reliability analysis   
## Call: psych::alpha(x = ., check.keys = T)
## 
##   raw_alpha std.alpha G6(smc) average_r S/N   ase mean   sd median_r
##       0.87      0.87    0.92      0.25 6.8 0.017  4.4 0.48     0.23
## 
##  lower alpha upper     95% confidence boundaries
## 0.83 0.87 0.9 
## 
##  Reliability if an item is dropped:
##              raw_alpha std.alpha G6(smc) average_r S/N alpha se var.r med.r
## distressed-       0.86      0.87    0.92      0.26 6.6    0.017 0.030  0.23
## productive        0.86      0.86    0.91      0.25 6.4    0.018 0.029  0.23
## apreensivo-       0.87      0.87    0.92      0.26 6.8    0.016 0.027  0.25
## amedrontado-      0.86      0.87    0.91      0.26 6.6    0.017 0.026  0.24
## incomodado-       0.87      0.88    0.92      0.27 7.0    0.017 0.028  0.26
## angustiado-       0.87      0.87    0.91      0.26 6.6    0.017 0.030  0.23
## strong            0.85      0.86    0.91      0.25 6.2    0.019 0.026  0.22
## enthusiastic      0.85      0.86    0.91      0.24 6.1    0.019 0.026  0.21
## inspired          0.85      0.86    0.91      0.24 6.2    0.019 0.025  0.23
## afraid-           0.87      0.87    0.92      0.26 6.7    0.017 0.027  0.23
## vigorous          0.86      0.87    0.91      0.25 6.5    0.018 0.027  0.23
## stimulated        0.85      0.86    0.91      0.24 6.0    0.019 0.027  0.20
## humiliated-       0.87      0.87    0.92      0.26 6.8    0.017 0.029  0.24
## determined        0.86      0.87    0.91      0.26 6.5    0.018 0.027  0.22
## irritable-        0.86      0.87    0.91      0.25 6.5    0.017 0.029  0.23
## powerfull         0.86      0.87    0.91      0.26 6.5    0.018 0.027  0.23
## dynamic           0.86      0.87    0.91      0.25 6.4    0.018 0.029  0.23
## scared-           0.87      0.87    0.91      0.26 6.6    0.017 0.026  0.23
## nervous-          0.86      0.86    0.91      0.25 6.3    0.018 0.029  0.23
## interested        0.86      0.87    0.92      0.25 6.5    0.018 0.029  0.23
## 
##  Item statistics 
##                n raw.r std.r r.cor r.drop mean   sd
## distressed-  105  0.46  0.49  0.44   0.39  4.7 0.70
## productive   105  0.63  0.58  0.56   0.55  3.5 1.18
## apreensivo-  105  0.31  0.38  0.35   0.23  4.6 0.85
## amedrontado- 104  0.45  0.50  0.49   0.37  4.8 0.62
## incomodado-  104  0.28  0.30  0.24   0.18  4.6 0.77
## angustiado-  105  0.42  0.49  0.46   0.36  4.9 0.53
## strong       104  0.72  0.67  0.66   0.65  3.9 1.24
## enthusiastic 102  0.75  0.73  0.73   0.70  4.2 0.97
## inspired     101  0.74  0.69  0.70   0.69  4.2 0.96
## afraid-      103  0.37  0.45  0.42   0.33  4.8 0.56
## vigorous     105  0.62  0.54  0.51   0.51  3.5 1.45
## stimulated   102  0.77  0.76  0.76   0.73  4.2 0.86
## humiliated-  104  0.36  0.40  0.36   0.31  5.0 0.24
## determined   105  0.58  0.53  0.50   0.49  4.4 0.97
## irritable-   104  0.48  0.53  0.51   0.42  4.8 0.55
## powerfull    104  0.59  0.53  0.51   0.51  4.2 1.04
## dynamic      101  0.61  0.57  0.55   0.53  3.8 1.21
## scared-      105  0.40  0.49  0.48   0.36  4.8 0.62
## nervous-     105  0.56  0.63  0.62   0.51  4.8 0.56
## interested   104  0.60  0.54  0.51   0.52  3.9 1.09
## 
## Non missing response frequency for each item
##                 1    2    3    4    5 miss
## distressed   0.80 0.12 0.05 0.03 0.00 0.03
## productive   0.07 0.15 0.24 0.33 0.21 0.03
## apreensivo   0.72 0.16 0.07 0.04 0.01 0.03
## amedrontado  0.91 0.06 0.01 0.00 0.02 0.04
## incomodado   0.73 0.20 0.03 0.03 0.01 0.04
## angustiado   0.91 0.05 0.02 0.02 0.00 0.03
## strong       0.07 0.09 0.13 0.27 0.44 0.04
## enthusiastic 0.03 0.04 0.08 0.37 0.48 0.06
## inspired     0.01 0.07 0.10 0.32 0.50 0.06
## afraid       0.89 0.09 0.00 0.01 0.01 0.05
## vigorous     0.15 0.11 0.12 0.27 0.34 0.03
## stimulated   0.01 0.05 0.07 0.43 0.44 0.06
## humiliated   0.97 0.02 0.01 0.00 0.00 0.04
## determined   0.05 0.01 0.04 0.33 0.57 0.03
## irritable    0.88 0.09 0.02 0.02 0.00 0.04
## powerfull    0.02 0.08 0.12 0.28 0.51 0.04
## dynamic      0.06 0.12 0.16 0.33 0.34 0.06
## scared       0.92 0.04 0.01 0.02 0.01 0.03
## nervous      0.89 0.08 0.03 0.00 0.01 0.03
## interested   0.05 0.07 0.16 0.40 0.32 0.04

3.4 H1: Mean (Pain) at T1 > Mean (Pain) at T2

6.6435185, 2.8441208 5.7857143, 2.5705626

ds %>% 
  select(pain_t1, pain_t2) %>% descr()

## Descriptive Statistics  
## ds  
## N: 108  
## 
##                     pain_t1   pain_t2
## ----------------- --------- ---------
##              Mean      6.64      5.79
##           Std.Dev      2.84      2.57
##               Min      0.00      0.00
##                Q1      5.00      4.00
##            Median      7.00      6.00
##                Q3      9.00      8.00
##               Max     10.00     10.00
##               MAD      2.97      2.97
##               IQR      4.00      4.00
##                CV      0.43      0.44
##          Skewness     -0.79     -0.55
##       SE.Skewness      0.23      0.32
##          Kurtosis     -0.23     -0.26
##           N.Valid    108.00     56.00
##         Pct.Valid    100.00     51.85

ds %>% 
  filter(!is.na(pain_t2)) %>% 
  select(id, email,pain_t1, pain_t2)%>% 
  pivot_longer(cols = -c(id,email),
               names_to = "Time",
               values_to = "Result")%>% 
  mutate(Time = if_else(Time == "pain_t1", "First measurement","Second measurement")) %>%
  ggboxplot(., "Time", "Result") +
  stat_compare_means(method = "t.test", label = "p.signif", paired = TRUE, var.equal = TRUE)

In the manuscript, this is the table 2.

t.test(ds$pain_t1, ds$pain_t2, 
       alternative = "greater",
       paired = T)

## 
##  Paired t-test
## 
## data:  ds$pain_t1 and ds$pain_t2
## t = 3.4123, df = 55, p-value = 0.0006075
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  0.5688612       Inf
## sample estimates:
## mean of the differences 
##                1.116071

ds %>% 
  filter(!is.na(pain_t2)) %>% 
  {  effsize::cohen.d(.$pain_t1, .$pain_t2, paired = TRUE)}

## 
## Cohen's d
## 
## d estimate: 0.4473575 (small)
## 95 percent confidence interval:
##     lower     upper 
## 0.1748532 0.7198618

Graph

ds %>% 
  filter(!is.na(pain_t2)) %>% 
  select(id,pain_t1, pain_t2) %>% 
  pivot_longer(-id) %>% 
  mutate(name = as.factor(if_else(name == "pain_t1", "T1","T2"))) %>% 
  ggplot(., aes(x=name, y=value, group=1)) +
  stat_summary(fun = mean, geom = "line") +
  stat_summary(fun.data = mean_se, geom = "errorbar", width=0.2) +
  theme_bw() +
  ylim(6,9)

#https://stackoverflow.com/questions/10357768/plotting-lines-and-the-group-aesthetic-in-ggplot2
#https://kohske.wordpress.com/2010/12/27/faq-geom_line-doesnt-draw-lines/

ds %>% 
  select(sum_positive, sum_negative) %>% descr()

## Descriptive Statistics  
## ds  
## N: 108  
## 
##                     sum_negative   sum_positive
## ----------------- -------------- --------------
##              Mean          11.73          38.09
##           Std.Dev           3.94          10.18
##               Min           0.00           0.00
##                Q1          10.00          34.00
##            Median          11.00          40.00
##                Q3          13.00          45.00
##               Max          31.00          50.00
##               MAD           1.48           8.90
##               IQR           3.00          11.00
##                CV           0.34           0.27
##          Skewness           1.38          -1.61
##       SE.Skewness           0.23           0.23
##          Kurtosis           7.91           3.45
##           N.Valid         108.00         108.00
##         Pct.Valid         100.00         100.00

To check if this second group (T2) is virtually the same as the respondents in T1

ds %>% 
  mutate(dropout = if_else(is.na(pain_t2),"y","n")) %>% 
  arrange(desc(dropout)) %>% 
  {t.test(sum_positive ~ dropout, var.equal = T, data = .)}

## 
##  Two Sample t-test
## 
## data:  sum_positive by dropout
## t = 1.2873, df = 106, p-value = 0.2008
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -1.35854  6.38876
## sample estimates:
## mean in group n mean in group y 
##        39.30357        36.78846

ds %>% 
  mutate(dropout = if_else(is.na(pain_t2),"y","n")) %>% 
  {t.test(sum_negative ~ dropout, var.equal = T, data = .)}

## 
##  Two Sample t-test
## 
## data:  sum_negative by dropout
## t = -0.9251, df = 106, p-value = 0.357
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -2.2105510  0.8039576
## sample estimates:
## mean in group n mean in group y 
##        11.39286        12.09615

ds %>% 
  mutate(dropout = if_else(is.na(pain_t2),"y","n")) %>% 
  {t.test(sum_negative ~ dropout, var.equal = T, data = .)}

## 
##  Two Sample t-test
## 
## data:  sum_negative by dropout
## t = -0.9251, df = 106, p-value = 0.357
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -2.2105510  0.8039576
## sample estimates:
## mean in group n mean in group y 
##        11.39286        12.09615

ds %>% 
  mutate(dropout = if_else(is.na(pain_t2),"y","n")) %>% 
  {t.test(age ~ dropout, var.equal = T, data = .)}

## 
##  Two Sample t-test
## 
## data:  age by dropout
## t = 0.40825, df = 74, p-value = 0.6843
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -3.255624  4.933495
## sample estimates:
## mean in group n mean in group y 
##        40.80952        39.97059

ds %>% 
  mutate(dropout = if_else(is.na(pain_t2),"y","n")) %>% 
  {gmodels::CrossTable(.$sex, .$dropout,chisq = T)}

## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## | Chi-square contribution |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  108 
## 
##  
##              | .$dropout 
##        .$sex |         n |         y | Row Total | 
## -------------|-----------|-----------|-----------|
##            1 |        24 |        21 |        45 | 
##              |     0.019 |     0.021 |           | 
##              |     0.533 |     0.467 |     0.417 | 
##              |     0.429 |     0.404 |           | 
##              |     0.222 |     0.194 |           | 
## -------------|-----------|-----------|-----------|
##            2 |        32 |        31 |        63 | 
##              |     0.014 |     0.015 |           | 
##              |     0.508 |     0.492 |     0.583 | 
##              |     0.571 |     0.596 |           | 
##              |     0.296 |     0.287 |           | 
## -------------|-----------|-----------|-----------|
## Column Total |        56 |        52 |       108 | 
##              |     0.519 |     0.481 |           | 
## -------------|-----------|-----------|-----------|
## 
##  
## Statistics for All Table Factors
## 
## 
## Pearson's Chi-squared test 
## ------------------------------------------------------------
## Chi^2 =  0.0678179     d.f. =  1     p =  0.7945408 
## 
## Pearson's Chi-squared test with Yates' continuity correction 
## ------------------------------------------------------------
## Chi^2 =  0.004238619     d.f. =  1     p =  0.9480907 
## 
##

ds %>% 
  mutate(dropout = if_else(is.na(pain_t2),"y","n")) %>% 
  {chisq.test(table(.$sex, .$dropout))}

## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  table(.$sex, .$dropout)
## X-squared = 0.0042386, df = 1, p-value = 0.9481

3.5 H2

Plot

ds %>% 
  filter(!is.na(pain_t2)) %>%
  select(id,pain_t1,pain_t2,panas_group) %>%  #select target variables
  pivot_longer(cols = -c(id,panas_group),
               names_to = "Time",
               values_to = "Result") %>% #to long format
  mutate(panas_group = fct_recode(factor(panas_group), "Negative" = "0", "Positive" = "1"),
         Time = fct_recode(factor(Time), "First measurement" = "pain_t1", "Second measurement" = "pain_t2")) %>% #transform panas group and time into factors and then re-arrange its names
  ggplot(., aes(x = Time, y = Result, color = id, group = id, linetype = panas_group)) +
  geom_line() + geom_point() + theme_bw() + scale_linetype(name = "Sex")

Linear mixed model

In the manuscript, this is the table 3.

#https://web.stanford.edu/class/psych252/section/Mixed_models_tutorial.html
ds %>% #filter(!is.na(pain_t2)) %>% 
  mutate(id = as.factor(id)) %>% 
  select(id,pain_t1,pain_t2,panas_group) %>%  #select target variables
  pivot_longer(cols = -c(id,panas_group),
               names_to = "Time",
               values_to = "Result") %>% 
  lmer(Result  ~ Time * panas_group + (1 | id),.) %>% 
  summary()

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: Result ~ Time * panas_group + (1 | id)
##    Data: .
## 
## REML criterion at convergence: 771.2
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.0549 -0.5027  0.1038  0.5516  1.5132 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  id       (Intercept) 4.589    2.142   
##  Residual             3.228    1.797   
## Number of obs: 164, groups:  id, 108
## 
## Fixed effects:
##                                 Estimate Std. Error      df t value Pr(>|t|)
## (Intercept)                        8.600      1.250 127.792   6.878 2.42e-10
## Timepain_t2                       -2.248      1.682  64.026  -1.337    0.186
## panas_grouppositive               -2.051      1.280 127.792  -1.602    0.112
## Timepain_t2:panas_grouppositive    1.294      1.713  63.876   0.755    0.453
##                                    
## (Intercept)                     ***
## Timepain_t2                        
## panas_grouppositive                
## Timepain_t2:panas_grouppositive    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) Tmpn_2 pns_gr
## Timepain_t2 -0.307              
## pns_grppstv -0.977  0.300       
## Tmpn_t2:pn_  0.301 -0.981 -0.309

(In case someone wants to check a repeated measures anova)

ez_outcome <- ds %>% 
    filter(!is.na(pain_t2)) %>%  #REPEATED ANOVA doest not work with empty cells
    select(id,pain_t1,pain_t2,panas_group) %>%  #select target variables
    pivot_longer(cols = -c(id,panas_group), #use these variables as index
                 names_to = "Time",
                 values_to = "Result") %>% 
  ez::ezANOVA(
    data = ., #data
    dv = Result, #result is the target 
    wid = id, #each subject is within this dataset
    within = Time, #time is repeated
    between = .(panas_group), #each participant was assigned to a specific condition
    type = 3, #contrast
    detailed = TRUE,
    return_aov = TRUE)

ez_outcome$ANOVA

##             Effect DFn DFd         SSn      SSd          F            p p<.05
## 1      (Intercept)   1  54 684.2864583 516.6042 71.5276243 1.814927e-11     *
## 2      panas_group   1  54   1.7864583 516.6042  0.1867363 6.673677e-01      
## 3             Time   1  54   9.1674107 163.9375  3.0196885 8.795969e-02      
## 4 panas_group:Time   1  54   0.8102679 163.9375  0.2668972 6.075302e-01      
##           ges
## 1 0.501371891
## 2 0.002618181
## 3 0.013291707
## 4 0.001189206

4 Exploratory analyzes

(Babel 1) I would run a regression analysis with pain intensity experienced, positive and negative affect as predictors, and recalled pain as a dependent variable.

lm(pain_t2 ~ pain_t1 + sum_positive + sum_negative, data = ds) %>% 
  olsrr::ols_regress()

##                         Model Summary                          
## --------------------------------------------------------------
## R                       0.531       RMSE                2.239 
## R-Squared               0.282       Coef. Var          38.705 
## Adj. R-Squared          0.241       MSE                 5.015 
## Pred R-Squared          0.136       MAE                 1.804 
## --------------------------------------------------------------
##  RMSE: Root Mean Square Error 
##  MSE: Mean Square Error 
##  MAE: Mean Absolute Error 
## 
##                               ANOVA                                
## ------------------------------------------------------------------
##                Sum of                                             
##               Squares        DF    Mean Square      F        Sig. 
## ------------------------------------------------------------------
## Regression    102.662         3         34.221    6.824     6e-04 
## Residual      260.767        52          5.015                    
## Total         363.429        55                                   
## ------------------------------------------------------------------
## 
##                                   Parameter Estimates                                    
## ----------------------------------------------------------------------------------------
##        model      Beta    Std. Error    Std. Beta      t        Sig      lower    upper 
## ----------------------------------------------------------------------------------------
##  (Intercept)     1.939         1.701                  1.140    0.260    -1.475    5.353 
##      pain_t1     0.543         0.126        0.509     4.298    0.000     0.289    0.796 
## sum_positive    -0.023         0.033       -0.089    -0.703    0.485    -0.088    0.043 
## sum_negative     0.088         0.098        0.114     0.900    0.372    -0.108    0.284 
## ----------------------------------------------------------------------------------------

All predictors

ds %>% 
  select(pain_t2 , pain_t1 , sum_positive , sum_negative , sex , age , run_alone , athlete , first_marathon) %>% 
  str()

## tibble [108 x 9] (S3: tbl_df/tbl/data.frame)
##  $ pain_t2       : num [1:108] NA 8 7 NA 3 NA 4 NA 3 3 ...
##  $ pain_t1       : num [1:108] 10 6 7 9 6 8 5 8 7 7 ...
##  $ sum_positive  : num [1:108] 36 34 40 16 37 32 42 31 32 0 ...
##  $ sum_negative  : num [1:108] 11 14 11 20 10 13 11 14 10 0 ...
##  $ sex           : int [1:108] 2 2 2 2 2 2 2 2 2 2 ...
##  $ age           : num [1:108] 69 63 58 56 55 54 51 51 50 50 ...
##  $ run_alone     : int [1:108] 2 2 2 2 2 2 2 NA NA NA ...
##  $ athlete       : int [1:108] 2 2 2 2 2 1 2 NA 2 NA ...
##  $ first_marathon: int [1:108] 1 1 1 1 1 2 2 1 1 NA ...

library(glmulti)
fit <- glmulti(pain_t2 ~ pain_t1 + sum_positive + sum_negative + sex + age + run_alone + athlete + first_marathon, 
          data = ds,
          #xr = c("sex", "run_alone", "athlete", "first_marathon"),
          level = 1,               # No interaction considered
          method = "g",            # Exhaustive approach
          crit = "aicc",            # AIC as criteria
          confsetsize = 5,         # Keep 5 best models
          plotty = F, 
          report = F,  # No plot or interim reports
          fitfunction = "lm",      # lm function
          marginality = T) #<-- Don't leave out the main effect

## TASK: Genetic algorithm in the candidate set.
## Initialization...
## Algorithm started...
## Improvements in best and average IC have bebingo en below the specified goals.
## Algorithm is declared to have converged.
## Completed.

summary(fit)

## $name
## [1] "glmulti.analysis"
## 
## $method
## [1] "g"
## 
## $fitting
## [1] "lm"
## 
## $crit
## [1] "aicc"
## 
## $level
## [1] 1
## 
## $marginality
## [1] TRUE
## 
## $confsetsize
## [1] 5
## 
## $bestic
## [1] 161.9397
## 
## $icvalues
## [1] 161.9397 162.2118 163.8248 163.9958 164.0557
## 
## $bestmodel
## [1] "pain_t2 ~ 1 + pain_t1 + sum_positive + sum_negative + age + run_alone + "
## [2] "    first_marathon"                                                      
## 
## $modelweights
## [1] 0.3370044 0.2941417 0.1313106 0.1205499 0.1169934
## 
## $generations
## [1] 180
## 
## $elapsed
## [1] 0.008809018
## 
## $includeobjects
## [1] TRUE

summary(fit)$bestmodel

## [1] "pain_t2 ~ 1 + pain_t1 + sum_positive + sum_negative + age + run_alone + "
## [2] "    first_marathon"

#Check models
top <- weightable(fit)
top <- top[top$aicc <= min(top$aicc) + 2,]
top

##                                                                                                    model
## 1                 pain_t2 ~ 1 + pain_t1 + sum_positive + sum_negative + age + run_alone + first_marathon
## 2       pain_t2 ~ 1 + pain_t1 + sum_positive + sum_negative + age + run_alone + athlete + first_marathon
## 3 pain_t2 ~ 1 + pain_t1 + sum_positive + sum_negative + sex + age + run_alone + athlete + first_marathon
##       aicc   weights
## 1 161.9397 0.3370044
## 2 162.2118 0.2941417
## 3 163.8248 0.1313106

summary(fit@objects[[1]])

## 
## Call:
## fitfunc(formula = as.formula(x), data = data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.2376 -1.1728 -0.1367  1.1670  3.9465 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    -2.40962    4.61538  -0.522   0.6057    
## pain_t1         0.71263    0.15161   4.700 6.29e-05 ***
## sum_positive   -0.13932    0.06417  -2.171   0.0385 *  
## sum_negative    0.41147    0.17854   2.305   0.0288 *  
## age            -0.01700    0.03600  -0.472   0.6404    
## run_alone       1.47975    0.76035   1.946   0.0617 .  
## first_marathon  1.81113    0.83754   2.162   0.0393 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.01 on 28 degrees of freedom
##   (73 observations deleted due to missingness)
## Multiple R-squared:  0.588,  Adjusted R-squared:  0.4997 
## F-statistic:  6.66 on 6 and 28 DF,  p-value: 0.000185

#http://www.metafor-project.org/doku.php/tips:model_selection_with_glmulti_and_mumin`

5 Presenting

lm(pain_t2 ~ factor(run_alone) + factor(first_marathon) + pain_t1 + sum_positive + sum_negative + age, ds) %>% 
  olsrr::ols_regress()

##                         Model Summary                          
## --------------------------------------------------------------
## R                       0.767       RMSE                2.010 
## R-Squared               0.588       Coef. Var          36.271 
## Adj. R-Squared          0.500       MSE                 4.042 
## Pred R-Squared          0.366       MAE                 1.498 
## --------------------------------------------------------------
##  RMSE: Root Mean Square Error 
##  MSE: Mean Square Error 
##  MAE: Mean Absolute Error 
## 
##                               ANOVA                                
## ------------------------------------------------------------------
##                Sum of                                             
##               Squares        DF    Mean Square      F        Sig. 
## ------------------------------------------------------------------
## Regression    161.511         6         26.918     6.66     2e-04 
## Residual      113.175        28          4.042                    
## Total         274.686        34                                   
## ------------------------------------------------------------------
## 
##                                         Parameter Estimates                                          
## ----------------------------------------------------------------------------------------------------
##                   model      Beta    Std. Error    Std. Beta      t        Sig      lower     upper 
## ----------------------------------------------------------------------------------------------------
##             (Intercept)     0.881         4.317                  0.204    0.840    -7.963     9.725 
##      factor(run_alone)2     1.480         0.760        0.255     1.946    0.062    -0.078     3.037 
## factor(first_marathon)2     1.811         0.838        0.307     2.162    0.039     0.096     3.527 
##                 pain_t1     0.713         0.152        0.620     4.700    0.000     0.402     1.023 
##            sum_positive    -0.139         0.064       -0.289    -2.171    0.039    -0.271    -0.008 
##            sum_negative     0.411         0.179        0.316     2.305    0.029     0.046     0.777 
##                     age    -0.017         0.036       -0.058    -0.472    0.640    -0.091     0.057 
## ----------------------------------------------------------------------------------------------------

Done (on March 22, 2021)

Project - Memory of pain (2018)

Luis Anunciacao

August 11, 2020

1 Data processing

2 Merging dataset

3 Results

3.1 Participants

3.2 Participants at t2

3.3 Reliability of the PANAS

3.4 H1: Mean (Pain) at T1 > Mean (Pain) at T2

3.5 H2

4 Exploratory analyzes

5 Presenting