Association between sleep quality and pain in older adults

Introduction
- Adjusting Variables for the Analysis
Analysis
- Association between pain and sleepquality
- Association between sleepquality and pain
  - Ordinal Regression
  - Logistic Regression

Introduction

In this manuscript, we’re going to conduct an analysis into the ELSI Dataset. This is a modified version for usage in the Principles and Practice of Clinical Research program of Harvard T.H. Chan School of Public Health.

The aim here is to replicate the analysis done in a previous paper and see if the results also apply to the sample that was included in the ELSI Brazil.

First order of business is to download the dataset we’re going to be analyzing, which was kindly provided by PPCR staff.

## Loading the necessary libraries
library(tidyverse)
library(haven)
library(broom)
library(ggplot2)
library(gtsummary)

## Url provided by PPCR staff
url <- 'https://ryver-prd-files.s3.us-west-2.amazonaws.com/tnt45405/T9gKrWnaRr5K0yT/FINAL%20-%20ELSI_students.dta'

## Using the read_dta function from haven to read the .dta file created in STATA
PPCRdata <- read_dta(url)

## Removing the url from memory as it is no longer useful
rm(url)

## Transforming to tibble in order to increase readability
PPCRdata <- as_tibble(PPCRdata)
PPCRdata

## # A tibble: 6,974 × 121
##         id2   psu calibrated_weight stratum region    area    hometype staircase
##       <dbl> <dbl>             <dbl>   <dbl> <dbl+lbl> <dbl+l> <dbl+lb> <dbl+lbl>
##  1 20200001    21             3111.      30 1 [North] 1 [Urb… 1 [Hous… 0 [No]   
##  2 20200002    21             1555.      30 1 [North] 1 [Urb… 1 [Hous… 0 [No]   
##  3 20200003    21             1147.      30 1 [North] 1 [Urb… 1 [Hous… 0 [No]   
##  4 20200004    21             1523.      30 1 [North] 1 [Urb… 1 [Hous… 0 [No]   
##  5 20200007    21             2563.      30 1 [North] 1 [Urb… 1 [Hous… 0 [No]   
##  6 20200009    21             5127.      30 1 [North] 1 [Urb… 1 [Hous… 0 [No]   
##  7 20200012    21              918.      30 1 [North] 1 [Urb… 1 [Hous… 1 [Yes]  
##  8 20200013    21              757.      30 1 [North] 1 [Urb… 1 [Hous… 1 [Yes]  
##  9 20200014    21             1027.      30 1 [North] 1 [Urb… 1 [Hous… 0 [No]   
## 10 20200015    21             1005.      30 1 [North] 1 [Urb… 1 [Hous… 0 [No]   
## # ℹ 6,964 more rows
## # ℹ 113 more variables: sex <dbl+lbl>, age <dbl+lbl>, marital <dbl+lbl>,
## #   race <dbl+lbl>, children <dbl+lbl>, grandchildren <dbl+lbl>,
## #   siblings <dbl+lbl>, education <dbl+lbl>, dis_med <dbl+lbl>,
## #   dis_social <dbl+lbl>, dis_work <dbl+lbl>, dis_family <dbl+lbl>,
## #   dis_place <dbl+lbl>, work_12 <dbl+lbl>, workability <dbl+lbl>,
## #   retired <dbl+lbl>, reasonretire <dbl+lbl>, incomescore <dbl+lbl>, …

PPCR Staff has described the variables included in this dataset with the following: Variables in modified dataset

Adjusting Variables for the Analysis

Let’s now look at the some of the individual variables to understand what they mean

unique(PPCRdata$pain)

## <labelled<double>[2]>: Do you feel pain that bothers you frequently?
## [1] 1 0
## 
## Labels:
##  value                     label
##      0                        No
##      1                       Yes
##      8            Does not apply
##      9 Didn’t know/didn’t answer

unique(PPCRdata$p_intensity)

## <labelled<double>[4]>: How intense is this pain most of the time?
## [1]  2 NA  3  1
## 
## Labels:
##  value                     label
##      1                 Soft/weak
##      2                  Moderate
##      3            Intense/strong
##      8            Does not apply
##      9 Didn’t know/didn’t answer

unique(PPCRdata$sleepquality)

## <labelled<double>[6]>: How would you evaluate the quality of your sleep?
## [1]  4  2  3  1  5 NA
## 
## Labels:
##  value                     label
##      1                 Very good
##      2                      Good
##      3                      Fair
##      4                       Bad
##      5                  Very bad
##      9 Didn’t know/didn’t answer

unique(PPCRdata$sleepproblems)

## <labelled<double>[3]>: Has a doctor ever told you that you have sleep problems?
## [1]  0  1 NA
## 
## Labels:
##  value                     label
##      0                        No
##      1                       Yes
##      8            Does not apply
##      9 Didn’t know/didn’t answer

We can then see that pain and sleepproblems are binary variables, while p_intensity and sleepquality are ordinal variables, although they aren’t coded in this manner yet.

Now let’s transform p_intensity and sleepquality into categorical variables and pain and sleepproblems into dichotomous variables, as they were read as numerical variables with labels.

## Using the as_factor function from haven to transform the variables 
## with the labels as names for the levels in the factors
PPCRdata$sleepquality <- as_factor(PPCRdata$sleepquality)
PPCRdata$p_intensity <- as_factor(PPCRdata$p_intensity)
PPCRdata$pain <- as_factor(PPCRdata$pain)
PPCRdata$sleepproblems <- as_factor(PPCRdata$sleepproblems)


levels(PPCRdata$sleepquality)

## [1] "Very good"                 "Good"                     
## [3] "Fair"                      "Bad"                      
## [5] "Very bad"                  "Didn’t know/didn’t answer"

## Reversing the order of factors, as initially the order was wrong in sleepquality
PPCRdata$sleepquality <- fct_rev(PPCRdata$sleepquality)
levels(PPCRdata$sleepquality)

## [1] "Didn’t know/didn’t answer" "Very bad"                 
## [3] "Bad"                       "Fair"                     
## [5] "Good"                      "Very good"

## doing the same with genhealth as it will be used for a sensitivity analysis
# adjusting the genhealth variable
PPCRdata %>% mutate(genhealth = fct_rev(as_factor(genhealth))) -> PPCRdata

Other variables that we’re going to use in our model that haven’t been coded properly as well include: depression, alcohol, fracture, cancer, diabetes, dementia, arthritis, heartdisease and lungdisease.

## Showing that the variables are coded as numerical variables
unique(PPCRdata$depression)

## <labelled<double>[3]>: Has a doctor ever told you that you have depression?
## [1]  1  0 NA
## 
## Labels:
##  value                     label
##      0                        No
##      1                       Yes
##      9 Didn’t know/didn’t answer

## Commented the other variables for brevity, but they were all numerical
## unique(PPCRdata$sex)
## unique(PPCRdata$alcohol)
## unique(PPCRdata$fracture)
## unique(PPCRdata$cancer)
## unique(PPCRdata$diabetes)
## unique(PPCRdata$dementia)
## unique(PPCRdata$arthritis)
## unique(PPCRdata$heartdisease)
## unique(PPCRdata$lungdisease)

## Using the as_factor function from haven to transform the variables 
## with the labels as names for the levels in the factors
PPCRdata$depression <- as_factor(PPCRdata$depression)
PPCRdata$sex <- as_factor(PPCRdata$sex)
PPCRdata$alcohol <- as_factor(PPCRdata$alcohol)
PPCRdata$fracture <- as_factor(PPCRdata$fracture)
PPCRdata$cancer <- as_factor(PPCRdata$cancer)
PPCRdata$diabetes <- as_factor(PPCRdata$diabetes)
PPCRdata$dementia <- as_factor(PPCRdata$dementia)
PPCRdata$arthritis <- as_factor(PPCRdata$arthritis)
PPCRdata$heartdisease <- as_factor(PPCRdata$heartdisease)
PPCRdata$lungdisease <- as_factor(PPCRdata$lungdisease)

## Converting age to numeric

PPCRdata$age <- as.numeric(PPCRdata$age)

Let’s now create a table summarizing all the information we have collected.

## Table 1
## divided by pain

## selecting all the variables that will be used
PPCRdata %>% dplyr::select(age,bmi,sex,pain,p_intensity,sleepquality,
                           sleepproblems,depression,alcohol,fracture,cancer,
                           diabetes,dementia,arthritis,heartdisease,
                           lungdisease) %>%
        
        ## recoding the variables to not have levels that aren't used, so
        ## that they don't clutter the table
        mutate(pain = recode(droplevels(pain), No = "No Pain", Yes = "Pain"),
               p_intensity = droplevels(p_intensity),
               sleepquality = droplevels(sleepquality),
               depression = droplevels(depression),
               alcohol = droplevels(alcohol),
               diabetes = droplevels(diabetes),
               dementia = droplevels(dementia),
               arthritis = droplevels(arthritis),
               cancer = droplevels(cancer),
               sleepproblems = droplevels(sleepproblems)) %>%
        
        ## determining how each type of variable will be presented
        ## continous as mean and sd and categorical as absolute relative
        ## frequencies
        tbl_summary(statistic = list(all_continuous() ~ "{mean} ({sd})",
                                    all_categorical() ~ "{n} ({p}%)"),
                    ## not showing NA's in each variable
                    missing = "no",
                    ## dividing the observations by pain
                    by = pain,
                    ## presenting the variables better
                    label = list(p_intensity ~ "Pain Intensity",
                                 sleepquality ~ "Sleep Quality",
                                 depression ~ "Depression",
                                 alcohol ~ "Alcohol Consumption",
                                 diabetes ~ "Diabetes Mellitus",
                                 dementia ~ "Dementia",
                                 arthritis ~ "Arthritis",
                                 cancer ~ "Cancer",
                                 sleepproblems ~ "Sleep Problems",
                                 age ~ "Age",
                                 bmi ~ "Body Mass Index",
                                 fracture ~ "Fracture of Hip or Wrist",
                                 heartdisease ~ "Heart Disease",
                                 lungdisease ~ "Lung Disease")) %>%
                ## title of the variable column
                modify_header(label = "**Variable**") %>%
                ## title of the table
                modify_caption("Subjects baseline characteristics") %>%
                bold_labels() %>%
                ## since we're not showing NA's in each variable, showing the
                ## N for each specific variable
                add_n() %>% 
                ## adding a column for overall observations, not divided by pain
                add_overall(last = TRUE) -> table1

## checking if table has been saved already
if (!file.exists("C:/Users/Pedro Henrique/Documents/PPCR/Data Analysis Project/DataProjectPPCR/tables/table1.docx")){
        ## if it hasn't, saving the table for publication
        table1 %>% 
                as_gt() %>% 
                gt::gtsave(filename = "table1.docx", path = "C:/Users/Pedro Henrique/Documents/PPCR/Data Analysis Project/DataProjectPPCR/tables")
}
## printing the table
table1

Subjects baseline characteristics
Variable	N	No Pain, N = 4,391¹	Pain, N = 2,583¹	Overall, N = 6,974¹
Age	6,974	66 (9)	65 (9)	66 (9)
Body Mass Index	6,850	27.5 (4.9)	28.6 (5.7)	27.9 (5.2)
Sex	6,974
Female		2,462 (56%)	1,766 (68%)	4,228 (61%)
Male		1,929 (44%)	817 (32%)	2,746 (39%)
Pain Intensity	2,583
Soft/weak		0 (NA%)	318 (12%)	318 (12%)
Moderate		0 (NA%)	1,348 (52%)	1,348 (52%)
Intense/strong		0 (NA%)	917 (36%)	917 (36%)
Sleep Quality	6,968
Very bad		70 (1.6%)	162 (6.3%)	232 (3.3%)
Bad		352 (8.0%)	536 (21%)	888 (13%)
Fair		907 (21%)	724 (28%)	1,631 (23%)
Good		2,387 (54%)	932 (36%)	3,319 (48%)
Very good		673 (15%)	225 (8.7%)	898 (13%)
Sleep Problems	6,941	293 (6.7%)	495 (19%)	788 (11%)
Depression	6,963	422 (9.6%)	477 (19%)	899 (13%)
Alcohol Consumption	6,961
Never		3,268 (75%)	2,059 (80%)	5,327 (77%)
Less than once a month		484 (11%)	197 (7.6%)	681 (9.8%)
Once a month or more		631 (14%)	322 (12%)	953 (14%)
Fracture of Hip or Wrist	6,974	42 (1.0%)	47 (1.8%)	89 (1.3%)
Cancer	6,970	175 (4.0%)	117 (4.5%)	292 (4.2%)
Diabetes Mellitus	6,943	695 (16%)	563 (22%)	1,258 (18%)
Dementia	6,969	47 (1.1%)	69 (2.7%)	116 (1.7%)
Arthritis	6,937	546 (12%)	922 (36%)	1,468 (21%)
Heart Disease	6,968	260 (5.9%)	292 (11%)	552 (7.9%)
Lung Disease	6,968	159 (3.6%)	233 (9.0%)	392 (5.6%)
¹ Mean (SD); n (%)

It’s important to note that the pain intensity variable doesn’t have any observations for those that do not have pain in the first place, as should be expected.

At this point we have coded all the variables correctly, so we can proceed with the analysis.

Analysis

Association between pain and sleepquality

Unadjusted Analysis

This analysis checks for the association between the pain variable and the sleepquality variable, without adjustments. As pain is a binary variable, we’re going to run a logistic regression.

## using the generalized linear model function 
## with the family = "binomial" to create a logistical
## regression

model1 <- glm (data = PPCRdata, formula = pain ~ sleepquality,
               family = "binomial")
summary(model1)

## 
## Call:
## glm(formula = pain ~ sleepquality, family = "binomial", data = PPCRdata)
## 
## Coefficients:
##                       Estimate Std. Error z value Pr(>|z|)    
## (Intercept)             0.8391     0.1430   5.866 4.45e-09 ***
## sleepqualityBad        -0.4186     0.1586  -2.639  0.00832 ** 
## sleepqualityFair       -1.0645     0.1515  -7.028 2.10e-12 ***
## sleepqualityGood       -1.7796     0.1482 -12.011  < 2e-16 ***
## sleepqualityVery good  -1.9347     0.1624 -11.910  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 9184.1  on 6967  degrees of freedom
## Residual deviance: 8669.3  on 6963  degrees of freedom
##   (6 observations deleted due to missingness)
## AIC: 8679.3
## 
## Number of Fisher Scoring iterations: 4

## coef(model1) extracts the coefficients from the model
## confint(model1) calculates confidence intervals (95%)
## cbind then creates a table by joining the two by column
## finally, we exponentiate all the results using the base e
## in order to transform logit units into regular units

exp(cbind(OR = coef(model1),confint(model1)))

## Waiting for profiling to be done...

##                              OR     2.5 %    97.5 %
## (Intercept)           2.3142857 1.7567890 3.0806419
## sleepqualityBad       0.6579686 0.4799679 0.8945713
## sleepqualityFair      0.3449168 0.2550187 0.4621682
## sleepqualityGood      0.1687122 0.1255215 0.2245582
## sleepqualityVery good 0.1444610 0.1045639 0.1978048

## another way, which is simpler and does the same, is to use
## the broom::tidy function

model1Tidy <- tidy(model1, conf.int = TRUE, exponentiate = TRUE)
model1Tidy %>% rename(Odds.Ratio = estimate) -> model1Tidy
model1Tidy

## # A tibble: 5 × 7
##   term                Odds.Ratio std.error statistic  p.value conf.low conf.high
##   <chr>                    <dbl>     <dbl>     <dbl>    <dbl>    <dbl>     <dbl>
## 1 (Intercept)              2.31      0.143      5.87 4.45e- 9    1.76      3.08 
## 2 sleepqualityBad          0.658     0.159     -2.64 8.32e- 3    0.480     0.895
## 3 sleepqualityFair         0.345     0.151     -7.03 2.10e-12    0.255     0.462
## 4 sleepqualityGood         0.169     0.148    -12.0  3.10e-33    0.126     0.225
## 5 sleepqualityVery g…      0.144     0.162    -11.9  1.05e-32    0.105     0.198

## as we will be doing this extensively for each model, creating a function
## in order to not repeat the code every time
tbl_pub <- function(model, path, file, labels){
        model %>% 
                        tbl_regression(exponentiate = TRUE, conf.int = TRUE, 
                                       label = labels) %>% 
                        bold_p() %>%
                        bold_labels() %>%
                        italicize_levels() %>%
                        modify_header(label = "**Variable**") -> tblx
        ## checking if the model has been saved already
        if (!file.exists(paste(path,file,sep = ""))){
                ## if it hasn't, saving the model for publication
                tblx %>%
                        as_gt() %>% 
                        gt::gtsave(filename = file, path = path)   
        }
        tblx
}

## creating a label list
labelList <- list(sleepquality ~ "Sleep Quality",
                  pain ~ "Pain",
                  age ~ "Age",
                  bmi ~ "Body Mass Index",
                  depression ~ "Depression",
                  alcohol ~ "Alcohol",
                  fracture ~ "Fracture",
                  cancer ~ "Cancer",
                  diabetes ~ "Diabetes",
                  dementia ~ "Dementia",
                  arthritis ~ "Arthritis",
                  heartdisease ~ "Heart Disease",
                  lungdisease ~ "Lung Disease",
                  sex ~ "Sex")

## creating a path variable

tablePath <- "C:/Users/Pedro Henrique/Documents/PPCR/Data Analysis Project/DataProjectPPCR/tables/"

## creating and saving the publication table
tbl_pub (model1, tablePath, "model1.docx", labelList[1]) -> tbl1


## let's plot these coefficients and their CI's to increase
## readability

## setting up two label arrays for reuse later

labels1 <- c("sleepqualityVery good" = "Very Good", "sleepqualityGood" = "Good",
             "sleepqualityFair" = "Fair", "sleepqualityBad" = "Bad")

labels2 <-c("sleepqualityVery good" = "Sleep Quality: Very Good",
            "sleepqualityGood" = "Sleep Quality: Good",
            "sleepqualityFair" = "Sleep Quality: Fair",
            "sleepqualityBad" = "Sleep Quality: Bad",
            "sleepqualityVery Bad" = "Reference - Sleep Quality: Very Bad",
            "alcoholOnce a month or more" = "Alcohol Use: Once a month of more",
            "alcoholLess than once a month" = "Alcohol Use: Less than once a month",
            "alcoholNever" = "Reference - Alcohol Use: Never",
            "depressionYes" = "Depression",
            "alcoholYes" = "Alcohol Consumption",
            "diabetesYes" = "Diabetes Mellitus",
            "dementiaYes" = "Dementia",
            "arthritisYes" = "Arthritis",
            "cancerYes" = "Cancer",
            "age" = "Age",
            "bmi" = "Body Mass Index",
            "fractureYes" = "Fracture of Hip or Wrist",
            "heartdiseaseYes" = "Heart Disease",
            "lungdiseaseYes" = "Lung Disease",
            "painYes" = "Pain",
            "sexMale" = "Male")

## creating a plot

model1Tidy %>% ggplot (aes(Odds.Ratio, term, xmin = conf.low,
                           xmax = conf.high, height = 0)) +
                ## inserting a point for the odds ratio
                geom_point() +
                ## drawing a line at the 1 to improve readability
                geom_vline(xintercept = 1, lty = 4) +
                ## drawing an horizontal line to represent the 
                ## confidence interval
                geom_errorbarh() +
                ## removing the intercept term and changing labels
                scale_y_discrete(
                        limits = setdiff(model1Tidy$term, "(Intercept)"),
                        labels = labels1) +
                labs(x = "Odds Ratio", y = "Sleep Quality",
                     title = "Presence of Pain", 
                subtitle = "Unadjusted odds ratio, in comparison to \"Very Bad\" Sleep Quality",
                caption = paste("N =", nobs(model1))) +
                ## adjusting the ticks on the x axis and its range
                scale_x_continuous (breaks = seq(0,1.2,0.2),
                                    limits = c(0,1.2)) -> plot1
## checking if plot has been saved already
if (!file.exists("C:/Users/Pedro Henrique/Documents/PPCR/Data Analysis Project/DataProjectPPCR/plots/plot1.png")){
        ## if it hasn't, saving the plot for publication
        ggsave("plot1.png", plot1, 
               path = "C:/Users/Pedro Henrique/Documents/PPCR/Data Analysis Project/DataProjectPPCR/plots", height = 3) 
}

plot1

This analysis allows us to determine that, when using a reference level of sleepquality that is very bad, the odds of experiencing pain when sleepquality increases are significantly lower (the odds ratio of the comparison of each other level with the very bad level of sleep quality can be seen in the table and plot above).

Sensitivity Analysis

In this next model, we’re going to consider sleepquality as a continuous variable.

## doing the transformation and saving on another variable to not lose the
## categorical information (names for each level)
PPCRdata %>% mutate(sleepqualitynum = 
        case_when(PPCRdata$sleepquality == 'Very bad' ~ 1,
                  PPCRdata$sleepquality == 'Bad' ~ 2,
                  PPCRdata$sleepquality == 'Fair' ~ 3,
                  PPCRdata$sleepquality == 'Good' ~ 4,
                  PPCRdata$sleepquality == 'Very good' ~ 5)) -> PPCRdata

Now that the variable is transformed to numeric, let’s run the model again.

glm (data = PPCRdata, formula = pain ~ sleepqualitynum, 
     family = "binomial") -> model2
model2 %>% tidy(conf.int = TRUE, exponentiate = TRUE) %>% 
        rename(Odds.Ratio = estimate) -> model2Tidy
model2Tidy

## # A tibble: 2 × 7
##   term            Odds.Ratio std.error statistic   p.value conf.low conf.high
##   <chr>                <dbl>     <dbl>     <dbl>     <dbl>    <dbl>     <dbl>
## 1 (Intercept)          4.33     0.0963      15.2 3.29e- 52    3.59      5.23 
## 2 sleepqualitynum      0.564    0.0269     -21.3 2.07e-100    0.535     0.594

model2Tidy %>% ggplot (aes(Odds.Ratio, term, xmin = conf.low,
                           xmax = conf.high, height = 0)) +
                ## inserting a point for the odds ratio
                geom_point() +
                ## drawing a line at the 1 to improve readability
                geom_vline(xintercept = 1, lty = 4) +
                ## drawing an horizontal line to represent the 
                ## confidence interval
                geom_errorbarh() +
                scale_y_discrete(
                        limits = setdiff(model2Tidy$term, "(Intercept)"))

In order to determine which model is best, we’re going to look at the AIC values. It looks like changing the variable to continues increases the AIC, which tells us that the model is behaving worse than the categorical. Let’s keep it categorical, then.

Adjusted Analysis

This analysis checks for the association between the pain variable and the sleepquality variable, adjusting the model for bmi, comorbidities and depression, as was done in the original article. As pain is a binary variable, we’re going to run a logistic regression.

model3 <- glm(data = PPCRdata, formula = pain ~ sleepquality + age + bmi + 
                      depression + alcohol + fracture + cancer + diabetes + 
                      dementia + arthritis + heartdisease + lungdisease,
                family = "binomial")
summary(model3)

## 
## Call:
## glm(formula = pain ~ sleepquality + age + bmi + depression + 
##     alcohol + fracture + cancer + diabetes + dementia + arthritis + 
##     heartdisease + lungdisease, family = "binomial", data = PPCRdata)
## 
## Coefficients:
##                                Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                    0.178116   0.306158   0.582 0.560716    
## sleepqualityBad               -0.358261   0.172220  -2.080 0.037502 *  
## sleepqualityFair              -0.884865   0.165214  -5.356 8.51e-08 ***
## sleepqualityGood              -1.502021   0.161661  -9.291  < 2e-16 ***
## sleepqualityVery good         -1.671866   0.176887  -9.452  < 2e-16 ***
## age                           -0.011630   0.003067  -3.792 0.000150 ***
## bmi                            0.028117   0.005353   5.253 1.50e-07 ***
## depressionYes                  0.309084   0.082688   3.738 0.000186 ***
## alcoholLess than once a month -0.349505   0.097373  -3.589 0.000332 ***
## alcoholOnce a month or more   -0.062550   0.082267  -0.760 0.447056    
## fractureYes                    0.445831   0.243776   1.829 0.067421 .  
## cancerYes                     -0.019709   0.136656  -0.144 0.885322    
## diabetesYes                    0.194192   0.071420   2.719 0.006548 ** 
## dementiaYes                    0.264841   0.216723   1.222 0.221699    
## arthritisYes                   1.228566   0.066383  18.507  < 2e-16 ***
## heartdiseaseYes                0.455709   0.101336   4.497 6.89e-06 ***
## lungdiseaseYes                 0.818769   0.116491   7.029 2.09e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 8872.2  on 6752  degrees of freedom
## Residual deviance: 7803.9  on 6736  degrees of freedom
##   (221 observations deleted due to missingness)
## AIC: 7837.9
## 
## Number of Fisher Scoring iterations: 4

## creating and saving publication-ready table
tbl_pub (model3, tablePath, "model3.docx", labelList[-c(2,14)]) -> tbl3

tbl_merge(list(tbl1,tbl3),
          tab_spanner = c("**Unadjusted**", "**Adjusted**")) -> merged1

## The merging columns (variable name, variable label, row type, and label column)
## are not unique and the merge may fail or result in a malformed table. If you
## previously 'tbl_stack'ed your tables, then 'tbl_merge'ing before you 'tbl_stack'
## may resolve the issue.

if(!file.exists(paste(tablePath,"merged1.docx", sep=""))){
        merged1 %>% as_gt() %>% gt::gtsave(filename = "merged1.docx", path = tablePath)
}

Let’s adjust the results and plot the odds ratios.

model3Tidy <- tidy(model3, conf.int = TRUE, exponentiate = TRUE)
model3Tidy %>% rename(Odds.Ratio = estimate) -> model3Tidy
model3Tidy

## # A tibble: 17 × 7
##    term               Odds.Ratio std.error statistic  p.value conf.low conf.high
##    <chr>                   <dbl>     <dbl>     <dbl>    <dbl>    <dbl>     <dbl>
##  1 (Intercept)             1.19    0.306       0.582 5.61e- 1    0.657     2.18 
##  2 sleepqualityBad         0.699   0.172      -2.08  3.75e- 2    0.497     0.976
##  3 sleepqualityFair        0.413   0.165      -5.36  8.51e- 8    0.297     0.568
##  4 sleepqualityGood        0.223   0.162      -9.29  1.53e-20    0.161     0.305
##  5 sleepqualityVery …      0.188   0.177      -9.45  3.34e-21    0.132     0.265
##  6 age                     0.988   0.00307    -3.79  1.50e- 4    0.982     0.994
##  7 bmi                     1.03    0.00535     5.25  1.50e- 7    1.02      1.04 
##  8 depressionYes           1.36    0.0827      3.74  1.86e- 4    1.16      1.60 
##  9 alcoholLess than …      0.705   0.0974     -3.59  3.32e- 4    0.582     0.852
## 10 alcoholOnce a mon…      0.939   0.0823     -0.760 4.47e- 1    0.799     1.10 
## 11 fractureYes             1.56    0.244       1.83  6.74e- 2    0.969     2.53 
## 12 cancerYes               0.980   0.137      -0.144 8.85e- 1    0.748     1.28 
## 13 diabetesYes             1.21    0.0714      2.72  6.55e- 3    1.06      1.40 
## 14 dementiaYes             1.30    0.217       1.22  2.22e- 1    0.853     2.00 
## 15 arthritisYes            3.42    0.0664     18.5   1.81e-76    3.00      3.89 
## 16 heartdiseaseYes         1.58    0.101       4.50  6.89e- 6    1.29      1.92 
## 17 lungdiseaseYes          2.27    0.116       7.03  2.09e-12    1.81      2.85

## adding the reference terms to the data frame to allow for a better plot
model3Tidy %>% add_row(term = "sleepqualityVery Bad", Odds.Ratio = -99, std.error = 0, statistic = 0, p.value = 0, conf.low = 0, conf.high = 0, .before = 2) %>%
        add_row(term = "alcoholNever", Odds.Ratio = -99, std.error = 0, statistic = 0, p.value = 0, conf.low = 0, conf.high = 0, .before = 10) %>%
        ## filtering the intercept
        filter (term !="(Intercept)") %>%
        ## saving the result so that we can refer to it in the next call
        {. ->> model3Tidy}%>%
        ## adding a variable to determine whether the observation
        ## is a reference or not. Initially adding everything as FALSE
        add_column(.after = "conf.high",
                   reference = rep(FALSE, length.out = nrow(model3Tidy))) %>%
        ## now the appropriate ones to TRUE
        dplyr::mutate(reference = case_when
               (term %in% c("Very Bad", "Never") ~ TRUE,
                                     TRUE ~ FALSE)) %>%
        ## create a Variable column
        dplyr::mutate(.after= "term",
                      Variable = case_when(grepl("sleep", term) ~ "Sleep Quality",
                                           grepl("age", term) ~ "Age",
                                           grepl("bmi", term) ~ "Body Mass Index",
                                           grepl("depression",term) ~ "Depression",
                                           grepl("alcohol",term) ~ "Alcohol",
                                           grepl("fracture", term) ~ "Fracture",
                                           grepl("cancer",term) ~ "Cancer",
                                           grepl("diabetes",term) ~ "Diabetes",
                                           grepl("dementia", term) ~ "Dementia",
                                           grepl("arthritis",term) ~ "Arthritis",
                                           grepl("heart",term) ~ "Heart Disease",
                                           grepl("lung",term) ~ "Lung Disease")) %>%
        ## transform the variables to factor
        dplyr::mutate(Variable = as_factor(Variable)) %>%
        dplyr::mutate(term = as_factor(term)) -> model3Tidy


model3Tidy %>% ggplot (aes(x = Odds.Ratio, y = term, xmin = conf.low,
                           xmax = conf.high, height = 0, alpha = !reference)) +
                ## inserting a point for the odds ratio
                geom_point(aes(alpha = !reference)) +
                ## drawing a line at the 1 to improve readability
                geom_vline(xintercept = 1, lty = 4) +
                ## drawing an horizontal line to represent the 
                ## confidence interval
                geom_errorbarh() +
                scale_y_discrete(labels = labels2) +
                labs(x = "Odds Ratio", y = "",
                     title = "Presence of Pain", 
                subtitle = "Adjusted Odds Ratios",
                caption = paste("N =", nobs(model3))) +
                facet_grid(Variable ~ ., scales = "free_y", space = "free_y") +
                scale_alpha_identity() +
                # hide facet labels
                theme(strip.background = element_blank(),
                strip.text = element_blank()) +
                # remove spacing between facets
                theme(panel.spacing = unit(0, "pt")) +
                scale_x_continuous(breaks = seq(0,3, by = 0.5),
                                   limits = (c(0,2.9))) -> plot3

## checking if plot has been saved already
if (!file.exists("C:/Users/Pedro Henrique/Documents/PPCR/Data Analysis Project/DataProjectPPCR/plots/plot3.png")){
        ## if it hasn't, saving the plot for publication
        ggsave("plot3.png", plot3, 
               path = "C:/Users/Pedro Henrique/Documents/PPCR/Data Analysis Project/DataProjectPPCR/plots") 
}

plot3

If we refer to the analysis of the unadjusted model, the results are the same for the sleepquality variable. In summary, increasing levels of sleep quality are decrease the odds of pain being present.

Sensitivity Analysis

We can also test how the adjusted model behaves with sleep quality as a continuous variable.

model4 <- glm(data = PPCRdata, formula = pain ~ sleepqualitynum + age + bmi + 
                      depression + alcohol + fracture + cancer + diabetes + 
                      dementia + arthritis + heartdisease + lungdisease,
                family = "binomial")
summary(model4)

## 
## Call:
## glm(formula = pain ~ sleepqualitynum + age + bmi + depression + 
##     alcohol + fracture + cancer + diabetes + dementia + arthritis + 
##     heartdisease + lungdisease, family = "binomial", data = PPCRdata)
## 
## Coefficients:
##                                Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                    0.734425   0.282044   2.604 0.009216 ** 
## sleepqualitynum               -0.489627   0.029134 -16.806  < 2e-16 ***
## age                           -0.011787   0.003062  -3.849 0.000119 ***
## bmi                            0.028120   0.005343   5.263 1.42e-07 ***
## depressionYes                  0.304587   0.082189   3.706 0.000211 ***
## alcoholLess than once a month -0.357506   0.097191  -3.678 0.000235 ***
## alcoholOnce a month or more   -0.063880   0.082100  -0.778 0.436524    
## fractureYes                    0.448646   0.242928   1.847 0.064772 .  
## cancerYes                     -0.015393   0.136500  -0.113 0.910215    
## diabetesYes                    0.198541   0.071297   2.785 0.005358 ** 
## dementiaYes                    0.276703   0.216056   1.281 0.200299    
## arthritisYes                   1.234864   0.066287  18.629  < 2e-16 ***
## heartdiseaseYes                0.461404   0.101158   4.561 5.09e-06 ***
## lungdiseaseYes                 0.825247   0.116301   7.096 1.29e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 8872.2  on 6752  degrees of freedom
## Residual deviance: 7818.1  on 6739  degrees of freedom
##   (221 observations deleted due to missingness)
## AIC: 7846.1
## 
## Number of Fisher Scoring iterations: 4

model4Tidy <- tidy(model4, conf.int = TRUE, exponentiate = TRUE)
model4Tidy %>% rename(Odds.Ratio = estimate) -> model4Tidy
model4Tidy

## # A tibble: 14 × 7
##    term               Odds.Ratio std.error statistic  p.value conf.low conf.high
##    <chr>                   <dbl>     <dbl>     <dbl>    <dbl>    <dbl>     <dbl>
##  1 (Intercept)             2.08    0.282       2.60  9.22e- 3    1.20      3.63 
##  2 sleepqualitynum         0.613   0.0291    -16.8   2.21e-63    0.579     0.649
##  3 age                     0.988   0.00306    -3.85  1.19e- 4    0.982     0.994
##  4 bmi                     1.03    0.00534     5.26  1.42e- 7    1.02      1.04 
##  5 depressionYes           1.36    0.0822      3.71  2.11e- 4    1.15      1.59 
##  6 alcoholLess than …      0.699   0.0972     -3.68  2.35e- 4    0.577     0.845
##  7 alcoholOnce a mon…      0.938   0.0821     -0.778 4.37e- 1    0.798     1.10 
##  8 fractureYes             1.57    0.243       1.85  6.48e- 2    0.973     2.53 
##  9 cancerYes               0.985   0.136      -0.113 9.10e- 1    0.752     1.28 
## 10 diabetesYes             1.22    0.0713      2.78  5.36e- 3    1.06      1.40 
## 11 dementiaYes             1.32    0.216       1.28  2.00e- 1    0.864     2.02 
## 12 arthritisYes            3.44    0.0663     18.6   1.87e-77    3.02      3.92 
## 13 heartdiseaseYes         1.59    0.101       4.56  5.09e- 6    1.30      1.93 
## 14 lungdiseaseYes          2.28    0.116       7.10  1.29e-12    1.82      2.87

model4Tidy %>% ggplot (aes(Odds.Ratio, term, xmin = conf.low,
                           xmax = conf.high, height = 0)) +
                ## inserting a point for the odds ratio
                geom_point() +
                ## drawing a line at the 1 to improve readability
                geom_vline(xintercept = 1, lty = 4) +
                ## drawing an horizontal line to represent the 
                ## confidence interval
                geom_errorbarh() +
                scale_y_discrete(
                        limits = setdiff(model4Tidy$term, "(Intercept)"))

The results point towards increases in the continuous variable also being significantly associated with a decrease in the odds of pain being present. Looking at the AIC, it still seems as if the model with sleepquality coded as categorical behaves better.

Testing the assumptions of the model

At this point, I haven’t yet learned how to properly test for the assumptions of the logistic regression model. On a later opportunity, I’ll try to remedy this situation. For now, I’ll include some commands that were listed as useful in the R you ready for R? e-book.

library(blorr)
blr_model_fit_stats(model3)

##                               Model Fit Statistics                                
## ---------------------------------------------------------------------------------
## Log-Lik Intercept Only:     -4596.949    Log-Lik Full Model:            -3901.953 
## Deviance(6736):              7803.906    LR(16):                         1389.993 
##                                          Prob > LR:                         0.000 
## MCFadden's R2                   0.151    McFadden's Adj R2:                 0.147 
## ML (Cox-Snell) R2:              0.181    Cragg-Uhler(Nagelkerke) R2:        0.247 
## McKelvey & Zavoina's R2:        0.194    Efron's R2:                        0.154 
## Count R2:                       0.701    Adj Count R2:                      0.182 
## BIC:                         7953.807    AIC:                            7837.906 
## ---------------------------------------------------------------------------------

blr_test_hosmer_lemeshow(model3)

##            Partition for the Hosmer & Lemeshow Test            
## --------------------------------------------------------------
##                         def = 1                 def = 0        
## Group    Total    Observed    Expected    Observed    Expected 
## --------------------------------------------------------------
##   1       676       107        107.90       569        568.10  
##   2       675       107        128.45       568        546.55  
##   3       675       134        142.86       541        532.14  
##   4       675       162        160.29       513        514.71  
##   5       676       210        188.13       466        487.87  
##   6       675       215        226.35       460        448.65  
##   7       675       313        272.65       362        402.35  
##   8       675       323        327.53       352        347.47  
##   9       675       389        398.34       286        276.66  
##  10       676       513        520.51       163        155.49  
## --------------------------------------------------------------
## 
##      Goodness of Fit Test      
## ------------------------------
## Chi-Square    DF    Pr > ChiSq 
## ------------------------------
##  20.6763      8       0.0081   
## ------------------------------

blr_roc_curve(blr_gains_table(model3))

car::vif(model3)

##                  GVIF Df GVIF^(1/(2*Df))
## sleepquality 1.044858  4        1.005500
## age          1.080334  1        1.039391
## bmi          1.037185  1        1.018423
## depression   1.047143  1        1.023300
## alcohol      1.044415  2        1.010923
## fracture     1.006378  1        1.003184
## cancer       1.008435  1        1.004208
## diabetes     1.033202  1        1.016465
## dementia     1.012373  1        1.006168
## arthritis    1.026991  1        1.013406
## heartdisease 1.021015  1        1.010453
## lungdisease  1.004299  1        1.002147

Association between sleepquality and pain

Ordinal Regression

Adjusted Analysis

In order to do this analysis I’ll try to follow the instructions from the UCLA OARC - Ordinal Logistic Regression | R Data Analysis Example.

As the original article by Montelhão et al tested for bidirectional associations, let’s try to replicate that as well. In this case, the dependent variable is ordinal, therefore we will run an ordinal logistical regression.

## creating an ordinal version of sleepquality

PPCRdata %>% mutate (sleepqualityord = factor(PPCRdata$sleepquality,
                                              ordered = TRUE)) -> PPCRdata

## loading the library required to run the model
library(MASS)

## running the model
model5 <- polr (data = PPCRdata, formula = sleepqualityord ~ pain + depression + 
                       alcohol + fracture + cancer + diabetes + dementia +
                       arthritis + heartdisease + lungdisease, Hess = TRUE)
summary(model5)

## Call:
## polr(formula = sleepqualityord ~ pain + depression + alcohol + 
##     fracture + cancer + diabetes + dementia + arthritis + heartdisease + 
##     lungdisease, data = PPCRdata, Hess = TRUE)
## 
## Coefficients:
##                                  Value Std. Error  t value
## painYes                       -0.85302    0.05024 -16.9801
## depressionYes                 -0.82700    0.06790 -12.1801
## alcoholLess than once a month  0.07362    0.07643   0.9633
## alcoholOnce a month or more    0.06820    0.06710   1.0165
## fractureYes                   -0.36430    0.20135  -1.8093
## cancerYes                     -0.01869    0.11499  -0.1625
## diabetesYes                   -0.11152    0.05947  -1.8752
## dementiaYes                   -0.90439    0.17339  -5.2160
## arthritisYes                  -0.30678    0.05771  -5.3158
## heartdiseaseYes               -0.28727    0.08436  -3.4051
## lungdiseaseYes                -0.14181    0.09885  -1.4347
## 
## Intercepts:
##                Value    Std. Error t value 
## Very bad|Bad    -4.1364   0.0780   -53.0304
## Bad|Fair        -2.3132   0.0473   -48.9539
## Fair|Good       -0.9812   0.0383   -25.6396
## Good|Very good   1.5042   0.0430    34.9433
## 
## Residual Deviance: 17569.05 
## AIC: 17599.05 
## (98 observations deleted due to missingness)

## tidy(model5, exponentiate = TRUE, p.values = TRUE, conf.int = TRUE)

The tidy function is not working for the polr model, even though it should. I tried to debug it, but couldn’t. I will then describe a more manual approach, as was described in the UCLA Office of Advanced Research Computing (OARC) page for Ordinal Regression.

## getting the coefficients
ctable <- coef(summary(model5))

## getting the respective p-values from the normal distribution
p <- pnorm(abs(ctable[, "t value"]), lower.tail = FALSE) * 2

## creating a table with the respective p-values added
ctable <- cbind(ctable, "p value" = p)
ctable

##                                     Value Std. Error     t value       p value
## painYes                       -0.85301696 0.05023641 -16.9800548  1.153784e-64
## depressionYes                 -0.82699798 0.06789738 -12.1801156  3.967194e-34
## alcoholLess than once a month  0.07362053 0.07642528   0.9633007  3.353966e-01
## alcoholOnce a month or more    0.06820388 0.06709866   1.0164715  3.094049e-01
## fractureYes                   -0.36430180 0.20135407  -1.8092596  7.041068e-02
## cancerYes                     -0.01869110 0.11498945  -0.1625462  8.708757e-01
## diabetesYes                   -0.11151556 0.05946946  -1.8751736  6.076884e-02
## dementiaYes                   -0.90439261 0.17338723  -5.2160278  1.828008e-07
## arthritisYes                  -0.30678023 0.05771145  -5.3157605  1.062128e-07
## heartdiseaseYes               -0.28726601 0.08436256  -3.4051363  6.613103e-04
## lungdiseaseYes                -0.14181420 0.09884818  -1.4346668  1.513821e-01
## Very bad|Bad                  -4.13641304 0.07800080 -53.0303894  0.000000e+00
## Bad|Fair                      -2.31320370 0.04725268 -48.9539177  0.000000e+00
## Fair|Good                     -0.98124293 0.03827055 -25.6396324 5.518018e-145
## Good|Very good                 1.50417355 0.04304618  34.9432561 1.639373e-267

## creating confidence intervals
ci<-confint.default(model5)
colnames(ci) <- c("conf.low", "conf.high")

## exponentiating the log-odds and the confidence intervals
## in order to create the odds-ratios and the respective CI's
model5OddsRatio <- exp(cbind(OR = coef(model5), ci))
## converting rownames to columns, as this will be necessary to build the plot
model5OddsRatio <- as_tibble(rownames_to_column(as.data.frame(model5OddsRatio)))
model5OddsRatio %>% rename(term = rowname) -> model5OddsRatio

Let’s build a plot using these values.

model5OddsRatio %>% ggplot (aes(OR, term, xmin = conf.low,
                           xmax = conf.high, height = 0)) +
                geom_point() +
                geom_vline(xintercept = 1, lty = 4) +
                geom_errorbarh()

The plot allows us to visually determine that the presence of pain decreases the odds of sleepquality being higher. Unfortunately, the model has a high AIC.

Unadjusted Analysis

In order to be able to compare the adjusted analysis, let’s proceed with one that is unadjusted as well.

## this code is equivalent to the one used for the adjusted analysis, just 
## with no covariates added to the model
model6 <- polr (data = PPCRdata, formula = sleepqualityord ~ pain, Hess = TRUE)

## Warning in polr(data = PPCRdata, formula = sleepqualityord ~ pain, Hess =
## TRUE): design appears to be rank-deficient, so dropping some coefs

summary(model6)

## Call:
## polr(formula = sleepqualityord ~ pain, data = PPCRdata, Hess = TRUE)
## 
## Coefficients:
##          Value Std. Error t value
## painYes -1.023    0.04753  -21.53
## 
## Intercepts:
##                Value    Std. Error t value 
## Very bad|Bad    -3.8629   0.0717   -53.8797
## Bad|Fair        -2.1040   0.0402   -52.2757
## Fair|Good       -0.8107   0.0313   -25.9252
## Good|Very good   1.6254   0.0379    42.8842
## 
## Residual Deviance: 18102.95 
## AIC: 18112.95 
## (6 observations deleted due to missingness)

## getting the coefficients
ctable2 <- coef(summary(model6))

## getting the respective p-values from the normal distribution
p <- pnorm(abs(ctable2[, "t value"]), lower.tail = FALSE) * 2

## creating a table with the respective p-values added
ctable2 <- cbind(ctable2, "p value" = p)

## creating confidence intervals
ci<-confint.default(model6)
colnames(ci) <- c("conf.low", "conf.high")

## exponentiating the log-odds and the confidence intervals
## in order to create the odds-ratios and the respective CI's
model6OddsRatio <- exp(cbind(OR = coef(model6), ci))
## converting rownames to columns, as this will be necessary to build the plot
model6OddsRatio <- as_tibble(rownames_to_column(as.data.frame(model6OddsRatio)))
model6OddsRatio %>% rename(term = rowname) -> model6OddsRatio
model6OddsRatio

## # A tibble: 1 × 4
##   term       OR conf.low conf.high
##   <chr>   <dbl>    <dbl>     <dbl>
## 1 painYes 0.359    0.327     0.395

## building a plot
model6OddsRatio %>% ggplot (aes(OR, term, xmin = conf.low,
                           xmax = conf.high, height = 0)) +
                geom_point() +
                geom_vline(xintercept = 1, lty = 4) +
                geom_errorbarh()

We need to check what the warning means, but the presence of pain reduces the odds of increased sleep quality in this model. The AIC is huge, as should be expected of a model that only has one predictor and that predictor is a dichotomous variable.

Testing for the Proportional Odds Assumption

This is something to consider further on. At this point, I haven’t yet figured out exactly how to do it, but I’ll try to follow the instructions from the Office of Advanced Research Computing on another day.

## Loading the library that is needed to conduct the visual analysis of the
## proportional odds assumption

library (Hmisc)

## 
## Attaching package: 'Hmisc'

## The following objects are masked from 'package:dplyr':
## 
##     src, summarize

## The following objects are masked from 'package:base':
## 
##     format.pval, units

sf <- function(y) {
  c('Y>=1' = qlogis(mean(y >= 1)),
    'Y>=2' = qlogis(mean(y >= 2)),
    'Y>=3' = qlogis(mean(y >= 3)),
    'Y>=4' = qlogis(mean(y >= 4)),
    'Y>=5' = qlogis(mean(y >= 5)),
    'Y>=6' = qlogis(mean(y >= 6)))
}

(s <- with(PPCRdata, summary(as.numeric(sleepqualityord) ~ pain + depression + 
                       alcohol + fracture + cancer + diabetes + dementia +
                       arthritis + heartdisease + lungdisease, fun=sf)))

## as.numeric(sleepqualityord)      N= 6968 , 6 Missing 
## 
## +------------+-------------------------+----+----+--------+---------+-----------+---------+----+
## |            |                         |   N|Y>=1|    Y>=2|     Y>=3|       Y>=4|     Y>=5|Y>=6|
## +------------+-------------------------+----+----+--------+---------+-----------+---------+----+
## |        pain|                       No|4389| Inf|4.122284|2.2407601| 0.83398814|-1.708658|-Inf|
## |            |                      Yes|2579| Inf|2.702686|0.9913397|-0.20623388|-2.347771|-Inf|
## |            |           Does not apply|   0|    |        |         |           |         |    |
## |            |Didn’t know/didn’t answer|   0|    |        |         |           |         |    |
## +------------+-------------------------+----+----+--------+---------+-----------+---------+----+
## |  depression|                       No|6058| Inf|3.626631|1.8714213| 0.54034089|-1.803168|-Inf|
## |            |                      Yes| 899| Inf|2.411318|0.6418539|-0.29807178|-3.066889|-Inf|
## |            |                  Missing|  11| Inf|2.302585|1.5040774| 0.18232156|-2.302585|-Inf|
## +------------+-------------------------+----+----+--------+---------+-----------+---------+----+
## |     alcohol|                    Never|5323| Inf|3.363997|1.5908421| 0.37907588|-1.937990|-Inf|
## |            |   Less than once a month| 680| Inf|4.019382|1.9459101| 0.61903921|-1.973077|-Inf|
## |            |     Once a month or more| 952| Inf|3.075775|1.8265024| 0.57408782|-1.741207|-Inf|
## |            |                  Missing|  13| Inf|     Inf|1.7047481|-0.15415068|-1.203973|-Inf|
## +------------+-------------------------+----+----+--------+---------+-----------+---------+----+
## |    fracture|                       No|6879| Inf|3.382288|1.6654277| 0.43438220|-1.905168|-Inf|
## |            |                      Yes|  89| Inf|2.627081|0.8850382|-0.11247798|-2.460809|-Inf|
## +------------+-------------------------+----+----+--------+---------+-----------+---------+----+
## |      cancer|                       No|6672| Inf|3.411938|1.6677603| 0.43352306|-1.914758|-Inf|
## |            |                      Yes| 292| Inf|2.665033|1.3523928| 0.28968008|-1.811881|-Inf|
## |            |                  Missing|   4| Inf|     Inf|1.0986123| 0.00000000|     -Inf|-Inf|
## +------------+-------------------------+----+----+--------+---------+-----------+---------+----+
## |    diabetes|                       No|5682| Inf|3.397204|1.7253825| 0.48294991|-1.888043|-Inf|
## |            |                      Yes|1256| Inf|3.225520|1.3576787| 0.20131409|-2.020781|-Inf|
## |            |                  Missing|  30| Inf|     Inf|1.8718022|-0.40546511|-1.871802|-Inf|
## +------------+-------------------------+----+----+--------+---------+-----------+---------+----+
## |    dementia|                       No|6848| Inf|3.414865|1.6819751| 0.45021038|-1.896113|-Inf|
## |            |                      Yes| 115| Inf|2.060023|0.4784902|-0.91021169|-3.323236|-Inf|
## |            |                  Missing|   5| Inf|1.386294|0.4054651|-0.40546511|     -Inf|-Inf|
## +------------+-------------------------+----+----+--------+---------+-----------+---------+----+
## |   arthritis|                       No|5465| Inf|3.609603|1.8539727| 0.58442952|-1.835344|-Inf|
## |            |                      Yes|1466| Inf|2.813170|1.0931627|-0.10651259|-2.232586|-Inf|
## |            |                  Missing|  37| Inf|1.856298|0.7339692|-0.49643689|-2.110213|-Inf|
## +------------+-------------------------+----+----+--------+---------+-----------+---------+----+
## |heartdisease|                       No|6412| Inf|3.451517|1.7161281| 0.47010500|-1.894850|-Inf|
## |            |                      Yes| 551| Inf|2.690759|1.0484225|-0.05445992|-2.102100|-Inf|
## |            |                  Missing|   5| Inf|     Inf|      Inf| 0.40546511|     -Inf|-Inf|
## +------------+-------------------------+----+----+--------+---------+-----------+---------+----+
## | lungdisease|                       No|6571| Inf|3.420717|1.6849207| 0.45543727|-1.901268|-Inf|
## |            |                      Yes| 392| Inf|2.730029|1.1962508|-0.03061464|-2.068013|-Inf|
## |            |                  Missing|   5| Inf|     Inf|1.3862944| 0.40546511|     -Inf|-Inf|
## +------------+-------------------------+----+----+--------+---------+-----------+---------+----+
## |     Overall|                         |6968| Inf|3.368484|1.6527710| 0.42715949|-1.910944|-Inf|
## +------------+-------------------------+----+----+--------+---------+-----------+---------+----+

s[, 6] <- s[, 6] - s[, 5]
s[, 5] <- s[, 5] - s[, 4]
s[, 4] <- s[, 4] - s[, 3]
s[, 3] <- s[, 3] - s[, 3]

s

## as.numeric(sleepqualityord)      N= 6968 , 6 Missing 
## 
## +------------+-------------------------+----+----+----+----------+----------+---------+----+
## |            |                         |   N|Y>=1|Y>=2|      Y>=3|      Y>=4|     Y>=5|Y>=6|
## +------------+-------------------------+----+----+----+----------+----------+---------+----+
## |        pain|                       No|4389| Inf|   0|-1.8815238|-1.4067720|-2.542646|-Inf|
## |            |                      Yes|2579| Inf|   0|-1.7113463|-1.1975736|-2.141537|-Inf|
## |            |           Does not apply|   0|    |    |          |          |         |    |
## |            |Didn’t know/didn’t answer|   0|    |    |          |          |         |    |
## +------------+-------------------------+----+----+----+----------+----------+---------+----+
## |  depression|                       No|6058| Inf|   0|-1.7552100|-1.3310804|-2.343509|-Inf|
## |            |                      Yes| 899| Inf|   0|-1.7694644|-0.9399257|-2.768818|-Inf|
## |            |                  Missing|  11| Inf|   0|-0.7985077|-1.3217558|-2.484907|-Inf|
## +------------+-------------------------+----+----+----+----------+----------+---------+----+
## |     alcohol|                    Never|5323| Inf|   0|-1.7731550|-1.2117662|-2.317066|-Inf|
## |            |   Less than once a month| 680| Inf|   0|-2.0734714|-1.3268709|-2.592116|-Inf|
## |            |     Once a month or more| 952| Inf|   0|-1.2492726|-1.2524146|-2.315295|-Inf|
## |            |                  Missing|  13| Inf|    |      -Inf|-1.8588988|-1.049822|-Inf|
## +------------+-------------------------+----+----+----+----------+----------+---------+----+
## |    fracture|                       No|6879| Inf|   0|-1.7168605|-1.2310455|-2.339551|-Inf|
## |            |                      Yes|  89| Inf|   0|-1.7420430|-0.9975162|-2.348331|-Inf|
## +------------+-------------------------+----+----+----+----------+----------+---------+----+
## |      cancer|                       No|6672| Inf|   0|-1.7441774|-1.2342372|-2.348281|-Inf|
## |            |                      Yes| 292| Inf|   0|-1.3126400|-1.0627127|-2.101561|-Inf|
## |            |                  Missing|   4| Inf|    |      -Inf|-1.0986123|     -Inf|-Inf|
## +------------+-------------------------+----+----+----+----------+----------+---------+----+
## |    diabetes|                       No|5682| Inf|   0|-1.6718214|-1.2424326|-2.370993|-Inf|
## |            |                      Yes|1256| Inf|   0|-1.8678417|-1.1563646|-2.222095|-Inf|
## |            |                  Missing|  30| Inf|    |      -Inf|-2.2772673|-1.466337|-Inf|
## +------------+-------------------------+----+----+----+----------+----------+---------+----+
## |    dementia|                       No|6848| Inf|   0|-1.7328899|-1.2317647|-2.346323|-Inf|
## |            |                      Yes| 115| Inf|   0|-1.5815332|-1.3887019|-2.413024|-Inf|
## |            |                  Missing|   5| Inf|   0|-0.9808293|-0.8109302|     -Inf|-Inf|
## +------------+-------------------------+----+----+----+----------+----------+---------+----+
## |   arthritis|                       No|5465| Inf|   0|-1.7556306|-1.2695431|-2.419773|-Inf|
## |            |                      Yes|1466| Inf|   0|-1.7200070|-1.1996753|-2.126073|-Inf|
## |            |                  Missing|  37| Inf|   0|-1.1223288|-1.2304061|-1.613776|-Inf|
## +------------+-------------------------+----+----+----+----------+----------+---------+----+
## |heartdisease|                       No|6412| Inf|   0|-1.7353891|-1.2460231|-2.364955|-Inf|
## |            |                      Yes| 551| Inf|   0|-1.6423362|-1.1028825|-2.047640|-Inf|
## |            |                  Missing|   5| Inf|    |          |      -Inf|     -Inf|-Inf|
## +------------+-------------------------+----+----+----+----------+----------+---------+----+
## | lungdisease|                       No|6571| Inf|   0|-1.7357965|-1.2294834|-2.356706|-Inf|
## |            |                      Yes| 392| Inf|   0|-1.5337783|-1.2268654|-2.037398|-Inf|
## |            |                  Missing|   5| Inf|    |      -Inf|-0.9808293|     -Inf|-Inf|
## +------------+-------------------------+----+----+----+----------+----------+---------+----+
## |     Overall|                         |6968| Inf|   0|-1.7157131|-1.2256115|-2.338103|-Inf|
## +------------+-------------------------+----+----+----+----------+----------+---------+----+

plot(s, which=1:6, xlab='logit', main=' ', xlim=c(-2.8,0))

It doesn’t look like the proportional odds assumption holds.

Logistic Regression

In order to conduct a logistic regression using sleep quality as a dependent variable, we need to first convert it to a dichotomous variable. Let’s consider that “Very Bad” and “Bad” are failures and “Fair”, “Good” and “Very Good” are successes.

PPCRdata %>% mutate (sleepqualitydich = case_when(
                        sleepquality %in% c("Very bad", "Bad") ~ 0,
                        sleepquality %in% c("Fair", "Good", "Very good") ~ 1
                                                )
                     ) -> PPCRdata

head(PPCRdata$sleepqualitydich)

## [1] 0 1 1 1 1 1

summary(PPCRdata$sleepqualitydich)

##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##  0.0000  1.0000  1.0000  0.8393  1.0000  1.0000       6

Unadjusted Analysis

Let’s proceed with the unadjusted analysis with sleepquality as a dichotomous variable.

model7 <- glm(sleepqualitydich ~ pain, "binomial", data = PPCRdata)
summary(model7)

## 
## Call:
## glm(formula = sleepqualitydich ~ pain, family = "binomial", data = PPCRdata)
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  2.24076    0.05120   43.77   <2e-16 ***
## painYes     -1.24942    0.06772  -18.45   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 6144.2  on 6967  degrees of freedom
## Residual deviance: 5790.3  on 6966  degrees of freedom
##   (6 observations deleted due to missingness)
## AIC: 5794.3
## 
## Number of Fisher Scoring iterations: 4

## creating and saving table for publication
tbl_pub (model7, tablePath, "model7.docx", labelList[2]) -> tbl7

model7Tidy <- tidy(model7, conf.int = TRUE, exponentiate = TRUE)
model7Tidy %>% rename(Odds.Ratio = estimate) -> model7Tidy
model7Tidy

## # A tibble: 2 × 7
##   term        Odds.Ratio std.error statistic  p.value conf.low conf.high
##   <chr>            <dbl>     <dbl>     <dbl>    <dbl>    <dbl>     <dbl>
## 1 (Intercept)      9.40     0.0512      43.8 0           8.51     10.4  
## 2 painYes          0.287    0.0677     -18.5 5.12e-76    0.251     0.327

model7Tidy %>% ggplot (aes(Odds.Ratio, term, xmin = conf.low,
                           xmax = conf.high, height = 0)) +
                ## inserting a point for the odds ratio
                geom_point() +
                ## drawing a line at the 1 to improve readability
                geom_vline(xintercept = 1, lty = 4) +
                ## drawing an horizontal line to represent the 
                ## confidence interval
                geom_errorbarh() +
                scale_y_discrete(
                        limits = setdiff(model7Tidy$term, "(Intercept)"))

The presence of pain is associated with decreased odds of a success in sleep quality that is “Fair” or better.

Adjusted Analysis

model8 <- glm(sleepqualitydich ~ pain + age + bmi + depression + alcohol +
                      fracture + cancer + diabetes + dementia + arthritis +
                      heartdisease + lungdisease, "binomial", data = PPCRdata)
summary(model8)

## 
## Call:
## glm(formula = sleepqualitydich ~ pain + age + bmi + depression + 
##     alcohol + fracture + cancer + diabetes + dementia + arthritis + 
##     heartdisease + lungdisease, family = "binomial", data = PPCRdata)
## 
## Coefficients:
##                                Estimate Std. Error z value Pr(>|z|)    
## (Intercept)                    1.757899   0.336922   5.218 1.81e-07 ***
## painYes                       -1.044460   0.073726 -14.167  < 2e-16 ***
## age                            0.002957   0.003879   0.762 0.445873    
## bmi                            0.019638   0.006735   2.916 0.003546 ** 
## depressionYes                 -1.016393   0.086296 -11.778  < 2e-16 ***
## alcoholLess than once a month  0.166445   0.128151   1.299 0.194006    
## alcoholOnce a month or more    0.102168   0.108679   0.940 0.347173    
## fractureYes                   -0.521306   0.255911  -2.037 0.041644 *  
## cancerYes                     -0.095977   0.163413  -0.587 0.556984    
## diabetesYes                   -0.221428   0.086834  -2.550 0.010772 *  
## dementiaYes                   -0.721659   0.216245  -3.337 0.000846 ***
## arthritisYes                  -0.332851   0.080613  -4.129 3.64e-05 ***
## heartdiseaseYes               -0.384079   0.113976  -3.370 0.000752 ***
## lungdiseaseYes                -0.137382   0.133866  -1.026 0.304769    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 5929.9  on 6752  degrees of freedom
## Residual deviance: 5372.7  on 6739  degrees of freedom
##   (221 observations deleted due to missingness)
## AIC: 5400.7
## 
## Number of Fisher Scoring iterations: 5

## creating and saving table for publication
tbl_pub (model8, tablePath, "model8.docx", labelList[-c(1,14)]) -> tbl8

## creating merged table for adjusted and unadjusted analysis together
tbl_merge(list(tbl7,tbl8),
          tab_spanner = c("**Unadjusted**", "**Adjusted**")) -> merged2

if(!file.exists(paste(tablePath,"merged2.docx", sep=""))){
        merged2 %>% as_gt() %>% gt::gtsave(filename = "merged2.docx", path = tablePath)
}

## creating a table that has the four models together
tbl_merge(list(merged1,merged2)) %>%
        as_gt() %>% gt::tab_spanner(label = gt::md("**Unadjusted**"),
                                    columns = matches("_1_1"),
                                    replace = TRUE,
                                    level = 1,
                                    id = "unadjusted1") %>%
        gt::tab_spanner(label = gt::md("**Adjusted**"),
                        columns = matches("_2_1"),
                        replace = TRUE,
                        level = 1,
                        id = "adjusted1") %>%
        gt::tab_spanner(label = gt::md("**Unadjusted**"),
                                    columns = matches("_1_2"),
                                    replace = TRUE,
                                    level = 1,
                                    id = "unadjusted2") %>%
        gt::tab_spanner(label = gt::md("**Adjusted**"),
                        columns = matches("_2_2"),
                        replace = TRUE,
                        level = 1,
                        id = "adjusted2") %>%
        gt::tab_spanner(label = gt::md("**Outcome: Pain**"),
                        spanners = c("unadjusted1","adjusted1"),
                        level = 2) %>%
        gt::tab_spanner(label = gt::md("**Outcome: Sleep Quality**"),
                        spanners = c("unadjusted2", "adjusted2"),
                        level = 2) -> merged3

## The merging columns (variable name, variable label, row type, and label column)
## are not unique and the merge may fail or result in a malformed table. If you
## previously 'tbl_stack'ed your tables, then 'tbl_merge'ing before you 'tbl_stack'
## may resolve the issue.

if(!file.exists(paste(tablePath,"merged3.docx", sep=""))){
        merged3 %>% gt::gtsave(filename = "merged3.docx", path = tablePath)
}

merged3

	Outcome: Pain						Outcome: Sleep Quality
Variable	Unadjusted			Adjusted			Unadjusted			Adjusted
Variable	OR¹	95% CI¹	p-value	OR¹	95% CI¹	p-value	OR¹	95% CI¹	p-value	OR¹	95% CI¹	p-value
Sleep Quality
Very bad	—	—		—	—
Very bad	—	—		—	—
Very bad	—	—		—	—
Very bad	—	—		—	—
Bad	0.66	0.48, 0.89	0.008	0.70	0.50, 0.98	0.038
Fair	0.34	0.26, 0.46	<0.001	0.41	0.30, 0.57	<0.001
Good	0.17	0.13, 0.22	<0.001	0.22	0.16, 0.30	<0.001
Very good	0.14	0.10, 0.20	<0.001	0.19	0.13, 0.26	<0.001
Age				0.99	0.98, 0.99	<0.001				1.00	1.00, 1.01	0.4
Body Mass Index				1.03	1.02, 1.04	<0.001				1.02	1.01, 1.03	0.004
Depression
No				—	—					—	—
Yes				1.36	1.16, 1.60	<0.001				0.36	0.31, 0.43	<0.001
Alcohol
Never				—	—					—	—
Less than once a month				0.71	0.58, 0.85	<0.001				1.18	0.92, 1.53	0.2
Once a month or more				0.94	0.80, 1.10	0.4				1.11	0.90, 1.38	0.3
Fracture
No				—	—					—	—
Yes				1.56	0.97, 2.53	0.067				0.59	0.36, 1.0	0.042
Cancer
No				—	—					—	—
Yes				0.98	0.75, 1.28	0.9				0.91	0.66, 1.26	0.6
Diabetes
No				—	—					—	—
Yes				1.21	1.06, 1.40	0.007				0.80	0.68, 0.95	0.011
Dementia
No				—	—					—	—
Yes				1.30	0.85, 2.00	0.2				0.49	0.32, 0.75	<0.001
Arthritis
No				—	—					—	—
Yes				3.42	3.00, 3.89	<0.001				0.72	0.61, 0.84	<0.001
Heart Disease
No				—	—					—	—
Yes				1.58	1.29, 1.92	<0.001				0.68	0.55, 0.85	<0.001
Lung Disease
No				—	—					—	—
Yes				2.27	1.81, 2.85	<0.001				0.87	0.67, 1.14	0.3
Pain
No							—	—		—	—
Yes							0.29	0.25, 0.33	<0.001	0.35	0.30, 0.41	<0.001
¹ OR = Odds Ratio, CI = Confidence Interval

model8Tidy <- tidy(model8, conf.int = TRUE, exponentiate = TRUE)
model8Tidy %>% rename(Odds.Ratio = estimate) -> model8Tidy
model8Tidy

## # A tibble: 14 × 7
##    term               Odds.Ratio std.error statistic  p.value conf.low conf.high
##    <chr>                   <dbl>     <dbl>     <dbl>    <dbl>    <dbl>     <dbl>
##  1 (Intercept)             5.80    0.337       5.22  1.81e- 7    3.00     11.2  
##  2 painYes                 0.352   0.0737    -14.2   1.47e-45    0.304     0.406
##  3 age                     1.00    0.00388     0.762 4.46e- 1    0.995     1.01 
##  4 bmi                     1.02    0.00673     2.92  3.55e- 3    1.01      1.03 
##  5 depressionYes           0.362   0.0863    -11.8   5.06e-32    0.306     0.429
##  6 alcoholLess than …      1.18    0.128       1.30  1.94e- 1    0.923     1.53 
##  7 alcoholOnce a mon…      1.11    0.109       0.940 3.47e- 1    0.898     1.38 
##  8 fractureYes             0.594   0.256      -2.04  4.16e- 2    0.364     0.995
##  9 cancerYes               0.908   0.163      -0.587 5.57e- 1    0.664     1.26 
## 10 diabetesYes             0.801   0.0868     -2.55  1.08e- 2    0.677     0.951
## 11 dementiaYes             0.486   0.216      -3.34  8.46e- 4    0.319     0.747
## 12 arthritisYes            0.717   0.0806     -4.13  3.64e- 5    0.613     0.840
## 13 heartdiseaseYes         0.681   0.114      -3.37  7.52e- 4    0.546     0.854
## 14 lungdiseaseYes          0.872   0.134      -1.03  3.05e- 1    0.673     1.14

## adding the reference terms to the data frame to allow for a better plot
model8Tidy %>% 
        add_row(term = "alcoholNever", Odds.Ratio = -99, std.error = 0, statistic = 0, p.value = 0, conf.low = 0, conf.high = 0, .before = 10) %>%
        ## filtering the intercept
        filter (term !="(Intercept)") %>%
        ## saving the result so that we can refer to it in the next call
        {. ->> model8Tidy}%>%
        ## adding a variable to determine whether the observation
        ## is a reference or not. Initially adding everything as FALSE
        add_column(.after = "conf.high",
                   reference = rep(FALSE, length.out = nrow(model8Tidy))) %>%
        ## now the appropriate ones to TRUE
        dplyr::mutate(reference = case_when
               (term %in% c("Never") ~ TRUE,
                                     TRUE ~ FALSE)) %>%
        ## create a Variable column
        dplyr::mutate(.after= "term",
                      Variable = case_when(grepl("pain", term) ~ "Pain",
                                           grepl("age", term) ~ "Age",
                                           grepl("bmi", term) ~ "Body Mass Index",
                                           grepl("depression",term) ~ "Depression",
                                           grepl("alcohol",term) ~ "Alcohol",
                                           grepl("fracture", term) ~ "Fracture",
                                           grepl("cancer",term) ~ "Cancer",
                                           grepl("diabetes",term) ~ "Diabetes",
                                           grepl("dementia", term) ~ "Dementia",
                                           grepl("arthritis",term) ~ "Arthritis",
                                           grepl("heart",term) ~ "Heart Disease",
                                           grepl("lung",term) ~ "Lung Disease")) %>%
        ## transform the variables to factor
        dplyr::mutate(Variable = as_factor(Variable)) %>%
        dplyr::mutate(term = as_factor(term)) -> model8Tidy


model8Tidy %>% ggplot (aes(x = Odds.Ratio, y = term, xmin = conf.low,
                           xmax = conf.high, height = 0, alpha = !reference)) +
                ## inserting a point for the odds ratio
                geom_point(aes(alpha = !reference)) +
                ## drawing a line at the 1 to improve readability
                geom_vline(xintercept = 1, lty = 4) +
                ## drawing an horizontal line to represent the 
                ## confidence interval
                geom_errorbarh() +
                scale_y_discrete(labels = labels2) +
                labs(x = "Odds Ratio", y = "",
                     title = "Sleep Quality being Fair or Better", 
                subtitle = "Adjusted Odds Ratios",
                caption = paste("N =", nobs(model8))) +
                facet_grid(Variable ~ ., scales = "free_y", space = "free_y") +
                scale_alpha_identity() +
                # hide facet labels
                theme(strip.background = element_blank(),
                strip.text = element_blank()) +
                # remove spacing between facets
                theme(panel.spacing = unit(0, "pt")) +
                scale_x_continuous(breaks = seq(0,1.8, by = 0.2),
                                   limits = (c(0.3,1.53))) -> plot8

## checking if plot has been saved already
if (!file.exists("C:/Users/Pedro Henrique/Documents/PPCR/Data Analysis Project/DataProjectPPCR/plots/plot8.png")){
        ## if it hasn't, saving the plot for publication
        ggsave("plot8.png", plot8, 
               path = "C:/Users/Pedro Henrique/Documents/PPCR/Data Analysis Project/DataProjectPPCR/plots") 
}

plot8

Testing the assumptions

Sensitivity Analysis

Inclusion of Sex as a Covariate

The analysis done by Morelhão et al. didn’t include sex as a covariate. We can analyze with it included to see its effect.

model9 <- glm(sleepqualitydich ~ pain + age + bmi + depression + alcohol +
         fracture + cancer + diabetes + dementia + arthritis +
         heartdisease + lungdisease + sex, "binomial", data = PPCRdata)

model9 %>% tidy(conf.int = TRUE, exponentiate = TRUE) %>% 
        rename(Odds.Ratio = estimate) -> model9Tidy

model9Tidy

## # A tibble: 15 × 7
##    term               Odds.Ratio std.error statistic  p.value conf.low conf.high
##    <chr>                   <dbl>     <dbl>     <dbl>    <dbl>    <dbl>     <dbl>
##  1 (Intercept)             4.76    0.340      4.60   4.29e- 6    2.45      9.27 
##  2 painYes                 0.358   0.0739   -13.9    7.31e-44    0.310     0.414
##  3 age                     1.00    0.00388    0.700  4.84e- 1    0.995     1.01 
##  4 bmi                     1.02    0.00677    3.35   7.94e- 4    1.01      1.04 
##  5 depressionYes           0.382   0.0871   -11.1    2.08e-28    0.322     0.453
##  6 alcoholLess than …      1.11    0.129      0.809  4.18e- 1    0.866     1.44 
##  7 alcoholOnce a mon…      0.995   0.112     -0.0417 9.67e- 1    0.802     1.24 
##  8 fractureYes             0.632   0.256     -1.79   7.29e- 2    0.387     1.06 
##  9 cancerYes               0.905   0.164     -0.608  5.43e- 1    0.662     1.26 
## 10 diabetesYes             0.804   0.0870    -2.50   1.23e- 2    0.679     0.955
## 11 dementiaYes             0.466   0.217     -3.52   4.38e- 4    0.306     0.718
## 12 arthritisYes            0.746   0.0812    -3.61   3.11e- 4    0.637     0.876
## 13 heartdiseaseYes         0.657   0.115     -3.67   2.46e- 4    0.526     0.824
## 14 lungdiseaseYes          0.879   0.134     -0.963  3.36e- 1    0.678     1.15 
## 15 sexMale                 1.42    0.0805     4.34   1.42e- 5    1.21      1.66

## creating and saving table for publication
tbl_pub (model9, tablePath, "model9.docx", labelList[-1])

Variable	OR¹	95% CI¹	p-value
Pain
No	—	—
Yes	0.36	0.31, 0.41	<0.001
Age	1.00	1.00, 1.01	0.5
Body Mass Index	1.02	1.01, 1.04	<0.001
Depression
No	—	—
Yes	0.38	0.32, 0.45	<0.001
Alcohol
Never	—	—
Less than once a month	1.11	0.87, 1.44	0.4
Once a month or more	1.00	0.80, 1.24	>0.9
Fracture
No	—	—
Yes	0.63	0.39, 1.06	0.073
Cancer
No	—	—
Yes	0.91	0.66, 1.26	0.5
Diabetes
No	—	—
Yes	0.80	0.68, 0.96	0.012
Dementia
No	—	—
Yes	0.47	0.31, 0.72	<0.001
Arthritis
No	—	—
Yes	0.75	0.64, 0.88	<0.001
Heart Disease
No	—	—
Yes	0.66	0.53, 0.82	<0.001
Lung Disease
No	—	—
Yes	0.88	0.68, 1.15	0.3
Sex
Female	—	—
Male	1.42	1.21, 1.66	<0.001
¹ OR = Odds Ratio, CI = Confidence Interval

## adding the reference terms to the data frame to allow for a better plot
model9Tidy %>% 
        add_row(term = "alcoholNever", Odds.Ratio = 0, std.error = 0, statistic = 0, p.value = 0, conf.low = 0, conf.high = 0, .before = 10) %>%
        ## filtering the intercept
        filter (term !="(Intercept)") %>%
        ## saving the result so that we can refer to it in the next call
        {. ->> model9Tidy}%>%
        ## adding a variable to determine whether the observation
        ## is a reference or not. Initially adding everything as FALSE
        add_column(.after = "conf.high",
                   reference = rep(FALSE, length.out = nrow(model9Tidy))) %>%
        ## now the appropriate ones to TRUE
        dplyr::mutate(reference = case_when
               (term %in% c("alcoholNever") ~ TRUE,
                                     TRUE ~ FALSE)) %>%
        ## create a Variable column
        dplyr::mutate(.after= "term",
                      Variable = case_when(grepl("pain", term) ~ "Pain",
                                           grepl("age", term) ~ "Age",
                                           grepl("bmi", term) ~ "Body Mass Index",
                                           grepl("depression",term) ~ "Depression",
                                           grepl("alcohol",term) ~ "Alcohol",
                                           grepl("fracture", term) ~ "Fracture",
                                           grepl("cancer",term) ~ "Cancer",
                                           grepl("diabetes",term) ~ "Diabetes",
                                           grepl("dementia", term) ~ "Dementia",
                                           grepl("arthritis",term) ~ "Arthritis",
                                           grepl("heart",term) ~ "Heart Disease",
                                           grepl("lung",term) ~ "Lung Disease",
                                           grepl("sex",term) ~ "Sex")) %>%
        ## transform the variables to factor
        dplyr::mutate(Variable = as_factor(Variable)) %>%
        dplyr::mutate(term = as_factor(term)) -> model9Tidy

model9Tidy %>% ggplot (aes(x = Odds.Ratio, y = term, xmin = conf.low,
                           xmax = conf.high, height = 0, alpha = !reference)) +
                ## inserting a point for the odds ratio
                geom_point(aes(alpha = !reference)) +
                ## drawing a line at the 1 to improve readability
                geom_vline(xintercept = 1, lty = 4) +
                ## drawing an horizontal line to represent the 
                ## confidence interval
                geom_errorbarh() +
                scale_y_discrete(labels = labels2) +
                labs(x = "Odds Ratio", y = "",
                     title = "Sleep Quality being Fair or Better", 
                subtitle = "Adjusted Odds Ratios",
                caption = paste("N =", nobs(model3))) +
                facet_grid(Variable ~ ., scales = "free_y", space = "free_y") +
                scale_alpha_identity() +
                # hide facet labels
                theme(strip.background = element_blank(),
                strip.text = element_blank()) +
                # remove spacing between facets
                theme(panel.spacing = unit(0, "pt")) +
                scale_x_continuous(breaks = seq(0,1.8, by = 0.2),
                                   limits = (c(0.3,1.67))) -> plot9

## checking if plot has been saved already
if (!file.exists("C:/Users/Pedro Henrique/Documents/PPCR/Data Analysis Project/DataProjectPPCR/plots/plot9.png")){
        ## if it hasn't, saving the plot for publication
        ggsave("plot9.png", plot9, 
               path = "C:/Users/Pedro Henrique/Documents/PPCR/Data Analysis Project/DataProjectPPCR/plots") 
}

plot9

The results show us that being male is associated with increased odds of having fair or better sleep quality.

Replacing Individual Comorbidities with a Self-Reported General Health Assessment

We can also substitute the usage of variables to determine presence of comorbidities for a general health assessment reported by the patient himself, to see if it changes the analysis.

# modeling
model10 <- glm(sleepqualitydich ~ pain + genhealth + age + bmi + sex, "binomial", data = PPCRdata)

model10 %>% tidy(conf.int = TRUE, exponentiate = TRUE) %>% 
        rename(Odds.Ratio = estimate) -> model10Tidy

tbl_pub (model10, tablePath, "model10.docx",
         append(labelList[-c(1,5,6,7,8,9,10,11,12,13)], genhealth ~ "General Health Assessment"))

Variable	OR¹	95% CI¹	p-value
Pain
No	—	—
Yes	0.45	0.39, 0.52	<0.001
General Health Assessment
Very bad	—	—
Very bad	—	—
Bad	1.48	1.07, 2.03	0.016
Fair	2.85	2.11, 3.84	<0.001
Good	6.33	4.58, 8.74	<0.001
Very good	8.42	5.21, 14.0	<0.001
Excellent	9.44	4.87, 20.2	<0.001
Age	1.01	1.00, 1.02	0.017
Body Mass Index	1.02	1.01, 1.03	0.004
Sex
Female	—	—
Male	1.67	1.44, 1.95	<0.001
¹ OR = Odds Ratio, CI = Confidence Interval

While the direction of the effect was the same, when considering the subjects perception of their own general health status instead of comorbidity level, the presence of pain has a lower effect size in the association of decreased odds of having sleep quality that is fair or better.

Evaluation of dose-effect relationship between pain and sleep quality

We can also take a look at the subset of patients that have pain to see if sleep quality is worse with increasing levels of pain.

## unadjusted
model11 <- glm(sleepqualitydich ~ p_intensity,
               data = filter(PPCRdata, pain == "Yes"), family = "binomial")

model11 %>% tidy(conf.int = TRUE, exponentiate = TRUE) %>% 
        rename(Odds.Ratio = estimate) -> model11Tidy

tbl_pub(model11, tablePath, "model11.docx",
        list(p_intensity ~ "Pain Intensity")) -> tbl11

## adjusted
model12 <- glm(sleepqualitydich ~ p_intensity + age + bmi + sex + depression + alcohol +
         fracture + cancer + diabetes + dementia + arthritis +
         heartdisease + lungdisease,
    data = filter(PPCRdata, pain == "Yes"),
    family = "binomial")

model12 %>% tidy(conf.int = TRUE, exponentiate = TRUE) %>% 
        rename(Odds.Ratio = estimate) -> model12Tidy

tbl_pub (model12, tablePath, "model12.docx",
         append(labelList[-c(1,2)], p_intensity ~ "Pain Intensity")) -> tbl12

tbl_merge(list(tbl11,tbl12),
          tab_spanner = c("**Unadjusted**", "**Adjusted**")) -> merged4

if(!file.exists(paste(tablePath,"merged4.docx", sep=""))){
        merged4 %>% as_gt() %>% gt::gtsave(filename = "merged4.docx", path = tablePath)
}

merged4

Variable	Unadjusted			Adjusted
Variable	OR¹	95% CI¹	p-value	OR¹	95% CI¹	p-value
Pain Intensity
Soft/weak	—	—		—	—
Moderate	0.60	0.43, 0.83	0.003	0.67	0.47, 0.94	0.022
Intense/strong	0.28	0.20, 0.39	<0.001	0.35	0.24, 0.49	<0.001
Age				1.00	0.99, 1.01	>0.9
Body Mass Index				1.02	1.01, 1.04	0.005
Sex
Female				—	—
Male				1.27	1.02, 1.58	0.036
Depression
No				—	—
Yes				0.40	0.32, 0.51	<0.001
Alcohol
Never				—	—
Less than once a month				1.23	0.85, 1.82	0.3
Once a month or more				1.06	0.78, 1.43	0.7
Fracture
No				—	—
Yes				0.50	0.27, 0.93	0.028
Cancer
No				—	—
Yes				0.88	0.58, 1.38	0.6
Diabetes
No				—	—
Yes				0.95	0.75, 1.19	0.6
Dementia
No				—	—
Yes				0.47	0.27, 0.82	0.007
Arthritis
No				—	—
Yes				0.84	0.69, 1.02	0.079
Heart Disease
No				—	—
Yes				0.67	0.51, 0.89	0.005
Lung Disease
No				—	—
Yes				0.81	0.60, 1.11	0.2
¹ OR = Odds Ratio, CI = Confidence Interval

Increasing levels of pain intensity are also associated with decreased odds of having fair or better sleep quality.

Association between sleep quality and pain in older adults

Pedro Henrique Brant

2023-12-22

Introduction

Adjusting Variables for the Analysis

Analysis

Association between pain and sleepquality

Unadjusted Analysis

Sensitivity Analysis

Adjusted Analysis

Sensitivity Analysis

Testing the assumptions of the model

Association between sleepquality and pain

Ordinal Regression

Adjusted Analysis

Unadjusted Analysis

Testing for the Proportional Odds Assumption

Logistic Regression

Unadjusted Analysis

Adjusted Analysis

Testing the assumptions

Sensitivity Analysis

Inclusion of Sex as a Covariate

Replacing Individual Comorbidities with a Self-Reported General Health Assessment

Evaluation of dose-effect relationship between pain and sleep quality