BST02: Using R for Statistics in Medical Research

Part E: Markdown


SLIDES MARKDOWN

What is Markdown

  • R Markdown is a format for writing reproducible, dynamic reports with R
  • Use it to embed R code and results into slideshows, pdfs, html documents, Word files and more

Getting started with Markdown

Writing part:

  • Headers: #
  • Emphasis:
      - 1 asterisk/underscore -> word
      - 2 asterisks/underscores -> word
  • Bullets: *, -, +
  • Links are possible by specifying: [link](website) (no spaces)
  • Images: ![Rsymbol](Rsymbol.png)
    • Specify width with: {width=10%}
  • Horizontal rules: ---
  • Use the slash to escape these rules

R-part

  • eval: If FALSE, knitr will not run the code in the code chunk.
  • include: If FALSE, knitr will run the chunk but not include the chunk in the final document.
  • echo: If FALSE, knitr will not display the code in the code chunk above it’s results in the final document.
  • results:
      - If ‘hide’, knitr will not display the codes results in the final document.
      - If ‘hold’, knitr will delay displaying all output until the end of the chunk.
      - If ‘asis’, knitr will pass through results without reformatting them (useful if results return raw HTML, etc.)
  • error: If FALSE, knitr will not display any error messages generated by code.
  • message: If FALSE, knitr will not display any messages.
  • warning: If FALSE, knitr will not display any warning messages.
  • cache: If TRUE, knitr will cache the results to reuse in future knits. Knitr will reuse the results until the code chunk is altered.
  • cache.comments: If FALSE, knitr will not rerun the chunk if only a code comment has changed.
  • fig.cap: A character string to be used as a figure caption in LaTex.
  • fig.height, fig.width: The width and height to use in R for plots created by the chunk (in inches).

Reference: link

A simplified version

Because sometimes you:
- Want to create pdf / html /doc files from R
- Do not want to make a .Rmd file

With spinr we can do this!
- spinr syntax:
  - When you want a line to be interpreted as markdown prepend it with #’
  - When you want to start a new code chunk (with different options) use #+


DEMO CREATING REPORT

Introduction

The aim of this study is to investigate whether an association exists between serum bilirubin with sex, age, treatment and standardised blood clotting time.

Methods

Statistical Analysis

  • We are going to used the pbc data set from the survival library.

  • Continuous variables are presented as mean and standard deviation while categorical variables as counts and percentages. Differences in serum bilirubin, age and standardised blood clotting time between males and females were analyzed by the unpaired Student’s t tests for continuous variables. To investigate the association between serum bilirubin with sex, age, treatment and standardised blood clotting time a linear regression was performed. A p-value <0.05 was considered statistically significant and no correction for multiple testing was performed.

library(lattice)
library(knitr)
library(survival)
library(effects)
library(arsenal)
R.Version()$version.string

[1] “R version 3.5.3 (2019-03-11)”

packageVersion("lattice")

[1] ‘0.20.38’

packageVersion("knitr")

[1] ‘1.26’

packageVersion("survival")

[1] ‘3.1.7’

packageVersion("effects")

[1] ‘4.1.4’

packageVersion("arsenal")

[1] ‘3.4.0’

Results

Table 1 presents descriptive statistics with the results of the t-test analysis.

Table 1: Descriptive statistics with test
m (N=44) f (N=374) Total (N=418) p value
bili 0.573
   Mean (SD) 2.866 (2.319) 3.263 (4.591) 3.221 (4.408)
   Range 0.600 - 9.500 0.300 - 28.000 0.300 - 28.000
age < 0.001
   Mean (SD) 55.711 (10.978) 50.157 (10.241) 50.742 (10.447)
   Range 33.476 - 78.439 26.278 - 76.709 26.278 - 78.439
protime 0.151
   N-Miss 0 2 2
   Mean (SD) 10.941 (0.931) 10.707 (1.031) 10.732 (1.022)
   Range 9.700 - 14.100 9.000 - 18.000 9.000 - 18.000

Figure 1 represents the desnity plots of serum bilirubin per gender.

Figure 1: Density plots of serum bilirubin per gender

Figure 1: Density plots of serum bilirubin per gender

Regression analysis

The results of the regression analysis are presented in Table 2. Since we expect that the effect of sex, treatment and standardised blood clotting time on serum bilirubin would be different per age range, we assumed two models. One model includes only main effects and the other model includes an interaction term of age with all the other variables.

Linear regression without interactions
Table 2: The coefficients, standard errors and p-values
Estimate Std. Error Pr(>|t|)
(Intercept) -15.2051 2.8636 0.0000
sexf 0.8703 0.7692 0.2587
age -0.0056 0.0238 0.8134
trtplacebo 0.4856 0.4865 0.3189
protime 1.6533 0.2455 0.0000

Linear regression with interactions

Table 2: The coefficients, standard errors and p-values
Estimate Std. Error Pr(>|t|)
(Intercept) -48.0844 15.2169 0.0017
age 0.6078 0.2796 0.0305
sexf 2.6118 3.8551 0.4986
trtplacebo 1.4057 2.3571 0.5514
protime 4.5506 1.3375 0.0008
age:sexf -0.0332 0.0684 0.6274
age:trtplacebo -0.0188 0.0462 0.6843
age:protime -0.0538 0.0245 0.0286

Effect plots

The effect plots are based on the model with interaction terms.

Serum bilirubin with age

plot(effect("age", fm2))

Serum bilirubin with treatment

plot(effect("trt", fm2))

Serum bilirubin with standardised blood clotting time

plot(effect("protime", fm2))

Serum bilirubin with standardised blood clotting time and sex

plot(effect(c("protime", "sex"), fm2))

Conclusions

We may conclude that:

  • A strong clinical association was found between serum cholesterol and standardised blood clotting time
  • A weak association was found between sex and serum bilirubin
    • The same holds for age and treatment

Strengths and weaknesses of our study are:

  1. Small sample size

rsconnect::deployApp()