What is this?

This is a material for a lightning talk at the Partners R User Group meeting on 2018-07-19.

References

Introduction

tableone is an R package that assists the creation of “Table 1”, patient baseline characteristics in a format that is often seen in biomedical journals.

Load packages

library(tidyverse)
library(tableone)

Load data

We load the pbc (primary biliary cirrhosis) dataset from Mayo Clinic.

data(pbc, package = "survival")
pbc <- as_data_frame(pbc)
pbc
## # A tibble: 418 x 20
##       id  time status   trt   age sex   ascites hepato spiders edema  bili  chol albumin copper alk.phos   ast
##    <int> <int>  <int> <int> <dbl> <fct>   <int>  <int>   <int> <dbl> <dbl> <int>   <dbl>  <int>    <dbl> <dbl>
##  1     1   400      2     1  58.8 f           1      1       1   1    14.5   261    2.6     156    1718  138. 
##  2     2  4500      0     1  56.4 f           0      1       1   0     1.1   302    4.14     54    7395. 114. 
##  3     3  1012      2     1  70.1 m           0      0       0   0.5   1.4   176    3.48    210     516   96.1
##  4     4  1925      2     1  54.7 f           0      1       1   0.5   1.8   244    2.54     64    6122.  60.6
##  5     5  1504      1     2  38.1 f           0      1       1   0     3.4   279    3.53    143     671  113. 
##  6     6  2503      2     2  66.3 f           0      1       0   0     0.8   248    3.98     50     944   93  
##  7     7  1832      0     2  55.5 f           0      1       0   0     1     322    4.09     52     824   60.4
##  8     8  2466      2     2  53.1 f           0      0       0   0     0.3   280    4        52    4651.  28.4
##  9     9  2400      2     1  42.5 f           0      0       1   0     3.2   562    3.08     79    2276  144. 
## 10    10    51      2     2  70.6 f           1      0       1   1    12.6   200    2.74    140     918  147. 
## # ... with 408 more rows, and 4 more variables: trig <int>, platelet <int>, protime <dbl>, stage <int>

Overall tables

Invocation of CreateTableOne() with just the data argument shows all variables.

CreateTableOne(data = pbc)
##                       
##                        Overall          
##   n                        418          
##   id (mean (sd))        209.50 (120.81) 
##   time (mean (sd))     1917.78 (1104.67)
##   status (mean (sd))      0.83 (0.96)   
##   trt (mean (sd))         1.49 (0.50)   
##   age (mean (sd))        50.74 (10.45)  
##   sex = f (%)              374 (89.5)   
##   ascites (mean (sd))     0.08 (0.27)   
##   hepato (mean (sd))      0.51 (0.50)   
##   spiders (mean (sd))     0.29 (0.45)   
##   edema (mean (sd))       0.10 (0.25)   
##   bili (mean (sd))        3.22 (4.41)   
##   chol (mean (sd))      369.51 (231.94) 
##   albumin (mean (sd))     3.50 (0.42)   
##   copper (mean (sd))     97.65 (85.61)  
##   alk.phos (mean (sd)) 1982.66 (2140.39)
##   ast (mean (sd))       122.56 (56.70)  
##   trig (mean (sd))      124.70 (65.15)  
##   platelet (mean (sd))  257.02 (98.33)  
##   protime (mean (sd))    10.73 (1.02)   
##   stage (mean (sd))       3.02 (0.88)

Some variables are not appropriate as patient baseline characteristics, so let’s specify variables via the vars argument. Here we remove patient ID and outcome variables (time and status).

dput(names(pbc))
## c("id", "time", "status", "trt", "age", "sex", "ascites", "hepato", 
## "spiders", "edema", "bili", "chol", "albumin", "copper", "alk.phos", 
## "ast", "trig", "platelet", "protime", "stage")
vars <- c("trt", "age", "sex", "ascites", "hepato",
          "spiders", "edema", "bili", "chol", "albumin", "copper", "alk.phos",
          "ast", "trig", "platelet", "protime", "stage")
CreateTableOne(vars = vars, data = pbc)
##                       
##                        Overall          
##   n                        418          
##   trt (mean (sd))         1.49 (0.50)   
##   age (mean (sd))        50.74 (10.45)  
##   sex = f (%)              374 (89.5)   
##   ascites (mean (sd))     0.08 (0.27)   
##   hepato (mean (sd))      0.51 (0.50)   
##   spiders (mean (sd))     0.29 (0.45)   
##   edema (mean (sd))       0.10 (0.25)   
##   bili (mean (sd))        3.22 (4.41)   
##   chol (mean (sd))      369.51 (231.94) 
##   albumin (mean (sd))     3.50 (0.42)   
##   copper (mean (sd))     97.65 (85.61)  
##   alk.phos (mean (sd)) 1982.66 (2140.39)
##   ast (mean (sd))       122.56 (56.70)  
##   trig (mean (sd))      124.70 (65.15)  
##   platelet (mean (sd))  257.02 (98.33)  
##   protime (mean (sd))    10.73 (1.02)   
##   stage (mean (sd))       3.02 (0.88)

See ?pbc to better understand the dataset.

pbc                  package:survival                  R Documentation

Mayo Clinic Primary Biliary Cirrhosis Data

Description:

     D This data is from the Mayo Clinic trial in primary biliary
     cirrhosis (PBC) of the liver conducted between 1974 and 1984.  A
     total of 424 PBC patients, referred to Mayo Clinic during that
     ten-year interval, met eligibility criteria for the randomized
     placebo controlled trial of the drug D-penicillamine.  The first
     312 cases in the data set participated in the randomized trial and
     contain largely complete data.  The additional 112 cases did not
     participate in the clinical trial, but consented to have basic
     measurements recorded and to be followed for survival.  Six of
     those cases were lost to follow-up shortly after diagnosis, so the
     data here are on an additional 106 cases as well as the 312
     randomized participants.

     A nearly identical data set found in appendix D of Fleming and
     Harrington; this version has fewer missing values.

Usage:

     pbc

Format:

       age:       in years
       albumin:   serum albumin (g/dl)
       alk.phos:  alkaline phosphotase (U/liter)
       ascites:   presence of ascites
       ast:       aspartate aminotransferase, once called SGOT (U/ml)
       bili:      serum bilirunbin (mg/dl)
       chol:      serum cholesterol (mg/dl)
       copper:    urine copper (ug/day)
       edema:     0 no edema, 0.5 untreated or successfully treated
                  1 edema despite diuretic therapy
       hepato:    presence of hepatomegaly or enlarged liver
       id:        case number
       platelet:  platelet count
       protime:   standardised blood clotting time
       sex:       m/f
       spiders:   blood vessel malformations in the skin
       stage:     histologic stage of disease (needs biopsy)
       status:    status at endpoint, 0/1/2 for censored, transplant, dead
       time:      number of days between registration and the earlier of death,
                  transplantion, or study analysis in July, 1986
       trt:       1/2/NA for D-penicillmain, placebo, not randomised
       trig:      triglycerides (mg/dl)

Source:

     T Therneau and P Grambsch (2000), _Modeling Survival Data:
     Extending the Cox Model_, Springer-Verlag, New York.  ISBN:
     0-387-98784-3.

We can see some variables are numerically coded categorical variables (ascites, edema, hepato, trt). Here we convert these to factors for correct handling. For binary variables, make the second level the one you want to show the percentage for.

pbc <- pbc %>%
    mutate(ascites = factor(ascites, levels = c(0,1), labels = c("Absent","Present")),
           edema = factor(edema, levels = c(0, 0.5, 1), labels = c("No edema","Untreated or successfully treated","edema despite diuretic therapy")),
           hepato = factor(hepato, levels = c(0,1), labels = c("Absent","Present")),
           stage = factor(stage),
           trt = factor(trt, levels = c(1,2), labels = c("D-penicillmain", "Placebo")))

Now these variables are handled better.

CreateTableOne(vars = vars, data = pbc)
##                                       
##                                        Overall          
##   n                                        418          
##   trt = Placebo (%)                        154 (49.4)   
##   age (mean (sd))                        50.74 (10.45)  
##   sex = f (%)                              374 (89.5)   
##   ascites = Present (%)                     24 ( 7.7)   
##   hepato = Present (%)                     160 (51.3)   
##   spiders (mean (sd))                     0.29 (0.45)   
##   edema (%)                                             
##      No edema                              354 (84.7)   
##      Untreated or successfully treated      44 (10.5)   
##      edema despite diuretic therapy         20 ( 4.8)   
##   bili (mean (sd))                        3.22 (4.41)   
##   chol (mean (sd))                      369.51 (231.94) 
##   albumin (mean (sd))                     3.50 (0.42)   
##   copper (mean (sd))                     97.65 (85.61)  
##   alk.phos (mean (sd))                 1982.66 (2140.39)
##   ast (mean (sd))                       122.56 (56.70)  
##   trig (mean (sd))                      124.70 (65.15)  
##   platelet (mean (sd))                  257.02 (98.33)  
##   protime (mean (sd))                    10.73 (1.02)   
##   stage (%)                                             
##      1                                      21 ( 5.1)   
##      2                                      92 (22.3)   
##      3                                     155 (37.6)   
##      4                                     144 (35.0)

Show missing proportions with the missing option to the print method.

print(CreateTableOne(vars = vars, data = pbc), missing = TRUE)
##                                       
##                                        Overall           Missing
##   n                                        418                  
##   trt = Placebo (%)                        154 (49.4)    25.4   
##   age (mean (sd))                        50.74 (10.45)    0.0   
##   sex = f (%)                              374 (89.5)     0.0   
##   ascites = Present (%)                     24 ( 7.7)    25.4   
##   hepato = Present (%)                     160 (51.3)    25.4   
##   spiders (mean (sd))                     0.29 (0.45)    25.4   
##   edema (%)                                               0.0   
##      No edema                              354 (84.7)           
##      Untreated or successfully treated      44 (10.5)           
##      edema despite diuretic therapy         20 ( 4.8)           
##   bili (mean (sd))                        3.22 (4.41)     0.0   
##   chol (mean (sd))                      369.51 (231.94)  32.1   
##   albumin (mean (sd))                     3.50 (0.42)     0.0   
##   copper (mean (sd))                     97.65 (85.61)   25.8   
##   alk.phos (mean (sd))                 1982.66 (2140.39) 25.4   
##   ast (mean (sd))                       122.56 (56.70)   25.4   
##   trig (mean (sd))                      124.70 (65.15)   32.5   
##   platelet (mean (sd))                  257.02 (98.33)    2.6   
##   protime (mean (sd))                    10.73 (1.02)     0.5   
##   stage (%)                                               1.4   
##      1                                      21 ( 5.1)           
##      2                                      92 (22.3)           
##      3                                     155 (37.6)           
##      4                                     144 (35.0)

Group-stratified tables

trt is the treatment assignment variable, we should stratify the table with this variable. P-values are added by reasonable default functions.

vars <- setdiff(vars, "trt")
CreateTableOne(vars = vars, strata = "trt", data = pbc)
##                                       Stratified by trt
##                                        D-penicillmain    Placebo           p      test
##   n                                        158               154                      
##   age (mean (sd))                        51.42 (11.01)     48.58 (9.96)     0.018     
##   sex = f (%)                              137 (86.7)        139 (90.3)     0.421     
##   ascites = Present (%)                     14 ( 8.9)         10 ( 6.5)     0.567     
##   hepato = Present (%)                      73 (46.2)         87 (56.5)     0.088     
##   spiders (mean (sd))                     0.28 (0.45)       0.29 (0.46)     0.886     
##   edema (%)                                                                 0.877     
##      No edema                              132 (83.5)        131 (85.1)               
##      Untreated or successfully treated      16 (10.1)         13 ( 8.4)               
##      edema despite diuretic therapy         10 ( 6.3)         10 ( 6.5)               
##   bili (mean (sd))                        2.87 (3.63)       3.65 (5.28)     0.131     
##   chol (mean (sd))                      365.01 (209.54)   373.88 (252.48)   0.748     
##   albumin (mean (sd))                     3.52 (0.44)       3.52 (0.40)     0.874     
##   copper (mean (sd))                     97.64 (90.59)     97.65 (80.49)    0.999     
##   alk.phos (mean (sd))                 2021.30 (2183.44) 1943.01 (2101.69)  0.747     
##   ast (mean (sd))                       120.21 (54.52)    124.97 (58.93)    0.460     
##   trig (mean (sd))                      124.14 (71.54)    125.25 (58.52)    0.886     
##   platelet (mean (sd))                  258.75 (100.32)   265.20 (90.73)    0.555     
##   protime (mean (sd))                    10.65 (0.85)      10.80 (1.14)     0.197     
##   stage (%)                                                                 0.201     
##      1                                      12 ( 7.6)          4 ( 2.6)               
##      2                                      35 (22.2)         32 (20.8)               
##      3                                      56 (35.4)         64 (41.6)               
##      4                                      55 (34.8)         54 (35.1)

Some continuous variables are quite skewed like most biomarkers are. Median [IQR] may be a preferred format for these. Note test column indicates, p-values are based on different function, Wilcoxon test in this case.

print(CreateTableOne(vars = vars, strata = "trt", data = pbc), nonnormal = c("bili","chol"))
##                                       Stratified by trt
##                                        D-penicillmain           Placebo                  p      test   
##   n                                        158                      154                                
##   age (mean (sd))                        51.42 (11.01)            48.58 (9.96)            0.018        
##   sex = f (%)                              137 (86.7)               139 (90.3)            0.421        
##   ascites = Present (%)                     14 ( 8.9)                10 ( 6.5)            0.567        
##   hepato = Present (%)                      73 (46.2)                87 (56.5)            0.088        
##   spiders (mean (sd))                     0.28 (0.45)              0.29 (0.46)            0.886        
##   edema (%)                                                                               0.877        
##      No edema                              132 (83.5)               131 (85.1)                         
##      Untreated or successfully treated      16 (10.1)                13 ( 8.4)                         
##      edema despite diuretic therapy         10 ( 6.3)                10 ( 6.5)                         
##   bili (median [IQR])                     1.40 [0.80, 3.20]        1.30 [0.72, 3.60]      0.842 nonnorm
##   chol (median [IQR])                   315.50 [247.75, 417.00]  303.50 [254.25, 377.00]  0.544 nonnorm
##   albumin (mean (sd))                     3.52 (0.44)              3.52 (0.40)            0.874        
##   copper (mean (sd))                     97.64 (90.59)            97.65 (80.49)           0.999        
##   alk.phos (mean (sd))                 2021.30 (2183.44)        1943.01 (2101.69)         0.747        
##   ast (mean (sd))                       120.21 (54.52)           124.97 (58.93)           0.460        
##   trig (mean (sd))                      124.14 (71.54)           125.25 (58.52)           0.886        
##   platelet (mean (sd))                  258.75 (100.32)          265.20 (90.73)           0.555        
##   protime (mean (sd))                    10.65 (0.85)             10.80 (1.14)            0.197        
##   stage (%)                                                                               0.201        
##      1                                      12 ( 7.6)                 4 ( 2.6)                         
##      2                                      35 (22.2)                32 (20.8)                         
##      3                                      56 (35.4)                64 (41.6)                         
##      4                                      55 (34.8)                54 (35.1)

In the propensity score analysis, standardized mean differences (SMDs) are often preferred. Use the smd argument for

print(CreateTableOne(vars = vars, strata = "trt", data = pbc), nonnormal = c("bili","chol"), smd = TRUE, test = FALSE)
##                                       Stratified by trt
##                                        D-penicillmain           Placebo                  SMD   
##   n                                        158                      154                        
##   age (mean (sd))                        51.42 (11.01)            48.58 (9.96)            0.270
##   sex = f (%)                              137 (86.7)               139 (90.3)            0.111
##   ascites = Present (%)                     14 ( 8.9)                10 ( 6.5)            0.089
##   hepato = Present (%)                      73 (46.2)                87 (56.5)            0.207
##   spiders (mean (sd))                     0.28 (0.45)              0.29 (0.46)            0.016
##   edema (%)                                                                               0.058
##      No edema                              132 (83.5)               131 (85.1)                 
##      Untreated or successfully treated      16 (10.1)                13 ( 8.4)                 
##      edema despite diuretic therapy         10 ( 6.3)                10 ( 6.5)                 
##   bili (median [IQR])                     1.40 [0.80, 3.20]        1.30 [0.72, 3.60]      0.171
##   chol (median [IQR])                   315.50 [247.75, 417.00]  303.50 [254.25, 377.00]  0.038
##   albumin (mean (sd))                     3.52 (0.44)              3.52 (0.40)            0.018
##   copper (mean (sd))                     97.64 (90.59)            97.65 (80.49)          <0.001
##   alk.phos (mean (sd))                 2021.30 (2183.44)        1943.01 (2101.69)         0.037
##   ast (mean (sd))                       120.21 (54.52)           124.97 (58.93)           0.084
##   trig (mean (sd))                      124.14 (71.54)           125.25 (58.52)           0.017
##   platelet (mean (sd))                  258.75 (100.32)          265.20 (90.73)           0.067
##   protime (mean (sd))                    10.65 (0.85)             10.80 (1.14)            0.146
##   stage (%)                                                                               0.246
##      1                                      12 ( 7.6)                 4 ( 2.6)                 
##      2                                      35 (22.2)                32 (20.8)                 
##      3                                      56 (35.4)                64 (41.6)                 
##      4                                      55 (34.8)                54 (35.1)

Variable labels

Variable names are typically short and not appropriate for the final version of the table. Use the labelled package to assign variable labels.

var_label_list <- list(age = "Age in years",
                       sex = "Female",
                       ascites = "Ascites",
                       hepato = "Hepatomegaly",
                       spiders = "Spider angioma",
                       edema = "Edema",
                       bili = "Serum bilirunbin, mg/dl",
                       chol = "Serum cholesterol, mg/dl",
                       copper = "Urine copper ug/day",
                       stage = "Histologic stage of disease",
                       trig = "Triglycerides, mg/dl",
                       albumin = "Serum albumin, g/dl",
                       alk.phos = "Alkaline phosphotase, U/liter",
                       ast = "Aspartate aminotransferase, U/ml",
                       platelet = "Platelet count",
                       protime = "Prothrombin time in seconds")
labelled::var_label(pbc) <- var_label_list
labelled::var_label(pbc)
## $id
## NULL
## 
## $time
## NULL
## 
## $status
## NULL
## 
## $trt
## NULL
## 
## $age
## [1] "Age in years"
## 
## $sex
## [1] "Female"
## 
## $ascites
## [1] "Ascites"
## 
## $hepato
## [1] "Hepatomegaly"
## 
## $spiders
## [1] "Spider angioma"
## 
## $edema
## [1] "Edema"
## 
## $bili
## [1] "Serum bilirunbin, mg/dl"
## 
## $chol
## [1] "Serum cholesterol, mg/dl"
## 
## $albumin
## [1] "Serum albumin, g/dl"
## 
## $copper
## [1] "Urine copper ug/day"
## 
## $alk.phos
## [1] "Alkaline phosphotase, U/liter"
## 
## $ast
## [1] "Aspartate aminotransferase, U/ml"
## 
## $trig
## [1] "Triglycerides, mg/dl"
## 
## $platelet
## [1] "Platelet count"
## 
## $protime
## [1] "Prothrombin time in seconds"
## 
## $stage
## [1] "Histologic stage of disease"

Let’s see the table with variable labels.

print(CreateTableOne(vars = vars, strata = "trt", data = pbc), nonnormal = c("bili","chol"), smd = TRUE, test = FALSE, varLabels = TRUE)
##                                               Stratified by trt
##                                                D-penicillmain           Placebo                  SMD   
##   n                                                158                      154                        
##   Age in years (mean (sd))                       51.42 (11.01)            48.58 (9.96)            0.270
##   Female = f (%)                                   137 (86.7)               139 (90.3)            0.111
##   Ascites = Present (%)                             14 ( 8.9)                10 ( 6.5)            0.089
##   Hepatomegaly = Present (%)                        73 (46.2)                87 (56.5)            0.207
##   Spider angioma (mean (sd))                      0.28 (0.45)              0.29 (0.46)            0.016
##   Edema (%)                                                                                       0.058
##      No edema                                      132 (83.5)               131 (85.1)                 
##      Untreated or successfully treated              16 (10.1)                13 ( 8.4)                 
##      edema despite diuretic therapy                 10 ( 6.3)                10 ( 6.5)                 
##   Serum bilirunbin, mg/dl (median [IQR])          1.40 [0.80, 3.20]        1.30 [0.72, 3.60]      0.171
##   Serum cholesterol, mg/dl (median [IQR])       315.50 [247.75, 417.00]  303.50 [254.25, 377.00]  0.038
##   Serum albumin, g/dl (mean (sd))                 3.52 (0.44)              3.52 (0.40)            0.018
##   Urine copper ug/day (mean (sd))                97.64 (90.59)            97.65 (80.49)          <0.001
##   Alkaline phosphotase, U/liter (mean (sd))    2021.30 (2183.44)        1943.01 (2101.69)         0.037
##   Aspartate aminotransferase, U/ml (mean (sd))  120.21 (54.52)           124.97 (58.93)           0.084
##   Triglycerides, mg/dl (mean (sd))              124.14 (71.54)           125.25 (58.52)           0.017
##   Platelet count (mean (sd))                    258.75 (100.32)          265.20 (90.73)           0.067
##   Prothrombin time in seconds (mean (sd))        10.65 (0.85)             10.80 (1.14)            0.146
##   Histologic stage of disease (%)                                                                 0.246
##      1                                              12 ( 7.6)                 4 ( 2.6)                 
##      2                                              35 (22.2)                32 (20.8)                 
##      3                                              56 (35.4)                64 (41.6)                 
##      4                                              55 (34.8)                54 (35.1)

Once binary categories look OK, we can suppress level indication.

print(CreateTableOne(vars = vars, strata = "trt", data = pbc), nonnormal = c("bili","chol"), smd = TRUE, test = FALSE, varLabels = TRUE, dropEqual = TRUE)
##                                               Stratified by trt
##                                                D-penicillmain           Placebo                  SMD   
##   n                                                158                      154                        
##   Age in years (mean (sd))                       51.42 (11.01)            48.58 (9.96)            0.270
##   Female (%)                                       137 (86.7)               139 (90.3)            0.111
##   Ascites (%)                                       14 ( 8.9)                10 ( 6.5)            0.089
##   Hepatomegaly (%)                                  73 (46.2)                87 (56.5)            0.207
##   Spider angioma (mean (sd))                      0.28 (0.45)              0.29 (0.46)            0.016
##   Edema (%)                                                                                       0.058
##      No edema                                      132 (83.5)               131 (85.1)                 
##      Untreated or successfully treated              16 (10.1)                13 ( 8.4)                 
##      edema despite diuretic therapy                 10 ( 6.3)                10 ( 6.5)                 
##   Serum bilirunbin, mg/dl (median [IQR])          1.40 [0.80, 3.20]        1.30 [0.72, 3.60]      0.171
##   Serum cholesterol, mg/dl (median [IQR])       315.50 [247.75, 417.00]  303.50 [254.25, 377.00]  0.038
##   Serum albumin, g/dl (mean (sd))                 3.52 (0.44)              3.52 (0.40)            0.018
##   Urine copper ug/day (mean (sd))                97.64 (90.59)            97.65 (80.49)          <0.001
##   Alkaline phosphotase, U/liter (mean (sd))    2021.30 (2183.44)        1943.01 (2101.69)         0.037
##   Aspartate aminotransferase, U/ml (mean (sd))  120.21 (54.52)           124.97 (58.93)           0.084
##   Triglycerides, mg/dl (mean (sd))              124.14 (71.54)           125.25 (58.52)           0.017
##   Platelet count (mean (sd))                    258.75 (100.32)          265.20 (90.73)           0.067
##   Prothrombin time in seconds (mean (sd))        10.65 (0.85)             10.80 (1.14)            0.146
##   Histologic stage of disease (%)                                                                 0.246
##      1                                              12 ( 7.6)                 4 ( 2.6)                 
##      2                                              35 (22.2)                32 (20.8)                 
##      3                                              56 (35.4)                64 (41.6)                 
##      4                                              55 (34.8)                54 (35.1)

Markdown format

It’s only in the Github version currently. The kableone function can be called instead of print function.

kableone <- function(x, ...) {
  capture.output(x <- print(x))
  knitr::kable(x, ...)
}
kableone(CreateTableOne(vars = vars, strata = "trt", data = pbc), nonnormal = c("bili","chol"), smd = TRUE, test = FALSE, varLabels = TRUE, dropEqual = TRUE, noSpaces = TRUE, printToggle = FALSE)
D-penicillmain Placebo p test
n 158 154
age (mean (sd)) 51.42 (11.01) 48.58 (9.96) 0.018
sex = f (%) 137 (86.7) 139 (90.3) 0.421
ascites = Present (%) 14 ( 8.9) 10 ( 6.5) 0.567
hepato = Present (%) 73 (46.2) 87 (56.5) 0.088
spiders (mean (sd)) 0.28 (0.45) 0.29 (0.46) 0.886
edema (%) 0.877
No edema 132 (83.5) 131 (85.1)
Untreated or successfully treated 16 (10.1) 13 ( 8.4)
edema despite diuretic therapy 10 ( 6.3) 10 ( 6.5)
bili (mean (sd)) 2.87 (3.63) 3.65 (5.28) 0.131
chol (mean (sd)) 365.01 (209.54) 373.88 (252.48) 0.748
albumin (mean (sd)) 3.52 (0.44) 3.52 (0.40) 0.874
copper (mean (sd)) 97.64 (90.59) 97.65 (80.49) 0.999
alk.phos (mean (sd)) 2021.30 (2183.44) 1943.01 (2101.69) 0.747
ast (mean (sd)) 120.21 (54.52) 124.97 (58.93) 0.460
trig (mean (sd)) 124.14 (71.54) 125.25 (58.52) 0.886
platelet (mean (sd)) 258.75 (100.32) 265.20 (90.73) 0.555
protime (mean (sd)) 10.65 (0.85) 10.80 (1.14) 0.197
stage (%) 0.201
1 12 ( 7.6) 4 ( 2.6)
2 35 (22.2) 32 (20.8)
3 56 (35.4) 64 (41.6)
4 55 (34.8) 54 (35.1)

Export to a CSV file

The print method is invisibly returning a matrix object. We can export this to a file. In the console, the formating via spaces, but we don’t need them when exporting. The noSpaces option controls this aspect. If assigning the matrix is all you need, you can turn off printing by the printToggle option.

tab1mat <- print(CreateTableOne(vars = vars, strata = "trt", data = pbc), nonnormal = c("bili","chol"), smd = TRUE, test = FALSE, varLabels = TRUE, dropEqual = TRUE, noSpaces = TRUE, printToggle = FALSE)

Now this is just a matrix of text.

tab1mat
##                                               Stratified by trt
##                                                D-penicillmain            Placebo                   SMD     
##   n                                            "158"                     "154"                     ""      
##   Age in years (mean (sd))                     "51.42 (11.01)"           "48.58 (9.96)"            "0.270" 
##   Female (%)                                   "137 (86.7)"              "139 (90.3)"              "0.111" 
##   Ascites (%)                                  "14 (8.9)"                "10 (6.5)"                "0.089" 
##   Hepatomegaly (%)                             "73 (46.2)"               "87 (56.5)"               "0.207" 
##   Spider angioma (mean (sd))                   "0.28 (0.45)"             "0.29 (0.46)"             "0.016" 
##   Edema (%)                                    ""                        ""                        "0.058" 
##      No edema                                  "132 (83.5)"              "131 (85.1)"              ""      
##      Untreated or successfully treated         "16 (10.1)"               "13 (8.4)"                ""      
##      edema despite diuretic therapy            "10 (6.3)"                "10 (6.5)"                ""      
##   Serum bilirunbin, mg/dl (median [IQR])       "1.40 [0.80, 3.20]"       "1.30 [0.72, 3.60]"       "0.171" 
##   Serum cholesterol, mg/dl (median [IQR])      "315.50 [247.75, 417.00]" "303.50 [254.25, 377.00]" "0.038" 
##   Serum albumin, g/dl (mean (sd))              "3.52 (0.44)"             "3.52 (0.40)"             "0.018" 
##   Urine copper ug/day (mean (sd))              "97.64 (90.59)"           "97.65 (80.49)"           "<0.001"
##   Alkaline phosphotase, U/liter (mean (sd))    "2021.30 (2183.44)"       "1943.01 (2101.69)"       "0.037" 
##   Aspartate aminotransferase, U/ml (mean (sd)) "120.21 (54.52)"          "124.97 (58.93)"          "0.084" 
##   Triglycerides, mg/dl (mean (sd))             "124.14 (71.54)"          "125.25 (58.52)"          "0.017" 
##   Platelet count (mean (sd))                   "258.75 (100.32)"         "265.20 (90.73)"          "0.067" 
##   Prothrombin time in seconds (mean (sd))      "10.65 (0.85)"            "10.80 (1.14)"            "0.146" 
##   Histologic stage of disease (%)              ""                        ""                        "0.246" 
##      1                                         "12 (7.6)"                "4 (2.6)"                 ""      
##      2                                         "35 (22.2)"               "32 (20.8)"               ""      
##      3                                         "56 (35.4)"               "64 (41.6)"               ""      
##      4                                         "55 (34.8)"               "54 (35.1)"               ""

You can write to a CSV file easily.

write.csv(tab1mat, file = "./tab1.csv")

Export to an Excel file

This is not a part of the package, yet. Download the helper file from my gist. The functions depend on the openxlsx package.

library(openxlsx)
source("./tableone_helper_functions.R")

We can export to a .xlsx file with some useful formats by default.

write_tableone_mat_to_xlsx(tab1mat, file = "./tab1.xlsx", font_size = 10)