This is a material for a lightning talk at the Partners R User Group meeting on 2018-07-19.
tableone is an R package that assists the creation of “Table 1”, patient baseline characteristics in a format that is often seen in biomedical journals.
library(tidyverse)
library(tableone)
We load the pbc (primary biliary cirrhosis) dataset from Mayo Clinic.
data(pbc, package = "survival")
pbc <- as_data_frame(pbc)
pbc
## # A tibble: 418 x 20
## id time status trt age sex ascites hepato spiders edema bili chol albumin copper alk.phos ast
## <int> <int> <int> <int> <dbl> <fct> <int> <int> <int> <dbl> <dbl> <int> <dbl> <int> <dbl> <dbl>
## 1 1 400 2 1 58.8 f 1 1 1 1 14.5 261 2.6 156 1718 138.
## 2 2 4500 0 1 56.4 f 0 1 1 0 1.1 302 4.14 54 7395. 114.
## 3 3 1012 2 1 70.1 m 0 0 0 0.5 1.4 176 3.48 210 516 96.1
## 4 4 1925 2 1 54.7 f 0 1 1 0.5 1.8 244 2.54 64 6122. 60.6
## 5 5 1504 1 2 38.1 f 0 1 1 0 3.4 279 3.53 143 671 113.
## 6 6 2503 2 2 66.3 f 0 1 0 0 0.8 248 3.98 50 944 93
## 7 7 1832 0 2 55.5 f 0 1 0 0 1 322 4.09 52 824 60.4
## 8 8 2466 2 2 53.1 f 0 0 0 0 0.3 280 4 52 4651. 28.4
## 9 9 2400 2 1 42.5 f 0 0 1 0 3.2 562 3.08 79 2276 144.
## 10 10 51 2 2 70.6 f 1 0 1 1 12.6 200 2.74 140 918 147.
## # ... with 408 more rows, and 4 more variables: trig <int>, platelet <int>, protime <dbl>, stage <int>
Invocation of CreateTableOne() with just the data argument shows all variables.
CreateTableOne(data = pbc)
##
## Overall
## n 418
## id (mean (sd)) 209.50 (120.81)
## time (mean (sd)) 1917.78 (1104.67)
## status (mean (sd)) 0.83 (0.96)
## trt (mean (sd)) 1.49 (0.50)
## age (mean (sd)) 50.74 (10.45)
## sex = f (%) 374 (89.5)
## ascites (mean (sd)) 0.08 (0.27)
## hepato (mean (sd)) 0.51 (0.50)
## spiders (mean (sd)) 0.29 (0.45)
## edema (mean (sd)) 0.10 (0.25)
## bili (mean (sd)) 3.22 (4.41)
## chol (mean (sd)) 369.51 (231.94)
## albumin (mean (sd)) 3.50 (0.42)
## copper (mean (sd)) 97.65 (85.61)
## alk.phos (mean (sd)) 1982.66 (2140.39)
## ast (mean (sd)) 122.56 (56.70)
## trig (mean (sd)) 124.70 (65.15)
## platelet (mean (sd)) 257.02 (98.33)
## protime (mean (sd)) 10.73 (1.02)
## stage (mean (sd)) 3.02 (0.88)
Some variables are not appropriate as patient baseline characteristics, so let’s specify variables via the vars argument. Here we remove patient ID and outcome variables (time and status).
dput(names(pbc))
## c("id", "time", "status", "trt", "age", "sex", "ascites", "hepato",
## "spiders", "edema", "bili", "chol", "albumin", "copper", "alk.phos",
## "ast", "trig", "platelet", "protime", "stage")
vars <- c("trt", "age", "sex", "ascites", "hepato",
"spiders", "edema", "bili", "chol", "albumin", "copper", "alk.phos",
"ast", "trig", "platelet", "protime", "stage")
CreateTableOne(vars = vars, data = pbc)
##
## Overall
## n 418
## trt (mean (sd)) 1.49 (0.50)
## age (mean (sd)) 50.74 (10.45)
## sex = f (%) 374 (89.5)
## ascites (mean (sd)) 0.08 (0.27)
## hepato (mean (sd)) 0.51 (0.50)
## spiders (mean (sd)) 0.29 (0.45)
## edema (mean (sd)) 0.10 (0.25)
## bili (mean (sd)) 3.22 (4.41)
## chol (mean (sd)) 369.51 (231.94)
## albumin (mean (sd)) 3.50 (0.42)
## copper (mean (sd)) 97.65 (85.61)
## alk.phos (mean (sd)) 1982.66 (2140.39)
## ast (mean (sd)) 122.56 (56.70)
## trig (mean (sd)) 124.70 (65.15)
## platelet (mean (sd)) 257.02 (98.33)
## protime (mean (sd)) 10.73 (1.02)
## stage (mean (sd)) 3.02 (0.88)
See ?pbc to better understand the dataset.
pbc package:survival R Documentation
Mayo Clinic Primary Biliary Cirrhosis Data
Description:
D This data is from the Mayo Clinic trial in primary biliary
cirrhosis (PBC) of the liver conducted between 1974 and 1984. A
total of 424 PBC patients, referred to Mayo Clinic during that
ten-year interval, met eligibility criteria for the randomized
placebo controlled trial of the drug D-penicillamine. The first
312 cases in the data set participated in the randomized trial and
contain largely complete data. The additional 112 cases did not
participate in the clinical trial, but consented to have basic
measurements recorded and to be followed for survival. Six of
those cases were lost to follow-up shortly after diagnosis, so the
data here are on an additional 106 cases as well as the 312
randomized participants.
A nearly identical data set found in appendix D of Fleming and
Harrington; this version has fewer missing values.
Usage:
pbc
Format:
age: in years
albumin: serum albumin (g/dl)
alk.phos: alkaline phosphotase (U/liter)
ascites: presence of ascites
ast: aspartate aminotransferase, once called SGOT (U/ml)
bili: serum bilirunbin (mg/dl)
chol: serum cholesterol (mg/dl)
copper: urine copper (ug/day)
edema: 0 no edema, 0.5 untreated or successfully treated
1 edema despite diuretic therapy
hepato: presence of hepatomegaly or enlarged liver
id: case number
platelet: platelet count
protime: standardised blood clotting time
sex: m/f
spiders: blood vessel malformations in the skin
stage: histologic stage of disease (needs biopsy)
status: status at endpoint, 0/1/2 for censored, transplant, dead
time: number of days between registration and the earlier of death,
transplantion, or study analysis in July, 1986
trt: 1/2/NA for D-penicillmain, placebo, not randomised
trig: triglycerides (mg/dl)
Source:
T Therneau and P Grambsch (2000), _Modeling Survival Data:
Extending the Cox Model_, Springer-Verlag, New York. ISBN:
0-387-98784-3.
We can see some variables are numerically coded categorical variables (ascites, edema, hepato, trt). Here we convert these to factors for correct handling. For binary variables, make the second level the one you want to show the percentage for.
pbc <- pbc %>%
mutate(ascites = factor(ascites, levels = c(0,1), labels = c("Absent","Present")),
edema = factor(edema, levels = c(0, 0.5, 1), labels = c("No edema","Untreated or successfully treated","edema despite diuretic therapy")),
hepato = factor(hepato, levels = c(0,1), labels = c("Absent","Present")),
stage = factor(stage),
trt = factor(trt, levels = c(1,2), labels = c("D-penicillmain", "Placebo")))
Now these variables are handled better.
CreateTableOne(vars = vars, data = pbc)
##
## Overall
## n 418
## trt = Placebo (%) 154 (49.4)
## age (mean (sd)) 50.74 (10.45)
## sex = f (%) 374 (89.5)
## ascites = Present (%) 24 ( 7.7)
## hepato = Present (%) 160 (51.3)
## spiders (mean (sd)) 0.29 (0.45)
## edema (%)
## No edema 354 (84.7)
## Untreated or successfully treated 44 (10.5)
## edema despite diuretic therapy 20 ( 4.8)
## bili (mean (sd)) 3.22 (4.41)
## chol (mean (sd)) 369.51 (231.94)
## albumin (mean (sd)) 3.50 (0.42)
## copper (mean (sd)) 97.65 (85.61)
## alk.phos (mean (sd)) 1982.66 (2140.39)
## ast (mean (sd)) 122.56 (56.70)
## trig (mean (sd)) 124.70 (65.15)
## platelet (mean (sd)) 257.02 (98.33)
## protime (mean (sd)) 10.73 (1.02)
## stage (%)
## 1 21 ( 5.1)
## 2 92 (22.3)
## 3 155 (37.6)
## 4 144 (35.0)
Show missing proportions with the missing option to the print method.
print(CreateTableOne(vars = vars, data = pbc), missing = TRUE)
##
## Overall Missing
## n 418
## trt = Placebo (%) 154 (49.4) 25.4
## age (mean (sd)) 50.74 (10.45) 0.0
## sex = f (%) 374 (89.5) 0.0
## ascites = Present (%) 24 ( 7.7) 25.4
## hepato = Present (%) 160 (51.3) 25.4
## spiders (mean (sd)) 0.29 (0.45) 25.4
## edema (%) 0.0
## No edema 354 (84.7)
## Untreated or successfully treated 44 (10.5)
## edema despite diuretic therapy 20 ( 4.8)
## bili (mean (sd)) 3.22 (4.41) 0.0
## chol (mean (sd)) 369.51 (231.94) 32.1
## albumin (mean (sd)) 3.50 (0.42) 0.0
## copper (mean (sd)) 97.65 (85.61) 25.8
## alk.phos (mean (sd)) 1982.66 (2140.39) 25.4
## ast (mean (sd)) 122.56 (56.70) 25.4
## trig (mean (sd)) 124.70 (65.15) 32.5
## platelet (mean (sd)) 257.02 (98.33) 2.6
## protime (mean (sd)) 10.73 (1.02) 0.5
## stage (%) 1.4
## 1 21 ( 5.1)
## 2 92 (22.3)
## 3 155 (37.6)
## 4 144 (35.0)
trt is the treatment assignment variable, we should stratify the table with this variable. P-values are added by reasonable default functions.
vars <- setdiff(vars, "trt")
CreateTableOne(vars = vars, strata = "trt", data = pbc)
## Stratified by trt
## D-penicillmain Placebo p test
## n 158 154
## age (mean (sd)) 51.42 (11.01) 48.58 (9.96) 0.018
## sex = f (%) 137 (86.7) 139 (90.3) 0.421
## ascites = Present (%) 14 ( 8.9) 10 ( 6.5) 0.567
## hepato = Present (%) 73 (46.2) 87 (56.5) 0.088
## spiders (mean (sd)) 0.28 (0.45) 0.29 (0.46) 0.886
## edema (%) 0.877
## No edema 132 (83.5) 131 (85.1)
## Untreated or successfully treated 16 (10.1) 13 ( 8.4)
## edema despite diuretic therapy 10 ( 6.3) 10 ( 6.5)
## bili (mean (sd)) 2.87 (3.63) 3.65 (5.28) 0.131
## chol (mean (sd)) 365.01 (209.54) 373.88 (252.48) 0.748
## albumin (mean (sd)) 3.52 (0.44) 3.52 (0.40) 0.874
## copper (mean (sd)) 97.64 (90.59) 97.65 (80.49) 0.999
## alk.phos (mean (sd)) 2021.30 (2183.44) 1943.01 (2101.69) 0.747
## ast (mean (sd)) 120.21 (54.52) 124.97 (58.93) 0.460
## trig (mean (sd)) 124.14 (71.54) 125.25 (58.52) 0.886
## platelet (mean (sd)) 258.75 (100.32) 265.20 (90.73) 0.555
## protime (mean (sd)) 10.65 (0.85) 10.80 (1.14) 0.197
## stage (%) 0.201
## 1 12 ( 7.6) 4 ( 2.6)
## 2 35 (22.2) 32 (20.8)
## 3 56 (35.4) 64 (41.6)
## 4 55 (34.8) 54 (35.1)
Some continuous variables are quite skewed like most biomarkers are. Median [IQR] may be a preferred format for these. Note test column indicates, p-values are based on different function, Wilcoxon test in this case.
print(CreateTableOne(vars = vars, strata = "trt", data = pbc), nonnormal = c("bili","chol"))
## Stratified by trt
## D-penicillmain Placebo p test
## n 158 154
## age (mean (sd)) 51.42 (11.01) 48.58 (9.96) 0.018
## sex = f (%) 137 (86.7) 139 (90.3) 0.421
## ascites = Present (%) 14 ( 8.9) 10 ( 6.5) 0.567
## hepato = Present (%) 73 (46.2) 87 (56.5) 0.088
## spiders (mean (sd)) 0.28 (0.45) 0.29 (0.46) 0.886
## edema (%) 0.877
## No edema 132 (83.5) 131 (85.1)
## Untreated or successfully treated 16 (10.1) 13 ( 8.4)
## edema despite diuretic therapy 10 ( 6.3) 10 ( 6.5)
## bili (median [IQR]) 1.40 [0.80, 3.20] 1.30 [0.72, 3.60] 0.842 nonnorm
## chol (median [IQR]) 315.50 [247.75, 417.00] 303.50 [254.25, 377.00] 0.544 nonnorm
## albumin (mean (sd)) 3.52 (0.44) 3.52 (0.40) 0.874
## copper (mean (sd)) 97.64 (90.59) 97.65 (80.49) 0.999
## alk.phos (mean (sd)) 2021.30 (2183.44) 1943.01 (2101.69) 0.747
## ast (mean (sd)) 120.21 (54.52) 124.97 (58.93) 0.460
## trig (mean (sd)) 124.14 (71.54) 125.25 (58.52) 0.886
## platelet (mean (sd)) 258.75 (100.32) 265.20 (90.73) 0.555
## protime (mean (sd)) 10.65 (0.85) 10.80 (1.14) 0.197
## stage (%) 0.201
## 1 12 ( 7.6) 4 ( 2.6)
## 2 35 (22.2) 32 (20.8)
## 3 56 (35.4) 64 (41.6)
## 4 55 (34.8) 54 (35.1)
In the propensity score analysis, standardized mean differences (SMDs) are often preferred. Use the smd argument for
print(CreateTableOne(vars = vars, strata = "trt", data = pbc), nonnormal = c("bili","chol"), smd = TRUE, test = FALSE)
## Stratified by trt
## D-penicillmain Placebo SMD
## n 158 154
## age (mean (sd)) 51.42 (11.01) 48.58 (9.96) 0.270
## sex = f (%) 137 (86.7) 139 (90.3) 0.111
## ascites = Present (%) 14 ( 8.9) 10 ( 6.5) 0.089
## hepato = Present (%) 73 (46.2) 87 (56.5) 0.207
## spiders (mean (sd)) 0.28 (0.45) 0.29 (0.46) 0.016
## edema (%) 0.058
## No edema 132 (83.5) 131 (85.1)
## Untreated or successfully treated 16 (10.1) 13 ( 8.4)
## edema despite diuretic therapy 10 ( 6.3) 10 ( 6.5)
## bili (median [IQR]) 1.40 [0.80, 3.20] 1.30 [0.72, 3.60] 0.171
## chol (median [IQR]) 315.50 [247.75, 417.00] 303.50 [254.25, 377.00] 0.038
## albumin (mean (sd)) 3.52 (0.44) 3.52 (0.40) 0.018
## copper (mean (sd)) 97.64 (90.59) 97.65 (80.49) <0.001
## alk.phos (mean (sd)) 2021.30 (2183.44) 1943.01 (2101.69) 0.037
## ast (mean (sd)) 120.21 (54.52) 124.97 (58.93) 0.084
## trig (mean (sd)) 124.14 (71.54) 125.25 (58.52) 0.017
## platelet (mean (sd)) 258.75 (100.32) 265.20 (90.73) 0.067
## protime (mean (sd)) 10.65 (0.85) 10.80 (1.14) 0.146
## stage (%) 0.246
## 1 12 ( 7.6) 4 ( 2.6)
## 2 35 (22.2) 32 (20.8)
## 3 56 (35.4) 64 (41.6)
## 4 55 (34.8) 54 (35.1)
Variable names are typically short and not appropriate for the final version of the table. Use the labelled package to assign variable labels.
var_label_list <- list(age = "Age in years",
sex = "Female",
ascites = "Ascites",
hepato = "Hepatomegaly",
spiders = "Spider angioma",
edema = "Edema",
bili = "Serum bilirunbin, mg/dl",
chol = "Serum cholesterol, mg/dl",
copper = "Urine copper ug/day",
stage = "Histologic stage of disease",
trig = "Triglycerides, mg/dl",
albumin = "Serum albumin, g/dl",
alk.phos = "Alkaline phosphotase, U/liter",
ast = "Aspartate aminotransferase, U/ml",
platelet = "Platelet count",
protime = "Prothrombin time in seconds")
labelled::var_label(pbc) <- var_label_list
labelled::var_label(pbc)
## $id
## NULL
##
## $time
## NULL
##
## $status
## NULL
##
## $trt
## NULL
##
## $age
## [1] "Age in years"
##
## $sex
## [1] "Female"
##
## $ascites
## [1] "Ascites"
##
## $hepato
## [1] "Hepatomegaly"
##
## $spiders
## [1] "Spider angioma"
##
## $edema
## [1] "Edema"
##
## $bili
## [1] "Serum bilirunbin, mg/dl"
##
## $chol
## [1] "Serum cholesterol, mg/dl"
##
## $albumin
## [1] "Serum albumin, g/dl"
##
## $copper
## [1] "Urine copper ug/day"
##
## $alk.phos
## [1] "Alkaline phosphotase, U/liter"
##
## $ast
## [1] "Aspartate aminotransferase, U/ml"
##
## $trig
## [1] "Triglycerides, mg/dl"
##
## $platelet
## [1] "Platelet count"
##
## $protime
## [1] "Prothrombin time in seconds"
##
## $stage
## [1] "Histologic stage of disease"
Let’s see the table with variable labels.
print(CreateTableOne(vars = vars, strata = "trt", data = pbc), nonnormal = c("bili","chol"), smd = TRUE, test = FALSE, varLabels = TRUE)
## Stratified by trt
## D-penicillmain Placebo SMD
## n 158 154
## Age in years (mean (sd)) 51.42 (11.01) 48.58 (9.96) 0.270
## Female = f (%) 137 (86.7) 139 (90.3) 0.111
## Ascites = Present (%) 14 ( 8.9) 10 ( 6.5) 0.089
## Hepatomegaly = Present (%) 73 (46.2) 87 (56.5) 0.207
## Spider angioma (mean (sd)) 0.28 (0.45) 0.29 (0.46) 0.016
## Edema (%) 0.058
## No edema 132 (83.5) 131 (85.1)
## Untreated or successfully treated 16 (10.1) 13 ( 8.4)
## edema despite diuretic therapy 10 ( 6.3) 10 ( 6.5)
## Serum bilirunbin, mg/dl (median [IQR]) 1.40 [0.80, 3.20] 1.30 [0.72, 3.60] 0.171
## Serum cholesterol, mg/dl (median [IQR]) 315.50 [247.75, 417.00] 303.50 [254.25, 377.00] 0.038
## Serum albumin, g/dl (mean (sd)) 3.52 (0.44) 3.52 (0.40) 0.018
## Urine copper ug/day (mean (sd)) 97.64 (90.59) 97.65 (80.49) <0.001
## Alkaline phosphotase, U/liter (mean (sd)) 2021.30 (2183.44) 1943.01 (2101.69) 0.037
## Aspartate aminotransferase, U/ml (mean (sd)) 120.21 (54.52) 124.97 (58.93) 0.084
## Triglycerides, mg/dl (mean (sd)) 124.14 (71.54) 125.25 (58.52) 0.017
## Platelet count (mean (sd)) 258.75 (100.32) 265.20 (90.73) 0.067
## Prothrombin time in seconds (mean (sd)) 10.65 (0.85) 10.80 (1.14) 0.146
## Histologic stage of disease (%) 0.246
## 1 12 ( 7.6) 4 ( 2.6)
## 2 35 (22.2) 32 (20.8)
## 3 56 (35.4) 64 (41.6)
## 4 55 (34.8) 54 (35.1)
Once binary categories look OK, we can suppress level indication.
print(CreateTableOne(vars = vars, strata = "trt", data = pbc), nonnormal = c("bili","chol"), smd = TRUE, test = FALSE, varLabels = TRUE, dropEqual = TRUE)
## Stratified by trt
## D-penicillmain Placebo SMD
## n 158 154
## Age in years (mean (sd)) 51.42 (11.01) 48.58 (9.96) 0.270
## Female (%) 137 (86.7) 139 (90.3) 0.111
## Ascites (%) 14 ( 8.9) 10 ( 6.5) 0.089
## Hepatomegaly (%) 73 (46.2) 87 (56.5) 0.207
## Spider angioma (mean (sd)) 0.28 (0.45) 0.29 (0.46) 0.016
## Edema (%) 0.058
## No edema 132 (83.5) 131 (85.1)
## Untreated or successfully treated 16 (10.1) 13 ( 8.4)
## edema despite diuretic therapy 10 ( 6.3) 10 ( 6.5)
## Serum bilirunbin, mg/dl (median [IQR]) 1.40 [0.80, 3.20] 1.30 [0.72, 3.60] 0.171
## Serum cholesterol, mg/dl (median [IQR]) 315.50 [247.75, 417.00] 303.50 [254.25, 377.00] 0.038
## Serum albumin, g/dl (mean (sd)) 3.52 (0.44) 3.52 (0.40) 0.018
## Urine copper ug/day (mean (sd)) 97.64 (90.59) 97.65 (80.49) <0.001
## Alkaline phosphotase, U/liter (mean (sd)) 2021.30 (2183.44) 1943.01 (2101.69) 0.037
## Aspartate aminotransferase, U/ml (mean (sd)) 120.21 (54.52) 124.97 (58.93) 0.084
## Triglycerides, mg/dl (mean (sd)) 124.14 (71.54) 125.25 (58.52) 0.017
## Platelet count (mean (sd)) 258.75 (100.32) 265.20 (90.73) 0.067
## Prothrombin time in seconds (mean (sd)) 10.65 (0.85) 10.80 (1.14) 0.146
## Histologic stage of disease (%) 0.246
## 1 12 ( 7.6) 4 ( 2.6)
## 2 35 (22.2) 32 (20.8)
## 3 56 (35.4) 64 (41.6)
## 4 55 (34.8) 54 (35.1)
It’s only in the Github version currently. The kableone function can be called instead of print function.
kableone <- function(x, ...) {
capture.output(x <- print(x))
knitr::kable(x, ...)
}
kableone(CreateTableOne(vars = vars, strata = "trt", data = pbc), nonnormal = c("bili","chol"), smd = TRUE, test = FALSE, varLabels = TRUE, dropEqual = TRUE, noSpaces = TRUE, printToggle = FALSE)
| D-penicillmain | Placebo | p | test | |
|---|---|---|---|---|
| n | 158 | 154 | ||
| age (mean (sd)) | 51.42 (11.01) | 48.58 (9.96) | 0.018 | |
| sex = f (%) | 137 (86.7) | 139 (90.3) | 0.421 | |
| ascites = Present (%) | 14 ( 8.9) | 10 ( 6.5) | 0.567 | |
| hepato = Present (%) | 73 (46.2) | 87 (56.5) | 0.088 | |
| spiders (mean (sd)) | 0.28 (0.45) | 0.29 (0.46) | 0.886 | |
| edema (%) | 0.877 | |||
| No edema | 132 (83.5) | 131 (85.1) | ||
| Untreated or successfully treated | 16 (10.1) | 13 ( 8.4) | ||
| edema despite diuretic therapy | 10 ( 6.3) | 10 ( 6.5) | ||
| bili (mean (sd)) | 2.87 (3.63) | 3.65 (5.28) | 0.131 | |
| chol (mean (sd)) | 365.01 (209.54) | 373.88 (252.48) | 0.748 | |
| albumin (mean (sd)) | 3.52 (0.44) | 3.52 (0.40) | 0.874 | |
| copper (mean (sd)) | 97.64 (90.59) | 97.65 (80.49) | 0.999 | |
| alk.phos (mean (sd)) | 2021.30 (2183.44) | 1943.01 (2101.69) | 0.747 | |
| ast (mean (sd)) | 120.21 (54.52) | 124.97 (58.93) | 0.460 | |
| trig (mean (sd)) | 124.14 (71.54) | 125.25 (58.52) | 0.886 | |
| platelet (mean (sd)) | 258.75 (100.32) | 265.20 (90.73) | 0.555 | |
| protime (mean (sd)) | 10.65 (0.85) | 10.80 (1.14) | 0.197 | |
| stage (%) | 0.201 | |||
| 1 | 12 ( 7.6) | 4 ( 2.6) | ||
| 2 | 35 (22.2) | 32 (20.8) | ||
| 3 | 56 (35.4) | 64 (41.6) | ||
| 4 | 55 (34.8) | 54 (35.1) |
The print method is invisibly returning a matrix object. We can export this to a file. In the console, the formating via spaces, but we don’t need them when exporting. The noSpaces option controls this aspect. If assigning the matrix is all you need, you can turn off printing by the printToggle option.
tab1mat <- print(CreateTableOne(vars = vars, strata = "trt", data = pbc), nonnormal = c("bili","chol"), smd = TRUE, test = FALSE, varLabels = TRUE, dropEqual = TRUE, noSpaces = TRUE, printToggle = FALSE)
Now this is just a matrix of text.
tab1mat
## Stratified by trt
## D-penicillmain Placebo SMD
## n "158" "154" ""
## Age in years (mean (sd)) "51.42 (11.01)" "48.58 (9.96)" "0.270"
## Female (%) "137 (86.7)" "139 (90.3)" "0.111"
## Ascites (%) "14 (8.9)" "10 (6.5)" "0.089"
## Hepatomegaly (%) "73 (46.2)" "87 (56.5)" "0.207"
## Spider angioma (mean (sd)) "0.28 (0.45)" "0.29 (0.46)" "0.016"
## Edema (%) "" "" "0.058"
## No edema "132 (83.5)" "131 (85.1)" ""
## Untreated or successfully treated "16 (10.1)" "13 (8.4)" ""
## edema despite diuretic therapy "10 (6.3)" "10 (6.5)" ""
## Serum bilirunbin, mg/dl (median [IQR]) "1.40 [0.80, 3.20]" "1.30 [0.72, 3.60]" "0.171"
## Serum cholesterol, mg/dl (median [IQR]) "315.50 [247.75, 417.00]" "303.50 [254.25, 377.00]" "0.038"
## Serum albumin, g/dl (mean (sd)) "3.52 (0.44)" "3.52 (0.40)" "0.018"
## Urine copper ug/day (mean (sd)) "97.64 (90.59)" "97.65 (80.49)" "<0.001"
## Alkaline phosphotase, U/liter (mean (sd)) "2021.30 (2183.44)" "1943.01 (2101.69)" "0.037"
## Aspartate aminotransferase, U/ml (mean (sd)) "120.21 (54.52)" "124.97 (58.93)" "0.084"
## Triglycerides, mg/dl (mean (sd)) "124.14 (71.54)" "125.25 (58.52)" "0.017"
## Platelet count (mean (sd)) "258.75 (100.32)" "265.20 (90.73)" "0.067"
## Prothrombin time in seconds (mean (sd)) "10.65 (0.85)" "10.80 (1.14)" "0.146"
## Histologic stage of disease (%) "" "" "0.246"
## 1 "12 (7.6)" "4 (2.6)" ""
## 2 "35 (22.2)" "32 (20.8)" ""
## 3 "56 (35.4)" "64 (41.6)" ""
## 4 "55 (34.8)" "54 (35.1)" ""
You can write to a CSV file easily.
write.csv(tab1mat, file = "./tab1.csv")
This is not a part of the package, yet. Download the helper file from my gist. The functions depend on the openxlsx package.
library(openxlsx)
source("./tableone_helper_functions.R")
We can export to a .xlsx file with some useful formats by default.
write_tableone_mat_to_xlsx(tab1mat, file = "./tab1.xlsx", font_size = 10)