TOne - Flexible ‘Table 1’ for Groupwise Description

In the context of scientific publications, “Table 1” stands for a table with characteristics of the study population. Typically it consists of descriptive statistics for the used variables, as mean/standard deviation for continuous variables, and proportions for categorical variables. Often, a comparison is made between several groups within the framework of the scientific question. Creating such a table by hand can be very time consuming and there’s a need for a flexible function that helps us to solve the task.

DescTools::TOne() is designed to address 3 degrees of freedom: structure, logic, and format.

Structure

Regarding row/column structure, TOne is simple. The rows contain the names of variables and their statistical descriptions, for the columns a grouping variable can be specified. If necessary, a separate column for the description of the entire population without differentiating between groups can be added. The last column contains the results of a possible statistical test for group differences. The size n of the individual groups can be added as first row.

Locic

The “logic” determines which statistic is used to describe the variables and which test is used to check the group differences. For the description of the data, the mean and the standard deviation for numerical variables and the absolute and relative frequencies for qualitative variables are output by default. However, other functions (such as median/IQR) can be freely defined. Group differences of numerical variables are tested with the Kruskal-Wallis test, qualitative variables with a Chi-square test, or in the case of dichotomy with a Fisher exact test. If this choice does not meet the specific needs, other tests can be configured without restrictions for all the three cases. For dichotomous variables we might want to choose the reference level to be reported (e.g. should percentage for men or for women be output?). There’s an argument to determine whether in general to choose the first or the second level. For the sake of clearness the reported level is included in the variable name.

Format

Great importance was attached to the free definition of the number formats. The representation of counting variables, numerical variables and p-values for the statistical tests can be freely defined using either R’s several format functions or DescTools’s integrating function Format().

Result

The function returns a character matrix as result, which can easily be subset or combined with other matrices.

library(DescTools)
TOne(x = mtcars[,-2], grp = mtcars$cyl)
## Warning in chisq.test(table(x, g)): Chi-squared approximation may be incorrect

## Warning in chisq.test(table(x, g)): Chi-squared approximation may be incorrect
## var               total             4                 6                 8                                  
## n                 32                11 (34.4%)        7 (21.9%)         14 (43.8%)                         
## mpg               20.091 (6.027)    26.664 (4.510)    19.743 (1.454)    15.100 (2.560)    *** ¹            
## disp              230.722 (123.939) 105.136 (26.872)  183.314 (41.562)  353.100 (67.771)  *** ¹            
## hp                146.688 (68.563)  82.636 (20.935)   122.286 (24.260)  209.214 (50.977)  *** ¹            
## drat              3.597 (0.535)     4.071 (0.365)     3.586 (0.476)     3.229 (0.372)     *** ¹            
## wt                3.217 (0.978)     2.286 (0.570)     3.117 (0.356)     3.999 (0.759)     *** ¹            
## qsec              17.849 (1.787)    19.137 (1.682)    17.977 (1.707)    16.772 (1.196)    **  ¹            
## vs (= 1)          14 (43.8%)        10 (90.9%)        4 (57.1%)         0 (0.0%)          *** ³            
## am (= 1)          13 (40.6%)        8 (72.7%)         3 (42.9%)         2 (14.3%)         *   ³            
## gear              3.688 (0.738)     4.091 (0.539)     3.857 (0.690)     3.286 (0.726)     **  ¹            
## carb              2.812 (1.615)     1.545 (0.522)     3.429 (1.813)     3.500 (1.557)     **  ¹            
## ---
## ¹) Kruskal-Wallis test, ²) Fisher exact test, ³) Chi-Square test
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Now we select the variables to be described using the d.pizza dataset (package DescTools). The numeric statistics are changed to the form “mean / sd” with different number of digits. The total column is suppressed, the test for numerics is changed to ANOVA and the formats are set for all types of variables (abs = integers, num = numeric values, per = percentages, pval = p-values).

library(DescTools) 
# save the result object for later output
(t1 <- TOne(x    = d.pizza[,c("temperature", "driver", "rabate")], 
    grp   = d.pizza$area, 
    align = " ", 
    total = FALSE,
    
    FUN = function(x) gettextf("%s / %s (%s)",
            Format(mean(x, na.rm = TRUE), digits = 1),
            Format(sd(x, na.rm = TRUE), digits = 3),
            Format(median(x, na.rm = TRUE), digits = 1)),
       
    TEST  = list(
       num  = list(fun = function(x, g){summary(aov(x ~ g))[[1]][1, "Pr(>F)"]},
                   lbl = "ANOVA"),
       cat  = list(fun = function(x, g){chisq.test(table(x, g))$p.val},
                   lbl = "Chi-Square test"),
       dich = list(fun = function(x, g){fisher.test(table(x, g))$p.val},
                   lbl = "Fisher exact test")),
    
    fmt = list(abs  = as.fmt(big.mark = " ", digits=0), 
               num  = as.fmt(big.mark = " ", digits=1), 
               per  = as.fmt(fmt = "%", digits=1), 
               pval = as.fmt(fmt = "*", na.form = "   ")) 
))
## var                  Brent                Camden               Westminster                              
## n                     474 (39.5%)          344 (28.7%)          381 (31.8%)                             
## temperature          51.1 / 8.734 (53.4)  47.4 / 10.111 (50.3) 44.3 / 9.836 (45.9)  *** ¹               
## driver                                                                              *** ³               
##   Butcher              72 (15.2%)            1 (0.3%)            22 (5.8%)                              
##   Carpenter            29 (6.1%)            19 (5.6%)           221 (58.2%)                             
##   Carter              177 (37.4%)           47 (13.8%)            5 (1.3%)                              
##   Farmer               19 (4.0%)            87 (25.5%)           11 (2.9%)                              
##   Hunter              128 (27.1%)            4 (1.2%)            24 (6.3%)                              
##   Miller                6 (1.3%)            41 (12.0%)           77 (20.3%)                             
##   Taylor               42 (8.9%)           142 (41.6%)           20 (5.3%)                              
## rabate (= TRUE)       235 (50.3%)          172 (50.3%)          184 (48.7%)             ³               
## ---
## ¹) ANOVA, ²) Fisher exact test, ³) Chi-Square test
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# alternative output with kable()
knitr::kable(t1, align="lrrr", caption = "'Table 1' presentated by kable")
‘Table 1’ presentated by kable
var Brent Camden Westminster
n 474 (39.5%) 344 (28.7%) 381 (31.8%)
temperature 51.1 / 8.734 (53.4) 47.4 / 10.111 (50.3) 44.3 / 9.836 (45.9) *** ¹
driver *** ³
Butcher 72 (15.2%) 1 (0.3%) 22 (5.8%)
Carpenter 29 (6.1%) 19 (5.6%) 221 (58.2%)
Carter 177 (37.4%) 47 (13.8%) 5 (1.3%)
Farmer 19 (4.0%) 87 (25.5%) 11 (2.9%)
Hunter 128 (27.1%) 4 (1.2%) 24 (6.3%)
Miller 6 (1.3%) 41 (12.0%) 77 (20.3%)
Taylor 42 (8.9%) 142 (41.6%) 20 (5.3%)
rabate (= TRUE) 235 (50.3%) 172 (50.3%) 184 (48.7%) ³

As a “Windows only” option, the result can directly be transferred to Word with the function DescTools::ToWrd(). Both, font and alignment of the columns are freely definable for the Word table as fucntion arguments. Later fine tuned adaptations are possible as all format options in MS-Word are available through RDCOMClient.

# use e.g. the following 
# wrd <- GetNewWrd()
# t1 <- TOne(x=, ...)
# ToWrd(t1, wrd=wrd, font=list(name="Arial narrow", size=8), 
#       main = "Pizza table")