broom::tidy.survfit()There is a new version of the survival package arriving to CRAN in
the next couple of weeks. The new version changed the way censored data
are stored internally for multi-state models (the most common one being
competing risks). I think the censored data is now a matrix, where
previously it was a vector (though, I haven’t looked into the details).
Below is an example of broom::tidy() now giving incorrect
results for competing risks estimates using the dev version of the
{survival} package. I don’t think CRAN checks reverse deps for Suggested
packages, so the broom team may not be aware of the coming change in a
very common package.
First get the most recent version of broom.
pak::pak("tidymodels/broom")
packageVersion("broom")
## [1] '1.0.5.9000'
Set temp dirs for installation for the current CRAN release of survival and the dev version from GitHub.
path <- fs::path_temp()
path_cran <- fs::path(path, "cran")
path_dev <- fs::path(path, "dev")
fs::dir_create(path_cran)
fs::dir_create(path_dev)
Install both versions of packages
install.packages("survival", lib = path_cran)
##
## The downloaded binary packages are in
## /var/folders/6f/gdjf_vxj2wl3jhmxdkd1hd_w0000gn/T//Rtmprl9cre/downloaded_packages
pak::pak("therneau/survival", lib = path_dev)
# print CRAN and DEV versions of the package
desc::desc_get_version(fs::path(path_cran, "survival", "DESCRIPTION"))
## [1] '3.5.8'
desc::desc_get_version(fs::path(path_dev, "survival", "DESCRIPTION"))
## [1] '3.6.1'
Simulate a competing risks data set
set.seed(1234)
n <- 100
df <-
dplyr::tibble(
death_cr =
sample.int(n = 3, size = n, replace = TRUE) |>
factor(labels = c("censor", "cancer death", "other cause death")),
ttdeath = rgamma(n = n, shape = 1)
)
Illustrate the differences in the results of
broom::tidy() from the prod to dev versions of the survival
package.
First begin with the version of survival on CRAN.
# print broom pkg version
packageVersion("broom")
## [1] '1.0.5.9000'
# now create a tidy data frame of the results from CRAN and DEV and compare
library("survival", lib.loc = path_cran)
tidy_cran <-
survfit(Surv(ttdeath, death_cr) ~ 1, data = df) |>
broom::tidy()
detach("package:survival", unload=TRUE)
Now, we will perform the same calculation using the dev version of the package.
library("survival", lib.loc = path_dev)
tidy_dev <-
survfit(Surv(ttdeath, death_cr) ~ 1, data = df) |>
broom::tidy()
Print censoring results that do not match
dplyr::full_join(
tidy_cran |> dplyr::select(state, time, n.censor),
tidy_dev |> dplyr::select(state, time, n.censor),
by = c("state", "time"),
suffix = c(".cran", ".dev")
) |>
dplyr::filter(n.censor.cran != n.censor.dev)
## # A tibble: 58 × 4
## state time n.censor.cran n.censor.dev
## <chr> <dbl> <int> <dbl>
## 1 cancer death 0.130 1 0
## 2 cancer death 0.153 1 0
## 3 cancer death 0.192 1 0
## 4 cancer death 0.192 1 0
## 5 cancer death 0.202 1 0
## 6 cancer death 0.211 1 0
## 7 cancer death 0.269 1 0
## 8 cancer death 0.288 1 0
## 9 cancer death 0.292 1 0
## 10 cancer death 0.343 1 0
## # ℹ 48 more rows