knitr::opts_chunk$set(
echo = TRUE,
message = FALSE,
warning = FALSE,
fig.width = 10,
fig.height = 6,
collapse = TRUE,
comment = "#>"
)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(tidyr)
library(gt)
library(zoo)
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
library(openxlsx)
library(tibble)
library(ggplot2)
Complete PK/PD SAP workflow following CDISC standards for regulatory submissions. High-level workflow summary Use R to implement the SAP logic for PK and PD by creating subject-level ADaM datasets (e.g., ADPPK, ADPPD) that compute PK parameters and PD responses as specified in the SAP.
Use {gt} (and optionally {gtreg}) to create formatted clinical tables that match regulatory expectations and export them to RTF/HTML/PDF for inclusion in ND/BLA/NDA/MAA submissions.
Maintain SDTM mapping specs and ADaM specs as structured metadata (typically Excel), then use {defineR} to convert these specs into SDTM/ADaM metadata spreadsheets and define.xml files with schema validation for submission packages.
If you share your current data structure (variable names and domains), more tailored code templates for your specific SAP (e.g., noncompartmental PK, exposure–response PD models, or population PK) can be provided.
PK/PD stands for pharmacokinetics and pharmacodynamics, two closely linked areas of pharmacology that describe how drugs behave in the body and what effects they produce.
Pharmacokinetics is the study of what the body does to a drug over time. It focuses on the ADME processes: absorption (how the drug enters the bloodstream), distribution (how it spreads through tissues), metabolism (how it is chemically changed), and excretion (how it is removed).
Pharmacodynamics is the study of what the drug does to the body. It describes the relationship between drug concentration at the target (such as a receptor) and the resulting biological or clinical effect, including both desired therapeutic effects and side effects.
PK/PD (or PKPD) analysis links drug concentrations over time (PK) to their observed effects (PD) to understand exposure–response relationships. These combined models are widely used in drug development to select doses, design regimens, and predict onset, magnitude, and duration of drug effects in different populations.
SDTM stands for Study Data Tabulation Model, a CDISC standard that defines a consistent structure for organizing clinical trial data submitted to regulatory authorities like the FDA.
Purpose: SDTM standardizes how raw clinical trial data is formatted into datasets (called domains) for tabulation, making it easier for regulators to review safety, efficacy, and other study results. It is mandatory for FDA NDA/BLA/ANDA submissions since 2016 and is also required by PMDA (Japan) and MHRA (UK).
Key Components: Domains: Predefined datasets grouped by data type (e.g., DM for demographics, AE for adverse events, PC for PK concentrations).
Variables: Standardized names and structures across domains (e.g., STUDYID, USUBJID, –TESTCD).
Define.xml: Metadata file describing dataset structure, origins, and controlled terminology for regulatory review.
Related Standards Works with ADaM (analysis datasets) and CDASH (data collection).
SEND variant applies to nonclinical (animal) data.
ADaM stands for Analysis Data Model, a CDISC standard that defines the structure, content, and metadata for analysis-ready datasets derived from SDTM data to support statistical analyses and regulatory review.
Purpose: ADaM enables efficient generation, replication, and review of clinical trial statistical analyses, including tables, listings, and figures (TLFs), while ensuring tractability back to SDTM tabulation data. It provides flexibility for custom analyses beyond raw data tabulation, making it suitable for efficacy, safety, PK/PD, and other endpoints in NDA/BLA submissions.
Key Components: ADSL: Subject-level analysis dataset with one record per subject, containing demographics, treatment arms, disposition, and key dates.
BDS: Basic Data Structure for repeated measures or parameters (e.g., ADRS for responder data, ADTTE for time-to-event), with variables like PARAMCD, AVAL, ARM.
OCCDS/ADAE: Occurrence data for events like adverse events.
Metadata: Define.xml and analysis results metadata documenting derivations and parameters.
Relation to SDTM: ADaM datasets are created from SDTM domains, adding analysis variables (e.g., baseline flags, imputations) while maintaining traceability via variables like SRCSEQ or RELREC.
path2File <- "/home/rmlinux/DataBridge Statistical Consulting/CLAUDE/dBase/"
pk_raw <- read.csv(paste0(path2File,"pk_raw.csv"), header=TRUE, stringsAsFactors = FALSE)
glimpse(pk_raw)
#> Rows: 20
#> Columns: 6
#> $ STUDYID <chr> "STUDY1", "STUDY1", "STUDY1", "STUDY1", "STUDY1", "STUDY1", "S…
#> $ USUBJID <chr> "SUBJ001", "SUBJ001", "SUBJ001", "SUBJ001", "SUBJ001", "SUBJ00…
#> $ ARM <chr> "Drug A", "Drug A", "Drug A", "Drug A", "Drug A", "Drug A", "D…
#> $ VISIT <chr> "Visit 1", "Visit 1", "Visit 1", "Visit 1", "Visit 1", "Visit …
#> $ TIME <int> 0, 1, 2, 4, 8, 0, 1, 2, 4, 8, 0, 1, 2, 4, 8, 0, 1, 2, 4, 8
#> $ CONC <dbl> 0.0, 2.5, 4.2, 3.8, 1.9, 0.0, 3.1, 5.0, 4.1, 2.0, 0.0, 1.8, 3.…
pd_raw <- read.csv(paste0(path2File,"pd_raw.csv"), header=TRUE, stringsAsFactors = FALSE)
glimpse(pd_raw)
#> Rows: 16
#> Columns: 7
#> $ STUDYID <chr> "STUDY1", "STUDY1", "STUDY1", "STUDY1", "STUDY1", "STUDY1", "…
#> $ USUBJID <chr> "SUBJ001", "SUBJ001", "SUBJ001", "SUBJ001", "SUBJ002", "SUBJ0…
#> $ ARM <chr> "Drug A", "Drug A", "Drug A", "Drug A", "Drug A", "Drug A", "…
#> $ VISIT <chr> "Visit 1", "Visit 1", "Visit 1", "Visit 1", "Visit 1", "Visit…
#> $ TIME <int> 0, 2, 4, 8, 0, 2, 4, 8, 0, 2, 4, 8, 0, 2, 4, 8
#> $ EFFICACY <int> 10, 15, 18, 16, 12, 17, 20, 19, 11, 13, 14, 13, 9, 12, 13, 12
#> $ BIOMARK <dbl> 1.2, 1.5, 1.7, 1.6, 1.3, 1.6, 1.9, 1.8, 1.1, 1.3, 1.4, 1.3, 1…
pk_raw |>
ggplot(aes(x = TIME, y = CONC, color = USUBJID, linetype = ARM)) +
geom_point(size = 2) + geom_line() +
facet_wrap(~ARM) +
labs(title = "PK Concentration-Time Profiles", y = "Concentration (ng/mL)") +
theme_minimal()
# 3. PK Parameters (SAP)
pk_params <- pk_raw |>
group_by(STUDYID, USUBJID, ARM) |>
arrange(TIME, .by_group = TRUE) |>
summarize(
CMAX = max(CONC, na.rm = TRUE),
TMAX = TIME[which.max(CONC)],
AUC = {
if(length(CONC) > 1) {
dt <- diff(TIME)
cavg <- zoo::rollmean(CONC, 2)
sum(dt * cavg, na.rm = TRUE)
} else 0
},
.groups = "drop"
) |>
mutate(AUC = round(AUC, 2))
pk_params
#> # A tibble: 4 × 6
#> STUDYID USUBJID ARM CMAX TMAX AUC
#> <chr> <chr> <chr> <dbl> <int> <dbl>
#> 1 STUDY1 SUBJ001 Drug A 4.2 2 24
#> 2 STUDY1 SUBJ002 Drug A 5 2 26.9
#> 3 STUDY1 SUBJ003 Drug B 3 2 16.8
#> 4 STUDY1 SUBJ004 Drug B 3.5 2 19.2
pd_summ <- pd_raw |>
group_by(STUDYID, USUBJID, ARM, VISIT) |>
summarize(
MEAN_EFF = mean(EFFICACY, na.rm = TRUE),
MEAN_BIOM = mean(BIOMARK, na.rm = TRUE),
.groups = "drop"
) |>
mutate(across(c(MEAN_EFF, MEAN_BIOM), ~round(.x, 2)))
pd_summ
#> # A tibble: 4 × 6
#> STUDYID USUBJID ARM VISIT MEAN_EFF MEAN_BIOM
#> <chr> <chr> <chr> <chr> <dbl> <dbl>
#> 1 STUDY1 SUBJ001 Drug A Visit 1 14.8 1.5
#> 2 STUDY1 SUBJ002 Drug A Visit 1 17 1.65
#> 3 STUDY1 SUBJ003 Drug B Visit 1 12.8 1.27
#> 4 STUDY1 SUBJ004 Drug B Visit 1 11.5 1.18
adppk <- pk_params %>%
pivot_longer(
cols = c(CMAX, TMAX, AUC),
names_to = "PARAMCD",
values_to = "AVAL"
) %>%
mutate(
PARAM = case_when(
PARAMCD == "CMAX" ~ "Maximum Observed Concentration",
PARAMCD == "TMAX" ~ "Time of Cmax",
PARAMCD == "AUC" ~ "AUC(0-tlast)",
TRUE ~ NA_character_
),
AVAL = round(AVAL, 3)
) %>%
select(STUDYID, USUBJID, ARM, PARAMCD, PARAM, AVAL)
adppk
#> # A tibble: 12 × 6
#> STUDYID USUBJID ARM PARAMCD PARAM AVAL
#> <chr> <chr> <chr> <chr> <chr> <dbl>
#> 1 STUDY1 SUBJ001 Drug A CMAX Maximum Observed Concentration 4.2
#> 2 STUDY1 SUBJ001 Drug A TMAX Time of Cmax 2
#> 3 STUDY1 SUBJ001 Drug A AUC AUC(0-tlast) 24
#> 4 STUDY1 SUBJ002 Drug A CMAX Maximum Observed Concentration 5
#> 5 STUDY1 SUBJ002 Drug A TMAX Time of Cmax 2
#> 6 STUDY1 SUBJ002 Drug A AUC AUC(0-tlast) 26.9
#> 7 STUDY1 SUBJ003 Drug B CMAX Maximum Observed Concentration 3
#> 8 STUDY1 SUBJ003 Drug B TMAX Time of Cmax 2
#> 9 STUDY1 SUBJ003 Drug B AUC AUC(0-tlast) 16.8
#> 10 STUDY1 SUBJ004 Drug B CMAX Maximum Observed Concentration 3.5
#> 11 STUDY1 SUBJ004 Drug B TMAX Time of Cmax 2
#> 12 STUDY1 SUBJ004 Drug B AUC AUC(0-tlast) 19.2
tbl_pk <- adppk |>
group_by(PARAM, ARM) |>
summarize(
N = n(),
MEAN = mean(AVAL, na.rm = TRUE),
SD = sd(AVAL, na.rm = TRUE),
CV = (SD/MEAN)*100,
MIN = min(AVAL, na.rm = TRUE),
MEDIAN = median(AVAL, na.rm = TRUE),
MAX = max(AVAL, na.rm = TRUE),
.groups = "drop"
) |>
mutate(across(c(MEAN, SD, MIN, MEDIAN, MAX, CV), ~round(.x, 2)))
pk_table_final <- tbl_pk |>
gt(rowname_col = "PARAM") |>
tab_header(
title = "Table 14.1.1.1 PK Parameters by Treatment",
subtitle = "PK Analysis Set (N=4)"
) |>
cols_label(
ARM = "Treatment", N = "n", MEAN = "Mean", SD = "SD",
CV = "CV%", MIN = "Min", MEDIAN = "Median", MAX = "Max"
) |>
fmt_number(c(MEAN, SD, MIN, MEDIAN, MAX), decimals = 2) |>
fmt_percent(CV, decimals = 1) |>
tab_source_note("Linear trapezoidal AUC(0-tlast) per SAP.")
pk_table_final
| Table 14.1.1.1 PK Parameters by Treatment | ||||||||
| PK Analysis Set (N=4) | ||||||||
| Treatment | n | Mean | SD | CV% | Min | Median | Max | |
|---|---|---|---|---|---|---|---|---|
| AUC(0-tlast) | Drug A | 2 | 25.45 | 2.05 | 806.0% | 24.00 | 25.45 | 26.90 |
| AUC(0-tlast) | Drug B | 2 | 18.02 | 1.73 | 961.0% | 16.80 | 18.02 | 19.25 |
| Maximum Observed Concentration | Drug A | 2 | 4.60 | 0.57 | 1,230.0% | 4.20 | 4.60 | 5.00 |
| Maximum Observed Concentration | Drug B | 2 | 3.25 | 0.35 | 1,088.0% | 3.00 | 3.25 | 3.50 |
| Time of Cmax | Drug A | 2 | 2.00 | 0.00 | 0.0% | 2.00 | 2.00 | 2.00 |
| Time of Cmax | Drug B | 2 | 2.00 | 0.00 | 0.0% | 2.00 | 2.00 | 2.00 |
| Linear trapezoidal AUC(0-tlast) per SAP. | ||||||||
sdtm_pc_spec <- tribble(
~SDTM_Var, ~Source_Var, ~Derivation,
"STUDYID", "STUDYID", "Copy as-is",
"DOMAIN", NA, "'PC'",
"USUBJID", "USUBJID", "Copy as-is",
"PCSEQ", NA, "Row number by USUBJID",
"PCTESTCD", NA, "'DRUGCONC'",
"PCORRES", "CONC", "Reported concentration",
"PCSTRESN", "CONC", "Numeric concentration",
"PCSTRESU", NA, "'NG/ML'"
)
sdtm_pc_spec |> gt() |> tab_header("SDTM PC Mapping")
| SDTM PC Mapping | ||
| SDTM_Var | Source_Var | Derivation |
|---|---|---|
| STUDYID | STUDYID | Copy as-is |
| DOMAIN | NA | 'PC' |
| USUBJID | USUBJID | Copy as-is |
| PCSEQ | NA | Row number by USUBJID |
| PCTESTCD | NA | 'DRUGCONC' |
| PCORRES | CONC | Reported concentration |
| PCSTRESN | CONC | Numeric concentration |
| PCSTRESU | NA | 'NG/ML' |
adam_meta <- tribble(
~Dataset, ~Variable, ~Label, ~Type, ~Origin,
"ADPPK", "STUDYID", "Study ID", "Char", "SDTM.PC",
"ADPPK", "USUBJID", "Subject ID", "Char", "SDTM.PC",
"ADPPK", "ARM", "Treatment Arm", "Char", "SDTM.SV",
"ADPPK", "PARAMCD", "Param Code", "Char", "Derived",
"ADPPK", "AVAL", "Analysis Value", "Num", "Derived"
)
adam_meta
#> # A tibble: 5 × 5
#> Dataset Variable Label Type Origin
#> <chr> <chr> <chr> <chr> <chr>
#> 1 ADPPK STUDYID Study ID Char SDTM.PC
#> 2 ADPPK USUBJID Subject ID Char SDTM.PC
#> 3 ADPPK ARM Treatment Arm Char SDTM.SV
#> 4 ADPPK PARAMCD Param Code Char Derived
#> 5 ADPPK AVAL Analysis Value Num Derived
gtsave(pk_table_final, "pk_table.rtf")
write.xlsx(list(PC=sdtm_pc_spec, ADPPK=adam_meta), paste0(path2File,"metadata.xlsx"))