ADam_spec_4_1_2

Author

Phanikumar

Introduction to ADaM

The Clinical Data Interchange Standards Consortium (CDISC) Analysis Data Model Implementation Guide (ADaMIG) specifies standard ADaM dataset structures and variables, including naming conventions, and standard solutions to implementation issues [1]. The ADaM standard is designed to support efficient generation, replication, and review of analysis results, particularly for submission to regulatory agencies like the US Food and Drug Administration (FDA) [1, 2].

The ADaMIG describes two standard data structures:

  • Subject-Level Analysis Dataset (ADSL) [3]
  • Basic Data Structure (BDS) [3]

This document will focus on how treatment variables are handled within these structures, as illustrated by examples described in the sources.

Treatment Variables in ADSL

The ADSL dataset is a cornerstone of ADaM, containing one record per subject [3]. It includes crucial subject-level variables necessary for analysis and subject characterization [3].

According to the sources and our discussion, ADSL includes variables like:

  • USUBJID: Unique Subject Identifier.
  • ARM: Assigned Arm [Conversation history based on ADSL example context].
  • ACTARM: Actual Arm [Conversation history based on ADSL example context].
  • TRTxxP: Planned Treatment for Period xx, where “xx” is a zero-padded 2-digit integer representing the period [4].
  • TRTxxA: Actual Treatment for Period xx [Conversation history based on ADSL example context].
  • TRTSDT: Date of First Exposure to Treatment [Conversation history based on ADSL example context].
  • TRTEDT: Date of Last Exposure to Treatment [Conversation history based on ADSL example context].
  • Period-specific Dates: Such as TRxxSDT (Period xx Start Date) and TRxxEDT (Period xx End Date) [Conversation history based on ADSL example context].

ADSL is used to capture planned and actual treatment information at the subject level across different study designs, such as parallel or crossover designs [Conversation history based on Section 4.1 examples]. The ADSL dataset and its metadata are required in a CDISC-based submission [5].

Note

ADSL Key Characteristics for Treatment:

  • Subject-level data (1 row per subject) [3].
  • Contains planned (TRTxxP) and actual (TRTxxA) treatment for each period (xx) .
  • Includes subject-level start/end dates of treatment exposure (TRTSDT, TRTEDT) and period dates (TRxxSDT, TRxxEDT) .
  • Used to define subject cohorts based on treatment assignment or exposure .
Example 1

In Table 4.1.1, the treatment variables for 3 subjects in a parallel design study (1 treatment period) are illustrated. Note that the third subject was randomized to active treatment yet received placebo instead.TR01SDT and TR01EDT are not required variables in trial designs that do not involve multiple treatment periods.

flowchart LR
  A[USUBJID] --> B(ARM)-->C(TRT01P) -->D(TRT01A)

Table 4.1.1 Randomized Parallel Design – ADSL Dataset
Row USUBJID ARM ACTARM TRT01P TRT01A TRTSDT TRTEDT
1 1001 Drug X 5 mg Drug X 5 mg Drug X 5 mg Drug X 5 mg 23OCT2007 17DEC2007
2 1002 PBO PBO PBO PBO 19JUL2006 20SEP2007
3 1003 Drug X 5 mg PBO DRUG A PBO 01NOV2007 20NOV2007

Interactive way of presenting Data

Subject ID Assigned Arm Actual Arm Planned Treatment Period 01 Actual Treatment Period 01 Treatment Start Date Treatment End Date
1001 Drug X 5 mg Drug X 5 mg Drug X 5 mg Drug X 5 mg 2023-01-15 2023-03-15
1002 Placebo Placebo Placebo Placebo 2023-01-15 2023-03-15
1003 Drug X 5 mg Placebo Drug X 5 mg Placebo 2023-01-20 2023-03-20

Treatment Variables in BDS

The Basic Data Structure (BDS) contains one or more records per subject, per analysis parameter, per analysis timepoint (timepoint is conditional). It holds the actual data being analyzed (AVAL, AVALC) and describes it (PARAM). A BDS dataset requires at least one treatment variable .

This treatment variable can be:

  1. A subject-level variable carried over from ADSL (e.g., TRTxxP) [7].
  2. A record-level variable, typically TRTP (Planned Treatment for the record) or TRTA (Actual Treatment for the record) [7].

Record-level treatment variables like TRTP are useful when the relevant treatment for a particular analysis record (e.g., an assessment at a specific timepoint) might depend on the treatment the subject was receiving at that specific time or period, rather than their overall randomized treatment from the start of the study [7].

Here’s a visualization of the conceptual structure of a BDS record, highlighting the inclusion of treatment variables:


Example 2

Table 4.1.2 illustrates the treatment variables for 3 subjects in a 2-period crossover design. It should be noted that TRTSDT and TRTEDT are not displayed, but TRTSDT = TR01SDT and TRTEDT is the maximum of TR01EDT and TR02EDT as some subjects may have discontinued before receiving TRT02P. Note that subjects 1002 and 1003 (in rows 2 and 3) were each exposed to placebo for both trial periods.

Subject ID Planned Treatment SEQ Planned Treatment01 Planned Treatment02 Actual Treatment SEQ Actual Treatment01 Actual Treatment02 Planned Treatment Start Date Planned Treatment End Date Actual Treatment Start Date Actual Treatment End Date
1001 Placebo Drug X Placebo Drug X Placebo Drug X Placebo Drug X 5 mg 2023-01-15 2023-03-15 2023-01-15 2023-03-15
1002 Placebo Drug X Placebo Drug X Placebo-Placebo Placebo Placebo 2023-01-15 2023-03-15 2023-01-15 2023-03-15
1003 Drug X Placebo Drug X Placebo Placebo-Placebo Placebo Placebo 2023-01-20 2023-03-20 2023-01-20 2023-03-20

Example 3

Table 4.1.3 illustrates the treatment variables for 3 subjects in a 3-period crossover design. It should be noted that TRTSDT and TRTEDT are not displayed, but TRTSDT = TR01SDT and TRTEDT is the maximum of TR01EDT, TR02EDT, and TR03EDT as some subjects may have discontinued before receiving TRT03P. In this trial, all subjects received the planned treatment at each period so the TRTxxA variables are not needed. Table 4.1.3 Three-Period Crossover Design – ADSL Dataset


Based on Section 4.2,

  • “Creation of Derived Columns Versus Creation of Derived Rows,” of the ADaM Implementation Guide1 , the ADaM Basic Data Structure (BDS) has specific rules for how derived data should be added1 …. These rules are crucial for ensuring the BDS dataset is analysis-focused, has a predictable structure, and prevents inappropriate “horizontalization” (adding columns when rows are needed)1 .
Note

The rules govern how data derived from values already present within the ADaM dataset (specifically, functions of analysis values like AVAL or BASE) should be incorporated

Data directly copied or derived from other source datasets (like SDTM or ADSL) are not subject to these specific rules for columns vs. rows derived from analysis values3 .

Rule 1: New Column

  1. Add as a New Column ◦ This rule applies when you calculate a new value that is a parameter-invariant function of AVAL and potentially BASE using values from the same row.

  2. Includes subject-level start/end dates of treatment exposure (TRTSDT, TRTEDT) and period dates (TRxxSDT, TRxxEDT) .

  3. Used to define subject cohorts based on treatment assignment or exposure .

    “Parameter-invariant” means the formula for the calculation is the same regardless of the PARAM on the row.

    The function must not involve transforming BASE. ◦ Examples include CHG (Change from Baseline = AVAL - BASE) and R2BASE (Ratio to Baseline = AVAL / BASE).

Rule 2: New Rows

  1. Add as a New Parameter (New Rows) ◦ If you need to perform a transformation of AVAL that doesn’t meet the conditions of Rule 1 (e.g., calculating the logarithm of a value). ◦

  2. The transformed value becomes the new AVAL in the new parameter.

  3. ◦ This creates a new set of rows for this transformed parameter.

    Example: “Log10(Weight (kg))” as a new PARAM. •

Rule 3: Add as a New Row (within the same parameter)

  1. This rule is used for derivations involving one or more rows within the same analysis parameter for the purpose of creating a specific analysis timepoint.

    ◦ Examples include creating rows for imputed values (like LOCF or WOCF) or derived conceptual timepoints (like Endpoint, Post-Baseline Minimum/Maximum/Average).

    ◦ These new rows should have a unique AVISIT value (or AVISITN) and the DTYPE (Derivation Type) variable should be populated to indicate how the value was derived. •

Rule 4: Add as a New Parameter (New Rows)

  1. This applies to functions calculated from multiple rows within the same parameter, but which do not represent a single analysis timepoint (as in Rule 3). ◦ An example is the Cumulative Area Under the Curve (AUC) for a parameter, which is calculated using multiple measurements of that parameter over time. This calculation itself becomes a new parameter. •

Rule 5: Add as a New Parameter (New Rows)

  1. When a derived variable is a function of more than one parameter. ◦ This derivation results in a new parameter with its own set of rows.
◦ Examples include ratios of two different laboratory parameters (e.g., Total Cholesterol:HDL-C ratio) or compound criteria based on assessments from different parameters or multiple rows. •

Rule 6: Add a New Set of Rows ◦

  1. This rule applies when there is more than one definition of baseline for a parameter. ◦

    For each additional definition beyond the primary one, a separate set of rows must be created in the dataset. ◦ The BASETYPE variable is required in such cases to identify which baseline definition is applicable to the BASE value on each row. This allows comparisons (like shift tables or change from baseline) to be performed against different baselines within the same dataset structure.

These rules work together to provide a standardized approach to structuring derived analysis data within the ADaM BDS.

flowchart TD
A[Derived Data for ADaM BDS] --> B {"Is it a **parameter-invariant** function of **AVAL** and **BASE** on the **same row** that does not transform BASE?(Rule 1) [2, 4, 5]"}
B -- YES --> C[**Add as New Column**(e.g., CHG, R2BASE)[2, 4-8]]
B -- NO --> D{"What kind of derivation is it?[2]"}
D --> E{"Transformation of **AVAL**not meeting Rule 1 conditions?(Rule 2) [2, 9]"}
 E -- YES --> F[**Add as New Parameter**(New Rows)AVAL holds the transformed value.[2, 9] (e.g., Log10(Weight))]
 D --> G{"Function of 1+ rows **within** the same parameter for creating an **analysis timepoint**?(Rule 3) [2, 10]"}
G -- YES --> H[**Add as New Row**(Within the same parameter)Identified by unique AVISIT/AVISITN.Requires DTYPE (e.g., LOCF, AVERAGE).[2, 10-16] (e.g., Imputed values, Derived Endpoint)]
D --> I{"Function of **multiple** rows **within** a parameter?(Rule 4) [2]"}
I -- YES --> J[**Add as New Parameter**(New Rows)[2, 17] (e.g., Cumulative AUC derived from multiple timepoints)]
D --> K{"Function of **more than one** parameter?(Rule 5) [2, 18]"}
K -- YES --> L[**Add as New Parameter**(New Rows)[2, 18, 19] (e.g., Ratio of two parameters, Compound Criterion)]
D --> M{"More than one **definition of baseline**?(Rule 6) [2, 20]"}
M -- YES --> N[**Add a New Set of Rows**for Each Additional Baseline DefinitionRequires BASETYPE.[2, 20-23]]

C,F,H,J,L,N --> O[Dataset Structure Defined]

Here’s a 10-question quiz on ADAM standards, focusing on sections 4.1, 4.2, and 4.3.

Quiz: ADAM Standards (Sections 4.1 - 4.3)

Instructions: Choose the best answer for each question.

  1. Which of the following best describes the purpose of the Analysis Data Model (ADaM)?
    1. To define how data is collected in a clinical trial.
    2. To provide a standard structure for analysis datasets derived from clinical trial data.
    3. To outline the format for submitting data to regulatory authorities.
    4. To standardize the electronic transfer of clinical trial data.
  2. In ADaM, what is the primary role of an Analysis Dataset?
    1. To store raw data collected directly from the study.
    2. To provide data organized and structured for statistical analysis.
    3. To define the data collection process.
    4. To facilitate data exchange between different systems.
  3. Which section of the ADaM Implementation Guide (IG) provides specifications for the structure of analysis datasets?
    1. Section 3.
    2. Section 4.
    3. Section 5.
    4. Section 6.
  4. What does the acronym “BDS” stand for in ADaM?
    1. Basic Data Structure
    2. Biomedical Data Specification
    3. Basic Dataset
    4. Body Data Standard
  5. In the context of ADaM, what is a “derived variable”?
    1. A variable collected directly from the study subject.
    2. A variable created through calculations or transformations of other variables.
    3. A variable that identifies the study subject.
    4. A variable used only for data management purposes.
  6. Which ADaM structure is commonly used to represent time-to-event data, such as Progression-Free Survival (PFS)?
    1. BDS (Basic Data Structure)
    2. Occurrence Data Structure (OCCDS)
    3. Time to Event (TTE)
    4. Subject-Level Analysis Dataset (ADSL)
  7. According to ADaM, what is the purpose of the ADSL dataset?
    1. To hold all subject-level data.
    2. To contain one record per subject with key demographic and study information.
    3. To store adverse event information.
    4. To represent time-to-event endpoints.
  8. Which of the following is NOT a key principle of ADaM?
    1. Traceability
    2. Reusability
    3. Flexibility
    4. Data Collection
  9. In ADaM, how are analysis populations typically defined?
    1. Using SDTM variables.
    2. Through protocol specifications and represented in ADaM datasets.
    3. Based on data collection forms.
    4. By the data management team after study completion.
  10. Which of the following is a key consideration when creating derived variables in ADaM?
    1. Using only raw data values.
    2. Ensuring the derivation logic is clearly documented and traceable.
    3. Avoiding complex calculations.
    4. Storing the derivations in a separate system.

Here’s a 10-question quiz on ADAM standards, focusing on sections 4.1, 4.2, and 4.3, with a greater emphasis on technical details.

Quiz: Technical Questions on ADAM Standards (Sections 4.1 - 4.3)

Instructions: Choose the best answer for each question.

  1. In ADaM, which of the following best describes the relationship between SDTM and ADaM datasets?
    1. ADaM datasets are a direct copy of SDTM datasets.
    2. ADaM datasets are derived from SDTM datasets to support specific analyses.
    3. SDTM datasets are derived from ADaM datasets for submission purposes.
    4. SDTM and ADaM are independent of each other.
  2. When creating a BDS dataset, which variable is typically used to represent the parameter being analyzed (e.g., change from baseline in tumor size)?
    1. STUDYID
    2. USUBJID
    3. PARAM
    4. AVAL
  3. In ADaM, what is the significance of the “ANL01FL” variable?
    1. It indicates the analysis method used.
    2. It flags records to be included in a specific analysis.
    3. It represents the analysis start date.
    4. It defines the analysis population.
  4. Which of the following is a key consideration when deriving a new record in an ADaM dataset?
    1. Duplicating all variables from the source record.
    2. Retaining only the variables necessary for the intended analysis.
    3. Omitting traceability information to simplify the dataset.
    4. Using non-standard variable names for derived values.
  5. Which ADaM dataset structure is most appropriate for representing adverse event data in a format suitable for analysis?
    1. BDS (Basic Data Structure)
    2. OCCDS (Occurrence Data Structure)
    3. ADSL (Subject-Level Analysis Dataset)
    4. TTE (Time-to-Event)
  6. In ADaM, how is baseline data typically handled to support change-from-baseline analyses?
    1. Baseline values are stored in a separate dataset.
    2. Baseline values are included as additional records in the BDS dataset.
    3. Baseline values are incorporated into the ADSL dataset.
    4. Baseline values are not required in ADaM.
  7. For a time-to-event analysis using ADaM, which variable would typically represent the time from a starting point to the event of interest (e.g., time to progression)?
    1. EVNT
    2. ADT
    3. TIMETO
    4. CNSR
  8. In ADaM, what is the purpose of a “population flag”?
    1. To identify subjects who completed the study.
    2. To indicate subjects who experienced a specific adverse event.
    3. To define the subset of subjects included in a particular analysis.
    4. To flag subjects who received a specific treatment.
  9. Which of the following is a recommended approach for handling missing data in ADaM analysis datasets?
    1. Imputing missing values without documenting the method.
    2. Excluding subjects with any missing data.
    3. Using a consistent and well-documented methodology for handling missing values.
    4. Replacing missing values with a default value (e.g., 0) without justification.

When creating an ADaM dataset, what level of traceability is expected?
a) Traceability to the raw data only.
b) Traceability to the SDTM datasets and the derivation logic.
c) Traceability to the analysis results.
d) No traceability is required.