Creating a BDS Time-to-Event ADaM

Introduction

This article describes creating a BDS time-to-event ADaM.

The main part in programming a time-to-event dataset is the definition of the events and censoring times. {admiral} supports single events like death or composite events like disease progression or death. More than one source dataset can be used for the definition of the event and censoring times.

Note: All examples assume CDISC SDTM and/or ADaM format as input unless otherwise specified.

Required Packages

The examples of this vignette require the following packages.

library(admiral)
library(dplyr, warn.conflicts = FALSE)
library(pharmaversesdtm)

Programming Workflow

Read in Data

To start, all datasets needed for the creation of the time-to-event dataset should be read into the environment. This will be a company specific process.

For example purpose, the ADSL dataset—which is included in {admiral}—and the SDTM datasets from {pharmaversesdtm} are used.

ae <- pharmaversesdtm::ae
adsl <- admiral::admiral_adsl

ae <- convert_blanks_to_na(ae)

The following code creates a minimally viable ADAE dataset to be used throughout the following examples.

adae <- ae %>%
  left_join(adsl, by = c("STUDYID", "USUBJID")) %>%
  derive_vars_dt(
    new_vars_prefix = "AST",
    dtc = AESTDTC,
    highest_imputation = "M"
  ) %>%
  derive_vars_dt(
    new_vars_prefix = "AEN",
    dtc = AEENDTC,
    highest_imputation = "M",
    date_imputation = "last"
  ) %>%
  mutate(TRTEMFL = if_else(ASTDT >= TRTSDT &
    AENDT <= TRTEDT + days(30), "Y", NA_character_))

Derive Parameters (CNSR, ADT, STARTDT)

To derive the parameter dependent variables like CNSR, ADT, STARTDT, EVNTDESC, SRCDOM, PARAMCD, … the derive_param_tte() function can be used. It adds one parameter to the input dataset with one observation per subject. Usually it is called several times.

For each subject it is determined if an event occurred. In the affirmative the analysis date ADT is set to the earliest event date. If no event occurred, the analysis date is set to the latest censoring date.

The events and censorings are defined by the event_source() and the censor_source() class respectively. It defines

  • which observations (filter parameter) of a source dataset (dataset_name parameter) are potential events or censorings,
  • the value of the CNSR variable (censor parameter), and
  • which variable provides the date (date parameter).

The date can be provided as date (--DT variable) or datetime (--DTM variable).

CDISC strongly recommends CNSR = 0 for events and positive integers for censorings. {admiral} enforces this recommendation. Therefore the censor parameter is available for censor_source() only. It is defaulted to 1.

The dataset_name parameter expects a character value which is used as an identifier. The actual data which is used for the derivation of the parameter is provided via the source_datasets parameter of derive_param_tte(). It expects a named list of datasets. The names correspond to the identifiers specified for the dataset_name parameter. This allows to define events and censoring independent of the data.

Pre-Defined Time-to-Event Source Objects

The table below shows all pre-defined tte_source objects which should cover the most common use cases.

object dataset_name filter date censor set_values_to
ae_gr3_event adae TRTEMFL == “Y” & ATOXGR == “3” ASTDT 0 EVNTDESC: “GRADE 3 ADVERSE EVENT”
SRCDOM: “ADAE”
SRCVAR: “ASTDT”
SRCSEQ: AESEQ
ae_wd_event adae TRTEMFL == “Y” & AEACN == “DRUG WITHDRAWN” ASTDT 0 EVNTDESC: “ADVERSE EVENT LEADING TO DRUG WITHDRAWAL”
SRCDOM: “ADAE”
SRCVAR: “ASTDT”
SRCSEQ: AESEQ
ae_gr35_event adae TRTEMFL == “Y” & ATOXGR %in% c(“3”, “4”, “5”) ASTDT 0 EVNTDESC: “GRADE 3-5 ADVERSE EVENT”
SRCDOM: “ADAE”
SRCVAR: “ASTDT”
SRCSEQ: AESEQ
lastalive_censor adsl NULL LSTALVDT 1 EVNTDESC: “ALIVE”
SRCDOM: “ADSL”
SRCVAR: “LSTALVDT”
ae_gr1_event adae TRTEMFL == “Y” & ATOXGR == “1” ASTDT 0 EVNTDESC: “GRADE 1 ADVERSE EVENT”
SRCDOM: “ADAE”
SRCVAR: “ASTDT”
SRCSEQ: AESEQ
ae_ser_event adae TRTEMFL == “Y” & AESER == “Y” ASTDT 0 EVNTDESC: “SERIOUS ADVERSE EVENT”
SRCDOM: “ADAE”
SRCVAR: “ASTDT”
SRCSEQ: AESEQ
ae_gr2_event adae TRTEMFL == “Y” & ATOXGR == “2” ASTDT 0 EVNTDESC: “GRADE 2 ADVERSE EVENT”
SRCDOM: “ADAE”
SRCVAR: “ASTDT”
SRCSEQ: AESEQ
ae_event adae TRTEMFL == “Y” ASTDT 0 EVNTDESC: “ADVERSE EVENT”
SRCDOM: “ADAE”
SRCVAR: “ASTDT”
SRCSEQ: AESEQ
ae_gr4_event adae TRTEMFL == “Y” & ATOXGR == “4” ASTDT 0 EVNTDESC: “GRADE 4 ADVERSE EVENT”
SRCDOM: “ADAE”
SRCVAR: “ASTDT”
SRCSEQ: AESEQ
ae_gr5_event adae TRTEMFL == “Y” & ATOXGR == “5” ASTDT 0 EVNTDESC: “GRADE 5 ADVERSE EVENT”
SRCDOM: “ADAE”
SRCVAR: “ASTDT”
SRCSEQ: AESEQ
ae_sev_event adae TRTEMFL == “Y” & AESEV == “SEVERE” ASTDT 0 EVNTDESC: “SEVERE ADVERSE EVENT”
SRCDOM: “ADAE”
SRCVAR: “ASTDT”
SRCSEQ: AESEQ
death_event adsl DTHFL == “Y” DTHDT 0 EVNTDESC: “DEATH”
SRCDOM: “ADSL”
SRCVAR: “DTHDT”

These pre-defined objects can be passed directly to derive_param_tte() to create a new time-to-event parameter.

adtte <- derive_param_tte(
  dataset_adsl = adsl,
  start_date = TRTSDT,
  event_conditions = list(ae_ser_event),
  censor_conditions = list(lastalive_censor),
  source_datasets = list(adsl = adsl, adae = adae),
  set_values_to = exprs(PARAMCD = "TTAESER", PARAM = "Time to First Serious AE")
)
USUBJID PARAMCD PARAM STARTDT ADT CNSR
01-701-1015 TTAESER Time to First Serious AE 2014-01-02 2014-07-02 1
01-701-1023 TTAESER Time to First Serious AE 2012-08-05 2012-09-02 1
01-701-1028 TTAESER Time to First Serious AE 2013-07-19 2014-01-14 1
01-701-1033 TTAESER Time to First Serious AE 2014-03-18 2014-04-14 1
01-701-1034 TTAESER Time to First Serious AE 2014-07-01 2014-12-30 1
01-701-1047 TTAESER Time to First Serious AE 2013-02-12 2013-04-07 1
01-701-1097 TTAESER Time to First Serious AE 2014-01-01 2014-07-09 1
01-701-1111 TTAESER Time to First Serious AE 2012-09-07 2012-09-17 1
01-701-1115 TTAESER Time to First Serious AE 2012-11-30 2013-01-23 1
01-701-1118 TTAESER Time to First Serious AE 2014-03-12 2014-09-09 1

Single Event

For example, the overall survival time could be defined from treatment start to death. Patients alive or lost to follow-up would be censored to the last alive date. The following call defines a death event based on ADSL variables.

death <- event_source(
  dataset_name = "adsl",
  filter = DTHFL == "Y",
  date = DTHDT
)

A corresponding censoring based on the last known alive date can be defined by the following call.

lstalv <- censor_source(
  dataset_name = "adsl",
  date = LSTALVDT
)

The definitions can be passed to derive_param_tte() to create a new time-to-event parameter.

adtte <- derive_param_tte(
  dataset_adsl = adsl,
  source_datasets = list(adsl = adsl),
  start_date = TRTSDT,
  event_conditions = list(death),
  censor_conditions = list(lstalv),
  set_values_to = exprs(PARAMCD = "OS", PARAM = "Overall Survival")
)
USUBJID PARAMCD PARAM STARTDT ADT CNSR
01-701-1015 OS Overall Survival 2014-01-02 2014-07-02 1
01-701-1023 OS Overall Survival 2012-08-05 2012-09-02 1
01-701-1028 OS Overall Survival 2013-07-19 2014-01-14 1
01-701-1033 OS Overall Survival 2014-03-18 2014-04-14 1
01-701-1034 OS Overall Survival 2014-07-01 2014-12-30 1
01-701-1047 OS Overall Survival 2013-02-12 2013-04-07 1
01-701-1097 OS Overall Survival 2014-01-01 2014-07-09 1
01-701-1111 OS Overall Survival 2012-09-07 2012-09-17 1
01-701-1115 OS Overall Survival 2012-11-30 2013-01-23 1
01-701-1118 OS Overall Survival 2014-03-12 2014-09-09 1

Note that in practice for efficacy parameters you might use randomization date as the time to event origin date.

Add Additional Information for Events and Censoring (EVNTDESC, SRCVAR, …)

To add additional information like event or censoring description (EVNTDESC) or source variable (SRCVAR) the set_values_to parameter can be specified in the event/censoring definition.

# define death event #
death <- event_source(
  dataset_name = "adsl",
  filter = DTHFL == "Y",
  date = DTHDT,
  set_values_to = exprs(
    EVNTDESC = "DEATH",
    SRCDOM = "ADSL",
    SRCVAR = "DTHDT"
  )
)

# define censoring at last known alive date #
lstalv <- censor_source(
  dataset_name = "adsl",
  date = LSTALVDT,
  set_values_to = exprs(
    EVNTDESC = "LAST KNOWN ALIVE DATE",
    SRCDOM = "ADSL",
    SRCVAR = "LSTALVDT"
  )
)

# derive time-to-event parameter #
adtte <- derive_param_tte(
  dataset_adsl = adsl,
  source_datasets = list(adsl = adsl),
  event_conditions = list(death),
  censor_conditions = list(lstalv),
  set_values_to = exprs(PARAMCD = "OS", PARAM = "Overall Survival")
)
USUBJID EVNTDESC SRCDOM SRCVAR CNSR ADT
01-701-1015 LAST KNOWN ALIVE DATE ADSL LSTALVDT 1 2014-07-02
01-701-1023 LAST KNOWN ALIVE DATE ADSL LSTALVDT 1 2012-09-02
01-701-1028 LAST KNOWN ALIVE DATE ADSL LSTALVDT 1 2014-01-14
01-701-1033 LAST KNOWN ALIVE DATE ADSL LSTALVDT 1 2014-04-14
01-701-1034 LAST KNOWN ALIVE DATE ADSL LSTALVDT 1 2014-12-30
01-701-1047 LAST KNOWN ALIVE DATE ADSL LSTALVDT 1 2013-04-07
01-701-1097 LAST KNOWN ALIVE DATE ADSL LSTALVDT 1 2014-07-09
01-701-1111 LAST KNOWN ALIVE DATE ADSL LSTALVDT 1 2012-09-17
01-701-1115 LAST KNOWN ALIVE DATE ADSL LSTALVDT 1 2013-01-23
01-701-1118 LAST KNOWN ALIVE DATE ADSL LSTALVDT 1 2014-09-09

Handling Subjects Without Assessment

If a subject has no event and has no record meeting the censoring rule, it will not be included in the output dataset. In order to have a record for this subject in the output dataset, another censoring_source() object should be created to specify how those patients will be censored. Therefore the start censoring is defined below to achieve that subjects without data in adrs are censored at the start date.

The ADaM IG requires that a computed date must be accompanied by imputation flags. Thus, if the function detects a --DTF and/or --TMF variable corresponding to start_date then STARTDTF and STARTTMF are set automatically to the values of these variables. If a date variable from one of the event or censoring source datasets is imputed, the imputation flag can be specified for the set_values_to parameter in event_source() or censor_source() (see definition of the start censoring below).

As the CDISC pilot does not contain a RS dataset, the following example for progression free survival uses manually created datasets.

View(adsl)
USUBJID DTHFL DTHDT TRTSDT TRTSDTF
01 Y 2021-06-12 2021-01-01 M
02 N NA 2021-02-03 NA
03 Y 2021-08-21 2021-08-10 NA
04 N NA 2021-02-03 NA
05 N NA 2021-04-01 D
View(adrs)
USUBJID AVALC ADT ASEQ PARAMCD PARAM
01 SD 2021-01-03 1 OVR Overall Response
01 PR 2021-03-04 2 OVR Overall Response
01 PD 2021-05-05 3 OVR Overall Response
02 PD 2021-02-03 1 OVR Overall Response
04 SD 2021-02-13 1 OVR Overall Response
04 PR 2021-04-14 2 OVR Overall Response
04 CR 2021-05-15 3 OVR Overall Response

An event for progression free survival occurs if

  • progression of disease is observed or
  • the subject dies.

Therefore two event_source() objects are defined:

  • pd for progression of disease and
  • death for death.

Some subjects may experience both events. In this case the first one is selected by derive_param_tte().

# progressive disease event #
pd <- event_source(
  dataset_name = "adrs",
  filter = AVALC == "PD",
  date = ADT,
  set_values_to = exprs(
    EVNTDESC = "PD",
    SRCDOM = "ADRS",
    SRCVAR = "ADT",
    SRCSEQ = ASEQ
  )
)

# death event #
death <- event_source(
  dataset_name = "adsl",
  filter = DTHFL == "Y",
  date = DTHDT,
  set_values_to = exprs(
    EVNTDESC = "DEATH",
    SRCDOM = "ADSL",
    SRCVAR = "DTHDT"
  )
)

Subjects without event must be censored at the last tumor assessment. For the censoring the lastvisit object is defined as all tumor assessments. Please note that it is not necessary to select the last one or exclude assessments which resulted in progression of disease. This is handled within derive_param_tte().

# last tumor assessment censoring (CNSR = 1 by default) #
lastvisit <- censor_source(
  dataset_name = "adrs",
  date = ADT,
  set_values_to = exprs(
    EVNTDESC = "LAST TUMOR ASSESSMENT",
    SRCDOM = "ADRS",
    SRCVAR = "ADT"
  )
)

Patients without tumor assessment should be censored at the start date. Therefore the start object is defined with the treatment start date as censoring date. It is not necessary to exclude patient with tumor assessment in the definition of start because derive_param_tte() selects the last date across all censor_source() objects as censoring date.

# start date censoring (for patients without tumor assessment) (CNSR = 2) #
start <- censor_source(
  dataset_name = "adsl",
  date = TRTSDT,
  censor = 2,
  set_values_to = exprs(
    EVNTDESC = "TREATMENT START",
    SRCDOM = "ADSL",
    SRCVAR = "TRTSDT",
    ADTF = TRTSDTF
  )
)

# derive time-to-event parameter #
adtte <- derive_param_tte(
  dataset_adsl = adsl,
  source_datasets = list(adsl = adsl, adrs = adrs),
  start_date = TRTSDT,
  event_conditions = list(pd, death),
  censor_conditions = list(lastvisit, start),
  set_values_to = exprs(PARAMCD = "PFS", PARAM = "Progression Free Survival")
)
USUBJID PARAMCD STARTDT ADT ADTF CNSR
01 PFS 2021-01-01 2021-05-05 NA 0
02 PFS 2021-02-03 2021-02-03 NA 0
03 PFS 2021-08-10 2021-08-21 NA 0
04 PFS 2021-02-03 2021-05-15 NA 1
05 PFS 2021-04-01 2021-04-01 D 2

Deriving a Series of Time-to-Event Parameters

If several similar time-to-event parameters need to be derived the call_derivation() function is useful.

In the following example parameters for time to first AE, time to first serious AE, and time to first related AE are derived. The censoring is the same for all three. Only the definition of the event differs.

# define censoring #
observation_end <- censor_source(
  dataset_name = "adsl",
  date = pmin(TRTEDT + days(30), EOSDT),
  censor = 1,
  set_values_to = exprs(
    EVNTDESC = "END OF TREATMENT",
    SRCDOM = "ADSL",
    SRCVAR = "TRTEDT"
  )
)

# define time to first AE #
tt_ae <- event_source(
  dataset_name = "ae",
  date = ASTDT,
  set_values_to = exprs(
    EVNTDESC = "ADVERSE EVENT",
    SRCDOM = "AE",
    SRCVAR = "AESTDTC"
  )
)

# define time to first serious AE #
tt_ser_ae <- event_source(
  dataset_name = "ae",
  filter = AESER == "Y",
  date = ASTDT,
  set_values_to = exprs(
    EVNTDESC = "SERIOUS ADVERSE EVENT",
    SRCDOM = "AE",
    SRCVAR = "AESTDTC"
  )
)

# define time to first related AE #
tt_rel_ae <- event_source(
  dataset_name = "ae",
  filter = AEREL %in% c("PROBABLE", "POSSIBLE", "REMOTE"),
  date = ASTDT,
  set_values_to = exprs(
    EVNTDESC = "RELATED ADVERSE EVENT",
    SRCDOM = "AE",
    SRCVAR = "AESTDTC"
  )
)

# derive all three time to event parameters #
adaette <- call_derivation(
  derivation = derive_param_tte,
  variable_params = list(
    params(
      event_conditions = list(tt_ae),
      set_values_to = exprs(PARAMCD = "TTAE")
    ),
    params(
      event_conditions = list(tt_ser_ae),
      set_values_to = exprs(PARAMCD = "TTSERAE")
    ),
    params(
      event_conditions = list(tt_rel_ae),
      set_values_to = exprs(PARAMCD = "TTRELAE")
    )
  ),
  dataset_adsl = adsl,
  source_datasets = list(
    adsl = adsl,
    ae = filter(adae, TRTEMFL == "Y")
  ),
  censor_conditions = list(observation_end)
)
USUBJID PARAMCD STARTDT ADT CNSR EVNTDESC SRCDOM SRCVAR
01-701-1015 TTAE 2014-01-02 2014-01-09 0 ADVERSE EVENT AE AESTDTC
01-701-1015 TTRELAE 2014-01-02 2014-01-09 0 RELATED ADVERSE EVENT AE AESTDTC
01-701-1015 TTSERAE 2014-01-02 2014-07-02 1 END OF TREATMENT ADSL TRTEDT
01-701-1023 TTAE 2012-08-05 2012-08-07 0 ADVERSE EVENT AE AESTDTC
01-701-1023 TTRELAE 2012-08-05 2012-08-07 0 RELATED ADVERSE EVENT AE AESTDTC
01-701-1023 TTSERAE 2012-08-05 2012-09-02 1 END OF TREATMENT ADSL TRTEDT
01-701-1028 TTAE 2013-07-19 2014-01-14 1 END OF TREATMENT ADSL TRTEDT
01-701-1028 TTRELAE 2013-07-19 2014-01-14 1 END OF TREATMENT ADSL TRTEDT
01-701-1028 TTSERAE 2013-07-19 2014-01-14 1 END OF TREATMENT ADSL TRTEDT
01-701-1033 TTAE 2014-03-18 2014-04-14 1 END OF TREATMENT ADSL TRTEDT

Deriving Time-to-Event Parameters Using By Groups

If time-to-event parameters need to be derived for each by group of a source dataset, the by_vars parameter can be specified. Then a time-to-event parameter is derived for each by group.

Please note that CDISC requires separate parameters (PARAMCD, PARAM) for the by groups. Therefore the variables specified for the by_vars parameter are not included in the output dataset. The PARAMCD variable should be specified for the set_value_to parameter using an expression on the right hand side which results in a unique value for each by group. If the values of the by variables should be included in the output dataset, they can be stored in PARCATn variables.

In the following example a time-to-event parameter for each preferred term in the AE dataset is derived.

View(adsl)
USUBJID TRTSDT EOSDT STUDYID
01 2020-12-06 2021-03-06 AB42
02 2021-01-16 2021-02-03 AB42
View(ae)
USUBJID AESTDTC AESEQ AEDECOD STUDYID AESTDT
01 2021-01-03T10:56 1 Flu AB42 2021-01-03
01 2021-03-04 2 Cough AB42 2021-03-04
01 2021 3 Flu AB42 2021-01-01
# define time to first adverse event event #
ttae <- event_source(
  dataset_name = "ae",
  date = AESTDT,
  set_values_to = exprs(
    EVNTDESC = "AE",
    SRCDOM = "AE",
    SRCVAR = "AESTDTC",
    SRCSEQ = AESEQ
  )
)

# define censoring at end of study #
eos <- censor_source(
  dataset_name = "adsl",
  date = EOSDT,
  set_values_to = exprs(
    EVNTDESC = "END OF STUDY",
    SRCDOM = "ADSL",
    SRCVAR = "EOSDT"
  )
)

# derive time-to-event parameter #
adtte <- derive_param_tte(
  dataset_adsl = adsl,
  by_vars = exprs(AEDECOD),
  start_date = TRTSDT,
  event_conditions = list(ttae),
  censor_conditions = list(eos),
  source_datasets = list(adsl = adsl, ae = ae),
  set_values_to = exprs(
    PARAMCD = paste0("TTAE", as.numeric(as.factor(AEDECOD))),
    PARAM = paste("Time to First", AEDECOD, "Adverse Event"),
    PARCAT1 = "TTAE",
    PARCAT2 = AEDECOD
  )
)
USUBJID STARTDT PARAMCD PARAM ADT CNSR SRCSEQ
01 2020-12-06 TTAE1 Time to First Cough Adverse Event 2021-03-04 0 2
01 2020-12-06 TTAE2 Time to First Flu Adverse Event 2021-01-01 0 3
02 2021-01-16 TTAE1 Time to First Cough Adverse Event 2021-02-03 1 NA
02 2021-01-16 TTAE2 Time to First Flu Adverse Event 2021-02-03 1 NA

Derive Analysis Value (AVAL)

The analysis value (AVAL) can be derived by calling derive_vars_duration().

This example derives the time to event in days. Other units can be requested by the specifying the out_unit parameter.

adtte <- derive_vars_duration(
  adtte,
  new_var = AVAL,
  start_date = STARTDT,
  end_date = ADT
)
STUDYID USUBJID EVNTDESC SRCDOM SRCVAR CNSR ADT STARTDT PARAMCD PARAM AVAL
CDISCPILOT01 01-701-1015 LAST KNOWN ALIVE DATE ADSL LSTALVDT 1 2014-07-02 2014-01-02 OS Overall Survival 182
CDISCPILOT01 01-701-1023 LAST KNOWN ALIVE DATE ADSL LSTALVDT 1 2012-09-02 2012-08-05 OS Overall Survival 29
CDISCPILOT01 01-701-1028 LAST KNOWN ALIVE DATE ADSL LSTALVDT 1 2014-01-14 2013-07-19 OS Overall Survival 180
CDISCPILOT01 01-701-1033 LAST KNOWN ALIVE DATE ADSL LSTALVDT 1 2014-04-14 2014-03-18 OS Overall Survival 28
CDISCPILOT01 01-701-1034 LAST KNOWN ALIVE DATE ADSL LSTALVDT 1 2014-12-30 2014-07-01 OS Overall Survival 183
CDISCPILOT01 01-701-1047 LAST KNOWN ALIVE DATE ADSL LSTALVDT 1 2013-04-07 2013-02-12 OS Overall Survival 55
CDISCPILOT01 01-701-1097 LAST KNOWN ALIVE DATE ADSL LSTALVDT 1 2014-07-09 2014-01-01 OS Overall Survival 190
CDISCPILOT01 01-701-1111 LAST KNOWN ALIVE DATE ADSL LSTALVDT 1 2012-09-17 2012-09-07 OS Overall Survival 11
CDISCPILOT01 01-701-1115 LAST KNOWN ALIVE DATE ADSL LSTALVDT 1 2013-01-23 2012-11-30 OS Overall Survival 55
CDISCPILOT01 01-701-1118 LAST KNOWN ALIVE DATE ADSL LSTALVDT 1 2014-09-09 2014-03-12 OS Overall Survival 182

Derive Analysis Sequence Number (ASEQ)

The {admiral} function derive_var_obs_number() can be used to derive ASEQ:

adtte <- derive_var_obs_number(
  adtte,
  by_vars = exprs(STUDYID, USUBJID),
  order = exprs(PARAMCD),
  check_type = "error"
)
STUDYID USUBJID EVNTDESC SRCDOM SRCVAR CNSR ADT STARTDT PARAMCD PARAM AVAL ASEQ
CDISCPILOT01 01-701-1015 LAST KNOWN ALIVE DATE ADSL LSTALVDT 1 2014-07-02 2014-01-02 OS Overall Survival 182 1
CDISCPILOT01 01-701-1023 LAST KNOWN ALIVE DATE ADSL LSTALVDT 1 2012-09-02 2012-08-05 OS Overall Survival 29 1
CDISCPILOT01 01-701-1028 LAST KNOWN ALIVE DATE ADSL LSTALVDT 1 2014-01-14 2013-07-19 OS Overall Survival 180 1
CDISCPILOT01 01-701-1033 LAST KNOWN ALIVE DATE ADSL LSTALVDT 1 2014-04-14 2014-03-18 OS Overall Survival 28 1
CDISCPILOT01 01-701-1034 LAST KNOWN ALIVE DATE ADSL LSTALVDT 1 2014-12-30 2014-07-01 OS Overall Survival 183 1
CDISCPILOT01 01-701-1047 LAST KNOWN ALIVE DATE ADSL LSTALVDT 1 2013-04-07 2013-02-12 OS Overall Survival 55 1
CDISCPILOT01 01-701-1097 LAST KNOWN ALIVE DATE ADSL LSTALVDT 1 2014-07-09 2014-01-01 OS Overall Survival 190 1
CDISCPILOT01 01-701-1111 LAST KNOWN ALIVE DATE ADSL LSTALVDT 1 2012-09-17 2012-09-07 OS Overall Survival 11 1
CDISCPILOT01 01-701-1115 LAST KNOWN ALIVE DATE ADSL LSTALVDT 1 2013-01-23 2012-11-30 OS Overall Survival 55 1
CDISCPILOT01 01-701-1118 LAST KNOWN ALIVE DATE ADSL LSTALVDT 1 2014-09-09 2014-03-12 OS Overall Survival 182 1

Add ADSL Variables

Variables from ADSL which are required for time-to-event analyses, e.g., treatment variables or covariates can be added using derive_vars_merged().

adtte <- derive_vars_merged(
  adtte,
  dataset_add = adsl,
  new_vars = exprs(ARMCD, ARM, ACTARMCD, ACTARM, AGE, SEX),
  by_vars = exprs(STUDYID, USUBJID)
)
USUBJID PARAMCD CNSR AVAL ARMCD AGE SEX
01-701-1015 OS 1 182 Pbo 63 F
01-701-1023 OS 1 29 Pbo 64 M
01-701-1028 OS 1 180 Xan_Hi 71 M
01-701-1033 OS 1 28 Xan_Lo 74 M
01-701-1034 OS 1 183 Xan_Hi 77 F
01-701-1047 OS 1 55 Pbo 85 F
01-701-1097 OS 1 190 Xan_Lo 68 M
01-701-1111 OS 1 11 Xan_Lo 81 F
01-701-1115 OS 1 55 Xan_Lo 84 M
01-701-1118 OS 1 182 Pbo 52 M

Add Labels and Attributes

Adding labels and attributes for SAS transport files is supported by the following packages:

  • metacore: establish a common foundation for the use of metadata within an R session.

  • metatools: enable the use of metacore objects. Metatools can be used to build datasets or enhance columns in existing datasets as well as checking datasets against the metadata.

  • xportr: functionality to associate all metadata information to a local R data frame, perform data set level validation checks and convert into a transport v5 file(xpt).

NOTE: All these packages are in the experimental phase, but the vision is to have them associated with an End to End pipeline under the umbrella of the pharmaverse. An example of applying metadata and perform associated checks can be found at the pharmaverse E2E example.