--- title: "Creating ADSL" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Creating ADSL} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` # Introduction This article describes creating an `ADSL` ADaM specific to Vaccines. Examples are currently presented and tested using `DM`, `EX` SDTM domains. However, other domains could be used. **Note:** *All examples assume CDISC SDTM and/or ADaM format as input unless otherwise specified.* # Programming Flow * [Read in Data](#readdata) * [Derive Period, Subperiod, and Phase Variables (e.g. `APxxSDT`, `APxxEDT`, ...)](#periodvars) * [Derive Treatment Variables (`TRT0xP`, `TRT0xA`)](#treatmentvar) * [Derive/Impute Numeric Treatment Date/Time and Duration (`TRTSDT`, `TRTEDT`, `TRTDURD`)](#trtdatetime) * [Population Flags (e.g. `SAFFL`)](#popflag) * [Derive Vaccination Date Variables](#vax_date) * [Create Period Variables (Study Specific)](#period) * [Derive Other Variables)](#other) * [Add Labels and Attributes](#attributes) ## Read in Data {#readdata} To start, all data frames needed for the creation of `ADSL` should be read into the environment. This will be a company specific process. Some of the data frames needed may be `DM`, `EX`. ```{r, message=FALSE, warning=FALSE} library(admiral) library(admiralvaccine) library(pharmaversesdtm) library(dplyr, warn.conflicts = FALSE) library(lubridate) library(stringr) library(admiraldev) data("dm_vaccine") data("ex_vaccine") dm <- convert_blanks_to_na(dm_vaccine) ex <- convert_blanks_to_na(ex_vaccine) ``` The `DM` domain is used as the basis for `ADSL`: ```{r eval=TRUE} adsl <- dm %>% select(-DOMAIN) ``` ```{r, eval=TRUE, echo=FALSE} dataset_vignette( adsl, display_vars = exprs(USUBJID, RFSTDTC, COUNTRY, AGE, SEX, RACE, ETHNIC, ARM, ACTARM) ) ``` ## Derive Period, Subperiod, and Phase Variables (e.g. `APxxSDT`, `APxxEDT`, ...) {#periodvars} The `{admiral}` core package has separate functions to handle period variables since these variables are study specific. See the ["Visit and Period Variables" vignette](https://pharmaverse.github.io/admiral/articles/visits_periods.html) for more information. If the variables are not derived based on a period reference dataset, they may be derived at a later point of the flow. For example, phases like "Treatment Phase" and "Follow up" could be derived based on treatment start and end date. ## Derive Treatment Variables (`TRT0xP`, `TRT0xA`) {#treatmentvar} The mapping of the treatment variables is left to the ADaM programmer. An example mapping for a study without periods may be: ```{r eval=TRUE} adsl <- dm %>% mutate( TRT01P = substring(ARM, 1, 9), TRT02P = substring(ARM, 11, 100) ) %>% derive_vars_merged( dataset_add = ex, filter_add = EXLNKGRP == "VACCINATION 1", new_vars = exprs(TRT01A = EXTRT), by_vars = get_admiral_option("subject_keys") ) %>% derive_vars_merged( dataset_add = ex, filter_add = EXLNKGRP == "VACCINATION 2", new_vars = exprs(TRT02A = EXTRT), by_vars = get_admiral_option("subject_keys") ) ``` ## Derive/Impute Numeric Treatment Date/Time and Duration (`TRTSDTM`, `TRTEDTM`, `TRTDURD`) {#trtdatetime} The function `derive_vars_merged()` can be used to derive the treatment start and end date/times using the `ex` domain. A pre-processing step for `ex` is required to convert the variable `EXSTDTC` and `EXSTDTC` to datetime variables and impute missing date or time components. Conversion and imputation is done by `derive_vars_dtm()`. Example calls: ```{r eval=TRUE} # impute start and end time of exposure to first and last respectively, do not impute date ex_ext <- ex %>% derive_vars_dtm( dtc = EXSTDTC, new_vars_prefix = "EXST" ) %>% derive_vars_dtm( dtc = EXENDTC, new_vars_prefix = "EXEN" ) adsl <- adsl %>% derive_vars_merged( dataset_add = ex_ext, filter_add = (EXDOSE > 0 | (EXDOSE == 0 & str_detect(EXTRT, "VACCINE"))) & !is.na(EXSTDTM), new_vars = exprs(TRTSDTM = EXSTDTM), order = exprs(EXSTDTM, EXSEQ), mode = "first", by_vars = get_admiral_option("subject_keys") ) %>% derive_vars_merged( dataset_add = ex_ext, filter_add = (EXDOSE > 0 | (EXDOSE == 0 & str_detect(EXTRT, "VACCINE"))) & !is.na(EXENDTM), new_vars = exprs(TRTEDTM = EXENDTM), order = exprs(EXENDTM, EXSEQ), mode = "last", by_vars = get_admiral_option("subject_keys") ) ``` This call returns the original data frame with the column `TRTSDTM`, `TRTSTMF`, `TRTEDTM`, and `TRTETMF` added. Exposure observations with incomplete date and zero doses of non placebo treatments are ignored. Missing time parts are imputed as first or last for start and end date respectively. The datetime variables returned can be converted to dates using the `derive_vars_dtm_to_dt()` function. ```{r eval=TRUE} adsl <- adsl %>% derive_vars_dtm_to_dt(source_vars = exprs(TRTSDTM, TRTEDTM)) ``` Now, that `TRTSDT` and `TRTEDT` are derived, the function `derive_var_trtdurd()` can be used to calculate the Treatment duration (`TRTDURD`). ```{r eval=TRUE} adsl <- adsl %>% derive_var_trtdurd() ``` ```{r, eval=TRUE, echo=FALSE} dataset_vignette( adsl, display_vars = exprs(USUBJID, RFSTDTC, TRTSDTM, TRTSDT, TRTEDTM, TRTEDT, TRTDURD) ) ``` ### Population Flags (e.g. `SAFFL`) {#popflag} Since the populations flags are mainly company/study specific no dedicated functions are provided, but in most cases they can easily be derived using `derive_var_merged_exist_flag()`. An example of an implementation could be: ```{r eval=TRUE} adsl <- derive_var_merged_exist_flag( dataset = adsl, dataset_add = ex, by_vars = exprs(STUDYID, USUBJID), new_var = SAFFL, condition = (EXDOSE > 0 | (EXDOSE == 0 & str_detect(EXTRT, "VACCINE"))) ) %>% mutate( PPROTFL = "Y" ) ``` ```{r, eval=TRUE, echo=FALSE} dataset_vignette( adsl, display_vars = exprs(USUBJID, TRTSDT, ARM, ACTARM, SAFFL, PPROTFL) ) ``` ## Derive Vaccination Date Variables {#vax_date} In this step, we will create a vaccination date variables from `EX` domain. The function `derive_vars_vaxdt()` returns the variables `VAX01DT`,`VAX02DT`... added to the `adsl` dataset based on number of vaccinations. If there are multiple vaccinations for a visit per subject, a warning will be provided and only first observation will be filtered based on the variable order specified on the `order` argument. In this case, a user needs to select the `by_vars` appropriately. ```{r eval=TRUE} adsl <- derive_vars_vaxdt( dataset = ex, dataset_adsl = adsl, by_vars = exprs(USUBJID, VISITNUM), order = exprs(USUBJID, VISITNUM, VISIT, EXSTDTC) ) ``` ```{r, eval=TRUE, echo=FALSE} dataset_vignette( adsl, display_vars = exprs(USUBJID, VAX01DT, VAX02DT) ) ``` This call would return the input dataset with columns `VAX01DT`, `VAX02DT` added. ## Create Period Variables (Study Specific) {#period} In this step this we will create period variables which will be study specific, User can change the logic as per their study requirement. ```{r eval=TRUE} adsl <- adsl %>% mutate( AP01SDT = VAX01DT, AP01EDT = if_else(!is.na(VAX02DT), VAX02DT - 1, as.Date(RFPENDTC)), AP02SDT = if_else(!is.na(VAX02DT), VAX02DT, NA_Date_), AP02EDT = if_else(!is.na(AP02SDT), as.Date(RFPENDTC), NA_Date_) ) ``` ```{r, eval=TRUE, echo=FALSE} dataset_vignette( adsl, display_vars = exprs(USUBJID, AP01SDT, AP01EDT, AP02SDT, AP02EDT) ) ``` This call would return the input dataset with columns `AP01SDT`, `AP01EDT`, `AP02SDT`, `AP02EDT` added. ## Derive Other Variables {#other} The users can add specific code to cover their need for the analysis. The following functions are helpful for many ADSL derivations: - `derive_vars_merged()` - Merge Variables from a Dataset to the Input Dataset - `derive_var_merged_exist_flag()` - Merge an Existence Flag - `derive_var_merged_summary()` - Merge a Summary Variable See also [Generic Functions](https://pharmaverse.github.io/admiral/articles/generic.html). ## Add Labels and Attributes {#attributes} Adding labels and attributes for SAS transport files is supported by the following packages: - [metacore](https://atorus-research.github.io/metacore/): establish a common foundation for the use of metadata within an R session. - [metatools](https://pharmaverse.github.io/metatools/): enable the use of metacore objects. Metatools can be used to build datasets or enhance columns in existing datasets as well as checking datasets against the metadata. - [xportr](https://atorus-research.github.io/xportr/): functionality to associate all metadata information to a local R data frame, perform data set level validation checks and convert into a [transport v5 file(xpt)](https://documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/movefile/n1xbwdre0giahfn11c99yjkpi2yb.htm). NOTE: All these packages are in the experimental phase, but the vision is to have them associated with an End to End pipeline under the umbrella of the [pharmaverse](https://github.com/pharmaverse). An example of applying metadata and perform associated checks can be found at the [pharmaverse E2E example](https://pharmaverse.github.io/examples/adam/adsl.html). # Example Script ADaM | Sample Code ---- | -------------- ADSL | [ad_adsl.R](https://github.com/pharmaverse/admiralvaccine/blob/main/inst/templates/ad_adsl.R){target="_blank"}