This article describes creating questionnaire ADaMs. Although
questionnaire data is collected in a single SDTM dataset
(QS
), usually it does not make sense to create a single
ADQS
dataset for all questionnaire analyses. For example, a
univariate analysis of scores by visit requires different variables than
a time-to-event analysis. Therefore this vignette does not provide a
programming workflow for a complete dataset, but provides examples for
deriving common types of questionnaire parameters.
At the moment, {admiral}
does not provide functions or
metadata for specific questionnaires nor functionality for handling the
vast amount of questionnaires and related parameters, e.g. a metadata
structure for storing parameter definitions and functions for reading
such metadata. We plan to provide it in future releases.
Note: All examples assume CDISC SDTM and/or ADaM format as input unless otherwise specified.
The examples of this vignette require the following packages.
In this vignette we use the example data from the CDISC ADaM Supplements (Generalized Anxiety Disorder 7-Item Version 2 (GAD-7), Geriatric Depression Scale Short Form (GDS-SF))1:
STUDYID | USUBJID | QSSEQ | QSTESTCD | QSTEST | VISIT | VISITNUM | QSDTC | QSORRES | QSCAT | QSSTRESN |
---|---|---|---|---|---|---|---|---|---|---|
STUDYX | P0001 | 1 | GAD0201 | GAD02-Feeling Nervous Anxious or On Edge | VISIT 1 | 1 | 2012-11-16 | More than half the days | GAD-7 V2 | 2 |
STUDYX | P0001 | 2 | GAD0202 | GAD02-Not Able to Stop/Control Worrying | VISIT 1 | 1 | 2012-11-16 | More than half the days | GAD-7 V2 | 2 |
STUDYX | P0001 | 3 | GAD0203 | GAD02-Worrying Too Much About Things | VISIT 1 | 1 | 2012-11-16 | More than half the days | GAD-7 V2 | 2 |
STUDYX | P0001 | 4 | GAD0204 | GAD02-Trouble Relaxing | VISIT 1 | 1 | 2012-11-16 | More than half the days | GAD-7 V2 | 2 |
STUDYX | P0001 | 5 | GAD0205 | GAD02-Being Restless Hard to Sit Still | VISIT 1 | 1 | 2012-11-16 | Nearly every day | GAD-7 V2 | 3 |
STUDYX | P0001 | 6 | GAD0206 | GAD02-Becoming Easily Annoyed/Irritable | VISIT 1 | 1 | 2012-11-16 | Nearly every day | GAD-7 V2 | 3 |
STUDYX | P0001 | 7 | GAD0207 | GAD02-Feel Afraid/Something Awful Happen | VISIT 1 | 1 | 2012-11-16 | Several days | GAD-7 V2 | 1 |
STUDYX | P0001 | 8 | GAD0208 | GAD02-Total Score | VISIT 1 | 1 | 2012-11-16 | 15 | GAD-7 V2 | 15 |
STUDYX | P0001 | 9 | GAD0201 | GAD02-Feeling Nervous Anxious or On Edge | UNSCHEDULED VISIT 5.01 | 501 | 2013-04-15 | More than half the days | GAD-7 V2 | 2 |
STUDYX | P0001 | 10 | GAD0202 | GAD02-Not Able to Stop/Control Worrying | UNSCHEDULED VISIT 5.01 | 501 | 2013-04-15 | More than half the days | GAD-7 V2 | 2 |
adsl <- tribble(
~STUDYID, ~USUBJID, ~SITEID, ~ITTFL, ~TRTSDT, ~DTHCAUS,
"STUDYX", "P0001", 13L, "Y", lubridate::ymd("2012-11-16"), NA_character_,
"STUDYX", "P0002", 11L, "Y", lubridate::ymd("2012-11-16"), "PROGRESSIVE DISEASE"
)
STUDYID | USUBJID | SITEID | ITTFL | TRTSDT | DTHCAUS |
---|---|---|---|---|---|
STUDYX | P0001 | 13 | Y | 2012-11-16 | NA |
STUDYX | P0002 | 11 | Y | 2012-11-16 | PROGRESSIVE DISEASE |
The original items, i.e. the answers to the questionnaire questions, can be handled in the same way as in a BDS finding ADaM. For example:
adqs <- qs %>%
# Add ADSL variables
derive_vars_merged(
dataset_add = adsl,
new_vars = exprs(TRTSDT, DTHCAUS),
by_vars = exprs(STUDYID, USUBJID)
) %>%
# Add analysis parameter variables
mutate(
PARAMCD = QSTESTCD,
PARAM = QSTEST,
PARCAT1 = QSCAT,
AVALC = QSORRES,
AVAL = QSSTRESN
) %>%
# Add timing variables
derive_vars_dt(new_vars_prefix = "A", dtc = QSDTC) %>%
derive_vars_dy(reference_date = TRTSDT, source_vars = exprs(ADT)) %>%
mutate(
AVISIT = if_else(ADT <= TRTSDT, "BASELINE", VISIT),
AVISITN = if_else(ADT <= TRTSDT, 0, VISITNUM)
)
USUBJID | PARAMCD | PARAM | PARCAT1 | AVALC | AVAL | ADY | AVISIT |
---|---|---|---|---|---|---|---|
P0001 | GAD0201 | GAD02-Feeling Nervous Anxious or On Edge | GAD-7 V2 | More than half the days | 2 | 1 | BASELINE |
P0001 | GAD0202 | GAD02-Not Able to Stop/Control Worrying | GAD-7 V2 | More than half the days | 2 | 1 | BASELINE |
P0001 | GAD0203 | GAD02-Worrying Too Much About Things | GAD-7 V2 | More than half the days | 2 | 1 | BASELINE |
P0001 | GAD0204 | GAD02-Trouble Relaxing | GAD-7 V2 | More than half the days | 2 | 1 | BASELINE |
P0001 | GAD0205 | GAD02-Being Restless Hard to Sit Still | GAD-7 V2 | Nearly every day | 3 | 1 | BASELINE |
P0001 | GAD0206 | GAD02-Becoming Easily Annoyed/Irritable | GAD-7 V2 | Nearly every day | 3 | 1 | BASELINE |
P0001 | GAD0207 | GAD02-Feel Afraid/Something Awful Happen | GAD-7 V2 | Several days | 1 | 1 | BASELINE |
P0001 | GAD0208 | GAD02-Total Score | GAD-7 V2 | 15 | 15 | 1 | BASELINE |
P0001 | GAD0201 | GAD02-Feeling Nervous Anxious or On Edge | GAD-7 V2 | More than half the days | 2 | 151 | UNSCHEDULED VISIT 5.01 |
P0001 | GAD0202 | GAD02-Not Able to Stop/Control Worrying | GAD-7 V2 | More than half the days | 2 | 151 | UNSCHEDULED VISIT 5.01 |
We handle unscheduled visits as normal visits. For deriving visits
based on time-windows, see Visit
and Period Variables. And for flagging values to be used for
analysis, see derive_var_extreme_flag()
.
Please note that in the example data, the numeric values of the
answers are mapped in SDTM (QSSTRESN
) such that they can be
used for deriving scores. Depending on the question,
QSORRES == "YES"
is mapped to QSSTRESN = 0
or
QSSTRESN = 1
. If the QSSTRESN
values are not
ready to be used for deriving scores and require transformation, it is
recommended that QSSTRESN
is kept in the ADaM dataset for
traceability, and the transformed value is stored in AVAL
,
since that’s what will be used for the score calculation.
Scales and Scores are often derived as the sum or the average across
a subset of the items. For the GAD-7 questionnaire, the total score is
derived as the sum. The derive_summary_records()
function
with sum()
can be used to derive it as a new parameter. For
selecting the parameters to be summarized, regular expressions like in
the example below may be helpful. In the example we derive a separate
ADaM dataset for each questionnaire. Depending on the analysis needs, it
is also possible that an ADaM contains more than one questionnaire or
all questionnaires.
adgad7 <- adqs %>%
# Select records to keep in the GAD-7 ADaM
filter(PARCAT1 == "GAD-7 V2") %>%
derive_summary_records(
dataset = .,
dataset_add = .,
by_vars = exprs(STUDYID, USUBJID, AVISIT, ADT, ADY, TRTSDT, DTHCAUS),
# Select records contributing to total score
filter_add = str_detect(PARAMCD, "GAD020[1-7]"),
set_values_to = exprs(
AVAL = sum(AVAL, na.rm = TRUE),
PARAMCD = "GAD02TS",
PARAM = "GAD02-Total Score - Analysis"
)
)
USUBJID | PARAMCD | PARAM | AVAL | ADY | AVISIT |
---|---|---|---|---|---|
P0001 | GAD0201 | GAD02-Feeling Nervous Anxious or On Edge | 2 | 1 | BASELINE |
P0001 | GAD0202 | GAD02-Not Able to Stop/Control Worrying | 2 | 1 | BASELINE |
P0001 | GAD0203 | GAD02-Worrying Too Much About Things | 2 | 1 | BASELINE |
P0001 | GAD0204 | GAD02-Trouble Relaxing | 2 | 1 | BASELINE |
P0001 | GAD0205 | GAD02-Being Restless Hard to Sit Still | 3 | 1 | BASELINE |
P0001 | GAD0206 | GAD02-Becoming Easily Annoyed/Irritable | 3 | 1 | BASELINE |
P0001 | GAD0207 | GAD02-Feel Afraid/Something Awful Happen | 1 | 1 | BASELINE |
P0001 | GAD0208 | GAD02-Total Score | 15 | 1 | BASELINE |
P0001 | GAD02TS | GAD02-Total Score - Analysis | 15 | 1 | BASELINE |
P0001 | GAD0201 | GAD02-Feeling Nervous Anxious or On Edge | 2 | 151 | UNSCHEDULED VISIT 5.01 |
For the GDS-SF questionnaire, the total score is defined as the
average of the item values transformed to the range [0, 15] and rounded
up to the next integer. If more than five items are missing, the total
score is considered as missing. This parameter can be derived by
compute_scale()
and
derive_summary_records()
:
adgdssf <- adqs %>%
# Select records to keep in the GDS-SF ADaM
filter(PARCAT1 == "GDS SHORT FORM") %>%
derive_summary_records(
dataset = .,
dataset_add = .,
by_vars = exprs(STUDYID, USUBJID, AVISIT, ADT, ADY, TRTSDT, DTHCAUS),
# Select records contributing to total score
filter_add = str_detect(PARAMCD, "GDS02[01][0-9]"),
set_values_to = exprs(
AVAL = compute_scale(
AVAL,
source_range = c(0, 1),
target_range = c(0, 15),
min_n = 10
) %>%
ceiling(),
PARAMCD = "GDS02TS",
PARAM = "GDS02- Total Score - Analysis"
)
)
USUBJID | PARAMCD | PARAM | AVAL | ADY | AVISIT |
---|---|---|---|---|---|
P0001 | GDS0201 | GDS02-Satisfied With Life | 0 | 1 | BASELINE |
P0001 | GDS0202 | GDS02-Dropped Activities and Interests | 1 | 1 | BASELINE |
P0001 | GDS0203 | GDS02-Life Is Empty | 0 | 1 | BASELINE |
P0001 | GDS0204 | GDS02-Bored Often | 1 | 1 | BASELINE |
P0001 | GDS0205 | GDS02-Good Spirits Most of Time | 1 | 1 | BASELINE |
P0001 | GDS0206 | GDS02-Afraid of Something Bad Happening | 1 | 1 | BASELINE |
P0001 | GDS0207 | GDS02-Feel Happy Most of Time | 1 | 1 | BASELINE |
P0001 | GDS0208 | GDS02-Often Feel Helpless | 1 | 1 | BASELINE |
P0001 | GDS0209 | GDS02-Prefer to Stay Home | 0 | 1 | BASELINE |
P0001 | GDS0210 | GDS02-Memory Problems | 1 | 1 | BASELINE |
After deriving the scores by visit, the baseline and change from baseline variables can be derived:
adgdssf <- adgdssf %>%
# Flag baseline records (last before treatement start)
restrict_derivation(
derivation = derive_var_extreme_flag,
args = params(
by_vars = exprs(STUDYID, USUBJID, PARAMCD),
order = exprs(ADT),
new_var = ABLFL,
mode = "last"
),
filter = !is.na(AVAL) & ADT <= TRTSDT
) %>%
# Derive baseline and change from baseline variables
derive_var_base(
by_vars = exprs(STUDYID, USUBJID, PARAMCD),
source_var = AVAL,
new_var = BASE
) %>%
# Calculate CHG for post-baseline records
# The decision on how to populate pre-baseline and baseline values of CHG is left to producer choice
restrict_derivation(
derivation = derive_var_chg,
filter = AVISITN > 0
) %>%
# Calculate PCHG for post-baseline records
# The decision on how to populate pre-baseline and baseline values of PCHG is left to producer choice
restrict_derivation(
derivation = derive_var_pchg,
filter = AVISITN > 0
) %>%
# Derive sequence number
derive_var_obs_number(
by_vars = exprs(STUDYID, USUBJID),
order = exprs(PARAMCD, ADT),
check_type = "error"
)
USUBJID | PARAMCD | PARAM | AVISIT | AVAL | BASE | CHG | PCHG |
---|---|---|---|---|---|---|---|
P0001 | GDS0201 | GDS02-Satisfied With Life | BASELINE | 0 | 0 | NA | NA |
P0001 | GDS0201 | GDS02-Satisfied With Life | VISIT 2 | 0 | 0 | 0 | NA |
P0001 | GDS0201 | GDS02-Satisfied With Life | UNSCHEDULED 2.01 | 0 | 0 | 0 | NA |
P0001 | GDS0201 | GDS02-Satisfied With Life | VISIT 3 | NA | 0 | NA | NA |
P0001 | GDS0201 | GDS02-Satisfied With Life | VISIT 4 | 0 | 0 | 0 | NA |
P0001 | GDS0202 | GDS02-Dropped Activities and Interests | BASELINE | 1 | 1 | NA | NA |
P0001 | GDS0202 | GDS02-Dropped Activities and Interests | VISIT 2 | 0 | 1 | -1 | -100 |
P0001 | GDS0202 | GDS02-Dropped Activities and Interests | UNSCHEDULED 2.01 | 0 | 1 | -1 | -100 |
P0001 | GDS0202 | GDS02-Dropped Activities and Interests | VISIT 3 | NA | 1 | NA | NA |
P0001 | GDS0202 | GDS02-Dropped Activities and Interests | VISIT 4 | 0 | 1 | -1 | -100 |
As time to event parameters require specific variables like
CNSR
, STARTDT
, and EVNTDESC
, it
makes sense to create a separate time to event dataset for them.
However, it might be useful to create flags or categorization variables
in ADQS
. For example:
# Create AVALCATx lookup table
avalcat_lookup <- exprs(
~PARAMCD, ~condition, ~AVALCAT1, ~AVALCAT1N,
"GDS02TS", AVAL <= 5, "Normal", 0L,
"GDS02TS", AVAL <= 10 & AVAL > 5, "Possible Depression", 1L,
"GDS02TS", AVAL > 10, "Likely Depression", 2L
)
# Create CHGCAT1 lookup table
chgcat_lookup <- exprs(
~condition, ~CHGCAT1,
AVALCAT1N > BASECA1N, "WORSENED",
AVALCAT1N == BASECA1N, "NO CHANGE",
AVALCAT1N < BASECA1N, "IMPROVED"
)
adgdssf <- adgdssf %>%
derive_vars_cat(
definition = avalcat_lookup,
by_vars = exprs(PARAMCD)
) %>%
derive_var_base(
by_vars = exprs(STUDYID, USUBJID, PARAMCD),
source_var = AVALCAT1,
new_var = BASECAT1
) %>%
derive_var_base(
by_vars = exprs(STUDYID, USUBJID, PARAMCD),
source_var = AVALCAT1N,
new_var = BASECA1N
) %>%
derive_vars_cat(
definition = chgcat_lookup
)
USUBJID | PARAMCD | PARAM | AVISIT | AVAL | AVALCAT1 | CHGCAT1 |
---|---|---|---|---|---|---|
P0001 | GDS02TS | GDS02- Total Score - Analysis | BASELINE | 10 | Possible Depression | NO CHANGE |
P0001 | GDS02TS | GDS02- Total Score - Analysis | VISIT 2 | 8 | Possible Depression | NO CHANGE |
P0001 | GDS02TS | GDS02- Total Score - Analysis | UNSCHEDULED 2.01 | 8 | Possible Depression | NO CHANGE |
P0001 | GDS02TS | GDS02- Total Score - Analysis | VISIT 3 | 7 | Possible Depression | NO CHANGE |
P0001 | GDS02TS | GDS02- Total Score - Analysis | VISIT 4 | 3 | Normal | IMPROVED |
P0001 | GDS0215 | GDS02-Most People Better Off Than You | BASELINE | 0 | NA | NA |
P0001 | GDS0215 | GDS02-Most People Better Off Than You | VISIT 2 | 0 | NA | NA |
P0001 | GDS0215 | GDS02-Most People Better Off Than You | UNSCHEDULED 2.01 | 0 | NA | NA |
P0001 | GDS0215 | GDS02-Most People Better Off Than You | VISIT 3 | 0 | NA | NA |
P0001 | GDS0215 | GDS02-Most People Better Off Than You | VISIT 4 | 0 | NA | NA |
Then a time to deterioration parameter can be derived by:
# Define event
deterioration_event <- event_source(
dataset_name = "adqs",
filter = PARAMCD == "GDS02TS" & CHGCAT1 == "WORSENED",
date = ADT,
set_values_to = exprs(
EVNTDESC = "DEPRESSION WORSENED",
SRCDOM = "ADQS",
SRCVAR = "ADT",
SRCSEQ = ASEQ
)
)
# Define censoring at last assessment
last_valid_assessment <- censor_source(
dataset_name = "adqs",
filter = PARAMCD == "GDS02TS" & !is.na(CHGCAT1),
date = ADT,
set_values_to = exprs(
EVNTDESC = "LAST ASSESSMENT",
SRCDOM = "ADQS",
SRCVAR = "ADT",
SRCSEQ = ASEQ
)
)
# Define censoring at treatment start (for subjects without assessment)
start <- censor_source(
dataset_name = "adsl",
date = TRTSDT,
set_values_to = exprs(
EVNTDESC = "TREATMENT START",
SRCDOM = "ADSL",
SRCVAR = "TRTSDT"
)
)
adgdstte <- derive_param_tte(
dataset_adsl = adsl,
source_datasets = list(adsl = adsl, adqs = adgdssf),
start_date = TRTSDT,
event_conditions = list(deterioration_event),
censor_conditions = list(last_valid_assessment, start),
set_values_to = exprs(
PARAMCD = "TTDEPR",
PARAM = "Time to depression"
)
) %>%
derive_vars_duration(
new_var = AVAL,
start_date = STARTDT,
end_date = ADT
)
USUBJID | PARAMCD | PARAM | AVAL | CNSR | EVNTDESC | SRCDOM | SRCVAR |
---|---|---|---|---|---|---|---|
P0001 | TTDEPR | Time to depression | 90 | 1 | LAST ASSESSMENT | ADQS | ADT |
P0002 | TTDEPR | Time to depression | 30 | 0 | DEPRESSION WORSENED | ADQS | ADT |
The derivation of confirmed/definitive deterioration/improvement
parameters is very similar to the unconfirmed deterioration parameters
except that the event is not based on CHGCATy
, but on a
confirmation flag variable. This confirmation flag can be derived by
derive_var_joined_exist_flag()
. For example, flagging
deteriorations, which are confirmed by a second assessment at least
seven days later:
adgdssf <- adgdssf %>%
derive_var_joined_exist_flag(
dataset_add = adgdssf,
by_vars = exprs(USUBJID, PARAMCD),
order = exprs(ADT),
new_var = CDETFL,
join_vars = exprs(CHGCAT1, ADY),
join_type = "after",
filter_join = CHGCAT1 == "WORSENED" &
CHGCAT1.join == "WORSENED" &
ADY.join >= ADY + 7
)
USUBJID | PARAMCD | PARAM | ADY | CHGCAT1 | CDETFL |
---|---|---|---|---|---|
P0001 | GDS02TS | GDS02- Total Score - Analysis | 1 | NO CHANGE | NA |
P0001 | GDS02TS | GDS02- Total Score - Analysis | 30 | NO CHANGE | NA |
P0001 | GDS02TS | GDS02- Total Score - Analysis | 43 | NO CHANGE | NA |
P0001 | GDS02TS | GDS02- Total Score - Analysis | 58 | NO CHANGE | NA |
P0001 | GDS02TS | GDS02- Total Score - Analysis | 90 | IMPROVED | NA |
P0001 | GDS0215 | GDS02-Most People Better Off Than You | 1 | NA | NA |
P0001 | GDS0215 | GDS02-Most People Better Off Than You | 30 | NA | NA |
P0001 | GDS0215 | GDS02-Most People Better Off Than You | 43 | NA | NA |
P0001 | GDS0215 | GDS02-Most People Better Off Than You | 58 | NA | NA |
P0001 | GDS0215 | GDS02-Most People Better Off Than You | 90 | NA | NA |
For flagging deteriorations at two consecutive assessments or
considering death due to progression at the last assessment as
confirmation, the tmp_obs_nr_var
argument is helpful:
# Flagging deterioration at two consecutive assessments
adgdssf <- adgdssf %>%
derive_var_joined_exist_flag(
dataset_add = adgdssf,
by_vars = exprs(USUBJID, PARAMCD),
order = exprs(ADT),
new_var = CONDETFL,
join_vars = exprs(CHGCAT1),
join_type = "after",
tmp_obs_nr_var = tmp_obs_nr,
filter_join = CHGCAT1 == "WORSENED" &
CHGCAT1.join == "WORSENED" &
tmp_obs_nr.join == tmp_obs_nr + 1
) %>%
# Flagging deterioration confirmed by
# - a second deterioration at least 7 days later or
# - deterioration at the last assessment and death due to progression
derive_var_joined_exist_flag(
.,
dataset_add = .,
by_vars = exprs(USUBJID, PARAMCD),
order = exprs(ADT),
new_var = CDTDTHFL,
join_vars = exprs(CHGCAT1, ADY),
join_type = "all",
tmp_obs_nr_var = tmp_obs_nr,
filter_join = CHGCAT1 == "WORSENED" & (
CHGCAT1.join == "WORSENED" & ADY.join >= ADY + 7 |
tmp_obs_nr == max(tmp_obs_nr.join) & DTHCAUS == "PROGRESSIVE DISEASE")
)
USUBJID | PARAMCD | PARAM | ADY | CHGCAT1 | CONDETFL | CDTDTHFL |
---|---|---|---|---|---|---|
P0001 | GDS02TS | GDS02- Total Score - Analysis | 1 | NO CHANGE | NA | NA |
P0001 | GDS02TS | GDS02- Total Score - Analysis | 30 | NO CHANGE | NA | NA |
P0001 | GDS02TS | GDS02- Total Score - Analysis | 43 | NO CHANGE | NA | NA |
P0001 | GDS02TS | GDS02- Total Score - Analysis | 58 | NO CHANGE | NA | NA |
P0001 | GDS02TS | GDS02- Total Score - Analysis | 90 | IMPROVED | NA | NA |
P0001 | GDS0215 | GDS02-Most People Better Off Than You | 1 | NA | NA | NA |
P0001 | GDS0215 | GDS02-Most People Better Off Than You | 30 | NA | NA | NA |
P0001 | GDS0215 | GDS02-Most People Better Off Than You | 43 | NA | NA | NA |
P0001 | GDS0215 | GDS02-Most People Better Off Than You | 58 | NA | NA | NA |
P0001 | GDS0215 | GDS02-Most People Better Off Than You | 90 | NA | NA | NA |
For definitive deterioration (deterioration at all following
assessments), parameter summary functions like all()
can be
used in the filter condition:
adgdssf <- adgdssf %>%
derive_var_joined_exist_flag(
dataset_add = adgdssf,
by_vars = exprs(USUBJID, PARAMCD),
order = exprs(ADT),
new_var = DEFDETFL,
join_vars = exprs(CHGCAT1),
join_type = "after",
filter_join = CHGCAT1 == "WORSENED" & all(CHGCAT1.join == "WORSENED")
)
USUBJID | PARAMCD | PARAM | ADY | CHGCAT1 | DEFDETFL |
---|---|---|---|---|---|
P0001 | GDS02TS | GDS02- Total Score - Analysis | 1 | NO CHANGE | NA |
P0001 | GDS02TS | GDS02- Total Score - Analysis | 30 | NO CHANGE | NA |
P0001 | GDS02TS | GDS02- Total Score - Analysis | 43 | NO CHANGE | NA |
P0001 | GDS02TS | GDS02- Total Score - Analysis | 58 | NO CHANGE | NA |
P0001 | GDS02TS | GDS02- Total Score - Analysis | 90 | IMPROVED | NA |
P0001 | GDS0215 | GDS02-Most People Better Off Than You | 1 | NA | NA |
P0001 | GDS0215 | GDS02-Most People Better Off Than You | 30 | NA | NA |
P0001 | GDS0215 | GDS02-Most People Better Off Than You | 43 | NA | NA |
P0001 | GDS0215 | GDS02-Most People Better Off Than You | 58 | NA | NA |
P0001 | GDS0215 | GDS02-Most People Better Off Than You | 90 | NA | NA |
The time-to-event parameter can be derived in the same way as for the unconfirmed parameters (see Time to Deterioration/Improvement).
This class of parameters can be used when the worst answer of a set of yes/no answers should be selected. For example, if yes/no answers for “No sleep”, “Waking up more than three times”, “More than 30 minutes to fall asleep” are collected, a parameter for the worst sleeping problems could be derived. In the example, “no sleeping problems” is assumed if all questions were answered with “no”.
adsp <- adqs %>%
filter(PARCAT1 == "SLEEPING PROBLEMS") %>%
derive_extreme_event(
by_vars = exprs(USUBJID, AVISIT),
tmp_event_nr_var = event_nr,
order = exprs(event_nr, ADY, QSSEQ),
mode = "first",
events = list(
event(
condition = PARAMCD == "SP0101" & AVALC == "YES",
set_values_to = exprs(
AVALC = "No sleep",
AVAL = 1
)
),
event(
condition = PARAMCD == "SP0102" & AVALC == "YES",
set_values_to = exprs(
AVALC = "Waking up more than three times",
AVAL = 2
)
),
event(
condition = PARAMCD == "SP0103" & AVALC == "YES",
set_values_to = exprs(
AVALC = "More than 30 mins to fall asleep",
AVAL = 3
)
),
event(
condition = all(AVALC == "NO"),
set_values_to = exprs(
AVALC = "No sleeping problems",
AVAL = 4
)
),
event(
condition = TRUE,
set_values_to = exprs(
AVALC = "Missing",
AVAL = 99
)
)
),
set_values_to = exprs(
PARAMCD = "SP01WSP",
PARAM = "Worst Sleeping Problems"
)
)
USUBJID | PARAMCD | PARAM | AVISIT | AVALC |
---|---|---|---|---|
P0001 | SP0101 | SP01-No sleep at all | BASELINE | NO |
P0001 | SP0102 | SP01-Waking up more than three times | BASELINE | NO |
P0001 | SP0103 | SP01-More than 30 mins to fall asleep | BASELINE | YES |
P0001 | SP01WSP | Worst Sleeping Problems | BASELINE | More than 30 mins to fall asleep |
P0001 | SP0101 | SP01-No sleep at all | VISIT 2 | NO |
P0001 | SP0102 | SP01-Waking up more than three times | VISIT 2 | NO |
P0001 | SP0103 | SP01-More than 30 mins to fall asleep | VISIT 2 | NO |
P0001 | SP01WSP | Worst Sleeping Problems | VISIT 2 | No sleeping problems |
P0001 | SP0101 | SP01-No sleep at all | VISIT 4 | NO |
P0001 | SP0102 | SP01-Waking up more than three times | VISIT 4 | YES |
Parameters for completion, like “at least 90% of the questions were
answered”, can be derived by derive_summary_records()
.
adgdssf <- adgdssf %>%
derive_summary_records(
dataset_add = adgdssf,
filter_add = str_detect(PARAMCD, "GDS02[01][0-9]"),
by_vars = exprs(USUBJID, AVISIT),
set_values_to = exprs(
AVAL = sum(!is.na(AVAL)) / 15 >= 0.9,
PARAMCD = "COMPL90P",
PARAM = "Completed at least 90% of questions?",
AVALC = if_else(AVAL == 1, "YES", "NO")
)
)
USUBJID | PARAMCD | PARAM | AVISIT | AVALC |
---|---|---|---|---|
P0001 | COMPL90P | Completed at least 90% of questions? | BASELINE | YES |
P0001 | COMPL90P | Completed at least 90% of questions? | UNSCHEDULED 2.01 | YES |
P0001 | COMPL90P | Completed at least 90% of questions? | VISIT 2 | YES |
P0001 | COMPL90P | Completed at least 90% of questions? | VISIT 3 | NO |
P0001 | COMPL90P | Completed at least 90% of questions? | VISIT 4 | YES |
P0001 | GDS0201 | GDS02-Satisfied With Life | BASELINE | YES |
P0001 | GDS0201 | GDS02-Satisfied With Life | VISIT 2 | YES |
P0001 | GDS0201 | GDS02-Satisfied With Life | UNSCHEDULED 2.01 | YES |
P0001 | GDS0201 | GDS02-Satisfied With Life | VISIT 3 | NA |
P0001 | GDS0201 | GDS02-Satisfied With Life | VISIT 4 | YES |
Please note that the denominator may depend on the answers of some of the questions. For example, a given questionnaire might direct someone to go from question #4 directly to question #8 based on their response to question #4, because questions #5, #6 and #7 would not apply in that case.
If missed visits need to be taken into account, the expected records
can be added to the input dataset by calling
derive_expected_records()
:
# Create dataset with expected visits and parameters (GDS0201 - GDS0215)
parm_visit_ref <- crossing(
tribble(
~AVISIT, ~AVISITN,
"BASELINE", 0,
"VISIT 2", 2,
"VISIT 3", 3,
"VISIT 4", 4,
"VISIT 5", 5
),
tibble(PARAMCD = sprintf("GDS02%02d", seq(1, 15)))
)
adgdssf <- adgdssf %>%
derive_expected_records(
dataset_ref = parm_visit_ref,
by_vars = exprs(USUBJID),
set_values_to = exprs(
filled_in = 1
)
) %>%
derive_summary_records(
dataset = .,
dataset_add = .,
filter_add = str_detect(PARAMCD, "GDS02[01][0-9]"),
by_vars = exprs(USUBJID, AVISIT),
set_values_to = exprs(
AVAL = all(!is.na(AVAL)),
PARAMCD = "COMPLALL",
PARAM = "Completed all questions?",
AVALC = if_else(AVAL == 1, "YES", "NO")
)
) %>%
filter(is.na(filled_in)) %>%
select(-filled_in)
USUBJID | PARAMCD | PARAM | AVISIT | AVALC |
---|---|---|---|---|
P0001 | COMPL90P | Completed at least 90% of questions? | BASELINE | YES |
P0001 | COMPL90P | Completed at least 90% of questions? | UNSCHEDULED 2.01 | YES |
P0001 | COMPL90P | Completed at least 90% of questions? | VISIT 2 | YES |
P0001 | COMPL90P | Completed at least 90% of questions? | VISIT 3 | NO |
P0001 | COMPL90P | Completed at least 90% of questions? | VISIT 4 | YES |
P0001 | COMPLALL | Completed all questions? | BASELINE | YES |
P0001 | COMPLALL | Completed all questions? | UNSCHEDULED 2.01 | YES |
P0001 | COMPLALL | Completed all questions? | VISIT 2 | YES |
P0001 | COMPLALL | Completed all questions? | VISIT 3 | NO |
P0001 | COMPLALL | Completed all questions? | VISIT 4 | YES |
The example QS
data
(example_qs
) is included in the admiral package.↩︎