This vignette shows the general purpose and syntax of the
tern.rbmi
R package. The tern.rbmi
provides an
interface for Reference Based Multiple Imputation (rbmi
)
within the tern framework. For details of the rbmi
package,
please see Reference Based
Multiple Imputation (rbmi). The basic usage of rbmi
core functions is described in the quickstart
vignette:
tern.rbmi
The rbmi
package consists of 4 core functions (plus
several helper functions) which are typically called in sequence:
draws()
- fits the imputation models and stores their
parametersimpute()
- creates multiple imputed datasetsanalyse()
- analyses each of the multiple imputed
datasetspool()
- combines the analysis results across imputed
datasets into a single statisticWe use a publicly available example dataset from an antidepressant
clinical trial of an active drug versus placebo. The relevant endpoint
is the Hamilton 17-item depression rating scale (HAMD17
)
which was assessed at baseline and at weeks 1, 2, 4, and 6. Study drug
discontinuation occurred in 24% of subjects from the active drug and 26%
of subjects from placebo. All data after study drug discontinuation are
missing and there is a single additional intermittent missing
observation.
library(tern.rbmi)
#> Loading required package: rbmi
#> Loading required package: tern
#> Loading required package: rtables
#> Loading required package: formatters
#>
#> Attaching package: 'formatters'
#> The following object is masked from 'package:base':
#>
#> %||%
#> Loading required package: magrittr
#>
#> Attaching package: 'rtables'
#> The following object is masked from 'package:utils':
#>
#> str
#> Registered S3 method overwritten by 'tern':
#> method from
#> tidy.glm broom
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
data <- antidepressant_data
levels(data$THERAPY) <- c("PLACEBO", "DRUG") # This is important! The order defines the computation order later
missing_var <- "CHANGE"
vars <- list(
id = "PATIENT",
visit = "VISIT",
expand_vars = c("BASVAL", "THERAPY"),
group = "THERAPY"
)
covariates <- list(
draws = c("BASVAL*VISIT", "THERAPY*VISIT"),
analyse = c("BASVAL")
)
data <- data %>%
dplyr::select(PATIENT, THERAPY, VISIT, BASVAL, THERAPY, CHANGE) %>%
dplyr::mutate(dplyr::across(.cols = vars$id, ~ as.factor(.x))) %>%
dplyr::arrange(dplyr::across(.cols = c(vars$id, vars$visit)))
# Use expand_locf to add rows corresponding to visits with missing outcomes to the dataset
data_full <- do.call(
expand_locf,
args = list(
data = data,
vars = c(vars$expand_vars, vars$group),
group = vars$id,
order = c(vars$id, vars$visit)
) %>%
append(lapply(data[c(vars$id, vars$visit)], levels))
)
data_full <- data_full %>%
dplyr::group_by(dplyr::across(vars$id)) %>%
dplyr::mutate(!!vars$group := Filter(Negate(is.na), .data[[vars$group]])[1])
# there are duplicates - use first value
data_full <- data_full %>%
dplyr::group_by(dplyr::across(c(vars$id, vars$group, vars$visit))) %>%
dplyr::slice(1) %>%
dplyr::ungroup()
# need to have a single ID column
data_full <- data_full %>%
tidyr::unite("TMP_ID", dplyr::all_of(vars$id), sep = "_#_", remove = FALSE) %>%
dplyr::mutate(TMP_ID = as.factor(TMP_ID))
Set the imputation strategy to MAR for each patient with at least one missing observation.
The rbmi::draws()
function fits the imputation models
and stores the corresponding parameter estimates or Bayesian posterior
parameter draws. The three main inputs to the rbmi::draws()
function are:
Define the names of key variables in our dataset and the covariates
included in the imputation model using rbmi::set_vars()
.
Note that the covariates argument can also include interaction
terms.
debug_mode <- FALSE
draws_vars <- rbmi::set_vars(
outcome = missing_var,
visit = vars$visit,
group = vars$group,
covariates = covariates$draws
)
draws_vars$subjid <- "TMP_ID"
Define which imputation method to use, then create samples for the
imputation parameters by running the draws()
function.
draws_method <- rbmi::method_bayes()
draws_obj <- rbmi::draws(
data = data_full,
data_ice = data_ice,
vars = draws_vars,
method = draws_method
)
#>
#> SAMPLING FOR MODEL 'rbmi_mmrm' NOW (CHAIN 1).
#> Chain 1:
#> Chain 1: Gradient evaluation took 0.000281 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 2.81 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1:
#> Chain 1:
#> Chain 1: Iteration: 1 / 1200 [ 0%] (Warmup)
#> Chain 1: Iteration: 120 / 1200 [ 10%] (Warmup)
#> Chain 1: Iteration: 201 / 1200 [ 16%] (Sampling)
#> Chain 1: Iteration: 320 / 1200 [ 26%] (Sampling)
#> Chain 1: Iteration: 440 / 1200 [ 36%] (Sampling)
#> Chain 1: Iteration: 560 / 1200 [ 46%] (Sampling)
#> Chain 1: Iteration: 680 / 1200 [ 56%] (Sampling)
#> Chain 1: Iteration: 800 / 1200 [ 66%] (Sampling)
#> Chain 1: Iteration: 920 / 1200 [ 76%] (Sampling)
#> Chain 1: Iteration: 1040 / 1200 [ 86%] (Sampling)
#> Chain 1: Iteration: 1160 / 1200 [ 96%] (Sampling)
#> Chain 1: Iteration: 1200 / 1200 [100%] (Sampling)
#> Chain 1:
#> Chain 1: Elapsed Time: 0.53 seconds (Warm-up)
#> Chain 1: 1.983 seconds (Sampling)
#> Chain 1: 2.513 seconds (Total)
#> Chain 1:
#> Warning in fit_mcmc(designmat = model_df_scaled[, -1, drop = FALSE], outcome = model_df_scaled[, : The largest R-hat is 1.17, indicating chains have not mixed.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#r-hat
The next step is to use the parameters from the imputation model to
generate the imputed datasets. This is done via the
rbmi::impute()
function. The function only has two key
inputs: the imputation model output from rbmi::draws()
and
the reference groups relevant to reference-based imputation methods. Its
usage is thus:
The next step is to run the analysis model on each imputed dataset.
This is done by defining an analysis function and then calling
rbmi::analyse()
to apply this function to each imputed
dataset.
# Define analysis model
analyse_fun <- ancova
ref_levels <- levels(impute_obj$data$group[[1]])
names(ref_levels) <- c("ref", "alt")
analyse_obj <- rbmi::analyse(
imputations = impute_obj,
fun = analyse_fun,
vars = rbmi::set_vars(
subjid = "TMP_ID",
outcome = missing_var,
visit = vars$visit,
group = vars$group,
covariates = covariates$analyse
)
)
The rbmi::pool()
function can be used to summarize the
analysis results across multiple imputed datasets to provide an overall
statistic with a standard error, confidence intervals and a p-value for
the hypothesis test of the null hypothesis that the effect is equal to
0.
Finally create output with rtables
and tern
packages
library(broom)
df <- tidy(pool_obj)
df
#> group est se_est lower_cl_est upper_cl_est est_contr se_contr
#> 1 ref -1.615820 0.4862316 -2.575771 -0.6558685 NA NA
#> 2 alt -1.707626 0.4749573 -2.645319 -0.7699335 -0.09180645 0.6826279
#> 3 ref -4.255971 0.6629528 -5.565757 -2.9461847 NA NA
#> 4 alt -2.818716 0.6529116 -4.109004 -1.5284283 1.43725440 0.9415421
#> 5 ref -6.395916 0.7031927 -7.785629 -5.0062024 NA NA
#> 6 alt -4.226663 0.6976878 -5.606310 -2.8470159 2.16925278 1.0038227
#> 7 ref -7.678549 0.7694951 -9.199518 -6.1575798 NA NA
#> 8 alt -4.857284 0.7642806 -6.368945 -3.3456232 2.82126435 1.0908398
#> lower_cl_contr upper_cl_contr p_value relative_reduc visit conf_level
#> 1 NA NA NA NA 4 0.95
#> 2 -1.4394968 1.255884 0.89317724 0.05681725 4 0.95
#> 3 NA NA NA NA 5 0.95
#> 4 -0.4236434 3.298152 0.12906320 -0.33770308 5 0.95
#> 5 NA NA NA NA 6 0.95
#> 6 0.1841498 4.154356 0.03244574 -0.33916220 6 0.95
#> 7 NA NA NA NA 7 0.95
#> 8 0.6643306 4.978198 0.01073682 -0.36742156 7 0.95
Final product, reshape rbmi
final results to nicely
formatted rtable
object.
basic_table() %>%
split_cols_by("group", ref_group = levels(df$group)[1]) %>%
split_rows_by("visit", split_label = "Visit", label_pos = "topleft") %>%
summarize_rbmi() %>%
build_table(df)
#> Visit ref alt
#> —————————————————————————————————————————————————————————————————————————
#> 4
#> Adjusted Mean (SE) -1.616 (0.486) -1.708 (0.475)
#> 95% CI (-2.576, -0.656) (-2.645, -0.770)
#> Difference in Adjusted Means (SE) -0.092 (0.683)
#> 95% CI (-1.439, 1.256)
#> Relative Reduction (%) 5.7%
#> p-value (RBMI) 0.8932
#> 5
#> Adjusted Mean (SE) -4.256 (0.663) -2.819 (0.653)
#> 95% CI (-5.566, -2.946) (-4.109, -1.528)
#> Difference in Adjusted Means (SE) 1.437 (0.942)
#> 95% CI (-0.424, 3.298)
#> Relative Reduction (%) -33.8%
#> p-value (RBMI) 0.1291
#> 6
#> Adjusted Mean (SE) -6.396 (0.703) -4.227 (0.698)
#> 95% CI (-7.786, -5.006) (-5.606, -2.847)
#> Difference in Adjusted Means (SE) 2.169 (1.004)
#> 95% CI (0.184, 4.154)
#> Relative Reduction (%) -33.9%
#> p-value (RBMI) 0.0324
#> 7
#> Adjusted Mean (SE) -7.679 (0.769) -4.857 (0.764)
#> 95% CI (-9.200, -6.158) (-6.369, -3.346)
#> Difference in Adjusted Means (SE) 2.821 (1.091)
#> 95% CI (0.664, 4.978)
#> Relative Reduction (%) -36.7%
#> p-value (RBMI) 0.0107