Title: | Create Common TLGs Used in Clinical Trials |
---|---|
Description: | Table, Listings, and Graphs (TLG) library for common outputs used in clinical trials. |
Authors: | Joe Zhu [aut, cre], Daniel Sabanés Bové [aut], Jana Stoilova [aut], Davide Garolini [aut], Emily de la Rua [aut], Abinaya Yogasekaram [aut], Heng Wang [aut], Francois Collin [aut], Adrian Waddell [aut], Pawel Rucki [aut], Chendi Liao [aut], Jennifer Li [aut], F. Hoffmann-La Roche AG [cph, fnd] |
Maintainer: | Joe Zhu <[email protected]> |
License: | Apache License 2.0 |
Version: | 0.9.6.9016 |
Built: | 2024-11-20 09:26:39 UTC |
Source: | https://github.com/insightsengineering/tern |
Package to create tables, listings and graphs to analyze clinical trials data.
Maintainer: Joe Zhu [email protected]
Authors:
Daniel Sabanés Bové [email protected]
Jana Stoilova [email protected]
Davide Garolini [email protected]
Emily de la Rua [email protected]
Abinaya Yogasekaram [email protected]
Heng Wang [email protected]
Francois Collin
Adrian Waddell [email protected]
Pawel Rucki [email protected]
Chendi Liao [email protected]
Jennifer Li [email protected]
Other contributors:
F. Hoffmann-La Roche AG [copyright holder, funder]
Useful links:
Report bugs at https://github.com/insightsengineering/tern/issues
Wrapper function for rtables::add_combo_levels()
which configures settings for the risk difference
column to be added to an rtables
object. To add a risk difference column to a table, this function
should be used as split_fun
in calls to rtables::split_cols_by()
, followed by setting argument
riskdiff
to TRUE
in all following analyze function calls.
add_riskdiff( arm_x, arm_y, col_label = paste0("Risk Difference (%) (95% CI)", if (length(arm_y) > 1) paste0("\n", arm_x, " vs. ", arm_y)), pct = TRUE )
add_riskdiff( arm_x, arm_y, col_label = paste0("Risk Difference (%) (95% CI)", if (length(arm_y) > 1) paste0("\n", arm_x, " vs. ", arm_y)), pct = TRUE )
arm_x |
( |
arm_y |
( |
col_label |
( |
pct |
( |
A closure suitable for use as a split function (split_fun
) within rtables::split_cols_by()
when creating a table layout.
stat_propdiff_ci()
for details on risk difference calculation.
adae <- tern_ex_adae adae$AESEV <- factor(adae$AESEV) lyt <- basic_table() %>% split_cols_by("ARMCD", split_fun = add_riskdiff(arm_x = "ARM A", arm_y = c("ARM B", "ARM C"))) %>% count_occurrences_by_grade( var = "AESEV", riskdiff = TRUE ) tbl <- build_table(lyt, df = adae) tbl
adae <- tern_ex_adae adae$AESEV <- factor(adae$AESEV) lyt <- basic_table() %>% split_cols_by("ARMCD", split_fun = add_riskdiff(arm_x = "ARM A", arm_y = c("ARM B", "ARM C"))) %>% count_occurrences_by_grade( var = "AESEV", riskdiff = TRUE ) tbl <- build_table(lyt, df = adae) tbl
This works analogously to rtables::add_colcounts()
but on the rows. This function
is a wrapper for rtables::summarize_row_groups()
.
add_rowcounts(lyt, alt_counts = FALSE)
add_rowcounts(lyt, alt_counts = FALSE)
lyt |
( |
alt_counts |
( |
A modified layout where the latest row split labels now have the row-wise total counts (i.e. without column-based subsetting) attached in parentheses.
Row count values are contained in these row count rows but are not displayed so that they are not considered zero rows by default when pruning.
basic_table() %>% split_cols_by("ARM") %>% add_colcounts() %>% split_rows_by("RACE", split_fun = drop_split_levels) %>% add_rowcounts() %>% analyze("AGE", afun = list_wrap_x(summary), format = "xx.xx") %>% build_table(DM)
basic_table() %>% split_cols_by("ARM") %>% add_colcounts() %>% split_rows_by("RACE", split_fun = drop_split_levels) %>% add_rowcounts() %>% analyze("AGE", afun = list_wrap_x(summary), format = "xx.xx") %>% build_table(DM)
aesi_label(aesi, scope = NULL)
aesi_label(aesi, scope = NULL)
aesi |
( |
scope |
( |
A string
with the standard label for the AE basket.
adae <- tern_ex_adae # Standardized query label includes scope. aesi_label(adae$SMQ01NAM, scope = adae$SMQ01SC) # Customized query label. aesi_label(adae$CQ01NAM)
adae <- tern_ex_adae # Standardized query label includes scope. aesi_label(adae$SMQ01NAM, scope = adae$SMQ01SC) # Customized query label. aesi_label(adae$CQ01NAM)
These functions are wrappers of rtables::analyze_colvars()
which apply corresponding tern
statistics functions to add an analysis to a given table layout. In particular, these functions
where designed to have the analysis methods split into different columns.
analyze_vars_in_cols()
: fundamental tabulation of analysis methods onto columns.
In other words, the analysis methods are defined in the column space, i.e. they become
column labels. By changing the variable vector, the list of functions can be applied on
different variables, with the caveat of having the same number of statistical functions.
tabulate_rsp_subgroups()
: similarly to analyze_vars_in_cols
, this
function combines analyze_colvars
and summarize_row_groups
in a compact way
to produce standard tables that show analysis methods as columns.
tabulate_survival_subgroups()
: this function is very similar to the above, but
it is used for other tables.
analyze_patients_exposure_in_cols()
: based only on analyze_colvars
. It needs
summarize_patients_exposure_in_cols()
to leverage nesting of label rows analysis
with rtables::summarize_row_groups()
.
summarize_coxreg()
: generally based on rtables::summarize_row_groups()
, it behaves
similarly to tabulate_*
functions described above as it is designed to provide
specific standard tables that may contain nested structure with a combination of
summarize_row_groups()
and rtables::analyze_colvars()
.
summarize_functions for functions which are wrappers for rtables::summarize_row_groups()
.
analyze_functions for functions which are wrappers for rtables::analyze()
.
These functions are wrappers of rtables::analyze()
which apply corresponding tern
statistics functions
to add an analysis to a given table layout:
summarize_colvars()
: even if this function uses rtables::analyze_colvars()
,
it applies the analysis methods as different rows for one or more
variables that are split into different columns. In comparison, analyze_colvars_functions
leverage analyze_colvars
to have the context split in rows and the analysis
methods in columns.
analyze_colvars_functions for functions that are wrappers for rtables::analyze_colvars()
.
summarize_functions for functions which are wrappers for rtables::summarize_row_groups()
.
The analyze function analyze_vars()
creates a layout element to summarize one or more variables, using the S3
generic function s_summary()
to calculate a list of summary statistics. A list of all available statistics for
numeric variables can be viewed by running get_stats("analyze_vars_numeric")
and for non-numeric variables by
running get_stats("analyze_vars_counts")
. Use the .stats
parameter to specify the statistics to include in your
output summary table.
analyze_vars( lyt, vars, var_labels = vars, na_str = default_na_str(), nested = TRUE, ..., na.rm = TRUE, show_labels = "default", table_names = vars, section_div = NA_character_, .stats = c("n", "mean_sd", "median", "range", "count_fraction"), .formats = NULL, .labels = NULL, .indent_mods = NULL ) s_summary(x, na.rm = TRUE, denom, .N_row, .N_col, .var, ...) ## S3 method for class 'numeric' s_summary( x, na.rm = TRUE, denom, .N_row, .N_col, .var, control = control_analyze_vars(), ... ) ## S3 method for class 'factor' s_summary( x, na.rm = TRUE, denom = c("n", "N_col", "N_row"), .N_row, .N_col, ... ) ## S3 method for class 'character' s_summary( x, na.rm = TRUE, denom = c("n", "N_col", "N_row"), .N_row, .N_col, .var, verbose = TRUE, ... ) ## S3 method for class 'logical' s_summary( x, na.rm = TRUE, denom = c("n", "N_col", "N_row"), .N_row, .N_col, ... ) a_summary( x, .N_col, .N_row, .var = NULL, .df_row = NULL, .ref_group = NULL, .in_ref_col = FALSE, compare = FALSE, .stats = NULL, .formats = NULL, .labels = NULL, .indent_mods = NULL, na.rm = TRUE, na_str = default_na_str(), ... )
analyze_vars( lyt, vars, var_labels = vars, na_str = default_na_str(), nested = TRUE, ..., na.rm = TRUE, show_labels = "default", table_names = vars, section_div = NA_character_, .stats = c("n", "mean_sd", "median", "range", "count_fraction"), .formats = NULL, .labels = NULL, .indent_mods = NULL ) s_summary(x, na.rm = TRUE, denom, .N_row, .N_col, .var, ...) ## S3 method for class 'numeric' s_summary( x, na.rm = TRUE, denom, .N_row, .N_col, .var, control = control_analyze_vars(), ... ) ## S3 method for class 'factor' s_summary( x, na.rm = TRUE, denom = c("n", "N_col", "N_row"), .N_row, .N_col, ... ) ## S3 method for class 'character' s_summary( x, na.rm = TRUE, denom = c("n", "N_col", "N_row"), .N_row, .N_col, .var, verbose = TRUE, ... ) ## S3 method for class 'logical' s_summary( x, na.rm = TRUE, denom = c("n", "N_col", "N_row"), .N_row, .N_col, ... ) a_summary( x, .N_col, .N_row, .var = NULL, .df_row = NULL, .ref_group = NULL, .in_ref_col = FALSE, compare = FALSE, .stats = NULL, .formats = NULL, .labels = NULL, .indent_mods = NULL, na.rm = TRUE, na_str = default_na_str(), ... )
lyt |
( |
vars |
( |
var_labels |
( |
na_str |
( |
nested |
( |
... |
arguments passed to |
na.rm |
( |
show_labels |
( |
table_names |
( |
section_div |
( |
.stats |
( Options for numeric variables are: Options for non-numeric variables are: |
.formats |
(named |
.labels |
(named |
.indent_mods |
(named |
x |
( |
denom |
(
|
.N_row |
( |
.N_col |
( |
.var |
( |
control |
(
|
verbose |
( |
.df_row |
( |
.ref_group |
( |
.in_ref_col |
( |
compare |
( |
Automatic digit formatting: The number of digits to display can be automatically determined from the analyzed
variable(s) (vars
) for certain statistics by setting the statistic format to "auto"
in .formats
.
This utilizes the format_auto()
formatting function. Note that only data for the current row & variable (for all
columns) will be considered (.df_row[[.var]]
, see rtables::additional_fun_params
) and not the whole dataset.
analyze_vars()
returns a layout object suitable for passing to further layouting functions,
or to rtables::build_table()
. Adding this function to an rtable
layout will add formatted rows containing
the statistics from s_summary()
to the table layout.
s_summary()
returns different statistics depending on the class of x
.
If x
is of class numeric
, returns a list
with the following named numeric
items:
n
: The length()
of x
.
sum
: The sum()
of x
.
mean
: The mean()
of x
.
sd
: The stats::sd()
of x
.
se
: The standard error of x
mean, i.e.: (sd(x) / sqrt(length(x))
).
mean_sd
: The mean()
and stats::sd()
of x
.
mean_se
: The mean()
of x
and its standard error (see above).
mean_ci
: The CI for the mean of x
(from stat_mean_ci()
).
mean_sei
: The SE interval for the mean of x
, i.e.: (mean()
-/+ stats::sd()
/ sqrt()
).
mean_sdi
: The SD interval for the mean of x
, i.e.: (mean()
-/+ stats::sd()
).
mean_pval
: The two-sided p-value of the mean of x
(from stat_mean_pval()
).
median
: The stats::median()
of x
.
mad
: The median absolute deviation of x
, i.e.: (stats::median()
of xc
,
where xc
= x
- stats::median()
).
median_ci
: The CI for the median of x
(from stat_median_ci()
).
quantiles
: Two sample quantiles of x
(from stats::quantile()
).
iqr
: The stats::IQR()
of x
.
range
: The range_noinf()
of x
.
min
: The max()
of x
.
max
: The min()
of x
.
median_range
: The median()
and range_noinf()
of x
.
cv
: The coefficient of variation of x
, i.e.: (stats::sd()
/ mean()
* 100).
geom_mean
: The geometric mean of x
, i.e.: (exp(mean(log(x)))
).
geom_cv
: The geometric coefficient of variation of x
, i.e.: (sqrt(exp(sd(log(x)) ^ 2) - 1) * 100
).
If x
is of class factor
or converted from character
, returns a list
with named numeric
items:
n
: The length()
of x
.
count
: A list with the number of cases for each level of the factor x
.
count_fraction
: Similar to count
but also includes the proportion of cases for each level of the
factor x
relative to the denominator, or NA
if the denominator is zero.
If x
is of class logical
, returns a list
with named numeric
items:
n
: The length()
of x
(possibly after removing NA
s).
count
: Count of TRUE
in x
.
count_fraction
: Count and proportion of TRUE
in x
relative to the denominator, or NA
if the
denominator is zero. Note that NA
s in x
are never counted or leading to NA
here.
a_summary()
returns the corresponding list with formatted rtables::CellValue()
.
analyze_vars()
: Layout-creating function which can take statistics function arguments
and additional format arguments. This function is a wrapper for rtables::analyze()
.
s_summary()
: S3 generic function to produces a variable summary.
s_summary(numeric)
: Method for numeric
class.
s_summary(factor)
: Method for factor
class.
s_summary(character)
: Method for character
class. This makes an automatic
conversion to factor (with a warning) and then forwards to the method for factors.
s_summary(logical)
: Method for logical
class.
a_summary()
: Formatted analysis function which is used as afun
in analyze_vars()
and
compare_vars()
and as cfun
in summarize_colvars()
.
If x
is an empty vector, NA
is returned. This is the expected feature so as to return rcell
content in
rtables
when the intersection of a column and a row delimits an empty data selection.
When the mean
function is applied to an empty vector, NA
will be returned instead of NaN
, the latter
being standard behavior in R.
If x
is an empty factor
, a list is still returned for counts
with one element
per factor level. If there are no levels in x
, the function fails.
If factor variables contain NA
, these NA
values are excluded by default. To include NA
values
set na.rm = FALSE
and missing values will be displayed as an NA
level. Alternatively, an explicit
factor level can be defined for NA
values during pre-processing via df_explicit_na()
- the
default na_level
("<Missing>"
) will also be excluded when na.rm
is set to TRUE
.
Automatic conversion of character to factor does not guarantee that the table
can be generated correctly. In particular for sparse tables this very likely can fail.
It is therefore better to always pre-process the dataset such that factors are manually
created from character variables before passing the dataset to rtables::build_table()
.
To use for comparison (with additional p-value statistic), parameter compare
must be set to TRUE
.
Ensure that either all NA
values are converted to an explicit NA
level or all NA
values are left as is.
## Fabricated dataset. dta_test <- data.frame( USUBJID = rep(1:6, each = 3), PARAMCD = rep("lab", 6 * 3), AVISIT = rep(paste0("V", 1:3), 6), ARM = rep(LETTERS[1:3], rep(6, 3)), AVAL = c(9:1, rep(NA, 9)) ) # `analyze_vars()` in `rtables` pipelines ## Default output within a `rtables` pipeline. l <- basic_table() %>% split_cols_by(var = "ARM") %>% split_rows_by(var = "AVISIT") %>% analyze_vars(vars = "AVAL") build_table(l, df = dta_test) ## Select and format statistics output. l <- basic_table() %>% split_cols_by(var = "ARM") %>% split_rows_by(var = "AVISIT") %>% analyze_vars( vars = "AVAL", .stats = c("n", "mean_sd", "quantiles"), .formats = c("mean_sd" = "xx.x, xx.x"), .labels = c(n = "n", mean_sd = "Mean, SD", quantiles = c("Q1 - Q3")) ) build_table(l, df = dta_test) ## Use arguments interpreted by `s_summary`. l <- basic_table() %>% split_cols_by(var = "ARM") %>% split_rows_by(var = "AVISIT") %>% analyze_vars(vars = "AVAL", na.rm = FALSE) build_table(l, df = dta_test) ## Handle `NA` levels first when summarizing factors. dta_test$AVISIT <- NA_character_ dta_test <- df_explicit_na(dta_test) l <- basic_table() %>% split_cols_by(var = "ARM") %>% analyze_vars(vars = "AVISIT", na.rm = FALSE) build_table(l, df = dta_test) # auto format dt <- data.frame("VAR" = c(0.001, 0.2, 0.0011000, 3, 4)) basic_table() %>% analyze_vars( vars = "VAR", .stats = c("n", "mean", "mean_sd", "range"), .formats = c("mean_sd" = "auto", "range" = "auto") ) %>% build_table(dt) # `s_summary.numeric` ## Basic usage: empty numeric returns NA-filled items. s_summary(numeric()) ## Management of NA values. x <- c(NA_real_, 1) s_summary(x, na.rm = TRUE) s_summary(x, na.rm = FALSE) x <- c(NA_real_, 1, 2) s_summary(x, stats = NULL) ## Benefits in `rtables` contructions: dta_test <- data.frame( Group = rep(LETTERS[1:3], each = 2), sub_group = rep(letters[1:2], each = 3), x = 1:6 ) ## The summary obtained in with `rtables`: basic_table() %>% split_cols_by(var = "Group") %>% split_rows_by(var = "sub_group") %>% analyze(vars = "x", afun = s_summary) %>% build_table(df = dta_test) ## By comparison with `lapply`: X <- split(dta_test, f = with(dta_test, interaction(Group, sub_group))) lapply(X, function(x) s_summary(x$x)) # `s_summary.factor` ## Basic usage: s_summary(factor(c("a", "a", "b", "c", "a"))) # Empty factor returns zero-filled items. s_summary(factor(levels = c("a", "b", "c"))) ## Management of NA values. x <- factor(c(NA, "Female")) x <- explicit_na(x) s_summary(x, na.rm = TRUE) s_summary(x, na.rm = FALSE) ## Different denominators. x <- factor(c("a", "a", "b", "c", "a")) s_summary(x, denom = "N_row", .N_row = 10L) s_summary(x, denom = "N_col", .N_col = 20L) # `s_summary.character` ## Basic usage: s_summary(c("a", "a", "b", "c", "a"), .var = "x", verbose = FALSE) s_summary(c("a", "a", "b", "c", "a", ""), .var = "x", na.rm = FALSE, verbose = FALSE) # `s_summary.logical` ## Basic usage: s_summary(c(TRUE, FALSE, TRUE, TRUE)) # Empty factor returns zero-filled items. s_summary(as.logical(c())) ## Management of NA values. x <- c(NA, TRUE, FALSE) s_summary(x, na.rm = TRUE) s_summary(x, na.rm = FALSE) ## Different denominators. x <- c(TRUE, FALSE, TRUE, TRUE) s_summary(x, denom = "N_row", .N_row = 10L) s_summary(x, denom = "N_col", .N_col = 20L) a_summary(factor(c("a", "a", "b", "c", "a")), .N_row = 10, .N_col = 10) a_summary( factor(c("a", "a", "b", "c", "a")), .ref_group = factor(c("a", "a", "b", "c")), compare = TRUE ) a_summary(c("A", "B", "A", "C"), .var = "x", .N_col = 10, .N_row = 10, verbose = FALSE) a_summary( c("A", "B", "A", "C"), .ref_group = c("B", "A", "C"), .var = "x", compare = TRUE, verbose = FALSE ) a_summary(c(TRUE, FALSE, FALSE, TRUE, TRUE), .N_row = 10, .N_col = 10) a_summary( c(TRUE, FALSE, FALSE, TRUE, TRUE), .ref_group = c(TRUE, FALSE), .in_ref_col = TRUE, compare = TRUE ) a_summary(rnorm(10), .N_col = 10, .N_row = 20, .var = "bla") a_summary(rnorm(10, 5, 1), .ref_group = rnorm(20, -5, 1), .var = "bla", compare = TRUE)
## Fabricated dataset. dta_test <- data.frame( USUBJID = rep(1:6, each = 3), PARAMCD = rep("lab", 6 * 3), AVISIT = rep(paste0("V", 1:3), 6), ARM = rep(LETTERS[1:3], rep(6, 3)), AVAL = c(9:1, rep(NA, 9)) ) # `analyze_vars()` in `rtables` pipelines ## Default output within a `rtables` pipeline. l <- basic_table() %>% split_cols_by(var = "ARM") %>% split_rows_by(var = "AVISIT") %>% analyze_vars(vars = "AVAL") build_table(l, df = dta_test) ## Select and format statistics output. l <- basic_table() %>% split_cols_by(var = "ARM") %>% split_rows_by(var = "AVISIT") %>% analyze_vars( vars = "AVAL", .stats = c("n", "mean_sd", "quantiles"), .formats = c("mean_sd" = "xx.x, xx.x"), .labels = c(n = "n", mean_sd = "Mean, SD", quantiles = c("Q1 - Q3")) ) build_table(l, df = dta_test) ## Use arguments interpreted by `s_summary`. l <- basic_table() %>% split_cols_by(var = "ARM") %>% split_rows_by(var = "AVISIT") %>% analyze_vars(vars = "AVAL", na.rm = FALSE) build_table(l, df = dta_test) ## Handle `NA` levels first when summarizing factors. dta_test$AVISIT <- NA_character_ dta_test <- df_explicit_na(dta_test) l <- basic_table() %>% split_cols_by(var = "ARM") %>% analyze_vars(vars = "AVISIT", na.rm = FALSE) build_table(l, df = dta_test) # auto format dt <- data.frame("VAR" = c(0.001, 0.2, 0.0011000, 3, 4)) basic_table() %>% analyze_vars( vars = "VAR", .stats = c("n", "mean", "mean_sd", "range"), .formats = c("mean_sd" = "auto", "range" = "auto") ) %>% build_table(dt) # `s_summary.numeric` ## Basic usage: empty numeric returns NA-filled items. s_summary(numeric()) ## Management of NA values. x <- c(NA_real_, 1) s_summary(x, na.rm = TRUE) s_summary(x, na.rm = FALSE) x <- c(NA_real_, 1, 2) s_summary(x, stats = NULL) ## Benefits in `rtables` contructions: dta_test <- data.frame( Group = rep(LETTERS[1:3], each = 2), sub_group = rep(letters[1:2], each = 3), x = 1:6 ) ## The summary obtained in with `rtables`: basic_table() %>% split_cols_by(var = "Group") %>% split_rows_by(var = "sub_group") %>% analyze(vars = "x", afun = s_summary) %>% build_table(df = dta_test) ## By comparison with `lapply`: X <- split(dta_test, f = with(dta_test, interaction(Group, sub_group))) lapply(X, function(x) s_summary(x$x)) # `s_summary.factor` ## Basic usage: s_summary(factor(c("a", "a", "b", "c", "a"))) # Empty factor returns zero-filled items. s_summary(factor(levels = c("a", "b", "c"))) ## Management of NA values. x <- factor(c(NA, "Female")) x <- explicit_na(x) s_summary(x, na.rm = TRUE) s_summary(x, na.rm = FALSE) ## Different denominators. x <- factor(c("a", "a", "b", "c", "a")) s_summary(x, denom = "N_row", .N_row = 10L) s_summary(x, denom = "N_col", .N_col = 20L) # `s_summary.character` ## Basic usage: s_summary(c("a", "a", "b", "c", "a"), .var = "x", verbose = FALSE) s_summary(c("a", "a", "b", "c", "a", ""), .var = "x", na.rm = FALSE, verbose = FALSE) # `s_summary.logical` ## Basic usage: s_summary(c(TRUE, FALSE, TRUE, TRUE)) # Empty factor returns zero-filled items. s_summary(as.logical(c())) ## Management of NA values. x <- c(NA, TRUE, FALSE) s_summary(x, na.rm = TRUE) s_summary(x, na.rm = FALSE) ## Different denominators. x <- c(TRUE, FALSE, TRUE, TRUE) s_summary(x, denom = "N_row", .N_row = 10L) s_summary(x, denom = "N_col", .N_col = 20L) a_summary(factor(c("a", "a", "b", "c", "a")), .N_row = 10, .N_col = 10) a_summary( factor(c("a", "a", "b", "c", "a")), .ref_group = factor(c("a", "a", "b", "c")), compare = TRUE ) a_summary(c("A", "B", "A", "C"), .var = "x", .N_col = 10, .N_row = 10, verbose = FALSE) a_summary( c("A", "B", "A", "C"), .ref_group = c("B", "A", "C"), .var = "x", compare = TRUE, verbose = FALSE ) a_summary(c(TRUE, FALSE, FALSE, TRUE, TRUE), .N_row = 10, .N_col = 10) a_summary( c(TRUE, FALSE, FALSE, TRUE, TRUE), .ref_group = c(TRUE, FALSE), .in_ref_col = TRUE, compare = TRUE ) a_summary(rnorm(10), .N_col = 10, .N_row = 20, .var = "bla") a_summary(rnorm(10, 5, 1), .ref_group = rnorm(20, -5, 1), .var = "bla", compare = TRUE)
The layout-creating function analyze_vars_in_cols()
creates a layout element to generate a column-wise
analysis table.
This function sets the analysis methods as column labels and is a wrapper for rtables::analyze_colvars()
.
It was designed principally for PK tables.
analyze_vars_in_cols( lyt, vars, ..., .stats = c("n", "mean", "sd", "se", "cv", "geom_cv"), .labels = c(n = "n", mean = "Mean", sd = "SD", se = "SE", cv = "CV (%)", geom_cv = "CV % Geometric Mean"), row_labels = NULL, do_summarize_row_groups = FALSE, split_col_vars = TRUE, imp_rule = NULL, avalcat_var = "AVALCAT1", cache = FALSE, .indent_mods = NULL, na_str = default_na_str(), nested = TRUE, .formats = NULL, .aligns = NULL )
analyze_vars_in_cols( lyt, vars, ..., .stats = c("n", "mean", "sd", "se", "cv", "geom_cv"), .labels = c(n = "n", mean = "Mean", sd = "SD", se = "SE", cv = "CV (%)", geom_cv = "CV % Geometric Mean"), row_labels = NULL, do_summarize_row_groups = FALSE, split_col_vars = TRUE, imp_rule = NULL, avalcat_var = "AVALCAT1", cache = FALSE, .indent_mods = NULL, na_str = default_na_str(), nested = TRUE, .formats = NULL, .aligns = NULL )
lyt |
( |
vars |
( |
... |
additional arguments for the lower level functions. |
.stats |
( |
.labels |
(named |
row_labels |
( |
do_summarize_row_groups |
( |
split_col_vars |
( |
imp_rule |
( |
avalcat_var |
( |
cache |
( |
.indent_mods |
(named |
na_str |
( |
nested |
( |
.formats |
(named |
.aligns |
( |
A layout object suitable for passing to further layouting functions, or to rtables::build_table()
.
Adding this function to an rtable
layout will summarize the given variables, arrange the output
in columns, and add it to the table layout.
This is an experimental implementation of rtables::summarize_row_groups()
and rtables::analyze_colvars()
that may be subjected to changes as rtables
extends its support to more complex analysis pipelines in the
column space. We encourage users to read the examples carefully and file issues for different use cases.
In this function, labelstr
behaves atypically. If labelstr = NULL
(the default), row labels are assigned
automatically as the split values if do_summarize_row_groups = FALSE
(the default), and as the group label
if do_summarize_row_groups = TRUE
.
analyze_vars()
, rtables::analyze_colvars()
.
library(dplyr) # Data preparation adpp <- tern_ex_adpp %>% h_pkparam_sort() lyt <- basic_table() %>% split_rows_by(var = "STRATA1", label_pos = "topleft") %>% split_rows_by( var = "SEX", label_pos = "topleft", child_labels = "hidden" ) %>% # Removes duplicated labels analyze_vars_in_cols(vars = "AGE") result <- build_table(lyt = lyt, df = adpp) result # By selecting just some statistics and ad-hoc labels lyt <- basic_table() %>% split_rows_by(var = "ARM", label_pos = "topleft") %>% split_rows_by( var = "SEX", label_pos = "topleft", child_labels = "hidden", split_fun = drop_split_levels ) %>% analyze_vars_in_cols( vars = "AGE", .stats = c("n", "cv", "geom_mean"), .labels = c( n = "aN", cv = "aCV", geom_mean = "aGeomMean" ) ) result <- build_table(lyt = lyt, df = adpp) result # Changing row labels lyt <- basic_table() %>% analyze_vars_in_cols( vars = "AGE", row_labels = "some custom label" ) result <- build_table(lyt, df = adpp) result # Pharmacokinetic parameters lyt <- basic_table() %>% split_rows_by( var = "TLG_DISPLAY", split_label = "PK Parameter", label_pos = "topleft", child_labels = "hidden" ) %>% analyze_vars_in_cols( vars = "AVAL" ) result <- build_table(lyt, df = adpp) result # Multiple calls (summarize label and analyze underneath) lyt <- basic_table() %>% split_rows_by( var = "TLG_DISPLAY", split_label = "PK Parameter", label_pos = "topleft" ) %>% analyze_vars_in_cols( vars = "AVAL", do_summarize_row_groups = TRUE # does a summarize level ) %>% split_rows_by("SEX", child_labels = "hidden", label_pos = "topleft" ) %>% analyze_vars_in_cols( vars = "AVAL", split_col_vars = FALSE # avoids re-splitting the columns ) result <- build_table(lyt, df = adpp) result
library(dplyr) # Data preparation adpp <- tern_ex_adpp %>% h_pkparam_sort() lyt <- basic_table() %>% split_rows_by(var = "STRATA1", label_pos = "topleft") %>% split_rows_by( var = "SEX", label_pos = "topleft", child_labels = "hidden" ) %>% # Removes duplicated labels analyze_vars_in_cols(vars = "AGE") result <- build_table(lyt = lyt, df = adpp) result # By selecting just some statistics and ad-hoc labels lyt <- basic_table() %>% split_rows_by(var = "ARM", label_pos = "topleft") %>% split_rows_by( var = "SEX", label_pos = "topleft", child_labels = "hidden", split_fun = drop_split_levels ) %>% analyze_vars_in_cols( vars = "AGE", .stats = c("n", "cv", "geom_mean"), .labels = c( n = "aN", cv = "aCV", geom_mean = "aGeomMean" ) ) result <- build_table(lyt = lyt, df = adpp) result # Changing row labels lyt <- basic_table() %>% analyze_vars_in_cols( vars = "AGE", row_labels = "some custom label" ) result <- build_table(lyt, df = adpp) result # Pharmacokinetic parameters lyt <- basic_table() %>% split_rows_by( var = "TLG_DISPLAY", split_label = "PK Parameter", label_pos = "topleft", child_labels = "hidden" ) %>% analyze_vars_in_cols( vars = "AVAL" ) result <- build_table(lyt, df = adpp) result # Multiple calls (summarize label and analyze underneath) lyt <- basic_table() %>% split_rows_by( var = "TLG_DISPLAY", split_label = "PK Parameter", label_pos = "topleft" ) %>% analyze_vars_in_cols( vars = "AVAL", do_summarize_row_groups = TRUE # does a summarize level ) %>% split_rows_by("SEX", child_labels = "hidden", label_pos = "topleft" ) %>% analyze_vars_in_cols( vars = "AVAL", split_col_vars = FALSE # avoids re-splitting the columns ) result <- build_table(lyt, df = adpp) result
Helper layout-creating function to append the variable labels of a given variables vector from a given dataset in the top left corner. If a variable label is not found then the variable name itself is used instead. Multiple variable labels are concatenated with slashes.
append_varlabels(lyt, df, vars, indent = 0L)
append_varlabels(lyt, df, vars, indent = 0L)
lyt |
( |
df |
( |
vars |
( |
indent |
( |
A modified layout with the new variable label(s) added to the top-left material.
This is not an optimal implementation of course, since we are using here the data set
itself during the layout creation. When we have a more mature rtables
implementation then
this will also be improved or not necessary anymore.
lyt <- basic_table() %>% split_cols_by("ARM") %>% add_colcounts() %>% split_rows_by("SEX") %>% append_varlabels(DM, "SEX") %>% analyze("AGE", afun = mean) %>% append_varlabels(DM, "AGE", indent = 1) build_table(lyt, DM) lyt <- basic_table() %>% split_cols_by("ARM") %>% split_rows_by("SEX") %>% analyze("AGE", afun = mean) %>% append_varlabels(DM, c("SEX", "AGE")) build_table(lyt, DM)
lyt <- basic_table() %>% split_cols_by("ARM") %>% add_colcounts() %>% split_rows_by("SEX") %>% append_varlabels(DM, "SEX") %>% analyze("AGE", afun = mean) %>% append_varlabels(DM, "AGE", indent = 1) build_table(lyt, DM) lyt <- basic_table() %>% split_cols_by("ARM") %>% split_rows_by("SEX") %>% analyze("AGE", afun = mean) %>% append_varlabels(DM, c("SEX", "AGE")) build_table(lyt, DM)
Arrange grobs as a new grob with n * m (rows * cols)
layout.
arrange_grobs( ..., grobs = list(...), ncol = NULL, nrow = NULL, padding_ht = grid::unit(2, "line"), padding_wt = grid::unit(2, "line"), vp = NULL, gp = NULL, name = NULL )
arrange_grobs( ..., grobs = list(...), ncol = NULL, nrow = NULL, padding_ht = grid::unit(2, "line"), padding_wt = grid::unit(2, "line"), vp = NULL, gp = NULL, name = NULL )
... |
grobs. |
grobs |
( |
ncol |
( |
nrow |
( |
padding_ht |
( |
padding_wt |
( |
vp |
( |
gp |
( |
name |
( |
A grob
.
library(grid) num <- lapply(1:9, textGrob) grid::grid.newpage() grid.draw(arrange_grobs(grobs = num, ncol = 2)) showViewport() g1 <- circleGrob(gp = gpar(col = "blue")) g2 <- circleGrob(gp = gpar(col = "red")) g3 <- textGrob("TEST TEXT") grid::grid.newpage() grid.draw(arrange_grobs(g1, g2, g3, nrow = 2)) showViewport() grid::grid.newpage() grid.draw(arrange_grobs(g1, g2, g3, ncol = 3)) grid::grid.newpage() grid::pushViewport(grid::viewport(layout = grid::grid.layout(1, 2))) vp1 <- grid::viewport(layout.pos.row = 1, layout.pos.col = 2) grid.draw(arrange_grobs(g1, g2, g3, ncol = 2, vp = vp1)) showViewport()
library(grid) num <- lapply(1:9, textGrob) grid::grid.newpage() grid.draw(arrange_grobs(grobs = num, ncol = 2)) showViewport() g1 <- circleGrob(gp = gpar(col = "blue")) g2 <- circleGrob(gp = gpar(col = "red")) g3 <- textGrob("TEST TEXT") grid::grid.newpage() grid.draw(arrange_grobs(g1, g2, g3, nrow = 2)) showViewport() grid::grid.newpage() grid.draw(arrange_grobs(g1, g2, g3, ncol = 3)) grid::grid.newpage() grid::pushViewport(grid::viewport(layout = grid::grid.layout(1, 2))) vp1 <- grid::viewport(layout.pos.row = 1, layout.pos.col = 2) grid.draw(arrange_grobs(g1, g2, g3, ncol = 2, vp = vp1)) showViewport()
rtable
This is a new generic function to convert objects to rtable
tables.
as.rtable(x, ...) ## S3 method for class 'data.frame' as.rtable(x, format = "xx.xx", ...)
as.rtable(x, ...) ## S3 method for class 'data.frame' as.rtable(x, format = "xx.xx", ...)
x |
( |
... |
additional arguments for methods. |
format |
( |
An rtables
table object. Note that the concrete class will depend on the method used.
as.rtable(data.frame)
: Method for converting a data.frame
that contains numeric columns to rtable
.
x <- data.frame( a = 1:10, b = rnorm(10) ) as.rtable(x)
x <- data.frame( a = 1:10, b = rnorm(10) ) as.rtable(x)
CombinationFunction
CombinationFunction
is an S4 class which extends standard functions. These are special functions that
can be combined and negated with the logical operators.
## S4 method for signature 'CombinationFunction,CombinationFunction' e1 & e2 ## S4 method for signature 'CombinationFunction,CombinationFunction' e1 | e2 ## S4 method for signature 'CombinationFunction' !x
## S4 method for signature 'CombinationFunction,CombinationFunction' e1 & e2 ## S4 method for signature 'CombinationFunction,CombinationFunction' e1 | e2 ## S4 method for signature 'CombinationFunction' !x
e1 |
( |
e2 |
( |
x |
( |
A logical value indicating whether the left hand side of the equation equals the right hand side.
e1 & e2
: Logical "AND" combination of CombinationFunction
functions.
The resulting object is of the same class, and evaluates the two argument functions. The result
is then the "AND" of the two individual results.
e1 | e2
: Logical "OR" combination of CombinationFunction
functions.
The resulting object is of the same class, and evaluates the two argument functions. The result
is then the "OR" of the two individual results.
`!`(CombinationFunction)
: Logical negation of CombinationFunction
functions.
The resulting object is of the same class, and evaluates the original function. The result
is then the opposite of this results.
higher <- function(a) { force(a) CombinationFunction( function(x) { x > a } ) } lower <- function(b) { force(b) CombinationFunction( function(x) { x < b } ) } c1 <- higher(5) c2 <- lower(10) c3 <- higher(5) & lower(10) c3(7)
higher <- function(a) { force(a) CombinationFunction( function(x) { x > a } ) } lower <- function(b) { force(b) CombinationFunction( function(x) { x < b } ) } c1 <- higher(5) c2 <- lower(10) c3 <- higher(5) & lower(10) c3(7)
Simplifies the estimation of column counts, especially when group combination is required.
combine_counts(fct, groups_list = NULL)
combine_counts(fct, groups_list = NULL)
fct |
( |
groups_list |
(named |
A vector
of column counts.
ref <- c("A: Drug X", "B: Placebo") groups <- combine_groups(fct = DM$ARM, ref = ref) col_counts <- combine_counts( fct = DM$ARM, groups_list = groups ) basic_table() %>% split_cols_by_groups("ARM", groups) %>% add_colcounts() %>% analyze_vars("AGE") %>% build_table(DM, col_counts = col_counts) ref <- "A: Drug X" groups <- combine_groups(fct = DM$ARM, ref = ref) col_counts <- combine_counts( fct = DM$ARM, groups_list = groups ) basic_table() %>% split_cols_by_groups("ARM", groups) %>% add_colcounts() %>% analyze_vars("AGE") %>% build_table(DM, col_counts = col_counts)
ref <- c("A: Drug X", "B: Placebo") groups <- combine_groups(fct = DM$ARM, ref = ref) col_counts <- combine_counts( fct = DM$ARM, groups_list = groups ) basic_table() %>% split_cols_by_groups("ARM", groups) %>% add_colcounts() %>% analyze_vars("AGE") %>% build_table(DM, col_counts = col_counts) ref <- "A: Drug X" groups <- combine_groups(fct = DM$ARM, ref = ref) col_counts <- combine_counts( fct = DM$ARM, groups_list = groups ) basic_table() %>% split_cols_by_groups("ARM", groups) %>% add_colcounts() %>% analyze_vars("AGE") %>% build_table(DM, col_counts = col_counts)
Facilitate the re-combination of groups divided as reference and treatment groups; it helps in arranging groups of
columns in the rtables
framework and teal modules.
combine_groups(fct, ref = NULL, collapse = "/")
combine_groups(fct, ref = NULL, collapse = "/")
fct |
( |
ref |
( |
collapse |
( |
A list
with first item ref
(reference) and second item trt
(treatment).
groups <- combine_groups( fct = DM$ARM, ref = c("B: Placebo") ) basic_table() %>% split_cols_by_groups("ARM", groups) %>% add_colcounts() %>% analyze_vars("AGE") %>% build_table(DM)
groups <- combine_groups( fct = DM$ARM, ref = c("B: Placebo") ) basic_table() %>% split_cols_by_groups("ARM", groups) %>% add_colcounts() %>% analyze_vars("AGE") %>% build_table(DM)
Combine specified old factor Levels in a single new level.
combine_levels(x, levels, new_level = paste(levels, collapse = "/"))
combine_levels(x, levels, new_level = paste(levels, collapse = "/"))
x |
( |
levels |
( |
new_level |
( |
A factor
with the new levels.
x <- factor(letters[1:5], levels = letters[5:1]) combine_levels(x, levels = c("a", "b")) combine_levels(x, c("e", "b"))
x <- factor(letters[1:5], levels = letters[5:1]) combine_levels(x, levels = c("a", "b")) combine_levels(x, c("e", "b"))
Element-wise combination of two vectors
combine_vectors(x, y)
combine_vectors(x, y)
x |
( |
y |
( |
A list
where each element combines corresponding elements of x
and y
.
combine_vectors(1:3, 4:6)
combine_vectors(1:3, 4:6)
The analyze function compare_vars()
creates a layout element to summarize and compare one or more variables, using
the S3 generic function s_summary()
to calculate a list of summary statistics. A list of all available statistics
for numeric variables can be viewed by running get_stats("analyze_vars_numeric", add_pval = TRUE)
and for
non-numeric variables by running get_stats("analyze_vars_counts", add_pval = TRUE)
. Use the .stats
parameter to
specify the statistics to include in your output summary table.
Prior to using this function in your table layout you must use rtables::split_cols_by()
to create a column
split on the variable to be used in comparisons, and specify a reference group via the ref_group
parameter.
Comparisons can be performed for each group (column) against the specified reference group by including the p-value
statistic.
compare_vars( lyt, vars, var_labels = vars, na_str = default_na_str(), nested = TRUE, ..., na.rm = TRUE, show_labels = "default", table_names = vars, section_div = NA_character_, .stats = c("n", "mean_sd", "count_fraction", "pval"), .formats = NULL, .labels = NULL, .indent_mods = NULL ) s_compare(x, .ref_group, .in_ref_col, ...) ## S3 method for class 'numeric' s_compare(x, .ref_group, .in_ref_col, ...) ## S3 method for class 'factor' s_compare(x, .ref_group, .in_ref_col, denom = "n", na.rm = TRUE, ...) ## S3 method for class 'character' s_compare( x, .ref_group, .in_ref_col, denom = "n", na.rm = TRUE, .var, verbose = TRUE, ... ) ## S3 method for class 'logical' s_compare(x, .ref_group, .in_ref_col, na.rm = TRUE, denom = "n", ...)
compare_vars( lyt, vars, var_labels = vars, na_str = default_na_str(), nested = TRUE, ..., na.rm = TRUE, show_labels = "default", table_names = vars, section_div = NA_character_, .stats = c("n", "mean_sd", "count_fraction", "pval"), .formats = NULL, .labels = NULL, .indent_mods = NULL ) s_compare(x, .ref_group, .in_ref_col, ...) ## S3 method for class 'numeric' s_compare(x, .ref_group, .in_ref_col, ...) ## S3 method for class 'factor' s_compare(x, .ref_group, .in_ref_col, denom = "n", na.rm = TRUE, ...) ## S3 method for class 'character' s_compare( x, .ref_group, .in_ref_col, denom = "n", na.rm = TRUE, .var, verbose = TRUE, ... ) ## S3 method for class 'logical' s_compare(x, .ref_group, .in_ref_col, na.rm = TRUE, denom = "n", ...)
lyt |
( |
vars |
( |
var_labels |
( |
na_str |
( |
nested |
( |
... |
arguments passed to |
na.rm |
( |
show_labels |
( |
table_names |
( |
section_div |
( |
.stats |
( Options for numeric variables are: Options for non-numeric variables are: |
.formats |
(named |
.labels |
(named |
.indent_mods |
(named |
x |
( |
.ref_group |
( |
.in_ref_col |
( |
denom |
( |
.var |
( |
verbose |
( |
compare_vars()
returns a layout object suitable for passing to further layouting functions,
or to rtables::build_table()
. Adding this function to an rtable
layout will add formatted rows containing
the statistics from s_compare()
to the table layout.
s_compare()
returns output of s_summary()
and comparisons versus the reference group in the form of p-values.
compare_vars()
: Layout-creating function which can take statistics function arguments
and additional format arguments. This function is a wrapper for rtables::analyze()
.
s_compare()
: S3 generic function to produce a comparison summary.
s_compare(numeric)
: Method for numeric
class. This uses the standard t-test
to calculate the p-value.
s_compare(factor)
: Method for factor
class. This uses the chi-squared test
to calculate the p-value.
s_compare(character)
: Method for character
class. This makes an automatic
conversion to factor
(with a warning) and then forwards to the method for factors.
s_compare(logical)
: Method for logical
class. A chi-squared test
is used. If missing values are not removed, then they are counted as FALSE
.
For factor variables, denom
for factor proportions can only be n
since the purpose is to compare proportions
between columns, therefore a row-based proportion would not make sense. Proportion based on N_col
would
be difficult since we use counts for the chi-squared test statistic, therefore missing values should be accounted
for as explicit factor levels.
If factor variables contain NA
, these NA
values are excluded by default. To include NA
values
set na.rm = FALSE
and missing values will be displayed as an NA
level. Alternatively, an explicit
factor level can be defined for NA
values during pre-processing via df_explicit_na()
- the
default na_level
("<Missing>"
) will also be excluded when na.rm
is set to TRUE
.
For character variables, automatic conversion to factor does not guarantee that the table will be generated correctly. In particular for sparse tables this very likely can fail. Therefore it is always better to manually convert character variables to factors during pre-processing.
For compare_vars()
, the column split must define a reference group via ref_group
so that the comparison
is well defined.
s_summary()
which is used internally to compute a summary within s_compare()
, and a_summary()
which is used (with compare = TRUE
) as the analysis function for compare_vars()
.
# `compare_vars()` in `rtables` pipelines ## Default output within a `rtables` pipeline. lyt <- basic_table() %>% split_cols_by("ARMCD", ref_group = "ARM B") %>% compare_vars(c("AGE", "SEX")) build_table(lyt, tern_ex_adsl) ## Select and format statistics output. lyt <- basic_table() %>% split_cols_by("ARMCD", ref_group = "ARM C") %>% compare_vars( vars = "AGE", .stats = c("mean_sd", "pval"), .formats = c(mean_sd = "xx.x, xx.x"), .labels = c(mean_sd = "Mean, SD") ) build_table(lyt, df = tern_ex_adsl) # `s_compare.numeric` ## Usual case where both this and the reference group vector have more than 1 value. s_compare(rnorm(10, 5, 1), .ref_group = rnorm(5, -5, 1), .in_ref_col = FALSE) ## If one group has not more than 1 value, then p-value is not calculated. s_compare(rnorm(10, 5, 1), .ref_group = 1, .in_ref_col = FALSE) ## Empty numeric does not fail, it returns NA-filled items and no p-value. s_compare(numeric(), .ref_group = numeric(), .in_ref_col = FALSE) # `s_compare.factor` ## Basic usage: x <- factor(c("a", "a", "b", "c", "a")) y <- factor(c("a", "b", "c")) s_compare(x = x, .ref_group = y, .in_ref_col = FALSE) ## Management of NA values. x <- explicit_na(factor(c("a", "a", "b", "c", "a", NA, NA))) y <- explicit_na(factor(c("a", "b", "c", NA))) s_compare(x = x, .ref_group = y, .in_ref_col = FALSE, na.rm = TRUE) s_compare(x = x, .ref_group = y, .in_ref_col = FALSE, na.rm = FALSE) # `s_compare.character` ## Basic usage: x <- c("a", "a", "b", "c", "a") y <- c("a", "b", "c") s_compare(x, .ref_group = y, .in_ref_col = FALSE, .var = "x", verbose = FALSE) ## Note that missing values handling can make a large difference: x <- c("a", "a", "b", "c", "a", NA) y <- c("a", "b", "c", rep(NA, 20)) s_compare(x, .ref_group = y, .in_ref_col = FALSE, .var = "x", verbose = FALSE ) s_compare(x, .ref_group = y, .in_ref_col = FALSE, .var = "x", na.rm = FALSE, verbose = FALSE ) # `s_compare.logical` ## Basic usage: x <- c(TRUE, FALSE, TRUE, TRUE) y <- c(FALSE, FALSE, TRUE) s_compare(x, .ref_group = y, .in_ref_col = FALSE) ## Management of NA values. x <- c(NA, TRUE, FALSE) y <- c(NA, NA, NA, NA, FALSE) s_compare(x, .ref_group = y, .in_ref_col = FALSE, na.rm = TRUE) s_compare(x, .ref_group = y, .in_ref_col = FALSE, na.rm = FALSE)
# `compare_vars()` in `rtables` pipelines ## Default output within a `rtables` pipeline. lyt <- basic_table() %>% split_cols_by("ARMCD", ref_group = "ARM B") %>% compare_vars(c("AGE", "SEX")) build_table(lyt, tern_ex_adsl) ## Select and format statistics output. lyt <- basic_table() %>% split_cols_by("ARMCD", ref_group = "ARM C") %>% compare_vars( vars = "AGE", .stats = c("mean_sd", "pval"), .formats = c(mean_sd = "xx.x, xx.x"), .labels = c(mean_sd = "Mean, SD") ) build_table(lyt, df = tern_ex_adsl) # `s_compare.numeric` ## Usual case where both this and the reference group vector have more than 1 value. s_compare(rnorm(10, 5, 1), .ref_group = rnorm(5, -5, 1), .in_ref_col = FALSE) ## If one group has not more than 1 value, then p-value is not calculated. s_compare(rnorm(10, 5, 1), .ref_group = 1, .in_ref_col = FALSE) ## Empty numeric does not fail, it returns NA-filled items and no p-value. s_compare(numeric(), .ref_group = numeric(), .in_ref_col = FALSE) # `s_compare.factor` ## Basic usage: x <- factor(c("a", "a", "b", "c", "a")) y <- factor(c("a", "b", "c")) s_compare(x = x, .ref_group = y, .in_ref_col = FALSE) ## Management of NA values. x <- explicit_na(factor(c("a", "a", "b", "c", "a", NA, NA))) y <- explicit_na(factor(c("a", "b", "c", NA))) s_compare(x = x, .ref_group = y, .in_ref_col = FALSE, na.rm = TRUE) s_compare(x = x, .ref_group = y, .in_ref_col = FALSE, na.rm = FALSE) # `s_compare.character` ## Basic usage: x <- c("a", "a", "b", "c", "a") y <- c("a", "b", "c") s_compare(x, .ref_group = y, .in_ref_col = FALSE, .var = "x", verbose = FALSE) ## Note that missing values handling can make a large difference: x <- c("a", "a", "b", "c", "a", NA) y <- c("a", "b", "c", rep(NA, 20)) s_compare(x, .ref_group = y, .in_ref_col = FALSE, .var = "x", verbose = FALSE ) s_compare(x, .ref_group = y, .in_ref_col = FALSE, .var = "x", na.rm = FALSE, verbose = FALSE ) # `s_compare.logical` ## Basic usage: x <- c(TRUE, FALSE, TRUE, TRUE) y <- c(FALSE, FALSE, TRUE) s_compare(x, .ref_group = y, .in_ref_col = FALSE) ## Management of NA values. x <- c(NA, TRUE, FALSE) y <- c(NA, NA, NA, NA, FALSE) s_compare(x, .ref_group = y, .in_ref_col = FALSE, na.rm = TRUE) s_compare(x, .ref_group = y, .in_ref_col = FALSE, na.rm = FALSE)
Sets a list of parameters for summaries of descriptive statistics. Typically used internally to specify
details for s_summary()
. This function family is mainly used by analyze_vars()
.
control_analyze_vars( conf_level = 0.95, quantiles = c(0.25, 0.75), quantile_type = 2, test_mean = 0 )
control_analyze_vars( conf_level = 0.95, quantiles = c(0.25, 0.75), quantile_type = 2, test_mean = 0 )
conf_level |
( |
quantiles |
( |
quantile_type |
( |
test_mean |
( |
A list of components with the same names as the arguments.
Auxiliary functions for controlling arguments for formatting the annotation tables that can be added to plots
generated via g_km()
.
control_surv_med_annot(x = 0.8, y = 0.85, w = 0.32, h = 0.16, fill = TRUE) control_coxph_annot( x = 0.29, y = 0.51, w = 0.4, h = 0.125, fill = TRUE, ref_lbls = FALSE )
control_surv_med_annot(x = 0.8, y = 0.85, w = 0.32, h = 0.16, fill = TRUE) control_coxph_annot( x = 0.29, y = 0.51, w = 0.4, h = 0.125, fill = TRUE, ref_lbls = FALSE )
x |
( |
y |
( |
w |
( |
h |
( |
fill |
( |
ref_lbls |
( |
A list of components with the same names as the arguments.
control_surv_med_annot()
: Control function for formatting the median survival time annotation table. This annotation
table can be added in g_km()
by setting annot_surv_med=TRUE
, and can be configured using the
control_surv_med_annot()
function by setting it as the control_annot_surv_med
argument.
control_coxph_annot()
: Control function for formatting the Cox-PH annotation table. This annotation table can be
added in g_km()
by setting annot_coxph=TRUE
, and can be configured using the control_coxph_annot()
function
by setting it as the control_annot_coxph
argument.
control_surv_med_annot() control_coxph_annot()
control_surv_med_annot() control_coxph_annot()
This is an auxiliary function for controlling arguments for Cox-PH model, typically used internally to specify
details of Cox-PH model for s_coxph_pairwise()
. conf_level
refers to Hazard Ratio estimation.
control_coxph( pval_method = c("log-rank", "wald", "likelihood"), ties = c("efron", "breslow", "exact"), conf_level = 0.95 )
control_coxph( pval_method = c("log-rank", "wald", "likelihood"), ties = c("efron", "breslow", "exact"), conf_level = 0.95 )
pval_method |
( |
ties |
( |
conf_level |
( |
A list of components with the same names as the arguments.
Sets a list of parameters for Cox regression fit. Used internally.
control_coxreg( pval_method = c("wald", "likelihood"), ties = c("exact", "efron", "breslow"), conf_level = 0.95, interaction = FALSE )
control_coxreg( pval_method = c("wald", "likelihood"), ties = c("exact", "efron", "breslow"), conf_level = 0.95, interaction = FALSE )
pval_method |
( |
ties |
( |
conf_level |
( |
interaction |
( |
A list
of items with names corresponding to the arguments.
fit_coxreg_univar()
and fit_coxreg_multivar()
.
control_coxreg()
control_coxreg()
This is an auxiliary function for controlling arguments for the incidence rate, used
internally to specify details in s_incidence_rate()
.
control_incidence_rate( conf_level = 0.95, conf_type = c("normal", "normal_log", "exact", "byar"), input_time_unit = c("year", "day", "week", "month"), num_pt_year = 100 )
control_incidence_rate( conf_level = 0.95, conf_type = c("normal", "normal_log", "exact", "byar"), input_time_unit = c("year", "day", "week", "month"), num_pt_year = 100 )
conf_level |
( |
conf_type |
( |
input_time_unit |
( |
num_pt_year |
( |
A list of components with the same names as the arguments.
control_incidence_rate(0.9, "exact", "month", 100)
control_incidence_rate(0.9, "exact", "month", 100)
g_lineplot()
Default values for variables
parameter in g_lineplot
function.
A variable's default value can be overwritten for any variable.
control_lineplot_vars( x = "AVISIT", y = "AVAL", group_var = "ARM", facet_var = NA, paramcd = "PARAMCD", y_unit = "AVALU", subject_var = "USUBJID" )
control_lineplot_vars( x = "AVISIT", y = "AVAL", group_var = "ARM", facet_var = NA, paramcd = "PARAMCD", y_unit = "AVALU", subject_var = "USUBJID" )
x |
( |
y |
( |
group_var |
( |
facet_var |
( |
paramcd |
( |
y_unit |
( |
subject_var |
( |
A named character vector of variable names.
control_lineplot_vars() control_lineplot_vars(group_var = NA)
control_lineplot_vars() control_lineplot_vars(group_var = NA)
This is an auxiliary function for controlling arguments for logistic regression models.
conf_level
refers to the confidence level used for the Odds Ratio CIs.
control_logistic(response_definition = "response", conf_level = 0.95)
control_logistic(response_definition = "response", conf_level = 0.95)
response_definition |
( |
conf_level |
( |
A list of components with the same names as the arguments.
# Standard options. control_logistic() # Modify confidence level. control_logistic(conf_level = 0.9) # Use a different response definition. control_logistic(response_definition = "I(response %in% c('CR', 'PR'))")
# Standard options. control_logistic() # Modify confidence level. control_logistic(conf_level = 0.9) # Use a different response definition. control_logistic(response_definition = "I(response %in% c('CR', 'PR'))")
Sets a list of parameters to use when generating a risk (proportion) difference column. Used as input to the
riskdiff
parameter of tabulate_rsp_subgroups()
and tabulate_survival_subgroups()
.
control_riskdiff( arm_x = NULL, arm_y = NULL, format = "xx.x (xx.x - xx.x)", col_label = "Risk Difference (%) (95% CI)", pct = TRUE )
control_riskdiff( arm_x = NULL, arm_y = NULL, format = "xx.x (xx.x - xx.x)", col_label = "Risk Difference (%) (95% CI)", pct = TRUE )
arm_x |
( |
arm_y |
( |
format |
( |
col_label |
( |
pct |
( |
A list
of items with names corresponding to the arguments.
add_riskdiff()
, tabulate_rsp_subgroups()
, and tabulate_survival_subgroups()
.
control_riskdiff() control_riskdiff(arm_x = "ARM A", arm_y = "ARM B")
control_riskdiff() control_riskdiff(arm_x = "ARM A", arm_y = "ARM B")
This is an auxiliary function for controlling arguments for STEP calculations.
control_step( biomarker = NULL, use_percentile = TRUE, bandwidth, degree = 0L, num_points = 39L )
control_step( biomarker = NULL, use_percentile = TRUE, bandwidth, degree = 0L, num_points = 39L )
biomarker |
( |
use_percentile |
( |
bandwidth |
( |
degree |
( |
num_points |
( |
A list of components with the same names as the arguments, except biomarker
which is
just used to calculate the bandwidth
in case that actual biomarker windows are requested.
# Provide biomarker values and request actual values to be used, # so that bandwidth is chosen from range. control_step(biomarker = 1:10, use_percentile = FALSE) # Use a global model with quadratic biomarker interaction term. control_step(bandwidth = NULL, degree = 2) # Reduce number of points to be used. control_step(num_points = 10)
# Provide biomarker values and request actual values to be used, # so that bandwidth is chosen from range. control_step(biomarker = 1:10, use_percentile = FALSE) # Use a global model with quadratic biomarker interaction term. control_step(bandwidth = NULL, degree = 2) # Reduce number of points to be used. control_step(num_points = 10)
survfit
models for survival timeThis is an auxiliary function for controlling arguments for survfit
model, typically used internally to specify
details of survfit
model for s_surv_time()
. conf_level
refers to survival time estimation.
control_surv_time( conf_level = 0.95, conf_type = c("plain", "log", "log-log"), quantiles = c(0.25, 0.75) )
control_surv_time( conf_level = 0.95, conf_type = c("plain", "log", "log-log"), quantiles = c(0.25, 0.75) )
conf_level |
( |
conf_type |
( |
quantiles |
( |
A list of components with the same names as the arguments.
survfit
models for patients' survival rate at time pointsThis is an auxiliary function for controlling arguments for survfit
model, typically used internally to specify
details of survfit
model for s_surv_timepoint()
. conf_level
refers to patient risk estimation at a time point.
control_surv_timepoint( conf_level = 0.95, conf_type = c("plain", "log", "log-log") )
control_surv_timepoint( conf_level = 0.95, conf_type = c("plain", "log", "log-log") )
conf_level |
( |
conf_type |
( |
A list of components with the same names as the arguments.
The analyze function count_occurrences()
creates a layout element to calculate occurrence counts for patients.
This function analyzes the variable(s) supplied to vars
and returns a table of occurrence counts for
each unique value (or level) of the variable(s). This variable (or variables) must be
non-numeric. The id
variable is used to indicate unique subject identifiers (defaults to USUBJID
).
If there are multiple occurrences of the same value recorded for a patient, the value is only counted once.
The summarize function summarize_occurrences()
performs the same function as count_occurrences()
except it
creates content rows, not data rows, to summarize the current table row/column context and operates on the level of
the latest row split or the root of the table if no row splits have occurred.
count_occurrences( lyt, vars, id = "USUBJID", drop = TRUE, var_labels = vars, show_labels = "hidden", riskdiff = FALSE, na_str = default_na_str(), nested = TRUE, ..., table_names = vars, .stats = "count_fraction_fixed_dp", .formats = NULL, .labels = NULL, .indent_mods = NULL ) summarize_occurrences( lyt, var, id = "USUBJID", drop = TRUE, riskdiff = FALSE, na_str = default_na_str(), ..., .stats = "count_fraction_fixed_dp", .formats = NULL, .indent_mods = NULL, .labels = NULL ) s_count_occurrences( df, denom = c("N_col", "n", "N_row"), .N_col, .N_row, .df_row, drop = TRUE, .var = "MHDECOD", id = "USUBJID" ) a_count_occurrences( df, labelstr = "", id = "USUBJID", denom = c("N_col", "n", "N_row"), drop = TRUE, .N_col, .N_row, .var = NULL, .df_row = NULL, .stats = NULL, .formats = NULL, .labels = NULL, .indent_mods = NULL, na_str = default_na_str() )
count_occurrences( lyt, vars, id = "USUBJID", drop = TRUE, var_labels = vars, show_labels = "hidden", riskdiff = FALSE, na_str = default_na_str(), nested = TRUE, ..., table_names = vars, .stats = "count_fraction_fixed_dp", .formats = NULL, .labels = NULL, .indent_mods = NULL ) summarize_occurrences( lyt, var, id = "USUBJID", drop = TRUE, riskdiff = FALSE, na_str = default_na_str(), ..., .stats = "count_fraction_fixed_dp", .formats = NULL, .indent_mods = NULL, .labels = NULL ) s_count_occurrences( df, denom = c("N_col", "n", "N_row"), .N_col, .N_row, .df_row, drop = TRUE, .var = "MHDECOD", id = "USUBJID" ) a_count_occurrences( df, labelstr = "", id = "USUBJID", denom = c("N_col", "n", "N_row"), drop = TRUE, .N_col, .N_row, .var = NULL, .df_row = NULL, .stats = NULL, .formats = NULL, .labels = NULL, .indent_mods = NULL, na_str = default_na_str() )
lyt |
( |
vars |
( |
id |
( |
drop |
( |
var_labels |
( |
show_labels |
( |
riskdiff |
( |
na_str |
( |
nested |
( |
... |
additional arguments for the lower level functions. |
table_names |
( |
.stats |
( Options are: |
.formats |
(named |
.labels |
(named |
.indent_mods |
(named |
df |
( |
denom |
(
|
.N_col |
( |
.N_row |
( |
.df_row |
( |
.var , var
|
( |
labelstr |
( |
count_occurrences()
returns a layout object suitable for passing to further layouting functions,
or to rtables::build_table()
. Adding this function to an rtable
layout will add formatted rows containing
the statistics from s_count_occurrences()
to the table layout.
summarize_occurrences()
returns a layout object suitable for passing to further layouting functions,
or to rtables::build_table()
. Adding this function to an rtable
layout will add formatted content rows
containing the statistics from s_count_occurrences()
to the table layout.
s_count_occurrences()
returns a list with:
count
: list of counts with one element per occurrence.
count_fraction
: list of counts and fractions with one element per occurrence.
fraction
: list of numerators and denominators with one element per occurrence.
a_count_occurrences()
returns the corresponding list with formatted rtables::CellValue()
.
count_occurrences()
: Layout-creating function which can take statistics function arguments
and additional format arguments. This function is a wrapper for rtables::analyze()
.
summarize_occurrences()
: Layout-creating function which can take content function arguments
and additional format arguments. This function is a wrapper for rtables::summarize_row_groups()
.
s_count_occurrences()
: Statistics function which counts number of patients that report an
occurrence.
a_count_occurrences()
: Formatted analysis function which is used as afun
in count_occurrences()
.
By default, occurrences which don't appear in a given row split are dropped from the table and
the occurrences in the table are sorted alphabetically per row split. Therefore, the corresponding layout
needs to use split_fun = drop_split_levels
in the split_rows_by
calls. Use drop = FALSE
if you would
like to show all occurrences.
library(dplyr) df <- data.frame( USUBJID = as.character(c( 1, 1, 2, 4, 4, 4, 6, 6, 6, 7, 7, 8 )), MHDECOD = c( "MH1", "MH2", "MH1", "MH1", "MH1", "MH3", "MH2", "MH2", "MH3", "MH1", "MH2", "MH4" ), ARM = rep(c("A", "B"), each = 6), SEX = c("F", "F", "M", "M", "M", "M", "F", "F", "F", "M", "M", "F") ) df_adsl <- df %>% select(USUBJID, ARM) %>% unique() # Create table layout lyt <- basic_table() %>% split_cols_by("ARM") %>% add_colcounts() %>% count_occurrences(vars = "MHDECOD", .stats = c("count_fraction")) # Apply table layout to data and produce `rtable` object tbl <- lyt %>% build_table(df, alt_counts_df = df_adsl) %>% prune_table() tbl # Layout creating function with custom format. basic_table() %>% add_colcounts() %>% split_rows_by("SEX", child_labels = "visible") %>% summarize_occurrences( var = "MHDECOD", .formats = c("count_fraction" = "xx.xx (xx.xx%)") ) %>% build_table(df, alt_counts_df = df_adsl) # Count unique occurrences per subject. s_count_occurrences( df, .N_col = 4L, .N_row = 4L, .df_row = df, .var = "MHDECOD", id = "USUBJID" ) a_count_occurrences( df, .N_col = 4L, .df_row = df, .var = "MHDECOD", id = "USUBJID" )
library(dplyr) df <- data.frame( USUBJID = as.character(c( 1, 1, 2, 4, 4, 4, 6, 6, 6, 7, 7, 8 )), MHDECOD = c( "MH1", "MH2", "MH1", "MH1", "MH1", "MH3", "MH2", "MH2", "MH3", "MH1", "MH2", "MH4" ), ARM = rep(c("A", "B"), each = 6), SEX = c("F", "F", "M", "M", "M", "M", "F", "F", "F", "M", "M", "F") ) df_adsl <- df %>% select(USUBJID, ARM) %>% unique() # Create table layout lyt <- basic_table() %>% split_cols_by("ARM") %>% add_colcounts() %>% count_occurrences(vars = "MHDECOD", .stats = c("count_fraction")) # Apply table layout to data and produce `rtable` object tbl <- lyt %>% build_table(df, alt_counts_df = df_adsl) %>% prune_table() tbl # Layout creating function with custom format. basic_table() %>% add_colcounts() %>% split_rows_by("SEX", child_labels = "visible") %>% summarize_occurrences( var = "MHDECOD", .formats = c("count_fraction" = "xx.xx (xx.xx%)") ) %>% build_table(df, alt_counts_df = df_adsl) # Count unique occurrences per subject. s_count_occurrences( df, .N_col = 4L, .N_row = 4L, .df_row = df, .var = "MHDECOD", id = "USUBJID" ) a_count_occurrences( df, .N_col = 4L, .df_row = df, .var = "MHDECOD", id = "USUBJID" )
The analyze function count_occurrences_by_grade()
creates a layout element to calculate occurrence counts by grade.
This function analyzes primary analysis variable var
which indicates toxicity grades. The id
variable
is used to indicate unique subject identifiers (defaults to USUBJID
). The user can also supply a list of
custom groups of grades to analyze via the grade_groups
parameter. The remove_single
argument will
remove single grades from the analysis so that only grade groups are analyzed.
If there are multiple grades recorded for one patient only the highest grade level is counted.
The summarize function summarize_occurrences_by_grade()
performs the same function as
count_occurrences_by_grade()
except it creates content rows, not data rows, to summarize the current table
row/column context and operates on the level of the latest row split or the root of the table if no row splits have
occurred.
count_occurrences_by_grade( lyt, var, id = "USUBJID", grade_groups = list(), remove_single = TRUE, only_grade_groups = FALSE, var_labels = var, show_labels = "default", riskdiff = FALSE, na_str = default_na_str(), nested = TRUE, ..., table_names = var, .stats = "count_fraction", .formats = list(count_fraction = format_count_fraction_fixed_dp), .indent_mods = NULL, .labels = NULL ) summarize_occurrences_by_grade( lyt, var, id = "USUBJID", grade_groups = list(), remove_single = TRUE, only_grade_groups = FALSE, riskdiff = FALSE, na_str = default_na_str(), ..., .stats = "count_fraction", .formats = list(count_fraction = format_count_fraction_fixed_dp), .indent_mods = NULL, .labels = NULL ) s_count_occurrences_by_grade( df, .var, .N_row, .N_col, id = "USUBJID", grade_groups = list(), remove_single = TRUE, only_grade_groups = FALSE, denom = c("N_col", "n", "N_row"), labelstr = "" ) a_count_occurrences_by_grade( df, labelstr = "", id = "USUBJID", grade_groups = list(), remove_single = TRUE, only_grade_groups = FALSE, denom = c("N_col", "n", "N_row"), .N_col, .N_row, .df_row, .var = NULL, .stats = NULL, .formats = NULL, .labels = NULL, .indent_mods = NULL, na_str = default_na_str() )
count_occurrences_by_grade( lyt, var, id = "USUBJID", grade_groups = list(), remove_single = TRUE, only_grade_groups = FALSE, var_labels = var, show_labels = "default", riskdiff = FALSE, na_str = default_na_str(), nested = TRUE, ..., table_names = var, .stats = "count_fraction", .formats = list(count_fraction = format_count_fraction_fixed_dp), .indent_mods = NULL, .labels = NULL ) summarize_occurrences_by_grade( lyt, var, id = "USUBJID", grade_groups = list(), remove_single = TRUE, only_grade_groups = FALSE, riskdiff = FALSE, na_str = default_na_str(), ..., .stats = "count_fraction", .formats = list(count_fraction = format_count_fraction_fixed_dp), .indent_mods = NULL, .labels = NULL ) s_count_occurrences_by_grade( df, .var, .N_row, .N_col, id = "USUBJID", grade_groups = list(), remove_single = TRUE, only_grade_groups = FALSE, denom = c("N_col", "n", "N_row"), labelstr = "" ) a_count_occurrences_by_grade( df, labelstr = "", id = "USUBJID", grade_groups = list(), remove_single = TRUE, only_grade_groups = FALSE, denom = c("N_col", "n", "N_row"), .N_col, .N_row, .df_row, .var = NULL, .stats = NULL, .formats = NULL, .labels = NULL, .indent_mods = NULL, na_str = default_na_str() )
lyt |
( |
id |
( |
grade_groups |
(named |
remove_single |
( |
only_grade_groups |
( |
var_labels |
( |
show_labels |
( |
riskdiff |
( |
na_str |
( |
nested |
( |
... |
additional arguments for the lower level functions. |
table_names |
( |
.stats |
( Options are: |
.formats |
(named |
.indent_mods |
(named |
.labels |
(named |
df |
( |
.var , var
|
( |
.N_row |
( |
.N_col |
( |
denom |
(
|
labelstr |
( |
.df_row |
( |
count_occurrences_by_grade()
returns a layout object suitable for passing to further layouting functions,
or to rtables::build_table()
. Adding this function to an rtable
layout will add formatted rows containing
the statistics from s_count_occurrences_by_grade()
to the table layout.
summarize_occurrences_by_grade()
returns a layout object suitable for passing to further layouting functions,
or to rtables::build_table()
. Adding this function to an rtable
layout will add formatted content rows
containing the statistics from s_count_occurrences_by_grade()
to the table layout.
s_count_occurrences_by_grade()
returns a list of counts and fractions with one element per grade level or
grade level grouping.
a_count_occurrences_by_grade()
returns the corresponding list with formatted rtables::CellValue()
.
count_occurrences_by_grade()
: Layout-creating function which can take statistics function
arguments and additional format arguments. This function is a wrapper for rtables::analyze()
.
summarize_occurrences_by_grade()
: Layout-creating function which can take content function arguments
and additional format arguments. This function is a wrapper for rtables::summarize_row_groups()
.
s_count_occurrences_by_grade()
: Statistics function which counts the
number of patients by highest grade.
a_count_occurrences_by_grade()
: Formatted analysis function which is used as afun
in count_occurrences_by_grade()
.
Relevant helper function h_append_grade_groups()
.
library(dplyr) df <- data.frame( USUBJID = as.character(c(1:6, 1)), ARM = factor(c("A", "A", "A", "B", "B", "B", "A"), levels = c("A", "B")), AETOXGR = factor(c(1, 2, 3, 4, 1, 2, 3), levels = c(1:5)), AESEV = factor( x = c("MILD", "MODERATE", "SEVERE", "MILD", "MILD", "MODERATE", "SEVERE"), levels = c("MILD", "MODERATE", "SEVERE") ), stringsAsFactors = FALSE ) df_adsl <- df %>% select(USUBJID, ARM) %>% unique() # Layout creating function with custom format. basic_table() %>% split_cols_by("ARM") %>% add_colcounts() %>% count_occurrences_by_grade( var = "AESEV", .formats = c("count_fraction" = "xx.xx (xx.xx%)") ) %>% build_table(df, alt_counts_df = df_adsl) # Define additional grade groupings. grade_groups <- list( "-Any-" = c("1", "2", "3", "4", "5"), "Grade 1-2" = c("1", "2"), "Grade 3-5" = c("3", "4", "5") ) basic_table() %>% split_cols_by("ARM") %>% add_colcounts() %>% count_occurrences_by_grade( var = "AETOXGR", grade_groups = grade_groups, only_grade_groups = TRUE ) %>% build_table(df, alt_counts_df = df_adsl) # Layout creating function with custom format. basic_table() %>% add_colcounts() %>% split_rows_by("ARM", child_labels = "visible", nested = TRUE) %>% summarize_occurrences_by_grade( var = "AESEV", .formats = c("count_fraction" = "xx.xx (xx.xx%)") ) %>% build_table(df, alt_counts_df = df_adsl) basic_table() %>% add_colcounts() %>% split_rows_by("ARM", child_labels = "visible", nested = TRUE) %>% summarize_occurrences_by_grade( var = "AETOXGR", grade_groups = grade_groups ) %>% build_table(df, alt_counts_df = df_adsl) s_count_occurrences_by_grade( df, .N_col = 10L, .var = "AETOXGR", id = "USUBJID", grade_groups = list("ANY" = levels(df$AETOXGR)) ) a_count_occurrences_by_grade( df, .N_col = 10L, .N_row = 10L, .var = "AETOXGR", id = "USUBJID", grade_groups = list("ANY" = levels(df$AETOXGR)) )
library(dplyr) df <- data.frame( USUBJID = as.character(c(1:6, 1)), ARM = factor(c("A", "A", "A", "B", "B", "B", "A"), levels = c("A", "B")), AETOXGR = factor(c(1, 2, 3, 4, 1, 2, 3), levels = c(1:5)), AESEV = factor( x = c("MILD", "MODERATE", "SEVERE", "MILD", "MILD", "MODERATE", "SEVERE"), levels = c("MILD", "MODERATE", "SEVERE") ), stringsAsFactors = FALSE ) df_adsl <- df %>% select(USUBJID, ARM) %>% unique() # Layout creating function with custom format. basic_table() %>% split_cols_by("ARM") %>% add_colcounts() %>% count_occurrences_by_grade( var = "AESEV", .formats = c("count_fraction" = "xx.xx (xx.xx%)") ) %>% build_table(df, alt_counts_df = df_adsl) # Define additional grade groupings. grade_groups <- list( "-Any-" = c("1", "2", "3", "4", "5"), "Grade 1-2" = c("1", "2"), "Grade 3-5" = c("3", "4", "5") ) basic_table() %>% split_cols_by("ARM") %>% add_colcounts() %>% count_occurrences_by_grade( var = "AETOXGR", grade_groups = grade_groups, only_grade_groups = TRUE ) %>% build_table(df, alt_counts_df = df_adsl) # Layout creating function with custom format. basic_table() %>% add_colcounts() %>% split_rows_by("ARM", child_labels = "visible", nested = TRUE) %>% summarize_occurrences_by_grade( var = "AESEV", .formats = c("count_fraction" = "xx.xx (xx.xx%)") ) %>% build_table(df, alt_counts_df = df_adsl) basic_table() %>% add_colcounts() %>% split_rows_by("ARM", child_labels = "visible", nested = TRUE) %>% summarize_occurrences_by_grade( var = "AETOXGR", grade_groups = grade_groups ) %>% build_table(df, alt_counts_df = df_adsl) s_count_occurrences_by_grade( df, .N_col = 10L, .var = "AETOXGR", id = "USUBJID", grade_groups = list("ANY" = levels(df$AETOXGR)) ) a_count_occurrences_by_grade( df, .N_col = 10L, .N_row = 10L, .var = "AETOXGR", id = "USUBJID", grade_groups = list("ANY" = levels(df$AETOXGR)) )
The analyze function count_patients_with_event()
creates a layout element to calculate patient counts for a
user-specified set of events.
This function analyzes primary analysis variable vars
which indicates unique subject identifiers. Events
are defined by the user as a named vector via the filters
argument, where each name corresponds to a
variable and each value is the value(s) that that variable takes for the event.
If there are multiple records with the same event recorded for a patient, only one occurrence is counted.
count_patients_with_event( lyt, vars, filters, riskdiff = FALSE, na_str = default_na_str(), nested = TRUE, ..., table_names = vars, .stats = "count_fraction", .formats = list(count_fraction = format_count_fraction_fixed_dp), .labels = NULL, .indent_mods = NULL ) s_count_patients_with_event( df, .var, filters, .N_col, .N_row, denom = c("n", "N_col", "N_row") ) a_count_patients_with_event( df, labelstr = "", filters, denom = c("n", "N_col", "N_row"), .N_col, .N_row, .df_row, .var = NULL, .stats = NULL, .formats = NULL, .labels = NULL, .indent_mods = NULL, na_str = default_na_str() )
count_patients_with_event( lyt, vars, filters, riskdiff = FALSE, na_str = default_na_str(), nested = TRUE, ..., table_names = vars, .stats = "count_fraction", .formats = list(count_fraction = format_count_fraction_fixed_dp), .labels = NULL, .indent_mods = NULL ) s_count_patients_with_event( df, .var, filters, .N_col, .N_row, denom = c("n", "N_col", "N_row") ) a_count_patients_with_event( df, labelstr = "", filters, denom = c("n", "N_col", "N_row"), .N_col, .N_row, .df_row, .var = NULL, .stats = NULL, .formats = NULL, .labels = NULL, .indent_mods = NULL, na_str = default_na_str() )
lyt |
( |
vars |
( |
filters |
( |
riskdiff |
( |
na_str |
( |
nested |
( |
... |
additional arguments for the lower level functions. |
table_names |
( |
.stats |
( Options are: |
.formats |
(named |
.labels |
(named |
.indent_mods |
(named |
df |
( |
.var |
( |
.N_col |
( |
.N_row |
( |
denom |
(
|
labelstr |
( |
.df_row |
( |
count_patients_with_event()
returns a layout object suitable for passing to further layouting functions,
or to rtables::build_table()
. Adding this function to an rtable
layout will add formatted rows containing
the statistics from s_count_patients_with_event()
to the table layout.
s_count_patients_with_event()
returns the count and fraction of unique identifiers with the defined event.
a_count_patients_with_event()
returns the corresponding list with formatted rtables::CellValue()
.
count_patients_with_event()
: Layout-creating function which can take statistics function
arguments and additional format arguments. This function is a wrapper for rtables::analyze()
.
s_count_patients_with_event()
: Statistics function which counts the number of patients for which
the defined event has occurred.
a_count_patients_with_event()
: Formatted analysis function which is used as afun
in count_patients_with_event()
.
lyt <- basic_table() %>% split_cols_by("ARM") %>% add_colcounts() %>% count_values( "STUDYID", values = "AB12345", .stats = "count", .labels = c(count = "Total AEs") ) %>% count_patients_with_event( "SUBJID", filters = c("TRTEMFL" = "Y"), .labels = c(count_fraction = "Total number of patients with at least one adverse event"), table_names = "tbl_all" ) %>% count_patients_with_event( "SUBJID", filters = c("TRTEMFL" = "Y", "AEOUT" = "FATAL"), .labels = c(count_fraction = "Total number of patients with fatal AEs"), table_names = "tbl_fatal" ) %>% count_patients_with_event( "SUBJID", filters = c("TRTEMFL" = "Y", "AEOUT" = "FATAL", "AEREL" = "Y"), .labels = c(count_fraction = "Total number of patients with related fatal AEs"), .indent_mods = c(count_fraction = 2L), table_names = "tbl_rel_fatal" ) build_table(lyt, tern_ex_adae, alt_counts_df = tern_ex_adsl) s_count_patients_with_event( tern_ex_adae, .var = "SUBJID", filters = c("TRTEMFL" = "Y") ) s_count_patients_with_event( tern_ex_adae, .var = "SUBJID", filters = c("TRTEMFL" = "Y", "AEOUT" = "FATAL") ) s_count_patients_with_event( tern_ex_adae, .var = "SUBJID", filters = c("TRTEMFL" = "Y", "AEOUT" = "FATAL"), denom = "N_col", .N_col = 456 ) a_count_patients_with_event( tern_ex_adae, .var = "SUBJID", filters = c("TRTEMFL" = "Y"), .N_col = 100, .N_row = 100 )
lyt <- basic_table() %>% split_cols_by("ARM") %>% add_colcounts() %>% count_values( "STUDYID", values = "AB12345", .stats = "count", .labels = c(count = "Total AEs") ) %>% count_patients_with_event( "SUBJID", filters = c("TRTEMFL" = "Y"), .labels = c(count_fraction = "Total number of patients with at least one adverse event"), table_names = "tbl_all" ) %>% count_patients_with_event( "SUBJID", filters = c("TRTEMFL" = "Y", "AEOUT" = "FATAL"), .labels = c(count_fraction = "Total number of patients with fatal AEs"), table_names = "tbl_fatal" ) %>% count_patients_with_event( "SUBJID", filters = c("TRTEMFL" = "Y", "AEOUT" = "FATAL", "AEREL" = "Y"), .labels = c(count_fraction = "Total number of patients with related fatal AEs"), .indent_mods = c(count_fraction = 2L), table_names = "tbl_rel_fatal" ) build_table(lyt, tern_ex_adae, alt_counts_df = tern_ex_adsl) s_count_patients_with_event( tern_ex_adae, .var = "SUBJID", filters = c("TRTEMFL" = "Y") ) s_count_patients_with_event( tern_ex_adae, .var = "SUBJID", filters = c("TRTEMFL" = "Y", "AEOUT" = "FATAL") ) s_count_patients_with_event( tern_ex_adae, .var = "SUBJID", filters = c("TRTEMFL" = "Y", "AEOUT" = "FATAL"), denom = "N_col", .N_col = 456 ) a_count_patients_with_event( tern_ex_adae, .var = "SUBJID", filters = c("TRTEMFL" = "Y"), .N_col = 100, .N_row = 100 )
The analyze function count_patients_with_flags()
creates a layout element to calculate counts of patients for
which user-specified flags are present.
This function analyzes primary analysis variable var
which indicates unique subject identifiers. Flags
variables to analyze are specified by the user via the flag_variables
argument, and must either take value
TRUE
(flag present) or FALSE
(flag absent) for each record.
If there are multiple records with the same flag present for a patient, only one occurrence is counted.
count_patients_with_flags( lyt, var, flag_variables, flag_labels = NULL, var_labels = var, show_labels = "hidden", riskdiff = FALSE, na_str = default_na_str(), nested = TRUE, ..., table_names = paste0("tbl_flags_", var), .stats = "count_fraction", .formats = list(count_fraction = format_count_fraction_fixed_dp), .indent_mods = NULL, .labels = NULL ) s_count_patients_with_flags( df, .var, flag_variables, flag_labels = NULL, .N_col, .N_row, denom = c("n", "N_col", "N_row") ) a_count_patients_with_flags( df, labelstr = "", flag_variables, flag_labels = NULL, denom = c("n", "N_col", "N_row"), .N_col, .N_row, .df_row, .var = NULL, .stats = NULL, .formats = NULL, .labels = NULL, .indent_mods = NULL, na_str = default_na_str() )
count_patients_with_flags( lyt, var, flag_variables, flag_labels = NULL, var_labels = var, show_labels = "hidden", riskdiff = FALSE, na_str = default_na_str(), nested = TRUE, ..., table_names = paste0("tbl_flags_", var), .stats = "count_fraction", .formats = list(count_fraction = format_count_fraction_fixed_dp), .indent_mods = NULL, .labels = NULL ) s_count_patients_with_flags( df, .var, flag_variables, flag_labels = NULL, .N_col, .N_row, denom = c("n", "N_col", "N_row") ) a_count_patients_with_flags( df, labelstr = "", flag_variables, flag_labels = NULL, denom = c("n", "N_col", "N_row"), .N_col, .N_row, .df_row, .var = NULL, .stats = NULL, .formats = NULL, .labels = NULL, .indent_mods = NULL, na_str = default_na_str() )
lyt |
( |
var |
( |
flag_variables |
( |
flag_labels |
( |
var_labels |
( |
show_labels |
( |
riskdiff |
( |
na_str |
( |
nested |
( |
... |
additional arguments for the lower level functions. |
table_names |
( |
.stats |
( Options are: |
.formats |
(named |
.indent_mods |
(named |
.labels |
(named |
df |
( |
.var |
( |
.N_col |
( |
.N_row |
( |
denom |
(
|
labelstr |
( |
.df_row |
( |
count_patients_with_flags()
returns a layout object suitable for passing to further layouting functions,
or to rtables::build_table()
. Adding this function to an rtable
layout will add formatted rows containing
the statistics from s_count_patients_with_flags()
to the table layout.
s_count_patients_with_flags()
returns the count and the fraction of unique identifiers with each particular
flag as a list of statistics n
, count
, count_fraction
, and n_blq
, with one element per flag.
a_count_patients_with_flags()
returns the corresponding list with formatted rtables::CellValue()
.
count_patients_with_flags()
: Layout-creating function which can take statistics function
arguments and additional format arguments. This function is a wrapper for rtables::analyze()
.
s_count_patients_with_flags()
: Statistics function which counts the number of patients for which
a particular flag variable is TRUE
.
a_count_patients_with_flags()
: Formatted analysis function which is used as afun
in count_patients_with_flags()
.
If flag_labels
is not specified, variables labels will be extracted from df
. If variables are not
labeled, variable names will be used instead. Alternatively, a named vector
can be supplied to
flag_variables
such that within each name-value pair the name corresponds to the variable name and the value is
the label to use for this variable.
# Add labelled flag variables to analysis dataset. adae <- tern_ex_adae %>% dplyr::mutate( fl1 = TRUE %>% with_label("Total AEs"), fl2 = (TRTEMFL == "Y") %>% with_label("Total number of patients with at least one adverse event"), fl3 = (TRTEMFL == "Y" & AEOUT == "FATAL") %>% with_label("Total number of patients with fatal AEs"), fl4 = (TRTEMFL == "Y" & AEOUT == "FATAL" & AEREL == "Y") %>% with_label("Total number of patients with related fatal AEs") ) lyt <- basic_table() %>% split_cols_by("ARM") %>% add_colcounts() %>% count_patients_with_flags( "SUBJID", flag_variables = c("fl1", "fl2", "fl3", "fl4"), denom = "N_col" ) build_table(lyt, adae, alt_counts_df = tern_ex_adsl) # `s_count_patients_with_flags()` s_count_patients_with_flags( adae, "SUBJID", flag_variables = c("fl1", "fl2", "fl3", "fl4"), denom = "N_col", .N_col = 1000 ) a_count_patients_with_flags( adae, .N_col = 10L, .N_row = 10L, .var = "USUBJID", flag_variables = c("fl1", "fl2", "fl3", "fl4") )
# Add labelled flag variables to analysis dataset. adae <- tern_ex_adae %>% dplyr::mutate( fl1 = TRUE %>% with_label("Total AEs"), fl2 = (TRTEMFL == "Y") %>% with_label("Total number of patients with at least one adverse event"), fl3 = (TRTEMFL == "Y" & AEOUT == "FATAL") %>% with_label("Total number of patients with fatal AEs"), fl4 = (TRTEMFL == "Y" & AEOUT == "FATAL" & AEREL == "Y") %>% with_label("Total number of patients with related fatal AEs") ) lyt <- basic_table() %>% split_cols_by("ARM") %>% add_colcounts() %>% count_patients_with_flags( "SUBJID", flag_variables = c("fl1", "fl2", "fl3", "fl4"), denom = "N_col" ) build_table(lyt, adae, alt_counts_df = tern_ex_adsl) # `s_count_patients_with_flags()` s_count_patients_with_flags( adae, "SUBJID", flag_variables = c("fl1", "fl2", "fl3", "fl4"), denom = "N_col", .N_col = 1000 ) a_count_patients_with_flags( adae, .N_col = 10L, .N_row = 10L, .var = "USUBJID", flag_variables = c("fl1", "fl2", "fl3", "fl4") )
The analyze function count_values()
creates a layout element to calculate counts of specific values within a
variable of interest.
This function analyzes one or more variables of interest supplied as a vector to vars
. Values to
count for variable(s) in vars
can be given as a vector via the values
argument. One row of
counts will be generated for each variable.
count_values( lyt, vars, values, na_str = default_na_str(), nested = TRUE, ..., table_names = vars, .stats = "count_fraction", .formats = NULL, .labels = c(count_fraction = paste(values, collapse = ", ")), .indent_mods = NULL ) s_count_values( x, values, na.rm = TRUE, .N_col, .N_row, denom = c("n", "N_col", "N_row") ) ## S3 method for class 'character' s_count_values(x, values = "Y", na.rm = TRUE, ...) ## S3 method for class 'factor' s_count_values(x, values = "Y", ...) ## S3 method for class 'logical' s_count_values(x, values = TRUE, ...) a_count_values( x, values, na.rm = TRUE, .N_col, .N_row, denom = c("n", "N_col", "N_row") )
count_values( lyt, vars, values, na_str = default_na_str(), nested = TRUE, ..., table_names = vars, .stats = "count_fraction", .formats = NULL, .labels = c(count_fraction = paste(values, collapse = ", ")), .indent_mods = NULL ) s_count_values( x, values, na.rm = TRUE, .N_col, .N_row, denom = c("n", "N_col", "N_row") ) ## S3 method for class 'character' s_count_values(x, values = "Y", na.rm = TRUE, ...) ## S3 method for class 'factor' s_count_values(x, values = "Y", ...) ## S3 method for class 'logical' s_count_values(x, values = TRUE, ...) a_count_values( x, values, na.rm = TRUE, .N_col, .N_row, denom = c("n", "N_col", "N_row") )
lyt |
( |
vars |
( |
values |
( |
na_str |
( |
nested |
( |
... |
additional arguments for the lower level functions. |
table_names |
( |
.stats |
( Options are: |
.formats |
(named |
.labels |
(named |
.indent_mods |
(named |
x |
( |
na.rm |
( |
.N_col |
( |
.N_row |
( |
denom |
(
|
count_values()
returns a layout object suitable for passing to further layouting functions,
or to rtables::build_table()
. Adding this function to an rtable
layout will add formatted rows containing
the statistics from s_count_values()
to the table layout.
s_count_values()
returns output of s_summary()
for specified values of a non-numeric variable.
a_count_values()
returns the corresponding list with formatted rtables::CellValue()
.
count_values()
: Layout-creating function which can take statistics function arguments
and additional format arguments. This function is a wrapper for rtables::analyze()
.
s_count_values()
: S3 generic function to count values.
s_count_values(character)
: Method for character
class.
s_count_values(factor)
: Method for factor
class. This makes an automatic
conversion to character
and then forwards to the method for characters.
s_count_values(logical)
: Method for logical
class.
a_count_values()
: Formatted analysis function which is used as afun
in count_values()
.
For factor
variables, s_count_values
checks whether values
are all included in the levels of x
and fails otherwise.
For count_values()
, variable labels are shown when there is more than one element in vars
,
otherwise they are hidden.
# `count_values` basic_table() %>% count_values("Species", values = "setosa") %>% build_table(iris) # `s_count_values.character` s_count_values(x = c("a", "b", "a"), values = "a") s_count_values(x = c("a", "b", "a", NA, NA), values = "b", na.rm = FALSE) # `s_count_values.factor` s_count_values(x = factor(c("a", "b", "a")), values = "a") # `s_count_values.logical` s_count_values(x = c(TRUE, FALSE, TRUE)) # `a_count_values` a_count_values(x = factor(c("a", "b", "a")), values = "a", .N_col = 10, .N_row = 10)
# `count_values` basic_table() %>% count_values("Species", values = "setosa") %>% build_table(iris) # `s_count_values.character` s_count_values(x = c("a", "b", "a"), values = "a") s_count_values(x = c("a", "b", "a", NA, NA), values = "b", na.rm = FALSE) # `s_count_values.factor` s_count_values(x = factor(c("a", "b", "a")), values = "a") # `s_count_values.logical` s_count_values(x = c(TRUE, FALSE, TRUE)) # `a_count_values` a_count_values(x = factor(c("a", "b", "a")), values = "a", .N_col = 10, .N_row = 10)
Fits a Cox regression model and estimates hazard ratio to describe the effect size in a survival analysis.
summarize_coxreg( lyt, variables, control = control_coxreg(), at = list(), multivar = FALSE, common_var = "STUDYID", .stats = c("n", "hr", "ci", "pval", "pval_inter"), .formats = c(n = "xx", hr = "xx.xx", ci = "(xx.xx, xx.xx)", pval = "x.xxxx | (<0.0001)", pval_inter = "x.xxxx | (<0.0001)"), varlabels = NULL, .indent_mods = NULL, na_str = "", .section_div = NA_character_ ) s_coxreg(model_df, .stats, .which_vars = "all", .var_nms = NULL) a_coxreg( df, labelstr, eff = FALSE, var_main = FALSE, multivar = FALSE, variables, at = list(), control = control_coxreg(), .spl_context, .stats, .formats, .indent_mods = NULL, na_str = "", cache_env = NULL )
summarize_coxreg( lyt, variables, control = control_coxreg(), at = list(), multivar = FALSE, common_var = "STUDYID", .stats = c("n", "hr", "ci", "pval", "pval_inter"), .formats = c(n = "xx", hr = "xx.xx", ci = "(xx.xx, xx.xx)", pval = "x.xxxx | (<0.0001)", pval_inter = "x.xxxx | (<0.0001)"), varlabels = NULL, .indent_mods = NULL, na_str = "", .section_div = NA_character_ ) s_coxreg(model_df, .stats, .which_vars = "all", .var_nms = NULL) a_coxreg( df, labelstr, eff = FALSE, var_main = FALSE, multivar = FALSE, variables, at = list(), control = control_coxreg(), .spl_context, .stats, .formats, .indent_mods = NULL, na_str = "", cache_env = NULL )
lyt |
( |
variables |
(named |
control |
( |
at |
( |
multivar |
( |
common_var |
( |
.stats |
(
|
.formats |
(named |
varlabels |
( |
.indent_mods |
(named |
na_str |
( |
.section_div |
( |
model_df |
( |
.which_vars |
( |
.var_nms |
( |
df |
( |
labelstr |
( |
eff |
( |
var_main |
( |
.spl_context |
( |
cache_env |
( |
Cox models are the most commonly used methods to estimate the magnitude of the effect in survival analysis. It assumes proportional hazards: the ratio of the hazards between groups (e.g., two arms) is constant over time. This ratio is referred to as the "hazard ratio" (HR) and is one of the most commonly reported metrics to describe the effect size in survival analysis (NEST Team, 2020).
summarize_coxreg()
returns a layout object suitable for passing to further layouting functions,
or to rtables::build_table()
. Adding this function to an rtable
layout will add a Cox regression table
containing the chosen statistics to the table layout.
s_coxreg()
returns the selected statistic for from the Cox regression model for the selected variable(s).
a_coxreg()
returns formatted rtables::CellValue()
.
summarize_coxreg()
: Layout-creating function which creates a Cox regression summary table
layout. This function is a wrapper for several rtables
layouting functions. This function
is a wrapper for rtables::analyze_colvars()
and rtables::summarize_row_groups()
.
s_coxreg()
: Statistics function that transforms results tabulated
from fit_coxreg_univar()
or fit_coxreg_multivar()
into a list.
a_coxreg()
: Analysis function which is used as afun
in rtables::analyze()
and cfun
in rtables::summarize_row_groups()
within summarize_coxreg()
.
fit_coxreg for relevant fitting functions, h_cox_regression for relevant helper functions, and tidy_coxreg for custom tidy methods.
fit_coxreg_univar()
and fit_coxreg_multivar()
which also take the variables
, data
,
at
(univariate only), and control
arguments but return unformatted univariate and multivariate
Cox regression models, respectively.
library(survival) # Testing dataset [survival::bladder]. set.seed(1, kind = "Mersenne-Twister") dta_bladder <- with( data = bladder[bladder$enum < 5, ], tibble::tibble( TIME = stop, STATUS = event, ARM = as.factor(rx), COVAR1 = as.factor(enum) %>% formatters::with_label("A Covariate Label"), COVAR2 = factor( sample(as.factor(enum)), levels = 1:4, labels = c("F", "F", "M", "M") ) %>% formatters::with_label("Sex (F/M)") ) ) dta_bladder$AGE <- sample(20:60, size = nrow(dta_bladder), replace = TRUE) dta_bladder$STUDYID <- factor("X") u1_variables <- list( time = "TIME", event = "STATUS", arm = "ARM", covariates = c("COVAR1", "COVAR2") ) u2_variables <- list(time = "TIME", event = "STATUS", covariates = c("COVAR1", "COVAR2")) m1_variables <- list( time = "TIME", event = "STATUS", arm = "ARM", covariates = c("COVAR1", "COVAR2") ) m2_variables <- list(time = "TIME", event = "STATUS", covariates = c("COVAR1", "COVAR2")) # summarize_coxreg result_univar <- basic_table() %>% summarize_coxreg(variables = u1_variables) %>% build_table(dta_bladder) result_univar result_univar_covs <- basic_table() %>% summarize_coxreg( variables = u2_variables, ) %>% build_table(dta_bladder) result_univar_covs result_multivar <- basic_table() %>% summarize_coxreg( variables = m1_variables, multivar = TRUE, ) %>% build_table(dta_bladder) result_multivar result_multivar_covs <- basic_table() %>% summarize_coxreg( variables = m2_variables, multivar = TRUE, varlabels = c("Covariate 1", "Covariate 2") # custom labels ) %>% build_table(dta_bladder) result_multivar_covs # s_coxreg # Univariate univar_model <- fit_coxreg_univar(variables = u1_variables, data = dta_bladder) df1 <- broom::tidy(univar_model) s_coxreg(model_df = df1, .stats = "hr") # Univariate with interactions univar_model_inter <- fit_coxreg_univar( variables = u1_variables, control = control_coxreg(interaction = TRUE), data = dta_bladder ) df1_inter <- broom::tidy(univar_model_inter) s_coxreg(model_df = df1_inter, .stats = "hr", .which_vars = "inter", .var_nms = "COVAR1") # Univariate without treatment arm - only "COVAR2" covariate effects univar_covs_model <- fit_coxreg_univar(variables = u2_variables, data = dta_bladder) df1_covs <- broom::tidy(univar_covs_model) s_coxreg(model_df = df1_covs, .stats = "hr", .var_nms = c("COVAR2", "Sex (F/M)")) # Multivariate. multivar_model <- fit_coxreg_multivar(variables = m1_variables, data = dta_bladder) df2 <- broom::tidy(multivar_model) s_coxreg(model_df = df2, .stats = "pval", .which_vars = "var_main", .var_nms = "COVAR1") s_coxreg( model_df = df2, .stats = "pval", .which_vars = "multi_lvl", .var_nms = c("COVAR1", "A Covariate Label") ) # Multivariate without treatment arm - only "COVAR1" main effect multivar_covs_model <- fit_coxreg_multivar(variables = m2_variables, data = dta_bladder) df2_covs <- broom::tidy(multivar_covs_model) s_coxreg(model_df = df2_covs, .stats = "hr") a_coxreg( df = dta_bladder, labelstr = "Label 1", variables = u1_variables, .spl_context = list(value = "COVAR1"), .stats = "n", .formats = "xx" ) a_coxreg( df = dta_bladder, labelstr = "", variables = u1_variables, .spl_context = list(value = "COVAR2"), .stats = "pval", .formats = "xx.xxxx" )
library(survival) # Testing dataset [survival::bladder]. set.seed(1, kind = "Mersenne-Twister") dta_bladder <- with( data = bladder[bladder$enum < 5, ], tibble::tibble( TIME = stop, STATUS = event, ARM = as.factor(rx), COVAR1 = as.factor(enum) %>% formatters::with_label("A Covariate Label"), COVAR2 = factor( sample(as.factor(enum)), levels = 1:4, labels = c("F", "F", "M", "M") ) %>% formatters::with_label("Sex (F/M)") ) ) dta_bladder$AGE <- sample(20:60, size = nrow(dta_bladder), replace = TRUE) dta_bladder$STUDYID <- factor("X") u1_variables <- list( time = "TIME", event = "STATUS", arm = "ARM", covariates = c("COVAR1", "COVAR2") ) u2_variables <- list(time = "TIME", event = "STATUS", covariates = c("COVAR1", "COVAR2")) m1_variables <- list( time = "TIME", event = "STATUS", arm = "ARM", covariates = c("COVAR1", "COVAR2") ) m2_variables <- list(time = "TIME", event = "STATUS", covariates = c("COVAR1", "COVAR2")) # summarize_coxreg result_univar <- basic_table() %>% summarize_coxreg(variables = u1_variables) %>% build_table(dta_bladder) result_univar result_univar_covs <- basic_table() %>% summarize_coxreg( variables = u2_variables, ) %>% build_table(dta_bladder) result_univar_covs result_multivar <- basic_table() %>% summarize_coxreg( variables = m1_variables, multivar = TRUE, ) %>% build_table(dta_bladder) result_multivar result_multivar_covs <- basic_table() %>% summarize_coxreg( variables = m2_variables, multivar = TRUE, varlabels = c("Covariate 1", "Covariate 2") # custom labels ) %>% build_table(dta_bladder) result_multivar_covs # s_coxreg # Univariate univar_model <- fit_coxreg_univar(variables = u1_variables, data = dta_bladder) df1 <- broom::tidy(univar_model) s_coxreg(model_df = df1, .stats = "hr") # Univariate with interactions univar_model_inter <- fit_coxreg_univar( variables = u1_variables, control = control_coxreg(interaction = TRUE), data = dta_bladder ) df1_inter <- broom::tidy(univar_model_inter) s_coxreg(model_df = df1_inter, .stats = "hr", .which_vars = "inter", .var_nms = "COVAR1") # Univariate without treatment arm - only "COVAR2" covariate effects univar_covs_model <- fit_coxreg_univar(variables = u2_variables, data = dta_bladder) df1_covs <- broom::tidy(univar_covs_model) s_coxreg(model_df = df1_covs, .stats = "hr", .var_nms = c("COVAR2", "Sex (F/M)")) # Multivariate. multivar_model <- fit_coxreg_multivar(variables = m1_variables, data = dta_bladder) df2 <- broom::tidy(multivar_model) s_coxreg(model_df = df2, .stats = "pval", .which_vars = "var_main", .var_nms = "COVAR1") s_coxreg( model_df = df2, .stats = "pval", .which_vars = "multi_lvl", .var_nms = c("COVAR1", "A Covariate Label") ) # Multivariate without treatment arm - only "COVAR1" main effect multivar_covs_model <- fit_coxreg_multivar(variables = m2_variables, data = dta_bladder) df2_covs <- broom::tidy(multivar_covs_model) s_coxreg(model_df = df2_covs, .stats = "hr") a_coxreg( df = dta_bladder, labelstr = "Label 1", variables = u1_variables, .spl_context = list(value = "COVAR1"), .stats = "n", .formats = "xx" ) a_coxreg( df = dta_bladder, labelstr = "", variables = u1_variables, .spl_context = list(value = "COVAR2"), .stats = "pval", .formats = "xx.xxxx" )
Test and estimate the effect of a treatment in interaction with a covariate. The effect is estimated as the HR of the tested treatment for a given level of the covariate, in comparison to the treatment control.
h_coxreg_inter_effect(x, effect, covar, mod, label, control, ...) ## S3 method for class 'numeric' h_coxreg_inter_effect(x, effect, covar, mod, label, control, at, ...) ## S3 method for class 'factor' h_coxreg_inter_effect(x, effect, covar, mod, label, control, data, ...) ## S3 method for class 'character' h_coxreg_inter_effect(x, effect, covar, mod, label, control, data, ...) h_coxreg_extract_interaction(effect, covar, mod, data, at, control) h_coxreg_inter_estimations( variable, given, lvl_var, lvl_given, mod, conf_level = 0.95 )
h_coxreg_inter_effect(x, effect, covar, mod, label, control, ...) ## S3 method for class 'numeric' h_coxreg_inter_effect(x, effect, covar, mod, label, control, at, ...) ## S3 method for class 'factor' h_coxreg_inter_effect(x, effect, covar, mod, label, control, data, ...) ## S3 method for class 'character' h_coxreg_inter_effect(x, effect, covar, mod, label, control, data, ...) h_coxreg_extract_interaction(effect, covar, mod, data, at, control) h_coxreg_inter_estimations( variable, given, lvl_var, lvl_given, mod, conf_level = 0.95 )
x |
( |
effect |
( |
covar |
( |
mod |
( |
label |
( |
control |
( |
... |
see methods. |
at |
( |
data |
( |
variable , given
|
( |
lvl_var , lvl_given
|
( |
conf_level |
( |
Given the cox regression investigating the effect of Arm (A, B, C; reference A) and Sex (F, M; reference Female) and the model being abbreviated: y ~ Arm + Sex + Arm:Sex. The cox regression estimates the coefficients along with a variance-covariance matrix for:
b1 (arm b), b2 (arm c)
b3 (sex m)
b4 (arm b: sex m), b5 (arm c: sex m)
The estimation of the Hazard Ratio for arm C/sex M is given in reference to arm A/Sex M by exp(b2 + b3 + b5)/ exp(b3) = exp(b2 + b5). The interaction coefficient is deduced by b2 + b5 while the standard error is obtained as $sqrt(Var b2 + Var b5 + 2 * covariance (b2,b5))$.
h_coxreg_inter_effect()
returns a data.frame
of covariate interaction effects consisting of the following
variables: effect
, term
, term_label
, level
, n
, hr
, lcl
, ucl
, pval
, and pval_inter
.
h_coxreg_extract_interaction()
returns the result of an interaction test and the estimated values. If
no interaction, h_coxreg_univar_extract()
is applied instead.
h_coxreg_inter_estimations()
returns a list of matrices (one per level of variable) with rows corresponding
to the combinations of variable
and given
, with columns:
coef_hat
: Estimation of the coefficient.
coef_se
: Standard error of the estimation.
hr
: Hazard ratio.
lcl, ucl
: Lower/upper confidence limit of the hazard ratio.
h_coxreg_inter_effect()
: S3 generic helper function to determine interaction effect.
h_coxreg_inter_effect(numeric)
: Method for numeric
class. Estimates the interaction with a numeric
covariate.
h_coxreg_inter_effect(factor)
: Method for factor
class. Estimate the interaction with a factor
covariate.
h_coxreg_inter_effect(character)
: Method for character
class. Estimate the interaction with a character
covariate.
This makes an automatic conversion to factor
and then forwards to the method for factors.
h_coxreg_extract_interaction()
: A higher level function to get
the results of the interaction test and the estimated values.
h_coxreg_inter_estimations()
: Hazard ratio estimation in interactions.
Automatic conversion of character to factor does not guarantee results can be generated correctly. It is
therefore better to always pre-process the dataset such that factors are manually created from character
variables before passing the dataset to rtables::build_table()
.
library(survival) set.seed(1, kind = "Mersenne-Twister") # Testing dataset [survival::bladder]. dta_bladder <- with( data = bladder[bladder$enum < 5, ], data.frame( time = stop, status = event, armcd = as.factor(rx), covar1 = as.factor(enum), covar2 = factor( sample(as.factor(enum)), levels = 1:4, labels = c("F", "F", "M", "M") ) ) ) labels <- c("armcd" = "ARM", "covar1" = "A Covariate Label", "covar2" = "Sex (F/M)") formatters::var_labels(dta_bladder)[names(labels)] <- labels dta_bladder$age <- sample(20:60, size = nrow(dta_bladder), replace = TRUE) plot( survfit(Surv(time, status) ~ armcd + covar1, data = dta_bladder), lty = 2:4, xlab = "Months", col = c("blue1", "blue2", "blue3", "blue4", "red1", "red2", "red3", "red4") ) mod <- coxph(Surv(time, status) ~ armcd * covar1, data = dta_bladder) h_coxreg_extract_interaction( mod = mod, effect = "armcd", covar = "covar1", data = dta_bladder, control = control_coxreg() ) mod <- coxph(Surv(time, status) ~ armcd * covar1, data = dta_bladder) result <- h_coxreg_inter_estimations( variable = "armcd", given = "covar1", lvl_var = levels(dta_bladder$armcd), lvl_given = levels(dta_bladder$covar1), mod = mod, conf_level = .95 ) result
library(survival) set.seed(1, kind = "Mersenne-Twister") # Testing dataset [survival::bladder]. dta_bladder <- with( data = bladder[bladder$enum < 5, ], data.frame( time = stop, status = event, armcd = as.factor(rx), covar1 = as.factor(enum), covar2 = factor( sample(as.factor(enum)), levels = 1:4, labels = c("F", "F", "M", "M") ) ) ) labels <- c("armcd" = "ARM", "covar1" = "A Covariate Label", "covar2" = "Sex (F/M)") formatters::var_labels(dta_bladder)[names(labels)] <- labels dta_bladder$age <- sample(20:60, size = nrow(dta_bladder), replace = TRUE) plot( survfit(Surv(time, status) ~ armcd + covar1, data = dta_bladder), lty = 2:4, xlab = "Months", col = c("blue1", "blue2", "blue3", "blue4", "red1", "red2", "red3", "red4") ) mod <- coxph(Surv(time, status) ~ armcd * covar1, data = dta_bladder) h_coxreg_extract_interaction( mod = mod, effect = "armcd", covar = "covar1", data = dta_bladder, control = control_coxreg() ) mod <- coxph(Surv(time, status) ~ armcd * covar1, data = dta_bladder) result <- h_coxreg_inter_estimations( variable = "armcd", given = "covar1", lvl_var = levels(dta_bladder$armcd), lvl_given = levels(dta_bladder$covar1), mod = mod, conf_level = .95 ) result
This cuts a numeric vector into sample quantile bins.
cut_quantile_bins( x, probs = c(0.25, 0.5, 0.75), labels = NULL, type = 7, ordered = TRUE )
cut_quantile_bins( x, probs = c(0.25, 0.5, 0.75), labels = NULL, type = 7, ordered = TRUE )
x |
( |
probs |
( |
labels |
( |
type |
( |
ordered |
( |
A factor
variable with appropriately-labeled bins as levels.
Intervals are closed on the right side. That is, the first bin is the interval
[-Inf, q1]
where q1
is the first quantile, the second bin is then (q1, q2]
, etc.,
and the last bin is (qn, +Inf]
where qn
is the last quantile.
# Default is to cut into quartile bins. cut_quantile_bins(cars$speed) # Use custom quantiles. cut_quantile_bins(cars$speed, probs = c(0.1, 0.2, 0.6, 0.88)) # Use custom labels. cut_quantile_bins(cars$speed, labels = paste0("Q", 1:4)) # NAs are preserved in result factor. ozone_binned <- cut_quantile_bins(airquality$Ozone) which(is.na(ozone_binned)) # So you might want to make these explicit. explicit_na(ozone_binned)
# Default is to cut into quartile bins. cut_quantile_bins(cars$speed) # Use custom quantiles. cut_quantile_bins(cars$speed, probs = c(0.1, 0.2, 0.6, 0.88)) # Use custom labels. cut_quantile_bins(cars$speed, labels = paste0("Q", 1:4)) # NAs are preserved in result factor. ozone_binned <- cut_quantile_bins(airquality$Ozone) which(is.na(ozone_binned)) # So you might want to make these explicit. explicit_na(ozone_binned)
s_count_abnormal_by_baseline()
Description function that produces the labels for s_count_abnormal_by_baseline()
.
d_count_abnormal_by_baseline(abnormal)
d_count_abnormal_by_baseline(abnormal)
abnormal |
( |
Abnormal category labels for s_count_abnormal_by_baseline()
.
d_count_abnormal_by_baseline("LOW")
d_count_abnormal_by_baseline("LOW")
This is a helper function that describes the analysis in s_count_cumulative()
.
d_count_cumulative(threshold, lower_tail = TRUE, include_eq = TRUE)
d_count_cumulative(threshold, lower_tail = TRUE, include_eq = TRUE)
threshold |
( |
lower_tail |
( |
include_eq |
( |
Labels for s_count_cumulative()
.
s_count_missed_doses()
d_count_missed_doses(thresholds)
d_count_missed_doses(thresholds)
thresholds |
( |
d_count_missed_doses()
returns a named character
vector with the labels.
Describe the oncology response in a standard way.
d_onco_rsp_label(x)
d_onco_rsp_label(x)
x |
( |
Response labels.
d_onco_rsp_label( c("CR", "PR", "SD", "NON CR/PD", "PD", "NE", "Missing", "<Missing>", "NE/Missing") ) # Adding some values not considered in d_onco_rsp_label d_onco_rsp_label( c("CR", "PR", "hello", "hi") )
d_onco_rsp_label( c("CR", "PR", "SD", "NON CR/PD", "PD", "NE", "Missing", "<Missing>", "NE/Missing") ) # Adding some values not considered in d_onco_rsp_label d_onco_rsp_label( c("CR", "PR", "hello", "hi") )
d_pkparam()
d_pkparam()
A data.frame
of PK parameters.
pk_reference_dataset <- d_pkparam()
pk_reference_dataset <- d_pkparam()
This is a helper function that describes the analysis in s_proportion()
.
d_proportion(conf_level, method, long = FALSE)
d_proportion(conf_level, method, long = FALSE)
conf_level |
( |
method |
( |
long |
( |
String describing the analysis.
This is an auxiliary function that describes the analysis in
s_proportion_diff()
.
d_proportion_diff(conf_level, method, long = FALSE)
d_proportion_diff(conf_level, method, long = FALSE)
conf_level |
( |
method |
( |
long |
( |
A string
describing the analysis.
Internal function to check variables included in tabulate_rsp_subgroups()
and create column labels.
d_rsp_subgroups_colvars(vars, conf_level = NULL, method = NULL)
d_rsp_subgroups_colvars(vars, conf_level = NULL, method = NULL)
vars |
( |
conf_level |
( |
method |
( |
A list
of variables to tabulate and their labels.
Internal function to check variables included in tabulate_survival_subgroups()
and create column labels.
d_survival_subgroups_colvars(vars, conf_level, method, time_unit = NULL)
d_survival_subgroups_colvars(vars, conf_level, method, time_unit = NULL)
vars |
(
|
conf_level |
( |
method |
( |
time_unit |
( |
A list
of variables and their labels to tabulate.
At least one of n_tot
and n_tot_events
must be provided in vars
.
This is an auxiliary function that describes the analysis in s_test_proportion_diff
.
d_test_proportion_diff(method)
d_test_proportion_diff(method)
method |
( |
A string
describing the test from which the p-value is derived.
Conversion of days to months
day2month(x)
day2month(x)
x |
( |
A numeric
vector with the time in months.
x <- c(403, 248, 30, 86) day2month(x)
x <- c(403, 248, 30, 86) day2month(x)
This function is useful to label grid grobs (also ggplot2
, and lattice
plots)
with title, footnote, and page numbers.
decorate_grob( grob, titles, footnotes, page = "", width_titles = grid::unit(1, "npc"), width_footnotes = grid::unit(1, "npc"), border = TRUE, padding = grid::unit(rep(1, 4), "lines"), margins = grid::unit(c(1, 0, 1, 0), "lines"), outer_margins = grid::unit(c(2, 1.5, 3, 1.5), "cm"), gp_titles = grid::gpar(), gp_footnotes = grid::gpar(fontsize = 8), name = NULL, gp = grid::gpar(), vp = NULL )
decorate_grob( grob, titles, footnotes, page = "", width_titles = grid::unit(1, "npc"), width_footnotes = grid::unit(1, "npc"), border = TRUE, padding = grid::unit(rep(1, 4), "lines"), margins = grid::unit(c(1, 0, 1, 0), "lines"), outer_margins = grid::unit(c(2, 1.5, 3, 1.5), "cm"), gp_titles = grid::gpar(), gp_footnotes = grid::gpar(fontsize = 8), name = NULL, gp = grid::gpar(), vp = NULL )
grob |
( |
titles |
( |
footnotes |
( |
page |
( |
width_titles |
( |
width_footnotes |
( |
border |
( |
padding |
( |
margins |
( |
outer_margins |
( |
gp_titles |
( |
gp_footnotes |
( |
name |
a character identifier for the grob. Used to find the grob on the display list and/or as a child of another grob. |
gp |
A |
vp |
a |
The titles and footnotes will be ragged, i.e. each title will be wrapped individually.
A grid grob (gTree
).
library(grid) titles <- c( "Edgar Anderson's Iris Data", paste( "This famous (Fisher's or Anderson's) iris data set gives the measurements", "in centimeters of the variables sepal length and width and petal length", "and width, respectively, for 50 flowers from each of 3 species of iris." ) ) footnotes <- c( "The species are Iris setosa, versicolor, and virginica.", paste( "iris is a data frame with 150 cases (rows) and 5 variables (columns) named", "Sepal.Length, Sepal.Width, Petal.Length, Petal.Width, and Species." ) ) ## empty plot grid.newpage() grid.draw( decorate_grob( NULL, titles = titles, footnotes = footnotes, page = "Page 4 of 10" ) ) # grid p <- gTree( children = gList( rectGrob(), xaxisGrob(), yaxisGrob(), textGrob("Sepal.Length", y = unit(-4, "lines")), textGrob("Petal.Length", x = unit(-3.5, "lines"), rot = 90), pointsGrob(iris$Sepal.Length, iris$Petal.Length, gp = gpar(col = iris$Species), pch = 16) ), vp = vpStack(plotViewport(), dataViewport(xData = iris$Sepal.Length, yData = iris$Petal.Length)) ) grid.newpage() grid.draw(p) grid.newpage() grid.draw( decorate_grob( grob = p, titles = titles, footnotes = footnotes, page = "Page 6 of 129" ) ) ## with ggplot2 library(ggplot2) p_gg <- ggplot2::ggplot(iris, aes(Sepal.Length, Sepal.Width, col = Species)) + ggplot2::geom_point() p_gg p <- ggplotGrob(p_gg) grid.newpage() grid.draw( decorate_grob( grob = p, titles = titles, footnotes = footnotes, page = "Page 6 of 129" ) ) ## with lattice library(lattice) xyplot(Sepal.Length ~ Petal.Length, data = iris, col = iris$Species) p <- grid.grab() grid.newpage() grid.draw( decorate_grob( grob = p, titles = titles, footnotes = footnotes, page = "Page 6 of 129" ) ) # with gridExtra - no borders library(gridExtra) grid.newpage() grid.draw( decorate_grob( tableGrob( head(mtcars) ), titles = "title", footnotes = "footnote", border = FALSE ) )
library(grid) titles <- c( "Edgar Anderson's Iris Data", paste( "This famous (Fisher's or Anderson's) iris data set gives the measurements", "in centimeters of the variables sepal length and width and petal length", "and width, respectively, for 50 flowers from each of 3 species of iris." ) ) footnotes <- c( "The species are Iris setosa, versicolor, and virginica.", paste( "iris is a data frame with 150 cases (rows) and 5 variables (columns) named", "Sepal.Length, Sepal.Width, Petal.Length, Petal.Width, and Species." ) ) ## empty plot grid.newpage() grid.draw( decorate_grob( NULL, titles = titles, footnotes = footnotes, page = "Page 4 of 10" ) ) # grid p <- gTree( children = gList( rectGrob(), xaxisGrob(), yaxisGrob(), textGrob("Sepal.Length", y = unit(-4, "lines")), textGrob("Petal.Length", x = unit(-3.5, "lines"), rot = 90), pointsGrob(iris$Sepal.Length, iris$Petal.Length, gp = gpar(col = iris$Species), pch = 16) ), vp = vpStack(plotViewport(), dataViewport(xData = iris$Sepal.Length, yData = iris$Petal.Length)) ) grid.newpage() grid.draw(p) grid.newpage() grid.draw( decorate_grob( grob = p, titles = titles, footnotes = footnotes, page = "Page 6 of 129" ) ) ## with ggplot2 library(ggplot2) p_gg <- ggplot2::ggplot(iris, aes(Sepal.Length, Sepal.Width, col = Species)) + ggplot2::geom_point() p_gg p <- ggplotGrob(p_gg) grid.newpage() grid.draw( decorate_grob( grob = p, titles = titles, footnotes = footnotes, page = "Page 6 of 129" ) ) ## with lattice library(lattice) xyplot(Sepal.Length ~ Petal.Length, data = iris, col = iris$Species) p <- grid.grab() grid.newpage() grid.draw( decorate_grob( grob = p, titles = titles, footnotes = footnotes, page = "Page 6 of 129" ) ) # with gridExtra - no borders library(gridExtra) grid.newpage() grid.draw( decorate_grob( tableGrob( head(mtcars) ), titles = "title", footnotes = "footnote", border = FALSE ) )
grob
s and add page numberingNote that this uses the decorate_grob_factory()
function.
decorate_grob_set(grobs, ...)
decorate_grob_set(grobs, ...)
grobs |
( |
... |
arguments passed on to |
A decorated grob.
library(ggplot2) library(grid) g <- with(data = iris, { list( ggplot2::ggplotGrob( ggplot2::ggplot(mapping = aes(Sepal.Length, Sepal.Width, col = Species)) + ggplot2::geom_point() ), ggplot2::ggplotGrob( ggplot2::ggplot(mapping = aes(Sepal.Length, Petal.Length, col = Species)) + ggplot2::geom_point() ), ggplot2::ggplotGrob( ggplot2::ggplot(mapping = aes(Sepal.Length, Petal.Width, col = Species)) + ggplot2::geom_point() ), ggplot2::ggplotGrob( ggplot2::ggplot(mapping = aes(Sepal.Width, Petal.Length, col = Species)) + ggplot2::geom_point() ), ggplot2::ggplotGrob( ggplot2::ggplot(mapping = aes(Sepal.Width, Petal.Width, col = Species)) + ggplot2::geom_point() ), ggplot2::ggplotGrob( ggplot2::ggplot(mapping = aes(Petal.Length, Petal.Width, col = Species)) + ggplot2::geom_point() ) ) }) lg <- decorate_grob_set(grobs = g, titles = "Hello\nOne\nTwo\nThree", footnotes = "") draw_grob(lg[[1]]) draw_grob(lg[[2]]) draw_grob(lg[[6]])
library(ggplot2) library(grid) g <- with(data = iris, { list( ggplot2::ggplotGrob( ggplot2::ggplot(mapping = aes(Sepal.Length, Sepal.Width, col = Species)) + ggplot2::geom_point() ), ggplot2::ggplotGrob( ggplot2::ggplot(mapping = aes(Sepal.Length, Petal.Length, col = Species)) + ggplot2::geom_point() ), ggplot2::ggplotGrob( ggplot2::ggplot(mapping = aes(Sepal.Length, Petal.Width, col = Species)) + ggplot2::geom_point() ), ggplot2::ggplotGrob( ggplot2::ggplot(mapping = aes(Sepal.Width, Petal.Length, col = Species)) + ggplot2::geom_point() ), ggplot2::ggplotGrob( ggplot2::ggplot(mapping = aes(Sepal.Width, Petal.Width, col = Species)) + ggplot2::geom_point() ), ggplot2::ggplotGrob( ggplot2::ggplot(mapping = aes(Petal.Length, Petal.Width, col = Species)) + ggplot2::geom_point() ) ) }) lg <- decorate_grob_set(grobs = g, titles = "Hello\nOne\nTwo\nThree", footnotes = "") draw_grob(lg[[1]]) draw_grob(lg[[2]]) draw_grob(lg[[6]])
NA
valuesThe default string used to represent NA
values. This value is used as the default
value for the na_str
argument throughout the tern
package, and printed in place
of NA
values in output tables. If not specified for each tern
function by the user
via the na_str
argument, or in the R environment options via set_default_na_str()
,
then NA
is used.
default_na_str() set_default_na_str(na_str)
default_na_str() set_default_na_str(na_str)
na_str |
( |
default_na_str
returns the current value if an R environment option has been set
for "tern_default_na_str"
, or NA_character_
otherwise.
set_default_na_str
has no return value.
default_na_str()
: Accessor for default NA
value replacement string.
set_default_na_str()
: Setter for default NA
value replacement string. Sets the
option "tern_default_na_str"
within the R environment.
# Default settings default_na_str() getOption("tern_default_na_str") # Set custom value set_default_na_str("<Missing>") # Settings after value has been set default_na_str() getOption("tern_default_na_str")
# Default settings default_na_str() getOption("tern_default_na_str") # Set custom value set_default_na_str("<Missing>") # Settings after value has been set default_na_str() getOption("tern_default_na_str")
Utility functions to get valid statistic methods for different method groups
(.stats
) and their associated formats (.formats
), labels (.labels
), and indent modifiers
(.indent_mods
). This utility is used across tern
, but some of its working principles can be
seen in analyze_vars()
. See notes to understand why this is experimental.
get_stats( method_groups = "analyze_vars_numeric", stats_in = NULL, add_pval = FALSE ) get_formats_from_stats(stats, formats_in = NULL) get_labels_from_stats(stats, labels_in = NULL, row_nms = NULL) get_indents_from_stats(stats, indents_in = NULL, row_nms = NULL) tern_default_stats tern_default_formats tern_default_labels summary_formats(type = "numeric", include_pval = FALSE) summary_labels(type = "numeric", include_pval = FALSE)
get_stats( method_groups = "analyze_vars_numeric", stats_in = NULL, add_pval = FALSE ) get_formats_from_stats(stats, formats_in = NULL) get_labels_from_stats(stats, labels_in = NULL, row_nms = NULL) get_indents_from_stats(stats, indents_in = NULL, row_nms = NULL) tern_default_stats tern_default_formats tern_default_labels summary_formats(type = "numeric", include_pval = FALSE) summary_labels(type = "numeric", include_pval = FALSE)
method_groups |
( |
stats_in |
( |
add_pval |
( |
stats |
( |
formats_in |
(named |
labels_in |
(named |
row_nms |
( |
indents_in |
(named |
type |
( |
include_pval |
( |
tern_default_stats
is a named list of available statistics, with each element
named for their corresponding statistical method group.
tern_default_formats
is a named vector of available default formats, with each element
named for their corresponding statistic.
tern_default_labels
is a named character
vector of available default labels, with each element
named for their corresponding statistic.
Current choices for type
are counts
and numeric
for analyze_vars()
and affect get_stats()
.
summary_*
quick get functions for labels or formats uses get_stats
and get_labels_from_stats
or
get_formats_from_stats
respectively to retrieve relevant information.
get_stats()
returns a character
vector of statistical methods.
get_formats_from_stats()
returns a named vector of formats (if present in either
tern_default_formats
or formats_in
, otherwise NULL
). Values can be taken from
formatters::list_valid_format_labels()
or a custom function (e.g. formatting_functions).
get_labels_from_stats()
returns a named character
vector of labels (if present in either
tern_default_labels
or labels_in
, otherwise NULL
).
get_indents_from_stats()
returns a single indent modifier value to apply to all rows
or a named numeric vector of indent modifiers (if present, otherwise NULL
).
summary_formats()
returns a named vector
of default statistic formats for the given data type.
summary_labels
returns a named vector
of default statistic labels for the given data type.
get_stats()
: Get statistics available for a given method
group (analyze function). To check available defaults see tern::tern_default_stats
list.
get_formats_from_stats()
: Get formats corresponding to a list of statistics.
To check available defaults see tern::tern_default_formats
list.
get_labels_from_stats()
: Get labels corresponding to a list of statistics.
To check for available defaults see tern::tern_default_labels
list. If not available there,
the statistics name will be used as label.
get_indents_from_stats()
: Format indent modifiers for a given vector/list of statistics.
It defaults to 0L for all values.
tern_default_stats
: Named list of available statistics by method group for tern
.
tern_default_formats
: Named vector of default formats for tern
.
tern_default_labels
: Named character
vector of default labels for tern
.
summary_formats()
:
Quick function to retrieve default formats for summary statistics:
analyze_vars()
and analyze_vars_in_cols()
principally.
summary_labels()
:
Quick function to retrieve default labels for summary statistics.
Returns labels of descriptive statistics which are understood by rtables
. Similar to summary_formats
.
These defaults are experimental because we use the names of functions to retrieve the default statistics. This should be generalized in groups of methods according to more reasonable groupings.
Formats in tern
and rtables
can be functions that take in the table cell value and
return a string. This is well documented in vignette("custom_appearance", package = "rtables")
.
# analyze_vars is numeric num_stats <- get_stats("analyze_vars_numeric") # also the default # Other type cnt_stats <- get_stats("analyze_vars_counts") # Weirdly taking the pval from count_occurrences only_pval <- get_stats("count_occurrences", add_pval = TRUE, stats_in = "pval") # All count_occurrences all_cnt_occ <- get_stats("count_occurrences") # Multiple get_stats(c("count_occurrences", "analyze_vars_counts")) # Defaults formats get_formats_from_stats(num_stats) get_formats_from_stats(cnt_stats) get_formats_from_stats(only_pval) get_formats_from_stats(all_cnt_occ) # Addition of customs get_formats_from_stats(all_cnt_occ, formats_in = c("fraction" = c("xx"))) get_formats_from_stats(all_cnt_occ, formats_in = list("fraction" = c("xx.xx", "xx"))) # Defaults labels get_labels_from_stats(num_stats) get_labels_from_stats(cnt_stats) get_labels_from_stats(only_pval) get_labels_from_stats(all_cnt_occ) # Addition of customs get_labels_from_stats(all_cnt_occ, labels_in = c("fraction" = "Fraction")) get_labels_from_stats(all_cnt_occ, labels_in = list("fraction" = c("Some more fractions"))) get_indents_from_stats(all_cnt_occ, indents_in = 3L) get_indents_from_stats(all_cnt_occ, indents_in = list(count = 2L, count_fraction = 5L)) get_indents_from_stats( all_cnt_occ, indents_in = list(a = 2L, count.a = 1L, count.b = 5L), row_nms = c("a", "b") ) summary_formats() summary_formats(type = "counts", include_pval = TRUE) summary_labels() summary_labels(type = "counts", include_pval = TRUE)
# analyze_vars is numeric num_stats <- get_stats("analyze_vars_numeric") # also the default # Other type cnt_stats <- get_stats("analyze_vars_counts") # Weirdly taking the pval from count_occurrences only_pval <- get_stats("count_occurrences", add_pval = TRUE, stats_in = "pval") # All count_occurrences all_cnt_occ <- get_stats("count_occurrences") # Multiple get_stats(c("count_occurrences", "analyze_vars_counts")) # Defaults formats get_formats_from_stats(num_stats) get_formats_from_stats(cnt_stats) get_formats_from_stats(only_pval) get_formats_from_stats(all_cnt_occ) # Addition of customs get_formats_from_stats(all_cnt_occ, formats_in = c("fraction" = c("xx"))) get_formats_from_stats(all_cnt_occ, formats_in = list("fraction" = c("xx.xx", "xx"))) # Defaults labels get_labels_from_stats(num_stats) get_labels_from_stats(cnt_stats) get_labels_from_stats(only_pval) get_labels_from_stats(all_cnt_occ) # Addition of customs get_labels_from_stats(all_cnt_occ, labels_in = c("fraction" = "Fraction")) get_labels_from_stats(all_cnt_occ, labels_in = list("fraction" = c("Some more fractions"))) get_indents_from_stats(all_cnt_occ, indents_in = 3L) get_indents_from_stats(all_cnt_occ, indents_in = list(count = 2L, count_fraction = 5L)) get_indents_from_stats( all_cnt_occ, indents_in = list(a = 2L, count.a = 1L, count.b = 5L), row_nms = c("a", "b") ) summary_formats() summary_formats(type = "counts", include_pval = TRUE) summary_labels() summary_labels(type = "counts", include_pval = TRUE)
This is a helper function to encode missing entries across groups of categorical variables in a data frame.
df_explicit_na( data, omit_columns = NULL, char_as_factor = TRUE, logical_as_factor = FALSE, na_level = "<Missing>" )
df_explicit_na( data, omit_columns = NULL, char_as_factor = TRUE, logical_as_factor = FALSE, na_level = "<Missing>" )
data |
( |
omit_columns |
( |
char_as_factor |
( |
logical_as_factor |
( |
na_level |
( |
Missing entries are those with NA
or empty strings and will
be replaced with a specified value. If factor variables include missing
values, the missing value will be inserted as the last level.
Similarly, in case character or logical variables should be converted to factors
with the char_as_factor
or logical_as_factor
options, the missing values will
be set as the last level.
A data.frame
with the chosen modifications applied.
sas_na()
and explicit_na()
for other missing data helper functions.
my_data <- data.frame( u = c(TRUE, FALSE, NA, TRUE), v = factor(c("A", NA, NA, NA), levels = c("Z", "A")), w = c("A", "B", NA, "C"), x = c("D", "E", "F", NA), y = c("G", "H", "I", ""), z = c(1, 2, 3, 4), stringsAsFactors = FALSE ) # Example 1 # Encode missing values in all character or factor columns. df_explicit_na(my_data) # Also convert logical columns to factor columns. df_explicit_na(my_data, logical_as_factor = TRUE) # Encode missing values in a subset of columns. df_explicit_na(my_data, omit_columns = c("x", "y")) # Example 2 # Here we purposefully convert all `M` values to `NA` in the `SEX` variable. # After running `df_explicit_na` the `NA` values are encoded as `<Missing>` but they are not # included when generating `rtables`. adsl <- tern_ex_adsl adsl$SEX[adsl$SEX == "M"] <- NA adsl <- df_explicit_na(adsl) # If you want the `Na` values to be displayed in the table use the `na_level` argument. adsl <- tern_ex_adsl adsl$SEX[adsl$SEX == "M"] <- NA adsl <- df_explicit_na(adsl, na_level = "Missing Values") # Example 3 # Numeric variables that have missing values are not altered. This means that any `NA` value in # a numeric variable will not be included in the summary statistics, nor will they be included # in the denominator value for calculating the percent values. adsl <- tern_ex_adsl adsl$AGE[adsl$AGE < 30] <- NA adsl <- df_explicit_na(adsl)
my_data <- data.frame( u = c(TRUE, FALSE, NA, TRUE), v = factor(c("A", NA, NA, NA), levels = c("Z", "A")), w = c("A", "B", NA, "C"), x = c("D", "E", "F", NA), y = c("G", "H", "I", ""), z = c(1, 2, 3, 4), stringsAsFactors = FALSE ) # Example 1 # Encode missing values in all character or factor columns. df_explicit_na(my_data) # Also convert logical columns to factor columns. df_explicit_na(my_data, logical_as_factor = TRUE) # Encode missing values in a subset of columns. df_explicit_na(my_data, omit_columns = c("x", "y")) # Example 2 # Here we purposefully convert all `M` values to `NA` in the `SEX` variable. # After running `df_explicit_na` the `NA` values are encoded as `<Missing>` but they are not # included when generating `rtables`. adsl <- tern_ex_adsl adsl$SEX[adsl$SEX == "M"] <- NA adsl <- df_explicit_na(adsl) # If you want the `Na` values to be displayed in the table use the `na_level` argument. adsl <- tern_ex_adsl adsl$SEX[adsl$SEX == "M"] <- NA adsl <- df_explicit_na(adsl, na_level = "Missing Values") # Example 3 # Numeric variables that have missing values are not altered. This means that any `NA` value in # a numeric variable will not be included in the summary statistics, nor will they be included # in the denominator value for calculating the percent values. adsl <- tern_ex_adsl adsl$AGE[adsl$AGE < 30] <- NA adsl <- df_explicit_na(adsl)
grob
Draw grob on device page.
draw_grob(grob, newpage = TRUE, vp = NULL)
draw_grob(grob, newpage = TRUE, vp = NULL)
grob |
( |
newpage |
( |
vp |
( |
A grob
.
library(dplyr) library(grid) rect <- rectGrob(width = grid::unit(0.5, "npc"), height = grid::unit(0.5, "npc")) rect %>% draw_grob(vp = grid::viewport(angle = 45)) num <- lapply(1:10, textGrob) num %>% arrange_grobs(grobs = .) %>% draw_grob() showViewport()
library(dplyr) library(grid) rect <- rectGrob(width = grid::unit(0.5, "npc"), height = grid::unit(0.5, "npc")) rect %>% draw_grob(vp = grid::viewport(angle = 45)) num <- lapply(1:10, textGrob) num %>% arrange_grobs(grobs = .) %>% draw_grob() showViewport()
The analyze & summarize function estimate_multinomial_response()
creates a layout element to estimate the
proportion and proportion confidence interval for each level of a factor variable. The primary analysis variable,
var
, should be a factor variable, the values of which will be used as labels within the output table.
estimate_multinomial_response( lyt, var, na_str = default_na_str(), nested = TRUE, ..., show_labels = "hidden", table_names = var, .stats = "prop_ci", .formats = NULL, .labels = NULL, .indent_mods = NULL ) s_length_proportion(x, .N_col, ...) a_length_proportion(x, .N_col, ...)
estimate_multinomial_response( lyt, var, na_str = default_na_str(), nested = TRUE, ..., show_labels = "hidden", table_names = var, .stats = "prop_ci", .formats = NULL, .labels = NULL, .indent_mods = NULL ) s_length_proportion(x, .N_col, ...) a_length_proportion(x, .N_col, ...)
lyt |
( |
var |
( |
na_str |
( |
nested |
( |
... |
additional arguments for the lower level functions. |
show_labels |
( |
table_names |
( |
.stats |
( Options are: |
.formats |
(named |
.labels |
(named |
.indent_mods |
(named |
x |
( |
.N_col |
( |
estimate_multinomial_response()
returns a layout object suitable for passing to further layouting functions,
or to rtables::build_table()
. Adding this function to an rtable
layout will add formatted rows containing
the statistics from s_length_proportion()
to the table layout.
s_length_proportion()
returns statistics from s_proportion()
.
a_length_proportion()
returns the corresponding list with formatted rtables::CellValue()
.
estimate_multinomial_response()
: Layout-creating function which can take statistics function arguments
and additional format arguments. This function is a wrapper for rtables::analyze()
and
rtables::summarize_row_groups()
.
s_length_proportion()
: Statistics function which feeds the length of x
as number
of successes, and .N_col
as total number of successes and failures into s_proportion()
.
a_length_proportion()
: Formatted analysis function which is used as afun
in estimate_multinomial_response()
.
Relevant description function d_onco_rsp_label()
.
library(dplyr) # Use of the layout creating function. dta_test <- data.frame( USUBJID = paste0("S", 1:12), ARM = factor(rep(LETTERS[1:3], each = 4)), AVAL = c(A = c(1, 1, 1, 1), B = c(0, 0, 1, 1), C = c(0, 0, 0, 0)) ) %>% mutate( AVALC = factor(AVAL, levels = c(0, 1), labels = c("Complete Response (CR)", "Partial Response (PR)") ) ) lyt <- basic_table() %>% split_cols_by("ARM") %>% estimate_multinomial_response(var = "AVALC") tbl <- build_table(lyt, dta_test) tbl s_length_proportion(rep("CR", 10), .N_col = 100) s_length_proportion(factor(character(0)), .N_col = 100) a_length_proportion(rep("CR", 10), .N_col = 100) a_length_proportion(factor(character(0)), .N_col = 100)
library(dplyr) # Use of the layout creating function. dta_test <- data.frame( USUBJID = paste0("S", 1:12), ARM = factor(rep(LETTERS[1:3], each = 4)), AVAL = c(A = c(1, 1, 1, 1), B = c(0, 0, 1, 1), C = c(0, 0, 0, 0)) ) %>% mutate( AVALC = factor(AVAL, levels = c(0, 1), labels = c("Complete Response (CR)", "Partial Response (PR)") ) ) lyt <- basic_table() %>% split_cols_by("ARM") %>% estimate_multinomial_response(var = "AVALC") tbl <- build_table(lyt, dta_test) tbl s_length_proportion(rep("CR", 10), .N_col = 100) s_length_proportion(factor(character(0)), .N_col = 100) a_length_proportion(rep("CR", 10), .N_col = 100) a_length_proportion(factor(character(0)), .N_col = 100)
The analyze function estimate_proportion()
creates a layout element to estimate the proportion of responders
within a studied population. The primary analysis variable, vars
, indicates whether a response has occurred for
each record. See the method
parameter for options of methods to use when constructing the confidence interval of
the proportion. Additionally, a stratification variable can be supplied via the strata
element of the variables
argument.
estimate_proportion( lyt, vars, conf_level = 0.95, method = c("waldcc", "wald", "clopper-pearson", "wilson", "wilsonc", "strat_wilson", "strat_wilsonc", "agresti-coull", "jeffreys"), weights = NULL, max_iterations = 50, variables = list(strata = NULL), long = FALSE, na_str = default_na_str(), nested = TRUE, ..., show_labels = "hidden", table_names = vars, .stats = NULL, .formats = NULL, .labels = NULL, .indent_mods = NULL ) s_proportion( df, .var, conf_level = 0.95, method = c("waldcc", "wald", "clopper-pearson", "wilson", "wilsonc", "strat_wilson", "strat_wilsonc", "agresti-coull", "jeffreys"), weights = NULL, max_iterations = 50, variables = list(strata = NULL), long = FALSE ) a_proportion( df, .var, conf_level = 0.95, method = c("waldcc", "wald", "clopper-pearson", "wilson", "wilsonc", "strat_wilson", "strat_wilsonc", "agresti-coull", "jeffreys"), weights = NULL, max_iterations = 50, variables = list(strata = NULL), long = FALSE )
estimate_proportion( lyt, vars, conf_level = 0.95, method = c("waldcc", "wald", "clopper-pearson", "wilson", "wilsonc", "strat_wilson", "strat_wilsonc", "agresti-coull", "jeffreys"), weights = NULL, max_iterations = 50, variables = list(strata = NULL), long = FALSE, na_str = default_na_str(), nested = TRUE, ..., show_labels = "hidden", table_names = vars, .stats = NULL, .formats = NULL, .labels = NULL, .indent_mods = NULL ) s_proportion( df, .var, conf_level = 0.95, method = c("waldcc", "wald", "clopper-pearson", "wilson", "wilsonc", "strat_wilson", "strat_wilsonc", "agresti-coull", "jeffreys"), weights = NULL, max_iterations = 50, variables = list(strata = NULL), long = FALSE ) a_proportion( df, .var, conf_level = 0.95, method = c("waldcc", "wald", "clopper-pearson", "wilson", "wilsonc", "strat_wilson", "strat_wilsonc", "agresti-coull", "jeffreys"), weights = NULL, max_iterations = 50, variables = list(strata = NULL), long = FALSE )
lyt |
( |
vars |
( |
conf_level |
( |
method |
( |
weights |
( |
max_iterations |
( |
variables |
(named |
long |
( |
na_str |
( |
nested |
( |
... |
additional arguments for the lower level functions. |
show_labels |
( |
table_names |
( |
.stats |
( Options are: |
.formats |
(named |
.labels |
(named |
.indent_mods |
(named |
df |
( |
.var |
( |
estimate_proportion()
returns a layout object suitable for passing to further layouting functions,
or to rtables::build_table()
. Adding this function to an rtable
layout will add formatted rows containing
the statistics from s_proportion()
to the table layout.
s_proportion()
returns statistics n_prop
(n
and proportion) and prop_ci
(proportion CI) for a
given variable.
a_proportion()
returns the corresponding list with formatted rtables::CellValue()
.
estimate_proportion()
: Layout-creating function which can take statistics function arguments
and additional format arguments. This function is a wrapper for rtables::analyze()
.
s_proportion()
: Statistics function estimating a
proportion along with its confidence interval.
a_proportion()
: Formatted analysis function which is used as afun
in estimate_proportion()
.
dta_test <- data.frame( USUBJID = paste0("S", 1:12), ARM = rep(LETTERS[1:3], each = 4), AVAL = rep(LETTERS[1:3], each = 4) ) basic_table() %>% split_cols_by("ARM") %>% estimate_proportion(vars = "AVAL") %>% build_table(df = dta_test) # Case with only logical vector. rsp_v <- c(1, 0, 1, 0, 1, 1, 0, 0) s_proportion(rsp_v) # Example for Stratified Wilson CI nex <- 100 # Number of example rows dta <- data.frame( "rsp" = sample(c(TRUE, FALSE), nex, TRUE), "grp" = sample(c("A", "B"), nex, TRUE), "f1" = sample(c("a1", "a2"), nex, TRUE), "f2" = sample(c("x", "y", "z"), nex, TRUE), stringsAsFactors = TRUE ) s_proportion( df = dta, .var = "rsp", variables = list(strata = c("f1", "f2")), conf_level = 0.90, method = "strat_wilson" )
dta_test <- data.frame( USUBJID = paste0("S", 1:12), ARM = rep(LETTERS[1:3], each = 4), AVAL = rep(LETTERS[1:3], each = 4) ) basic_table() %>% split_cols_by("ARM") %>% estimate_proportion(vars = "AVAL") %>% build_table(df = dta_test) # Case with only logical vector. rsp_v <- c(1, 0, 1, 0, 1, 1, 0, 0) s_proportion(rsp_v) # Example for Stratified Wilson CI nex <- 100 # Number of example rows dta <- data.frame( "rsp" = sample(c(TRUE, FALSE), nex, TRUE), "grp" = sample(c("A", "B"), nex, TRUE), "f1" = sample(c("a1", "a2"), nex, TRUE), "f2" = sample(c("x", "y", "z"), nex, TRUE), stringsAsFactors = TRUE ) s_proportion( df = dta, .var = "rsp", variables = list(strata = c("f1", "f2")), conf_level = 0.90, method = "strat_wilson" )
Simulated CDISC data for examples
tern_ex_adsl tern_ex_adae tern_ex_adlb tern_ex_adpp tern_ex_adrs tern_ex_adtte
tern_ex_adsl tern_ex_adae tern_ex_adlb tern_ex_adpp tern_ex_adrs tern_ex_adtte
rds
(data.frame
)
An object of class tbl_df
(inherits from tbl
, data.frame
) with 200 rows and 21 columns.
An object of class tbl_df
(inherits from tbl
, data.frame
) with 541 rows and 42 columns.
An object of class tbl_df
(inherits from tbl
, data.frame
) with 4200 rows and 50 columns.
An object of class tbl_df
(inherits from tbl
, data.frame
) with 522 rows and 25 columns.
An object of class tbl_df
(inherits from tbl
, data.frame
) with 1600 rows and 29 columns.
An object of class tbl_df
(inherits from tbl
, data.frame
) with 1000 rows and 28 columns.
tern_ex_adsl
: ADSL data
tern_ex_adae
: ADAE data
tern_ex_adlb
: ADLB data
tern_ex_adpp
: ADPP data
tern_ex_adrs
: ADRS data
tern_ex_adtte
: ADTTE data
Substitute missing data with a string or factor level.
explicit_na(x, label = "<Missing>")
explicit_na(x, label = "<Missing>")
x |
( |
label |
( |
x
with any NA
values substituted by label
.
explicit_na(c(NA, "a", "b")) is.na(explicit_na(c(NA, "a", "b"))) explicit_na(factor(c(NA, "a", "b"))) is.na(explicit_na(factor(c(NA, "a", "b")))) explicit_na(sas_na(c("a", "")))
explicit_na(c(NA, "a", "b")) is.na(explicit_na(c(NA, "a", "b"))) explicit_na(factor(c(NA, "a", "b"))) is.na(explicit_na(factor(c(NA, "a", "b")))) explicit_na(sas_na(c("a", "")))
Prepares estimates for number of responses, patients and overall response rate,
as well as odds ratio estimates, confidence intervals and p-values,
for multiple biomarkers across population subgroups in a single data frame.
variables
corresponds to the names of variables found in data
, passed as a
named list and requires elements rsp
and biomarkers
(vector of continuous
biomarker variables) and optionally covariates
, subgroups
and strata
.
groups_lists
optionally specifies groupings for subgroups
variables.
extract_rsp_biomarkers( variables, data, groups_lists = list(), control = control_logistic(), label_all = "All Patients" )
extract_rsp_biomarkers( variables, data, groups_lists = list(), control = control_logistic(), label_all = "All Patients" )
variables |
(named |
data |
( |
groups_lists |
(named |
control |
(named |
label_all |
( |
A data.frame
with columns biomarker
, biomarker_label
, n_tot
, n_rsp
,
prop
, or
, lcl
, ucl
, conf_level
, pval
, pval_label
, subgroup
, var
,
var_label
, and row_type
.
You can also specify a continuous variable in rsp
and then use the
response_definition
control to convert that internally to a logical
variable reflecting binary response.
h_logistic_mult_cont_df()
which is used internally.
library(dplyr) library(forcats) adrs <- tern_ex_adrs adrs_labels <- formatters::var_labels(adrs) adrs_f <- adrs %>% filter(PARAMCD == "BESRSPI") %>% mutate(rsp = AVALC == "CR") # Typical analysis of two continuous biomarkers `BMRKR1` and `AGE`, # in logistic regression models with one covariate `RACE`. The subgroups # are defined by the levels of `BMRKR2`. df <- extract_rsp_biomarkers( variables = list( rsp = "rsp", biomarkers = c("BMRKR1", "AGE"), covariates = "SEX", subgroups = "BMRKR2" ), data = adrs_f ) df # Here we group the levels of `BMRKR2` manually, and we add a stratification # variable `STRATA1`. We also here use a continuous variable `EOSDY` # which is then binarized internally (response is defined as this variable # being larger than 750). df_grouped <- extract_rsp_biomarkers( variables = list( rsp = "EOSDY", biomarkers = c("BMRKR1", "AGE"), covariates = "SEX", subgroups = "BMRKR2", strata = "STRATA1" ), data = adrs_f, groups_lists = list( BMRKR2 = list( "low" = "LOW", "low/medium" = c("LOW", "MEDIUM"), "low/medium/high" = c("LOW", "MEDIUM", "HIGH") ) ), control = control_logistic( response_definition = "I(response > 750)" ) ) df_grouped
library(dplyr) library(forcats) adrs <- tern_ex_adrs adrs_labels <- formatters::var_labels(adrs) adrs_f <- adrs %>% filter(PARAMCD == "BESRSPI") %>% mutate(rsp = AVALC == "CR") # Typical analysis of two continuous biomarkers `BMRKR1` and `AGE`, # in logistic regression models with one covariate `RACE`. The subgroups # are defined by the levels of `BMRKR2`. df <- extract_rsp_biomarkers( variables = list( rsp = "rsp", biomarkers = c("BMRKR1", "AGE"), covariates = "SEX", subgroups = "BMRKR2" ), data = adrs_f ) df # Here we group the levels of `BMRKR2` manually, and we add a stratification # variable `STRATA1`. We also here use a continuous variable `EOSDY` # which is then binarized internally (response is defined as this variable # being larger than 750). df_grouped <- extract_rsp_biomarkers( variables = list( rsp = "EOSDY", biomarkers = c("BMRKR1", "AGE"), covariates = "SEX", subgroups = "BMRKR2", strata = "STRATA1" ), data = adrs_f, groups_lists = list( BMRKR2 = list( "low" = "LOW", "low/medium" = c("LOW", "MEDIUM"), "low/medium/high" = c("LOW", "MEDIUM", "HIGH") ) ), control = control_logistic( response_definition = "I(response > 750)" ) ) df_grouped
Prepares response rates and odds ratios for population subgroups in data frames. Simple wrapper
for h_odds_ratio_subgroups_df()
and h_proportion_subgroups_df()
. Result is a list of two
data.frames
: prop
and or
. variables
corresponds to the names of variables found in data
,
passed as a named list
and requires elements rsp
, arm
and optionally subgroups
and strata
.
groups_lists
optionally specifies groupings for subgroups
variables.
extract_rsp_subgroups( variables, data, groups_lists = list(), conf_level = 0.95, method = NULL, label_all = "All Patients" )
extract_rsp_subgroups( variables, data, groups_lists = list(), conf_level = 0.95, method = NULL, label_all = "All Patients" )
variables |
(named |
data |
( |
groups_lists |
(named |
conf_level |
( |
method |
( |
label_all |
( |
A named list of two elements:
prop
: A data.frame
containing columns arm
, n
, n_rsp
, prop
, subgroup
, var
,
var_label
, and row_type
.
or
: A data.frame
containing columns arm
, n_tot
, or
, lcl
, ucl
, conf_level
,
subgroup
, var
, var_label
, and row_type
.
Prepares estimates for number of events, patients and median survival times, as well as hazard ratio estimates,
confidence intervals and p-values, for multiple biomarkers across population subgroups in a single data frame.
variables
corresponds to the names of variables found in data
, passed as a named list
and requires elements
tte
, is_event
, biomarkers
(vector of continuous biomarker variables), and optionally subgroups
and strata
.
groups_lists
optionally specifies groupings for subgroups
variables.
extract_survival_biomarkers( variables, data, groups_lists = list(), control = control_coxreg(), label_all = "All Patients" )
extract_survival_biomarkers( variables, data, groups_lists = list(), control = control_coxreg(), label_all = "All Patients" )
variables |
(named |
data |
( |
groups_lists |
(named |
control |
( |
label_all |
( |
A data.frame
with columns biomarker
, biomarker_label
, n_tot
, n_tot_events
,
median
, hr
, lcl
, ucl
, conf_level
, pval
, pval_label
, subgroup
, var
,
var_label
, and row_type
.
h_coxreg_mult_cont_df()
which is used internally, tabulate_survival_biomarkers()
.
Prepares estimates of median survival times and treatment hazard ratios for population subgroups in
data frames. Simple wrapper for h_survtime_subgroups_df()
and h_coxph_subgroups_df()
. Result is a list
of two data.frame
s: survtime
and hr
. variables
corresponds to the names of variables found in data
,
passed as a named list
and requires elements tte
, is_event
, arm
and optionally subgroups
and strata
.
groups_lists
optionally specifies groupings for subgroups
variables.
extract_survival_subgroups( variables, data, groups_lists = list(), control = control_coxph(), label_all = "All Patients" )
extract_survival_subgroups( variables, data, groups_lists = list(), control = control_coxph(), label_all = "All Patients" )
variables |
(named |
data |
( |
groups_lists |
(named |
control |
(
|
label_all |
( |
A named list
of two elements:
survtime
: A data.frame
containing columns arm
, n
, n_events
, median
, subgroup
, var
,
var_label
, and row_type
.
hr
: A data.frame
containing columns arm
, n_tot
, n_tot_events
, hr
, lcl
, ucl
, conf_level
,
pval
, pval_label
, subgroup
, var
, var_label
, and row_type
.
rtables
formatting functions that handle extreme values.
h_get_format_threshold(digits = 2L) h_format_threshold(x, digits = 2L)
h_get_format_threshold(digits = 2L) h_format_threshold(x, digits = 2L)
digits |
( |
x |
( |
For each input, apply a format to the specified number of digits
. If the value is
below a threshold, it returns "<0.01" e.g. if the number of digits
is 2. If the value is
above a threshold, it returns ">999.99" e.g. if the number of digits
is 2.
If it is zero, then returns "0.00".
h_get_format_threshold()
returns a list
of 2 elements: threshold
, with low
and high
thresholds,
and format_string
, with thresholds formatted as strings.
h_format_threshold()
returns the given value, or if the value is not within the digit threshold the relation
of the given value to the digit threshold, as a formatted string.
h_get_format_threshold()
: Internal helper function to calculate the threshold and create formatted strings
used in Formatting Functions. Returns a list with elements threshold
and format_string
.
h_format_threshold()
: Internal helper function to apply a threshold format to a value.
Creates a formatted string to be used in Formatting Functions.
Other formatting functions:
format_auto()
,
format_count_fraction()
,
format_count_fraction_fixed_dp()
,
format_count_fraction_lt10()
,
format_extreme_values()
,
format_extreme_values_ci()
,
format_fraction()
,
format_fraction_fixed_dp()
,
format_fraction_threshold()
,
format_sigfig()
,
format_xx()
,
formatting_functions
h_get_format_threshold(2L) h_format_threshold(0.001) h_format_threshold(1000)
h_get_format_threshold(2L) h_format_threshold(0.001) h_format_threshold(1000)
f_conf_level(conf_level)
f_conf_level(conf_level)
conf_level |
( |
A string
.
f_pval(test_mean)
f_pval(test_mean)
test_mean |
( |
A string
.
This collapses levels and only keeps those new group levels, in the order provided. The returned factor has levels in the order given, with the possible missing level last (this will only be included if there are missing values).
fct_collapse_only(.f, ..., .na_level = "<Missing>")
fct_collapse_only(.f, ..., .na_level = "<Missing>")
.f |
( |
... |
(named |
.na_level |
( |
A modified factor
with collapsed levels. Values and levels which are not included
in the given character
vector input will be set to the missing level .na_level
.
Any existing NA
s in the input vector will not be replaced by the missing level. If needed,
explicit_na()
can be called separately on the result.
forcats::fct_collapse()
, forcats::fct_relevel()
which are used internally.
fct_collapse_only(factor(c("a", "b", "c", "d")), TRT = "b", CTRL = c("c", "d"))
fct_collapse_only(factor(c("a", "b", "c", "d")), TRT = "b", CTRL = c("c", "d"))
This discards the observations as well as the levels specified from a factor.
fct_discard(x, discard)
fct_discard(x, discard)
x |
( |
discard |
( |
A modified factor
with observations as well as levels from discard
dropped.
fct_discard(factor(c("a", "b", "c")), "c")
fct_discard(factor(c("a", "b", "c")), "c")
This inserts explicit missing values in a factor based on a condition. Additionally,
existing NA
values will be explicitly converted to given na_level
.
fct_explicit_na_if(x, condition, na_level = "<Missing>")
fct_explicit_na_if(x, condition, na_level = "<Missing>")
x |
( |
condition |
( |
na_level |
( |
A modified factor
with inserted and existing NA
converted to na_level
.
forcats::fct_na_value_to_level()
which is used internally.
fct_explicit_na_if(factor(c("a", "b", NA)), c(TRUE, FALSE, FALSE))
fct_explicit_na_if(factor(c("a", "b", NA)), c(TRUE, FALSE, FALSE))
Fitting functions for univariate and multivariate Cox regression models.
fit_coxreg_univar(variables, data, at = list(), control = control_coxreg()) fit_coxreg_multivar(variables, data, control = control_coxreg())
fit_coxreg_univar(variables, data, at = list(), control = control_coxreg()) fit_coxreg_multivar(variables, data, control = control_coxreg())
variables |
(named |
data |
( |
at |
( |
control |
( |
fit_coxreg_univar()
returns a coxreg.univar
class object which is a named list
with 5 elements:
mod
: Cox regression models fitted by survival::coxph()
.
data
: The original data frame input.
control
: The original control input.
vars
: The variables used in the model.
at
: Value of the covariate at which the effect should be estimated.
fit_coxreg_multivar()
returns a coxreg.multivar
class object which is a named list
with 4 elements:
mod
: Cox regression model fitted by survival::coxph()
.
data
: The original data frame input.
control
: The original control input.
vars
: The variables used in the model.
fit_coxreg_univar()
: Fit a series of univariate Cox regression models given the inputs.
fit_coxreg_multivar()
: Fit a multivariate Cox regression model.
When using fit_coxreg_univar
there should be two study arms.
h_cox_regression for relevant helper functions, cox_regression.
library(survival) set.seed(1, kind = "Mersenne-Twister") # Testing dataset [survival::bladder]. dta_bladder <- with( data = bladder[bladder$enum < 5, ], data.frame( time = stop, status = event, armcd = as.factor(rx), covar1 = as.factor(enum), covar2 = factor( sample(as.factor(enum)), levels = 1:4, labels = c("F", "F", "M", "M") ) ) ) labels <- c("armcd" = "ARM", "covar1" = "A Covariate Label", "covar2" = "Sex (F/M)") formatters::var_labels(dta_bladder)[names(labels)] <- labels dta_bladder$age <- sample(20:60, size = nrow(dta_bladder), replace = TRUE) plot( survfit(Surv(time, status) ~ armcd + covar1, data = dta_bladder), lty = 2:4, xlab = "Months", col = c("blue1", "blue2", "blue3", "blue4", "red1", "red2", "red3", "red4") ) # fit_coxreg_univar ## Cox regression: arm + 1 covariate. mod1 <- fit_coxreg_univar( variables = list( time = "time", event = "status", arm = "armcd", covariates = "covar1" ), data = dta_bladder, control = control_coxreg(conf_level = 0.91) ) ## Cox regression: arm + 1 covariate + interaction, 2 candidate covariates. mod2 <- fit_coxreg_univar( variables = list( time = "time", event = "status", arm = "armcd", covariates = c("covar1", "covar2") ), data = dta_bladder, control = control_coxreg(conf_level = 0.91, interaction = TRUE) ) ## Cox regression: arm + 1 covariate, stratified analysis. mod3 <- fit_coxreg_univar( variables = list( time = "time", event = "status", arm = "armcd", strata = "covar2", covariates = c("covar1") ), data = dta_bladder, control = control_coxreg(conf_level = 0.91) ) ## Cox regression: no arm, only covariates. mod4 <- fit_coxreg_univar( variables = list( time = "time", event = "status", covariates = c("covar1", "covar2") ), data = dta_bladder ) # fit_coxreg_multivar ## Cox regression: multivariate Cox regression. multivar_model <- fit_coxreg_multivar( variables = list( time = "time", event = "status", arm = "armcd", covariates = c("covar1", "covar2") ), data = dta_bladder ) # Example without treatment arm. multivar_covs_model <- fit_coxreg_multivar( variables = list( time = "time", event = "status", covariates = c("covar1", "covar2") ), data = dta_bladder )
library(survival) set.seed(1, kind = "Mersenne-Twister") # Testing dataset [survival::bladder]. dta_bladder <- with( data = bladder[bladder$enum < 5, ], data.frame( time = stop, status = event, armcd = as.factor(rx), covar1 = as.factor(enum), covar2 = factor( sample(as.factor(enum)), levels = 1:4, labels = c("F", "F", "M", "M") ) ) ) labels <- c("armcd" = "ARM", "covar1" = "A Covariate Label", "covar2" = "Sex (F/M)") formatters::var_labels(dta_bladder)[names(labels)] <- labels dta_bladder$age <- sample(20:60, size = nrow(dta_bladder), replace = TRUE) plot( survfit(Surv(time, status) ~ armcd + covar1, data = dta_bladder), lty = 2:4, xlab = "Months", col = c("blue1", "blue2", "blue3", "blue4", "red1", "red2", "red3", "red4") ) # fit_coxreg_univar ## Cox regression: arm + 1 covariate. mod1 <- fit_coxreg_univar( variables = list( time = "time", event = "status", arm = "armcd", covariates = "covar1" ), data = dta_bladder, control = control_coxreg(conf_level = 0.91) ) ## Cox regression: arm + 1 covariate + interaction, 2 candidate covariates. mod2 <- fit_coxreg_univar( variables = list( time = "time", event = "status", arm = "armcd", covariates = c("covar1", "covar2") ), data = dta_bladder, control = control_coxreg(conf_level = 0.91, interaction = TRUE) ) ## Cox regression: arm + 1 covariate, stratified analysis. mod3 <- fit_coxreg_univar( variables = list( time = "time", event = "status", arm = "armcd", strata = "covar2", covariates = c("covar1") ), data = dta_bladder, control = control_coxreg(conf_level = 0.91) ) ## Cox regression: no arm, only covariates. mod4 <- fit_coxreg_univar( variables = list( time = "time", event = "status", covariates = c("covar1", "covar2") ), data = dta_bladder ) # fit_coxreg_multivar ## Cox regression: multivariate Cox regression. multivar_model <- fit_coxreg_multivar( variables = list( time = "time", event = "status", arm = "armcd", covariates = c("covar1", "covar2") ), data = dta_bladder ) # Example without treatment arm. multivar_covs_model <- fit_coxreg_multivar( variables = list( time = "time", event = "status", covariates = c("covar1", "covar2") ), data = dta_bladder )
Fit a (conditional) logistic regression model.
fit_logistic( data, variables = list(response = "Response", arm = "ARMCD", covariates = NULL, interaction = NULL, strata = NULL), response_definition = "response" )
fit_logistic( data, variables = list(response = "Response", arm = "ARMCD", covariates = NULL, interaction = NULL, strata = NULL), response_definition = "response" )
data |
( |
variables |
(named |
response_definition |
( |
A fitted logistic regression model.
The variables
list needs to include the following elements:
arm
: Treatment arm variable name.
response
: The response arm variable name. Usually this is a 0/1 variable.
covariates
: This is either NULL
(no covariates) or a character vector of covariate variable names.
interaction
: This is either NULL
(no interaction) or a string of a single covariate variable name already
included in covariates
. Then the interaction with the treatment arm is included in the model.
library(dplyr) adrs_f <- tern_ex_adrs %>% filter(PARAMCD == "BESRSPI") %>% filter(RACE %in% c("ASIAN", "WHITE", "BLACK OR AFRICAN AMERICAN")) %>% mutate( Response = case_when(AVALC %in% c("PR", "CR") ~ 1, TRUE ~ 0), RACE = factor(RACE), SEX = factor(SEX) ) formatters::var_labels(adrs_f) <- c(formatters::var_labels(tern_ex_adrs), Response = "Response") mod1 <- fit_logistic( data = adrs_f, variables = list( response = "Response", arm = "ARMCD", covariates = c("AGE", "RACE") ) ) mod2 <- fit_logistic( data = adrs_f, variables = list( response = "Response", arm = "ARMCD", covariates = c("AGE", "RACE"), interaction = "AGE" ) )
library(dplyr) adrs_f <- tern_ex_adrs %>% filter(PARAMCD == "BESRSPI") %>% filter(RACE %in% c("ASIAN", "WHITE", "BLACK OR AFRICAN AMERICAN")) %>% mutate( Response = case_when(AVALC %in% c("PR", "CR") ~ 1, TRUE ~ 0), RACE = factor(RACE), SEX = factor(SEX) ) formatters::var_labels(adrs_f) <- c(formatters::var_labels(tern_ex_adrs), Response = "Response") mod1 <- fit_logistic( data = adrs_f, variables = list( response = "Response", arm = "ARMCD", covariates = c("AGE", "RACE") ) ) mod2 <- fit_logistic( data = adrs_f, variables = list( response = "Response", arm = "ARMCD", covariates = c("AGE", "RACE"), interaction = "AGE" ) )
This fits the Subgroup Treatment Effect Pattern logistic regression models for a binary (response) outcome. The treatment arm variable must have exactly 2 levels, where the first one is taken as reference and the estimated odds ratios are for the comparison of the second level vs. the first one.
The (conditional) logistic regression model which is fit is:
response ~ arm * poly(biomarker, degree) + covariates + strata(strata)
where degree
is specified by control_step()
.
fit_rsp_step(variables, data, control = c(control_step(), control_logistic()))
fit_rsp_step(variables, data, control = c(control_step(), control_logistic()))
variables |
(named |
data |
( |
control |
(named |
A matrix of class step
. The first part of the columns describe the
subgroup intervals used for the biomarker variable, including where the
center of the intervals are and their bounds. The second part of the
columns contain the estimates for the treatment arm comparison.
For the default degree 0 the biomarker
variable is not included in the model.
control_step()
and control_logistic()
for the available
customization options.
# Testing dataset with just two treatment arms. library(survival) library(dplyr) adrs_f <- tern_ex_adrs %>% filter( PARAMCD == "BESRSPI", ARM %in% c("B: Placebo", "A: Drug X") ) %>% mutate( # Reorder levels of ARM to have Placebo as reference arm for Odds Ratio calculations. ARM = droplevels(forcats::fct_relevel(ARM, "B: Placebo")), RSP = case_when(AVALC %in% c("PR", "CR") ~ 1, TRUE ~ 0), SEX = factor(SEX) ) variables <- list( arm = "ARM", biomarker = "BMRKR1", covariates = "AGE", response = "RSP" ) # Fit default STEP models: Here a constant treatment effect is estimated in each subgroup. # We use a large enough bandwidth to avoid too small subgroups and linear separation in those. step_matrix <- fit_rsp_step( variables = variables, data = adrs_f, control = c(control_logistic(), control_step(bandwidth = 0.9)) ) dim(step_matrix) head(step_matrix) # Specify different polynomial degree for the biomarker interaction to use more flexible local # models. Or specify different logistic regression options, including confidence level. step_matrix2 <- fit_rsp_step( variables = variables, data = adrs_f, control = c(control_logistic(conf_level = 0.9), control_step(bandwidth = NULL, degree = 1)) ) # Use a global constant model. This is helpful as a reference for the subgroup models. step_matrix3 <- fit_rsp_step( variables = variables, data = adrs_f, control = c(control_logistic(), control_step(bandwidth = NULL, num_points = 2L)) ) # It is also possible to use strata, i.e. use conditional logistic regression models. variables2 <- list( arm = "ARM", biomarker = "BMRKR1", covariates = "AGE", response = "RSP", strata = c("STRATA1", "STRATA2") ) step_matrix4 <- fit_rsp_step( variables = variables2, data = adrs_f, control = c(control_logistic(), control_step(bandwidth = NULL)) )
# Testing dataset with just two treatment arms. library(survival) library(dplyr) adrs_f <- tern_ex_adrs %>% filter( PARAMCD == "BESRSPI", ARM %in% c("B: Placebo", "A: Drug X") ) %>% mutate( # Reorder levels of ARM to have Placebo as reference arm for Odds Ratio calculations. ARM = droplevels(forcats::fct_relevel(ARM, "B: Placebo")), RSP = case_when(AVALC %in% c("PR", "CR") ~ 1, TRUE ~ 0), SEX = factor(SEX) ) variables <- list( arm = "ARM", biomarker = "BMRKR1", covariates = "AGE", response = "RSP" ) # Fit default STEP models: Here a constant treatment effect is estimated in each subgroup. # We use a large enough bandwidth to avoid too small subgroups and linear separation in those. step_matrix <- fit_rsp_step( variables = variables, data = adrs_f, control = c(control_logistic(), control_step(bandwidth = 0.9)) ) dim(step_matrix) head(step_matrix) # Specify different polynomial degree for the biomarker interaction to use more flexible local # models. Or specify different logistic regression options, including confidence level. step_matrix2 <- fit_rsp_step( variables = variables, data = adrs_f, control = c(control_logistic(conf_level = 0.9), control_step(bandwidth = NULL, degree = 1)) ) # Use a global constant model. This is helpful as a reference for the subgroup models. step_matrix3 <- fit_rsp_step( variables = variables, data = adrs_f, control = c(control_logistic(), control_step(bandwidth = NULL, num_points = 2L)) ) # It is also possible to use strata, i.e. use conditional logistic regression models. variables2 <- list( arm = "ARM", biomarker = "BMRKR1", covariates = "AGE", response = "RSP", strata = c("STRATA1", "STRATA2") ) step_matrix4 <- fit_rsp_step( variables = variables2, data = adrs_f, control = c(control_logistic(), control_step(bandwidth = NULL)) )
This fits the subgroup treatment effect pattern (STEP) models for a survival outcome. The treatment arm variable must have exactly 2 levels, where the first one is taken as reference and the estimated hazard ratios are for the comparison of the second level vs. the first one.
The model which is fit is:
Surv(time, event) ~ arm * poly(biomarker, degree) + covariates + strata(strata)
where degree
is specified by control_step()
.
fit_survival_step( variables, data, control = c(control_step(), control_coxph()) )
fit_survival_step( variables, data, control = c(control_step(), control_coxph()) )
variables |
(named |
data |
( |
control |
(named |
A matrix of class step
. The first part of the columns describe the subgroup intervals used
for the biomarker variable, including where the center of the intervals are and their bounds. The
second part of the columns contain the estimates for the treatment arm comparison.
For the default degree 0 the biomarker
variable is not included in the model.
control_step()
and control_coxph()
for the available customization options.
# Testing dataset with just two treatment arms. library(dplyr) adtte_f <- tern_ex_adtte %>% filter( PARAMCD == "OS", ARM %in% c("B: Placebo", "A: Drug X") ) %>% mutate( # Reorder levels of ARM to display reference arm before treatment arm. ARM = droplevels(forcats::fct_relevel(ARM, "B: Placebo")), is_event = CNSR == 0 ) labels <- c("ARM" = "Treatment Arm", "is_event" = "Event Flag") formatters::var_labels(adtte_f)[names(labels)] <- labels variables <- list( arm = "ARM", biomarker = "BMRKR1", covariates = c("AGE", "BMRKR2"), event = "is_event", time = "AVAL" ) # Fit default STEP models: Here a constant treatment effect is estimated in each subgroup. step_matrix <- fit_survival_step( variables = variables, data = adtte_f ) dim(step_matrix) head(step_matrix) # Specify different polynomial degree for the biomarker interaction to use more flexible local # models. Or specify different Cox regression options. step_matrix2 <- fit_survival_step( variables = variables, data = adtte_f, control = c(control_coxph(conf_level = 0.9), control_step(degree = 2)) ) # Use a global model with cubic interaction and only 5 points. step_matrix3 <- fit_survival_step( variables = variables, data = adtte_f, control = c(control_coxph(), control_step(bandwidth = NULL, degree = 3, num_points = 5L)) )
# Testing dataset with just two treatment arms. library(dplyr) adtte_f <- tern_ex_adtte %>% filter( PARAMCD == "OS", ARM %in% c("B: Placebo", "A: Drug X") ) %>% mutate( # Reorder levels of ARM to display reference arm before treatment arm. ARM = droplevels(forcats::fct_relevel(ARM, "B: Placebo")), is_event = CNSR == 0 ) labels <- c("ARM" = "Treatment Arm", "is_event" = "Event Flag") formatters::var_labels(adtte_f)[names(labels)] <- labels variables <- list( arm = "ARM", biomarker = "BMRKR1", covariates = c("AGE", "BMRKR2"), event = "is_event", time = "AVAL" ) # Fit default STEP models: Here a constant treatment effect is estimated in each subgroup. step_matrix <- fit_survival_step( variables = variables, data = adtte_f ) dim(step_matrix) head(step_matrix) # Specify different polynomial degree for the biomarker interaction to use more flexible local # models. Or specify different Cox regression options. step_matrix2 <- fit_survival_step( variables = variables, data = adtte_f, control = c(control_coxph(conf_level = 0.9), control_step(degree = 2)) ) # Use a global model with cubic interaction and only 5 points. step_matrix3 <- fit_survival_step( variables = variables, data = adtte_f, control = c(control_coxph(), control_step(bandwidth = NULL, degree = 3, num_points = 5L)) )
forest_viewport( tbl, width_row_names = NULL, width_columns = NULL, width_forest = grid::unit(1, "null"), gap_column = grid::unit(1, "lines"), gap_header = grid::unit(1, "lines"), mat_form = NULL )
forest_viewport( tbl, width_row_names = NULL, width_columns = NULL, width_forest = grid::unit(1, "null"), gap_column = grid::unit(1, "lines"), gap_header = grid::unit(1, "lines"), mat_form = NULL )
tbl |
( |
width_row_names |
( |
width_columns |
( |
width_forest |
( |
gap_column |
( |
gap_header |
( |
mat_form |
( |
A viewport tree.
library(grid) tbl <- rtable( header = rheader( rrow("", "E", rcell("CI", colspan = 2)), rrow("", "A", "B", "C") ), rrow("row 1", 1, 0.8, 1.1), rrow("row 2", 1.4, 0.8, 1.6), rrow("row 3", 1.2, 0.8, 1.2) ) v <- forest_viewport(tbl) grid::grid.newpage() showViewport(v)
library(grid) tbl <- rtable( header = rheader( rrow("", "E", rcell("CI", colspan = 2)), rrow("", "A", "B", "C") ), rrow("row 1", 1, 0.8, 1.1), rrow("row 2", 1.4, 0.8, 1.6), rrow("row 3", 1.2, 0.8, 1.2) ) v <- forest_viewport(tbl) grid::grid.newpage() showViewport(v)
Formatting function for the majority of default methods used in analyze_vars()
.
For non-derived values, the significant digits of data is used (e.g. range), while derived
values have one more digits (measure of location and dispersion like mean, standard deviation).
This function can be called internally with "auto" like, for example,
.formats = c("mean" = "auto")
. See details to see how this works with the inner function.
format_auto(dt_var, x_stat)
format_auto(dt_var, x_stat)
dt_var |
( |
x_stat |
( |
The internal function is needed to work with rtables
default structure for
format functions, i.e. function(x, ...)
, where is x are results from statistical evaluation.
It can be more than one element (e.g. for .stats = "mean_sd"
).
A string that rtables
prints in a table cell.
Other formatting functions:
extreme_format
,
format_count_fraction()
,
format_count_fraction_fixed_dp()
,
format_count_fraction_lt10()
,
format_extreme_values()
,
format_extreme_values_ci()
,
format_fraction()
,
format_fraction_fixed_dp()
,
format_fraction_threshold()
,
format_sigfig()
,
format_xx()
,
formatting_functions
x_todo <- c(0.001, 0.2, 0.0011000, 3, 4) res <- c(mean(x_todo[1:3]), sd(x_todo[1:3])) # x is the result coming into the formatting function -> res!! format_auto(dt_var = x_todo, x_stat = "mean_sd")(x = res) format_auto(x_todo, "range")(x = range(x_todo)) no_sc_x <- c(0.0000001, 1) format_auto(no_sc_x, "range")(x = no_sc_x)
x_todo <- c(0.001, 0.2, 0.0011000, 3, 4) res <- c(mean(x_todo[1:3]), sd(x_todo[1:3])) # x is the result coming into the formatting function -> res!! format_auto(dt_var = x_todo, x_stat = "mean_sd")(x = res) format_auto(x_todo, "range")(x = range(x_todo)) no_sc_x <- c(0.0000001, 1) format_auto(no_sc_x, "range")(x = no_sc_x)
Formats a count together with fraction with special consideration when count is 0
.
format_count_fraction(x, ...)
format_count_fraction(x, ...)
x |
( |
... |
not used. Required for |
A string in the format count (fraction %)
. If count
is 0, the format is 0
.
Other formatting functions:
extreme_format
,
format_auto()
,
format_count_fraction_fixed_dp()
,
format_count_fraction_lt10()
,
format_extreme_values()
,
format_extreme_values_ci()
,
format_fraction()
,
format_fraction_fixed_dp()
,
format_fraction_threshold()
,
format_sigfig()
,
format_xx()
,
formatting_functions
format_count_fraction(x = c(2, 0.6667)) format_count_fraction(x = c(0, 0))
format_count_fraction(x = c(2, 0.6667)) format_count_fraction(x = c(0, 0))
Formats a count together with fraction with special consideration when count is 0
.
format_count_fraction_fixed_dp(x, ...)
format_count_fraction_fixed_dp(x, ...)
x |
( |
... |
not used. Required for |
A string in the format count (fraction %)
. If count
is 0, the format is 0
.
Other formatting functions:
extreme_format
,
format_auto()
,
format_count_fraction()
,
format_count_fraction_lt10()
,
format_extreme_values()
,
format_extreme_values_ci()
,
format_fraction()
,
format_fraction_fixed_dp()
,
format_fraction_threshold()
,
format_sigfig()
,
format_xx()
,
formatting_functions
format_count_fraction_fixed_dp(x = c(2, 0.6667)) format_count_fraction_fixed_dp(x = c(2, 0.5)) format_count_fraction_fixed_dp(x = c(0, 0))
format_count_fraction_fixed_dp(x = c(2, 0.6667)) format_count_fraction_fixed_dp(x = c(2, 0.5)) format_count_fraction_fixed_dp(x = c(0, 0))
Formats a count together with fraction with special consideration when count is less than 10.
format_count_fraction_lt10(x, ...)
format_count_fraction_lt10(x, ...)
x |
( |
... |
not used. Required for |
A string in the format count (fraction %)
. If count
is less than 10, only count
is printed.
Other formatting functions:
extreme_format
,
format_auto()
,
format_count_fraction()
,
format_count_fraction_fixed_dp()
,
format_extreme_values()
,
format_extreme_values_ci()
,
format_fraction()
,
format_fraction_fixed_dp()
,
format_fraction_threshold()
,
format_sigfig()
,
format_xx()
,
formatting_functions
format_count_fraction_lt10(x = c(275, 0.9673)) format_count_fraction_lt10(x = c(2, 0.6667)) format_count_fraction_lt10(x = c(9, 1))
format_count_fraction_lt10(x = c(275, 0.9673)) format_count_fraction_lt10(x = c(2, 0.6667)) format_count_fraction_lt10(x = c(9, 1))
Create a formatting function for a single extreme value.
format_extreme_values(digits = 2L)
format_extreme_values(digits = 2L)
digits |
( |
An rtables
formatting function that uses threshold digits
to return a formatted extreme value.
Other formatting functions:
extreme_format
,
format_auto()
,
format_count_fraction()
,
format_count_fraction_fixed_dp()
,
format_count_fraction_lt10()
,
format_extreme_values_ci()
,
format_fraction()
,
format_fraction_fixed_dp()
,
format_fraction_threshold()
,
format_sigfig()
,
format_xx()
,
formatting_functions
format_fun <- format_extreme_values(2L) format_fun(x = 0.127) format_fun(x = Inf) format_fun(x = 0) format_fun(x = 0.009)
format_fun <- format_extreme_values(2L) format_fun(x = 0.127) format_fun(x = Inf) format_fun(x = 0) format_fun(x = 0.009)
Formatting Function for extreme values part of a confidence interval. Values
are formatted as e.g. "(xx.xx, xx.xx)" if the number of digits
is 2.
format_extreme_values_ci(digits = 2L)
format_extreme_values_ci(digits = 2L)
digits |
( |
An rtables
formatting function that uses threshold digits
to return a formatted extreme
values confidence interval.
Other formatting functions:
extreme_format
,
format_auto()
,
format_count_fraction()
,
format_count_fraction_fixed_dp()
,
format_count_fraction_lt10()
,
format_extreme_values()
,
format_fraction()
,
format_fraction_fixed_dp()
,
format_fraction_threshold()
,
format_sigfig()
,
format_xx()
,
formatting_functions
format_fun <- format_extreme_values_ci(2L) format_fun(x = c(0.127, Inf)) format_fun(x = c(0, 0.009))
format_fun <- format_extreme_values_ci(2L) format_fun(x = c(0.127, Inf)) format_fun(x = c(0, 0.009))
Formats a fraction together with ratio in percent.
format_fraction(x, ...)
format_fraction(x, ...)
x |
(named |
... |
not used. Required for |
A string in the format num / denom (ratio %)
. If num
is 0, the format is num / denom
.
Other formatting functions:
extreme_format
,
format_auto()
,
format_count_fraction()
,
format_count_fraction_fixed_dp()
,
format_count_fraction_lt10()
,
format_extreme_values()
,
format_extreme_values_ci()
,
format_fraction_fixed_dp()
,
format_fraction_threshold()
,
format_sigfig()
,
format_xx()
,
formatting_functions
format_fraction(x = c(num = 2L, denom = 3L)) format_fraction(x = c(num = 0L, denom = 3L))
format_fraction(x = c(num = 2L, denom = 3L)) format_fraction(x = c(num = 0L, denom = 3L))
Formats a fraction together with ratio in percent with fixed single decimal place. Includes trailing zero in case of whole number percentages to always keep one decimal place.
format_fraction_fixed_dp(x, ...)
format_fraction_fixed_dp(x, ...)
x |
(named |
... |
not used. Required for |
A string in the format num / denom (ratio %)
. If num
is 0, the format is num / denom
.
Other formatting functions:
extreme_format
,
format_auto()
,
format_count_fraction()
,
format_count_fraction_fixed_dp()
,
format_count_fraction_lt10()
,
format_extreme_values()
,
format_extreme_values_ci()
,
format_fraction()
,
format_fraction_threshold()
,
format_sigfig()
,
format_xx()
,
formatting_functions
format_fraction_fixed_dp(x = c(num = 1L, denom = 2L)) format_fraction_fixed_dp(x = c(num = 1L, denom = 4L)) format_fraction_fixed_dp(x = c(num = 0L, denom = 3L))
format_fraction_fixed_dp(x = c(num = 1L, denom = 2L)) format_fraction_fixed_dp(x = c(num = 1L, denom = 4L)) format_fraction_fixed_dp(x = c(num = 0L, denom = 3L))
Formats a fraction when the second element of the input x
is the fraction. It applies
a lower threshold, below which it is just stated that the fraction is smaller than that.
format_fraction_threshold(threshold)
format_fraction_threshold(threshold)
threshold |
( |
An rtables
formatting function that takes numeric input x
where the second
element is the fraction that is formatted. If the fraction is above or equal to the threshold,
then it is displayed in percentage. If it is positive but below the threshold, it returns,
e.g. "<1" if the threshold is 0.01
. If it is zero, then just "0" is returned.
Other formatting functions:
extreme_format
,
format_auto()
,
format_count_fraction()
,
format_count_fraction_fixed_dp()
,
format_count_fraction_lt10()
,
format_extreme_values()
,
format_extreme_values_ci()
,
format_fraction()
,
format_fraction_fixed_dp()
,
format_sigfig()
,
format_xx()
,
formatting_functions
format_fun <- format_fraction_threshold(0.05) format_fun(x = c(20, 0.1)) format_fun(x = c(2, 0.01)) format_fun(x = c(0, 0))
format_fun <- format_fraction_threshold(0.05) format_fun(x = c(20, 0.1)) format_fun(x = c(2, 0.01)) format_fun(x = c(0, 0))
Format numeric values to print with a specified number of significant figures.
format_sigfig(sigfig, format = "xx", num_fmt = "fg")
format_sigfig(sigfig, format = "xx", num_fmt = "fg")
sigfig |
( |
format |
( |
num_fmt |
( |
An rtables
formatting function.
Other formatting functions:
extreme_format
,
format_auto()
,
format_count_fraction()
,
format_count_fraction_fixed_dp()
,
format_count_fraction_lt10()
,
format_extreme_values()
,
format_extreme_values_ci()
,
format_fraction()
,
format_fraction_fixed_dp()
,
format_fraction_threshold()
,
format_xx()
,
formatting_functions
fmt_3sf <- format_sigfig(3) fmt_3sf(1.658) fmt_3sf(1e1) fmt_5sf <- format_sigfig(5) fmt_5sf(0.57) fmt_5sf(0.000025645)
fmt_3sf <- format_sigfig(3) fmt_3sf(1.658) fmt_3sf(1e1) fmt_5sf <- format_sigfig(5) fmt_5sf(0.57) fmt_5sf(0.000025645)
Translate a string where x and dots are interpreted as number place holders, and others as formatting elements.
format_xx(str)
format_xx(str)
str |
( |
An rtables
formatting function.
Other formatting functions:
extreme_format
,
format_auto()
,
format_count_fraction()
,
format_count_fraction_fixed_dp()
,
format_count_fraction_lt10()
,
format_extreme_values()
,
format_extreme_values_ci()
,
format_fraction()
,
format_fraction_fixed_dp()
,
format_fraction_threshold()
,
format_sigfig()
,
formatting_functions
test <- list(c(1.658, 0.5761), c(1e1, 785.6)) z <- format_xx("xx (xx.x)") sapply(test, z) z <- format_xx("xx.x - xx.x") sapply(test, z) z <- format_xx("xx.x, incl. xx.x% NE") sapply(test, z)
test <- list(c(1.658, 0.5761), c(1e1, 785.6)) z <- format_xx("xx (xx.x)") sapply(test, z) z <- format_xx("xx.x - xx.x") sapply(test, z) z <- format_xx("xx.x, incl. xx.x% NE") sapply(test, z)
See below for the list of formatting functions created in tern
to work with rtables
.
Other available formats can be listed via formatters::list_valid_format_labels()
. Additional
custom formats can be created via the formatters::sprintf_format()
function.
Other formatting functions:
extreme_format
,
format_auto()
,
format_count_fraction()
,
format_count_fraction_fixed_dp()
,
format_count_fraction_lt10()
,
format_extreme_values()
,
format_extreme_values_ci()
,
format_fraction()
,
format_fraction_fixed_dp()
,
format_fraction_threshold()
,
format_sigfig()
,
format_xx()
Graphing function that produces a Bland-Altman plot.
g_bland_altman(x, y, conf_level = 0.95)
g_bland_altman(x, y, conf_level = 0.95)
x |
( |
y |
( |
conf_level |
( |
A ggplot
Bland-Altman plot.
x <- seq(1, 60, 5) y <- seq(5, 50, 4) g_bland_altman(x = x, y = y, conf_level = 0.9)
x <- seq(1, 60, 5) y <- seq(5, 50, 4) g_bland_altman(x = x, y = y, conf_level = 0.9)
rtable
g_forest( tbl, col_x = attr(tbl, "col_x"), col_ci = attr(tbl, "col_ci"), vline = 1, forest_header = attr(tbl, "forest_header"), xlim = c(0.1, 10), logx = TRUE, x_at = c(0.1, 1, 10), width_row_names = lifecycle::deprecated(), width_columns = NULL, width_forest = lifecycle::deprecated(), lbl_col_padding = 0, rel_width_forest = 0.25, font_size = 12, col_symbol_size = attr(tbl, "col_symbol_size"), col = getOption("ggplot2.discrete.colour")[1], ggtheme = NULL, as_list = FALSE, gp = lifecycle::deprecated(), draw = lifecycle::deprecated(), newpage = lifecycle::deprecated() )
g_forest( tbl, col_x = attr(tbl, "col_x"), col_ci = attr(tbl, "col_ci"), vline = 1, forest_header = attr(tbl, "forest_header"), xlim = c(0.1, 10), logx = TRUE, x_at = c(0.1, 1, 10), width_row_names = lifecycle::deprecated(), width_columns = NULL, width_forest = lifecycle::deprecated(), lbl_col_padding = 0, rel_width_forest = 0.25, font_size = 12, col_symbol_size = attr(tbl, "col_symbol_size"), col = getOption("ggplot2.discrete.colour")[1], ggtheme = NULL, as_list = FALSE, gp = lifecycle::deprecated(), draw = lifecycle::deprecated(), newpage = lifecycle::deprecated() )
tbl |
( |
col_x |
( |
col_ci |
( |
vline |
( |
forest_header |
( |
xlim |
( |
logx |
( |
x_at |
( |
width_row_names |
|
width_columns |
( |
width_forest |
|
lbl_col_padding |
( |
rel_width_forest |
( |
font_size |
( |
col_symbol_size |
( |
col |
( |
ggtheme |
( |
as_list |
( |
gp |
|
draw |
|
newpage |
|
Given a rtables::rtable()
object with at least one column with a single value and one column with 2
values, converts table to a ggplot2::ggplot()
object and generates an accompanying forest plot. The
table and forest plot are printed side-by-side.
ggplot
forest plot and table.
library(dplyr) library(forcats) adrs <- tern_ex_adrs n_records <- 20 adrs_labels <- formatters::var_labels(adrs, fill = TRUE) adrs <- adrs %>% filter(PARAMCD == "BESRSPI") %>% filter(ARM %in% c("A: Drug X", "B: Placebo")) %>% slice(seq_len(n_records)) %>% droplevels() %>% mutate( # Reorder levels of factor to make the placebo group the reference arm. ARM = fct_relevel(ARM, "B: Placebo"), rsp = AVALC == "CR" ) formatters::var_labels(adrs) <- c(adrs_labels, "Response") df <- extract_rsp_subgroups( variables = list(rsp = "rsp", arm = "ARM", subgroups = c("SEX", "STRATA2")), data = adrs ) # Full commonly used response table. tbl <- basic_table() %>% tabulate_rsp_subgroups(df) g_forest(tbl) # Odds ratio only table. tbl_or <- basic_table() %>% tabulate_rsp_subgroups(df, vars = c("n_tot", "or", "ci")) g_forest( tbl_or, forest_header = c("Comparison\nBetter", "Treatment\nBetter") ) # Survival forest plot example. adtte <- tern_ex_adtte # Save variable labels before data processing steps. adtte_labels <- formatters::var_labels(adtte, fill = TRUE) adtte_f <- adtte %>% filter( PARAMCD == "OS", ARM %in% c("B: Placebo", "A: Drug X"), SEX %in% c("M", "F") ) %>% mutate( # Reorder levels of ARM to display reference arm before treatment arm. ARM = droplevels(fct_relevel(ARM, "B: Placebo")), SEX = droplevels(SEX), AVALU = as.character(AVALU), is_event = CNSR == 0 ) labels <- list( "ARM" = adtte_labels["ARM"], "SEX" = adtte_labels["SEX"], "AVALU" = adtte_labels["AVALU"], "is_event" = "Event Flag" ) formatters::var_labels(adtte_f)[names(labels)] <- as.character(labels) df <- extract_survival_subgroups( variables = list( tte = "AVAL", is_event = "is_event", arm = "ARM", subgroups = c("SEX", "BMRKR2") ), data = adtte_f ) table_hr <- basic_table() %>% tabulate_survival_subgroups(df, time_unit = adtte_f$AVALU[1]) g_forest(table_hr) # Works with any `rtable`. tbl <- rtable( header = c("E", "CI", "N"), rrow("", 1, c(.8, 1.2), 200), rrow("", 1.2, c(1.1, 1.4), 50) ) g_forest( tbl = tbl, col_x = 1, col_ci = 2, xlim = c(0.5, 2), x_at = c(0.5, 1, 2), col_symbol_size = 3 ) tbl <- rtable( header = rheader( rrow("", rcell("A", colspan = 2)), rrow("", "c1", "c2") ), rrow("row 1", 1, c(.8, 1.2)), rrow("row 2", 1.2, c(1.1, 1.4)) ) g_forest( tbl = tbl, col_x = 1, col_ci = 2, xlim = c(0.5, 2), x_at = c(0.5, 1, 2), vline = 1, forest_header = c("Hello", "World") )
library(dplyr) library(forcats) adrs <- tern_ex_adrs n_records <- 20 adrs_labels <- formatters::var_labels(adrs, fill = TRUE) adrs <- adrs %>% filter(PARAMCD == "BESRSPI") %>% filter(ARM %in% c("A: Drug X", "B: Placebo")) %>% slice(seq_len(n_records)) %>% droplevels() %>% mutate( # Reorder levels of factor to make the placebo group the reference arm. ARM = fct_relevel(ARM, "B: Placebo"), rsp = AVALC == "CR" ) formatters::var_labels(adrs) <- c(adrs_labels, "Response") df <- extract_rsp_subgroups( variables = list(rsp = "rsp", arm = "ARM", subgroups = c("SEX", "STRATA2")), data = adrs ) # Full commonly used response table. tbl <- basic_table() %>% tabulate_rsp_subgroups(df) g_forest(tbl) # Odds ratio only table. tbl_or <- basic_table() %>% tabulate_rsp_subgroups(df, vars = c("n_tot", "or", "ci")) g_forest( tbl_or, forest_header = c("Comparison\nBetter", "Treatment\nBetter") ) # Survival forest plot example. adtte <- tern_ex_adtte # Save variable labels before data processing steps. adtte_labels <- formatters::var_labels(adtte, fill = TRUE) adtte_f <- adtte %>% filter( PARAMCD == "OS", ARM %in% c("B: Placebo", "A: Drug X"), SEX %in% c("M", "F") ) %>% mutate( # Reorder levels of ARM to display reference arm before treatment arm. ARM = droplevels(fct_relevel(ARM, "B: Placebo")), SEX = droplevels(SEX), AVALU = as.character(AVALU), is_event = CNSR == 0 ) labels <- list( "ARM" = adtte_labels["ARM"], "SEX" = adtte_labels["SEX"], "AVALU" = adtte_labels["AVALU"], "is_event" = "Event Flag" ) formatters::var_labels(adtte_f)[names(labels)] <- as.character(labels) df <- extract_survival_subgroups( variables = list( tte = "AVAL", is_event = "is_event", arm = "ARM", subgroups = c("SEX", "BMRKR2") ), data = adtte_f ) table_hr <- basic_table() %>% tabulate_survival_subgroups(df, time_unit = adtte_f$AVALU[1]) g_forest(table_hr) # Works with any `rtable`. tbl <- rtable( header = c("E", "CI", "N"), rrow("", 1, c(.8, 1.2), 200), rrow("", 1.2, c(1.1, 1.4), 50) ) g_forest( tbl = tbl, col_x = 1, col_ci = 2, xlim = c(0.5, 2), x_at = c(0.5, 1, 2), col_symbol_size = 3 ) tbl <- rtable( header = rheader( rrow("", rcell("A", colspan = 2)), rrow("", "c1", "c2") ), rrow("row 1", 1, c(.8, 1.2)), rrow("row 2", 1.2, c(1.1, 1.4)) ) g_forest( tbl = tbl, col_x = 1, col_ci = 2, xlim = c(0.5, 2), x_at = c(0.5, 1, 2), vline = 1, forest_header = c("Hello", "World") )
Line plot(s) displaying trend in patients' parameter values over time is rendered. Patients' individual baseline values can be added to the plot(s) as reference.
g_ipp( df, xvar, yvar, xlab, ylab, id_var = "USUBJID", title = "Individual Patient Plots", subtitle = "", caption = NULL, add_baseline_hline = FALSE, yvar_baseline = "BASE", ggtheme = nestcolor::theme_nest(), plotting_choices = c("all_in_one", "split_by_max_obs", "separate_by_obs"), max_obs_per_plot = 4, col = NULL )
g_ipp( df, xvar, yvar, xlab, ylab, id_var = "USUBJID", title = "Individual Patient Plots", subtitle = "", caption = NULL, add_baseline_hline = FALSE, yvar_baseline = "BASE", ggtheme = nestcolor::theme_nest(), plotting_choices = c("all_in_one", "split_by_max_obs", "separate_by_obs"), max_obs_per_plot = 4, col = NULL )
df |
( |
xvar |
( |
yvar |
( |
xlab |
( |
ylab |
( |
id_var |
( |
title |
( |
subtitle |
( |
caption |
( |
add_baseline_hline |
( |
yvar_baseline |
( |
ggtheme |
( |
plotting_choices |
( |
max_obs_per_plot |
( |
col |
( |
A ggplot
object or a list of ggplot
objects.
g_ipp()
: Plotting function for individual patient plots which, depending on user
preference, renders a single graphic or compiles a list of graphics that show trends in individual's parameter
values over time.
Relevant helper function h_g_ipp()
.
library(dplyr) # Select a small sample of data to plot. adlb <- tern_ex_adlb %>% filter(PARAMCD == "ALT", !(AVISIT %in% c("SCREENING", "BASELINE"))) %>% slice(1:36) plot_list <- g_ipp( df = adlb, xvar = "AVISIT", yvar = "AVAL", xlab = "Visit", ylab = "SGOT/ALT (U/L)", title = "Individual Patient Plots", add_baseline_hline = TRUE, plotting_choices = "split_by_max_obs", max_obs_per_plot = 5 ) plot_list
library(dplyr) # Select a small sample of data to plot. adlb <- tern_ex_adlb %>% filter(PARAMCD == "ALT", !(AVISIT %in% c("SCREENING", "BASELINE"))) %>% slice(1:36) plot_list <- g_ipp( df = adlb, xvar = "AVISIT", yvar = "AVAL", xlab = "Visit", ylab = "SGOT/ALT (U/L)", title = "Individual Patient Plots", add_baseline_hline = TRUE, plotting_choices = "split_by_max_obs", max_obs_per_plot = 5 ) plot_list
From a survival model, a graphic is rendered along with tabulated annotation including the number of patient at risk at given time and the median survival per group.
g_km( df, variables, control_surv = control_surv_timepoint(), col = NULL, lty = NULL, lwd = 0.5, censor_show = TRUE, pch = 3, size = 2, max_time = NULL, xticks = NULL, xlab = "Days", yval = c("Survival", "Failure"), ylab = paste(yval, "Probability"), ylim = NULL, title = NULL, footnotes = NULL, font_size = 10, ci_ribbon = FALSE, annot_at_risk = TRUE, annot_at_risk_title = TRUE, annot_surv_med = TRUE, annot_coxph = FALSE, annot_stats = NULL, annot_stats_vlines = FALSE, control_coxph_pw = control_coxph(), ref_group_coxph = NULL, control_annot_surv_med = control_surv_med_annot(), control_annot_coxph = control_coxph_annot(), legend_pos = NULL, rel_height_plot = 0.75, ggtheme = NULL, as_list = FALSE, draw = lifecycle::deprecated(), newpage = lifecycle::deprecated(), gp = lifecycle::deprecated(), vp = lifecycle::deprecated(), name = lifecycle::deprecated(), annot_coxph_ref_lbls = lifecycle::deprecated(), position_coxph = lifecycle::deprecated(), position_surv_med = lifecycle::deprecated(), width_annots = lifecycle::deprecated() )
g_km( df, variables, control_surv = control_surv_timepoint(), col = NULL, lty = NULL, lwd = 0.5, censor_show = TRUE, pch = 3, size = 2, max_time = NULL, xticks = NULL, xlab = "Days", yval = c("Survival", "Failure"), ylab = paste(yval, "Probability"), ylim = NULL, title = NULL, footnotes = NULL, font_size = 10, ci_ribbon = FALSE, annot_at_risk = TRUE, annot_at_risk_title = TRUE, annot_surv_med = TRUE, annot_coxph = FALSE, annot_stats = NULL, annot_stats_vlines = FALSE, control_coxph_pw = control_coxph(), ref_group_coxph = NULL, control_annot_surv_med = control_surv_med_annot(), control_annot_coxph = control_coxph_annot(), legend_pos = NULL, rel_height_plot = 0.75, ggtheme = NULL, as_list = FALSE, draw = lifecycle::deprecated(), newpage = lifecycle::deprecated(), gp = lifecycle::deprecated(), vp = lifecycle::deprecated(), name = lifecycle::deprecated(), annot_coxph_ref_lbls = lifecycle::deprecated(), position_coxph = lifecycle::deprecated(), position_surv_med = lifecycle::deprecated(), width_annots = lifecycle::deprecated() )
df |
( |
variables |
(named
|
control_surv |
(
|
col |
( |
lty |
( |
lwd |
( |
censor_show |
( |
pch |
( |
size |
( |
max_time |
( |
xticks |
( |
xlab |
( |
yval |
( |
ylab |
( |
ylim |
( |
title |
( |
footnotes |
( |
font_size |
( |
ci_ribbon |
( |
annot_at_risk |
( |
annot_at_risk_title |
( |
annot_surv_med |
( |
annot_coxph |
( |
annot_stats |
( |
annot_stats_vlines |
( |
control_coxph_pw |
(
|
ref_group_coxph |
( |
control_annot_surv_med |
( |
control_annot_coxph |
( |
legend_pos |
( |
rel_height_plot |
( |
ggtheme |
( |
as_list |
( |
draw |
|
newpage |
|
gp |
|
vp |
|
name |
|
annot_coxph_ref_lbls |
Please use the |
position_coxph |
Please use the |
position_surv_med |
Please use the |
width_annots |
Please use the |
A ggplot
Kaplan-Meier plot and (optionally) summary table.
library(dplyr) df <- tern_ex_adtte %>% filter(PARAMCD == "OS") %>% mutate(is_event = CNSR == 0) variables <- list(tte = "AVAL", is_event = "is_event", arm = "ARMCD") # Basic examples g_km(df = df, variables = variables) g_km(df = df, variables = variables, yval = "Failure") # Examples with customization parameters applied g_km( df = df, variables = variables, control_surv = control_surv_timepoint(conf_level = 0.9), col = c("grey25", "grey50", "grey75"), annot_at_risk_title = FALSE, lty = 1:3, font_size = 8 ) g_km( df = df, variables = variables, annot_stats = c("min", "median"), annot_stats_vlines = TRUE, max_time = 3000, ggtheme = ggplot2::theme_minimal() ) # Example with pairwise Cox-PH analysis annotation table, adjusted annotation tables g_km( df = df, variables = variables, annot_coxph = TRUE, control_coxph = control_coxph(pval_method = "wald", ties = "exact", conf_level = 0.99), control_annot_coxph = control_coxph_annot(x = 0.26, w = 0.35), control_annot_surv_med = control_surv_med_annot(x = 0.8, y = 0.9, w = 0.35) )
library(dplyr) df <- tern_ex_adtte %>% filter(PARAMCD == "OS") %>% mutate(is_event = CNSR == 0) variables <- list(tte = "AVAL", is_event = "is_event", arm = "ARMCD") # Basic examples g_km(df = df, variables = variables) g_km(df = df, variables = variables, yval = "Failure") # Examples with customization parameters applied g_km( df = df, variables = variables, control_surv = control_surv_timepoint(conf_level = 0.9), col = c("grey25", "grey50", "grey75"), annot_at_risk_title = FALSE, lty = 1:3, font_size = 8 ) g_km( df = df, variables = variables, annot_stats = c("min", "median"), annot_stats_vlines = TRUE, max_time = 3000, ggtheme = ggplot2::theme_minimal() ) # Example with pairwise Cox-PH analysis annotation table, adjusted annotation tables g_km( df = df, variables = variables, annot_coxph = TRUE, control_coxph = control_coxph(pval_method = "wald", ties = "exact", conf_level = 0.99), control_annot_coxph = control_coxph_annot(x = 0.26, w = 0.35), control_annot_surv_med = control_surv_med_annot(x = 0.8, y = 0.9, w = 0.35) )
Line plot with optional table.
g_lineplot( df, alt_counts_df = NULL, variables = control_lineplot_vars(), mid = "mean", interval = "mean_ci", whiskers = c("mean_ci_lwr", "mean_ci_upr"), table = NULL, sfun = s_summary, ..., mid_type = "pl", mid_point_size = 2, position = ggplot2::position_dodge(width = 0.4), legend_title = NULL, legend_position = "bottom", ggtheme = nestcolor::theme_nest(), xticks = NULL, xlim = NULL, ylim = NULL, x_lab = obj_label(df[[variables[["x"]]]]), y_lab = NULL, y_lab_add_paramcd = TRUE, y_lab_add_unit = TRUE, title = "Plot of Mean and 95% Confidence Limits by Visit", subtitle = "", subtitle_add_paramcd = TRUE, subtitle_add_unit = TRUE, caption = NULL, table_format = NULL, table_labels = NULL, table_font_size = 3, errorbar_width = 0.45, newpage = lifecycle::deprecated(), col = NULL, linetype = NULL, rel_height_plot = 0.5, as_list = FALSE )
g_lineplot( df, alt_counts_df = NULL, variables = control_lineplot_vars(), mid = "mean", interval = "mean_ci", whiskers = c("mean_ci_lwr", "mean_ci_upr"), table = NULL, sfun = s_summary, ..., mid_type = "pl", mid_point_size = 2, position = ggplot2::position_dodge(width = 0.4), legend_title = NULL, legend_position = "bottom", ggtheme = nestcolor::theme_nest(), xticks = NULL, xlim = NULL, ylim = NULL, x_lab = obj_label(df[[variables[["x"]]]]), y_lab = NULL, y_lab_add_paramcd = TRUE, y_lab_add_unit = TRUE, title = "Plot of Mean and 95% Confidence Limits by Visit", subtitle = "", subtitle_add_paramcd = TRUE, subtitle_add_unit = TRUE, caption = NULL, table_format = NULL, table_labels = NULL, table_font_size = 3, errorbar_width = 0.45, newpage = lifecycle::deprecated(), col = NULL, linetype = NULL, rel_height_plot = 0.5, as_list = FALSE )
df |
( |
alt_counts_df |
( |
variables |
(named
|
mid |
( |
interval |
( |
whiskers |
( |
table |
( |
sfun |
( |
... |
optional arguments to |
mid_type |
( |
mid_point_size |
( |
position |
( |
legend_title |
( |
legend_position |
( |
ggtheme |
( |
xticks |
( |
xlim |
( |
ylim |
( |
x_lab |
( |
y_lab |
( |
y_lab_add_paramcd |
( |
y_lab_add_unit |
( |
title |
( |
subtitle |
( |
subtitle_add_paramcd |
( |
subtitle_add_unit |
( |
caption |
( |
table_format |
(named |
table_labels |
(named |
table_font_size |
( |
errorbar_width |
( |
newpage |
|
col |
( |
linetype |
( |
rel_height_plot |
( |
as_list |
( |
A ggplot
line plot (and statistics table if applicable).
adsl <- tern_ex_adsl adlb <- tern_ex_adlb %>% dplyr::filter(ANL01FL == "Y", PARAMCD == "ALT", AVISIT != "SCREENING") adlb$AVISIT <- droplevels(adlb$AVISIT) adlb <- dplyr::mutate(adlb, AVISIT = forcats::fct_reorder(AVISIT, AVISITN, min)) # Mean with CI g_lineplot(adlb, adsl, subtitle = "Laboratory Test:") # Mean with CI, no stratification with group_var g_lineplot(adlb, variables = control_lineplot_vars(group_var = NA)) # Mean, upper whisker of CI, no group_var(strata) counts N g_lineplot( adlb, whiskers = "mean_ci_upr", title = "Plot of Mean and Upper 95% Confidence Limit by Visit" ) # Median with CI g_lineplot( adlb, adsl, mid = "median", interval = "median_ci", whiskers = c("median_ci_lwr", "median_ci_upr"), title = "Plot of Median and 95% Confidence Limits by Visit" ) # Mean, +/- SD g_lineplot(adlb, adsl, interval = "mean_sdi", whiskers = c("mean_sdi_lwr", "mean_sdi_upr"), title = "Plot of Median +/- SD by Visit" ) # Mean with CI plot with stats table g_lineplot(adlb, adsl, table = c("n", "mean", "mean_ci")) # Mean with CI, table and customized confidence level g_lineplot( adlb, adsl, table = c("n", "mean", "mean_ci"), control = control_analyze_vars(conf_level = 0.80), title = "Plot of Mean and 80% Confidence Limits by Visit" ) # Mean with CI, table, filtered data adlb_f <- dplyr::filter(adlb, ARMCD != "ARM A" | AVISIT == "BASELINE") g_lineplot(adlb_f, table = c("n", "mean"))
adsl <- tern_ex_adsl adlb <- tern_ex_adlb %>% dplyr::filter(ANL01FL == "Y", PARAMCD == "ALT", AVISIT != "SCREENING") adlb$AVISIT <- droplevels(adlb$AVISIT) adlb <- dplyr::mutate(adlb, AVISIT = forcats::fct_reorder(AVISIT, AVISITN, min)) # Mean with CI g_lineplot(adlb, adsl, subtitle = "Laboratory Test:") # Mean with CI, no stratification with group_var g_lineplot(adlb, variables = control_lineplot_vars(group_var = NA)) # Mean, upper whisker of CI, no group_var(strata) counts N g_lineplot( adlb, whiskers = "mean_ci_upr", title = "Plot of Mean and Upper 95% Confidence Limit by Visit" ) # Median with CI g_lineplot( adlb, adsl, mid = "median", interval = "median_ci", whiskers = c("median_ci_lwr", "median_ci_upr"), title = "Plot of Median and 95% Confidence Limits by Visit" ) # Mean, +/- SD g_lineplot(adlb, adsl, interval = "mean_sdi", whiskers = c("mean_sdi_lwr", "mean_sdi_upr"), title = "Plot of Median +/- SD by Visit" ) # Mean with CI plot with stats table g_lineplot(adlb, adsl, table = c("n", "mean", "mean_ci")) # Mean with CI, table and customized confidence level g_lineplot( adlb, adsl, table = c("n", "mean", "mean_ci"), control = control_analyze_vars(conf_level = 0.80), title = "Plot of Mean and 80% Confidence Limits by Visit" ) # Mean with CI, table, filtered data adlb_f <- dplyr::filter(adlb, ARMCD != "ARM A" | AVISIT == "BASELINE") g_lineplot(adlb_f, table = c("n", "mean"))
Based on the STEP results, creates a ggplot
graph showing the estimated HR or OR
along the continuous biomarker value subgroups.
g_step( df, use_percentile = "Percentile Center" %in% names(df), est = list(col = "blue", lty = 1), ci_ribbon = list(fill = getOption("ggplot2.discrete.colour")[1], alpha = 0.5), col = getOption("ggplot2.discrete.colour") )
g_step( df, use_percentile = "Percentile Center" %in% names(df), est = list(col = "blue", lty = 1), ci_ribbon = list(fill = getOption("ggplot2.discrete.colour")[1], alpha = 0.5), col = getOption("ggplot2.discrete.colour") )
df |
( |
use_percentile |
( |
est |
(named |
ci_ribbon |
(named |
col |
( |
A ggplot
STEP graph.
Custom tidy method tidy.step()
.
library(survival) lung$sex <- factor(lung$sex) # Survival example. vars <- list( time = "time", event = "status", arm = "sex", biomarker = "age" ) step_matrix <- fit_survival_step( variables = vars, data = lung, control = c(control_coxph(), control_step(num_points = 10, degree = 2)) ) step_data <- broom::tidy(step_matrix) # Default plot. g_step(step_data) # Add the reference 1 horizontal line. library(ggplot2) g_step(step_data) + ggplot2::geom_hline(ggplot2::aes(yintercept = 1), linetype = 2) # Use actual values instead of percentiles, different color for estimate and no CI, # use log scale for y axis. g_step( step_data, use_percentile = FALSE, est = list(col = "blue", lty = 1), ci_ribbon = NULL ) + scale_y_log10() # Adding another curve based on additional column. step_data$extra <- exp(step_data$`Percentile Center`) g_step(step_data) + ggplot2::geom_line(ggplot2::aes(y = extra), linetype = 2, color = "green") # Response example. vars <- list( response = "status", arm = "sex", biomarker = "age" ) step_matrix <- fit_rsp_step( variables = vars, data = lung, control = c( control_logistic(response_definition = "I(response == 2)"), control_step() ) ) step_data <- broom::tidy(step_matrix) g_step(step_data)
library(survival) lung$sex <- factor(lung$sex) # Survival example. vars <- list( time = "time", event = "status", arm = "sex", biomarker = "age" ) step_matrix <- fit_survival_step( variables = vars, data = lung, control = c(control_coxph(), control_step(num_points = 10, degree = 2)) ) step_data <- broom::tidy(step_matrix) # Default plot. g_step(step_data) # Add the reference 1 horizontal line. library(ggplot2) g_step(step_data) + ggplot2::geom_hline(ggplot2::aes(yintercept = 1), linetype = 2) # Use actual values instead of percentiles, different color for estimate and no CI, # use log scale for y axis. g_step( step_data, use_percentile = FALSE, est = list(col = "blue", lty = 1), ci_ribbon = NULL ) + scale_y_log10() # Adding another curve based on additional column. step_data$extra <- exp(step_data$`Percentile Center`) g_step(step_data) + ggplot2::geom_line(ggplot2::aes(y = extra), linetype = 2, color = "green") # Response example. vars <- list( response = "status", arm = "sex", biomarker = "age" ) step_matrix <- fit_rsp_step( variables = vars, data = lung, control = c( control_logistic(response_definition = "I(response == 2)"), control_step() ) ) step_data <- broom::tidy(step_matrix) g_step(step_data)
This basic waterfall plot visualizes a quantity height
ordered by value with some markup.
g_waterfall( height, id, col_var = NULL, col = getOption("ggplot2.discrete.colour"), xlab = NULL, ylab = NULL, col_legend_title = NULL, title = NULL )
g_waterfall( height, id, col_var = NULL, col = getOption("ggplot2.discrete.colour"), xlab = NULL, ylab = NULL, col_legend_title = NULL, title = NULL )
height |
( |
id |
( |
col_var |
( |
col |
( |
xlab |
( |
ylab |
( |
col_legend_title |
( |
title |
( |
A ggplot
waterfall plot.
library(dplyr) g_waterfall(height = c(3, 5, -1), id = letters[1:3]) g_waterfall( height = c(3, 5, -1), id = letters[1:3], col_var = letters[1:3] ) adsl_f <- tern_ex_adsl %>% select(USUBJID, STUDYID, ARM, ARMCD, SEX) adrs_f <- tern_ex_adrs %>% filter(PARAMCD == "OVRINV") %>% mutate(pchg = rnorm(n(), 10, 50)) adrs_f <- head(adrs_f, 30) adrs_f <- adrs_f[!duplicated(adrs_f$USUBJID), ] head(adrs_f) g_waterfall( height = adrs_f$pchg, id = adrs_f$USUBJID, col_var = adrs_f$AVALC ) g_waterfall( height = adrs_f$pchg, id = paste("asdfdsfdsfsd", adrs_f$USUBJID), col_var = adrs_f$SEX ) g_waterfall( height = adrs_f$pchg, id = paste("asdfdsfdsfsd", adrs_f$USUBJID), xlab = "ID", ylab = "Percentage Change", title = "Waterfall plot" )
library(dplyr) g_waterfall(height = c(3, 5, -1), id = letters[1:3]) g_waterfall( height = c(3, 5, -1), id = letters[1:3], col_var = letters[1:3] ) adsl_f <- tern_ex_adsl %>% select(USUBJID, STUDYID, ARM, ARMCD, SEX) adrs_f <- tern_ex_adrs %>% filter(PARAMCD == "OVRINV") %>% mutate(pchg = rnorm(n(), 10, 50)) adrs_f <- head(adrs_f, 30) adrs_f <- adrs_f[!duplicated(adrs_f$USUBJID), ] head(adrs_f) g_waterfall( height = adrs_f$pchg, id = adrs_f$USUBJID, col_var = adrs_f$AVALC ) g_waterfall( height = adrs_f$pchg, id = paste("asdfdsfdsfsd", adrs_f$USUBJID), col_var = adrs_f$SEX ) g_waterfall( height = adrs_f$pchg, id = paste("asdfdsfdsfsd", adrs_f$USUBJID), xlab = "ID", ylab = "Percentage Change", title = "Waterfall plot" )
This produces loess
smoothed estimates of y
with Student confidence intervals.
get_smooths(df, x, y, groups = NULL, level = 0.95)
get_smooths(df, x, y, groups = NULL, level = 0.95)
df |
( |
x |
( |
y |
( |
groups |
( |
level |
( |
A data.frame
with original x
, smoothed y
, ylow
, and yhigh
, and
optional groups
variables formatted as factor
type.
This converts a list of group levels into a data frame format which is expected by rtables::add_combo_levels()
.
groups_list_to_df(groups_list)
groups_list_to_df(groups_list)
groups_list |
(named |
A tibble
in the required format.
grade_groups <- list( "Any Grade (%)" = c("1", "2", "3", "4", "5"), "Grade 3-4 (%)" = c("3", "4"), "Grade 5 (%)" = "5" ) groups_list_to_df(grade_groups)
grade_groups <- list( "Any Grade (%)" = c("1", "2", "3", "4", "5"), "Grade 3-4 (%)" = c("3", "4"), "Grade 5 (%)" = "5" ) groups_list_to_df(grade_groups)
count_abnormal_by_worst_grade()
Helper function to prepare an ADLB data frame to be used as input in
count_abnormal_by_worst_grade()
. The following pre-processing steps are applied:
adlb
is filtered on variable avisit
to only include post-baseline visits.
adlb
is filtered on variables worst_flag_low
and worst_flag_high
so that only
worst grades (in either direction) are included.
From the standard lab grade variable atoxgr
, the following two variables are derived
and added to adlb
:
A grade direction variable (e.g. GRADE_DIR
). The variable takes value "HIGH"
when
atoxgr > 0
, "LOW"
when atoxgr < 0
, and "ZERO"
otherwise.
A toxicity grade variable (e.g. GRADE_ANL
) where all negative values from atoxgr
are
replaced by their absolute values.
Unused factor levels are dropped from adlb
via droplevels()
.
h_adlb_abnormal_by_worst_grade( adlb, atoxgr = "ATOXGR", avisit = "AVISIT", worst_flag_low = "WGRLOFL", worst_flag_high = "WGRHIFL" )
h_adlb_abnormal_by_worst_grade( adlb, atoxgr = "ATOXGR", avisit = "AVISIT", worst_flag_low = "WGRLOFL", worst_flag_high = "WGRHIFL" )
adlb |
( |
atoxgr |
( |
avisit |
( |
worst_flag_low |
( |
worst_flag_high |
( |
h_adlb_abnormal_by_worst_grade()
returns the adlb
data frame with two new
variables: GRADE_DIR
and GRADE_ANL
.
h_adlb_abnormal_by_worst_grade(tern_ex_adlb) %>% dplyr::select(ATOXGR, GRADE_DIR, GRADE_ANL) %>% head(10)
h_adlb_abnormal_by_worst_grade(tern_ex_adlb) %>% dplyr::select(ATOXGR, GRADE_DIR, GRADE_ANL) %>% head(10)
Helper function to prepare a df
for generate the patient count shift table.
h_adlb_worsen( adlb, worst_flag_low = NULL, worst_flag_high = NULL, direction_var )
h_adlb_worsen( adlb, worst_flag_low = NULL, worst_flag_high = NULL, direction_var )
adlb |
( |
worst_flag_low |
(named |
worst_flag_high |
(named |
direction_var |
(
|
h_adlb_worsen()
returns the adlb
data.frame
containing only the
worst labs specified according to worst_flag_low
or worst_flag_high
for the
direction specified according to direction_var
. For instance, for a lab that is
needed for the low direction only, only records flagged by worst_flag_low
are
selected. For a lab that is needed for both low and high directions, the worst
low records are selected for the low direction, and the worst high record are selected
for the high direction.
abnormal_by_worst_grade_worsen
library(dplyr) # The direction variable, GRADDR, is based on metadata adlb <- tern_ex_adlb %>% mutate( GRADDR = case_when( PARAMCD == "ALT" ~ "B", PARAMCD == "CRP" ~ "L", PARAMCD == "IGA" ~ "H" ) ) %>% filter(SAFFL == "Y" & ONTRTFL == "Y" & GRADDR != "") df <- h_adlb_worsen( adlb, worst_flag_low = c("WGRLOFL" = "Y"), worst_flag_high = c("WGRHIFL" = "Y"), direction_var = "GRADDR" )
library(dplyr) # The direction variable, GRADDR, is based on metadata adlb <- tern_ex_adlb %>% mutate( GRADDR = case_when( PARAMCD == "ALT" ~ "B", PARAMCD == "CRP" ~ "L", PARAMCD == "IGA" ~ "H" ) ) %>% filter(SAFFL == "Y" & ONTRTFL == "Y" & GRADDR != "") df <- h_adlb_worsen( adlb, worst_flag_low = c("WGRLOFL" = "Y"), worst_flag_high = c("WGRHIFL" = "Y"), direction_var = "GRADDR" )
Helper function that merges ADSL and ADLB datasets so that missing lab test records are inserted in the
output dataset. Remember that na_level
must match the needed pre-processing
done with df_explicit_na()
to have the desired output.
h_adsl_adlb_merge_using_worst_flag( adsl, adlb, worst_flag = c(WGRHIFL = "Y"), by_visit = FALSE, no_fillin_visits = c("SCREENING", "BASELINE") )
h_adsl_adlb_merge_using_worst_flag( adsl, adlb, worst_flag = c(WGRHIFL = "Y"), by_visit = FALSE, no_fillin_visits = c("SCREENING", "BASELINE") )
adsl |
( |
adlb |
( |
worst_flag |
(named |
by_visit |
( |
no_fillin_visits |
(named |
In the result data missing records will be created for the following situations:
Patients who are present in adsl
but have no lab data in adlb
(both baseline and post-baseline).
Patients who do not have any post-baseline lab values.
Patients without any post-baseline values flagged as the worst.
df
containing variables shared between adlb
and adsl
along with variables PARAM
, PARAMCD
,
ATOXGR
, and BTOXGR
relevant for analysis. Optionally, AVISIT
are AVISITN
are included when
by_visit = TRUE
and no_fillin_visits = c("SCREENING", "BASELINE")
.
# `h_adsl_adlb_merge_using_worst_flag` adlb_out <- h_adsl_adlb_merge_using_worst_flag( tern_ex_adsl, tern_ex_adlb, worst_flag = c("WGRHIFL" = "Y") ) # `h_adsl_adlb_merge_using_worst_flag` by visit example adlb_out_by_visit <- h_adsl_adlb_merge_using_worst_flag( tern_ex_adsl, tern_ex_adlb, worst_flag = c("WGRLOVFL" = "Y"), by_visit = TRUE )
# `h_adsl_adlb_merge_using_worst_flag` adlb_out <- h_adsl_adlb_merge_using_worst_flag( tern_ex_adsl, tern_ex_adlb, worst_flag = c("WGRHIFL" = "Y") ) # `h_adsl_adlb_merge_using_worst_flag` by visit example adlb_out_by_visit <- h_adsl_adlb_merge_using_worst_flag( tern_ex_adsl, tern_ex_adlb, worst_flag = c("WGRLOVFL" = "Y"), by_visit = TRUE )
h_ancova(.var, .df_row, variables, interaction_item = NULL)
h_ancova(.var, .df_row, variables, interaction_item = NULL)
.var |
( |
.df_row |
( |
variables |
(named
|
interaction_item |
( |
The summary of a linear model.
h_ancova( .var = "Sepal.Length", .df_row = iris, variables = list(arm = "Species", covariates = c("Petal.Length * Petal.Width", "Sepal.Width")) )
h_ancova( .var = "Sepal.Length", .df_row = iris, variables = list(arm = "Species", covariates = c("Petal.Length * Petal.Width", "Sepal.Width")) )
s_count_occurrences_by_grade()
Helper function for s_count_occurrences_by_grade()
to insert grade groupings into list with
individual grade frequencies. The order of the final result follows the order of grade_groups
.
The elements under any-grade group (if any), i.e. the grade group equal to refs
will be moved to
the end. Grade groups names must be unique.
h_append_grade_groups( grade_groups, refs, remove_single = TRUE, only_grade_groups = FALSE )
h_append_grade_groups( grade_groups, refs, remove_single = TRUE, only_grade_groups = FALSE )
grade_groups |
(named |
refs |
(named |
remove_single |
( |
only_grade_groups |
( |
Formatted list of grade groupings.
h_append_grade_groups( list( "Any Grade" = as.character(1:5), "Grade 1-2" = c("1", "2"), "Grade 3-4" = c("3", "4") ), list("1" = 10, "2" = 20, "3" = 30, "4" = 40, "5" = 50) ) h_append_grade_groups( list( "Any Grade" = as.character(5:1), "Grade A" = "5", "Grade B" = c("4", "3") ), list("1" = 10, "2" = 20, "3" = 30, "4" = 40, "5" = 50) ) h_append_grade_groups( list( "Any Grade" = as.character(1:5), "Grade 1-2" = c("1", "2"), "Grade 3-4" = c("3", "4") ), list("1" = 10, "2" = 5, "3" = 0) )
h_append_grade_groups( list( "Any Grade" = as.character(1:5), "Grade 1-2" = c("1", "2"), "Grade 3-4" = c("3", "4") ), list("1" = 10, "2" = 20, "3" = 30, "4" = 40, "5" = 50) ) h_append_grade_groups( list( "Any Grade" = as.character(5:1), "Grade A" = "5", "Grade B" = c("4", "3") ), list("1" = 10, "2" = 20, "3" = 30, "4" = 40, "5" = 50) ) h_append_grade_groups( list( "Any Grade" = as.character(1:5), "Grade 1-2" = c("1", "2"), "Grade 3-4" = c("3", "4") ), list("1" = 10, "2" = 5, "3" = 0) )
Helper function to extract column indices from a VTableTree
for a given
vector of column names.
h_col_indices(table_tree, col_names)
h_col_indices(table_tree, col_names)
table_tree |
( |
col_names |
( |
A vector of column indices.
s_count_cumulative()
Helper function to calculate count and fraction of x
values in the lower or upper tail given a threshold.
h_count_cumulative( x, threshold, lower_tail = TRUE, include_eq = TRUE, na.rm = TRUE, .N_col )
h_count_cumulative( x, threshold, lower_tail = TRUE, include_eq = TRUE, na.rm = TRUE, .N_col )
x |
( |
threshold |
( |
lower_tail |
( |
include_eq |
( |
na.rm |
( |
.N_col |
( |
A named vector with items:
count
: the count of values less than, less or equal to, greater than, or greater or equal to a threshold
of user specification.
fraction
: the fraction of the count.
set.seed(1, kind = "Mersenne-Twister") x <- c(sample(1:10, 10), NA) .N_col <- length(x) h_count_cumulative(x, 5, .N_col = .N_col) h_count_cumulative(x, 5, lower_tail = FALSE, include_eq = FALSE, na.rm = FALSE, .N_col = .N_col) h_count_cumulative(x, 0, lower_tail = FALSE, .N_col = .N_col) h_count_cumulative(x, 100, lower_tail = FALSE, .N_col = .N_col)
set.seed(1, kind = "Mersenne-Twister") x <- c(sample(1:10, 10), NA) .N_col <- length(x) h_count_cumulative(x, 5, .N_col = .N_col) h_count_cumulative(x, 5, lower_tail = FALSE, include_eq = FALSE, na.rm = FALSE, .N_col = .N_col) h_count_cumulative(x, 0, lower_tail = FALSE, .N_col = .N_col) h_count_cumulative(x, 100, lower_tail = FALSE, .N_col = .N_col)
Helper functions used in fit_coxreg_univar()
and fit_coxreg_multivar()
.
h_coxreg_univar_formulas(variables, interaction = FALSE) h_coxreg_multivar_formula(variables) h_coxreg_univar_extract(effect, covar, data, mod, control = control_coxreg()) h_coxreg_multivar_extract(var, data, mod, control = control_coxreg())
h_coxreg_univar_formulas(variables, interaction = FALSE) h_coxreg_multivar_formula(variables) h_coxreg_univar_extract(effect, covar, data, mod, control = control_coxreg()) h_coxreg_multivar_extract(var, data, mod, control = control_coxreg())
variables |
(named |
interaction |
( |
effect |
( |
covar |
( |
data |
( |
mod |
( |
control |
( |
var |
( |
h_coxreg_univar_formulas()
returns a character
vector coercible into formulas (e.g stats::as.formula()
).
h_coxreg_multivar_formula()
returns a string
coercible into a formula (e.g stats::as.formula()
).
h_coxreg_univar_extract()
returns a data.frame
with variables effect
, term
, term_label
, level
,
n
, hr
, lcl
, ucl
, and pval
.
h_coxreg_multivar_extract()
returns a data.frame
with variables pval
, hr
, lcl
, ucl
, level
,
n
, term
, and term_label
.
h_coxreg_univar_formulas()
: Helper for Cox regression formula. Creates a list of formulas. It is used
internally by fit_coxreg_univar()
for the comparison of univariate Cox regression models.
h_coxreg_multivar_formula()
: Helper for multivariate Cox regression formula. Creates a formulas
string. It is used internally by fit_coxreg_multivar()
for the comparison of multivariate Cox
regression models. Interactions will not be included in multivariate Cox regression model.
h_coxreg_univar_extract()
: Utility function to help tabulate the result of
a univariate Cox regression model.
h_coxreg_multivar_extract()
: Tabulation of multivariate Cox regressions. Utility function to help
tabulate the result of a multivariate Cox regression model for a treatment/covariate variable.
# `h_coxreg_univar_formulas` ## Simple formulas. h_coxreg_univar_formulas( variables = list( time = "time", event = "status", arm = "armcd", covariates = c("X", "y") ) ) ## Addition of an optional strata. h_coxreg_univar_formulas( variables = list( time = "time", event = "status", arm = "armcd", covariates = c("X", "y"), strata = "SITE" ) ) ## Inclusion of the interaction term. h_coxreg_univar_formulas( variables = list( time = "time", event = "status", arm = "armcd", covariates = c("X", "y"), strata = "SITE" ), interaction = TRUE ) ## Only covariates fitted in separate models. h_coxreg_univar_formulas( variables = list( time = "time", event = "status", covariates = c("X", "y") ) ) # `h_coxreg_multivar_formula` h_coxreg_multivar_formula( variables = list( time = "AVAL", event = "event", arm = "ARMCD", covariates = c("RACE", "AGE") ) ) # Addition of an optional strata. h_coxreg_multivar_formula( variables = list( time = "AVAL", event = "event", arm = "ARMCD", covariates = c("RACE", "AGE"), strata = "SITE" ) ) # Example without treatment arm. h_coxreg_multivar_formula( variables = list( time = "AVAL", event = "event", covariates = c("RACE", "AGE"), strata = "SITE" ) ) library(survival) dta_simple <- data.frame( time = c(5, 5, 10, 10, 5, 5, 10, 10), status = c(0, 0, 1, 0, 0, 1, 1, 1), armcd = factor(LETTERS[c(1, 1, 1, 1, 2, 2, 2, 2)], levels = c("A", "B")), var1 = c(45, 55, 65, 75, 55, 65, 85, 75), var2 = c("F", "M", "F", "M", "F", "M", "F", "U") ) mod <- coxph(Surv(time, status) ~ armcd + var1, data = dta_simple) result <- h_coxreg_univar_extract( effect = "armcd", covar = "armcd", mod = mod, data = dta_simple ) result mod <- coxph(Surv(time, status) ~ armcd + var1, data = dta_simple) result <- h_coxreg_multivar_extract( var = "var1", mod = mod, data = dta_simple ) result
# `h_coxreg_univar_formulas` ## Simple formulas. h_coxreg_univar_formulas( variables = list( time = "time", event = "status", arm = "armcd", covariates = c("X", "y") ) ) ## Addition of an optional strata. h_coxreg_univar_formulas( variables = list( time = "time", event = "status", arm = "armcd", covariates = c("X", "y"), strata = "SITE" ) ) ## Inclusion of the interaction term. h_coxreg_univar_formulas( variables = list( time = "time", event = "status", arm = "armcd", covariates = c("X", "y"), strata = "SITE" ), interaction = TRUE ) ## Only covariates fitted in separate models. h_coxreg_univar_formulas( variables = list( time = "time", event = "status", covariates = c("X", "y") ) ) # `h_coxreg_multivar_formula` h_coxreg_multivar_formula( variables = list( time = "AVAL", event = "event", arm = "ARMCD", covariates = c("RACE", "AGE") ) ) # Addition of an optional strata. h_coxreg_multivar_formula( variables = list( time = "AVAL", event = "event", arm = "ARMCD", covariates = c("RACE", "AGE"), strata = "SITE" ) ) # Example without treatment arm. h_coxreg_multivar_formula( variables = list( time = "AVAL", event = "event", covariates = c("RACE", "AGE"), strata = "SITE" ) ) library(survival) dta_simple <- data.frame( time = c(5, 5, 10, 10, 5, 5, 10, 10), status = c(0, 0, 1, 0, 0, 1, 1, 1), armcd = factor(LETTERS[c(1, 1, 1, 1, 2, 2, 2, 2)], levels = c("A", "B")), var1 = c(45, 55, 65, 75, 55, 65, 85, 75), var2 = c("F", "M", "F", "M", "F", "M", "F", "U") ) mod <- coxph(Surv(time, status) ~ armcd + var1, data = dta_simple) result <- h_coxreg_univar_extract( effect = "armcd", covar = "armcd", mod = mod, data = dta_simple ) result mod <- coxph(Surv(time, status) ~ armcd + var1, data = dta_simple) result <- h_coxreg_multivar_extract( var = "var1", mod = mod, data = dta_simple ) result
Convert the survival fit data into a data frame designed for plotting
within g_km
.
This starts from the broom::tidy()
result, and then:
Post-processes the strata
column into a factor.
Extends each stratum by an additional first row with time 0 and probability 1 so that downstream plot lines start at those coordinates.
Adds a censor
column.
Filters the rows before max_time
.
h_data_plot(fit_km, armval = "All", max_time = NULL)
h_data_plot(fit_km, armval = "All", max_time = NULL)
fit_km |
( |
armval |
( |
max_time |
( |
A tibble
with columns time
, n.risk
, n.event
, n.censor
, estimate
, std.error
, conf.high
,
conf.low
, strata
, and censor
.
library(dplyr) library(survival) # Test with multiple arms tern_ex_adtte %>% filter(PARAMCD == "OS") %>% survfit(formula = Surv(AVAL, 1 - CNSR) ~ ARMCD, data = .) %>% h_data_plot() # Test with single arm tern_ex_adtte %>% filter(PARAMCD == "OS", ARMCD == "ARM B") %>% survfit(formula = Surv(AVAL, 1 - CNSR) ~ ARMCD, data = .) %>% h_data_plot(armval = "ARM B")
library(dplyr) library(survival) # Test with multiple arms tern_ex_adtte %>% filter(PARAMCD == "OS") %>% survfit(formula = Surv(AVAL, 1 - CNSR) ~ ARMCD, data = .) %>% h_data_plot() # Test with single arm tern_ex_adtte %>% filter(PARAMCD == "OS", ARMCD == "ARM B") %>% survfit(formula = Surv(AVAL, 1 - CNSR) ~ ARMCD, data = .) %>% h_data_plot(armval = "ARM B")
ggplot
decompositionThe elements composing the ggplot
are extracted and organized in a list
.
h_decompose_gg(gg)
h_decompose_gg(gg)
gg |
( |
A named list
with elements:
panel
: The panel.
yaxis
: The y-axis.
xaxis
: The x-axis.
xlab
: The x-axis label.
ylab
: The y-axis label.
guide
: The legend.
library(dplyr) library(survival) library(grid) fit_km <- tern_ex_adtte %>% filter(PARAMCD == "OS") %>% survfit(formula = Surv(AVAL, 1 - CNSR) ~ ARMCD, data = .) data_plot <- h_data_plot(fit_km = fit_km) xticks <- h_xticks(data = data_plot) gg <- h_ggkm( data = data_plot, yval = "Survival", censor_show = TRUE, xticks = xticks, xlab = "Days", ylab = "Survival Probability", title = "tt", footnotes = "ff" ) g_el <- h_decompose_gg(gg) grid::grid.newpage() grid.rect(gp = grid::gpar(lty = 1, col = "red", fill = "gray85", lwd = 5)) grid::grid.draw(g_el$panel) grid::grid.newpage() grid.rect(gp = grid::gpar(lty = 1, col = "royalblue", fill = "gray85", lwd = 5)) grid::grid.draw(with(g_el, cbind(ylab, yaxis)))
library(dplyr) library(survival) library(grid) fit_km <- tern_ex_adtte %>% filter(PARAMCD == "OS") %>% survfit(formula = Surv(AVAL, 1 - CNSR) ~ ARMCD, data = .) data_plot <- h_data_plot(fit_km = fit_km) xticks <- h_xticks(data = data_plot) gg <- h_ggkm( data = data_plot, yval = "Survival", censor_show = TRUE, xticks = xticks, xlab = "Days", ylab = "Survival Probability", title = "tt", footnotes = "ff" ) g_el <- h_decompose_gg(gg) grid::grid.newpage() grid.rect(gp = grid::gpar(lty = 1, col = "red", fill = "gray85", lwd = 5)) grid::grid.draw(g_el$panel) grid::grid.newpage() grid.rect(gp = grid::gpar(lty = 1, col = "royalblue", fill = "gray85", lwd = 5)) grid::grid.draw(with(g_el, cbind(ylab, yaxis)))
g_lineplot
tableh_format_row(x, format, labels = NULL)
h_format_row(x, format, labels = NULL)
x |
(named |
format |
(named |
labels |
(named |
A single row data.frame
object.
mean_ci <- c(48, 51) x <- list(mean = 50, mean_ci = mean_ci) format <- c(mean = "xx.x", mean_ci = "(xx.xx, xx.xx)") labels <- c(mean = "My Mean") h_format_row(x, format, labels) attr(mean_ci, "label") <- "Mean 95% CI" x <- list(mean = 50, mean_ci = mean_ci) h_format_row(x, format, labels)
mean_ci <- c(48, 51) x <- list(mean = 50, mean_ci = mean_ci) format <- c(mean = "xx.x", mean_ci = "(xx.xx, xx.xx)") labels <- c(mean = "My Mean") h_format_row(x, format, labels) attr(mean_ci, "label") <- "Mean 95% CI" x <- list(mean = 50, mean_ci = mean_ci) h_format_row(x, format, labels)
Function that generates a simple line plot displaying parameter trends over time.
h_g_ipp( df, xvar, yvar, xlab, ylab, id_var, title = "Individual Patient Plots", subtitle = "", caption = NULL, add_baseline_hline = FALSE, yvar_baseline = "BASE", ggtheme = nestcolor::theme_nest(), col = NULL )
h_g_ipp( df, xvar, yvar, xlab, ylab, id_var, title = "Individual Patient Plots", subtitle = "", caption = NULL, add_baseline_hline = FALSE, yvar_baseline = "BASE", ggtheme = nestcolor::theme_nest(), col = NULL )
df |
( |
xvar |
( |
yvar |
( |
xlab |
( |
ylab |
( |
id_var |
( |
title |
( |
subtitle |
( |
caption |
( |
add_baseline_hline |
( |
yvar_baseline |
( |
ggtheme |
( |
col |
( |
A ggplot
line plot.
g_ipp()
which uses this function.
library(dplyr) # Select a small sample of data to plot. adlb <- tern_ex_adlb %>% filter(PARAMCD == "ALT", !(AVISIT %in% c("SCREENING", "BASELINE"))) %>% slice(1:36) p <- h_g_ipp( df = adlb, xvar = "AVISIT", yvar = "AVAL", xlab = "Visit", id_var = "USUBJID", ylab = "SGOT/ALT (U/L)", add_baseline_hline = TRUE ) p
library(dplyr) # Select a small sample of data to plot. adlb <- tern_ex_adlb %>% filter(PARAMCD == "ALT", !(AVISIT %in% c("SCREENING", "BASELINE"))) %>% slice(1:36) p <- h_g_ipp( df = adlb, xvar = "AVISIT", yvar = "AVAL", xlab = "Visit", id_var = "USUBJID", ylab = "SGOT/ALT (U/L)", add_baseline_hline = TRUE ) p
Draw the Kaplan-Meier plot using ggplot2
.
h_ggkm( data, xticks = NULL, yval = "Survival", censor_show, xlab, ylab, ylim = NULL, title, footnotes = NULL, max_time = NULL, lwd = 1, lty = NULL, pch = 3, size = 2, col = NULL, ci_ribbon = FALSE, ggtheme = nestcolor::theme_nest() )
h_ggkm( data, xticks = NULL, yval = "Survival", censor_show, xlab, ylab, ylim = NULL, title, footnotes = NULL, max_time = NULL, lwd = 1, lty = NULL, pch = 3, size = 2, col = NULL, ci_ribbon = FALSE, ggtheme = nestcolor::theme_nest() )
data |
( |
xticks |
( |
yval |
( |
censor_show |
( |
xlab |
( |
ylab |
( |
ylim |
( |
title |
( |
footnotes |
( |
max_time |
( |
lwd |
( |
lty |
( |
pch |
( |
size |
( |
col |
( |
ci_ribbon |
( |
ggtheme |
( |
A ggplot
object.
library(dplyr) library(survival) fit_km <- tern_ex_adtte %>% filter(PARAMCD == "OS") %>% survfit(formula = Surv(AVAL, 1 - CNSR) ~ ARMCD, data = .) data_plot <- h_data_plot(fit_km = fit_km) xticks <- h_xticks(data = data_plot) gg <- h_ggkm( data = data_plot, censor_show = TRUE, xticks = xticks, xlab = "Days", yval = "Survival", ylab = "Survival Probability", title = "Survival" ) gg
library(dplyr) library(survival) fit_km <- tern_ex_adtte %>% filter(PARAMCD == "OS") %>% survfit(formula = Surv(AVAL, 1 - CNSR) ~ ARMCD, data = .) data_plot <- h_data_plot(fit_km = fit_km) xticks <- h_xticks(data = data_plot) gg <- h_ggkm( data = data_plot, censor_show = TRUE, xticks = xticks, xlab = "Days", yval = "Survival", ylab = "Survival Probability", title = "Survival" ) gg
Grob of rtable
output from h_tbl_coxph_pairwise()
h_grob_coxph( ..., x = 0, y = 0, width = grid::unit(0.4, "npc"), ttheme = gridExtra::ttheme_default(padding = grid::unit(c(1, 0.5), "lines"), core = list(bg_params = list(fill = c("grey95", "grey90"), alpha = 0.5))) )
h_grob_coxph( ..., x = 0, y = 0, width = grid::unit(0.4, "npc"), ttheme = gridExtra::ttheme_default(padding = grid::unit(c(1, 0.5), "lines"), core = list(bg_params = list(fill = c("grey95", "grey90"), alpha = 0.5))) )
... |
arguments to pass to |
x |
( |
y |
( |
width |
( |
ttheme |
( |
A grob
of a table containing statistics HR
, XX% CI
(XX
taken from control_coxph_pw
),
and p-value (log-rank)
.
library(dplyr) library(survival) library(grid) grid::grid.newpage() grid.rect(gp = grid::gpar(lty = 1, col = "pink", fill = "gray85", lwd = 1)) data <- tern_ex_adtte %>% filter(PARAMCD == "OS") %>% mutate(is_event = CNSR == 0) tbl_grob <- h_grob_coxph( df = data, variables = list(tte = "AVAL", is_event = "is_event", arm = "ARMCD"), control_coxph_pw = control_coxph(conf_level = 0.9), x = 0.5, y = 0.5 ) grid::grid.draw(tbl_grob)
library(dplyr) library(survival) library(grid) grid::grid.newpage() grid.rect(gp = grid::gpar(lty = 1, col = "pink", fill = "gray85", lwd = 1)) data <- tern_ex_adtte %>% filter(PARAMCD == "OS") %>% mutate(is_event = CNSR == 0) tbl_grob <- h_grob_coxph( df = data, variables = list(tte = "AVAL", is_event = "is_event", arm = "ARMCD"), control_coxph_pw = control_coxph(conf_level = 0.9), x = 0.5, y = 0.5 ) grid::grid.draw(tbl_grob)
The survival fit is transformed in a grob containing a table with groups in rows characterized by N, median and 95% confidence interval.
h_grob_median_surv( fit_km, armval = "All", x = 0.9, y = 0.9, width = grid::unit(0.3, "npc"), ttheme = gridExtra::ttheme_default() )
h_grob_median_surv( fit_km, armval = "All", x = 0.9, y = 0.9, width = grid::unit(0.3, "npc"), ttheme = gridExtra::ttheme_default() )
fit_km |
( |
armval |
( |
x |
( |
y |
( |
width |
( |
ttheme |
( |
A grob
of a table containing statistics N
, Median
, and XX% CI
(XX
taken from fit_km
).
library(dplyr) library(survival) library(grid) grid::grid.newpage() grid.rect(gp = grid::gpar(lty = 1, col = "pink", fill = "gray85", lwd = 1)) tern_ex_adtte %>% filter(PARAMCD == "OS") %>% survfit(formula = Surv(AVAL, 1 - CNSR) ~ ARMCD, data = .) %>% h_grob_median_surv() %>% grid::grid.draw()
library(dplyr) library(survival) library(grid) grid::grid.newpage() grid.rect(gp = grid::gpar(lty = 1, col = "pink", fill = "gray85", lwd = 1)) tern_ex_adtte %>% filter(PARAMCD == "OS") %>% survfit(formula = Surv(AVAL, 1 - CNSR) ~ ARMCD, data = .) %>% h_grob_median_surv() %>% grid::grid.draw()
Two graphical objects are obtained, one corresponding to row labeling and the second to the table of
numbers of patients at risk. If title = TRUE
, a third object corresponding to the table title is
also obtained.
h_grob_tbl_at_risk(data, annot_tbl, xlim, title = TRUE)
h_grob_tbl_at_risk(data, annot_tbl, xlim, title = TRUE)
data |
( |
annot_tbl |
( |
xlim |
( |
title |
( |
A named list
of two gTree
objects if title = FALSE
: at_risk
and label
, or three
gTree
objects if title = TRUE
: at_risk
, label
, and title
.
library(dplyr) library(survival) library(grid) fit_km <- tern_ex_adtte %>% filter(PARAMCD == "OS") %>% survfit(formula = Surv(AVAL, 1 - CNSR) ~ ARMCD, data = .) data_plot <- h_data_plot(fit_km = fit_km) xticks <- h_xticks(data = data_plot) gg <- h_ggkm( data = data_plot, censor_show = TRUE, xticks = xticks, xlab = "Days", ylab = "Survival Probability", title = "tt", footnotes = "ff", yval = "Survival" ) # The annotation table reports the patient at risk for a given strata and # times (`xticks`). annot_tbl <- summary(fit_km, times = xticks) if (is.null(fit_km$strata)) { annot_tbl <- with(annot_tbl, data.frame(n.risk = n.risk, time = time, strata = "All")) } else { strata_lst <- strsplit(sub("=", "equals", levels(annot_tbl$strata)), "equals") levels(annot_tbl$strata) <- matrix(unlist(strata_lst), ncol = 2, byrow = TRUE)[, 2] annot_tbl <- data.frame( n.risk = annot_tbl$n.risk, time = annot_tbl$time, strata = annot_tbl$strata ) } # The annotation table is transformed into a grob. tbl <- h_grob_tbl_at_risk(data = data_plot, annot_tbl = annot_tbl, xlim = max(xticks)) # For the representation, the layout is estimated for which the decomposition # of the graphic element is necessary. g_el <- h_decompose_gg(gg) lyt <- h_km_layout(data = data_plot, g_el = g_el, title = "t", footnotes = "f") grid::grid.newpage() pushViewport(viewport(layout = lyt, height = .95, width = .95)) grid.rect(gp = grid::gpar(lty = 1, col = "purple", fill = "gray85", lwd = 1)) pushViewport(viewport(layout.pos.row = 3:4, layout.pos.col = 2)) grid.rect(gp = grid::gpar(lty = 1, col = "orange", fill = "gray85", lwd = 1)) grid::grid.draw(tbl$at_risk) popViewport() pushViewport(viewport(layout.pos.row = 3:4, layout.pos.col = 1)) grid.rect(gp = grid::gpar(lty = 1, col = "green3", fill = "gray85", lwd = 1)) grid::grid.draw(tbl$label)
library(dplyr) library(survival) library(grid) fit_km <- tern_ex_adtte %>% filter(PARAMCD == "OS") %>% survfit(formula = Surv(AVAL, 1 - CNSR) ~ ARMCD, data = .) data_plot <- h_data_plot(fit_km = fit_km) xticks <- h_xticks(data = data_plot) gg <- h_ggkm( data = data_plot, censor_show = TRUE, xticks = xticks, xlab = "Days", ylab = "Survival Probability", title = "tt", footnotes = "ff", yval = "Survival" ) # The annotation table reports the patient at risk for a given strata and # times (`xticks`). annot_tbl <- summary(fit_km, times = xticks) if (is.null(fit_km$strata)) { annot_tbl <- with(annot_tbl, data.frame(n.risk = n.risk, time = time, strata = "All")) } else { strata_lst <- strsplit(sub("=", "equals", levels(annot_tbl$strata)), "equals") levels(annot_tbl$strata) <- matrix(unlist(strata_lst), ncol = 2, byrow = TRUE)[, 2] annot_tbl <- data.frame( n.risk = annot_tbl$n.risk, time = annot_tbl$time, strata = annot_tbl$strata ) } # The annotation table is transformed into a grob. tbl <- h_grob_tbl_at_risk(data = data_plot, annot_tbl = annot_tbl, xlim = max(xticks)) # For the representation, the layout is estimated for which the decomposition # of the graphic element is necessary. g_el <- h_decompose_gg(gg) lyt <- h_km_layout(data = data_plot, g_el = g_el, title = "t", footnotes = "f") grid::grid.newpage() pushViewport(viewport(layout = lyt, height = .95, width = .95)) grid.rect(gp = grid::gpar(lty = 1, col = "purple", fill = "gray85", lwd = 1)) pushViewport(viewport(layout.pos.row = 3:4, layout.pos.col = 2)) grid.rect(gp = grid::gpar(lty = 1, col = "orange", fill = "gray85", lwd = 1)) grid::grid.draw(tbl$at_risk) popViewport() pushViewport(viewport(layout.pos.row = 3:4, layout.pos.col = 1)) grid.rect(gp = grid::gpar(lty = 1, col = "green3", fill = "gray85", lwd = 1)) grid::grid.draw(tbl$label)
Build the y-axis annotation from a decomposed ggplot
.
h_grob_y_annot(ylab, yaxis)
h_grob_y_annot(ylab, yaxis)
ylab |
( |
yaxis |
( |
A gTree
object containing the y-axis annotation from a ggplot
.
library(dplyr) library(survival) library(grid) fit_km <- tern_ex_adtte %>% filter(PARAMCD == "OS") %>% survfit(formula = Surv(AVAL, 1 - CNSR) ~ ARMCD, data = .) data_plot <- h_data_plot(fit_km = fit_km) xticks <- h_xticks(data = data_plot) gg <- h_ggkm( data = data_plot, censor_show = TRUE, xticks = xticks, xlab = "Days", ylab = "Survival Probability", title = "title", footnotes = "footnotes", yval = "Survival" ) g_el <- h_decompose_gg(gg) grid::grid.newpage() pvp <- grid::plotViewport(margins = c(5, 4, 2, 20)) pushViewport(pvp) grid::grid.draw(h_grob_y_annot(ylab = g_el$ylab, yaxis = g_el$yaxis)) grid.rect(gp = grid::gpar(lty = 1, col = "gray35", fill = NA))
library(dplyr) library(survival) library(grid) fit_km <- tern_ex_adtte %>% filter(PARAMCD == "OS") %>% survfit(formula = Surv(AVAL, 1 - CNSR) ~ ARMCD, data = .) data_plot <- h_data_plot(fit_km = fit_km) xticks <- h_xticks(data = data_plot) gg <- h_ggkm( data = data_plot, censor_show = TRUE, xticks = xticks, xlab = "Days", ylab = "Survival Probability", title = "title", footnotes = "footnotes", yval = "Survival" ) g_el <- h_decompose_gg(gg) grid::grid.newpage() pvp <- grid::plotViewport(margins = c(5, 4, 2, 20)) pushViewport(pvp) grid::grid.draw(h_grob_y_annot(ylab = g_el$ylab, yaxis = g_el$yaxis)) grid.rect(gp = grid::gpar(lty = 1, col = "gray35", fill = NA))
Prepares a (5 rows) x (2 cols) layout for the Kaplan-Meier curve.
h_km_layout( data, g_el, title, footnotes, annot_at_risk = TRUE, annot_at_risk_title = TRUE )
h_km_layout( data, g_el, title, footnotes, annot_at_risk = TRUE, annot_at_risk_title = TRUE )
data |
( |
g_el |
( |
title |
( |
footnotes |
( |
annot_at_risk |
( |
annot_at_risk_title |
( |
The layout corresponds to a grid of two columns and five rows of unequal dimensions. Most of the dimension are fixed, only the curve is flexible and will accommodate with the remaining free space.
The left column gets the annotation of the ggplot
(y-axis) and the names of the strata for the patient
at risk tabulation. The main constraint is about the width of the columns which must allow the writing of
the strata name.
The right column receive the ggplot
, the legend, the x-axis and the patient at risk table.
A grid layout.
library(dplyr) library(survival) library(grid) fit_km <- tern_ex_adtte %>% filter(PARAMCD == "OS") %>% survfit(formula = Surv(AVAL, 1 - CNSR) ~ ARMCD, data = .) data_plot <- h_data_plot(fit_km = fit_km) xticks <- h_xticks(data = data_plot) gg <- h_ggkm( data = data_plot, censor_show = TRUE, xticks = xticks, xlab = "Days", ylab = "Survival Probability", title = "tt", footnotes = "ff", yval = "Survival" ) g_el <- h_decompose_gg(gg) lyt <- h_km_layout(data = data_plot, g_el = g_el, title = "t", footnotes = "f") grid.show.layout(lyt)
library(dplyr) library(survival) library(grid) fit_km <- tern_ex_adtte %>% filter(PARAMCD == "OS") %>% survfit(formula = Surv(AVAL, 1 - CNSR) ~ ARMCD, data = .) data_plot <- h_data_plot(fit_km = fit_km) xticks <- h_xticks(data = data_plot) gg <- h_ggkm( data = data_plot, censor_show = TRUE, xticks = xticks, xlab = "Days", ylab = "Survival Probability", title = "tt", footnotes = "ff", yval = "Survival" ) g_el <- h_decompose_gg(gg) lyt <- h_km_layout(data = data_plot, g_el = g_el, title = "t", footnotes = "f") grid.show.layout(lyt)
Helper functions used in calculations for logistic regression.
h_get_interaction_vars(fit_glm) h_interaction_coef_name( interaction_vars, first_var_with_level, second_var_with_level ) h_or_cat_interaction( odds_ratio_var, interaction_var, fit_glm, conf_level = 0.95 ) h_or_cont_interaction( odds_ratio_var, interaction_var, fit_glm, at = NULL, conf_level = 0.95 ) h_or_interaction( odds_ratio_var, interaction_var, fit_glm, at = NULL, conf_level = 0.95 ) h_simple_term_labels(terms, table) h_interaction_term_labels(terms1, terms2, table, any = FALSE) h_glm_simple_term_extract(x, fit_glm) h_glm_interaction_extract(x, fit_glm) h_glm_inter_term_extract(odds_ratio_var, interaction_var, fit_glm, ...) h_logistic_simple_terms(x, fit_glm, conf_level = 0.95) h_logistic_inter_terms(x, fit_glm, conf_level = 0.95, at = NULL)
h_get_interaction_vars(fit_glm) h_interaction_coef_name( interaction_vars, first_var_with_level, second_var_with_level ) h_or_cat_interaction( odds_ratio_var, interaction_var, fit_glm, conf_level = 0.95 ) h_or_cont_interaction( odds_ratio_var, interaction_var, fit_glm, at = NULL, conf_level = 0.95 ) h_or_interaction( odds_ratio_var, interaction_var, fit_glm, at = NULL, conf_level = 0.95 ) h_simple_term_labels(terms, table) h_interaction_term_labels(terms1, terms2, table, any = FALSE) h_glm_simple_term_extract(x, fit_glm) h_glm_interaction_extract(x, fit_glm) h_glm_inter_term_extract(odds_ratio_var, interaction_var, fit_glm, ...) h_logistic_simple_terms(x, fit_glm, conf_level = 0.95) h_logistic_inter_terms(x, fit_glm, conf_level = 0.95, at = NULL)
fit_glm |
( |
interaction_vars |
( |
first_var_with_level |
( |
second_var_with_level |
( |
odds_ratio_var |
( |
interaction_var |
( |
conf_level |
( |
at |
( |
terms |
( |
table |
( |
terms1 |
( |
terms2 |
( |
any |
( |
x |
( |
... |
additional arguments for the lower level functions. |
Vector of names of interaction variables.
Name of coefficient.
Odds ratio.
Odds ratio.
Odds ratio.
Term labels containing numbers of patients.
Term labels containing numbers of patients.
Tabulated main effect results from a logistic regression model.
Tabulated interaction term results from a logistic regression model.
A data.frame
of tabulated interaction term results from a logistic regression model.
Tabulated statistics for the given variable(s) from the logistic regression model.
Tabulated statistics for the given variable(s) from the logistic regression model.
h_get_interaction_vars()
: Helper function to extract interaction variable names from a fitted
model assuming only one interaction term.
h_interaction_coef_name()
: Helper function to get the right coefficient name from the
interaction variable names and the given levels. The main value here is that the order
of first and second variable is checked in the interaction_vars
input.
h_or_cat_interaction()
: Helper function to calculate the odds ratio estimates
for the case when both the odds ratio and the interaction variable are categorical.
h_or_cont_interaction()
: Helper function to calculate the odds ratio estimates
for the case when either the odds ratio or the interaction variable is continuous.
h_or_interaction()
: Helper function to calculate the odds ratio estimates
in case of an interaction. This is a wrapper for h_or_cont_interaction()
and
h_or_cat_interaction()
.
h_simple_term_labels()
: Helper function to construct term labels from simple terms and the table
of numbers of patients.
h_interaction_term_labels()
: Helper function to construct term labels from interaction terms and the table
of numbers of patients.
h_glm_simple_term_extract()
: Helper function to tabulate the main effect
results of a (conditional) logistic regression model.
h_glm_interaction_extract()
: Helper function to tabulate the interaction term
results of a logistic regression model.
h_glm_inter_term_extract()
: Helper function to tabulate the interaction
results of a logistic regression model. This basically is a wrapper for
h_or_interaction()
and h_glm_simple_term_extract()
which puts the results
in the right data frame format.
h_logistic_simple_terms()
: Helper function to tabulate the results including
odds ratios and confidence intervals of simple terms.
h_logistic_inter_terms()
: Helper function to tabulate the results including
odds ratios and confidence intervals of interaction terms.
We don't provide a function for the case when both variables are continuous because this does not arise in this table, as the treatment arm variable will always be involved and categorical.
library(dplyr) library(broom) adrs_f <- tern_ex_adrs %>% filter(PARAMCD == "BESRSPI") %>% filter(RACE %in% c("ASIAN", "WHITE", "BLACK OR AFRICAN AMERICAN")) %>% mutate( Response = case_when(AVALC %in% c("PR", "CR") ~ 1, TRUE ~ 0), RACE = factor(RACE), SEX = factor(SEX) ) formatters::var_labels(adrs_f) <- c(formatters::var_labels(tern_ex_adrs), Response = "Response") mod1 <- fit_logistic( data = adrs_f, variables = list( response = "Response", arm = "ARMCD", covariates = c("AGE", "RACE") ) ) mod2 <- fit_logistic( data = adrs_f, variables = list( response = "Response", arm = "ARMCD", covariates = c("AGE", "RACE"), interaction = "AGE" ) ) h_glm_simple_term_extract("AGE", mod1) h_glm_simple_term_extract("ARMCD", mod1) h_glm_interaction_extract("ARMCD:AGE", mod2) h_glm_inter_term_extract("AGE", "ARMCD", mod2) h_logistic_simple_terms("AGE", mod1) h_logistic_inter_terms(c("RACE", "AGE", "ARMCD", "AGE:ARMCD"), mod2)
library(dplyr) library(broom) adrs_f <- tern_ex_adrs %>% filter(PARAMCD == "BESRSPI") %>% filter(RACE %in% c("ASIAN", "WHITE", "BLACK OR AFRICAN AMERICAN")) %>% mutate( Response = case_when(AVALC %in% c("PR", "CR") ~ 1, TRUE ~ 0), RACE = factor(RACE), SEX = factor(SEX) ) formatters::var_labels(adrs_f) <- c(formatters::var_labels(tern_ex_adrs), Response = "Response") mod1 <- fit_logistic( data = adrs_f, variables = list( response = "Response", arm = "ARMCD", covariates = c("AGE", "RACE") ) ) mod2 <- fit_logistic( data = adrs_f, variables = list( response = "Response", arm = "ARMCD", covariates = c("AGE", "RACE"), interaction = "AGE" ) ) h_glm_simple_term_extract("AGE", mod1) h_glm_simple_term_extract("ARMCD", mod1) h_glm_interaction_extract("ARMCD:AGE", mod2) h_glm_inter_term_extract("AGE", "ARMCD", mod2) h_logistic_simple_terms("AGE", mod1) h_logistic_inter_terms(c("RACE", "AGE", "ARMCD", "AGE:ARMCD"), mod2)
trim_levels_to_map()
Helper function to create a map data frame from the input dataset, which can be used as an argument in the
trim_levels_to_map
split function. Based on different method, the map is constructed differently.
h_map_for_count_abnormal( df, variables = list(anl = "ANRIND", split_rows = c("PARAM"), range_low = "ANRLO", range_high = "ANRHI"), abnormal = list(low = c("LOW", "LOW LOW"), high = c("HIGH", "HIGH HIGH")), method = c("default", "range"), na_str = "<Missing>" )
h_map_for_count_abnormal( df, variables = list(anl = "ANRIND", split_rows = c("PARAM"), range_low = "ANRLO", range_high = "ANRHI"), abnormal = list(low = c("LOW", "LOW LOW"), high = c("HIGH", "HIGH HIGH")), method = c("default", "range"), na_str = "<Missing>" )
df |
( |
variables |
(named |
abnormal |
(named |
method |
( |
na_str |
( |
A map data.frame
.
If method is "default"
, the returned map will only have the abnormal directions that are observed in the
df
, and records with all normal values will be excluded to avoid error in creating layout. If method is
"range"
, the returned map will be based on the rule that at least one observation with low range > 0
for low direction and at least one observation with high range is not missing for high direction.
adlb <- df_explicit_na(tern_ex_adlb) h_map_for_count_abnormal( df = adlb, variables = list(anl = "ANRIND", split_rows = c("LBCAT", "PARAM")), abnormal = list(low = c("LOW"), high = c("HIGH")), method = "default", na_str = "<Missing>" ) df <- data.frame( USUBJID = c(rep("1", 4), rep("2", 4), rep("3", 4)), AVISIT = c( rep("WEEK 1", 2), rep("WEEK 2", 2), rep("WEEK 1", 2), rep("WEEK 2", 2), rep("WEEK 1", 2), rep("WEEK 2", 2) ), PARAM = rep(c("ALT", "CPR"), 6), ANRIND = c( "NORMAL", "NORMAL", "LOW", "HIGH", "LOW", "LOW", "HIGH", "HIGH", rep("NORMAL", 4) ), ANRLO = rep(5, 12), ANRHI = rep(20, 12) ) df$ANRIND <- factor(df$ANRIND, levels = c("LOW", "HIGH", "NORMAL")) h_map_for_count_abnormal( df = df, variables = list( anl = "ANRIND", split_rows = c("PARAM"), range_low = "ANRLO", range_high = "ANRHI" ), abnormal = list(low = c("LOW"), high = c("HIGH")), method = "range", na_str = "<Missing>" )
adlb <- df_explicit_na(tern_ex_adlb) h_map_for_count_abnormal( df = adlb, variables = list(anl = "ANRIND", split_rows = c("LBCAT", "PARAM")), abnormal = list(low = c("LOW"), high = c("HIGH")), method = "default", na_str = "<Missing>" ) df <- data.frame( USUBJID = c(rep("1", 4), rep("2", 4), rep("3", 4)), AVISIT = c( rep("WEEK 1", 2), rep("WEEK 2", 2), rep("WEEK 1", 2), rep("WEEK 2", 2), rep("WEEK 1", 2), rep("WEEK 2", 2) ), PARAM = rep(c("ALT", "CPR"), 6), ANRIND = c( "NORMAL", "NORMAL", "LOW", "HIGH", "LOW", "LOW", "HIGH", "HIGH", rep("NORMAL", 4) ), ANRLO = rep(5, 12), ANRHI = rep(20, 12) ) df$ANRIND <- factor(df$ANRIND, levels = c("LOW", "HIGH", "NORMAL")) h_map_for_count_abnormal( df = df, variables = list( anl = "ANRIND", split_rows = c("PARAM"), range_low = "ANRLO", range_high = "ANRHI" ), abnormal = list(low = c("LOW"), high = c("HIGH")), method = "range", na_str = "<Missing>" )
Functions to calculate odds ratios in estimate_odds_ratio()
.
or_glm(data, conf_level) or_clogit(data, conf_level, method = "exact")
or_glm(data, conf_level) or_clogit(data, conf_level, method = "exact")
data |
( |
conf_level |
( |
method |
( |
A named list
of elements or_ci
and n_tot
.
or_glm()
: Estimates the odds ratio based on stats::glm()
. Note that there must be
exactly 2 groups in data
as specified by the grp
variable.
or_clogit()
: Estimates the odds ratio based on survival::clogit()
. This is done for
the whole data set including all groups, since the results are not the same as when doing
pairwise comparisons between the groups.
# Data with 2 groups. data <- data.frame( rsp = as.logical(c(1, 1, 0, 1, 0, 0, 1, 1)), grp = letters[c(1, 1, 1, 2, 2, 2, 1, 2)], strata = letters[c(1, 2, 1, 2, 2, 2, 1, 2)], stringsAsFactors = TRUE ) # Odds ratio based on glm. or_glm(data, conf_level = 0.95) # Data with 3 groups. data <- data.frame( rsp = as.logical(c(1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0)), grp = letters[c(1, 1, 1, 2, 2, 2, 3, 3, 3, 3, 1, 1, 1, 2, 2, 2, 3, 3, 3, 3)], strata = LETTERS[c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)], stringsAsFactors = TRUE ) # Odds ratio based on stratified estimation by conditional logistic regression. or_clogit(data, conf_level = 0.95)
# Data with 2 groups. data <- data.frame( rsp = as.logical(c(1, 1, 0, 1, 0, 0, 1, 1)), grp = letters[c(1, 1, 1, 2, 2, 2, 1, 2)], strata = letters[c(1, 2, 1, 2, 2, 2, 1, 2)], stringsAsFactors = TRUE ) # Odds ratio based on glm. or_glm(data, conf_level = 0.95) # Data with 3 groups. data <- data.frame( rsp = as.logical(c(1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0)), grp = letters[c(1, 1, 1, 2, 2, 2, 3, 3, 3, 3, 1, 1, 1, 2, 2, 2, 3, 3, 3, 3)], strata = LETTERS[c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)], stringsAsFactors = TRUE ) # Odds ratio based on stratified estimation by conditional logistic regression. or_clogit(data, conf_level = 0.95)
PARAM
variableh_pkparam_sort(pk_data, key_var = "PARAMCD")
h_pkparam_sort(pk_data, key_var = "PARAMCD")
pk_data |
( |
key_var |
( |
A pharmacokinetic data.frame
sorted by a PARAM
variable.
library(dplyr) adpp <- tern_ex_adpp %>% mutate(PKPARAM = factor(paste0(PARAM, " (", AVALU, ")"))) pk_ordered_data <- h_pkparam_sort(adpp)
library(dplyr) adpp <- tern_ex_adpp %>% mutate(PKPARAM = factor(paste0(PARAM, " (", AVALU, ")"))) pk_ordered_data <- h_pkparam_sort(adpp)
For each arm level, the predicted mean rate is calculated using the fitted model object, with newdata
set to the result of stats::model.frame
, a reconstructed data or the original data, depending on the
object formula (coming from the fit). The confidence interval is derived using the conf_level
parameter.
h_ppmeans(obj, .df_row, arm, conf_level)
h_ppmeans(obj, .df_row, arm, conf_level)
obj |
( |
.df_row |
( |
arm |
( |
conf_level |
( |
h_ppmeans()
returns the estimated means.
prop_diff_wald(rsp, grp, conf_level = 0.95, correct = FALSE) prop_diff_ha(rsp, grp, conf_level) prop_diff_nc(rsp, grp, conf_level, correct = FALSE) prop_diff_cmh(rsp, grp, strata, conf_level = 0.95) prop_diff_strat_nc( rsp, grp, strata, weights_method = c("cmh", "wilson_h"), conf_level = 0.95, correct = FALSE )
prop_diff_wald(rsp, grp, conf_level = 0.95, correct = FALSE) prop_diff_ha(rsp, grp, conf_level) prop_diff_nc(rsp, grp, conf_level, correct = FALSE) prop_diff_cmh(rsp, grp, strata, conf_level = 0.95) prop_diff_strat_nc( rsp, grp, strata, weights_method = c("cmh", "wilson_h"), conf_level = 0.95, correct = FALSE )
rsp |
( |
grp |
( |
conf_level |
( |
correct |
( |
strata |
( |
weights_method |
( |
A named list
of elements diff
(proportion difference) and diff_ci
(proportion difference confidence interval).
prop_diff_wald()
: The Wald interval follows the usual textbook
definition for a single proportion confidence interval using the normal
approximation. It is possible to include a continuity correction for Wald's
interval.
prop_diff_ha()
: Anderson-Hauck confidence interval.
prop_diff_nc()
: Newcombe confidence interval. It is based on
the Wilson score confidence interval for a single binomial proportion.
prop_diff_cmh()
: Calculates the weighted difference. This is defined as the difference in
response rates between the experimental treatment group and the control treatment group, adjusted
for stratification factors by applying Cochran-Mantel-Haenszel (CMH) weights. For the CMH chi-squared
test, use stats::mantelhaen.test()
.
prop_diff_strat_nc()
: Calculates the stratified Newcombe confidence interval and difference in response
rates between the experimental treatment group and the control treatment group, adjusted for stratification
factors. This implementation follows closely the one proposed by Yan and Su (2010).
Weights can be estimated from the heuristic proposed in prop_strat_wilson()
or from CMH-derived weights
(see prop_diff_cmh()
).
Yan X, Su XG (2010). “Stratified Wilson and Newcombe Confidence Intervals for Multiple Binomial Proportions.” Stat. Biopharm. Res., 2(3), 329–335.
prop_diff()
for implementation of these helper functions.
# Wald confidence interval set.seed(2) rsp <- sample(c(TRUE, FALSE), replace = TRUE, size = 20) grp <- factor(c(rep("A", 10), rep("B", 10))) prop_diff_wald(rsp = rsp, grp = grp, conf_level = 0.95, correct = FALSE) # Anderson-Hauck confidence interval ## "Mid" case: 3/4 respond in group A, 1/2 respond in group B. rsp <- c(TRUE, FALSE, FALSE, TRUE, TRUE, TRUE) grp <- factor(c("A", "B", "A", "B", "A", "A"), levels = c("B", "A")) prop_diff_ha(rsp = rsp, grp = grp, conf_level = 0.90) ## Edge case: Same proportion of response in A and B. rsp <- c(TRUE, FALSE, TRUE, FALSE) grp <- factor(c("A", "A", "B", "B"), levels = c("A", "B")) prop_diff_ha(rsp = rsp, grp = grp, conf_level = 0.6) # Newcombe confidence interval set.seed(1) rsp <- c( sample(c(TRUE, FALSE), size = 40, prob = c(3 / 4, 1 / 4), replace = TRUE), sample(c(TRUE, FALSE), size = 40, prob = c(1 / 2, 1 / 2), replace = TRUE) ) grp <- factor(rep(c("A", "B"), each = 40), levels = c("B", "A")) table(rsp, grp) prop_diff_nc(rsp = rsp, grp = grp, conf_level = 0.9) # Cochran-Mantel-Haenszel confidence interval set.seed(2) rsp <- sample(c(TRUE, FALSE), 100, TRUE) grp <- sample(c("Placebo", "Treatment"), 100, TRUE) grp <- factor(grp, levels = c("Placebo", "Treatment")) strata_data <- data.frame( "f1" = sample(c("a", "b"), 100, TRUE), "f2" = sample(c("x", "y", "z"), 100, TRUE), stringsAsFactors = TRUE ) prop_diff_cmh( rsp = rsp, grp = grp, strata = interaction(strata_data), conf_level = 0.90 ) # Stratified Newcombe confidence interval set.seed(2) data_set <- data.frame( "rsp" = sample(c(TRUE, FALSE), 100, TRUE), "f1" = sample(c("a", "b"), 100, TRUE), "f2" = sample(c("x", "y", "z"), 100, TRUE), "grp" = sample(c("Placebo", "Treatment"), 100, TRUE), stringsAsFactors = TRUE ) prop_diff_strat_nc( rsp = data_set$rsp, grp = data_set$grp, strata = interaction(data_set[2:3]), weights_method = "cmh", conf_level = 0.90 ) prop_diff_strat_nc( rsp = data_set$rsp, grp = data_set$grp, strata = interaction(data_set[2:3]), weights_method = "wilson_h", conf_level = 0.90 )
# Wald confidence interval set.seed(2) rsp <- sample(c(TRUE, FALSE), replace = TRUE, size = 20) grp <- factor(c(rep("A", 10), rep("B", 10))) prop_diff_wald(rsp = rsp, grp = grp, conf_level = 0.95, correct = FALSE) # Anderson-Hauck confidence interval ## "Mid" case: 3/4 respond in group A, 1/2 respond in group B. rsp <- c(TRUE, FALSE, FALSE, TRUE, TRUE, TRUE) grp <- factor(c("A", "B", "A", "B", "A", "A"), levels = c("B", "A")) prop_diff_ha(rsp = rsp, grp = grp, conf_level = 0.90) ## Edge case: Same proportion of response in A and B. rsp <- c(TRUE, FALSE, TRUE, FALSE) grp <- factor(c("A", "A", "B", "B"), levels = c("A", "B")) prop_diff_ha(rsp = rsp, grp = grp, conf_level = 0.6) # Newcombe confidence interval set.seed(1) rsp <- c( sample(c(TRUE, FALSE), size = 40, prob = c(3 / 4, 1 / 4), replace = TRUE), sample(c(TRUE, FALSE), size = 40, prob = c(1 / 2, 1 / 2), replace = TRUE) ) grp <- factor(rep(c("A", "B"), each = 40), levels = c("B", "A")) table(rsp, grp) prop_diff_nc(rsp = rsp, grp = grp, conf_level = 0.9) # Cochran-Mantel-Haenszel confidence interval set.seed(2) rsp <- sample(c(TRUE, FALSE), 100, TRUE) grp <- sample(c("Placebo", "Treatment"), 100, TRUE) grp <- factor(grp, levels = c("Placebo", "Treatment")) strata_data <- data.frame( "f1" = sample(c("a", "b"), 100, TRUE), "f2" = sample(c("x", "y", "z"), 100, TRUE), stringsAsFactors = TRUE ) prop_diff_cmh( rsp = rsp, grp = grp, strata = interaction(strata_data), conf_level = 0.90 ) # Stratified Newcombe confidence interval set.seed(2) data_set <- data.frame( "rsp" = sample(c(TRUE, FALSE), 100, TRUE), "f1" = sample(c("a", "b"), 100, TRUE), "f2" = sample(c("x", "y", "z"), 100, TRUE), "grp" = sample(c("Placebo", "Treatment"), 100, TRUE), stringsAsFactors = TRUE ) prop_diff_strat_nc( rsp = data_set$rsp, grp = data_set$grp, strata = interaction(data_set[2:3]), weights_method = "cmh", conf_level = 0.90 ) prop_diff_strat_nc( rsp = data_set$rsp, grp = data_set$grp, strata = interaction(data_set[2:3]), weights_method = "wilson_h", conf_level = 0.90 )
Functions to calculate different proportion confidence intervals for use in estimate_proportion()
.
prop_wilson(rsp, conf_level, correct = FALSE) prop_strat_wilson( rsp, strata, weights = NULL, conf_level = 0.95, max_iterations = NULL, correct = FALSE ) prop_clopper_pearson(rsp, conf_level) prop_wald(rsp, conf_level, correct = FALSE) prop_agresti_coull(rsp, conf_level) prop_jeffreys(rsp, conf_level)
prop_wilson(rsp, conf_level, correct = FALSE) prop_strat_wilson( rsp, strata, weights = NULL, conf_level = 0.95, max_iterations = NULL, correct = FALSE ) prop_clopper_pearson(rsp, conf_level) prop_wald(rsp, conf_level, correct = FALSE) prop_agresti_coull(rsp, conf_level) prop_jeffreys(rsp, conf_level)
rsp |
( |
conf_level |
( |
correct |
( |
strata |
( |
weights |
( |
max_iterations |
( |
Confidence interval of a proportion.
prop_wilson()
: Calculates the Wilson interval by calling stats::prop.test()
.
Also referred to as Wilson score interval.
prop_strat_wilson()
: Calculates the stratified Wilson confidence
interval for unequal proportions as described in Yan and Su (2010)
prop_clopper_pearson()
: Calculates the Clopper-Pearson interval by calling stats::binom.test()
.
Also referred to as the exact
method.
prop_wald()
: Calculates the Wald interval by following the usual textbook definition
for a single proportion confidence interval using the normal approximation.
prop_agresti_coull()
: Calculates the Agresti-Coull interval. Constructed (for 95% CI) by adding two successes
and two failures to the data and then using the Wald formula to construct a CI.
prop_jeffreys()
: Calculates the Jeffreys interval, an equal-tailed interval based on the
non-informative Jeffreys prior for a binomial proportion.
Yan X, Su XG (2010). “Stratified Wilson and Newcombe Confidence Intervals for Multiple Binomial Proportions.” Stat. Biopharm. Res., 2(3), 329–335.
estimate_proportion, descriptive function d_proportion()
,
and helper functions strata_normal_quantile()
and update_weights_strat_wilson()
.
rsp <- c( TRUE, TRUE, TRUE, TRUE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE ) prop_wilson(rsp, conf_level = 0.9) # Stratified Wilson confidence interval with unequal probabilities set.seed(1) rsp <- sample(c(TRUE, FALSE), 100, TRUE) strata_data <- data.frame( "f1" = sample(c("a", "b"), 100, TRUE), "f2" = sample(c("x", "y", "z"), 100, TRUE), stringsAsFactors = TRUE ) strata <- interaction(strata_data) n_strata <- ncol(table(rsp, strata)) # Number of strata prop_strat_wilson( rsp = rsp, strata = strata, conf_level = 0.90 ) # Not automatic setting of weights prop_strat_wilson( rsp = rsp, strata = strata, weights = rep(1 / n_strata, n_strata), conf_level = 0.90 ) prop_clopper_pearson(rsp, conf_level = .95) prop_wald(rsp, conf_level = 0.95) prop_wald(rsp, conf_level = 0.95, correct = TRUE) prop_agresti_coull(rsp, conf_level = 0.95) prop_jeffreys(rsp, conf_level = 0.95)
rsp <- c( TRUE, TRUE, TRUE, TRUE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE ) prop_wilson(rsp, conf_level = 0.9) # Stratified Wilson confidence interval with unequal probabilities set.seed(1) rsp <- sample(c(TRUE, FALSE), 100, TRUE) strata_data <- data.frame( "f1" = sample(c("a", "b"), 100, TRUE), "f2" = sample(c("x", "y", "z"), 100, TRUE), stringsAsFactors = TRUE ) strata <- interaction(strata_data) n_strata <- ncol(table(rsp, strata)) # Number of strata prop_strat_wilson( rsp = rsp, strata = strata, conf_level = 0.90 ) # Not automatic setting of weights prop_strat_wilson( rsp = rsp, strata = strata, weights = rep(1 / n_strata, n_strata), conf_level = 0.90 ) prop_clopper_pearson(rsp, conf_level = .95) prop_wald(rsp, conf_level = 0.95) prop_wald(rsp, conf_level = 0.95, correct = TRUE) prop_agresti_coull(rsp, conf_level = 0.95) prop_jeffreys(rsp, conf_level = 0.95)
Helper functions which are documented here separately to not confuse the user when reading about the user-facing functions.
h_rsp_to_logistic_variables(variables, biomarker) h_logistic_mult_cont_df(variables, data, control = control_logistic()) h_tab_rsp_one_biomarker(df, vars, na_str = default_na_str(), .indent_mods = 0L)
h_rsp_to_logistic_variables(variables, biomarker) h_logistic_mult_cont_df(variables, data, control = control_logistic()) h_tab_rsp_one_biomarker(df, vars, na_str = default_na_str(), .indent_mods = 0L)
variables |
(named |
biomarker |
( |
data |
( |
control |
(named |
df |
( |
vars |
(
|
na_str |
( |
.indent_mods |
(named |
h_rsp_to_logistic_variables()
returns a named list
of elements response
, arm
, covariates
, and strata
.
h_logistic_mult_cont_df()
returns a data.frame
containing estimates and statistics for the selected biomarkers.
h_tab_rsp_one_biomarker()
returns an rtables
table object with the given statistics arranged in columns.
h_rsp_to_logistic_variables()
: helps with converting the "response" function variable list
to the "logistic regression" variable list. The reason is that currently there is an
inconsistency between the variable names accepted by extract_rsp_subgroups()
and fit_logistic()
.
h_logistic_mult_cont_df()
: prepares estimates for number of responses, patients and
overall response rate, as well as odds ratio estimates, confidence intervals and p-values, for multiple
biomarkers in a given single data set.
variables
corresponds to names of variables found in data
, passed as a named list and requires elements
rsp
and biomarkers
(vector of continuous biomarker variables) and optionally covariates
and strata
.
h_tab_rsp_one_biomarker()
: Prepares a single sub-table given a df_sub
containing
the results for a single biomarker.
library(dplyr) library(forcats) adrs <- tern_ex_adrs adrs_labels <- formatters::var_labels(adrs) adrs_f <- adrs %>% filter(PARAMCD == "BESRSPI") %>% mutate(rsp = AVALC == "CR") formatters::var_labels(adrs_f) <- c(adrs_labels, "Response") # This is how the variable list is converted internally. h_rsp_to_logistic_variables( variables = list( rsp = "RSP", covariates = c("A", "B"), strata = "D" ), biomarker = "AGE" ) # For a single population, estimate separately the effects # of two biomarkers. df <- h_logistic_mult_cont_df( variables = list( rsp = "rsp", biomarkers = c("BMRKR1", "AGE"), covariates = "SEX" ), data = adrs_f ) df # If the data set is empty, still the corresponding rows with missings are returned. h_coxreg_mult_cont_df( variables = list( rsp = "rsp", biomarkers = c("BMRKR1", "AGE"), covariates = "SEX", strata = "STRATA1" ), data = adrs_f[NULL, ] ) # Starting from above `df`, zoom in on one biomarker and add required columns. df1 <- df[1, ] df1$subgroup <- "All patients" df1$row_type <- "content" df1$var <- "ALL" df1$var_label <- "All patients" h_tab_rsp_one_biomarker( df1, vars = c("n_tot", "n_rsp", "prop", "or", "ci", "pval") )
library(dplyr) library(forcats) adrs <- tern_ex_adrs adrs_labels <- formatters::var_labels(adrs) adrs_f <- adrs %>% filter(PARAMCD == "BESRSPI") %>% mutate(rsp = AVALC == "CR") formatters::var_labels(adrs_f) <- c(adrs_labels, "Response") # This is how the variable list is converted internally. h_rsp_to_logistic_variables( variables = list( rsp = "RSP", covariates = c("A", "B"), strata = "D" ), biomarker = "AGE" ) # For a single population, estimate separately the effects # of two biomarkers. df <- h_logistic_mult_cont_df( variables = list( rsp = "rsp", biomarkers = c("BMRKR1", "AGE"), covariates = "SEX" ), data = adrs_f ) df # If the data set is empty, still the corresponding rows with missings are returned. h_coxreg_mult_cont_df( variables = list( rsp = "rsp", biomarkers = c("BMRKR1", "AGE"), covariates = "SEX", strata = "STRATA1" ), data = adrs_f[NULL, ] ) # Starting from above `df`, zoom in on one biomarker and add required columns. df1 <- df[1, ] df1$subgroup <- "All patients" df1$row_type <- "content" df1$var <- "ALL" df1$var_label <- "All patients" h_tab_rsp_one_biomarker( df1, vars = c("n_tot", "n_rsp", "prop", "or", "ci", "pval") )
Helper functions that tabulate in a data frame statistics such as response rate and odds ratio for population subgroups.
h_proportion_df(rsp, arm) h_proportion_subgroups_df( variables, data, groups_lists = list(), label_all = "All Patients" ) h_odds_ratio_df(rsp, arm, strata_data = NULL, conf_level = 0.95, method = NULL) h_odds_ratio_subgroups_df( variables, data, groups_lists = list(), conf_level = 0.95, method = NULL, label_all = "All Patients" )
h_proportion_df(rsp, arm) h_proportion_subgroups_df( variables, data, groups_lists = list(), label_all = "All Patients" ) h_odds_ratio_df(rsp, arm, strata_data = NULL, conf_level = 0.95, method = NULL) h_odds_ratio_subgroups_df( variables, data, groups_lists = list(), conf_level = 0.95, method = NULL, label_all = "All Patients" )
rsp |
( |
arm |
( |
variables |
(named |
data |
( |
groups_lists |
(named |
label_all |
( |
strata_data |
( |
conf_level |
( |
method |
( |
Main functionality is to prepare data for use in a layout-creating function.
h_proportion_df()
returns a data.frame
with columns arm
, n
, n_rsp
, and prop
.
h_proportion_subgroups_df()
returns a data.frame
with columns arm
, n
, n_rsp
, prop
, subgroup
,
var
, var_label
, and row_type
.
h_odds_ratio_df()
returns a data.frame
with columns arm
, n_tot
, or
, lcl
, ucl
, conf_level
, and
optionally pval
and pval_label
.
h_odds_ratio_subgroups_df()
returns a data.frame
with columns arm
, n_tot
, or
, lcl
, ucl
,
conf_level
, subgroup
, var
, var_label
, and row_type
.
h_proportion_df()
: Helper to prepare a data frame of binary responses by arm.
h_proportion_subgroups_df()
: Summarizes proportion of binary responses by arm and across subgroups
in a data frame. variables
corresponds to the names of variables found in data
, passed as a named list and
requires elements rsp
, arm
and optionally subgroups
. groups_lists
optionally specifies
groupings for subgroups
variables.
h_odds_ratio_df()
: Helper to prepare a data frame with estimates of
the odds ratio between a treatment and a control arm.
h_odds_ratio_subgroups_df()
: Summarizes estimates of the odds ratio between a treatment and a control
arm across subgroups in a data frame. variables
corresponds to the names of variables found in
data
, passed as a named list and requires elements rsp
, arm
and optionally subgroups
and strata
. groups_lists
optionally specifies groupings for subgroups
variables.
library(dplyr) library(forcats) adrs <- tern_ex_adrs adrs_labels <- formatters::var_labels(adrs) adrs_f <- adrs %>% filter(PARAMCD == "BESRSPI") %>% filter(ARM %in% c("A: Drug X", "B: Placebo")) %>% droplevels() %>% mutate( # Reorder levels of factor to make the placebo group the reference arm. ARM = fct_relevel(ARM, "B: Placebo"), rsp = AVALC == "CR" ) formatters::var_labels(adrs_f) <- c(adrs_labels, "Response") h_proportion_df( c(TRUE, FALSE, FALSE), arm = factor(c("A", "A", "B"), levels = c("A", "B")) ) h_proportion_subgroups_df( variables = list(rsp = "rsp", arm = "ARM", subgroups = c("SEX", "BMRKR2")), data = adrs_f ) # Define groupings for BMRKR2 levels. h_proportion_subgroups_df( variables = list(rsp = "rsp", arm = "ARM", subgroups = c("SEX", "BMRKR2")), data = adrs_f, groups_lists = list( BMRKR2 = list( "low" = "LOW", "low/medium" = c("LOW", "MEDIUM"), "low/medium/high" = c("LOW", "MEDIUM", "HIGH") ) ) ) # Unstratatified analysis. h_odds_ratio_df( c(TRUE, FALSE, FALSE, TRUE), arm = factor(c("A", "A", "B", "B"), levels = c("A", "B")) ) # Include p-value. h_odds_ratio_df(adrs_f$rsp, adrs_f$ARM, method = "chisq") # Stratatified analysis. h_odds_ratio_df( rsp = adrs_f$rsp, arm = adrs_f$ARM, strata_data = adrs_f[, c("STRATA1", "STRATA2")], method = "cmh" ) # Unstratified analysis. h_odds_ratio_subgroups_df( variables = list(rsp = "rsp", arm = "ARM", subgroups = c("SEX", "BMRKR2")), data = adrs_f ) # Stratified analysis. h_odds_ratio_subgroups_df( variables = list( rsp = "rsp", arm = "ARM", subgroups = c("SEX", "BMRKR2"), strata = c("STRATA1", "STRATA2") ), data = adrs_f ) # Define groupings of BMRKR2 levels. h_odds_ratio_subgroups_df( variables = list( rsp = "rsp", arm = "ARM", subgroups = c("SEX", "BMRKR2") ), data = adrs_f, groups_lists = list( BMRKR2 = list( "low" = "LOW", "low/medium" = c("LOW", "MEDIUM"), "low/medium/high" = c("LOW", "MEDIUM", "HIGH") ) ) )
library(dplyr) library(forcats) adrs <- tern_ex_adrs adrs_labels <- formatters::var_labels(adrs) adrs_f <- adrs %>% filter(PARAMCD == "BESRSPI") %>% filter(ARM %in% c("A: Drug X", "B: Placebo")) %>% droplevels() %>% mutate( # Reorder levels of factor to make the placebo group the reference arm. ARM = fct_relevel(ARM, "B: Placebo"), rsp = AVALC == "CR" ) formatters::var_labels(adrs_f) <- c(adrs_labels, "Response") h_proportion_df( c(TRUE, FALSE, FALSE), arm = factor(c("A", "A", "B"), levels = c("A", "B")) ) h_proportion_subgroups_df( variables = list(rsp = "rsp", arm = "ARM", subgroups = c("SEX", "BMRKR2")), data = adrs_f ) # Define groupings for BMRKR2 levels. h_proportion_subgroups_df( variables = list(rsp = "rsp", arm = "ARM", subgroups = c("SEX", "BMRKR2")), data = adrs_f, groups_lists = list( BMRKR2 = list( "low" = "LOW", "low/medium" = c("LOW", "MEDIUM"), "low/medium/high" = c("LOW", "MEDIUM", "HIGH") ) ) ) # Unstratatified analysis. h_odds_ratio_df( c(TRUE, FALSE, FALSE, TRUE), arm = factor(c("A", "A", "B", "B"), levels = c("A", "B")) ) # Include p-value. h_odds_ratio_df(adrs_f$rsp, adrs_f$ARM, method = "chisq") # Stratatified analysis. h_odds_ratio_df( rsp = adrs_f$rsp, arm = adrs_f$ARM, strata_data = adrs_f[, c("STRATA1", "STRATA2")], method = "cmh" ) # Unstratified analysis. h_odds_ratio_subgroups_df( variables = list(rsp = "rsp", arm = "ARM", subgroups = c("SEX", "BMRKR2")), data = adrs_f ) # Stratified analysis. h_odds_ratio_subgroups_df( variables = list( rsp = "rsp", arm = "ARM", subgroups = c("SEX", "BMRKR2"), strata = c("STRATA1", "STRATA2") ), data = adrs_f ) # Define groupings of BMRKR2 levels. h_odds_ratio_subgroups_df( variables = list( rsp = "rsp", arm = "ARM", subgroups = c("SEX", "BMRKR2") ), data = adrs_f, groups_lists = list( BMRKR2 = list( "low" = "LOW", "low/medium" = c("LOW", "MEDIUM"), "low/medium/high" = c("LOW", "MEDIUM", "HIGH") ) ) )
Split a data frame into a non-nested list of subsets.
h_split_by_subgroups(data, subgroups, groups_lists = list())
h_split_by_subgroups(data, subgroups, groups_lists = list())
data |
( |
subgroups |
( |
groups_lists |
(named |
Main functionality is to prepare data for use in forest plot layouts.
A list with subset data (df
) and metadata about the subset (df_labels
).
df <- data.frame( x = c(1:5), y = factor(c("A", "B", "A", "B", "A"), levels = c("A", "B", "C")), z = factor(c("C", "C", "D", "D", "D"), levels = c("D", "C")) ) formatters::var_labels(df) <- paste("label for", names(df)) h_split_by_subgroups( data = df, subgroups = c("y", "z") ) h_split_by_subgroups( data = df, subgroups = c("y", "z"), groups_lists = list( y = list("AB" = c("A", "B"), "C" = "C") ) )
df <- data.frame( x = c(1:5), y = factor(c("A", "B", "A", "B", "A"), levels = c("A", "B", "C")), z = factor(c("C", "C", "D", "D", "D"), levels = c("D", "C")) ) formatters::var_labels(df) <- paste("label for", names(df)) h_split_by_subgroups( data = df, subgroups = c("y", "z") ) h_split_by_subgroups( data = df, subgroups = c("y", "z"), groups_lists = list( y = list("AB" = c("A", "B"), "C" = "C") ) )
It divides the data in the vector param
into the groups defined by f
based on specified values
. It is relevant
in rtables
layers so as to distribute parameters .stats
or' .formats
into lists with items corresponding to
specific analysis function.
h_split_param(param, value, f)
h_split_param(param, value, f)
param |
( |
value |
( |
f |
( |
A named list
with the same element names as f
, each containing the elements specified in .stats
.
f <- list( surv = c("pt_at_risk", "event_free_rate", "rate_se", "rate_ci"), surv_diff = c("rate_diff", "rate_diff_ci", "ztest_pval") ) .stats <- c("pt_at_risk", "rate_diff") h_split_param(.stats, .stats, f = f) # $surv # [1] "pt_at_risk" # # $surv_diff # [1] "rate_diff" .formats <- c("pt_at_risk" = "xx", "event_free_rate" = "xxx") h_split_param(.formats, names(.formats), f = f) # $surv # pt_at_risk event_free_rate # "xx" "xxx" # # $surv_diff # NULL
f <- list( surv = c("pt_at_risk", "event_free_rate", "rate_se", "rate_ci"), surv_diff = c("rate_diff", "rate_diff_ci", "ztest_pval") ) .stats <- c("pt_at_risk", "rate_diff") h_split_param(.stats, .stats, f = f) # $surv # [1] "pt_at_risk" # # $surv_diff # [1] "rate_diff" .formats <- c("pt_at_risk" = "xx", "event_free_rate" = "xxx") h_split_param(.formats, names(.formats), f = f) # $surv # pt_at_risk event_free_rate # "xx" "xxx" # # $surv_diff # NULL
Helper function to create a new SMQ variable in ADAE that consists of all adverse events belonging to
selected Standardized/Customized queries. The new dataset will only contain records of the adverse events
belonging to any of the selected baskets. Remember that na_str
must match the needed pre-processing
done with df_explicit_na()
to have the desired output.
h_stack_by_baskets( df, baskets = grep("^(SMQ|CQ).+NAM$", names(df), value = TRUE), smq_varlabel = "Standardized MedDRA Query", keys = c("STUDYID", "USUBJID", "ASTDTM", "AEDECOD", "AESEQ"), aag_summary = NULL, na_str = "<Missing>" )
h_stack_by_baskets( df, baskets = grep("^(SMQ|CQ).+NAM$", names(df), value = TRUE), smq_varlabel = "Standardized MedDRA Query", keys = c("STUDYID", "USUBJID", "ASTDTM", "AEDECOD", "AESEQ"), aag_summary = NULL, na_str = "<Missing>" )
df |
( |
baskets |
( |
smq_varlabel |
( |
keys |
( |
aag_summary |
( |
na_str |
( |
A data.frame
with variables in keys
taken from df
and new variable SMQ containing
records belonging to the baskets selected via the baskets
argument.
adae <- tern_ex_adae[1:20, ] %>% df_explicit_na() h_stack_by_baskets(df = adae) aag <- data.frame( NAMVAR = c("CQ01NAM", "CQ02NAM", "SMQ01NAM", "SMQ02NAM"), REFNAME = c( "D.2.1.5.3/A.1.1.1.1 aesi", "X.9.9.9.9/Y.8.8.8.8 aesi", "C.1.1.1.3/B.2.2.3.1 aesi", "C.1.1.1.3/B.3.3.3.3 aesi" ), SCOPE = c("", "", "BROAD", "BROAD"), stringsAsFactors = FALSE ) basket_name <- character(nrow(aag)) cq_pos <- grep("^(CQ).+NAM$", aag$NAMVAR) smq_pos <- grep("^(SMQ).+NAM$", aag$NAMVAR) basket_name[cq_pos] <- aag$REFNAME[cq_pos] basket_name[smq_pos] <- paste0( aag$REFNAME[smq_pos], "(", aag$SCOPE[smq_pos], ")" ) aag_summary <- data.frame( basket = aag$NAMVAR, basket_name = basket_name, stringsAsFactors = TRUE ) result <- h_stack_by_baskets(df = adae, aag_summary = aag_summary) all(levels(aag_summary$basket_name) %in% levels(result$SMQ)) h_stack_by_baskets( df = adae, aag_summary = NULL, keys = c("STUDYID", "USUBJID", "AEDECOD", "ARM"), baskets = "SMQ01NAM" )
adae <- tern_ex_adae[1:20, ] %>% df_explicit_na() h_stack_by_baskets(df = adae) aag <- data.frame( NAMVAR = c("CQ01NAM", "CQ02NAM", "SMQ01NAM", "SMQ02NAM"), REFNAME = c( "D.2.1.5.3/A.1.1.1.1 aesi", "X.9.9.9.9/Y.8.8.8.8 aesi", "C.1.1.1.3/B.2.2.3.1 aesi", "C.1.1.1.3/B.3.3.3.3 aesi" ), SCOPE = c("", "", "BROAD", "BROAD"), stringsAsFactors = FALSE ) basket_name <- character(nrow(aag)) cq_pos <- grep("^(CQ).+NAM$", aag$NAMVAR) smq_pos <- grep("^(SMQ).+NAM$", aag$NAMVAR) basket_name[cq_pos] <- aag$REFNAME[cq_pos] basket_name[smq_pos] <- paste0( aag$REFNAME[smq_pos], "(", aag$SCOPE[smq_pos], ")" ) aag_summary <- data.frame( basket = aag$NAMVAR, basket_name = basket_name, stringsAsFactors = TRUE ) result <- h_stack_by_baskets(df = adae, aag_summary = aag_summary) all(levels(aag_summary$basket_name) %in% levels(result$SMQ)) h_stack_by_baskets( df = adae, aag_summary = NULL, keys = c("STUDYID", "USUBJID", "AEDECOD", "ARM"), baskets = "SMQ01NAM" )
Helper functions that are used internally for the STEP calculations.
h_step_window(x, control = control_step()) h_step_trt_effect(data, model, variables, x) h_step_survival_formula(variables, control = control_step()) h_step_survival_est( formula, data, variables, x, subset = rep(TRUE, nrow(data)), control = control_coxph() ) h_step_rsp_formula(variables, control = c(control_step(), control_logistic())) h_step_rsp_est( formula, data, variables, x, subset = rep(TRUE, nrow(data)), control = control_logistic() )
h_step_window(x, control = control_step()) h_step_trt_effect(data, model, variables, x) h_step_survival_formula(variables, control = control_step()) h_step_survival_est( formula, data, variables, x, subset = rep(TRUE, nrow(data)), control = control_coxph() ) h_step_rsp_formula(variables, control = c(control_step(), control_logistic())) h_step_rsp_est( formula, data, variables, x, subset = rep(TRUE, nrow(data)), control = control_logistic() )
x |
( |
control |
(named |
data |
( |
model |
( |
variables |
(named |
formula |
( |
subset |
( |
h_step_window()
returns a list containing the window-selection matrix sel
and the interval information matrix interval
.
h_step_trt_effect()
returns a vector with elements est
and se
.
h_step_survival_formula()
returns a model formula.
h_step_survival_est()
returns a matrix of number of observations n
,
events
, log hazard ratio estimates loghr
, standard error se
,
and Wald confidence interval bounds ci_lower
and ci_upper
. One row is
included for each biomarker value in x
.
h_step_rsp_formula()
returns a model formula.
h_step_rsp_est()
returns a matrix of number of observations n
, log odds
ratio estimates logor
, standard error se
, and Wald confidence interval bounds
ci_lower
and ci_upper
. One row is included for each biomarker value in x
.
h_step_window()
: Creates the windows for STEP, based on the control settings
provided.
h_step_trt_effect()
: Calculates the estimated treatment effect estimate
on the linear predictor scale and corresponding standard error from a STEP model
fitted
on data
given variables
specification, for a single biomarker value x
.
This works for both coxph
and glm
models, i.e. for calculating log hazard ratio or log odds
ratio estimates.
h_step_survival_formula()
: Builds the model formula used in survival STEP calculations.
h_step_survival_est()
: Estimates the model with formula
built based on
variables
in data
for a given subset
and control
parameters for the
Cox regression.
h_step_rsp_formula()
: Builds the model formula used in response STEP calculations.
h_step_rsp_est()
: Estimates the model with formula
built based on
variables
in data
for a given subset
and control
parameters for the
logistic regression.
Helper functions which are documented here separately to not confuse the user when reading about the user-facing functions.
h_surv_to_coxreg_variables(variables, biomarker) h_coxreg_mult_cont_df(variables, data, control = control_coxreg()) h_tab_surv_one_biomarker( df, vars, time_unit, na_str = default_na_str(), .indent_mods = 0L, ... )
h_surv_to_coxreg_variables(variables, biomarker) h_coxreg_mult_cont_df(variables, data, control = control_coxreg()) h_tab_surv_one_biomarker( df, vars, time_unit, na_str = default_na_str(), .indent_mods = 0L, ... )
variables |
(named |
biomarker |
( |
data |
( |
control |
( |
df |
( |
vars |
(
|
time_unit |
( |
na_str |
( |
.indent_mods |
(named |
... |
additional arguments for the lower level functions. |
h_surv_to_coxreg_variables()
returns a named list
of elements time
, event
, arm
,
covariates
, and strata
.
h_coxreg_mult_cont_df()
returns a data.frame
containing estimates and statistics for the selected biomarkers.
h_tab_surv_one_biomarker()
returns an rtables
table object with the given statistics arranged in columns.
h_surv_to_coxreg_variables()
: Helps with converting the "survival" function variable list
to the "Cox regression" variable list. The reason is that currently there is an inconsistency between the variable
names accepted by extract_survival_subgroups()
and fit_coxreg_multivar()
.
h_coxreg_mult_cont_df()
: Prepares estimates for number of events, patients and median survival
times, as well as hazard ratio estimates, confidence intervals and p-values, for multiple biomarkers
in a given single data set.
variables
corresponds to names of variables found in data
, passed as a named list and requires elements
tte
, is_event
, biomarkers
(vector of continuous biomarker variables) and optionally subgroups
and strata
.
h_tab_surv_one_biomarker()
: Prepares a single sub-table given a df_sub
containing
the results for a single biomarker.
library(dplyr) library(forcats) adtte <- tern_ex_adtte # Save variable labels before data processing steps. adtte_labels <- formatters::var_labels(adtte, fill = FALSE) adtte_f <- adtte %>% filter(PARAMCD == "OS") %>% mutate( AVALU = as.character(AVALU), is_event = CNSR == 0 ) labels <- c("AVALU" = adtte_labels[["AVALU"]], "is_event" = "Event Flag") formatters::var_labels(adtte_f)[names(labels)] <- labels # This is how the variable list is converted internally. h_surv_to_coxreg_variables( variables = list( tte = "AVAL", is_event = "EVNT", covariates = c("A", "B"), strata = "D" ), biomarker = "AGE" ) # For a single population, estimate separately the effects # of two biomarkers. df <- h_coxreg_mult_cont_df( variables = list( tte = "AVAL", is_event = "is_event", biomarkers = c("BMRKR1", "AGE"), covariates = "SEX", strata = c("STRATA1", "STRATA2") ), data = adtte_f ) df # If the data set is empty, still the corresponding rows with missings are returned. h_coxreg_mult_cont_df( variables = list( tte = "AVAL", is_event = "is_event", biomarkers = c("BMRKR1", "AGE"), covariates = "REGION1", strata = c("STRATA1", "STRATA2") ), data = adtte_f[NULL, ] ) # Starting from above `df`, zoom in on one biomarker and add required columns. df1 <- df[1, ] df1$subgroup <- "All patients" df1$row_type <- "content" df1$var <- "ALL" df1$var_label <- "All patients" h_tab_surv_one_biomarker( df1, vars = c("n_tot", "n_tot_events", "median", "hr", "ci", "pval"), time_unit = "days" )
library(dplyr) library(forcats) adtte <- tern_ex_adtte # Save variable labels before data processing steps. adtte_labels <- formatters::var_labels(adtte, fill = FALSE) adtte_f <- adtte %>% filter(PARAMCD == "OS") %>% mutate( AVALU = as.character(AVALU), is_event = CNSR == 0 ) labels <- c("AVALU" = adtte_labels[["AVALU"]], "is_event" = "Event Flag") formatters::var_labels(adtte_f)[names(labels)] <- labels # This is how the variable list is converted internally. h_surv_to_coxreg_variables( variables = list( tte = "AVAL", is_event = "EVNT", covariates = c("A", "B"), strata = "D" ), biomarker = "AGE" ) # For a single population, estimate separately the effects # of two biomarkers. df <- h_coxreg_mult_cont_df( variables = list( tte = "AVAL", is_event = "is_event", biomarkers = c("BMRKR1", "AGE"), covariates = "SEX", strata = c("STRATA1", "STRATA2") ), data = adtte_f ) df # If the data set is empty, still the corresponding rows with missings are returned. h_coxreg_mult_cont_df( variables = list( tte = "AVAL", is_event = "is_event", biomarkers = c("BMRKR1", "AGE"), covariates = "REGION1", strata = c("STRATA1", "STRATA2") ), data = adtte_f[NULL, ] ) # Starting from above `df`, zoom in on one biomarker and add required columns. df1 <- df[1, ] df1$subgroup <- "All patients" df1$row_type <- "content" df1$var <- "ALL" df1$var_label <- "All patients" h_tab_surv_one_biomarker( df1, vars = c("n_tot", "n_tot_events", "median", "hr", "ci", "pval"), time_unit = "days" )
Helper functions that tabulate in a data frame statistics such as median survival time and hazard ratio for population subgroups.
h_survtime_df(tte, is_event, arm) h_survtime_subgroups_df( variables, data, groups_lists = list(), label_all = "All Patients" ) h_coxph_df(tte, is_event, arm, strata_data = NULL, control = control_coxph()) h_coxph_subgroups_df( variables, data, groups_lists = list(), control = control_coxph(), label_all = "All Patients" )
h_survtime_df(tte, is_event, arm) h_survtime_subgroups_df( variables, data, groups_lists = list(), label_all = "All Patients" ) h_coxph_df(tte, is_event, arm, strata_data = NULL, control = control_coxph()) h_coxph_subgroups_df( variables, data, groups_lists = list(), control = control_coxph(), label_all = "All Patients" )
tte |
( |
is_event |
( |
arm |
( |
variables |
(named |
data |
( |
groups_lists |
(named |
label_all |
( |
strata_data |
( |
control |
(
|
Main functionality is to prepare data for use in a layout-creating function.
h_survtime_df()
returns a data.frame
with columns arm
, n
, n_events
, and median
.
h_survtime_subgroups_df()
returns a data.frame
with columns arm
, n
, n_events
, median
, subgroup
,
var
, var_label
, and row_type
.
h_coxph_df()
returns a data.frame
with columns arm
, n_tot
, n_tot_events
, hr
, lcl
, ucl
,
conf_level
, pval
and pval_label
.
h_coxph_subgroups_df()
returns a data.frame
with columns arm
, n_tot
, n_tot_events
, hr
,
lcl
, ucl
, conf_level
, pval
, pval_label
, subgroup
, var
, var_label
, and row_type
.
h_survtime_df()
: Helper to prepare a data frame of median survival times by arm.
h_survtime_subgroups_df()
: Summarizes median survival times by arm and across subgroups
in a data frame. variables
corresponds to the names of variables found in data
, passed as a named list and
requires elements tte
, is_event
, arm
and optionally subgroups
. groups_lists
optionally specifies
groupings for subgroups
variables.
h_coxph_df()
: Helper to prepare a data frame with estimates of
treatment hazard ratio.
h_coxph_subgroups_df()
: Summarizes estimates of the treatment hazard ratio
across subgroups in a data frame. variables
corresponds to the names of variables found in
data
, passed as a named list and requires elements tte
, is_event
, arm
and
optionally subgroups
and strata
. groups_lists
optionally specifies
groupings for subgroups
variables.
library(dplyr) library(forcats) adtte <- tern_ex_adtte # Save variable labels before data processing steps. adtte_labels <- formatters::var_labels(adtte) adtte_f <- adtte %>% filter( PARAMCD == "OS", ARM %in% c("B: Placebo", "A: Drug X"), SEX %in% c("M", "F") ) %>% mutate( # Reorder levels of ARM to display reference arm before treatment arm. ARM = droplevels(fct_relevel(ARM, "B: Placebo")), SEX = droplevels(SEX), is_event = CNSR == 0 ) labels <- c("ARM" = adtte_labels[["ARM"]], "SEX" = adtte_labels[["SEX"]], "is_event" = "Event Flag") formatters::var_labels(adtte_f)[names(labels)] <- labels # Extract median survival time for one group. h_survtime_df( tte = adtte_f$AVAL, is_event = adtte_f$is_event, arm = adtte_f$ARM ) # Extract median survival time for multiple groups. h_survtime_subgroups_df( variables = list( tte = "AVAL", is_event = "is_event", arm = "ARM", subgroups = c("SEX", "BMRKR2") ), data = adtte_f ) # Define groupings for BMRKR2 levels. h_survtime_subgroups_df( variables = list( tte = "AVAL", is_event = "is_event", arm = "ARM", subgroups = c("SEX", "BMRKR2") ), data = adtte_f, groups_lists = list( BMRKR2 = list( "low" = "LOW", "low/medium" = c("LOW", "MEDIUM"), "low/medium/high" = c("LOW", "MEDIUM", "HIGH") ) ) ) # Extract hazard ratio for one group. h_coxph_df(adtte_f$AVAL, adtte_f$is_event, adtte_f$ARM) # Extract hazard ratio for one group with stratification factor. h_coxph_df(adtte_f$AVAL, adtte_f$is_event, adtte_f$ARM, strata_data = adtte_f$STRATA1) # Extract hazard ratio for multiple groups. h_coxph_subgroups_df( variables = list( tte = "AVAL", is_event = "is_event", arm = "ARM", subgroups = c("SEX", "BMRKR2") ), data = adtte_f ) # Define groupings of BMRKR2 levels. h_coxph_subgroups_df( variables = list( tte = "AVAL", is_event = "is_event", arm = "ARM", subgroups = c("SEX", "BMRKR2") ), data = adtte_f, groups_lists = list( BMRKR2 = list( "low" = "LOW", "low/medium" = c("LOW", "MEDIUM"), "low/medium/high" = c("LOW", "MEDIUM", "HIGH") ) ) ) # Extract hazard ratio for multiple groups with stratification factors. h_coxph_subgroups_df( variables = list( tte = "AVAL", is_event = "is_event", arm = "ARM", subgroups = c("SEX", "BMRKR2"), strata = c("STRATA1", "STRATA2") ), data = adtte_f )
library(dplyr) library(forcats) adtte <- tern_ex_adtte # Save variable labels before data processing steps. adtte_labels <- formatters::var_labels(adtte) adtte_f <- adtte %>% filter( PARAMCD == "OS", ARM %in% c("B: Placebo", "A: Drug X"), SEX %in% c("M", "F") ) %>% mutate( # Reorder levels of ARM to display reference arm before treatment arm. ARM = droplevels(fct_relevel(ARM, "B: Placebo")), SEX = droplevels(SEX), is_event = CNSR == 0 ) labels <- c("ARM" = adtte_labels[["ARM"]], "SEX" = adtte_labels[["SEX"]], "is_event" = "Event Flag") formatters::var_labels(adtte_f)[names(labels)] <- labels # Extract median survival time for one group. h_survtime_df( tte = adtte_f$AVAL, is_event = adtte_f$is_event, arm = adtte_f$ARM ) # Extract median survival time for multiple groups. h_survtime_subgroups_df( variables = list( tte = "AVAL", is_event = "is_event", arm = "ARM", subgroups = c("SEX", "BMRKR2") ), data = adtte_f ) # Define groupings for BMRKR2 levels. h_survtime_subgroups_df( variables = list( tte = "AVAL", is_event = "is_event", arm = "ARM", subgroups = c("SEX", "BMRKR2") ), data = adtte_f, groups_lists = list( BMRKR2 = list( "low" = "LOW", "low/medium" = c("LOW", "MEDIUM"), "low/medium/high" = c("LOW", "MEDIUM", "HIGH") ) ) ) # Extract hazard ratio for one group. h_coxph_df(adtte_f$AVAL, adtte_f$is_event, adtte_f$ARM) # Extract hazard ratio for one group with stratification factor. h_coxph_df(adtte_f$AVAL, adtte_f$is_event, adtte_f$ARM, strata_data = adtte_f$STRATA1) # Extract hazard ratio for multiple groups. h_coxph_subgroups_df( variables = list( tte = "AVAL", is_event = "is_event", arm = "ARM", subgroups = c("SEX", "BMRKR2") ), data = adtte_f ) # Define groupings of BMRKR2 levels. h_coxph_subgroups_df( variables = list( tte = "AVAL", is_event = "is_event", arm = "ARM", subgroups = c("SEX", "BMRKR2") ), data = adtte_f, groups_lists = list( BMRKR2 = list( "low" = "LOW", "low/medium" = c("LOW", "MEDIUM"), "low/medium/high" = c("LOW", "MEDIUM", "HIGH") ) ) ) # Extract hazard ratio for multiple groups with stratification factors. h_coxph_subgroups_df( variables = list( tte = "AVAL", is_event = "is_event", arm = "ARM", subgroups = c("SEX", "BMRKR2"), strata = c("STRATA1", "STRATA2") ), data = adtte_f )
Please see h_tab_surv_one_biomarker()
and h_tab_rsp_one_biomarker()
, which use this function for examples.
This function is a wrapper for rtables::summarize_row_groups()
.
h_tab_one_biomarker( df, afuns, colvars, na_str = default_na_str(), .indent_mods = 0L, ... )
h_tab_one_biomarker( df, afuns, colvars, na_str = default_na_str(), .indent_mods = 0L, ... )
df |
( |
afuns |
(named |
colvars |
(named |
na_str |
( |
.indent_mods |
(named |
... |
additional arguments for the lower level functions. |
An rtables
table object with statistics in columns.
Create a data.frame
of pairwise stratified or unstratified Cox-PH analysis results.
h_tbl_coxph_pairwise( df, variables, ref_group_coxph = NULL, control_coxph_pw = control_coxph(), annot_coxph_ref_lbls = FALSE )
h_tbl_coxph_pairwise( df, variables, ref_group_coxph = NULL, control_coxph_pw = control_coxph(), annot_coxph_ref_lbls = FALSE )
df |
( |
variables |
(named
|
ref_group_coxph |
( |
control_coxph_pw |
(
|
annot_coxph_ref_lbls |
( |
A data.frame
containing statistics HR
, XX% CI
(XX
taken from control_coxph_pw
),
and p-value (log-rank)
.
library(dplyr) adtte <- tern_ex_adtte %>% filter(PARAMCD == "OS") %>% mutate(is_event = CNSR == 0) h_tbl_coxph_pairwise( df = adtte, variables = list(tte = "AVAL", is_event = "is_event", arm = "ARM"), control_coxph_pw = control_coxph(conf_level = 0.9) )
library(dplyr) adtte <- tern_ex_adtte %>% filter(PARAMCD == "OS") %>% mutate(is_event = CNSR == 0) h_tbl_coxph_pairwise( df = adtte, variables = list(tte = "AVAL", is_event = "is_event", arm = "ARM"), control_coxph_pw = control_coxph(conf_level = 0.9) )
Transform a survival fit to a table with groups in rows characterized by N, median and confidence interval.
h_tbl_median_surv(fit_km, armval = "All")
h_tbl_median_surv(fit_km, armval = "All")
fit_km |
( |
armval |
( |
A summary table with statistics N
, Median
, and XX% CI
(XX
taken from fit_km
).
library(dplyr) library(survival) adtte <- tern_ex_adtte %>% filter(PARAMCD == "OS") fit <- survfit( formula = Surv(AVAL, 1 - CNSR) ~ ARMCD, data = adtte ) h_tbl_median_surv(fit_km = fit)
library(dplyr) library(survival) adtte <- tern_ex_adtte %>% filter(PARAMCD == "OS") fit <- survfit( formula = Surv(AVAL, 1 - CNSR) ~ ARMCD, data = adtte ) h_tbl_median_surv(fit_km = fit)
s_count_abnormal_lab_worsen_by_baseline()
Helper function to count the number of patients and the fraction of patients according to
highest post-baseline lab grade variable .var
, baseline lab grade variable baseline_var
,
and the direction of interest specified in direction_var
.
h_worsen_counter(df, id, .var, baseline_var, direction_var)
h_worsen_counter(df, id, .var, baseline_var, direction_var)
df |
( |
id |
( |
.var |
( |
baseline_var |
( |
direction_var |
(
|
The counts and fraction of patients whose worst post-baseline lab grades are worse than their baseline grades, for post-baseline worst grades "1", "2", "3", "4" and "Any".
abnormal_by_worst_grade_worsen
library(dplyr) # The direction variable, GRADDR, is based on metadata adlb <- tern_ex_adlb %>% mutate( GRADDR = case_when( PARAMCD == "ALT" ~ "B", PARAMCD == "CRP" ~ "L", PARAMCD == "IGA" ~ "H" ) ) %>% filter(SAFFL == "Y" & ONTRTFL == "Y" & GRADDR != "") df <- h_adlb_worsen( adlb, worst_flag_low = c("WGRLOFL" = "Y"), worst_flag_high = c("WGRHIFL" = "Y"), direction_var = "GRADDR" ) # `h_worsen_counter` h_worsen_counter( df %>% filter(PARAMCD == "CRP" & GRADDR == "Low"), id = "USUBJID", .var = "ATOXGR", baseline_var = "BTOXGR", direction_var = "GRADDR" )
library(dplyr) # The direction variable, GRADDR, is based on metadata adlb <- tern_ex_adlb %>% mutate( GRADDR = case_when( PARAMCD == "ALT" ~ "B", PARAMCD == "CRP" ~ "L", PARAMCD == "IGA" ~ "H" ) ) %>% filter(SAFFL == "Y" & ONTRTFL == "Y" & GRADDR != "") df <- h_adlb_worsen( adlb, worst_flag_low = c("WGRLOFL" = "Y"), worst_flag_high = c("WGRHIFL" = "Y"), direction_var = "GRADDR" ) # `h_worsen_counter` h_worsen_counter( df %>% filter(PARAMCD == "CRP" & GRADDR == "Low"), id = "USUBJID", .var = "ATOXGR", baseline_var = "BTOXGR", direction_var = "GRADDR" )
Calculate the positions of ticks on the x-axis. However, if xticks
already
exists it is kept as is. It is based on the same function ggplot2
relies on,
and is required in the graphic and the patient-at-risk annotation table.
h_xticks(data, xticks = NULL, max_time = NULL)
h_xticks(data, xticks = NULL, max_time = NULL)
data |
( |
xticks |
( |
max_time |
( |
A vector of positions to use for x-axis ticks on a ggplot
object.
library(dplyr) library(survival) data <- tern_ex_adtte %>% filter(PARAMCD == "OS") %>% survfit(formula = Surv(AVAL, 1 - CNSR) ~ ARMCD, data = .) %>% h_data_plot() h_xticks(data) h_xticks(data, xticks = seq(0, 3000, 500)) h_xticks(data, xticks = 500) h_xticks(data, xticks = 500, max_time = 6000) h_xticks(data, xticks = c(0, 500), max_time = 300) h_xticks(data, xticks = 500, max_time = 300)
library(dplyr) library(survival) data <- tern_ex_adtte %>% filter(PARAMCD == "OS") %>% survfit(formula = Surv(AVAL, 1 - CNSR) ~ ARMCD, data = .) %>% h_data_plot() h_xticks(data) h_xticks(data, xticks = seq(0, 3000, 500)) h_xticks(data, xticks = 500) h_xticks(data, xticks = 500, max_time = 6000) h_xticks(data, xticks = c(0, 500), max_time = 300) h_xticks(data, xticks = 500, max_time = 300)
imputation_rule( df, x_stats, stat, imp_rule, post = FALSE, avalcat_var = "AVALCAT1" )
imputation_rule( df, x_stats, stat, imp_rule, post = FALSE, avalcat_var = "AVALCAT1" )
df |
( |
x_stats |
(named |
stat |
( |
imp_rule |
( |
post |
( |
avalcat_var |
( |
A list
containing statistic value (val
) and NA level (na_str
) that should be displayed
according to the specified imputation rule.
analyze_vars_in_cols()
where this function can be implemented by setting the imp_rule
argument.
set.seed(1) df <- data.frame( AVAL = runif(50, 0, 1), AVALCAT1 = sample(c(1, "BLQ"), 50, replace = TRUE) ) x_stats <- s_summary(df$AVAL) imputation_rule(df, x_stats, "max", "1/3") imputation_rule(df, x_stats, "geom_mean", "1/3") imputation_rule(df, x_stats, "mean", "1/2")
set.seed(1) df <- data.frame( AVAL = runif(50, 0, 1), AVALCAT1 = sample(c(1, "BLQ"), 50, replace = TRUE) ) x_stats <- s_summary(df$AVAL) imputation_rule(df, x_stats, "max", "1/3") imputation_rule(df, x_stats, "geom_mean", "1/3") imputation_rule(df, x_stats, "mean", "1/2")
Given a list of statistic labels and and a list of control parameters, updates labels with a relevant
control specification. For example, if control has element conf_level
set to 0.9
, the default
label for statistic mean_ci
will be updated to "Mean 90% CI"
. Any labels that are supplied
via labels_custom
will not be updated regardless of control
.
labels_use_control(labels_default, control, labels_custom = NULL)
labels_use_control(labels_default, control, labels_custom = NULL)
labels_default |
(named |
control |
(named |
labels_custom |
(named |
A named character vector of labels with control specifications applied to relevant labels.
control <- list(conf_level = 0.80, quantiles = c(0.1, 0.83), test_mean = 0.57) get_labels_from_stats(c("mean_ci", "quantiles", "mean_pval")) %>% labels_use_control(control = control)
control <- list(conf_level = 0.80, quantiles = c(0.1, 0.83), test_mean = 0.57) get_labels_from_stats(c("mean_ci", "quantiles", "mean_pval")) %>% labels_use_control(control = control)
Layout-creating function which creates a multivariate column layout summarizing logistic
regression results. This function is a wrapper for rtables::split_cols_by_multivar()
.
logistic_regression_cols(lyt, conf_level = 0.95)
logistic_regression_cols(lyt, conf_level = 0.95)
lyt |
( |
conf_level |
( |
A layout object suitable for passing to further layouting functions. Adding this
function to an rtable
layout will split the table into columns corresponding to
statistics df
, estimate
, std_error
, odds_ratio
, ci
, and pvalue
.
Constructor for content functions to be used in summarize_logistic()
to summarize
logistic regression results. This function is a wrapper for rtables::summarize_row_groups()
.
logistic_summary_by_flag( flag_var, na_str = default_na_str(), .indent_mods = NULL )
logistic_summary_by_flag( flag_var, na_str = default_na_str(), .indent_mods = NULL )
flag_var |
( |
na_str |
( |
.indent_mods |
(named |
A content function.
Conversion of months to days. This is an approximative calculation because it considers each month as having an average of 30.4375 days.
month2day(x)
month2day(x)
x |
( |
A numeric
vector with the time in days.
x <- c(13.25, 8.15, 1, 2.834) month2day(x)
x <- c(13.25, 8.15, 1, 2.834) month2day(x)
The analyze function estimate_odds_ratio()
creates a layout element to compare bivariate responses between
two groups by estimating an odds ratio and its confidence interval.
The primary analysis variable specified by vars
is the group variable. Additional variables can be included in the
analysis via the variables
argument, which accepts arm
, an arm variable, and strata
, a stratification variable.
If more than two arm levels are present, they can be combined into two groups using the groups_list
argument.
estimate_odds_ratio( lyt, vars, variables = list(arm = NULL, strata = NULL), conf_level = 0.95, groups_list = NULL, na_str = default_na_str(), nested = TRUE, method = "exact", show_labels = "hidden", table_names = vars, var_labels = vars, .stats = "or_ci", .formats = NULL, .labels = NULL, .indent_mods = NULL ) s_odds_ratio( df, .var, .ref_group, .in_ref_col, .df_row, variables = list(arm = NULL, strata = NULL), conf_level = 0.95, groups_list = NULL, method = "exact" ) a_odds_ratio( df, .var, .ref_group, .in_ref_col, .df_row, variables = list(arm = NULL, strata = NULL), conf_level = 0.95, groups_list = NULL, method = "exact" )
estimate_odds_ratio( lyt, vars, variables = list(arm = NULL, strata = NULL), conf_level = 0.95, groups_list = NULL, na_str = default_na_str(), nested = TRUE, method = "exact", show_labels = "hidden", table_names = vars, var_labels = vars, .stats = "or_ci", .formats = NULL, .labels = NULL, .indent_mods = NULL ) s_odds_ratio( df, .var, .ref_group, .in_ref_col, .df_row, variables = list(arm = NULL, strata = NULL), conf_level = 0.95, groups_list = NULL, method = "exact" ) a_odds_ratio( df, .var, .ref_group, .in_ref_col, .df_row, variables = list(arm = NULL, strata = NULL), conf_level = 0.95, groups_list = NULL, method = "exact" )
lyt |
( |
vars |
( |
variables |
(named |
conf_level |
( |
groups_list |
(named |
na_str |
( |
nested |
( |
method |
( |
show_labels |
( |
table_names |
( |
var_labels |
( |
.stats |
( Options are: |
.formats |
(named |
.labels |
(named |
.indent_mods |
(named |
df |
( |
.var |
( |
.ref_group |
( |
.in_ref_col |
( |
.df_row |
( |
estimate_odds_ratio()
returns a layout object suitable for passing to further layouting functions,
or to rtables::build_table()
. Adding this function to an rtable
layout will add formatted rows containing
the statistics from s_odds_ratio()
to the table layout.
s_odds_ratio()
returns a named list with the statistics or_ci
(containing est
, lcl
, and ucl
) and n_tot
.
a_odds_ratio()
returns the corresponding list with formatted rtables::CellValue()
.
estimate_odds_ratio()
: Layout-creating function which can take statistics function arguments
and additional format arguments. This function is a wrapper for rtables::analyze()
.
s_odds_ratio()
: Statistics function which estimates the odds ratio
between a treatment and a control. A variables
list with arm
and strata
variable names must be passed if a stratified analysis is required.
a_odds_ratio()
: Formatted analysis function which is used as afun
in estimate_odds_ratio()
.
This function uses logistic regression for unstratified analyses, and conditional logistic regression for stratified analyses. The Wald confidence interval is calculated with the specified confidence level.
For stratified analyses, there is currently no implementation for conditional likelihood confidence intervals, therefore the likelihood confidence interval is not available as an option.
When vars
contains only responders or non-responders no odds ratio estimation is possible so the returned
values will be NA
.
Relevant helper function h_odds_ratio()
.
set.seed(12) dta <- data.frame( rsp = sample(c(TRUE, FALSE), 100, TRUE), grp = factor(rep(c("A", "B"), each = 50), levels = c("A", "B")), strata = factor(sample(c("C", "D"), 100, TRUE)) ) l <- basic_table() %>% split_cols_by(var = "grp", ref_group = "B") %>% estimate_odds_ratio(vars = "rsp") build_table(l, df = dta) # Unstratified analysis. s_odds_ratio( df = subset(dta, grp == "A"), .var = "rsp", .ref_group = subset(dta, grp == "B"), .in_ref_col = FALSE, .df_row = dta ) # Stratified analysis. s_odds_ratio( df = subset(dta, grp == "A"), .var = "rsp", .ref_group = subset(dta, grp == "B"), .in_ref_col = FALSE, .df_row = dta, variables = list(arm = "grp", strata = "strata") ) a_odds_ratio( df = subset(dta, grp == "A"), .var = "rsp", .ref_group = subset(dta, grp == "B"), .in_ref_col = FALSE, .df_row = dta )
set.seed(12) dta <- data.frame( rsp = sample(c(TRUE, FALSE), 100, TRUE), grp = factor(rep(c("A", "B"), each = 50), levels = c("A", "B")), strata = factor(sample(c("C", "D"), 100, TRUE)) ) l <- basic_table() %>% split_cols_by(var = "grp", ref_group = "B") %>% estimate_odds_ratio(vars = "rsp") build_table(l, df = dta) # Unstratified analysis. s_odds_ratio( df = subset(dta, grp == "A"), .var = "rsp", .ref_group = subset(dta, grp == "B"), .in_ref_col = FALSE, .df_row = dta ) # Stratified analysis. s_odds_ratio( df = subset(dta, grp == "A"), .var = "rsp", .ref_group = subset(dta, grp == "B"), .in_ref_col = FALSE, .df_row = dta, variables = list(arm = "grp", strata = "strata") ) a_odds_ratio( df = subset(dta, grp == "A"), .var = "rsp", .ref_group = subset(dta, grp == "B"), .in_ref_col = FALSE, .df_row = dta )
The analysis function estimate_proportion_diff()
creates a layout element to estimate the difference in proportion
of responders within a studied population. The primary analysis variable, vars
, is a logical variable indicating
whether a response has occurred for each record. See the method
parameter for options of methods to use when
constructing the confidence interval of the proportion difference. A stratification variable can be supplied via the
strata
element of the variables
argument.
estimate_proportion_diff( lyt, vars, variables = list(strata = NULL), conf_level = 0.95, method = c("waldcc", "wald", "cmh", "ha", "newcombe", "newcombecc", "strat_newcombe", "strat_newcombecc"), weights_method = "cmh", na_str = default_na_str(), nested = TRUE, ..., var_labels = vars, show_labels = "hidden", table_names = vars, .stats = NULL, .formats = NULL, .labels = NULL, .indent_mods = NULL ) s_proportion_diff( df, .var, .ref_group, .in_ref_col, variables = list(strata = NULL), conf_level = 0.95, method = c("waldcc", "wald", "cmh", "ha", "newcombe", "newcombecc", "strat_newcombe", "strat_newcombecc"), weights_method = "cmh" ) a_proportion_diff( df, .var, .ref_group, .in_ref_col, variables = list(strata = NULL), conf_level = 0.95, method = c("waldcc", "wald", "cmh", "ha", "newcombe", "newcombecc", "strat_newcombe", "strat_newcombecc"), weights_method = "cmh" )
estimate_proportion_diff( lyt, vars, variables = list(strata = NULL), conf_level = 0.95, method = c("waldcc", "wald", "cmh", "ha", "newcombe", "newcombecc", "strat_newcombe", "strat_newcombecc"), weights_method = "cmh", na_str = default_na_str(), nested = TRUE, ..., var_labels = vars, show_labels = "hidden", table_names = vars, .stats = NULL, .formats = NULL, .labels = NULL, .indent_mods = NULL ) s_proportion_diff( df, .var, .ref_group, .in_ref_col, variables = list(strata = NULL), conf_level = 0.95, method = c("waldcc", "wald", "cmh", "ha", "newcombe", "newcombecc", "strat_newcombe", "strat_newcombecc"), weights_method = "cmh" ) a_proportion_diff( df, .var, .ref_group, .in_ref_col, variables = list(strata = NULL), conf_level = 0.95, method = c("waldcc", "wald", "cmh", "ha", "newcombe", "newcombecc", "strat_newcombe", "strat_newcombecc"), weights_method = "cmh" )
lyt |
( |
vars |
( |
variables |
(named |
conf_level |
( |
method |
( |
weights_method |
( |
na_str |
( |
nested |
( |
... |
additional arguments for the lower level functions. |
var_labels |
( |
show_labels |
( |
table_names |
( |
.stats |
( Options are: |
.formats |
(named |
.labels |
(named |
.indent_mods |
(named |
df |
( |
.var |
( |
.ref_group |
( |
.in_ref_col |
( |
estimate_proportion_diff()
returns a layout object suitable for passing to further layouting functions,
or to rtables::build_table()
. Adding this function to an rtable
layout will add formatted rows containing
the statistics from s_proportion_diff()
to the table layout.
s_proportion_diff()
returns a named list of elements diff
and diff_ci
.
a_proportion_diff()
returns the corresponding list with formatted rtables::CellValue()
.
estimate_proportion_diff()
: Layout-creating function which can take statistics function arguments
and additional format arguments. This function is a wrapper for rtables::analyze()
.
s_proportion_diff()
: Statistics function estimating the difference
in terms of responder proportion.
a_proportion_diff()
: Formatted analysis function which is used as afun
in estimate_proportion_diff()
.
When performing an unstratified analysis, methods "cmh"
, "strat_newcombe"
, and "strat_newcombecc"
are
not permitted.
## "Mid" case: 4/4 respond in group A, 1/2 respond in group B. nex <- 100 # Number of example rows dta <- data.frame( "rsp" = sample(c(TRUE, FALSE), nex, TRUE), "grp" = sample(c("A", "B"), nex, TRUE), "f1" = sample(c("a1", "a2"), nex, TRUE), "f2" = sample(c("x", "y", "z"), nex, TRUE), stringsAsFactors = TRUE ) l <- basic_table() %>% split_cols_by(var = "grp", ref_group = "B") %>% estimate_proportion_diff( vars = "rsp", conf_level = 0.90, method = "ha" ) build_table(l, df = dta) s_proportion_diff( df = subset(dta, grp == "A"), .var = "rsp", .ref_group = subset(dta, grp == "B"), .in_ref_col = FALSE, conf_level = 0.90, method = "ha" ) # CMH example with strata s_proportion_diff( df = subset(dta, grp == "A"), .var = "rsp", .ref_group = subset(dta, grp == "B"), .in_ref_col = FALSE, variables = list(strata = c("f1", "f2")), conf_level = 0.90, method = "cmh" ) a_proportion_diff( df = subset(dta, grp == "A"), .var = "rsp", .ref_group = subset(dta, grp == "B"), .in_ref_col = FALSE, conf_level = 0.90, method = "ha" )
## "Mid" case: 4/4 respond in group A, 1/2 respond in group B. nex <- 100 # Number of example rows dta <- data.frame( "rsp" = sample(c(TRUE, FALSE), nex, TRUE), "grp" = sample(c("A", "B"), nex, TRUE), "f1" = sample(c("a1", "a2"), nex, TRUE), "f2" = sample(c("x", "y", "z"), nex, TRUE), stringsAsFactors = TRUE ) l <- basic_table() %>% split_cols_by(var = "grp", ref_group = "B") %>% estimate_proportion_diff( vars = "rsp", conf_level = 0.90, method = "ha" ) build_table(l, df = dta) s_proportion_diff( df = subset(dta, grp == "A"), .var = "rsp", .ref_group = subset(dta, grp == "B"), .in_ref_col = FALSE, conf_level = 0.90, method = "ha" ) # CMH example with strata s_proportion_diff( df = subset(dta, grp == "A"), .var = "rsp", .ref_group = subset(dta, grp == "B"), .in_ref_col = FALSE, variables = list(strata = c("f1", "f2")), conf_level = 0.90, method = "cmh" ) a_proportion_diff( df = subset(dta, grp == "A"), .var = "rsp", .ref_group = subset(dta, grp == "B"), .in_ref_col = FALSE, conf_level = 0.90, method = "ha" )
Family of constructor and condition functions to flexibly prune occurrence tables.
The condition functions always return whether the row result is higher than the threshold.
Since they are of class CombinationFunction()
they can be logically combined with other condition
functions.
keep_rows(row_condition) keep_content_rows(content_row_condition) has_count_in_cols(atleast, ...) has_count_in_any_col(atleast, ...) has_fraction_in_cols(atleast, ...) has_fraction_in_any_col(atleast, ...) has_fractions_difference(atleast, ...) has_counts_difference(atleast, ...)
keep_rows(row_condition) keep_content_rows(content_row_condition) has_count_in_cols(atleast, ...) has_count_in_any_col(atleast, ...) has_fraction_in_cols(atleast, ...) has_fraction_in_any_col(atleast, ...) has_fractions_difference(atleast, ...) has_counts_difference(atleast, ...)
row_condition |
( |
content_row_condition |
( |
atleast |
( |
... |
arguments for row or column access, see |
keep_rows()
returns a pruning function that can be used with rtables::prune_table()
to prune an rtables
table.
keep_content_rows()
returns a pruning function that checks the condition on the first content
row of leaf tables in the table.
has_count_in_cols()
returns a condition function that sums the counts in the specified column.
has_count_in_any_col()
returns a condition function that compares the counts in the
specified columns with the threshold.
has_fraction_in_cols()
returns a condition function that sums the counts in the
specified column, and computes the fraction by dividing by the total column counts.
has_fraction_in_any_col()
returns a condition function that looks at the fractions
in the specified columns and checks whether any of them fulfill the threshold.
has_fractions_difference()
returns a condition function that extracts the fractions of each
specified column, and computes the difference of the minimum and maximum.
has_counts_difference()
returns a condition function that extracts the counts of each
specified column, and computes the difference of the minimum and maximum.
keep_rows()
: Constructor for creating pruning functions based on
a row condition function. This removes all analysis rows (TableRow
) that should be
pruned, i.e., don't fulfill the row condition. It removes the sub-tree if there are no
children left.
keep_content_rows()
: Constructor for creating pruning functions based on
a condition for the (first) content row in leaf tables. This removes all leaf tables where
the first content row does not fulfill the condition. It does not check individual rows.
It then proceeds recursively by removing the sub tree if there are no children left.
has_count_in_cols()
: Constructor for creating condition functions on total counts in the specified columns.
has_count_in_any_col()
: Constructor for creating condition functions on any of the counts in
the specified columns satisfying a threshold.
has_fraction_in_cols()
: Constructor for creating condition functions on total fraction in
the specified columns.
has_fraction_in_any_col()
: Constructor for creating condition functions on any fraction in
the specified columns.
has_fractions_difference()
: Constructor for creating condition function that checks the difference
between the fractions reported in each specified column.
has_counts_difference()
: Constructor for creating condition function that checks the difference
between the counts reported in each specified column.
Since most table specifications are worded positively, we name our constructor and condition
functions positively, too. However, note that the result of keep_rows()
says what
should be pruned, to conform with the rtables::prune_table()
interface.
tab <- basic_table() %>% split_cols_by("ARM") %>% split_rows_by("RACE") %>% split_rows_by("STRATA1") %>% summarize_row_groups() %>% analyze_vars("COUNTRY", .stats = "count_fraction") %>% build_table(DM) # `keep_rows` is_non_empty <- !CombinationFunction(all_zero_or_na) prune_table(tab, keep_rows(is_non_empty)) # `keep_content_rows` more_than_twenty <- has_count_in_cols(atleast = 20L, col_names = names(tab)) prune_table(tab, keep_content_rows(more_than_twenty)) more_than_one <- has_count_in_cols(atleast = 1L, col_names = names(tab)) prune_table(tab, keep_rows(more_than_one)) # `has_count_in_any_col` any_more_than_one <- has_count_in_any_col(atleast = 1L, col_names = names(tab)) prune_table(tab, keep_rows(any_more_than_one)) # `has_fraction_in_cols` more_than_five_percent <- has_fraction_in_cols(atleast = 0.05, col_names = names(tab)) prune_table(tab, keep_rows(more_than_five_percent)) # `has_fraction_in_any_col` any_atleast_five_percent <- has_fraction_in_any_col(atleast = 0.05, col_names = names(tab)) prune_table(tab, keep_rows(any_atleast_five_percent)) # `has_fractions_difference` more_than_five_percent_diff <- has_fractions_difference(atleast = 0.05, col_names = names(tab)) prune_table(tab, keep_rows(more_than_five_percent_diff)) more_than_one_diff <- has_counts_difference(atleast = 1L, col_names = names(tab)) prune_table(tab, keep_rows(more_than_one_diff))
tab <- basic_table() %>% split_cols_by("ARM") %>% split_rows_by("RACE") %>% split_rows_by("STRATA1") %>% summarize_row_groups() %>% analyze_vars("COUNTRY", .stats = "count_fraction") %>% build_table(DM) # `keep_rows` is_non_empty <- !CombinationFunction(all_zero_or_na) prune_table(tab, keep_rows(is_non_empty)) # `keep_content_rows` more_than_twenty <- has_count_in_cols(atleast = 20L, col_names = names(tab)) prune_table(tab, keep_content_rows(more_than_twenty)) more_than_one <- has_count_in_cols(atleast = 1L, col_names = names(tab)) prune_table(tab, keep_rows(more_than_one)) # `has_count_in_any_col` any_more_than_one <- has_count_in_any_col(atleast = 1L, col_names = names(tab)) prune_table(tab, keep_rows(any_more_than_one)) # `has_fraction_in_cols` more_than_five_percent <- has_fraction_in_cols(atleast = 0.05, col_names = names(tab)) prune_table(tab, keep_rows(more_than_five_percent)) # `has_fraction_in_any_col` any_atleast_five_percent <- has_fraction_in_any_col(atleast = 0.05, col_names = names(tab)) prune_table(tab, keep_rows(any_atleast_five_percent)) # `has_fractions_difference` more_than_five_percent_diff <- has_fractions_difference(atleast = 0.05, col_names = names(tab)) prune_table(tab, keep_rows(more_than_five_percent_diff)) more_than_one_diff <- has_counts_difference(atleast = 1L, col_names = names(tab)) prune_table(tab, keep_rows(more_than_one_diff))
This is a helper function that is used in tests.
reapply_varlabels(x, varlabels, ...)
reapply_varlabels(x, varlabels, ...)
x |
( |
varlabels |
( |
... |
further parameters to be added to the list. |
x
with variable labels reapplied.
The tabulate_rsp_biomarkers()
function creates a layout element to tabulate the estimated biomarker effects on a
binary response endpoint across subgroups, returning statistics including response rate and odds ratio for each
population subgroup. The table is created from df
, a list of data frames returned by extract_rsp_biomarkers()
,
with the statistics to include specified via the vars
parameter.
A forest plot can be created from the resulting table using the g_forest()
function.
tabulate_rsp_biomarkers( df, vars = c("n_tot", "n_rsp", "prop", "or", "ci", "pval"), na_str = default_na_str(), .indent_mods = 0L )
tabulate_rsp_biomarkers( df, vars = c("n_tot", "n_rsp", "prop", "or", "ci", "pval"), na_str = default_na_str(), .indent_mods = 0L )
df |
( |
vars |
(
|
na_str |
( |
.indent_mods |
(named |
These functions create a layout starting from a data frame which contains the required statistics. The tables are then typically used as input for forest plots.
An rtables
table summarizing biomarker effects on binary response by subgroup.
In contrast to tabulate_rsp_subgroups()
this tabulation function does
not start from an input layout lyt
. This is because internally the table is
created by combining multiple subtables.
h_tab_rsp_one_biomarker()
which is used internally, extract_rsp_biomarkers()
.
library(dplyr) library(forcats) adrs <- tern_ex_adrs adrs_labels <- formatters::var_labels(adrs) adrs_f <- adrs %>% filter(PARAMCD == "BESRSPI") %>% mutate(rsp = AVALC == "CR") formatters::var_labels(adrs_f) <- c(adrs_labels, "Response") df <- extract_rsp_biomarkers( variables = list( rsp = "rsp", biomarkers = c("BMRKR1", "AGE"), covariates = "SEX", subgroups = "BMRKR2" ), data = adrs_f ) ## Table with default columns. tabulate_rsp_biomarkers(df) ## Table with a manually chosen set of columns: leave out "pval", reorder. tab <- tabulate_rsp_biomarkers( df = df, vars = c("n_rsp", "ci", "n_tot", "prop", "or") ) ## Finally produce the forest plot. g_forest(tab, xlim = c(0.7, 1.4))
library(dplyr) library(forcats) adrs <- tern_ex_adrs adrs_labels <- formatters::var_labels(adrs) adrs_f <- adrs %>% filter(PARAMCD == "BESRSPI") %>% mutate(rsp = AVALC == "CR") formatters::var_labels(adrs_f) <- c(adrs_labels, "Response") df <- extract_rsp_biomarkers( variables = list( rsp = "rsp", biomarkers = c("BMRKR1", "AGE"), covariates = "SEX", subgroups = "BMRKR2" ), data = adrs_f ) ## Table with default columns. tabulate_rsp_biomarkers(df) ## Table with a manually chosen set of columns: leave out "pval", reorder. tab <- tabulate_rsp_biomarkers( df = df, vars = c("n_rsp", "ci", "n_tot", "prop", "or") ) ## Finally produce the forest plot. g_forest(tab, xlim = c(0.7, 1.4))
rtable
objects to ggplot
objectsGiven a rtables::rtable()
object, performs basic conversion to a ggplot2::ggplot()
object built using
functions from the ggplot2
package. Any table titles and/or footnotes are ignored.
rtable2gg(tbl, fontsize = 12, colwidths = NULL, lbl_col_padding = 0)
rtable2gg(tbl, fontsize = 12, colwidths = NULL, lbl_col_padding = 0)
tbl |
( |
fontsize |
( |
colwidths |
( |
lbl_col_padding |
( |
A ggplot
object.
dta <- data.frame( ARM = rep(LETTERS[1:3], rep(6, 3)), AVISIT = rep(paste0("V", 1:3), 6), AVAL = c(9:1, rep(NA, 9)) ) lyt <- basic_table() %>% split_cols_by(var = "ARM") %>% split_rows_by(var = "AVISIT") %>% analyze_vars(vars = "AVAL") tbl <- build_table(lyt, df = dta) rtable2gg(tbl) rtable2gg(tbl, fontsize = 15, colwidths = c(2, 1, 1, 1))
dta <- data.frame( ARM = rep(LETTERS[1:3], rep(6, 3)), AVISIT = rep(paste0("V", 1:3), 6), AVAL = c(9:1, rep(NA, 9)) ) lyt <- basic_table() %>% split_cols_by(var = "ARM") %>% split_rows_by(var = "AVISIT") %>% analyze_vars(vars = "AVAL") tbl <- build_table(lyt, df = dta) rtable2gg(tbl) rtable2gg(tbl, fontsize = 15, colwidths = c(2, 1, 1, 1))
Statistics function that uses the Bland-Altman method to assess the agreement between two numerical vectors and calculates a variety of statistics.
s_bland_altman(x, y, conf_level = 0.95)
s_bland_altman(x, y, conf_level = 0.95)
x |
( |
y |
( |
conf_level |
( |
A named list of the following elements:
df
difference_mean
ci_mean
difference_sd
difference_se
upper_agreement_limit
lower_agreement_limit
agreement_limit_se
upper_agreement_limit_ci
lower_agreement_limit_ci
t_value
n
x <- seq(1, 60, 5) y <- seq(5, 50, 4) s_bland_altman(x, y, conf_level = 0.9)
x <- seq(1, 60, 5) y <- seq(5, 50, 4) s_bland_altman(x, y, conf_level = 0.9)
NA
SAS imports missing data as empty strings or strings with whitespaces only. This helper function can be used to
convert these values to NA
s.
sas_na(x, empty = TRUE, whitespaces = TRUE)
sas_na(x, empty = TRUE, whitespaces = TRUE)
x |
( |
empty |
( |
whitespaces |
( |
x
with ""
and/or whitespace-only values substituted by NA
, depending on the values of
empty
and whitespaces
.
sas_na(c("1", "", " ", " ", "b")) sas_na(factor(c("", " ", "b"))) is.na(sas_na(c("1", "", " ", " ", "b")))
sas_na(c("1", "", " ", " ", "b")) sas_na(factor(c("", " ", "b"))) is.na(sas_na(c("1", "", " ", " ", "b")))
Functions to score occurrence table subtables and rows which can be used in the sorting of occurrence tables.
score_occurrences(table_row) score_occurrences_cols(...) score_occurrences_subtable(...) score_occurrences_cont_cols(...)
score_occurrences(table_row) score_occurrences_cols(...) score_occurrences_subtable(...) score_occurrences_cont_cols(...)
table_row |
( |
... |
arguments for row or column access, see |
score_occurrences()
returns the sum of counts across all columns of a table row.
score_occurrences_cols()
returns a function that sums counts across all specified columns
of a table row.
score_occurrences_subtable()
returns a function that sums counts in each subtable
across all specified columns.
score_occurrences_cont_cols()
returns a function that sums counts in the first content row in
specified columns.
score_occurrences()
: Scoring function which sums the counts across all
columns. It will fail if anything else but counts are used.
score_occurrences_cols()
: Scoring functions can be produced by this constructor to only include
specific columns in the scoring. See h_row_counts()
for further information.
score_occurrences_subtable()
: Scoring functions produced by this constructor can be used on
subtables: They sum up all specified column counts in the subtable. This is useful when
there is no available content row summing up these counts.
score_occurrences_cont_cols()
: Produces a score function for sorting table by summing the first content row in
specified columns. Note that this is extending rtables::cont_n_onecol()
and rtables::cont_n_allcols()
.
lyt <- basic_table() %>% split_cols_by("ARM") %>% add_colcounts() %>% analyze_num_patients( vars = "USUBJID", .stats = c("unique"), .labels = c("Total number of patients with at least one event") ) %>% split_rows_by("AEBODSYS", child_labels = "visible", nested = FALSE) %>% summarize_num_patients( var = "USUBJID", .stats = c("unique", "nonunique"), .labels = c( "Total number of patients with at least one event", "Total number of events" ) ) %>% count_occurrences(vars = "AEDECOD") tbl <- build_table(lyt, tern_ex_adae, alt_counts_df = tern_ex_adsl) %>% prune_table() tbl_sorted <- tbl %>% sort_at_path(path = c("AEBODSYS", "*", "AEDECOD"), scorefun = score_occurrences) tbl_sorted score_cols_a_and_b <- score_occurrences_cols(col_names = c("A: Drug X", "B: Placebo")) # Note that this here just sorts the AEDECOD inside the AEBODSYS. The AEBODSYS are not sorted. # That would require a second pass of `sort_at_path`. tbl_sorted <- tbl %>% sort_at_path(path = c("AEBODSYS", "*", "AEDECOD"), scorefun = score_cols_a_and_b) tbl_sorted score_subtable_all <- score_occurrences_subtable(col_names = names(tbl)) # Note that this code just sorts the AEBODSYS, not the AEDECOD within AEBODSYS. That # would require a second pass of `sort_at_path`. tbl_sorted <- tbl %>% sort_at_path(path = c("AEBODSYS"), scorefun = score_subtable_all, decreasing = FALSE) tbl_sorted
lyt <- basic_table() %>% split_cols_by("ARM") %>% add_colcounts() %>% analyze_num_patients( vars = "USUBJID", .stats = c("unique"), .labels = c("Total number of patients with at least one event") ) %>% split_rows_by("AEBODSYS", child_labels = "visible", nested = FALSE) %>% summarize_num_patients( var = "USUBJID", .stats = c("unique", "nonunique"), .labels = c( "Total number of patients with at least one event", "Total number of events" ) ) %>% count_occurrences(vars = "AEDECOD") tbl <- build_table(lyt, tern_ex_adae, alt_counts_df = tern_ex_adsl) %>% prune_table() tbl_sorted <- tbl %>% sort_at_path(path = c("AEBODSYS", "*", "AEDECOD"), scorefun = score_occurrences) tbl_sorted score_cols_a_and_b <- score_occurrences_cols(col_names = c("A: Drug X", "B: Placebo")) # Note that this here just sorts the AEDECOD inside the AEBODSYS. The AEBODSYS are not sorted. # That would require a second pass of `sort_at_path`. tbl_sorted <- tbl %>% sort_at_path(path = c("AEBODSYS", "*", "AEDECOD"), scorefun = score_cols_a_and_b) tbl_sorted score_subtable_all <- score_occurrences_subtable(col_names = names(tbl)) # Note that this code just sorts the AEBODSYS, not the AEDECOD within AEBODSYS. That # would require a second pass of `sort_at_path`. tbl_sorted <- tbl %>% sort_at_path(path = c("AEBODSYS"), scorefun = score_subtable_all, decreasing = FALSE) tbl_sorted
split_cols_by_groups(lyt, var, groups_list = NULL, ref_group = NULL, ...)
split_cols_by_groups(lyt, var, groups_list = NULL, ref_group = NULL, ...)
lyt |
( |
var |
( |
groups_list |
(named |
ref_group |
( |
... |
additional arguments to |
A layout object suitable for passing to further layouting functions. Adding
this function to an rtable
layout will add a column split including the given
groups to the table layout.
# 1 - Basic use # Without group combination `split_cols_by_groups` is # equivalent to [rtables::split_cols_by()]. basic_table() %>% split_cols_by_groups("ARM") %>% add_colcounts() %>% analyze("AGE") %>% build_table(DM) # Add a reference column. basic_table() %>% split_cols_by_groups("ARM", ref_group = "B: Placebo") %>% add_colcounts() %>% analyze( "AGE", afun = function(x, .ref_group, .in_ref_col) { if (.in_ref_col) { in_rows("Diff Mean" = rcell(NULL)) } else { in_rows("Diff Mean" = rcell(mean(x) - mean(.ref_group), format = "xx.xx")) } } ) %>% build_table(DM) # 2 - Adding group specification # Manual preparation of the groups. groups <- list( "Arms A+B" = c("A: Drug X", "B: Placebo"), "Arms A+C" = c("A: Drug X", "C: Combination") ) # Use of split_cols_by_groups without reference column. basic_table() %>% split_cols_by_groups("ARM", groups) %>% add_colcounts() %>% analyze("AGE") %>% build_table(DM) # Including differentiated output in the reference column. basic_table() %>% split_cols_by_groups("ARM", groups_list = groups, ref_group = "Arms A+B") %>% analyze( "AGE", afun = function(x, .ref_group, .in_ref_col) { if (.in_ref_col) { in_rows("Diff. of Averages" = rcell(NULL)) } else { in_rows("Diff. of Averages" = rcell(mean(x) - mean(.ref_group), format = "xx.xx")) } } ) %>% build_table(DM) # 3 - Binary list dividing factor levels into reference and treatment # `combine_groups` defines reference and treatment. groups <- combine_groups( fct = DM$ARM, ref = c("A: Drug X", "B: Placebo") ) groups # Use group definition without reference column. basic_table() %>% split_cols_by_groups("ARM", groups_list = groups) %>% add_colcounts() %>% analyze("AGE") %>% build_table(DM) # Use group definition with reference column (first item of groups). basic_table() %>% split_cols_by_groups("ARM", groups, ref_group = names(groups)[1]) %>% add_colcounts() %>% analyze( "AGE", afun = function(x, .ref_group, .in_ref_col) { if (.in_ref_col) { in_rows("Diff Mean" = rcell(NULL)) } else { in_rows("Diff Mean" = rcell(mean(x) - mean(.ref_group), format = "xx.xx")) } } ) %>% build_table(DM)
# 1 - Basic use # Without group combination `split_cols_by_groups` is # equivalent to [rtables::split_cols_by()]. basic_table() %>% split_cols_by_groups("ARM") %>% add_colcounts() %>% analyze("AGE") %>% build_table(DM) # Add a reference column. basic_table() %>% split_cols_by_groups("ARM", ref_group = "B: Placebo") %>% add_colcounts() %>% analyze( "AGE", afun = function(x, .ref_group, .in_ref_col) { if (.in_ref_col) { in_rows("Diff Mean" = rcell(NULL)) } else { in_rows("Diff Mean" = rcell(mean(x) - mean(.ref_group), format = "xx.xx")) } } ) %>% build_table(DM) # 2 - Adding group specification # Manual preparation of the groups. groups <- list( "Arms A+B" = c("A: Drug X", "B: Placebo"), "Arms A+C" = c("A: Drug X", "C: Combination") ) # Use of split_cols_by_groups without reference column. basic_table() %>% split_cols_by_groups("ARM", groups) %>% add_colcounts() %>% analyze("AGE") %>% build_table(DM) # Including differentiated output in the reference column. basic_table() %>% split_cols_by_groups("ARM", groups_list = groups, ref_group = "Arms A+B") %>% analyze( "AGE", afun = function(x, .ref_group, .in_ref_col) { if (.in_ref_col) { in_rows("Diff. of Averages" = rcell(NULL)) } else { in_rows("Diff. of Averages" = rcell(mean(x) - mean(.ref_group), format = "xx.xx")) } } ) %>% build_table(DM) # 3 - Binary list dividing factor levels into reference and treatment # `combine_groups` defines reference and treatment. groups <- combine_groups( fct = DM$ARM, ref = c("A: Drug X", "B: Placebo") ) groups # Use group definition without reference column. basic_table() %>% split_cols_by_groups("ARM", groups_list = groups) %>% add_colcounts() %>% analyze("AGE") %>% build_table(DM) # Use group definition with reference column (first item of groups). basic_table() %>% split_cols_by_groups("ARM", groups, ref_group = names(groups)[1]) %>% add_colcounts() %>% analyze( "AGE", afun = function(x, .ref_group, .in_ref_col) { if (.in_ref_col) { in_rows("Diff Mean" = rcell(NULL)) } else { in_rows("Diff Mean" = rcell(mean(x) - mean(.ref_group), format = "xx.xx")) } } ) %>% build_table(DM)
Stack grobs as a new grob with 1 column and multiple rows layout.
stack_grobs( ..., grobs = list(...), padding = grid::unit(2, "line"), vp = NULL, gp = NULL, name = NULL )
stack_grobs( ..., grobs = list(...), padding = grid::unit(2, "line"), vp = NULL, gp = NULL, name = NULL )
... |
grobs. |
grobs |
( |
padding |
( |
vp |
( |
gp |
( |
name |
( |
A grob
.
library(grid) g1 <- circleGrob(gp = gpar(col = "blue")) g2 <- circleGrob(gp = gpar(col = "red")) g3 <- textGrob("TEST TEXT") grid.newpage() grid.draw(stack_grobs(g1, g2, g3)) showViewport() grid.newpage() pushViewport(viewport(layout = grid.layout(1, 2))) vp1 <- viewport(layout.pos.row = 1, layout.pos.col = 2) grid.draw(stack_grobs(g1, g2, g3, vp = vp1, name = "test")) showViewport() grid.ls(grobs = TRUE, viewports = TRUE, print = FALSE)
library(grid) g1 <- circleGrob(gp = gpar(col = "blue")) g2 <- circleGrob(gp = gpar(col = "red")) g3 <- textGrob("TEST TEXT") grid.newpage() grid.draw(stack_grobs(g1, g2, g3)) showViewport() grid.newpage() pushViewport(viewport(layout = grid.layout(1, 2))) vp1 <- viewport(layout.pos.row = 1, layout.pos.col = 2) grid.draw(stack_grobs(g1, g2, g3, vp = vp1, name = "test")) showViewport() grid.ls(grobs = TRUE, viewports = TRUE, print = FALSE)
Convenient function for calculating the mean confidence interval. It calculates the arithmetic as well as the
geometric mean. It can be used as a ggplot
helper function for plotting.
stat_mean_ci( x, conf_level = 0.95, na.rm = TRUE, n_min = 2, gg_helper = TRUE, geom_mean = FALSE )
stat_mean_ci( x, conf_level = 0.95, na.rm = TRUE, n_min = 2, gg_helper = TRUE, geom_mean = FALSE )
x |
( |
conf_level |
( |
na.rm |
( |
n_min |
( |
gg_helper |
( |
geom_mean |
( |
A named vector
of values mean_ci_lwr
and mean_ci_upr
.
stat_mean_ci(sample(10), gg_helper = FALSE) p <- ggplot2::ggplot(mtcars, ggplot2::aes(cyl, mpg)) + ggplot2::geom_point() p + ggplot2::stat_summary( fun.data = stat_mean_ci, geom = "errorbar" ) p + ggplot2::stat_summary( fun.data = stat_mean_ci, fun.args = list(conf_level = 0.5), geom = "errorbar" ) p + ggplot2::stat_summary( fun.data = stat_mean_ci, fun.args = list(conf_level = 0.5, geom_mean = TRUE), geom = "errorbar" )
stat_mean_ci(sample(10), gg_helper = FALSE) p <- ggplot2::ggplot(mtcars, ggplot2::aes(cyl, mpg)) + ggplot2::geom_point() p + ggplot2::stat_summary( fun.data = stat_mean_ci, geom = "errorbar" ) p + ggplot2::stat_summary( fun.data = stat_mean_ci, fun.args = list(conf_level = 0.5), geom = "errorbar" ) p + ggplot2::stat_summary( fun.data = stat_mean_ci, fun.args = list(conf_level = 0.5, geom_mean = TRUE), geom = "errorbar" )
Convenient function for calculating the two-sided p-value of the mean.
stat_mean_pval(x, na.rm = TRUE, n_min = 2, test_mean = 0)
stat_mean_pval(x, na.rm = TRUE, n_min = 2, test_mean = 0)
x |
( |
na.rm |
( |
n_min |
( |
test_mean |
( |
A p-value.
stat_mean_pval(sample(10)) stat_mean_pval(rnorm(10), test_mean = 0.5)
stat_mean_pval(sample(10)) stat_mean_pval(rnorm(10), test_mean = 0.5)
Convenient function for calculating the median confidence interval. It can be used as a ggplot
helper
function for plotting.
stat_median_ci(x, conf_level = 0.95, na.rm = TRUE, gg_helper = TRUE)
stat_median_ci(x, conf_level = 0.95, na.rm = TRUE, gg_helper = TRUE)
x |
( |
conf_level |
( |
na.rm |
( |
gg_helper |
( |
This function was adapted from DescTools/versions/0.99.35/source
A named vector
of values median_ci_lwr
and median_ci_upr
.
stat_median_ci(sample(10), gg_helper = FALSE) p <- ggplot2::ggplot(mtcars, ggplot2::aes(cyl, mpg)) + ggplot2::geom_point() p + ggplot2::stat_summary( fun.data = stat_median_ci, geom = "errorbar" )
stat_median_ci(sample(10), gg_helper = FALSE) p <- ggplot2::ggplot(mtcars, ggplot2::aes(cyl, mpg)) + ggplot2::geom_point() p + ggplot2::stat_summary( fun.data = stat_median_ci, geom = "errorbar" )
Function for calculating the proportion (or risk) difference and confidence interval between arm X (reference group) and arm Y. Risk difference is calculated by subtracting cumulative incidence in arm Y from cumulative incidence in arm X.
stat_propdiff_ci( x, y, N_x, N_y, list_names = NULL, conf_level = 0.95, pct = TRUE )
stat_propdiff_ci( x, y, N_x, N_y, list_names = NULL, conf_level = 0.95, pct = TRUE )
x |
( |
y |
( |
N_x |
( |
N_y |
( |
list_names |
( |
conf_level |
( |
pct |
( |
List of proportion differences and CIs corresponding to each pair of number of occurrences in x
and
y
. Each list element consists of 3 statistics: proportion difference, CI lower bound, and CI upper bound.
Split function add_riskdiff()
which, when used as split_fun
within rtables::split_cols_by()
with riskdiff
argument is set to TRUE
in subsequent analyze functions, adds a column containing
proportion (risk) difference to an rtables
layout.
stat_propdiff_ci( x = list(0.375), y = list(0.01), N_x = 5, N_y = 5, list_names = "x", conf_level = 0.9 ) stat_propdiff_ci( x = list(0.5, 0.75, 1), y = list(0.25, 0.05, 0.5), N_x = 10, N_y = 20, pct = FALSE )
stat_propdiff_ci( x = list(0.375), y = list(0.01), N_x = 5, N_y = 5, list_names = "x", conf_level = 0.9 ) stat_propdiff_ci( x = list(0.5, 0.75, 1), y = list(0.25, 0.05, 0.5), N_x = 10, N_y = 20, pct = FALSE )
This function wraps the estimation of stratified percentiles when we assume the approximation for large numbers. This is necessary only in the case proportions for each strata are unequal.
strata_normal_quantile(vars, weights, conf_level)
strata_normal_quantile(vars, weights, conf_level)
vars |
( |
weights |
( |
conf_level |
( |
Stratified quantile.
strata_data <- table(data.frame( "f1" = sample(c(TRUE, FALSE), 100, TRUE), "f2" = sample(c("x", "y", "z"), 100, TRUE), stringsAsFactors = TRUE )) ns <- colSums(strata_data) ests <- strata_data["TRUE", ] / ns vars <- ests * (1 - ests) / ns weights <- rep(1 / length(ns), length(ns)) strata_normal_quantile(vars, weights, 0.95)
strata_data <- table(data.frame( "f1" = sample(c(TRUE, FALSE), 100, TRUE), "f2" = sample(c("x", "y", "z"), 100, TRUE), stringsAsFactors = TRUE )) ns <- colSums(strata_data) ests <- strata_data["TRUE", ] / ns vars <- ests * (1 - ests) / ns weights <- rep(1 / length(ns), length(ns)) strata_normal_quantile(vars, weights, 0.95)
The analyze function summarize_colvars()
uses the statistics function s_summary()
to analyze variables that are
arranged in columns. The variables to analyze should be specified in the table layout via column splits (see
rtables::split_cols_by()
and rtables::split_cols_by_multivar()
) prior to using summarize_colvars()
.
The function is a minimal wrapper for rtables::analyze_colvars()
, a function typically used to apply different
analysis methods in rows for each column variable. To use the analysis methods as column labels, please refer to
the analyze_vars_in_cols()
function.
summarize_colvars( lyt, ..., na_str = default_na_str(), .stats = c("n", "mean_sd", "median", "range", "count_fraction"), .formats = NULL, .labels = NULL, .indent_mods = NULL )
summarize_colvars( lyt, ..., na_str = default_na_str(), .stats = c("n", "mean_sd", "median", "range", "count_fraction"), .formats = NULL, .labels = NULL, .indent_mods = NULL )
lyt |
( |
... |
arguments passed to |
na_str |
( |
.stats |
( |
.formats |
(named |
.labels |
(named |
.indent_mods |
(named |
A layout object suitable for passing to further layouting functions, or to rtables::build_table()
.
Adding this function to an rtable
layout will summarize the given variables, arrange the output
in columns, and add it to the table layout.
rtables::split_cols_by_multivar()
and analyze_colvars_functions
.
dta_test <- data.frame( USUBJID = rep(1:6, each = 3), PARAMCD = rep("lab", 6 * 3), AVISIT = rep(paste0("V", 1:3), 6), ARM = rep(LETTERS[1:3], rep(6, 3)), AVAL = c(9:1, rep(NA, 9)), CHG = c(1:9, rep(NA, 9)) ) ## Default output within a `rtables` pipeline. basic_table() %>% split_cols_by("ARM") %>% split_rows_by("AVISIT") %>% split_cols_by_multivar(vars = c("AVAL", "CHG")) %>% summarize_colvars() %>% build_table(dta_test) ## Selection of statistics, formats and labels also work. basic_table() %>% split_cols_by("ARM") %>% split_rows_by("AVISIT") %>% split_cols_by_multivar(vars = c("AVAL", "CHG")) %>% summarize_colvars( .stats = c("n", "mean_sd"), .formats = c("mean_sd" = "xx.x, xx.x"), .labels = c(n = "n", mean_sd = "Mean, SD") ) %>% build_table(dta_test) ## Use arguments interpreted by `s_summary`. basic_table() %>% split_cols_by("ARM") %>% split_rows_by("AVISIT") %>% split_cols_by_multivar(vars = c("AVAL", "CHG")) %>% summarize_colvars(na.rm = FALSE) %>% build_table(dta_test)
dta_test <- data.frame( USUBJID = rep(1:6, each = 3), PARAMCD = rep("lab", 6 * 3), AVISIT = rep(paste0("V", 1:3), 6), ARM = rep(LETTERS[1:3], rep(6, 3)), AVAL = c(9:1, rep(NA, 9)), CHG = c(1:9, rep(NA, 9)) ) ## Default output within a `rtables` pipeline. basic_table() %>% split_cols_by("ARM") %>% split_rows_by("AVISIT") %>% split_cols_by_multivar(vars = c("AVAL", "CHG")) %>% summarize_colvars() %>% build_table(dta_test) ## Selection of statistics, formats and labels also work. basic_table() %>% split_cols_by("ARM") %>% split_rows_by("AVISIT") %>% split_cols_by_multivar(vars = c("AVAL", "CHG")) %>% summarize_colvars( .stats = c("n", "mean_sd"), .formats = c("mean_sd" = "xx.x, xx.x"), .labels = c(n = "n", mean_sd = "Mean, SD") ) %>% build_table(dta_test) ## Use arguments interpreted by `s_summary`. basic_table() %>% split_cols_by("ARM") %>% split_rows_by("AVISIT") %>% split_cols_by_multivar(vars = c("AVAL", "CHG")) %>% summarize_colvars(na.rm = FALSE) %>% build_table(dta_test)
These functions are wrappers for rtables::summarize_row_groups()
, applying corresponding tern
content functions
to add summary rows to a given table layout:
h_tab_one_biomarker()
(probably to deprecate)
Additionally, the summarize_coxreg()
function utilizes rtables::summarize_row_groups()
(in combination with several other rtables
functions like rtables::analyze_colvars()
) to
output a Cox regression summary table.
analyze_functions for functions which are wrappers for rtables::analyze()
.
analyze_colvars_functions for functions that are wrappers for rtables::analyze_colvars()
.
Layout-creating function which summarizes a logistic variable regression for binary outcome with categorical/continuous covariates in model statement. For each covariate category (if categorical) or specified values (if continuous), present degrees of freedom, regression parameter estimate and standard error (SE) relative to reference group or category. Report odds ratios for each covariate category or specified values and corresponding Wald confidence intervals as default but allow user to specify other confidence levels. Report p-value for Wald chi-square test of the null hypothesis that covariate has no effect on response in model containing all specified covariates. Allow option to include one two-way interaction and present similar output for each interaction degree of freedom.
summarize_logistic( lyt, conf_level, drop_and_remove_str = "", .indent_mods = NULL )
summarize_logistic( lyt, conf_level, drop_and_remove_str = "", .indent_mods = NULL )
lyt |
( |
conf_level |
( |
drop_and_remove_str |
( |
.indent_mods |
(named |
A layout object suitable for passing to further layouting functions, or to rtables::build_table()
.
Adding this function to an rtable
layout will add a logistic regression variable summary to the table layout.
For the formula, the variable names need to be standard data.frame
column names without
special characters.
library(dplyr) library(broom) adrs_f <- tern_ex_adrs %>% filter(PARAMCD == "BESRSPI") %>% filter(RACE %in% c("ASIAN", "WHITE", "BLACK OR AFRICAN AMERICAN")) %>% mutate( Response = case_when(AVALC %in% c("PR", "CR") ~ 1, TRUE ~ 0), RACE = factor(RACE), SEX = factor(SEX) ) formatters::var_labels(adrs_f) <- c(formatters::var_labels(tern_ex_adrs), Response = "Response") mod1 <- fit_logistic( data = adrs_f, variables = list( response = "Response", arm = "ARMCD", covariates = c("AGE", "RACE") ) ) mod2 <- fit_logistic( data = adrs_f, variables = list( response = "Response", arm = "ARMCD", covariates = c("AGE", "RACE"), interaction = "AGE" ) ) df <- tidy(mod1, conf_level = 0.99) df2 <- tidy(mod2, conf_level = 0.99) # flagging empty strings with "_" df <- df_explicit_na(df, na_level = "_") df2 <- df_explicit_na(df2, na_level = "_") result1 <- basic_table() %>% summarize_logistic( conf_level = 0.95, drop_and_remove_str = "_" ) %>% build_table(df = df) result1 result2 <- basic_table() %>% summarize_logistic( conf_level = 0.95, drop_and_remove_str = "_" ) %>% build_table(df = df2) result2
library(dplyr) library(broom) adrs_f <- tern_ex_adrs %>% filter(PARAMCD == "BESRSPI") %>% filter(RACE %in% c("ASIAN", "WHITE", "BLACK OR AFRICAN AMERICAN")) %>% mutate( Response = case_when(AVALC %in% c("PR", "CR") ~ 1, TRUE ~ 0), RACE = factor(RACE), SEX = factor(SEX) ) formatters::var_labels(adrs_f) <- c(formatters::var_labels(tern_ex_adrs), Response = "Response") mod1 <- fit_logistic( data = adrs_f, variables = list( response = "Response", arm = "ARMCD", covariates = c("AGE", "RACE") ) ) mod2 <- fit_logistic( data = adrs_f, variables = list( response = "Response", arm = "ARMCD", covariates = c("AGE", "RACE"), interaction = "AGE" ) ) df <- tidy(mod1, conf_level = 0.99) df2 <- tidy(mod2, conf_level = 0.99) # flagging empty strings with "_" df <- df_explicit_na(df, na_level = "_") df2 <- df_explicit_na(df2, na_level = "_") result1 <- basic_table() %>% summarize_logistic( conf_level = 0.95, drop_and_remove_str = "_" ) %>% build_table(df = df) result1 result2 <- basic_table() %>% summarize_logistic( conf_level = 0.95, drop_and_remove_str = "_" ) %>% build_table(df = df2) result2
The analyze function analyze_num_patients()
creates a layout element to count total numbers of unique or
non-unique patients. The primary analysis variable vars
is used to uniquely identify patients.
The count_by
variable can be used to identify non-unique patients such that the number of patients with a unique
combination of values in vars
and count_by
will be returned instead as the nonunique
statistic. The required
variable can be used to specify a variable required to be non-missing for the record to be included in the counts.
The summarize function summarize_num_patients()
performs the same function as analyze_num_patients()
except it
creates content rows, not data rows, to summarize the current table row/column context and operates on the level of
the latest row split or the root of the table if no row splits have occurred.
analyze_num_patients( lyt, vars, required = NULL, count_by = NULL, unique_count_suffix = TRUE, na_str = default_na_str(), nested = TRUE, .stats = NULL, .formats = NULL, .labels = c(unique = "Number of patients with at least one event", nonunique = "Number of events"), show_labels = c("default", "visible", "hidden"), .indent_mods = 0L, riskdiff = FALSE, ... ) summarize_num_patients( lyt, var, required = NULL, count_by = NULL, unique_count_suffix = TRUE, na_str = default_na_str(), .stats = NULL, .formats = NULL, .labels = c(unique = "Number of patients with at least one event", nonunique = "Number of events"), .indent_mods = 0L, riskdiff = FALSE, ... ) s_num_patients( x, labelstr, .N_col, count_by = NULL, unique_count_suffix = TRUE ) s_num_patients_content( df, labelstr = "", .N_col, .var, required = NULL, count_by = NULL, unique_count_suffix = TRUE )
analyze_num_patients( lyt, vars, required = NULL, count_by = NULL, unique_count_suffix = TRUE, na_str = default_na_str(), nested = TRUE, .stats = NULL, .formats = NULL, .labels = c(unique = "Number of patients with at least one event", nonunique = "Number of events"), show_labels = c("default", "visible", "hidden"), .indent_mods = 0L, riskdiff = FALSE, ... ) summarize_num_patients( lyt, var, required = NULL, count_by = NULL, unique_count_suffix = TRUE, na_str = default_na_str(), .stats = NULL, .formats = NULL, .labels = c(unique = "Number of patients with at least one event", nonunique = "Number of events"), .indent_mods = 0L, riskdiff = FALSE, ... ) s_num_patients( x, labelstr, .N_col, count_by = NULL, unique_count_suffix = TRUE ) s_num_patients_content( df, labelstr = "", .N_col, .var, required = NULL, count_by = NULL, unique_count_suffix = TRUE )
lyt |
( |
vars |
( |
required |
( |
count_by |
( |
unique_count_suffix |
( |
na_str |
( |
nested |
( |
.stats |
( Options are: |
.formats |
(named |
.labels |
(named |
show_labels |
( |
.indent_mods |
(named |
riskdiff |
( |
... |
additional arguments for the lower level functions. |
x |
( |
labelstr |
( |
.N_col |
( |
df |
( |
.var , var
|
( |
In general, functions that starts with analyze*
are expected to
work like rtables::analyze()
, while functions that starts with summarize*
are based upon rtables::summarize_row_groups()
. The latter provides a
value for each dividing split in the row and column space, but, being it
bound to the fundamental splits, it is repeated by design in every page
when pagination is involved.
analyze_num_patients()
returns a layout object suitable for passing to further layouting functions,
or to rtables::build_table()
. Adding this function to an rtable
layout will add formatted rows containing
the statistics from s_num_patients_content()
to the table layout.
summarize_num_patients()
returns a layout object suitable for passing to further layouting functions,
or to rtables::build_table()
. Adding this function to an rtable
layout will add formatted rows containing
the statistics from s_num_patients_content()
to the table layout.
s_num_patients()
returns a named list
of 3 statistics:
unique
: Vector of counts and percentages.
nonunique
: Vector of counts.
unique_count
: Counts.
s_num_patients_content()
returns the same values as s_num_patients()
.
analyze_num_patients()
: Layout-creating function which can take statistics function arguments
and additional format arguments. This function is a wrapper for rtables::analyze()
.
summarize_num_patients()
: Layout-creating function which can take statistics function arguments
and additional format arguments. This function is a wrapper for rtables::summarize_row_groups()
.
s_num_patients()
: Statistics function which counts the number of
unique patients, the corresponding percentage taken with respect to the
total number of patients, and the number of non-unique patients.
s_num_patients_content()
: Statistics function which counts the number of unique patients
in a column (variable), the corresponding percentage taken with respect to the total number of
patients, and the number of non-unique patients in the column.
As opposed to summarize_num_patients()
, this function does not repeat the produced rows.
df <- data.frame( USUBJID = as.character(c(1, 2, 1, 4, NA, 6, 6, 8, 9)), ARM = c("A", "A", "A", "A", "A", "B", "B", "B", "B"), AGE = c(10, 15, 10, 17, 8, 11, 11, 19, 17), SEX = c("M", "M", "M", "F", "F", "F", "M", "F", "M") ) # analyze_num_patients tbl <- basic_table() %>% split_cols_by("ARM") %>% add_colcounts() %>% analyze_num_patients("USUBJID", .stats = c("unique")) %>% build_table(df) tbl # summarize_num_patients tbl <- basic_table() %>% split_cols_by("ARM") %>% split_rows_by("SEX") %>% summarize_num_patients("USUBJID", .stats = "unique_count") %>% build_table(df) tbl # Use the statistics function to count number of unique and nonunique patients. s_num_patients(x = as.character(c(1, 1, 1, 2, 4, NA)), labelstr = "", .N_col = 6L) s_num_patients( x = as.character(c(1, 1, 1, 2, 4, NA)), labelstr = "", .N_col = 6L, count_by = c(1, 1, 2, 1, 1, 1) ) # Count number of unique and non-unique patients. df <- data.frame( USUBJID = as.character(c(1, 2, 1, 4, NA)), EVENT = as.character(c(10, 15, 10, 17, 8)) ) s_num_patients_content(df, .N_col = 5, .var = "USUBJID") df_by_event <- data.frame( USUBJID = as.character(c(1, 2, 1, 4, NA)), EVENT = c(10, 15, 10, 17, 8) ) s_num_patients_content(df_by_event, .N_col = 5, .var = "USUBJID", count_by = "EVENT")
df <- data.frame( USUBJID = as.character(c(1, 2, 1, 4, NA, 6, 6, 8, 9)), ARM = c("A", "A", "A", "A", "A", "B", "B", "B", "B"), AGE = c(10, 15, 10, 17, 8, 11, 11, 19, 17), SEX = c("M", "M", "M", "F", "F", "F", "M", "F", "M") ) # analyze_num_patients tbl <- basic_table() %>% split_cols_by("ARM") %>% add_colcounts() %>% analyze_num_patients("USUBJID", .stats = c("unique")) %>% build_table(df) tbl # summarize_num_patients tbl <- basic_table() %>% split_cols_by("ARM") %>% split_rows_by("SEX") %>% summarize_num_patients("USUBJID", .stats = "unique_count") %>% build_table(df) tbl # Use the statistics function to count number of unique and nonunique patients. s_num_patients(x = as.character(c(1, 1, 1, 2, 4, NA)), labelstr = "", .N_col = 6L) s_num_patients( x = as.character(c(1, 1, 1, 2, 4, NA)), labelstr = "", .N_col = 6L, count_by = c(1, 1, 2, 1, 1, 1) ) # Count number of unique and non-unique patients. df <- data.frame( USUBJID = as.character(c(1, 2, 1, 4, NA)), EVENT = as.character(c(10, 15, 10, 17, 8)) ) s_num_patients_content(df, .N_col = 5, .var = "USUBJID") df_by_event <- data.frame( USUBJID = as.character(c(1, 2, 1, 4, NA)), EVENT = c(10, 15, 10, 17, 8) ) s_num_patients_content(df_by_event, .N_col = 5, .var = "USUBJID", count_by = "EVENT")
The tabulate_survival_biomarkers()
function creates a layout element to tabulate the estimated effects of multiple
continuous biomarker variables on survival across subgroups, returning statistics including median survival time and
hazard ratio for each population subgroup. The table is created from df
, a list of data frames returned by
extract_survival_biomarkers()
, with the statistics to include specified via the vars
parameter.
A forest plot can be created from the resulting table using the g_forest()
function.
tabulate_survival_biomarkers( df, vars = c("n_tot", "n_tot_events", "median", "hr", "ci", "pval"), groups_lists = list(), control = control_coxreg(), label_all = lifecycle::deprecated(), time_unit = NULL, na_str = default_na_str(), .indent_mods = 0L )
tabulate_survival_biomarkers( df, vars = c("n_tot", "n_tot_events", "median", "hr", "ci", "pval"), groups_lists = list(), control = control_coxreg(), label_all = lifecycle::deprecated(), time_unit = NULL, na_str = default_na_str(), .indent_mods = 0L )
df |
( |
vars |
(
|
groups_lists |
(named |
control |
( |
label_all |
|
time_unit |
( |
na_str |
( |
.indent_mods |
(named |
These functions create a layout starting from a data frame which contains the required statistics. The tables are then typically used as input for forest plots.
An rtables
table summarizing biomarker effects on survival by subgroup.
tabulate_survival_biomarkers()
: Table-creating function which creates a table
summarizing biomarker effects on survival by subgroup.
In contrast to tabulate_survival_subgroups()
this tabulation function does
not start from an input layout lyt
. This is because internally the table is
created by combining multiple subtables.
h_tab_surv_one_biomarker()
which is used internally, extract_survival_biomarkers()
.
library(dplyr) adtte <- tern_ex_adtte # Save variable labels before data processing steps. adtte_labels <- formatters::var_labels(adtte) adtte_f <- adtte %>% filter(PARAMCD == "OS") %>% mutate( AVALU = as.character(AVALU), is_event = CNSR == 0 ) labels <- c("AVALU" = adtte_labels[["AVALU"]], "is_event" = "Event Flag") formatters::var_labels(adtte_f)[names(labels)] <- labels # Typical analysis of two continuous biomarkers `BMRKR1` and `AGE`, # in multiple regression models containing one covariate `RACE`, # as well as one stratification variable `STRATA1`. The subgroups # are defined by the levels of `BMRKR2`. df <- extract_survival_biomarkers( variables = list( tte = "AVAL", is_event = "is_event", biomarkers = c("BMRKR1", "AGE"), strata = "STRATA1", covariates = "SEX", subgroups = "BMRKR2" ), label_all = "Total Patients", data = adtte_f ) df # Here we group the levels of `BMRKR2` manually. df_grouped <- extract_survival_biomarkers( variables = list( tte = "AVAL", is_event = "is_event", biomarkers = c("BMRKR1", "AGE"), strata = "STRATA1", covariates = "SEX", subgroups = "BMRKR2" ), data = adtte_f, groups_lists = list( BMRKR2 = list( "low" = "LOW", "low/medium" = c("LOW", "MEDIUM"), "low/medium/high" = c("LOW", "MEDIUM", "HIGH") ) ) ) df_grouped ## Table with default columns. tabulate_survival_biomarkers(df) ## Table with a manually chosen set of columns: leave out "pval", reorder. tab <- tabulate_survival_biomarkers( df = df, vars = c("n_tot_events", "ci", "n_tot", "median", "hr"), time_unit = as.character(adtte_f$AVALU[1]) ) ## Finally produce the forest plot. g_forest(tab, xlim = c(0.8, 1.2))
library(dplyr) adtte <- tern_ex_adtte # Save variable labels before data processing steps. adtte_labels <- formatters::var_labels(adtte) adtte_f <- adtte %>% filter(PARAMCD == "OS") %>% mutate( AVALU = as.character(AVALU), is_event = CNSR == 0 ) labels <- c("AVALU" = adtte_labels[["AVALU"]], "is_event" = "Event Flag") formatters::var_labels(adtte_f)[names(labels)] <- labels # Typical analysis of two continuous biomarkers `BMRKR1` and `AGE`, # in multiple regression models containing one covariate `RACE`, # as well as one stratification variable `STRATA1`. The subgroups # are defined by the levels of `BMRKR2`. df <- extract_survival_biomarkers( variables = list( tte = "AVAL", is_event = "is_event", biomarkers = c("BMRKR1", "AGE"), strata = "STRATA1", covariates = "SEX", subgroups = "BMRKR2" ), label_all = "Total Patients", data = adtte_f ) df # Here we group the levels of `BMRKR2` manually. df_grouped <- extract_survival_biomarkers( variables = list( tte = "AVAL", is_event = "is_event", biomarkers = c("BMRKR1", "AGE"), strata = "STRATA1", covariates = "SEX", subgroups = "BMRKR2" ), data = adtte_f, groups_lists = list( BMRKR2 = list( "low" = "LOW", "low/medium" = c("LOW", "MEDIUM"), "low/medium/high" = c("LOW", "MEDIUM", "HIGH") ) ) ) df_grouped ## Table with default columns. tabulate_survival_biomarkers(df) ## Table with a manually chosen set of columns: leave out "pval", reorder. tab <- tabulate_survival_biomarkers( df = df, vars = c("n_tot_events", "ci", "n_tot", "median", "hr"), time_unit = as.character(adtte_f$AVALU[1]) ) ## Finally produce the forest plot. g_forest(tab, xlim = c(0.8, 1.2))
## S3 method for class 'summary.coxph' tidy(x, ...) ## S3 method for class 'coxreg.univar' tidy(x, ...) ## S3 method for class 'coxreg.multivar' tidy(x, ...)
## S3 method for class 'summary.coxph' tidy(x, ...) ## S3 method for class 'coxreg.univar' tidy(x, ...) ## S3 method for class 'coxreg.multivar' tidy(x, ...)
x |
( |
... |
additional arguments for the lower level functions. |
broom::tidy()
returns:
For summary.coxph
objects, a data.frame
with columns: Pr(>|z|)
, exp(coef)
, exp(-coef)
, lower .95
,
upper .95
, level
, and n
.
For coxreg.univar
objects, a data.frame
with columns: effect
, term
, term_label
, level
, n
, hr
,
lcl
, ucl
, pval
, and ci
.
For coxreg.multivar
objects, a data.frame
with columns: term
, pval
, term_label
, hr
, lcl
, ucl
,
level
, and ci
.
tidy(summary.coxph)
: Custom tidy method for survival::coxph()
summary results.
Tidy the survival::coxph()
results into a data.frame
to extract model results.
tidy(coxreg.univar)
: Custom tidy method for a univariate Cox regression.
Tidy up the result of a Cox regression model fitted by fit_coxreg_univar()
.
tidy(coxreg.multivar)
: Custom tidy method for a multivariate Cox regression.
Tidy up the result of a Cox regression model fitted by fit_coxreg_multivar()
.
library(survival) library(broom) set.seed(1, kind = "Mersenne-Twister") dta_bladder <- with( data = bladder[bladder$enum < 5, ], data.frame( time = stop, status = event, armcd = as.factor(rx), covar1 = as.factor(enum), covar2 = factor( sample(as.factor(enum)), levels = 1:4, labels = c("F", "F", "M", "M") ) ) ) labels <- c("armcd" = "ARM", "covar1" = "A Covariate Label", "covar2" = "Sex (F/M)") formatters::var_labels(dta_bladder)[names(labels)] <- labels dta_bladder$age <- sample(20:60, size = nrow(dta_bladder), replace = TRUE) formula <- "survival::Surv(time, status) ~ armcd + covar1" msum <- summary(coxph(stats::as.formula(formula), data = dta_bladder)) tidy(msum) ## Cox regression: arm + 1 covariate. mod1 <- fit_coxreg_univar( variables = list( time = "time", event = "status", arm = "armcd", covariates = "covar1" ), data = dta_bladder, control = control_coxreg(conf_level = 0.91) ) ## Cox regression: arm + 1 covariate + interaction, 2 candidate covariates. mod2 <- fit_coxreg_univar( variables = list( time = "time", event = "status", arm = "armcd", covariates = c("covar1", "covar2") ), data = dta_bladder, control = control_coxreg(conf_level = 0.91, interaction = TRUE) ) tidy(mod1) tidy(mod2) multivar_model <- fit_coxreg_multivar( variables = list( time = "time", event = "status", arm = "armcd", covariates = c("covar1", "covar2") ), data = dta_bladder ) broom::tidy(multivar_model)
library(survival) library(broom) set.seed(1, kind = "Mersenne-Twister") dta_bladder <- with( data = bladder[bladder$enum < 5, ], data.frame( time = stop, status = event, armcd = as.factor(rx), covar1 = as.factor(enum), covar2 = factor( sample(as.factor(enum)), levels = 1:4, labels = c("F", "F", "M", "M") ) ) ) labels <- c("armcd" = "ARM", "covar1" = "A Covariate Label", "covar2" = "Sex (F/M)") formatters::var_labels(dta_bladder)[names(labels)] <- labels dta_bladder$age <- sample(20:60, size = nrow(dta_bladder), replace = TRUE) formula <- "survival::Surv(time, status) ~ armcd + covar1" msum <- summary(coxph(stats::as.formula(formula), data = dta_bladder)) tidy(msum) ## Cox regression: arm + 1 covariate. mod1 <- fit_coxreg_univar( variables = list( time = "time", event = "status", arm = "armcd", covariates = "covar1" ), data = dta_bladder, control = control_coxreg(conf_level = 0.91) ) ## Cox regression: arm + 1 covariate + interaction, 2 candidate covariates. mod2 <- fit_coxreg_univar( variables = list( time = "time", event = "status", arm = "armcd", covariates = c("covar1", "covar2") ), data = dta_bladder, control = control_coxreg(conf_level = 0.91, interaction = TRUE) ) tidy(mod1) tidy(mod2) multivar_model <- fit_coxreg_multivar( variables = list( time = "time", event = "status", arm = "armcd", covariates = c("covar1", "covar2") ), data = dta_bladder ) broom::tidy(multivar_model)
Helper method (for broom::tidy()
) to prepare a data frame from a glm
object
with binomial
family.
## S3 method for class 'glm' tidy(x, conf_level = 0.95, at = NULL, ...)
## S3 method for class 'glm' tidy(x, conf_level = 0.95, at = NULL, ...)
x |
( |
conf_level |
( |
at |
( |
... |
additional arguments for the lower level functions. |
A data.frame
containing the tidied model.
h_logistic_regression for relevant helper functions.
library(dplyr) library(broom) adrs_f <- tern_ex_adrs %>% filter(PARAMCD == "BESRSPI") %>% filter(RACE %in% c("ASIAN", "WHITE", "BLACK OR AFRICAN AMERICAN")) %>% mutate( Response = case_when(AVALC %in% c("PR", "CR") ~ 1, TRUE ~ 0), RACE = factor(RACE), SEX = factor(SEX) ) formatters::var_labels(adrs_f) <- c(formatters::var_labels(tern_ex_adrs), Response = "Response") mod1 <- fit_logistic( data = adrs_f, variables = list( response = "Response", arm = "ARMCD", covariates = c("AGE", "RACE") ) ) mod2 <- fit_logistic( data = adrs_f, variables = list( response = "Response", arm = "ARMCD", covariates = c("AGE", "RACE"), interaction = "AGE" ) ) df <- tidy(mod1, conf_level = 0.99) df2 <- tidy(mod2, conf_level = 0.99)
library(dplyr) library(broom) adrs_f <- tern_ex_adrs %>% filter(PARAMCD == "BESRSPI") %>% filter(RACE %in% c("ASIAN", "WHITE", "BLACK OR AFRICAN AMERICAN")) %>% mutate( Response = case_when(AVALC %in% c("PR", "CR") ~ 1, TRUE ~ 0), RACE = factor(RACE), SEX = factor(SEX) ) formatters::var_labels(adrs_f) <- c(formatters::var_labels(tern_ex_adrs), Response = "Response") mod1 <- fit_logistic( data = adrs_f, variables = list( response = "Response", arm = "ARMCD", covariates = c("AGE", "RACE") ) ) mod2 <- fit_logistic( data = adrs_f, variables = list( response = "Response", arm = "ARMCD", covariates = c("AGE", "RACE"), interaction = "AGE" ) ) df <- tidy(mod1, conf_level = 0.99) df2 <- tidy(mod2, conf_level = 0.99)
Tidy the STEP results into a tibble
format ready for plotting.
## S3 method for class 'step' tidy(x, ...)
## S3 method for class 'step' tidy(x, ...)
x |
( |
... |
not used. |
A tibble
with one row per STEP subgroup. The estimates and CIs are on the HR or OR scale,
respectively. Additional attributes carry metadata also used for plotting.
g_step()
which consumes the result from this function.
library(survival) lung$sex <- factor(lung$sex) vars <- list( time = "time", event = "status", arm = "sex", biomarker = "age" ) step_matrix <- fit_survival_step( variables = vars, data = lung, control = c(control_coxph(), control_step(num_points = 10, degree = 2)) ) broom::tidy(step_matrix)
library(survival) lung$sex <- factor(lung$sex) vars <- list( time = "time", event = "status", arm = "sex", biomarker = "age" ) step_matrix <- fit_survival_step( variables = vars, data = lung, control = c(control_coxph(), control_step(num_points = 10, degree = 2)) ) broom::tidy(step_matrix)
Replicate entries of a vector if required.
to_n(x, n)
to_n(x, n)
x |
( |
n |
( |
x
if it has the required length already or is NULL
,
otherwise if it is scalar the replicated version of it with n
entries.
This function will fail if x
is not of length n
and/or is not a scalar.
Helper function to use mostly within tests. with_spaces
parameter allows
to test not only for content but also indentation and table structure.
print_txt_to_copy
instead facilitate the testing development by returning a well
formatted text that needs only to be copied and pasted in the expected output.
to_string_matrix( x, widths = NULL, max_width = NULL, hsep = formatters::default_hsep(), with_spaces = TRUE, print_txt_to_copy = FALSE )
to_string_matrix( x, widths = NULL, max_width = NULL, hsep = formatters::default_hsep(), with_spaces = TRUE, print_txt_to_copy = FALSE )
x |
( |
widths |
( |
max_width |
( |
hsep |
( |
with_spaces |
( |
print_txt_to_copy |
( |
A matrix
of string
s. If print_txt_to_copy = TRUE
the well formatted printout of the
table will be printed to console, ready to be copied as a expected value.
tbl <- basic_table() %>% split_rows_by("SEX") %>% split_cols_by("ARM") %>% analyze("AGE") %>% build_table(tern_ex_adsl) to_string_matrix(tbl, widths = ceiling(propose_column_widths(tbl) / 2))
tbl <- basic_table() %>% split_rows_by("SEX") %>% split_cols_by("ARM") %>% analyze("AGE") %>% build_table(tern_ex_adsl) to_string_matrix(tbl, widths = ceiling(propose_column_widths(tbl) / 2))
The special term univariate
indicate that the model should be fitted individually for
every variable included in univariate.
univariate(x)
univariate(x)
x |
( |
If provided alongside with pairwise specification, the model
y ~ ARM + univariate(SEX, AGE, RACE)
lead to the study and comparison of the models
y ~ ARM
y ~ ARM + SEX
y ~ ARM + AGE
y ~ ARM + RACE
When used within a model formula, produces univariate models for each variable provided.
prop_strat_wilson()
This function wraps the iteration procedure that allows you to estimate the weights for each proportional strata. This assumes to minimize the weighted squared length of the confidence interval.
update_weights_strat_wilson( vars, strata_qnorm, initial_weights, n_per_strata, max_iterations = 50, conf_level = 0.95, tol = 0.001 )
update_weights_strat_wilson( vars, strata_qnorm, initial_weights, n_per_strata, max_iterations = 50, conf_level = 0.95, tol = 0.001 )
vars |
( |
strata_qnorm |
( |
initial_weights |
( |
n_per_strata |
( |
max_iterations |
( |
conf_level |
( |
tol |
( |
A list
of 3 elements: n_it
, weights
, and diff_v
.
For references and details see prop_strat_wilson()
.
vs <- c(0.011, 0.013, 0.012, 0.014, 0.017, 0.018) sq <- 0.674 ws <- rep(1 / length(vs), length(vs)) ns <- c(22, 18, 17, 17, 14, 12) update_weights_strat_wilson(vs, sq, ws, ns, 100, 0.95, 0.001)
vs <- c(0.011, 0.013, 0.012, 0.014, 0.017, 0.018) sq <- 0.674 ws <- rep(1 / length(vs), length(vs)) ns <- c(22, 18, 17, 17, 14, 12) update_weights_strat_wilson(vs, sq, ws, ns, 100, 0.95, 0.001)
Collection of useful functions that are expanding on the core list of functions
provided by rtables
. See rtables::custom_split_funs and rtables::make_split_fun()
for more information on how to make a custom split function. All these functions
work with rtables::split_rows_by()
argument split_fun
to modify the way the split
happens. For other split functions, consider consulting rtables::split_funcs
.
ref_group_position(position = "first") level_order(order)
ref_group_position(position = "first") level_order(order)
position |
( |
order |
( |
ref_group_position()
returns an utility function that puts the reference group
as first, last or at a certain position and needs to be assigned to split_fun
.
level_order()
returns an utility function that changes the original levels' order,
depending on input order
and split levels.
ref_group_position()
: Split function to place reference group facet at a specific position
during post-processing stage.
level_order()
: Split function to change level order based on an integer
vector or a character
vector that represent the split variable's factor levels.
library(dplyr) dat <- data.frame( x = factor(letters[1:5], levels = letters[5:1]), y = 1:5 ) # With rtables layout functions basic_table() %>% split_cols_by("x", ref_group = "c", split_fun = ref_group_position("last")) %>% analyze("y") %>% build_table(dat) # With tern layout funcitons adtte_f <- tern_ex_adtte %>% filter(PARAMCD == "OS") %>% mutate( AVAL = day2month(AVAL), is_event = CNSR == 0 ) basic_table() %>% split_cols_by(var = "ARMCD", ref_group = "ARM B", split_fun = ref_group_position("first")) %>% add_colcounts() %>% surv_time( vars = "AVAL", var_labels = "Survival Time (Months)", is_event = "is_event", ) %>% build_table(df = adtte_f) basic_table() %>% split_cols_by(var = "ARMCD", ref_group = "ARM B", split_fun = ref_group_position(2)) %>% add_colcounts() %>% surv_time( vars = "AVAL", var_labels = "Survival Time (Months)", is_event = "is_event", ) %>% build_table(df = adtte_f) # level_order -------- # Even if default would bring ref_group first, the original order puts it last basic_table() %>% split_cols_by("Species", split_fun = level_order(c(1, 3, 2))) %>% analyze("Sepal.Length") %>% build_table(iris) # character vector new_order <- level_order(levels(iris$Species)[c(1, 3, 2)]) basic_table() %>% split_cols_by("Species", ref_group = "virginica", split_fun = new_order) %>% analyze("Sepal.Length") %>% build_table(iris)
library(dplyr) dat <- data.frame( x = factor(letters[1:5], levels = letters[5:1]), y = 1:5 ) # With rtables layout functions basic_table() %>% split_cols_by("x", ref_group = "c", split_fun = ref_group_position("last")) %>% analyze("y") %>% build_table(dat) # With tern layout funcitons adtte_f <- tern_ex_adtte %>% filter(PARAMCD == "OS") %>% mutate( AVAL = day2month(AVAL), is_event = CNSR == 0 ) basic_table() %>% split_cols_by(var = "ARMCD", ref_group = "ARM B", split_fun = ref_group_position("first")) %>% add_colcounts() %>% surv_time( vars = "AVAL", var_labels = "Survival Time (Months)", is_event = "is_event", ) %>% build_table(df = adtte_f) basic_table() %>% split_cols_by(var = "ARMCD", ref_group = "ARM B", split_fun = ref_group_position(2)) %>% add_colcounts() %>% surv_time( vars = "AVAL", var_labels = "Survival Time (Months)", is_event = "is_event", ) %>% build_table(df = adtte_f) # level_order -------- # Even if default would bring ref_group first, the original order puts it last basic_table() %>% split_cols_by("Species", split_fun = level_order(c(1, 3, 2))) %>% analyze("Sepal.Length") %>% build_table(iris) # character vector new_order <- level_order(levels(iris$Species)[c(1, 3, 2)]) basic_table() %>% split_cols_by("Species", ref_group = "virginica", split_fun = new_order) %>% analyze("Sepal.Length") %>% build_table(iris)