---
title: "How to use autoslider.core"
date: "`r Sys.Date()`"
output:
    rmarkdown::html_document:
        theme: "spacelab"
        highlight: "kate"
        toc: true
        toc_float: true
author: 
  - Stefan Thoma, Joe Zhu
vignette: >
  %\VignetteIndexEntry{Introduction to `AutoslideR.core`}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
editor_options:
    markdown:
        wrap: 72
---

In this vignette we show the general `autoslider.core` workflow, how you can create functions that produce study-specific outputs, and how you can integrate them into the `autoslider.core` framework.

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

# Requirements

Of course, you need to have the `autoslider.core` package installed, and you need to have data available.
In this example I use example data stored in the `autoslider.core` package. 

The data needs to be stored in a named list where the names should correspond to ADaM data sets.
```{r, echo = FALSE, include = FALSE}
library(autoslider.core)
library(dplyr)
```


```{r, eval = TRUE, include = FALSE}
# hidden setup
# Install and load the necessary packages
library(yaml)

# Create the YAML content
yaml_content <- '
ITT:
  title: Intent to Treat Population
  condition: ITTFL == "Y"
  target: adsl
  type: slref
SAS:
  title: Secondary Analysis Set
  condition: SASFL == "Y"
  target: adsl
  type: slref
SE:
  title: Safety Evaluable Population
  condition: SAFFL== "Y"
  target: adsl
  type: slref
SER:
  title: Serious Adverse Events
  condition: AESER == "Y"
  target: adae
  type: anl
LBCRP:
  title: CRP Values
  condition: PARAMCD == "CRP"
  target: adlb
  type: slref
LBNOBAS:
  title: Only Visits After Baseline
  condition: ABLFL != "Y" & !(AVISIT %in% c("SCREENING", "BASELINE"))
  target: adlb
  type: slref
'

# Create a temporary YAML file
filters <- tempfile(fileext = ".yaml")

# Write the YAML content to the temporary file
write(yaml_content, file = filters)

# Create the specs entry
specs_entry <- '
- program: t_pop_slide
  titles: Analysis Sets
  footnotes: "Analysis Sets footer"
  paper: L6
  suffix: ITT
- program: t_ds_slide
  titles: Patient Disposition ({filter_titles("adsl")})
  footnotes: "t_ds footnotes"
  paper: L6
  suffix: ITT
- program: t_dm_slide
  titles: Patient Demographics and Baseline Characteristics
  footnotes: "t_dm_slide footnote"
  paper: L6
  suffix: ITT
  args:
    arm: "TRT01A"
    vars: ["SEX", "AGE", "RACE", "ETHNIC", "COUNTRY"]
- program: lbt06
  titles: Patient Disposition ({filter_titles("adsl")})
  footnotes: "t_ds footnotes"
  paper: L6
  suffix: ITT_LBCRP_LBNOBAS
'

# Create a temporary specs entry file
spec_file <- tempfile(fileext = ".yaml")

# Write the specs entry to the temporary file
write(specs_entry, file = spec_file)


# Clean up the temporary files
# file.remove(filters)
# file.remove(spec_file)
```


# Workflow


The folder structure could look something like: 

```
├── programs
│   ├── run_script.R
│   ├── R   
|   |   ├── helping_functions.R
|   |   ├── output_functions.R
├── outputs
├── specs.yml
├── filters.yml
```

The `autoslideR` workflow would be implemented in the `run_script.R` file.
This workflow does not require the files in `R/`. 
However, if custom output-creating functions are implemented, `R/` would be the place to put them.

The `autoslideR` workflow has four main aspects: 

## The specifications `specs.yml` 

This file contains the specifications of all outputs you would like to create.

For each output we define specific information, namely the program name, the footnotes & titles, the paper (this indicates the orientation, P for portrait and L for landscape, the number indicates the font size), the suffix and `args`.

It could look something like that:

```
- program: t_pop_slide
  titles: Analysis Sets 
  footnotes: 'Analysis Sets footer'
  paper: L6
- program: t_ds_slide
  titles: Patient Disposition ({filter_titles("adsl")})
  footnotes: 't_ds footnotes'
  paper: L6
  suffix: ITT
- program: t_dm_slide
  titles: Patient Demographics and Baseline Characteristics
  footnotes: 't_dm_slide footnote'
  paper: L6
  suffix: ITT
  args:
    arm: "TRT01A"
    vars: ["SEX", "AGE", "RACE", "ETHNIC", "COUNTRY"]

```

The program name refers to a function that produces an output. 
This could be one of the `_slide` functions in `autoslider.core` or a custom function.

Titles and footnotes are added once the outputs are created.
We refer to that as decorating the outputs.

The suffix specifies the name of the filters that are applied to the data, before the data is funneled into the function (program).
The filters themselves are specified in the `filters.yml` file.

## The filters `filters.yml`

In `filters.yml` we specify the names of the filters used across the outputs. 
Each filter has a name (e.g. `FAS`), a title (`Full Analysis Set`), and then the filtering condition on a target dataset. 
The filter title may be appended to the output title. For the `t_ds_slides` slide above all filter titles that target the adsl dataset would be included in the brackets. 
We would thus expect the title to read: "Patient Disposition (Full Analysis Set)"

[what is the type?]

As you can see, we don't just have population filters, but also filters on serious adverse events. 
We can thus produce SAE tables by just supplying the serious adverse events to the AE table function. 
This concept generalizes also to `PARAMCD` values.


```
ITT:
  title: Intent to Treat Population
  condition: ITTFL =='Y'
  target: adsl
  type: slref
SAS:
  title: Secondary Analysis Set
  condition: SASFL == 'Y'
  target: adsl
  type: slref
SE:
  title: Safety Evaluable Population
  condition: SAFFL=='Y'
  target: adsl
  type: slref
SER:
  title: Serious Adverse Events
  condition: AESER == 'Y'
  target: adae
  type: anl

```


## The functions

You can find an overview of all `autoslider.core` functions [here](link to the manual).
Note that all output-producing functions end with `_slide` while the prefix (i.e. `t_`, `l_`, `g_`) specify the type of output (i.e. table, listing, or graph respectively).
Custom functions are needed if the `autoslider.core` functions do not cover the outputs you need. 
More on that further down.

## The backend machinery

A typical workflow could look something like this: 


```{r, eval = FALSE}
# define path to the yml files
spec_file <- "spec.yml"
filters <- "filters.yml"
```

```{r}
library("dplyr")
# load all filters
filters::load_filters(filters, overwrite = TRUE)
# read data
data <- list(
  "adsl" = eg_adsl %>%
    mutate(
      FASFL = SAFFL, # add FASFL for illustrative purpose for t_pop_slide
      # DISTRTFL is needed for t_ds_slide but is missing in example data
      DISTRTFL = sample(c("Y", "N"), size = length(TRT01A), replace = TRUE, prob = c(.1, .9))
    ) %>%
    preprocess_t_ds(), # this preproccessing is required by one of the autoslider.core functions
  "adae" = eg_adae,
  "adtte" = eg_adtte,
  "adrs" = eg_adrs,
  "adlb" = eg_adlb
)


# create outputs based on the specs and the functions
outputs <- spec_file %>%
  read_spec() %>%
  # we can also filter for specific programs, if we don't want to create them all
  filter_spec(., program %in% c(
    "t_ds_slide",
    "t_dm_slide",
    "t_pop_slide"
  )) %>%
  # these filtered specs are now piped into the generate_outputs function.
  # this function also requires the data
  generate_outputs(datasets = data) %>%
  # now we decorate based on the specs, i.e. add footnotes and titles
  decorate_outputs(
    version_label = NULL
  )
```


We can have a look at one of the outputs stored in the outputs file:
```{r}
outputs$t_pop_slide_ITT
```

Now we can save it to a slide. 
For this example I store the output in a tempfile, you would likely store it in the `outputs/` folder.
```{r, eval = FALSE}
# Output to slides with template and color theme
outputs %>%
  generate_slides(
    outfile = tempfile(fileext = ".ppts"),
    template = file.path(system.file(package = "autoslider.core"), "/theme/basic.pptx"),
    table_format = autoslider_format
  )
```

# Writing custom functions

Unless your requirements are really specific, the most efficient way to write a study function is to base it off of a template function from the [TLG catalogue](https://insightsengineering.github.io/tlg-catalog/stable/). 

The function you would want to create should take as input a list of datasets and potentially additional arguments. 
Within the function, you should not worry about filtering the data, as this should be taken care of with the `filters.yml` file and the general workflow. 
To work properly with `autoslider.core` your function should return either: 

- A `ggplot2` object for graphs
- An `rtables` object for tables
- An `rlistings` object for listings

That's it!

Now let's see how this works in practice.

## create example function

A function that works within the `autoslideR` workflow should also work on its own. 
This makes it straightforward to develop and test. 

As an example, lets create a function corresponding to a [TLG catalogue](https://insightsengineering.github.io/tlg-catalog/stable/) output. 

We are going create a table based on [LBT06 Laboratory Abnormalities by Visit and Baseline Status](https://insightsengineering.github.io/tlg-catalog/stable/tables/lab-results/lbt06.html):


```{r}
lbt06 <- function(datasets) {
  # Ensure character variables are converted to factors and empty strings and NAs are explicit missing levels.
  adsl <- datasets$adsl %>% tern::df_explicit_na()
  adlb <- datasets$adlb %>% tern::df_explicit_na()

  # Please note that df_explict_na has a na_level argument defaulting to "<Missing>",
  # Please don't change the na_level to anything other than NA, empty string or the default "<Missing>".

  adlb_f <- adlb %>%
    dplyr::filter(ABLFL != "Y") %>%
    dplyr::filter(!(AVISIT %in% c("SCREENING", "BASELINE"))) %>%
    dplyr::mutate(AVISIT = droplevels(AVISIT)) %>%
    formatters::var_relabel(AVISIT = "Visit")

  adlb_f_crp <- adlb_f %>% dplyr::filter(PARAMCD == "CRP")

  # Define the split function
  split_fun <- rtables::drop_split_levels

  lyt <- rtables::basic_table(show_colcounts = TRUE) %>%
    rtables::split_cols_by("ARM") %>%
    rtables::split_rows_by("AVISIT",
      split_fun = split_fun, label_pos = "topleft",
      split_label = formatters::obj_label(adlb_f_crp$AVISIT)
    ) %>%
    tern::count_abnormal_by_baseline(
      "ANRIND",
      abnormal = c(Low = "LOW", High = "HIGH"),
      .indent_mods = 4L
    ) %>%
    tern::append_varlabels(adlb_f_crp, "ANRIND", indent = 1L) %>%
    rtables::append_topleft("    Baseline Status")

  result <- rtables::build_table(
    lyt = lyt,
    df = adlb_f_crp,
    alt_counts_df = adsl
  ) %>%
    rtables::trim_rows()

  result
}
```

Let's see if this works:

```{r}
lbt06(data)
```

This works! 

To increase code-reusability and to have filtering control centralised in the `filters.yml` file, I would recommend to remove most filtering processes from the function, namely the following chunk: 

```{r}
adlb_f <- eg_adlb %>%
  dplyr::filter(ABLFL != "Y") %>%
  dplyr::filter(!(AVISIT %in% c("SCREENING", "BASELINE"))) %>%
  dplyr::mutate(AVISIT = droplevels(AVISIT)) %>%
  formatters::var_relabel(AVISIT = "Visit")

adlb_f_crp <- adlb_f %>% dplyr::filter(PARAMCD == "CRP")
```

For this, I'll add two separate filters into the `filters.yml` file; one to filter the right parameter, and one to take care of the AVISIT and ABLFL. 

```
LBCRP:
  title: CRP Values
  condition: PARAMCD == 'CRP'
  target: adlb
  type: slref
LBNOBAS:
  title: Only Visits After Baseline
  condition: ABLFL != "Y" & !(AVISIT %in% c("SCREENING", "BASELINE"))
  target: adlb
  type: slref

```

And the corresponding specs entry: 
```
- program: lbt06
  titles: Patient Disposition ({filter_titles("adsl")})
  footnotes: 't_ds footnotes'
  paper: L6
  suffix: FAS_LBCRP_LBNOBAS

```

Now we can rewrite the function. 
But keep in mind: 
We are applying a filter to the ADSL data set but create the output on ADLB. 
To forward the filter to ADLB, we must semi-join the ADSL to ADLB.

```{r}
lbt06 <- function(datasets) {
  # Ensure character variables are converted to factors and empty strings and NAs are explicit missing levels.
  adsl <- datasets$adsl %>% tern::df_explicit_na()
  adlb <- datasets$adlb %>% tern::df_explicit_na()


  # join adsl to adlb
  adlb <- adlb %>% semi_join(adsl, by = "USUBJID")

  # Please note that df_explict_na has a na_level argument defaulting to "<Missing>",
  # Please don't change the na_level to anything other than NA, empty string or the default "<Missing>".

  adlb_f <- adlb %>%
    dplyr::mutate(AVISIT = droplevels(AVISIT)) %>%
    formatters::var_relabel(AVISIT = "Visit")

  # Define the split function
  split_fun <- rtables::drop_split_levels

  lyt <- rtables::basic_table(show_colcounts = TRUE) %>%
    rtables::split_cols_by("ARM") %>%
    rtables::split_rows_by("AVISIT",
      split_fun = split_fun, label_pos = "topleft",
      split_label = formatters::obj_label(adlb_f_crp$AVISIT)
    ) %>%
    tern::count_abnormal_by_baseline(
      "ANRIND",
      abnormal = c(Low = "LOW", High = "HIGH"),
      .indent_mods = 4L
    ) %>%
    tern::append_varlabels(adlb_f, "ANRIND", indent = 1L) %>%
    rtables::append_topleft("    Baseline Status")

  result <- rtables::build_table(
    lyt = lyt,
    df = adlb_f,
    alt_counts_df = adsl
  ) %>%
    rtables::trim_rows()

  result
}
```

lets do a dry-run before we integrate this function into the workflow: 

```{r}
# filter data | this step will be performed by the workflow later on

adsl <- eg_adsl
adlb <- eg_adlb

adlb_f <- adlb %>%
  dplyr::filter(ABLFL != "Y") %>%
  dplyr::filter(!(AVISIT %in% c("SCREENING", "BASELINE")))

adlb_f_crp <- adlb_f %>% dplyr::filter(PARAMCD == "CRP")

adsl_f <- adsl %>% filter(ITTFL == "Y")
```

```{r}
lbt06(list(adsl = adsl_f, adlb = adlb_f_crp))
```

Looks like it works!

## Integrate it into the general workflow

You have to keep in mind that the function you created must be in the global environment when calling the `create_outputs` function. 
This is the case for all `autoslider.core` functions, as you attach the `autoslider.core` package (with your `library(autoslider.core)` call), so all (exported) function of the `autoslider.core` package are available.

If you store your custom function in a separate script, you would need to source that script at some point before calling the function, i.e.: 

```{r, eval = FALSE}
source("programs/R/output_functions.R")
```

Now you just have to make sure the two `.yml` files are correctly specified. 


Set the path to the `.yml` files.
```{r, eval = FALSE}
filters <- "filters.yml"
spec_file <- "specs.yml"
```


Then load the filters and generate the outputs.
```{r}
filters::load_filters(filters, overwrite = TRUE)

outputs <- spec_file %>%
  read_spec() %>%
  generate_outputs(data) %>%
  decorate_outputs()

outputs$lbt06_ITT_LBCRP_LBNOBAS
```


Once this works, we can finally generate the slides. 

```{r}
filepath <- tempfile(fileext = ".pptx")
generate_slides(outputs, outfile = filepath)
```

Of course, you would not use a temporary file, and you might want to use a custom `.pptx` template for your slides. 


### Customizing the Output
You can customize the name of the function and the path where it's saved using the `function_name` and `save_path` parameters:

```{r}
#| eval: FALSE
use_template(
  template = "listing",
  function_name = "l_custom_slide",
  save_path = "./my_directory/l_custom_slide.R"
)
```

This will create a new function named "my_listing_slide.R" in the "my_directory" directory.
In {autoslideR} we call functions in the format `[t,l,g]_[output]_slide` and the file typically `[t,l,g]_[output]_slide.R`. We encourage users to follow this custom. 
A `dm` table function would be called something like: `t_dm_slide` and the file it's stored in `t_dm_slide.R`.

### Overwriting Existing Files
By default, use_template will not overwrite existing files. If you want to overwrite an existing file, you can set overwrite = TRUE:

```{r}
#| eval: FALSE
use_template(
  template = "listing",
  function_name = "l_custom_slide",
  save_path = "./my_directory/l_custom_slide.R",
  overwrite = TRUE
)
```

### Opening the File
If you want to open the file immediately after it's created, you can set open = TRUE. This is the default behavior when running in an interactive session:

```{r}
#| eval: FALSE
use_template(
  template = "listing",
  function_name = "l_custom_slide",
  save_path = "./my_directory/l_custom_slide.R",
  open = TRUE
)
```


### Conclusion
The use_template function is a powerful tool for creating new autoslideR compatible output functions. By customizing the function name, save path, and other options, you can easily create slides that fit your specific needs.