Shift Layers

Shift tables are a special kind of frequency table - but what they count are changes in state. This is most common when looking at laboratory ranges, where you may be interested in seeing how a subject’s results related to normal ranges. The ‘change in state’ would refer to how that subject’s results were at baseline versus different points of measure. Shift tables allow you to see the distribution of how subjects move between normal ranges, and if the population is improving or worsening as the study progresses.

While shift tables are very similar to a normal frequency table, there’s more nuance here, and thus we decided to create group_shift(). This function is largely an abstraction of a count layer, and in fact re-uses a good deal of the same underlying code. But we handle some of the complexity for you to make the interface easy to use and the behavior similar to that of the group_count() and group_desc() APIs. Given that shift tables are built on count layers, many of functions that work with count layers behave in the same way when using shift layers. However, the following cannot be used in shift layers:

  • Functions related to nested counts, including set_nest_count(), set_outer_sort_position()
  • Functions related to total rows and missing rows, including set_missing_count(), add_total_row(), set_total_row_label()
  • Risk difference, including add_risk_diff()
  • and finally, result based sorting methods, including set_order_count_method(), set_ordering_cols(), set_result_order_var()

One thing to note - the group_shift() API is intended to be used on shift tables where one group is presented in rows and the other group in columns. Occasionally, shift tables will have a row based approach that shows “Low to High”, “Normal to High”, etc. For those situations, group_count() will do just fine.

A Basic Example

Let’s look at an example.

tplyr_table(tplyr_adlb, TRTA, where=PARAMCD == "CK") %>%
  add_layer(
    group_shift(vars(row = BNRIND, column = ANRIND), by = vars(PARAM, VISIT))
  ) %>%
  build() %>%
  head(20) %>%
  kable()
row_label1 row_label2 row_label3 var1_Placebo_L var1_Placebo_N var1_Placebo_H var1_Xanomeline High Dose_L var1_Xanomeline High Dose_N var1_Xanomeline High Dose_H var1_Xanomeline Low Dose_L var1_Xanomeline Low Dose_N var1_Xanomeline Low Dose_H ord_layer_index ord_layer_1 ord_layer_2 ord_layer_3
Creatine Kinase (U/L) WEEK 12 L 0 0 0 0 0 0 0 0 0 1 35 1 1
Creatine Kinase (U/L) WEEK 12 N 0 11 1 0 6 1 0 6 1 1 35 1 2
Creatine Kinase (U/L) WEEK 12 H 0 0 0 0 0 1 0 0 0 1 35 1 3
Creatine Kinase (U/L) WEEK 24 L 0 0 0 0 0 0 0 0 0 1 35 2 1
Creatine Kinase (U/L) WEEK 24 N 0 10 2 0 2 0 0 2 0 1 35 2 2
Creatine Kinase (U/L) WEEK 24 H 0 0 0 0 0 0 0 0 0 1 35 2 3
Creatine Kinase (U/L) WEEK 8 L 0 0 0 0 0 0 0 0 0 1 35 3 1
Creatine Kinase (U/L) WEEK 8 N 0 6 1 0 9 1 0 6 0 1 35 3 2
Creatine Kinase (U/L) WEEK 8 H 0 0 0 0 0 0 0 0 0 1 35 3 3

First, let’s look at the differences in the shift API. Shift layers must take a row and a column variable, as the layer is designed to create a box for you that explains the changes in state. The row variable will typically be your “from” variable, and the column variable will typically be your “to” variable. Behind the scenes, Tplyr breaks this down for you to properly count and present the data.

For the most part, the last example gets us where we want to go - but there’s still some that’s left to be desired. It doesn’t look like there are any ‘L’ values for BNRIND in the dataset so we are not getting and rows containing ‘L’. Let’s see if we can fix that by dummying in the possible values.

Filling Missing Groups Using Factors

tplyr_adlb$ANRIND <- factor(tplyr_adlb$ANRIND, levels=c("L", "N", "H"))
tplyr_adlb$BNRIND <- factor(tplyr_adlb$BNRIND, levels=c("L", "N", "H"))
tplyr_table(tplyr_adlb, TRTA, where=PARAMCD == "CK") %>%
  add_layer(
    group_shift(vars(row = BNRIND, column = ANRIND), by = vars(PARAM, VISIT))
  ) %>%
  build() %>%
  head(20) %>%
  kable()
row_label1 row_label2 row_label3 var1_Placebo_L var1_Placebo_N var1_Placebo_H var1_Xanomeline High Dose_L var1_Xanomeline High Dose_N var1_Xanomeline High Dose_H var1_Xanomeline Low Dose_L var1_Xanomeline Low Dose_N var1_Xanomeline Low Dose_H ord_layer_index ord_layer_1 ord_layer_2 ord_layer_3
Creatine Kinase (U/L) WEEK 12 L 0 0 0 0 0 0 0 0 0 1 35 1 1
Creatine Kinase (U/L) WEEK 12 N 0 11 1 0 6 1 0 6 1 1 35 1 2
Creatine Kinase (U/L) WEEK 12 H 0 0 0 0 0 1 0 0 0 1 35 1 3
Creatine Kinase (U/L) WEEK 24 L 0 0 0 0 0 0 0 0 0 1 35 2 1
Creatine Kinase (U/L) WEEK 24 N 0 10 2 0 2 0 0 2 0 1 35 2 2
Creatine Kinase (U/L) WEEK 24 H 0 0 0 0 0 0 0 0 0 1 35 2 3
Creatine Kinase (U/L) WEEK 8 L 0 0 0 0 0 0 0 0 0 1 35 3 1
Creatine Kinase (U/L) WEEK 8 N 0 6 1 0 9 1 0 6 0 1 35 3 2
Creatine Kinase (U/L) WEEK 8 H 0 0 0 0 0 0 0 0 0 1 35 3 3

There we go. This is another situation where using factors in R let’s us dummy values within the dataset. Furthermore, since factors are ordered, it automatically corrected the sort order of the row labels too.

Where to go from here

There’s much more to learn! Check out the sorting vignette for more information on sorting. Additionally, check out our vignette on denominators to understand controlling more of the nuance in shift tables.