The teal.data
package specifies the data format used in
teal
applications. The teal_data
class
inherits from qenv
and is meant to be used for reproducibility purposes.
To create an object of class teal_data
, use the
teal_data
function. teal_data
has a number of
methods to manage relevant information in private class slots.
library(teal.data)
# create teal_data object
my_data <- teal_data()
# run code within teal_data to create data objects
my_data <- within(
my_data,
{
data1 <- data.frame(id = 1:10, x = 11:20)
data2 <- data.frame(id = 1:10, x = 21:30)
}
)
# get objects stored in teal_data
my_data[["data1"]]
my_data[["data1"]]
# get reproducible code
get_code(my_data)
# get or set datanames
datanames(my_data) <- c("data1", "data2")
datanames(my_data)
# print
print(my_data)
teal_data
characteristicsA teal_data
object keeps the following information:
env
- an environment containing data.code
- a string containing code to reproduce
env
(details in reproducibility).datanames
- a character vector listing objects of
interest to teal
modules (details in this
teal
vignette).join_keys
- a join_keys
object defining
relationships between datasets (details in Join
Keys).The primary function of teal_data
is to provide
reproducibility of data. We recommend to initialize empty
teal_data
, which marks object as verified, and
create datasets by evaluating code in the object, using
within
or eval_code
. Read more in teal_data Reproducibility.
my_data <- teal_data()
my_data <- within(my_data, data <- data.frame(x = 11:20))
my_data <- within(my_data, data$id <- seq_len(nrow(data)))
my_data # is verified
## ✅︎ verified teal_data object
## <environment: 0x55cd3c14a458> [L]
## Parent: <environment: package:teal.data>
## Bindings:
## • data: <df[,2]> [L]
The teal_data
class supports relational data.
Relationships between datasets can be described by joining keys and
stored in a teal_data
object. These relationships can be
read or set with the join_keys
function. See more in join_keys.
my_data <- teal_data()
my_data <- within(my_data, {
data <- data.frame(id = 1:10, x = 11:20)
child <- data.frame(id = 1:20, data_id = c(1:10, 1:10), y = 21:30)
})
join_keys(my_data) <- join_keys(
join_key("data", "data", key = "id"),
join_key("child", "child", key = "id"),
join_key("child", "data", key = c("data_id" = "id"))
)
join_keys(my_data)
## A join_keys object containing foreign keys between 2 datasets:
## child: [id]
## <-- data: [id]
## data: [id]
## --> child: [data_id]