Generate synthetic data relationally-linked to existing data

gen_reljoin_table(
  joinrec,
  tblrec,
  miss_recipe = NULL,
  db,
  keep = NA_character_
)

Arguments

joinrec

tibble. Recipe for synthesizing core/seed data based of a foreign key present in an existing table within db

tblrec

tibble. Recipe for generating the remainder of the new table, via gen_table_data, building on initial table generated using joinrec.

miss_recipe

tibble or NULL. A missingness recipe, if desired, to be applied after data generation via inject_nas.

db

list. A named list of existing tibbles/data.frames. The names will be used to resolve foreign table references in joinrec.

keep

TODO

Value

The newly synthesized data table.

Details

In relational database terms, this function synthesizes new data in a table which has a foreign key in a table existing already within db. Typically it will not generate data in the same dimension as the foreign table (as in that case the new data could simply be columns added to the existing table). Instead, it generally has the possibility of multiple rows for a particular foreign-key value, the possibility a foreign key value is not present at all, or both. A concrete example of this is Adverse Events being mapped to patients (USUBJID in CDISC terms). Some patients will have multiple adverse events, while many will have none at all.

This is done via 3 steps:

1. Applying the relational join recipe. The "relational join recipe" step should be considered primarily as the mechanism for defining the dimensions of the new data table.

2. The main data synthesis step, which is done by applying the tblrec recipe on the scaffolding provided by the newly dimensioned table generated in step 1.

3. Injecting missingness (optional) using missrec.

See also