This function aligns a given dataset (data
) with a reference dataset
(missingReference
). It ensures that the structure, column names, and factor
levels in data
match the structure of missingReference
. If necessary,
missing columns are initialized with NA
, and factor levels are adjusted to
match the reference. Additionally, it handles the imputation of missing
values based on the reference and manages flag variables for categorical or
numerical columns.
Value
A data frame where the structure, column names, and factor levels of
data
are aligned with missingReference
. Missing values in data
are
imputed based on the first row of the missingReference
, and flag
variables are updated accordingly.
Examples
data <- data.frame(
X1_FLAG = c(0, 0, 0),
X1 = factor(c(NA, "C", "B"), levels = LETTERS[2:3]),
X2_FLAG = c(NA, 0, 1),
X2 = c(2, NA, 3)
)
missingReference <- data.frame(
X1_FLAG = 1,
X1 = factor("A", levels = LETTERS[1:2]),
X2 = 1,
X2_FLAG = 1
)
getDataInShape(data, missingReference)
#> X1_FLAG X1 X2 X2_FLAG
#> 1 1 A 2 0
#> 2 1 A 1 1
#> 3 0 B 3 1