This function calculates the chi-squared statistic for each column of datX
against the response variable response
. It supports both numerical and
categorical predictors in datX
. For numerical variables, it automatically
discretizes them into factor levels based on standard deviations and mean,
using different splitting criteria depending on the sample size.
Value
A vector of chi-squared statistics, one for each predictor variable
in datX
. For numerical variables, the chi-squared statistic is computed
after binning the variable.
Details
For each variable in datX
, the function first checks if the
variable is numerical. If so, it is discretized into factor levels using
either two or three split points, depending on the sample size and the
number of levels in the response
. Missing values are handled by assigning
them to a new factor level.
The chi-squared statistic is then computed between each predictor and the
response
. If the chi-squared test has more than one degree of freedom,
the Wilson-Hilferty transformation is applied to adjust the statistic to a
1-degree-of-freedom chi-squared distribution.