d estimates the incremental explained risk variation
across a set of pre-specified disease subtypes in a case-control study.
This function takes the name of the disease subtype variable, the number
of disease subtypes, a list of risk factors, and a wide dataset,
and does the needed
transformation on the dataset to get the correct format. Then the polytomous
logistic regression model is fit using
and D is calculated based on the resulting risk predictions.
d(label, M, factors, data)
the name of the subtype variable in the data. This should be a
numeric variable with values 0 through M, where 0 indicates control subjects.
Must be supplied in quotes, e.g.
is the number of subtypes. For M>=2.
a list of the names of the binary or continuous risk factors.
For binary risk factors the lowest level will be used as the reference level.
the name of the dataframe that contains the relevant variables.
Begg, C. B., Zabor, E. C., Bernstein, J. L., Bernstein, L., Press, M. F., & Seshan, V. E. (2013). A conceptual and methodological framework for investigating etiologic heterogeneity. Stat Med, 32(29), 5039-5052. doi: 10.1002/sim.5902
d( label = "subtype", M = 4, factors = list("x1", "x2", "x3"), data = subtype_data )#>  0.4100995