Click or drag to resize

LogisticRegressionParameterCalcDesignVariables(IDFColumn) Method

Convenience method for generating design, or dummy, variables which replace independent variables in a logistic model that take on discrete, nominal scaled values. The encoding method used is "reference cell coding". where the group with the SMALLEST code serves as the reference group. The method is described in the remark below.

Namespace: CenterSpace.NMath.Core
Assembly: NMath (in NMath.dll) Version: 7.4
Syntax
public static DataFrame DesignVariables(
	IDFColumn catagoricalDataCol
)

Parameters

catagoricalDataCol  IDFColumn
A column containaing the nominally scaled variable values.

Return Value

DataFrame
DataFrame containing the design variables encoded using "reference cell encoding". The group with the SMALLEST code serves as the reference group. If the input column name is X, and the variable X has k possible values, the output DataFrame will contain k - 1 columns with names: X_0, X_1,...,X_(k-1)
Remarks
If a nominal scaled variable has k possible values, then k - 1 design variables will be created, each with a value of zero or one. The design variable values are encoded by setting all design variables to zero for the reference group, and then setting a single design variable equal to one for each of the other groups. The design variables replace the nominally scaled variable in the model. Suppose that the jth independent variable, xj has k levels. Denote by Dju, the design variables and denote the coefficients for these design variables by Bju, u = 1, 2,...,k-1. Then the logit for the model with p variables and the jth variable being discrete would be g(x) = B0 + B1*x1 +...+ (Bj1*Dj1 + Bj2*Dj2 +...+Bj(k-1)*Dj(k-1) +...+ Bp*xp For example, suppose that Race is an independent variable in a model with three possible values: white, black and other. Suppose further that these values have been encoded in the data as white = 1, black = 2, and other = 3. The input to the DesignVariables function would be a data frame column with name = Race and the numerical values for each subject. This function would then generate a data frame containing 3 - 1 = 2 columns for the two design variables with names Race_0 and Race_1. Sample input/output - Input Column: Race ---- 1 1 2 1 1 3 3 2 1 Output DataFrame: Race_0 Race_1 ------ ------ 0 0 0 0 1 0 0 0 0 0 0 1 0 1 1 0 0 0
See Also