Statistics Dictionary

To see a definition, select a term from the dropdown text box below. The statistics dictionary will display the definition, plus links to related web pages.

Select term:

Dummy Variable

A dummy variable (aka, an indicator variable) is a numeric variable that represents categorical data, such as gender, race, political affiliation, etc.

Researchers use dummy variables to analyze regression equations when one or more independent variables are categorical. The key to the analysis is to express categorical variables as dummy variables.

Technically, dummy variables are dichotomous, quantitative variables; they can take on any two quantitative values. As a practical matter, regression results are easier to interpret when dummy variables take on two specific values, 1 or 0. Typically, 1 represents the presence of a qualitative attribute, and 0 represents the absence.

The number of dummy variables required to represent a particular categorical variable depends on the number of values that the categorical variable can assume. To represent a categorical variable that can assume k different values, a researcher would need to define k - 1 dummy variables.

For example, suppose we are interested in political affiliation, a categorical variable that might assume three values - Republican, Democrat, or Independent. We could represent political affiliation with two dummy variables:

  • X1 = 1, if Republican; X1 = 0, otherwise.
  • X2 = 1, if Democrat; X2 = 0, otherwise.

In this example, notice that we don't have to create a dummy variable to represent the "Independent" category of political affiliation. If X1 equals 0 and X2 equals zero, we know the voter is neither Republican nor Democrat. Therefore, voter must be Independent.

See also:  Dummy Variables in Regression