Creating dummy variables is a point not to be missed while working on nominal variables. This comphrehensive tutorial includes necessary steps to make dummy variables based on variables class in R data frame.

In this tutorial, we learn the usage of dummy.data.frame() function available in dummies package (Brown, 2012). Firstly, we learn how to create dummy variables for categorical variables . Secondly, we go over how to create dummy variables for specified class. At last, we learn how to create dummy variables for all variables in R data frame.

In this tutorial, we do not discuss that k-1 dummy variables are used if we have k levels of a categorical variable. You can read it here.

Let’s construct a data frame involving the variables involving four different variable types in R.

w <- factor(rep(c("apple","banana","carrot"), each = 2))
x <- rep(c("A","B","C"), 2)
y <- rep(1:3, 2)
z <- rep(c(0.4,0.8), 3)
data <- data.frame(w, x, y, z)

data
##        w x y   z
## 1  apple A 1 0.4
## 2  apple B 2 0.8
## 3 banana C 3 0.4
## 4 banana A 1 0.8
## 5 carrot B 2 0.4
## 6 carrot C 3 0.8

sapply(data, class) 
##           w           x           y           z 
##    "factor" "character"   "integer"   "numeric" 

Check Out: How to Convert Categorical Variables into Dummy Variables in R

1) How to Create Dummy Variables for Categorical Variables

In this part, we use dummy.data.frame() function with default arguments. It converts the variables with factor and character classes to dummy variables. If we set all = FALSE, it removes the variables except for dummy variables.

library(dummies)
dummy.data.frame(data)
##   wapple wbanana wcarrot xA xB xC y   z
## 1      1       0       0  1  0  0 1 0.4
## 2      1       0       0  0  1  0 2 0.8
## 3      0       1       0  0  0  1 3 0.4
## 4      0       1       0  1  0  0 1 0.8
## 5      0       0       1  0  1  0 2 0.4
## 6      0       0       1  0  0  1 3 0.8

dummy.data.frame(data, all = FALSE)
##   wapple wbanana wcarrot xA xB xC
## 1      1       0       0  1  0  0
## 2      1       0       0  0  1  0
## 3      0       1       0  0  0  1
## 4      0       1       0  1  0  0
## 5      0       0       1  0  1  0
## 6      0       0       1  0  0  1

Also Check: How to Merge Data Frames in R

2) How to Create Dummy Variables for Specified Class

We can create dummy variables by specifying the variable class with dummy.class argument. In this part, we set to dummy.class to “factor”, “character”, “numeric” and “integer”.

dummy.data.frame(data, dummy.classes = "factor")
##   wapple wbanana wcarrot x y   z
## 1      1       0       0 A 1 0.4
## 2      1       0       0 B 2 0.8
## 3      0       1       0 C 3 0.4
## 4      0       1       0 A 1 0.8
## 5      0       0       1 B 2 0.4
## 6      0       0       1 C 3 0.8

dummy.data.frame(data, dummy.classes = "character")
##        w xA xB xC y   z
## 1  apple  1  0  0 1 0.4
## 2  apple  0  1  0 2 0.8
## 3 banana  0  0  1 3 0.4
## 4 banana  1  0  0 1 0.8
## 5 carrot  0  1  0 2 0.4
## 6 carrot  0  0  1 3 0.8

dummy.data.frame(data, dummy.classes = "numeric")
##        w x y z0.4 z0.8
## 1  apple A 1    1    0
## 2  apple B 2    0    1
## 3 banana C 3    1    0
## 4 banana A 1    0    1
## 5 carrot B 2    1    0
## 6 carrot C 3    0    1

dummy.data.frame(data, dummy.classes = "integer")
##        w x y1 y2 y3   z
## 1  apple A  1  0  0 0.4
## 2  apple B  0  1  0 0.8
## 3 banana C  0  0  1 0.4
## 4 banana A  1  0  0 0.8
## 5 carrot B  0  1  0 0.4
## 6 carrot C  0  0  1 0.8

Also Check: How to Remove Outliers from Data in R

3) How to Create Dummy Variables for All Variables

We can create dummy variables for all variables by setting dummy.class to “ALL”.

dummy.data.frame(data, dummy.classes = "ALL")
##   wapple wbanana wcarrot xA xB xC y1 y2 y3 z0.4 z0.8
## 1      1       0       0  1  0  0  1  0  0    1    0
## 2      1       0       0  0  1  0  0  1  0    0    1
## 3      0       1       0  0  0  1  0  0  1    1    0
## 4      0       1       0  1  0  0  1  0  0    0    1
## 5      0       0       1  0  1  0  0  1  0    1    0
## 6      0       0       1  0  0  1  0  0  1    0    1

The application of the codes is available in our youtube channel below.

How to Create Dummy Variables Based on Variable Class in R Data Frame
Subscribe to YouTube Channel

Don’t forget to check: How to Reinstall All Packages After Updating R

References

Brown, C. (2012). dummies: Create dummy/indicator variables flexibly and efficiently. R package version 1.5.6.


Dr. Osman Dag