Recoding character variables is the important part in data manuplation. This comprehensive guide covers all steps of recoding a character variable. Find out how to revalue character data in R.

In this guide, we will work on three ways of recoding character variables in R. Firstly, we will revalue categorical variables in character type. Secondly, we will convert character to factor by recoding categorical variables. Last, we will convert character to numeric by renaming categorical variables.

In this part, for example, we construct a vector including five characters to learn how to recode categorical variables.

data <- c("a","b","c","b","a")
class(data)
## [1] "character"

Check Out: 6 Ways of Subsetting Data in R

1. How to Recode Categorical Variables in Character

In this part, we learn six ways of recoding categorical variables in character type. Firstly, we learn recode() function available in dplyr package (Wickham et al., 2020). Secondly, we use revalue() function in plyr package (Wickham, 2011). Thirdly, we recode the categorical variable by using mapvalues() in plyr package (Wickham, 2011). Then, we use ifelse function to rename the categorical variable. Moreover, we learn match() function to revalue categorical variables in R. Last, we recode the categorical variable using brackets with condition.

dplyr::recode(data, a = "apple", b = "banana", c = "carrot")
## [1] "apple"  "banana" "carrot" "banana" "apple" 

plyr::revalue(data, c(a = "apple", b = "banana", c = "carrot"))
## [1] "apple"  "banana" "carrot" "banana" "apple" 

plyr::mapvalues(data, from = c("a", "b", "c"), to = c("apple", "banana", "carrot"))
## [1] "apple"  "banana" "carrot" "banana" "apple" 

ifelse(data=="a","apple",ifelse(data=="b","banana","carrot"))
## [1] "apple"  "banana" "carrot" "banana" "apple" 
 
oldvals <- c("a", "b", "c")
newvals <- c("apple", "banana", "carrot")
newvals[match(data, oldvals)]
## [1] "apple"  "banana" "carrot" "banana" "apple" 

data[data=="a"] <- "apple"
data[data=="b"] <- "banana"
data[data=="c"] <- "carrot"
data
## [1] "apple"  "banana" "carrot" "banana" "apple" 

Also Check: How to Handle Missing Values in R

2. How to Recode Categorical Variables Converting Character to Factor

In this section, we recode the character variable and convert its data type to factor. We learn three ways of converting character to factor by renaming categorical variables. Firstly, we use recode_factor() available in dplyr package (Wickham et al., 2020). Secondly, we convert character to factor with fct_recode() function in forcats R package (Wickham, 2020). Last, we convert character to factor with match() function.

data <- c("a","b","c","b","a")

dplyr::recode_factor(data, a = "apple", b = "banana", c = "carrot")
## [1] apple  banana carrot banana apple 
## Levels: apple banana carrot

forcats::fct_recode(data, apple = "a", banana = "b", carrot = "c")
## [1] apple  banana carrot banana apple 
## Levels: apple banana carrot

oldvals <- c("a", "b", "c")
newvals <- factor(c("apple", "banana", "carrot"))
newvals[match(data, oldvals)]
## [1] apple  banana carrot banana apple 
## Levels: apple banana carrot

Also Check: How to Clean Data in R

3. How to Recode Categorical Variables Converting Character to Numeric

In this part, we learn how to convert character to numeric by recoding categorical variables. Here are three ways of converting character to numeric by recoding categorical variables. Firstly, we use recode() available in dplyr package (Wickham et al., 2020). Then, we use ifelse() function to recode the categorical data to numeric variables. Last, we learn match() function to rename the character variable to numeric one.

data <- c("a","b","c","b","a")

dplyr::recode(data, a = 1, b = 2, c = 3)
## [1] 1 2 3 2 1

ifelse(data=="a", 1, ifelse(data=="b", 2, 3))
## [1] 1 2 3 2 1

oldvals <- c("a", "b", "c")
newvals <- c(1, 2, 3)
newvals[match(data, oldvals)]
## [1] 1 2 3 2 1

The data type of all recoded variables can be converted to each other using as.factor(), as.numeric() and as.character() functions.

The application of the codes is available in our youtube channel below.

How to Recode Character Variables in R Using RStudio
Subscribe to YouTube Channel

Don’t forget to check: What are Data Types in R?

References

Wickham, H. (2011). The Split-Apply-Combine Strategy for Data Analysis. Journal of statistical software, 40(1), 1-29.

Wickham, H. (2020). forcats: Tools for Working with Categorical
Variables (Factors). R package version 0.5.0.

Wickham, H., Francois, R., Henry, L., Muller, K. (2020). dplyr: A Grammar of Data Manipulation. R package version 1.0.2.


Dr. Osman Dag