Sometimes, we need to merge datasets coming from different sources. This ultimate tutorial includes combining the data frames in different ways. Find out how to merge data frames in R.

In this tutorial, we will cover how to merge data frames in different ways. We will learn combining data frames by common ids, first data frame ids, second data frame ids and all ids. There are commonly used three ways of merging data frames in R. Firstly, we will learn how to join the data frames by using merge() function. Secondly, we learn dplyr package (Wickham et al., 2022) to merge data frames in R. At last, we use tidyverse package (Wickham et al., 2019) to combine data drames in R.

Let’s construct two data frames to illustrate how to merge data frames in R.

data1 <- data.frame(id = 1:4, x1 = 101:104)
data1
##   id  x1
## 1  1 101
## 2  2 102
## 3  3 103
## 4  4 104

data2 <- data.frame(id = 3:6, x2 = 13:16)
data2
##   id x2
## 1  3 13
## 2  4 14
## 3  5 15
## 4  6 16

Check Out: How to Convert All Columns of Data Frame to Numeric in R

1) How to Merge Data Frames Using merge() Function in R

In this part, we use merge() function to combine the data frames by common ids, first data frame ids, second data frame ids and all ids, respectively.

merge(data1, data2, by = "id")
##   id  x1 x2
## 1  3 103 13
## 2  4 104 14

merge(data1, data2, by = "id", all.x = TRUE) 
##   id  x1 x2
## 1  1 101 NA
## 2  2 102 NA
## 3  3 103 13
## 4  4 104 14

merge(data1, data2, by = "id", all.y = TRUE) 
##   id  x1 x2
## 1  3 103 13
## 2  4 104 14
## 3  5  NA 15
## 4  6  NA 16

merge(data1, data2, by = "id", all.x = TRUE, all.y = TRUE)  
##   id  x1 x2
## 1  1 101 NA
## 2  2 102 NA
## 3  3 103 13
## 4  4 104 14
## 5  5  NA 15
## 6  6  NA 16

Also Check: How to Find Class of Each Column in R Data Frame

2) How to Merge Data Frames Using dplyr Package in R

In this section, we learn inner_join(), left_join(), right_join() and full_join() functions available in dplyr package (Wickham et al., 2022) to merge data frames by common ids, first data frame ids, second data frame ids and all ids, respectively.

library(dplyr)
data1 %>% inner_join(data2, by = 'id')
##   id  x1 x2
## 1  3 103 13
## 2  4 104 14

data1 %>% left_join(data2, by = 'id')
##   id  x1 x2
## 1  1 101 NA
## 2  2 102 NA
## 3  3 103 13
## 4  4 104 14
 
data1 %>% right_join(data2, by = 'id')
##   id  x1 x2
## 1  3 103 13
## 2  4 104 14
## 3  5  NA 15
## 4  6  NA 16
 
data1 %>% full_join(data2, by = 'id')
##   id  x1 x2
## 1  1 101 NA
## 2  2 102 NA
## 3  3 103 13
## 4  4 104 14
## 5  5  NA 15
## 6  6  NA 16

Also Check: How to Round Data Frame Containing Character Variables in R

3) How to Merge Data Frames Using tidyverse Package in R

In this part, we first need to list the data frames. Then, we use reduce() function. Inside reduce() function, inner_join, left_join, right_join and full_join must be defined to merge data frames by common ids, first data frame ids, second data frame ids and all ids, respectively.

library(tidyverse)
data_list <- list(data1, data2)      
 
data_list %>% reduce(inner_join, by = 'id')
##   id  x1 x2
## 1  3 103 13
## 2  4 104 14

data_list %>% reduce(left_join, by = 'id')
##   id  x1 x2
## 1  1 101 NA
## 2  2 102 NA
## 3  3 103 13
## 4  4 104 14

data_list %>% reduce(right_join, by = 'id')
##   id  x1 x2
## 1  3 103 13
## 2  4 104 14
## 3  5  NA 15
## 4  6  NA 16

data_list %>% reduce(full_join, by = 'id')
##   id  x1 x2
## 1  1 101 NA
## 2  2 102 NA
## 3  3 103 13
## 4  4 104 14
## 5  5  NA 15
## 6  6  NA 16

The application of the codes is available in our youtube channel below.

How to Merge Data Frames in R
Subscribe to YouTube Channel

Don’t forget to check: How to Sort a Data Frame by Single and Multiple Columns in R

References

Wickham, H., Averick, M., Bryan, J., Chang, W., McGowan, L. D. A., François, R., … & Yutani, H. (2019). Welcome to the Tidyverse. Journal of open source software4(43), 1686.

Wickham, H., Francois, R., Henry, L., Muller, K. (2022). dplyr: A Grammar of Data Manipulation. R package version 1.0.10.


Dr. Osman Dag