Sometimes, we need to merge datasets coming from different sources. This ultimate tutorial includes combining the data frames in different ways. Find out how to merge data frames in R.

In this tutorial, we will cover how to merge data frames in different ways. We will learn combining data frames by common ids, first data frame ids, second data frame ids and all ids. There are commonly used three ways of merging data frames in R. Firstly, we will learn how to join the data frames by using merge() function. Secondly, we learn dplyr package (Wickham et al., 2022) to merge data frames in R. At last, we use tidyverse package (Wickham et al., 2019) to combine data drames in R.

Let’s construct two data frames to illustrate how to merge data frames in R.

```data1 <- data.frame(id = 1:4, x1 = 101:104)
data1
##   id  x1
## 1  1 101
## 2  2 102
## 3  3 103
## 4  4 104

data2 <- data.frame(id = 3:6, x2 = 13:16)
data2
##   id x2
## 1  3 13
## 2  4 14
## 3  5 15
## 4  6 16
```

## 1) How to Merge Data Frames Using merge() Function in R

In this part, we use merge() function to combine the data frames by common ids, first data frame ids, second data frame ids and all ids, respectively.

```merge(data1, data2, by = "id")
##   id  x1 x2
## 1  3 103 13
## 2  4 104 14

merge(data1, data2, by = "id", all.x = TRUE)
##   id  x1 x2
## 1  1 101 NA
## 2  2 102 NA
## 3  3 103 13
## 4  4 104 14

merge(data1, data2, by = "id", all.y = TRUE)
##   id  x1 x2
## 1  3 103 13
## 2  4 104 14
## 3  5  NA 15
## 4  6  NA 16

merge(data1, data2, by = "id", all.x = TRUE, all.y = TRUE)
##   id  x1 x2
## 1  1 101 NA
## 2  2 102 NA
## 3  3 103 13
## 4  4 104 14
## 5  5  NA 15
## 6  6  NA 16
```

## 2) How to Merge Data Frames Using dplyr Package in R

In this section, we learn inner_join(), left_join(), right_join() and full_join() functions available in dplyr package (Wickham et al., 2022) to merge data frames by common ids, first data frame ids, second data frame ids and all ids, respectively.

```library(dplyr)
data1 %>% inner_join(data2, by = 'id')
##   id  x1 x2
## 1  3 103 13
## 2  4 104 14

data1 %>% left_join(data2, by = 'id')
##   id  x1 x2
## 1  1 101 NA
## 2  2 102 NA
## 3  3 103 13
## 4  4 104 14

data1 %>% right_join(data2, by = 'id')
##   id  x1 x2
## 1  3 103 13
## 2  4 104 14
## 3  5  NA 15
## 4  6  NA 16

data1 %>% full_join(data2, by = 'id')
##   id  x1 x2
## 1  1 101 NA
## 2  2 102 NA
## 3  3 103 13
## 4  4 104 14
## 5  5  NA 15
## 6  6  NA 16
```

## 3) How to Merge Data Frames Using tidyverse Package in R

In this part, we first need to list the data frames. Then, we use reduce() function. Inside reduce() function, inner_join, left_join, right_join and full_join must be defined to merge data frames by common ids, first data frame ids, second data frame ids and all ids, respectively.

```library(tidyverse)
data_list <- list(data1, data2)

data_list %>% reduce(inner_join, by = 'id')
##   id  x1 x2
## 1  3 103 13
## 2  4 104 14

data_list %>% reduce(left_join, by = 'id')
##   id  x1 x2
## 1  1 101 NA
## 2  2 102 NA
## 3  3 103 13
## 4  4 104 14

data_list %>% reduce(right_join, by = 'id')
##   id  x1 x2
## 1  3 103 13
## 2  4 104 14
## 3  5  NA 15
## 4  6  NA 16

data_list %>% reduce(full_join, by = 'id')
##   id  x1 x2
## 1  1 101 NA
## 2  2 102 NA
## 3  3 103 13
## 4  4 104 14
## 5  5  NA 15
## 6  6  NA 16
```

The application of the codes is available in our youtube channel below.