Binding
Whereas joins are useful for combining data frames based on matching keys, another way to combine data frames is to bind them together, which can be done either by rows or by columns. TidierData.jl
implements these actions using @bind_rows()
and @bind_cols()
, respectively.
Let's generate three data frames to combine.
using TidierData
df1 = DataFrame(a=1:3, b=1:3);
df2 = DataFrame(a=4:6, b=4:6);
df3 = DataFrame(a=7:9, c=7:9);
@bind_rows()
¤
@bind_rows(df1, df2)
Row | a | b |
---|---|---|
Int64 | Int64 | |
1 | 1 | 1 |
2 | 2 | 2 |
3 | 3 | 3 |
4 | 4 | 4 |
5 | 5 | 5 |
6 | 6 | 6 |
@bind_rows()
keeps columns that are present in at least one of the provided data frames. Any missing columns will be filled with missing
values.
@bind_rows(df1, df3)
Row | a | b | c |
---|---|---|---|
Int64 | Int64? | Int64? | |
1 | 1 | 1 | missing |
2 | 2 | 2 | missing |
3 | 3 | 3 | missing |
4 | 7 | missing | 7 |
5 | 8 | missing | 8 |
6 | 9 | missing | 9 |
There is an optional id
argument to add an identifier for combined data frames. Note that both @bind_rows
and @bind_cols
accept multiple (i.e., more than 2) data frames, as in the example below.
@bind_rows(df1, df2, df3, id = "id")
Row | a | b | c | id |
---|---|---|---|---|
Int64 | Int64? | Int64? | Int64 | |
1 | 1 | 1 | missing | 1 |
2 | 2 | 2 | missing | 1 |
3 | 3 | 3 | missing | 1 |
4 | 4 | 4 | missing | 2 |
5 | 5 | 5 | missing | 2 |
6 | 6 | 6 | missing | 2 |
7 | 7 | missing | 7 | 3 |
8 | 8 | missing | 8 | 3 |
9 | 9 | missing | 9 | 3 |
@bind_cols()
¤
@bind_cols
works similarly to R's tidyverse
although the .name_repair
argument is not supported.
@bind_cols(df1, df2)
Row | a | b | a_1 | b_1 |
---|---|---|---|---|
Int64 | Int64 | Int64 | Int64 | |
1 | 1 | 1 | 4 | 4 |
2 | 2 | 2 | 5 | 5 |
3 | 3 | 3 | 6 | 6 |
This page was generated using Literate.jl.