@mutate
The primary purpose of @mutate()
is to either create a new column or to update an existing column without changing the number of rows in the dataset. If you only plan to select the mutated columns, then you can use @transmute()
instead of @mutate(). However, in
Tidier.jl,
@select()can also be used to create and select new columns (unlike R's
tidyverse), which means that
@transmute()is a redundant function in that it has the same functionality as
@select().
@transmuteis included in
Tidier.jl` for convenience but is not strictly required.
using Tidier
using RDatasets
movies = dataset("ggplot2", "movies");
Using @mutate()
to add a new column¤
Let's create a new column that contains the budget for each movie expressed in millions of dollars, and the select a handful of columns and rows for the sake of brevity. Notice that the underscores in in 1_000_000
are strictly optional and included only for the sake of readability. Underscores within numbers are ignored by Julia, such that 1_000_000
is read by Julia exactly the same as 1000000
.
@chain movies begin
@filter(!ismissing(Budget))
@mutate(Budget_Millions = Budget/1_000_000)
@select(Title, Budget, Budget_Millions)
@slice(1:5)
end
Row | Title | Budget | Budget_Millions |
---|---|---|---|
String | Int32? | Float64 | |
1 | 'G' Men | 450000 | 0.45 |
2 | 'Manos' the Hands of Fate | 19000 | 0.019 |
3 | 'Til There Was You | 23000000 | 23.0 |
4 | .com for Murder | 5000000 | 5.0 |
5 | 10 Things I Hate About You | 16000000 | 16.0 |
Using @mutate()
to update an existing column¤
Here we will repeat the same exercise, except that we will overwrite the existing Budget
column.
@chain movies begin
@filter(!ismissing(Budget))
@mutate(Budget = Budget/1_000_000)
@select(Title, Budget)
@slice(1:5)
end
Row | Title | Budget |
---|---|---|
String | Float64 | |
1 | 'G' Men | 0.45 |
2 | 'Manos' the Hands of Fate | 0.019 |
3 | 'Til There Was You | 23.0 |
4 | .com for Murder | 5.0 |
5 | 10 Things I Hate About You | 16.0 |
Here's an example of using @mutate
with in
.
@chain movies begin
@filter(!ismissing(Budget))
@mutate(Nineties = Year in 1990:1999)
@select(Title, Year, Nineties)
@slice(1:5)
end
Row | Title | Year | Nineties |
---|---|---|---|
String | Int32 | Bool | |
1 | 'G' Men | 1935 | false |
2 | 'Manos' the Hands of Fate | 1966 | false |
3 | 'Til There Was You | 1997 | true |
4 | .com for Murder | 2002 | false |
5 | 10 Things I Hate About You | 1999 | true |
Using @transmute
to update and select columns.¤
If we knew we wanted to select only the Title
and Budget
columns, we could have also used@transmute()
, which (again) is just an alias for @select()
.
@chain movies begin
@filter(!ismissing(Budget))
@transmute(Title = Title, Budget = Budget/1_000_000)
@slice(1:5)
end
Row | Title | Budget |
---|---|---|
String | Float64 | |
1 | 'G' Men | 0.45 |
2 | 'Manos' the Hands of Fate | 0.019 |
3 | 'Til There Was You | 23.0 |
4 | .com for Murder | 5.0 |
5 | 10 Things I Hate About You | 16.0 |
This page was generated using Literate.jl.