across
across() is a helper function that is typically used inside @mutate() or @summarize to operate on multiple columns and/or multiple functions. Notice that across() accepts two arguments, a set of variables and a set of functions. If providing multiple variables or functions, these should be provided as a tuple – in other words, wrapped in parentheses and separated by commas. If you want to skip missing values, you can "fuse" the summary function (such as mean()) with the skipmissing() function by using the fuction fusion operator, which you can type out in Julia by typing \circ and then pressing [Tab] such that it reads mean∘skipmissing.
using TidierData
using RDatasets
movies = dataset("ggplot2", "movies");
One variable, one function
@chain movies begin
@mutate(Budget = Budget / 1_000_000)
@summarize(across(Budget, mean∘skipmissing))
end
One variable, one anonymous function
@chain movies begin
@mutate(Budget = Budget / 1_000_000)
@summarize(across(Budget, (x -> mean(skipmissing(x)))))
end
Note: compound functions are not correctly supported inside of anonymous functions. As of right now, the above function works, but (x -> mean∘skipmissing(x)) does not work. This is a known bug and will be fixed in a future update.
Multiple variables, multiple functions
@chain movies begin
@mutate(Budget = Budget / 1_000_000)
@summarize(across((Rating, Budget), (mean∘skipmissing, median∘skipmissing)))
end
Multiple selection helpers, multiple functions
@chain movies begin
@mutate(Budget = Budget / 1_000_000)
@summarize(across((starts_with("Bud"), ends_with("ting")), (mean∘skipmissing, median∘skipmissing)))
end
This page was generated using Literate.jl.