Applying functions to collections

The apply family consists of functions that help apply a function f to a collection x.

In base Julia there is already the map function, but

  • It does not work on dictionaries;

  • The function is the first argument, and the collection is the second. This make it less “pipeable”.

We will cover some common cases below.

One variable, one collection

Given a collection x and a one-variable function f, we can apply f to each element of x as follows:

using TidierIteration;

x = [3:6;];
f(x) = x^2;

apply(x, f)
4-element Vector{Int64}:
  9
 16
 25
 36

This, of course, is the same as

map(f, x)
4-element Vector{Int64}:
  9
 16
 25
 36

or

f.(x)
4-element Vector{Int64}:
  9
 16
 25
 36

Things get more interesting when we have a dictionary as follows:

d = Dict(i => i for i in [1:4;])
Dict{Int64, Int64} with 4 entries:
  4 => 4
  2 => 2
  3 => 3
  1 => 1
apply(d, f)
Dict{Int64, Int64} with 4 entries:
  4 => 16
  2 => 4
  3 => 9
  1 => 1

while map(f, d) gives an error.

We can see a dictionary as a collection with named entries, and apply(d, f) means that we apply f to each value of d while keeping the keys of d intact.

In case you want to modify the keys of a dictionary, there is the special function

apply_keys(d, x -> -x)
Dict{Int64, Int64} with 4 entries:
  -1 => 1
  -3 => 3
  -2 => 2
  -4 => 4

If you just want to apply f for its side-effects and return nothing, use

walk(x, f)

In case you want to convert each output of f to a specific type, you can always pass a compose function:

apply(x, string  f)
4-element Vector{String}:
 "9"
 "16"
 "25"
 "36"

Two variables, two collections

We can apply a two-variable function f to two collections x and y by applying f to each pair (x_i, y_i) where x_i is the i-th element of x and y_i the i-th element of y. If x and y have different sizes, we iterate until one of them ends.

x = [1:4;]
y = [5:7;]
f(x, y) = x + y

apply2(x, y, f)
3-element Vector{Int64}:
  6
  8
 10

When x and y are dictionaries, we iterate on the set of common keys:


d1 = Dict(i => i for i in [1:4;])
d2 = Dict(i => i^2 for i in [3:9;])

apply2(d1, d2, f)
Dict{Int64, Int64} with 2 entries:
  4 => 20
  3 => 12

Two variables, one collection

In this case, we can use the index of each element of x as the first variable to be applied on f, that is, we apply f on the pairs (i, x_i) for each index i of x. It is important to note that i is the first argument to be passed to f.

x = [3:6;]
g(i, x) = Dict(i => x)
iapply(x, g)
4-element Vector{Dict{Int64, Int64}}:
 Dict(1 => 3)
 Dict(2 => 4)
 Dict(3 => 5)
 Dict(4 => 6)

When x is a dictionary, the elements i are the keys of x:

d = Dict(i => i for i in [1:4;])
h(k, v) = k + v

iapply(d, h)
Dict{Int64, Int64} with 4 entries:
  4 => 8
  2 => 4
  3 => 6
  1 => 2

One variable and one collection, dataframe output

When the output of f is a dataframe, we can bind all rows (or columns) quickly as follows:

x = [1:4;]
h1(x) = DataFrame(:x => x)
apply_dfr(x, h1)
4×1 DataFrame
Row x
Int64
1 1
2 2
3 3
4 4

or

s = "abcd";
h2(s) = DataFrame(string(s) => rand(1))
h2("b")
apply_dfc(s, h2)
1×4 DataFrame
Row a b c d
Float64 Float64 Float64 Float64
1 0.0621625 0.190929 0.296002 0.625887

p variables and one collection

We can apply a p-variable function to a collection of p elements as follows:

x = [
    [1, 2], [3, 4], [5, 6]
]
f(x, y, z) = x + y + z

papply(x, f)
2-element Vector{Int64}:
  9
 12