In this chapter we will see operations that deal with rows, be it ordering or throwing some rows away.
The following is necessary to run all examples:
using DataFrames , PalmerPenguins
using Tidier
import DataFramesMeta as DFM
penguins = PalmerPenguins.load () |> DataFrame;
@slice_head (penguins, n = 10 )
1
Adelie
Torgersen
39.1
18.7
181
3750
male
2
Adelie
Torgersen
39.5
17.4
186
3800
female
3
Adelie
Torgersen
40.3
18.0
195
3250
female
4
Adelie
Torgersen
missing
missing
missing
missing
missing
5
Adelie
Torgersen
36.7
19.3
193
3450
female
6
Adelie
Torgersen
39.3
20.6
190
3650
male
7
Adelie
Torgersen
38.9
17.8
181
3625
female
8
Adelie
Torgersen
39.2
19.6
195
4675
male
9
Adelie
Torgersen
34.1
18.1
193
3475
missing
10
Adelie
Torgersen
42.0
20.2
190
4250
missing
Filtering (or: throwing rows away)
To filter a dataframe means keeping only the rows that satisfy a certain criteria (ie. a boolean condition).
To filter in Tidier, we use the macro @filter
. You can use it in the form
@filter (penguins, species == "Adelie" )
152×7 DataFrame
127 rows omitted
1
Adelie
Torgersen
39.1
18.7
181
3750
male
2
Adelie
Torgersen
39.5
17.4
186
3800
female
3
Adelie
Torgersen
40.3
18.0
195
3250
female
4
Adelie
Torgersen
missing
missing
missing
missing
missing
5
Adelie
Torgersen
36.7
19.3
193
3450
female
6
Adelie
Torgersen
39.3
20.6
190
3650
male
7
Adelie
Torgersen
38.9
17.8
181
3625
female
8
Adelie
Torgersen
39.2
19.6
195
4675
male
9
Adelie
Torgersen
34.1
18.1
193
3475
missing
10
Adelie
Torgersen
42.0
20.2
190
4250
missing
11
Adelie
Torgersen
37.8
17.1
186
3300
missing
12
Adelie
Torgersen
37.8
17.3
180
3700
missing
13
Adelie
Torgersen
41.1
17.6
182
3200
female
⋮
⋮
⋮
⋮
⋮
⋮
⋮
⋮
141
Adelie
Dream
40.2
17.1
193
3400
female
142
Adelie
Dream
40.6
17.2
187
3475
male
143
Adelie
Dream
32.1
15.5
188
3050
female
144
Adelie
Dream
40.7
17.0
190
3725
male
145
Adelie
Dream
37.3
16.8
192
3000
female
146
Adelie
Dream
39.0
18.7
185
3650
male
147
Adelie
Dream
39.2
18.6
190
4250
male
148
Adelie
Dream
36.6
18.4
184
3475
female
149
Adelie
Dream
36.0
17.8
195
3450
female
150
Adelie
Dream
37.8
18.1
193
3750
male
151
Adelie
Dream
36.0
17.1
187
3700
female
152
Adelie
Dream
41.5
18.5
201
4000
male
or without parentesis as in
@filter penguins species == "Adelie"
152×7 DataFrame
127 rows omitted
1
Adelie
Torgersen
39.1
18.7
181
3750
male
2
Adelie
Torgersen
39.5
17.4
186
3800
female
3
Adelie
Torgersen
40.3
18.0
195
3250
female
4
Adelie
Torgersen
missing
missing
missing
missing
missing
5
Adelie
Torgersen
36.7
19.3
193
3450
female
6
Adelie
Torgersen
39.3
20.6
190
3650
male
7
Adelie
Torgersen
38.9
17.8
181
3625
female
8
Adelie
Torgersen
39.2
19.6
195
4675
male
9
Adelie
Torgersen
34.1
18.1
193
3475
missing
10
Adelie
Torgersen
42.0
20.2
190
4250
missing
11
Adelie
Torgersen
37.8
17.1
186
3300
missing
12
Adelie
Torgersen
37.8
17.3
180
3700
missing
13
Adelie
Torgersen
41.1
17.6
182
3200
female
⋮
⋮
⋮
⋮
⋮
⋮
⋮
⋮
141
Adelie
Dream
40.2
17.1
193
3400
female
142
Adelie
Dream
40.6
17.2
187
3475
male
143
Adelie
Dream
32.1
15.5
188
3050
female
144
Adelie
Dream
40.7
17.0
190
3725
male
145
Adelie
Dream
37.3
16.8
192
3000
female
146
Adelie
Dream
39.0
18.7
185
3650
male
147
Adelie
Dream
39.2
18.6
190
4250
male
148
Adelie
Dream
36.6
18.4
184
3475
female
149
Adelie
Dream
36.0
17.8
195
3450
female
150
Adelie
Dream
37.8
18.1
193
3750
male
151
Adelie
Dream
36.0
17.1
187
3700
female
152
Adelie
Dream
41.5
18.5
201
4000
male
Notice that the columns are typed as if they were variables on the Julia environment. This is inspired by the tidyverse
behaviour of data-masking: inside a tidyverse verb, the columns are taken as “statistical variables” that exist inside the dataframe as columns.
In DataFramesMeta, we have two macros for filtering: @subset
and @rsubset
. Use the first when you have some criteria that uses a whole column, for example:
DFM.@subset penguins : body_mass_g .>= mean (skipmissing (: body_mass_g))
149×7 DataFrame
124 rows omitted
1
Adelie
Torgersen
39.2
19.6
195
4675
male
2
Adelie
Torgersen
42.0
20.2
190
4250
missing
3
Adelie
Torgersen
34.6
21.1
198
4400
male
4
Adelie
Torgersen
42.5
20.7
197
4500
male
5
Adelie
Dream
39.8
19.1
184
4650
male
6
Adelie
Dream
44.1
19.7
196
4400
male
7
Adelie
Dream
39.6
18.8
190
4600
male
8
Adelie
Biscoe
40.1
18.9
188
4300
male
9
Adelie
Biscoe
41.3
21.1
195
4400
male
10
Adelie
Torgersen
41.8
19.4
198
4450
male
11
Adelie
Torgersen
42.8
18.5
195
4250
male
12
Adelie
Torgersen
42.9
17.6
196
4700
male
13
Adelie
Dream
41.1
18.1
205
4300
male
⋮
⋮
⋮
⋮
⋮
⋮
⋮
⋮
138
Gentoo
Biscoe
47.2
13.7
214
4925
female
139
Gentoo
Biscoe
46.8
14.3
215
4850
female
140
Gentoo
Biscoe
50.4
15.7
222
5750
male
141
Gentoo
Biscoe
45.2
14.8
212
5200
female
142
Gentoo
Biscoe
49.9
16.1
213
5400
male
143
Chinstrap
Dream
49.2
18.2
195
4400
male
144
Chinstrap
Dream
52.8
20.0
205
4550
male
145
Chinstrap
Dream
54.2
20.8
201
4300
male
146
Chinstrap
Dream
52.0
20.7
210
4800
male
147
Chinstrap
Dream
53.5
19.9
205
4500
male
148
Chinstrap
Dream
50.8
18.5
201
4450
male
149
Chinstrap
Dream
49.0
19.6
212
4300
male
Notice the broadcast on >=. We need it because each variable is interpreted as a vector (the whole column) . Also, notice that we refer to columns as symbols (i.e. we append :
to it).
In the above example, we needed the whole column body_mass_g
to take the mean and then filter the rows based on that. If, however, your filtering criteria only uses information about each row (without needing to see it in context of the whole column), then @rsubset
(r ow subset) is easier to use: it interprets each columns as a value (not an array), so no broadcasting is needed:
DFM.@rsubset penguins : species == "Adelie"
152×7 DataFrame
127 rows omitted
1
Adelie
Torgersen
39.1
18.7
181
3750
male
2
Adelie
Torgersen
39.5
17.4
186
3800
female
3
Adelie
Torgersen
40.3
18.0
195
3250
female
4
Adelie
Torgersen
missing
missing
missing
missing
missing
5
Adelie
Torgersen
36.7
19.3
193
3450
female
6
Adelie
Torgersen
39.3
20.6
190
3650
male
7
Adelie
Torgersen
38.9
17.8
181
3625
female
8
Adelie
Torgersen
39.2
19.6
195
4675
male
9
Adelie
Torgersen
34.1
18.1
193
3475
missing
10
Adelie
Torgersen
42.0
20.2
190
4250
missing
11
Adelie
Torgersen
37.8
17.1
186
3300
missing
12
Adelie
Torgersen
37.8
17.3
180
3700
missing
13
Adelie
Torgersen
41.1
17.6
182
3200
female
⋮
⋮
⋮
⋮
⋮
⋮
⋮
⋮
141
Adelie
Dream
40.2
17.1
193
3400
female
142
Adelie
Dream
40.6
17.2
187
3475
male
143
Adelie
Dream
32.1
15.5
188
3050
female
144
Adelie
Dream
40.7
17.0
190
3725
male
145
Adelie
Dream
37.3
16.8
192
3000
female
146
Adelie
Dream
39.0
18.7
185
3650
male
147
Adelie
Dream
39.2
18.6
190
4250
male
148
Adelie
Dream
36.6
18.4
184
3475
female
149
Adelie
Dream
36.0
17.8
195
3450
female
150
Adelie
Dream
37.8
18.1
193
3750
male
151
Adelie
Dream
36.0
17.1
187
3700
female
152
Adelie
Dream
41.5
18.5
201
4000
male
In both Tidier and DataFramesMeta, only the rows to which the criteria is true
are returned. This means that false
and missing
are thrown away.
In pure DataFrames, we use the subset
function, and the criteria is passed with the notation
subset (penguins, : column => boolean_function)
where boolean_function
is a boolean (with possibly missing
values) function on 1 variable (the :column
you passed). Add the kwarg skipmissing=true
if you want to get rid of missing values.
Filtering with one criteria
Problem: Filtering all the rows with species
== “Adelie”.
@filter penguins species == "Adelie"
152×7 DataFrame
127 rows omitted
1
Adelie
Torgersen
39.1
18.7
181
3750
male
2
Adelie
Torgersen
39.5
17.4
186
3800
female
3
Adelie
Torgersen
40.3
18.0
195
3250
female
4
Adelie
Torgersen
missing
missing
missing
missing
missing
5
Adelie
Torgersen
36.7
19.3
193
3450
female
6
Adelie
Torgersen
39.3
20.6
190
3650
male
7
Adelie
Torgersen
38.9
17.8
181
3625
female
8
Adelie
Torgersen
39.2
19.6
195
4675
male
9
Adelie
Torgersen
34.1
18.1
193
3475
missing
10
Adelie
Torgersen
42.0
20.2
190
4250
missing
11
Adelie
Torgersen
37.8
17.1
186
3300
missing
12
Adelie
Torgersen
37.8
17.3
180
3700
missing
13
Adelie
Torgersen
41.1
17.6
182
3200
female
⋮
⋮
⋮
⋮
⋮
⋮
⋮
⋮
141
Adelie
Dream
40.2
17.1
193
3400
female
142
Adelie
Dream
40.6
17.2
187
3475
male
143
Adelie
Dream
32.1
15.5
188
3050
female
144
Adelie
Dream
40.7
17.0
190
3725
male
145
Adelie
Dream
37.3
16.8
192
3000
female
146
Adelie
Dream
39.0
18.7
185
3650
male
147
Adelie
Dream
39.2
18.6
190
4250
male
148
Adelie
Dream
36.6
18.4
184
3475
female
149
Adelie
Dream
36.0
17.8
195
3450
female
150
Adelie
Dream
37.8
18.1
193
3750
male
151
Adelie
Dream
36.0
17.1
187
3700
female
152
Adelie
Dream
41.5
18.5
201
4000
male
DFM.@rsubset penguins : species == "Adelie"
152×7 DataFrame
127 rows omitted
1
Adelie
Torgersen
39.1
18.7
181
3750
male
2
Adelie
Torgersen
39.5
17.4
186
3800
female
3
Adelie
Torgersen
40.3
18.0
195
3250
female
4
Adelie
Torgersen
missing
missing
missing
missing
missing
5
Adelie
Torgersen
36.7
19.3
193
3450
female
6
Adelie
Torgersen
39.3
20.6
190
3650
male
7
Adelie
Torgersen
38.9
17.8
181
3625
female
8
Adelie
Torgersen
39.2
19.6
195
4675
male
9
Adelie
Torgersen
34.1
18.1
193
3475
missing
10
Adelie
Torgersen
42.0
20.2
190
4250
missing
11
Adelie
Torgersen
37.8
17.1
186
3300
missing
12
Adelie
Torgersen
37.8
17.3
180
3700
missing
13
Adelie
Torgersen
41.1
17.6
182
3200
female
⋮
⋮
⋮
⋮
⋮
⋮
⋮
⋮
141
Adelie
Dream
40.2
17.1
193
3400
female
142
Adelie
Dream
40.6
17.2
187
3475
male
143
Adelie
Dream
32.1
15.5
188
3050
female
144
Adelie
Dream
40.7
17.0
190
3725
male
145
Adelie
Dream
37.3
16.8
192
3000
female
146
Adelie
Dream
39.0
18.7
185
3650
male
147
Adelie
Dream
39.2
18.6
190
4250
male
148
Adelie
Dream
36.6
18.4
184
3475
female
149
Adelie
Dream
36.0
17.8
195
3450
female
150
Adelie
Dream
37.8
18.1
193
3750
male
151
Adelie
Dream
36.0
17.1
187
3700
female
152
Adelie
Dream
41.5
18.5
201
4000
male
subset (penguins, : species => x -> x .== "Adelie" , skipmissing= true )
152×7 DataFrame
127 rows omitted
1
Adelie
Torgersen
39.1
18.7
181
3750
male
2
Adelie
Torgersen
39.5
17.4
186
3800
female
3
Adelie
Torgersen
40.3
18.0
195
3250
female
4
Adelie
Torgersen
missing
missing
missing
missing
missing
5
Adelie
Torgersen
36.7
19.3
193
3450
female
6
Adelie
Torgersen
39.3
20.6
190
3650
male
7
Adelie
Torgersen
38.9
17.8
181
3625
female
8
Adelie
Torgersen
39.2
19.6
195
4675
male
9
Adelie
Torgersen
34.1
18.1
193
3475
missing
10
Adelie
Torgersen
42.0
20.2
190
4250
missing
11
Adelie
Torgersen
37.8
17.1
186
3300
missing
12
Adelie
Torgersen
37.8
17.3
180
3700
missing
13
Adelie
Torgersen
41.1
17.6
182
3200
female
⋮
⋮
⋮
⋮
⋮
⋮
⋮
⋮
141
Adelie
Dream
40.2
17.1
193
3400
female
142
Adelie
Dream
40.6
17.2
187
3475
male
143
Adelie
Dream
32.1
15.5
188
3050
female
144
Adelie
Dream
40.7
17.0
190
3725
male
145
Adelie
Dream
37.3
16.8
192
3000
female
146
Adelie
Dream
39.0
18.7
185
3650
male
147
Adelie
Dream
39.2
18.6
190
4250
male
148
Adelie
Dream
36.6
18.4
184
3475
female
149
Adelie
Dream
36.0
17.8
195
3450
female
150
Adelie
Dream
37.8
18.1
193
3750
male
151
Adelie
Dream
36.0
17.1
187
3700
female
152
Adelie
Dream
41.5
18.5
201
4000
male
Filtering with several criteria
Problem: Filtering all the rows with species
== “Adelie”, sex
== “male” and body_mass_g
> 4000.
@filter penguins species == "Adelie" sex == "male" body_mass_g > 4000
34×7 DataFrame
9 rows omitted
1
Adelie
Torgersen
39.2
19.6
195
4675
male
2
Adelie
Torgersen
34.6
21.1
198
4400
male
3
Adelie
Torgersen
42.5
20.7
197
4500
male
4
Adelie
Torgersen
46.0
21.5
194
4200
male
5
Adelie
Dream
39.2
21.1
196
4150
male
6
Adelie
Dream
39.8
19.1
184
4650
male
7
Adelie
Dream
44.1
19.7
196
4400
male
8
Adelie
Dream
39.6
18.8
190
4600
male
9
Adelie
Dream
42.3
21.2
191
4150
male
10
Adelie
Biscoe
40.1
18.9
188
4300
male
11
Adelie
Biscoe
42.0
19.5
200
4050
male
12
Adelie
Biscoe
41.3
21.1
195
4400
male
13
Adelie
Biscoe
41.1
18.2
192
4050
male
⋮
⋮
⋮
⋮
⋮
⋮
⋮
⋮
23
Adelie
Dream
40.3
18.5
196
4350
male
24
Adelie
Dream
43.2
18.5
192
4100
male
25
Adelie
Biscoe
41.0
20.0
203
4725
male
26
Adelie
Biscoe
37.8
20.0
190
4250
male
27
Adelie
Biscoe
43.2
19.0
197
4775
male
28
Adelie
Biscoe
45.6
20.3
191
4600
male
29
Adelie
Biscoe
42.2
19.5
197
4275
male
30
Adelie
Biscoe
42.7
18.3
196
4075
male
31
Adelie
Torgersen
41.5
18.3
195
4300
male
32
Adelie
Dream
37.5
18.5
199
4475
male
33
Adelie
Dream
39.7
17.9
193
4250
male
34
Adelie
Dream
39.2
18.6
190
4250
male
DFM.@rsubset penguins : species == "Adelie" : sex == "male" : body_mass_g > 4000
34×7 DataFrame
9 rows omitted
1
Adelie
Torgersen
39.2
19.6
195
4675
male
2
Adelie
Torgersen
34.6
21.1
198
4400
male
3
Adelie
Torgersen
42.5
20.7
197
4500
male
4
Adelie
Torgersen
46.0
21.5
194
4200
male
5
Adelie
Dream
39.2
21.1
196
4150
male
6
Adelie
Dream
39.8
19.1
184
4650
male
7
Adelie
Dream
44.1
19.7
196
4400
male
8
Adelie
Dream
39.6
18.8
190
4600
male
9
Adelie
Dream
42.3
21.2
191
4150
male
10
Adelie
Biscoe
40.1
18.9
188
4300
male
11
Adelie
Biscoe
42.0
19.5
200
4050
male
12
Adelie
Biscoe
41.3
21.1
195
4400
male
13
Adelie
Biscoe
41.1
18.2
192
4050
male
⋮
⋮
⋮
⋮
⋮
⋮
⋮
⋮
23
Adelie
Dream
40.3
18.5
196
4350
male
24
Adelie
Dream
43.2
18.5
192
4100
male
25
Adelie
Biscoe
41.0
20.0
203
4725
male
26
Adelie
Biscoe
37.8
20.0
190
4250
male
27
Adelie
Biscoe
43.2
19.0
197
4775
male
28
Adelie
Biscoe
45.6
20.3
191
4600
male
29
Adelie
Biscoe
42.2
19.5
197
4275
male
30
Adelie
Biscoe
42.7
18.3
196
4075
male
31
Adelie
Torgersen
41.5
18.3
195
4300
male
32
Adelie
Dream
37.5
18.5
199
4475
male
33
Adelie
Dream
39.7
17.9
193
4250
male
34
Adelie
Dream
39.2
18.6
190
4250
male
subset (
penguins
, [: species, : sex, : body_mass_g] =>
(x, y, z) -> (x .== "Adelie" ) .& (y .== "male" ) .& (z .> 4000 )
,skipmissing= true
)
34×7 DataFrame
9 rows omitted
1
Adelie
Torgersen
39.2
19.6
195
4675
male
2
Adelie
Torgersen
34.6
21.1
198
4400
male
3
Adelie
Torgersen
42.5
20.7
197
4500
male
4
Adelie
Torgersen
46.0
21.5
194
4200
male
5
Adelie
Dream
39.2
21.1
196
4150
male
6
Adelie
Dream
39.8
19.1
184
4650
male
7
Adelie
Dream
44.1
19.7
196
4400
male
8
Adelie
Dream
39.6
18.8
190
4600
male
9
Adelie
Dream
42.3
21.2
191
4150
male
10
Adelie
Biscoe
40.1
18.9
188
4300
male
11
Adelie
Biscoe
42.0
19.5
200
4050
male
12
Adelie
Biscoe
41.3
21.1
195
4400
male
13
Adelie
Biscoe
41.1
18.2
192
4050
male
⋮
⋮
⋮
⋮
⋮
⋮
⋮
⋮
23
Adelie
Dream
40.3
18.5
196
4350
male
24
Adelie
Dream
43.2
18.5
192
4100
male
25
Adelie
Biscoe
41.0
20.0
203
4725
male
26
Adelie
Biscoe
37.8
20.0
190
4250
male
27
Adelie
Biscoe
43.2
19.0
197
4775
male
28
Adelie
Biscoe
45.6
20.3
191
4600
male
29
Adelie
Biscoe
42.2
19.5
197
4275
male
30
Adelie
Biscoe
42.7
18.3
196
4075
male
31
Adelie
Torgersen
41.5
18.3
195
4300
male
32
Adelie
Dream
37.5
18.5
199
4475
male
33
Adelie
Dream
39.7
17.9
193
4250
male
34
Adelie
Dream
39.2
18.6
190
4250
male
Problem: Filtering all the rows with species
== “Adelie” OR sex
== “male”.
@filter penguins (species == "Adelie" ) | (sex == "male" )
247×7 DataFrame
222 rows omitted
1
Adelie
Torgersen
39.1
18.7
181
3750
male
2
Adelie
Torgersen
39.5
17.4
186
3800
female
3
Adelie
Torgersen
40.3
18.0
195
3250
female
4
Adelie
Torgersen
missing
missing
missing
missing
missing
5
Adelie
Torgersen
36.7
19.3
193
3450
female
6
Adelie
Torgersen
39.3
20.6
190
3650
male
7
Adelie
Torgersen
38.9
17.8
181
3625
female
8
Adelie
Torgersen
39.2
19.6
195
4675
male
9
Adelie
Torgersen
34.1
18.1
193
3475
missing
10
Adelie
Torgersen
42.0
20.2
190
4250
missing
11
Adelie
Torgersen
37.8
17.1
186
3300
missing
12
Adelie
Torgersen
37.8
17.3
180
3700
missing
13
Adelie
Torgersen
41.1
17.6
182
3200
female
⋮
⋮
⋮
⋮
⋮
⋮
⋮
⋮
236
Chinstrap
Dream
50.8
18.5
201
4450
male
237
Chinstrap
Dream
49.0
19.6
212
4300
male
238
Chinstrap
Dream
51.5
18.7
187
3250
male
239
Chinstrap
Dream
51.4
19.0
201
3950
male
240
Chinstrap
Dream
50.7
19.7
203
4050
male
241
Chinstrap
Dream
52.2
18.8
197
3450
male
242
Chinstrap
Dream
49.3
19.9
203
4050
male
243
Chinstrap
Dream
50.2
18.8
202
3800
male
244
Chinstrap
Dream
51.9
19.5
206
3950
male
245
Chinstrap
Dream
55.8
19.8
207
4000
male
246
Chinstrap
Dream
49.6
18.2
193
3775
male
247
Chinstrap
Dream
50.8
19.0
210
4100
male
DFM.@rsubset penguins (: species == "Adelie" ) | (: sex == "male" )
247×7 DataFrame
222 rows omitted
1
Adelie
Torgersen
39.1
18.7
181
3750
male
2
Adelie
Torgersen
39.5
17.4
186
3800
female
3
Adelie
Torgersen
40.3
18.0
195
3250
female
4
Adelie
Torgersen
missing
missing
missing
missing
missing
5
Adelie
Torgersen
36.7
19.3
193
3450
female
6
Adelie
Torgersen
39.3
20.6
190
3650
male
7
Adelie
Torgersen
38.9
17.8
181
3625
female
8
Adelie
Torgersen
39.2
19.6
195
4675
male
9
Adelie
Torgersen
34.1
18.1
193
3475
missing
10
Adelie
Torgersen
42.0
20.2
190
4250
missing
11
Adelie
Torgersen
37.8
17.1
186
3300
missing
12
Adelie
Torgersen
37.8
17.3
180
3700
missing
13
Adelie
Torgersen
41.1
17.6
182
3200
female
⋮
⋮
⋮
⋮
⋮
⋮
⋮
⋮
236
Chinstrap
Dream
50.8
18.5
201
4450
male
237
Chinstrap
Dream
49.0
19.6
212
4300
male
238
Chinstrap
Dream
51.5
18.7
187
3250
male
239
Chinstrap
Dream
51.4
19.0
201
3950
male
240
Chinstrap
Dream
50.7
19.7
203
4050
male
241
Chinstrap
Dream
52.2
18.8
197
3450
male
242
Chinstrap
Dream
49.3
19.9
203
4050
male
243
Chinstrap
Dream
50.2
18.8
202
3800
male
244
Chinstrap
Dream
51.9
19.5
206
3950
male
245
Chinstrap
Dream
55.8
19.8
207
4000
male
246
Chinstrap
Dream
49.6
18.2
193
3775
male
247
Chinstrap
Dream
50.8
19.0
210
4100
male
subset (penguins, [: species, : sex] => (x, y) -> (x .== "Adelie" ) .| (y .== "male" ), skipmissing= true )
247×7 DataFrame
222 rows omitted
1
Adelie
Torgersen
39.1
18.7
181
3750
male
2
Adelie
Torgersen
39.5
17.4
186
3800
female
3
Adelie
Torgersen
40.3
18.0
195
3250
female
4
Adelie
Torgersen
missing
missing
missing
missing
missing
5
Adelie
Torgersen
36.7
19.3
193
3450
female
6
Adelie
Torgersen
39.3
20.6
190
3650
male
7
Adelie
Torgersen
38.9
17.8
181
3625
female
8
Adelie
Torgersen
39.2
19.6
195
4675
male
9
Adelie
Torgersen
34.1
18.1
193
3475
missing
10
Adelie
Torgersen
42.0
20.2
190
4250
missing
11
Adelie
Torgersen
37.8
17.1
186
3300
missing
12
Adelie
Torgersen
37.8
17.3
180
3700
missing
13
Adelie
Torgersen
41.1
17.6
182
3200
female
⋮
⋮
⋮
⋮
⋮
⋮
⋮
⋮
236
Chinstrap
Dream
50.8
18.5
201
4450
male
237
Chinstrap
Dream
49.0
19.6
212
4300
male
238
Chinstrap
Dream
51.5
18.7
187
3250
male
239
Chinstrap
Dream
51.4
19.0
201
3950
male
240
Chinstrap
Dream
50.7
19.7
203
4050
male
241
Chinstrap
Dream
52.2
18.8
197
3450
male
242
Chinstrap
Dream
49.3
19.9
203
4050
male
243
Chinstrap
Dream
50.2
18.8
202
3800
male
244
Chinstrap
Dream
51.9
19.5
206
3950
male
245
Chinstrap
Dream
55.8
19.8
207
4000
male
246
Chinstrap
Dream
49.6
18.2
193
3775
male
247
Chinstrap
Dream
50.8
19.0
210
4100
male
Filtering with a variable column name
Suppose the column you want to filter is a variable, let’s say a symbol
Problem: Filtering all the rows where the column stored in my_column
is “Adelie”.
@eval @filter penguins $ my_column == "Adelie"
152×7 DataFrame
127 rows omitted
1
Adelie
Torgersen
39.1
18.7
181
3750
male
2
Adelie
Torgersen
39.5
17.4
186
3800
female
3
Adelie
Torgersen
40.3
18.0
195
3250
female
4
Adelie
Torgersen
missing
missing
missing
missing
missing
5
Adelie
Torgersen
36.7
19.3
193
3450
female
6
Adelie
Torgersen
39.3
20.6
190
3650
male
7
Adelie
Torgersen
38.9
17.8
181
3625
female
8
Adelie
Torgersen
39.2
19.6
195
4675
male
9
Adelie
Torgersen
34.1
18.1
193
3475
missing
10
Adelie
Torgersen
42.0
20.2
190
4250
missing
11
Adelie
Torgersen
37.8
17.1
186
3300
missing
12
Adelie
Torgersen
37.8
17.3
180
3700
missing
13
Adelie
Torgersen
41.1
17.6
182
3200
female
⋮
⋮
⋮
⋮
⋮
⋮
⋮
⋮
141
Adelie
Dream
40.2
17.1
193
3400
female
142
Adelie
Dream
40.6
17.2
187
3475
male
143
Adelie
Dream
32.1
15.5
188
3050
female
144
Adelie
Dream
40.7
17.0
190
3725
male
145
Adelie
Dream
37.3
16.8
192
3000
female
146
Adelie
Dream
39.0
18.7
185
3650
male
147
Adelie
Dream
39.2
18.6
190
4250
male
148
Adelie
Dream
36.6
18.4
184
3475
female
149
Adelie
Dream
36.0
17.8
195
3450
female
150
Adelie
Dream
37.8
18.1
193
3750
male
151
Adelie
Dream
36.0
17.1
187
3700
female
152
Adelie
Dream
41.5
18.5
201
4000
male
DFM.@rsubset penguins $ my_column == "Adelie"
152×7 DataFrame
127 rows omitted
1
Adelie
Torgersen
39.1
18.7
181
3750
male
2
Adelie
Torgersen
39.5
17.4
186
3800
female
3
Adelie
Torgersen
40.3
18.0
195
3250
female
4
Adelie
Torgersen
missing
missing
missing
missing
missing
5
Adelie
Torgersen
36.7
19.3
193
3450
female
6
Adelie
Torgersen
39.3
20.6
190
3650
male
7
Adelie
Torgersen
38.9
17.8
181
3625
female
8
Adelie
Torgersen
39.2
19.6
195
4675
male
9
Adelie
Torgersen
34.1
18.1
193
3475
missing
10
Adelie
Torgersen
42.0
20.2
190
4250
missing
11
Adelie
Torgersen
37.8
17.1
186
3300
missing
12
Adelie
Torgersen
37.8
17.3
180
3700
missing
13
Adelie
Torgersen
41.1
17.6
182
3200
female
⋮
⋮
⋮
⋮
⋮
⋮
⋮
⋮
141
Adelie
Dream
40.2
17.1
193
3400
female
142
Adelie
Dream
40.6
17.2
187
3475
male
143
Adelie
Dream
32.1
15.5
188
3050
female
144
Adelie
Dream
40.7
17.0
190
3725
male
145
Adelie
Dream
37.3
16.8
192
3000
female
146
Adelie
Dream
39.0
18.7
185
3650
male
147
Adelie
Dream
39.2
18.6
190
4250
male
148
Adelie
Dream
36.6
18.4
184
3475
female
149
Adelie
Dream
36.0
17.8
195
3450
female
150
Adelie
Dream
37.8
18.1
193
3750
male
151
Adelie
Dream
36.0
17.1
187
3700
female
152
Adelie
Dream
41.5
18.5
201
4000
male
subset (penguins, my_column => x -> x .== "Adelie" )
152×7 DataFrame
127 rows omitted
1
Adelie
Torgersen
39.1
18.7
181
3750
male
2
Adelie
Torgersen
39.5
17.4
186
3800
female
3
Adelie
Torgersen
40.3
18.0
195
3250
female
4
Adelie
Torgersen
missing
missing
missing
missing
missing
5
Adelie
Torgersen
36.7
19.3
193
3450
female
6
Adelie
Torgersen
39.3
20.6
190
3650
male
7
Adelie
Torgersen
38.9
17.8
181
3625
female
8
Adelie
Torgersen
39.2
19.6
195
4675
male
9
Adelie
Torgersen
34.1
18.1
193
3475
missing
10
Adelie
Torgersen
42.0
20.2
190
4250
missing
11
Adelie
Torgersen
37.8
17.1
186
3300
missing
12
Adelie
Torgersen
37.8
17.3
180
3700
missing
13
Adelie
Torgersen
41.1
17.6
182
3200
female
⋮
⋮
⋮
⋮
⋮
⋮
⋮
⋮
141
Adelie
Dream
40.2
17.1
193
3400
female
142
Adelie
Dream
40.6
17.2
187
3475
male
143
Adelie
Dream
32.1
15.5
188
3050
female
144
Adelie
Dream
40.7
17.0
190
3725
male
145
Adelie
Dream
37.3
16.8
192
3000
female
146
Adelie
Dream
39.0
18.7
185
3650
male
147
Adelie
Dream
39.2
18.6
190
4250
male
148
Adelie
Dream
36.6
18.4
184
3475
female
149
Adelie
Dream
36.0
17.8
195
3450
female
150
Adelie
Dream
37.8
18.1
193
3750
male
151
Adelie
Dream
36.0
17.1
187
3700
female
152
Adelie
Dream
41.5
18.5
201
4000
male
In case the column is a string
my_column_string = "species" ;
instead of a symbol, we can write in the same way, just taking care in Tidier to convert it to a symbol
@eval @filter penguins $ (Symbol (my_column_string)) == "Adelie"
152×7 DataFrame
127 rows omitted
1
Adelie
Torgersen
39.1
18.7
181
3750
male
2
Adelie
Torgersen
39.5
17.4
186
3800
female
3
Adelie
Torgersen
40.3
18.0
195
3250
female
4
Adelie
Torgersen
missing
missing
missing
missing
missing
5
Adelie
Torgersen
36.7
19.3
193
3450
female
6
Adelie
Torgersen
39.3
20.6
190
3650
male
7
Adelie
Torgersen
38.9
17.8
181
3625
female
8
Adelie
Torgersen
39.2
19.6
195
4675
male
9
Adelie
Torgersen
34.1
18.1
193
3475
missing
10
Adelie
Torgersen
42.0
20.2
190
4250
missing
11
Adelie
Torgersen
37.8
17.1
186
3300
missing
12
Adelie
Torgersen
37.8
17.3
180
3700
missing
13
Adelie
Torgersen
41.1
17.6
182
3200
female
⋮
⋮
⋮
⋮
⋮
⋮
⋮
⋮
141
Adelie
Dream
40.2
17.1
193
3400
female
142
Adelie
Dream
40.6
17.2
187
3475
male
143
Adelie
Dream
32.1
15.5
188
3050
female
144
Adelie
Dream
40.7
17.0
190
3725
male
145
Adelie
Dream
37.3
16.8
192
3000
female
146
Adelie
Dream
39.0
18.7
185
3650
male
147
Adelie
Dream
39.2
18.6
190
4250
male
148
Adelie
Dream
36.6
18.4
184
3475
female
149
Adelie
Dream
36.0
17.8
195
3450
female
150
Adelie
Dream
37.8
18.1
193
3750
male
151
Adelie
Dream
36.0
17.1
187
3700
female
152
Adelie
Dream
41.5
18.5
201
4000
male
DFM.@rsubset penguins $ (my_column_string) == "Adelie"
152×7 DataFrame
127 rows omitted
1
Adelie
Torgersen
39.1
18.7
181
3750
male
2
Adelie
Torgersen
39.5
17.4
186
3800
female
3
Adelie
Torgersen
40.3
18.0
195
3250
female
4
Adelie
Torgersen
missing
missing
missing
missing
missing
5
Adelie
Torgersen
36.7
19.3
193
3450
female
6
Adelie
Torgersen
39.3
20.6
190
3650
male
7
Adelie
Torgersen
38.9
17.8
181
3625
female
8
Adelie
Torgersen
39.2
19.6
195
4675
male
9
Adelie
Torgersen
34.1
18.1
193
3475
missing
10
Adelie
Torgersen
42.0
20.2
190
4250
missing
11
Adelie
Torgersen
37.8
17.1
186
3300
missing
12
Adelie
Torgersen
37.8
17.3
180
3700
missing
13
Adelie
Torgersen
41.1
17.6
182
3200
female
⋮
⋮
⋮
⋮
⋮
⋮
⋮
⋮
141
Adelie
Dream
40.2
17.1
193
3400
female
142
Adelie
Dream
40.6
17.2
187
3475
male
143
Adelie
Dream
32.1
15.5
188
3050
female
144
Adelie
Dream
40.7
17.0
190
3725
male
145
Adelie
Dream
37.3
16.8
192
3000
female
146
Adelie
Dream
39.0
18.7
185
3650
male
147
Adelie
Dream
39.2
18.6
190
4250
male
148
Adelie
Dream
36.6
18.4
184
3475
female
149
Adelie
Dream
36.0
17.8
195
3450
female
150
Adelie
Dream
37.8
18.1
193
3750
male
151
Adelie
Dream
36.0
17.1
187
3700
female
152
Adelie
Dream
41.5
18.5
201
4000
male
subset (penguins, my_column_string => x -> x .== "Adelie" )
152×7 DataFrame
127 rows omitted
1
Adelie
Torgersen
39.1
18.7
181
3750
male
2
Adelie
Torgersen
39.5
17.4
186
3800
female
3
Adelie
Torgersen
40.3
18.0
195
3250
female
4
Adelie
Torgersen
missing
missing
missing
missing
missing
5
Adelie
Torgersen
36.7
19.3
193
3450
female
6
Adelie
Torgersen
39.3
20.6
190
3650
male
7
Adelie
Torgersen
38.9
17.8
181
3625
female
8
Adelie
Torgersen
39.2
19.6
195
4675
male
9
Adelie
Torgersen
34.1
18.1
193
3475
missing
10
Adelie
Torgersen
42.0
20.2
190
4250
missing
11
Adelie
Torgersen
37.8
17.1
186
3300
missing
12
Adelie
Torgersen
37.8
17.3
180
3700
missing
13
Adelie
Torgersen
41.1
17.6
182
3200
female
⋮
⋮
⋮
⋮
⋮
⋮
⋮
⋮
141
Adelie
Dream
40.2
17.1
193
3400
female
142
Adelie
Dream
40.6
17.2
187
3475
male
143
Adelie
Dream
32.1
15.5
188
3050
female
144
Adelie
Dream
40.7
17.0
190
3725
male
145
Adelie
Dream
37.3
16.8
192
3000
female
146
Adelie
Dream
39.0
18.7
185
3650
male
147
Adelie
Dream
39.2
18.6
190
4250
male
148
Adelie
Dream
36.6
18.4
184
3475
female
149
Adelie
Dream
36.0
17.8
195
3450
female
150
Adelie
Dream
37.8
18.1
193
3750
male
151
Adelie
Dream
36.0
17.1
187
3700
female
152
Adelie
Dream
41.5
18.5
201
4000
male
Arranging
To arrange a dataframe means to reorder the rows according to the order of some columns. The rows are first arranged by the first column, then by the second (if any), and so on. In Tidier, when we want to invert the ordering, just put the column name inside a desc()
call.
Arranging by one column
Problem: Arrange by body_mass_g
.
@arrange penguins body_mass_g
344×7 DataFrame
319 rows omitted
1
Chinstrap
Dream
46.9
16.6
192
2700
female
2
Adelie
Biscoe
36.5
16.6
181
2850
female
3
Adelie
Biscoe
36.4
17.1
184
2850
female
4
Adelie
Biscoe
34.5
18.1
187
2900
female
5
Adelie
Dream
33.1
16.1
178
2900
female
6
Adelie
Torgersen
38.6
17.0
188
2900
female
7
Chinstrap
Dream
43.2
16.6
187
2900
female
8
Adelie
Biscoe
37.9
18.6
193
2925
female
9
Adelie
Dream
37.5
18.9
179
2975
missing
10
Adelie
Dream
37.0
16.9
185
3000
female
11
Adelie
Dream
37.3
16.8
192
3000
female
12
Adelie
Torgersen
35.9
16.6
190
3050
female
13
Adelie
Torgersen
35.2
15.9
186
3050
female
⋮
⋮
⋮
⋮
⋮
⋮
⋮
⋮
333
Gentoo
Biscoe
48.6
16.0
230
5800
male
334
Gentoo
Biscoe
48.4
14.6
213
5850
male
335
Gentoo
Biscoe
49.3
15.7
217
5850
male
336
Gentoo
Biscoe
55.1
16.0
230
5850
male
337
Gentoo
Biscoe
45.2
16.4
223
5950
male
338
Gentoo
Biscoe
49.8
15.9
229
5950
male
339
Gentoo
Biscoe
51.1
16.3
220
6000
male
340
Gentoo
Biscoe
48.8
16.2
222
6000
male
341
Gentoo
Biscoe
59.6
17.0
230
6050
male
342
Gentoo
Biscoe
49.2
15.2
221
6300
male
343
Adelie
Torgersen
missing
missing
missing
missing
missing
344
Gentoo
Biscoe
missing
missing
missing
missing
missing
DFM.@orderby penguins : body_mass_g
344×7 DataFrame
319 rows omitted
1
Chinstrap
Dream
46.9
16.6
192
2700
female
2
Adelie
Biscoe
36.5
16.6
181
2850
female
3
Adelie
Biscoe
36.4
17.1
184
2850
female
4
Adelie
Biscoe
34.5
18.1
187
2900
female
5
Adelie
Dream
33.1
16.1
178
2900
female
6
Adelie
Torgersen
38.6
17.0
188
2900
female
7
Chinstrap
Dream
43.2
16.6
187
2900
female
8
Adelie
Biscoe
37.9
18.6
193
2925
female
9
Adelie
Dream
37.5
18.9
179
2975
missing
10
Adelie
Dream
37.0
16.9
185
3000
female
11
Adelie
Dream
37.3
16.8
192
3000
female
12
Adelie
Torgersen
35.9
16.6
190
3050
female
13
Adelie
Torgersen
35.2
15.9
186
3050
female
⋮
⋮
⋮
⋮
⋮
⋮
⋮
⋮
333
Gentoo
Biscoe
48.6
16.0
230
5800
male
334
Gentoo
Biscoe
48.4
14.6
213
5850
male
335
Gentoo
Biscoe
49.3
15.7
217
5850
male
336
Gentoo
Biscoe
55.1
16.0
230
5850
male
337
Gentoo
Biscoe
45.2
16.4
223
5950
male
338
Gentoo
Biscoe
49.8
15.9
229
5950
male
339
Gentoo
Biscoe
51.1
16.3
220
6000
male
340
Gentoo
Biscoe
48.8
16.2
222
6000
male
341
Gentoo
Biscoe
59.6
17.0
230
6050
male
342
Gentoo
Biscoe
49.2
15.2
221
6300
male
343
Adelie
Torgersen
missing
missing
missing
missing
missing
344
Gentoo
Biscoe
missing
missing
missing
missing
missing
sort (penguins, : body_mass_g)
344×7 DataFrame
319 rows omitted
1
Chinstrap
Dream
46.9
16.6
192
2700
female
2
Adelie
Biscoe
36.5
16.6
181
2850
female
3
Adelie
Biscoe
36.4
17.1
184
2850
female
4
Adelie
Biscoe
34.5
18.1
187
2900
female
5
Adelie
Dream
33.1
16.1
178
2900
female
6
Adelie
Torgersen
38.6
17.0
188
2900
female
7
Chinstrap
Dream
43.2
16.6
187
2900
female
8
Adelie
Biscoe
37.9
18.6
193
2925
female
9
Adelie
Dream
37.5
18.9
179
2975
missing
10
Adelie
Dream
37.0
16.9
185
3000
female
11
Adelie
Dream
37.3
16.8
192
3000
female
12
Adelie
Torgersen
35.9
16.6
190
3050
female
13
Adelie
Torgersen
35.2
15.9
186
3050
female
⋮
⋮
⋮
⋮
⋮
⋮
⋮
⋮
333
Gentoo
Biscoe
48.6
16.0
230
5800
male
334
Gentoo
Biscoe
48.4
14.6
213
5850
male
335
Gentoo
Biscoe
49.3
15.7
217
5850
male
336
Gentoo
Biscoe
55.1
16.0
230
5850
male
337
Gentoo
Biscoe
45.2
16.4
223
5950
male
338
Gentoo
Biscoe
49.8
15.9
229
5950
male
339
Gentoo
Biscoe
51.1
16.3
220
6000
male
340
Gentoo
Biscoe
48.8
16.2
222
6000
male
341
Gentoo
Biscoe
59.6
17.0
230
6050
male
342
Gentoo
Biscoe
49.2
15.2
221
6300
male
343
Adelie
Torgersen
missing
missing
missing
missing
missing
344
Gentoo
Biscoe
missing
missing
missing
missing
missing
Arranging by two columns, with one reversed
Problem: First arrange by island
, then by reversed body_mass_g
.
@arrange penguins island desc (body_mass_g)
344×7 DataFrame
319 rows omitted
1
Gentoo
Biscoe
missing
missing
missing
missing
missing
2
Gentoo
Biscoe
49.2
15.2
221
6300
male
3
Gentoo
Biscoe
59.6
17.0
230
6050
male
4
Gentoo
Biscoe
51.1
16.3
220
6000
male
5
Gentoo
Biscoe
48.8
16.2
222
6000
male
6
Gentoo
Biscoe
45.2
16.4
223
5950
male
7
Gentoo
Biscoe
49.8
15.9
229
5950
male
8
Gentoo
Biscoe
48.4
14.6
213
5850
male
9
Gentoo
Biscoe
49.3
15.7
217
5850
male
10
Gentoo
Biscoe
55.1
16.0
230
5850
male
11
Gentoo
Biscoe
49.5
16.2
229
5800
male
12
Gentoo
Biscoe
48.6
16.0
230
5800
male
13
Gentoo
Biscoe
50.4
15.7
222
5750
male
⋮
⋮
⋮
⋮
⋮
⋮
⋮
⋮
333
Adelie
Torgersen
41.1
18.6
189
3325
male
334
Adelie
Torgersen
38.5
17.9
190
3325
female
335
Adelie
Torgersen
37.8
17.1
186
3300
missing
336
Adelie
Torgersen
38.8
17.6
191
3275
female
337
Adelie
Torgersen
40.3
18.0
195
3250
female
338
Adelie
Torgersen
41.1
17.6
182
3200
female
339
Adelie
Torgersen
34.6
17.2
189
3200
female
340
Adelie
Torgersen
36.2
17.2
187
3150
female
341
Adelie
Torgersen
35.9
16.6
190
3050
female
342
Adelie
Torgersen
35.2
15.9
186
3050
female
343
Adelie
Torgersen
39.0
17.1
191
3050
female
344
Adelie
Torgersen
38.6
17.0
188
2900
female
# works only when the reversed column is numeric?
DFM.@orderby penguins : island : body_mass_g .* - 1
344×7 DataFrame
319 rows omitted
1
Gentoo
Biscoe
49.2
15.2
221
6300
male
2
Gentoo
Biscoe
59.6
17.0
230
6050
male
3
Gentoo
Biscoe
51.1
16.3
220
6000
male
4
Gentoo
Biscoe
48.8
16.2
222
6000
male
5
Gentoo
Biscoe
45.2
16.4
223
5950
male
6
Gentoo
Biscoe
49.8
15.9
229
5950
male
7
Gentoo
Biscoe
48.4
14.6
213
5850
male
8
Gentoo
Biscoe
49.3
15.7
217
5850
male
9
Gentoo
Biscoe
55.1
16.0
230
5850
male
10
Gentoo
Biscoe
49.5
16.2
229
5800
male
11
Gentoo
Biscoe
48.6
16.0
230
5800
male
12
Gentoo
Biscoe
50.4
15.7
222
5750
male
13
Gentoo
Biscoe
50.0
16.3
230
5700
male
⋮
⋮
⋮
⋮
⋮
⋮
⋮
⋮
333
Adelie
Torgersen
38.5
17.9
190
3325
female
334
Adelie
Torgersen
37.8
17.1
186
3300
missing
335
Adelie
Torgersen
38.8
17.6
191
3275
female
336
Adelie
Torgersen
40.3
18.0
195
3250
female
337
Adelie
Torgersen
41.1
17.6
182
3200
female
338
Adelie
Torgersen
34.6
17.2
189
3200
female
339
Adelie
Torgersen
36.2
17.2
187
3150
female
340
Adelie
Torgersen
35.9
16.6
190
3050
female
341
Adelie
Torgersen
35.2
15.9
186
3050
female
342
Adelie
Torgersen
39.0
17.1
191
3050
female
343
Adelie
Torgersen
38.6
17.0
188
2900
female
344
Adelie
Torgersen
missing
missing
missing
missing
missing
sort (penguins, [order (: island), order (: body_mass_g, rev= true )])
344×7 DataFrame
319 rows omitted
1
Gentoo
Biscoe
missing
missing
missing
missing
missing
2
Gentoo
Biscoe
49.2
15.2
221
6300
male
3
Gentoo
Biscoe
59.6
17.0
230
6050
male
4
Gentoo
Biscoe
51.1
16.3
220
6000
male
5
Gentoo
Biscoe
48.8
16.2
222
6000
male
6
Gentoo
Biscoe
45.2
16.4
223
5950
male
7
Gentoo
Biscoe
49.8
15.9
229
5950
male
8
Gentoo
Biscoe
48.4
14.6
213
5850
male
9
Gentoo
Biscoe
49.3
15.7
217
5850
male
10
Gentoo
Biscoe
55.1
16.0
230
5850
male
11
Gentoo
Biscoe
49.5
16.2
229
5800
male
12
Gentoo
Biscoe
48.6
16.0
230
5800
male
13
Gentoo
Biscoe
50.4
15.7
222
5750
male
⋮
⋮
⋮
⋮
⋮
⋮
⋮
⋮
333
Adelie
Torgersen
41.1
18.6
189
3325
male
334
Adelie
Torgersen
38.5
17.9
190
3325
female
335
Adelie
Torgersen
37.8
17.1
186
3300
missing
336
Adelie
Torgersen
38.8
17.6
191
3275
female
337
Adelie
Torgersen
40.3
18.0
195
3250
female
338
Adelie
Torgersen
41.1
17.6
182
3200
female
339
Adelie
Torgersen
34.6
17.2
189
3200
female
340
Adelie
Torgersen
36.2
17.2
187
3150
female
341
Adelie
Torgersen
35.9
16.6
190
3050
female
342
Adelie
Torgersen
35.2
15.9
186
3050
female
343
Adelie
Torgersen
39.0
17.1
191
3050
female
344
Adelie
Torgersen
38.6
17.0
188
2900
female
Arranging by one variable column
Problem: Arrange by a column stored in a variable my_arrange_column
.
my_arrange_column = : body_mass_g;
@eval @arrange penguins $ my_arrange_column
344×7 DataFrame
319 rows omitted
1
Chinstrap
Dream
46.9
16.6
192
2700
female
2
Adelie
Biscoe
36.5
16.6
181
2850
female
3
Adelie
Biscoe
36.4
17.1
184
2850
female
4
Adelie
Biscoe
34.5
18.1
187
2900
female
5
Adelie
Dream
33.1
16.1
178
2900
female
6
Adelie
Torgersen
38.6
17.0
188
2900
female
7
Chinstrap
Dream
43.2
16.6
187
2900
female
8
Adelie
Biscoe
37.9
18.6
193
2925
female
9
Adelie
Dream
37.5
18.9
179
2975
missing
10
Adelie
Dream
37.0
16.9
185
3000
female
11
Adelie
Dream
37.3
16.8
192
3000
female
12
Adelie
Torgersen
35.9
16.6
190
3050
female
13
Adelie
Torgersen
35.2
15.9
186
3050
female
⋮
⋮
⋮
⋮
⋮
⋮
⋮
⋮
333
Gentoo
Biscoe
48.6
16.0
230
5800
male
334
Gentoo
Biscoe
48.4
14.6
213
5850
male
335
Gentoo
Biscoe
49.3
15.7
217
5850
male
336
Gentoo
Biscoe
55.1
16.0
230
5850
male
337
Gentoo
Biscoe
45.2
16.4
223
5950
male
338
Gentoo
Biscoe
49.8
15.9
229
5950
male
339
Gentoo
Biscoe
51.1
16.3
220
6000
male
340
Gentoo
Biscoe
48.8
16.2
222
6000
male
341
Gentoo
Biscoe
59.6
17.0
230
6050
male
342
Gentoo
Biscoe
49.2
15.2
221
6300
male
343
Adelie
Torgersen
missing
missing
missing
missing
missing
344
Gentoo
Biscoe
missing
missing
missing
missing
missing
DFM.@orderby penguins $ my_arrange_column
344×7 DataFrame
319 rows omitted
1
Chinstrap
Dream
46.9
16.6
192
2700
female
2
Adelie
Biscoe
36.5
16.6
181
2850
female
3
Adelie
Biscoe
36.4
17.1
184
2850
female
4
Adelie
Biscoe
34.5
18.1
187
2900
female
5
Adelie
Dream
33.1
16.1
178
2900
female
6
Adelie
Torgersen
38.6
17.0
188
2900
female
7
Chinstrap
Dream
43.2
16.6
187
2900
female
8
Adelie
Biscoe
37.9
18.6
193
2925
female
9
Adelie
Dream
37.5
18.9
179
2975
missing
10
Adelie
Dream
37.0
16.9
185
3000
female
11
Adelie
Dream
37.3
16.8
192
3000
female
12
Adelie
Torgersen
35.9
16.6
190
3050
female
13
Adelie
Torgersen
35.2
15.9
186
3050
female
⋮
⋮
⋮
⋮
⋮
⋮
⋮
⋮
333
Gentoo
Biscoe
48.6
16.0
230
5800
male
334
Gentoo
Biscoe
48.4
14.6
213
5850
male
335
Gentoo
Biscoe
49.3
15.7
217
5850
male
336
Gentoo
Biscoe
55.1
16.0
230
5850
male
337
Gentoo
Biscoe
45.2
16.4
223
5950
male
338
Gentoo
Biscoe
49.8
15.9
229
5950
male
339
Gentoo
Biscoe
51.1
16.3
220
6000
male
340
Gentoo
Biscoe
48.8
16.2
222
6000
male
341
Gentoo
Biscoe
59.6
17.0
230
6050
male
342
Gentoo
Biscoe
49.2
15.2
221
6300
male
343
Adelie
Torgersen
missing
missing
missing
missing
missing
344
Gentoo
Biscoe
missing
missing
missing
missing
missing
sort (penguins, my_arrange_column)
344×7 DataFrame
319 rows omitted
1
Chinstrap
Dream
46.9
16.6
192
2700
female
2
Adelie
Biscoe
36.5
16.6
181
2850
female
3
Adelie
Biscoe
36.4
17.1
184
2850
female
4
Adelie
Biscoe
34.5
18.1
187
2900
female
5
Adelie
Dream
33.1
16.1
178
2900
female
6
Adelie
Torgersen
38.6
17.0
188
2900
female
7
Chinstrap
Dream
43.2
16.6
187
2900
female
8
Adelie
Biscoe
37.9
18.6
193
2925
female
9
Adelie
Dream
37.5
18.9
179
2975
missing
10
Adelie
Dream
37.0
16.9
185
3000
female
11
Adelie
Dream
37.3
16.8
192
3000
female
12
Adelie
Torgersen
35.9
16.6
190
3050
female
13
Adelie
Torgersen
35.2
15.9
186
3050
female
⋮
⋮
⋮
⋮
⋮
⋮
⋮
⋮
333
Gentoo
Biscoe
48.6
16.0
230
5800
male
334
Gentoo
Biscoe
48.4
14.6
213
5850
male
335
Gentoo
Biscoe
49.3
15.7
217
5850
male
336
Gentoo
Biscoe
55.1
16.0
230
5850
male
337
Gentoo
Biscoe
45.2
16.4
223
5950
male
338
Gentoo
Biscoe
49.8
15.9
229
5950
male
339
Gentoo
Biscoe
51.1
16.3
220
6000
male
340
Gentoo
Biscoe
48.8
16.2
222
6000
male
341
Gentoo
Biscoe
59.6
17.0
230
6050
male
342
Gentoo
Biscoe
49.2
15.2
221
6300
male
343
Adelie
Torgersen
missing
missing
missing
missing
missing
344
Gentoo
Biscoe
missing
missing
missing
missing
missing