Movies dataset

To get started, we will load the movies dataset from the RDatasets.jl package.

using TidierData
using RDatasets

movies = dataset("ggplot2", "movies");

To work with this dataset, we will use the @chain macro. This macro initiates a pipe, and every function or macro provided to it between the begin and end blocks modifies the dataframe mentioned at the beginning of the pipe. You don't have to necessarily spread a chain over multiple lines of code, but when working with data frames it's often easiest to do so. Before going further, take a look at the Chain.jl GitHub page to see all the cool things that are possible with this, including mid-chain side effects using @aside and mid-chain assignment of variables.

Let's take a look at the first 5 rows of the movies dataset using @slice().

@chain movies begin
    @slice(1:5)
end
5×24 DataFrame
RowTitleYearLengthBudgetRatingVotesR1R2R3R4R5R6R7R8R9R10MPAAActionAnimationComedyDramaDocumentaryRomanceShort
StringInt32Int32Int32?Float64Int32Float64Float64Float64Float64Float64Float64Float64Float64Float64Float64Cat…Int32Int32Int32Int32Int32Int32Int32
1$1971121missing6.43484.54.54.54.514.524.524.514.54.54.50011000
2$1000 a Touchdown193971missing6.0200.014.54.524.514.514.514.54.54.514.50010000
3$21 a Day Once a Month19417missing8.250.00.00.00.00.024.50.044.524.524.50100001
4$40,000199670missing8.2614.50.00.00.00.00.00.00.034.545.50010000
5$50,000 Climax Show, The197571missing3.41724.54.50.014.514.54.50.00.00.024.50000000

Let's use @glimpse() to preview the dataset.

@glimpse(movies)
Rows: 58788
Columns: 24
.Title         String         $, $1000 a Touchdown, $21 a Day Once a Month, $40,
.Year          Int32          1971, 1939, 1941, 1996, 1975, 2000, 2002, 2002, 19
.Length        Int32          121, 71, 7, 70, 71, 91, 93, 25, 97, 61, 99, 96, 10
.Budget        Union{Missing, Int32}missing, missing, missing, missing, missing,
.Rating        Float64        6.4, 6.0, 8.2, 8.2, 3.4, 4.3, 5.3, 6.7, 6.6, 6.0,
.Votes         Int32          348, 20, 5, 6, 17, 45, 200, 24, 18, 51, 23, 53, 44
.R1            Float64        4.5, 0.0, 0.0, 14.5, 24.5, 4.5, 4.5, 4.5, 4.5, 4.5
.R2            Float64        4.5, 14.5, 0.0, 0.0, 4.5, 4.5, 0.0, 4.5, 4.5, 0.0,
.R3            Float64        4.5, 4.5, 0.0, 0.0, 0.0, 4.5, 4.5, 4.5, 4.5, 4.5,
.R4            Float64        4.5, 24.5, 0.0, 0.0, 14.5, 14.5, 4.5, 4.5, 0.0, 4.
.R5            Float64        14.5, 14.5, 0.0, 0.0, 14.5, 14.5, 24.5, 4.5, 0.0,
.R6            Float64        24.5, 14.5, 24.5, 0.0, 4.5, 14.5, 24.5, 14.5, 0.0,
.R7            Float64        24.5, 14.5, 0.0, 0.0, 0.0, 4.5, 14.5, 14.5, 34.5,
.R8            Float64        14.5, 4.5, 44.5, 0.0, 0.0, 4.5, 4.5, 14.5, 14.5, 4
.R9            Float64        4.5, 4.5, 24.5, 34.5, 0.0, 14.5, 4.5, 4.5, 4.5, 4.
.R10           Float64        4.5, 14.5, 24.5, 45.5, 24.5, 14.5, 14.5, 14.5, 24.
.MPAA          CategoricalArrays.CategoricalValue{String, UInt8}, , , , , , R, ,
.Action        Int32          0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0,
.Animation     Int32          0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
.Comedy        Int32          1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0,
.Drama         Int32          1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1,
.Documentary   Int32          0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,
.Romance       Int32          0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
.Short         Int32          0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0,

This page was generated using Literate.jl.