Supported Functions
using TidierData
using TidierDates
mdy()
, dmy()
, ymd()
¤
These functions parse dates represented as strings into a DateTime format in Julia. The input should be a string month-day-year, day-month-year, or year-month-day format respectively. They are relatively robust in their ability to take non-uniform strings of dates with year in yyyy format
df = DataFrame(date = ["today is the 4th July, 2000",
"ayer fue 13th Oct 2001",
"3 of Mar, 2002 was a fun day",
"23rd Apr 2003",
"23/7/2043",
"03/2/1932",
"23-08-1932",
"4th of July, 2005",
"08092019" ,
missing])
@chain df begin
@mutate(date = dmy(date))
end
Row | date |
---|---|
Date? | |
1 | 2000-07-04 |
2 | 2001-10-13 |
3 | 2002-03-03 |
4 | 2003-04-23 |
5 | 2043-07-23 |
6 | 1932-02-03 |
7 | 1932-08-23 |
8 | 2005-07-04 |
9 | 2019-09-08 |
10 | missing |
mdy_hms()
, dmy_hms()
, ymd_hms()
¤
Similar to the previous group, these functions parse date-time strings in month-day-year, day-month-year, or year-month-day format respectively. The input should include both date and time information.
round_date()
, floor_date()
¤
#`floor_date()`: This function rounds a date down to the nearest specified unit (e.g., hour, minute, day, month, year). It takes two arguments - a Date or DateTime object and a string indicating the unit of time to which the date should be floored.
#`round_date()`: This function rounds a date to the nearest specified unit (e.g., hour, minute, month, year). Like
df2 = DataFrame(date = ["20190330120141", "2008-04-05 16-23-07", "2010.06.07 19:45:00",
"2011-2-8 14-3-7", "2012-3, 9 09:2, 37", "201305-15 0302-09",
"2013 arbitrary 2 non-decimal 7 chars 13 in between 2 !!! 7",
"OR collapsed formats: 20140618 181608 (as long as prefixed with zeros)",
missing ])
@chain df2 begin
@mutate(date = ymd_hms(date))
@mutate(floor_byhr = floor_date(date, "hour"))
@mutate(round_bymin = round_date(date, "minute"))
@mutate(rounded_bymo = round_date(date, "month"))
end
Row | date | floor_byhr | round_bymin | rounded_bymo |
---|---|---|---|---|
DateTime? | DateTime? | DateTime? | DateTime? | |
1 | -166741088-11-01T09:06:23.360 | -166741088-11-01T09:00:00 | -166741088-11-01T09:06:00 | -166741088-11-01T00:00:00 |
2 | 2008-04-05T16:23:07 | 2008-04-05T16:00:00 | 2008-04-05T16:23:00 | 2008-04-01T00:00:00 |
3 | 2010-06-07T19:45:00 | 2010-06-07T19:00:00 | 2010-06-07T19:45:00 | 2010-06-01T00:00:00 |
4 | 2011-02-08T14:03:07 | 2011-02-08T14:00:00 | 2011-02-08T14:03:00 | 2011-02-01T00:00:00 |
5 | 2012-03-09T09:02:37 | 2012-03-09T09:00:00 | 2012-03-09T09:03:00 | 2012-03-01T00:00:00 |
6 | 2013-05-15T03:02:09 | 2013-05-15T03:00:00 | 2013-05-15T03:02:00 | 2013-05-01T00:00:00 |
7 | 2013-02-12T07:13:02 | 2013-02-12T07:00:00 | 2013-02-12T07:13:00 | 2013-02-01T00:00:00 |
8 | 2014-06-18T18:16:08 | 2014-06-18T18:00:00 | 2014-06-18T18:16:00 | 2014-07-01T00:00:00 |
9 | missing | missing | missing | missing |
difftime()
¤
This function computes the difference between two DateTime or Date objects. It returns the result in the unit specified by the second argument, which can be "seconds", "minutes", "hours", "days", or "weeks". It returns this value as a float.
times = DataFrame(
start_time = [
"06-27-2023 15:20:00",
"06-26-2023 12:45:15",
"06-26-2023 16:30:30",
"06-25-2023 10:11:35",
"06-24-2023 09:00:24",
"06-26-2023 09:30:00",
"06-25-2023 11:00:15",
"06-24-2023 01:34:45",
"06-26-2023 14:20:00",
"06-25-2023 10:45:30"
],
end_time = [
"06-27-2023 14:53:53",
"06-25-2023 10:50:30",
"06-28-2023 16:32:30",
"06-24-2023 10:20:30",
"06-24-2023 10:05:00",
missing,
"10-25-2023 11:55:13",
"06-24-2023 11:35:45",
"07-26-2023 15:15:45",
"06-24-2023 12:50:15"
]
)
Row | start_time | end_time |
---|---|---|
String | String? | |
1 | 06-27-2023 15:20:00 | 06-27-2023 14:53:53 |
2 | 06-26-2023 12:45:15 | 06-25-2023 10:50:30 |
3 | 06-26-2023 16:30:30 | 06-28-2023 16:32:30 |
4 | 06-25-2023 10:11:35 | 06-24-2023 10:20:30 |
5 | 06-24-2023 09:00:24 | 06-24-2023 10:05:00 |
6 | 06-26-2023 09:30:00 | missing |
7 | 06-25-2023 11:00:15 | 10-25-2023 11:55:13 |
8 | 06-24-2023 01:34:45 | 06-24-2023 11:35:45 |
9 | 06-26-2023 14:20:00 | 07-26-2023 15:15:45 |
10 | 06-25-2023 10:45:30 | 06-24-2023 12:50:15 |
After a string is converted into a datetime format, Date.jl functions such as hour(), year(), etc can be applied in Tidier chains as well.
@chain times begin
@mutate(start_time = mdy_hms(start_time))
@mutate(end_time = mdy_hms(end_time))
@mutate(timedifmins = difftime(end_time, start_time, "minutes"))
@mutate(timedifmins = difftime(end_time, start_time, "hours"))
@mutate(year= year(start_time))
@mutate(second = second(start_time))
end
Row | start_time | end_time | timedifmins | year | second |
---|---|---|---|---|---|
DateTime | DateTime? | Float64? | Int64 | Int64 | |
1 | 2023-06-27T15:20:00 | 2023-06-27T14:53:53 | -0.435278 | 2023 | 0 |
2 | 2023-06-26T12:45:15 | 2023-06-25T10:50:30 | -25.9125 | 2023 | 15 |
3 | 2023-06-26T16:30:30 | 2023-06-28T16:32:30 | 48.0333 | 2023 | 30 |
4 | 2023-06-25T10:11:35 | 2023-06-24T10:20:30 | -23.8514 | 2023 | 35 |
5 | 2023-06-24T09:00:24 | 2023-06-24T10:05:00 | 1.07667 | 2023 | 24 |
6 | 2023-06-26T09:30:00 | missing | missing | 2023 | 0 |
7 | 2023-06-25T11:00:15 | 2023-10-25T11:55:13 | 2928.92 | 2023 | 15 |
8 | 2023-06-24T01:34:45 | 2023-06-24T11:35:45 | 10.0167 | 2023 | 45 |
9 | 2023-06-26T14:20:00 | 2023-07-26T15:15:45 | 720.929 | 2023 | 0 |
10 | 2023-06-25T10:45:30 | 2023-06-24T12:50:15 | -21.9208 | 2023 | 30 |
hms()
¤
This function parses time strings (e.g., "12:34:56") into a Time format in Julia. It takes a string or an array of strings with the time information and doesn't require additional arguments.
df3 = DataFrame(
Time = [
"09:00:24",
"10:11:35",
"01:34:45",
"12:45:15",
"09:30:00",
"10:45:30",
"11:00:15",
"14:20:00",
"15:10:45",
missing
]
)
@chain df3 begin
@mutate(Time = hms(Time))
@filter(!ismissing(Time))
@mutate(hour = hour(Time))
end
Row | Time | hour |
---|---|---|
Time? | Int64 | |
1 | 09:00:24 | 9 |
2 | 10:11:35 | 10 |
3 | 01:34:45 | 1 |
4 | 12:45:15 | 12 |
5 | 09:30:00 | 9 |
6 | 10:45:30 | 10 |
7 | 11:00:15 | 11 |
8 | 14:20:00 | 14 |
9 | 15:10:45 | 15 |
This page was generated using Literate.jl.