R - Dealing with Date and Time
Date format
Make sure that the hiredate in the emp dataframe is a Date type
str(emp)
'data.frame': 14 obs. of 8 variables: $ ID : int 7839 7698 7782 7566 7654 7499 7844 7900 7521 7902 ... $ HIREDATE: Date, format: "1981-11-17" ... ... $ DEPTNO : int 10 30 10 20 30 30 30 30 30 20
If it is not, covert it to the Date format
emp$HIREDATE <- as.Date(emp$HIREDATE,format="%d/%m/%Y")
Current Date
Get current date, current date and time and the system timezone:
> Sys.Date() [1] "2012-04-15" > Sys.time() [1] "2012-04-15 03:20:45 BST" > Sys.timezone() [1] "BST"
Extracting part of the date
Which year do we have?
> format(Sys.Date(),"%Y") [1] "2012"
Use: %d for day, %m for months and ?Date (mind the upper case "D") for more help
Calculating Intervals
What will be the date in 8 weeks:
> Sys.Date() + as.difftime(8,units="weeks") [1] "2012-06-10"
What is the employees seniority?
> td <- Sys.Date() - emp$HIREDATE > data.frame(emp,Seniority=as.double(td/365.24))[,c("ENAME","Seniority")] ENAME Seniority
1 BUSH 35.24532 2 BLAIR 35.79290 3 MERKEL 35.68612 4 PUTIN 35.87230 5 CHIRACK 35.38221 6 BAROSSO 35.98456 7 GATES 35.43697 8 BUFFETT 35.20151 9 WALTON 35.97908 10 TOOSK 35.20151 11 THATCHER 36.16252 12 CARNEGIE 34.18574 13 FORD 34.09265 14 ELISON 35.06188
Exercises
17. Exercise
Compute job seniority (use format and as.integer )
ID ENAME JOB MGR HIREDATE SAL COMM DEPTNO Seniority 1 7839 BUSH PRESIDENT NA 1981-11-17 5000 NA 10 31 2 7698 BLAIR MANAGER 7839 1981-05-01 2850 NA 30 31 3 7782 MERKEL MANAGER 7839 1981-06-09 2450 NA 10 31 4 7566 PUTIN MANAGER 7839 1981-04-02 2975 NA 20 31 5 7654 CHIRACK SALESMAN 7698 1981-09-28 1250 1400 30 31 6 7499 BAROSSO SALESMAN 7698 1981-02-20 1600 300 30 31 7 7844 GATES SALESMAN 7698 1981-09-08 1500 0 30 31 8 7900 BUFFETT CLERK 7698 1981-12-03 950 NA 30 31 9 7521 WALTON SALESMAN 7698 1981-02-22 1250 500 30 31 10 7902 TOOSK ANALYST 7566 1981-12-03 3000 NA 20 31 11 7369 THATCHER CLERK 7902 1980-12-17 800 NA 20 32 12 7788 CARNEGIE ANALYST 7566 1982-12-09 3000 NA 20 30 13 7876 FORD CLERK 7788 1983-01-12 1100 NA 20 29 14 7934 ELISON CLERK 7782 1982-01-23 1300 NA 10 30
curyear <- as.integer(format(Sys.Date(),"%Y")) Seniority <- curyear - as.integer(format(emp$HIREDATE,"%Y")) cbind(emp,Seniority)
18. *Exercise
Compute job seniority (use "-" operator between dates).
+----------+-----------+ | ename | Seniority | +----------+-----------+ | BUSH | 25.14 | | BLAIR | 25.69 | | MERKEL | 25.59 | | PUTIN | 25.78 | | CHIRACK | 25.28 | | BAROSSO | 25.89 | | GATES | 25.33 | | BUFFETT | 25.09 | | WALTON | 25.88 | | TOOSK | 25.09 | | THATCHER | 26.07 | | CARNEGIE | 24.06 | | FORD | 23.97 | | ELISON | 24.95 | +----------+-----------+
td <- Seniority <- Sys.Date() - emp$HIREDATE cbind(emp,Seniority=td/365.25)[,c("ENAME","Seniority")]
19. Exercise
Select employees hired before 1st of April 1981.
+----------+------------+ | ename | hiredate | +----------+------------+ | BAROSSO | 1981-02-20 | | WALTON | 1981-02-22 | | THATCHER | 1980-12-17 | +----------+------------+
20. Exercise
Select employees hired in January.
- Method 1: use format() function
- Method 2: use compound condition (date from date to)
+--------+------------+ | ename | hiredate | +--------+------------+ | FORD | 1983-01-12 | | ELISON | 1982-01-23 | +--------+------------+
21. Exercise
Select the information below. Use ?format if needed. hint
- apropos("date")
- ?format.Date
- ?strptime
+----------+--------------------------------------------+ | ename | When hired? | +----------+--------------------------------------------+ | BUSH | was hired 17 of November 1981 Tuesday | | BLAIR | was hired 1 of May 1981 Friday | | MERKEL | was hired 9 of June 1981 Tuesday | | PUTIN | was hired 2 of April 1981 Thursday | | CHIRACK | was hired 28 of September 1981 Monday | | BAROSSO | was hired 20 of February 1981 Friday | | GATES | was hired 8 of September 1981 Tuesday | | BUFFETT | was hired 3 of December 1981 Thursday | | WALTON | was hired 22 of February 1981 Sunday | | TOOSK | was hired 3 of December 1981 Thursday | | THATCHER | was hired 17 of December 1980 Wednesday | | CARNEGIE | was hired 9 of December 1982 Thursday | | FORD | was hired 12 of January 1983 Wednesday | | ELISON | was hired 23 of January 1982 Saturday | +----------+--------------------------------------------+
22. Exercise*
Show all employees hired 50 days before (inclusive) to 50 days after (inclusive) 28th February 1982
+--------+------------+ | ename | hiredate | +--------+------------+ | ELISON | 1982-01-23 | +--------+------------+
POSIX Time Format
- POSIXct
- number of seconds since the start of January 1, 1970 (Unix Era)
- negative numbers - before Unix Era
- POSIXlt
Is a vector of:
- seconds
- minutes
- hours
- day of month (1-31)
- month of the year (0-11)
- years since 1900
- day of the week (0-6 where 0 represents Sunday)
- day of the year (0-365)
- daylight savings indicator
ISOdatetime
d1 = ISOdate(1982,2,3)
d2 = ISOdate(1981,2,3)
d1-d2
Time difference of 365 days
as.double(d1-d2)
[1] 365
dt1 = ISOdatetime(1982,2,3,2,1,2)
dt2 = ISOdatetime(1982,2,3,12,1,2)
dt1-dt2
Time difference of -10 hours
Converting Charater to Time
d1 <- strptime("2016-01-01 00:00:00","%Y-%m-%d %H:%M:%S")
d2 <- strptime("2016-01-01 01:00:00","%Y-%m-%d %H:%M:%S")
Difference between ISOdate and ISOdatetime
Effectively ISOdate is ISOdatetime at 12:00:00 GMT
> dt1 <- ISOdate(2016,1,1) > dt2 <- ISOdate(2016,1,1,12,00,00) > dt1 [1] "2016-01-01 12:00:00 GMT" > dt2 [1] "2016-01-01 12:00:00 GMT"
- Presentation of midnight
> d1 <- strptime("2016-01-01 00:00:00","%Y-%m-%d %H:%M:%S") > d2 <- strptime("2016-01-01 01:00:00","%Y-%m-%d %H:%M:%S") > d1 [1] "2016-01-01 CST" > d2 [1] "2016-01-01 01:00:00 CST"
difftime
Subtracting dates in normal ways can be implicitly converted to different number depend on how big is the interval
> dt1 <- ISOdate(2016,1,1,12,00,00) > dt2 <- ISOdate(2016,1,1,12,00,07) > dt2 - dt1 Time difference of 7 secs > as.double(dt2 - dt1) [1] 7 > dt1 <- ISOdate(2016,1,1,12,00,00) > dt2 <- ISOdate(2116,1,1,12,00,07) > dt2 - dt1 Time difference of 36524 days > as.double(dt2 - dt1) [1] 36524 > difftime(dt2,dt1) Time difference of 36524 days > difftime(dt2,dt1,units="sec") Time difference of 3155673607 secs
Setting up timezone
ISOdate(2016,1,1,12,00,00,tz = "CST")