R - Dealing with Date and Time

From Training Material
Jump to navigation Jump to search


Date format

Make sure that the hiredate in the emp dataframe is a Date type

 str(emp)
'data.frame':	14 obs. of  8 variables:
$ ID      : int  7839 7698 7782 7566 7654 7499 7844 7900 7521 7902
...
$ HIREDATE: Date, format: "1981-11-17" ...
...
$ DEPTNO  : int  10 30 10 20 30 30 30 30 30 20

If it is not, covert it to the Date format

emp$HIREDATE <- as.Date(emp$HIREDATE,format="%d/%m/%Y")

Current Date

Get current date, current date and time and the system timezone:

> Sys.Date()
[1] "2012-04-15"
> Sys.time()
[1] "2012-04-15 03:20:45 BST"
> Sys.timezone()
[1] "BST"

Extracting part of the date

Which year do we have?

> format(Sys.Date(),"%Y")
[1] "2012"

Use: %d for day, %m for months and ?Date (mind the upper case "D") for more help

Calculating Intervals

What will be the date in 8 weeks:

> Sys.Date() + as.difftime(8,units="weeks")
[1] "2012-06-10"


What is the employees seniority?

> td <- Sys.Date() - emp$HIREDATE
> data.frame(emp,Seniority=as.double(td/365.24))[,c("ENAME","Seniority")]
     ENAME Seniority

1 BUSH 35.24532 2 BLAIR 35.79290 3 MERKEL 35.68612 4 PUTIN 35.87230 5 CHIRACK 35.38221 6 BAROSSO 35.98456 7 GATES 35.43697 8 BUFFETT 35.20151 9 WALTON 35.97908 10 TOOSK 35.20151 11 THATCHER 36.16252 12 CARNEGIE 34.18574 13 FORD 34.09265 14 ELISON 35.06188

Exercises

17. Exercise

Compute job seniority (use format and as.integer )

     ID    ENAME       JOB  MGR   HIREDATE  SAL COMM DEPTNO Seniority
1  7839     BUSH PRESIDENT   NA 1981-11-17 5000   NA     10        31
2  7698    BLAIR   MANAGER 7839 1981-05-01 2850   NA     30        31
3  7782   MERKEL   MANAGER 7839 1981-06-09 2450   NA     10        31
4  7566    PUTIN   MANAGER 7839 1981-04-02 2975   NA     20        31
5  7654  CHIRACK  SALESMAN 7698 1981-09-28 1250 1400     30        31
6  7499  BAROSSO  SALESMAN 7698 1981-02-20 1600  300     30        31
7  7844    GATES  SALESMAN 7698 1981-09-08 1500    0     30        31
8  7900  BUFFETT     CLERK 7698 1981-12-03  950   NA     30        31
9  7521   WALTON  SALESMAN 7698 1981-02-22 1250  500     30        31
10 7902    TOOSK   ANALYST 7566 1981-12-03 3000   NA     20        31
11 7369 THATCHER     CLERK 7902 1980-12-17  800   NA     20        32
12 7788 CARNEGIE   ANALYST 7566 1982-12-09 3000   NA     20        30
13 7876     FORD     CLERK 7788 1983-01-12 1100   NA     20        29
14 7934   ELISON     CLERK 7782 1982-01-23 1300   NA     10        30

curyear <- as.integer(format(Sys.Date(),"%Y")) Seniority <- curyear - as.integer(format(emp$HIREDATE,"%Y")) cbind(emp,Seniority)

18. *Exercise

Compute job seniority (use "-" operator between dates).

+----------+-----------+
| ename    | Seniority |
+----------+-----------+
| BUSH     |     25.14 |
| BLAIR    |     25.69 |
| MERKEL   |     25.59 |
| PUTIN    |     25.78 |
| CHIRACK  |     25.28 |
| BAROSSO  |     25.89 |
| GATES    |     25.33 |
| BUFFETT  |     25.09 |
| WALTON   |     25.88 |
| TOOSK    |     25.09 |
| THATCHER |     26.07 |
| CARNEGIE |     24.06 |
| FORD     |     23.97 |
| ELISON   |     24.95 |
+----------+-----------+ 

td <- Seniority <- Sys.Date() - emp$HIREDATE cbind(emp,Seniority=td/365.25)[,c("ENAME","Seniority")]

19. Exercise

Select employees hired before 1st of April 1981.

+----------+------------+
| ename    | hiredate   |
+----------+------------+
| BAROSSO  | 1981-02-20 |
| WALTON   | 1981-02-22 | 
| THATCHER | 1980-12-17 |
+----------+------------+

20. Exercise

Select employees hired in January.

  • Method 1: use format() function
  • Method 2: use compound condition (date from date to)
+--------+------------+
| ename  | hiredate   |
+--------+------------+
| FORD   | 1983-01-12 |
| ELISON | 1982-01-23 |
+--------+------------+

21. Exercise

Select the information below. Use ?format if needed. hint

  • apropos("date")
  • ?format.Date
  • ?strptime
+----------+--------------------------------------------+
| ename    | When hired?                                |
+----------+--------------------------------------------+
| BUSH     |  was hired 17 of November 1981 Tuesday   |
| BLAIR    |  was hired 1 of May 1981 Friday          |
| MERKEL   |  was hired 9 of June 1981 Tuesday        |
| PUTIN    |  was hired 2 of April 1981 Thursday      |
| CHIRACK  |  was hired 28 of September 1981 Monday   |
| BAROSSO  |  was hired 20 of February 1981 Friday    |
| GATES    |  was hired 8 of September 1981 Tuesday   |
| BUFFETT  |  was hired 3 of December 1981 Thursday   |
| WALTON   |  was hired 22 of February 1981 Sunday    |
| TOOSK    |  was hired 3 of December 1981 Thursday   |
| THATCHER |  was hired 17 of December 1980 Wednesday |
| CARNEGIE |  was hired 9 of December 1982 Thursday   |
| FORD     |  was hired 12 of January 1983 Wednesday  |
| ELISON   |  was hired 23 of January 1982 Saturday   |
+----------+--------------------------------------------+

22. Exercise*

Show all employees hired 50 days before (inclusive) to 50 days after (inclusive) 28th February 1982

+--------+------------+
| ename  | hiredate   |
+--------+------------+
| ELISON | 1982-01-23 |
+--------+------------+

POSIX Time Format

POSIXct
  1. number of seconds since the start of January 1, 1970 (Unix Era)
  2. negative numbers - before Unix Era
POSIXlt

Is a vector of:

  1. seconds
  2. minutes
  3. hours
  4. day of month (1-31)
  5. month of the year (0-11)
  6. years since 1900
  7. day of the week (0-6 where 0 represents Sunday)
  8. day of the year (0-365)
  9. daylight savings indicator

ISOdatetime

   d1 =  ISOdate(1982,2,3)
   d2 =  ISOdate(1981,2,3)
   d1-d2
   Time difference of 365 days
   as.double(d1-d2)
   [1] 365


  dt1 =  ISOdatetime(1982,2,3,2,1,2)
  dt2 =  ISOdatetime(1982,2,3,12,1,2)
  dt1-dt2
Time difference of -10 hours

Converting Charater to Time

 d1 <- strptime("2016-01-01 00:00:00","%Y-%m-%d %H:%M:%S")
 d2 <- strptime("2016-01-01 01:00:00","%Y-%m-%d %H:%M:%S")

Difference between ISOdate and ISOdatetime

Effectively ISOdate is ISOdatetime at 12:00:00 GMT

> dt1 <- ISOdate(2016,1,1)
> dt2 <- ISOdate(2016,1,1,12,00,00)
> dt1
[1] "2016-01-01 12:00:00 GMT"
> dt2
[1] "2016-01-01 12:00:00 GMT"


Presentation of midnight
> d1 <- strptime("2016-01-01 00:00:00","%Y-%m-%d %H:%M:%S")
> d2 <- strptime("2016-01-01 01:00:00","%Y-%m-%d %H:%M:%S")
> d1
[1] "2016-01-01 CST"
> d2
[1] "2016-01-01 01:00:00 CST"

difftime

Subtracting dates in normal ways can be implicitly converted to different number depend on how big is the interval

> dt1 <- ISOdate(2016,1,1,12,00,00)
> dt2 <- ISOdate(2016,1,1,12,00,07)
> dt2 - dt1
Time difference of 7 secs
> as.double(dt2 - dt1)
[1] 7
> dt1 <- ISOdate(2016,1,1,12,00,00)
> dt2 <- ISOdate(2116,1,1,12,00,07)
> dt2 - dt1
Time difference of 36524 days
> as.double(dt2 - dt1)
[1] 36524
> difftime(dt2,dt1)
Time difference of 36524 days
> difftime(dt2,dt1,units="sec")
Time difference of 3155673607 secs

Setting up timezone

ISOdate(2016,1,1,12,00,00,tz = "CST")