R - Data Processing: Difference between revisions
(No difference) 
 | 
Latest revision as of 07:37, 14 February 2017
Tables used in the exercises
EMP Table
+------+----------+-----------+------+------------+---------+---------+--------+ | ID | ENAME | JOB | MGR | HIREDATE | SAL | COMM | DEPTNO | +------+----------+-----------+------+------------+---------+---------+--------+ | 7839 | BUSH | PRESIDENT | NULL | 1981-11-17 | 5000.00 | NULL | 10 | | 7698 | BLAIR | MANAGER | 7839 | 1981-05-01 | 2850.00 | NULL | 30 | | 7782 | MERKEL | MANAGER | 7839 | 1981-06-09 | 2450.00 | NULL | 10 | | 7566 | PUTIN | MANAGER | 7839 | 1981-04-02 | 2975.00 | NULL | 20 | | 7654 | CHIRACK | SALESMAN | 7698 | 1981-09-28 | 1250.00 | 1400.00 | 30 | | 7499 | BAROSSO | SALESMAN | 7698 | 1981-02-20 | 1600.00 | 300.00 | 30 | | 7844 | GATES | SALESMAN | 7698 | 1981-09-08 | 1500.00 | 0.00 | 30 | | 7900 | BUFFETT | CLERK | 7698 | 1981-12-03 | 950.00 | NULL | 30 | | 7521 | WALTON | SALESMAN | 7698 | 1981-02-22 | 1250.00 | 500.00 | 30 | | 7902 | TOOSK | ANALYST | 7566 | 1981-12-03 | 3000.00 | NULL | 20 | | 7369 | THATCHER | CLERK | 7902 | 1980-12-17 | 800.00 | NULL | 20 | | 7788 | CARNEGIE | ANALYST | 7566 | 1982-12-09 | 3000.00 | NULL | 20 | | 7876 | FORD | CLERK | 7788 | 1983-01-12 | 1100.00 | NULL | 20 | | 7934 | ELISON | CLERK | 7782 | 1982-01-23 | 1300.00 | NULL | 10 | +------+----------+-----------+------+------------+---------+---------+--------+
DEPT Table
+--------+------------+----------+ | DEPTNO | DNAME | LOC | +--------+------------+----------+ | 10 | ACCOUNTING | NEW YORK | | 20 | RESEARCH | LONDON | | 30 | SALES | PARIS | | 40 | OPERATIONS | BERLIN | +--------+------------+----------+
SALGRADE Table
+-------+-------+-------+ | GRADE | LOSAL | HISAL | +-------+-------+-------+ | 1 | 700 | 1200 | | 2 | 1201 | 1400 | | 3 | 1401 | 2000 | | 4 | 2001 | 3000 | | 5 | 3001 | 9999 | +-------+-------+-------+
Loading data
In order to load data from a text file we use read.table function
 
emp <- read.table("https://training-course-material.com/images/9/97/Emp.txt",header=TRUE)
Show the data
 emp
ID ENAME JOB MGR HIREDATE SAL COMM DEPTNO 1 7839 BUSH PRESIDENT NA 1981-11-17 5000 NA 10 2 7698 BLAIR MANAGER 7839 1981-05-01 2850 NA 30 3 7782 MERKEL MANAGER 7839 1981-06-09 2450 NA 10 4 7566 PUTIN MANAGER 7839 1981-04-02 2975 NA 20 5 7654 CHIRACK SALESMAN 7698 1981-09-28 1250 1400 30 6 7499 BAROSSO SALESMAN 7698 1981-02-20 1600 300 30 7 7844 GATES SALESMAN 7698 1981-09-08 1500 0 30 8 7900 BUFFETT CLERK 7698 1981-12-03 950 NA 30 9 7521 WALTON SALESMAN 7698 1981-02-22 1250 500 30 10 7902 TOOSK ANALYST 7566 1981-12-03 3000 NA 20 11 7369 THATCHER CLERK 7902 1980-12-17 800 NA 20 12 7788 CARNEGIE ANALYST 7566 1982-12-09 3000 NA 20 13 7876 FORD CLERK 7788 1983-01-12 1100 NA 20 14 7934 ELISON CLERK 7782 1982-01-23 1300 NA 10
Show the structure of the emp data frame
> str(emp) 'data.frame': 14 obs. of 8 variables: $ ID : int 7839 7698 7782 7566 7654 7499 7844 7900 7521 7902 ... $ ENAME : Factor w/ 14 levels "BAROSSO","BLAIR",..: 4 2 10 11 6 1 9 3 14 13 ... $ JOB : Factor w/ 5 levels "ANALYST","CLERK",..: 4 3 3 3 5 5 5 2 5 1 ... $ MGR : int NA 7839 7839 7839 7698 7698 7698 7698 7698 7566 ... $ HIREDATE: Factor w/ 13 levels "01/05/1981","02/04/1981",..: 8 1 5 2 13 10 4 3 11 3 ... $ SAL : int 5000 2850 2450 2975 1250 1600 1500 950 1250 3000 ... $ COMM : int NA NA NA NA 1400 300 0 NA 500 NA ... $ DEPTNO : int 10 30 10 20 30 30 30 30 30 20 ...
Converting Data
ENAME column is not really a factor, but rather a character column
Convert factor to a character
emp$ENAME <- as.character(emp$ENAME) > str(emp) 'data.frame': 14 obs. of 8 variables: $ ID : int 7839 7698 7782 7566 7654 7499 7844 7900 7521 7902 ... $ ENAME : chr "BUSH" "BLAIR" "MERKEL" "PUTIN" ... ... $ DEPTNO : int 10 30 10 20 30 30 30 30 30 20 ...
Similar way convert HIREDATE column
emp$HIREDATE <- as.Date(emp$HIREDATE,format="%d/%m/%Y")
Loading data exercises
- Load dept and salgrade tabled into data frames using read.table function.
 
Call the variable dept and salgrade accordingly
- https://training-course-material.com/images/a/a7/Dept.txt
 - https://training-course-material.com/images/6/6d/Salgrade.txt
 
Trivia
- R is case sensitive
 - White charters are ignored
 - Commands end with semicolon ";" or Enter (new line) charter
 - You can comment the rest of the line with the number character (hash) "#"
 
emp #This is just a comment
- Operators work the same way as in maths or other programming languages
 
2+2; 12-3; -(10+2); 4*3; 13/4; 2**4; 2^ 4; 2/0
Selecting Data
Display all columns and rows from emp data frame.
> emp
     ID    ENAME       JOB  MGR   HIREDATE  SAL COMM DEPTNO
1  7839     BUSH PRESIDENT   NA 17/11/1981 5000   NA     10
2  7698    BLAIR   MANAGER 7839 01/05/1981 2850   NA     30
3  7782   MERKEL   MANAGER 7839 09/06/1981 2450   NA     10
...
Select three columns: ename, job and sal and all rows from emp table.
> emp[,c("ENAME","JOB","SAL")]
      ENAME       JOB  SAL
1      BUSH PRESIDENT 5000
2     BLAIR   MANAGER 2850
3    MERKEL   MANAGER 2450
...
14   ELISON     CLERK 1300
Select the one column on its own and return a vector
> emp$SAL [1] 5000 2850 2450 2975 1250 1600 1500 950 1250 3000 800 3000 1100 1300
Combining Data
Select two columns and combine them
> data.frame(SAL,ENAME)
    SAL    ENAME
1  5000     BUSH
2  2850    BLAIR
3  2450   MERKEL
4  2975    PUTIN
... 
You can add labels as well
> data.frame(Salary=emp$SAL,'Employee Name' = emp$ENAME) Salary Employee.Name 1 5000 BUSH 2 2850 BLAIR 3 2450 MERKEL 4 2975 PUTIN 5 1250 CHIRACK 6 1600 BAROSSO
Calculate the annual salary
> data.frame(emp$ENAME,AnnualSal=emp$SAL*12) emp.ENAME AnnualSal 1 BUSH 60000 2 BLAIR 34200 3 MERKEL 29400 4 PUTIN 35700 5 CHIRACK 15000 ...
Attaching dataframe
We can attach the table so we do not have to write the emp$ prefix
> attach(emp)
> data.frame(ENAME,AnnualSal=SAL*12) ENAME AnnualSal 1 "BUSH" 60000 2 "BLAIR" 34200 ...
Concatenation
Concatenate two columns
> paste(ENAME,"is",JOB) [1] "BUSH is PRESIDENT" "BLAIR is MANAGER" "MERKEL is MANAGER" "PUTIN is MANAGER" [5] "CHIRACK is SALESMAN" "BAROSSO is SALESMAN" "GATES is SALESMAN" "BUFFETT is CLERK" [9] "WALTON is SALESMAN" "TOOSK is ANALYST" "THATCHER is CLERK" "CARNEGIE is ANALYST" [13] "FORD is CLERK" "ELISON is CLERK"
Sorting
Sort by salary in ascending order
 emp[order(SAL),]
Sort by salary in descending order
 emp[order(SAL,decreasing=TRUE),]
Sort by multiple columns
 emp[order(JOB,SAL),]
ID ENAME JOB MGR HIREDATE SAL COMM DEPTNO TOOSK 7902 TOOSK ANALYST 7566 03/12/1981 3000 NA 20 CARNEGIE 7788 CARNEGIE ANALYST 7566 09/12/1982 3000 NA 20 THATCHER 7369 THATCHER CLERK 7902 17/12/1980 800 NA 20 BUFFETT 7900 BUFFETT CLERK 7698 03/12/1981 950 NA 30 FORD 7876 FORD CLERK 7788 12/01/1983 1100 NA 20 ELISON 7934 ELISON CLERK 7782 23/01/1982 1300 NA 10 MERKEL 7782 MERKEL MANAGER 7839 09/06/1981 2450 NA 10 BLAIR 7698 BLAIR MANAGER 7839 01/05/1981 2850 NA 30 PUTIN 7566 PUTIN MANAGER 7839 02/04/1981 2975 NA 20 BUSH 7839 BUSH PRESIDENT NA 17/11/1981 5000 NA 10 CHIRACK 7654 CHIRACK SALESMAN 7698 28/09/1981 1250 1400 30 WALTON 7521 WALTON SALESMAN 7698 22/02/1981 1250 500 30 GATES 7844 GATES SALESMAN 7698 08/09/1981 1500 0 30 BAROSSO 7499 BAROSSO SALESMAN 7698 20/02/1981 1600 300 30
Sort by HIREDATE, but do not show it
> emp[order(HIREDATE),c("ENAME","SAL")]
      ENAME  SAL
11 THATCHER  800
6   BAROSSO 1600
9    WALTON 1250
4     PUTIN 2975
2     BLAIR 2850
3    MERKEL 2450
7     GATES 1500
5   CHIRACK 1250
1      BUSH 5000
8   BUFFETT  950
10    TOOSK 3000
14   ELISON 1300
12 CARNEGIE 3000
13     FORD 1100
Sorting one column in reverse order
emp[order(DEPTNO,-SAL, decreasing=TRUE),]
ID ENAME JOB MGR HIREDATE SAL COMM DEPTNO 8 7900 BUFFETT CLERK 7698 03/12/1981 950 NA 30 5 7654 CHIRACK SALESMAN 7698 28/09/1981 1250 1400 30 9 7521 WALTON SALESMAN 7698 22/02/1981 1250 500 30 7 7844 GATES SALESMAN 7698 08/09/1981 1500 0 30 6 7499 BAROSSO SALESMAN 7698 20/02/1981 1600 300 30 2 7698 BLAIR MANAGER 7839 01/05/1981 2850 NA 30 11 7369 THATCHER CLERK 7902 17/12/1980 800 NA 20 13 7876 FORD CLERK 7788 12/01/1983 1100 NA 20 4 7566 PUTIN MANAGER 7839 02/04/1981 2975 NA 20 10 7902 TOOSK ANALYST 7566 03/12/1981 3000 NA 20 12 7788 CARNEGIE ANALYST 7566 09/12/1982 3000 NA 20 14 7934 ELISON CLERK 7782 23/01/1982 1300 NA 10 3 7782 MERKEL MANAGER 7839 09/06/1981 2450 NA 10 1 7839 BUSH PRESIDENT NA 17/11/1981 5000 NA 10
Filtering Results
Select employees earning more than 2000
> emp[SAL>2000,]
     ID    ENAME       JOB  MGR   HIREDATE  SAL COMM DEPTNO
1  7839     BUSH PRESIDENT   NA 1981-11-17 5000   NA     10
2  7698    BLAIR   MANAGER 7839 1981-05-01 2850   NA     30
3  7782   MERKEL   MANAGER 7839 1981-06-09 2450   NA     10
4  7566    PUTIN   MANAGER 7839 1981-04-02 2975   NA     20
10 7902    TOOSK   ANALYST 7566 1981-12-03 3000   NA     20
12 7788 CARNEGIE   ANALYST 7566 1982-12-09 3000   NA     20
How does it work?
The SAL>2000 expression returns a vector of boolean values.
> SAL>2000 [1] TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE TRUE FALSE TRUE FALSE FALSE
If the salary is greater than 2000 for a row, the corresponding value will be TRUE, otherwise, FALSE.
Than, the data frame selects only those rows which have true value in the corresponding rows.
Select only managers.
> emp[JOB == "MANAGER",] ID ENAME JOB MGR HIREDATE SAL COMM DEPTNO 2 7698 BLAIR MANAGER 7839 1981-05-01 2850 NA 30 3 7782 MERKEL MANAGER 7839 1981-06-09 2450 NA 10 4 7566 PUTIN MANAGER 7839 1981-04-02 2975 NA 20
Select people hired before or on 1st of April 1981.
> emp[HIREDATE <= '1981-04-01',]
    ID    ENAME      JOB  MGR   HIREDATE  SAL COMM DEPTNO
6  7499  BAROSSO SALESMAN 7698 1981-02-20 1600  300     30
9  7521   WALTON SALESMAN 7698 1981-02-22 1250  500     30
11 7369 THATCHER    CLERK 7902 1980-12-17  800   NA     20
Select all except clerks.
> emp[JOB != 'CLERK',]
    ID    ENAME       JOB  MGR   HIREDATE  SAL COMM DEPTNO
1  7839     BUSH PRESIDENT   NA 1981-11-17 5000   NA     10
2  7698    BLAIR   MANAGER 7839 1981-05-01 2850   NA     30
3  7782   MERKEL   MANAGER 7839 1981-06-09 2450   NA     10
4  7566    PUTIN   MANAGER 7839 1981-04-02 2975   NA     20
5  7654  CHIRACK  SALESMAN 7698 1981-09-28 1250 1400     30
6  7499  BAROSSO  SALESMAN 7698 1981-02-20 1600  300     30
7  7844    GATES  SALESMAN 7698 1981-09-08 1500    0     30
9  7521   WALTON  SALESMAN 7698 1981-02-22 1250  500     30
10 7902    TOOSK   ANALYST 7566 1981-12-03 3000   NA     20
12 7788 CARNEGIE   ANALYST 7566 1982-12-09 3000   NA     20
Select only name and salary of salesmen
> emp[JOB == 'SALESMAN',c("ENAME","SAL")]
    ENAME  SAL
5 CHIRACK 1250
6 BAROSSO 1600
7   GATES 1500
9  WALTON 1250
Complex Expressions
To select people earning from 1000 to 2000 (inclusive) you can type:
 > emp[SAL >= 1000 & SAL <= 2000,]
ID ENAME JOB MGR HIREDATE SAL COMM DEPTNO 5 7654 CHIRACK SALESMAN 7698 1981-09-28 1250 1400 30 6 7499 BAROSSO SALESMAN 7698 1981-02-20 1600 300 30 7 7844 GATES SALESMAN 7698 1981-09-08 1500 0 30 9 7521 WALTON SALESMAN 7698 1981-02-22 1250 500 30 13 7876 FORD CLERK 7788 1983-01-12 1100 NA 20 14 7934 ELISON CLERK 7782 1982-01-23 1300 NA 10
Select people working in department no 10 or 20.
> emp[DEPTNO == 10 | DEPTNO == 20,c("ENAME","DEPTNO")]
      ENAME DEPTNO
1      BUSH     10
3    MERKEL     10
4     PUTIN     20
10    TOOSK     20
11 THATCHER     20
12 CARNEGIE     20
13     FORD     20
14   ELISON     10
Select managers, clerks and presidents
> emp[JOB == 'PRESIDENT' | JOB == 'CLERK' | JOB == 'MANAGER',c("ENAME","JOB")]
      ENAME       JOB
1      BUSH PRESIDENT
2     BLAIR   MANAGER
3    MERKEL   MANAGER
4     PUTIN   MANAGER
8   BUFFETT     CLERK
11 THATCHER     CLERK
13     FORD     CLERK
14   ELISON     CLERK
The above can be simplified to:
> emp[JOB %in% c('PRESIDENT','CLERK','MANAGER'),c("ENAME","JOB")]
      ENAME       JOB
1      BUSH PRESIDENT
2     BLAIR   MANAGER
3    MERKEL   MANAGER
4     PUTIN   MANAGER
8   BUFFETT     CLERK
11 THATCHER     CLERK
13     FORD     CLERK
14   ELISON     CLERK
Finding matches
Select people with names starting with “B”.
> emp[grep("B.*",ENAME),]
    ID   ENAME       JOB  MGR   HIREDATE  SAL COMM DEPTNO
1 7839    BUSH PRESIDENT   NA 17/11/1981 5000   NA     10
2 7698   BLAIR   MANAGER 7839 01/05/1981 2850   NA     30
6 7499 BAROSSO  SALESMAN 7698 20/02/1981 1600  300     30
8 7900 BUFFETT     CLERK 7698 03/12/1981  950   NA     30
Grep function uses regular expression. A couple of other examples:
Select people with second letter ‘a’.
> emp[grep("^.A",ENAME),]
    ID    ENAME      JOB  MGR   HIREDATE  SAL COMM DEPTNO
6  7499  BAROSSO SALESMAN 7698 20/02/1981 1600  300     30
7  7844    GATES SALESMAN 7698 08/09/1981 1500    0     30
9  7521   WALTON SALESMAN 7698 22/02/1981 1250  500     30
12 7788 CARNEGIE  ANALYST 7566 09/12/1982 3000   NA     20
Saving Results in a variable
Find people working in department 20 and 30, order results by their job.
> empo <- emp[order(JOB),]
The empo variable contains the ordered elements from emp table by JOB.
> empo[empo$DEPTNO %in% c(10,30),]
     ID   ENAME       JOB  MGR   HIREDATE  SAL COMM DEPTNO
8  7900 BUFFETT     CLERK 7698 03/12/1981  950   NA     30
14 7934  ELISON     CLERK 7782 23/01/1982 1300   NA     10
2  7698   BLAIR   MANAGER 7839 01/05/1981 2850   NA     30
3  7782  MERKEL   MANAGER 7839 09/06/1981 2450   NA     10
1  7839    BUSH PRESIDENT   NA 17/11/1981 5000   NA     10
5  7654 CHIRACK  SALESMAN 7698 28/09/1981 1250 1400     30
6  7499 BAROSSO  SALESMAN 7698 20/02/1981 1600  300     30
7  7844   GATES  SALESMAN 7698 08/09/1981 1500    0     30
9  7521  WALTON  SALESMAN 7698 22/02/1981 1250  500     30
Filtering by date ranges
emp[
     HIREDATE >= as.Date("1981-01-01") & HIREDATE <= as.Date("1981-12-31")
     ,c("ENAME","HIREDATE")
   ]
ENAME HIREDATE 1 BUSH 1981-11-17 2 BLAIR 1981-05-01 3 MERKEL 1981-06-09 4 PUTIN 1981-04-02 5 CHIRACK 1981-09-28 6 BAROSSO 1981-02-20 7 GATES 1981-09-08 8 BUFFETT 1981-12-03 9 WALTON 1981-02-22 10 TOOSK 1981-12-03
Dealing with NA value
The NA isn’t just zero (0) or an empty string “”. It’s something like a void. Almost every operation with NA returns NA value.
NA*2; NA+2; NA/NA; 12/NA; NA>2; NA==NA;
[1] NA [1] NA [1] NA [1] NA [1] NA [1] NA
> data.frame(ENAME,SAL,COMM,SAL+COMM)
      ENAME  SAL COMM SAL...COMM
1      BUSH 5000   NA         NA
2     BLAIR 2850   NA         NA
3    MERKEL 2450   NA         NA
4     PUTIN 2975   NA         NA
5   CHIRACK 1250 1400       2650
6   BAROSSO 1600  300       1900
7     GATES 1500    0       1500
...
Convert the NA values to 0 with ifelse(is.na(COMM),0,COMM) expression
 
 data.frame(
     ENAME,SAL,COMM,
     AnnualSal=
         SAL+ifelse(is.na(COMM),0,COMM)
 )
ENAME SAL COMM AnnualSal 1 BUSH 5000 NA 5000 2 BLAIR 2850 NA 2850 3 MERKEL 2450 NA 2450 4 PUTIN 2975 NA 2975 5 CHIRACK 1250 1400 2650 6 BAROSSO 1600 300 1900 7 GATES 1500 0 1500 8 BUFFETT 950 NA 950 ...
Select employees with commission more than 300
> emp[COMM > 300,]
       ID   ENAME      JOB  MGR   HIREDATE  SAL COMM DEPTNO
NA     NA    <NA>     <NA>   NA       <NA>   NA   NA     NA
NA.1   NA    <NA>     <NA>   NA       <NA>   NA   NA     NA
NA.2   NA    <NA>     <NA>   NA       <NA>   NA   NA     NA
NA.3   NA    <NA>     <NA>   NA       <NA>   NA   NA     NA
5    7654 CHIRACK SALESMAN 7698 1981-09-28 1250 1400     30
NA.4   NA    <NA>     <NA>   NA       <NA>   NA   NA     NA
9    7521  WALTON SALESMAN 7698 1981-02-22 1250  500     30
NA.5   NA    <NA>     <NA>   NA       <NA>   NA   NA     NA
NA.6   NA    <NA>     <NA>   NA       <NA>   NA   NA     NA
NA.7   NA    <NA>     <NA>   NA       <NA>   NA   NA     NA
NA.8   NA    <NA>     <NA>   NA       <NA>   NA   NA     NA
NA.9   NA    <NA>     <NA>   NA       <NA>   NA   NA     NA
Remove all rows with NA values in any column
> na.omit(emp[COMM > 300,])
    ID   ENAME      JOB  MGR   HIREDATE  SAL COMM DEPTNO
5 7654 CHIRACK SALESMAN 7698 1981-09-28 1250 1400     30
9 7521  WALTON SALESMAN 7698 1981-02-22 1250  500     30
Showing Unique Values
levels(emp$JOB) unique(emp$JOB)
Exercises
1. Exercise
- Load dept and salgrade tabled into data frames using read.table function from locations below:
 
2.  Exercise
Select all rows from SALGRADE dataframe.
GRADE LOSAL HISAL 1 1 700 1200 2 2 1201 1400 3 3 1401 2000 4 4 2001 3000 5 5 3001 9999
3. Exercise
Select shown data about employees who earn between 1000 and 2000.
ENAME DEPTNO SAL 5 CHIRACK 30 1250 6 BAROSSO 30 1600 7 GATES 30 1500 9 WALTON 30 1250 13 FORD 20 1100 14 ELISON 10 1300
4. Exercise
Select number and name of departments. Order them by department name.
DEPTNO DNAME 1 10 ACCOUNTING 4 40 OPERATIONS 2 20 RESEARCH 3 30 SALES
5. Exercise
Select all distinct jobs.
[1] PRESIDENT MANAGER SALESMAN CLERK ANALYST
6. Exercise
Select the data shown below about employees working in departments 10 and 20. Sort the results by their names in descending order.
ENAME JOB DEPTNO 10 TOOSK ANALYST 20 11 THATCHER CLERK 20 4 PUTIN MANAGER 20 3 MERKEL MANAGER 10 13 FORD CLERK 20 14 ELISON CLERK 10 12 CARNEGIE ANALYST 20 1 BUSH PRESIDENT 10
7. Exercise
Select clerks working in department no 20.
ENAME JOB DEPTNO 11 THATCHER CLERK 20 13 FORD CLERK 20
8. Exercise
Select employees who have a boss.
ENAME MGR 2 BLAIR 7839 3 MERKEL 7839 4 PUTIN 7839 5 CHIRACK 7698 6 BAROSSO 7698 7 GATES 7698 8 BUFFETT 7698 9 WALTON 7698 10 TOOSK 7566 11 THATCHER 7902 12 CARNEGIE 7566 13 FORD 7788 14 ELISON 7782
9.  Exercise
Select manager’s annual remuneration.
Ename AnnaulSal 1 1 BLAIR 34200 2 2 MERKEL 29400 3 3 PUTIN 35700
10. Exercise
Select monthly remuneration (salary and commission).
            Ename Remuneration
     1     BUSH         5000
     2    BLAIR         2850
     3   MERKEL         2450
     4    PUTIN         2975
     5  CHIRACK         2650
     6  BAROSSO         1900
     7    GATES         1500
     8  BUFFETT          950
     9   WALTON         1750
    10    TOOSK         3000
    11 THATCHER          800
    12 CARNEGIE         3000
    13     FORD         1100
    14   ELISON         1300
11.  Exercise*
Select workers hired in 1982.
ENAME HIREDATE 12 CARNEGIE 1982-12-09 14 ELISON 1982-01-23
12. Exercise
Select employees whose commission is greater than their salary.
ENAME SAL COMM 5 CHIRACK 1250 1400
13. Exercise
Write a select statement which returns output below (use concatenation).
[1] "BUSH is PRESIDENT and work in department no 10" [2] "BLAIR is MANAGER and work in department no 30" [3] "MERKEL is MANAGER and work in department no 10" [4] "PUTIN is MANAGER and work in department no 20" [5] "CHIRACK is SALESMAN and work in department no 30" [6] "BAROSSO is SALESMAN and work in department no 30" [7] "GATES is SALESMAN and work in department no 30" [8] "BUFFETT is CLERK and work in department no 30" [9] "WALTON is SALESMAN and work in department no 30" [10] "TOOSK is ANALYST and work in department no 20" [11] "THATCHER is CLERK and work in department no 20" [12] "CARNEGIE is ANALYST and work in department no 20" [13] "FORD is CLERK and work in department no 20" [14] "ELISON is CLERK and work in department no 10"
14. Exercise
Find employees with second letter “L” and third “A”.
ID ENAME JOB MGR HIREDATE SAL COMM DEPTNO 2 7698 BLAIR MANAGER 7839 1981-05-01 2850 NA 30
15. Exercise
Find employees whose name contains T and ends with N.
[1] "PUTIN" "WALTON"
16. Exercise
Find employees who either work as managers or work in department no 10, but not both. Hint: use xor() operator
ID ENAME JOB MGR HIREDATE SAL COMM DEPTNO 1 7839 BUSH PRESIDENT NA 1981-11-17 5000 NA 10 2 7698 BLAIR MANAGER 7839 1981-05-01 2850 NA 30 4 7566 PUTIN MANAGER 7839 1981-04-02 2975 NA 20 14 7934 ELISON CLERK 7782 1982-01-23 1300 NA 10