We may be designers and programmers, but we often need to look at the contents of the database to help our customers deal with problems. I probably spend as much time writing SQL select statements in Quest Software’s TOAD (Tools for Oracle Application Developers) as I do writing SQR in SlickEdit. I rarely go beyond the twelve basic keywords (select, from, where, and, or, not, like, exists, order by, group by, in), but there are other syntactical choices. I’m not necessarily recommending them; they may be slower than the more familiar choices.
Column Vectors
When we’re joining tables, we need to equate the relevant keys, or suffer the horrible consequences. This SQL returns millions of meaningless rows, matching each employee in PS_EMPLOYEES with every row (for all the employees past and present) in PS_JOB.
select * from PS_EMPLOYEES E, PS_JOB J
This SQL is better, selecting the current row in PS_JOB for each person who is still an employee of the organization:
select * from PS_EMPLOYEES E, PS_JOB J
where J.EMPLID = E.EMPLID
and J.EMPL_RCD = E.EMPL_RCD
and J.EFFDT = E.EFFDT
This is alternate syntax for the same thing. I only have access to the Oracle database, so I don’t know whether it is standard SQL. Try it!
select * from PS_JOB
where (EMPLID, EMPL_RCD, EFFDT) in
(select EMPLID, EMPL_RCD, EFFDT from PS_EMPLOYEES)
We can take it another step. This SQL selects only active employees, omitting those on leave of absence.
select * from PS_JOB
where (EMPLID, EMPL_RCD, EFFDT, EMPL_STATUS) in
(select EMPLID, EMPL_RCD, EFFDT, 'A' from PS_EMPLOYEES)
It is a bit risque in that the subselection returns all the rows in PS_EMPLOYEES, even the employees whose EMPL_STATUS is not ‘A’. However, the employees who aren’t really active will fail the main query because their EMPL_STATUS in PS_JOB is not ‘A’.
Any Listed Item
We already know how to summarize certain or conditions. A selection like this:
select * from PS_EARNINGS_TBL where ERNCD = 'REG' or ERNCD = 'VAC'
can be written with an in list like this:
select * from PS_EARNINGS_TBL where ERNCD in ('REG','VAC')
The any keyword is an alternative to the in list as follows.
select * from PS_EARNINGS_TBL where ERNCD = any ('REG','VAC')
I haven’t found any advantage to any. The = any is the same as in. The <> any is useless. In an example similar to the statement above, the selection would return “REG” rows because “REG” <> “VAC” and it would return “VAC” rows because “VAC” <> “REG”. The > any would return any earning code greater than “REG”, making the “VAC” superfluous.
All Listed Items
Perhaps the any keyword is just to keep company with the all keyword. Here is the SQL to find the employee whose pay equals the maximum of any employee’s pay.
select EMPLID, HOURLY_RT from PS_EMPLOYEES
where HOURLY_RT = (select max(HOURLY_RT) from PS_EMPLOYEES)
Here is the SQL to find the employee whose pay is greater than or equal to all other employees.
select EMPLID, HOURLY_RT from PS_EMPLOYEES
where HOURLY_RT >= all (select HOURLY_RT from PS_EMPLOYEES)
I think the best use of this feature is to replace “not exists” or “minus.” Here’s an example:
select EMPLID, HOURLY_RT from PS_EMPLOYEES
where JOBCODE <> all (select JOBCODE from PS_JOBCODE_TBL
where JOB_FAMILY = '50')
Even this example doesn’t extend the reach of the more familiar SQL. We could write it this way instead.
select EMPLID, HOURLY_RT from PS_EMPLOYEES E, PS_JOBCODE_TBL J
where J.JOBCODE = E.JOBCODE
and J.JOB_FAMILY <> '50'
Having
Here’s a feature you might already know, because it’s the only solution for certain problems. The having keyword must follow a group by clause and the column or function it uses must be a selected column or function. The order by clause, if any, comes last.
select EMPLID, count(*) from PS_NAMES
group by EMPLID
having count(*) > 1
order by count(*) desc
I use it most often to find duplicates; a selection of one or a few columns whose values repeat. Here, it lists the ID numbers of employees who have changed their names.
Oracle’s Recursion
Relational databases discourage the use of intra-table pointers. Each level of pseudo-recursion requires another self-join. Here’s how to find a strictly limited length of a management chain.
select A.NAME, B.NAME, C.NAME
from PS_EMPLOYEES A, PS_EMPLOYEES B, PS_EMPLOYEES C
where B.EMPLID = A.REPORTS_TO
and C.EMPLID = B.REPORTS_TO
This gives us two levels of management for each employee. If we have a larger organization, we need more copies of PS_EMPLOYEES. At the same time, if the CEO reports to blank, anyone who is fewer levels down than the number of copies of PS_EMPLOYEES won’t be listed.
Oracle has a real recursion feature, using key phrases start with and connect by prior. With them we can navigate the organization chart implicit in PS_EMPLOYEES, the menu tree in the portal, and possibly even the department security tree. Here is SQL that returns the entire menu tree, indented appropriately, although not in a very useful order. Note the use of the level keyword.
select lpad(' ', 10*level-10) || PORTAL_LABEL
from PSPRSMDEFN
start with PORTAL_OBJNAME = 'PORTAL_ROOT_OBJECT'
connect by prior PORTAL_OBJNAME = PORTAL_PRNTOBJNAME
and PORTAL_NAME = 'EMPLOYEE'
In Conclusion
The column vectors can save some typing. The any and all are of questionable value – I’d love to hear from anyone who finds them useful. The having clause is occasionally indispensable. The recursion feature lets us do things in a single SQL command that would otherwise require an indefinite number of iterations – I’d love to hear from anyone who knows of similar functionality in SQL Server or DB2.
Recent Comments