Here are some small but significant ways to speed up or streamline your SQR program.
1. Select Before Delete
In every pay period we added 200,000 rows to a table that already had twenty million rows. This process had to be repeatable, so it started by deleting any rows that might exist for the current pay period.
begin-procedure delete_pay_period_rows
begin-sql
delete from PS_HUGE_TBL
where PAY_PERIOD = $current_pay_period
end-sql
end-procedure delete_pay_period_rows
We rarely needed to rerun the program, but the database reviewed every row, looking for something to delete. Oracle doesn’t use indexes for delete. Instead of calling delete_pay_period_rows every time, I added this code:
begin-select loops = 1
PAY_PERIOD
do delete_pay_period_rows
from PS_HUGE_TBL
where PAY_PERIOD = $current_pay_period
end-select
Now Oracle can use the index for PAY_PERIOD, and usually discover that there is no need for a delete.
2. Get The Current PS_JOB Row
PS_JOB has four keys, EMPLID, EMPL_RCD, EFFDT, and EFFSEQ. The “current” row for each employee is the one with the latest EFFDT (effective date) that is still in the past, and the highest EFFSEQ (effective sequence number) for that date. If our organization uses multiple values of EMPL_RCD, we might also want to get its maximum value before selecting EFFDT and EFFSEQ.
(Side note: EMPL_RCD is “employee record number,” and it can be used to allow one employee to hold different jobs simultaneously in different departments with different pay rates. More recently, it can be used to distinguish one person’s differing roles as a job applicant, employee, and other.)
Normally, to select the current row for each current employee as of a particular date, we write:
begin-select
J.EMPLID
from PS_JOB J
where J.EFFDT = (select max(EFFDT)
from PS_JOB
where EMPLID = J.EMPLID
and EMPL_RCD = J.EMPL_RCD
and EFFDT <= $as_of_date)
and J.EFFSEQ = (select max(EFFSEQ)
from PS_JOB
where EMPLID = J.EMPLID
and EMPL_RCD = J.EMPL_RCD
and EFFDT = J.EFFDT)
and J.EMPL_STATUS in (‘A’,’L’,’P’,’S’)
end-select
However, if we want just the current employees (who haven’t terminated their employment) as of the day we run the SQR program, we can use the PS_EMPLOYEES table, which is refreshed every night. PS_EMPLOYEES has many of the same fields as PS_JOB, and it might replace PS_JOB entirely. However, if we still need PS_JOB, we can write:
begin-select
J.EMPLID
from PS_EMPLOYEES E, PS_JOB J
where J.EMPLID = E.EMPLID
and J.EMPL_RCD = E.EMPL_RCD
and J.EFFDT = E.EFFDT
and J.EFFSEQ = E.EFFSEQ
end-select
Ten years after we implemented Peoplesoft HR, the PS_EMPLOYEES table might be 1/20th the size of PS_JOB, and we’ve gone from two subselects to a single join.
3. Who Dropped The Deduction?
Part-time employees with irregular schedules may get a paycheck too small to pay for all their deductions. We can find those cases by joining PS_PAY_CHECK (for the EMPLID) with PS_PAY_DEDUCTION (for the deduction code and amount). First we select the two tables for the previous pay period to see who paid the deduction last time. Then we add a “not exists” subquery of the same two tables for the current pay period to see who didn’t pay the deduction this time. Slow, slow, slow.
Here’s a faster way:
begin-select
PC.EMPLID
from PS_PAY_CHECK PC, PS_PAY_DEDUCTION PD
where PC.PAY_END_DT in ($prev_pay_end_dt, $curr_pay_end_dt)
and PD.COMPANY = PC.COMPANY
and PD.PAYGROUP = PC.PAYGROUP
and PD.PAY_END_DT = PC.PAY_END_DT
and PD.OFF_CYCLE = PC.OFF_CYCLE
and PD.PAGE_NUM = PC.PAGE_NUM
and PD.LINE_NUM = PC.LINE_NUM
and PD.DEDCD = $deduction_code
group by PC.EMPLID
having max(PC.PAY_END_DT) = $prev_pay_end_dt
end-select
The having clause is an additional selection criterion that SQL applies to each group of rows created by the group by clause.
If you haven’t implemented Peoplesoft Payroll for North America yet, there is an option you could consider. There is a setup option that specifies how many paychecks (LINE_NUM) should be on each “page” (PAGE_NUM). The product defaults to seven. If you set it to one, then every paycheck will have LINE_NUM = 1, and you won’t have to specify PD.LINE_NUM = PC.LINE_NUM. I’ve never had a chance to test it. If you do, please let me know what happens.
4. Skip The Third Pay Period
Some benefits require deductions in the first two biweekly pay periods of each month, but not for the third pay period (if it occurs).
FSA (flexible spending accounts for medical or dependent care expenses) deductions are taken in the first two pay periods, so we export them to the carrier only twice a month. The carrier wants to know who started deductions in the pay period just ended. But what if someone “started” (COVERAGE_BEGIN_DT) in a non-deducting pay period? We want to report on him or her too.
begin-select
min(PAY_BEGIN_DT) &PC.COVERAGE_BEGIN_DT
from PS_PAY_CALENDAR PC
where PC.COMPANY = $company
and PC.PAYGROUP = $paygroup
and ((SYSDATE - 28 between PC.PAY_BEGIN_DT and PC.PAY_END_DT and PC.PAY_PERIOD = 3)
or
(SYSDATE - 14 between PC.PAY_BEGIN_DT and PC.PAY_END_DT and PC.PAY_PERIOD < 3))
and PC.PAY_OFF_CYCLE_CAL = 'N'
end-select
This selection will return a start date equal to the PAY_BEGIN_DT of the skipped pay period after the following pay period ends. If there wasn’t a skipped pay period, it will return the PAY_BEGIN_DT of the pay period just ended. Then we can select all rows in PS_FSA_BENEFIT whose COVERAGE_BEGIN_DT is greater than or equal to PC.COVERAGE_BEGIN_DT.
5. Union All
Whenever it seems appropriate to perform an outer join or a union in SQL, we should consider a union all command instead. Jeff Smith proposes better alternatives to a full outer join in this article (http://weblogs.sqlteam.com/jeffs/archive/2007/04/19/Full-Outer-Joins.aspx).
A union command will cause the database to collect all the results from two select statements in a temporary table, merge them, sort them, and eliminate the duplicates. If we don’t mind duplicates, or if we know there are no duplicates, we should use union all to bypass the merge, sort, and de-duplication processes.
6. Don’t Apply Functions To Columns
Sometimes we need to test some transformation of a column rather than the column itself, but if we test a transformation of an indexed column, Oracle (and perhaps other databases) won’t use the index. It will transform that column’s value in each row. Even if the column isn’t indexed, it’s faster to transform the constant value once rather than that column’s value in each row.
- Slow: where substr(FIELD, 1, 3) = ‘ABC’
- Fast: where FIELD like ‘ABC%’
- Slow: where to_char(DATE, ‘mm/dd/yy’) = ’01/01/02’
- Fast: where DATE = to_date(’01/01/02’, ‘mm/dd/yy’)
- Slow: order by substr(FIELD, 1, 2), substr(FIELD, 3, 2)
- Fast: order by FIELD
The last example was a compound key consisting of three values that were two characters each. The business requirements were to sort by the first two values. We need to recognize that sorting the entire field will accomplish the same thing.
7. Value Lists
Peoplesoft Time & Labor introduced a great idea, value lists. The PS_TL_VAL_LIST_DTL table has three fields, LIST_ID, LIST_TYPE, and VALUEGROUP. LIST_TYPE can be anything; union code, jobcode, deptid, position number, location, shift, company, paygroup, grade, step, employee status, employee type, full-time/part-time code, earnings code, deduction code, tax code, etc.
We’ve all written SQL selections that include “in list” clauses, with an arbitrary list of values. The database handles these inefficiently, because they are equivalent to a long list of tests, separated by “or.”
where ERNCD in ('REG','OVT','VAC')
Is the same as
where (ERNCD = 'REG' or ERNCD = 'OVT' or ERNCD = 'VAC')
With value lists, we can create a list and use it as a subquery or a joined table:
where ERNCD in (select VALUEGROUP
from PS_TL_VAL_LIST_DTL
where LIST_ID = 'ABC'
and LIST_TYPE = 'ERNCD')
or
from PS_TL_VAL_LIST_DTL TVLD, PS_PAY_OTH_EARNS POE
where TVLD.LIST_ID = 'ABC'
and TVLD.LIST_TYPE = 'ERNCD'
and POE.ERNCD = TVLD.VALUEGROUP
This also saves us from changing the program every time the list changes. So, create a version of PS_TL_VAL_LIST_DTL for yourself!
It’s a shame that non-Peoplesoft SQL users are unlikely to see this.
Thank you. Maybe, someday, a SQL blogger will find this post and mention it on his or her blog.
Very helpful SQL tips