There are two things we can do after we quicksort an array. First, we can read it row by row, and the data will be in order. Second, we can use a binary search to find a specific row very, very quickly.
Binary Search
A binary search has some conceptual similarities to quicksort. It starts with the whole array, does a little work, selects half its previous domain, and repeats. It could be coded with recursion in C or Java. This routine has two statements that are specific to the array structure; first to retrieve a key field, second to return a return-value field. To generalize this procedure, replace the two statements with calls to array-specific procedures.
begin-procedure binary_search
let $return_value = ''
let #lower = 0
let #upper = #max_index
while #lower + 1 < #upper
let #middle = floor((#lower + #upper) / 2)
let $middle_key = array.key(#middle)
evaluate $middle_key
when > $search_key
! The middle value is too high, the next pass should
! focus on the lower half of this pass.
let #upper = #middle
when < $search_key
! The middle value is too low, the next pass should
! focus on the upper half of this pass.
let #lower = #middle
when-other
! We found the match.
let $return_value = array.return_value(#middle)
! Set #lower and #upper to exit the loop.
let #lower = #middle
let #upper = #middle
end-evaluate
end-while
end-procedure binary_search
Binary / Linear Search
Peoplesoft uses “effective dates” to keep track of changes in an object’s status. An object could be a person, a department, a general ledger account, etc. Status could refer to any fact about that object.
Imagine an employee (Mary) who is hired on January 1, 1995 as a computer operator for a wage of $20 per hour. On January 1, 1996, she gets a raise to $22 per hour. On December 10, 1997, she is promoted to senior computer operator at $25 per hour. On May 3, 2000, she becomes a junior programmer, assigned to learn SQR, for $30 per hour. On August 1, 2002, she rises to programmer I, paid $35 per hour. The PS_JOB table has a row for her employee number (EMPLID) on each of these dates (EFFDT).
Suppose we have an array with all the history of every employee, sorted by EMPLID and EFFDT. We want to find Mary’s pay rate as of September 1, 2001. We can use a binary search to find one of the rows for Mary, but there is no row for September 1, 2001. The row we want is the row with the last / highest / maximum date before our target date, which is the May 3, 2000 row. We can use a bidirectional linear search to find it.
begin-procedure binary_effdt_search
! First, find a row with a matching key.
let #lower = 0
let #upper = #max_index
while #lower + 1 < #upper
let #middle = floor((#lower + #upper) / 2)
let $middle_key = array.key(#middle)
evaluate $middle_key
when > $search_key
let #upper = #middle
when < $search_key
let #lower = #middle
when-other
! Second, if the $search_key matches, find the row
! with the highest effective date.
while array.key(#middle + 1) = $search_key
and array.effdt(#middle + 1) <= $search_effdt
add 1 to #middle
end-while
! Set #lower and #upper to exit the loop.
let #lower = #middle
let #upper = #middle
end-evaluate
end-while
! Third, if the $search_key matches, find the latest
! array.effdt() less than or equal to the $search_effdt.
if array.key(#lower) = $search_key
while array.key(#lower - 1) = $search_key
and array.effdt(#lower) > $search_effdt
subtract 1 from #lower
end-while
let $return_value = array.return_value(#lower)
else
let $return_value = ''
end-if
end-procedure binary_effdt_search
Looking Ahead
We can’t talk about sorting and searching in SQR without a brief mention of the Load-Lookup command. This is one of my favorite features of the SQR language. Although Valentine’s Day is over, next week I’ll begin a four part mini-series, “Load-Lookup Love Letter.”
Brain Teaser
Meanwhile, here is a brainteaser. Please post solutions as comments.
After importing a flat file to PS_TL_RAPID_TIME, we want to report on the results. We could have counters and accumulators in the SQR program, but we want to report on what actually made it to the database. This code uses the “select and print” feature of SQR, which is usually for columnar output, to print a footnote (x accepted entries with $y). Then it prints the second footnote (x rejected entries with $y) conventionally. What is the logical error in this code?
begin-select
count(ST_INSTANCE) (+2, 1) edit 999
'accepted entries with' ( ,+1)
sum(TL_QUANTITY) ( ,+1) edit $99,999.99
print #num_rejected (+1, 1) edit 999
print 'rejected entries with' ( ,+1)
print #amt_rejected ( ,+1) edit $99,999.99
from PS_TL_RAPID_TIME
where ST_INSTANCE = #st_instance
end-select
I’m not familiar with the PS_TL_RAPID_TIME table, but it seems likely that if all the records in the flat file were rejected, then there would be no data returned from the select statement. In that case, neither of the notes would print (even though presumably there were some number of records rejected that you would have wanted to know about)….
Absolutely correct!