We can’t always output information in the proper order by using the DBMS to sort the incoming data. Imagine a corporate roster showing who reports to whom. That’s going to require multiple selects of PS_EMPLOYEES. We could write the data to a scratch table and select it again in order. We could write it to a disk file, use a sort utility, and then read the file.

At moments like this, we face the greatest mystery of our time: why doesn’t SQR have a sort command? My theory is that SQ Software, Sybase, MITI/Sqribe, Brio, Hyperion, and Oracle never added a sort command for our own good, because programming sorts builds character.

(Seriously, does anyone know why there is no sort command in SQR?)

Quicksort is algorithm invented by C.A.R. Hoare in 1962*. It runs fast and small. It’s not optimal in every situation, but it’s usually a good choice.

The challenge for SQR programmers is that Quicksort is recursive and SQR is not. Quicksort’s basic approach is to move the “lesser” data to the first half of the list and the “greater” data to the second half of the list. Next, it calls itself twice, first to repeat the process on the first half of the list, then to repeat the process on the second half of the list.

I searched the Internet for “nonrecursive quicksort,” and found a Basic program by James Haley, dated August 28, 1985. I rewrote it for SQR and generalized it to handle any array. We can put the generic quicksort procedure in an sqc include file and write specific supporting procedures (save_pivot, compare_rows, and swap_rows) for each program.

Here is the generic quicksort procedure. I used the “local” option so that quicksort’s variables wouldn’t interfere with those variables in the main program.

(History quiz – does anyone know why #i and #j are used so often for loop counters?)

begin-procedure quicksort local
  create-array name=ranges size=32
    field=from:integer=0
    field=thru:integer=0

  clear-array name=ranges
  move 0 to #range_num
  put #first_row #last_row into ranges(#range_num)

  while #range_num >= 0
    get #from #thru from ranges(#range_num)
    subtract 1 from #range_num
    if #from < #thru
      do save_pivot
      let #i = #from
      let #j = #thru + 1
      while #i < #j
        move '<' to $relationship
        while $relationship = '<' and #i < #thru
          add 1 to #i
          do compare_rows(#i, $relationship)
        end-while
        move '>' to $relationship
        while $relationship = '>'
          subtract 1 from #j
          do compare_rows(#j, $relationship)
        end-while
        if #i < #j
          do swap_rows(#i, #j)
        end-if
      end-while
      do swap_rows(#from, #j)
      add 1 to #range_num
      let ranges.from(#range_num) = #from
      let ranges.thru(#range_num) = #j - 1
      add 1 to #range_num
      let ranges.from(#range_num) = #j + 1
      let ranges.thru(#range_num) = #thru
    end-if
  end-while
end-procedure quicksort

Each SQR program that uses quicksort needs to have its own versions of the save_pivot, compare_rows, and swap_rows procedures. For this example, I imagine a program that needs to sort three times. It uses the $sort variable to control evaluate commands so that quicksort will act on three different arrays (array, employee, and dept) and sort employee in two different ways (“by salary” and “by name”).

Note that save_pivot does not have local variables; it uses global variable $sort to set global variables #value2, or #salary2, or $last_name2 and $first_name2, or department $name2.

begin-procedure save_pivot
  evaluate $sort
    when = 'array'
      let #value2 = array.field(#from)
    when = 'employee by salary'
      let #salary2 = employee.salary(#from)
    when = 'employee by name'
      get $last_name2 $first_name2
        from employee(#from) last_name first_name
    when = 'dept by name'
      let $name2 = dept.name(#from)
  end-evaluate
end-procedure save_pivot

Note that compare_rows uses local variables, so it has to refer to $sort and the variables from save_pivot with underscores. It compares to rows from an array and returns the symbols for less than, equal to, or greater than in the $relationship variable. The value of $relationship tells quicksort whether #row1 should come before or after the pivot in the array. If we want to sort in descending order, we would reverse the rules. We can sort by multiple fields (see the employee by name example), with some field ascending and some fields descending.

begin-procedure compare_rows(#row1, :$relationship)
  evaluate $_sort
    when = 'array'
      let #value1 = array.field(#row1)
      evaluate #value1
        when < #_value2
          let $relationship = '<'
        when = #_value2
          let $relationship = '='
        when > #_value2
          let $relationship = '>'
      end-evaluate
    when = 'employee by salary'
      let #salary1 = employee.salary(#row1)
      evaluate #salary1
        when < #_salary2
          let $relationship = '<'
        when = #_salary2
          let $relationship = '='
        when > #_salary2
          let $relationship = '>'
      end-evaluate
    when = 'employee by name'
      get $last_name1 $first_name1
        from employee(#row1) last_name first_name
      evaluate $last_name1
        when < $_last_name2
          let $relationship = '<'
        when = $_last_name2
          evaluate $first_name1
            when < $_first_name2
              let $relationship = '<'
            when = $_first_name2
              let $relationship = '='
            when > $_first_name2
              let $relationship = '>'
          end-evaluate
        when > $_last_name2
          let $relationship = '>'
      end-evaluate
    when = 'dept by name'
      let $name1 = dept.name(#row1)
      evaluate $name1
        when < $_name2
          let $relationship = '<'
        when = $_name2
          let $relationship = '='
        when > $_name2
          let $relationship = '>'
      end-evaluate
  end-evaluate
end-procedure compare_rows

Swapping rows is easy when we’re dealing with just one array and just one field. For the general case, we need something this elaborate. Note that we can use the same swapping code for “employee by salary” and “employee by name.”

begin-procedure swap_rows(#row1, #row2)
  evaluate $_sort
    when = 'array'
      let #value1 = array.field(#row1)
      let #value2 = array.field(#row2)
      let array.field(#row2) = #value1
      let array.field(#row1) = #value2
    when = 'employee by salary'
    when = 'employee by name'
      get $emplid1 $last_name1 $first_name1 #salary1
        from employee(#row1)
      get $emplid2 $last_name2 $first_name2 #salary2
        from employee(#row2)
      put $emplid1 $last_name1 $first_name1 #salary1
        into employee(#row2)
      put $emplid2 $last_name2 $first_name2 #salary2
        into employee(#row1)
    when = 'dept by name'
      get $setid1 $deptid1 $name1 from dept(#row1)
      get $setid2 $deptid2 $name2 from dept(#row2)
      put $setid1 $deptid1 $name1 into dept(#row2)
      put $setid2 $deptid2 $name2 into dept(#row1)
  end-evaluate
end-procedure swap_rows

* March 2, 2009 addendum: C.A.R. Hoare is Sir Charles Antony Richard Hoare, a Britsh Computer Scientist. Not only did he invent Quicksort, he invented null pointers in 1965. He now calls null pointers his billion dollar mistake because of programmers like me who have written countless bugs based on null pointers. Thankfully, I haven’t dereferenced a pointer since I started programming in SQR.

Looking Ahead

Sometimes we sort data for use within the program, not just for output. Looking for data in an unsorted array requires a slow, linear search. Looking for data in a sorted array can use a fast, binary search. Next week we’ll look at binary searches and binary/linear searches. (A binary/linear search is my name for searching through sorted, effective-dated data.)

Brain Teaser

Here’s the warm-up: there are two variables, #x and #y. We need to set #min to the lesser of the two and #max to the greater of the two. Avoiding if commands, what are the formulae for #min and #max?

Here’s the challenge: there are three variables, #x, #y, and #z. What is the simplest code to set the values of #min, #mid, and #max?