The term “log file” implies that this file serves as a log; a diary; a record of significant events in the program. It’s a place for unofficial information or information about this instance of this program, not to be stored in the organization’s production database for widespread access and not to be printed in reports for the system’s customers. It’s information for the technical team to use for troubleshooting or analysis.
1. Standard Log Entries
I like to output some standard information at the beginning of the log file of every program I write.
- The database that the SQR program used.
- The Peoplesoft Process Scheduler process instance number.
- The name of the run control table.
- The operator ID.
- The run control ID.
If the Peoplesoft Process Scheduler runs the SQR program as a “SQR Report” process type, it will supply the process instance number, operator ID, run control ID, and database instance name in the command line. If the SQR program calls the Peoplesoft-delivered procedure “stdapi-init” in include file “stdapi.sqc,” those value will be available to the program in variables $prcs_process_instance, $prcs_oprid, $prcs_run_cntl_id, and $database_name respectively.
I divide my programs into a series of major procedures. At the beginning of each major procedure, I put a “checkpoint” that logs a message that the procedure has begun, with the time it occurred. This tells me that the program didn’t crash before reaching that procedure, and whether there were any unusual delays in reaching that point. I wrote a simple procedure to help do that:
begin-procedure show_time($message)
let $time = edit(datenow(), 'HH:MI:SS')
show $time ' ' $message
end-procedure show_time
I make frequent use of the load-lookup command, which logs the number of rows it reads. Before each load-lookup, I output a message containing the value of the rows parameter, to let me know if I should increase that value. Here’s sample code:
show 'Expecting fewer than {num_emps} rows'
load-lookup name=emp_map
table='(select EMPLID, ROWNUM from PS_PERSONAL_DATA order by EMPLID)'
key=EMPLID return_value=ROWNUM rows={num_emps}
And here’s the output in the log file:
Expecting fewer than 25000 rows
(SQR 2613) Loading 'emp_map' lookup table ... done. 21328 rows loaded.
2. Debugging Entries
SQR programmers might not know that there is a class of software called “debuggers.” These are programming tools that execute a program within an environment that lets us slow it (our SQR program) down, monitor its operation, stop it at any point, view and even change its variables. Any professional programmer ought to consider using these tools.
SparkPath Technologies (www.sparkpath.com) offers SP Debugger for SQR. It works as a stand-alone product or in concert with the SQR Express integrated development environment from Business Computing Solutions (www.sqrexpress.com).
CON+DEV Consulting & Development (www.sqr-runner.com) offers SQR Runner 4 Pro, an integrated development environment which includes a debugger.
SRI Technologies (www.sritech.biz) offers the SQRPlus Interactive Debugger as part of their integrated development environment.
We also output messages and variables to the log file while developing and testing our programs. It’s a crude trace facility with breakpoints (the stop command) and variable examination (the show command). The output is often voluminous and rarely self-explanatory. It’s not interactive, so if we find a “clue” to our bug, we have to add more debugging code and run our program again from the beginning.
I believe it’s best to delete all the debugging display and show commands before the program goes into production, taking the clutter out of the source code as well as the log file. It’s second best to comment them out. The only difference between live code and commented-out code is an inconspicuous exclamation point.
3. Analysis And Troubleshooting Entries
Any program that stays in use for enough years will eventually encounter unexpected conditions. Long after everyone thought the program was debugged, there will be a bug. Perhaps year “00” will follow year “99.” Perhaps there will be a year with 27 biweekly paydays. Perhaps our users will suspect the program’s output but it will turn out not be a bug after all.
Sometimes a complex data manipulation or program flow lies deep within the program, far removed from the input or output values. Or there may be a pocket of code that is rarely executed (the full-moon on leap-day routine). It could be helpful to leave some commentary in the log file for the questions that are bound to arise.
I wrote a program that generated 200,000 rows of data every time it ran. One of its functions was to identify twelve pairs of small subsets of the data, sum the values of those 24 subsets, and divide the first sum of each pair by the second sum of each pair. The twelve resulting ratios were usually between 0.2 and 0.3, but they could be quite different. For example, any of the twelve pairs could, theoretically, have zero as the numerator or zero as the denominator.
It took a lot of testing and fixing to make the ratios correct, and it was difficult to evaluate them in haystack of 200,000 rows. I added show statements to the program to identify the subsets, numerators, denominators, and ratios. I decided that information was too interesting to remove after I finished debugging. It remains in the log file, ready for the day that the results look wrong.
4. Error Messages
I love error messages. I love to get them (as opposed to program failures without messages) and I love to send them. The log file is the perfect place for error messages. (In a future blog post, I will discuss using Peoplesoft’s message log table as well.)
The Peoplesoft Process Monitor uses the term “success” to mean a process ended normally, but not necessarily that the results were accurate. Fortunately, we can make the Process Monitor say “error” simply by executing the stop command.
An SQR error message can have two audiences, depending on the traditions in the organization. What are the organizational expectations for the users who ran the program and got the “error” status in the Process Monitor? Should they view the log file and attempt to deal with the problem themselves? Or should they call the programmer at the first sign of trouble?
If we want to encourage the users to help themselves, the error messages should be in a language they understand. Peoplesoft can track the language (English, Spanish, etc.) of its operators, but what I mean is that the error message should be in a human language, spelled correctly, jargon-free, in full sentences, grammatical, and nicely formatted.
The error message should explain what went wrong as specifically as possible, including specific data if necessary. It should explain what the user should do about it. Here is an example:
You asked for all transactions between April 30, 2009 and April 1, 2009. The end date is earlier than the start date. Please change the dates and rerun the program.
Or,
This program can only handle 20 earning codes per paycheck. Employee 12345 (Pat Smith) has more than 20. Please review the paycheck. It there really should be that many earning codes, please contact the IT department. Otherwise, fix the paycheck and rerun the program.
Some errors (or in some organizations, all errors) should be referred to the IT (Information Technology) department. Those error messages can be less friendly but they should have a clear message that the user shouldn’t panic and should call IT. They could also tell the user what to say to IT. (“The finance report failed” is less helpful than “Program FINX021 failed while processing employee 12345.”) Those errors should include the values of the variables that may have contributed to the problem. SQR provides built-in variables to monitor database interaction.
- #sql-count is the number of rows affected by the last SQL insert, update, or delete.
- $sql-error is the text message from the database.
- #sql-status is the status value (e.g. Oracle’s V2) of the last SQL select statement compiled or executed.
Discussion Topic
Is it better to spend more time making a program failsafe (never fails) or failsoft (gives good messages when it does fail)?
It’s impossible to make it failsafe, so you might as well make it failsoft.
Is there any way to assign a severity level to a log message and specify at runtime what severity to display?
An even cooler feature would be where the program pushes each function entry onto a stack and pops it on exit. The stack only gets dumped if a severe message (i.e. an error) gets logged. That way, you have a stack trace for your errors, but you don’t create huge log files when everything is OK. (And even when an error happens, you have a succinct log file.)
Bob, once again you’re aiming low and succumbing to lousy programming habits. You’re just resigning yourself to having bugs, and trying to CYA with error messages.
Maybe if you cared more about the quality of your work, you wouldn’t need to log so many errors.
SQR does have a facility somewhat like severity level. Any line can begin with “#debug”, and SQR will compile that line if there is a “-debug” parameter in the SQR command line. Further, a line can begin with “#debuga” (where “a” is a single letter or digit). Then the SQR command line can have something like “-debugab12” which will cause SQR to compile any line beginning with #debuga, #debugb, #debug1, or #debug2.
SQR doesn’t have a “fail-only output” function. However, we could create a file and write to it. Then, the last thing we would do before the successful end of the program is to close it and delete it.
Remember, Edward, we could write a program that met the users stated requirements perfectly, but failed when the user (or the database) fell out of conformance. I had a case where the original user specified that my program should produce a report for a given pay period. Then, a different user ran the report without entering a pay period. I learned a lesson about validating user inputs, but I learned it from the log file; the second user didn’t mention that she hadn’t entered a value.
I’m not sure we’re talking about the same thing. My idea was that, instead of writing a log command in each function or module, just push the name of the function on a stack (i.e. a growing array). Pop it when you exit the function. Then, when you want to log a fatal error, call a routine that logs all the function names on the stack before logging the error message.
You’re right, that is something different. It seems interesting, but we could only use it for errors that the programmer detected. If the SQR interpreter finds an error, it won’t dump a stack for us. Perhaps some of the third party debuggers would.