Friday, December 12, 2008

Refactoring Foxhound: Logging

Jeff Atwood's post The Problem With Logging got me thinking: I should talk about diagnostic logging inside Foxhound.

(This being a blog about databases, I have to make one thing clear: diagnostic logging has nothing to do with transaction logging.)
Foxhound creates two kinds of diagnostic logs: large amounts program trace data are written to the SQL Anywhere console log text file during development, and a small number of errors and exceptions are written to a permanent table by both development and delivery builds (those errors and exceptions are also written to the console log during development).
(The location and size of the console log file is controlled by these command line options: dbeng11 -o filespec.txt -os 10M)
Jeff says "I am not anti-logging. I am anti-abusive-logging." That said, Jeff has five main points to make, five problems he sees with the way programmers often do diagnostic logging... let's see how those points apply to Foxhound (Jeff's points are in "Bold italics", my comments follow):

"Logging means more code."

All the tracing logic in Foxhound is hand crafted, none of it is automatically generated. That reduces the amount of superfluous logging, output that will never be useful... most instances of tracing logic have been added to Foxhound to help debug some specific problem, or added to sections of code that have had a history of problems.

The Monitor section of Foxhound has been particularly hard hit, due to it's multi-threaded and timing dependent nature, so that's where the most interesting tracing logic exists.

Once added, tracing logic tends to remain in the code until it proves to be annoying... more on this later.

"Logging isn't free."

All of the tracing logic uses SQL Anywhere's MESSAGE ... FOR DEBUG feature, and it's enabled only for development builds, not delivery. Over time, however, some large sections of tracing logic have been removed from Foxhound to avoid performance penalties during development... sometimes deleted, sometimes commented-out in case it becomes useful again during periods of heavy maintenance.

That only applies to the development build, however. The execution profiler has never shown any bottlenecks in the delivery build that were caused by disabled MESSAGE ... FOR DEBUG statements.

"If it's worth saving to a logfile, it's worth showing in the user interface."

Foxhound doesn't (intentionally) fail, there's no concept of "fatal" errors. The design calls for all exceptions to be handled one way or another: Some are "normal" such as the failure to connect to a target database (which is shown to the user), or the failure to complete some SQL operation because Foxhound is shutting down (which is not shown to the user).

Other exceptions are "abnormal" (unexpected and/or unheard of) and are worth recording for possible future investigation; those are recorded in a permanent table. They are the reason why the Foxhound Options page has buttons labelled Display Diagnostics, Export Diagnostics and Delete Diagnostics.

The vast majority of diagnostic logging data, however, is tracing data written to the console log, and only during development. NONE of that tracing data is written by the delivery build, and there is no Foxhound user interface to display it... during testing, Wordpad is the tracing data UI of choice.

"The more you log, the less you can find."

That's absolutely true; that's the most frequent reason various pieces of tracing logic are removed from Foxhound: to keep the size of the console log file humanly manageable. Sometimes, the pieces are commented-out ("Who knows when this #$%&!%# procedure will give trouble again!"), sometimes they are permanently expunged ("No, this inner-inner-inner loop works now, I'm never gonna need that ten-thousand-line trace again!")

"The logfile that cried Wolf" ... i.e., serious errors will go unnoticed in huge logfiles that contain too much unimportant information.

Sure that's true, if the huge log file is the only place serious errors are recorded. That's not the case with Foxhound, however... that's what the permanent table is for.



I've got some things to say myself, about logging...
  • If logging helps, do it. If it hurts, stop.

  • Diagnostic logging is neither good, nor bad, per se. It's a tool which can be used for good or evil, it's your choice.

  • Logging contributes nothing to the end product; no customer is going to be impressed by the elegance and thoroughness of your logs.

  • And last but not least: Debuggers are great, especially the one built in to SQL Anywhere, but they are not a silver bullet.
All debuggers come with a considerable amount of human procedural overhead. I'm not anti-debugger... I once bought a hardware circuit board to debug an IBM PC timer interrupt handler: $800 in 1988 dollars, a week to figure out how to make it work, one single "Aha!" moment, one single line of code to fix, and then it was never used again.

Debuggers are the MRI machines of the programming world, invaluable for some problems but unnecessary for most. In most cases nothing beats a simple MESSAGE statement. In many cases even that's not required, not if you know where the error occurred and you can see the SQLCODE and ERRORMSG().



Show Me The Code!

Here's how it diagnostic tracing works, in Foxhound, starting with the Windows batch file that runs the development build. The heart of this batch file is the dbisql command which launches the debug build SQL file called 004_rroad_build_debug.sql:
Note: The ^ character is the line continuation character for Windows command files.
"%SQLANY11%\Bin32\dbisql.com"^
-c "ENG=foxhound1;DBN=f;UID=dba;PWD=sql"^
READ ENCODING Cp1252 004_rroad_build_debug.sql^
[ON] [ON] [Beta 1.09] [a] [10]
The five [values] inside square braces are parameter values passed to dbisql for the five named parameters inside the file.

Here's what the corresponding dbisql command looks like in the delivery build batch file; different SQL file, different parameter values:
"%SQLANY11%\Bin32\dbisql.com"^
-c "ENG=foxhound1;DBN=f;UID=dba;PWD=sql"^
READ ENCODING Cp1252 004_rroad_build_for_delivery.sql^
[OFF] [OFF] [Beta 1.09] [a] [10]
Now back to the debug build; here are some excerpts from the 004_rroad_build_debug.sql file:
PARAMETERS DEBUG_MESSAGES, 
MONITOR_DEBUG_MESSAGES,
VERSION_PREFIX,
BUILD_FORK,
MAX_PROXY_OWNER_COUNT;

...

READ ENCODING Cp1252 010_rroad_setup.sql
[{DEBUG_MESSAGES}]
[{VERSION_PREFIX}]
[{BUILD_FORK}]

[{MAX_PROXY_OWNER_COUNT}];
...

READ ENCODING Cp1252 203_rroad_monitor_sample_loop.sql
[{MONITOR_DEBUG_MESSAGES}];

...
The PARAMETERS command is not part of SQL, it's a command specific to dbisql that gives names to different [values] passed from the command line. The series of READ statements inside 004_rroad_build_debug.sql tell dbisql to process each named *.sql file in turn, passing on whatever parameter [values] they require.

Inside the [square braces], the names enclosed inside {curly braces} are references back to the PARAMETERS statement.

For example, MONITOR_DEBUG_MESSAGES parameter is second on the dbisql command line, and it is passed on as the first parameter to the file 203_rroad_monitor_sample_loop.sql. For the debug build, the value is ON. For the delivery build batch file, and the corresponding delivery SQL file, the value is OFF.

Here's an excerpt from 203_rroad_monitor_sample_loop.sql:
PARAMETERS MONITOR_DEBUG_MESSAGES;

...

CREATE EVENT rroad_monitor_sample_loop
HANDLER BEGIN

...

SET TEMPORARY OPTION DEBUG_MESSAGES
= '{MONITOR_DEBUG_MESSAGES}';

...

MESSAGE STRING ( 'DIAG ', CURRENT TIMESTAMP, ' ',
CONNECTION_PROPERTY ( 'Number' ),
' rroad_monitor_sample - Stop Sampling for ',
@sampling_id ) TO CONSOLE DEBUG ONLY;

...
Since every EVENT and web SERVICE runs on its own connection, every event and service block is responsible for its own connection-level SET TEMPORARY OPTION commands... hence the PARAMETERS command in this SQL file. The handling of {curly brace values} is a pre-compile string substitution process performed by dbisql before the code is passed to SQL Anywhere for processing; here's the SET command after the substitution:
SET TEMPORARY OPTION DEBUG_MESSAGES 
= 'ON';
For many applications, setting DEBUG_MESSAGES is not that complex: The simplest approach is to set it ON and OFF one time, at the database level, and be done with it:
SET OPTION PUBLIC.DEBUG_MESSAGES = 'ON'; -- or 'OFF'
In a client server environment, your application could pass the following command to the engine after starting a new connection:
SET TEMPORARY OPTION DEBUG_MESSAGES = 'ON';
That might not work too well in a connection pooling environment; for that, it's probably best to turn it ON and OFF at the database level as described above.

Foxhound, however, is built entirely around EVENT and web SERVICE blocks; there are no client server database connections, no connection pools, no applications in the traditional sense, just the web browser.

The MESSAGE statement shown above contains these basic components:
  • A call to STRING to build the message text.

  • The 'DIAG ' substring acts as a eyecatcher so you can tell diagnostic text apart from checkpoint messages and other data in the log.

  • The CURRENT TIMESTAMP is often useful when looking for needles in the haystack.

  • The CONNECTION_PROPERTY ( 'Number' ) value is helpful with applications that have many simultaneous connections; e.g., busy applications with many users, and Foxhound with its myriad events and web services.

  • The rest of the data passed to STRING is the actual diagnostic data, unique to each MESSAGE statement.

  • The TO CONSOLE DEBUG ONLY clauses control where and when the data is written; for more information see here.

No comments: