The only thing that really drove me nuts when I was active on newsgroups was the willingness of some programmers to code around an error instead of taking time to find the bug.
The day after fighting my way through an intransigent bug, I read two great blog posts from two coders about their own challenging cases. While my own doesn’t rival theirs for the sophistication of either the problems or the solutions, all three have one thing in common. Each of us worked hard to disentangle the problem from the visible behavior. That both of these good programmers spent obviously a lot of time going the source (ha, ha) reinforces, for me, anyway, the philosophy that I can’t fix a problem unless I understand it.
Jeff Atwood discusses a problem his project experienced with deadlocking. Even if you don’t care about my debugging philosophy, give it a read. It’s a nice discussion of comp. sci. theory meeting real life. Atwood intelligently walks through the process of finding the problem, evaluating possible fixes, and the reasons behind the choice he made.
In the meantime, Rick Strahl was fighting problems with a COM server leaking handles. Again, his discussions are always intelligent and worth reading for, if nothing else, the insight into a good programmer’s methodology.
Both Atwood and Strahl (big fish, big ponds) were doing the same thing I was doing (little fish, in my home office). They were whittling away the surrounding environment in order to isolate the (mis)behavior.
My bug was my own doing (unlike Strahl and Atwood). The problem was in a VFP application, compiled into an .APP that adds menu options to a main application. The bug was in a form in the .APP.
I should note that my process is to build classes up testing incrementally with each behavior. The troublesome components were tested and working well…until it got to testing the APP from the EXE.
Even though I had compiled in the debug info and set breakpoints…even going so far as to insert a SUSPEND, I couldn’t get the program to break before the problem. The error that the feature wasn’t available told me exactly where I was going wrong with the approach. This is the first application I’ve worked configured like this (with an APP running from another EXE’s menu), so I didn’t realize for a long time that I couldn’t debug in the APP running from the EXE. Duh. I know, I know, you can’t make me feel dumber than I already feel. :-)
In his case, Strahl narrowed code down to the most elemental example that showed the problem. Atwood used the SQL Server 2005 Profiler to find the statement that was always involved in the deadlocks. My biggest challenge was setting up an example use the debugger with, and that didn’t change the error behavior. To further complicate things the bug wasn’t being caught in my error handling in any useful way so the bug wasn’t being logged in the error handling.
Once I could replicate the error where I could debug it, I could isolate the source of the error. But that wasn’t the source of the bug. It turned out I was passing a logical instead of a character to a method that creates a folder. I could have checked the type of the parameter and tried to handle it (a common VFP “mistake”). That would have fixed the error, but not the bug. After all, I fully expected that the caller *should* have had a character to pass. Backing up to find out why it had passed a logical–and it had, pcount() = 1–found the real source of the Nile…er…error, which was easy to fix.
Two morals to this story. First, it’s hard work to find the real source of an error: fixing it is frequently far easier. Second, if even the big dogs suffer the frustration of painstakingly isolating problems, well, at least in debugging technique I’m in good company.
Now, if only I’d stop having those damned brain farts.