Wednesday, July 8, 2009

Why did it do THAT?

In keeping with this blog's theme of decomposition (insert Beethoven joke here) bughunting can be divided into 2 parts, identifying the bug and determining the cause. Repair takes us back to the "How do you do this?" question.

Here I will focus on the tracking down the cause part of the issue. Identification is more about testing than code writing.

Finding hard bugs is a subspecies of the scientific method. Create a theory of why the bug occurs and prove it wrong. Repeat. Eventually you find a correct theory.

Being able to easily construct mental models of what is going on is critical to this process. These models come in two flavours, a model of how things are supposed to work and models that explain a given (erroneous) behaviour.

Given a good model of how things were designed will lead to the software component that is generating the error. At that point an inspection of the local code and its inputs can start you on the trail of the bug. Breakpoints, tracing, and debugging output come into play here.

Lacking such a model (the joys of legacy code) you need to start constructing possible models of what could be going on. At this point the inventiveness referred to the in the design post becomes very useful.

How do you do THIS?

There are three components to the ability to solve hard design problems. Knowledge of the tools at hand, the skill to combine them in novel ways to create solutions, and the ability to evaluate the results and select the best solution.

Knowledge of the tools at hand is the easy part, both in application and in identifying individuals who have this knowledge. Schools, training courses, books, conferences, websites, the list of resources available to improve in this area is long and varied.

It is not entirely clear how much inventiveness is a learned skill as opposed to an innate aptitude. There are large differences between individuals in this area but it also seems apparent that solving a lot of different problems can increase someones skill level.

The ability to evaluate the results is a very important part of solving hard problems. If there are metrics that can be used to compare results this component is not particularly difficult. However, that is a very large "if". In general this requires judgment, and good judgment is frequently associated with a history of burned fingers.

Hard Problems

The first question when considering the issue of hard problems is: Why put out the effort to solve them? There are two basic reasons for attacking hard problems. Either the problem is significant or finding the solution is satisfying. Puzzles and games appeal purely to the latter motivation. Solving significant problems have a strong tendency to provide economic benefits to the solver, either directly, or via an organization they are a member of.

When looking at problems taxonomy arises quickly. What kind of hard problems are we looking at? At this point in this blog I am looking at 2 kinds of problems, software design and software debugging. Or, how do you do THIS?, and why is it doing THAT?

Code Quality

First: Which aspect of code quality am I concerned with here? Of all the qualities that code has correctness takes pride of place, if it is not correct not a lot else matters. However, this is not what most people assume by the phrase "code quality" since it is assumed as a basic attribute, rather than a distinguishing attribute. Then there are the performance qualities, time and space. Given Moore's Law (even after running off the end of it for speed), optimizing compilers, and other advances this is not an issue in the small. If there are performance issues they need to be addressed (in the absence of pathology) with algorithmic and architectural attacks.

For me, the prime code quality issue that is contentious enough to write about is comprehensibility. If people can read the code it will be maintainable, portable, and all the other things that people want in code when they are looking at it.

What drives comprehensibility? Or, to flip the question, what drives incomprehensibility? In computer programs the primary barrier to understanding is complexity. One of the primary controllable factors driving complexity is scale (this also applies to looking at computer systems and projects at higher levels).

To reduce complexity, reduce the scale.

This means that computer programs need to be factored into small subroutines. Keeping routines small requires constructing them from lower level abstractions. Identifying those abstractions and designing good interfaces for them are the primary skills required to create good code.