Thursday, January 21, 2010

Span of Control - The Mosque

One of the ways to evaluate a design is to look at the span of control of each component.  If the span is too wide, the component may have absorbed too many details from lower components.  If is too narrow, the partitioning may not be being effective at reducing complexity.

The too narrow case has a sharp bound.  If there is only one thing below a component, that component should be looked at.

The too wide case is a little fuzzier.  It's probably something like 5 plus or minus 2. 

There are exceptions.  If the too narrow case is being called from many places and it makes a semantic transformation to what it is calling, that is probably a good thing.

If the too wide case is something like a command parser where the core is a giant switch statement where each case is one of the commands, then it is probably not usefully reduceable. In that case there should be no other control flow statements in the procedure and each switch case should be a single line.  Hoist surrounding control flow into an outer function and create functions for any multiline cases.

One of the consequences of this is that good designs tend to show a mosque shape.  The get wider until a maximum level and then narrow slightly.  This arises from the delegation structure at the top calling unique routines but the final work being done by fewer common routines (possibly from a library).

Friday, January 15, 2010

Coupling

Coupling is even more critical than cohesion as a design principle.  Bad cohesion leads to comprehension problems, which (as a first order effect) leads to slower code modification.  Bad coupling leads directly to bugs.

Types of coupling (worst to best)

Content -  uses internal structure
Common - uses global data
External - externally imposed structure
Control - passing down a control switch
Stamp - composite parameters
Data - elementary parameters.
Message - no parameters
None - no coupling at all

You do need to couple to get anything done.  The key is to keep the coupling as loose as possible.

The more egregious forms of coupling are obviated my modern language design.  It is not possible to modify local variables from outside the routine they are defined in.  Globals are still possible, so discipline is required to not make use of them.

Polymorphism in OOP is the attack on control coupling.  Instead of passing down a flag to be tested, the controlling routine creates an object with the desired behaviour so the controlled routine does not have to handle any decisions.

The costs of stamp coupling can be mitigated by encapsulation.  If the external interface can be held constant it doesn't matter to the routine that the implementation has changed.

The Don't Repeat Yourself (DRY) principle is a powerful attack on the problems caused by coupling.  If the changes can be made in one place, and everything derives behaviour (correctly) from that definition, changes cannot cause bugs. Correctly is of course the key word there.  Careful data structure design is still required.

Thursday, January 14, 2010

Cohesion

Cohesion is the first principle to consider when doing design.  At thi level this is the concept that each procedure is responsible for only one thing.  My earlier post about commenting rules was really all bout how to identify the cohesion level for a procedure.

Note: Cohesion can also be applied to higher order groupings, such as classes.

There is a ranked taxonomy of cohesion (from worst to best).

Coincidental  - There really isn't any.

Logical  - It all does the same kind of thing (more applicable to higher order groupings).

Temporal - It all happens at the same tiime.

Procedural - One thing is required to follow another.

Communicational - They all work on the same stuff.

Sequential - The results of one part are the input to the next.

Functional  - It just does one thing

It is not always possible to get good cohesion in a procedure (eg an initialization routine).  It is important when confronting that situation to simplify the procedure as much as possible.  A low cohesion routine which is a list of function calls (each of which is a higher cohesion routine) will be part of a better system than one which is full of if statements and loops.

Wednesday, January 13, 2010

What is Design?

Once we have a problem that is bigger than we want to deal with in a single block of code (which happens much sooner than some people seem to believe) we have to decide how to partition the code to solve the problem.  This partitioning, and the data transfer decisions that derive from it is what I am calling design.

Tuesday, January 12, 2010

Programming Taxonomy

There are some assumptions that I am making about how programming is done that I think I should make explicit, even if they are common to the point of near ubiquity.

Almost all programming is done using the approach of procedural programming, broadly defined. That is, we program a computer by specifying a procedure to follow.  It is possible to use different approaches, Prolog, in particular, is a language that does not embody the procedural approach.

But,  I hear objections, what about object oriented programming, or functional programming?

These are both subsets of procedural programming.  Object oriented programming is all about how to package the procedures (and the data they operate on) to cope with the complexity of the programs we are trying to write. 

Functional programming restricts the procedures available to the programmer to those that do not change state in order to simplify what we need to understand about the program in order to write it correctly (it also makes it easier to partition it during execution, helpful when trying to increase parallelism).

Monday, January 11, 2010

History of Design Principles

Given that I am going to be writing blog entries on design principles I thought I would start by making some introductory remarks on the history of said principles.

The first thinking about computer programming focused on data structures and algorithms.  The approach to these was and is mostly mathematical.  There is no inherent attention paid to the implementation in this discipline.

Programming design principles derive almost universally from the work around structured programming.  Object Orientation is a development of linguistic structures and their application to package the design principles developed by the structured programming movement.

The opening salvo in the movement was Dijkstra's letter (retitled by Wirth), "Go-to Statement Considered Harmful".  This lead fairly directly to a programming style and languages that used if, for, while and like constructs.  These principles are pretty much universally accepted now.

This sufficed at the lowest level of code construction, but further principles were required when considering deisgn at a slightly higher level.  The question here is not what happens inside a subroutine (to use an early word) but how to decide what to do in a subroutine and how they should communicate.

The basis of this will be the subject of my next post.

Friday, January 8, 2010

Design Driven Design

So it's a redundant redundancy.

This is my theoretical push back against low level and mechanical Test Driven Design (TDD).

TDD says take a spec, write the test, have it fail, implement the spec, have the test pass, repeat.
When you come across a resulting bad design, refactor.

I think we can do a little better than this within the scope of a sprint.

Design Driven Design (DDD) says take the specs for a sprint, do a design, evaluate the design against design principles, redo if necessary, write the tests for the design, redo the design if you can't write the tests, then implement.

We do have design principles.  I described one in my last post about how a commenting style can expose bad design.  I will review more of them in coming posts.

Thursday, January 7, 2010

How Simple Commenting Rules Improve Code Structure

In my last post I described my rules for commenting code.

Here I am going to talk about how following those rules makes your code better.

The statement is slightly backwards. What I am really saying is that if you can legitimately describe your code using the commenting structure I describe your code will be better than if you can't.

In particular: If you can accurately and completely describe a function with one simple statement (with no, ifs, ands, or buts) that function is much more likely to belong to a well designed system than if you can't.

The bad news is that you have to design the system and write the comments before you can apply the test to see if the system is well designed. The good news is that is relatively early in the system development.

This is not the holy grail of a rule that will allow for the mechanical generation of high quality designs. It is an after the fact test that should indicate design quality, which is, I think, as good as we are going to get.

Wednesday, January 6, 2010

Comments on Comments

Jason Baker wrote a blog entry Myths about Comments. I'm going to make some comments here.

My standard on expected comments is this:
One comment at the start of every function describing what it does.
One comment per argument describing what it is.
One comment for anything that must be consistent with something else in another place (eg these outputs need to be to be in the same order as the inputs in foo).