Quality Really IS Free

Scott Berkun recently brought up Philip Crosby’s classic Quality is Free on his PMClinic mailing list, in a discussion about how to “successfully argue for time for higher quality”. If argued on its merits, that should be an easy one: it really is cheaper to introduce activities that raise the quality of the software than it is to omit them.

Scott is right. Crosby’s work is definitely the place to start when you’re trying to get a bead on quality. In Quality is Free, Crosby defines quality as “conformance to requirements,” and this has become the standard definition of quality — not just in software engineering, but universally across all modern product engineering and manufacturing industries.

This is a slightly counterintuitive idea, but it’s not hard to understand why it makes sense to define quality that way. I usually explain it using a hypothetical “black box”. Say you’re a tester, and someone hands you a black box with a big red button on it and tells you to test it. Consider the following three scenarios:

  1. You press the button and nothing happens.
  2. You press the button, and the box plays a recording telling you that you pressed the button incorrectly.
  3. You press the button, and the box immediately heats up to precisely 472 degrees C, causing you to drop it. It breaks into a dozen pieces.

In each of these scenarios, does the box work? That depends on whether it’s a children’s toy or the heating element for an industrial oven. What this illustrates is that there’s no way to validate a product unless you know what it’s supposed to do. And that’s Crosby’s point — that if you don’t know the requirements of a product, you can’t judge its quality.

In the PMClinic thread, Faisal Jawdat had a good post summing some of the ways that Crosby shows us how quality is free. It takes effort to do quality improvement, but it leads to a lower cost to the project overall. It’s a lot cheaper — by orders of magnitude — to fix a defect when it’s still on paper than it is to fix it after the code has been built. In practice, it’s not hard to show that quality activities like inspections, walkthroughs, deskchecks, code reviews, test-driven development, continuous integration, static code verification, etc., save more time than they cost. If you hold a code review that takes three people two hours, and if you’ve done a good job selecting the code to be reviewed, then you almost always find bugs which would have cost far more than six man-hours to fix. And you often catch at least one defect which would have made it out and been caught by a user. So by introducing code reviews, you can reduce the total effort required for the project while increasing the quality of the code (by shipping it with fewer defects).

The other side of this idea was championed by W. Edwards Deming who, along with Crosby and Juran, is seen as one of the “grandfathers” of modern quality. One important idea that he wrote about in “Out of the Crisis” is that there the requirements themselves also have their own notion of quality. High-quality requirements address the needs of the people who will use the product (the users) and the people who need it (the stakeholders). He spends a lot of time talking about how getting the people who build the product to understand what it’s needed for is the best way to get them to innovate in the right direction. He has some great examples, drawn from industrial engineering and automotive manufacturnig, that taught me a lot about fostering communication between the people who need the software and the people who build it. (Like Crosby, Deming doesn’t write specifically about software. The reason for this was that modern software engineering has its roots, via Watts Humphrey, in both Deming and Crosby.)

All of this adds up to the idea that understanding the needs of the users and stakeholders, and then agreeing upon the behavior of the software and verifying that the behavior will actually satisfy the needs, will lead to higher quality software and a more efficient team.

Which brings me to a classic counterargument (brought up by another PMClinic poster) which is often used to shoot down quality ideas: “The perfect is the enemy of the good.”

This is certainly true, and if you have a QA team that’s trying to get the software “perfect”, then they aren’t doing their jobs. But I’ve never met an experienced software tester who thought his job was to make the code perfect.

I have, however, seen many senior managers try to remove quality activities from the schedule, justifying it by saying that the software doesn’t have to be perfect. Ironically, I often hear the same people blaming the QA team for problems that they didn’t catch. It seems that there’s a double-standard here: when testers want to test software, they are guilty of trying to reach perfection, but when they miss defects they’re guilty of not being perfect. (I’m not doing a great job of explaining this — Jenny could probably do it better.)

That’s why Crosby’s definition of quality is so important. It makes quality very specific: the more closely the product conforms to its requirements, the higher its quality. It also makes the job of the testers very specific: to validate the software against the requirements. It also makes it clear why the verification and defect prevention activities (reviews and inspections) are so important — they reduce the number of defects in the requirements, before those defects can be reflected in the software. With these ideas in mind, it’s easy to challenge the senior manager who wants to cut down the testing schedule as “too perfectionist”. Whenever I’ve been in that situation, I’ve simply said, “Just tell me which requirements you don’t need to work, and we’ll cut out the tests for them. And in the future, we’ll do this earlier so that we don’t build those features in the first place!”

Iterative Development and the Efficiency Gap

Jenny and I were talking yesterday about short, time-boxed releases. Breaking a project into short, frequently delivered releases is a technique which has been gaining in popularity lately. Agile methodologies like SCRUM and XP rely on them, but they can be found as phased releases in traditional development shops as well. It’s clear why they’re popular with clients — working software is delivered frequently, and is an intuitive and satisfying measure of progress. And there are definite advantages to iteration from the development perspective. It’s easier to respond to changes in scope and requirements. Project planning is easier as well: instead of estimating how long the project will take, the team can select a subset of the scope which will fit in iterations. And, of course, iteration planning works very well with other techniques like continuous integration, refactoring and test-driven development, all of which can be explicitly planned for within the release cycle.

But is there a cost?

Jenny pointed out to me that there is one, and I think she’s right. Even when the feature set of the project can be neatly broken down — and not every project can be broken down into time-boxed releases — there is additional overhead that’s added. Each release must be planned. The team needs to package and deploy the software. The project needs to be wrapped up and the team needs to get trained up on the next release. And while the development team may be able to handle that relatively seamlessly, users tend to move more slowly and be less adaptive or responsive to change.

The problem is compounded if there is a software testing component. Any time the development team touches existing code, the testers need to run regression tests to make sure the current behavior is intact. And just because the iterations are relatively easy on the developers, it doesn’t mean that the testers will also be able to finish quickly. If there are small changes to many parts of the software, it can require a large testing effort. This is why iteration can mean a great deal of rework for the test team.

Any time there a project is broken into a set of time-boxed release cycles, there’s going to be an “efficiency gap” — the difference between the effort required to develop the project all at once, and the larger effort required to build and deliver the software in small, iterative phases. The question of whether or not to apply iterative development to a particular project can then come down to a question of whether the cost of iteration or phase planning is offset by the gain in flexibility and stakeholder perception.

So can this efficiency gap be measured? Obviously, it’s not possible to measure the actual efficiency gap, so it would be necessary to come up with a proxy for it. One way to measure might be to look for regression defects: keep track of the effort or lines of code required to repair defects discovered by unit tests or functional tests which passed for previous releases. Another way would be to compare the number of lines of code added to the number of lines modified or deleted — the higher this ratio, the less overlap between releases. These metrics could be useful for determining exactly how long a phase should be: if it seems like the interations are inefficient, it might make sense to add more scope and time to each iteration. If this improves efficiency, then it may be possible to keep modifying the iteration size until the team finds a good “sweet spot”.