Working Effectively with Legacy Code

Book cover image

The books on object-oriented programming written or edited by one of the signatories of the Agile manifesto (Robert Martin, Martin Fowler or Kent Beck, to name a few) tend to have some common characteristics. They use Smalltalk or Java as the language for the code samples, pack a lot of wisdom and interesting examples, and unfortunately tend to be long. This book, from the Robert C. Martin series, discusses cases of Java and C++ code bases that are painful to work on, and how to improve them. At 500 pages, it doesn’t count as a lightweight, and to be perfectly honest, there are many pages that could have been left out, or banished to the appendix because they concern details of how C++ compilers work.

The author defines legacy code simply as code without tests. There is a rather convincing reasoning behind this seemingly oversimplified definition, and it goes like this. Changing an existing code base is all about feedback. You understand as much of the code as you need, change a few lines, and then verify your change. How this verification happens, the feedback, is what differentiates code bases. If you get your feedback only when the diff is online, and the customers are screaming, you will be developing under a lot of pressure, leading to anxiety and thus suboptimal code, written slowly. You will refrain from making experiments or refactoring, because keeping existing functionality is of utmost importance. If there is dedicated manual testing, at least some security before reaching the customer is provided, but the feedback still comes with too much delay to develop with confidence. The ideal condition is one where we can write code, verify that it is not breaking any existing functionality, and also put in the feedback mechanism which verifies our addition. The only feedback mechanism that can provide this is unit testing. If a code base has unit tests, we can improve it by refactoring anyway. If there are none, however, the first task in improving it is writing these tests.

What exactly are unit tests? This is one of those things in software development that everyone knows the correct definition of, but very few people do according to the book. As the name implies, unit tests put the smallest unit in a language (i.e. functions for procedural languages, classes for OOP langs) in a “software vise”. They fixate the features we are focusing on, and give us quick feedback on whether we are on the right path. For this purpose, they should execute very fast and be robust, i.e. fail only when the feature fails. Michael Feathers, together with many others, posites unit tests as not depending on code-external factors such as databases or network connections. I definitely agree with him, but it has become common place these days to call tests that make requests to an application and then check the resulting database condition also unit tests, which is in fact incorrect usage.

With this role attributed to unit tests, and the definition of legacy code as code lacking them, it is obvious what one should strive for when working on a legacy code base: Identify what to change, put relevant parts of codebase into a vise with tests, proceed to make changes with confidence. The rest of the book is concerned with various techniques to accomplish each of these steps, demonstrated in concrete project scenarios such as “It takes forever to make a change”, or “I can’t run this method in a test harness”. The book is directed mainly at Java and C++ programmers, the most popular object oriented languages, and the case studies are tailored to the difficulties faced by developers working in large projects with these languages.

The concrete obstacle to applying the above mentioned isolate-test-modify method in most legacy systems is that complicated dependencies make it difficult to instantiate classes individually, or verify effects of calling their methods. The techniques presented to solve this dependency problem fall into two rough categories: platform-level and language-level. Platform-level methods concern the specifics of how the language platforms function, making use of the different features to circumvent the limits to testing the platforms introduces in other places. Some such methods discussed are the preprocessor in C++ and classpath in Java. These both can be used to modify behavior at compile-time and break dependencies. Language-level techniques, on the other hand, concern the use of various language features, especially object-oriented ones, to modify behavior and ease testing. Some examples are mocks, where common interfaces are implemented by functional and verification classes, or subclassing and overriding effectful methods to circumvent effects (a technique the author calls Wrap Class). There are too many of both techniques, especially of the second category, to list here, which makes the book rather relevant for detailed study by programmers in the targeted languages. I felt myself a bit lost at times as a Python developer, and skipped pages, but the detailed discussions of object oriented techniques for testing still had some interesting surprises for me.

This book being about legacy code, there are many examples of bad code that is difficult to disentangle and test, coupled with concrete techniques to do so, and illustrations of object-oriented principles. An example of a very big class, for example, is discussed around not only testing it and breaking it down, but instead coupled with why it is a bad idea to build such a class in the first place, and how to avoid doing it in new code. The same example is used to illustrate the single responsibility principle and the interface segregation principle. In one other discussion, the importance of encapsulation is demonstrated with how properly respecting it can constrain effects within a class, making it much easier to understand and to test. Although these topics are frequently coupled with Java or C++ code, they are the most rewarding, and worth studying through.

One frequent topic the author has to grudgingly go into are all the silly features of Java & C++ that make testing difficult or impossible. Among these, all the different ways to control subclassing definitely take the cake. For example, I was astonished to find out that in C#, one can prevent a class from getting instantiated or subclassed. This is obviously a huge roadblock to testing classes that use this class as a constructor argument or methods that need such an object as argument. Another weird example is the combination of const and mutable in C++, which allows a method to change a const field on a class. Why the const keyword even exists is something only the language designers can explain. Reading through these examples, one comes to appreciate permissive languages such as Python which take a more convention-oriented approach.

Another aspect of the book is the way discussions of developer and team psychology are interleaved with the technical topics. This is a neglected topic in most programming books, but Feathers, thanks to his experience with consulting engagements, knows how valuable it is, and scatters discussions on the importance of various decisions and methods for developer sanity throughout the text. The most prevalent one is why fast unit tests are a great boon for programmer productivity and well-being. TDD-wise tests allow the developer to focus on one thing at a time, get fast feedback, and change code with confidence. The alternative to having tests is anxiety, and having to read tons of irrelevant code and concentrating on trivial things just so that nothing breaks. There are other brilliant insights, such as how ugly code convinces you that things will always be this ugly, or how big chunks of procedural code call for more of the same. One that I found really interesting is his point on how programming books do not contain really ugly code, because if they did, no one would buy them.

This book is a great addition to the library of every developer working in OOP languages, especially those who feel the pain of having to maintain legacy code they themselves did not write.