Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Review of "Clean Code" Second Edition

This is true! Clean Code’s second edition is pretty much the same: same tiny-functions style, the same old OOP habit of mixing data and behavior into choppy class designs, and the same blind spots.

I wrote a critique of Clean Code's first edition and was curious how much of it would become obsolete. After reading it, the answer is: "not much, if anything"

It is a different book in some ways:

  • The first edition was focused, concise and highly opinionated. You could disagree with 80% of it and still appreciate it as a clear, coherent message about one particular style of programming.
  • The second edition is less focused. It’s an amalgamation of Martin’s work - a collage of Clean Code, The Clean Coder, Clean Architecture, We Programmers, and SOLID material.

It’s also a noticeably thicker book - full of digressions, rants, and good old rambling.

But the code part has not really changed. This review focuses on "Part 1: Code".

...

Here's a good example of the new format:
In Chapter 10 - "One Thing" - he addresses critiques of his style of tiny functions:

Over the years, this particular recommendation, from the first edition of this book, has been the most controversial. But why? There are a lot of reasons.
  1. People are afraid that if they follow this rule, they will drown beneath a sea of tiny little functions.

  2. People are afraid that following this rule will obscure the intent of a function because they won't be able to see it all in one place.

  3. People worry that the performance penalty of calling all those tiny little functions will cause their applications to run like molasses in the wintertime.

  4. People think they're going to have to be bouncing around, from function to function, in order to understand anything.

  5. People are concerned that the resulting functions will be too entangled to understand in isolation.

Let's address these one at a time.

  1. Drowning

Long ago our languages were very primitive. The first version of Fortran, for example, allowed you to write one function. And it wasn't called a function. It was just called a program. There were no modules, no functions, and no subroutines. As I said, it was primitive.

COBOL wasn't much better. You could jump around from one paragraph to the next using a goto statement; but the idea of functions that you could call with arguments was still a decade in the future.

When Algol, and later, C, came along, the idea of functions burst upon us like a wave. Suddenly we could create modules that could be individually compiled. Modular programming became all the rage.

But as the years went by, and our programs got larger and larger, we began to realize that functions weren't enough. Our names started to collide. We were having trouble keeping them unique. And we really didn't like the idea of prefixing every name with the name of its module. I mean, nobody likes io_files_open as a function name.

So we added classes and namespaces to our languages. These gave us lovely little cubbyholes that had names into which we could place our functions and our variables. Moreover, these namespaces and classes quickly became hierarchical. We could have functions within namespaces that were within yet other namespaces.

He keeps going, but you can see that:

  • Points 1, 2, and 4 are essentially the same concern.
  • Folksy phrasing like “run like molasses in the wintertime” doesn’t clarify anything.
  • Instead of addressing the core critique, he detours into a thin language-history lesson.

This is the style of the second edition.

What has somewhat improved is the Java code: most examples in the first edition were abysmally bad, and while the second edition is better, it still leaves much to be desired.

The second edition has examples in other languages. But Martin writes Python and Go as if they were Java with disclaimers that he doesn’t really know them:

  • "It’s been a long time since I’ve written any Python, so bear with me."
  • "I apologize - I am not an accomplished Golang programmer"

Why does he use them? I guess to prove that the ideas aren’t Java-specific and apply across languages. And they might be - if you write Java in Python. Do they apply if you write Python in Python? That’s left as an exercise for the reader.

Over the years Martin has definitely heard the critique of his original book, and he has participated in debates around it (with Casey Muratori about performance, and with John Ousterhout about style and design).

I don't believe Martin internalized the critique or revised any of his recommendations. There is a tonal shift here and there towards being more accepting of alternative views ("but you may disagree. That's OK. We are allowed to disagree. There is no single standard for what cleanliness is. This is mine, and mine may not be yours."), but I haven't noticed any changes in the way he writes code.

Chapter 2 - Clean that code!

This chapter is a replacement for "Chapter 14: Successive Refinement" from the first edition.

This is a good illustration that it's the same "Clean Code"(tm) style from 2007 - nothing has really changed.

As a reminder:

  1. Avoid comments. Over-use "description as a name" instead
  2. Extract a lot of tiny functions that are used only once
  3. Move parameters to class state and pass them implicitly from everywhere to everywhere

This is how he cleans the code in 2024: Original version on the left; the cleaned-up version on the right:

package fromRoman;

import java.util.Arrays;

public class FromRoman {
  public static int convert(String roman) {
    if (roman.contains("VIV") ||
      roman.contains("IVI") ||
      roman.contains("IXI") ||
      roman.contains("LXL") ||
      roman.contains("XLX") ||
      roman.contains("XCX") ||
      roman.contains("DCD") ||
      roman.contains("CDC") ||
      roman.contains("MCM")) {
      throw new InvalidRomanNumeralException(roman);
    }
    roman = roman.replace("IV", "4");
    roman = roman.replace("IX", "9");
    roman = roman.replace("XL", "F");
    roman = roman.replace("XC", "N");
    roman = roman.replace("CD", "G");
    roman = roman.replace("CM", "O");
    if (roman.contains("IIII") ||
      roman.contains("VV") ||
      roman.contains("XXXX") ||
      roman.contains("LL") ||
      roman.contains("CCCC") ||
      roman.contains("DD") ||
      roman.contains("MMMM")) {
      throw new InvalidRomanNumeralException(roman);
    }

    int[] numbers = new int[roman.length()];
    int i = 0;
    for (char digit : roman.toCharArray()) {
      switch (digit) {
        case 'I' -> numbers[i] = 1;
        case 'V' -> numbers[i] = 5;
        case 'X' -> numbers[i] = 10;
        case 'L' -> numbers[i] = 50;
        case 'C' -> numbers[i] = 100;
        case 'D' -> numbers[i] = 500;
        case 'M' -> numbers[i] = 1000;
        case '4' -> numbers[i] = 4;
        case '9' -> numbers[i] = 9;
        case 'F' -> numbers[i] = 40;
        case 'N' -> numbers[i] = 90;
        case 'G' -> numbers[i] = 400;
        case 'O' -> numbers[i] = 900;
        default -> throw new InvalidRomanNumeralException(roman);
      }
      i++;
    }
    int lastDigit = 1000;
    for (int number : numbers) {
      if (number > lastDigit) {
        throw new InvalidRomanNumeralException(roman);
      }
      lastDigit = number;
    }
    return Arrays.stream(numbers).sum();
  }

  public static
  class InvalidRomanNumeralException extends RuntimeException {
    public InvalidRomanNumeralException(String roman) {
    }
  }
}
package fromRoman;

import java.util.ArrayList;
import java.util.List;
import java.util.Map;

public class FromRoman {
  private String roman;
  private List<Integer> numbers = new ArrayList<>();
  private int charIx;
  private char nextChar;
  private Integer nextValue;
  private Integer value;
  private int nchars;this is unfortunate mix of input data, intermediate computation state, and output data
What's surprising is that Martin wants method bodies to be clean, but doesn't mind mess in object's state or class namespace
Map<Character, Integer> values = Map.of(this should be a constant, i.e. "static private final" 'I', 1, 'V', 5, 'X', 10, 'L', 50, 'C', 100, 'D', 500, 'M', 1000); public FromRoman(String roman) {this should be private, if we want `convert` to be an entry point this.roman = roman; } public static int convert(String roman) { return new FromRoman(roman).doConversion(); } private int doConversion() { checkInitialSyntax(); convertLettersToNumbers(); checkNumbersInDecreasingOrder(); return numbers.stream().reduce(0, Integer::sum); } private void checkInitialSyntax() { checkForIllegalPrefixCombinations(); checkForImproperRepetitions(); } private void checkForIllegalPrefixCombinations() { checkForIllegalPatterns( new String[]{"VIV", "IVI", "IXI", "IXV", "LXL", "XLX", "XCX", "XCL", "DCD", "CDC", "CMC", "CMD"});useless allocations. these should be extracted into constants } private void checkForImproperRepetitions() {there is no need to split validation in 3 methods. this just pollutes name space checkForIllegalPatterns( new String[]{"IIII", "VV", "XXXX", "LL", "CCCC", "DD", "MMMM"});useless allocations. these should be extracted into constants } private void checkForIllegalPatterns(String[] patterns) { for (String badString : patterns) if (roman.contains(badString)) throw new InvalidRomanNumeralException(roman); } private void convertLettersToNumbers() { char[] chars = roman.toCharArray(); nchars = chars.length; for (charIx = 0; charIx < nchars; charIx++) { nextChar = isLastChar() ? 0 : chars[charIx + 1]; nextValue = values.get(nextChar); char thisChar = chars[charIx]; value = values.get(thisChar); switch (thisChar) { case 'I' -> addValueConsideringPrefix('V', 'X'); case 'X' -> addValueConsideringPrefix('L', 'C'); case 'C' -> addValueConsideringPrefix('D', 'M'); case 'V', 'L', 'D', 'M' -> numbers.add(value); default -> throw new InvalidRomanNumeralException(roman); } } } private boolean isLastChar() { return charIx + 1 == nchars; } private void addValueConsideringPrefix(char p1, char p2) { if (nextChar == p1 || nextChar == p2) { numbers.add(nextValue - value); charIx++; } else numbers.add(value); } private void checkNumbersInDecreasingOrder() { for (int i = 0; i < numbers.size() - 1; i++) if (numbers.get(i) < numbers.get(i + 1)) throw new InvalidRomanNumeralException(roman); } public static class InvalidRomanNumeralException extends RuntimeException { public InvalidRomanNumeralException(String roman) { super("Invalid Roman numeral: " + roman); } } }

Let's be clear: this is a stylistic choice. It doesn't improve any objective metric.
As a style, it's very recognizable: when you see long-named functions that have 0-1 parameters and very short bodies - you immediately think "Oh. That's Uncle Bob's Clean Code". From an artistic standpoint, creating a distinctive, instantly recognizable style is an achievement. I bet you know a colleague or two who uses it. But code isn't art. The question is whether the style helps you build and maintain software.

Putting the style aside for a moment, you'll notice that the "improved" code is surprisingly sloppy:

  1. Confusing public API: class FromRoman has a single public entry point - static method convert, this means the constructor should be private.

If we are aiming for code to be new-reader friendly, the proper implementation would be:

// internal constructor. use static method 'convert' instead
private FromRoman(String roman) {
   this.roman = roman;
}
  1. Unnecessary allocations: The immutable values (Map<Character, Integer> values) have to be "private static final" to avoid creating the same reference data every time you do a conversion. The validation patterns are allocated on each call too.

Here's how it should be done instead:

private static final String[] INVALID_PATTERNS = {
  // Illegal prefix combinations
  "VIV", "IVI", "IXI", "IXV", "LXL", "XLX",
  "XCX", "XCL", "DCD", "CDC", "CMC", "CMD",
  // Improper repetitions
  "IIII", "VV", "XXXX", "LL", "CCCC", "DD", "MMMM"
};

private static void validate(String roman) {
  for (String pattern : INVALID_PATTERNS) {
    if (roman.contains(pattern)) {
      throw new InvalidRomanNumeralException(roman);
    }
  }
}

This eliminates the useless allocations and removes the pointless checkForIllegalPrefixCombinations, checkForImproperRepetitions, and checkForIllegalPatterns helpers.

  1. It's a smaller code chunk that does exactly the same
  2. This code is less work for the compiler to compile
  3. It will perform better at runtime
  4. It would generate less garbage and have less memory pressure

All for free, just from using a different style.

And let's talk style. Arguably private static void validate(String roman) is an easy-to-understand signature. You don't need to jump across the code to guess what it is doing. And the implementation utilizes different language constructs to express the intent:

  • Inputs for the processing - as method arguments
  • Static immutable reference data - as constants
  • Code explanation - as comments

In contrast, Clean Code advocates using method names for code explanation and class fields for passing parameters and results. This feels like a misuse of the instrument.

Now try to work out the purpose and relationships between these methods from signatures only:

  • private void checkForIllegalPrefixCombinations()
  • private void checkInitialSyntax()
  • private void checkForIllegalPatterns(String[] patterns)
  • private void checkForImproperRepetitions()

Fun!

P.S. Sometimes it's worth encoding documentation into the name to make things more visible (for example, runTaskDangerouslySkipPermissions), but this should not be a go-to move. This is a special case situation that requires justification.

...

This is the moment that made me chuckle:

Future Bob: Two months later I'm torn. The first version, ugly as it was, was not as chopped up as this one. It's true that the names and the ordering of the extracted functions read like a story and are a big help in understanding the intent; but there were several times that I had to scroll back up to the top to assure myself about the types of instance variables. I found the choppiness, and the scrolling, to be annoying.

YES! Everyone finds this annoying! Lean into this feeling! See the light, Robert! It's not too late.

But then he falls into justifications:

However, and this is critical, I am reading this cleaned code after having first read the ugly version and having gone through the work of understanding it. So now, as I read this version, I am annoyed because I already understand it and find the chopped-up functions and the instance variables redundant.

This annoyance is an issue that John Ousterhout and I have debated. When you understand an algorithm, the artifacts intended to help you understand it become annoying. Worse, if you understand an algorithm, the names or comments you write to help others will be biased by that understanding and may not help the reader as much as you think they will

He is mixing two points here:

  • Stroustrup's rule - beginners want as much context as possible, experts want code to be terse and on-point
  • You might be bad at explaining your ideas in code

Comments solve both problems: experts can skip them, beginners can rely on them. Bad comments are less risky than bad code - if you're bad at explaining your code in comments, people can ignore them. If you're bad at explaining your thoughts in code, everyone is stuck with it.

He casually pushes comments under the bus: "the names or comments you write to help others will be biased by that understanding and may not help the reader as much as you think they will". But his code had none so far.

This section was supposed to address the main critique from John Ousterhout: code style and design.

Then he moves to performance (Casey Muratori is not mentioned once in the book, which is a shame):

You might complain that the creation of an object is overkill and expensive. It's true that allocating memory, and then garbage- collecting it later, takes extra time. There is also time spent in the calling of all those small methods. If you are writing embedded real-time code with very tight timing constraints, or if you are writing code for a high-performance flight simulator or first-person shooter game, then this decision could seem insane, not to mention unclean. But for my purposes, the extra time is the least of my concerns. I don't need the speed. For my situation, I think trading speed for clarity, within reason, is a good trade to make.

It might sound repetitive, but the trade-off comes not from cleanliness or clarity, it comes specifically from his style:

  • Function extraction to keep them small (and to avoid comments)
  • Moving parameters to object fields to minimize the number of parameters (to 1 or 0)

He understates the negative impact:

  • Compile time: more small functions and many fields forces the compiler to do more work
  • Runtime: all these object fields require additional memory space for references
  • Testing: in Java, for non-JITed code, passing data via instance fields is more expensive than using method parameters. Slow non-JITed code means slow tests.

If you don't agree that this style improves clarity (and you shouldn't - be like John Ousterhout), you get a double slap: less readable code that also has worse performance. This is what makes people so energized against "clean code"™.

You might be a functional programmer horrified that the functions are not "pure." But, in fact, the static convert function is as pure as a function can be. The others are just little helpers that operate within a single invocation of that overarching pure function. Those instance variables are very convenient for allowing the individual methods to communicate without having to resort to passing arguments. This shows that one good use for an object is to allow the helper functions that operate within the execution of a pure function to easily communicate through the instance variables of the object.

"Well, actually" the function that can throw an exception is not pure. A pure function must be deterministic and have no side effects. Throwing an exception is a side effect-it alters the control flow and can be observed from outside the function.

"allowing the individual methods to communicate without having to resort to passing arguments" - this is such strange wording. Passing arguments and returning values is THE main mechanism for how functions communicate. There is no resorting here.


Tangent: Pure functions.

There is a whole section in Chapter 7: Clean Functions dedicated to Robert Martin refusing to embrace the concept of pure functions.

Purity is an observed quality, not an intrinsic quality. Therefore, whether a function is pure or not depends upon what is being observed.

Consider the C library function fopen(char* name, char* mode) This function has the side effect of leaving the named file open. This means that fopen is not pure, because there are functions, such as fgetc and fputc, that will not behave properly unless fopen is called first. Indeed, if you call fopen with the same arguments twice in a row without an intervening call to fclose, the second invocation will likely fail because the file is already open. So fopen is not pure, because of the side effect that leaves the system in an altered state.

However, we can make the fopen function appear to be pure, at least to certain observers, by ensconcing it in another function that hides the side effect.

void openAndDo(char* fileName, void (*doTo)(FILE*)) {
  FILE* f = fopen(fileName, "r+");
  doTo(f);
  fclose(f);
}

I believe he is trying to address criticism of his misunderstanding of pure functions and side effects by doubling down on that misunderstanding.

Indeed, if you call fopen with the same arguments twice in a row without an intervening call to fclose, the second invocation will likely fail because the file is already open

On some operating systems it will fail, but on others it won’t. Instead, you may just leak file descriptors. The side effect of fopen is more complex than “leaves the file open”:

  • It mutates global system state.
  • It depends on external system resources being available.
  • It might fail based on external conditions.

In other words, it has an implicit dependency on the environment.

at least to certain observers

For any observer:

  • fopen can fail, doTo can fail, and fclose can fail. Any of these may leave the system in a different state.
  • fopen and fclose both depend on the environment: the file must exist, the filesystem must be available, there must be free file descriptors, etc.

The whole point of pure functions is that they depend only on their input parameters and always produce the same result for the same input. This makes them:

  • Context-independent – they can be used anywhere you have the required input data, and they compose well with other parts of the system.
  • Easier to reason about – you only need to think about the function’s implementation, not hidden global or environmental state.
  • Easier to optimize – they are suitable for memoization/caching and compiler optimizations

I suspect this section exists mostly to defend his “0–1 parameters” style. But again, instead of clarifying important concepts (purity and side effects), Martin blurs them just enough to justify his preferred way of writing code.

This reframing of purity as an “observed quality” in a programming book in 2025 looks off.


So I think this is pretty clean. Am I right? I think I am; but you may disagree. That's OK. We are allowed to disagree. There is no single standard for what cleanliness is. This is mine, and mine may not be yours. All I'll ask of you is that you consider mine and develop your own standard. The most important thing is to have a standard that you apply with discipline.

In isolation this is an A-OK caveat. It makes the book feel less dogmatic and given how controversial it is - that's a good tone shift. But the question is: who is Robert Martin speaking to? Junior people would just ignore this paragraph, they are looking for straight and simple guidance. Putting disclaimers like this doesn't negate the harmful influence. For senior people who are just upset with "clean code"(tm) style this is an empty platitude. Everyone who disagrees with "clean code"(tm) style knows that they are allowed to disagree. That's not the point of the criticism.

Postscript: Future Bob Playing with Grok3

I fed my "clean" version of the above code into Grok3 and asked it to make improvements. After a bit of thought, it returned the following code, which compiled and which passed all my tests.

package fromRoman;

import java.util.Map;

public class FromRoman {
    private String roman;
    private static final String REGEX =
      "^M{0,3}(CM|CD|D?C{0,3})(XC|XL|L?X{0,3})(IX|IV|V?I{0,3})$";
    private static final Map<Character, Integer> VALUES = Map.of(
        'I', 1,
        'V', 5,
        'X', 10,
        'L', 50,
        'C', 100,
        'D', 500,
        'M', 1000
    );

    /** Constructs a FromRoman instance,
        validating the input Roman numeral. */
    public FromRoman(String roman) {
        if (!roman.matches(REGEX)) {
            throw new InvalidRomanNumeralException(roman);
        }
        this.roman = roman;
    }

    /** Converts the Roman numeral to an integer. */
    public int toInt() {
        int sum = 0;
        for (int i = 0; i < roman.length(); i++) {
            int currentValue = VALUES.get(roman.charAt(i));
            if (i < roman.length() - 1) {
                int nextValue = VALUES.get(roman.charAt(i + 1));
                if (currentValue < nextValue) {
                    sum -= currentValue;
                } else {
                    sum += currentValue;
                }
            } else {
                sum += currentValue;
            }
        }
        return sum;
    }

    /** Static method to convert a Roman numeral string to an integer. */
    public static int convert(String roman) {
        return new FromRoman(roman).toInt();
    }

    public static
    class InvalidRomanNumeralException extends RuntimeException {
        public InvalidRomanNumeralException(String roman) {
            super("Invalid Roman numeral: " + roman);
        }
    }
}

Grok3 did a nice job

This is such a canonical solution to the Roman numeral problem that Grok almost certainly got it from the training data.

But what's interesting here is how you can bias an AI model to write suboptimal code. Notice that Grok's solution retains the instance variable roman which is completely unnecessary. It is a bias from his "clean" version. If you just ask a model to create a brand new converter from "roman numerals to int in java" it would come up with a reasonable API:

public final class RomanNumerals {
   public static int toInt(String roman) {...}
}

Chapter 3: First principles

Chapter 3 is supposed to demonstrate how SOLID gives you designs that scale with new requirements. What it actually demonstrates is how you can get a complex code design that completely ignores the complexity of the domain.

The starting point is a simple rental statement for a conference center.

---Statement.java---
package ubConferenceCenter;

import java.util.ArrayList;
import java.util.List;

public class Statement {

  public enum CatalogItem {SMALL_ROOM, LARGE_ROOM,
                           PROJECTOR, COFFEE, COOKIES}

  public record RentalItem(CatalogItem type,
                           int days,
                           int unitPrice,
                           int price,
                           int tax) {
  }

  public record Totals(int subtotal, int tax) {
  }

  private String customerName;
  private int subtotal = 0;
  private int tax = 0;
  private List<RentalItem> items = new ArrayList<>();

  public Statement(String customerName) {
    this.customerName = customerName;
  }

  public void rent(CatalogItem item, int days) {
    int unitPrice = switch (item) {
      case SMALL_ROOM -> 100;
      case LARGE_ROOM -> 150;
      case PROJECTOR -> 50;
      case COFFEE -> 10;
      case COOKIES -> 15;
    };

    boolean eligibleForDiscount = switch (item) {
      case SMALL_ROOM, LARGE_ROOM -> days == 5;
      case PROJECTOR, COFFEE, COOKIES-> false;
    };

    int price = unitPrice * days;

    if (eligibleForDiscount) price = (int) Math.round(price * .9);

    subtotal += price;
    int thisTax = switch (item) {
      case SMALL_ROOM, LARGE_ROOM, PROJECTOR ->
           (int) Math.round(price * .05);
      case COFFEE,COOKIES -> 0;
    };
    tax += thisTax;
    items.add(new RentalItem(item, days, unitPrice, price, thisTax));
  }

  public RentalItem[] getItems() {
    List<RentalItem> items = new ArrayList<>(this.items);
    boolean largeRoomFiveDays = items.stream().anyMatch(
      item -> item.type() == CatalogItem.LARGE_ROOM && item.days() == 5);
    boolean coffeeFiveDays = items.stream().anyMatch(
      item -> item.type() == CatalogItem.COFFEE && item.days() == 5);
    if (largeRoomFiveDays && coffeeFiveDays)
      items.add(new RentalItem(CatalogItem.COOKIES, 5, 0, 0, 0));
    return items.toArray(new RentalItem[0]);
  }

  public String getCustomerName() {
    return customerName;
  }

  public Totals getTotals() {
    return new Totals(subtotal, tax);
  }
}

To address the requirements changes:

Let's suppose that we've just gotten back from an all-hands meeting at which the CEO gave us a glowing report about how the company is expanding. There are new markets, new states, new laws to contend with. There will be new discounts and new promotions, and new tax regulations

The proposed rewrite is as follows:

Applying different letters from SOLID, Martin claims that SOLID helps design code for growth and scaling:

our assumption has been that this code is going to grow, and that we’re trying to make room for that growth

I can agree with extraction of Bonus subsystem. But why stop at Bonuses? What about taxes and discounts and pricing?
(side note: void checkAndAdd(List<RentalItem> items, List<RentalItem> bonusItems) is surprisingly awkward API)

The whole redesign is mostly an attempt to replace switch statements with polymorphism. But is CatalogItem actually a good interface?

public interface CatalogItem {
    boolean isEligibleForDiscount(int days);
    int getDiscountedPrice(int days);
    int getUnitPrice();
    double getTaxRate();
    String getName();
}

This is old-school OOP: one interface that pretends to encapsulate “an item” but quietly bundles pricing, tax, and discount policy into a single interface.

And yet it is very simplistic.

Everyone who has dealt with taxes can immediately understand how laughable this simplification is:

double getTaxRate();
  1. There's rarely just one tax
  2. Taxes depend on the category of item
  3. Tax depends on where and who
  4. Tax rules change over time
  5. double wouldn't cut it for rounding and precision, some taxes are not even percentages.

The requirement changes that this design wouldn't be able to accommodate:

  1. Actual taxation system
  2. Changes over time

For example, Martin's interface makes a fundamental domain modeling error: it treats tax as an intrinsic property of items.
Say we're expanding state tax rules: New York charges 8.875% on room rentals but 0% on food. California charges 7.25% on both. With Martin's design, where does this logic go?

class LargeRoom implements CatalogItem {
     double getTaxRate() {
         // How do I know we're in NY vs CA?
         // Do I inject location into every item?
     }
 }

I don't believe this design is ready for "new markets, new states, new laws to contend with."

This is one of the problems with SOLID in general: it's a collection of principles that you can apply mechanistically to any codebase, completely ignoring the domain of the application.

This idea of separating high-level policy from low-level detail might be useful, but it is misapplied. The real world of taxation doesn't work by each newly produced item (low-level) deciding its own tax rate. And yes, tax code (high level) has to change to accommodate new categories when they are being introduced.

This is where domain-driven design provides better guidance: instead of mechanically applying structural principles, start by understanding what you're building.

  1. Figure out the goal of the application
  2. Figure out main domain areas
  3. Figure out main domain entities
  4. Figure out processing workflows

It is easier to organize growing space, by abandoning old OOP-style of mixing data and processing in one interface.
For this rental system, the processing flow is straightforward:

[order] -> <discounts> -> <promotions> -> <pricing> -> <taxation> -> [receipt]

And the sketch would look like this:

  // data

  record RoomRental(RoomType type, int days) {}
  record EquipmentRental(EquipmentType type, int days) {}
  record ConsumableOrder(ConsumableType type, int days) {}

  class Order {
      private List<RoomRental> rooms = new ArrayList<>();
      private List<EquipmentRental> equipment = new ArrayList<>();
      private List<ConsumableOrder> consumables = new ArrayList<>();
  }

  record BillingDetails(
      String customerName,
      String billingAddress,
      String taxId,
      PaymentMethod paymentMethod
  ) {}

  // processing

  var pricing = new PricingEngine(discounts, taxes, promotions);

  var order = new Order()
      .addRoom(Rooms.LARGE, 5)
      .addRoom(Rooms.SMALL, 3)
      .addEquipment(Equipment.PROJECTOR, 3)
      .addConsumable(Consumables.COFFEE, 5);

  var billingDetails = new BillingDetails(
      "Acme Corp",
      "123 Business St, New York, NY 10001",
      "EIN: 12-3456789",
      PaymentMethod.CORPORATE_ACCOUNT
  );

  Receipt receipt = pricing.generateReceipt(order, billingDetails);

This is a sketch, but it clearly has space for growth into a real system: Need to add California tax rules? Add a new tax strategy. New promotion for repeat customers? Add it to the promotions engine. Etc

Applying SOLID without thinking about domain modeling, will produce architecturally wrong designs that won't work.

Chapter 5: Comments

In the revised chapter on comments, he refuses to take the blame for popularizing the idea that comments are a sign of bad code and should be avoided:

It has been said that advice like this gives license to younger programmers to avoid writing comments. Shame on any programmers, young or old, who use these recommendations as an excuse for their laziness and/or unwillingness to add proper explanations and context. In the end, I cannot concern myself with how others might abuse my advice. My sole purpose here is help those who will not.

And yet even in this edition he declares comments to be a failure:

The comments you write may be necessary; but that necessity is also unfortunate.
..
The proper use of comments is to compensate for our failure to express ourselves in code. Note that I used the word failure. I meant it. Comments are always failures of either our languages or our abilities.

I would generalize even more: this is a failure of our universe. If only the laws of our universe were better, we could avoid writing comments.

Both editions of this book are seriously lacking examples of good comments. The chapter about comments has a section "Good comments" and all examples are comically bad.

Comments Do Not Make Up for Bad Code

One of the more common motivations for writing comments is bad code. We write a module and we know it is confusing and disorganized. We know it's a mess. So we say to ourselves, "Ooh, I'd better comment that!" No! You'd better clean it!

Expressing thoughts in code is harder than expressing thoughts in plain text. Expressing thoughts in code requires:

  • Clear thinking
  • Mastery of the language
  • Expressiveness of the language

For example, you can try very hard, but the expressiveness of the language is a noticeable limiting factor:

assertTrue(b.compareTo(a) == 1);    // b < a

The best Java can offer is:

assertThat(a).isGreaterThan(b);

Serious note: if you have messy code, sit down and try to write clear comments - it will clarify your thinking and might help you reorganize the original mess.

Or another example from Chapter 7: Clean functions:

They will read like well-written prose.
if(employee.shouldBePaidToday())
   employee.pay();

In this example the employee class/interface is now coupled with unrelated concerns:

  • calendars
  • schedules
  • payment transactions

Does this read like prose? Maybe. Is it a good interface design? No, it isn't. Some languages are just not expressive enough for you to satisfy both constraints: good code design and "reading like prose".

Overall the messaging of this chapter is the same. It's overfocused on bad examples and underfocused on good ones.

Seriously this is an example of a good comment:
// Returns an instance of the Responder being tested.
protected abstract Responder responderInstance();

That's because the author was using the well-known Singleton design pattern. When you use a design pattern like that, you should follow the canonical form specified by that pattern. In the Singleton pattern, the name of the accessor function always ends in the word Instance.

This doesn't look like a singleton. At all.

This is how a typical singleton should look (in Java):

class Responder {
    public static final Responder INSTANCE = createResponder();
}

I don't know what has happened here. But assuming that the suffix Instance means singleton is just wrong. I've checked the source code of Fitnesse. And I'm sorry to inform you this is not a singleton.

A couple of notes that irritate me:

Attributions and Bylines

/* Added by Rick */

This is just graffiti. Please don't spray-paint your name all over the source code. Source code control systems are very good at remembering who added what, when.

I'm surprised to hear this argument from an IT veteran who for sure should have seen mass migration from one version control system to another (CVS to SVN or from SVN to Git). And how often the authorship information is lost during such migrations.

Commented-Out Code

Few practices are as odious as commenting-out code. Commented-out code is an abomination before nature and nature's God. Don't check this stuff in!

Others who see that commented-out code will be unlikely to have the courage to delete it. They'll think it is there for a reason and is too important to delete. So commented-out code gathers like dregs at the bottom of a bad bottle of wine

In reality, nobody is afraid to delete old commented-out code; once it’s been sitting there for a year or two, everyone assumes it’s stale. And for short-lived changes, commenting code in and out is far more practical and convenient than anything else.

I don't know where this puritan stance against comments is coming from. Comments can be fun, creative and useful:

ASCII art diagram showing memory barriers and store sequences - an example of creative documentation

Feel free to use and enjoy them.

...

And one last thing..

Exception handling is still a mess, even two decades later:

public void loadProperties()
 {
   try
   {
    String propertiesPath = propertiesLocation + "/" + PROPERTIES_FILE;
    FileInputStream propertiesStream =
                    new FileInputStream(propertiesPath);
    loadedProperties.load(propertiesStream);
   }
   catch(IOException e)
   {
     // No properties files means all defaults are loaded
   }
 }

... Wouldn't it have been better if the programmer had written the following?

catch(IOException e)
{
   LoadedProperties.loadDefaults();
}

No! It wouldn't. This is very sloppy code for several reasons:

  1. IOException is too broad: This exception type can occur for many reasons-file not found, permission denied, disk full, corrupted data, network issues (for network filesystems), and more.

  2. Silent failures make debugging impossible: When this code fails in production, you'll have no idea why the defaults were loaded. That information is gone.

  3. The fallback might not be safe: Loading defaults might be the right response to "file not found" but completely wrong for "disk full" or "permission denied." Different IOException causes should trigger different recovery strategies.

The correct approach would be something like:

catch(FileNotFoundException e) {
   logger.info("Config file not found, the execution would proceed using defaults", e);
   LoadedProperties.loadDefaults();
} catch(IOException e) {
   throw new ConfigurationException("Unable to load configuration", e);
}

If you think that I'm nit-picking, I promise you that I'm not. Proper handling of error cases is one of the main pillars of software engineering. Just another illustration that Clean Code (the book) is obsessed with style aesthetics and underweights operational reality.

Small functions are a style, but good error handling is clean code.


Reference Materials

If you want to learn more about domain-driven design and domain modeling, here are some excellent resources:

For performance critique of Clean Code style:

Clean Code - Critical Analysis

Clean code is probably the most recommended book to entry level engineers and junior developers. While it might have been good recommendation in the past, I don't believe that's the case anymore.

After revisiting the book in 2023 I was surprised to notice that:

  • The book hasn't aged well
  • Much of its advice ranges from questionable to harmful
  • Examples are the worst part of the book. By any objective metrics, many would qualify as "bad code."
  • Lazer focus on a wrong thing and attempt to sell it as the solution to everything. Code readability is important, it is not the only one aspect
  • Despite being an entry-level book, it has these vibes of implied superiority, potentially giving the readers an undeserved sense of expertise

For a significantly shorter critque of the book, check out qntm's critique. I mostly agree with qntm assessment. But it's a bit too emotional and personal and doesn't cover the parts i find the most harmful.

If you're just starting your career and seeking books to improve your coding skills, I suggest these instead:

Also Robert Martin is working on the second edition of Clean Code. It will be fun to see how much of this critique will become irrelevant 😃

Intended Audience

  • For those recommending "Clean Code":
    You may have read the book over a decade ago and found it useful at the time. This might help you reconsider.
  • For those confused after reading "Clean Code":
    You’re not alone. The book can be confusing, and you might find better practical advice in modern alternatives
  • For those who enjoyed "Clean Code":
    "It's great when people get excited about things, but sometimes they get a little too excited."
    I hope this critique would help you to see more nuance.

A Note to All Readers

This page exists as a reference for anyone. Agree or disagree, contributions to the critique are welcome via GitHub.




Chapter 1: Clean Code

Here is Martin's main thesis: bad code destroys companies.

Ergo: Writing bad code is a mistake, and no excuse can justify it.

I know of one company that, in the late 80s, wrote a killer app. It was very popular, and lots of professionals bought and used it. But then the release cycles began to stretch. Bugs were not repaired from one release to the next. Load times grew and crashes increased. I remember the day I shut the product down in frustration and never used it again. The company went out of business a short time after that.

Two decades later I met one of the early employees of that company and asked him what had happened. The answer confirmed my fears. They had rushed the product to market and had made a huge mess in the code. As they added more and more features, the code got worse and worse until they simply could not manage it any longer.

It was the bad code that brought the company down.

A Counter-Anecdote:

"Oracle Database 12.2.

It is close to 25 million lines of C code. What an unimaginable horror! You can't change a single line of code in the product without breaking 1000s of existing tests. Generations of programmers have worked on that code under difficult deadlines and filled the code with all kinds of crap.

Very complex pieces of logic, memory management, context switching, etc. are all held together with thousands of flags. The whole code is ridden with mysterious macros that one cannot decipher without picking a notebook and expanding relevant parts of the macros by hand. It can take a day to two days to really understand what a macro does.

Sometimes one needs to understand the values and the effects of 20 different flags to predict how the code would behave in different situations. Sometimes 100s too! I am not exaggerating."

From HN: What's the largest amount of bad code you have ever seen work?

As horrific as this sounds, Oracle won't go out business anytime soon. I can bet my life on that.

A personal anecdote:

I worked in a small Series-A startup that had the most elegant code I've ever seen. The code was highly attuned to the domain model, well-structured and was so ergonomic that adding new features was a joy. It was the third iteration of the codebase. They run out of money. It was the hunt for the clean code that brought the company down. Nah. I'm kidding. World of business is far more complex than just code quality.

Reading Clean Code in 2009, the idea that bad code being the root of all evil impressed me and I became a convert. It took me a decade to finally admit "ok. this doesn't match with reality." "Bad" code is pretty much the norm. Success or failure of an enterprise has nothing to do with it. And code that's good today will be very questionable from tomorrow's standards (this book is a perfect illustration of this).

The focus on "cleanliness" of the code often feels like bike-shedding. The real challenge is to keep the balance between complexity vs available resources to manage it. Oracle can afford to have complex code base as they have enough resources and (I'm speculating) keep throwing more bodies at the problem.

Of course you have been impeded by bad code. So then—why did you write it? <...>

Were you trying to go fast? Were you in a rush? Probably so. Perhaps you felt that you didn't have time to do a good job

"Not having enough time" might be an easy excuse and rationalization. But in my experience, I'm writing bad code because:

  • I don't know that it is bad
  • I don't know how to make it good within the constraints
  • The existing architecture or design

More often, it's a combination: accepting the existing design because I don't know how to make it better.

You don't know what you don't know. Unfortunately, just reading "Clean Code" helps very little with discovering unknowns.

What might help is diving in and working on high quality code bases, having an experienced mentor, learning and studying different domains, ecosystems, languages and frameworks.

The Boy Scout Rule

"Leave the campground cleaner than you found it". This is the call to action: always improve code that you're touching.

In corporate america anything that has a catchy name has a chance to spread. I've heard of CTO at a mid-size company who knew two things about IT: how to hire contractors to adapt Scaled Agile Framework and "the Boy Scout Rule" of programming.

Can you imagine working on a project where the code simply got better as time passed?

This sounds appealing and might work on a small scale (think couple lines of code) and in isolation. But on this scale, it quickly hits diminishing returns. Applied on a larger scale, it creates chaos, random failures, and placing unnecessary burden on people working with the system and the codebase.

What makes things worse: cleanliness of code is not an objective metric.

In a campground, everyone agrees a plastic cup is trash and doesn’t belong. In code, readability and elegance are far more subjective. The most elegant OCaml code will be unreadable and weird in the eyes of average java Bob.

Kent Beck offers the idea of tidyings - small code changes that will unquestionably improve code base. While reading the book "Tidy first?" you'll notice two things:

  • These are very small changes
  • Question mark in the title of the book

Clean Code advocates for significantly more radical interventions.



The boy scout rule is the opposite of:

Do not "fix" stuff that is not broken
(this needs a catchy name)

To Sum Up:

Focusing obsessively on clean code often misses the bigger picture. The real game lies in managing complexity while balancing the resources and the constraints.




Chapter 2: Meaningful names

I agree with the theme of this chapter: names are important. Naming is the main component of building abstractions and abstraction boundaries. In smaller languages like C or Go-Lang, naming is the primary mechanism for abstraction.

Naming brings the problem domain into the computational domain:

static int TEMPERATURE_LIMIT_F = 1000;

Named code constructs — such as functions and variables — are the building blocks of composition in virtually every programming paradigm.

Names are a form of abstraction: they provide a simplified way of thinking about a more complex underlying entity. Like other forms of abstraction, the best names are those that focus attention on what is most important about the underlying entity while omitting details that are less important.

Fom "Philosophy of software design"

However, I disagree with Clean Code's specific approach to naming and its examples of "good" names. Quite often the book declares a good rule but then shows horrible application.

Use intention-revealing names

The advocated principle is the title - the name should reveal intent. But the application:

The name of a variable, function, or class, should answer all the big questions. It should tell you why it exists, what it does, and how it is used. If a name requires a comment, then the name does not reveal its intent.

This sounds like an impossible task to me. First, the name that reveals all of those details fails to be an abstraction boundary. Second, what you notice in many examples in the book this approach leads to using "description as a name".

Martin presents 3 versions of the same code:


public List<int[]> getThem() {
  List<int[]> list1 
       = new ArrayList<int[]>();
  for (int[] x : theList)
    if (x[0] == 4)
      list1.add(x);
  return list1;
} 
(Might be) Obfuscated and unrealistic

public List<int[]> getFlaggedCells() {
  List<int[]> flaggedCells 
     = new ArrayList<int[]>();
  for (int[] cell : gameBoard)
    if (cell[STATUS_VALUE] == FLAGGED)
      flaggedCells.add(cell);
  return flaggedCells;
}
(Migh be) Good enough

public List<Cell> getFlaggedCells() {
  List<Cell&ht; flaggedCells 
      = new ArrayList<Cell>();
  for (Cell cell : gameBoard)
    if (cell.isFlagged())
        flaggedCells.add(cell);
  return flaggedCells;
}
(Might be) Premature abstraction

My main disagreement is this: not all code chunks need a name. In modern languages, this method can be a one-liner:

gameBoard.filter(cell => cell(STATUS_VALUE) == FLAGGED)

That's it. This code can be inlined and used as is.

While getFlaggedCells looks like an improvement over obfuscated getThem, it's not a name, it's a description. If the description is as long as the code, it is redundant.

Martin writes about it in passing: "if you can extract another function from it with a name that is not merely a restatement of its implementation".
But he violates this principle quite often.

For readability, the second version is just as clear as the third. Adding a Cell abstraction is overkill.

To be fair there are good reasons to introduce Cell abstraction, but also there are reasons not to
List<int[]> is a generic type - devoid of meaning without a context.
List<Cell> - has more semantic meaning and is harder to misuse.

List<int[]> list = getFlaggedCells();
list.get(0)[0] = list.get(0)[1] - list.get(0)[0];
        
This compiles and runs, but the transformation is non-sensical and will corrupt data (the first element is a status field).
List<Cell> has a better affordance than List<int[]>, and makes such mistakes less likely. But improved affordance comes from the specialized type, not just the name.

But there is a the downside - specialized types needs specialized processing code.   Serialization libraries, for example, would have support for List<int[]> out of the box, but would need custom ser-de for the Cell class.

Since Clean Code, Robert Martin has embraced Clojure and functional programming. One of the tenets of Clojure philosophy: use generic types to represent the data and you'll have enormous library of processing functions that can be reused and combined.
I'm curious if he would ever finish the second edition and if he had changed his mind about types.

Avoid disinformation ... Use Problem Domain Name

Martin pretty much advocates for informative style of writing in code: be clear, avoid quirks and puns.

These are examples of "good" names from his perspective:

  • bunchOfAccounts
  • XYZControllerForEfficientStorageOfStrings
Do not refer to a grouping of accounts as an accountList unless it's actually a List. The word list means something specific to programmers. If the container holding the accounts is not actually a List, it may lead to false conclusions. So accountGroup or bunchOfAccounts or just plain accounts would be better.

This might be stylistic preferences, but accountsList is easier to read and write than bunchOfAccounts: it's shorter and has fewer words - it is more concise. If by looking at word List the first thing you're thinking is java.util.List, then you might need some time away from Java. Touch some grass, write some Haskell.

XYZControllerForEfficientStorageOfStrings - this is not a name - it's an essay. It tells the whole story of (web) application working with strings and storing them efficiently.

Martin states that acronyms and word shortenings are bad, but doesn't see the problem in a name that has 7 words and 40 characters in it.

There is a simple solution to this - comments that expand acronyms and explain shortenings. But because Martin believes that comments are a failure, he has to insist on using essays as a name.

By the end of the chapter he finally mentions:

Shorter names are generally better than longer ones, so long as they are clear.

...The resulting names are more precise, which is the point of all naming.

These are really good points! But most of the examples in the chapter are not aligned with them. And I don't think "short and concise naming" is a takeaway people are getting from reading this chapter.

If anything, in most examples he proposed to replace short (albeit cryptic) names with 3-4 word long slugs.

Opinion: Code Conventions for the Java Programming Language
Pascal and Camel case in Java was proposed in Sun's Java Style as a way to enforce "names should be short yet meaningful".

The idea being that ItIsReallyUncomfortableToReadLongSentencesWrittenInThisStyle. Hence people will be soft forced to limit the size of names.

The assumption was wrong.

Add meaningful context

I believe this was my first big WTF moment in the book:

Original Code:
private void printGuessStatistics(char candidate, 
                                  int count) {   
    String number;
    String verb;
    String pluralModifier;
    if (count == 0) {
        number = "no";
        verb = "are";
        pluralModifier = "s";
    } else if (count == 1) {
        number = "1";
        verb = "is";
        pluralModifier = "";
    } else {
        number = Integer.toString(count);
        verb = "are";
        pluralModifier = "s";
    }
    String guessMessage = String.format(
        "There %s %s %s%s", verb, number, 
        candidate, pluralModifier
    );
    print(guessMessage);Instead of printing, the method should just return guessMessage String result
     
     
}
Proposed rewrite:
public class GuessStatisticsMessage {
    private String number;
    private String verb;
    private String pluralModifier;
  
    public String make(char candidate, int count) {
        createPluralDependentMessageParts(count);
        return String.format(
                "There %s %s %s%s", 
                verb, number, candidate, pluralModifier );
    }
  
    private void createPluralDependentMessageParts(int count) {
        if (count == 0) {
            thereAreNoLetters();
        } else if (count == 1) {
            thereIsOneLetter();
        } else {
            thereAreManyLetters(count);
        }
    }
  
    private void thereAreManyLetters(int count) {
        number = Integer.toString(count);
        verb = "are";
        pluralModifier = "s";
    }
  
    private void thereIsOneLetter() {
        number = "1";
        verb = "is";
        pluralModifier = "";
    }
  
    private void thereAreNoLetters() {
        number = "no";
        verb = "are";
        pluralModifier = "s";
    }
}

In no way the second option is better than the first one. The original has only one problem: side-effects. Instead of printing to the console it should have just return String.

And that is it:

  • 1 method, 20 lines of code, can be read top to bottom,
  • 3 mutable local variables to capture local mutable state that can not escape.
  • It is thread-safe. I's impossible to misuse this API.

Second option:

  • 5 methods, 40 lines of code, +1 new class with mutable state.
  • Because it's a class, the state can escape and be observed from the outside.
  • Which makes it not thread safe.

The second option introduced more code, more concepts and more entities, introduced thread safety concerns.. while getting the exactly same results.

It also violates one of the rules laid out in the chapter about method names: "Methods should have verb or verb phrase names". thereIsOneLetter() is not really a verb or a verb phrase.

If I'll try to be charitable here: Martin is creating internal Domain Specific Language(DSL) for a problem.

A good indicator of an internal DSL: there are parts of API that doesn't make sense outside specific context/grammar.

For example, new GuessStatisticsMessage().thereAreNoLetters() looks weird and doesn't make sense.

On the other hand, language consturcts are presented in a somewhat declarative style:

private void thereAreNoLetters() {
    number = "no";
    verb = "are";
    pluralModifier = "s";
}

private void thereIsOneLetter() {
    number = "1";
    verb = "is";
    pluralModifier = "";
}

I hope you agree that methods like this is not a typical Java. If anything, this looks closer to typical ruby.

But ultimately, object-oriented programming and domain-specific languages are unnessary for the simple task. In a powerful-enough language, keeping this code procedural gives the result that is short and easy to understand:

def formatGuessStatistics(candidate: Char, count: Int): String = {   
   count match {
       case i if i < 0 => throw new IllegalArgumentException(s"count=$count: negative counts are not supported")
       case 0 => s"There are no ${candidate}s"
       case 1 => s"There is 1 ${candidate}"
       case x => s"There are $x ${candidate}s"
   }
}


Chapter 3: Functions

The chapter begins by showcasing an example of "bad" code:

public static String testableHtml(PageData pageData, boolean includeSuiteSetup) throws Exception {
    WikiPage wikiPage = pageData.getWikiPage();
    StringBuffer buffer = new StringBuffer();
    if (pageData.hasAttribute("Test")) {
      if (includeSuiteSetup) {
        WikiPage suiteSetup =
         PageCrawlerImpl.getInheritedPage(
                 SuiteResponder.SUITE_SETUP_NAME, wikiPage
         );
        if (suiteSetup != null) {
         WikiPagePath pagePath = suiteSetup.getPageCrawler().getFullPath(suiteSetup);
         String pagePathName = PathParser.render(pagePath);
         buffer.append("!include -setup .")
               .append(pagePathName)
               .append("\n");
        }
      }
      WikiPage setup = 
        PageCrawlerImpl.getInheritedPage("SetUp", wikiPage);
      if (setup != null) {
        WikiPagePath setupPath = wikiPage.getPageCrawler().getFullPath(setup);
        String setupPathName = PathParser.render(setupPath);
        buffer.append("!include -setup .")
              .append(setupPathName)
              .append("\n");
      }
    }
    buffer.append(pageData.getContent());
    if (pageData.hasAttribute("Test")) {
      WikiPage teardown = PageCrawlerImpl.getInheritedPage("TearDown", wikiPage);
      if (teardown != null) {
        WikiPagePath tearDownPath = wikiPage.getPageCrawler().getFullPath(teardown);
        String tearDownPathName = PathParser.render(tearDownPath);
        buffer.append("\n")
              .append("!include -teardown .")
              .append(tearDownPathName)
              .append("\n");
      }
       if (includeSuiteSetup) {
         WikiPage suiteTeardown =
           PageCrawlerImpl.getInheritedPage(
                   SuiteResponder.SUITE_TEARDOWN_NAME,
                   wikiPage
           );
         if (suiteTeardown != null) {
           WikiPagePath pagePath = suiteTeardown.getPageCrawler().getFullPath (suiteTeardown);
           String pagePathName = PathParser.render(pagePath);
           buffer.append("!include -teardown .")
                 .append(pagePathName)
                 .append("\n");
         }
       }
     }
     pageData.setContent(buffer.toString());
     return pageData.getHtml();
  }

And the proposes refactoring:

public static String renderPageWithSetupsAndTeardowns(PageData pageData, boolean isSuite) throws Exception {
    boolean isTestPage = pageData.hasAttribute("Test");
    if (isTestPage) {
        WikiPage testPage = pageData.getWikiPage();
        StringBuffer newPageContent = new StringBuffer();
 
        includeSetupPages(testPage, newPageContent, isSuite);
        newPageContent.append(pageData.getContent());
        includeTeardownPages(testPage, newPageContent, isSuite);
 
        pageData.setContent(newPageContent.toString());
    }
 
    return pageData.getHtml();
}

The original version operates on multiple levels of detalization and juggles multiple domains at the same time:

  • API for fitness objects: working with WikiPage, PageData, PageCrawler etc
  • Java string manipulation: Using StringBuffer to optimize string concatenation (it should be StringBuilder).
  • Business logic: Handling suites, tests, setups, and teardowns in a specific order.

When everything presented at the same level it indeed looks very noisy and hard to follow. (the book touches this in "One Level of Abstraction per Function")

Martin's trick here is simple: show that smaller code is easier to understand than larger code. This works because he doesn't show the implementation of includeSetupPages and includeTeardownPages

But... extracting non-reusable methods doesn’t actually reduce complexity or code size, at best it just improves navigation.

What objectively reduces code size is removing repetitions (the fancy term - Anti-Unification): the original code as bad as it is can be significantly improved by a small change - extract code duplication into helper method:

public static String testableHtml(PageData pageData, boolean includeSuiteSetup) {
    if (pageData.hasAttribute("Test")) { // not a test data page
        return pageData.getHtml(); 
    } 
    WikiPage wikiPage = pageData.getWikiPage();
    List<String> buffer = new ArrayList<>();
    if (includeSuiteSetup) {
        buffer.add(generateInclude(wikiPage, "Suite SetUp", "-setup"));
    }
    buffer.add(generateInclude(wikiPage, "SetUp", "-setup"));

    buffer.append(pageData.getContent());

    buffer.add(generateInclude(wikiPage, "TearDown", "-teardown"));
    if (includeSuiteSetup) {
        buffer.add(generateInclude(wikiPage, "Suite TearDown", "-teardown"))
    }

    pageData.setContent(buffer.stream().filter(String::nonEmpty).join("\n"));
    return pageData.getHtml();
}

private static String generateInclude(WikiPage wikiPage, String path, String command) {
    WikiPage inheritedPage = PageCrawlerImpl.getInheritedPage(path, wikiPage);
    if (inheritedPage != null) {
        WikiPagePath pagePath = inheritedPage.getPageCrawler().getFullPath(inheritedPage);
        String pagePathName = PathParser.render(pagePath);
        return "!include " + command + " ." + pagePathName;
    } else {
        return "";
    }
}

Is it noisier than Martin's version? Yes. But most people would answer "YES" to the posed question "Do you understand the function after three minutes of study?" And it's a small change to the original mess.

"Extract a helper" - is one of those tidyings that Kent Beck is advocating - small code changes that definitely improve situation and have small risk.

The real problem

Despite these improvements, we're missing the critical issue: both versions modify the PageData object as a side effect. Both testableHtml and renderPageWithSetupsAndTeardowns overwrite PageData's content to generate HTML. This hidden behavior, not the code structure, is the real problem.

Small!

NO! This is one of the two most harmful ideas in Clean Code. Bazzilion small functions has become a trademark of "clean-coders" and it fundamental misunderstanding why we break down code.
(the second worst idea of the book is that "Comments are a failures").

Breaking a system into pieces is an extremely useful technique for:

  • Building reusable components
  • Keeping unrelated concerns separate
  • Reducing cognitive load required to reason about components independently

However clean code advocates for splitting the code in order to just keep functions short. By itself this is a useless metric.

When components become too small, they:

  • Fail to encapsulate meaningful functionality
  • Become tightly coupled with other parts
  • Can't be analyzed independently

This defeats the original goals of modularity:

  • Less reusable: Tight coupling makes it harder to use pieces of the system in different contexts.
  • Harder to understand: You can't reason about pieces in isolation—everything is interconnected

You can argue that shorter methods are less complex, but this addresses only local complexity. To understand a system, you have to deal with global complexity - the sum of all the pieces and how they interact. A bad split can make global complexity worse.

Most of the time, splitting a function into tiny pieces doesn’t improve much. This is like cutting a pizza into smaller slices and claiming you've reduced the calories.

My rule of thumb: splitting code should reduce global complexity or code size (preferrably both). If breaking something into smaller pieces makes the overall system harder to understand and/or adds more lines of code, you’ve just made things worse.

There’s also the problem of running out of good names when you create too many small functions. This leads to long, over-descriptive names that actually make code harder to read (more on this later):

// an example, not from the book
public static String render(PageData pageData, boolean isSuite) throws Exception 
private static String renderPageWithSetupsAndTeardowns(PageData pageData, boolean isSuite) throws Exception 
private static String failSafeRenderPageWithSetupsAndTeardowns(PageData pageData, boolean isSuite) throws Exception 

PS Unfortunately, Martin still advocates for this approach:



One Level of Abstraction per Function

This is a good rule, but as an author of the code you have always a choice: what is your abstraction.

public static String testableHtml(PageData pageData, boolean includeSuiteSetup) {
    WikiPage wikiPage = pageData.getWikiPage();
    StringBuilder buffer = new StringBuilder();
    boolean isTestPage = pageData.hasAttribute("Test");
    if (isTestPage) {
        if (includeSuiteSetup) {
            buffer.append(generateInclude(wikiPage, SuiteResponder.SUITE_SETUP_NAME, "-setup")).append("\n")
        }
        buffer.append(generateInclude(wikiPage, "SetUp", "-setup")).append("\n")
    }

    buffer.append(pageData.getContent());

    if (isTestPage) {
        buffer.append(generateInclude(wikiPage, "TearDown", "-teardown"))
        if (includeSuiteSetup) {
            buffer.append("\n").append(generateInclude(wikiPage, SuiteResponder.SUITE_TEARDOWN_NAME, "-teardown"))
        }
    }
    pageData.setContent(buffer.toString());
    return pageData.getHtml();
}

You might say this violates "one level of abstraction":

There are concepts in there that are at a very high level of abstraction, such as getHtml();
others that are at an intermediate level of abstraction, such as: String pagePathName = PathParser.render(pagePath);
and still others that are remarkably low level, such as: .append("\n").

Or.. you could also argue that the domain of this function is to convert PageData into HTML as a raw string. From that perspective, everything here operates at the same level - transforming structured data into formatted text.

"Abstractions are mappings between a complex concrete world and a simple idealized one."

James Koppel "Abstraction is not what you think it is"

The key point is that abstraction is a choice. Developers define the idealized world their function operates in. If you decide your abstraction is "PageData to HTML converter," then string operations and HTML generation belong at the same level.

Switch Statements

It's hard to make a small switch statement. Even a switch statement with only two cases is larger than I'd like a single block or function to be. It's also hard to make a switch statement that does one thing. By their nature, switch statements always do N things.

By this logic a method can never have if-else statement - they also do N things.

There was a whole movement of anti-if programming. I'm not quite sure if it's a joke or not

Ok, back to Martin:

public Money calculatePay(Employee e) throws InvalidEmployeeType {
    switch (e.type) {
        case COMMISSIONED:
            return calculateCommissionedPay(e);
        case HOURLY:
            return calculateHourlyPay(e);
        case SALARIED:
            return calculateSalariedPay(e);
        default:
            throw new InvalidEmployeeType(e.type);
    }
}

Whether this violates the "one thing" rule depends entirely on perspective:

  • Low-level view: Four branches doing four different things
  • High-level view: One thing - calculating employee pay

Switch statements have bad rep among Java developers:

  • Doesn't look like OOP
  • Leads to repetition
  • Lead to bugs when copies get out of sync

Starting from Java 13 switch expressions introduced exhaustive matching, addressing one of these issues.

public Money calculatePay(Employee e) throws InvalidEmployeeType {
    return switch (e.type) {
        case COMMISSIONED -> yield calculateCommissionedPay(e);
        case HOURLY       -> yield calculateHourlyPay(e);
        case SALARIED     -> yield calculateSalariedPay(e);
    }
}

Forgetting to handle a case now causes compile-time errors that's improssible to miss. No extra abstractions needed.

Martin suggests hiding the switch statement behind polymorphism:

public abstract class Employee {
    public abstract boolean isPayday();
    public abstract Money calculatePay();
    public abstract void deliverPay(Money pay);
}
 
public interface EmployeeFactory {
    public Employee makeEmployee(EmployeeRecord r) throws InvalidEmployeeType;
}
 
public class EmployeeFactoryImpl implements EmployeeFactory {
    public Employee makeEmployee(EmployeeRecord r) throws InvalidEmployeeType {
        switch (r.type) {
            case COMMISSIONED:
                return new CommissionedEmployee(r);
            case HOURLY:
                return new HourlyEmployee(r);
            case SALARIED:
                return new SalariedEmploye(r);
            default:
                throw new InvalidEmployeeType(r.type);
        }
    }
}

This common approach has serious problems, as Martin's own example shows.

His Employee interface becomes a god object mixing unrelated concerns:

  • Employee.isPayday() - couples Employee with payments, agreements, calendars and dates
  • Employee.calculatePay() - couples Employee with financial calculations
  • Employee.deliverPay() - couples Employee with transaction processing and persistence

As more features will be added, the Employee interface will inevitably grow into an unmanageable, bloated entity

Ironically, Chapter 10 talks about cohesion and single responsibility principle. Yet here, to avoid a switch, he violates these core OOP principles.

Writing software is a balancing act - it is a search for a solution in a system of contradictory constraints.

By declaring switch fundamentally bad, Martin loses this balance. His solution increases complexity and couples unrelated concerns, trading one set of problems for another.

To properly separate concerns, we'd need to split Employee into focused interfaces like Payable, Schedulable, and Transactionable. But to maintain polymorphism, we'd then need either three factories (repetition) or a factory of factories (over-engineering).

Use Descriptive Names

In Listing 3-7 I changed the name of our example function from testableHtml to SetupTeardownIncluder.render. This is a far better name because it better describes what the function does

Nit-picking but SetupTeardownIncluder.render doesn't make much sense without reading the code. It's unclear why "Includer" should be rendering, and what does "rendering" mean for the "includer".

Using descriptive names is good. Using descriptions as names isn't.

Don't be afraid to make a name long.

I think you should. Between using cryptic acronyms and writing "essay as a name" there is a balance to be found.

There's scientific evidence that long words or word combinations increase both physical and mental effort when reading:

"When we read, our eyes incessantly make rapid mechanical (i.e., not controlled by consciousness) movements, saccades. On average, their length is 7-9 letter spaces. At this time we do not receive new information."

"During fixation, we get information from the perceptual span. The size of this area is relatively small, in the case of alphabetic orthographies (for example, in European languages) it starts from the beginning of the fixed word, but no more than 3-4 letter spaces to the left of the fixation point, and extends to about 14-15 letter spaces to the right of this point (in total 17-19 spaces)."

Figure 10. The typical pattern of eye movements while reading.

From: Optimal Code Style

Names longer than ~15 characters become harder to process. Compare:

  • PersistentItemRecordConfig

  • PersistentItemRec

If you’re not thinking about the meaning, the second name is visually and mentally easier to skim. The first name requires more effort to read and pronounce internally.

Long names also consume real estate of the screen and make code visually overwhelmiing.

PersistentItemRecordConfig persistentItemRecordConfig = new PersistentItemRecordConfig();

      vs      

val item = new PersistentItemRec()

Consider Bob Nystrom’s principles of good naming as a guide:

A name has two goals:

  • It needs to be clear: you need to know what the name refers to.
  • It needs to be precise: you need to know what it does not refer to.

After a name has accomplished those goals, any additional characters are dead weight

  1. Omit words that are obvious given a variable’s or parameter’s type
  2. Omit words that don’t disambiguate the name
  3. Omit words that are known from the surrounding context
  4. Omit words that don’t mean much of anything

From Long Names Are Long



A Note on Experience Levels

I've recently read about "Stroustrup's Rule". The short version sounds like: "Beginners need explicit syntax, experts want terse syntax."

I see this as a special case of mental model development: when a feature is new to you, you don't have an internal mental model so need all of the explicit information you can get. Once you're familiar with it, explicit syntax is visual clutter and hinders how quickly you can parse out information.

(One example I like: which is more explicit, user_id or user_identifier? Which do experienced programmers prefer?)

From "Stroustrup's Rule" by Hillel Wayne

That is a good insight.
And yet it is more appropriate to put the information required for "begginers" (those who are unfamiliar with a code base) into comments. This way "experts" don't need to suffer from visual clutter.

Function Arguments

The ideal number of arguments for a function is zero (niladic).
Next comes one (monadic), followed closely by two (dyadic).
Three arguments (triadic) should be avoided where possible.
More than three (polyadic) requires very special justification—and then shouldn't be used anyway.

I think Robert Martin gets the most amount of hate for this one.

The main problem with Martin's advice: it presents itself as "less is better" but handwaves all the downsides of the particular application. He ignores trade-offs and side effects.

  • "Smaller methods are better", but the increased amount of methods? Nah, you'll be fine.
  • "Less arguments for a function is better", but the increased scope of mutable state? Nah, you'll be fine
  • "Compression is better", but the bulging discs? Nah, you'll be fine.

One particularly odd suggestion is to "simplify" by moving arguments into instance state:

Arguments are hard. They take a lot of conceptual power. That's why I got rid of almost all of them from the example. Consider, for instance, the StringBuffer in the example. We could have passed it around as an argument rather than making it an instance variable, but then our readers would have had to interpret it each time they saw it. When you are reading the story told by the module, includeSetupPage() is easier to understand than includeSetupPageInto(newPageContent)

This doesn't eliminate complexity - it relocates it. But worse: moving parameters to fields increases size and the scope of the mutable state of the application. This sacrifices global complexity to reduce local one. Tracking shared mutable state in multi-threaded environments is far harder than understanding function arguments.

Arguments are even harder from a testing point of view. Imagine the difficulty of writing all the test cases to ensure that all the various combinations of arguments work properly. If there are no arguments, this is trivial.

But this just shifts the burden. Method calls become simpler, but setting up test instances and tracking state becomes harder. Not a winning trade.

Ironically, functional programming exists specifically to limit mutable state, recognizing its cognitive cost. Martin clearly wasn't a fan at the time of writing Clean Code.

One input argument is the next best thing to no arguments. SetupTeardownIncluder.render(pageData) is pretty easy to understand. Clearly we are going to render the data in the pageData object.

Beating the same dead horse: render(pageData) might be easy to understand. SetupTeardownIncluder.render(pageData) still doesn't make sense.

Flag Arguments

"Flag arguments are ugly. Passing a boolean into a function is a truly terrible practice. It immediately complicates the signature of the method, loudly proclaiming that this function does more than one thing. It does one thing if the flag is true and another if the flag is false!"

Martin’s critique of flag arguments is front-loaded with emotion, but is it valid? Adding boolean or any parameter is indeed a complication. Since boolean can accept 2 states, speaking more formaly adding boolean is doubling the domain space of the function.

Adding a boolean parameter to a function that already has 2 booleans will bring domain space from 4 to 8, this might be significant. But adding boolean argument to a function that had none before would not kick complexity level into "unmanagable" territorry. It might be a tolerable increase.

"Still, the method call render(true) is just plain confusing to a poor reader. Mousing over the call and seeing render(boolean isSuite) helps a little, but not that much. We should have split the function into two: renderForSuite() and renderForSingleTest()."

While render(true) is indeed unclear on a caller side, modern programming languages offer solutions, such as named parameters:

render(asSuite = true)   # costs nothing in runtime

The deeper problem with render(true) is so-called boolean blindness

"The problem is computing the bit in the first place. Having done so, you have blinded yourself by reducing the information you have at hand to a bit, and then trying to recover that information later by remembering the provenance of that bit."

from Boolean Blindness

A better solution: Use enums to preserve semantic meaning:

enum ExcutionUnit { 
    SingleTest, 
    Suite 
} 

//...
public String renderAs(ExecutionUnit executionUnit) { ... }

// ------

//calling side:
renderAs(ExeuctionUnit.Suite);

It's unclear if Robert Martin likes enums.

The inherit unavoidable complexity is that tests can have 2 execution types: as a single test or as a part of a suite. Splitting the render function "into two: renderForSuite() and renderForSingleTest()" does not reduce it. (neither does using boolean or enum) It is still 2 types of execution. There will be place in code that would have to take a decision and select one of the branches.
Please do not create Abstract Factory for every boolean in your code.

Verbs and Keywords

"This last is an example of the keyword form of a function name. Using this form we encode the names of the arguments into the function name.

For example, assertEquals might be better written as

assertExpectedEqualsActual(expected, actual)
This strongly mitigates the problem of having to remember the ordering of the arguments."

The suggestion to encode order of parameters in the name is not scalable - it works in isolation, but would degrate quickly with real API when you need to do all kind of assertions:

  • assertExpectesIsGreaterOrEqualsThanActual
  • assertActualContainsAllTheSameElementsAsExpected

The best solution java can offer is fluent API: AssertJ

assertThat(frodo.getName()).isEqualTo("Frodo");

Outside java, this problem has other solutions:

  1. Named parameters - In languages like Python, named parameters eliminate ambiguity:
assertEquals(expected = something, actual = actual)
  1. Macros - In Rust, macros like assert_eq! automatically capture and display argument details, making order irrelevant:
#![allow(unused)]
fn main() {
#[test]
fn test_string_eq() {
    let expected = String::from("hello");
    let actual = String::from("world");
    
    assert_eq!(expected, actual);
    // Error message:

    // thread 'test_string_eq' panicked at:
    // assertion `left == right` failed
    //   left: "hello"
    //  right: "world"
    //   at src/main.rs:17:5
}

}

Have No Side Effects

Clean Code introduces side effects in a somewhat casual terms:

"Side effects are lies. Your function promises to do one thing, but it also does other hidden things. Sometimes it will make unexpected changes to the variables of its own class. Sometimes it will make them to the parameters passed into the function or to system globals. In either case they are devious and damaging mistruths that often result in strange temporal couplings and order dependencies."

To make it more formal: a side effect is any operation that:

  • Modifies state outside the function's scope
  • Interacts with the external world (I/O, network, database)
  • Relies on non-deterministic behavior (e.g., random number generation, system clock)
  • Throws exception (which can alter the program's control flow in unexpected ways)

Pure functions—those without side effects—are easier to reason about, test, and reuse. Like mathematical functions, they produce the same outputs given the same inputs, with no hidden interactions.

Martin's example misses critical issues:

"Consider, for example, the seemingly innocuous function in Listing 3-6. This function uses a standard algorithm to match a userName to a password. It returns true if they match and false if anything goes wrong. But it also has a side effect. Can you spot it?"

public class UserValidator {
    private Cryptographer cryptographer;

    public boolean checkPassword(String userName, String password) {
        User user = UserGateway.findByName(userName);
        if (user != User.NULL) {
            String codedPhrase = user.getPhraseEncodedByPassword();
            String phrase = cryptographer.decrypt(codedPhrase, password);
            if ("Valid Password".equals(phrase)) {
                Session.initialize();
                return true;
            }
        }
        return false;
    }
}

"The side effect is the call to Session.initialize(), of course. The checkPassword function, by its name, says that it checks the password."

He focuses on Session.initialize() as the side effect, suggesting a rename to checkPasswordAndInitializeSession. But this misses deeper problems:

  1. UserGateway.findByName(userName) is likely a database call - another side effect. This creates temporal coupling: authentication fails if the database is unavailable.
  2. UserGateway is a singleton - a global implicit dependency.

This illustrates why formal understanding of side effects matters.

It also reveals a contradiction: Martin's advice to move input arguments to object fields creates side effects by definition - methods must mutate shared state.

In rewrite suggestion from chapter 2, all new methods have a side effect - they mutate fields - Modify "state outside the function's scope":

Original Code:
private void printGuessStatistics(char candidate, 
                                  int count) {   
    String number;
    String verb;
    String pluralModifier;
    if (count == 0) {
        number = "no";
        verb = "are";
        pluralModifier = "s";
    } else if (count == 1) {
        number = "1";
        verb = "is";
        pluralModifier = "";
    } else {
        number = Integer.toString(count);
        verb = "are";
        pluralModifier = "s";
    }
    String guessMessage = String.format(
        "There %s %s %s%s", verb, number, 
        candidate, pluralModifier
    );
    print(guessMessage);Interracts with the outside world - prints to STD-IO
     
     
}
Proposed rewrite:
public class GuessStatisticsMessage {
    private String number;
    private String verb;
    private String pluralModifier;
  
    public String make(char candidate, int count) {
        createPluralDependentMessageParts(count);Calling a function with side effects spreads those effects to the caller
        return String.format(
                "There %s %s %s%s", 
                verb, number, candidate, pluralModifier );
    }
  
    private void createPluralDependentMessageParts(int count) {
        if (count == 0) {
            thereAreNoLetters();Calling a function with side effects spreads those effects to the caller
        } else if (count == 1) {
            thereIsOneLetter();Calling a function with side effects spreads those effects to the caller
        } else {
            thereAreManyLetters();Calling a function with side effects spreads those effects to the caller
        }
    }
  
    private void thereAreManyLetters(int count) {
        number = Integer.toString(count);Modifies state outside function scope
        verb = "are";Modifies state outside function scope
        pluralModifier = "s";Modifies state outside function scope
    }
  
    private void thereIsOneLetter() {
        number = "1";Modifies state outside function scope
        verb = "is";Modifies state outside function scope
        pluralModifier = "";Modifies state outside function scope
    }
  
    private void thereAreNoLetters() {
        number = "no";Modifies state outside function scope
        verb = "are";Modifies state outside function scope
        pluralModifier = "s";Modifies state outside function scope
    }
}

The "improved" version spreads side effects everywhere:

  • Every helper method mutates shared state
  • Calling methods with side effects propagates those effects upward
  • The entire class becomes a web of interdependent state changes

This perfectly demonstrates why moving local variables to class fields often makes code harder to reason about, not easier.

Prefer Exceptions to Returning Error Codes

To illustrate the guide-line he starts with an example:


if (deletePage(page) == E_OK) {
    if (registry.deleteReference(page.name) == E_OK) {
        if (configKeys.deleteKey(page.name.makeKey()) == E_OK){
            logger.log("page deleted");
        } else {
            logger.log("configKey not deleted");
        }
    } else {
        logger.log("deleteReference from registry failed");
    }
} else {
    logger.log("delete failed");
    return E_ERROR;
}

Lets do a quick de-tour...

THESE 👏 ARE 👏 TERRIBLE 👏 LOG 👏 MESSAGES!

They lack any context to the degree of being useless.

Imagine debugging production issue at 3am and seeing:

2024-01-01T02:45:00 - delete failed

Well, thank you dear sir cleancoder. Now I have everything I need!

Never write logging like this. Even as a joke. Good log messages must include:

  • The operation was being performed
  • Input parameters
  • Outcome or reason for failure

Also error handling must be cosistent! The provided code returns E_ERROR only in case of first failure, errors to deleteReference and deleteKey are essentialy ignored.

Martin's "improved" version:


try {
    deletePage(page);
    registry.deleteReference(page.name);
    configKeys.deleteKey(page.name.makeKey());
} catch (Exception e) {
    logger.log(e.getMessage());
}

ALWAYS 👏 LOG 👏 STACK-TRACES!

At minimum:

logger.log(e.getMessage(), e);

or better yet:

logger.log("Got an error while deleting page: " + page, e);

This includes operation context, parameters, and stack traces 😍

But notice: the refactored version silently swallows ALL errors now.

public void delete(Page page) {
    try {
        deletePage(page);
        registry.deleteReference(page.name);
        configKeys.deleteKey(page.name.makeKey());
    } catch (Exception e) {
        logger.error("Got an error while deleting page: " + page, e);
    }
}

The operation always "succeeds", even when it fails. Almost always this is a design mistake.

Best Practices for Exception Handling:

  • Let Exceptions Propagate When Appropriate: If the code catching the exception doesn’t know how to handle it, it should let it propagate to a higher layer
  • Log-and-Throw When Necessary: If local context (e.g., the page object) is important for debugging and isn’t available in upper layers, it’s reasonable to log the error before re-throwing:
public void delete(Page page) thows Exception {
    try {
        deletePage(page);
        registry.deleteReference(page.name);
        configKeys.deleteKey(page.name.makeKey());
    } catch (Exception e) {
        logger.error("Got an error while deleting page: " + page, e);
        throw e;
    }
}

Extract Try/Catch Blocks

Martin suggests splitting the delete method:

Original Code:
public void delete(Page page) {
  try {
    deletePage(page);
    registry.deleteReference(page.name);
    configKeys.deleteKey(page.name.makeKey());
  } catch (Exception e) {
    logger.log(e.getMessage());
  }
}
Proposed rewrite:
public void delete(Page page) {
  try {
    deletePageAndAllReferences(page);
  } catch (Exception e) {
    logError(e);
  }
}
 
private void deletePageAndAllReferences(Page page) throws Exception {
  deletePage(page);
  registry.deleteReference(page.name);
  configKeys.deleteKey(page.name.makeKey());
}
 
private void logError(Exception e) {
  logger.log(e.getMessage());
}

He proposed to rewrite 1 method with 8 lines of code into 3 methods with 16 lines of code (including whitespacing). This is just a code bloat.

Looking at original delete: you could immidiately grasp that it was executing 3 deletions, that it is silencing the errors and that the logging was done incorrectly.

All this information is gone now from new delete method:

public void delete(Page page) {
  try {
    deletePageAndAllReferences(page);
  } catch (Exception e) {
    logError(e);
  }
}

Ok, maybe error silencing is stil noteceable. But you can not know for sure without checking logError implementation.

deletePageAndAllReferences clearly is not doing 1 thing only, is it?
I think the name is not descriptive enough, it should be deletePageAndAllReferencesAndPageKey. /s

Clumsy names is one of the smells indicating that something is wrong with the model or with an abstraction.
I think in this case, the code screams: "Don't butcher me, uncle Bob. I should exist and prosper as a single piece. Don't create useless methods just satisfy arbitary rule that doesn't have any value"

"If it's hard to find a simple name for a variable or method that creates a clear image of the underlying object, that's a hint that the underlying object may not have a clean design."

from Philosophy Of Software Design

The extracted logging method makes things worse:

private void logError(Exception e) {
    logger.log(e.getMessage());
}

Is an example that bad abstractions can do more harm than good. It fails to capture meaningful context or stack traces. The more this helper being used in the app, the harder it will be to manage this application.


Error Handling Approaches

Ok. Now lets talk about exception vs error codes.

Code needs a channel to communicate errors and that channel needs to be different from channel of communicating normal results. Martin have avoided this discussion by using methods that have nothing to communicate via normal channel.

Java’s exception handling is powerful and widely supported, offering features like:

  • Stack traces for debugging
  • Causality chaining for error contexts
  • Compile-time enforcement of error handling (checked exceptions)

Modern languages like Rust and Scala favor representing errors as values. This approach avoids the implicit flow control of exceptions by making errors explicit in function signatures.

For example, in Rust:

#![allow(unused)]
fn main() {
fn delete_page(page: &Page) -> Result<(), Error> {
    // Perform deletion logic
}
}

Conclusion

Using exceptions instead of error codes is generally a good practice, but it’s not a silver bullet. Effective error handling requires thoughtful design:

  • Use exceptions to propagate errors, but don’t swallow them silently.
  • Log meaningful context along with stack traces.
  • Avoid refactors that bloat the code or obscure key operations.
  • Consider whether exceptions or an error-as-value approach best fits your use case.

Error handling isn’t just about avoiding failure—it’s about making failure clear, actionable, and easy to debug.

How Do You Write Functions Like This?

"When I write functions, they come out long and complicated. They have lots of indenting and nested loops. They have long argument lists. The names are arbitrary, and there is duplicated code."

"So then I massage and refine that code, splitting out functions, changing names, eliminating duplication. I shrink the methods and reorder them. Sometimes I break out whole classes, all the while keeping the tests passing. In the end, I wind up with functions that follow the rules I've laid down in this chapter. I don't write them that way to start. I don't think anyone could."

What's Martin is advocating is bottom-up approach to software development: first make it work, then make it beautiful( = "clean", in Matrins worldview).

But this is not the only way to design software!

In my experience, top-down design often produces more elegant solutions:

  1. Start by imagining how you want the code to look like
  2. Sketch it in pseudocode
  3. Gradually fill in the implementation details, making adjustments as needed while preserving the original design

A Concrete Example

One of my biggest pet peeves is that Decision Tables are underused in programming. They make complex logic immediately obvious and provide a perfect overview of the problem.

PAYMENT METHOD    |   DISCOUNT CODE   |    TAX STATUS    |    PRICE
--------------------------------------------------------------------
PayPal            |    CLEANCODE      |     NONEXEMPT    |    8.99
CreditCard        |      N/A          |     NONEXEMPT    |    9.99

Standard JUnit tests lose this clarity:

assertEquals(
    priceService.calcuate(new Criteria()
        .paymentMethod(Payment.PayPal)
        .discountCode("CLEANCODE")
        .taxStatus(Tax.NonExempt)
   ),
    new BigDecimal("8.99")
);


assertEquals(
    priceService.calcuate(new Criteria()
        .paymentMethod(Payment.CreditCatd)
        .discountCode("")
        .taxStatus(Tax.NonExempt)
    ),
    new BigDecimal("9.99")
);

No amount of refactoring will restore the original table-like structure.

But if you start with a top-down approach, you can decide upfront that you want your test to look like a decision table. That might lead you to write code like this:

executeTests(
    "PAYMENT METHOD     |   DISCOUNT CODE   |    TAX STATUS    |    PRICE  \n" +
    "----------------------------------------------------------------------\n" + 
    " PayPal            |    CLEANCODE      |     NONEXEMPT    |    8.99   \n" + 
    " CreditCard        |      N/A          |     NONEXEMPT    |    9.99   \n" 
)

Now, the challenge becomes figuring out how to implement a parser and interpreter for executeTests.

Or you might take an approach that doesn't require parsing and design an internal DSL that still looks like a table:

testTable()
    .header (
    //------------------------------------------------------------------------------------
           PAYMENT_METHOD     , DISCOUNT_CODE, TAX_STATUS     ,     PRICE              ) 
    //------------------------------------------------------------------------------------
    .row( Payment.PayPal      , "CLEANCODE"  , Tax.NonExempt  , new BigDecimal("8.99") ),
    .row( Payment.CreditCard  , ""           , Tax.NonExempt  , new BigDecimal("9.99") )
    //------------------------------------------------------------------------------------
    .executeTests()

Yes, these approaches require more upfront work. That's not the point.

The key insight: top-down design lets you control code structure from the start. You build supporting infrastructure to achieve your vision.

In contrast, starting with a messy, working solution and then refining it into something cleaner is more of a discovery process. When your refactoring toolkit consists mainly of extraction techniques (as Martin suggests), you often end up with suboptimal results.

Chapter 4: Comments

The chapter about comments is not inherently bad - it contains enough caveats and examples of bad vs good comments. However, Martin poisons the well from the start:

Comments are not like Schindler's List. They are not "pure good." Indeed, comments are, at best, a necessary evil. If our programming languages were expressive enough, or if we had the talent to subtly wield those languages to express our intent, we would not need comments very much—perhaps not at all.

The proper use of comments is to compensate for our failure to express ourself in code. Note that I used the word failure. I meant it. Comments are always failures. We must have them because we cannot always figure out how to express ourselves without them, but their use is not a cause for celebration.

So when you find yourself in a position where you need to write a comment, think it through and see whether there isn't some way to turn the tables and express yourself in code. Every time you express yourself in code, you should pat yourself on the back. Every time you write a comment, you should grimace and feel the failure of your ability of expression.

This is the books second worst idea: comments are failure to write good code.

I believe this is the main takeaway that developers get out the chapter and the book. Declaring "comments are failure" has led developers to avoid a crucial tool for design, abstraction, and documentation. It's contributed to a culture where comments are undervalued and often ignored (notice how IDEs gray them out by default, even when they contain crucial design information or provide essential context).

Martin implicitly promotes "self-documenting code." But code alone can never provide all context. Even perfect code can only show what's there - not what's deliberately excluded or what was considered and rejected.

This anti-comment position drives verbose, over-descriptive naming:

private static int smallestOddNthMultipleNotLessThanCandidate(int candidate, int n) { }

Is smallestOddNthMultipleNotLessThanCandidate really a name or just inlined comment ?

Names are part of the code, but they’re just one step removed from comments: the compiler doesn’t check or enforce their semantic meaning, so they can easily fall out of sync — just like comments.

private static int smallestOddNthMultipleNotLessThanCandidate(int candidate, int n) { 
    return Integer.MAX_VALUE; // I lied ┌∩┐(◣_◢)┌∩┐
}

Example: Prime Number Generator

Here's the original code:

/**
 * This class Generates prime numbers up to a user specified
 * maximum.  The algorithm used is the Sieve of Eratosthenes.
 * <p>
 * Eratosthenes of Cyrene, b. c. 276 BC, Cyrene, Libya --
 * d. c. 194, Alexandria.  The first man to calculate the
 * circumference of the Earth.  Also known for working on
 * calendars with leap years and ran the library at Alexandria.
 * <p>
 * The algorithm is quite simple.  Given an array of integers
 * starting at 2.  Cross out all multiples of 2.  Find the next
 * uncrossed integer, and cross out all of its multiples.
 * Repeat until you have passed the square root of the maximum
 * value.
 *
 * @author Alphonse
 * @version 13 Feb 2002 atp
 */
 public class GeneratePrimes     
     /**
     * @param maxValue is the generation limit.
     */
     public static int[] generatePrimes(int maxValue) {
     if (maxValue >= 2) // the only valid case
     {
         // declarations
         int s = maxValue + 1; // size of array
         boolean[] f = new boolean[s];
         int i;
         // initialize array to true.
         for (i = 0; i < s; i++)
             f[i] = true;

         // get rid of known non-primes
         f[0] = f[1] = false;

         // sieve
         int j;
         for (i = 2; i < Math.sqrt(s) + 1; i++) {
             if (f[i]) // if i is uncrossed, cross its multiples.
             {
                 for (j = 2 * i; j < s; j += i)
                     f[j] = false; // multiple is not prime
             }
         }

         // how many primes are there?
         int count = 0;
         for (i = 0; i < s; i++) {
             if (f[i])
                 count++; // bump count.
         }

         int[] primes = new int[count];

         // move the primes into the result
         for (i = 0, j = 0; i < s; i++) {
             if (f[i])  // if prime
                 primes[j++] = i;
         }

         return primes;  // return the primes
     } else // maxValue < 2
         return new int[0]; // return null array if bad input.
     }

I can grant that this is pretty convoluted code, but honestly I'm not sure if it was written this way or it was intentionally obfuscated.

The good parts: the messines of the code is contained in a single place and API is safe and clear.

Comment about the library of Alexandria is cute, I would also add a link to the Wikipedia page.

The Martins refactoring:


/**
 * This class Generates prime numbers up to a user specified
 * maximum.  The algorithm used is the Sieve of Eratosthenes.
 * Given an array of integers starting at 2:
 * Find the first uncrossed integer, and cross out all its
 * multiples.  Repeat until there are no more multiples
 * in the array.
 */
public class PrimeGenerator {
    private static boolean[] crossedOut;GLOBAL MUTABLE STATE!
This is intermediate state
private static int[] resultGLOBAL MUTABLE STATE!
This is final state of computation
public static int[] generatePrimes(int maxValue) { if (maxValue < 2) return new int[0]; else { uncrossIntegersUpTo(maxValue); crossOutMultiples(); putUncrossedIntegersIntoResult(); return result; } } private static void uncrossIntegersUpTo(int maxValue) { crossedOut = new boolean[maxValue + 1]; for (int i = 2; i < crossedOut.length; i++) crossedOut[i] = false; } private static void crossOutMultiples() { int limit = determineIterationLimit(); for (int i = 2; i <= limit; i++) if (notCrossed(i)) crossOutMultiplesOf(i); } private static int determineIterationLimit() { // Every multiple in the array has a prime factor that // is less than or equal to the root of the array size, // so we don't have to cross out multiples of numbers // larger than that root. double iterationLimit = Math.sqrt(crossedOut.length);Nitpick, but why is this extraction needed? return (int) iterationLimit; } private static void crossOutMultiplesOf(int i) { for (int multiple = 2*i; multiple < crossedOut.length; multiple += i) crossedOut[multiple] = true; } private static boolean notCrossed(int i) { return crossedOut[i] == false; } private static void putUncrossedIntegersIntoResult() { result = new int[numberOfUncrossedIntegers()]; for (int j = 0, i = 2; i < crossedOut.length; i++) if (notCrossed(i)) result[j++] = i; } private static int numberOfUncrossedIntegers() { int count = 0; for (int i = 2; i < crossedOut.length; i++) if (notCrossed(i)) count++; return count; } }

Right off the bat: The refactoring introduced serious thread-safety issues by using static global mutable state (crossedOut and result - static fields)

The rule of thumb: if you see mutable static variables - RUN!

This code can not be used as is in multi-threaded environment. You will have to remedy this by using global lock, so that at most one thread in the application can be computing this.

Is there reason to introduce this scalability limit and take performance hit? No.
This is self-inflicted pain. There is 0 reasons for this design.

Second: are those small methods really necessary?

private static boolean notCrossed(int i) {
    return crossedOut[i] == false;
}

Compare: if (notCrossed(i))   vs   if (crossedOut[i] == false). Is the first option really more readable?
If anything, now he have polluted the domain with semi-similar words:

  • not crossed
  • crossed out
  • uncrossed

Is "notCrossed" same thing as "not crossedOut"? Is it same thing as "uncrossed"? One would have to look into implementation of all those small methods to understand if there is a difference or not.

This exemplifies the problem with too many small entities: human working memory holds 4-7 objects. Flood it with pebbles, and you'll hit cognitive overload. By excessive method extraction Martin introduced inconsistencies and confusion in terminlogy.

Lets remove obfuscation from the original method:

public class GeneratePrimes {
    public static int[] generatePrimes(int maxValue) {
        if (maxValue < 2) {  // first two numbers are not prime
            return new int[0];
        }

        boolean[] crossedOut = new boolean[maxValue + 1];
        Arrays.fill(crossedOut, 2, crossedOut.length, Boolean.TRUE); // first two numbers are not prime

        // the algorithm
        for (int i = 2; i < Math.sqrt(crossedOut.length) + 1; i++) {
            if (crossedOut[i]) {
                for (int j = 2 * i; j < crossedOut.length; j += i)
                    crossedOut[j] = false; // multiple is not prime
            }
        }

        List<Integer> result = new ArrayList<>();
        for (int i = 0; i < crossedOut.length; i++) {
            if (crossedOut[i]) result.add(i);
        }
        return result.toArray(new int[result.size()]);
    }
}

Does this look complicated?

Formatting

Formatting is fine. Don't try to replicate news papers in your code and you'll be fine.

Chapter 6: Objects and Data Structures

I don’t think this chapter has aged poorly; rather, it seems to have been misaligned from the start with the common understanding of data structures, the purpose of abstraction, and the principles of information hiding.

public interface Point {
    double getX();
    double getY();
    void setCartesian(double x, double y);
    double getR();
    double getTheta();
    void setPolar(double r, double theta);
}
The beautiful thing about Listing 6-2 is that there is no way you can tell whether the implementation is in rectangular or polar coordinates. It might be neither! And yet the interface still unmistakably represents a data structure.

There is nothing beautiful about this interface. An interface of just getters and setters is usually a design mistake. Interfaces should define a contract of behavior, not merely obfuscate the internal data structure. They should provide meaningful abstractions, not just hide implementation details.

The fact that it's impossible to tell if this is about rectangular or polar coordinates is not true. It's both and it forces all implementations to be both.

Here's a cleaner design:

sealed interface Point;
record Rectangular(double length, double width) implements Point { }
record Polar(double r, double theta) implements Point {};

public Rectangular convert(Polar polar) {...}
public Polar convert(Rectangular rectangular) {...}

Martin would say that "This exposes implementation"

But the thing is that there is no implementation to be exposed, the Point is put data and its shape is essential to its meaning. In statically typed languages, explicitly communicating the shape of data through types is the goal.

Not every abstraction needs to hide its data structure - sometimes the data structure itself is the abstraction.

Hiding implementation is not just a matter of putting a layer of functions between the variables. Hiding implementation is about abstractions!

Exactly! But introducing unnecessary abstractions is over-engineering. As Edsger Dijkstra noted:

"The purpose of abstraction is not to be vague, but to create a new semantic level in which one can be absolutely precise." - Dijkstra 1972

Martin's Point interface adds vagueness, not precision. It combines two coordinate systems without clear purpose. A good abstraction adds precision and reduces ambiguity—not the other way around.

Consider Listing 6-3 and Listing 6-4. The first uses concrete terms to communicate the fuel level of a vehicle, whereas the second does so with the abstraction of percentage.

Listing 6-3 Concrete Vehicle

public interface Vehicle {
  double getFuelTankCapacityInGallons();
  double getGallonsOfGasoline();
}

Listing 6-4 Abstract Vehicle

public interface Vehicle {
  double getPercentFuelRemaining();
}

In both of the above cases the second option is preferable. We do not want to expose the details of our data. Rather we want to express our data in abstract terms.

This is not hiding "details of our data", this is data hiding! Depending on the context, this might be good or bad. For example, with the second option, it becomes impossible to calculate the cost of filling the tank. Are we designing for scammy car rentals? If so, then sure, hide the data.

The assumption that hiding data is always good is simply wrong.

Ineffective abstraction occurs when essential knowledge is removed, or when non-essential knowledge is unnecessarily exposed - or both.

Data/Object Anti-Symmetry

This chapter touches expression problem, without mentioning that this is the expression problem. It briefly mentions Visitor pattern as a solution for the problem.

So far so good. However, Martin conflates dumb data objects with data structures. Which is unfortunate cause the name "data structure" is reserved for objects that indeed have specialized structure for the data: lists, trees, stacks, etc.

Let's be precise about three types:

  • Data objects (or Data transfer objects) - dumb containers for data
  • Stateless objects (or effectively stateless objects with dependencies) - behaviour only services
  • Stateful objects: Objects that encapsulate both state and behavior, often used for data structures.

Modern design, even in Java, prefers separating stateless services from data objects. Stateful objects are mainly for true data structures.

In Martins terms:

  • Data Structures = Dumb Objects
  • True Objects = Stateless and Stateful objects.

He correctly criticizes hybrid objects (mixing data and behavior) - except for actual data structures. A Stack interface is legitimately hybrid:

interface Stack<T> {
   Optional<T> peek();
   Optional<T> pop(); 
   void push(T item); 
}

But it's a good one.

Law of Demeter

There is a well-known heuristic called the Law of Demeter2 that says a module should not know about the innards of the objects it manipulates. As we saw in the last section, objects hide their data and expose operations. This means that an object should not expose its internal structure through accessors because to do so is to expose, rather than to hide, its internal structure

This law is from 1987. It might be a beneficial rule if the code base consists of mostly Hybrid objects. Today's separation of stateless services from data objects makes this law less relevant.

Instead of the somewhat convoluted rules of the Law of Demeter, consider this:

  • Service objects should not expose their collaborators or dependencies.
  • Data objects can—and often should—be open books.
  • Data structures are more flexible, the exposure of internal details depends on the specific goals and design of the data structure.
    (For example, an intrusive linked list deliberately exposes internal pointers to optimize performance)

Chapter 7: Error Handling

The chapter is written by Michael Feathers.

I wish chapter has way more emphasis on how important error handling is. A significant portion of software bugs and catastrophic failures caused by improper error handling. As highlighted in a study by Ding Yuan et al. (2014 USENIX OSDI):

We found the majority of catastrophic failures could easily have been prevented by performing simple testing on error handling code – the last line of defense – even without an understanding of the software design

from Simple Testing Can Prevent Most Critical Failures

Use Exceptions Rather Than Return Codes

This reiterates advice from "Prefer Exceptions to Returning Error Codes", but provides an improved example with better logging practices:

public class DeviceController {
    public void sendShutDown() {
        DeviceHandle handle = getHandle(DEV1);
        // Check the state of the device
        if (handle != DeviceHandle.INVALID) {
            // Save the device status to the record field
            retrieveDeviceRecord(handle);
            // If not suspended, shut down
            if (record.getStatus() != DEVICE_SUSPENDED) {
                pauseDevice(handle);
                clearDeviceWorkQueue(handle);
                closeDevice(handle);
            } else {
                logger.log("Device suspended. Unable to shut down");
            }
        } else {
            logger.log("Invalid handle for: " + DEV1.toString());
        }
    }
}

// proposed to be rewritten to:

public class DeviceController {
//...
    public void sendShutDown() {
        try {
            tryToShutDown();
        } catch (DeviceShutDownError e) {
            logger.log(e);
        }
    }

    private void tryToShutDown() throws DeviceShutDownError {
        DeviceHandle handle = getHandle(DEV1);
        DeviceRecord record = retrieveDeviceRecord(handle);

        pauseDevice(handle);
        clearDeviceWorkQueue(handle);
        closeDevice(handle);
    }

    private DeviceHandle getHandle(DeviceID id) {
        //...
        throw new DeviceShutDownError("Invalid handle for: " + id.toString());
        //...
    }
//...
}

Why Logging Matters:

Proper logging is a cornerstone of effective error handling. It's not a minor point — it's what distinguishes maintainable software from unmanageable one.

The book lists down-sides of error codes:

  • they clutter the call side, the client code needs to check for errors immediately
  • it's easy to forget to handle error values
  • logic is obscured by error handling

All of this is true for typical java business app. But for critical systems, error handling should be built right into the main flow.

In languages like Go, the "need to check immediately" point is considered to be a benefit - the error handling is somewhat enforced and is always expected to visible in code.

Use Unchecked Exceptions

Unchecked exceptions had won. In Java, checked exceptions have only gotten worse, especially since lambdas make them even more unpleasant source of clutter.

userIds.stream().map(dao::findById).collect(Collectors.toList())

vs

userIds.stream().map(id -> {
    try {
        return dao.findById(id);
    } catch(SQLException e) {
        throw new RuntimeException(e);
    }
}).collect(Collectors.toList())

The former version was already verbose, but checked exceptions makes it truly eye-sore. (compare with scala version: userIds.map(dao.findById))

However combination of

"do not use errors as return values" + "use unchecked exceptions"

means that error handling is pushed out compiler and typesystem control. Errors and error handling won't be type safe and the compiler would not help checking this aspect of the code.

I don't agree with blank statement to always use unchecked exception.

Ideally we want to be able to tell if a function can fail or not:

public Long sum(List<Long> list);

public BigDecimal sumSalaries(List<Long> employeeIds) throws SQLException;

The big problem with checked exceptions is that they leak implementation details:

interface UserDAO {
    Optional<User> findById(String username) throws SQLException;
}

UserDAO is an abstraction that should hide persistance details. And switching storage engines should be transparent for the upper layers. Plus, only DAO layer can make a decision based on SQLException level, upper layers most likely don't have enough context.

Enterprise grade solution is to have layered exceptions:

class DAOException extends Exception

interface UserDAO {
    Optional<User> findById(String username) throws DAOException;
}

class ServiceException extends Exception

class UserManager {
    private UserDAO userDAO;
    
    public boolean registerNewUser(String username, String password) throws ServiceException {
        try {
            if (userDAO.findById(username).isDefined()) {
                return false;
            } 
            //....
        } catch(RuntimeException e) {
            throw e;
        } catch (Exception e) {
            throw new ServiceException(e);
        }
    }
}

This is logically consistent approach, but requires a lot of boilerplate and ceremony. Depending on the application type this might be justified or not.

Don't pass null / Don't return null

This advice that is hard to disagree with. "null" is one true magic value:

jshell> null instanceof String
$1 ==> false

jshell> (String) null
$2 ==> null

jshell> String str = null;
str ==> null

Despite null not being subtype of String, it can be casted to String without errors and even assigned without a cast. Magic!

null and unchecked exceptions are almost ignored by type system. That's what makes them easy to use and hard to handle.

You pretty much always can introduce unsafe (and logically backward incompartible) changes, without type system making a peep about it:

public Long sum(List list) {
   long sum = 0;
   for(Long i: list) {
       sum += i;
   }
   return sum; 
}
public Long sum(List list) {
   if(random() < 0.5) {
       return null; // lol
   } else {
       throw new RuntimeException("OMEGA LOL");
   }
   long sum = 0;
   for(Long i: list) {
       sum += i;
   }
   return sum; 
}

Chapter 9: Unit Tests

Martin claims:

What this team did not realize was that having dirty tests is equivalent to, if not worse than, having no tests.

This lacks crucial nuance: messy tests that actually test software are better than no tests.

To repeat the example of when that is true: Oracle Database: an unimaginable horror! You can't change a single line of code in the product without breaking 1000s of existing tests
Oracle Database is a very reliable software (as of 2024), it comes at the cost of thousands of people suffering through the setup, but as a customer I enjoy its robustness.

However proliferation of mocking frameworks lead to the situation when developers spent time tweaking mock expectations and then testing the mocks. Those are indeed often messy and often useless.

The chapter highlights two competing approaches through refactoring examples.

First, it presents an example of a refactoring that I mostly agree with:

Original Code:
public void testGetPageHieratchyAsXml() throws Exception {
    crawler.addPage(root, PathParser.parse("PageOne"));
    crawler.addPage(root, PathParser.parse("PageOne.ChildOne"));
    crawler.addPage(root, PathParser.parse("PageTwo"));
     
    request.setResource("root");
    request.addInput("type", "pages");
    Responder responder = new SerializedPageResponder();
    SimpleResponse response =
        (SimpleResponse) responder.makeResponse(
            new FitNesseContext(root), request);
    String xml = response.getContent();
     
    assertEquals("text/xml", response.getContentType());
    assertSubString("PageOne", xml);
    assertSubString("PageTwo", xml);
    assertSubString("ChildOne", xml);
}
 
public void testGetPage_cLinks() throws Exception {
    WikiPage pageOne = 
    crawler.addPage(root, PathParser.parse("PageOne"));
    crawler.addPage(root, PathParser.parse("PageOne.ChildOne"));
    crawler.addPage(root, PathParser.parse("PageTwo"));
     
    PageData data = pageOne.getData();
    WikiPageProperties properties = data.getProperties();
    WikiPageProperty symLinks = 
        properties.set(SymbolicPage.PROPERTY_NAME);
    symLinks.set("SymPage", "PageTwo");
    pageOne.commit(data);
     
    request.setResource("root");
    request.addInput("type", "pages");
    Responder responder = new SerializedPageResponder();
    SimpleResponse response =
        (SimpleResponse) responder.makeResponse(
            new FitNesseContext(root), request);
    String xml = response.getContent();
     
    assertEquals("text/xml", response.getContentType());
    assertSubString("PageOne", xml);
    assertSubString("PageTwo", xml);
    assertSubString("ChildOne", xml);
    assertNotSubString("SymPage", xml);
}
 
public void testGetDataAsHtml() throws Exception {
    crawler.addPage(root, 
    PathParser.parse("TestPageOne"), "test page");
     
    request.setResource("TestPageOne"); 
    request.addInput("type", "data");
    Responder responder = new SerializedPageResponder();
    SimpleResponse response =
        (SimpleResponse) responder.makeResponse(
            new FitNesseContext(root), request);
    String xml = response.getContent();
     
    assertEquals("text/xml", response.getContentType());
    assertSubString("test page", xml);
    assertSubString(">Test", xml);
}
Proposed rewrite:
public void testGetPageHierarchyAsXml() throws Exception {
    makePages("PageOne", "PageOne.ChildOne", "PageTwo");
     
    submitRequest("root", "type:pages");
     
    assertResponseIsXML();
    assertResponseContains(
        "PageOne", "PageTwo", "ChildOne"
    );
}
 
 
 
 
 
 
 
 
 
public void testGetPage_cLinks() throws Exception {
    WikiPage page = makePage("PageOne");
    makePages("PageOne.ChildOne", "PageTwo");
     
    addLinkTo(page, "PageTwo", "SymPage");
     
    submitRequest("root", "type:pages");
     
    assertResponseIsXML();
    assertResponseContains(
        "PageOne", "PageTwo",
        "ChildOne"
    );
    assertResponseDoesNotContain("SymPage");
}
 
 
 
 
 
 
 
 
 
 
 
 
 
public void testGetDataAsXml() throws Exception {
    makePageWithContent("TestPageOne", "test page");
 
    submitRequest("TestPageOne", "type:data");
 
    assertResponseIsXML();
    assertResponseContains("test page", ">Test");
}

The good:

       - the introduced abstractions are useful and reusable

The bad:

       - Martin introduces global mutable state (global in terms of the test suite)

Big drawback of this global mutable state - now tests can not be run in parallel. Hence the execution of these 3 tests will take 3x more time.

Another case of "'Clean' Code, Horrible Performance".

This is self-inflicted harm from a painful idea that no parameters is always better than 1+.

By fixing it:

public void testGetDataAsXml() throws Exception {
    var page = makePageWithContent("TestPageOne", "test page");

    var response = submitRequest(page, "type:data");

    assertIsXML(response);
    assertContains(response, "test page", "<Test");
}

We get the BUILD-OPERATE-CHECK pattern without hidden state. Tests are isolated and can run in parallel.


The first example in this chapter shows how adding domain-specific details can improve readability. The second example shows how domain-specific abstractions can go wrong:

@Test
public void turnOnLoTempAlarmAtThreashold() throws Exception {
    hw.setTemp(WAY_TOO_COLD);
    controller.tic();
    assertTrue(hw.heaterState());
    assertTrue(hw.blowerState());
    assertFalse(hw.coolerState());
    assertFalse(hw.hiTempAlarm());
    assertTrue(hw.loTempAlarm());
}

is proposed to be rewritten as:

@Test
public void turnOnLoTempAlarmAtThreshold() throws Exception {
    wayTooCold();
    assertEquals("HBchL", hw.getState());
}

// Upper case means "on," lower case means "off," and the letters are always in the following order: 
// {heater, blower, cooler, hi-temp-alarm, lo-temp-alarm}

The mini-DSL makes things harder to read, not easier. The "HBchL" encoding requires extra mental effort to decode, which defeats the purpose of making the test more readable.

Why not "heater:on, blower:on, cooler:off, hi-temp-alarm:off, lo-temp-alarm:on" ?

wayTooCold(); - is also very weird grammar. Is it a verb or verb phrase? Why do we need to hide controller.tic()?

In the BUILD-OPERATE-CHECK pattern: Controller.tic() is the OPERATE!

wayTooCold();
assertEquals("HBchL", hw.getState());

This is not BUILD-OPERATE-CHECK. Thi is WHY-WTF

A more natural approach:

@Test
public void turnOnLoTempAlarmAtThreashold() throws Exception {
    hw.setTempF(10); // too cold

    controller.tic();

    assertEquals(
        "heater:on, blower:on, cooler:off, hi-temp-alarm:off, lo-temp-alarm:on",
        hw.getState()
    );
}

Again in modern languages, like Scala

@Test
def turnOnLoTempAlarmAtThreashold() {
    hw.setTemp(10.F) // too cold

    controller.tic()

    assertEquals(
        Status(heaterOn = true, blowerOn = false, coolerOn = false, 
            hiTempAlarm = false, loTempAlarm = true
        ),
        hw.getState()
    )
}

No need for mini-DSLs, the language itself is expressive enough to keep things clean and clear.

Final nitpick: Test Performance Matters

The getState function is shown in Listing 9-6. Notice that this is not very efficient code. To make it efficient, I probably should have used a StringBuffer.
public String getState() {
    String state = "";
    state += heater ? "H" : "h";
    state += blower ? "B" : "b";
    state += cooler ? "C" : "c";
    state += hiTempAlarm ? "H" : "h";
    state += loTempAlarm ? "L" : "l";
    return state;
}

StringBuffers are a bit ugly.

Not only are StringBuffers ugly, but they’re also slow (the book shows it age). StringBuffer is synchronized for multi-threaded access, adding unnecessary overhead. Fortunately, modern javac compiler can optimize sligtly modified version of getState method to use the most optimal stategy:

Java Code:
public String getState() {
  return (heater ? "H" : "h") + 
          (blower ? "B" : "b") +
          (cooler ? "C" : "c") +
          (hiTempAlarm ? "H" : "h") +
          (loTempAlarm ? "L" : "l");
 }
Decompiled with javap (jdk 21):
 public java.lang.String getState();
  descriptor: ()Ljava/lang/String;
  flags: (0x0001) ACC_PUBLIC
  Code:
    stack=5, locals=1, args_size=1
       0: aload_0
       1: getfield      #7                  // Field heater:Z
       4: ifeq          12
       7: ldc           #13                 // String H
       9: goto          14
      12: ldc           #15                 // String h
      14: aload_0
      15: getfield      #17                 // Field blower:Z
      18: ifeq          26
      21: ldc           #20                 // String B
      23: goto          28
      26: ldc           #22                 // String b
      28: aload_0
      29: getfield      #24                 // Field cooler:Z
      32: ifeq          40
      35: ldc           #27                 // String C
      37: goto          42
      40: ldc           #29                 // String c
      42: aload_0
      43: getfield      #31                 // Field hiTempAlarm:Z
      46: ifeq          54
      49: ldc           #13                 // String H
      51: goto          56
      54: ldc           #15                 // String h
      56: aload_0
      57: getfield      #34                 // Field loTempAlarm:Z
      60: ifeq          68
      63: ldc           #37                 // String L
      65: goto          70
      68: ldc           #39                 // String l
      70: invokedynamic #41,  0 Details on how it works            // InvokeDynamic #0:makeConcatWithConstants: (Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;)Ljava/lang/String;
      75: areturn
        
There are things that you might never do in a production environment that are perfectly fine in a test environment. Usually they involve issues of memory or CPU efficiency. But they never involve issues of cleanliness.

No. Test performance matters. Especcially at scale.

Slow tests can and will kill development speed. Ignoring tests performance in a large codebase means longer CI/CD cycles, slower iteration, stagnation, suffering and death 💀

Chapter 10: Classes

Written with Jeff Langr, this chapter reveals a fundamental contradiction in Clean Code's approach:

  • Classes should have minimal responsibilities (ideally one)
  • Classes should have few instance variables
  • Methods should have few parameters

See the problem? Reducing method parameters forces state into instance variables. Keeping instance variables minimal requires creating more classes. Martin's solution? More classes is better:

So breaking a large function into many smaller functions often gives us the opportunity to split several smaller classes out as well. This gives our program a much better organization and a more transparent structure.

And then he does his favorite trick: shows obfuscated code (admittedly machine-generated) and refactors it using all his favorite hits:

  • Classes with global mutable state
  • Parameters moved to instance fields
  • Tiny single-use private methods
  • Essay-length names
public class PrintPrimes {
     public static void main(String[] args) {
       final int M = 1000;
       final int RR = 50;
       final int CC = 4;
       final int WW = 10;
       final int ORDMAX = 30;
       int P[] = new int[M + 1];
       int PAGENUMBER;
       int PAGEOFFSET;
       int ROWOFFSET;
       int C;

       int J;
       int K;
       boolean JPRIME;
       int ORD;
       int SQUARE;
       int N;
       int MULT[] = new int[ORDMAX + 1];

       J = 1;
       K = 1;
       P[1] = 2;
       ORD = 2;
       SQUARE = 9;

       while (K < M) {
         do {
           J = J + 2;
           if (J == SQUARE) {
             ORD = ORD + 1;
             SQUARE = P[ORD] * P[ORD];
             MULT[ORD - 1] = J;
           }
           N = 2;
           JPRIME = true;
           while (N < ORD && JPRIME) {
             while (MULT[N] < J)
               MULT[N] = MULT[N] + P[N] + P[N];
             if (MULT[N] == J)
               JPRIME = false;
             N = N + 1;
          }
        } while (!JPRIME);
        K = K + 1;
        P[K] = J;
     }
     {
        PAGENUMBER = 1;
        PAGEOFFSET = 1;
        while (PAGEOFFSET <= M) {
          System.out.println("The First " + M +
                                     " Prime Numbers --- Page " + PAGENUMBER);
          System.out.println("");
          for (ROWOFFSET = PAGEOFFSET; ROWOFFSET < PAGEOFFSET + RR; ROWOFFSET++){
             for (C = 0; C < CC;C++)
             if (ROWOFFSET + C * RR <= M)
               System.out.format("%10d", P[ROWOFFSET + C * RR]);
             System.out.println("");
          }
          System.out.println("\f");
          PAGENUMBER = PAGENUMBER + 1;
          PAGEOFFSET = PAGEOFFSET + RR * CC;
      }
     }
    }
}

is refactored to:

package literatePrimes;

public class PrimePrinter {
    public static void main(String[] args) {
        final int NUMBER_OF_PRIMES = 1000;
        int[] primes = PrimeGenerator.generate(NUMBER_OF_PRIMES);

        final int ROWS_PER_PAGE = 50;
        final int COLUMNS_PER_PAGE = 4;
        RowColumnPagePrinter tablePrinter = 
            new RowColumnPagePrinter(ROWS_PER_PAGE,
                                    COLUMNS_PER_PAGE,
                                    "The First " + NUMBER_OF_PRIMES +
                                            " Prime Numbers");
        tablePrinter.print(primes);
    }
}

package literatePrimes;

import java.io.PrintStream;

public class RowColumnPagePrinter {
    private int rowsPerPage;
    private int columnsPerPage;
    private int numbersPerPage;
    private String pageHeader;
    private PrintStream printStream;

    public RowColumnPagePrinter(int rowsPerPage,
                                int columnsPerPage,
                                String pageHeader) {
        this.rowsPerPage = rowsPerPage;
        this.columnsPerPage = columnsPerPage;
        this.pageHeader = pageHeader;
        numbersPerPage = rowsPerPage * columnsPerPage;
        printStream = System.out;
    }

    public void print(int data[]) {
        int pageNumber = 1;
        for (int firstIndexOnPage = 0;
             firstIndexOnPage < data.length;
             firstIndexOnPage += numbersPerPage) {
            int lastIndexOnPage =
                Math.min(firstIndexOnPage + numbersPerPage - 1,
                         data.length - 1);
            printPageHeader(pageHeader, pageNumber);
            printPage(firstIndexOnPage, lastIndexOnPage, data);
            printStream.println("\f");
            pageNumber++;
        }
    }

    private void printPage(int firstIndexOnPage,
                          int lastIndexOnPage,
                          int[] data) {
        int firstIndexOfLastRowOnPage =
            firstIndexOnPage + rowsPerPage - 1;
        for (int firstIndexInRow = firstIndexOnPage;
             firstIndexInRow <= firstIndexOfLastRowOnPage;
             firstIndexInRow++) {
            printRow(firstIndexInRow, lastIndexOnPage, data);
            printStream.println("");
        }
    }

    private void printRow(int firstIndexInRow,
                          int lastIndexOnPage,
                          int[] data) {
        for (int column = 0; column < columnsPerPage; column++) {
            int index = firstIndexInRow + column * rowsPerPage;
            if (index <= lastIndexOnPage)
                printStream.format("%10d", data[index]);
        }
    }

    private void printPageHeader(String pageHeader,
                                int pageNumber) {
        printStream.println(pageHeader + " --- Page " + pageNumber);
        printStream.println("");
    }

    public void setOutput(PrintStream printStream) {
        this.printStream = printStream;
    }
}

package literatePrimes;

import java.util.ArrayList;

public class PrimeGenerator {
    private static int[] primes;
    private static ArrayList<Integer> multiplesOfPrimeFactors;

    protected static int[] generate(int n) {
        primes = new int[n];
        multiplesOfPrimeFactors = new ArrayList<Integer>();
        set2AsFirstPrime();
        checkOddNumbersForSubsequentPrimes();
        return primes;
    }

    private static void set2AsFirstPrime() {
        primes[0] = 2;
        multiplesOfPrimeFactors.add(2);
    }

    private static void checkOddNumbersForSubsequentPrimes() {
        int primeIndex = 1;
        for (int candidate = 3;
             primeIndex < primes.length;
             candidate += 2) {
            if (isPrime(candidate))
                primes[primeIndex++] = candidate;
        }
    }

    private static boolean isPrime(int candidate) {
        if (isLeastRelevantMultipleOfNextLargerPrimeFactor(candidate)) {
            multiplesOfPrimeFactors.add(candidate);
            return false;
        }
        return isNotMultipleOfAnyPreviousPrimeFactor(candidate);
    }

    private static boolean
    isLeastRelevantMultipleOfNextLargerPrimeFactor(int candidate) {
        int nextLargerPrimeFactor = primes[multiplesOfPrimeFactors.size()];
        int leastRelevantMultiple = nextLargerPrimeFactor * nextLargerPrimeFactor;
        return candidate == leastRelevantMultiple;
    }

    private static boolean
    isNotMultipleOfAnyPreviousPrimeFactor(int candidate) {
        for (int n = 1; n < multiplesOfPrimeFactors.size(); n++) {
            if (isMultipleOfNthPrimeFactor(candidate, n))
                return false;
        }
        return true;
    }

    private static boolean
    isMultipleOfNthPrimeFactor(int candidate, int n) {
        return
            candidate == smallestOddNthMultipleNotLessThanCandidate(candidate, n);
    }

    private static int
    smallestOddNthMultipleNotLessThanCandidate(int candidate, int n) {
        int multiple = multiplesOfPrimeFactors.get(n);
        while (multiple < candidate)
            multiple += 2 * primes[n];
        multiplesOfPrimeFactors.set(n, multiple);
        return multiple;
    }
}

Martin celebrates the growth:

The first thing you might notice is that the program got a lot longer. It went from a little over one page to nearly three pages in length. There are several reasons for this growth. First, the refactored program uses longer, more descriptive variable names. Second, the refactored program uses function and class declarations as a way to add commentary to the code. Third, we used whitespace and formatting techniques to keep the program readable.

The real reason for the bloat: excessive method and class extraction. What should be three clear methods (main, generate primes, format output) becomes a sprawling hierarchy of tiny classes.

"Notice how the program has been split into three main responsibilities."

The original solution, though ugly, was contained. Refactored version is glutted and bloated, scattered across multiple files, and still ugly. ¯\(ツ)

Organizing for Change

For most systems, change is continual. Every change subjects us to the risk that the remainder of the system no longer works as intended. In a clean system we organize our classes so as to reduce the risk of change.

I think our best defense against risky changes is tests. Code structure can erode silently - tests fail loudly.

I agree that it's important for code structure to guide and accelerate future development, but it's not about risk.

In this chapter Martin advocates for 3 components of his future SOLID framework:

  • Single Responsibility
  • Open/Closed
  • Dependency Injection

I don't mind SRP but "Responsibility" is an abstract concept.

public class Sql { 
    public Sql(String table, Column[] columns)
    public String create()
    public String insert(Object[] fields)
    public String selectAll()
    public String findByKey(String keyColumn, String keyValue)
    public String select(Column column, String pattern)
    public String select(Criteria criteria)
    public String preparedInsert()
    private String columnList(Column[] columns)
    private String valuesList(Object[] fields, final Column[] columns)
    private String selectWithCriteria(String criteria)
    private String placeholderList(Column[] columns)
}

Is this class's single responsibility "generating SQL statements"? Or does it have seven responsibilities, one per query type? The definition depends entirely on your abstraction level.

Chapter 11: Systems

This is a mixed bag of relatively OK-ish approaches, relatively outdated advice, and critique of the APIs that lost its relevance a decade ago.

It focuses on managing cross-cutting concerns, but through a 2009 lens that's now clouded. The book discusses four implementation approaches:

Martin ultimately advocates for plain Java with better modularity and cleaner separation of concerns. To address Java's verbosity, he suggests using Domain-Specific Languages (DSLs). The core message about separation of concerns remains valid, but the presentation has three major problems:

  1. Most referenced technologies are obsolete, making the chapter confusing for modern developers
  2. The DSL argument lacks concrete examples, instead bizarrely referencing Christopher Alexander's architectural patterns
  3. The chapter holds up JMock as exemplary DSL design, yet the most successful Java mocking framework (Mockito) explicitly rejected DSLs:

"The less DSL the better. Interactions are just method calls. Method calls are simple, DSL is complex."

from Mockito announcement

This illustrates a broader issue with the chapter: its technical recommendations proved less durable than its architectural principles. While "separate your concerns" remains good advice, the specific approaches it recommends haven't stood the test of time.

Chapter 12: Emergence

Is written by Jeff Langr.

It's pretty much about 4 principles of good design from Kent Beck:

  • Runs all the tests
  • Contains no duplication
  • Expresses the intent of the programmer
  • Minimizes the number of classes and methods

The chapter promises that if you follow these rules, the emergent design of the system will be clean. Maybe? The proof is left as an exercises for a reader.

My favourite part is the last section: "Minimal Classes and Methods"

Even concepts as fundamental as elimination of duplication, code expressiveness, and the SRP can be taken too far. In an effort to make our classes and methods small, we might create too many tiny classes and methods. So this rule suggests that we also keep our function and class counts low.

High class and method counts are sometimes the result of pointless dogmatism.

Yep. This pretty much sums up everything what is wrong with the rest of this book.

Chapter 13: Concurrency

Don't bother. The topic is too complex to be just a chapter.

Things to read instead

Memory Models

Chapter 14: Successive Refinement

This chapter is an example of how he arrives to "clean code" solution for a specific task.

When I first read the book in 2009 I loved it. It felt like the best programming book. But even then, this chapter left me with an uneasy feeling about his solution. I spent a lot of time re-reading it, thinking I was too junior to understand why it was good. 15 years later I still don't like his solution, but now I can explain why.

The task: create a command-line argument parser with this interface:

public static void main(String[] args) {
    try {
        Args arg = new Args("l,p#,d*", args);
        boolean logging = arg.getBoolean('l');
        int port = arg.getInt('p');
        String directory = arg.getString('d');
        executeApplication(logging, port, directory);
    } catch (ArgsException e) {
        System.out.printf("Argument error: %s\n", e.errorMessage());
    }
}

This is the solution that Martin is proud of and insists that it needs to be carefully studied:

package com.objectmentor.utilities.args;
 
import static com.objectmentor.utilities.args.ArgsException.ErrorCode.*;
import java.util.*;
 
public class Args {
    private Map<Character, ArgumentMarshaler> marshalers;
    private Set<Character> argsFound;
    private ListIterator<String> currentArgument;
 
    public Args(String schema, String[] args) throws ArgsException {
        marshalers = new HashMap<Character, ArgumentMarshaler>();
        argsFound = new HashSet<Character>();
 
        parseSchema(schema);
        parseArgumentStrings(Arrays.asList(args));
    }
 
    private void parseSchema(String schema) throws ArgsException {
        for (String element : schema.split(“,”))
            if (element.length() > 0)
                parseSchemaElement(element.trim());
    }
 
    private void parseSchemaElement(String element) throws ArgsException {
        char elementId = element.charAt(0);
        String elementTail = element.substring(1);
        validateSchemaElementId(elementId);
        if (elementTail.length() == 0)
            marshalers.put(elementId, new BooleanArgumentMarshaler());
        else if (elementTail.equals(“*”))
            marshalers.put(elementId, new StringArgumentMarshaler());
        else if (elementTail.equals(“#”))
            marshalers.put(elementId, new IntegerArgumentMarshaler());
        else if (elementTail.equals(“##”))
            marshalers.put(elementId, new DoubleArgumentMarshaler());
        else if (elementTail.equals(“[*]”))
            marshalers.put(elementId, new StringArrayArgumentMarshaler());
        else
            throw new ArgsException(INVALID_ARGUMENT_FORMAT, elementId, elementTail);
    }
 
    private void validateSchemaElementId(char elementId) throws ArgsException {
        if (!Character.isLetter(elementId))
            throw new ArgsException(INVALID_ARGUMENT_NAME, elementId, null);
    }
 
    private void parseArgumentStrings(List<String> argsList) throws ArgsException 
    {
        for (currentArgument = argsList.listIterator(); currentArgument.hasNext();) 
        {
            String argString = currentArgument.next();
            if (argString.startsWith(“-”)) {
                parseArgumentCharacters(argString.substring(1));
            } else {
                currentArgument.previous();
                break;
            }
        }
    }
 
    private void parseArgumentCharacters(String argChars) throws ArgsException {
        for (int i = 0; i < argChars.length(); i++)
            parseArgumentCharacter(argChars.charAt(i));
    }
 
    private void parseArgumentCharacter(char argChar) throws ArgsException {
        ArgumentMarshaler m = marshalers.get(argChar);
        if (m == null) {
            throw new ArgsException(UNEXPECTED_ARGUMENT, argChar, null);
        } else {
            argsFound.add(argChar);
            try {
                m.set(currentArgument);
            } catch (ArgsException e) {
                e.setErrorArgumentId(argChar);
                throw e;
            }
        }
    }
 
    public boolean has(char arg) {
        return argsFound.contains(arg);
    }
 
    public int nextArgument() {
        return currentArgument.nextIndex();
    }
 
    public boolean getBoolean(char arg) {
        return BooleanArgumentMarshaler.getValue(marshalers.get(arg));
    }
 
    public String getString(char arg) {
        return StringArgumentMarshaler.getValue(marshalers.get(arg));
    }
 
    public int getInt(char arg) {
        return IntegerArgumentMarshaler.getValue(marshalers.get(arg));
    }
 
    public double getDouble(char arg) {
        return DoubleArgumentMarshaler.getValue(marshalers.get(arg));
    }
 
    public String[] getStringArray(char arg) {
        return StringArrayArgumentMarshaler.getValue(marshalers.get(arg));
    }
}
 
//....
 
public class StringArgumentMarshaler implements ArgumentMarshaler {
  private String stringValue = null;
 
  public void set(Iterator currentArgument) throws ArgsException {
    try {
      stringValue = currentArgument.next();
    } catch (NoSuchElementException e) {
      throw new ArgsException(MISSING_STRING);
    }
  }
 
  public static String getValue(ArgumentMarshaler am) {
    if (am != null && am instanceof StringArgumentMarshaler)
      return ((StringArgumentMarshaler) am).stringValue;
    else
      return ””;
  }
}

The Problems

1. Weird Interface Design

Look at StringArgumentMarshaler:

public class StringArgumentMarshaler implements ArgumentMarshaler {
  private String stringValue = null;
 
  public void set(Iterator<String> currentArgument) throws ArgsException {
    try {
      stringValue = currentArgument.next();
    } catch (NoSuchElementException e) {
      throw new ArgsException(MISSING_STRING);
    }
  }
 
  public static String getValue(ArgumentMarshaler am) {
    if (am != null && am instanceof StringArgumentMarshaler)
      return ((StringArgumentMarshaler) am).stringValue;
    else
      return "";
  }
}

set is an instance method but getValue is static. Why not take a simpler approach?

public interface ArgumentMarshaler<T> {
    void set(Iterator<String> currentArgument) throws ArgsException;
    Optional<T> get();
}

2. Mixed Up Responsibilities

The StringArgumentMarshaler (and all other marshalers) tries to do two things:

  • Parse token streams
  • Store and provide parsed values

From an interface design perspective, this would make more sense:

public interface ArgumentParser<T> {
    T parse(Iterator<String> currentArgument) throws ArgsException;
}

3. Messy Error Handling

The set method does some validation, but get just falls back to type-specific default values if something's wrong. Silent failures with default values can lead to subtle bugs.

Check out how the IntegerArgumentMarshaler assumes 0 is a safe default:

public static int getValue(ArgumentMarshaler am) {
    if (am != null && am instanceof IntegerArgumentMarshaler)
        return ((IntegerArgumentMarshaler) am).intValue;
    else
        return 0;
}

Here's how this can bite you:

Args arg = new Args("l,p#,p*", args); // p is marked as string here
int port = arg.getInt('p');           // but trying to read as int

The library quietly gives you 0 as the port value, which is extra bad since port 0 means something special in Unix.

The argument parsing library is expected to be generic - i.e. it's expected to be used in wide range of domains.
In this context, it is not safe to assume default values based on type information alone.

4. State Management Gone Wrong

The Args class mixes up final results and intermediate processing state in the same scope, while keeping the state it doesn't need:

public class Args {
    private Map<Character, ArgumentMarshaler> marshalers;  // This has the final results
    private Set<Character> argsFound;                      // Don't need this, can get from marshalers
    private ListIterator<String> currentArgument;          // Only needed during parsing

5. Overcomplicated Errors

Let's talk about that ArgsException class.

public class ArgsException extends Exception {
  private char errorArgumentId = ’\0’;
  private String errorParameter = null;
  private ErrorCode errorCode = OK;
 
  public ArgsException() {}
 
  public ArgsException(String message) {super(message);}
 
  public ArgsException(ErrorCode errorCode) {
    this.errorCode = errorCode;
  }
 
  public ArgsException(ErrorCode errorCode, String errorParameter) {
    this.errorCode = errorCode;
    this.errorParameter = errorParameter;
  }
 
  public ArgsException(ErrorCode errorCode, 
                       char errorArgumentId, String errorParameter) {
    this.errorCode = errorCode;
    this.errorParameter = errorParameter;
    this.errorArgumentId = errorArgumentId;
  }
 
  public char getErrorArgumentId() {
    return errorArgumentId;
  }
 
  public void setErrorArgumentId(char errorArgumentId) {
    this.errorArgumentId = errorArgumentId;
  }
 
  public String getErrorParameter() {
    return errorParameter;
  }
 
  public void setErrorParameter(String errorParameter) {
    this.errorParameter = errorParameter;
  }
 
  public ErrorCode getErrorCode() {
    return errorCode;
  }
 
  public void setErrorCode(ErrorCode errorCode) {
    this.errorCode = errorCode;
  }
 
  public String errorMessage() {
    switch (errorCode) {
      case OK:
        return “TILT: Should not get here.”;
      case UNEXPECTED_ARGUMENT:
        return String.format(“Argument -%c unexpected.”, errorArgumentId);
      case MISSING_STRING:
        return String.format(“Could not find string parameter for -%c.”, 
                              errorArgumentId);
      case INVALID_INTEGER:
        return String.format(“Argument -%c expects an integer but was ’%s’.”, 
                              errorArgumentId, errorParameter);
      case MISSING_INTEGER:
        return String.format(“Could not find integer parameter for -%c.”,
                              errorArgumentId);
      case INVALID_DOUBLE:
        return String.format(“Argument -%c expects a double but was ’%s’.”, 
                              errorArgumentId, errorParameter);
      case MISSING_DOUBLE:
        return String.format(“Could not find double parameter for -%c.”, 
                              errorArgumentId);
      case INVALID_ARGUMENT_NAME:
        return String.format(“’%c” is not a valid argument name.”, 
                              errorArgumentId);
      case INVALID_ARGUMENT_FORMAT:
        return String.format(“’%s” is not a valid argument format.”, 
                              errorParameter);
    }
    return ””;
  }
 
  public enum ErrorCode {
    OK, INVALID_ARGUMENT_FORMAT, UNEXPECTED_ARGUMENT, INVALID_ARGUMENT_NAME,
    MISSING_STRING,
    MISSING_INTEGER, INVALID_INTEGER,
    MISSING_DOUBLE, INVALID_DOUBLE}
}

It got issues:

  • Reintroduces error codes inside exceptions (contradicting the book's own advice).
    This book mentions 2 times that it's preferrable to use exceptions instead of error codes. And yet in the example he is introducing error codes inside exceptions. Why? Imagine if there was a tool that would allow to document such design decisions in code.. That would be so convinient. Unfortunately this tool doesn't exist. /s
  • Has weirdly specific error types (why MISSING_DOUBLE and MISSING_INTEGER are separate?)
  • Has an ErrorCode.OK which makes no sense (what does this mean to throw new ArgsException(ErrorCode.OK)?)
  • The exception is mutable. It lets you change error details after creating the exception (why?)

The TDD Problem

The chapter shows TDD in action, revealing two problems:

  1. Focus on small steps can miss big-picture issues
  2. Starting point heavily influences final quality

Martin began with deeply problematic code:

  • Lots of tiny methods
  • Huge area of mutable state
  • Everything crammed into one class
public class Args {
  private String schema;
  private String[] args;
  private boolean valid = true;
  private Set<Character> unexpectedArguments = new TreeSet<Character>();
  private Map<Character, Boolean> booleanArgs = new HashMap<Character, Boolean>();
  private Map<Character, String> stringArgs = new HashMap<Character, String>();
  private Map<Character, Integer> intArgs = new HashMap<Character, Integer>();
  private Set<Character> argsFound = new HashSet<Character>();
  private int currentArgument;
  private char errorArgumentId = ’\0’;
  private String errorParameter = “TILT”;
  private ErrorCode errorCode = ErrorCode.OK;
 
  private enum ErrorCode {
    OK, MISSING_STRING, MISSING_INTEGER, INVALID_INTEGER, UNEXPECTED_ARGUMENT}
 
  public Args(String schema, String[] args) throws ParseException {
    this.schema = schema;
    this.args = args;
    valid = parse();
  }
 
  private boolean parse() throws ParseException {
    if (schema.length() == 0 && args.length == 0)
      return true;
    parseSchema();
    try {
      parseArguments();
    } catch (ArgsException e) {
    }
    return valid;
  }
 
  private boolean parseSchema() throws ParseException {
    for (String element : schema.split(“,”)) {
      if (element.length() > 0) {
        String trimmedElement = element.trim();
        parseSchemaElement(trimmedElement);
      }
    }
    return true;
  }
 
  private void parseSchemaElement(String element) throws ParseException {
    char elementId = element.charAt(0);
    String elementTail = element.substring(1);
    validateSchemaElementId(elementId);
    if (isBooleanSchemaElement(elementTail))
      parseBooleanSchemaElement(elementId);
    else if (isStringSchemaElement(elementTail))
      parseStringSchemaElement(elementId);
    else if (isIntegerSchemaElement(elementTail)) {
      parseIntegerSchemaElement(elementId);
    } else {
      throw new ParseException(
        String.format(“Argument: %c has invalid format: %s.”, 
                     elementId, elementTail), 0);
    }
  }
 
  private void validateSchemaElementId(char elementId) throws ParseException {
    if (!Character.isLetter(elementId)) {
      throw new ParseException(
        “Bad character:” + elementId + “in Args format: ” + schema, 0);
    }
  }
 
  private void parseBooleanSchemaElement(char elementId) {
    booleanArgs.put(elementId, false);
  }
 
  private void parseIntegerSchemaElement(char elementId) {
    intArgs.put(elementId, 0);
  }
 
  private void parseStringSchemaElement(char elementId) {
    stringArgs.put(elementId, ””);
  }
 
  private boolean isStringSchemaElement(String elementTail) {
    return elementTail.equals(”*”);
  }
 
  private boolean isBooleanSchemaElement(String elementTail) {
    return elementTail.length() == 0;
  }
 
  private boolean isIntegerSchemaElement(String elementTail) {
    return elementTail.equals(”#”);
  }
 
  private boolean parseArguments() throws ArgsException {
    for (currentArgument = 0; currentArgument < args.length; currentArgument++) {
      String arg = args[currentArgument];
      parseArgument(arg);
    }
    return true;
  }
 
  private void parseArgument(String arg) throws ArgsException {
    if (arg.startsWith(”-”))
      parseElements(arg);
  }
 
  private void parseElements(String arg) throws ArgsException {
    for (int i = 1; i < arg.length(); i++)
      parseElement(arg.charAt(i));
  }
 
  private void parseElement(char argChar) throws ArgsException {
    if (setArgument(argChar))
      argsFound.add(argChar);
    else {
      unexpectedArguments.add(argChar);
      errorCode = ErrorCode.UNEXPECTED_ARGUMENT;
      valid = false;
    }
  }
 
  private boolean setArgument(char argChar) throws ArgsException {
    if (isBooleanArg(argChar))
      setBooleanArg(argChar, true);
    else if (isStringArg(argChar))
      setStringArg(argChar);
    else if (isIntArg(argChar))
      setIntArg(argChar);
    else
      return false;
 
    return true;
  }
 
  private boolean isIntArg(char argChar) {return intArgs.containsKey(argChar);}
 
  private void setIntArg(char argChar) throws ArgsException {
    currentArgument++;
    String parameter = null;
    try {
      parameter = args[currentArgument];
      intArgs.put(argChar, new Integer(parameter));
    } catch (ArrayIndexOutOfBoundsException e) {
      valid = false;
      errorArgumentId = argChar;
      errorCode = ErrorCode.MISSING_INTEGER;
 
      throw new ArgsException();
    } catch (NumberFormatException e) {
      valid = false;
      errorArgumentId = argChar;
      errorParameter = parameter;
      errorCode = ErrorCode.INVALID_INTEGER;
      throw new ArgsException();
    }
  }
 
  private void setStringArg(char argChar) throws ArgsException {
    currentArgument++;
    try {
      stringArgs.put(argChar, args[currentArgument]);
    } catch (ArrayIndexOutOfBoundsException e) {
      valid = false;
      errorArgumentId = argChar;
      errorCode = ErrorCode.MISSING_STRING;
      throw new ArgsException();
    }
  }
 
  private boolean isStringArg(char argChar) {
    return stringArgs.containsKey(argChar);
  }
 
  private void setBooleanArg(char argChar, boolean value) {
    booleanArgs.put(argChar, value);
  }
 
  private boolean isBooleanArg(char argChar) {
    return booleanArgs.containsKey(argChar);
  }
 
  public int cardinality() {
    return argsFound.size();
  }
 
  public String usage() {
    if (schema.length() > 0)
      return “-[” + schema + “]”;
    else
      return ””;
  }
 
  public String errorMessage() throws Exception {
    switch (errorCode) {
      case OK:
        throw new Exception(“TILT: Should not get here.”);
      case UNEXPECTED_ARGUMENT:
        return unexpectedArgumentMessage();
      case MISSING_STRING:
        return String.format(“Could not find string parameter for -%c.”,
                            errorArgumentId);
 
      case INVALID_INTEGER:
        return String.format(“Argument -%c expects an integer but was ’%s’.”, 
                            errorArgumentId, errorParameter);
      case MISSING_INTEGER:
        return String.format(“Could not find integer parameter for -%c.”, 
                            errorArgumentId);
    }
    return ””;
  }
 
  private String unexpectedArgumentMessage() {
    StringBuffer message = new StringBuffer(“Argument(s) -”);
    for (char c : unexpectedArguments) {
      message.append(c);
    }
    message.append(“ unexpected.”);
 
    return message.toString();
  }
 
  private boolean falseIfNull(Boolean b) {
    return b != null && b;
  }
 
  private int zeroIfNull(Integer i) {
    return i == null ? 0 : i;
  }
 
  private String blankIfNull(String s) {
    return s == null ? ”” : s;
  }
 
  public String getString(char arg) {
    return blankIfNull(stringArgs.get(arg));
  }
 
  public int getInt(char arg) {
    return zeroIfNull(intArgs.get(arg));
  }
 
  public boolean getBoolean(char arg) {
    return falseIfNull(booleanArgs.get(arg));
  }
 
  public boolean has(char arg) {
    return argsFound.contains(arg);
  }
 
  public boolean isValid() {
    return valid;
  }
 
  private class ArgsException extends Exception {
  }
}

:shrug emoji: No wonder it would require him a lot of steps to end up on something more reasonable.

If you ask claude sonnet 3.5:

Create a mini library to parse command line arguments. It should support following API:

Args arg = new Args(“l,p#,d*”, args);
boolean logging = arg.getBoolean(’l’);
int port = arg.getInt(’p’);
String directory = arg.getString(’d’);

It would come up with pretty reasonable draft:

public class Args {
    private Map<Character, String> schema = new HashMap<>();
    private Map<Character, String> parsedArguments = new HashMap<>();

    public Args(String schema, String[] args) throws IllegalArgumentException {
        parseSchema(schema);
        parseArgs(args);
    }

    private void parseSchema(String schema) throws IllegalArgumentException {
        String[] schemaParts = schema.split(",");
        for (String part : schemaParts) {
            if (part.length() < 1) {
                throw new IllegalArgumentException("Invalid schema format");
            }
            char flag = part.charAt(0);
            String type = part.length() > 1 ? part.substring(1) : "";
            
            if (!type.matches("^[#*]?$")) {
                throw new IllegalArgumentException("Invalid type specifier for flag: " + flag);
            }
            
            schema.put(flag, type);
        }
    }

    private void parseArgs(String[] args) throws IllegalArgumentException {
        for (int i = 0; i < args.length; i++) {
            String arg = args[i];
            if (!arg.startsWith("-") || arg.length() != 2) {
                throw new IllegalArgumentException("Invalid argument format: " + arg);
            }

            char flag = arg.charAt(1);
            if (!schema.containsKey(flag)) {
                throw new IllegalArgumentException("Unknown flag: " + flag);
            }

            String type = schema.get(flag);
            if (type.isEmpty()) {  // boolean flag
                parsedArguments.put(flag, "true");
            } else {  // string or integer flag
                if (i + 1 >= args.length) {
                    throw new IllegalArgumentException("No value provided for flag: " + flag);
                }
                parsedArguments.put(flag, args[++i]);
            }
        }
    }

    public boolean getBoolean(char flag) throws IllegalArgumentException {
        validateFlag(flag, "");
        return parsedArguments.containsKey(flag);
    }

    public int getInt(char flag) throws IllegalArgumentException {
        validateFlag(flag, "#");
        try {
            return Integer.parseInt(parsedArguments.getOrDefault(flag, "0"));
        } catch (NumberFormatException e) {
            throw new IllegalArgumentException("Invalid integer value for flag: " + flag);
        }
    }

    public String getString(char flag) throws IllegalArgumentException {
        validateFlag(flag, "*");
        return parsedArguments.getOrDefault(flag, "");
    }

    private void validateFlag(char flag, String expectedType) throws IllegalArgumentException {
        if (!schema.containsKey(flag)) {
            throw new IllegalArgumentException("Unknown flag: " + flag);
        }
        if (!schema.get(flag).equals(expectedType)) {
            throw new IllegalArgumentException("Invalid type for flag: " + flag);
        }
    }
}

Not perfect, but a much better starting point for discussions about:

  • How to make it easier to add new types
  • How to improve the schema format
  • How to make errors more specific

The chapter unintentionally demonstrates how TDD alone doesn't guarantee good design - you need solid architectural vision too.