by Adam Brett

DRY (Don't Repeat Yourself) is a Fallacy

Good programmers know that code duplication should be avoided, so we have catchphrases like Don't Repeat Yourself (DRY) in order to remind ourselves that code duplication is evil. DRY is one of those arguments that sounds logical when you hear it, it just makes sense, but how many of us think about why code duplication should be avoided?

We're not doing it to save keystrokes, because we have ctrl-c, ctrl-v for that. The underlying issue is that there's a cost associated with duplicated code. You have to update it in multiple places when you want to change it, otherwise you could end up with two pieces of code that should behave the same, but don't.

 Dry is about knowledge, not code.

In the Blue Book (DDD), Eric Evans expresses this in the form of "Bound Contexts". The example he uses in the book is that of products in a supermarket (if I remember correctly, it's a while since I read it). Consider the following example:

class Cart {
  private Products products = new Products;

  public void addProduct(Product product)
  {
    if (3 == products.count()) {
      throw new DomainException('Max 3 products allowed');
    }

    this.products.add(product);
  }
}

class Shipment {
  private Products products = new Products;

  public void addProduct(Product product)
  {
    if (3 == products.count()) {
      throw new DomainException('Max 3 products allowed');
    }

    this.products.add(product);
  }
}

Most people would consider this duplicated code, and therefore it violates the DRY principle. A good developer might then refactor this out to something like this:

abstract class ProductContainer
{
  private Products products = new Products;

  public void addProduct(Product product)
  {
    if (3 == products.count()) {
      throw new DomainException('Max 3 products allowed');
    }

    this.products.add(product);
  }
}

class Cart extends ProductContainer {}
class Shipment extends ProductContainer {}

The code, as it stood, is identical, but as good DDD practitioners we have to ask ourselves why? The only way to find out is to talk to the business.

In a supermarket, you have products that you sell, but the definition of what a product is is fluid, it means different things to different people in the business. With DDD, it's through talking to these different people, and learning their different interpretations of what a "product" looks like (as well as the business rules around them) that you develop your Bounded Context.

What a product looks like for a shipment is different to what a product looks like in a shopping cart, and as a result there are different things we want to know about them, and different things you can do with them.

Whilst we're probably talking about the same physical product, what we mean when we reference them in these two contexts is different – a product in a shopping cart has a price and a barcode, it has a weight that must be less than the maximum rated value for the cart, and so on. A product in a shipment is that same product, but wrapped in packing foam, and a cardboard box, so it is both larger and heavier, and we don't care about the price at this point, as it's already been sold.

When you re-visit our example in these two contexts, then it becomes very easy to see how the duplication we saw here was just co-incidental, and introducing the ProductContainer indirection is a fallacy.

Changing the Rules

The rules themselves are not important. The principle issue that that the limit of 3 products could have been introduced in both places for different reasons, and therefore could change independently.

As an example, the business could decide to allow as many products as the customer likes in a Cart, but then create multiple Shipment's to send them to the customer. With the ProductContainer example, we are now stuck with tight coupling of the two independent domain objects – what you do to one will automatically be applied to the other, and this can lead to potentially dangerous software errors in the real world (imagine the products we're talking about are volatile chemicals, which is why we can't ship more than 3 at a time, or the number of people allowed in an elevator).

With this trivial example, there are trivial ways to get around this limitation whilst still being Code-DRY. You might be considering rewriting ProductContainer like so:

abstract class ProductContainer
{
  protected int maxProducts = 3;

  private Products products = new Products;

  public void addProduct(Product product)
  {
    if (this.maxProduct == products.count()) {
      throw new DomainException(String.format('Max %d products allowed', this.maxProducts));
    }

    this.products.add(product);
  }
}


class Cart extends ProductContainer {
  public Cart()
  {
    this.maxProducts = 2;
  }
}

class Shipment extends ProductContainer {}

This would probably work, but should still be avoided. The rules could change in unexpected ways, and without any more context we don't know the reasons behind the 3 product limit. It could be to do with size or weight, or the type of supplier agreement we have a customer, and in future we could need to factor these in.

 DRY is about Knowledge

Despite what it might appear from the code, the business rule in this example is not "Max 3 products allowed". In fact, it's two separate business rules "A cart cannot have more than 3 products", and "A shipment cannot have more than 3 products". Two different rules, no matter how similar, should be separated in code, and not abstracted away.

"Don't repeat yourself" has never been about code, it was always about knowledge. Two pieces of code that represent the same knowledge will always change together, and that's where the cost (risk) is associated and you need to DRY the code.

On the other hand, if two identical pieces of code represent different knowledge, or different contexts, de-duplicating them introduces the cost & risk, via tight coupling and low cohesion, meaning changes to unrelated pieces of code can end up breaking each other, and you face having to refactor away the bad abstraction at a later date.

In summary, DRY is not a reason to couple code libraries with similar behaviours; instead, it is a reason to have a single canonical source of knowledge within a system.

For exclusive content, including screen-casts, videos, and early beta access to my projects, subscribe to my email list below.


I love discussion, but not blog comments. If you want to comment on what's written above, head over to twitter.