Demeter: It’s not just a good idea. It’s the law.

Is #try really so bad?

In response to my recent post about #try being a code smell, a lot of people made the reasonable objection that the example I used—of using #try on a a Hash—was a pathological case. A much more typical usage of #try looks like this:

def user_info(user)
  "Name: #{user.name}. Dept: #{user.department.try(:name)}"
end

user may or may not have an associated department, so the call to Department#name is wrapped in a #try. If there is an associated department, its name will be returned. If not, the result will be nil.

Straightforward enough. Is there anything wrong with this code?

I can think of a a few things. For one thing, it’s ugly. For my money, one of the hallmarks of beautiful code is that it’s visually consistent: similar operations have similar appearance. In the code above, we access one attribute with simple dot syntax (.name), and another with a very different-looking .try(:name), even though in both cases the concept we are trying to express is the same: “get the ‘name’ attribute”.

It’s a variety of ugliness that tends to proliferate, too. Starting with the code above, it’s not a very big leap to get to this:

def user_info(user)
  "Name: #{user.name}. Boss: #{user.department.try(:head).try(:name)}"
end

Yuck.

And then there are the tests. They’ll probably look something like this:

describe '#user_info' do
  subject { user_info(user) }
  let(:user) { stub('user', :name => "Bob", :department => stub(:name => "Accounting")) }
  specify {
    subject.should match(/Dept: Accounting/)
  }
end

Metastasizing mocks

This, again, doesn’t seem so bad. But give that test suite six months of active development, and chances are the tests wind up looking more like this:

describe '#user_info' do
  subject { user_info(user) }
  let(:user) { 
    stub('user', 
         :name => "Bob", 
         :department => 
           stub(:name     => "Accounting",
                :head     => 
                  stub(:name     => "Jack", 
                       :position => stub(:title => "Vice President"))),
            :division => stub(:name => "Microwave Oven Programming")),
          :position => stub(:title => "Senior Bean Counter"))
 }
  # examples...
end

Not only that, the same tree of stubs will probably be duplicated, with subtle differences, for every test group that interacts with a User—because no one has time to sort out the specific subset of stubs that a given test actually needs in order to function.

At some point the client will decide that users really need to be associated with zero or more departments instead of just one. At that point some unlucky programmer will spend a late night fixing the 300 tests this “small change” breaks because of all the stubs that model the old behavior. Then the next day he’ll write an angry rant about how mock objects are a bad idea.

Structural coupling

The seed of this all-too-common predicament is structural coupling. What’s structural coupling? To define it, let’s start with a review of the DRY principle:

Every piece of knowledge must have a single, unambiguous, authoritative representation within the system.

It’s easy to think about DRYness just in terms of data: e.g., there should be only one place in the system for API keys; they shouldn’t just be copy-and-pasted willy-nilly throughout the codebase. But DRY applies equally to structural knowledge: knowledge about the composition of and relationships between your objects.

Let’s take a look at the code we started out with:

def user_info(user)
  "Name: #{user.name}. Dept: #{user.department.try(:name)}"
end

This seemingly innocuous code makes the following assumptions:

  • user will have a name property.
  • user may or may not have a single department.
  • user‘s department, in turn, has a name property

By going two levels deep into user‘s associations, we’ve made a structural coupling between this code and the models it works with. We’ve duplicated knowledge about a User’s associations—canonically located in the User and Department classes—in the #user_info method.

And the #try method was an enabler. By papering over the ugly user.department && user.department.name construct we’d otherwise have had to use, #try made the coupling an easier syntactical pill to swallow.

This would be bad enough if we made a habit of it, because we’d have to change every method with a similar structural coupling whenever the innards of User or Department changed. But because we’re good Test-Driven developers, we then proceeded to couple dozens of test suites to a specific model structure, in the form of stubs and mock objects.

This is clearly an undesirable outcome. Wouldn’t it be handy to have a simple rule that helps us avoid structural coupling?

The Law of Demeter

Back in the 1980s, a group of programmers working on a project called the Demeter system realized that certain qualities in their object-oriented code led to the code being easier to maintain and change. Qualities such as low coupling; information hiding; localization of information, and narrow interfaces between objects. They asked themselves: “Is there a simple heuristic that humans or machines can apply to code to determine whether it has these positive qualities?”.

The answer they came up with came to be known as the “Law of Demeter”. It is stated as follows:

For all classes C. and for all methods M attached to C, all objects to which M sends a message must be instances of classes associated with the following classes:

  1. The argument classes of M (including C).
  2. The instance variable classes of C.

(Objects created by M, or by functions or methods which M calls, and objects in global variables are considered as arguments of M.)

WikiWiki explains the law like this:

  • Your method can call other methods in its class directly.
  • Your method can call methods on its own fields directly (but not on the fields’ fields).
  • When your method takes parameters, your method can call methods on those parameters directly.
  • When your method creates local objects, that method can call methods on the local objects.

If that still seems confusing, here’s an alternative explanation from Peter Van Rooijen:

  • You can play with yourself.
  • You can play with your own toys (but you can’t take them apart),
  • You can play with toys that were given to you.
  • And you can play with toys you’ve made yourself.

The Demeter programmers wrote up their experiences in a paper called Object-Oriented Programming: An Objective Sense of Style. What they found was that when methods were written in a form which complied with the Law of Demeter, the resulting codebase was easier to maintain and evolve.

It’s important to understand that the Law of Demeter is a heuristic, not an end in and of itself. It is not a law in the sense that you “must” write your code in a certain way. Rather, it is a law in the sense that it has been consistently observed that if code complies with the Law of Demeter, it almost certainly has a number of the qualities—encapsulation, loose coupling, etc.—desirable in an OO system.

Laying down the law

With that in mind, let’s take one more look at our example code:

def user_info(user)
  "Name: #{user.name}. Dept: #{user.department.try(:name)}"
end

This code does not comply with the Law of Demeter. In addition to calling methods on its parameter, user, it also calls a method on the result of one of those methods: (department.name).

Assuming this is a Rails program, it is extremely easy to change the code to satisfy the law. First, we make a one-line addition to the User class:

class User
  delegate :name, :to => :department, :prefix => true, :allow_nil => true
  # ...
end

The #delegate macro, provided by ActiveSupport, generates a new method User#department_name which delegates to the user’s #department. By supplying :allow_nil => true, we ensure that the method will simply return nil in the case when there is no department associated with the user.

Here’s our code again, updated to use the new method:

def user_info(user)
  "Name: #{user.name}. Dept: #{user.department_name}"
end

The code now respects the Law of Demeter: it is coupled only to the immediate interface of the user parameter.

The updated test suite now has only one stub object:

describe '#user_info' do
  subject { user_info(user) }
  let(:user) { stub('user', :name => "Bob", :department_name => "Accounting") }
  specify {
    subject.should match(/Dept: Accounting/)
  }
end

Already we have a simpler test suite. But the real benefit comes when it is time to change the models. Let’s consider the case when a User changes from being linked to just one department, to having a list of zero or more departments. How we re-implement User#department_name depends on the needs of the domain. Let’s say the department name should now be a comma-separated list:

class User
  def department_name
    departments.join(", ")
  end
end

We replace our delegate method with a method implementing the new semantics. And that’s the only change! The #user_info method remains the same, as does every test suite that references User#departmentname.

By adhering to the Law of Demeter, we have decreased coupling, and increased the velocity with which we can make changes to the business logic.

Objection #1: What about method chains?

“But Avdi” you may object, “it sounds like a good guideline, but clearly it’s not something to be rigidly adhered to in Ruby code. If we followed it all the time we could never do method chaining!”

Method chains are a core Ruby idiom, to be sure. As an example, here’s a method which takes a string and generates a “slug” for use as an identifier or as a a URL component:

def slug(string)
  string.strip.downcase.tr_s('^[a-z0-9]', '-')
end

That’s one, two, three levels of method call. That can’t comply with the Law of Demeter, but it surely is concise and convenient!

Look again at the definition of the Law: it never says anything about the number of methods called, or the number of objects a method uses. It is strictly concerned with the number of types a method deals with.

The #slug method expects a String, and calls three methods, each one returning… another String. In fact, because it only calls methods for the type of object (String) passed into it as parameters, we find that this method complies perfectly with the Law of Demeter.

Likewise with another common Ruby pattern, chains of Enumerable methods like #map and #select. Because each returns another Enumerable object, there is no violation.

Objection #2: Delegation explosion

Another objection to Demeter is that strictly following it results in objects which are full of attributes which aren’t a direct part of their responsibility. Quoting Mark Wilden in the comments on my previous article:

Why should a Human have to know whether a Country has a name? Or any other attribute (unless it needs them itself)? If a Human is associated with multiple Countries (birth, residence, voting, vacation, etc.) does it then have to duplicate this delegation for each method of each country?

What about attributes that clearly have nothing to do with Human? Yes, one might say that a Human has a country_name. But does a Human have a country_population? A country_mortality_rate? I would say it does not, but Demeter insists that it must.

In the Object-Oriented view of the world, objects are not merely bags of attributes. They are entities to which you send messages and from which you receive replies. The classic example is a financial transaction: if I am a shopkeeper and you buy something from me, I don’t ask you for your wallet, rummage around until I find a credit card, and then copy down the information I need. Instead, I ask you for your credit card number and expiration date.

Putting this in object terms, a payment system which calls person.wallet.credit_cards.first.number exhibits tight structural coupling, and is closer in spirit to the data-structure-oriented programming which preceded OO. From an objects-sending-messages standpoint, it is perfectly reasonable for a Person to have a creditcard_number.

An important and often neglected point to hit on, before we move on: in an Object-Oriented system, it is prfectly allowable (and even encouraged) for objects to have personas or facets. A doctor deals with a patient’s physical symptoms while the reception desk deals with her wallet and insurance info. You wouldn’t walk into a doctor’s office, step up to the receptionist, and take off your shirt (unless you were on very good terms with the receptionist!).

Likewise, an object can have a large API, but only expose subsets of that API to different collaborators. Some languages enforce these subset relationships quite strictly; e.g. C++ with its private inheritance, and interfaces in Java. In other languages, such as Ruby, the restriction may be more about convention than something the language enforces. There’s nothing wrong with having a large API, so long as individual collaborators only talk to well-defined subsets of it.

But what about Mark’s example human.country_mortality_rate? Surely that’s pushing it a bit far?

Perhaps it is. But Demeter doesn’t prevent us from interacting with an objects second- and third-order associations; it simply asserts that we can’t interact with all of those objects in the same method. Look again at the formulation of the law:

…all objects to which M sends a message…

Demeter is a rule about methods only; it does not limit the set of types a class can interact with.

So this is perfectly legal:

class StatPresenter
  def human_stats(human)
    "Age: #{human.age}.nCountry stats:n#{country_stats(human.country)}"
  end

  def country_stats(country)
    "  Mortality rate: #{country.mortality_rate}"
  end
end

Of course, you could completely violate the spirit of Demeter by taking this too far; something the authors of the Demeter paper note. Realistically, we’d probably want to break that StatPresenter class up into smaller classes once it started interacting with many different types of object.

The important thing, from the standpoint of Demeter, is to avoid tying a single method to a deep hierarchy of types, as well as limiting the number of types one method deals with.

One of the most basic ways we can limit the number of types a given method must be aware of is to eliminate the common case of “maybe nil” parameters. Remember, NilClass is a type too, and when a parameter might be nil we’ve increased the number of types the method has to know about by one.

As an example, the following version of the code above, while technically Demeter-compliant, is once again riddled with #try calls:

class StatPresenter
  def human_stats(human)
    "Age: #{human.age}.nCountry stats:n#{country_stats(human.country)}"
  end

  def country_stats(country) # country may be nil
    "  Population: #{country.try(:population)}n" +
    "  Mortality rate: #{country.try(:mortality_rate)}n"
  end
end

The set of types #country_stats deals directly with is: StatPresenter (self), Country, and NilClass.

We can’t always get rid of switching on nil entirely, but what Demeter-influenced code gives us the opportunity to do is to easily confine that switch to a single location. Let’s rewrite the code above:

class StatPresenter
  def human_stats(human)
    "Age: #{human.age}." + (human.country ? 
      "nCountry stats:n#{country_stats(human.country)}" :
      "n(No Country Stats)")
  end

  def country_stats(country)
    "  Population: #{country.population}n" +
    "  Mortality rate: #{country.mortality_rate)n"
  end
end

With this final edit, we’ve reduced the coupling of each method to a minimal point. StatPresenter#human_stats deals only with Human objects, and all it knows about #country is that it may or may not be there. StatPresenter#human_stats only knows about Country objects.

Bringing Demeter to work

“OK, fine. I can see that the Law of Demeter is a great guideline, at least in theory. But who has time to do all that refactoring? I have deadlines to meet!”

While refactoring code to comply with Demeter can certainly improve its design, I don’t think Demeter becomes truly practical until you incorporate it consistently into your coding style. Like many low-level “code construction” techniques—such as good variable naming—its value lies less in coming in and applying it after the fact, and more in practicing it until it becomes second nature.

Let’s take a look at how we’d add the department name to the #user_info method using TDD and the Law of Demeter. Here’s the code before adding the new functionality:

def user_info(user)
  "Name: #{user.name}"
end

Now let’s add the department name.

  1. We write our test first:

    describe '#user_info' do
      subject { user_info(user) }
      let(:user) { stub('user', :name => "Bob", :department_name => "Accounting") }
      specify {
        subject.should match(/Dept: Accounting/)
      }
    end
    

    We know that nested mock/stub objects is a smell indicating structural coupling, so we force ourselves to write a stub for the user object we wish we had. The User#department_name method doesn’t exist yet; we make a mental note to implement it. If we forget, the omission will be caught by our acceptance and/or integration tests.

  2. We run the spec. It fails, because we haven’t implemented it yet
  3. We write enough code to make the test pass:

    def user_info(user)
      "Name: #{user.name}. Dept: #{user.department_name}"
    end
    
  4. We run the tests again, and this time they pass.
  5. The final step is to implement the User#department_name method. We could write a test asserting that the method delegates to Department; personally, I find this a little redundant and would just write the delegation and call it done:

    class User
      delegate :name, :to => :department, :prefix => true, :allow_nil => true
      # ...
    end
    

Revisiting your code to make it Demeter-compliant after the fact will indeed slow you down. By incorporating the rule into your habits, to the point that it becomes second nature, you reduce the impact (if any) to the point where it becomes insignificant. This is especially true in Ruby and Rails, where techniques such as composition-and-delegation, viewed as “heavyweight patterns” in some languages, become one-liners. And any fractional slowdown you do experience from an extra test run here and there will be more than made up for by the ease of changing your loosely-coupled code as requirements change. With discipline and practice, it is possible to be both fast and good.

Conclusion

To summarize:

  • #try is more often than not indicative of structural coupling. Structural coupling, in turn, violates the DRY principle.
  • Structural coupling, left unchecked, can substantially slow the evolution of a project.
  • The Law of Demeter, which sets limits on the number of types a single method can interact with, is a heuristic for identifying code that (among other positive properties) has low structural coupling.
  • When refactor our code to comply with the Law of Demeter, it tends to reduce structural coupling both in application code and in tests. As a side effect, it tends to eliminate the need for #try calls and similar constructs.
  • Contrary to popular belief, Demeter does not limit the number of of dots in a method call chain. It also doesn’t limit the number of classes a class can interact with.
  • The best way to incorporate Demeter into your work is to make it a habit, rather than a cleanup chore.

Do you look for Demeter violations in your code? Do you think there are still some instances where a #try makes sense? Do you have more questions about the Law of Demeter or structural coupling? As always, I welcome feedback in the comments!

P.S. It’s my birthday! To celebrate, for 24 hours I’m offering 50% off on my book, Exceptional Ruby. Use code HAPPY0X1F to get the discount.