Rubies in the Rough

This is where I try to teach how I think about programming.

1

MAY
2012

Single Method Classes

[Update: I've changed my mind about some of the following due to this excellent counter argument.]

In the words of Dennis Miller, "I don't want to get off on a rant here, but…"

There's something that drives me crazy and I see it in so much Ruby code. I see it in the documentation for our key projects; I see Rubyists of all skill levels doing it; it's just everywhere.

Let's talk about when the use of a Class is and is not appropriate.

The Chained new()

Here's an example of one form of code that bugs me:

class Adder
  def add(n)
    40 + n
  end
end
p Adder.new.add(2)

The problem here is that a Class has been used, probably because as Rubyists that's always our default choice, but it's the wrong fit for this code. A Class is for state and behavior. The example above is just using behavior. No state is maintained.

A handy tip for sniffing out this problem is watching for a call to new() in the middle of method chaining as we have here. If you always use a Class like that, it's not really a Class. Put another way, if an instance never gets assigned to a variable, something has likely gone wrong with the design.

Here's a slightly trickier version:

class Adder
  def initialize(n)
    @n = n
  end

  def add(n)
    @n + n
  end
end

Does this code have the same problem? It depends on how it gets used. If it's always used with a chained new() then it's the same thing we looked at above:

p Adder.new(40).add(2)

State isn't really being tracked here. It's just an illusion.

However, it may be fine if the Adder is sometimes used like this:

forty = Adder.new(40)
puts "40 + 1 = #{forty.add(1)}"
puts "40 + 2 = #{forty.add(2)}"
puts "40 + 3 = #{forty.add(3)}"

Note the variable assignment. As I said before, this is our hint that things are on the up and up. In this case, some value is locked-in so we can then run some experiments on it. That initial value is state, so this is a fine use for a Class.

This isn't the only form of the problem we need to watch out for though.

The Unused new()

Some programmers see my first example and think, "No problem. Let's remove the new()." They will often change the code to be something like this:

class Adder
  def self.add(n)
    40 + n
  end
end
p Adder.add(2)

This isn't any better. Yes, the chained call to new() is gone. However, now we have a new problem. What does this mean?

what_am_i = Adder.new

A Class is a factory for manufacturing instances of some type. With this code, I am still allowed to make those instances, but they are completely pointless. There's no meaningful state or behavior for my new instance.

We could used some Ruby tricks to make new() uncallable, but that's not really better. Introspection would still tell us that Adder is a Class. That's a lie. It's not for manufacturing instances.

That's why this problem bugs me when I see it in code, by the way. Remember, the primary purpose of code is to communicate with the reader. Always. Period. Code that uses Classes in this way does not do that well. When I see a Class, I am thinking about the instances it will be used to manufacture. If that's not its purpose, I have been lead astray. We want our code to communicate better.

The One Word Fix

We have multiple choices for how to improve code like this, when we see it. The easiest is to make a one word change to my last example:

module Adder
  def self.add(n)
    40 + n
  end
end
p Adder.add(2)

Now we are working with a Module instead of a Class. The rules have changed. A Module is not used to manufacture instances. It can't do that. It's not really for maintaining state (though I admit that there are minor exceptions to this).

The important point is that I won't be thinking about instances when I read this code. It communicates better.

Is it the best we can do though?

Dual Natured

While a Module is not an instance factory, it does fill multiple roles in Ruby. This isn't your fault. A Module is a bit of an overloaded concept in the language.

First, a Module is often used as a namespace in Ruby. It can group together related constants and methods. We are fine with that meaning here. It just says that the add() method is part of the Adder namespace. No problem.

A Module is also a place to hang methods that don't belong on a Class, typically because they don't maintain any state. Conversion methods are a great example of this. Think of ERB::Util.h(). Again, this meaning is fine. We actually switched to a Module for exactly this reason.

Finally, a Module can be used as a mix-in in Ruby. This essentially allows plugging it into any other scope. It will add its methods and constants to that scope. This last meaning is a bit of a problem for our "fixed" code. The reason is that it's possible to write this code using our last example:

class SomeMathyThing
  include Adder
end

This is similar to our unused new() problem. While we can mix this Module in (because you can do that with any Ruby Module), it has no meaning. Only instance-level methods and constants matter when mixing a Module in and we didn't define either of those.

I want to stress that this isn't your fault. This is a quirk of how Ruby is designed. Switching to a Module is still an acceptable fix.

The question is, could we do even better? Could we bring our code in line with all three meanings of Ruby's Module. If we can, it might just communicate even better. That's a noble goal, so let's try it.

It turns out that Ruby includes a not-too-well-known tool just for this reason. A Module that uses this tool can be used both as a bag of methods and a mix-in. For example, take a look at the built-in Math Module:

module SomeMathyThing
  extend Math

  def self.do_math
    sqrt(4)
  end
end
p Math.sqrt(4)
p SomeMathyThing.do_math

As you can see we can call methods on Math directly and we can also add Math to other scopes when it makes sense. It fits all three definitions of a Module in Ruby.

The Kernel Module also behaves this way. If you were in some context that defined a p() method, you could still get to the one in Kernel if you needed it. The code for that would be ::Kernel.p(). The leading :: forces the constant lookup to happen from the top level, where Kernel lives. That ensures you will get the right Kernel, even if another constant with the same name is available in your current scope. Then we just call p() on the Module normally.

This is done with the help of module_function() and we can use it ourselves:

module Adder
  module_function

  def add(n)
    40 + n
  end
end

module SomeMathyThing
  extend Adder

  def self.do_add
    add(2)
  end
end

p Adder.add(2)
p SomeMathyThing.do_add

Notice that I just defined normal instance-level methods after calling module_function(). module_function() is like Ruby's public() and private() in that it will affect methods that follow, when used without an argument. (It does support passing arguments to affect specific methods as well.)

Controlling Access

Transforming a method with module_function() does a couple of things. First, the instance level methods are copied up to the Module level. This is what allows for the dual interface.

Another effect of module_function() is that the instance level methods are made private(). This prevents them from adding to any external interface when they are used as a mix-in. This is also why things like Kernel's p() cannot be called with a receiver, except in the Module level access case I showed earlier. In other words, this code will throw a "private method called" error even with our last example:

Object.new.extend(Adder).add(2)

This is usually desirable, but if you have a case where you would prefer to control access for the methods you define, you're probably going to lock horns with module_function().

There's a trick you can use to get around even that. Checkout this example:

module AccessControlled
  extend self

  def share_me
    hide_me.sub("private", "public")
  end

  private

  def hide_me
    "I am a private interface."
  end
end

p AccessControlled.share_me
p Object.new.extend(AccessControlled).share_me

The magic line is extend self which mixes the Module into itself at the Module level. This gives the same kind of dual interface we had before, but it also respects the access control we have setup for the methods. You can see that share_me() does add to the public interface of objects it is mixed into. Also, both of the following lines are errors, assuming the code above:

AccessControlled.hide_me
Object.new.extend(AccessControlled).hide_me

I tend to reach for module_function() first, preferring to keep as much private() as I can get away with. If I find myself in a situation where I need more control over visibility though, I'll switch to extend self.

The One True Interface

I want to cover one last semi-related point.

A lot of API's ask us to pass an object that responds to a certain method. If you are expecting just one method, make it call().

I think the downside to this approach is that programmer's worry it's not as expressive. While that's the right concern to have, I don't think it's as bad we think. It basically comes to this code validater.valid?(whatever) verses validater.call(whatever). While the former does clearly express what is happening, the second still comes out pretty OK in plain English, "Mr. Validater I am now calling upon you to do your job."

Plus there's just too much gain for going with call(). It's already been decided: call() is the preferred one method interface in Ruby. This makes you compatible with Proc (and lambda()) and Rack, just to give some key examples. That allows me to skip defining a Class/Module altogether if my needs are simple or possibly to serve whatever I am defining in a Web application. That's just because you chose to go with call(). Standard interfaces are handy.

If I haven't swayed you, at least support both, with some code like this:

if validater.respond_to?(:valid?)
  validater.valid?(whatever)
else
  validater.call(whatever)
end

Help Me End This Bad Habit

You now know that every time I see these single method stateless Classes in the wild I die a little inside. Have pity on my aging heart! Here's what you can do to help:

  • Check your code. I know you were reading this whole article thinking, "I don't do that." However, you probably do. Find code like this and improve it.
  • Ignore bad documentation. Whenever a Ruby project's documentation tells you to define these silly Classes, try it my way first. It pretty much always works just fine.
  • Fix the code and documentation of others. Start sending patches. Share a link to this article when you do. Let's make this less common. I'll live longer.
Comments (3)
  1. James Edward Gray II

    This example was brought to my attention, but the top version doesn't bother me here. I view this as a Method Object and see part of the state being managed here the process itself. Because of that, I think the object is fine.

    1. Reply (using GitHub Flavored Markdown)

      Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

      Ajax loader
  2. Iain Hecker
    Iain Hecker May 2nd, 2012 Reply Link

    You could formalize this pattern (just like singleton), like this: https://gist.github.com/2581027

    1. Reply (using GitHub Flavored Markdown)

      Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

      Ajax loader
  3. Gregory Brown
    Gregory Brown May 16th, 2012 Reply Link

    James and I have been discussing related concepts to this article over on Practicing Ruby, and we've found there is a lot of overlap (and a fair bit of disagreement!) between our two perspectives on things.

    I agree that the chained new() and unused new() are bad, but disagree with the use of modules as a suitable replacement, in particular the use of module_function. Unfortunately, Ruby has no great solution to this problem, so it's more of a game of "what's least ugly in this scenario?", which is dependent on context.

    See this Practicing Ruby article for my take on things. The comments there are available to subscribers only, but I'd be happy to continue discussing the topic here with anyone interested :)

    1. Reply (using GitHub Flavored Markdown)

      Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

      Ajax loader
Leave a Comment (using GitHub Flavored Markdown)

Comments on this blog are moderated. Spam is removed, formatting is fixed, and there's a zero tolerance policy on intolerance.

Ajax loader