Sunday, August 8, 2010

Ways to modularize your Ruby code

In this post, I'll recommend several ways to properly organize code in Ruby projects. I'll also explain my reasoning and how I arrived at each solution.

Much of this builds on the excellent work of others in the Ruby community, and I've linked to these other writeups as appropriate in case you want to know more.

This subject is of particular interest to anyone who is packaging a gem library, but it should be handy to anyone who wishes to organize code within any Ruby project.

Namespacing with modules

Let's say you've got an isolated piece of functionality that doesn't depend on anything else, such as a simple key (random string) generator. You intend to always call it with its full namespace, so you don't get it mixed up with similarly named functions. To make something that you can call using Rigatoni::KeyGenerator.generate_key(n), you'd use the following code.

module Rigatoni
module KeyGenerator
def self.generate_key(key_length = 5)
puts "generate_key() called with length #{key_length}"
end
end
end

Note that I defined the function as self.generate_key() — the self keyword is crucial.

That's a subtle but critical difference. If I didn't include the self,
I'd have to include Rigatoni::KeyGenerator first and then run the
generate_key() function, which isn't what I want to do.

If I want to save myself a little typing while having some semblance of a namespace to qualify my method call, I can do this:

include Rigatoni
KeyGenerator.generate_key()

This is the best way to modularize a self-contained piece of Ruby code that's meant to be called independently.

Extending functionality through modules

The most common way in which I've seen Ruby modules used is to extend the functionality of existing classes. This is where they're used as mixins to extend a Ruby class.

Even then, there are two ways to extend the functionality of a Ruby class with modules: instance methods and class methods. By default, including a module in your class definition will give you new instance methods.

If you want to define class methods in a module, you have to jump through some extra hoops.

module Moo
module Ham
def self.included(base)
base.extend(ClassMethods)
end

module ClassMethods
def foo()
puts "foo called"
end
end

def bar()
puts "bar called"
end
end
end

class Bacon
include Moo::Ham
end

The module we define above is Moo::Ham. We have a dummy class, Bacon, which includes Moo::Ham. It includes both class and instance methods, which we can run with the following example code.

# Class method.
Bacon.foo

# Instance method.
b = Bacon.new
b.bar

This is a longtime Ruby idiom, and John Nunemaker unpacks this in his post, Include vs. Extend in Ruby.

Namespaced classes (for state-dependent modularity)

In the examples up until now, we've dealt only with methods that could be run independently without needing something already in place.

Let's say all you're using to modularize your code is Ruby modules. Then you start to notice that a lot of the methods have the same argument passed in. Either that, or you find yourself setting up or populating some variable again and again in order to perform the work.

module Moo
module Ham
def some_func_foo(access_key, api_key, x)
# Set things up with access_key and api_key.

# Perform work with x.
end

def some_func_bar(access_key, api_key, y)
# Set things up with access_key and api_key.

# Perform work with y.
end

def some_func_baz(access_key, api_key, z)
# Set things up with access_key and api_key.

# Perform work with z.
end
end
end

When you start to notice these things, it's time to turn your module into a class.

module Moo
class Ham
def initialize(access_key, api_key)
@access_key = access_key
@api_key = api_key

# Do other stuff here to set up what you need.
end

def some_func_foo(x)
# Perform work with x.
end

def some_func_bar(y)
# Perform work with y.
end

def some_func_baz(z)
# Perform work with z.
end
end
end

This keeps us from duplicating code. Note that we end up having to instantiate a class because the methods depend on the initial setup work being done, but our code is leaner and meaner this way.

# Old way. Gross.
Moo::Ham.some_func_foo(access_key_one, api_key_one, x)
Moo::Ham.some_func_bar(access_key_one, api_key_one, y)
Moo::Ham.some_func_baz(access_key_one, api_key_one, z)

# New way. Nice.
mh = Moo::Ham.new
mh.some_func_foo(x)
mh.some_func_bar(y)
mh.some_func_baz(z)

The major takeaway is that modules are not the only way to modularize our Ruby code.

Handling dependencies

Other times, you'll have methods that aren't state-dependent and which don't belong in a class. But they'll have a different kind of dependency: on external gem libraries being present.

Say you have a module, Foo, which you define in a file called foo.rb.

# foo.rb
module Foo
ABACAB="abacab"

def print_foo
puts "foo"
end
end

Then say you have another module, Bar, which depends on Foo.

# bar.rb
module Bar
def self.bar
require 'foo'
include Foo

puts ABACAB
print_foo()
end
end

You try to be a good citizen by pulling in Foo only when you need it. So now you're ready to run Bar.bar() from another script, run_bar.rb.

# run_bar.rb
require 'bar'

Bar.bar()

But then you run it, and you get an error with baffling and mixed results. The line referencing the constant ABACAB ran perfectly fine; the call to print_foo() failed. Let's move the require and include of Foo outside Bar's module definition and see what happens.

# bar.rb
require 'foo'
include Foo

module Bar
def self.bar
puts ABACAB
print_foo()
end
end

When we run run_bar.rb again, we see that this works for us.

It seems a little wasteful to pull in Foo, but given the results of our little experiment here, we've got no choice: if we want to call methods in the Foo namespace, we've got to pull in Foo at the top — outside the module definition and outside the method definition.

Besides, if we're pulling in bar.rb, our real intent is to go after the full functionality that Bar provides. The functionality that Bar gives us depends on Foo anyway, and won't work at all in its absence. We're not really being wasteful.

Summary

There are two major ways to modularize our Ruby code: modules and classes. Despite the name, modules are not the only way to modularize. Use classes if the proper behavior depends on state.

Start with a module by default. If you find yourself creating too many methods that take in the same parameter again and again, turn your module into a class which populates the initial values when you instantiate it.

When a module depends on other libraries, pull these other libraries in at the top of the file, outside the module definition. Calls to methods in these libraries won't be found otherwise.

blog comments powered by Disqus