Dynamic Models with Ruby and MongoDB

Introduction

Over the past twelve years, most of my development has involved statically typed languages and relational databases. Over the last year or so, among other things, I’ve delved into Ruby and MongoDB and as with many others, enjoyed the ease and speed with which these can be used to create applications. Furthermore, I’ve found the new perspectives invaluable to my development skills as a whole.

What I’m going to show here is a specific flexibility that Ruby (or any other dynamic language) and MongoDB can provide compared to the statically typed/relational DB environment.

The functionality I’ll demonstrate is already provided in Ruby ODMs for Mongo such as MongoMapper and Mongoid, so it’s not a revelation. This exercise is more a diary entry of my own “Aha!” moment.

The main disclaimer I’ll make is that I’m neither a Ruby or MongoDB expert, so go ahead and feel free to comment.

A Simple Ruby Model

So let’s say you’ve come straight over from cutting C# or Java and you decide you’re going to write some class that persists it’s data into MongoDB. Something like this might result.

class Person
  attr_accessor :_id, :handle, :first_name, :surname

  def self.find(id, db_conn)
    data = db_conn.collection(:person).find_one({:_id => BSON::ObjectId(id)})
    return nil if !data
    Person.new data, db_conn
  end

  def initialize(data, db_conn)
    @db_conn        = db_conn
    @_id            = data['_id']
    @handle         = data['handle']
    @first_name     = data['first_name']
    @surname        = data['surname']
  end

  def save
    collection = @db_conn.collection :person
    if @_id
      collection.update({_id: @_id}, {'$set' => {handle: @handle, first_name: @first_name, surname: @surname}})
    else
      @_id = collection.insert({handle: @handle, first_name: @first_name, surname: @surname})
    end
  end

  def delete
    @db_conn.collection(:person).remove({_id: @_id})
    @_id = nil
  end
end

You can see here, we’ve got some instance variables, typical CRUD operations and a factory method for locating Person instances. Some validations and safety condition checking would be required of course, but I’ll leave this out at the moment for clarity.

Basic usage examples might look something like this.

connection = Mongo::Connection.new('mongo_url', 27017).db 'mongo_name'
connection.authenticate 'mongo_user', 'mongo_password'

person = Person.find '4eaefd00e4b085353830b53d', connection
person.handle = 'Nickname'
person.save

new_person = Person.new({handle: 'Dude', first_name: 'Jeffrey', surname: 'Lebowski'}, connection)
new_person.save

another_person = Person.find '4f54d02de4b091c3a04cd256', connection
another_person.delete

As a start, this is OK. It won’t give us too much trouble in a simple scenario. However, we’ve really just transplanted a statically typed paradigm and have foregone the great power available to us in this new dynamic environment. We are:

What if we wanted to store any old attributes with our person?

Well, we know straight away that as a document database, Mongo will let us store any key/value combinations in a record - it’s schema-less. We can also see from the example above that with the Ruby driver, we just pass and receive hashes (think C# dictionaries) to represent this data. What we need is flexibility in our Ruby model.

Hello Dynamic Language

The first thing that we need to be able to do is hydrate a class instance by creating instance variables from a hash. We wave goodbye to our statically typed mindset and rewrite our constructor like this.

def initialize(data, db_conn)
  data.each {|key, val| instance_variable_set "@#{key}",val}
  @db_conn = db_conn
end

Yes; that easy. This is one of the things I love about Ruby. I find that when I write (or copy) code that seems like it should work, most of the time it just does.

So now we can create a person by using a hash with any values, but of course our save method still only persists a fixed list of attributes. We need to be able to turn our instance variables back into a hash. This turns out to be a simple matter of introducing the following method.

# The keys have the leading '@' removed.
def instance_var_hash
  Hash[instance_variables.map {|var|
    name = var.to_s
    key = var[1..name.size]
    [key, instance_variable_get(var)]
  }]
end

Then our save method changes to use it thusly.

def save
  collection = @db_conn.collection :person
  data = instance_var_hash
  # Remove the keys that we don't want to persist.
  ['db_conn', '_id'].each {|key| data.delete key}
  if @_id
    collection.update({_id: @_id}, {'$set' => data})
  else
    @_id = collection.insert data
  end
end

This is a (limited as we’ll see) working dynamic class. We can create a Person instance with any extra attributes we want. We can save it as a Mongo document and we can retrieve it from the database having all of the attributes as instance variables, whatever they might be.

Now let’s say we want to use this model in an API that exchanges JSON. We’ve made it (almost too) easy for ourselves by handling our data with hashes. We just need to add the two methods below for returning JSON and for creating a Person instance from JSON.

def self.from_json(json, db_conn)
  data = JSON.parse json
  # Assume it came from the DB in the first place if _id is present.
  data['_id'] = BSON::ObjectId(data['_id']) if data.key? '_id'
  Person.new data db_conn
end

def to_json(*a)
  data = instance_var_hash
  data['_id'] = data['_id'].to_s if data.key? '_id' # Nicer representation.
  # We might also remove certain keys that we don't want to send as JSON...
  data.to_json
end

That’s it; we’re in business. If we knock up a Sinatra or Grape API, we can use this model to read from and persist to a MongoDB database.

Not So Fast

Although we can create, save and retrieve a Person instance with our cleverness, there’s (at least) one grave limitation to our model. If our usage requires direct access to new dynamic members that we’ve introduced, we’re out of luck.

new_person = Person.new({handle: 'Dude', first_name: 'Jeffrey', surname: 'Lebowski', living_friends: ['Walter', 'Donny']}, connection)
new_person.save
new_person.handle = 'The Dude' # This is OK.
new_person.living_friends = ['Walter'] # Boom!

The last line will throw a NoMethodError because we have no public accessor for the new instance variable that we created on-the-fly.

There are Ruby methods that enable us to dynamically add attribute accessors to a class, but we really just want to add them to individual instances. We can do this, but hold on to your hat.

Metaprogramming Magic

The technique below uses what’s known as a Metaclass. It comes courtesy of the enigmatic _why. To read more about the concept specifically, check out this article.

First the Metaclass. Don’t worry. It hurt my brain too at first.

class Object
  def metaclass
    class << self; self; end
  end
end

And now our new (and final for this post) constructor.

def initialize(data, db_conn)
  data.each do |key, val|
    var_name = "@#{key}"
    instance_variable_set var_name, val
    metaclass.send(:define_method, "#{key}=".to_sym) {|val| instance_variable_set var_name, val}
    metaclass.send(:define_method, key.to_sym) {instance_variable_get var_name}
  end
  @db_conn = db_conn
end

For each of the hash members, in addition to dynamically creating and setting the value of an instance variable, we’re creating a get/set instance method. Note: they’re actually class methods on the class that is the Metaclass for our instance, but let’s not split hairs.

To prove this does what I claim, try it out.

new_person = Person.new({handle: 'Dude', first_name: 'Jeffrey', surname: 'Lebowski', living_friends: ['Walter', 'Donny']}, connection)
new_person.handle = 'The Dude' # This is OK.
new_person.living_friends = ['Walter'] # This is OK now too.

another_person = Person.new({handle: 'Loner', first_name: 'John', surname: 'Smith'}, connection)
puts another_person.living_friends # NoMethodError, just like we expect.

And finally, our finished product. There’s more that we could do with this, such as drawing out the now-generic methods into a base class. Another thing might be validation that enforces a list of mandatory fields; but I’ll leave that as an exercise for the reader.

require 'mongo'
require 'json'

class Object
  def metaclass
    class << self; self; end
  end
end

class Person
  def self.from_json(json, db_conn)
    data = JSON.parse json
    # Assume it came from the DB in the first place if _id is present.
    data['_id'] = BSON::ObjectId(data['_id']) if data.key? '_id'
    Person.new data db_conn
  end

  def self.find(id, db_conn)
    data = db_conn.collection(:person).find_one({:_id => BSON::ObjectId(id)})
    return nil if !data
    person = Person.new data, db_conn
  end

  def initialize(data, db_conn)
    data.each do |key, val|
      var_name = "@#{key}"
      instance_variable_set var_name, val
      metaclass.send(:define_method, "#{key}=".to_sym) {|val| instance_variable_set var_name, val}
      metaclass.send(:define_method, key.to_sym) {instance_variable_get var_name}
    end
    @db_conn = db_conn
  end

  def save
    collection = @db_conn.collection :person
    data = instance_var_hash
    # Remove the keys we don't wan't to persist.
    ['db_conn', '_id'].each {|key| data.delete key}
    if @_id
      collection.update({_id: @_id}, {'$set' => data})
    else
      @_id = collection.insert data
    end
  end

  def delete
    @db_conn.collection(:person).remove({_id: @_id})
    @_id = nil
  end

  def to_json(*a)
    data = instance_var_hash
    data['_id'] = data['_id'].to_s if data.key? '_id' # Nicer representation.
    # We might also remove certain keys that we don't want to send as JSON...
    data.to_json
  end
end

I think you’ll agree; it really does tie the room together.