Saturday, January 31, 2009

Ruby: Adding the comparison operator (<=>) to the Symbol class

Hi guys/gals,

I'm working on some Ruby at the moment.
The code I'm writing uses hash maps and arrays to store word associations. To lower my memory usage, I decided to use symbols. Using symbols means all uses of the same word only use the one symbol worth of memory, instead of using a string worth of memory per word.

For example, 800 strings of "a", is a lot more usage than 800 symbols of :a, as the symbol is shared. If I used strings, I would have 800 independent copies of "a".

For display purposes, I wanted to sort my arrays of symbols. I soon learned that I could not call .sort! on an array of symbols. This is because the base class "Symbol" does not have <=> (comparison) operator by default. To be able to sort my arrays of symbols, I needed to add a <=> method to Symbol. The following code shows how to do this:

class Symbol
def <=> value
to_s <=> value.to_s

The code above injects a <=> (comparison) method into the Symbol class. The <=> method simply converts the symbols that it is comparing to strings, and then calls the string <=> method.

Note: I wrote the <=> method for my own purpose, to sort my symbols alphabetically. You could have just as easily written a method which compares symbols based on their length.

Ruby API: Symbol
Ruby API: String#Comparison


Deuce said...

do the two calls to to_s create loads of temporary string objects as you compare your way through all your words?

Robert Pyke said...

The method does create temporary string objects, yes. But I do not call the sort! method on my major "collections" of word associations. I only call it on small subsets of the "collection" which I have created using hash lookups. These subsets are very small in comparison to the overall "collection" of word associations, usually < 1% in size. Meaning that whilst this does have some performance overhead, it is nothing in comparison to the memory usage I would have if I was using string objects for my overall "collections" of words. Being seriously pessimistic, this may half my sort! time, but it will drop my memory usage to at least 5-10% of what it would be otherwise.

simon said...

it's beautiful