Jacob Evelyn & Jemma Issroff
We recently open-sourced MemoWise
! It’s a Ruby memoization gem, and if you’re curious you can read more about why we created it and some interesting corners of Ruby we discovered along the way. Now let’s talk about how we made it really, really fast.
We began by writing benchmark code for our gem so that at any time we could see how performant it was. We decided to explicitly optimize for memoized value retrieval, so our benchmarks only test the speed at which an already-memoized method’s results are returned on subsequent method calls. Because we memoize many different methods in our codebase, we also ensured our benchmarks cover many parameter types and combinations—from def method1
to def method2(a, *b, c:, **d)
.
These benchmarks allow us to check the performance impact of any change to our gem’s code. This helps drive our development cycle—when we see scenarios where our gem isn’t as fast as we’d like, we can dig in and then test any potential improvements before committing them. For our initial implementation, the benchmarks showed that all types of memoized methods were slower than we wanted, so we first looked at common code in the core of our gem.
At the heart of any Ruby memoization gem is a way to dynamically define a new memoized method that caches the results of the original method. Typically this uses define_method
and looks something like:
define_method(existing_method_name) do |*args, **kwargs| method_cache = @cache.fetch("#{existing_method_name}") do @cache["#{existing_method_name}"] = {} end key = [args, kwargs] method_cache.fetch(key) do method_cache[key] = super(*args, **kwargs) end end
The block passed to define_method
is executed each time the method is called, and Ruby adds overhead that makes blocks slightly less efficient than the equivalent code outside of a block. We expect to call memoized methods many times in performance-sensitive code where these minor block inefficiencies can add up.
Is there a more performant way to dynamically define methods in Ruby? It turns out we can eval
strings of raw Ruby code in the context of our original class, and this has significant speedups over define_method
:
eval <<-END_OF_METHOD def #{existing_method_name}(*args, **kwargs) method_cache = @cache.fetch("#{existing_method_name}") do @cache["#{existing_method_name}"] = {} end key = [args, kwargs] method_cache.fetch(key) do method_cache[key] = super(*args, **kwargs) end end END_OF_METHOD
The above code works, but it allocates objects when it doesn’t always need to! Consider the following method:
def triple(a) a * 3 end
If we memoize triple
with the approach above, our memoized version is generated as:
def triple(*args, **kwargs) method_cache = @cache.fetch("triple") do @cache["triple"] = {} end key = [args, kwargs] method_cache.fetch(key) do method_cache[key] = super(*args, **kwargs) end end
Every time we call our memoized triple
we allocate an args
array, a kwargs
hash, and a key
array. Each of these allocations is unnecessary! Ideally our generated code would be:
def triple(a) method_cache = @cache.fetch("triple") do @cache["triple"] = {} end method_cache.fetch(a) do method_cache[a] = super(a) end end
By introspecting the arguments of the method we’re memoizing, our generated code can avoid unnecessary allocations in a variety of scenarios, such as:
args
when there are only keyword argumentskwargs
when there are only positional argumentskey
when there’s only one argumentmethod_cache
when there are no arguments at allOur benchmarks showed that these optimizations yield dramatic performance improvements and help make the gem so fast.
Please feel free to try MemoWise
out or take a peek at the code, and if you have any other performance suggestions let us know. (We’re happy to accept contributions and we’d love to hear what you think!)