memoize in Ruby-CSDN博客

Common Lisp 的经典书《On Lisp》的 5.3 节叫做 Memoizing 。书中讲到了将函数调用的返回值缓存起来的一种技术。这本来是一种非常常见的技术，但是《On Lisp》让我看到了动态语言的精练之处，这样的一种技术被抽象成一个通用的函数，将任意一个函数传入 memoize ，就会得到一个经过包装的函数，并且它已经具备了缓存的能力：

(defun memoize (fn)
  (let ((cache (make-hash-table :test #'equal)))
    #'(lambda (&rest args)
  (multiple-value-bind (val win) (gethash args cache)
    (if win
        val
        (setf (gethash args cache)
        (apply fn args)))))))

下面是一个例子：

CL-USER> (setf (symbol-function 'memoized)
         (memoize #'(lambda (x)
          (sleep 5)
          x)))
#<CLOSURE (LAMBDA (&REST ARGS)) {B7B80F5}>
CL-USER> (time (memoized 1))
Evaluation took:
  5.0 seconds of real time
  0.004 seconds of user run time
  0.0 seconds of system run time
  0 calls to %EVAL
  0 page faults and
  0 bytes consed.
1
CL-USER> (time (memoized 1))
Evaluation took:
  0.0 seconds of real time
  0.0 seconds of user run time
  0.0 seconds of system run time
  0 calls to %EVAL
  0 page faults and
  0 bytes consed.
1

《On Lisp》是一本非常经典的书，如果你不明白为什么 C 语言的宏与 Lisp 的宏一个是青蛙一个是王子的话，极力建议读一读，里面讲到了许多很实用的技巧，犹如秘境寻宝般刺激！

今天在 Nextlib 的公共图书馆里面淘到这篇 Ruby Monitor-Functions ，里面讲到了 Ruby 里的 wrap_method ，我立即就想到了原来看过的 memoize 函数。

实现本身是非常简单的，因为和在 Common Lisp 里面一样，Ruby 里的哈希表用起来也是非常方便的。在 Lisp 里的 memoize 函数采用函数编程的套路，没有副作用，而是返回一个包装过的函数，而 Ruby 的语法更接近命令式，我希望能以如下方式来使用这个 memoize 函数：

class Foo
  def foo
    # do some time-consuming work
  end
  memoize :foo
end

换言之，调用 memoize 之后，作为副作用，foo 就有了缓存的能力，而不用关心 memoize 方法的返回值。我将这个方法实现为 Module 的类方法，这样就可以在定义类的时候使用它了。

class Module
  def memoize method_name
    memo ||= { }
    orig_method = instance_method(method_name)
    define_method method_name do |*args|
      memo.fetch(args) do |key|
        val = orig_method.bind(self).call(*args)
        memo[key] = val
      end
    end
  end
end

这是一个非常简易的实现，没有用到那篇文章中实现的 wrap_method 方法。 memoize 也没有考虑有 block 的情况，因为 memoize 本身就不是万金油，不能滥用啊。另外，由于 memo 变量是为每一个方法调用的时候创建的，所以同一个类的不同的对象用同样的参数来调用的话，会取到同样的 cache ，想要区分不同的对象也简单，在存入 memo 的时候哈希表的键值(就是那个 args 数组)前面把 self 加进去就可以了，像这样

    define_method method_name do |*args|
      memo.fetch(Array.new(args).unshift(self)) do |key|
        val = orig_method.bind(self).call(*args)
        memo[key] = val
      end
    end

下面来测试一下效果：

require 'benchmark'
 
array = (1..100000).map { rand 100 }
computer = Computer.new
 
Benchmark.bm(20) do |x|
  x.report("normal factor:") { array.each { |i| computer.factor(i) } }
  Computer.instance_eval { memoize :factor }
  x.report("memoize installed:") { array.each { |i| computer.factor(i) } }
  x.report("all cached:") { array.each { |i| computer.factor(i) } }
end

在我的机器 (Debian GNU/Linux 2.6.22-1-686) 上跑出来的结果是：

                          user     system      total        real
normal factor:       17.060000   1.350000  18.410000 ( 18.884428)
memoize installed:    0.750000   0.120000   0.870000 (  0.895277)
all cached:           0.740000   0.120000   0.860000 (  0.904223)

可以看到 memoize 的效果非常明显，第一个和后面两个时间差距非常大。而第二个和第三个，一个是现跑现 cache ，另外一个是所有的都是从 cache 中取的，基本上就是查找哈希表的过程，不过它们之间差距并不大。

完整的代码可以从这里下载。