《The.Go.Programming.Language.2015.11.pdf》之函数缓存实现

最新推荐文章于 2016-10-26 16:21:01 发布

kingeasternsun

最新推荐文章于 2016-10-26 16:21:01 发布

阅读量958

点赞数

分类专栏： Go Golang你所不知道的技巧 Golang入门文章标签：函数缓存 concurrent 竞争服务器

本文链接：https://blog.csdn.net/wdy_yx/article/details/52624638

版权

Golang你所不知道的技巧同时被 3 个专栏收录

29 篇文章

订阅专栏

20 篇文章

订阅专栏

Golang入门

16 篇文章

订阅专栏

本文介绍了一种用于函数结果缓存的技术，通过缓存相同输入的计算结果以避免重复计算，提高程序效率。文章展示了如何使用Go语言实现串行、并行版本的缓存，并讨论了在并发环境下解决数据竞争的方法。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

文中9.7节 Example:Concurrent Non-Blocking Cache
该例子实现一个功能，对函数进行缓存，这样函数对同样的参数只需要计算一次。该方法还是concurrent-safe的，并且避免了对整个缓存加锁引起的竞争。

我们先来看串行的实现

串行的实现

func httpGetBody(url string) (interface{}, error) {
    resp, err := http.Get(url)
    if err != nil {
        return nil, err
    }

    defer resp.Body.Close()

    return ioutil.ReadAll(resp.Body)

}

type result struct {
    value interface{}
    err   error
}
type Func func(key string) (interface{}, error)
type Memo struct {
    f     Func
    cache map[string]result
}

func New(f Func) *Memo {
    return &Memo{f: f, cache: make(map[string]result)}
}

func (memo *Memo) Get(key string) (interface{}, error) {
    res, ok := memo.cache[key]
    if !ok {
        res.value, res.err = memo.f(key)
        memo.cache[key] = res
    }
    return res.value, res.err
}

func testCache() {
    incomingURLS := []string{"http://cn.bing.com/", "http://www.baidu.com", "http://cn.bing.com/", "http://www.baidu.com",
        "http://www.baidu.com", "http://cn.bing.com/", "http://www.baidu.com",
        "http://www.baidu.com", "http://cn.bing.com/", "http://www.baidu.com",
        "http://www.baidu.com", "http://cn.bing.com/", "http://www.baidu.com",
        "http://www.baidu.com", "http://cn.bing.com/", "http://www.baidu.com"}

    m := New(httpGetBody)
    allstart := time.Now()
    for _, url := range incomingURLS {
        start := time.Now()
        value, err := m.Get(url)
        if err != nil {
            fmt.Println(err)
        }

        fmt.Printf("%s, %s, %d bytes\n",
            url, time.Since(start), len(value.([]byte)))
    }

    fmt.Printf("all %s\n", time.Since(allstart))

}

执行结果

http://cn.bing.com/, 180.576553ms, 120050 bytes
http://www.baidu.com, 25.863523ms, 99882 bytes
http://cn.bing.com/, 397ns, 120050 bytes
http://www.baidu.com, 245ns, 99882 bytes
http://www.baidu.com, 154ns, 99882 bytes
http://cn.bing.com/, 123ns, 120050 bytes
http://www.baidu.com, 136ns, 99882 bytes
http://www.baidu.com, 123ns, 99882 bytes
http://cn.bing.com/, 127ns, 120050 bytes
http://www.baidu.com, 188ns, 99882 bytes
http://www.baidu.com, 116ns, 99882 bytes
http://cn.bing.com/, 123ns, 120050 bytes
http://www.baidu.com, 118ns, 99882 bytes
http://www.baidu.com, 180ns, 99882 bytes
http://cn.bing.com/, 140ns, 120050 bytes
http://www.baidu.com, 124ns, 99882 bytes
all 206.583298ms

利用go并行执行

我们利用sync.WaitGroup来等待所有URL解析完成

func testCache() {
    incomingURLS := []string{"http://cn.bing.com/", "http://www.baidu.com", "http://cn.bing.com/", "http://www.baidu.com",
        "http://www.baidu.com", "http://cn.bing.com/", "http://www.baidu.com",
        "http://www.baidu.com", "http://cn.bing.com/", "http://www.baidu.com",
        "http://www.baidu.com", "http://cn.bing.com/", "http://www.baidu.com",
        "http://www.baidu.com", "http://cn.bing.com/", "http://www.baidu.com"}

    m := New(httpGetBody)
    allstart := time.Now()
    var n sync.WaitGroup
    for _, url := range incomingURLS {
        start := time.Now()
        n.Add(1)
        go func(url string) {
            value, err := m.Get(url)
            if err != nil {
                fmt.Println(err)
            }

            fmt.Printf("%s, %s, %d bytes\n",
                url, time.Since(start), len(value.([]byte)))
            n.Done()
        }(url)
        n.Wait()
    }

    fmt.Printf("all %s\n", time.Since(allstart))

}

结果可以看到时间更短了，但是里面出现竞争关系了。

    if !ok {
        res.value, res.err = memo.f(key)
        memo.cache[key] = res
    }

可能一个goroutine判断!ok时，另外的goroutine也判断!ok，f还是执行了多次。

添加互斥锁

type Memo struct {
    f     Func
    mu    sync.Mutex
    cache map[string]result
}

func (memo *Memo) Get(key string) (interface{}, error) {
    memo.mu.Lock()
    defer memo.mu.Unlock()
    res, ok := memo.cache[key]
    if !ok {
        res.value, res.err = memo.f(key)
        memo.cache[key] = res
    }
    return res.value, res.err
}

但是这样造成一个问题，把Memo重新变回了串行访问。

最终方法1:使用指针标记

作者的思路就是实现这样的一个结果，一个goroutine调用函数，完成耗时的工作，其他调用同样函数的goroutine等待函数执行完毕后立马获取结果。

type result struct {
    value interface{}
    err   error
}
type entry struct {
    res   result
    ready chan struct{}
}
type Func func(key string) (interface{}, error)
type Memo struct {
    f     Func
    mu    sync.Mutex
    cache map[string]*entry
}

func New(f Func) *Memo {
    return &Memo{f: f, cache: make(map[string]*entry)}
}

func (memo *Memo) Get(key string) (interface{}, error) {
    memo.mu.Lock()
    e := memo.cache[key]
    if e == nil {
        e = &entry{ready: make(chan struct{})}
        memo.cache[key] = e
        memo.mu.Unlock()

        e.res.value, e.res.err = memo.f(key)
        close(e.ready)

    } else {
        memo.mu.Unlock()
        <-e.ready
    }
    return e.res.value, e.res.err
}

Memo的cache成员由map[string]result变为map[string]*entry
entry的结构为：

type entry struct {
    res   result
    ready chan struct{}
}

ready通道用来通知其他goroutine函数执行完毕可以读取结果了。
代码的核心在Get函数中的下面部分

    memo.mu.Lock()
    e := memo.cache[key]
    if e == nil {
        e = &entry{ready: make(chan struct{})}
        memo.cache[key] = e
        memo.mu.Unlock()

将函数f的计算从锁区域中分离开了，通过memo.cache[key] = e实现只有一个goroutine会执行函数运算。

最终方法2：使用客户端服务器模型

专门一个服务进程负责缓存，其它goroutine向该服务进程请求函数结果。

    // Func is the type of the function to memoize.
    type Func func(key string) (interface{}, error)
    // A result is the result of calling a Func.
    type result struct {
        value interface{}
        err
        error
    }
    type entry struct {
        res
        result
        ready chan struct{} // closed when res is ready
    }

下面是关键部分代码

type request struct {
    key      string
    response chan<- result
}

type Memo struct {
    requests chan request
}

Memo的成员是一个值类型为request的通道requests，用来向服务器发送函数请求，request类型包含一个result的通道，传递给服务器后，服务器用来给相应的goroutine传递函数的结果。
New函数主要创建了request通道和启动服务器:

func New(f Func) *Memo {
    memo := &Memo{request: make(chan request)}
    go memo.server(f)
    return memo
}

Get函数创建result通道response，构建resquest，然后通过requests通道发送给服务器，通过response接受函数结果。

func (memo *Memo) Get(key string) (interface{}, error) {
    response := make(chan result)
    memo.requests <- request{key, response}
    res := <-response
    return res.value, res.err
}
func (memo *Memo) Close() {
    close(memo.requests)
}

接下来是服务器程序

func (memo *Memo) server(f Func) {
    cache := make(map[string]*entry)
    for req := range memo.requests {
        e := cache[req.key]
        if e == nil {
            // This is the first request for this key.
            e = &entry{ready: make(chan struct{})}
            cache[req.key] = e
            go e.call(f, req.key) // call f(key)
        }
        go e.deliver(req.response)
    }
}

func (e *entry) call(f Func, key string) {
    // Evaluate the function.
    e.res.value, e.res.err = f(key)
    // Broadcast the ready condition.
    close(e.ready)
}

func (e *entry) deliver(response chan<- result) {
    //等待函数执行结束
    <-e.ready
    //发送结果到客户端
    response <- e.res
}