groupcache 全方位解读之基础篇

groupcache的官网文档太少了，几乎是没有，这篇文章是整合网上的文章加上自己的思考。

针对group的文章会写三篇文章，深度是层层递进的，希望小伙们读后有所收获。

1、《groupcache 全方位解读之基础篇》

2、《groupcache 全方位解读之核心组件源代码剖析》

3、《groupcache 全方位解读之架构及技术点剖析》

待完成部分（需要好好弄清楚）：

对于minCache和hotCache的理解不够明白，在后面的源代码分析过程中再思考下这个问题。

一、groupcache简介

1、1 功能和特性

groupcache is a distributed caching and cache-filling library, intended as a replacement for a pool of memcached nodes in many cases.

翻译：groupcache是一个分布式缓存库，在很多场景下可以替代memcached节点池

与Redis等其他常用cache实现不同，groupcache并不运行在单独的server上，而是作为library和app运行在同一进程中。所以groupcache既是server也是client。

1、2 技术点

1）LRU

LRU （Least Recently Used）最近最少使用算法，它的核心思想是根据某一种策略淘汰很少使用的数据，常用与缓存的设计。

2）consistenthash

3）singleflight

4）protocol buffers

5）http请求和响应知识

二、项目代码目录结构

首先观察项目的目录结构，对groupcache有个大致的了解

项目并没有采用go modules的方式进行管理，docs文档目录？没有！example或demo目录？没有！test测试？有！！

既然有测试文件，从groupcache_test.go文件入手，看看groupcache的使用方法

然后我们先看最上面的5个文件夹（package）分别是什么内容，分模块攻破之后再看外层调用逻辑，把知识点再串联到一起。

consistenthash:一致性哈希

groupcachedb:其内部通信数据格式，protocol buffers格式，熟悉通信协议能够在一定程度上了解通信流程的数据传输。

lru:最近最少使用淘汰算法

singleflight:单飞（嘿嘿，有双飞不）

testpb:pb结尾和Protocol Buffers逃脱不了干系了

groupcache.go是对外提供功能的主要文件，我们就从他开始下手。

此外，还有byteview.go,http.go,peers.go,sinks.go文件

这些会在第二篇文章《groupcache 全方位解读之核心组件源代码剖析》中进行重点讲解。

三、编写demo并运行（玩耍吧，少年！）

网上有个groupcache exeample代码很精炼流程很清晰，代码如下：

// Simple groupcache example: https://github.com/golang/groupcache
// Running 3 instances:
// go run groupcache.go -addr=:8080 -pool=http://127.0.0.1:8080,http://127.0.0.1:8081,http://127.0.0.1:8082
// go run groupcache.go -addr=:8081 -pool=http://127.0.0.1:8081,http://127.0.0.1:8080,http://127.0.0.1:8082
// go run groupcache.go -addr=:8082 -pool=http://127.0.0.1:8082,http://127.0.0.1:8080,http://127.0.0.1:8081
// Testing:
// curl localhost:8080/color?name=red
package main

import (
   "context"
   "errors"
   "flag"
   "log"
   "net/http"
   "strings"

   "github.com/golang/groupcache"
)

var Store = map[string][]byte{
   "red":   []byte("#FF0000"),
   "green": []byte("#00FF00"),
   "blue":  []byte("#0000FF"),
}

var Group = groupcache.NewGroup("foobar", 64<<20, groupcache.GetterFunc(
   func(ctx context.Context, key string, dest groupcache.Sink) error {
      log.Println("looking up", key)
      v, ok := Store[key]
      if !ok {
         return errors.New("color not found")
      }
      dest.SetBytes(v)
      return nil
   },
))

func main() {
   addr := flag.String("addr", ":8080", "server address")
   peers := flag.String("pool", "http://localhost:8080", "server pool list")
   flag.Parse()
   http.HandleFunc("/color", func(w http.ResponseWriter, r *http.Request) {
      color := r.FormValue("name")
      var b []byte
      err := Group.Get(nil, color, groupcache.AllocatingByteSliceSink(&b))
      if err != nil {
         http.Error(w, err.Error(), http.StatusNotFound)
         return
      }
      w.Write(b)
      w.Write([]byte{'\n'})
   })
   p := strings.Split(*peers, ",")
   pool := groupcache.NewHTTPPool(p[0])
   pool.Set(p...)
   http.ListenAndServe(*addr, nil)
}

玩法：

开启三个实例，并且在本地监听不同的端口

go run groupcache.go -addr=:8080 -pool=http://127.0.0.1:8080,http://127.0.0.1:8081,http://127.0.0.1:8082

go run groupcache.go -addr=:8081 -pool=http://127.0.0.1:8081,http://127.0.0.1:8080,http://127.0.0.1:8082

go run groupcache.go -addr=:8082 -pool=http://127.0.0.1:8082,http://127.0.0.1:8080,http://127.0.0.1:8081

传递的pool参数，是设置每个节点的地址，也就是peers的地址。如果在本节点cache中查找不到相应缓存，则根据peers的地址发送http请求来获取缓存。

将代码编译运行，默认端口是8080，确保运行程序后，打开浏览器输入：

http://localhost:8080/color?name=red

按照缓存组件操作的常规思路，缓存操作分为以下三个步骤

步骤1：groupcache资源对象的创建方式

使用groupcache.NewGroup函数创建名字为foobar的Group，同时传入回调函数

步骤2：groupcache api将内容写入缓存

根据传入的key参数，在Store映射表中查找对应的value，Store映射表的key类型为string，value类型是[]byte

然后通过dest.setBytes(v)将value值写入到缓存中。

备注：

用户只能通过在callback回调函数中通过写入Sink来更新cache。而这个callback回调函数只有在cache miss的时候才被调用。

步骤3：groupcache api从缓存中读取内容

var b []byte

err := Group.Get(nil, color, groupcache.AllocatingByteSliceSink(&b))

依托步骤1创建的Group对象，调用其Get函数，从缓存中读取内容并将内容写入到b字节切片中

四、源代码流程分析

4、1 读取缓存流程

读取缓存的主体流程如下：

总体流程如下：

首先检查maincache和hotcache缓存中是否命中缓存？有则直接返回，否则进行下一步

然后查询远端peer的缓存？有则直接返回，否则进行下一步

上面两步都未查询到缓存，则调用用户注册的回调函数来更新缓存（如从数据库读取数据后添加到缓存中）

4、1、1 查询本地的maincache和hotcache缓存

func (g *Group) lookupCache(key string) (value ByteView, ok bool) {
	if g.cacheBytes <= 0 {
		return
	}
	value, ok = g.mainCache.get(key)
	if ok {
		return
	}
	value, ok = g.hotCache.get(key)
	return
}

首先要搞清楚mainCache和HotCache这两个基础概念（也是我学习过程中的疑问点）

const (
	// The MainCache is the cache for items that this peer is the
	// owner for.
	MainCache CacheType = iota + 1

	// The HotCache is the cache for items that seem popular
	// enough to replicate to this node, even though it's not the
	// owner.
	HotCache
)

mainCache含义是什么？（暂时先这么理解）

The MainCache is the cache for items that this peer is the owner for

maincache为分布式中本地分配到的cache部分

hotCache含义是什么？

The HotCache is the cache for items that seem popular enough to replicate to this node, even though it's not the owner.

hotcache是由于访问频率高而被复制到此节点的缓存,尽管本节点不是它的拥有者。

4、1、2 查询远端peer的缓存

在Group的load函数中

func (g *Group) load(ctx context.Context, key string, dest Sink) (value ByteView, destPopulated bool, err error)

当某个key在本地找不到时，groupcache会根据sharding向peer发送http request。

var value ByteView
var err error
if peer, ok := g.peers.PickPeer(key); ok {
	value, err = g.getFromPeer(ctx, peer, key)
	if err == nil {
		g.Stats.PeerLoads.Add(1)
		return value, nil
	}
g.Stats.PeerErrors.Add(1)
// TODO(bradfitz): log the peer's error? keep
// log of the past few for /groupcachez?  It's
// probably boring (normal task movement), so not
// worth logging I imagine.
}

先调用PickPeer获取一个peer，然后调用getFromPeer函数从peer获取缓存内容。

至于PickPeer的具体实现，我们放到后面的一致性哈希来讲解，这里只介绍梳理下流程。

4、1、3 调用用户注册的回调函数回填缓存

如果在本地mainCache、本地hotCache、peer Cache中都没找到缓存，就调用getLocally函数

value, err = g.getLocally(ctx, key, dest)
	if err != nil {
		g.Stats.LocalLoadErrs.Add(1)
		return nil, err
	}
	g.Stats.LocalLoads.Add(1)
	destPopulated = true // only one caller of load gets this return value
	g.populateCache(key, value, &g.mainCache)
	return value, nil

func (g *Group) getLocally(ctx context.Context, key string, dest Sink) (ByteView, error) {
	err := g.getter.Get(ctx, key, dest)
	if err != nil {
		return ByteView{}, err
	}
	return dest.view()
}

getLocally函数中，会调用用户在创建Group时注册的Getter类型的回调函数，回调函数的逻辑可能是从数据库中加载数据。

加载数据成功后，调用populateCache将键值对存储到mainCache缓存中，然后将更新后的值返回给调用者。

4、2 写入缓存流程

将内容更新到mainCache缓存中

用户只能通过在callback中写入groupcache.Sink来更新cache。而这个callback只有在cache miss的时候才会被调用。所以cache一旦被写入便无法更新。

5、总结

本章主要是整体角度了解groupcache的api接口，编写demo程序玩耍起groupcache库，然后通过单步调试的方法进一步了解了两个基础操作：从groupcache缓存中读取内容和将内容写入groupcache缓存。

参考链接