go tool ppof 获取和分析 profile 数据

什么是 Profile?

在计算机性能调试领域里，profile 就是对应用的画像，这里画像就是应用使用 CPU 和内存等情况，也就是说应用使用了多少 CPU 资源、都是哪些部分在使用、每个函数使用的比例是多少、有哪些函数在等待 CPU 资源等等。知道了这些，我们就能对应用进行规划，也能快速定位性能瓶颈。

Golang 是一个对性能特别看重的语言，因此语言中自带了 profile 的库，这篇文章就要讲解怎么在 golang 中做 profile。

在 Golang 中，主要关注的应用运行情况主要包括以下几种：

CPU profile：报告程序的 CPU 使用情况，按照一定频率去采集应用程序在 CPU 和寄存器上面的数据
Memory profile（Heap profile）：报告程序的内存使用情况
Block profile：报告 goroutines 不在运行状态的情况，可以用来分析和查找死锁等性能瓶颈
Goroutine profile：报告 goroutines 的使用情况，有哪些 goroutine，它们的调用关系是怎样的

两种收集方式

runtime/pprofnet/http/pprof

工具型应用

如果你的应用是一次性的，运行一段时间就结束，那么最好的办法就是在应用退出时把 profile 的报告保存到文件中，进行分析。对于这种情况，可以使用 runtime/pprof 库。

pprofpprof.StartCPUProfile()w io.WriterStopCPUProfile()

main.go

f, err := os.Create(*cpuprofile)
...
pprof.StartCPUProfile(f)
defer pprof.StopCPUProfile()

应用执行结束后，就会生成一个文件，保存了我们的 CPU profile 数据。

WriteHeapProfilestartstop

f, err := os.Create(*memprofile)
pprof.WriteHeapProfile(f)
f.Close()

服务型应用

net/http/pprof

在 import 里添加一行：

import _ "net/http/pprof"

在主函数中启动服务监听端口：

go func() {
    http.ListenAndServe(":6060", nil)
}()

/debug/pprof

/debug/pprof/

profiles:
0	block
756	goroutine
16100	heap
0	mutex
94	threadcreate

full goroutine stack dump

go tool ppof 获取和分析 profile 数据

go tool pprof

graphviz

$ sudo apt-get install -y graphviz

注意获取的 profile 数据是动态的，要想获得有效的数据，请保证应用处于较大的负载（比如正在运行的服务，或者通过其他工具模拟访问压力）。否则如果应用处于空闲状态，得到的结果可能没有任何意义。

我们以 CPU profile 分析为例介绍两种分析方法。

终端

go tool pprofgo tool pprof [binary] [source]binarysource

 ➜  go tool pprof ./hyperkube http://172.16.3.232:10251/debug/pprof/profile
Fetching profile from http://172.16.3.232:10251/debug/pprof/profile
Please wait... (30s)
Saved profile in /home/cizixs/pprof/pprof.hyperkube.172.16.3.232:10251.samples.cpu.002.pb.gz
Entering interactive mode (type "help" for commands)
(pprof)

?seconds=60help

topN

(pprof) top10
130ms of 360ms total (36.11%)
Showing top 10 nodes out of 180 (cum >= 10ms)
      flat  flat%   sum%        cum   cum%
      20ms  5.56%  5.56%      100ms 27.78%  encoding/json.(*decodeState).object
      20ms  5.56% 11.11%       20ms  5.56%  runtime.(*mspan).refillAllocCache
      20ms  5.56% 16.67%       20ms  5.56%  runtime.futex
      10ms  2.78% 19.44%       10ms  2.78%  encoding/json.(*decodeState).literalStore
      10ms  2.78% 22.22%       10ms  2.78%  encoding/json.(*decodeState).scanWhile
      10ms  2.78% 25.00%       40ms 11.11%  encoding/json.checkValid
      10ms  2.78% 27.78%       10ms  2.78%  encoding/json.simpleLetterEqualFold
      10ms  2.78% 30.56%       10ms  2.78%  encoding/json.stateBeginValue
      10ms  2.78% 33.33%       10ms  2.78%  encoding/json.stateEndValue
      10ms  2.78% 36.11%       10ms  2.78%  encoding/json.stateInString

累加值 cumulative

toplist

 (pprof) list podFitsOnNode
Total: 120ms
ROUTINE ======================== k8s.io/kubernetes/plugin/pkg/scheduler.podFitsOnNode in /home/cizixs/go/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/plugin/pkg/scheduler/generic_scheduler.go
         0       20ms (flat, cum) 16.67% of Total
         .          .    230:
         .          .    231:// Checks whether node with a given name and NodeInfo satisfies all predicateFuncs.
         .          .    232:func podFitsOnNode(pod *api.Pod, meta interface{}, info *schedulercache.NodeInfo, predicateFuncs map[string]algorithm.FitPredicate) (bool, []algorithm.PredicateFailureReason, error) {
         .          .    233:	var failedPredicates []algorithm.PredicateFailureReason
         .          .    234:	for _, predicate := range predicateFuncs {
         .       20ms    235:		fit, reasons, err := predicate(pod, meta, info)
         .          .    236:		if err != nil {
         .          .    237:			err := fmt.Errorf("SchedulerPredicates failed due to %v, which is unexpected.", err)
         .          .    238:			return false, []algorithm.PredicateFailureReason{}, err
         .          .    239:		}
         .          .    240:		if !fit {

disasm

可视化

web 命令svg

这个调用图包含了更多的信息，而且可视化的图像能让我们更清楚地理解整个应用程序的全貌。图中每个方框对应一个函数，方框越大代表执行的时间越久（包括它调用的子函数执行时间，但并不是正比的关系）；方框之间的箭头代表着调用关系，箭头上的数字代表被调用函数的执行时间。

encoding/json.(*decodeState).objectdisadm 的用法相同，它。

pdf 命令pdfpprof --help

另一个可视化的方法是直接启动一个 http 服务：

go tool pprof -http="10.224.27.152:8081" ./hyperkube http://172.16.3.232:10251/debug/pprof/profile

在浏览器上访问 10.224.27.152:8081 即可看到各种界面。

注：本文大量使用了使用 pprof 和火焰图调试 golang 应用的内容，并结合了笔者平时的实践。