pprof模块

Go语言项目中的性能优化主要有以下几个方面:

  • CPU profile:报告程序的 CPU 使用情况,按照一定频率去采集应用程序在 CPU 和寄存器上面的数据
  • Memory Profile(Heap Profile):报告程序的内存使用情况
  • Block Profiling:报告 goroutines 不在运行状态的情况,可以用来分析和查找死锁等性能瓶颈
  • Goroutine Profiling:报告 goroutines 的使用情况,有哪些 goroutine,它们的调用关系是怎样的

Go语言内置了获取程序的运行数据的工具,包括以下两个标准库:

runtime/pprofnet/http/pprof

pprof开启后,每隔一段时间(10ms)就会收集下当前的堆栈信息,获取格格函数占用的CPU以及内存资源;最后通过对这些采样数据进行分析,形成一个性能分析报告。

注意,我们只应该在性能测试的时候才在代码中引入pprof。

准备工具

图形化工具

  1. 安装pprof工具包
go get -u github.com/google/pprof
go tool pprof
  1. 安装graphviz,以支持火焰图
    这个需要在服务器端安装,可以使用各个操作系统的包管理工具,或者选择源码安装:
brew install graphviz # for macos
apt-get install graphviz # for ubuntu
yum install graphviz # for centos

为什么提示在服务端安装
如果服务端和客户端在同一台机器上不存这个问题
但是如果服务端和客户端不在同一台机器上,你需要知道graphviz装在哪

压测工具

不详细介绍,如果有需要的话,可以使用
https://github.com/wg/wrk 或 https://github.com/adjust/go-wrk

go-torch

go tool pprofgo tool pprof
工具型应用

如果你的应用程序是运行一段时间就结束退出类型。那么最好的办法是在应用退出的时候把 profiling 的报告保存到文件中,进行分析。对于这种情况,可以使用runtime/pprof库。 首先在代码中导入runtime/pprof工具:

import "runtime/pprof"

网上示例代码:

package main

import (
    "flag"
    "fmt"
    "log"
    "os"
    "runtime/pprof"
)

var (
    cpuprofile = flag.String("cpuprofile", "", "write cpu profile to file.")
)

func main() {
    log.Println("begin")
    flag.Parse()

    if *cpuprofile != "" {
        f, err := os.Create(*cpuprofile)
        if err != nil {
            log.Fatal(err)
        }

        pprof.StartCPUProfile(f)  // 开启CPU性能分析
        defer pprof.StopCPUProfile()  // 停止CPU性能分析
    }

    for i := 0; i < 30; i++ {
        nums := fibonacci(i)
        fmt.Println(nums)
    }
}

func fibonacci(num int) int {
    if num < 2 {
        return 1
    }
    return fibonacci(num-1) + fibonacci(num-2)
}

注意这两句,分析CPU性能:

pprof.StartCPUProfile(f)  // 开启CPU性能分析
defer pprof.StopCPUProfile()  // 停止CPU性能分析
go build test.go--cpuprofile./test --cpuprofile=cpu.profcpu.profgo tool pprof
pprof.WriteHeapProfile(w io.Writer)
服务型应用
net/http/pprofhttp.DefaultServeMuxhttp.ListenAndServe(“0.0.0.0:8000”, nil)net/http/pprof
import _ "net/http/pprof"  //注意下划线

网上示例代码:

package main
import (
    _ "net/http/pprof"
    "net/http"
)
func main() {
    http.ListenAndServe("0.0.0.0:8000", nil)
}
test.go
go run test.go
/debug/pproflocalhost
http://localhost:8082/debug/pprof/

在这里插入图片描述
这个路径下还有几个子页面:

类型 描述 备注
allocs 内存分配情况的采样信息 可以用浏览器打开,但可读性不高
blocks 阻塞操作情况的采样信息 可以用浏览器打开,但可读性不高
cmdline 显示程序启动命令及参数 可以用浏览器打开
goroutine 当前所有协程的堆栈信息 可以用浏览器打开,但可读性不高
heap 堆上内存使用情况的采样信息 可以用浏览器打开,但可读性不高
mutex 锁争用情况的采样信息 可以用浏览器打开,但可读性不高
profile CPU占用情况的采样信息, 持续 30s 浏览器打开会下载文件
threadcreate 系统线程创建情况的采样信息 可以用浏览器打开,但可读性不高
trace 程序运行跟踪信息 浏览器打开会下载文件
Gin框架使用pprof

有两个包可以用:

package main

import (
	"github.com/gin-contrib/pprof"
	"github.com/gin-gonic/gin"
)

func main() {
  router := gin.Default()
  pprof.Register(router)
  router.Run(":8080")
}
package main

import (
	"github.com/gin-gonic/gin"
	"github.com/DeanThompson/ginpprof"
)

func main() {
	router := gin.Default()
	ginpprof.Wrap(router)
	router.Run(":8080")
}

注意:
如果使用router.Use(gzip.Gzip(gzip.DefaultCompression))等类似功能的方法需要注释掉,否则浏览器打开页面会触发下载

pprof与性能测试结合
go test
  • cpuprofile:cpu profiling 数据要保存的文件地址
  • memprofile:memory profiling 数据要报文的文件地址

我们还可以选择将pprof与性能测试相结合,比如:
下面执行测试的同时,也会执行 CPU profiling,并把结果保存在 cpu.prof 文件中:

go test -bench . -cpuprofile=cpu.prof

下面执行测试的同时,也会执行 Mem profiling,并把结果保存在 cpu.prof 文件中:

go test -bench . -memprofile=./mem.prof
.profgo tool pprof
go tool pprof命令

介绍

go tool pprof
go tool pprofgo tool pprof
go tool pprof [binary] [source]

其中:

  • binary 是应用的二进制文件,用来解析各种符号;
  • source 表示 profile 数据的来源,可以是本地的文件,也可以是 http 地址。

注意事项: 获取的 Profiling 数据是动态的,要想获得有效的数据,请保证应用处于较大的负载(比如正在生成中运行的服务,或者通过其他工具模拟访问压力)。否则如果应用处于空闲状态,得到的结果可能没有任何意义。

图形化展示

github.com/gin-contrib/pprof
$ go run main.go 
[GIN-debug] [WARNING] Running in "debug" mode. Switch to "release" mode in production.
 - using env:   export GIN_MODE=release
 - using code:  gin.SetMode(gin.ReleaseMode)

[GIN-debug] GET    /debug/pprof/             --> github.com/DeanThompson/ginpprof.IndexHandler.func1 (3 handlers)
[GIN-debug] GET    /debug/pprof/heap         --> github.com/DeanThompson/ginpprof.HeapHandler.func1 (3 handlers)
[GIN-debug] GET    /debug/pprof/goroutine    --> github.com/DeanThompson/ginpprof.GoroutineHandler.func1 (3 handlers)
[GIN-debug] GET    /debug/pprof/allocs       --> github.com/DeanThompson/ginpprof.AllocsHandler.func1 (3 handlers)
[GIN-debug] GET    /debug/pprof/block        --> github.com/DeanThompson/ginpprof.BlockHandler.func1 (3 handlers)
[GIN-debug] GET    /debug/pprof/threadcreate --> github.com/DeanThompson/ginpprof.ThreadCreateHandler.func1 (3 handlers)
[GIN-debug] GET    /debug/pprof/cmdline      --> github.com/DeanThompson/ginpprof.CmdlineHandler.func1 (3 handlers)
[GIN-debug] GET    /debug/pprof/profile      --> github.com/DeanThompson/ginpprof.ProfileHandler.func1 (3 handlers)
[GIN-debug] GET    /debug/pprof/symbol       --> github.com/DeanThompson/ginpprof.SymbolHandler.func1 (3 handlers)
[GIN-debug] POST   /debug/pprof/symbol       --> github.com/DeanThompson/ginpprof.SymbolHandler.func1 (3 handlers)
[GIN-debug] GET    /debug/pprof/trace        --> github.com/DeanThompson/ginpprof.TraceHandler.func1 (3 handlers)
[GIN-debug] GET    /debug/pprof/mutex        --> github.com/DeanThompson/ginpprof.MutexHandler.func1 (3 handlers)
[GIN-debug] Listening and serving HTTP on :8080
/debug/pprof

下边介绍其他的用法:
我们再新建一个终端窗口输入以下命令,当前所有协程的堆栈信息:

go tool pprof --seconds 20 http://localhost:8080/debug/pprof/goroutine

这句话的意思,采集cpu数据并持续20s
也可以用另一种写法:

go tool pprof http://localhost:8080/debug/pprof/goroutine?second=20

看一下执行结果:

$ go tool pprof --seconds 20 http://localhost:8080/debug/pprof/goroutine
Fetching profile over HTTP from http://localhost:8080/debug/pprof/goroutine?seconds=20
Please wait... (20s)
Saved profile in /home/server/pprof/pprof.main.goroutine.001.pb.gz
File: main
Type: goroutine
Time: Dec 18, 2019 at 5:48pm (CST)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) 
(pprof)web
(pprof) web
Couldn't find a suitable web browser!
Set the BROWSER environment variable to your desired browser.
(pprof) 
Saved profile/home/server/pprof/pprof.main.goroutine.001.pb.gz
go tool pprof -http=0.0.0.0:8081 /home/server/pprof/pprof.main.goroutine.001.pb.gz
http://localhost:8081/graphvizCould not execute dot; may need to install graphvizVIEWFlame Graph

命令行交互界面

go tool pprofweb

top

top 默认查看程序中占用CPU前10位的函数
top3 可以查看程序中占用CPU前3位的函数:

(pprof) top 
Showing nodes accounting for 5, 100% of 5 total
Showing top 10 nodes out of 53
      flat  flat%   sum%        cum   cum%
         3 60.00% 60.00%          3 60.00%  runtime.gopark
         1 20.00% 80.00%          1 20.00%  runtime/pprof.writeRuntimeProfile
         1 20.00%   100%          1 20.00%  syscall.Syscall
         0     0%   100%          1 20.00%  bufio.(*Reader).ReadLine
         0     0%   100%          1 20.00%  bufio.(*Reader).ReadSlice
         0     0%   100%          1 20.00%  bufio.(*Reader).fill
         0     0%   100%          1 20.00%  github.com/DeanThompson/ginpprof.GoroutineHandler.func1
         0     0%   100%          1 20.00%  github.com/gin-gonic/gin.(*Context).Next
         0     0%   100%          1 20.00%  github.com/gin-gonic/gin.(*Engine).Run
         0     0%   100%          1 20.00%  github.com/gin-gonic/gin.(*Engine).ServeHTTP
(pprof) top3
Showing nodes accounting for 5, 100% of 5 total
Showing top 3 nodes out of 53
      flat  flat%   sum%        cum   cum%
         3 60.00% 60.00%          3 60.00%  runtime.gopark
         1 20.00% 80.00%          1 20.00%  runtime/pprof.writeRuntimeProfile
         1 20.00%   100%          1 20.00%  syscall.Syscall

其中:

  • flat:当前函数占用CPU的耗时
  • flat::当前函数占用CPU的耗时百分比
  • sun%:函数占用CPU的耗时累计百分比
  • cum:当前函数加上调用当前函数的函数占用CPU的总耗时
  • cum%:当前函数加上调用当前函数的函数占用CPU的总耗时百分比
  • 最后一列:函数名称

在大多数的情况下,我们可以通过分析这五列得出一个应用程序的运行情况,并对程序进行优化。

list

listlist gopark
(pprof) list gopark 
Total: 5
ROUTINE ======================== runtime.gopark in /usr/local/go/src/runtime/proc.go
         3          3 (flat, cum) 60.00% of Total
         .          .    297:   gp.waitreason = reason
         .          .    298:   mp.waittraceev = traceEv
         .          .    299:   mp.waittraceskip = traceskip
         .          .    300:   releasem(mp)
         .          .    301:   // can't do anything that might move the G between Ms here.
         3          3    302:   mcall(park_m)
         .          .    303:}
         .          .    304:
         .          .    305:// Puts the current goroutine into a waiting state and unlocks the lock.
         .          .    306:// The goroutine can be made runnable again by calling goready(gp).
         .          .    307:func goparkunlock(lock *mutex, reason waitReason, traceEv byte, traceskip int) {
ROUTINE ======================== runtime.goparkunlock in /usr/local/go/src/runtime/proc.go
         0          1 (flat, cum) 20.00% of Total
         .          .    303:}
         .          .    304:
         .          .    305:// Puts the current goroutine into a waiting state and unlocks the lock.
         .          .    306:// The goroutine can be made runnable again by calling goready(gp).
         .          .    307:func goparkunlock(lock *mutex, reason waitReason, traceEv byte, traceskip int) {
         .          1    308:   gopark(parkunlock_c, unsafe.Pointer(lock), reason, traceEv, traceskip)
         .          .    309:}
         .          .    310:
         .          .    311:func goready(gp *g, traceskip int) {
         .          .    312:   systemstack(func() {
         .          .    313:           ready(gp, traceskip, true)
(pprof) 

pdf

pdf可以生成可视化pdf文件:

(pprof) pdf
Generating report in profile001.pdf
(pprof)
profile001.pdf

help

其他文件格式执行pprof help查看命令说明。

(pprof) help
  Commands:
    callgrind        Outputs a graph in callgrind format
    comments         Output all profile comments
    disasm           Output assembly listings annotated with samples
    dot              Outputs a graph in DOT format
    eog              Visualize graph through eog
    evince           Visualize graph through evince
    gif              Outputs a graph image in GIF format
    gv               Visualize graph through gv
    kcachegrind      Visualize report in KCachegrind
    list             Output annotated source for functions matching regexp
    pdf              Outputs a graph in PDF format
    peek             Output callers/callees of functions matching regexp
    png              Outputs a graph image in PNG format
    proto            Outputs the profile in compressed protobuf format
    ps               Outputs a graph in PS format
    raw              Outputs a text representation of the raw profile
    svg              Outputs a graph in SVG format
    tags             Outputs all tags in the profile
    text             Outputs top entries in text form
    top              Outputs top entries in text form
    topproto         Outputs top entries in compressed protobuf format
    traces           Outputs all profile samples in text form
    tree             Outputs a text rendering of call graph
    web              Visualize graph through web browser
    weblist          Display annotated source in a web browser
    o/options        List options and their current values
    quit/exit/^D     Exit pprof
参考

Go pprof性能调优
golang pprof 实战
pprof (golang 性能监控与分析)
Golang性能调优(go-torch, go tool pprof)