你的 golang 程序正在悄悄内存泄漏

什么是内存泄漏

内存泄漏是指程序运行过程中，内存因为某些原因无法释放或没有释放。简单来讲就是，有代码占着茅坑不拉屎，让内存资源造成了浪费。如果泄漏的内存越堆越多，就会占用程序正常运行的内存。比较轻的影响是程序开始运行越来越缓慢；严重的话，可能导致大量泄漏的内存堆积，最终导致程序没有内存可以运行，最终导致 OOM （Out Of Memory，即内存溢出）。但通常来讲，内存泄漏都是极其不易发现的，所以为了保证程序的健康运行，我们需要重视如何避免写出内存泄漏的代码。

slice、string 误用造成内存泄漏

a[1:3]slice

内存泄漏分析

slice

这里使用一张《Go 入门指南》的图：

图自《Go 入门指南》

xslice

sliceslice

slicesliceslicesliceslice

yxxxyxxy

验证一下

让我们使用代码验证一下：

func TestSlice(t *testing.T) {
    var a []int
    for i := 0; i < 100; i++ {
        a = append(a, i)
    }

    var b = a[:10]
    println(&a, &b)
    println(&a[0], &b[0])
}

运行后，输出如下：

0xc000038748 0xc000038730
0xc000148400 0xc000148400

a[0]b[0]ababaababa

需要注意的是：由于 string 切片时也会共用底层数组，所以使用不当也会造成内存泄漏。

其他语言中类似的情况

slice

比如 Python，也有切片这个概念，看下面这个代码：

>>> a=[1,2,4,5]
>>> b=tab[:3]
>>> id(a[0])
140700163291672
>>> id(b[0])
140700163291672

a[0]b[0]

SubListSubListList

解决方案

TestSlice

如果我们不能保证将切片作为局部变量使用且不传递，则应该对需要的切片数据进行拷贝，防止内存泄漏。如下所示的两种方式均可：

func TestSliceSolution(t *testing.T) {
    var a, b []int
    for i := 0; i < 100; i++ {
        a = append(a, i)
    }

    b = append(b, a[:10]...)
    println(&a[0], &b[0])
}

//0xc000014800 0xc000020230

func TestSliceSolution2(t *testing.T) {
    var a, b []int
    for i := 0; i < 100; i++ {
        a = append(a, i)
    }

    b = make([]int, 10)
    copy(b, a[:10])
    println(&a[0], &b[0])
}

//0xc000014800 0xc00003e6d0

time.Ticker 误用造成内存泄漏

TickerTimerTimerTickerStop

Ticker

func TestTickerNormal(t *testing.T) {
    ticker := time.NewTicker(time.Second)
    defer ticker.Stop()
    go func() {
        for {
            fmt.Println(<-ticker.C)
        }
    }()

    time.Sleep(time.Second * 3)
    fmt.Println("finish")
}

//2022-03-17 12:01:06.279504 +0800 CST m=+1.000922333
//2022-03-17 12:01:07.281379 +0800 CST m=+2.002815014
//finish
//2022-03-17 12:01:08.280861 +0800 CST m=+3.002314240

内存泄漏分析

Stop

StopTicker

func TestTickerUsingStop(t *testing.T) {
    for i := 0; i < 100_0000; i++ {
        go func() {
            ticker := time.NewTicker(time.Second)
            defer ticker.Stop()
            for i := 0; i < 3; i++ {
                <-ticker.C
            }
        }()
    }
    time.Sleep(10 * time.Second)
    
    // 以下代码用于内存分析，不重要，不需要看
    f, _ := os.Create("1.prof")
    defer f.Close()
    runtime.GC()
    _ = pprof.WriteHeapProfile(f)
    log.Println("finish")
}

go tool pprof 1.proftop

Dropped 11 nodes (cum <= 2.09MB)
      flat  flat%   sum%        cum   cum%
  402.16MB 96.08% 96.08%   402.16MB 96.08%  runtime.malg
    8.67MB  2.07% 98.15%     8.67MB  2.07%  runtime.allgadd
    6.23MB  1.49% 99.64%     6.23MB  1.49%  time.startTimer
         0     0% 99.64%     6.23MB  1.49%  demo.TestTickerUsingStop.func1
         0     0% 99.64%   410.83MB 98.15%  runtime.newproc.func1
         0     0% 99.64%   410.83MB 98.15%  runtime.newproc1
         0     0% 99.64%   410.83MB 98.15%  runtime.systemstack
         0     0% 99.64%     6.23MB  1.49%  time.NewTicker

StopTicker

func TestTickerWithoutUsingStop(t *testing.T) {
    for i := 0; i < 100_0000; i++ {
        go func() {
            ticker := time.NewTicker(time.Second)
            for i := 0; i < 3; i++ {
                <-ticker.C
            }
        }()
    }
    time.Sleep(10 * time.Second)
    
    // 以下代码用于内存分析，不重要，不需要看
    f, _ := os.Create("2.prof")
    defer f.Close()
    runtime.GC()
    _ = pprof.WriteHeapProfile(f)
    log.Println("finish")
}

操作同上，得到输出如下：

Dropped 10 nodes (cum <= 3.04MB)
      flat  flat%   sum%        cum   cum%
  378.65MB 62.21% 62.21%   378.65MB 62.21%  runtime.malg
  210.02MB 34.51% 96.72%   219.83MB 36.12%  time.NewTicker
    9.81MB  1.61% 98.33%     9.81MB  1.61%  time.startTimer
    8.67MB  1.42% 99.75%     8.67MB  1.42%  runtime.allgadd
         0     0% 99.75%   219.83MB 36.12%  demo.TestTickerWithoutUsingStop.func1
         0     0% 99.75%   387.32MB 63.64%  runtime.newproc.func1
         0     0% 99.75%   387.32MB 63.64%  runtime.newproc1
         0     0% 99.75%   387.32MB 63.64%  runtime.systemstack

• flat表示此函数分配的内存并由该函数持有
• cum表示内存是由这个函数或它调用堆栈的函数分配的

Stoptime.NewTicker

Ticker<-ticker.C

func TestTicker(t *testing.T) {
    fmt.Println("NumGoroutine:", runtime.NumGoroutine())
    go func() {
        ticker := time.NewTicker(time.Second)
        ticker.Stop() // 注意，这里先 stop 了
        for i := 0; i < 3; i++ {
            <-ticker.C
        }
        fmt.Println("ticker finish")
    }()

    time.Sleep(5 * time.Second)
    fmt.Println("NumGoroutine:", runtime.NumGoroutine())
}

// Output:
// NumGoroutine: 2
// NumGoroutine: 3

channel 误用造成内存泄漏

都说 golang 10 次内存泄漏，9 次是 go routine 泄漏。可见 go channel 内存泄漏的常见性。go channel 内存泄漏主要分两种情况，我在《老手也常误用！详解 Go channel 内存泄漏问题》这篇文章有详细讲述。这里简单说一下造成内存泄漏的代码、原因。

select-case

func TestLeakOfMemory(t *testing.T) {
   fmt.Println("NumGoroutine:", runtime.NumGoroutine())
   chanLeakOfMemory()
   time.Sleep(time.Second * 3) // 等待 goroutine 执行，防止过早输出结果
   fmt.Println("NumGoroutine:", runtime.NumGoroutine())
}

func chanLeakOfMemory() {
   errCh := make(chan error) 
   go func() { 
      time.Sleep(2 * time.Second)
      errCh <- errors.New("chan error") // (1)
      fmt.Println("finish sending")
   }()

   var err error
   select {
   case <-time.After(time.Second): // (2) 大家也经常在这里使用 <-ctx.Done()
      fmt.Println("超时")
   case err = <-errCh: 
      if err != nil {
         fmt.Println(err)
      } else {
         fmt.Println(nil)
      }
   }
}

由于 go channel 在没有缓冲队列的时候，读取 channel 默认是阻塞的，所以 (1) 处代码会阻塞，(2) 处超时后，由于没有 go routine 读取 channel ，(1) 会一直阻塞。因此输出：

NumGoroutine: 2
超时
NumGoroutine: 3

for-range

func TestLeakOfMemory2(t *testing.T) {
   fmt.Println("NumGoroutine:", runtime.NumGoroutine())
   chanLeakOfMemory2()
   time.Sleep(time.Second * 3) // 等待 goroutine 执行，防止过早输出结果
   fmt.Println("NumGoroutine:", runtime.NumGoroutine())
}

func chanLeakOfMemory2() {
   ich := make(chan int, 100)
   // sender
   go func() {
      defer close(ich)
      for i := 0; i < 10000; i++ {
         ich <- i // (2)
         time.Sleep(time.Millisecond) // 控制一下，别发太快
      }
   }()
   // receiver
   go func() {
      ctx, cancel := context.WithTimeout(context.Background(), time.Second)
      defer cancel()
      for i := range ich { 
         if ctx.Err() != nil { // (1)
            fmt.Println(ctx.Err())
            return
         }
         fmt.Println(i)
      }
   }()
}

// Output:
// NumGoroutine: 2
// 0
// 1
// ...(省略)...
// 789
// context deadline exceeded
// NumGoroutine: 3

ctx.Err() != nilich

解决方案

如果接收者需要在 channel 关闭之前提前退出，为防止内存泄漏，在发送者与接收者发送次数是一对一时，应设置 channel 缓冲队列为 1；在发送者与接收者的发送次数是多对多时，应使用专门的 stop channel 通知发送者关闭相应 channel。

由于篇幅限制，更详细的内容可以看《老手也常误用！详解 Go channel 内存泄漏问题》这篇文章。

总结

以上造成内存泄漏的示例看起来似乎都是小问题，单个示例泄漏的内存不多。但要注意，我们的上述代码可能被写在一个 go routine 中，如果每次访问，都是用一个 go routine 处理（比如后端中，每有一个请求，就会创建一个 go routine 来处理），那么是不是访问的次数越多，泄漏的内存越多。内存泄漏正是由这种看似不起眼的小问题造成的。如果放任不管或不重视，最终造成的结果就是业务频繁宕机、卡顿等。所以我们在业务中应该极其重视。

参考文章

• 一些可能的内存泄漏场景：https://gfw.go101.org/article/memory-leaking.html

引用链接

[1]

你的 golang 程序正在悄悄内存泄漏

什么是内存泄漏

目录

slice、string 误用造成内存泄漏

内存泄漏分析

解决方案

time.Ticker 误用造成内存泄漏

内存泄漏分析

channel 误用造成内存泄漏

解决方案

总结

参考文章

引用链接