Golang的反射最为人诟病的就是它极差的性能,接下来我们尝试优化它的性能。
如果我们使用正常的流程来创建一个对象,将会是如下的代码片段:
type People struct {
Age int
Name string
}
func New() *People {
return &People{
Age: 18,
Name: "shiina",
}
}
反射(Reflect)Person
func NewUseReflect() interface{} {
var p People
t := reflect.TypeOf(p)
v := reflect.New(t)
v.Elem().Field(0).Set(reflect.ValueOf(18))
v.Elem().Field(1).Set(reflect.ValueOf("shiina"))
return v.Interface()
}
GoPerson
简单的性能测试
Gogo bench
func BenchmarkNew(b *testing.B) {
b.ReportAllocs()
b.ResetTimer()
for i := 0; i < b.N; i++ {
New()
}
}
func BenchmarkNewUseReflect(b *testing.B) {
b.ReportAllocs()
b.ResetTimer()
for i := 0; i < b.N; i++ {
NewUseReflect()
}
}
我们得到的测试结果如下:
BenchmarkNew
BenchmarkNew-16 1000000000 1.55 ns/op 0 B/op 0 allocs/op
BenchmarkNewUseReflect
BenchmarkNewUseReflect-16 4787185 248 ns/op 64 B/op 2 allocs/op
我们能够发现使用反射的耗时是不使用的160倍左右
性能损耗的猜测
那么反射创建对象,主要的性能损耗在哪里呢?我们先进行一个实验:
string
- 四个成员变量:
type People struct {
Age int
Name string
Test1 string
Test2 string
}
func New() interface{} {
return &People{
Age: 18,
Name: "shiina",
Test1: "test1",
Test2: "test2",
}
}
func NewUseReflect() interface{} {
var p People
t := reflect.TypeOf(p)
v := reflect.New(t)
v.Elem().Field(0).Set(reflect.ValueOf(18))
v.Elem().Field(1).Set(reflect.ValueOf("shiina"))
v.Elem().Field(2).Set(reflect.ValueOf("test1"))
v.Elem().Field(3).Set(reflect.ValueOf("test2"))
return v.Interface()
}
——————————————————————————————————————————
BenchmarkNew
BenchmarkNew-16 1000000000 1.12 ns/op 0 B/op 0 allocs/op
BenchmarkNewUseReflect
BenchmarkNewUseReflect-16 3334735 366 ns/op 128 B/op 2 allocs/op
- 无成员变量:
type People struct{}
func New() interface{} {
return &People{}
}
func NewUseReflect() interface{} {
var p People
t := reflect.TypeOf(p)
v := reflect.New(t)
return v.Interface()
}
——————————————————————————————————————————
BenchmarkNew
BenchmarkNew-16 1000000000 1.32 ns/op 0 B/op 0 allocs/op
BenchmarkNewUseReflect
BenchmarkNewUseReflect-16 17362648 62.3 ns/op 0 B/op 0 allocs/op
reflect.New()value.Field().Set()
Gopprof
pprof
# 生成测试数据
kieranhu@KIERANHU-MC0 ~/Downloads> go test -bench=. -benchmem -memprofile memprofile.out -cpuprofile profile.out
# 分析测试数据
kieranhu@KIERANHU-MC0 ~/Downloads> go tool pprof ./profile.out
Type: cpu
Time: Apr 24, 2020 at 7:38pm (CST)
Duration: 2.02s, Total samples = 1.92s (94.91%)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) list NewUseReflect
reflect.TypeOf()reflect.New()value.Field().Set()reflect.TypeOf()value.Fidle().Set()
ROUTINE ======================== begonia.NewUseReflect in /Users/kieranhu/go/src/begonia/reflect_test.go
60ms 2.17s (flat, cum) 64.97% of Total
. . 29:
10ms 10ms 30:func NewUseReflect() interface{} {
. . 31: var p People
10ms 580ms 32: t := reflect.TypeOf(p)
. 440ms 33: v := reflect.New(t)
10ms 220ms 34: v.Elem().Field(0).Set(reflect.ValueOf(18))
10ms 250ms 35: v.Elem().Field(1).Set(reflect.ValueOf("shiina"))
. 280ms 36: v.Elem().Field(2).Set(reflect.ValueOf("test1"))
10ms 220ms 37: v.Elem().Field(3).Set(reflect.ValueOf("test2"))
10ms 170ms 38: return v.Interface()
. . 39:}
. . 40:
干掉 value.Field().Set()
我们先从怎么不用xxx=xxx进行赋值说起。
unsafe
Gounsafeunsafe
- 获得该字符串的地址
- 对该地址赋值
我们通过四行就可以完成上面的操作:
str := ""
// 获得该字符串的地址
p := uintptr(unsafe.Pointer(&str))
// 在该地址上赋值
*(*string)(unsafe.Pointer(p))="test"
fmt.Println(str)
-----------------
test
unsafe
操作结构体
我们通过上述代码,得到一个结论:
- 只要我们知道内存地址,就可以操作任意变量。
接下来我们可以尝试去操作结构体了。
Go
- 结构体的成员变量是顺序存储的
- 结构体第一个成员变量的地址就是该结构体的地址。
value.Field().Set()
- 获得结构体地址
- 获得结构体内成员变量的偏移量
- 得到结构体成员变量地址
- 修改变量值
我们逐个来获得获得。
Gointerface
// emptyInterface is the header for an interface{} value.
type emptyInterface struct {
typ *rtype
word unsafe.Pointer
}
reflect/Value.go
typinterfaceword
interface空接口interface{}emptyInterfaceword
结构体类型强转
先用下面这段代码示例,来解决一下不同结构体之间的转换:
type Test1 struct {
Test1 string
}
type Test2 struct {
test2 string
}
func TestStruct(t *testing.T) {
t1 := Test1{
Test1: "hello",
}
t2 := *(*Test2)(unsafe.Pointer(&t1))
fmt.Println(t2)
}
----------------
{hello}
然后我们更换两个结构体中的成员变量类型,再尝试一下:
type Test1 struct {
a int32
b []byte
}
type Test2 struct {
b int16
a string
}
func TestStruct(t *testing.T) {
t1 := Test1{
a:1,
b:[]byte("asdasd"),
}
t2 := *(*Test2)(unsafe.Pointer(&t1))
fmt.Println(t2)
}
----------------
{1 asdasd}
我们可以发现,后面这次尝试两个结构体的类型完全不同,但是其中int32和int16的存储方式相同,[]byte和string的存储方式相同,我们可以得出一个简单的结论:
- 不论类型签名是否相同,只要底层存储方式相同,我们就可以强制转换,并且可以突破私有成员变量限制。
reflect/value.goemptyInterfaceinterfaceword
type emptyInterface struct {
typ *struct{}
word unsafe.Pointer
}
func TestStruct(t *testing.T) {
var in interface{}
in = People{
Age: 18,
Name: "shiina",
Test1: "test1",
Test2: "test2",
}
t2 := uintptr(((*emptyInterface)(unsafe.Pointer(&in))).word)
*(*int)(unsafe.Pointer(t2))=111
fmt.Println(in)
}
---------------
{111 shiina test1 test2}
我们获取了结构体地址后,根据结构体地址,修改了结构体内第一个成员变量的值,接下来我们开始进行第二步:得到结构体成员变量的偏移量
我们可以通过反射,来轻松的获得每一个成员变量的偏移量,进而根据结构体的地址,获得每一个成员变量的地址。
当我们获得了每一个成员变量的地址后,就可以很轻易的修改它了。
var in interface{}
in = People{
Age: 18,
Name: "shiina",
Test1: "test1",
Test2: "test2",
}
typeP := reflect.TypeOf(in)
offset1 := typeP.Field(1).Offset
offset2 := typeP.Field(2).Offset
offset3 := typeP.Field(3).Offset
t2 := uintptr(((*emptyInterface)(unsafe.Pointer(&in))).word)
*(*int)(unsafe.Pointer(t2)) = 111
*(*string)(unsafe.Pointer(t2 + offset1)) = "hello"
*(*string)(unsafe.Pointer(t2 + offset2)) = "hello1"
*(*string)(unsafe.Pointer(t2 + offset3)) = "hello2"
fmt.Println(in)
---------------------
{111 hello hello1 hello2}
value.Field().Set()
NewQuickReflect()
var (
offset1 uintptr
offset2 uintptr
offset3 uintptr
p People
t = reflect.TypeOf(p)
)
func init() {
offset1 = t.Field(1).Offset
offset2 = t.Field(2).Offset
offset3 = t.Field(3).Offset
}
type People struct {
Age int
Name string
Test1 string
Test2 string
}
type emptyInterface struct {
typ *struct{}
word unsafe.Pointer
}
func New() *People {
return &People{
Age: 18,
Name: "shiina",
Test1: "test1",
Test2: "test2",
}
}
func NewUseReflect() interface{} {
v := reflect.New(t)
v.Elem().Field(0).Set(reflect.ValueOf(18))
v.Elem().Field(1).Set(reflect.ValueOf("shiina"))
v.Elem().Field(2).Set(reflect.ValueOf("test1"))
v.Elem().Field(3).Set(reflect.ValueOf("test2"))
return v.Interface()
}
func NewQuickReflect() interface{} {
v := reflect.New(t)
p := v.Interface()
ptr0 := uintptr((*emptyInterface)(unsafe.Pointer(&p)).word)
ptr1 := ptr0 + offset1
ptr2 := ptr0 + offset2
ptr3 := ptr0 + offset3
*((*int)(unsafe.Pointer(ptr0))) = 18
*((*string)(unsafe.Pointer(ptr1))) = "shiina"
*((*string)(unsafe.Pointer(ptr2))) = "test1"
*((*string)(unsafe.Pointer(ptr3))) = "test2"
return p
}
func BenchmarkNew(b *testing.B) {
b.ReportAllocs()
b.ResetTimer()
for i := 0; i < b.N; i++ {
New()
}
}
func BenchmarkNewUseReflect(b *testing.B) {
b.ReportAllocs()
b.ResetTimer()
for i := 0; i < b.N; i++ {
NewUseReflect()
}
}
func BenchmarkNewQuickReflect(b *testing.B) {
b.ReportAllocs()
b.ResetTimer()
for i := 0; i < b.N; i++ {
NewQuickReflect()
}
}
运行后我们的测试结果:
BenchmarkNew
BenchmarkNew-16 1000000000 1.34 ns/op 0 B/op 0 allocs/op
BenchmarkNewUseReflect
BenchmarkNewUseReflect-16 3715539 276 ns/op 64 B/op 1 allocs/op
BenchmarkNewQuickReflect
BenchmarkNewQuickReflect-16 12772573 94.7 ns/op 64 B/op 1 allocs/op
可以看出我们的性能从原生205倍提升到了70倍,并且这个优化的程度将会随着结构体成员变量越多而越明显。
NewQuickReflectpprof
ROUTINE ======================== begonia.NewQuickReflect in /Users/kieranhu/go/src/begonia/reflect_test.go
120ms 1.07s (flat, cum) 28.53% of Total
. . 57:
. . 58:func NewQuickReflect() interface{} {
40ms 800ms 59: v := reflect.New(t)
. . 60:
. 180ms 61: p := v.Interface()
. . 62: ptr0 := uintptr((*emptyInterface)(unsafe.Pointer(&p)).word)
40ms 40ms 63: ptr1 := ptr0 + offset1
10ms 10ms 64: ptr2 := ptr0 + offset2
. . 65: ptr3 := ptr0 + offset3
10ms 10ms 66: *((*int)(unsafe.Pointer(ptr0))) = 18
. 10ms 67: *((*string)(unsafe.Pointer(ptr1))) = "shiina"
. . 68: *((*string)(unsafe.Pointer(ptr2))) = "test1"
. . 69: *((*string)(unsafe.Pointer(ptr3))) = "test2"
20ms 20ms 70: return p
. . 71:}
. . 72:
reflect.New()
干掉 reflect.New()
池化
sync.pool
var (
/**
...........
**/
pool sync.Pool
)
func init() {
/**
............
**/
pool.New = func() interface{} {
return reflect.New(t)
}
for i := 0; i < 100; i++ {
pool.Put(reflect.New(t).Elem())
}
}
/**
............
**/
func NewQuickReflectWithPool() interface{} {
p := pool.Get()
ptr0 := uintptr((*emptyInterface)(unsafe.Pointer(&p)).word)
ptr1 := ptr0 + offset1
ptr2 := ptr0 + offset2
ptr3 := ptr0 + offset3
*((*int)(unsafe.Pointer(ptr0))) = 18
*((*string)(unsafe.Pointer(ptr1))) = "shiina"
*((*string)(unsafe.Pointer(ptr2))) = "test1"
*((*string)(unsafe.Pointer(ptr3))) = "test2"
return p
}
func BenchmarkQuickReflectWithPool(b *testing.B) {
b.ReportAllocs()
b.ResetTimer()
for i := 0; i < b.N; i++ {
obj := NewQuickReflectWithPool()
pool.Put(obj)
}
}
在上述这个用例中,我们一拿到这个对象几乎就立即放回了对象池,模拟的是对象池资源充足情况下的性能:
BenchmarkNew
BenchmarkNew-16 1000000000 1.26 ns/op 0 B/op 0 allocs/op
BenchmarkNewUseReflect
BenchmarkNewUseReflect-16 5515128 226 ns/op 64 B/op 1 allocs/op
BenchmarkNewQuickReflect
BenchmarkNewQuickReflect-16 21561645 91.4 ns/op 64 B/op 1 allocs/op
BenchmarkQuickReflectWithPool
BenchmarkQuickReflectWithPool-16 40770750 55.6 ns/op 0 B/op 0 allocs/op
我们可以发现在对象池对象充足的情况下,没有了malloc带来的耗时,我们的性能从原生72倍提升到原生的44倍。
但是当对象池不充足情况下,就没有这么可喜的效率了。
另一个思路
我们能够发现现在主要的耗时都在利用反射的创建对象上,这个时候我脑海里有一个思路:
Person{}&Person
- 值类型传递值而不是指针的时候会进行拷贝
来在使用反射的前提下,利用值传递特性获得一个原生级别对象拷贝?
如果不使用反射,已知类型的情况下会是如下的代码:
func TestStruct(t *testing.T) {
p1 := People{}
var p2 interface{}
p2 = p1
ptr0 := uintptr((*emptyInterface)(unsafe.Pointer(&p2)).word)
ptr1 := ptr0 + offset1
ptr2 := ptr0 + offset2
ptr3 := ptr0 + offset3
*((*int)(unsafe.Pointer(ptr0))) = 18
*((*string)(unsafe.Pointer(ptr1))) = "shiina"
*((*string)(unsafe.Pointer(ptr2))) = "test1"
*((*string)(unsafe.Pointer(ptr3))) = "test2"
fmt.Println(p1)
fmt.Println(p2)
}
------------------------
{0 }
{18 shiina test1 test2}
p1
很可惜的是,当我们不能直接指定类型的时候,想象中这样场景一直实现不了,会直接修改原变量的值,最终我找到了这样的调用方法:
func TestNew(t *testing.T) {
elemValue := reflect.New(reflect.TypeOf(People{})).Elem()
p := elemValue.Interface()
ptr0 := uintptr((*emptyInterface)(unsafe.Pointer(&p)).word)
ptr1 := ptr0 + offset1
ptr2 := ptr0 + offset2
ptr3 := ptr0 + offset3
*((*int)(unsafe.Pointer(ptr0))) = 18
*((*string)(unsafe.Pointer(ptr1))) = "shiina"
*((*string)(unsafe.Pointer(ptr2))) = "test1"
*((*string)(unsafe.Pointer(ptr3))) = "test2"
fmt.Println(p)
fmt.Println(elemValue)
}
-------------------
{18 shiina test1 test2}
{0 }
elemValue.Interface()
BenchmarkNew
BenchmarkNew-16 1000000000 1.83 ns/op 0 B/op 0 allocs/op
BenchmarkNewUseReflect
BenchmarkNewUseReflect-16 2992928 372 ns/op 128 B/op 2 allocs/op
BenchmarkNewQuickReflect
BenchmarkNewQuickReflect-16 12648523 98.7 ns/op 64 B/op 1 allocs/op
BenchmarkQuickReflectWithPool
BenchmarkQuickReflectWithPool-16 40309711 58.2 ns/op 0 B/op 0 allocs/op
BenchmarkNewWithElemReflect
BenchmarkNewWithElemReflect-16 12700314 89.0 ns/op 64 B/op 1 allocs/op
结果比较沮丧,我们仅提升了不到10ns,从53倍提升到48倍,并且性能的提升也并不稳定。
reflect.New()elemValue.Interface()
- reflect.New()
func New(typ Type) Value {
if typ == nil {
panic("reflect: New(nil)")
}
t := typ.(*rtype)
ptr := unsafe_New(t)
fl := flag(Ptr)
return Value{t.ptrTo(), ptr, fl}
}
- elemValue.Interface()
if v.flag&flagAddr != 0 {
// TODO: pass safe boolean from valueInterface so
// we don't need to copy if safe==true?
c := unsafe_New(t)
typedmemmove(t, c, ptr)
ptr = c
}
reflect.New()unsafe_New()elemValueInterface()unsafe_New()
reflect.New()
END
如上整个性能优化的从思路到实验,再到实现大概总共花了一周的空闲时间。越写越觉得我不像是在写Go而是在写c了。或许我应该让Go写的更像Go而不是想什么黑魔法来让Go更快(也更不安全)?很感谢需求不饱和让我还有摸鱼时间来研究这个(x