go反射的秘密

最新推荐文章于 2023-05-16 20:56:16 发布

cheniie

最新推荐文章于 2023-05-16 20:56:16 发布

阅读量457

点赞数

分类专栏： Go 文章标签： go 反射

本文链接：https://blog.csdn.net/puss0/article/details/113810274

版权

Go 专栏收录该内容

47 篇文章 4 订阅

订阅专栏

前篇说了Go语言中反射的一些注意事项，现在让我们来看看反射背后的小秘密。前方检票上车，做好读源代码的准备。在每段源代码的开头，我都会标出所在文件。

首先从TypeOf函数开始，该函数在\src\reflect\type.go文件中定义。

/*--type.go--*/
func TypeOf(i interface{}) Type {
	eface := *(*emptyInterface)(unsafe.Pointer(&i))
	return toType(eface.typ)
}

该函数首先取了参数的地址，然后进行了一些列的强制转换。在unsafe.go文件中我们能找到关于Pointer的相关定义如下。

/*--unsafe.go--*/
type ArbitraryType int
type Pointer *ArbitraryType

Pointer实际上是一个*int类型，但是值得注意的是，Pointer类型比较特殊，因为它有2个独有的骚操作，是别人都不具备的能力。

可以和任何类型的指针相互转换
可以和uintptr类型相互转换

当调用TypeOf函数之后，实际上发生了三件事：

参数被转换成interface{}类型（如果你看到“反射时针对接口做的”之类的言辞，就是应在这里了）
接口地址又被转换成unsafe.Pointer类型
从unsafe.Pointer转换成*emptyInterface类型

在golang中，接口分为两种，一种是空接口，一种是有方法的接口。

/*--value.go--*/
// emptyInterface is the header for an interface{} value.
type emptyInterface struct {
	typ  *rtype
	word unsafe.Pointer
}

// nonEmptyInterface is the header for an interface value with methods.
type nonEmptyInterface struct {
	// see ../runtime/iface.go:/Itab
	itab *struct {
		ityp *rtype // static interface type
		typ  *rtype // dynamic concrete type
		hash uint32 // copy of typ.hash
		_    [4]byte
		fun  [100000]unsafe.Pointer // method table
	}
	word unsafe.Pointer
}

说到这里，就不得不提一下golang中接口的实现。golang中接口包含两个部分的内容：iTable地址和值的地址。前者存储了接口所存储值的类型信息，后者指向接口所存储的值。
在这里插入图片描述

上图就是接口的结构，在emptyInterface类型中，itable地址对应typ字段；值地址对应word字段。而在nonEmptyInterface类型中，itable地址对应itab字段，可以看到该字段是一个匿名结构体，其中有个fn字段，这就是方法集；值地址同样对应word字段。因此接口的大小是两个指针大小，在64位机器上是2×8=16字节。

在接口定义中，我们看到了rtype类型，这就是存储类型信息的地方，它也是一个结构体。

/*--type.go--*/
type rtype struct {
	size       uintptr  // 值的大小
	ptrdata    uintptr  // number of bytes in the type that can contain pointers
	hash       uint32   // hash of type; avoids computation in hash tables
	tflag      tflag    // extra type information flags
	align      uint8    // alignment of variable with this type
	fieldAlign uint8    // alignment of struct field with this type
	kind       uint8    // 字段类型
	alg        *typeAlg // algorithm table
	gcdata     *byte    // garbage collection data
	str        nameOff  // string form
	ptrToThis  typeOff  // type for pointer to this type, may be zero
}

居然只认得两个字段。。。没关系，这些字段有的是不能直接使用的，需要再经过计算才能得出真正的信息。还是先继续看TypeOf函数，经过一番转换，最终是得到了emptyInterface类型的值eface，注意这是一个结构体值类型，而不是指针。

后面调用了toType函数，它的任务很简单，就是判断一下tpy字段，也就是那个指向类型信息结构体的指针是否为空，如果为空，那它也返回空，一个没有类型的值必须是nil，否则它将类型信息原样返回。

/*--type.go--*/
func toType(t *rtype) Type {
	if t == nil {
		return nil
	}
	return t
}

如果留心观察的话，你会发现在rtype类型上实现了好多函数，居然正好包含了Type接口定义的那些函数，也就是说，rtype实现了Type接口。

以上就是调用ValueOf函数的前前后后，接下来看看调用ValueOf函数时又发生了什么。既然ValueOf函数最终会返回一个Value类型的值，那就先看看Value结构体长什么样子。

/*--value.go--*/
type Value struct {
	// typ holds the type of the value represented by a Value.
	typ *rtype

	// Pointer-valued data or, if flagIndir is set, pointer to data.
	// Valid when either flagIndir is set or typ.pointers() is true.
    // 设置了flagIndir的话指向数据，否则指向指针。
    // 只有设置了flagIndir或者typ.pointers()==ture时才有效。
	ptr unsafe.Pointer

	// flag holds metadata about the value.
	// The lowest bits are flag bits:
	//	- flagStickyRO: obtained via unexported not embedded field, so read-only
	//	- flagEmbedRO: obtained via unexported embedded field, so read-only
	//	- flagIndir: val holds a pointer to the data
	//	- flagAddr: v.CanAddr is true (implies flagIndir)
	//	- flagMethod: v is a method value.
	// The next five bits give the Kind of the value.
	// This repeats typ.Kind() except for method values.
	// The remaining 23+ bits give a method number for method values.
	// If flag.kind() != Func, code can assume that flagMethod is unset.
	// If ifaceIndir(typ), code can assume that flagIndir is set.
	flag

	// A method value represents a curried method invocation
	// like r.Read for some receiver r. The typ+val+flag bits describe
	// the receiver r, but the flag's Kind bits say Func (methods are
	// functions), and the top bits of the flag give the method number
	// in r's type's method table.
}

那么为什么通过Value调用结构体上的方法时能将结构体传递给方法，而通过Type调用却不行呢？答案就在这里，就是这个字段的原因。当通过Value调用方法时，ptr字段指向结构体值，因而可以访问到调用该方法的值，而通过Type调用方法时，ptr字段就不知道飞到那里去了。

我们不妨将上一篇的代码稍稍修改来做个试验。无论是通过ValueOf还是TypeOf，最终都能拿到Value值。但是由于ptr字段是非公开的，所以只能通过反射把它的值给拿出来。没错，就是反射再反射。

package main

import (
	"fmt"
	"reflect"
	"unsafe"
)

type T struct {
	a int    `test:"a_int"`
	s string `test:"s_string"`
}

func (t *T) SetA(a int) {
	t.a = a
}

func (t *T) SetS(s string) {
	t.s = s
}

func (t T) PrintT() {
	fmt.Printf("[T.PrintT]# %2d → %s\n", t.a, t.s)
}

func main() {
    t := T{1, "hello"}
	value := reflect.ValueOf(t)
	method, _ := reflect.TypeOf(t).MethodByName("PrintT")
	p1 := reflect.ValueOf(value).FieldByName("ptr").Pointer()
	p2 := reflect.ValueOf(method.Func).FieldByName("ptr").Pointer()
	fmt.Printf("[value.ptr] %+v\n", *(*T)(unsafe.Pointer(p1)))
	fmt.Printf("[vtype.ptr] %+v\n\n", *(*T)(unsafe.Pointer(p2)))
}

value是通过ValueOf函数获得的，method.Func也是一个Value类型的值。然后我们将这两个Value值分别通过反射获取ptr字段的值，然后将地址转换成T类型结构体，结果如下。

[value.ptr] {a:1 s:hello}
[vtype.ptr] {a:5002912 s:}

结果已经很明显了，通过TyoeOf方式获得的Value值的ptr字段丢失了实际的值，这就是真相。

接着说ValueOf函数，如果传入的接口为nil的话，那么也将得到一个的Value值。由于反射，参数将会发生逃逸，从栈空间转移到堆空间。最后由unpackEface将接口变成Value并返回。

/*--value.go--*/
func ValueOf(i interface{}) Value {
	if i == nil {
		return Value{}
	}

	// TODO: Maybe allow contents of a Value to live on the stack.
	// For now we make the contents always escape to the heap. It
	// makes life easier in a few places (see chanrecv/mapassign
	// comment below).
	escapes(i)

	return unpackEface(i)
}

unpackEface函数源码如下。第一步还是七拐八弯的类型转换，这一次得到的是一个指向emptyInterface的指针，同样，如果这个接口没有类型的换，也应该返回一个空的Value值。

/*--value.go--*/
func unpackEface(i interface{}) Value {
	e := (*emptyInterface)(unsafe.Pointer(&i))
	// NOTE: don't read e.word until we know whether it is really a pointer or not.
	t := e.typ
	if t == nil {
		return Value{}
	}
	f := flag(t.Kind())
	if ifaceIndir(t) {
		f |= flagIndir
	}
	return Value{t, e.word, f}
}

接着又是类型转换，flag是一个uintptr类型，在64位机器上是8字节。

/*--value.go--*/
type flag uintptr

在Value结构体中就有flag字段，该字段记录着更多的类型信息。具体来说，flag的低5位代表的是类型信息Kind，一共27种。

/*--type.go--*/
const (
	Invalid Kind = iota
	Bool
	Int
	Int8
	Int16
	Int32
	Int64
	Uint
	Uint8
	Uint16
	Uint32
	Uint64
	Uintptr
	Float32
	Float64
	Complex64
	Complex128
	Array
	Chan
	Func
	Interface
	Map
	Ptr
	Slice
	String
	Struct
	UnsafePointer
)

flag的第6至10位也有特殊含义，并且有对应的掩码，每一位的含义都通过注释给出。

/*--value.go--*/
const (
	flagKindWidth        = 5 // 27种类型，至少需要5bit表示
	flagKindMask    flag = 1<<flagKindWidth - 1 // 类型掩码，用于提取类型，=11111
	flagStickyRO    flag = 1 << 5 // flag第6位，表示字段是非导出、非嵌入的，只读字段
	flagEmbedRO     flag = 1 << 6 // flag第7位，表示字段是非导出的嵌入字段，只读字段
    flagIndir       flag = 1 << 7 // flag第8位，表示Value中(ptr字段)是否持有数据的地址
    flagAddr        flag = 1 << 8 // flag第9位，表示变量可以取地址，Value.CanAddr函数就是返回的这个标志位，同时必有flag的第8位(flagIndir对应位)为1
	flagMethod      flag = 1 << 9 // flag第10位，表示是否是函数类型
	flagMethodShift      = 10 // 在调用函数时会用到它
	flagRO          flag = flagStickyRO | flagEmbedRO // 只读字段
)

同时在flag上还定义了一些函数，一共有5个，都很容易，就是判断各个标志位，不妨看一看。

/*--value.go--*/
// 提取flag中类型(Kind)信息
func (f flag) kind() Kind {
	return Kind(f & flagKindMask)
}
// 返回只读字段的第6位标志位
func (f flag) ro() flag {
	if f&flagRO != 0 {
		return flagStickyRO
	}
	return 0
}
// 类型断言，如果falg中的类型和给定类型不一致则引发异常
func (f flag) mustBe(expected Kind) {
	if f.kind() != expected {
		panic(&ValueError{methodName(), f.kind()})
	}
}
// 断言字段是导出的，如果是非导出字段，则引发异常
func (f flag) mustBeExported() {
	if f == 0 {
		panic(&ValueError{methodName(), 0})
	}
	if f&flagRO != 0 {
		panic("reflect: " + methodName() + " using value obtained using unexported field")
	}
}

// 断言字段是可赋值的，如果字段不可赋值(即非导出或不可取地址)则引发异常
func (f flag) mustBeAssignable() {
	if f == 0 {
		panic(&ValueError{methodName(), Invalid})
	}
	// Assignable if addressable and not read-only.
	if f&flagRO != 0 {
		panic("reflect: " + methodName() + " using value obtained using unexported field")
	}
	if f&flagAddr == 0 {
		panic("reflect: " + methodName() + " using unaddressable value")
	}
}

说完了falg，我们再接着看看ifaceIndir函数，它在type.go文件中。

/*--type.go--*/
// ifaceIndir reports whether t is stored indirectly in an interface value.
func ifaceIndir(t *rtype) bool {
	return t.kind&kindDirectIface == 0
}

其中kindDirectIface的定义如下：

/*--type.go--*/
const kindDirectIface = 1 << 5

好巧不巧，它也是第6位，如果t.kind的第6位为0，则ifaceIndir函数就返回1，紧接着回到unpackEface函数，将变量f的第8位置1，f就是Value的flag字段。意思是说：如果字段不是非导出非嵌入的，那说明它是可以取地址的，于是设置flag的flagIndir标志位为1，记录这一项信息。

unpackEface的最后构造了一个Value结构体返回，其中t就是类型信息的结构体，e.word是指向数据的指针，还记得接口类型中的那个word字段吧，就是它了，f标识了更多的类型信息。

至此，获取Value的过程就结束了。

那么问题来了，无论是获取Value还是Type，我们都只是做了一番类型转换就得到了类型信息，那么rtype结构体的各个字段又是如何填充的呢？这个问题的真相时当我们把一个变量赋值给interface{}类型变量时发生了什么，在这个类型转换过程中，系统已经把各项信息都填写好了。再次强调，Value和Type是从Interface类型转化过来的，而不是具体类型。所以真正的疑问是如何把一个具体类型的值转成interface{}！

下面做两个实验。

**实验一：**前面曾通过反射查看过Value的ptr字段，现在我们来瞧瞧flag字段。

package main

import (
	"fmt"
	"reflect"
)

type T struct {
	a int
	bool
	_ byte
	B float64
}

func (t T) Hello() {
	fmt.Println("hello. I'm T.")
}

func main() {
	t := T{1, true, 'A', 3.14}
	tval := reflect.ValueOf(t)

	vval := reflect.ValueOf(tval)
	val_field1 := reflect.ValueOf(tval.Field(0))
	val_field2 := reflect.ValueOf(tval.Field(1))
	val_field3 := reflect.ValueOf(tval.Field(2))
	val_field4 := reflect.ValueOf(tval.Field(3))
	val_method := reflect.ValueOf(tval.Method(0))

	flag0 := vval.FieldByName("flag").Uint()
	flag1 := val_field1.FieldByName("flag").Uint()
	flag2 := val_field2.FieldByName("flag").Uint()
	flag3 := val_field3.FieldByName("flag").Uint()
	flag4 := val_field4.FieldByName("flag").Uint()
	flag5 := val_method.FieldByName("flag").Uint()

	fmt.Printf("[flag0-T] %b\n", flag0)
	fmt.Printf("[flag1-a] %b\n", flag1)
	fmt.Printf("[flag2- ] %b\n", flag2)
	fmt.Printf("[flag3-_] %b\n", flag3)
	fmt.Printf("[flag4-B] %b\n", flag4)
	fmt.Printf("[flag5  ] %b\n", flag5)
}

我们通过反射分别取结构体及其每个字段和方法的Value值，然后再反射取出Value中的flag字段，结果如下。为了好看，我进行了对齐以及将低5位分离了出来，注释部分是低5位对应的十进制以及类型，在前面有每种类型的常量定义。

[flag0-T]   100  11001 //25=Struct
[flag1-a]   101  00010 //2=Int
[flag2- ]   110  00001 //1=Bool
[flag3-_]   101  01000 //8=Uint8
[flag4-B]   100  01110 //14=Float64
[flag5  ] 10100  10011 //19=Func

从结果可以看出，第1、3两个字段是非导出非嵌入字段，因此第6位是1；第2个字段是非导出嵌入字段，因此第7位是1，flag5对应的是函数，因此第10位是1，函数不可以取地址，因此第9位为0；由于Value的ptr字段持有数据地址，所以第8位都是1。

这里还要注意一点，与字段不同，只有导出的函数才能反射出来！

**实验2：**模仿emptyInterface结构体，看看将接口类型转化成emptyInterface类型的结果。

首先我们需要将必要的定义从源代码中拷贝出来，主要是rtype相关的定义。

package main

import (
	"fmt"
	"unsafe"
)

type e struct {
	p    *rtype
	word unsafe.Pointer
}
/*=====================源码拷贝=====================*/
type tflag uint8
type typeAlg struct {
	hash  func(unsafe.Pointer, uintptr) uintptr
	equal func(unsafe.Pointer, unsafe.Pointer) bool
}
type nameOff int32 // offset to a name
type typeOff int32 // offset to an *rtype
type rtype struct {
	size       uintptr
	ptrdata    uintptr  // number of bytes in the type that can contain pointers
	hash       uint32   // hash of type; avoids computation in hash tables
	tflag      tflag    // extra type information flags
	align      uint8    // alignment of variable with this type
	fieldAlign uint8    // alignment of struct field with this type
	kind       uint8    // enumeration for C
	alg        *typeAlg // algorithm table
	gcdata     *byte    // garbage collection data
	str        nameOff  // string form
	ptrToThis  typeOff  // type for pointer to this type, may be zero
}
/*==================================================*/
func main() {
	love := []int{573, 295}
	var i interface{} = love
	w := *(*e)(unsafe.Pointer(&i))
	fmt.Println("[w]", w)
	fmt.Printf("[p] %+v\n", w.p)
	fmt.Println("[word]", *(*[]int)(w.word))
}

结果如下，经过转换后，我们最终又从word字段恢复出了切片数据，是不是很神奇呢？

[w] {0x4a6e40 0xc0420483a0}
[p] &{size:24 ptrdata:8 hash:469329550 tflag:2 align:8 fieldAlign:8 kind:23 alg:0x53da00 gcdata:0x4d9bb4 str:4801 ptrToThis:43808}
[word] [573 295]

最后要说的就是顺序问题。结构体经过反射后，字段和函数都都可以通过下标获得。进过反射后，结构体的字段和方法的顺序规则如下：

字段按声明顺序排列，方法按字母升序排列。