golang反射: a gccgo's perspective
intro
大部分“现代”语言都支持自动内存回收(garbage collection),也支持反射(reflection)。go作为一种新出现的语言自然也不例外。
lua、Python作为动态语言,类型信息就保存在对象中,这也是动态语言可以动态添加字段/执行函数的基础。
golang作为C语言的广义派生语言,是一种静态语言。 参考C++的多态实现和动态语言的实现机制,直观上golang的反射也应该是在每个对象中保存一个“隐藏"(使用者不可见)的指针来指向对象的反射信息。
由于只是讨论golang的实现,更重要的是基于实现了解语言表现出来的特性和能力,这里采用gcc中的go实现代码而不是golang官方的gc编译器。
测试
用例
tsecer@harry: cat gcc_go_type_assertion.go
package assertion
type IFACE interface {
Get()
Set(x int)
}
type IMP struct {
x int
y int
}
func (imp IMP)Get() {
}
func (imp IMP)Set(x int) {
}
func Main() IMP {
var imp IMP
var iface IFACE = imp;
iface.Set(imp.x + imp.y)
aif := iface.(IMP)
return aif
}
类型
通过gdb可以看到IMP类型只有干净的struct定义的字段,interface类型中不仅包含了对象指针(__object),而且还包含了函数指针表,反射使用的类型描述信息__type_descriptor。
(gdb) ptype IMP
type = struct {
int x;
int y;
}
(gdb) ptype IFACE
type = struct {
struct {
_type *__type_descriptor;
void (*Get)(void *);
void (*Set)(void *, int);
} *__methods;
void *__object;
}
(gdb)
语义
类型断言
类型断言比较的是interface对象中保存的类型信息__type_descriptor是否和代码中指定的结构类型指针(go.assertion.IMP..d)是否相等(每个结构的类型信息是唯一的)。从某种意义上说,这个interface更像是C++中有虚函数的对象包含的、隐含的、指向虚函数表的结构,只是多了个用来实现反射功能的__type_descriptor字段。
iface.__methods = &imt..interface_4Get_bfunc_8_9_8_9_cSet_bfunc_8int_9_8_9_5..go_0assertion.IMP;
...
iface.__object = GOTMP.4_3;
interface初始化
对应的,将一个结构赋值给interface就是初始化interface的各个字段
_11 = iface.__methods;
_12 = _11->__type_descriptor;
if (_12 == &go.assertion.IMP..d) goto <D.514>; else goto <D.511>;
完整Main函数的gimple
struct IMP go.assertion.Main ()
{
struct * iftmp.5;
struct IMP D.519;
struct IMP $ret0;
try
{
$ret0 = {};
{
struct IMP imp;
struct IFACE iface;
struct IMP aif;
try
{
imp = {};
iface.__methods = &imt..interface_4Get_bfunc_8_9_8_9_cSet_bfunc_8int_9_8_9_5..go_0assertion.IMP;
_1 = runtime.newobject (&go.assertion.IMP..d);
GOTMP.0 = _1;
GOTMP.3_2 = GOTMP.0;
*GOTMP.3_2 = imp;
GOTMP.4_3 = GOTMP.0;
iface.__object = GOTMP.4_3;
_4 = iface.__methods;
_5 = _4->Set;
_6 = imp.x;
_7 = imp.y;
_8 = _6 + _7;
_9 = iface.__object;
_5 (_9, _8);
_10 = iface.__methods;
if (_10 != 0B) goto <D.513>; else goto <D.511>;
<D.513>:
_11 = iface.__methods;
_12 = _11->__type_descriptor;
if (_12 == &go.assertion.IMP..d) goto <D.514>; else goto <D.511>;
<D.514>:
goto <D.512>;
<D.511>:
_13 = iface.__methods;
if (_13 == 0B) goto <D.516>; else goto <D.517>;
<D.516>:
iftmp.5 = 0B;
goto <D.518>;
<D.517>:
_14 = iface.__methods;
iftmp.5 = _14->__type_descriptor;
<D.518>:
runtime.panicdottype (&go.assertion.IMP..d, iftmp.5, &go.assertion.IFACE..d);
<D.512>:
_15 = iface.__object;
aif = MEM[(struct IMP *)_15];
{
$ret0 = aif;
D.519 = $ret0;
return D.519;
}
}
finally
{
imp = {CLOBBER(eos)};
iface = {CLOBBER(eos)};
aif = {CLOBBER(eos)};
}
}
}
finally
{
$ret0 = {CLOBBER(eos)};
}
}
reflect
由于每个interface对象都保存了类型信息,所以只要拿到了interface对象,就可以从中获得interface包含对象的类型描述。
取类型
///@file: gcc\libgo\go\reflect\type.go
// TypeOf returns the reflection Type that represents the dynamic type of i.
// If i is a nil interface value, TypeOf returns nil.
func TypeOf(i interface{}) Type {
eface := *(*emptyInterface)(unsafe.Pointer(&i))
return toType(eface.typ)
}
取对象
///@file: gcc\libgo\go\reflect\value.go
// ValueOf returns a new Value initialized to the concrete value
// stored in the interface i. ValueOf(nil) returns the zero Value.
func ValueOf(i interface{}) Value {
if i == nil {
return Value{}
}
// TODO: Maybe allow contents of a Value to live on the stack.
// For now we make the contents always escape to the heap. It
// makes life easier in a few places (see chanrecv/mapassign
// comment below).
escapes(i)
return unpackEface(i)
}
// unpackEface converts the empty interface i to a Value.
func unpackEface(i interface{}) Value {
e := (*emptyInterface)(unsafe.Pointer(&i))
// NOTE: don't read e.word until we know whether it is really a pointer or not.
t := e.typ
if t == nil {
return Value{}
}
f := flag(t.Kind())
if ifaceIndir(t) {
f |= flagIndir
}
return Value{t, e.word, f}
}
go表示的结构信息。empty interface(any)没有函数表,所以第一个字段就是类型指针。
///@file:gcc\libgo\go\reflect\value.go
// emptyInterface is the header for an interface{} value.
type emptyInterface struct {
typ *rtype
word unsafe.Pointer
}
// nonEmptyInterface is the header for a interface value with methods.
type nonEmptyInterface struct {
// see ../runtime/iface.go:/Itab
itab *struct {
typ *rtype // dynamic concrete type
fun [100000]unsafe.Pointer // method table
}
word unsafe.Pointer
}
struct Object in compiler and runtime
可以看到,go runtime使用go实现的,而reflect信息是编译器通过C语言生成,go runtime如果解析编译器生成的C语言struct对象呢?因为go中的struct是POD类型,而C中的struct也是POD类型,只要约定好各个字段的位置、大小信息,直接访问就行。
源代码中的注释
// Return the type of a type descriptor. We should really tie this to
// runtime.Type rather than copying it. This must match the struct "_type"
// declared in libgo/go/runtime/type.go.
Type*
Type::make_type_descriptor_type()
{
///...
// The type descriptor type.
Struct_type* type_descriptor_type =
Type::make_builtin_struct_type(11,
"kind", uint8_type,
"align", uint8_type,
"fieldAlign", uint8_type,
"size", uintptr_type,
"hash", uint32_type,
"hashfn", hash_fntype,
"equalfn", equal_fntype,
"gc", uintptr_type,
"string", pointer_string_type,
"", pointer_uncommon_type,
"ptrToThis",
pointer_type_descriptor_type);
Named_type* named = Type::make_builtin_named_type("_type",
type_descriptor_type);
named_type_descriptor_type->set_type_value(named);
ret = named;
go运行时对应结构
///@file: gcc\libgo\go\reflect\type.go
// rtype is the common implementation of most values.
// It is embedded in other, public struct types, but always
// with a unique tag like `reflect:"array"` or `reflect:"ptr"`
// so that code cannot convert from, say, *arrayType to *ptrType.
type rtype struct {
kind uint8 // enumeration for C
align int8 // alignment of variable with this type
fieldAlign uint8 // alignment of struct field with this type
_ uint8 // unused/padding
size uintptr
hash uint32 // hash of type; avoids computation in hash tables
hashfn func(unsafe.Pointer, uintptr) uintptr // hash function
equalfn func(unsafe.Pointer, unsafe.Pointer) bool // equality function
gc unsafe.Pointer // garbage collection data
string *string // string form; unnecessary but undeniably useful
*uncommonType // (relatively) uncommon fields
ptrToThis *rtype // type for pointer to this type, if used in binary or has methods
}
receiver用pointer还是value
函数解析
// Start compiling a function.
Named_object*
Gogo::start_function(const std::string& name, Function_type* type,
bool add_method_to_type, Location location)
{
///...
// We want to look through the pointer created by the
// parser, without getting an error if the type is not yet
// defined.
if (rtype->classification() == Type::TYPE_POINTER)
rtype = rtype->points_to();
while (rtype->named_type() != NULL
&& rtype->named_type()->is_alias())
rtype = rtype->named_type()->real_type()->forwarded();
if (rtype->is_error_type())
ret = Named_object::make_function(name, NULL, function);
else if (rtype->named_type() != NULL)
{
if (rtype->named_type()->named_object()->package() != NULL)
{
go_error_at(type->receiver()->location(),
"may not define methods on non-local type");
ret = Named_object::make_function(name, NULL, function);
}
else
{
ret = rtype->named_type()->add_method(name, function);
if (!ret->is_function())
{
// Redefinition error.
ret = Named_object::make_function(name, NULL, function);
}
}
///...
}
函数添加到类型的local_methods_绑定中(同一个package中不能有同名函数,但是不同receiver中可以有同名函数),绑定的name就是函数的原始字符串(没有根据指针还是对象区分):这也意味着同一个类型某个名字的method只能有一个(要么是指针,要么是数值)。
There are two reasons to use a pointer receiver.
The first is so that the method can modify the value that its receiver points to.
The second is to avoid copying the value on each method call. This can be more efficient if the receiver is a large struct, for example.
In this example, both Scale and Abs are methods with receiver type *Vertex, even though the Abs method needn't modify its receiver.
In general, all methods on a given type should have either value or pointer receivers, but not a mixture of both. (We'll see why over the next few pages.)
// Add a method to this type.
Named_object*
Named_type::add_method(const std::string& name, Function* function)
{
go_assert(!this->is_alias_);
if (this->local_methods_ == NULL)
this->local_methods_ = new Bindings(NULL);
return this->local_methods_->add_function(name, NULL, function);
}
outro
struct类型对象并没有额外字段来存储所谓的反射信息,struct对象的内存布局就是POD(Plain Old Data)类型。
反射相关信息如果要运行时使用,需要保存在interface对象中。
浙公网安备 33010602011771号