lua5.3 gc 元方法__gc分析（6）

定义

如果我们给一个普通表 t，设置元表时，上带 __gc 的元方法，就会让这个表 t 在被回收时，触发 __gc 方法。例如下面的例子：

local mt =  {__gc = function ()
	print("call __gc ...")
end}

local t1 = setmetatable({}, mt)

t1 = nil

collectgarbage("collect")
print("end ...")
--[[
运行结果：
call __gc ...
end ...
]]

实现分析

我们在调用 setmetatable() API 时，看看里面的具体实现：

LUA_API int lua_setmetatable (lua_State *L, int objindex) {
  TValue *obj;
  Table *mt;
  lua_lock(L);
  api_checknelems(L, 1);
  obj = index2addr(L, objindex);
  if (ttisnil(L->top - 1))
    mt = NULL;
  else {
    api_check(L, ttistable(L->top - 1), "table expected");
    mt = hvalue(L->top - 1);
  }
  switch (ttnov(obj)) {
    case LUA_TTABLE: {
      hvalue(obj)->metatable = mt;
      if (mt) {
        luaC_objbarrier(L, gcvalue(obj), mt);
        luaC_checkfinalizer(L, gcvalue(obj), mt);
      }
      break;
    }
    ...
}
    
/*
** if object 'o' has a finalizer, remove it from 'allgc' list (must
** search the list to find it) and link it in 'finobj' list.
*/
void luaC_checkfinalizer (lua_State *L, GCObject *o, Table *mt) {
  global_State *g = G(L);
  if (tofinalize(o) ||                 /* obj. is already marked... */
      gfasttm(g, mt, TM_GC) == NULL)   /* or has no finalizer? */
    return;  /* nothing to be done */
  else {  /* move 'o' to 'finobj' list */
    GCObject **p;
    if (issweepphase(g)) {
      makewhite(g, o);  /* "sweep" object 'o' */
      if (g->sweepgc == &o->next)  /* should not remove 'sweepgc' object */
        g->sweepgc = sweeptolive(L, g->sweepgc);  /* change 'sweepgc' */
    }
    /* search for pointer pointing to 'o' */
    for (p = &g->allgc; *p != o; p = &(*p)->next) { /* empty */ }
    *p = o->next;  /* remove 'o' from 'allgc' list */
    o->next = g->finobj;  /* link it in 'finobj' list */
    g->finobj = o;
    l_setbit(o->marked, FINALIZEDBIT);  /* mark it as such */
  }
}

在给普通表设置元表时，会触发 luaC_checkfinalizer() 函数调用，这个函数就会把具有 __gc 元方法行为的普通表从 g->allgc 链表中摘下来，放到 g->finobj 链表中，如果已经加入过了，就直接返回。

从中我们也可以了解到，如果是在 g->allgc 已经有很多对象了的时候，才去设置 __gc 的话，42 行 for 循环查找的开销可不少。所以，如果可以，还是在程序启动时，尽早的设置 __gc 元方法比较好，当然，也要看需求，并不绝对。

这里有个注意点，就是如果当前处于清除阶段（清除阶段是可以分步执行的），这个对象有可能是死亡的，需要把这个对象从清除链表 g->sweepgc 中移除出去。实现起来也比较简单，就是先将对象设置成当前白，如果当前 g->sweepgc 链表正要清除的就是这个对象，还需要调用 sweeptolive() 函数找到这个对象，然后从 g->sweepgc 链表中移除。

local mt = nil

mt = {__gc = function (t)
	print("__gc:", t, mt)
	setmetatable(t, mt)
end}

local t1 = setmetatable({}, mt)

print("create t1", t1)
t1 = nil

collectgarbage()
print("end ...")

比如上面，在 gc 清除阶段执行 __gc 方法时，再对表 t1 设置一次元方法 setmetatable(t, mt)，触发 luaC_checkfinalizer 36~40 行的代码。

GCSatomic 原子阶段

接着看看原子阶段，做了些什么事。


static l_mem atomic (lua_State *L) {
  global_State *g = G(L);
  ...
  separatetobefnz(g, 0);  /* separate objects to be finalized */
  g->gcfinnum = 1;  /* there may be objects to be finalized */
  markbeingfnz(g);  /* mark objects that will be finalized */
  ...
  return work;  /* estimate of memory marked by 'atomic' */
}

/*
** find last 'next' field in list 'p' list (to add elements in its end)
*/
static GCObject **findlast (GCObject **p) {
  while (*p != NULL)
    p = &(*p)->next;
  return p;
}


/*
** move all unreachable objects (or 'all' objects) that need
** finalization from list 'finobj' to list 'tobefnz' (to be finalized)
*/
static void separatetobefnz (global_State *g, int all) {
  GCObject *curr;
  GCObject **p = &g->finobj;
  GCObject **lastnext = findlast(&g->tobefnz);
  while ((curr = *p) != NULL) {  /* traverse all finalizable objects */
    lua_assert(tofinalize(curr));
    if (!(iswhite(curr) || all))  /* not being collected? */
      p = &curr->next;  /* don't bother with it */
    else {
      *p = curr->next;  /* remove 'curr' from 'finobj' list */
      curr->next = *lastnext;  /* link at the end of 'tobefnz' list */
      *lastnext = curr;
      lastnext = &curr->next;
    }
  }
}


/*
** mark all objects in list of being-finalized
*/
static void markbeingfnz (global_State *g) {
  GCObject *o;
  for (o = g->tobefnz; o != NULL; o = o->next)
    markobject(g, o);
}

在原子操作 atomic() 调用 separatetobefnz() 时，遍历 g->finobj 链表，如果节点是白色的，就加入到 g->tobefnz 链表中，表示没有再被引用了，需要调用 __gc 指向的方法，否则就跳过，指向下一个对象。

在原子阶段，会调用markbeingfnz() 函数，需要对 g->tobefnz 链表上的对象进行 mark 操作，因为这些对象在调用 __gc 方法前是不能被回收的，而且它们引用到的其他 gc 可回收对象，不能在本轮 gc 中回收，所以，需要不断的遍历 mark。

GCScallfin 阶段

在singlestep() 函数中我们看到从 GCSswpfinobj 到 GCSswptobefnz 阶段，都是对 g->finobj，和 g->tobefnz 链表的处理，同 g->allgc 链表清理流程一样，最终都是调用 sweeplist() 函数，检测链表上的对象，如果对象是标记阶段的白色，就清除释放对象，回收内存，如果不是，就重置下对象颜色为当前白色。

重点看下 GCScallfin 阶段，runafewfinalizers() 函数处理：

/*
** call a few (up to 'g->gcfinnum') finalizers
*/
static int runafewfinalizers (lua_State *L) {
  global_State *g = G(L);
  unsigned int i;
  lua_assert(!g->tobefnz || g->gcfinnum > 0);
  for (i = 0; g->tobefnz && i < g->gcfinnum; i++)
    GCTM(L, 1);  /* call one finalizer */
  g->gcfinnum = (!g->tobefnz) ? 0  /* nothing more to finalize? */
                    : g->gcfinnum * 2;  /* else call a few more next time */
  return i;
}

static GCObject *udata2finalize (global_State *g) {
  GCObject *o = g->tobefnz;  /* get first element */
  lua_assert(tofinalize(o));
  g->tobefnz = o->next;  /* remove it from 'tobefnz' list */
  o->next = g->allgc;  /* return it to 'allgc' list */
  g->allgc = o;
  resetbit(o->marked, FINALIZEDBIT);  /* object is "normal" again */
  if (issweepphase(g))
    makewhite(g, o);  /* "sweep" object */
  return o;
}

static void dothecall (lua_State *L, void *ud) {
  UNUSED(ud);
  luaD_callnoyield(L, L->top - 2, 0);
}

static void GCTM (lua_State *L, int propagateerrors) {
  global_State *g = G(L);
  const TValue *tm;
  TValue v;
  setgcovalue(L, &v, udata2finalize(g));
  tm = luaT_gettmbyobj(L, &v, TM_GC);
  if (tm != NULL && ttisfunction(tm)) {  /* is there a finalizer? */
    int status;
    ...
    setobj2s(L, L->top, tm);  /* push finalizer... */
    setobj2s(L, L->top + 1, &v);  /* ... and its argument */
    L->top += 2;  /* and (next line) call the finalizer */
    L->ci->callstatus |= CIST_FIN;  /* will run a finalizer */
    status = luaD_pcall(L, dothecall, NULL, savestack(L, L->top - 2), 0);
    ...
    if (status != LUA_OK && propagateerrors) {  /* error while running __gc? */
      if (status == LUA_ERRRUN) {  /* is there an error object? */
        const char *msg = (ttisstring(L->top - 1))
                            ? svalue(L->top - 1)
                            : "no message";
        luaO_pushfstring(L, "error in __gc metamethod (%s)", msg);
        status = LUA_ERRGCMM;  /* error in __gc metamethod */
      }
      luaD_throw(L, status);  /* re-throw error */
    }
  }
}

在 runafewfinalizers() 函数里面，遍历所有已经没有在被引用的，且带有 __gc 元方法的对象（表对象，或者用户数据对象），执行 GCTM() 方法。在 GCTM() 方法中，会先调用 udata2finalize() 方法将对象从 g->tobefnz 链表中摘下来，重新放回到 g->allgc 链表中，再判断对象当前是否还存在 __gc 元方法，如果存在，就调用 __gc 元方法，注意，调用这个 __gc 元方法是采用 pcall 方式调用的，目的是为了在函数里头要是出现异常情况，能回到当前环境，status = LUA_ERRGCMM; 标记当前异常是发生在 __gc 里头的。但如果其中某个对象在执行 __gc 的元方法时，pcall 失败了，GCScallfin 阶段就会被中断，执行55代码，抛出异常，然后得等下次进入 singlestep() 时，再继续执行剩余对象的 __gc 元方法。

例如下面的例子：

local function f( )
	local t1 = setmetatable({}, {__gc = function (t)
		print("aaa", t)
	end})


	local t2 = setmetatable({}, {__gc = function (t)
		print("bbb", t)
		t = 12+t.a
	end})
	
	print("----------------- gc start", t1, t2)

	t1 = nil
	t2 = nil
	
	collectgarbage()

	print("----------------- gc end")
end

local ok, err = pcall(f)
print(ok, err)

print("----- call gc again")
collectgarbage()

print("main end")
--[[
运行结果：
----------------- gc start      table: 00da85f0 table: 00da8668
bbb     table: 00da8668
false   error in __gc metamethod (..\aaaa.lua:9: attempt to perform arithmetic on a nil value (field 'a'))
----- call gc again
aaa     table: 00da85f0
main end
]]

我们可以看到，表 t2 在第1次17行全量 gc 时，报错了，导致表 t1 的元方法，在第2次26行全量 gc 的时候才被执行到。

总的来说，在 gc 跑到 GCScallfin 阶段时，会取出 g->tobefnz 链表里的对象去执行 __gc 元方法。并且在udata2finalize() 中会把对象重新放回到 g->allgc 链表中，标记为当前白，等待下一轮gc，才回收对象。也就是说，本轮 gc 只会执行 __gc 元方法，等到下一轮，才去执行释放操作，如下图所示：

将对象重新放回 g->allgc 链表的目的也很简单，因为这个对象，有可能会在 __gc 指向的方法里头再次被引用，所以，不能在本轮 gc 释放，得放到下一轮 gc 去做检查是否需要释放。对于一个已经不在栈上，或者全局表等其他地方引用了，只有在 __gc 指向的方法里头被局部变量引用这种，例如下面的例子：

local mt = nil

mt = {__gc = function (tb)
	print("__gc tb:", tb, tb.str)
	setmetatable(tb, mt)
end}

local t = setmetatable({str = "abc"}, mt)

print("---- t:", t)
t = nil

print("========= gc1 start")
collectgarbage()
print("========= gc1 end")

print("========= gc2 start")
collectgarbage()
print("========= gc2 end")

--[[
运行结果：
---- t: table: 00000000006d9a90
========= gc1 start
__gc tb:        table: 00000000006d9a90 abc
========= gc1 end
========= gc2 start
__gc tb:        table: 00000000006d9a90 abc
========= gc2 end
__gc tb:        table: 00000000006d9a90 abc
]]

在例子中，我们看到，表 t 引用着 str 字符串对象，所以，在原子阶段的 markbeingfnz() 函数中 mark 字符串str，使其不被 gc 回收。在 GCScallfin 阶段，调用 __gc 元方法，把表对象 t 又重新插回 g->allgc 链表。接着我们看到 __gc 指向的函数内，又一次调用 setmetatable(tb, mt) 设置元表，流程又回到了开篇介绍 setmetatable()的实现了。

还有一种情况，在设置完 t 的元表 mt 后，在之后的某一刻时间，t 表已经在 g->finobj 或 g->tobefnz 链表里了，如果此时，将 mt.__gc 置为 nil，那么是不会主动从 g->finobj 或 g->tobefnz 链表中摘下来的，只有在本轮 gc 的 GCScallfin 清除阶段的 GCTM() 方法中，把这个对象 t 重新插回 g->allgc 链表里，等待下一轮 gc 再做清除释放。也就是说，带__gc 元方法的对象，已经处在清除阶段了，即使该对象把元表设置为nil，本轮 gc 也不会清除这个对象，需要等下一轮 gc 才会去清除。

比较有意思的是 __gc 元方法能否被调用，取决于在setmetatable()那刻，是否有设置 __gc 元方法。例如下面的例子：

local mt = {}

local t = setmetatable({str = "abc"}, mt)

mt.__gc = function (tb)
	print("__gc tb:", tb, tb.str)
end

print("---- t:", t)
t = nil

print("========= gc start")
collectgarbage()
print("========= gc end")
--[[
运行结果：
---- t: table: 00000000006d96d0
========= gc start
========= gc end
]]

我们发现，在打印 gc start 到 gc end 之间，并没有执行到 __gc 元方法，而我们只要把 local t = setmetatable({str = "abc"}, mt) 这行代码往下移到到 mt.__gc = ... 之后，我们再次运行程序，发现就能打印 __gc 元方法里的内容了。

说明想要执行对象的 __gc 元方法，只能在调用API setmetatable() 接口之前，就要把 __gc 元方法设置好值。

当然还有一种比较取巧的方法，如下：

local mt = {}
mt.__gc = 1

local t = setmetatable({str = "abc"}, mt)

mt.__gc = function (tb)
	print("__gc tb:", tb, tb.str)
end

print("---- t:", t)
t = nil

print("========= gc1 start")
collectgarbage()
print("========= gc1 end")
--[[
运行结果：
---- t: table: 00000000007598d0
========= gc1 start
__gc tb:        table: 00000000007598d0 abc
========= gc1 end
]]

就是先对 mt.__gc 赋值 1，其实不管赋什么值都可以，我们在 setmetatable()之后，再重新对 __gc 赋值，看到运行结果里有第 7 行的打印： __gc tb: table: 00000000007598d0 abc。从中得出结论，想要执行对象的 __gc 元方法，只能在调用 API setmetatable() 接口之前，就要把 __gc 值设置好，只是我们可以在真正触发 __gc 元方法调用之前，有机会改变其值，但这个时机是不可控的，所以建议，能提前就提前准备好。

额外补充

我们知道，在 lua gc 中，标记，清除阶段，都是可以分步执行的，可以被中断，而只有原子阶段的 atomic()才是不可中断的，得一次性走完，才能进入到下一个阶段。那么如果在清除阶段，调用 setmetatable()设置元方法 __gc ，会在本轮的 GCScallfin 阶段执行对象的 __gc 元方法吗？

答案是不会，因为我们只有在 atomic() 函数里才有机会把对象放到 g->tobefnz 链表中，只有放到 g->tobefnz 链表的对象，才会最终执行 __gc 元方法。所以，清除阶段设置 __gc 元方法，只能错过本轮 gc，待到下一轮 gc 的 GCScallfin 阶段，才会执行 __gc 元方法，这对用户来说，不会有什么影响，只要最终能调用到 __gc 元方法就可以了。

为了试验我说的是否正确，我专门查看了 lua gc 相关API，发现 lua gc 并没有对外开放当前是处于哪个阶段的API。所以，我的测试方法，是额外加多一个API接口，改写 lua 源码（当然也可以使用动态库的方式，但懒得那样做了），在 lua 层也能查看到当前 gc 处于哪个阶段，这样，我们就可以准确的在清除阶段 GCSswpend 设置 __gc 元方法了。

还有一个有趣的发现，就是设置了 __gc 元方法的对象，如果在本轮 gc 中，还一直有被引用着的话，最终是被挂在 g->finobj 链表中的，而不会放到 g->tobefnz 或者 g->allgc 链表中（我们仔细看原子阶段调用的 separatetobefnz(g, 0); 第2个传的参数是0，表明对象只有白色的，才会被加入到 g->tobefnz，而不是白色的，还是处在 g->finobj 链表中），那么放在 g->finobj 的对象，在本轮清除阶段，或者在新的一轮 gc 中，对象不在 g->allgc 链表中，会不会被 mark，会被清除吗，会有什么影响吗？

答案是不会有什么影响的，如果是在本轮 gc 中，只要对象有被引用着，在标记阶段，不管是在 g->allgc 链表，还是在 g->finobj 链表，都最终会被标记为黑色，在清除阶段，sweeplist() 在清除 g->finobj 链表时，发现对象是非标记阶段的白色，就不会被清除。而如果是清除阶段调用 setmetatable()，把对象加入到 g->finobj 链表的话，也不会有任何问题，因为在 luaC_checkfinalizer() 中发现，清除阶段加入的对象，会标记为当前白，所以，也同样不会有什么问题。

如果是到了下一轮 gc 标记阶段，g->finobj 链表的对象，会和 g->allgc 链表的对象一样，有同等待遇，链表中的对象是否被 mark，取决于对象是否有在线程、全局注册表、全局元表这些根对象有直接引用，或者间接引用。有的话，会被 g->gray，g->grayagain 等链表引用，原子阶段，判断对象没有再被引用了，才会被放入到 g->tobefnz 链表中，最后在调用完 __gc 元方法后，才放回到 g->allgc 链表中，完成对象从 g->finobj 链表到 g->allgc 链表的转移，这个过程，就是周期会有点长。所以，对象是在 g->finobj，还是在 g->allgc 中，都不会有什么影响，最终都能回到 g->allgc 链表中，再新一轮 gc 中走向释放。

通过以上对 __gc 元方法实现的分析，我们最终可以得出结论，g->finobj 和 g->tobefnz 链表是不可能存在死对象的，也就是挂在这两个链表上的对象，本轮 gc 是不会被释放的，g->tobefnz 链表可能会在下一轮 gc 进行释放，而 g->finobj 链表的对象，则需要到下下轮，才有机会回到 g->allgc 链表进行释放。

posted @ 2023-08-07 22:22 墨色山水阅读(305) 评论(0) 收藏举报

刷新页面返回顶部

lindx

lua5.3 gc 元方法__gc分析（6）

定义

实现分析

GCSatomic 原子阶段

GCScallfin 阶段

额外补充