C++和lua关于lambda/closure的实现对比

closure

lexical scope

这篇文章虽然不长,但是厘清了一些基本的概念:
闭包的直观效果就是可以捕捉所在环境中的变量。

Closures are special functions that can capture the environment, i.e. variables within a lexical scope.

A closure is any function that closes over the environment in which it was defined. This means that it can access variables, not in its parameter list.

persist

但是lexical scope只是闭包的一个表象维度,使用所在环境中的词法范围内的变量这一点并不新奇,Pascal语言天生就支持,gcc的C语言扩展Nested Functions也可以捕捉词法范围的变量。更关键的是捕捉之后这些捕捉到的信息需要具有持久性(Persistent),捕捉之后生成一个“时间胶囊”,将使用的局部变量和函数“装箱”之后可以被传递到任意地方使用。

A closure is a persistent local variable scope

A closure is a persistent scope which holds on to local variables even after the code execution has moved out of that block. Languages which support closure (such as JavaScript, Swift, and Ruby) will allow you to keep a reference to a scope (including its parent scopes), even after the block in which those variables were declared has finished executing, provided you keep a reference to that block or function somewhere.

The scope object and all its local variables are tied to the function and will persist as long as that function persists.

This gives us function portability. We can expect any variables that were in scope when the function was first defined to still be in scope when we later call the function, even if we call the function in a completely different context.

closure vs lambda

之前隐约觉得C++中的lambda就是closure,但是既然C++使用lambda而不是closure,应该是某些原因的。同样这篇文章也说明了lambda和closure的区别:

Scott Meyers puts it beautifully — “The distinction between a lambda and the corresponding closure is precisely equivalent to the distinction between a class and an instance of the class”.

简言之:closure只的是一个捕捉了变量的函数(function),而lambda是一个表达式(expression)。以gcc实现为例:closure是gcc内部定义的、一个包含了捕捉变量集和函数体的struct结构,而lambda则是这个结构的一个对象。

问题

持久性是闭包需要真正解决的语义问题,而词法范围(lexical scope)相对来说是一个比较微小(trivial)的功能。关键的问题是:函数返回之后,closure引用变量的生命期也会随之结束,那么如何才能让这些局部变量具有持久性呢?

没错,就是简单粗暴的拷贝一份

lua实现

一个lua虚拟机中栈相关的变量包括stack、top和stack_last,分别指向了栈底,栈顶和可用内存空间的结尾。结构本身和vector这类容器的内存挂历思路一样:预留/预分配一个足够大的空间,然后按需使用,避免频繁地内存申请和释放操作。

/*
** 'per thread' state
*/
struct lua_State {
  CommonHeader;
  unsigned short nci;  /* number of items in 'ci' list */
  lu_byte status;
  StkId top;  /* first free slot in the stack */
  global_State *l_G;
  CallInfo *ci;  /* call info for current function */
  const Instruction *oldpc;  /* last pc traced */
  StkId stack_last;  /* last free slot in the stack */
  StkId stack;  /* stack base */
  UpVal *openupval;  /* list of open upvalues in this stack */
  GCObject *gclist;
///...
};

StkId类型

StkId是一个指向Value类型的指针,Value是一个union类型,配合类型信息tt_,可以完整的存储整数、浮点数这两种lua基本数值类型。

typedef TValue *StkId;  /* index to stack elements */

/*
** Tagged Values. This is the basic representation of values in Lua,
** an actual value plus a tag with its type.
*/

/*
** Union of all Lua values
*/
typedef union Value {
  GCObject *gc;    /* collectable objects */
  void *p;         /* light userdata */
  int b;           /* booleans */
  lua_CFunction f; /* light C functions */
  lua_Integer i;   /* integer numbers */
  lua_Number n;    /* float numbers */
} Value;


#define TValuefields	Value value_; int tt_


typedef struct lua_TValue {
  TValuefields;
} TValue;

栈操作

在解析字符串时,直接是通过递增指针(L->top--),这意味着栈变量的使用是连续的,栈是一个数组结构,具体来说,是一个TValue类型的数组。

/*
** creates a new string and anchors it in scanner's table so that
** it will not be collected until the end of the compilation
** (by that time it should be anchored somewhere)
*/
TString *luaX_newstring (LexState *ls, const char *str, size_t l) {
  lua_State *L = ls->L;
  TValue *o;  /* entry for 'str' */
  TString *ts = luaS_newlstr(L, str, l);  /* create new string */
  setsvalue2s(L, L->top++, ts);  /* temporarily anchor it in stack */
  o = luaH_set(L, ls->h, L->top - 1);
  if (ttisnil(o)) {  /* not in use yet? */
    /* boolean value does not need GC barrier;
       table has no metatable, so it does not need to invalidate cache */
    setbvalue(o, 1);  /* t[string] = true */
    luaC_checkGC(L);
  }
  else {  /* string already present */
    ts = tsvalue(keyfromval(o));  /* re-use value previously stored */
  }
  L->top--;  /* remove string from stack */
  return ts;
}

例子

以lua官方关于closure的文档为例

    function newCounter ()
      local i = 0
      return function ()   -- anonymous function
               i = i + 1
               return i
             end
    end
    
    c1 = newCounter()
    print(c1())  --> 1
    print(c1())  --> 2

closure内部使用了局部变量i,当执行流从newCounter返回值后,对i的访问是否会像C++中的lambda引用栈变量一样引用到“脏数据”(注意:此时的变量i是一个整数,可以直接放入到栈上的一个TValue内存而不需要额外存储空间)。

官方文档

lua官方文档《The Implementation of Lua 5.0》的第五节“Functions and Closures”系统/准确的说明了lua的这部分实现。

When the variable goes out of scope, it migrates into a slot inside the upvalue itself (Figure 4, right). Because access is indirect through a pointer in the upvalue, this migration is transparent to any code that reads or writes the variable. Unlike its inner functions, the function that declares the variable accesses it as it accesses its own local variables: directly in the stack.

在closure生成的时候,的确是把stack中的对象拷贝一份过来。

一些细节

1. 寻址方式

由于upvalue存储的物理位置不同,所以lua虚拟机也需要体现这种区别,就像386中栈操作使用单独的push/pop一样。在lua中访问栈变量,函数参数,常量使用的都是不同的指令模式。

从下面的反编译代码可以看到,对于upvalue的读写访问使用的是特殊的GETUPVAL/SETUPVAL指令。

tsecer@harry: cat -n luaclosure.lua 
     1      function newCounter ()
     2        local i = 0
     3        return function ()   -- anonymous function
     4                 i = i + 1
     5                 return i
     6               end
     7      end
     8
     9      c1 = newCounter()
    10      print(c1())
    11      print(c1())
    12
tsecer@harry: luac -v
Lua 5.1.4  Copyright (C) 1994-2008 Lua.org, PUC-Rio
tsecer@harry: luac -l luaclosure.lua 

main <luaclosure.lua:0,0> (14 instructions, 56 bytes at 0x15d7530)
0+ params, 2 slots, 0 upvalues, 0 locals, 3 constants, 1 function
        1       [7]     CLOSURE         0 0     ; 0x15d7710
        2       [1]     SETGLOBAL       0 -1    ; newCounter
        3       [9]     GETGLOBAL       0 -1    ; newCounter
        4       [9]     CALL            0 1 2
        5       [9]     SETGLOBAL       0 -2    ; c1
        6       [10]    GETGLOBAL       0 -3    ; print
        7       [10]    GETGLOBAL       1 -2    ; c1
        8       [10]    CALL            1 1 0
        9       [10]    CALL            0 0 1
        10      [11]    GETGLOBAL       0 -3    ; print
        11      [11]    GETGLOBAL       1 -2    ; c1
        12      [11]    CALL            1 1 0
        13      [11]    CALL            0 0 1
        14      [11]    RETURN          0 1

function <luaclosure.lua:1,7> (5 instructions, 20 bytes at 0x15d7710)
0 params, 2 slots, 0 upvalues, 1 local, 1 constant, 1 function
        1       [2]     LOADK           0 -1    ; 0
        2       [6]     CLOSURE         1 0     ; 0x15d78c0
        3       [6]     MOVE            0 0
        4       [6]     RETURN          1 2
        5       [7]     RETURN          0 1

function <luaclosure.lua:3,6> (6 instructions, 24 bytes at 0x15d78c0)
0 params, 2 slots, 1 upvalue, 0 locals, 1 constant, 0 functions
        1       [4]     GETUPVAL        0 0     ; i
        2       [4]     ADD             0 0 -1  ; - 1
        3       [4]     SETUPVAL        0 0     ; i
        4       [5]     GETUPVAL        0 0     ; i
        5       [5]     RETURN          0 2
        6       [6]     RETURN          0 1
tsecer@harry: 

2. 数据结构

在一个Closure结构中,和函数原型(Proto)同等级别的就有一个UpVal数组,也就是Closure = Proto + upvalue

typedef struct LClosure {
  ClosureHeader;
  struct Proto *p;
  UpVal *upvals[1];  /* list of upvalues */
} LClosure;

/*
** Upvalues for Lua closures
*/
struct UpVal {
  TValue *v;  /* points to stack or to its own value */
  lu_mem refcount;  /* reference counter */
  union {
    struct {  /* (when open) */
      UpVal *next;  /* linked list */
      int touched;  /* mark to avoid cycles with dead threads */
    } open;
    TValue value;  /* the value (when closed) */
  } u;
};

那么为什么不像常量一样,把upvalue放在Proto结构中呢?

/*
** Function Prototypes
*/
typedef struct Proto {
///...
  TValue *k;  /* constants used by the function */
  Instruction *code;  /* opcodes */
///...
};

一个明显的原因在于常量在编译运行时不会变化,而upvalue位于closure中,closure又位于function中,closure可以引用函数参数,每次函数传递的参数不同,每次capture到的变量也不同,因此也是不同的closure。

举例来说:

function fun(x)                                                                 
    return function(y)                                                          
        return x + y                                                            
    end                                                                         
end                                                                             
                                                                                
print(fun(1)(1))                                                                
print(fun(2)(1))

虽然fun内部的closure逻辑都是相同的内容( return x + y),但是由于捕捉的函数参数x值不同(分别是1和2),所以是不同的closure。

3. upvalue复制

在执行CLOSURE指令时触发pushclosure调用

      vmcase(OP_CLOSURE) {
        Proto *p = cl->p->p[GETARG_Bx(i)];
        LClosure *ncl = getcached(p, cl->upvals, base);  /* cached closure */
        if (ncl == NULL)  /* no match? */
          pushclosure(L, p, cl->upvals, base, ra);  /* create a new one */
        else
          setclLvalue(L, ra, ncl);  /* push cashed closure */
        checkGC(L, ra + 1);
        vmbreak;
      }

逐个upvalue调用luaF_findupval函数

/*
** create a new Lua closure, push it in the stack, and initialize
** its upvalues. Note that the closure is not cached if prototype is
** already black (which means that 'cache' was already cleared by the
** GC).
*/
static void pushclosure (lua_State *L, Proto *p, UpVal **encup, StkId base,
                         StkId ra) {
  int nup = p->sizeupvalues;
  Upvaldesc *uv = p->upvalues;
  int i;
  LClosure *ncl = luaF_newLclosure(L, nup);
  ncl->p = p;
  setclLvalue(L, ra, ncl);  /* anchor new closure in stack */
  for (i = 0; i < nup; i++) {  /* fill in its upvalues */
    if (uv[i].instack)  /* upvalue refers to local variable? */
      ncl->upvals[i] = luaF_findupval(L, base + uv[i].idx);
    else  /* get upvalue from enclosing function */
      ncl->upvals[i] = encup[uv[i].idx];
    ncl->upvals[i]->refcount++;
    /* new closure is white, so we do not need a barrier here */
  }
  if (!isblack(p))  /* cache will not break GC invariant? */
    p->cache = ncl;  /* save it on cache for reuse */
}

在luaF_findupval函数中如果不存在则通过luaM_new创建一个新变量。

UpVal *luaF_findupval (lua_State *L, StkId level) {
  UpVal **pp = &L->openupval;
  UpVal *p;
  UpVal *uv;
  lua_assert(isintwups(L) || L->openupval == NULL);
  while (*pp != NULL && (p = *pp)->v >= level) {
    lua_assert(upisopen(p));
    if (p->v == level)  /* found a corresponding upvalue? */
      return p;  /* return it */
    pp = &p->u.open.next;
  }
  /* not found: create a new upvalue */
  uv = luaM_new(L, UpVal);
  uv->refcount = 0;
  uv->u.open.next = *pp;  /* link it to list of open upvalues */
  uv->u.open.touched = 1;
  *pp = uv;
  uv->v = level;  /* current value lives in the stack */
  if (!isintwups(L)) {  /* thread not in list of threads with upvalues? */
    L->twups = G(L)->twups;  /* link it to the list */
    G(L)->twups = L;
  }
  return uv;
}

非栈变量的处理

注意到pushclosure尝试拷贝upvalue的时候,会有一个额外的instack判断,非instack变量不会触发拷贝。

static int newupvalue (FuncState *fs, TString *name, expdesc *v) {
  Proto *f = fs->f;
  int oldsize = f->sizeupvalues;
  checklimit(fs, fs->nups + 1, MAXUPVAL, "upvalues");
  luaM_growvector(fs->ls->L, f->upvalues, fs->nups, f->sizeupvalues,
                  Upvaldesc, MAXUPVAL, "upvalues");
  while (oldsize < f->sizeupvalues)
    f->upvalues[oldsize++].name = NULL;
  f->upvalues[fs->nups].instack = (v->k == VLOCAL);
  f->upvalues[fs->nups].idx = cast_byte(v->u.info);
  f->upvalues[fs->nups].name = name;
  luaC_objbarrier(fs->ls->L, f, name);
  return fs->nups++;
}

C++实现

概述

如果分析gcc的lambda实现的话,可以发现gcc的内部实现和lua差不多:gcc是在内部创建了一个匿名的(编译器可见)struct,struct将捕捉的参数作为成员变量,并且生成一个函数调用的operator,这两个分别对应lua中typedef struct LClosure结构的

UpVal upvals[1]; / list of upvalues */

struct Proto *p;

从下面的测试代码可以看到,c++中捕捉的变量也是拷贝了自己的一份。

tsecer@harry: cat lambda.cpp 
#include <stdio.h>
#include <string.h>

struct S
{
    int a[10];
};

auto ret_lambda()
{
    S s{1, 2, 3};

    auto local = [s]()
    {
        printf("%d, %d\n", s.a[0], s.a[1]);
    };

    memset(&s, 0 , sizeof(s));

    return local;
}

int main(int argc, const char *argv[])
{
    ret_lambda()();
    return 0;
}
tsecer@harry: g++ lambda.cpp
tsecer@harry: ./a.out 
1, 2
tsecer@harry: 

实现

大致来说,gcc对于lambda函数的实现,相当于是允许通过捕捉语句声明引用哪些所在函数的局部变量,把这些捕捉变量放在一个结构中,并在结构中定义一个函数操作符。

tsecer@harry: cat -n autofunc.cpp            
     1  #include <functional>
     2
     3  typedef std::function<bool (int, int)> stdFunc;
     4
     5  int main(int argc, const char * argv[])
     6  {
     7          int a,  b, c, d;
     8          auto ff = [&](int x, int y) -> bool
     9          {
    10                  return x + y + a + d;
    11          };
    12
    13          ff(0x1111, 0x2222);
    14
    15          stdFunc stdff = ff;
    16          stdff(0x3333, 0x4444);
    17  }
tsecer@harry: g++ -g -std=c++11 autofunc.cpp
tsecer@harry: gdb ./a.out 
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-80.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/harry/study/autofunc/a.out...done.
(gdb) b main
Breakpoint 1 at 0x400874: file autofunc.cpp, line 11.
(gdb) r
Starting program: /home/harry/study/autofunc/./a.out 

Breakpoint 1, main (argc=1, argv=0x7fffffffe578) at autofunc.cpp:11
11              };
Missing separate debuginfos, use: debuginfo-install glibc-2.17-196.tl2.3.x86_64 libgcc-4.8.5-4.el7.x86_64 libstdc++-4.8.5-4.el7.x86_64
(gdb) ptype ff
type = struct __lambda0 {
    int &__a;
    int &__d;
}
(gdb) s
13              ff(0x1111, 0x2222);
(gdb) 
__lambda0::operator() (__closure=0x7fffffffe460, x=4369, y=8738) at autofunc.cpp:10
10                      return x + y + a + d;
(gdb) ptype __closure
type = const struct __lambda0 {
    int &__a;
    int &__d;
} * const
(gdb) 

可以看到,由于lambda函数ff引用了局部变量a、b,编译器在编译时动态生成了这样一个结构

struct __lambda0 {
    int &__a;
    int &__d;
bool operator()(int x, int y)
{
	return x + y+ __a + __b;
}
}

注意:此时看到的是新生成lambda结构中的、单独拷贝的__a和__d,而不是捕捉变量本身的地址,因为需要单独拷贝一份。

posted on 2024-04-10 14:31  tsecer  阅读(12)  评论(0编辑  收藏  举报

导航