春节前后出于对STM和Lisp宏的好奇一直在关注Clojure,新年过后对于Erlang开发者还是有不少好消息,有两本Erlang的新书出版:第一本是O'Reilly公司的小册子"Introducing Erlang";一本是著名的LYSE,这本书终于从有了正式出版的版本,在此之前有热心的网友编译成电子书,印刷版排版精美保持了原站图文并茂的风格.目前这两本电子书都很容易下载到电子版,请自行搜索.
 
   下面是阅读LYSE时遇到的一个细节问题:
 

Note: Since R14A, a new optimization has been added to Erlang's compiler. It simplifies selective receives in very specific cases of back-and-forth communications between processes. An example of such a function is optimized/1 in multiproc.erl.

To make it work, a reference (make_ref()) has to be created in a function and then sent in a message. In the same function, a selective receive is then made. If no message can match unless it contains the same reference, the compiler automatically makes sure the VM will skip messages received before the creation of that reference.

Note that you shouldn't try to coerce your code to fit such optimizations. The Erlang developers only look for patterns that are frequently used and then make them faster. If you write idiomatic code, optimizations should come to you. Not the other way around.

    
   简单复述一下上面的问题:如果进程的消息队列里面积压了很多消息,那么每一次做receive遍历整个队列,就会有不必要的损失.解决方法就是make_ref,我们知道make_ref相当于创建一个唯一标识,那么只有能匹配上这个唯一标识的才会匹配成功.作者说"the compiler automatically makes sure the VM will skip messages received before the creation of that reference.",具体是怎么做的呢?
 
 先看代码:
 
32 optimized(Pid) ->
33   Ref = make_ref(),
34    Pid ! {self(),Ref,hello},
35    receive
36        {Pid,Ref,Msg} ->
36            io:format("~p~n", [Msg])
37    end.

 

   上面的例子中,在调用make_ref/0之前所有的消息都不可能被匹配到,所以没有必要遍历整个消息队列,所以引入了两个指令recv_mark,recv_set.

   recv_mark/1指令在进程上下文保存了队尾指针,recv_set会在当前检查当前进程上下文中是否包含这个信息.如果找到这个信息,就会把消息队列中指向下一条要读消息的指针移动到之前记录的位置.remove_message指令必须修改一下首先让recv_mark/1里面保存的消息失效,这样避免在recv_mark/1recv_set/1执行的中间有另外一个receive操作.
 
  在beam_receive模块的注释使用伪代码解释了一下,我们直接看编译阶段的的结果:
 
  普通版本的消息接收代码,下面是对应的.S代码
     24 normal() ->
     25     receive
     26         {_, Message} ->
     27             [Message | normal()]
     28     after 0 ->
     29         []
     30     end.

{function, normal, 0, 13}.
  {label,12}.
    {line,[{location,"msg.erl",24}]}.
    {func_info,{atom,msg},{atom,normal},0}.
  {label,13}.
    {allocate_zero,1,0}.
    {line,[{location,"msg.erl",25}]}.
  {label,14}.
    {loop_rec,{f,16},{x,0}}.
    {test,is_tuple,{f,15},[{x,0}]}.
    {test,test_arity,{f,15},[{x,0},2]}.
    {get_tuple_element,{x,0},1,{y,0}}.
    remove_message.
    {line,[{location,"msg.erl",27}]}.
    {call,0,{f,13}}.
    {test_heap,2,1}.
    {put_list,{y,0},{x,0},{x,0}}.
    {deallocate,1}.
    return.
  {label,15}.
    {loop_rec_end,{f,14}}.
  {label,16}.
    timeout.
    {move,nil,{x,0}}.
    {deallocate,1}.
    return.

 

下面,我们对比一下有make_ref版本的代码:

 
32 optimized(Pid) ->
33   Ref = make_ref(),
34    Pid ! {self(),Ref,hello},
35    receive
36        {Pid,Ref,Msg} ->
36            io:format("~p~n", [Msg])
37    end.
 
 

{function, optimized, 1, 18}.
  {label,17}.
    {line,[{location,"msg.erl",33}]}.
    {func_info,{atom,msg},{atom,optimized},1}.
  {label,18}.
    {allocate_zero,2,1}.
    {move,{x,0},{y,1}}.
    {line,[{location,"msg.erl",34}]}.
    {recv_mark,{f,19}}.
    {call_ext,0,{extfunc,erlang,make_ref,0}}.
    {test_heap,4,1}.
    {bif,self,{f,0},[],{x,1}}.
    {move,{x,0},{y,0}}.
    {put_tuple,3,{x,2}}.
    {put,{x,1}}.
    {put,{y,0}}.
    {put,{atom,hello}}.
    {move,{x,2},{x,1}}.
    {move,{y,1},{x,0}}.
    {line,[{location,"msg.erl",35}]}.
    send.
    {line,[{location,"msg.erl",36}]}.
    {recv_set,{f,19}}.
  {label,19}.
    {loop_rec,{f,21},{x,0}}.
    {test,is_tuple,{f,20},[{x,0}]}.
    {test,test_arity,{f,20},[{x,0},3]}.
    {get_tuple_element,{x,0},0,{x,1}}.
    {get_tuple_element,{x,0},1,{x,2}}.
    {get_tuple_element,{x,0},2,{x,3}}.
    {test,is_eq_exact,{f,20},[{x,1},{y,1}]}.
    {test,is_eq_exact,{f,20},[{x,2},{y,0}]}.
    remove_message.
    {test_heap,2,4}.
    {put_list,{x,3},nil,{x,1}}.
    {move,{literal,"~p~n"},{x,0}}.
    {line,[{location,"msg.erl",38}]}.
    {call_ext_last,2,{extfunc,io,format,2},2}.
  {label,20}.
    {loop_rec_end,{f,19}}.
  {label,21}.
    {wait,{f,19}}.
 

 

  在OTP代码中这种处理方式还是很常见的,司空见惯的代码个中深意还要好好体会.
 
 
 
2014-8-26 8:35:00补充 消息接受的流语义
Erlang消息处理是具有流语义的,一条消息发送到进程P意味着这条消息被放入了P的MailBox,进程Q按照m1 m2的顺序给进程P发送消息,那么可能出现的可能是
[1] m1 m2 按顺序到达 
[2] 只有m1送达
[3] 都没有送达
 
最后,小图一张:"绝世名伶" 范晓萱