在Erlang Shell中调试的时候经常会遇到的一个问题就是在Shell中遇到异常会导致ETS表丢失,需要反复去创建ETS表,调试比较麻烦.这是由于Erlang Shell在遇到异常之后会重建,ETS表依赖于创建它的进程,如果创建它的进程崩溃了ETS表也就销毁了(不是绝对的,后面可以看到);看下官方文档的描述:
  Note that there is no automatic garbage collection for tables. Even if there are no references to a table from any process,it will not automatically be destroyed unless the owner process terminates.It can be destroyed explicitly by using delete/1.The default owner is the process that created the table. 
 
  我们先来解决这个Erlang Shell调试的问题,首先一个很简单的方法就是解除对Shell进程的依赖,我们在别的进程里面创建ETS表.
Eshell V5.9  (abort with ^G)
1> ets:new(test,[named_table]). %%% 首先我们模拟一下出错的情景
test
2> ets:i(). %%% 查看一下ets表 的信息
id name type size mem owner
----------------------------------------------------------------------------
13 code set 258 9657 code_server
4110 code_names set 55 6983 code_server
8207 shell_records ordered_set 0 106 <0.24.0>
ac_tab ac_tab set 6 852 application_controller
file_io_servers file_io_servers set 0 316 file_server_2
global_locks global_locks set 0 316 global_name_server
global_names global_names set 0 316 global_name_server
global_names_ext global_names_ext set 0 316 global_name_server
global_pid_ids global_pid_ids bag 0 316 global_name_server
global_pid_names global_pid_names bag 0 316 global_name_server
inet_cache inet_cache bag 0 316 inet_db
inet_db inet_db set 23 503 inet_db
inet_hosts_byaddr inet_hosts_byaddr bag 0 316 inet_db
inet_hosts_byname inet_hosts_byname bag 0 316 inet_db
inet_hosts_file_byaddr inet_hosts_file_byaddr bag 0 316 inet_db
inet_hosts_file_byname inet_hosts_file_byname bag 0 316 inet_db
test test set 0 316 <0.30.0>
ok
3> 1/0. %%% 异常一下 让Erlang Shell重建
** exception error: bad argument in an arithmetic expression
in operator '/'/2
called as 1 / 0
4> ets:i(). %%% 再一次查看ETS表,可以看到test表已经不存在了
id name type size mem owner
----------------------------------------------------------------------------
13 code set 259 9760 code_server
4110 code_names set 55 6983 code_server
8207 shell_records ordered_set 0 106 <0.24.0>
ac_tab ac_tab set 6 852 application_controller
file_io_servers file_io_servers set 0 316 file_server_2
global_locks global_locks set 0 316 global_name_server
global_names global_names set 0 316 global_name_server
global_names_ext global_names_ext set 0 316 global_name_server
global_pid_ids global_pid_ids bag 0 316 global_name_server
global_pid_names global_pid_names bag 0 316 global_name_server
inet_cache inet_cache bag 0 316 inet_db
inet_db inet_db set 23 503 inet_db
inet_hosts_byaddr inet_hosts_byaddr bag 0 316 inet_db
inet_hosts_byname inet_hosts_byname bag 0 316 inet_db
inet_hosts_file_byaddr inet_hosts_file_byaddr bag 0 316 inet_db
inet_hosts_file_byname inet_hosts_file_byname bag 0 316 inet_db
ok

5> spawn(fun() -> ets:new(test,[named_table]), receive after infinity->ok end end). %%% 在别的进程创建ETS
<0.39.0>
6> ets:i(). %%% 查看ETS
id name type size mem owner
----------------------------------------------------------------------------
13 code set 259 9760 code_server
4110 code_names set 55 6983 code_server
8207 shell_records ordered_set 0 106 <0.24.0>
ac_tab ac_tab set 6 852 application_controller
file_io_servers file_io_servers set 0 316 file_server_2
global_locks global_locks set 0 316 global_name_server
global_names global_names set 0 316 global_name_server
global_names_ext global_names_ext set 0 316 global_name_server
global_pid_ids global_pid_ids bag 0 316 global_name_server
global_pid_names global_pid_names bag 0 316 global_name_server
inet_cache inet_cache bag 0 316 inet_db
inet_db inet_db set 23 503 inet_db
inet_hosts_byaddr inet_hosts_byaddr bag 0 316 inet_db
inet_hosts_byname inet_hosts_byname bag 0 316 inet_db
inet_hosts_file_byaddr inet_hosts_file_byaddr bag 0 316 inet_db
inet_hosts_file_byname inet_hosts_file_byname bag 0 316 inet_db
test test set 0 316 <0.39.0>
ok
7> 1/0. %%% 搞崩Erlang Shell
** exception error: bad argument in an arithmetic expression
in operator '/'/2
called as 1 / 0
8> ets:i(). %%% 查看ETS表信息 test表还在
id name type size mem owner
----------------------------------------------------------------------------
13 code set 259 9760 code_server
4110 code_names set 55 6983 code_server
8207 shell_records ordered_set 0 106 <0.24.0>
ac_tab ac_tab set 6 852 application_controller
file_io_servers file_io_servers set 0 316 file_server_2
global_locks global_locks set 0 316 global_name_server
global_names global_names set 0 316 global_name_server
global_names_ext global_names_ext set 0 316 global_name_server
global_pid_ids global_pid_ids bag 0 316 global_name_server
global_pid_names global_pid_names bag 0 316 global_name_server
inet_cache inet_cache bag 0 316 inet_db
inet_db inet_db set 23 503 inet_db
inet_hosts_byaddr inet_hosts_byaddr bag 0 316 inet_db
inet_hosts_byname inet_hosts_byname bag 0 316 inet_db
inet_hosts_file_byaddr inet_hosts_file_byaddr bag 0 316 inet_db
inet_hosts_file_byname inet_hosts_file_byname bag 0 316 inet_db
test test set 0 316 <0.39.0>
ok
9>
   第二种方法创建的ets依赖于进程<0.39.0>解除了对Erlang Shell进程的依赖,Erlang Shell重建之后,ets表依然存在;我们可以把这个方法封装到user_default.erl中方便调试使用.可以参考: [Erlang 0027] Using Record in Erlang Shell 
 
   可以注意到文档里面还提到了一种方法来指定ETS表的所有权:
   Table ownership can be transferred at process termination by using the heir option or explicitly by calling give_away/3. 
   文档描述中可以把ETS的拥有权移交出去,所以我们试一下give_away的方式:
Eshell V5.9  (abort with ^G)
1> ets:new(test,[named_table]).
test
2> ets:i().
id name type size mem owner
----------------------------------------------------------------------------
13 code set 259 10181 code_server
4110 code_names set 54 7151 code_server
8207 shell_records ordered_set 0 97 <0.26.0>
ac_tab ac_tab set 6 843 application_controller
file_io_servers file_io_servers set 0 307 file_server_2
global_locks global_locks set 0 307 global_name_server
global_names global_names set 0 307 global_name_server
global_names_ext global_names_ext set 0 307 global_name_server
global_pid_ids global_pid_ids bag 0 307 global_name_server
global_pid_names global_pid_names bag 0 307 global_name_server
inet_cache inet_cache bag 0 307 inet_db
inet_db inet_db set 29 630 inet_db
inet_hosts_byaddr inet_hosts_byaddr bag 0 307 inet_db
inet_hosts_byname inet_hosts_byname bag 0 307 inet_db
inet_hosts_file_byaddr inet_hosts_file_byaddr bag 0 307 inet_db
inet_hosts_file_byname inet_hosts_file_byname bag 0 307 inet_db
test test set 0 307 <0.32.0>
ok
3> 1/0.
** exception error: bad argument in an arithmetic expression
in operator '/'/2
called as 1 / 0
4> ets:i().
id name type size mem owner
----------------------------------------------------------------------------
13 code set 260 10290 code_server
4110 code_names set 54 7151 code_server
8207 shell_records ordered_set 0 97 <0.26.0>
ac_tab ac_tab set 6 843 application_controller
file_io_servers file_io_servers set 0 307 file_server_2
global_locks global_locks set 0 307 global_name_server
global_names global_names set 0 307 global_name_server
global_names_ext global_names_ext set 0 307 global_name_server
global_pid_ids global_pid_ids bag 0 307 global_name_server
global_pid_names global_pid_names bag 0 307 global_name_server
inet_cache inet_cache bag 0 307 inet_db
inet_db inet_db set 29 630 inet_db
inet_hosts_byaddr inet_hosts_byaddr bag 0 307 inet_db
inet_hosts_byname inet_hosts_byname bag 0 307 inet_db
inet_hosts_file_byaddr inet_hosts_file_byaddr bag 0 307 inet_db
inet_hosts_file_byname inet_hosts_file_byname bag 0 307 inet_db
ok
5> P=spwan(fun-> receive infinity -> ok end end ).
* 1: syntax error before: '->'
5> P=spwan(fun-> receive after infinity -> ok end end ).
* 1: syntax error before: '->'
5> P=spwan(fun()-> receive after infinity -> ok end end ).
** exception error: undefined shell command spwan/1
6> P=spawn(fun()-> receive after infinity -> ok end end ).
<0.43.0>
7> ets:new(test,[named_table]).
test
8> ets:i().
id name type size mem owner
----------------------------------------------------------------------------
13 code set 261 10419 code_server
4110 code_names set 54 7151 code_server
8207 shell_records ordered_set 0 97 <0.26.0>
ac_tab ac_tab set 6 843 application_controller
file_io_servers file_io_servers set 0 307 file_server_2
global_locks global_locks set 0 307 global_name_server
global_names global_names set 0 307 global_name_server
global_names_ext global_names_ext set 0 307 global_name_server
global_pid_ids global_pid_ids bag 0 307 global_name_server
global_pid_names global_pid_names bag 0 307 global_name_server
inet_cache inet_cache bag 0 307 inet_db
inet_db inet_db set 29 630 inet_db
inet_hosts_byaddr inet_hosts_byaddr bag 0 307 inet_db
inet_hosts_byname inet_hosts_byname bag 0 307 inet_db
inet_hosts_file_byaddr inet_hosts_file_byaddr bag 0 307 inet_db
inet_hosts_file_byname inet_hosts_file_byname bag 0 307 inet_db
test test set 0 307 <0.41.0>
ok
9> self().
<0.41.0>
10> ets:give_away(test,P,[]).
true
11> 1/0.
** exception error: bad argument in an arithmetic expression
in operator '/'/2
called as 1 / 0
12> self().
<0.49.0>
13> ets:i().
id name type size mem owner
----------------------------------------------------------------------------
13 code set 261 10419 code_server
4110 code_names set 54 7151 code_server
8207 shell_records ordered_set 0 97 <0.26.0>
ac_tab ac_tab set 6 843 application_controller
file_io_servers file_io_servers set 0 307 file_server_2
global_locks global_locks set 0 307 global_name_server
global_names global_names set 0 307 global_name_server
global_names_ext global_names_ext set 0 307 global_name_server
global_pid_ids global_pid_ids bag 0 307 global_name_server
global_pid_names global_pid_names bag 0 307 global_name_server
inet_cache inet_cache bag 0 307 inet_db
inet_db inet_db set 29 630 inet_db
inet_hosts_byaddr inet_hosts_byaddr bag 0 307 inet_db
inet_hosts_byname inet_hosts_byname bag 0 307 inet_db
inet_hosts_file_byaddr inet_hosts_file_byaddr bag 0 307 inet_db
inet_hosts_file_byname inet_hosts_file_byname bag 0 307 inet_db
test test set 0 307 <0.43.0>
ok
14>


  效果和我们预期的一样,这个问题可以告一段落;既然ETS表会对创建的进程存在生命周期的依赖,那么还有什么方法可以控制ETS的生命周期呢?
 
ETS Owner
 
  另外可以在创建ETS表的时候指定一个继任者,当创建的进程死掉之后,ETS的所有权转移到继任者:
 
{heir,Pid,HeirData} | {heir,none}
Set  a  process  as  heir.  The  heir  will  inherit  the  table  if  the  owner  terminates.  The  message  'ETS-TRANSFER',tid(),FromPid,HeirData} will be sent to the heir when that happens. The heir must be a local process. Default heir is none, which will destroy the table when the owner terminates.
19> HP=spawn(fun()-> receive  after infinity -> ok end end ).                                                   
<0.60.0>
20> ets:new(demo_ets,[named_table,{heir,HP,[]}]).
demo_ets
21> ets:i().
id name type size mem owner
----------------------------------------------------------------------------
13 code set 262 10534 code_server
4110 code_names set 54 7151 code_server
8207 shell_records ordered_set 0 97 <0.26.0>
ac_tab ac_tab set 6 843 application_controller
demo_ets demo_ets set 0 307 <0.56.0>
file_io_servers file_io_servers set 0 307 file_server_2
global_locks global_locks set 0 307 global_name_server
global_names global_names set 0 307 global_name_server
global_names_ext global_names_ext set 0 307 global_name_server
global_pid_ids global_pid_ids bag 0 307 global_name_server
global_pid_names global_pid_names bag 0 307 global_name_server
inet_cache inet_cache bag 0 307 inet_db
inet_db inet_db set 29 630 inet_db
inet_hosts_byaddr inet_hosts_byaddr bag 0 307 inet_db
inet_hosts_byname inet_hosts_byname bag 0 307 inet_db
inet_hosts_file_byaddr inet_hosts_file_byaddr bag 0 307 inet_db
inet_hosts_file_byname inet_hosts_file_byname bag 0 307 inet_db
test test set 0 307 <0.43.0>
ok
22> self().
<0.56.0>
23> 1/0.
** exception error: bad argument in an arithmetic expression
in operator '/'/2
called as 1 / 0
24> ets:i().
id name type size mem owner
----------------------------------------------------------------------------
13 code set 262 10534 code_server
4110 code_names set 54 7151 code_server
8207 shell_records ordered_set 0 97 <0.26.0>
ac_tab ac_tab set 6 843 application_controller
demo_ets demo_ets set 0 307 <0.60.0>
file_io_servers file_io_servers set 0 307 file_server_2
global_locks global_locks set 0 307 global_name_server
global_names global_names set 0 307 global_name_server
global_names_ext global_names_ext set 0 307 global_name_server
global_pid_ids global_pid_ids bag 0 307 global_name_server
global_pid_names global_pid_names bag 0 307 global_name_server
inet_cache inet_cache bag 0 307 inet_db
inet_db inet_db set 29 630 inet_db
inet_hosts_byaddr inet_hosts_byaddr bag 0 307 inet_db
inet_hosts_byname inet_hosts_byname bag 0 307 inet_db
inet_hosts_file_byaddr inet_hosts_file_byaddr bag 0 307 inet_db
inet_hosts_file_byname inet_hosts_file_byname bag 0 307 inet_db
ok
25>
  
   下面这篇文章,很好的总结了ETS表几种管理方式 Don’t Lose Your ets Tables 我们上面基本上都涉及到了,作者提到的第三种方法也很容易想到,专门有一个进程,这个进程的职责就是"own the table".

Don’t Lose Your ets Tables 


Name an Heir

When you create an ets table you can also name a process to inherit the table should the creating process die:

TableId = ets:new(my_table, [{heir, SomeOtherProcess, HeirData}]),

If the creating process dies, the process SomeOtherProcess will receive a message of the form

{'ETS-TRANSFER', TableId, OldOwner, HeirData}

where TableId is the table identifier returned from ets:newOldOwner is the pid of the process that owned the table, and HeirData is the data provided with the heir option passed to ets:new. Once it receives this message, SomeOtherProcess owns the table.

Give It Away

Alternatively, you can create an ets table and then give it to some other process to keep it:

TableId = ets:new(my_table, []),ets:give_away(TableId, SomeOtherProcess, GiftData),

 

If the creating process dies, the process SomeOtherProcess will receive a message of the form

{'ETS-TRANSFER', TableId, OldOwner, GiftData}

where TableId is the table identifier returned from ets:newOldOwner is the pid of the process that owned the table, and GiftData is the data provided in the ets:give_away call. Once it receives this message, SomeOtherProcess owns the table.

Table Manager

Instead of naming an heir or giving a table away, you can just have your Erlang supervisorprocess create a child process whose sole task is to own the table. This process creates the table as a named public table, thus allowing other processes to know its name and read/write it directly, with ets built-in concurrency protection dealing with any concurrency issues. Since the owner process does nothing more than create the table and then wait to be told to shut down, the likelihood of it crashing and taking the table with it is practically nil. The drawback here, though, is that the process actually using the table may have to coordinate with the owner process to ensure the table is available, and worse, it ends up using what is essentially a global variable — the table name — which can make code harder to read and maintain.

A Combination Approach

A nice way of managing ets tables, though, is to use a combination of the three previous techniques:

  1. The Erlang supervisor creates a table manager process. Since all this process does is manage the table, the likelihood of it crashing is very low.
  2. The table manager links itself to the table user process and traps exits, allowing it to receive an EXIT message if the table user process dies unexpectedly.
  3. The table manager creates a table, names itself (self()) as the heir, and then gives it away to the table user process.
  4. If the table user process dies, the table manager is informed of the process death and also inherits the table back.

Once it inherits the table, the table manager can then for example wait until the supervisorrecreates the table user process, and then repeat the steps above to give the table to the new table user process. Other variations on this approach, like maybe a small pool of child process clones that cooperate to transfer the table between them in case of error, are of course also possible. Even though there are still process coordination issues here (but nothing difficult), I like this approach because it avoids global named tables and takes advantage of Erlang's supervision hierarchy.

The title of my QCon talk was "Let It Crash...Except When You Shouldn't." This scenario is an example of "when you shouldn't" — losing ets data due to a process crash is easily avoided.

 

2012-12-20 17:04:48 补充 如果仅仅是解决Erlang Shell异常死掉导致ETS崩溃的问题,还有一个最简单的方法:不让Shell死掉就行了  多谢@[京]汪(84871172) 的提醒

Eshell V5.9  (abort with ^G)
1> catch_exception(true).
false
2> self().
<0.30.0>
3> 1/0.
* exception error: bad argument in an arithmetic expression
    in operator  '/'/2
       called as 1 / 0
4> self().