代码改变世界

ORA-00030: User session ID does not exist.

2016-05-06 21:31  潇湘隐者  阅读(8792)  评论(0编辑  收藏  举报

   同事在Toad里面执行SQL语句时,突然无线网络中断了,让我检查一下具体情况,如下所示(有些信息,用xxx替换,因为是在处理那些历史归档数据,使用的一个特殊用户,所以可以用下面SQL找到对应的会话信息):

SQL> SELECT B.USERNAME     , 
  2         B.SID          , 
  3         B.SERIAL#      ,
  4         LOGON_TIME     ,
  5         A.OBJECT_ID
  6    FROM V$LOCKED_OBJECT A, V$SESSION B
  7   WHERE A.SESSION_ID = B.SID AND B.USERNAME=&USERNAME
  8   ORDER BY B.LOGON_TIME;
 
USERNAME                              SID    SERIAL# LOGON_TIM  OBJECT_ID
------------------------------ ---------- ---------- --------- ----------
xxxxxx                                523      41890 06-MAY-16     825891
xxxxxx                                523      41890 06-MAY-16     825892

执行了kill会话的语句后,检查发现对应的会话仍然存在,只是SERIAL#值变化了,再次去kill会话时,出现ORA-00030错误,如下所示

SQL> alter system kill session '523, 41890' immediate;
 
System altered.
 
 
SQL> SELECT  A.ORACLE_USERNAME  ,
  2          A.OS_USER_NAME     ,
  3          B.OWNER            ,
  4          B.OBJECT_NAME      , 
  5          A.SESSION_ID       ,
  6          A.PROCESS          ,
  7          A.LOCKED_MODE
  8    FROM V$LOCKED_OBJECT A, DBA_OBJECTS B
  9   WHERE B.OBJECT_ID = A.OBJECT_ID AND  B.OWNER=&OWNER
 10  ORDER BY  A.ORACLE_USERNAME,
 11            A.OS_USER_NAME;
 
ORACLE_USERNAME   OS_USER_NAME     OWNER       OBJECT_NAME        SESSION_ID PROCESS   LOCKED_MODE
---------------- ------------- -----------   ----------------- ----------------------  -------------
xxxxxxxxxxxxxxx    ZhanxxxnL   xxxxxxxxxxxx    INV_xxxx_HD       523 6208:7548               3
xxxxxxxxxxxxxxx    ZhanxxxxL   xxxxxxxxxxxx    INV_xxxx_LINES    523 6208:7548               3
 
SQL> SELECT B.USERNAME    , 
  2      B.SID            , 
  3      B.SERIAL#        ,
  4      LOGON_TIME       ,
  5      A.OBJECT_ID
  6    FROM V$LOCKED_OBJECT A, V$SESSION B
  7   WHERE A.SESSION_ID = B.SID
     AND  B.USERNAME=&USERNAME
  8   ORDER BY B.LOGON_TIME;
 
USERNAME                              SID    SERIAL# LOGON_TIM  OBJECT_ID
------------------------------ ---------- ---------- --------- ----------
xxxxxxxxxxxxxx                        523      41891 06-MAY-16     825892
xxxxxxxxxxxxxx                        523      41891 06-MAY-16     825891
 
 
SQL> alter system kill session '523, 41891' immediate;
alter system kill session '523, 41891' immediate
*
ERROR at line 1:
ORA-00030: User session ID does not exist.

clip_image001

在metalink上,查看了ORA-00030错误的描述、原因、解决方案。如下所示

SQL> ho oerr ora 30

00030, 00000, "User session ID does not exist."

// *Cause: The user session ID no longer exists, probably because the

// session was logged out.

// *Action: Use a valid session ID.

 

The command may have been issued for one or more of the following reasons:

1. The process no longer exists at the os level, but does show up as active in v$session.

2. The user reboots the client machine without logging off, leaving a shadow process.

3. That session is holding onto a lock that needs to be released.

CAUSE

This error occurs because PMON is already trying to kill the session.

This is indicated by the fact that the serial number keeps changing.

When PMON attempts to cleanup a dead session, it will increase the serial number.

PMON may take a long time to clean up the process. If the process was doing a very large transaction at the time it aborted, then PMON has to rollback the large transaction.

When PMON makes progress, i.e. if it manages to free at least some of the process's resource, it will repeatedly keep trying to delete the process. When it finally gets to the point where it can't free up any of the process's resource (i.e. there are no more free buffers), it will print a message to the trace file and try to delete that process a second time. 

The problem is encountered when PMON lacks the resources needed to remove the process. If there are not enough buffers, then the removal of  the process is delayed. This is a free buffer problem in the data cache.

SOLUTION

Encountering an ORA-30 when attempting to manually kill a process is not necessarily a bug but a result of trying to kill a process already marked as killed. 

PMON can take anywhere from 5 minutes to over 24 hours to clean up a job. The impact is that often the process being cleaned up is holding locks that prevents others from performing certain operations.

The solution is to wait for PMON to clean up the process.

 

基本上只能等待pmon进程回收处理这个进程,等了十来分钟,这个会话进程还是没有被清理,于是我查看了一下会话的相关信息,在网上查看到相关资料,可以从系统层面kill掉会话

SQL> 
 
SQL> select event from v$session_wait where sid=523;
 
EVENT
----------------------------------------------------------------
db file sequential read
 
SQL> select sql_text from v$session a,v$sqltext_with_newlines b
  2    where decode(a.sql_hash_value, 0, prev_hash_value, sql_hash_value)=b.hash_value
  3    and a.sid=&sid order by piece;
Enter value for sid: 523
old   3:   and a.sid=&sid order by piece
new   3:   and a.sid=523 order by piece
 
SQL_TEXT
----------------------------------------------------------------
DELETE from inv_xxx_lines WHERE (xxx) IN ( SELECT tr
ans_line_id FROM xxxx GROUP BY trans_line_id HAVING C
OUNT(xxxxx) > 1) AND ROWID NOT IN (SELECT MIN(ROWID) FRO
M xxxx GROUP BY xxx HAVING COUNT(*) > 1)

 

于是我尝试从系统层面kill掉对应的系统进程。执行完成后,验证发现对应的会话已经Kill掉了。不知道是凑巧pmon进程回收了这个会话进程还是真的能从系统进程能kill掉(因为不能重新这种场景),如果下次碰到这种场景,就可以测试、验证了。特此记录一下

 

SQL> ! kill -9 4884

 

参考资料:

https://support.oracle.com/epmos/faces/DocumentDisplay?_afrLoop=533785808734847&id=1011386.6&_afrWindowMode=0&_adf.ctrl-state=13ipo04jjr_4

http://www.linuxidc.com/Linux/2011-09/43730.htm