故障处理:ADG环境ORA-04020导致数据库实例异常关闭

我们的文章会在微信公众号IT民工的龙马人生博客网站( www.htz.pw )同步更新 ,欢迎关注收藏,也欢迎大家转载,但是请在文章开始地方标注文章出处,谢谢!
由于博客中有大量代码,通过页面浏览效果更佳。

ADG环境ORA-04020导致数据库实例异常关闭

环境介绍:11.2.0.3 rac-single adg,linux环境

1,alert日志报错信息

Errors in file /oracle/app/oracle/diag/rdbms/htz/htz/trace/htz_lgwr_21296.trc:

ORA-04020: deadlock detected while trying to lock object SYS.crmadb

Mon Oct 20 02:05:33 2014

MRP0: Background Media Recovery terminated with error 449

Errors in file /oracle/app/oracle/diag/rdbms/htz/htz/trace/htz_mrp0_56025.trc:

ORA-00449: background process ‘LGWR’ unexpectedly terminated with error 4020

Managed Standby Recovery not using Real Time Apply

LGWR (ospid: 21296): terminating the instance due to error 4020

通过上面信息,可以得到LGWR因为ORA-04020错误被终止,导致数据库实例被异常关闭

查看htz_lgwr_21296.trc日志文件信息

A deadlock among DDL and parse locks is detected.

This deadlock is usually due to user errors in

the design of an application or from issuing a set

of concurrent statements which can cause a deadlock.

This should not be reported to Oracle Support.

The following information may aid in finding

the errors which cause the deadlock:

ORA-04020: deadlock detected while trying to lock object SYS.htzadb

——————————————————–

  object   waiting  waiting       blocking blocking

  handle   session     lock mode   session     lock mode

——–  ——– ——– —-  ——– ——– —-

0xf57d98d10  0xf72e00ed0 0xdd47a1ff8    S  0xf8a820b50 0xf57dd26f8    0

0xdb6f9ffa0  0xf82226890 0xdbd7714f0    S  0xf72e00ed0 0xe6a20c450    X

0xf57d98d10  0xf8a820b50 0xf57dd26f8    X  0xf82226890 0xde8de91e8    S

0xf8a820b50被0xf82226890阻塞

0xf82226890被0xf72e00ed0阻塞

0xf72e00ed0被0xf8a820b50阻塞

这里出现了死锁

下面看看每一个对象的名字

——————————————————–

———- DUMP OF WAITING AND BLOCKING LOCKS ———-

——————————————————–

————- WAITING LOCK ————-

—————————————-

SO: 0xdd47a1ff8, type: 78, owner: 0xf086a2548, flag: INIT/-/-/0x00 if: 0x3 c: 0x3

 proc=0xf699e18f8, name=LIBRARY OBJECT LOCK, file=kgl.h LINE:8547, pg=0

 

LibraryObjectLock:  Address=0xdd47a1ff8 Handle=0xf57d98d10 RequestMode=S CanBeBrokenCount=1 Incarnation=1 ExecutionCount=0  

 

  User=0xf72e00ed0 Session=0xf6adebea8 ReferenceCount=0 Flags=[0000] SavepointNum=316

  LibraryHandle:  Address=0xf57d98d10 Hash=9dbb48b6 LockMode=S PinMode=0 LoadLockMode=0 Status=0

  ObjectName:  Name=SYS.htzdb  

    FullHashValue=38f345fe8aac99a711f61f849dbb48b6 Namespace=DBINSTANCE(74) Type=CURSOR(00) Identifier=280064 OwnerIdn=0

  Statistics:  InvalidationCount=0 ExecutionCount=0 LoadCount=0 ActiveLocks=1 TotalLockCount=215156859 TotalPinCount=0

  Counters:  BrokenCount=1 RevocablePointer=1 KeepDependency=0 BucketInUse=17242 HandleInUse=17242 HandleReferenceCount=0

  Concurrency:  DependencyMutex=0xf57d98dc0(0, 0, 0, 0) Mutex=0xf57d98e40(0, 431384674, 75138, 0)

  Flags=RON/PIN/KEP/BSO/[00810003]

  WaitersLists: 

    Lock=0xf57d98da0[0xf57dd2768,0xeb7186ca8]

    Pin=0xf57d98d80[0xf57d98d80,0xf57d98d80]

    LoadLock=0xf57d98df8[0xf57d98df8,0xf57d98df8]

  Timestamp: 

  HandleReference:  Address=0xf57d98eb8 Handle=(nil) Flags=[00] ————- BLOCKING LOCK ————

—————————————-

SO: 0xf57dd26f8, type: 78, owner: 0xf72f836e8, flag: INIT/-/-/0x00 if: 0x3 c: 0x3

 proc=0xf79a35860, name=LIBRARY OBJECT LOCK, file=kgl.h LINE:8547, pg=0

 

LibraryObjectLock:  Address=0xf57dd26f8 Handle=0xf57d98d10 RequestMode=X CanBeBrokenCount=1 Incarnation=1 ExecutionCount=0  

 

  User=0xf8a820b50 Session=0xf8a820b50 ReferenceCount=0 Flags=[0100] SavepointNum=2

  LibraryHandle:  Address=0xf57d98d10 Hash=9dbb48b6 LockMode=S PinMode=0 LoadLockMode=0 Status=0

  ObjectName:  Name=SYS.htzdb  

    FullHashValue=38f345fe8aac99a711f61f849dbb48b6 Namespace=DBINSTANCE(74) Type=CURSOR(00) Identifier=280064 OwnerIdn=0

  Statistics:  InvalidationCount=0 ExecutionCount=0 LoadCount=0 ActiveLocks=1 TotalLockCount=215156859 TotalPinCount=0

  Counters:  BrokenCount=1 RevocablePointer=1 KeepDependency=0 BucketInUse=17242 HandleInUse=17242 HandleReferenceCount=0

  Concurrency:  DependencyMutex=0xf57d98dc0(0, 0, 0, 0) Mutex=0xf57d98e40(0, 431384674, 75138, 0)

  Flags=RON/PIN/KEP/BSO/[00810003]

  WaitersLists: 

    Lock=0xf57d98da0[0xf57dd2768,0xeb7186ca8]

    Pin=0xf57d98d80[0xf57d98d80,0xf57d98d80]

    LoadLock=0xf57d98df8[0xf57d98df8,0xf57d98df8]

  Timestamp: 

  HandleReference:  Address=0xf57d98eb8 Handle=(nil) Flags=[00] ————- WAITING LOCK ————-

—————————————-

SO: 0xdbd7714f0, type: 78, owner: 0xf086b0190, flag: INIT/-/-/0x00 if: 0x3 c: 0x3

 proc=0xf899dd678, name=LIBRARY OBJECT LOCK, file=kgl.h LINE:8547, pg=0

 

LibraryObjectLock:  Address=0xdbd7714f0 Handle=0xdb6f9ffa0 RequestMode=S CanBeBrokenCount=1 Incarnation=1 ExecutionCount=0  

 

  User=0xf82226890 Session=0xf7a2af120 ReferenceCount=0 Flags=[0000] SavepointNum=8c4af

  LibraryHandle:  Address=0xdb6f9ffa0 Hash=32cc0539 LockMode=X PinMode=0 LoadLockMode=0 Status=0

  ObjectName:  Name=$BUILD$. 969f67910439dee  

    FullHashValue=8bef6dd99c7e1680af8c2d9432cc0539 Namespace=SQL AREA BUILD(82) Type=CURSOR(00) Identifier=0 OwnerIdn=0

  Statistics:  InvalidationCount=0 ExecutionCount=0 LoadCount=0 ActiveLocks=1 TotalLockCount=57 TotalPinCount=0

  Counters:  BrokenCount=1 RevocablePointer=1 KeepDependency=0 BucketInUse=59 HandleInUse=59 HandleReferenceCount=0

  Concurrency:  DependencyMutex=0xdb6fa0050(0, 0, 0, 0) Mutex=0xdb6fa00d0(0, 229, 1, 0)

  Flags=RON/PIN/[00010000]

  WaitersLists: 

    Lock=0xdb6fa0030[0xdbd771560,0xdbd771560]

    Pin=0xdb6fa0010[0xdb6fa0010,0xdb6fa0010]

    LoadLock=0xdb6fa0088[0xdb6fa0088,0xdb6fa0088]

  Timestamp: 

  HandleReference:  Address=0xdb6fa0150 Handle=(nil) Flags=[00] ————- BLOCKING LOCK ————

—————————————-

SO: 0xe6a20c450, type: 78, owner: 0xf086a2548, flag: INIT/-/-/0x00 if: 0x3 c: 0x3

 proc=0xf699e18f8, name=LIBRARY OBJECT LOCK, file=kgl.h LINE:8547, pg=0

 

LibraryObjectLock:  Address=0xe6a20c450 Handle=0xdb6f9ffa0 Mode=X CanBeBrokenCount=1 Incarnation=1 ExecutionCount=0  

 

  User=0xf72e00ed0 Session=0xf6adebea8 ReferenceCount=1 Flags=CNB/[0001] SavepointNum=316

  LibraryHandle:  Address=0xdb6f9ffa0 Hash=32cc0539 LockMode=X PinMode=0 LoadLockMode=0 Status=0

  ObjectName:  Name=$BUILD$. 969f67910439dee  

    FullHashValue=8bef6dd99c7e1680af8c2d9432cc0539 Namespace=SQL AREA BUILD(82) Type=CURSOR(00) Identifier=0 OwnerIdn=0

  Statistics:  InvalidationCount=0 ExecutionCount=0 LoadCount=0 ActiveLocks=1 TotalLockCount=57 TotalPinCount=0

  Counters:  BrokenCount=1 RevocablePointer=1 KeepDependency=0 BucketInUse=59 HandleInUse=59 HandleReferenceCount=0

  Concurrency:  DependencyMutex=0xdb6fa0050(0, 0, 0, 0) Mutex=0xdb6fa00d0(0, 229, 1, 0)

  Flags=RON/PIN/[00010000]

  WaitersLists: 

    Lock=0xdb6fa0030[0xdbd771560,0xdbd771560]

    Pin=0xdb6fa0010[0xdb6fa0010,0xdb6fa0010]

    LoadLock=0xdb6fa0088[0xdb6fa0088,0xdb6fa0088]

  Timestamp: 

  HandleReference:  Address=0xdb6fa0150 Handle=(nil) Flags=[00] ————- WAITING LOCK ————-

—————————————-

SO: 0xf57dd26f8, type: 78, owner: 0xf72f836e8, flag: INIT/-/-/0x00 if: 0x3 c: 0x3

 proc=0xf79a35860, name=LIBRARY OBJECT LOCK, file=kgl.h LINE:8547, pg=0

 

LibraryObjectLock:  Address=0xf57dd26f8 Handle=0xf57d98d10 RequestMode=X CanBeBrokenCount=1 Incarnation=1 ExecutionCount=0  

 

  User=0xf8a820b50 Session=0xf8a820b50 ReferenceCount=0 Flags=[0100] SavepointNum=2

  LibraryHandle:  Address=0xf57d98d10 Hash=9dbb48b6 LockMode=S PinMode=0 LoadLockMode=0 Status=0

  ObjectName:  Name=SYS.htzdb  

    FullHashValue=38f345fe8aac99a711f61f849dbb48b6 Namespace=DBINSTANCE(74) Type=CURSOR(00) Identifier=280064 OwnerIdn=0

  Statistics:  InvalidationCount=0 ExecutionCount=0 LoadCount=0 ActiveLocks=1 TotalLockCount=215156859 TotalPinCount=0

  Counters:  BrokenCount=1 RevocablePointer=1 KeepDependency=0 BucketInUse=17242 HandleInUse=17242 HandleReferenceCount=0

  Concurrency:  DependencyMutex=0xf57d98dc0(0, 0, 0, 0) Mutex=0xf57d98e40(0, 431384674, 75138, 0)

  Flags=RON/PIN/KEP/BSO/[00810003]

  WaitersLists: 

    Lock=0xf57d98da0[0xf57dd2768,0xeb7186ca8]

    Pin=0xf57d98d80[0xf57d98d80,0xf57d98d80]

    LoadLock=0xf57d98df8[0xf57d98df8,0xf57d98df8]

  Timestamp: 

  HandleReference:  Address=0xf57d98eb8 Handle=(nil) Flags=[00] ————- BLOCKING LOCK ————

—————————————-

SO: 0xde8de91e8, type: 78, owner: 0xf086e3620, flag: INIT/-/-/0x00 if: 0x3 c: 0x3

 proc=0xf899dd678, name=LIBRARY OBJECT LOCK, file=kgl.h LINE:8547, pg=0

 

LibraryObjectLock:  Address=0xde8de91e8 Handle=0xf57d98d10 Mode=S CanBeBrokenCount=1 Incarnation=1 ExecutionCount=0  

 

  User=0xf82226890 Session=0xf82226890 ReferenceCount=1 Flags=CNB/[0001] SavepointNum=8b0a9

  LibraryHandle:  Address=0xf57d98d10 Hash=9dbb48b6 LockMode=S PinMode=0 LoadLockMode=0 Status=0

  ObjectName:  Name=SYS.htzdb  

    FullHashValue=38f345fe8aac99a711f61f849dbb48b6 Namespace=DBINSTANCE(74) Type=CURSOR(00) Identifier=280064 OwnerIdn=0

  Statistics:  InvalidationCount=0 ExecutionCount=0 LoadCount=0 ActiveLocks=1 TotalLockCount=215156859 TotalPinCount=0

  Counters:  BrokenCount=1 RevocablePointer=1 KeepDependency=0 BucketInUse=17242 HandleInUse=17242 HandleReferenceCount=0

  Concurrency:  DependencyMutex=0xf57d98dc0(0, 0, 0, 0) Mutex=0xf57d98e40(0, 431384674, 75138, 0)

  Flags=RON/PIN/KEP/BSO/[00810003]

  WaitersLists: 

    Lock=0xf57d98da0[0xf57dd2768,0xeb7186ca8]

    Pin=0xf57d98d80[0xf57d98d80,0xf57d98d80]

    LoadLock=0xf57d98df8[0xf57d98df8,0xf57d98df8]

  Timestamp: 

  HandleReference:  Address=0xf57d98eb8 Handle=(nil) Flags=[00] ——————————————————–

This lock request was aborted.

error 4020 detected in background process

ORA-04020: deadlock detected while trying to lock object SYS.htzdb

kjzduptcctx: Notifying DIAG for crash event

—– Abridged Call Stack Trace —–

ksedsts()+461<-kjzdssdmp()+267<-kjzduptcctx()+232<-kjzdicrshnfy()+53<-ksuitm()+1332<-ksbrdp()+3344<-opirip()+623<-opidrv()+603<-sou2o()+103<-opimai_real()+266<-ssthrdmain()+252<-main()+201<-__libc_start_main()+253<-_start()+36

—– End of Abridged Call Stack Trace —–

 

*** 2014-10-20 02:05:33.948

LGWR (ospid: 21296): terminating the instance due to error 4020

ksuitm: waiting up to [5] seconds before killing DIAG(21282)

可以参数MOS:ORA-04020 Deadlocks: Most Common Causes (文档 ID 166924.1)里面有详细的介绍怎么阅读此trace文件

2,查询MOS信息

通过MOS可以发现下面的BUG信息

Bug 17292239 – Active Dataguard Hangs waiting for library cache lock on DBINSTANCE. Tracking bug to deliver backports to remove fix for 11664426 (Doc ID 17292239.8)

Bug 17018214 – ORA-600 [krdrsb_end_qscn_2] ORA-4021 in Active Dataguard Standby Database with fix for bug 16717701 present – Instance may crash (Doc ID 17018214.8)

Bug 16717701 – Active Dataguard Hangs waiting for library cache lock on DBINSTANCE namespace with possible deadlock – Superseded (Doc ID 16717701.8)

Bug 11664426 – Query hangs with “kksfbc child completion” wait – superseded (Doc ID 11664426.8)

在17292239中,提到有如下的描述,在主库发现大量的DDL语句

A Standby Active Dataguard database may encounter severe library cache lock contention with possible deadlock causing the database to hang when there are a lot of DDLs on the Primary Database.

在17018214中,提到有如下的描述。

Note:
  For interim patches please use this fix instead of 16717701 in order to address both issues.
  This fix also contains the fix of bug 17292239; it is recommended then to apply this fix alone
   instead of a Merged fix between 17018214 and 17292239.
 

*** IMPORTANT ***
 This fix is DISABLED by default in ALL releases.

 
 
In 11.2.0.4 the fix is controlled by hidden parameter “_adg_parselock_timeout=<value in centiseconds>.
 
For backports/one-off patches, the fix must be enabled by setting event 16717701 at level 104887600.
 
 The value of the event encodes two things:
  1. Timeout – this is the amount of time that LGWR will wait for an X lock before signaling(发出信号) a timeout error and retrying.
  2. Sleep duration – this is the amount of time that LGWR will sleep for after having timed out.
 
 Both of the above values are denoted(表明) in MILLISECONDS. The event value encodes the sleep duration in its 12
 high-order bits and the timeout in the 20 lower order bits. The value can be calculated using the following formula:
 
   value = (S * 1048576) + T
 
 where S = sleep duration in milliseconds
          T = timeout in milliseconds
 
 Although these values may need some fine tuning for each case, we recommend that all customers initially start with
 a 30 second timeout and 100 ms sleep duration. In other words, LGWR will wait for up to 30 seconds to acquire the X-lock.
 If it fails, then it will retry after sleeping for 100 milliseconds.
 Using the above formula, the event value would then have to be set to:
 
  

value = (100 * 1048576) + 30000 = 104887600

 
 ** In a RAC ADG configuration, the event must have the same value on all instances. It may not be set or altered in a rolling manner ***

可以通过打补丁并配置event来解决

3,查询DDL语句

通过查询audit_ddl_obj可以统计ddl语句

TO_CHAR(OPR_TIME   COUNT(*)

—————- ———-

2014-10-20 01:38          4

2014-10-20 01:39          3

2014-10-20 01:43          2

2014-10-20 01:44          3

2014-10-20 01:45          6

2014-10-20 01:48          4

2014-10-20 01:49          3

2014-10-20 01:53          2

2014-10-20 01:54          3

2014-10-20 01:58          4

2014-10-20 01:59          3

2014-10-20 02:00         17

2014-10-20 02:01          5

2014-10-20 02:03          7

2014-10-20 02:04         53

2014-10-20 02:05         10

2014-10-20 02:06         15

2014-10-20 02:07         20

2014-10-20 02:08          4

2014-10-20 02:09          8

2014-10-20 02:13          2

2014-10-20 02:14          3

SQL> select sid, count(*)

  2    from system.audit_ddl_obj

  3   where to_char(opr_time, ‘yyyy-mm-dd hh24:mi’) = ‘2014-10-20 02:04’

  4   group by sid

  5   order by 2

  6  ;

 

       SID   COUNT(*)

———- ———-

      4959          3

      8028         50

SQL> select USER_NAME, ddl_type, ddl_sql, object_type, object_name

  2    from system.audit_ddl_obj

  3   where to_char(opr_time, ‘yyyy-mm-dd hh24:mi’) = ‘2014-10-20 02:04’

  4     and sid = 8028

  5  ;

 

USER_NAME                      DDL_TYPE             DDL_SQL    OBJECT_TYPE        OBJECT_NAME

—————————— ——————– ———- —————— ——————————

SYS                            CREATE                          TABLE              ORA_TEMP_2_DS_1531250

SYS                            GRANT                           OBJECT PRIVILEGE   ORA_TEMP_2_DS_1531250

SYS                            ALTER                           TABLE              ORA_TEMP_2_DS_1531250

SYS                            TRUNCATE                        TABLE              ORA_TEMP_2_DS_1531250

SYS                            DROP                            TABLE              ORA_TEMP_2_DS_1531250

SYS                            CREATE                          TABLE              ORA_TEMP_2_DS_1531251

SYS                            GRANT                           OBJECT PRIVILEGE   ORA_TEMP_2_DS_1531251

SYS                            ALTER                           TABLE              ORA_TEMP_2_DS_1531251

SYS                            TRUNCATE                        TABLE              ORA_TEMP_2_DS_1531251

通过查询,发现是由于自动统计信息导致的大量DDL语句

4,故障分析总结

通过上面分析,可以确认是由于主库大量的DDL语句使备库触发BUG17292239,通过打补丁并配置event可以解决。

------------------作者介绍-----------------------
姓名:黄廷忠
现就职:Oracle中国高级服务团队
曾就职:OceanBase、云和恩墨、东方龙马等
电话、微信、QQ:18081072613
个人博客: (http://www.htz.pw)
CSDN地址: (https://blog.csdn.net/wwwhtzpw)
博客园地址: (https://www.cnblogs.com/www-htz-pw)

posted @ 2025-08-01 15:49  认真就输  阅读(30)  评论(0)    收藏  举报