故障处理:troubleshooting row cache lock
我们的文章会在微信公众号IT民工的龙马人生和博客网站( www.htz.pw )同步更新 ,欢迎关注收藏,也欢迎大家转载,但是请在文章开始地方标注文章出处,谢谢!
由于博客中有大量代码,通过页面浏览效果更佳。
本文转自朋友的真实案例分享。
案例:troubleshooting row cache lock
本案例来自东区某客户,数据库版本为rac 19.7。周日业务反应有大量业务阻塞,通过一线同事发来的wait chain信息可以看到大量的row cache lock(CID=16)和library cache lock,并且final blocking event是row cache lock。final block session是not wait的。
首先row cache lock在之前的一篇row cache lock分析中已经介绍过了基本分析思路,这里就不多说明了,直接进入正题http://www.minniebaby.tech/2021/10/27/%e6%a1%88%e4%be%8b%ef%bc%9a%e4%b8%80%e6%ac%a1row-cache-lock%e5%bc%82%e5%b8%b8%e5%a4%84%e7%90%86/
ash分析:
select instance_NUMBER,program,event,sql_id,sql_exec_start,min(sample_time),max(sample_time),seq#,count(*)*10 from dba_hist_active_sess_history
where event='row cache lock' and p3=5 and sample_time>trunc(sysdate-1)
group by instance_NUMBER,program,event,sql_id,sql_exec_start,seq#
order by sql_exec_start;
INSTANCE_NUMBER PROGRAM EVENT SQL_ID SQL_EXEC_START MIN(SAMPLE_TIME) MAX(SAMPLE_TIME) SEQ# COUNT(*)*10
--------------- ------------------------------ -------------------- ------------- ------------------- ------------------------------ ------------------------------ ---------- -----------
2 oracle@prdres2 (J003) row cache lock b6usrg82hwsa3 2023-05-07 10:07:45 07-MAY-23 10.09.03.948 AM 07-MAY-23 04.14.06.719 PM 1239 21890
2 oracle@prdres2 (J003) row cache lock b6usrg82hwsa3 2023-05-07 10:07:45 07-MAY-23 04.14.16.729 PM 07-MAY-23 06.31.25.760 PM 1241 8230
final block session(INSTANCE_NUMBER=2、SID=2018)执行的是自动统计信息任务。以X模式请求dc_histogram_defs的row cache lock,并且持续了非常长时间的请求。一直持续到客户重启数据库。
*** 2023-05-07T10:55:08.609919+08:00 ((1))
(session) sid: 2018 ser: 26444 trans: 0x2b8ee7da0, creator: 0x310d9c5a8
flags: (0x8210041) USR/- flags2: (0x40009) -/-/INC
flags_idl: (0x1) status: BSY/-/-/- kill: -/-/-/-
DID: 0002-00B0-0003F5E70002-00B0-0003F5E6, short-term DID:
txn branch: (nil)
con_id/con_uid/con_name: 3/2615539819/PDBRES
con_logonuid: 2615539819 con_logonid: 3
con_scuid: 1 con_scid: 1
edition#: 134 user#/name: 0/SYS
oct: 170, prv: 0, sql: 0x19a891ae8, psql: 0x48a3146c0
stats: 0x1bdfa4d10, PX stats: 0x12f05d04
service name: SYS$USERS
client details:
O/S info: user: oracle, term: UNKNOWN, ospid: 392
machine: prdres2 program: oracle@prdres2 (J003)
application name: DBMS_SCHEDULER, hash value=2478762354
action name: ORA$AT_OS_OPT_SY_5210, hash value=513032857
Current Wait Stack:
0: waiting for 'row cache lock'
cache id=0x10, mode=0x0, request=0x5
pdb uid=0x9be5f86b, master hint=0x0
wait_id=1230 seq_num=1239 snap_id=1
wait times: snap=46 min 13 sec, exc=46 min 13 sec, total=46 min 13 sec
wait times: max=infinite, heur=46 min 13 sec
wait counts: calls=5570 os=5570
in_wait=1 iflags=0x215a2
There are 5 sessions blocked by this session.
Dumping one waiter:
inst: 2, sid: 3781, ser: 11134
wait event: 'library cache lock'
p1: 'handle address'=0x18c056a30
p2: 'lock address'=0x17d3fdf58
p3: '100*mode+namespace'=0x1600010002
row_wait_obj#: 4294967295, block#: 0, row#: 0, file# 0
min_blocked_time: 97 secs, waiter_cache_ver: 49043
Wait State:
fixed_waits=0 flags=0x22 boundary=(nil)/-1
Session Wait History:
elapsed time of 0.000027 sec since current wait
0: waited for 'row cache lock'
cache id=0x10, mode=0x0, request=0x5
pdb uid=0x9be5f86b, master hint=0x0
wait_id=1229 seq_num=1238 snap_id=1
wait times: snap=0.000211 sec, exc=0.000211 sec, total=0.000211 sec
wait times: max=0.000000 sec
wait counts: calls=2 os=2
occurred after 0.000012 sec of elapsed time
1: waited for 'KJC: Wait for msg sends to complete'
msg=0x353bf7860, dest|rcvr=0x10000, mtype=0x6a
wait_id=1228 seq_num=1237 snap_id=1
wait times: snap=0.000045 sec, exc=0.000045 sec, total=0.000045 sec
wait times: max=5 min 1 sec
wait counts: calls=2 os=2
occurred after 0.000004 sec of elapsed time
2: waited for 'DLM cross inst call completion'
caller instance number=0x2, cluster incarnation number=0x7, request identifier=0x14f735f
wait_id=1227 seq_num=1236 snap_id=1
wait times: snap=0.000004 sec, exc=0.000004 sec, total=0.000004 sec
wait times: max=0.000000 sec
wait counts: calls=0 os=0
occurred after 0.000021 sec of elapsed time
3: waited for 'row cache lock'
cache id=0x10, mode=0x0, request=0x5
pdb uid=0x9be5f86b, master hint=0x0
wait_id=1226 seq_num=1235 snap_id=1
wait times: snap=0.000269 sec, exc=0.000269 sec, total=0.000269 sec
wait times: max=0.000000 sec
wait counts: calls=2 os=2
occurred after 0.000033 sec of elapsed time
4: waited for 'KJC: Wait for msg sends to complete'
msg=0x353afe5a8, dest|rcvr=0x10000, mtype=0x6a
wait_id=1225 seq_num=1234 snap_id=1
wait times: snap=0.000042 sec, exc=0.000042 sec, total=0.000042 sec
wait times: max=5 min 1 sec
wait counts: calls=2 os=2
occurred after 0.000013 sec of elapsed time
5: waited for 'DLM cross inst call completion'
caller instance number=0x2, cluster incarnation number=0x7, request identifier=0x14f735e
wait_id=1224 seq_num=1233 snap_id=1
wait times: snap=0.000011 sec, exc=0.000011 sec, total=0.000011 sec
wait times: max=0.000000 sec
wait counts: calls=0 os=0
occurred after 0.000064 sec of elapsed time
6: waited for 'row cache lock'
cache id=0x10, mode=0x0, request=0x5
pdb uid=0x9be5f86b, master hint=0x0
wait_id=1223 seq_num=1232 snap_id=1
wait times: snap=0.000219 sec, exc=0.000219 sec, total=0.000219 sec
wait times: max=0.000000 sec
wait counts: calls=2 os=2
occurred after 0.000016 sec of elapsed time
7: waited for 'KJC: Wait for msg sends to complete'
msg=0x353f0abe0, dest|rcvr=0x10000, mtype=0x6a
wait_id=1222 seq_num=1231 snap_id=1
wait times: snap=0.000057 sec, exc=0.000057 sec, total=0.000057 sec
wait times: max=5 min 1 sec
wait counts: calls=2 os=2
occurred after 0.000007 sec of elapsed time
8: waited for 'DLM cross inst call completion'
caller instance number=0x2, cluster incarnation number=0x7, request identifier=0x14f735d
wait_id=1221 seq_num=1230 snap_id=1
wait times: snap=0.000004 sec, exc=0.000004 sec, total=0.000004 sec
wait times: max=0.000000 sec
wait counts: calls=0 os=0
occurred after 0.000035 sec of elapsed time
9: waited for 'row cache lock'
cache id=0x10, mode=0x0, request=0x5
pdb uid=0x9be5f86b, master hint=0x0
wait_id=1220 seq_num=1229 snap_id=1
wait times: snap=0.000372 sec, exc=0.000372 sec, total=0.000372 sec
wait times: max=0.000000 sec
wait counts: calls=1 os=1
occurred after 0.000752 sec of elapsed time
10点55分已经等待了46分钟的row cache lock
P1=16(dc_histogram_defs)
P3=5(X模式)
SOC: 0x8cb9e628, type: row cache enqueues (111), map: 0x17bf35d28
state: LIVE (0x99fc), flags: INIT (0x1)
row cache enqueue: count=1 session=0x351263950 object=0x197150cc0, request=X
flag=03 WAI/TRC/-/-/-/-/-/- savepoint=0x4076e
type=MULTI-INSTANCE instance locked=F handle=0x8cb9e6f8
row cache parent object: addr=0x197150cc0 cid=16(dc_histogram_defs) conid=3 conuid=2615539819
hash=a3de4515 typ=61 transaction=(nil) flags=00000002 inc=1, pdbinc=1
version=2 mtx version=15688
own=0x197150d90[0x197150d90,0x197150d90] wat=0x197150da0[0x8cb9e6a8,0x8cb9e6a8] mode=N req=X
status=VALID/-/-/-/-/-/-/-/- KGH pinned
instance lock=QQ 6a7e84e3 f2908b26
set=0, complete=FALSE
set=1, complete=FALSE
hash=a3de4515这个是关键信息,用于后续分析的匹配和搜索。
那么为什么J003请求不到X模式的row cache lock呢?原因只能是当前有进程正在以S或者X模式持有了该队列。那么在ash blocking为空的情况下如何找出这个进程呢?
我的方法是使用hash=a3de4515去搜搜trace下面的所有trace看看能不能有发现,最终在diag trace中找到了持有的进程。
SOC: 0x3398f0508, type: row cache enqueues (111), map: 0x19faa3ba8
state: LIVE (0x99fc), flags: INIT (0x1)
row cache enqueue: count=1 session=0x1c106b318 object=0x4d9d89fb0, mode=S
flag=80 -/-/-/-/-/-/-/ILK savepoint=0x8fe64da
type=MULTI-INSTANCE instance locked=T handle=0x3398f05d8
row cache parent object: addr=0x4d9d89fb0 cid=16(dc_histogram_defs) conid=3 conuid=2615539819
hash=a3de4515 typ=61 transaction=(nil) flags=00000002 inc=1, pdbinc=1
version=2 mtx version=698
own=0x4d9d8a080[0x3398f0588,0x3398f0588] wat=0x4d9d8a090[0x4d9d8a090,0x4d9d8a090] mode=S req=N
status=VALID/-/-/-/-/-/-/-/- KGH pinned
instance lock=QQ 6a7e84e3 f2908b26
set=0, complete=FALSE
set=1, complete=FALSE
hash和instance lock和请求的完全匹配,通过SO和OWNER,往上搜索可以定位到process为GEN1,再通过process的SO可以找到gen1的session state信息。
SO: 0x1bff52c30, type: session (4), map: 0x1c106b318
state: LIVE (0x4532), flags: 0x1
owner: 0x1bfee0180, proc: 0x1bfee0180
link: 0x1bff52c50[0x1bf6fb2a8, 0x1bf6fb228]
child list count: 7, link: 0x1bff52ca0[0x19faa3a48, 0x1bd6d1020]
conid: 1, conuid: 1, SGA version=(1,0), pg: 0
SOC: 0x1c106b318, type: session (4), map: 0x1bff52c30
state: LIVE (0x99fc), flags: INIT (0x1)
(session) sid: 631 ser: 35354 trans: (nil), creator: 0x1b0ea3d88
flags: (0x51) USR/- flags2: (0x409) -/-/INC
flags_idl: (0x1) status: BSY/-/-/- kill: -/-/-/-
0001-000A-000000020001-000A-00000003 DID: , short-term DID:
txn branch: (nil)
con_id/con_uid/con_name: 3/2615539819/PDBRES
con_logonuid: 1 con_logonid: 1
con_scuid: 1 con_scid: 1
edition#: 134 user#/name: 0/SYS
oct: 3, prv: 0, sql: 0x3398059e8, psql: (nil)
stats: 0x1dfe645a0, PX stats: 0x12f05d04
service name: SYS$BACKGROUND
Current Wait Stack:
0: waiting for 'gc cr request'
file#=0x9, block#=0x8a06, class#=0x1
wait_id=50342261 seq_num=11438 snap_id=1
wait times: snap=0.147084 sec, exc=0.147084 sec, total=0.147084 sec
wait times: max=infinite, heur=0.147084 sec
wait counts: calls=4 os=4
in_wait=1 iflags=0x5a0
Wait State:
fixed_waits=0 flags=0x22 boundary=(nil)/-1
Session Wait History:
elapsed time of 0.000099 sec since current wait
0: waited for 'gc cr block lost'
=0x9, =0x8a06, =0x1
wait_id=50342260 seq_num=11437 snap_id=1
wait times: snap=0.512788 sec, exc=0.512788 sec, total=0.512788 sec
wait times: max=infinite
wait counts: calls=5 os=5
occurred after 0.000090 sec of elapsed time
1: waited for 'gc cr block lost'
=0x9, =0x8a06, =0x1
wait_id=50342259 seq_num=11436 snap_id=1
wait times: snap=0.515190 sec, exc=0.515190 sec, total=0.515190 sec
wait times: max=infinite
wait counts: calls=5 os=5
occurred after 0.000062 sec of elapsed time
2: waited for 'gc cr block lost'
=0x9, =0x8a06, =0x1
wait_id=50342258 seq_num=11435 snap_id=1
wait times: snap=0.518847 sec, exc=0.518847 sec, total=0.518847 sec
wait times: max=infinite
wait counts: calls=5 os=5
occurred after 0.000065 sec of elapsed time
3: waited for 'gc cr block lost'
=0x9, =0x8a06, =0x1
wait_id=50342257 seq_num=11434 snap_id=1
wait times: snap=0.517242 sec, exc=0.517242 sec, total=0.517242 sec
wait times: max=infinite
wait counts: calls=5 os=5
occurred after 0.000060 sec of elapsed time
4: waited for 'gc cr block lost'
=0x9, =0x8a06, =0x1
wait_id=50342256 seq_num=11433 snap_id=1
wait times: snap=0.512952 sec, exc=0.512952 sec, total=0.512952 sec
wait times: max=infinite
wait counts: calls=5 os=5
occurred after 0.000060 sec of elapsed time
5: waited for 'gc cr block lost'
=0x9, =0x8a06, =0x1
wait_id=50342255 seq_num=11432 snap_id=1
wait times: snap=0.515456 sec, exc=0.515456 sec, total=0.515456 sec
wait times: max=infinite
wait counts: calls=5 os=5
occurred after 0.000074 sec of elapsed time
6: waited for 'gc cr block lost'
=0x9, =0x8a06, =0x1
wait_id=50342254 seq_num=11431 snap_id=1
wait times: snap=0.515304 sec, exc=0.515304 sec, total=0.515304 sec
wait times: max=infinite
wait counts: calls=5 os=5
occurred after 0.000063 sec of elapsed time
7: waited for 'gc cr block lost'
=0x9, =0x8a06, =0x1
wait_id=50342253 seq_num=11430 snap_id=1
wait times: snap=0.517334 sec, exc=0.517334 sec, total=0.517334 sec
wait times: max=infinite
wait counts: calls=5 os=5
occurred after 0.000057 sec of elapsed time
8: waited for 'gc cr block lost'
=0x9, =0x8a06, =0x1
wait_id=50342252 seq_num=11429 snap_id=1
wait times: snap=0.517039 sec, exc=0.517039 sec, total=0.517039 sec
wait times: max=infinite
wait counts: calls=5 os=5
occurred after 0.000067 sec of elapsed time
9: waited for 'gc cr block lost'
=0x9, =0x8a06, =0x1
wait_id=50342251 seq_num=11428 snap_id=1
wait times: snap=0.512986 sec, exc=0.512986 sec, total=0.512986 sec
wait times: max=infinite
wait counts: calls=5 os=5
occurred after 0.000066 sec of elapsed time
Sampled Session History of session 631 serial 35354
---------------------------------------------------
The sampled session history is constructed by sampling
the target session every 1 second. The sampling process
captures at each sample if the session is in a non-idle wait,
an idle wait, or not in a wait. If the session is in a
non-idle wait then one interval is shown for all the samples
the session was in the same non-idle wait. If the
session is in an idle wait or not in a wait for
consecutive samples then one interval is shown for all
the consecutive samples. Though we display these consecutive
samples in a single interval the session may NOT be continuously
idle or not in a wait (the sampling process does not know).
The history is displayed in reverse chronological order.
sample interval: 1 sec, max history 120 sec
---------------------------------------------------
[1 sample, 11:12:58]
waited for 'gc cr request', seq_num: 11438
p1: 'file#'=0x9
p2: 'block#'=0x8a06
p3: 'class#'=0x1
[1 sample, 11:12:57]
waited for 'gc cr block lost', seq_num: 11436
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:56]
waited for 'gc cr block lost', seq_num: 11434
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:55]
waited for 'gc cr block lost', seq_num: 11432
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:54]
waited for 'gc cr block lost', seq_num: 11430
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:53]
waited for 'gc cr block lost', seq_num: 11428
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:52]
waited for 'gc cr block lost', seq_num: 11426
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:51]
waited for 'gc cr block lost', seq_num: 11424
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:50]
waited for 'gc cr block lost', seq_num: 11422
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:49]
waited for 'gc cr block lost', seq_num: 11420
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:48]
waited for 'gc cr block lost', seq_num: 11418
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:47]
waited for 'gc cr block lost', seq_num: 11416
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:46]
waited for 'gc cr block lost', seq_num: 11414
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:45]
waited for 'gc cr block lost', seq_num: 11412
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:44]
waited for 'gc cr block lost', seq_num: 11410
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:43]
waited for 'gc cr block lost', seq_num: 11409
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:42]
waited for 'gc cr block lost', seq_num: 11407
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:41]
waited for 'gc cr block lost', seq_num: 11405
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:40]
waited for 'gc cr block lost', seq_num: 11403
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:39]
waited for 'gc cr block lost', seq_num: 11401
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:38]
waited for 'gc cr block lost', seq_num: 11399
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:37]
waited for 'gc cr block lost', seq_num: 11397
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:36]
waited for 'gc cr block lost', seq_num: 11395
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:35]
waited for 'gc cr block lost', seq_num: 11393
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:34]
waited for 'gc cr block lost', seq_num: 11391
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:33]
waited for 'gc cr block lost', seq_num: 11389
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:32]
waited for 'gc cr block lost', seq_num: 11387
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:31]
waited for 'gc cr block lost', seq_num: 11385
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:30]
waited for 'gc cr block lost', seq_num: 11383
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:29]
waited for 'gc cr block lost', seq_num: 11381
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:28]
waited for 'gc cr block lost', seq_num: 11380
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:27]
waited for 'gc cr block lost', seq_num: 11378
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:26]
waited for 'gc cr block lost', seq_num: 11376
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:25]
waited for 'gc cr block lost', seq_num: 11375
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:24]
waited for 'gc cr block lost', seq_num: 11373
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:23]
waited for 'gc cr block lost', seq_num: 11372
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:22]
waited for 'gc cr block lost', seq_num: 11370
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:21]
waited for 'gc cr block lost', seq_num: 11368
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:20]
waited for 'gc cr block lost', seq_num: 11366
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
[1 sample, 11:12:19]
waited for 'gc cr block lost', seq_num: 11364
p1: ''=0x9
p2: ''=0x8a06
p3: ''=0x1
大量的gc cr block lost,通常出现block lost都与网络丢包有关系。但是我反复查看了oswnetstat,并未发现异常,而且gc cr block lost只在gen1进程的等待中出现。
查看gen1的trace发现异常阶段有大量的如下输出:
*** 2023-05-07T08:34:12.831763+08:00 (PDBRES(3))
IPCLW:[0.30861]{-}[WAIT]:PROTO: [1683419652831702]ipclw_data_chunk_process:1165 Discarding msg with seq # 3515351735, expecting 544659863
*** 2023-05-07T08:34:13.347836+08:00 (PDBRES(3))
IPCLW:[0.30862]{-}[WAIT]:PROTO: [1683419653347772]ipclw_data_chunk_process:1165 Discarding msg with seq # 3515352003, expecting 544659863
IPCLW:[0.30863]{-}[WAIT]:PROTO: [1683419653863530]ipclw_data_chunk_process:1165 Discarding msg with seq # 3515352455, expecting 544659863
*** 2023-05-07T08:34:14.379122+08:00 (PDBRES(3))
IPCLW:[0.30864]{-}[WAIT]:PROTO: [1683419654379038]ipclw_data_chunk_process:1165 Discarding msg with seq # 3515352766, expecting 544659863
IPCLW:[0.30865]{-}[WAIT]:PROTO: [1683419654896105]ipclw_data_chunk_process:1165 Discarding msg with seq # 3515353098, expecting 544659863
*** 2023-05-07T08:34:15.409055+08:00 (PDBRES(3))
IPCLW:[0.30866]{-}[WAIT]:PROTO: [1683419655408955]ipclw_data_chunk_process:1165 Discarding msg with seq # 3515353384, expecting 544659863
IPCLW:[0.30867]{-}[WAIT]:PROTO: [1683419655926455]ipclw_data_chunk_process:1165 Discarding msg with seq # 3515353701, expecting 544659863
*** 2023-05-07T08:34:16.437884+08:00 (PDBRES(3))
IPCLW:[0.30868]{-}[WAIT]:PROTO: [1683419656437736]ipclw_data_chunk_process:1165 Discarding msg with seq # 3515353931, expecting 544659863
IPCLW:[0.30869]{-}[WAIT]:PROTO: [1683419656951721]ipclw_data_chunk_process:1165 Discarding msg with seq # 3515354273, expecting 544659863
*** 2023-05-07T08:34:17.473483+08:00 (PDBRES(3))
IPCLW:[0.30870]{-}[WAIT]:PROTO: [1683419657473412]ipclw_data_chunk_process:1165 Discarding msg with seq # 3515354587, expecting 544659863
IPCLW:[0.30871]{-}[WAIT]:PROTO: [1683419657986986]ipclw_data_chunk_process:1165 Discarding msg with seq # 3515354878, expecting 544659863
时间与gen1开始反复等待gc cr block lost吻合。搜索mos发现与Bug 34482043 : SOME “GC CR MULTI BLOCK MIXED” WAIT OUTLIERS FOLLOWED BY “IPCLW DISCARDING MSG” CAUSE APERFORMANCE REGRESSION IN 19C非常相似。
@ RELEASE NOTES:
@ ]] CR Multi block read performance degradation.
@ ]]
@ .
@ INTERNAL PROBLEM DESCRIPTION:
@ Discarding out of order messages in IPCLW causes CR lost block. The block
@ will be requested
@ again only after receiving the side channel message which by default was
@ sent 500 milliseconds later.
@
@ .
@ INTERNAL FIX DESCRIPTION:
@ Deliver out of order messages if they are received within a certain window
@ size.
workaround是升级到19.18。
gc cr block lost只有gen1进程的sql出现过,sql是同一个,我忘记记录了,但是我很清楚的记得与user$.SPARE6有关系,个人猜测可能禁用掉”_disable_last_successful_login_time”特性也可以。
------------------作者介绍-----------------------
姓名:黄廷忠
现就职:Oracle中国高级服务团队
曾就职:OceanBase、云和恩墨、东方龙马等
电话、微信、QQ:18081072613
个人博客: (http://www.htz.pw)
CSDN地址: (https://blog.csdn.net/wwwhtzpw)
博客园地址: (https://www.cnblogs.com/www-htz-pw)

浙公网安备 33010602011771号