了解Maclean Liu|向Maclean Liu提问 Oracle ALLSTARS 全明星(群内有多位Oracle高级售后support,N位OCM和ACE) QQ群 # QQ群号:23549328 # 已经升级到 2000人群,空位多多。欢迎有一定基础的Oracle骨友加入,现在入群需要经过Maclean的技术面试,欢迎面试,请加QQ号:47079569 为好友参加面试 2群基础群 适合刚入门的同学,会共享最佳入门实践和资料 QQ群 # QQ群号:171092051 # 已经升级到 500人的超级群,空位多多,无需面试

为什么说log file sync(其实是写redo慢)会造成buffer busy wait?

《gc buffer busy/gcs log flush sync与log file sync》一文中我介绍了 redo flush慢造成RAC中gc buffer busy争用的原理, 而在《【技术分享】开Oracle调优鹰眼,深入理解AWR性能报告》 中我又介绍了 log file sync(其实本质是lgwr 写redo慢)也会造成单实例single instance环境中的buffer busy wait等待, 这是为什么呢?   我们来做一个演示说明该问题:   示例用表: conn maclean/oracle create table  maclog (t1 int);   打开一个Session A 并获得它的OS进程号 [oracle@vrh8 ~]$ sqlplus / as sysdba SQL*Plus: Release 10.2.0.5.0 - Production on Sat Mar 9 09:32:25 2013 Copyright (c) 1982, 2010, Oracle. All Rights Reserved. Connected to: Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options     [oracle@vrh8 ~]$ ps -ef|grep LOCAL oracle 18441 18438 0 09:41 ? 00:00:00 oracleG10R25 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq))) oracle 18445 18348 0 09:42 pts/3 00:00:00 grep LOCAL   使用 gdb debug 该18375的前台进程,并指定breakpoint [oracle@vrh8 ~]$ gdb $ORACLE_HOME/bin/oracle 18441 GNU gdb (GDB) Red Hat Enterprise Linux (7.0.1-37.el5) (gdb) b kcrf_commit_force Breakpoint 1 at 0x286519c   之后在session A执行:   SQL> insert into maclog values(1); 1 row created. 若insert hang,则在GDB 中输入多次C  - continue 这样能够继续   输入 commit; ==》 此时session A会hang住   SQL> commit;   ====》HANG   在GDB 中输入bt 可以看到stack call (gdb) bt #0 0x000000000286519c in kcrf_commit_force () #1 0x00000000028620ef in kcrfw_redo_gen () #2 0x00000000010e7dba in kcbchg1_main () #3 0x00000000010e6d99 in kcbchg () #4 0x000000000143f65a in ktucmt () #5 0x00000000013c7a06 in ktcCommitTxn () #6 0x00000000042a559e in ktdcmt () #7 0x00000000024fe09c in k2lcom () #8 0x0000000002418993 in k2send () #9 0x0000000001418b47 in xctctl () #10 0x00000000014174dd in xctcom_with_options () #11 0x000000000211fc26 in kksExecuteCommand () #12 0x00000000030ef87a in opiexe () #13 0x0000000003232d47 in kpoal8 () #14 0x00000000013b7c10 in opiodr () #15 0x0000000003c3c9da in ttcpip () #16 0x00000000013b3144 in opitsk () #17 0x00000000013b60ec in opiino () #18 0x00000000013b7c10 in opiodr () #19 0x00000000013a92f8 in opidrv () #20 0x0000000001fa3936 in sou2o () #21 0x000000000072d40b in opimai_real () #22 0x000000000072d35c in main () 其stack call为ktcCommitTxn=> ktucmt => kcbchg => kcbchg1_main => kcrfw_redo_gen => kcrf_commit_force   kcbchg==> block change ,为什么要发生block change呢? 因为commit需要对在Buffer Cache里的block做immediate block cleanout 此时开一个session B  查询maclog表 SQL> select * from maclog; ==》阻塞在buffer busy wait上   这样就通过 无法完成的immediate block cleanout 去pin住buffer ,来形成了一个buffer busy wait   做一个system state dump : SQL> oradebug setmypid Statement processed. SQL> SQL> oradebug dump systemstate 266; Statement processed. SQL> oradebug tracefile_name /s01/admin/G10R25/udump/g10r25_ora_18551.trc   分析该systemstate dump   Session B一直在等 buffer busy wait SO: 0xaa42fff8, type: 4, owner: 0xaa3048f8, flag: INIT/-/-/0x00 (session) sid: 164 trans: (nil), creator: 0xaa3048f8, flag: (100051) USR/- BSY/-/-/-/-/- DID: 0001-0008-00000002, short-term DID: 0000-0000-00000000 txn branch: (nil) oct: 0, prv: 0, sql: (nil), psql: (nil), user: 0/SYS service name: SYS$BACKGROUND waiting for 'buffer busy waits' wait_time=0, seconds since wait started=245 file#=2, block#=89, class#=21 blocking sess=0x(nil) seq=12413 Dumping Session Wait History for 'buffer busy waits' count=1 wait_time=0.215431 sec file#=2, block#=89, class#=21 for 'buffer busy waits' count=1 wait_time=0.977438 sec file#=2, block#=89, class#=21 for 'buffer busy waits' count=1 wait_time=0.977538 sec file#=2, block#=89, class#=21 for 'buffer busy waits' count=1 wait_time=0.977512 sec file#=2, block#=89, class#=21 for 'buffer busy waits' count=1 wait_time=0.977480 sec file#=2, block#=89, class#=21 for 'buffer busy waits' count=1 wait_time=0.977488 sec file#=2, block#=89, class#=21 for 'buffer busy waits' count=1 wait_time=0.977639 sec file#=2, block#=89, class#=21 其stack call为: Short stack dump: ksdxfstk()+32<-ksdxcb()+1573<-sspuser()+111<-__restore_rt()+0<-__GI_semtimedop()+10<-sskgpwwait()+265<-skgpwwait()+162<-kslwaitns_timed()+1102<-kskthbwt()+246<-kslwait( )+228<-kcbzwb()+1496<-kcbgtcr()+23190<-ktugct()+588<-ktbgcl1()+4711<-ktrgcm()+1979<-ktrget()+486<-kdst_fetch()+524<-kdstf0000001kmP()+3137<-kdsttgr()+2427<-qertbFetch() +650<-qergsFetch()+444<-opifch2()+2944<-opiall0()+2206<-opikpr()+642<-opiodr()+1184<-rpidrus()+196<-skgmstack()+158<-rpidru()+116<-rpiswu2()+409<-kprball()+1270<-kkescF etch()+83<-kkedsamp()+6304<-kkedsSel()+1495<-kkecdn()+3055<-kkotap()+859<-kkoiqb()+9830<-kkooqb()+904<-kkoqbc()+2093<-apakkoqb()+167<-apaqbdDescendents()+414<-apaqbdLis tReverse()+68<-apadrv()+573<-opitca()+1545<-kksLoadChild()+9714<-kxsGetRuntimeLock()+1454<-kksfbc()+14910<-kkspsc0()+979<-kksParseCursor()+142<-opiosq0()+1641<-kpooprx( )+318<-kpoal8()+964<-opiodr()+1184<-ttcpip()+1226<-opitsk()+1310<-opiino()+1024<-opiodr()+1184<-opidrv()+548<-sou2o()+114<-opimai_real()+163<-main()+116<-__libc_start_m ain()+244<-_start()+41     file=2 block=0x89 即 137 md: EXCL  被  owner: 0xaa30b888  PID=22持有 SO: 0xaadd91d8, type: 24, owner: 0xaa44f240, flag: INIT/-/-/0xc0 (buffer) (CR) PR: 0xaa30c878 FLG: 0x0 class bit: (nil) kcbbfbp: [BH: 0x69fe2948, LINK: 0xaadd9218] (WAITING) where: ktuwh05: ktugct, why: 0 BH (0x69fe2948) file#: 2 rdba: 0x00800089 (2/137) class: 33 ba: 0x69cc6000 set: 3 blksize: 8192 bsi: 0 set-flg: 0 pwbcnt: 0 dbwrid: 0 obj: -1 objn: 0 tsn: 1 afn: 2 hash: [aacc0958,aacc0958] lru: [75fc3228,7eff15a8] obj-flags: object_ckpt_list ckptq: [6ef77368,aadf28f8] fileq: [6ef77378,aadf2938] objq: [a7bdd678,6afdf518] use: [aaddb748,aaddb748] wait: [aaddef10,aadd9218] st: XCURRENT md: EXCL tch: 465 flags: mod_started gotten_in_current_mode block_written_once redo_since_read change state: ACTIVE change count: 1 LRBA: [0x2d5.10d5a.0] HSCN: [0x0.242e76a] HSUB: [1] Using State Objects ---------------------------------------- SO: 0xaaddb708, type: 24, owner: 0xaa454568, flag: INIT/-/-/0x00 (buffer) PR: 0xaa30b888 FLG: 0x1000 class bit: (nil) kcbbfbp: [BH: 0x69fe2948, LINK: 0xaaddb748] where: ktuwh02: ktugus, why: 0 Waiting State Objects ---------------------------------------- SO: 0xaaddeed0, type: 24, owner: 0xaa450598, flag: INIT/-/-/0xc0 (buffer) PR: 0xaa3048f8 FLG: 0x0 class bit: (nil) kcbbfbp: [BH: 0x69fe2948, LINK: 0xaaddef10] (WAITING) where: ktuwh02: ktugus, why: 0   简单来说这个示例说明了几点:
  1. OLTP类型的小DML操作一般都会是immediate block cleanout的,这要求在commit之前对block做change kcbchg
  2. 在commit kcrf_commit_force完成前都不会释放对该block buffer的buffer pin
  3. 由上述2点造成的buffer pin最终会影响select和其他insert/update/delete 形成buffer busy wait
  4. 由于慢的lgwr写redo log会造成 kcrf_commit_force commit的缓慢,表现在等待事件上就是log file sync
  5. 由于block cleanout时pin block buffer且commit 慢,则会导致更长时间的buffer busy wait
  6. 若log file sync是由lgwr 写redo log慢(log file parallel write)引起的,则它的另一个效应就是buffer busy wait增多
  7. 若看到AWR中log file sync+buffer busy wait是主要等待事件,则优先解决log file sync ,因为buffer busy wait实际可能是受害者
  AWR中与commit cleanout相关的 Instance activity 有好几个   commit cleanout failures: block lost commit cleanout failures: buffer being written commit cleanout failures: callback failure commit cleanout failures: cannot pin commit cleanouts commit cleanouts successfully completed  

posted on 2013-03-19 00:52  Oracle和MySQL  阅读(664)  评论(0编辑  收藏  举报

导航