Common Wait Events[转]

http://space.itpub.net/?uid-55472-action-viewspace-itemid-432997

Common Wait Events

Buffer busy waits

当一个session要获取buffer cache中的数据块，而该数据块正在被其他session所使用时这一event就会出现。

两种情况下会产生这一事件：

l 另一Sesssion正在buffer cache中修改该数据块，修改数据块时session会修改该数据块头的标记，以防止其他session冲突；

l 另一Session正在从datafile将此数据块读入buffer cache。这一情况在10g之前会产生buffer busy waits事件，从10g开始由read by other session事件替代。

不要混淆buffer busy waits和buffer busy，buffer busy是由于session要通过ASM使用cached metadata所产生的。

常见的可能产生buffer busy waits事件的buffer类型包括：data blocks,segment header,undo blocks和undo header。

参数：P1-file number,P2-block number,P3-10g之前是一个代表wait原因的数值，10g之后是wait class.

Timeout: 100cs or 1 second.

Control file parallel write

Session等待写控制文件的竞争。需要写控制文件主要有以下几种情况：

l CKPT每3秒向控制文件写入redo logs的checkpoint位置，用于recovery;

l NOLOGGING或者UNRECOVERABLE的DML操作时，oracle会将这些SCN写入控制文件；

l RMAN将备份和恢复信息写入控制文件。

Control file parallel write等待的是OS和IO，而并非其他session.当一个session在写控制文件时，会hold CF enqueue，这样其他Session就会等待这个enqueue。

如果这一等待时间很多，说明系统写控制文件的操作过多，或者写控制文件的性能过低。

参数：P1-number of control file; P2-total number of blocks to write to control file; P3- number of IO requests

Db file parallel read

以下两种情况会出现该事件：

l 在数据库recovery的时候出现，从Datafile并行读取恢复所需数据块；

l 用户进程从一个或者和多个数据文件读取很多非连续的single block.

参数：P1-读取的文件编号；P2-total number of blocks to read; P3-total numbe of IO requests

Db file parallel write

DBWR向数据文件写入脏数据块时产生，产生的原因是系统IO. Dbwr将一系列的脏数据块整理到”write batch”，并向IO请求将Batch写入数据文件的资源，并一直等待到IO完成写入为止。如果使用的是异步IO,则DBWR不等待IO写入完成就直接将free buffer放回LRU chain供用户使用。

参数：P1-number of file to write to ; P2-total number of blocks to write; p3-timeout value

Db file scattered read

当session提交一个IO请求，要求读取很多数据块时会出现该事件。这些被读取的数据块被scattered into the buffer cache,他们在buffer cache中并不连续。这一事件往往出现在全表扫描和index fast full scans。参数db_file_multiblock_read_count定义了每次读取的最大block数量。

数据文件的IO等待是正常的现象，这一事件本身并不代表数据库有问题。但是如果用于等待该事件的时间明显高于其他等待事件，那就需要进一步查找原因了。

参数：P1-file number to read the blocks from; P2-starting block number to begin reading; P3-number of blocks to read

Db file sequential read

(摘自原文，理解不深刻，怕翻错了)The db file sequential read wait event occurs when the process waits foran I/O completion for a sequential read. The name is a bit misleading, suggesting a multiblock operation, butthis is a single block read operation. The event gets posted whenreading from an index, rollback or undo segments,table access by rowid, rebuilding control files, dumping datafile headers, or the datafile headers.

同样，这一事件本身并不代表数据库有问题。但是如果用于等待该事件的时间明显高于其他等待事件，那就需要进一步查找原因了。

参数：P1-file number to read the blocks from; P2-starting block number to begin reading; P3-number of blocks to read，绝大多数情况下是1

Db file single write

由dbwr发起，往往发生在oracle更新数据文件头时，最常见的就是checkpoint的时候。这一事件往往出现在数据文件比较混乱的情况下。

参数：P1-file number to write to,P2-starting block number,P3-number to write，往往是1

Direct path read

当session直接将数据读入PGA而不是SGA的buffer cache时会产生direct path read事件。根据硬件平台的不同和DISK_ASYNCH_IO参数的不同，direct path read有同步和异步两种方式。Direct path read一般是排序、并行查询、hash joins等需要使用到磁盘的temporary segments的操作所引起所使用到的。

这个事件的等待次数和时间在同步和异步的情况下是不同的：

l 同步的情况下session提交一个请求以后会等待IO结束，但这个等待的时间并不计入direct path read事件。在IO完成，session获取到自己所需要的数据以后会提交direct path read事件。这样，应该是每有一个read request，就会有一个对应的等待时间，但是等待时间却是相当短的。

l 异步的情况下，session持续提交多个读取数据的请求，之后就对已经被读取到PGA的数据进行处理。只有在发现所需要的数据还没有被读入PGA,才会提交direct path read等待事件。因此，在异步的情况下，read request的数量和等待的数量是不同的。

因此，我们在v$system_event和v$session_event中看到关于这一等待事件的统计并不可靠。

Direct path write

同direct path read相对应，是由于direct data loads(inserts with APPEND hint OR CATS)或并行的DML操作需要写入temporary segments引起的。

Enqueue

某一个Session需要使用数据库的某项资源时，由于有其他的session正在使用这一资源，就会产生enqueue等待事件。

Enqueue的种类是由两位字符表示的，常见的有：

l ST Enqueue for Space Management Transaction

l SQ Enqueue for Sequence Numbers

l TX Enqueue for a Transaction

从10g开始，每种enqueue都由单独的等待事件来表示，而不再是合并在enqueue中由不同的种类区分了。

看个例子，在9i和10g分别实验，两个Session同时对同一张表作dml操作时，9i显示有”enqueue”事件，类型为“TX”，而10g报告的等待事件则为“enq: TX - row lock contention”。

--以下为9i的输出

select sid,event,p1text,p1,p2text,p2,p3text,p3,wait_time,state

from v$session_wait

where event='enqueue';

SID EVENT P1TEXT P1 P2TEXT P2 P3TEXT P3 WAIT_TIME STATE

22 enqueuename|mode 1415053318 id1131073 id2597156 0 WAITING

select sid,

chr(bitand(p1, -16777216)/16777215)||chr(bitand(p1,16711680)/65535) "Name",

(bitand(p1, 65535)) "Mode"

from v$session_wait where event = 'enqueue';

SID Name Mode

22 TX 6

--以下为10g的输出

SQL> select sid,event,p1text,p1,p2text,p2,p3text,p3,wait_time,state from v$session_wait where sid in (144);

SID EVENT P1TEXT P1 P2TEXT P2 P3TEXT P3 WAIT_TIME STATE

144 enq: TX - row lock contention name|mode 1415053318 usn<<16 | slot 655364 sequence 1619 0 WAITING

参数：P1-enqueue name and mode，encoded in ASCII format;

P2- Resource identifier ID1 for the requested lock, same as V$LOCK.ID1

P3- Resource identifier ID1 for the requested lock, same as V$LOCK.ID2

Free buffer waits

当session在buffer cache中找不到足够的free buffer用于read in data blocks或者build a consistent read(CR) image of data block时就会提交Free buffer waits。产生这个等待事件的原因可能是buffer cache过小，或者将脏块写入磁盘的速度过慢，一旦出现这个等待事件，用户进程就会向DBWR发送信号，要求其尽快将脏数据块写入磁盘。

参数：P1-用户读取的数据块的所在文件，P2-用户读取的数据块的所在block号，P3-10g才开始启动，用于记录BUFFER CACHE中LRU和LRUW的SET_ID#

Latch free

进程想要获取latch的时候发现该latch正被其他进程所占用，此时就会出现latch free wait事件。和enqueue不同的是，latch不使用queue,process在获取不到latch的情况下不进入queue等待而是等待一段时间以后再获取。

常见的latch包括cache buffer chains, library chache以及shared pool等，详见第六章。

参数：P1-进程所等待的latch address，P2-latch number,可以通过v$latchname查询到对应的latch name，P3-尝试获取latch的次数

Library cache pin

当一个Session要更改library cache中的对象，它必须先获取pin,在这个session编译或者分析PL/SQL存储过程和视图的过程中oracle会提交library cache pin事件。

(这段不太明白)What actions to take to reduce these waits depend heavily on what blocking scenario is occurring. A common problem scenario is the use of DYNAMIC SQL from within a PL/SQL procedure where the PL/SQL code is recompiled and the DYNAMIC SQL calls something that depends on the calling procedure.

如果出现blocking的情况，可以使用如下方法找到requesting pin的对象：

Select P1 from v$session_wait where event='library cache pin';

select s.sid, kglpnmod "Mode", kglpnreq "Req"
from x$kglpn p, v$session s
where p.kglpnuse = s.saddr
and kglpnhdl = '&P1RAW';

x$kglpn——[K]ernel [G]eneric [L]ibrary Cache Manager object [P]i[N]s

sys@ORALOCAL(192.168.0.22)> desc x$kglpn
##主要用来处理library cache pin holder
Name                                                  Null?    Type
----------------------------------------------------- -------- ------------
ADDR                                                           RAW(4)
INDX                                                           NUMBER
INST_ID                                                        NUMBER
KGLPNADR                                                       RAW(4)
KGLPNUSE                                                       RAW(4)
KGLPNSES                                                       RAW(4)
KGLPNHDL                                                       RAW(4)
##关联v$session_wait中event为library cache pin的P1RAW，再关联v$session，可以查出sid和serial#
KGLPNLCK                                                       RAW(4)
KGLPNCNT                                                       NUMBER
KGLPNMOD                                                       NUMBER
##如果值为3，表示为library cache pin的holder；如果值为0，表示为waiter
KGLPNREQ                                                       NUMBER
##如果值为0，表示为library cache pin的holder；如果值为2，表示为waiter
KGLPNDMK                                                       NUMBER
KGLPNSPN                                                       NUMBER

参数：P1-被pin对象地址,P2-address of load lock, P3 Contains the mode plus the namespace (mode indicates which data pieces of the object are to be loaded; namespace is the object namespace as displayed in V$DB_OBJECT_CACHE view)

Library cache lock

A session must acquire a library cache lock on an object handle to prevent other sessions from accessing it at the same time, or to maintain a dependency for a long time, or to locate an object in the library cache.

与library cache pin有何区别呢？

Log buffer space

Session向log buffer中写入数据时发现没有足够的空间就会产生log buffer space,这一事件说明LGWR向redo log写入的速度没有应用产生redo日志的速度快。可能的原因：log buffer过小或者redo log file存在IO竞争。

Log file parallel write

由LGWR进程提交该事件，when the session waits for LGWR process to write redo from log buffer to all the log members of the redo log group，就会出现这个事件。

在异步IO情况下，LGWR进程同时向所有log file member写入，否则，顺序写入。

有该事件出现时说明磁盘设备速度比较慢或者redo log所在的盘有IO竞争。

参数：P1-number of log files to write to,P2-number of OS blocks to write to,P3-number of IO requests.

Log file sequential read

当Arch进程从online redo log file读取数据块有等待时就会产生log file sequential read事件。

参数：P1-Relative sequence number of the redo log file within the redo log group,P2-block number to start reading from, P3- number of os blocks to read

Log file switch(archiving needed)

当LGWR要切换到下一个redo log group时发现arch进程尚未将该日志组写道archived log file destination,此时就会出现该等待事件。

Log file switch(checkpoint incomplete)

LGWR试图切换日志文件时由于checkpoint尚未完成导致的等待。这一事件一般是在redo log files过小的情况下出现。

Log file switch completion

进程等待log file switch to complete.

Log file sync

用户完成一个事物后commit或者rollback都会触发LGWR将redo log写入online redo log。用户进程等待LGWR进程完成写入redo log的IO过程就会产生该等待事件。

如果一个用户遇到log file sync等待事件，通过v$session_wait查看发现它一直在等待同一个buffer#(P1)，那么v$session_wait表中该Session对应的log file sync等待事件的SEQ#会不断增长（否则的话timeout可能存在问题），此时的瓶颈就在LGWR进程，就需要查找堵塞LGWR进程的原因了。

调优LGWR的关键是磁盘IO，例如，redo logs不应该放在RAID5的磁盘阵列上。另外，将分散的事务合并成大事务也有助于减少该等待事件。

参数：P1-The number of the buffer in the log buffer that needs to be synchronized

SQL*Net message from client

Session在等待client的输入。

参数：P1-Prior to Oracle8i release, the value in this parameter was not of much use. Since Oracle8i, the P1RAW column contains an ASCII value to show what type of network driver is in use by the client connections; for example, bequeath, and TCP.

P2-The number of bytes received by the session from the client—generally one, even though the received packet will contain more than 1 byte.

SQL*Net message to client

This wait event is posted by the session when it is sending a message to the client. The client process may be too busy to accept the delivery of the message, causing the server session to wait, or the network latency delays may be causing the message delivery to take longer.

参数：P1-Prior to Oracle8i Database, the value in this parameter was not of much use. Since Oracle8i, the P1RAW column contains an ASCII value to show what type of network driver is in use by the client connections, for example, bequeath and TCP.

P2-Number of bytes sent to client. This is generally one even though the sent packet will contain more than 1 byte.

Common Wait Events[转]