导航

关于Oracle备份中的fractured block

Posted on 2013-06-20 10:28  半夏_Fanxiaobo  阅读(692)  评论(0)    收藏  举报

One danger in making online backups is the possibility of inconsistent data within a block. For example, assume that you are backing up block 100 in datafile users.dbf. Also, assume that the copy utility reads the entire block while DBWR is in the middle of updating the block. In this case, the copy utility may read the old data in the top half of the block and the new data in the bottom top half of the block. The result is called a fractured block, meaning that the data contained in this block is not consistent. at a given SCN.

When performing backups of an open tablespace without using RMAN, you must put tablespaces in backup mode to prevent the creation of fractured blocks in your backup. When not in backup mode, the database records only changed bytes in the redo stream. When a tablespace is in backup mode, each time a block is changed the database writes the before-image of the entire block to the redo stream before modifying the block. Then, the database also records the changes to the block in the redo log. During user-managed recovery using SQL*Plus, the database applies both the captured block images and the recorded block changes from the redo logs. Applying the block images repairs any possible fractured blocks in the backup being restored and recovered.

RMAN does not require that you put datafiles into backup mode. During an RMAN backup, a database server session reads each block of the datafile and checks whether each block is fractured by comparing the block header and footer. If a block is fractured, the session re-reads the block. If the same fracture is found, then the block is considered permanently corrupt. If MAXCORRUPT is exceeded, the backup stops.

 

Fractured Block explanation [ID 1392417.1]

   

In this Document
  Symptoms
  Cause
  Solution


Applies to:

Oracle Server - Enterprise Edition - Version: 10.2.0.4 and later   [Release: 10.2 and later ]
Information in this document applies to any platform.

Symptoms

Fractured block found causing ORA-1578 error 

Cause

ORA-1578 error could be an indication of a fractured block. A fractured block is a clear symptom about serious issues within the O.S./H.W. layers.

In order to understand why a fractured block happens, we need to understand how a block is written into disk.

The block size at OS level does not match the block size at Oracle level, so in order to write an Oracle block, we need to perform more that one OS write.

As an example: if OS block size is 512 bytes and Oracle block is 8K, we need to perform 16 writes at OS level in order to complete the write process.

Oracle keeps track off the header of each block and constructs before writing down to disk by building a small 4 byte field/value in the tail of each block (tailchk) to guarantee it is correctly written

Example of a Fractured (Broken) block

Page 264462 is influx - most likely media corrupt
Corrupt block relative dba: 0x0284090e (file 10, block 264462)
Fractured block found during dbv:
Data in bad block:
type: 6 format: 2 rdba: 0x0284090e
last change scn: 0x0003.da17adf8 seq: 0x1 flg: 0x04
spare1: 0x0 spare2: 0x0 spare3: 0x0
consistency value in tail: 0xaead0601 <----------
check value in block header: 0x8564
computed block checksum: 0x355

-------------------------------------------------------------------------------------------

. tailchk: (current: 0xaead0601)
----------
It is built, at the footer of the block to guarantee the beginning and end of the block corresponds to the same version.
it is 4 bytes built as:

.- lower order 2 bytes of scn base - 0x....adf8
.- block type - 0x06
.- scn sequence number - 0x01

-> tailchk should have been 0xadf80601 in this block whilst it's 0xaead0601
-------------------------------------------------------------------------------------------

This value was right on the block oracle asked o.s. to write but, unfortunately, the write did not complete as a whole and only partial write was done or the write completed but invalid information was actually written in the block.

The tail check only verifies if the header and the tail of the block has been written correctly, but does not warranty that the complete block was written.

In order to warranty this, we have the database parameter DB_BLOCK_CHECKSUM
If this parameter is set to TRUE, we calculate a checksum value for the complete block and we write this information in the block header before writing the block. When the block is read again, this values is recomputed and compared with the one at block header.

There are checks that may be run against datafiles to ensure the validity of all tail values on all blocks of them.
DBV catches this kind of failures and may be used against your DB file(s) to check this.

Identically, there is a clear path to follow when this happens.
These blocks are badly written by o.s./h.w. and as such, Oracle operations over the block(s) affected are correct.
(otherwise, a different kind of error would have been printed out)

In addition, restoring and recovering the block is not introducing the issue again, what indicates that the redo changes generated are correct, so the issue is clearly related with other layer (most probably hardware layer)

Solution

Restore and recover the block from a valid backup