BBED修复坏块
1.故障现象概述
1.1.故障描述
信用卡电子申请发卡系统(CIAP),在2016-10-09 02:09:00时,一线报警,发生数据库宕机。数据库宕机后,一线运维人员立即做出处理,手动启动数据库。数据库能够正常启动,并且没有影响到业务。但是NBU备份操作仍然无法进行备份。(截至2016-10-11日接到工单为止,已经2天没有进行NBU备份了。)
1.2.日志分析
此时,这个时间点正在做的操作是NBU备份。通过查看:
1)alert告警日志alert_orcl.log:
发现该时间点file=4 '/u01/app/oracle/oradata/orcl/users01.dbf'这个文件报ORA-01114: IO error writing block to file 4 (block # 330894)的i/o错误。
Sun Oct 09 02:00:00 2016
Closing scheduler window
Closing Resource Manager plan via scheduler window
Clearing Resource Manager plan via parameter
Sun Oct 09 02:09:00 2016
KCF: read, write or open error, block=0x50c8e online=1
file=4 '/u01/app/oracle/oradata/orcl/users01.dbf'
error=27072 txt: 'Linux-x86_64 Error: 5: Input/output error
Additional information: 4
Additional information: 330894
Additional information: -1'
Errors in file /u01/app/oracle/diag/rdbms/paddb25/orcl/trace/orcl_dbw3_8377.trc:
Errors in file /u01/app/oracle/diag/rdbms/paddb25/orcl/trace/orcl_dbw3_8377.trc:
ORA-63999: data file suffered media failure
ORA-01114: IO error writing block to file 4 (block # 330894)
ORA-01110: data file 4: '/u01/app/oracle/oradata/orcl/users01.dbf'
ORA-27072: File I/O error
Linux-x86_64 Error: 5: Input/output error
Additional information: 4
Additional information: 330894
Additional information: -1
2)nbu备份日志CIAP-dbs-oracle-apporcl-bp.sh.out:
Script /nbu/script/CIAP-dbs-oracle-apporcl-bp.sh
==== started on Sat Oct 8 17:29:44 CST 2016 ====
RMAN: /u01/app/oracle/product/11.2.0/db_1/bin/rman
ORACLE_SID: orcl
ORACLE_USER: oracle
ORACLE_HOME: /u01/app/oracle/product/11.2.0/db_1
NB_ORA_FULL: 1
NB_ORA_INCR: 0
NB_ORA_CINC: 0
NB_ORA_SERV: nbu_master
NB_ORA_POLICY: CIAP-dbs-oracle-apporcl-bp
Full backup requested
Recovery Manager: Release 11.2.0.4.0 - Production on Sat Oct 8 17:29:44 2016
。。。。。。
channel ch00: starting piece 1 at 08-OCT-16
released channel: ch00
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on ch00 channel at 10/08/2016 18:00:05
ORA-19501: read error on file "/u01/app/oracle/oradata/orcl/users01.dbf", block number 330688 (block size=8192)
ORA-27072: File I/O error
Additional information: 4
Additional information: 330688
Additional information: 360448
RMAN> RMAN>
Recovery Manager complete.
Script /nbu/script/CIAP-dbs-oracle-apporcl-bp.sh
==== ended in error on Sat Oct 8 18:00:07 CST 2016 ====
通过初步分析,推测data file 4: '/u01/app/oracle/oradata/orcl/users01.dbf'上存在坏块。(由于alert日志和rman操作日志所取的时间段不同,所以坏块的块号不一样。)
1.3.系统管理员采取的操作
1.系统管理员尝试第一种方法:
故障发生后,系统管理员自己采取了rman的一系列命令,手动修复坏块。操作过程如下:
1)查看存放在file 4上的坏块号330624上的对象是索引:
SELECT segment_name , segment_type , owner , tablespace_name
FROM dba_extents
WHERE file_id = 4 and 330624 between block_id AND block_id + blocks - 1;
2)确定坏块上存放的是索引之后,系统管理员采取的措施是:删除该索引,然后将索引建到其他表空间;
3)再次执行第1步中的sql语句,返回结果为空(即此时没有对象占用到这个坏块)
4)然后,系统管理员执行如下命令(RMAN> backup validate datafile 4;),依旧报坏块问题
(此时是正常的,因为rman会自己校验数据文件是否有坏块,如果有坏块,则无法备份)
RMAN> validate datafile 4;
Starting validate at 10-OCT-16
using channel ORA_DISK_1
channel ORA_DISK_1: starting validation of datafile
channel ORA_DISK_1: specifying datafile(s) for validation
input datafile file number=00004 name=/u01/app/oracle/oradata/orcl/users01.dbf
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of validate command on ORA_DISK_1 channel at 10/10/2016 10:01:46
ORA-19501: read error on file "/u01/app/oracle/oradata/orcl/users01.dbf", block number 330624 (block size=8192)
ORA-27072: File I/O error
Additional information: 4
Additional information: 330624
Additional information: 884736
5)接着,又执行了如下命令,返回结果为空:(查询视图v$database_block_corruption,提示有坏块,注意该视图可能不会返回任何数据,如无返回,先执行backup validate)
select * from v$database_block_corruption;
6)接着,又执行了如下命令,并且提示修复成功:
RMAN> blockrecover datafile 4 block 330624;
7)然后,再次执行backup validate datafile 4,依旧报错:
RMAN> validate datafile 4;
Starting validate at 10-OCT-16
using channel ORA_DISK_1
channel ORA_DISK_1: starting validation of datafile
channel ORA_DISK_1: specifying datafile(s) for validation
input datafile file number=00004 name=/u01/app/oracle/oradata/orcl/users01.dbf
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of validate command on ORA_DISK_1 channel at 10/10/2016 10:01:46
ORA-19501: read error on file "/u01/app/oracle/oradata/orcl/users01.dbf", block number 330624 (block size=8192)
ORA-27072: File I/O error
Additional information: 4
Additional information: 330624
Additional information: 884736
2.系统管理员尝试的第二种方法:
在确定如下sql查询返回结果为空的前提下,
SELECT segment_name , segment_type , owner , tablespace_name
FROM dba_extents
WHERE file_id = 4 and 330624 between block_id AND block_id + blocks - 1;
在这个数据文件的表空间上,新建一个测试表test,将其扩充。通过append方式插入数据,直到坏块被测试表使用,然后commit,并checkpoint。
alter table test allocate extent(size 1m);
再次执行backup validate datafile 4,依旧报错(属于正常现象)
RMAN> validate datafile 4;
Starting validate at 10-OCT-16
using channel ORA_DISK_1
channel ORA_DISK_1: starting validation of datafile
channel ORA_DISK_1: specifying datafile(s) for validation
input datafile file number=00004 name=/u01/app/oracle/oradata/orcl/users01.dbf
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of validate command on ORA_DISK_1 channel at 10/10/2016 10:01:46
ORA-19501: read error on file "/u01/app/oracle/oradata/orcl/users01.dbf", block number 330624 (block size=8192)
ORA-27072: File I/O error
Additional information: 4
Additional information: 330624
Additional information: 884736
1.4.系统管理员在mos上提sr,根据sr做的一系列操作
1.)查看坏块上存放的对象
SELECT * FROM dba_extents WHERE file_id = 4 and 330624 between block_id AND block_id + blocks - 1;
SELECT * FROM dba_extents WHERE file_id = 4 and 884736 between block_id AND block_id + blocks - 1;
2.)查看dbv操作的输出结果
dbverify工具,主要目的是检查数据文件的物理结构,包括数据文件是否损坏,是否存在逻辑坏块,以及数据文件中包含何种类型的数据。同时,dbv工具可以验证Online或者offline的数据文件。不管数据库是否打开,都可以访问数据文件。
注意:dbv工具只能校验数据文件,而redo日志文件、控制文件则无法校验。
2.1)
SQL> select file_name,file_id,tablespace_name from dba_data_files where file_id=4;
2.2)
SQL>select tablespace_name,block_size from dba_tablespaces where tablespace_name=‘<TABLESPACE_NAME>';
2.3)
dbv file=/u01/app/oracle/oradata/orcl/users01.dbf logfile=dbv.log blocksize=
其中dbv操作的输出结果为:
dbv file=/u01/app/oracle/oradata/orcl/users01.dbf logfile=dbv.log blocksize=8192
执行出现错误 DBV-00600:Fatal Error - [28] [27061] [0] [0]
3.) dbv操作没有执行成功,所以下面返回结果也为空(可以先执行dbv或者backup validate,将坏块信息写入数据字典)
select * from V$DATABASE_BLOCK_CORRUPTION;
4.)根据DBV-00600:Fatal Error - [28] [27061] [0] [0],推断是操作系统层面的问题。执行如下命令,并上传结果:
oracle@HP-685C-204-CC ~]$ dbfsize /u01/app/oracle/oradata/orcl/users01.dbf
Database file: /u01/app/oracle/oradata/orcl/users01.dbf
Database file type: file system
Database file size: 801760 8192 byte blocks
[oracle@HP-685C-204-CC ~]$ ls -lrt /u01/app/oracle/oradata/orcl/users01.dbf
-rw-r----- 1 oracle oinstall 6568026112 Oct 11 18:09 /u01/app/oracle/oradata/orcl/users01.dbf
通过操作系统日志内容,发现大量Medium Error的报错,所以推断是系统硬件存储方面存在的问题。
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<messages-20161009
Oct 6 21:02:10 HP-685C-204-CC kernel: [0x000053420001015e][INFO][UP][__mppLnx_scsi_done, 3494][20786931867] UP_done: A0C1P0L1,r=0x8000002,MPP_CHECK_CONDITION,sk=0x3,ASC/ASCQ=0x11/0x0,SN:511684510.
Oct 6 21:02:10 HP-685C-204-CC kernel: [0x0000534200010162][ERR][UP][mppLnx_HandleIOResult, 2734][20786931867] A0C1P0L1 IO FAILURE. vcmnd SN 511684510 pdev H3C0T1L1 0x03/0x11/0x00 0x08000002 status:1
Oct 6 21:02:10 HP-685C-204-CC kernel: sd 6:0:0:1: [sdb] Unhandled sense code
Oct 6 21:02:10 HP-685C-204-CC kernel: sd 6:0:0:1: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Oct 6 21:02:10 HP-685C-204-CC kernel: sd 6:0:0:1: [sdb] Sense Key : Medium Error [current]
Oct 6 21:02:10 HP-685C-204-CC kernel: sd 6:0:0:1: [sdb] Add. Sense: Unrecovered read error
Oct 6 21:02:10 HP-685C-204-CC kernel: sd 6:0:0:1: [sdb] CDB: Read(10): 28 00 10 d5 09 4f 00 00 10 00
........
Oct 9 03:36:01 HP-685C-204-CC kernel: [0x000053420001015e][INFO][UP][__mppLnx_scsi_done, 3494][20983363142] UP_done: A0C1P0L1,r=0x8000002,MPP_CHECK_CONDITION,sk=0x3,ASC/ASCQ=0x11/0x0,SN:519423184.
Oct 9 03:36:01 HP-685C-204-CC kernel: [0x0000534200010162][ERR][UP][mppLnx_HandleIOResult, 2734][20983363142] A0C1P0L1 IO FAILURE. vcmnd SN 519423184 pdev H3C0T1L1 0x03/0x11/0x00 0x08000002 status:1
Oct 9 03:36:01 HP-685C-204-CC kernel: sd 6:0:0:1: [sdb] Unhandled sense code
Oct 9 03:36:01 HP-685C-204-CC kernel: sd 6:0:0:1: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Oct 9 03:36:01 HP-685C-204-CC kernel: sd 6:0:0:1: [sdb] Sense Key : Medium Error [current]
Oct 9 03:36:01 HP-685C-204-CC kernel: sd 6:0:0:1: [sdb] Add. Sense: Unrecovered read error
Oct 9 03:36:01 HP-685C-204-CC kernel: sd 6:0:0:1: [sdb] CDB: Read(10): 28 00 10 d5 09 4f 00 00 08 00
5.)系统管理员,通过联系存储工程师,Linux系统工程师,排查问题,最终确认是os文件系统的问题:superblock损坏。最终通过格式化磁盘,重新刷dg的方式,解决问题。
2.坏块实验模拟
2.2.1.基于rman实现坏块介质恢复
2.1.1.创建实验环境
[oracle@dg1 datafile]$ pwd
/home/oracle/app/oradata/DG1/datafile
[oracle@dg1 datafile]$ ll
总计 1755708
-rw-r----- 1 oracle oinstall 545267712 10-13 10:42 o1_mf_sysaux_ct37034z_.dbf
-rw-r----- 1 oracle oinstall 713039872 10-13 10:42 o1_mf_system_ct370329_.dbf
-rw-r----- 1 oracle oinstall 30416896 10-13 09:22 o1_mf_temp_ct3779v5_.tmp
-rw-r----- 1 oracle oinstall 309338112 10-13 10:42 o1_mf_undotbs1_ct37035m_.dbf
-rw-r----- 1 oracle oinstall 225452032 10-10 14:34 o1_mf_users_ct37037v_.dbf
[oracle@dg1 datafile]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.1.0 Production on Thu Oct 13 10:45:48 2016
Copyright (c) 1982, 2009, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
1)创建实验用的 datafile
SQL> create tablespace tbs_test1 datafile '/home/oracle/app/oradata/DG1/datafile/test02.dbf' size 50m autoextend on;
Tablespace created.
2)基于新的数据文件,在表空间tbs_test上创建表tb_tmp
SQL> conn scott/tiger
Connected.
SQL>
SQL> create table tb_tmp tablespace tbs_test1 as select * from emp;
Table created.
3)查看数据文件上的file_id和file_name
SQL> desc dba_data_files
Name Null? Type
FILE_NAME VARCHAR2(513)
FILE_ID NUMBER
TABLESPACE_NAME VARCHAR2(30)
BYTES NUMBER
BLOCKS NUMBER
STATUS VARCHAR2(9)
RELATIVE_FNO NUMBER
AUTOEXTENSIBLE VARCHAR2(3)
MAXBYTES NUMBER
MAXBLOCKS NUMBER
INCREMENT_BY NUMBER
USER_BYTES NUMBER
USER_BLOCKS NUMBER
ONLINE_STATUS VARCHAR2(7)
SQL> set lines 200 pages 1000
SQL> col file_name for a60
SQL> col file_id for 999
SQL> select file_id,file_name from dba_data_files where tablespace_name='TBS_TEST1';
FILE_ID FILE_NAME
7 /home/oracle/app/oradata/DG1/datafile/test02.dbf
SQL>
4)表对象tb_tmp上的信息:包含对应的文件信息,头部块,总块数
SQL> col segment_name for a15
SQL> select segment_name,header_file,header_block,blocks
2 from dba_segments
3 where segment_name='TB_TMP'
4 and owner='SCOTT';
SEGMENT_NAME HEADER_FILE HEADER_BLOCK BLOCKS
TB_TMP 7 130 8
SQL>
5)首先使用rman备份对应的数据文件
[oracle@dg1 ~]$ rman target /
Recovery Manager: Release 11.2.0.1.0 - Production on Thu Oct 13 11:26:00 2016
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
connected to target database: DG1 (DBID=1900287250)
RMAN> backup datafile 6 tag=health;
Starting backup at 14-OCT-16
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=36 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00007 name=/home/oracle/app/oradata/DG1/datafile/test02.dbf
channel ORA_DISK_1: starting piece 1 at 14-OCT-16
channel ORA_DISK_1: finished piece 1 at 14-OCT-16
piece handle=/home/oracle/app/flash_recovery_area/DG1/backupset/2016_10_14/o1_mf_nnndf_HEALTH1_d00kn57y_.bkp tag=HEALTH1 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 14-OCT-16
RMAN> exit
Recovery Manager complete.
[oracle@dg1 ~]$
2.1.2.单个数据块损坏的恢复处理
1.使用linux自带的dd命令,损坏1个单块数据块
[oracle@dg1 ~]$ cd /home/oracle/app/oradata/DG1/datafile
[oracle@dg1 datafile]$ ll
总计 1755708
-rw-r----- 1 oracle oinstall 545267712 10-14 08:36 o1_mf_sysaux_ct37034z_.dbf
-rw-r----- 1 oracle oinstall 713039872 10-14 08:36 o1_mf_system_ct370329_.dbf
-rw-r----- 1 oracle oinstall 30416896 10-13 09:22 o1_mf_temp_ct3779v5_.tmp
-rw-r----- 1 oracle oinstall 309338112 10-14 08:36 o1_mf_undotbs1_ct37035m_.dbf
-rw-r----- 1 oracle oinstall 225452032 10-10 14:34 o1_mf_users_ct37037v_.dbf
-rw-r----- 1 oracle oinstall 0 10-13 17:13 test02.dbf
[oracle@dg1 datafile]$ dd of=test02.dbf bs=8192 conv=notrunc seek=130 <<EOF
corruptted block!
EOF
0+1 records in
0+1 records out
18 bytes (18 B) copied, 0.000112 seconds, 161 kB/s
[oracle@dg1 datafile]$
2.清空buffer cache
SQL> alter system flush buffer_cache;
3.查询表对象 tb_tmp ,收到ora-01578报错
SQL> select count() from scott.tb_tmp;
select count() from scott.tb_tmp
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 7, block # 130)
ORA-01110: data file 7: '/home/oracle/app/oradata/DG1/datafile/test02.dbf'
SQL>
4.查询视图v$database_block_corruption,提示有坏块。注意,该视图可能不返回任何数据,如无返回,则需要先执行 backup validate
SQL> select * from v$database_block_corruption;
FILE# BLOCK# BLOCKS CORRUPTION_CHANGE# CORRUPTIO
7 130 1 0 CORRUPT
SQL>
5.也可以使用dbv工具来校验坏块,参考http://blog.csdn.net/robinson_0612/article/details/6530890
[oracle@dg1 datafile]$ dbv file=/home/oracle/app/oradata/DG1/datafile/test02.dbf blocksize=8192
DBVERIFY: Release 11.2.0.1.0 - Production on Fri Oct 14 10:48:56 2016
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
DBVERIFY - Verification starting : FILE = /home/oracle/app/oradata/DG1/datafile/test02.dbf
Page 130 is marked corrupt
Corrupt block relative dba: 0x01c00082 (file 7, block 130)
Bad header found during dbv:
Data in bad block:
type: 99 format: 7 rdba: 0x74747075
last change scn: 0x6f6c.62206465 seq: 0x63 flg: 0x6b
spare1: 0x72 spare2: 0x72 spare3: 0x0
consistency value in tail: 0x7fa92301
check value in block header: 0xa21
block checksum disabled
DBVERIFY - Verification complete
Total Pages Examined : 6400
Total Pages Processed (Data) : 1
Total Pages Failing (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing (Index): 0
Total Pages Processed (Other): 129
Total Pages Processed (Seg) : 0
Total Pages Failing (Seg) : 0
Total Pages Empty : 6269
Total Pages Marked Corrupt : 1
Total Pages Influx : 0
Total Pages Encrypted : 0
Highest block SCN : 1212325 (0.1212325)
[oracle@dg1 datafile]$
6.使用blockrecover来恢复坏块
RMAN> blockrecover datafile 7 block 130;
Starting recover at 14-OCT-16
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=42 device type=DISK
channel ORA_DISK_1: restoring block(s)
channel ORA_DISK_1: specifying block(s) to restore from backup set
restoring blocks of datafile 00007
channel ORA_DISK_1: reading from backup piece /home/oracle/app/flash_recovery_area/DG1/backupset/2016_10_14/o1_mf_nnndf_HEALTH1_d00kn57y_.bkp
channel ORA_DISK_1: piece handle=/home/oracle/app/flash_recovery_area/DG1/backupset/2016_10_14/o1_mf_nnndf_HEALTH1_d00kn57y_.bkp tag=HEALTH1
channel ORA_DISK_1: restored block(s) from backup piece 1
channel ORA_DISK_1: block restore complete, elapsed time: 00:00:01
starting media recovery
media recovery complete, elapsed time: 00:00:03
Finished recover at 14-OCT-16
RMAN>
7.再次查询表对象tb_tmp,正常
[oracle@dg1 datafile]$ sqlplus scott/tiger
SQL*Plus: Release 11.2.0.1.0 Production on Fri Oct 14 10:54:36 2016
Copyright (c) 1982, 2009, Oracle. All rights reserved.
Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> select count(*) from tb_tmp;
COUNT(*)
14
SQL>
2.1.3.多个数据块损坏的恢复处理
1.先查看要处理的块上,存放的对象
SQL> select segment_name,segment_type,owner,tablespace_name from dba_extents where file_id=7
2 and 154 between block_id and block_id+blocks-1;
SEGMENT_NAME SEGMENT_TYPE OWNER TABLESPACE_NAME
TB_TMP TABLE SCOTT TBS_TEST1
2.备份这个数据文件
[oracle@dg1 ~]$ rman target /
Recovery Manager: Release 11.2.0.1.0 - Production on Mon Oct 17 14:17:51 2016
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
connected to target database: DG1 (DBID=1900287250)
RMAN> backup datafile 7;
Starting backup at 17-OCT-16
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=34 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00007 name=/home/oracle/app/oradata/DG1/datafile/test02.dbf
channel ORA_DISK_1: starting piece 1 at 17-OCT-16
channel ORA_DISK_1: finished piece 1 at 17-OCT-16
piece handle=/home/oracle/app/flash_recovery_area/DG1/backupset/2016_10_17/o1_mf_nnndf_TAG20161017T141757_d08v8odb_.bkp tag=TAG20161017T141757 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:03
Finished backup at 17-OCT-16
RMAN> quit
Recovery Manager complete.
[oracle@dg1 ~]$
3.使用Linux的dd命令,对不连续的块做损坏操作
[oracle@dg1 datafile]$ dd of=/home/oracle/app/oradata/DG1/datafile/test02.dbf bs=8192 conv=notrunc seek=133<<EOF
corruptted no1 block
EOF
0+1 records in
0+1 records out
21 bytes (21 B) copied, 0 seconds, Infinity B/s
[oracle@dg1 datafile]$ dd of=/home/oracle/app/oradata/DG1/datafile/test02.dbf bs=8192 conv=notrunc seek=143<<EOF
corruptted no2 block
EOF
0+1 records in
0+1 records out
21 bytes (21 B) copied, 2.6e-05 seconds, 808 kB/s
[oracle@dg1 datafile]$ dd of=/home/oracle/app/oradata/DG1/datafile/test02.dbf bs=8192 conv=notrunc seek=153<<EOF
corruptted no3 block
EOF
0+1 records in
0+1 records out
21 bytes (21 B) copied, 0 seconds, Infinity B/s
[oracle@dg1 datafile]$
4.由于损坏了多个数据块,但是查询时,会从最小的块号133开始提示,如果块被修复后,还有坏块,则会继续提示133之后的坏块
1)这里有个小插曲:在建完tb_tmp表之后,执行1次查询,然后破坏块133/143/153,再查询表,竟然能查到结果,与预期不同(原因是:没有清空buffer_cache缓存)
a)查询表tb_tmp的数据量:
SQL> select count(*) from tb_tmp;
COUNT(*)
1261568
b)查询v$database_block_corruption,无任何记录:
SQL> select * from v$database_block_corruption;
no rows selected
SQL>
c)查看这个块上的对象:
SQL> select segment_name,segment_type,owner,tablespace_name from dba_extents where file_id=7
2 and 133 between block_id and block_id+blocks-1;
SEGMENT_NAME
SEGMENT_TYPE OWNER TABLESPACE_NAME
TB_TMP
TABLE SCOTT TBS_TEST1
SQL>
2)rman命令,使用backup validate datafile来校验数据文件datafile 7:
[oracle@dg1 ~]$ rman target /
Recovery Manager: Release 11.2.0.1.0 - Production on Mon Oct 17 14:33:49 2016
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
connected to target database: DG1 (DBID=1900287250)
RMAN> validate datafile 7;
Starting validate at 17-OCT-16
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=40 device type=DISK
channel ORA_DISK_1: starting validation of datafile
channel ORA_DISK_1: specifying datafile(s) for validation
input datafile file number=00007 name=/home/oracle/app/oradata/DG1/datafile/test02.dbf
channel ORA_DISK_1: validation complete, elapsed time: 00:00:01
List of Datafiles
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
7 FAILED 0 1113 21720 1224942
File Name: /home/oracle/app/oradata/DG1/datafile/test02.dbf
Block Type Blocks Failing Blocks Processed
Data 0 20294
Index 0 0
Other 3 313
validate found one or more corrupt blocks
See trace file /home/oracle/app/diag/rdbms/dg1/dg1/trace/dg1_ora_24517.trc for details
Finished validate at 17-OCT-16
RMAN>
3)再次查询v$database_block_corruption,有3条记录:
SQL> select * from v$database_block_corruption;
FILE# BLOCK# BLOCKS CORRUPTION_CHANGE# CORRUPTIO
7 153 1 0 CORRUPT
7 143 1 0 CORRUPT
7 133 1 0 CORRUPT
SQL>
4)查询表tb_tmp仍然有记录:
SQL> select count(*) from tb_tmp;
COUNT(*)
1261568
SQL>
5)清空buffer_cache缓存后,再次查询,出现期望的报错信息:
SQL> alter system flush buffer_cache;
System altered.
SQL> select count() from tb_tmp;
select count() from tb_tmp
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 7, block # 133)
ORA-01110: data file 7: '/home/oracle/app/oradata/DG1/datafile/test02.dbf'
SQL>
5.通过rman,使用blockrecover corruption list 来恢复,则所有坏块都会被恢复。这里,我们使用blockrecover datafile xx block xx 来依次恢复:
1)datafile 7 block 133:
SQL> select count() from tb_tmp;
select count() from tb_tmp
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 7, block # 133)
ORA-01110: data file 7: '/home/oracle/app/oradata/DG1/datafile/test02.dbf'
SQL>
[oracle@dg1 ~]$ rman target /
Recovery Manager: Release 11.2.0.1.0 - Production on Mon Oct 17 15:25:40 2016
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
connected to target database: DG1 (DBID=1900287250)
RMAN> blockrecover datafile 7 block 133;
Starting recover at 17-OCT-16
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=52 device type=DISK
channel ORA_DISK_1: restoring block(s)
channel ORA_DISK_1: specifying block(s) to restore from backup set
restoring blocks of datafile 00007
channel ORA_DISK_1: reading from backup piece /home/oracle/app/flash_recovery_area/DG1/backupset/2016_10_17/o1_mf_nnndf_TAG20161017T141757_d08v8odb_.bkp
channel ORA_DISK_1: piece handle=/home/oracle/app/flash_recovery_area/DG1/backupset/2016_10_17/o1_mf_nnndf_TAG20161017T141757_d08v8odb_.bkp tag=TAG20161017T141757
channel ORA_DISK_1: restored block(s) from backup piece 1
channel ORA_DISK_1: block restore complete, elapsed time: 00:00:01
starting media recovery
media recovery complete, elapsed time: 00:00:03
Finished recover at 17-OCT-16
RMAN>
2)datafile 7 block 143:
SQL> select count() from tb_tmp;
select count() from tb_tmp
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 7, block # 143)
ORA-01110: data file 7: '/home/oracle/app/oradata/DG1/datafile/test02.dbf'
RMAN> blockrecover datafile 7 block 143;
Starting recover at 17-OCT-16
using channel ORA_DISK_1
channel ORA_DISK_1: restoring block(s)
channel ORA_DISK_1: specifying block(s) to restore from backup set
restoring blocks of datafile 00007
channel ORA_DISK_1: reading from backup piece /home/oracle/app/flash_recovery_area/DG1/backupset/2016_10_17/o1_mf_nnndf_TAG20161017T141757_d08v8odb_.bkp
channel ORA_DISK_1: piece handle=/home/oracle/app/flash_recovery_area/DG1/backupset/2016_10_17/o1_mf_nnndf_TAG20161017T141757_d08v8odb_.bkp tag=TAG20161017T141757
channel ORA_DISK_1: restored block(s) from backup piece 1
channel ORA_DISK_1: block restore complete, elapsed time: 00:00:00
starting media recovery
media recovery complete, elapsed time: 00:00:01
Finished recover at 17-OCT-16
RMAN>
3)datafile 7 block 153:
SQL> select count() from tb_tmp;
select count() from tb_tmp
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 7, block # 153)
ORA-01110: data file 7: '/home/oracle/app/oradata/DG1/datafile/test02.dbf'
RMAN> blockrecover datafile 7 block 153;
Starting recover at 17-OCT-16
using channel ORA_DISK_1
channel ORA_DISK_1: restoring block(s)
channel ORA_DISK_1: specifying block(s) to restore from backup set
restoring blocks of datafile 00007
channel ORA_DISK_1: reading from backup piece /home/oracle/app/flash_recovery_area/DG1/backupset/2016_10_17/o1_mf_nnndf_TAG20161017T141757_d08v8odb_.bkp
channel ORA_DISK_1: piece handle=/home/oracle/app/flash_recovery_area/DG1/backupset/2016_10_17/o1_mf_nnndf_TAG20161017T141757_d08v8odb_.bkp tag=TAG20161017T141757
channel ORA_DISK_1: restored block(s) from backup piece 1
channel ORA_DISK_1: block restore complete, elapsed time: 00:00:00
starting media recovery
media recovery complete, elapsed time: 00:00:03
Finished recover at 17-OCT-16
RMAN>
6.验证结果
1)此时,再次执行,结果正常:
SQL> select count(*) from tb_tmp;
COUNT(*)
1261568
2)再次查询v$database_block_corruption,无返回结果
SQL> select * from v$database_block_corruption;
no rows selected
SQL>
2.2.坏块的对象定位与影响
1.模拟坏块
1)查看表tb_tmp所占用的块数量:
SQL> col segment_name for a15
SQL> select segment_name,header_file,header_block,blocks
2 from dba_segments
3 where segment_name='TB_TMP'
4 and owner='SCOTT';
SEGMENT_NAME HEADER_FILE HEADER_BLOCK BLOCKS
TB_TMP 7 130 384
2)查看块511和512上的对象:(找边界值,块头130,一共384个块,则末尾块为514)
SQL> col segment_name for a20
SQL> col segment_type for a20
SQL> col tablespace_name for a30
SQL> select segment_name,segment_type,owner,tablespace_name
2 from dba_extents
3 where file_id=7
4 and 512 between block_id and block_id+blocks-1;
no rows selected
SQL> col segment_name for a20
SQL> col segment_type for a20
SQL> col tablespace_name for a30
SQL> select segment_name,segment_type,owner,tablespace_name
2 from dba_extents
3 where file_id=7
4 and 511 between block_id and block_id+blocks-1;
SEGMENT_NAME SEGMENT_TYPE OWNER TABLESPACE_NAME
TB_TMP TABLE SCOTT TBS_TEST1
SQL>
3)用linux的dd命令,破坏块153,创建实验用的坏块
[oracle@dg1 ~]$ dd of=/home/oracle/app/oradata/DG1/datafile/test02.dbf bs=8192 conv=notrunc seek=153<<EOF
corrupted block
EOF
0+1 records in
0+1 records out
16 bytes (16 B) copied, 3.5e-05 seconds, 457 kB/s
[oracle@dg1 ~]$
4)验证坏块
SQL> select * from v$database_block_corruption;
no rows selected
[oracle@dg1 datafile]$ rman target /
Recovery Manager: Release 11.2.0.1.0 - Production on Mon Oct 17 16:40:53 2016
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
connected to target database: DG1 (DBID=1900287250)
RMAN> backup validate datafile 7;
Starting backup at 17-OCT-16
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=43 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00007 name=/home/oracle/app/oradata/DG1/datafile/test02.dbf
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
List of Datafiles
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
7 FAILED 0 1113 21720 1228821
File Name: /home/oracle/app/oradata/DG1/datafile/test02.dbf
Block Type Blocks Failing Blocks Processed
Data 0 20296
Index 0 0
Other 1 311
validate found one or more corrupt blocks
See trace file /home/oracle/app/diag/rdbms/dg1/dg1/trace/dg1_ora_24978.trc for details
Finished backup at 17-OCT-16
RMAN>
SQL> select * from v$database_block_corruption;
FILE# BLOCK# BLOCKS CORRUPTION_CHANGE# CORRUPTIO
7 153 1 0 CORRUPT
SQL> select count(*) from tb_tmp;
COUNT(*)
57344
如下,清空buffer_cache:
SQL> alter system flush buffer_cache;
System altered.
SQL> select count() from tb_tmp;
select count() from tb_tmp
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 7, block # 153)
ORA-01110: data file 7: '/home/oracle/app/oradata/DG1/datafile/test02.dbf'
SQL>
2.查询坏块号上存放的对象
SQL> set lines 200 pages 1000
SQL> col segment_name for a20
SQL> col segment_type for a20
SQL> col tablespace_name for a30
SQL> select segment_name,segment_type,owner,tablespace_name
2 from dba_extents
3 where file_id=7
4 and 153 between block_id and block_id+blocks-1;
SEGMENT_NAME SEGMENT_TYPE OWNER TABLESPACE_NAME
TB_TMP TABLE SCOTT TBS_TEST1
SQL>
3.对于损坏的数据文件,缺省情况下,不能对其进行备份,如下:
[oracle@dg1 datafile]$ rman target /
Recovery Manager: Release 11.2.0.1.0 - Production on Mon Oct 17 16:57:31 2016
Copyright (c) 1982, 2009, Oracle and/or its affiliates. All rights reserved.
connected to target database: DG1 (DBID=1900287250)
RMAN> backup datafile 7 tag='corrupted';
Starting backup at 17-OCT-16
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=59 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00007 name=/home/oracle/app/oradata/DG1/datafile/test02.dbf
channel ORA_DISK_1: starting piece 1 at 17-OCT-16
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on ORA_DISK_1 channel at 10/17/2016 16:57:51
ORA-19566: exceeded limit of 0 corrupt blocks for file /home/oracle/app/oradata/DG1/datafile/test02.dbf
RMAN>
4.设定允许损坏的块数量之后,才能进行备份
RMAN> run{
2> set maxcorrupt for datafile 7 to 1;
3> backup datafile 7 tag='corruption';
4> }
executing command: SET MAX CORRUPT
using target database control file instead of recovery catalog
Starting backup at 17-OCT-16
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=36 device type=DISK
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00007 name=/home/oracle/app/oradata/DG1/datafile/test02.dbf
channel ORA_DISK_1: starting piece 1 at 17-OCT-16
channel ORA_DISK_1: finished piece 1 at 17-OCT-16
piece handle=/home/oracle/app/flash_recovery_area/DG1/backupset/2016_10_17/o1_mf_nnndf_CORRUPTION_d094stkg_.bkp tag=CORRUPTION comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 17-OCT-16
RMAN>
5.查看备份信息,应在修复坏块后重新备份以避免由于保留策略导致先前可用的备份被aged out
RMAN> list backup summary;
List of Backups
Key TY LV S Device Type Completion Time #Pieces #Copies Compressed Tag
1 B F A DISK 13-OCT-16 1 1 NO HEALTH
2 B F A DISK 14-OCT-16 1 1 NO HEALTH1
3 B F A DISK 17-OCT-16 1 1 NO TAG20161017T141757
4 B F A DISK 17-OCT-16 1 1 NO TAG20161017T163651
5 B F A DISK 17-OCT-16 1 1 NO CORRUPTION
RMAN>
3.索引坏块处理流程
3.3.1.处理流程简介
1)一般情况下,是在rman备份时,报有坏块的错误
2)检查v$database_block_corruption,查看坏块是在哪个块上
3)若v$database_block_corruption没有返回值,则rman命令执行validate datafile xxx,然后再次查看v$database_block_corruption,找到坏块号
4)查询dba_extents,找到坏块上存放的对象
5)根据实际情况,做下一步处理
3.2.具体处理流程
3.2.1.新建表tb_tmp,并创建索引,添加not null约束
SQL> show user
USER is "SCOTT"
SQL> create table tb_tmp tablespace tbs_test1 as select * from emp;
Table created.
SQL>
SQL> create index ind_empno on tb_tmp(empno) tablespace tbs_test1;
Index created.
SQL> alter table tb_tmp modify empno not null;
Table altered.
3.2.2.查看表tb_tmp和索引ind_empno所占用的文件号和块号
1)表:
SQL> col segment_name for a15
SQL> select segment_name,header_file,header_block,blocks
2 from dba_segments
3 where segment_name='TB_TMP'
4 and owner='SCOTT';
SEGMENT_NAME HEADER_FILE HEADER_BLOCK BLOCKS
TB_TMP 7 538 8
SQL> set lines 200 pages 1000
SQL> col segment_name for a20
SQL> col segment_type for a20
SQL> col tablespace_name for a30
SQL> select segment_name,segment_type,owner,tablespace_name
2 from dba_extents
3 where file_id=7
4 and 543 between block_id and block_id+blocks-1;
SEGMENT_NAME SEGMENT_TYPE OWNER TABLESPACE_NAME
TB_TMP TABLE SCOTT TBS_TEST1
SQL>
2)索引:
SQL> col segment_name for a15
SQL> select segment_name,header_file,header_block,blocks
2 from dba_segments
3 where segment_name='IND_EMPNO'
4 and owner='SCOTT';
SEGMENT_NAME HEADER_FILE HEADER_BLOCK BLOCKS
IND_EMPNO 7 546 8
SQL> set lines 200 pages 1000
SQL> col segment_name for a20
SQL> col segment_type for a20
SQL> col tablespace_name for a30
SQL> select segment_name,segment_type,owner,tablespace_name
2 from dba_extents
3 where file_id=7
4 and 551 between block_id and block_id+blocks-1;
SEGMENT_NAME SEGMENT_TYPE OWNER TABLESPACE_NAME
IND_EMPNO INDEX SCOTT TBS_TEST1
SQL>
3.2.3.破坏1个索引块,创造实验环境
1)破坏1个索引块524:
[oracle@dg1 datafile]$ dd of=/home/oracle/app/oradata/DG1/datafile/test02.dbf bs=8192 conv=notrunc seek=547 <<EOF
corrupted index block
EOF
0+1 records in
0+1 records out
22 bytes (22 B) copied, 0 seconds, Infinity B/s
[oracle@dg1 datafile]$
2)查看v$database_block_corruption,无返回结果
SQL> select * from v$database_block_corruption;
no rows selected
3)rman校验数据文件7
RMAN> backup validate datafile 7;
Starting backup at 18-OCT-16
using channel ORA_DISK_1
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00007 name=/home/oracle/app/oradata/DG1/datafile/test02.dbf
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
List of Datafiles
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
7 FAILED 0 1113 21720 1256351
File Name: /home/oracle/app/oradata/DG1/datafile/test02.dbf
Block Type Blocks Failing Blocks Processed
Data 0 20273
Index 0 13
Other 1 321
validate found one or more corrupt blocks
See trace file /home/oracle/app/diag/rdbms/dg1/dg1/trace/dg1_ora_3046.trc for details
Finished backup at 18-OCT-16
RMAN>
4)再次查看v$database_block_corruption,返回1条结果
SQL> select * from v$database_block_corruption;
FILE# BLOCK# BLOCKS CORRUPTION_CHANGE# CORRUPTIO
7 547 1 0 CORRUPT
SQL>
3.2.4.用rman尝试备份操作,报错有坏块
RMAN> backup datafile 7;
Starting backup at 18-OCT-16
using channel ORA_DISK_1
channel ORA_DISK_1: starting full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00007 name=/home/oracle/app/oradata/DG1/datafile/test02.dbf
channel ORA_DISK_1: starting piece 1 at 18-OCT-16
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on ORA_DISK_1 channel at 10/18/2016 09:26:43
ORA-19566: exceeded limit of 0 corrupt blocks for file /home/oracle/app/oradata/DG1/datafile/test02.dbf
RMAN>
3.2.5.确认坏块号,以及坏块上的对象
1)确认坏块号
SQL> select * from v$database_block_corruption;
FILE# BLOCK# BLOCKS CORRUPTION_CHANGE# CORRUPTIO
7 547 1 0 CORRUPT
SQL>
2)确认坏块上的对象
SQL> set lines 200 pages 1000
SQL> col segment_name for a20
SQL> col segment_type for a20
SQL> col tablespace_name for a30
SQL> select segment_name,segment_type,owner,tablespace_name
2 from dba_extents
3 where file_id=7
4 and 547 between block_id and block_id+blocks-1;
SEGMENT_NAME SEGMENT_TYPE OWNER TABLESPACE_NAME
IND_EMPNO INDEX SCOTT TBS_TEST1
SQL>
3.2.6.查询表,看是否走索引
1)在做dd命令,破坏索引块547之前,单独查索引列,走索引
SQL> set autotrace traceonly
SQL> select empno from tb_tmp;
14 rows selected.
Execution Plan
Plan hash value: 474812530
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 14 | 56 | 1 (0)| 00:00:01 |
| 1 | INDEX FULL SCAN | IND_EMPNO | 14 | 56 | 1 (0)| 00:00:01 |
Statistics
24 recursive calls
0 db block gets
5 consistent gets
0 physical reads
0 redo size
686 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
14 rows processed
SQL>
2)做dd命令,破坏索引块547之后,单独查索引列,报坏块的错误
SQL> set autotrace traceonly
SQL> select empno from tb_tmp;
select empno from tb_tmp
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 7, block # 547)
ORA-01110: data file 7: '/home/oracle/app/oradata/DG1/datafile/test02.dbf'
3)查询count(),仍然报有坏块的错误
SQL> select count() from tb_tmp;
select count(*) from tb_tmp
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 7, block # 131)
ORA-01110: data file 7: '/home/oracle/app/oradata/DG1/datafile/test02.dbf'
4)Select * from tb_tmp,查询正常
SQL> select * from tb_tmp;
14 rows selected.
Execution Plan
Plan hash value: 3935752189
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 14 | 1218 | 3 (0)| 00:00:01 |
| 1 | TABLE ACCESS FULL| TB_TMP | 14 | 1218 | 3 (0)| 00:00:01 |
Note
- dynamic sampling used for this statement (level=2)
Statistics
147 recursive calls
0 db block gets
22 consistent gets
0 physical reads
0 redo size
1630 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
14 rows processed
SQL> set autotrace off
3.2.7.结论
在破坏索引块547,之后,凡是查询需要用到索引的,都会报坏块的错误;
但是,如果查询不需要用到损坏的索引列,如查其他列,则不会报坏块的错误。
如下:
SQL> select empno from tb_tmp where empno=7900;
select empno from tb_tmp where empno=7900
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 7, block # 131)
ORA-01110: data file 7: '/home/oracle/app/oradata/DG1/datafile/test02.dbf'
SQL> select ename from tb_tmp;
ENAME
SMITH
ALLEN
WARD
JONES
MARTIN
BLAKE
CLARK
SCOTT
KING
TURNER
ADAMS
JAMES
FORD
MILLER
14 rows selected.
SQL>
3.2.8.重建索引,查看坏块是否还存在,查看坏块上此时存放的对象
1)重建索引列
SQL> alter index ind_empno rebuild online;
Index altered.
2)查看坏块547是否还存在
SQL> select * from v$database_block_corruption;
FILE# BLOCK# BLOCKS CORRUPTION_CHANGE# CORRUPTIO
7 547 1 0 CORRUPT
3)查看这个坏块上存放的对象
SQL> set lines 200 pages 1000
SQL> col segment_name for a20
SQL> col segment_type for a20
SQL> col tablespace_name for a30
SQL> select segment_name,segment_type,owner,tablespace_name
2 from dba_extents
3 where file_id=7
4 and 547 between block_id and block_id+blocks-1;
no rows selected
SQL>
4)再次查询索引列
SQL> set autotrace traceonly
SQL> select empno from tb_tmp;
14 rows selected.
Execution Plan
Plan hash value: 474812530
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 14 | 56 | 1 (0)| 00:00:01 |
| 1 | INDEX FULL SCAN | IND_EMPNO | 14 | 56 | 1 (0)| 00:00:01 |
Statistics
1 recursive calls
0 db block gets
2 consistent gets
0 physical reads
0 redo size
686 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
14 rows processed
SQL> set autotrace off
3.2.9.新建测试表,并不断插入数据,直到坏块被重新使用(期间要关注表空间的使用率,不要达到100%)
1)在tbs_test1表空间上创建表test4
SQL> show user
USER is "SCOTT"
SQL> create table test4 tablespace tbs_test1 as select * from emp;
Table created.
2)查看坏块是否被重新写入数据(刚好,上边test4的重建操作,使坏块被重新使用了)
SQL> set lines 200 pages 1000
SQL> col segment_name for a20
SQL> col segment_type for a20
SQL> col tablespace_name for a30
SQL> select segment_name,segment_type,owner,tablespace_name
2 from dba_extents
3 where file_id=7
4 and 547 between block_id and block_id+blocks-1;
SEGMENT_NAME SEGMENT_TYPE OWNER TABLESPACE_NAME
TEST4 TABLE SCOTT TBS_TEST1
SQL>
3)查看表空间使用率
SQL> set lines 200 pages 1000
SQL> col tablespace_name for a30
SQL> SELECT A.TABLESPACE_NAME as "tablespace",
2 A.BYTES / 1024 / 1024 as "total(m)",
3 C.BYTES / 1024 / 1024 as "free(m)",
4 B.BYTES / 1024 / 1024 as "used(m)",
5 (B.BYTES * 100) / A.BYTES as "used%",
6 (C.BYTES * 100) / A.BYTES as "free%"
7 FROM SYS.SM$TS_AVAIL A, SYS.SM$TS_USED B, SYS.SM$TS_FREE C
8 WHERE A.TABLESPACE_NAME = B.TABLESPACE_NAME
9 AND A.TABLESPACE_NAME = C.TABLESPACE_NAME;
tablespace total(m) free(m) used(m) used% free%
TBS_TEST1 50 48.875 .125 .25 97.75
SYSAUX 530 32.25 496.75 93.7264151 6.08490566
UNDOTBS1 295 250.5625 43.4375 14.7245763 84.9364407
USERS 215 180.9375 33.125 15.4069767 84.1569767
SYSTEM 680 .125 678.875 99.8345588 .018382353
TEST1 50 48.875 .125 .25 97.75
6 rows selected.
SQL>
4)查看v$database_block_corruption,仍有之前的那一条返回结果
SQL> select * from v$database_block_corruption;
FILE# BLOCK# BLOCKS CORRUPTION_CHANGE# CORRUPTIO
7 131 1 0 CORRUPT
5)用rman命令,校验数据文件7
RMAN> validate datafile 7;
Starting validate at 18-OCT-16
using channel ORA_DISK_1
channel ORA_DISK_1: starting validation of datafile
channel ORA_DISK_1: specifying datafile(s) for validation
input datafile file number=00007 name=/home/oracle/app/oradata/DG1/datafile/test02.dbf
channel ORA_DISK_1: validation complete, elapsed time: 00:00:01
List of Datafiles
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
7 OK 0 6261 6400 1274555
File Name: /home/oracle/app/oradata/DG1/datafile/test02.dbf
Block Type Blocks Failing Blocks Processed
Data 0 1
Index 0 2
Other 0 136
Finished validate at 18-OCT-16
RMAN>
6)再次查看v$database_block_corruption,无返回结果
SQL> select * from v$database_block_corruption;
no rows selected
7)查询表tb_tmp的empno列,此时正常走索引
SQL> set autotrace traceonly
SQL> select empno from tb_tmp;
14 rows selected.
Execution Plan
Plan hash value: 474812530
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 14 | 182 | 1 (0)| 00:00:01 |
| 1 | INDEX FULL SCAN | IND_EMPNO | 14 | 182 | 1 (0)| 00:00:01 |
Note
- dynamic sampling used for this statement (level=2)
Statistics
0 recursive calls
0 db block gets
2 consistent gets
0 physical reads
0 redo size
686 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
14 rows processed
SQL> set autotrace off
4.总结
1.如果rman或者nbu备份报坏块错误时,可以先查看v$database_block_corruption,定位到坏块是哪个文件的哪个数据块
SQL> select * from v$database_block_corruption;
2.如果上一步查不到,可以先校验数据文件,使用rman的backup validate datafile xx;来定位报错块
RMAN> validate datafile 7;
3.查到了坏块的file_id和block_id,则可以根据这两个值,定位这个坏块上存放的对象是什么
set lines 200 pages 1000
col segment_name for a20
col segment_type for a20
col tablespace_name for a30
select segment_name,segment_type,owner,tablespace_name
from dba_extents
where file_id=7
and 131 between block_id and block_id+blocks-1;
4.逻辑坏块的对象,是可以drop掉的
5.如果是逻辑坏块,可以通过drop等方式释放掉坏块对象所占用的空间,然后重新使用这块空间,就可以消除坏块的报错
6.如果是物理坏块,则可以通过cp数据文件的方式,来进一步确认

浙公网安备 33010602011771号