1? 环境配置
1.1???? 主机配置
这里我采用的是VMWARE WORKSTATION 10.0.1平台下的操作系统Centos5.8来进行安装
| 主机类别 |
IP |
DB_NAME |
DB_UNIQUE_NAME |
Net Service Name |
| 主库 |
192.168.2.128 |
Dg1 |
Dg1 |
Dg1 |
| 备库 |
192.168.2.129 |
Dg1 |
?DG2 |
?DG2 |
1.2?? 检查数据库是否支持Data Guard(企业版才支持),是否归档模式,Enable force logging
SQL> select * from v$option where parameter = 'Managed Standby';
需要确认主库处于归档模式
若不是需要按照以下步骤进行修改
1、startup mount
2、alter database archivelog;
3、alter database open;
4、将primary数据库置为FORCE LOGGING模式
alter database force logging;??(强制产生日志)
1.3?? 如果主库没有密码文件则建立密码文件,从而可以OS验证的方式登陆
$ orapwd file=$ORACLE_HOME/dbs/orapw$ORACLE_SID password=123456 entries=5
1.4?? 配置standby redolog(最佳性能模式可以忽略,如果将来变成备库且要转为其它两种模式则要建立)
alter database add standby logfile
group 4 ('/oracle/oradata/dg1/redo04.log') size 50m,
group 5 ('/oracle/oradata/dg1/redo05.log ') size 50m,
group 6 ('/oracle/oradata/ dg1/redo06.log ') size 50m,
group 7 ('/oracle/oradata/ dg1/redo07.log ') size 50m;
standby redolog的组数参考公式:(online redolog组数 + 1) * 数据库线程数;单机线程数为1,RAC一般为2。standby redolog的组成员数和大小也尽量和online redolog一样。
1.5?? 设置主库初始化参数
$ sqlplus '/as sysdba'
SQL> create pfile from spfile;??(备份参数文件)
SQL> alter system set LOG_ARCHIVE_CONFIG='DG_CONFIG=(dg1,dg2)' scope=spfile;????????(启动db接受或发送redo data,包括所有库的db_unique_name)
SQL> alter system set LOG_ARCHIVE_DEST_1='LOCATION=/oracle/archivelog/dg1 VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=dg1' scope=spfile;???(主库归档目的地)
SQL> alter system set LOG_ARCHIVE_DEST_2='SERVICE=dg2 LGWR ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=dg2' scope=spfile;????(当该库充当主库角色时,设置物理备库redo data的传输目的地)
SQL> alter system set LOG_ARCHIVE_MAX_PROCESSES=5 scope=spfile;??(最大ARCn进程数)可省略
SQL> alter system set LOG_ARCHIVE_DEST_STATE_1=ENABLE scope=spfile;????(允许redo传输服务传输数据到目的地,默认是enable)
SQL> alter system set LOG_ARCHIVE_DEST_STATE_2=ENABLE scope=spfile;????(同上)
SQL> alter system set REMOTE_LOGIN_PASSWORDFILE=EXCLUSIVE scope=spfile;????(exclusive or shared,所有库sys密码要一致,默认是exclusive可省略)
--以下是主库切换为备库,充当备库角色时的一些参数设置,如果不打算做数据库切换就不用设置了
SQL> alter system set FAL_SERVER=dg2 scope=spfile;????????(配置网络服务名,假如转换为备库角色时,从这里获取丢失的归档文件)
SQL> alter system set FAL_CLIENT=dg1 scope=spfile;????????(配置网络服务名,fal_server拷贝丢失的归档文件到这里)
SQL> alter system set STANDBY_FILE_MANAGEMENT=auto scope=spfile;????????(auto后当主库的datafiles增删时备库也同样自动操作,且会把日志传送到备库standby_archive_dest参数指定的目录下,确保该目录存在,如果你的存储采用文件系统没有问题,但是如果采用了裸设备,你就必须将该参数设置为manual)
SQL> alter system set STANDBY_ARCHIVE_DEST='LOCATION=/oracle/archivelog/dg1' scope=spfile;????(有了以上参数设置,则无论该库充当主库角色还是备库角色都无需再修改了。
下面附上实际文档/oracle/product/10.2.0/db_1/dbs/initdg1.ora
dg1.__db_cache_size=92274688
dg1.__java_pool_size=4194304
dg1.__large_pool_size=4194304
dg1.__shared_pool_size=62914560
dg1.__streams_pool_size=0
*.audit_file_dest='/oracle/admin/dg1/adump'
*.background_dump_dest='/oracle/admin/dg1/bdump'
*.compatible='10.2.0.1.0'
*.control_files='/oracle/oradata/dg1/control01.ctl','/oracle/oradata/dg1/control02.ctl','/oracle/oradata/dg1/control03.ctl'
*.core_dump_dest='/oracle/admin/dg1/cdump'
*.db_block_size=8192
*.db_domain=''
*.db_file_multiblock_read_count=16
*.db_file_name_convert='dg2','dg1'
*.db_name='dg1'
*.db_recovery_file_dest='/oracle/flash_recovery_area'
*.db_recovery_file_dest_size=2147483648
*.dispatchers='(PROTOCOL=TCP) (SERVICE=dg1XDB)'
*.fal_client='DG1'
*.fal_server='DG2'
*.job_queue_processes=10
*.log_archive_config='DG_CONFIG=(dg1,dg2)'
*.log_archive_dest_1='LOCATION=/oracle/archivelog/dg1 VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=dg1'
*.log_archive_dest_2='SERVICE=dg2 LGWR ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=dg2'
*.log_archive_dest_state_1='ENABLE'
*.log_archive_dest_state_2='ENABLE'
*.log_archive_format='%t_%s_%r.dbf'
*.log_archive_max_processes=5
*.log_file_name_convert='dg2','dg1'
*.open_cursors=300
*.pga_aggregate_target=16777216
*.processes=150
*.remote_login_passwordfile='EXCLUSIVE'
*.sga_target=167772160
*.standby_archive_dest='LOCATION=/oracle/archivelog/dg1'
*.standby_file_management='AUTO'
*.undo_management='AUTO'
*.undo_tablespace='UNDOTBS1'
*.user_dump_dest='/oracle/admin/dg1/udump'
然后重启数据库:
SQL> shutdown immediate
SQL> startup;
1.6?? 备份主库数据文件
关闭应用服务器,停止监听:
$ lsnrctl stop
RMA方式备份:
$ rman target /
RMAN> backup full database format ‘/oracle/backup/backup_%T_%s_%p.bak’;
RMAN>sql “alter system archive log current”;
RMAN>backup archivelog all format ‘/oracle/backup/arch_%T_%s_%p.bak’;
若指定目录不存在,则手动创建
1.7?? 在主库上建立备库控制文件(控制文件通常需要有多份,手工将文件复制几份)
$ sqlplus '/as sysdba'
SQL> alter database create standby controlfile as '/oracle/backup/stdby_control01.ctl';
$ cd /oracle/backup
$ cp stdby_control01.ctl? stdby_control02.ctl
$ cp stdby_control01.ctl? stdby_control03.ctl
为备库准备init参数
$ sqlplus '/as sysdba'
SQL> create pfile = '/oracle/backup/initdg2.ora' from spfile;
$ cd /oracle/backup
$ vi initdg2.ora
注意主备库不同角色的属性配置,注意文件路径等,注意db_name要和主库一致,主要是以下参数:
----
control_files='/oracle/oradata/DG2/ stdby_control01.ctl ','/oracle/oradata/DG2/ stdby_control02.ctl ', '/oracle/oradata/DG2/ stdby_control03.ctl '
db_unique_name=dg2
log_archive_config='DG_CONFIG=(dg2,dg1)'
log_archive_dest_1='LOCATION=/oracle/archivelog /dg2 VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=dg2'
log_archive_dest_2='SERVICE=dg1 LGWR ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=dg1'
LOG_ARCHIVE_DEST_STATE_1=ENABLE
LOG_ARCHIVE_DEST_STATE_2=ENABLE
fal_client=dg2
fal_server=dg1
db_file_name_convert=’ dg1’,’ dg2’ (主备目录必须一致,否则这里需要写上路径)
log_file_name_convert=’dg1’,’dg2’(同上)
STANDBY_FILE_MANAGEMENT=AUTO
其他admin目录下更换实际即可
下面贴上实际initdg2.ora文档
dg2.__db_cache_size=88080384
dg2.__java_pool_size=4194304
dg2.__large_pool_size=4194304
dg2.__shared_pool_size=67108864
dg2.__streams_pool_size=0
*.audit_file_dest='/oracle/admin/dg2/adump'
*.background_dump_dest='/oracle/admin/dg2/bdump'
*.compatible='10.2.0.1.0'
*.control_files='/oracle/oradata/dg2/stdby_control01.ctl','/oracle/oradata/dg2/stdby_control02.ctl','/oracle/oradata/dg2/stdby_control03.ctl'
*.core_dump_dest='/oracle/admin/dg2/cdump'
db_unique_name=dg2
*.db_block_size=8192
*.db_domain=''
*.db_file_multiblock_read_count=16
*.db_file_name_convert='dg1','dg2'
*.db_name='dg1'
*.db_recovery_file_dest='/oracle/flash_recovery_area'
*.db_recovery_file_dest_size=2147483648
*.dispatchers='(PROTOCOL=TCP) (SERVICE=dg1XDB)'
*.fal_client='DG2'
*.fal_server='DG1'
*.job_queue_processes=10
*.log_archive_config='DG_CONFIG=(dg2,dg1)'
*.log_archive_dest_1='LOCATION=/oracle/archivelog/dg2 VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=dg2'
*.log_archive_dest_2='SERVICE=dg1 LGWR ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=dg1'
*.log_archive_dest_state_1='ENABLE'
*.log_archive_dest_state_2='ENABLE'
*.log_archive_format='%t_%s_%r.dbf'
*.log_archive_max_processes=5
*.log_file_name_convert='dg1','dg2'
*.open_cursors=300
*.pga_aggregate_target=16777216
*.processes=150
*.remote_login_passwordfile='EXCLUSIVE'
*.sga_target=167772160
*.standby_archive_dest='LOCATION=/oracle/archivelog/dg2'
*.standby_file_management='AUTO'
*.undo_management='AUTO'
*.undo_tablespace='UNDOTBS1'
*.user_dump_dest='/oracle/admin/dg2/udump'
1.8?? 拷贝上面生成的文件backup_%T.bak、stdby_control01/02/03.ctl、initdg2.ora到备库所在主机
注意rman备份的文件在主备库主机上目录要一致。
$ scp *.bak 192.168.2.129:/oracle/backup/
$ scp initdg2.ora 192.168.2.129:$ORACLE_HOME/dbs/
$ scp ORCL*.ctl 192.168.2.129:/oracle/oradata/dg2/
建立主库监听和主备库的网络服务名(必须是dedicated的),并启动监听
$ lsnrctl stop
cd $ORACLE_HOME/network/admin
vi listener.ora
SID_LIST_LISTENER =
(SID_LIST =
(SID_DESC =
(SID_NAME = PLSExtProc)
(ORACLE_HOME = /oracle/product/10.2.0/db_1)
(PROGRAM = extproc)
)
(SID_DESC =
(SID_NAME = DG1)
(ORACLE_HOME = /oracle/product/10.2.0/db_1)
(GLOBAL_DBNAME = DG1)
)
)
LISTENER =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = linux128)(PORT = 1521))
(ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC0))
)
)
Vi tnsnames.ora
EXTPROC_CONNECTION_DATA =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC0))
)
(CONNECT_DATA =
(SID = PLSExtProc)
(PRESENTATION = RO)
)
)
DG1 =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST =192.168.2.128)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = dg1)
)
)
DG2 =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST =192.168.2.129)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = dg2)
)
)
$ tnsping dg1
$ tnsping dg2???(
将listener.ora tnsname.ora传至备机相应目录,修改listener.ora中dg1为dg2,以及HOST即可。
2??? 建立备库
2.1?? 设置环境变量并建立备库一些必需目录
$ export ORACLE_BASE=/oracle
$ export ORACLE_HOME= /oracle/product/11.2.0/db_1
$ export ORACLE_SID=dg2
$ mkdir -p $ORACLE_BASE/admin/$ORACLE_SID/adump
--在11g环境下如下目录可以不建
$ mkdir -p $ORACLE_BASE/admin/$ORACLE_SID/bdump
$ mkdir -p $ORACLE_BASE/admin/$ORACLE_SID/cdump
$ mkdir -p $ORACLE_BASE/admin/$ORACLE_SID/udump
--以下目录要看哪些地方可能会存放数据库文件,注意不能少建
$ mkdir -p /oracle/oradata/$ORACLE_SID
2.2?? 在备库主机上生成密码文件,且sys密码和主库得一致
$ orapwd file=$ORACLE_HOME/dbs/orapw$ORACLE_SID password=123456 entries=5
--此时如果还是有问题,可以从主库上把密码文件拷贝过来
2.3、在备库上建立监听,和主备库网络服务名(必须是dedicated的),并启动监听
$ netca????(是图形界面,或者手工从别的库把listener.ora和tnsnames.ora拷过来修改也行)
$ lsnrctl start
$ tnsping dg1
$ tnsping dg2
2.3?? 在备库上建立spfile
$ sqlplus '/as sysdba'
SQL> create spfile from pfile;
如果pfile没有放到$ORACLE_HOME/dbs/下,而是放在别的位置:
SQL> create spfile from pfile='/oracle/product/10.2/db_1/dbs/initdg2.ora';
2.4?? 启动物理备库
SQL> startup nomount
SQL> alter database mount standby database;
2.5?? 备库做数据恢复
RMAN恢复:
$ rman target /???????(要求主备库rman备份文件的存放路径和文件名一致)
RMAN> restore database;
RMAN> restore archivelog all;
介质恢复后,rman?自动将standby 数据库打开到mount 状态。
2.6?? 配置standby redolog(最佳性能模式可以忽略,如果要转为其它两种模式则要建立)
SQL> alter database add standby logfile
group 4 ('/oracle/oradata/dg2/redo04.log') size 50m,
group 5 ('/oracle/oradata/dg2/redo05.log ') size 50m,
group 6 ('/oracle/oradata/dg2/redo06.log ') size 50m,
group 7 ('/oracle/oradata/dg2/redo07.log ') size 50m;
standby redolog的组数参考公式:(online redolog组数 + 1) * 数据库线程数;单机线程数为1,RAC一般为2。
standby redolog的组成员数和大小也尽量和online redolog一样。
2.7?? 在备库上,启动redo apply
SQL> alter database recover managed standby database disconnect from session;
到此物理备库创建完毕!
3??? 主备库各参数文件内容
3.1???? 主库
listener.ora
SID_LIST_LISTENER =
(SID_LIST =
(SID_DESC =
(SID_NAME = PLSExtProc)
(ORACLE_HOME = /oracle/product/10.2.0/db_1)
(PROGRAM = extproc)
)
(SID_DESC =
(SID_NAME = dg2)
(ORACLE_HOME = /oracle/product/10.2.0/db_1)
(GLOBAL_DBNAME = dg2)
)
)
LISTENER =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = linux129)(PORT = 1521))
(ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC0))
)
)
tnsnames.ora
# Generated by Oracle configuration tools.
DG1 =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST =192.168.2.128)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = dg1)
)
)
DG2 =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST =192.168.2.129)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = DG2)
)
)
EXTPROC_CONNECTION_DATA =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC1))
)
(CONNECT_DATA =
(SID = PLSExtProc)
(PRESENTATION = RO)
)
)
3.2???? 备库
listener.ora
SID_LIST_LISTENER =
(SID_LIST =
(SID_DESC =
(SID_NAME = PLSExtProc)
(ORACLE_HOME = /oracle/product/10.2.0/db_1)
(PROGRAM = extproc)
)
(SID_DESC =
(SID_NAME = dg2)
(ORACLE_HOME = /oracle/product/10.2.0/db_1)
(GLOBAL_DBNAME = dg2)
)
)
LISTENER =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = linux129)(PORT = 1521))
(ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC0))
)
)
tnsnames.ora
EXTPROC_CONNECTION_DATA =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC0))
)
(CONNECT_DATA =
(SID = PLSExtProc)
(PRESENTATION = RO)
)
)
DG1 =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST =192.168.2.128)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = dg1)
)
)
DG2 =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST =192.168.2.129)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = dg2)
)
)
4??? ?测试是否成功
4.1??? 主库归档前
SQL> archive log list;
Database log mode??????????????Archive Mode
Automatic archival?????????????Enabled
Archive destination???????????/oracle/archivelog/dg1
Oldest online log sequence?????6
Next log sequence to archive???8
Current log sequence???????????8
此时备库:
SQL> archive log list;
Database log mode??????????????Archive Mode
Automatic archival?????????????Enabled
Archive destination????????????/oracle/archivelog/dg2
Oldest online log sequence?????0
Next log sequence to archive???0
Current log sequence???????????8
4.2????? 主库归档后
SQL> ALTER SYSTEM SWITCH LOGFILE;????--对单实例数据库或RAC中的当前实例执行日志切换
SQL> ALTER SYSTEM ARCHIVE LOG CURRENT;????--对数据库中的所有实例执行日志切换
SQL> archive log list;
Database log mode??????????????Archive Mode
Automatic archival?????????????Enabled
Archive destination????????????/orahome/arch1/WENDING
Oldest online log sequence?????7
Next log sequence to archive???9
Current log sequence???????????9
此时备库:
SQL> archive log list;
Database log mode??????????????Archive Mode
Automatic archival?????????????Enabled
Archive destination????????????/orahome/arch1/PHYSTDBY
Oldest online log sequence?????0
Next log sequence to archive???0
Current log sequence???????????9
两者的Current log sequence一致,说明成功
备库上:
SQL>select process,status from v$managed_standby;
SQL> select process,status from v$managed_standby;
PROCESS ?? ???STATUS
------------------ ------------------------
ARCH?????????? ?? CONNECTED
ARCH?????????? ?? CONNECTED
ARCH?????????? ?? CONNECTED
ARCH?????????? ?? CONNECTED
ARCH?????????? ?? CONNECTED
RFS????????????? ?? IDLE
RFS????????????? ?? IDLE
MRP0?????????? ?? WAIT_FOR_LOG
RFS????????????? ?? IDLE
9 rows selected.
RFS? ------远程文件接受进程
MRP0-------日志应用进程
4.3???? 主库插入数据测试
主库插入数据后
SQL> commit;
Commit complete.
SQL> alter system switch logfile;
System altered.
对于物理备库(physical standby)来说,处于日志恢复模式的时候数据库是不能打开的。如果要打开只能先关闭日志恢复模式,然后以read only (只读方式)打开。
(oracle11G版本可以在read only下应用归档日志)
SQL>alter database recover managed standby database cancel;
停止日志恢复模式
SQL>alter database open read only;
打开数据库。
这样在主库中插入的数据,此时可以查看到,若想恢复mount状态
> alter database recover managed standby database disconnect from session;
> select status from v$instance;(回到mount状态)
4.4???? ?主备库切换
4.4.1?Switchover
一般SWITCHOVER切换都是计划中的切换,特点是在切换后,不会丢失任何的数据,而且这个过程是可逆的,整个DATA?GUARD环境不会被破坏,原来DATA?GUARD环境中的所有物理和逻辑STANDBY都可以继续工作。
在进行DATA?GUARD的物理STANDBY切换前需要注意:
1)确认主库和从库间网络连接通畅;
2)确认没有活动的会话连接在数据库中;
3)PRIMARY数据库处于打开的状态,STANDBY数据库处于MOUNT状态;
4)确保STANDBY数据库处于ARCHIVELOG模式;
5)如果设置了REDO应用的延迟,那么将这个设置去掉;
6)确保配置了主库和从库的初始化参数,使得切换完成后,DATA?GUARD机制可以顺利的运行。??
主库:
- 查看switchover?状态
SQL>?SELECT?SWITCHOVER_STATUS?FROM?V$DATABASE;
SWITCHOVER_STATUS
--------------------
TO?STANDBY
附:?A:switchover_status出现session?active/not?allowed
当出现session?active的时候表示还有活动的session,则运行
SQL> select count(*) from v$session where username is not null;
COUNT(*)
1
虽然当前数据库的状态是SESSION ACTIVE ,而不是TO STANDBY,但是查询V$SESSION会话,确认除了当前会话外,其他都是系统会话,那么就可以在主库进行SWITCHOVER切换了。
SQL> alter database commit to switchover to physical standby;
Database altered.
2 启动到mount和应用日志状态
SQL>?SHUTDOWN?IMMEDIATE
SQL>?startup?mount;
- 查看数据库模式
SQL>select?dest_name,status,database_mode,recovery_mode,protection_mode?from?v$archive_dest_status;
SQL>select?status,database_mode?from?v$archive_dest_status;
备库:?
1.查看switchover状态
SQL>?SELECT?SWITCHOVER_STATUS?FROM?V$DATABASE;
TO?PRIMARY
附:若不是用此语句切换:ALTER?DATABASE?COMMIT?TO?SWITCHOVER?TO?PRIMARY
补充:若出现:ORA-16139:?media?recovery?required
是因为没有执行:alter?database?recover?managed?standby?database?disconnect?from?session;
- 切换成主库
SQL>?ALTER?DATABASE?COMMIT?TO?SWITCHOVER?TO?PRIMARY;
Database?altered.
SQL>?shutdown?immediate;
SQL>?startup;
SQL>?alter?system?switch?logfile;
- 查看数据库模式
SQL>select?dest_name,status,database_mode,recovery_mode,protection_mode?from?v$archive_dest_status;
SQL>select?status,database_mode?from?v$archive_dest_status;
验证同步:
SQL>?select?max(sequence#)?from?v$archived_log;
MAX(SEQUENCE#)
--------------
78
4.4.2? ?Failovers
FAILOVER切换一般是PRIMARY数据库发生故障后的切换,这种情况是STANDBY数据库发挥其作用的情况。这种切换发生后,可能会造成数据的丢失。而且这个过程不是可逆的,DATA?GUARD环境会被破坏。
由于PRIMARY数据库已经无法启动,所以FAILOVER切换所需的条件并不多,只要检查STANDBY是否运行在最大保护模式下,如果是的话,需要将其置为最大性能模式,否则切换到PRIMARY角色也无法启动。
1. 查看是否有日志GAP,没有应用的日志:
SQL>?SELECT?UNIQUE?THREAD#,?MAX(SEQUENCE#)?OVER(PARTITION?BY?THREAD#)?LAST?FROM?V$ARCHIVED_LOG;
SQL>?SELECT?THREAD#,?LOW_SEQUENCE#,?HIGH_SEQUENCE#?FROM?V$ARCHIVE_GAP;
如果有,则拷贝过来并且注册
SQL>?ALTER?DATABASE?REGISTER?PHYSICAL?LOGFILE?'路径';
重复查看直到没有应用的日志:
- 然后停止应用归档:
SQL>?ALTER?DATABASE?RECOVER?MANAGED?STANDBY?DATABASE?CANCEL;
Database?altered.
- 下面将STANDBY数据库切换为PRIMARY数据库:?
SQL>?ALTER?DATABASE?RECOVER?MANAGED?STANDBY?DATABASE?FINISH;
或?SQL>?ALTER?DATABASE?RECOVER?MANAGED?STANDBY?DATABASE?FINISH?FORCE;
Database?altered.
SQL>?SELECT?DATABASE_ROLE?FROM?V$DATABASE;
DATABASE_ROLE
----------------
PHYSICAL?STANDBY
SQL>?ALTER?DATABASE?COMMIT?TO?SWITCHOVER?TO?PRIMARY;
Database?altered.
SQL>?ALTER?DATABASE?OPEN;?或者?shutdown?immediate+startup
Database?altered.
检查数据库是否已经切换成功:
SQL>?SELECT?DATABASE_ROLE?FROM?V$DATABASE;
DATABASE_ROLE
----------------
PRIMARY
至此,FAILOVER切换完成。这个时候应该马上对新的PRIMARY数据库进行备份。
5??? ?DG启动与关闭
5.1???? DG启动先启动备再启动主
备库启动
SQL> startup nomount;
ORACLE instance started.
Total System Global Area? 413372416 bytes
Fixed Size????????????????? 2213896 bytes
Variable Size???????????? 339740664 bytes
Database Buffers?????????? 67108864 bytes
Redo Buffers??????????????? 4308992 bytes
Database mounted.
SQL>alter database mount standby database;
SQL> alter database recover managed standby database disconnect from session;
(补充一点备库可以打开,是只读的模式)
SQL> recover managed standby database cancel;
Media recovery complete.
SQL> alter database pen;
11G中只读转换
alter database convert to snapshot standby;
alter database convert to physical standby;
主库启动
SQL> startup
ORACLE instance started.
Total System Global Area? 413372416 bytes
Fixed Size????????????????? 2213896 bytes
Variable Size???????????? 356517880 bytes
Database Buffers?????????? 50331648 bytes
Redo Buffers?????????? ?????4308992 bytes
Database mounted.
Database opened.
5.2???? DG关闭(先关谁其实问题不大)
6??? ?DG切换为逻辑模式
6.1???? 主库设置参数并创建字典表
备用数据库需要从某一位置获取数据字典信息。字典信息应当置于来自于主数据库中的重做流中。因此,在主数据库上,执行以下命令构建字典的 LogMiner 表:
alter system set log_archive_DEST_2='SERVICE=DG2 LGWR ASYNC VALID_FOR=(ONLINE_LOGFILES,STANDBY_ROLE) DB_UNIQUE_NAME=DG2' SCOPE=SPFILE;
原有
alter system set log_archive_DEST_2='SERVICE=DG2 LGWR ASYNC VALID_FOR=(ONLINE_LOGFILES,PRIMARY_ROLE) DB_UNIQUE_NAME=DG2' SCOPE=SPFILE;
SQL> begin
2? dbms_logstdby.build;
3? end;
4? /
PL/SQL procedure successfully completed.
6.2???? 备库退出恢复模式并切换为逻辑模式
SQL> alter database recover managed standby database cancel;
Database altered.
SQL> ALTER DATABASE RECOVER TO LOGICAL STANDBY standby;
Database altered.
这步在字典表创建完后才会成功。
SQL> shutdown immediate
SQL> startup mount;
SQL> alter database open resetlogs;
SQL> alter database start logical standby apply;
7??? ?补充知识
7.1???? Data Guard物理Standby常用视图
| 视图名 |
数据库角色 |
视图内容描述 |
| V$ARCHIVE_DEST |
primary, physical |
显示Data Guard配置中所有的归档路径 |
| V$ARCHIVE_DEST_STATUS |
Primary, physical |
显示归档路径的配置信息和状态 |
| V$ARCHIVE_GAP |
Primary, physical |
显示归档日志gap,用于在failover切换时处理归档裂缝 |
| V$ARCHIVED_LOG |
Primary, physical |
显示控制文件中保存的归档日志信息 |
| V$DATAGUARD_CONFIG |
Primary, physical |
列出Data Guard中所有的DB_UNIQUE_NAME和
LOG_ARCHIVE_CONFIG参数信息 |
| V$DATAGUARD_STATS |
Primary, physical |
显示所有Primary上未能传输给Standby的Redo Log信息,如果在Primary上查询该视图,那么结果值将被清除。 |
| V$DATAGUARD_STATUS |
Primary, physical |
显示Data Guard中的事件信息,同时这些信息也会被记录在alert.log中和其他的trace文件中 |
| V$MANAGED_STANDBY |
Physical only |
显示当前Standby的所有状态信息 |
| V$STANDBY_LOG |
Primary, physical |
显示所有的Standby Redo Log信息 |
7.2???? Data Guard 初始化参数
| 参数名 |
Primary |
Standby |
描述 |
| ARCHIVE_LAG_TARGET =?seconds |
YES |
Physical only |
可选,在指定的时间后强制发生log switch事件。 |
| COMPATIBLE =?release_number |
YES |
Logical
and
Physical |
Data Guard最低版本要求是9.2.0.1.0
如果想进行switchover切换的话,Primary和Standby必须设置相同的release_number。 |
| CONTROL_FILE_RECORD_KEEP_TIME =
number_of_days |
YES |
Logical
and
Physical |
可选,该参数可以防止记录在制定天数内被覆盖。 |
| CONTROL_FILES =
( 'control_file_name' ,’ control_file_name', '...') |
YES |
Logical
and
Physical |
必须,指定控制文件路径和文件名。 |
| DB_FILE_NAME_CONVERT =
(‘location_of_primary_database_datafile' ,
'location_of_standby_database_datafile_name' , '...' |
NO |
Physical
only |
如果Standby和Primary上数据文件存储的路径不同,则必须设置该参数。(在Standby上设置时,先写Primary路径,后写Standby路径) |
| DB_UNIQUE_NAME =
unique_service_provider_name_for_this_database |
YES |
Logical
and
Physical |
建议,如果设置LOG_ARCHIVE_CONFIG参数则必须设置该参数,用于唯一标识Data Guard中每个数据库。 |
| FAL_CLIENT =?Oracle_Net_service_name |
YES |
Physical
only |
详见3.4.1。 |
| FAL_SERVER =?Oracle_Net_service_name |
NO |
Physical
only |
详见3.4.1。 |
| INSTANCE_NAME |
YES |
Logical
and
Physical |
可选,如果Primary和Standby在同一台服务器上,那么可以通过指定该参数来区分实例。 |
| LOG_ARCHIVE_CONFIG =
'DG_CONFIG=(db_unique_name,db_unique_name, ...)' |
YES |
Logical
and
Physical |
建议,该参数会指定Data Guard中Primary和所有的Standby数据库。 |
| LOG_ARCHIVE_DEST_n?=
{LOCATION=path_nam
?|SERVICE=service_name, attribute,attribute, ...?} |
YES |
Logical
and
Physical |
必须,详见3.3。 |
| LOG_ARCHIVE_DEST_STATE_n?=
{ENABLE|DEFER|ALTERNATE|RESET} |
YES |
Logical
and
Physical |
必须,详见3.2。 |
| LOG_ARCHIVE_FORMAT =
log%d_%t_%s_%r.arc |
YES |
Logical
and
Physical |
可选,如果设置了STANDBY_ARCHIVE_DEST参数则必须。指定归档日志文件的命名规则。 |
| LOG_ARCHIVE_LOCAL_FIRST =
[TRUE|FALSE] |
YES |
NO |
可选,如果是TRUE表示ARCn进程在传输给Standby之前,至少确保一组本地归档路径归档完成。 |
| LOG_ARCHIVE_MAX_PROCESSES =?integer |
YES |
Logical
and
Physical |
详见3.1.5。 |
| LOG_ARCHIVE_MIN_SUCCEED_DEST =?integer |
YES |
NO |
可选,至少有指定数量的归档路径成功接收到Redo Log之后,Primary才会重用Online Redo Log |
| LOG_ARCHIVE_TRACE =?integer |
YES |
Logical
and
Physical |
可选,当Redo Log传输到Standby时进行跟踪,有效值有(0, 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024,
2048, 4096) |
| LOG_FILE_NAME_CONVERT =
'location_of_primary_database_redo_logs',
'location_of_standby_database_redo_logs' |
NO |
Logical
and
Physical |
同DB_FILE_NAME_CONVERT (在Standby上设置时,先写Primary路径,后写Standby路径) |
| PARALLEL_MAX_SERVERS =?integer |
YES |
Logical
only |
必须,设置逻辑Standby数据库中的最大并行进程数。该参数最小不能低于5,Oracle建议最小设置为9。具体计算方法为:(如果PGA_AGGREGATE_TARGET>0)
PARALLEL_MAX_SERVERS=CPU_COUNT x
PARALLEL_THREADS_PER_CPU x 10 |
| REMOTE_LOGIN_PASSWORDFILE =
{EXCLUSIVE|SHARED] |
YES |
Logical
and
Physical |
必须,在Data Guard所有节点上设置。 |
| STANDBY_ARCHIVE_DEST =?filespec |
NO |
Logical
and
Physical |
可选,指定Standby接收归档日志后的存放路径,该参数会覆盖LOG_ARCHIVE_DEST_n参数指定的路径。
(10g开始可以不设置此参数) |
| STANDBY_FILE_MANAGEMENT =
{AUTO|MANUAL} |
YES |
Physical
only |
必须,建议设置为AUTO,这样在Primary创建或者删除datafile时,Standby会自动进行同步操作。 |
| USER_DUMP_DEST =
directory_path_name_of_trace_file |
YES |
Logical
and
Physical |
如果设置了LOG_ARCHIVE_TRACE参数则必须设置此参数。 |
7.3???? Data Guard三种模式
7.3.1? 最大保护
当备库有问题时主库会挂起
1、创建standby redo logs
2、参数设log_archive_DEST_2='SERVICE=DG LGWRSYNCAFFIRM VALID_FOR=(ONLINE_LOGFILES,STANDBY_ROLE) DB_UNIQUE_NAME=DG' SCOPE=SPFILE;
3、alter database set standby database to maximize protection;
7.3.2? 最大可用性
1、参数设log_archive_DEST_2='SERVICE=DG LGWRSYNCAFFIRM VALID_FOR=(ONLINE_LOGFILES,STANDBY_ROLE) DB_UNIQUE_NAME=DG' SCOPE=SPFILE;
2、alter database set standby database to maximize availability;
7.3.3? 最大性能
1、log_archive_DEST_2='SERVICE=DG LGWR ASYNC NOAFFIRM VALID_FOR=(ONLINE_LOGFILES,STANDBY_ROLE) DB_UNIQUE_NAME=DG' SCOPE=SPFILE;
或者log_archive_DEST_2='SERVICE=DG LGWR ARCH NOAFFIRM VALID_FOR=(ONLINE_LOGFILES,STANDBY_ROLE) DB_UNIQUE_NAME=DG' SCOPE=SPFILE;
2、alter database set standby database to maximize performance;
8??? ?常见故障
8.1???? ORA-32004:
[oracle@DG samples]$ oerr ora 32004
32004, 00000, "obsolete or deprecated parameter(s) specified for %s instance"
// *Cause:? Obsolete or deprecated parameters for this instance type
//????????? were specified in the SPFILE or the PFILE on the server side.
// *Action: See alert log for a list of parameters that are obsolete
//????????? or deprecated. Remove them from the SPFILE or the server
//????????? side PFILE.
提示是使用了过时的参数造成,但用下面的语句中没有查到有过时的参数
select name,isspecified from v$obsolete_parameter where isspecified='TRUE';
no rows selected
Deprecated 通过查找ALERT日志中的该字段找出如下字段
Deprecated system parameters with specified values:
standby_archive_dest
方法是关闭参数中的该项standby_archive_dest
1、SQL> create pfile from spfile;
2、将pfile中的standby_archive_dest去掉
3、SQL> shutdwon immediate
4、SQL> create spfile from pfile='/oracle/product/11.2.0/db_1/dbs/initDG.ora';
5、startup后正常
8.2???? ORA-03113
备库重启后,在主库上归档出现ORA-03113错误
SQL> select dest_name,status,error from v$archive_dest;
DEST_NAME????????????????????? STATUS???????????????????????? ERROR
------------------------------ --------
LOG_ARCHIVE_DEST_1??????????? VALID
LOG_ARCHIVE_DEST_2????? ?ERROR?????????? ORA-03113: end-of-file on??? communication channel
解决办法:在主库执行
SQL> alter system set log_archive_dest_state_2= enable;
这个命令式手动触发主库区尝试连接备库。
其实这种情况下,只要保证主备库之间的网络和配置是正确的。dataguard会自动恢复这个错误。这个周期默认是300秒,也可以在log_archive_dest_2的参数中添加reopen 参数指定这个主备库之间失败后继续尝试的周期。
8.3???? ORA-01031
ORA-01031: insufficient privileges错误
SQL> select dest_name,status,error from v$archive_dest;
DEST_NAME????????????????????? STATUS???????????????????????? ERROR
---------------------- -----------------------------------------------
LOG_ARCHIVE_DEST_1???????????? VALID
LOG_ARCHIVE_DEST_2???????????? ERROR????????????? ORA-01031: insufficient
Privileges
解决办法:统一主备库的数据库密码文件,或者重建密码文件,sys密码设置成一样。
然后在主库执行
SQL> alter system set log_archive_dest_state_2= enable;
8.4???? ORA-16191
ORA-16191: Primary log shipping client not logged on? standby
SQL> select dest_name,status,error from v$archive_dest;
DEST_NAME????????????????????? STATUS???????????????????????? ERROR
------------------------------ -----------
LOG_ARCHIVE_DEST_1???????????? VALID
LOG_ARCHIVE_DEST_2??????????? ERROR???????????????????? ORA-16191: Primary log??? shipping client not logged on? standby
解决办法:统一主备库的数据库密码文件,或者重建密码文件,sys密码设置成一样。
然后在主库执行
SQL> alter system set log_archive_dest_state_2= enable;
8.5???? 备库一直无法应用日志
备库一直无法应用日志,MRP0进程显示WAIT_FOR_GAP的问题
发现从主库传来的日志无法应用
在备库检查,
SQL> select sequence#,applied from v$archived_log;
SEQUENCE# APP
———- —
930 NO
931 NO
932 NO
933 NO
934 NO
935 NO
936 NO
937 NO
938 NO
939 NO
940 NO
然后开始查看有没有mrp
[oracle@HJITBACKUP bdump]$ ps -ef | grep mrp
oracle?? 31896???? 1? 0 14:37 ???????? 00:00:00 ora_mrp0_flow
oracle?? 32001 31820? 0 15:17 pts/1??? 00:00:00 grep mrp
看来有,接着查gap,发现备库上有此进程,
SQL> select * from v$archive_gap
2? ;
no rows selected
查询视图没有发现,
在接着检查V$MANAGED_STANDBY
SQL> select process,status from v$managed_standby;
PROCESS?? STATUS
——— ————
ARCH????? CONNECTED
ARCH????? CONNECTED
MRP0????? WAIT_FOR_GAP
RFS?????? IDLE
RFS?????? IDLE
发现MRP0在等待GAP,进一步查看此视图
select process,status,group#,thread#,sequence#,block#,blocks from v$managed_standby;
PROCESS?? STATUS?????? GROUP#??????? THREAD#? SEQUENCE#???? BLOCK#???? BLOCKS
——— ———— ———- ———- ———- ———- ———-
ARCH????? CONNECTED??? N/A???????????????? 0????????? 0????????? 0????????? 0
ARCH????? CONNECTED??? N/A???????????????? 0????????? 0????????? 0????????? 0
MRP0????? WAIT_FOR_GAP N/A???????????????? 1??????? 928????????? 0????????? 0
RFS?????? IDLE???????? N/A???????????????? 0????????? 0????????? 0????????? 0
RFS?????? IDLE???????? N/A???????????????? 0????????? 0????????? 0????????? 0
发现日志928没有应用,
原来是由于主库删除了928,导致备库没法应用,所以只能从备份中恢复,restore archivelog
至此问题处理完毕。
查询备库状态
SQL> select process,status from v$managed_standby;
PROCESS?? STATUS
——— ————
ARCH????? CONNECTED
ARCH????? CONNECTED
MRP0????? WAIT_FOR_LOG
RFS?????? IDLE
RFS?????? IDLE
所以当standby装完后,在主库切换日志后,这里状态应该是
MRP0????? WAIT_FOR_LOG 才是正常的状态
9? 注意事项
建议在主备库的涉及到名称地方都统一用小写字母,避免在配置过程出现莫名的错误。
如果在主库执行 alter database clear unarchived logfile或alter database open resetlogs,则dataguard要重建。
在连续恢复模式下工作之前,需要保证之前所有的归档日志己经应用到备用库上。因为在连续恢复模式的情况下,oracle不会应用之前的归档日志,而只会应用后面陆续到来的归档日志。
新建表、表空间、datafile都能通过日志应用到备库,但新建一个临时表空间和rename datafile 均不能应用到备库上。
出现归档日志gap时,需要找出相应的归档日志,然后将这些归档日志copy到备用节点的log_archive_dest目录下面。然后ALTER DATABASE RECOVER AUTOMATIC STANDBY DATABASE;
应当实时察看standby库的alert文件,就能清晰明了地知道主备更新的情况。这也是排错的重要方法。
相关视图
V$ARCHIVE_DEST
V$ARCHIVE_DEST_STATUS
V$ARCHIVE_GAP
V$ARCHIVED_LOG
V$DATABASE
V$DATAFILE
V$DATAGUARD_STATUS
V$LOG
V$LOGFILE
V$LOG_HISTORY
V$STANDBY_LOG