代码改变世界

实验:Oracle单节点RAC添加节点

2018-06-04 22:56 AlfredZhao 阅读(...) 评论(...) 编辑 收藏

环境:RHEL 6.5 + Oracle 11.2.0.4 单节点RAC
需求:单节点RAC添加新节点

1.添加节点前的准备工作

参考Oracle官方文档:
Oracle® Clusterware Administration and Deployment Guide
11g Release 2 (11.2) -> Adding and Deleting Cluster Nodes

1.1 确保硬件连接正常

1.1 Make physical connections.
Connect the nodes' hardware to the network infrastructure of your cluster. This includes establishing electrical connections, configuring network interconnects, configuring shared disk subsystem connections, and so on. See your hardware vendor documentation for details about this step.

确保物理层面的硬件连接都正常,这包括public/private网络连接、共享存储的连接。

#public ip
192.168.1.61  jystdrac1
192.168.1.63  jystdrac2
#virtual ip
192.168.1.62  jystdrac1-vip
192.168.1.64  jystdrac2-vip
#scan ip
192.168.1.65  jystdrac-scan

#private ip
10.10.10.61    jystdrac1-priv
10.10.10.63    jystdrac2-priv

我这里实验是jystdrac1是单节点RAC,现在需要添加jystdrac2到集群。
1.2 安装操作系统

1.2 Install the operating system.
Install a cloned image of the operating system that matches the operating system on the other nodes in your cluster. This includes installing required service patches, updates, and drivers. See your operating system vendor documentation for details about this process.
Oracle recommends that you use a cloned image. However, if the installation fulfills the installation requirements, then install the operating system according to the vendor documentation.

安装操作系统,这里Oracle建议使用克隆,基本原则就是与其他节点一致,包括操作系统版本号、Oracle需要的补丁包、操作系统的内核参数等。
1.3 创建Oracle相关用户

1.3 Create Oracle users.
You must create all Oracle users on the new node that exist on the existing nodes. For example, if you are adding a node to a cluster that has two nodes, and those two nodes have different owners for the Grid Infrastructure home and the Oracle home, then you must create those owners onthe new node, even if you do not plan to install an Oracle home on the new node.
As root, create the Oracle users and groups using the same user ID and group ID as on the existing nodes.

使用root用户创建Oracle相关用户,如果其他节点用到grid用户和oracle用户,新加节点也要创建好这些用户,并且保证用户的uid和gid一致。

1.4 确认SSH配置

1.4 Ensure that SSH is configured on the node.
SSH is configured when you install Oracle Clusterware 11g release 2 (11.2). If SSH is not configured, then see Oracle Grid Infrastructure Installation Guide for information about configuring SSH.

如SSH用户等价性需要手工配置,可参考:

1.5 使用CVU校验

1.5 Verify the hardware and operating system installations with the Cluster Verification Utility (CVU).
After you configure the hardware and operating systems on the nodes you want to add, you can run the following command to verify that the nodes you want to add are reachable by other nodes in the cluster. You can also use this command to verify user equivalence to all given nodes from the local node, node connectivity among all of the given nodes, accessibility to shared storage from all of the given nodes, and so on.
From the Grid_home/bin directory on an existing node, run the CVU command to obtain a detailed comparison of the properties of the reference node with all of the other nodes that are part of your current cluster environment. Replace ref_node with the name of a node in your existing cluster against which you want CVU to compare the nodes to be added. Specify a comma-delimited list of nodes after the -n option. In the following example, orainventory_group is the name of the Oracle Inventory group, and osdba_group is the name of the OSDBA group:
$ cluvfy comp peer [-refnode ref_node] -n node_list [-orainv orainventory_group] [-osdba osdba_group] [-verbose]
For the reference node, select a cluster node against which you want CVU to compare, for example, the nodes that you want to add that you specify with the -n option.

检查新加节点有哪些设置不匹配:

# su - grid
$ cluvfy comp peer -refnode jystdrac1 -n jystdrac2 -verbose

我的环境没有问题。

2.正式添加节点

2.1 确认环境

2.1 Ensure that you have successfully installed Oracle Clusterware on at least one node in your cluster environment. To perform the following procedure, Grid_home must identify your successfully installed Oracle Clusterware home.

确认在你的集群环境中至少有一个节点成功安装了Oracle Clusterware。下面步骤中的GRID_HOME指的是安装Oracle Clusterware的目录。
2.2 验证添加节点

2.2 Verify the integrity of the cluster and node3:
$ cluvfy stage -pre nodeadd -n node3 [-fixup [-fixupdir fixup_dir]] [-verbose]

$ cluvfy stage -pre nodeadd -n jystdrac2 -fixup -fixupdir /tmp/fixupdir -verbose

截取部分检查结果如下(为节省篇幅,大部分passed的检查项都已删减掉):

[grid@jystdrac1 ~]$ cluvfy stage -pre nodeadd -n jystdrac2 -fixup -fixupdir /tmp/fixupdir -verbose

Performing pre-checks for node addition 

Checking node reachability...

...

Checking CRS home location...
PRVG-1013 : The path "/opt/app/11.2.0/grid" does not exist or cannot be created on the nodes to be added
Result: Shared resources check for node addition failed

Interface information for node "jystdrac1"
 Name   IP Address      Subnet          Gateway         Def. Gateway    HW Address        MTU   
 ------ --------------- --------------- --------------- --------------- ----------------- ------
 eth2   192.168.1.61    192.168.1.0     0.0.0.0         UNKNOWN         08:00:27:E7:88:48 1500  
 eth2   192.168.1.62    192.168.1.0     0.0.0.0         UNKNOWN         08:00:27:E7:88:48 1500  
 eth2   192.168.1.65    192.168.1.0     0.0.0.0         UNKNOWN         08:00:27:E7:88:48 1500  
 eth3   10.10.10.61     10.10.10.0      0.0.0.0         UNKNOWN         08:00:27:83:CC:56 1500  
 eth3   169.254.203.60  169.254.0.0     0.0.0.0         UNKNOWN         08:00:27:83:CC:56 1500  


Interface information for node "jystdrac2"
 Name   IP Address      Subnet          Gateway         Def. Gateway    HW Address        MTU   
 ------ --------------- --------------- --------------- --------------- ----------------- ------
 eth2   192.168.1.63    192.168.1.0     0.0.0.0         UNKNOWN         08:00:27:0C:E1:B1 1500  
 eth3   10.10.10.63     10.10.10.0      0.0.0.0         UNKNOWN         08:00:27:B1:1B:CE 1500  

Checking for multiple users with UID value 0
Result: Check for multiple users with UID value 0 passed 

Check: Current group ID 
Result: Current group ID check passed

...

Checking OCR integrity...

OCR integrity check passed

Checking Oracle Cluster Voting Disk configuration...

Oracle Cluster Voting Disk configuration check passed
Check: Time zone consistency 
Result: Time zone consistency check passed

...

Pre-check for node addition was unsuccessful on all the nodes. 
[grid@jystdrac1 ~]$

我的环境在这里的检查项中,需要注意的主要就是确认目录的属主和权限:

[root@jystdrac2 opt]# ls -ld /opt/app
drwxr-xr-x. 3 root oinstall 4096 May 25 23:20 /opt/app
[root@jystdrac2 opt]# chown grid:oinstall /opt/app
[root@jystdrac2 opt]# chmod 775 /opt/app
[root@jystdrac2 opt]# ls -ld /opt/app
drwxrwxr-x. 3 grid oinstall 4096 May 25 23:20 /opt/app

2.3 GI添加节点

2.3 To extend the Grid Infrastructure home to the node3, navigate to the Grid_home/oui/bin directory on node1 and run the addNode.sh script using the following syntax, where node3 is the name of the node that you are adding and node3-vip is the VIP name for the node:

我这里实验就是 jystdrac2(没有使用GNS):

[grid@jystdrac1 bin]$ pwd
/opt/app/11.2.0/grid/oui/bin
[grid@jystdrac1 bin]$ ls
addLangs.sh  addNode.sh  attachHome.sh  detachHome.sh  filesList.bat  filesList.properties  filesList.sh  lsnodes  resource  runConfig.sh  runInstaller  runInstaller.sh  runSSHSetup.sh


$ ./addNode.sh "CLUSTER_NEW_NODES={jystdrac2}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={jystdrac2-vip}"
--下面这条命令是错误的,无法跳过添加节点前的检查:
--$ ./addNode.sh -force -ignorePrereq -ignoreSysPrereqs "CLUSTER_NEW_NODES={jystdrac2}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={jystdrac2-vip}"

最终确认正确跳过添加节点检查的方法是设置IGNORE_PREADDNODE_CHECKS变量(这里我耽误了一些时间,尝试了Oracle惯用的ignorePrereq和ignoreSysPrereqs发现都不对):

export IGNORE_PREADDNODE_CHECKS=Y
echo $IGNORE_PREADDNODE_CHECKS

$ ./addNode.sh "CLUSTER_NEW_NODES={jystdrac2}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={jystdrac2-vip}"

正常添加GI节点的输出如下:

-----------------------------------------------------------------------------


Instantiating scripts for add node (Monday, June 4, 2018 1:27:27 PM CST)
.                                                                 1% Done.
Instantiation of add node scripts complete

Copying to remote nodes (Monday, June 4, 2018 1:27:30 PM CST)
...............................................................................................                                 96% Done.
Home copied to new nodes

Saving inventory on nodes (Monday, June 4, 2018 1:34:22 PM CST)
.                                                               100% Done.
Save inventory complete
WARNING:
The following configuration scripts need to be executed as the "root" user in each new cluster node. Each script in the list below is followed by a list of nodes.
/opt/app/11.2.0/grid/root.sh #On nodes jystdrac2
To execute the configuration scripts:
    1. Open a terminal window
    2. Log in as "root"
    3. Run the scripts in each cluster node
    
The Cluster Node Addition of /opt/app/11.2.0/grid was successful.
Please check '/tmp/silentInstall.log' for more details.
[grid@jystdrac1 bin]$ 

按提示在新加节点上执行root脚本:

[root@jystdrac2 app]# /opt/app/11.2.0/grid/root.sh
Performing root user operation for Oracle 11g 

The following environment variables are set as:
    ORACLE_OWNER= grid
    ORACLE_HOME=  /opt/app/11.2.0/grid

Enter the full pathname of the local bin directory: [/usr/local/bin]: 
The contents of "dbhome" have not changed. No need to overwrite.
The contents of "oraenv" have not changed. No need to overwrite.
The contents of "coraenv" have not changed. No need to overwrite.


Creating /etc/oratab file...
Entries will be added to the /etc/oratab file as needed by
Database Configuration Assistant when a database is created
Finished running generic part of root script.
Now product-specific root actions will be performed.
Using configuration parameter file: /opt/app/11.2.0/grid/crs/install/crsconfig_params
Creating trace directory
User ignored Prerequisites during installation
Installing Trace File Analyzer
OLR initialization - successful
Adding Clusterware entries to upstart
CRS-4402: The CSS daemon was started in exclusive mode but found an active CSS daemon on node jystdrac1, number 1, and is terminating
An active cluster was found during exclusive startup, restarting to join the cluster
clscfg: EXISTING configuration version 5 detected.
clscfg: version 5 is 11g Release 2.
Successfully accumulated necessary OCR keys.
Creating OCR keys for user 'root', privgrp 'root'..
Operation successful.
Preparing packages for installation...
cvuqdisk-1.0.9-1
Configure Oracle Grid Infrastructure for a Cluster ... succeeded

查看集群状态:

[grid@jystdrac2 ~]$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.CRS.dg
               ONLINE  ONLINE       jystdrac1                                    
               ONLINE  ONLINE       jystdrac2                                    
ora.DATA.dg
               ONLINE  ONLINE       jystdrac1                                    
               OFFLINE OFFLINE      jystdrac2                                    
ora.FRA.dg
               ONLINE  ONLINE       jystdrac1                                    
               OFFLINE OFFLINE      jystdrac2                                    
ora.LISTENER.lsnr
               ONLINE  ONLINE       jystdrac1                                    
               ONLINE  ONLINE       jystdrac2                                    
ora.asm
               ONLINE  ONLINE       jystdrac1                Started             
               ONLINE  ONLINE       jystdrac2                Started             
ora.gsd
               OFFLINE OFFLINE      jystdrac1                                    
               OFFLINE OFFLINE      jystdrac2                                    
ora.net1.network
               ONLINE  ONLINE       jystdrac1                                    
               ONLINE  ONLINE       jystdrac2                                    
ora.ons
               ONLINE  ONLINE       jystdrac1                                    
               ONLINE  ONLINE       jystdrac2                                    
ora.registry.acfs
               ONLINE  ONLINE       jystdrac1                                    
               ONLINE  ONLINE       jystdrac2                                    
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       jystdrac1                                    
ora.cvu
      1        ONLINE  ONLINE       jystdrac1                                    
ora.jystdrac1.vip
      1        ONLINE  ONLINE       jystdrac1                                    
ora.jystdrac2.vip
      1        ONLINE  ONLINE       jystdrac2                                    
ora.oc4j
      1        ONLINE  ONLINE       jystdrac1                                    
ora.ractest.db
      1        ONLINE  ONLINE       jystdrac1                Open                
ora.scan1.vip
      1        ONLINE  ONLINE       jystdrac1                                    
[grid@jystdrac2 ~]$ 

至此,GI添加已经完成。

2.4 添加RAC节点

2.4 If you have an Oracle RAC or Oracle RAC One Node database configured on the cluster and you have a local Oracle home, then do the following to extend the Oracle database home to node3:
Navigate to the Oracle_home/oui/bin directory on node1 and run the addNode.sh script as the user that installed Oracle RAC using the following syntax:
$ ./addNode.sh "CLUSTER_NEW_NODES={node3}"
Run the Oracle_home/root.sh script on node3 as root, where Oracle_home is the Oracle RAC home.

实际执行:

[root@jystdrac2 app]# ls -ld /opt/app
drwxr-xr-x. 6 root oinstall 4096 Jun  4 13:34 /opt/app
[root@jystdrac2 app]# mkdir -p /opt/app/oracle
[root@jystdrac2 app]# chown oracle:oinstall /opt/app/oracle/
[root@jystdrac2 app]# ls -ld /opt/app/oracle/
drwxr-xr-x 2 oracle oinstall 4096 Jun  4 14:25 /opt/app/oracle/

--添加RAC节点:
cd $ORACLE_HOME/oui/bin
./addNode.sh "CLUSTER_NEW_NODES={jystdrac2}"

然后按提示执行root脚本。

2.5 执行root.sh脚本

2.5 Run the Grid_home/root.sh script on the node3 as root and run the subsequent script, as instructed.

2.6 验证集群完整性

2.6 cluvfy stage -post nodeadd -n node3 [-verbose]
Check whether either a policy-managed or administrator-managed Oracle RAC database is configured to run on node3 (the newly added node). If you configured an administrator-managed Oracle RAC database, you may need to use DBCA to add an instance to the database to run on this newly added node.

cluvfy stage -post nodeadd -n jystdrac2 -verbose

3.其他配置工作

此时需要将单节点RAC修改为两节点RAC,可以按照官方文档说的直接使用dbca,也可以手工来配置。
如果此时直接尝试启动新加节点的实例,会报错:

[oracle@jystdrac2 ~]$ srvctl add instance -d ractest -i ractest2                                

[oracle@jystdrac2 ~]$ srvctl start instance -d ractest -i ractest2
PRCR-1013 : Failed to start resource ora.ractest.db
PRCR-1064 : Failed to start resource ora.ractest.db on node jystdrac2
CRS-5017: The resource action "ora.ractest.db start" encountered the following error: 
ORA-29760: instance_number parameter not specified
. For details refer to "(:CLSN00107:)" in "/opt/app/11.2.0/grid/log/jystdrac2/agent/crsd/oraagent_oracle/oraagent_oracle.log".

CRS-2674: Start of 'ora.ractest.db' on 'jystdrac2' failed
[oracle@jystdrac2 ~]$ 

下面手工折腾下配置:

3.1 配置参数文件
在目标端通过spfile创建一个pfile

create pfile='/tmp/pfilerac.ora' from spfile;

修改改pfile,添加/修改RAC相关配置类似如下(之前只有实例1,没有实例2):

ractest1.instance_number=1
ractest2.instance_number=2
ractest1.instance_name=ractest1
ractest2.instance_name=ractest2
ractest1.thread=1
ractest2.thread=2
ractest1.undo_tablespace='UNDOTBS1'
ractest2.undo_tablespace='UNDOTBS2'
ractest1.local_listener='(ADDRESS=(PROTOCOL=TCP)(HOST= 192.168.1.62)(PORT=1521))'
ractest2.local_listener='(ADDRESS=(PROTOCOL=TCP)(HOST= 192.168.1.64)(PORT=1521))'

使用修改后的pfile启动数据库

SQL> startup nomount pfile='/tmp/pfilerac.ora';

3.2 配置节点2实例相关对象
返回节点1操作,添加节点2的日志组:

SQL> 
alter database add logfile thread 2 group 21 size 50M;
alter database add logfile thread 2 group 22 size 50M;
alter database add logfile thread 2 group 23 size 50M;

添加实例2的undo:

SQL> 
CREATE UNDO TABLESPACE UNDOTBS2 DATAFILE '+DATA' SIZE 100M;

启用thread 2(确保节点2可以mount):

SQL> 
alter database enable public thread 2;

新加节点创建spfile,内容就是当前使用的pfile:

SQL> 
create spfile='+DATA/ractest/spfileractest.ora' from pfile='/tmp/pfilerac.ora';

新加节点重新使用spfile重启新加的节点:

SQL> shutdown immediate
startup

附:运行catclust.sql建立集群相关字典视图(需确认是否需要执行)

--需确认是否需要执行(我这里是需要执行的)
@?/rdbms/admin/catclust.sql

3.3 最终确定数据库信息

--srvctl config database -d ractest
[oracle@jystdrac2 ~]$ srvctl config database -d ractest
Database unique name: ractest
Database name: ractest
Oracle home: /opt/app/oracle/product/11.2.0/dbhome_1
Oracle user: oracle
Spfile: +DATA/ractest/spfileractest.ora
Domain: 
Start options: open
Stop options: immediate
Database role: PRIMARY
Management policy: AUTOMATIC
Server pools: ractest
Database instances: ractest1,ractest2
Disk Groups: DATA
Mount point paths: 
Services: 
Type: RAC
Database is administrator managed

--crsctl stat res -t
[grid@jystdrac1 ~]$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.CRS.dg
               ONLINE  ONLINE       jystdrac1                                    
               ONLINE  ONLINE       jystdrac2                                    
ora.DATA.dg
               ONLINE  ONLINE       jystdrac1                                    
               ONLINE  ONLINE       jystdrac2                                    
ora.FRA.dg
               ONLINE  ONLINE       jystdrac1                                    
               ONLINE  ONLINE       jystdrac2                                    
ora.LISTENER.lsnr
               ONLINE  ONLINE       jystdrac1                                    
               ONLINE  ONLINE       jystdrac2                                    
ora.asm
               ONLINE  ONLINE       jystdrac1                Started             
               ONLINE  ONLINE       jystdrac2                Started             
ora.gsd
               OFFLINE OFFLINE      jystdrac1                                    
               OFFLINE OFFLINE      jystdrac2                                    
ora.net1.network
               ONLINE  ONLINE       jystdrac1                                    
               ONLINE  ONLINE       jystdrac2                                    
ora.ons
               ONLINE  ONLINE       jystdrac1                                    
               ONLINE  ONLINE       jystdrac2                                    
ora.registry.acfs
               ONLINE  ONLINE       jystdrac1                                    
               ONLINE  ONLINE       jystdrac2                                    
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       jystdrac1                                    
ora.cvu
      1        ONLINE  ONLINE       jystdrac1                                    
ora.jystdrac1.vip
      1        ONLINE  ONLINE       jystdrac1                                    
ora.jystdrac2.vip
      1        ONLINE  ONLINE       jystdrac2                                    
ora.oc4j
      1        ONLINE  ONLINE       jystdrac1                                    
ora.ractest.db
      1        ONLINE  ONLINE       jystdrac1                Open                
      2        ONLINE  ONLINE       jystdrac2                Open                
ora.scan1.vip
      1        ONLINE  ONLINE       jystdrac1                                    
[grid@jystdrac1 ~]$ 

至此,已经完成单节点RAC添加节点的全部工作。