代码改变世界

RAC 某节点不可用时,对应VIP是否可用

2016-11-07 22:53 AlfredZhao 阅读(...) 评论(...) 编辑 收藏

实验环境:RHEL 6.5 + GI 11.2.0.4 + Oracle 11.2.0.4
验证:RAC 某节点不可用时,其对应VIP是否可用?是否可用于连接数据库?

[grid@jyrac2 ~]$ more /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

#public ip
192.168.56.150  jyrac1
192.168.56.152  jyrac2
#private ip
10.10.10.11    jyrac1-priv
10.10.10.12    jyrac2-priv
#virtual ip
192.168.56.151  jyrac1-vip
192.168.56.153  jyrac2-vip
#scan ip
192.168.56.160  jyrac-scan

1.节点宕机,对应节点VIP漂移到另一个节点

模拟主机jyrac1宕机,随之该节点的VIP资源failed over。
此时在主机jyrac2查询对应的资源信息如下:

[grid@jyrac2 ~]$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS       
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA1.dg
               ONLINE  ONLINE       jyrac2                                       
ora.FRA1.dg
               ONLINE  ONLINE       jyrac2                                       
ora.LISTENER.lsnr
               ONLINE  ONLINE       jyrac2                                       
ora.OCR1.dg
               ONLINE  ONLINE       jyrac2                                       
ora.asm
               ONLINE  ONLINE       jyrac2                   Started             
ora.gsd
               OFFLINE OFFLINE      jyrac2                                       
ora.net1.network
               ONLINE  ONLINE       jyrac2                                       
ora.ons
               ONLINE  ONLINE       jyrac2                                       
ora.registry.acfs
               ONLINE  ONLINE       jyrac2                                       
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       jyrac2                                       
ora.cvu
      1        ONLINE  ONLINE       jyrac2                                       
ora.jyrac1.vip
      1        ONLINE  INTERMEDIATE jyrac2                   FAILED OVER         
ora.jyrac2.vip
      1        ONLINE  ONLINE       jyrac2                                       
ora.jyzhao.db
      1        ONLINE  OFFLINE                               Instance Shutdown   
      2        ONLINE  ONLINE       jyrac2                   Open                
ora.oc4j
      1        ONLINE  ONLINE       jyrac2                                       
ora.scan1.vip
      1        ONLINE  ONLINE       jyrac2         

2.节点宕机后,另一个节点的网络信息

查看在主机jyrac2网络信息:

[grid@jyrac2 ~]$ ifconfig -a
eth2      Link encap:Ethernet  HWaddr 08:00:27:1A:5A:7A  
          inet addr:192.168.56.152  Bcast:192.168.56.255  Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:fe1a:5a7a/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:3955 errors:0 dropped:0 overruns:0 frame:0
          TX packets:8277 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:510869 (498.8 KiB)  TX bytes:692985 (676.7 KiB)

eth2:1    Link encap:Ethernet  HWaddr 08:00:27:1A:5A:7A  
          inet addr:192.168.56.153  Bcast:192.168.56.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

eth2:2    Link encap:Ethernet  HWaddr 08:00:27:1A:5A:7A  
          inet addr:192.168.56.151  Bcast:192.168.56.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

eth2:3    Link encap:Ethernet  HWaddr 08:00:27:1A:5A:7A  
          inet addr:192.168.56.160  Bcast:192.168.56.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

eth3      Link encap:Ethernet  HWaddr 08:00:27:7C:CD:9F  
          inet addr:10.10.10.12  Bcast:10.10.10.255  Mask:255.255.255.0
          inet6 addr: fe80::a00:27ff:fe7c:cd9f/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:116635 errors:0 dropped:0 overruns:0 frame:0
          TX packets:103705 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:81979594 (78.1 MiB)  TX bytes:63727783 (60.7 MiB)

eth3:1    Link encap:Ethernet  HWaddr 08:00:27:7C:CD:9F  
          inet addr:169.254.198.12  Bcast:169.254.255.255  Mask:255.255.0.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:145545 errors:0 dropped:0 overruns:0 frame:0
          TX packets:145545 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:130875345 (124.8 MiB)  TX bytes:130875345 (124.8 MiB)

3.可ping通宕机节点VIP,但无对应监听

发现能ping通故障节点VIP,但并没有对应的监听程序。
ping可以通,如下:

[root@jyrac2 ~]# ping 192.168.56.151
PING 192.168.56.151 (192.168.56.151) 56(84) bytes of data.
64 bytes from 192.168.56.151: icmp_seq=1 ttl=64 time=0.023 ms
64 bytes from 192.168.56.151: icmp_seq=2 ttl=64 time=0.036 ms
64 bytes from 192.168.56.151: icmp_seq=3 ttl=64 time=0.051 ms
64 bytes from 192.168.56.151: icmp_seq=4 ttl=64 time=0.069 ms
^C
--- 192.168.56.151 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3229ms
rtt min/avg/max/mdev = 0.023/0.044/0.069/0.019 ms

没有监听,通过此VIP自然连接不上,如下:

[root@jyrac2 ~]# sqlplus system/oracle@192.168.56.151/jyzhao

SQL*Plus: Release 11.2.0.4.0 Production on Thu Nov 3 16:58:31 2016

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

ERROR:
ORA-12541: TNS:no listener


Enter user-name: 

#查看监听,确认的确只有scan ip和节点2的ip和vip被监听;节点1的vip的确是没有被监听。
[root@jyrac2 ~]# ps -ef|grep tns
root        15     2  0 09:05 ?        00:00:00 [netns]
grid      2845     1  0 09:07 ?        00:00:00 /opt/app/11.2.0/grid/bin/tnslsnr LISTENER -inherit
grid      6867     1  0 09:34 ?        00:00:00 /opt/app/11.2.0/grid/bin/tnslsnr LISTENER_SCAN1 -inherit
root     13951 10953  0 17:17 pts/0    00:00:00 grep tns
[root@jyrac2 ~]# su - grid
[grid@jyrac2 ~]$ lsnrctl status LISTENER

LSNRCTL for Linux: Version 11.2.0.4.0 - Production on 03-NOV-2016 17:18:09

Copyright (c) 1991, 2013, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for Linux: Version 11.2.0.4.0 - Production
Start Date                03-NOV-2016 09:07:20
Uptime                    0 days 8 hr. 10 min. 49 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /opt/app/11.2.0/grid/network/admin/listener.ora
Listener Log File         /opt/app/grid/diag/tnslsnr/jyrac2/listener/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.56.152)(PORT=1521)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.56.153)(PORT=1521)))
Services Summary...
Service "+ASM" has 1 instance(s).
  Instance "+ASM2", status READY, has 1 handler(s) for this service...
Service "jyzhao" has 1 instance(s).
  Instance "jyzhao2", status READY, has 1 handler(s) for this service...
Service "jyzhaoXDB" has 1 instance(s).
  Instance "jyzhao2", status READY, has 1 handler(s) for this service...
The command completed successfully
[grid@jyrac2 ~]$ lsnrctl status LISTENER_SCAN1

LSNRCTL for Linux: Version 11.2.0.4.0 - Production on 03-NOV-2016 17:18:18

Copyright (c) 1991, 2013, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER_SCAN1)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER_SCAN1
Version                   TNSLSNR for Linux: Version 11.2.0.4.0 - Production
Start Date                03-NOV-2016 09:34:20
Uptime                    0 days 7 hr. 43 min. 58 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /opt/app/11.2.0/grid/network/admin/listener.ora
Listener Log File         /opt/app/11.2.0/grid/log/diag/tnslsnr/jyrac2/listener_scan1/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER_SCAN1)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=192.168.56.160)(PORT=1521)))
Services Summary...
Service "jyzhao" has 1 instance(s).
  Instance "jyzhao2", status READY, has 1 handler(s) for this service...
Service "jyzhaoXDB" has 1 instance(s).
  Instance "jyzhao2", status READY, has 1 handler(s) for this service...
The command completed successfully
[grid@jyrac2 ~]$ 

4.知识点总结

回到本文开头提出的问题:
RAC 某节点不可用时,其对应VIP是否可用?是否可用于连接数据库?
答:RAC 某节点不可用时,其对应VIP可ping通。但由于没有监听,所以不可用于连接数据库。

最后思考下,VIP既然failover漂移后可以ping通,但却没有对应的监听程序导致节点故障时不能再用此节点的vip连接数据库,那么VIP设置的意义何在呢?
从网络上搜索一些资料也可以得知:
VIP 是在出现故障的时候保证不用等待底层的TCP/IP 协议来诊断断开连接,而使用RAC级别来诊断,这样会很快。
简单举例子来说,就是节点1故障后,其VIP会漂移到节点2;但是由于节点2并没有对节点1这个VIP地址进行监听,这样客户端就能迅速知道节点1故障,从而切换到节点2;另外,需要注意的是,如果要实现failover,那么tnsnames.ora中要有相应的配置。关于RAC的tnsnames.ora配置具体可参见Oracle RAC客户端tnsnames.ora相关配置及测试