避免hacmp出现脑裂，从哪些从导电性能方面考虑去考虑

点击联系发帖人 时间：2016-12-15 03:06

脑裂

9136人阅读
Oracle RAC（110）
&一. 概述在之前的文章：&&&&&& &&&&&& &&&&&& &&&&&& 提到OCSSD&这个进程是Clusterware最关键的进程，如果这个进程出现异常，会导致系统重启，这个进程提供CSS(Cluster&Synchronization&Service)服务。&CSS&服务通过多种心跳机制实时监控集群状态，提供脑裂保护等基础集群服务功能。&&&&&& CSS&服务有2种心跳机制：&一种是通过私有网络的Network&Heartbeat，另一种是通过Voting&Disk的Disk&Heartbeat.&&&&&& 这2种心跳都有最大延时，对于Disk&Heartbeat，&这个延时叫作IOT&(I/O&Timeout);对于Network&Heartbeat,&这个延时叫MC(Misscount)。&这2个参数都以秒为单位，缺省时IOT大于MC，在默认情况下，这2个参数是Oracle&自动判定的，并且不建议调整。&可以通过如下命令来查看参数值：$crsctl&get&css&disktimeout$crsctl&get&css&misscount&如：[oracle@rac1 ~]$ crsctl get css disktimeout200[oracle@rac1 ~]$ crsctl get css misscount60这是这2个参数的默认值。 &二. MOS 上相关的几篇文章How to start/stop the 10g CRS ClusterWare[ID ]10g RAC: Steps To Increase CSS Misscount,Reboottime and Disktimeout [ID ]CSS Timeout Computation in OracleClusterware [ID ]RAC Assurance Support Team: RAC and OracleClusterware Starter Kit and Best Practices (Generic) [ID ]&2.1修改CSS Misscount 步骤：& 1)Shut down CRS on all but one node. For exact steps use Note
& 2)Execute crsctl as root to modify the misscount:&&&&$ORA_CRS_HOME/bin/crsctl set css misscount &n&&&&&where &n& is the maximum i/o latency to the voting disk +1 second& 3)Reboot the node where adjustment was made& 4)Start all other nodes shutdown in step 1&With the Patch:4896338 for 10.2.0.1 thereare two additional settings that can be tuned.&This change is incorporated into the 10.2.0.2 and 10.1.0.6patchsets.&& &&These following are only relevant on10.2.0.1 with Patch:4896338，In addition to MissCount, CSS now has two more parameters:& 1)reboottime (default 3 seconds) - the amount of time allowed for a node &to complete a reboot after the CSS daemon hasbeen evicted. (I.E. how &long does ittake for the machine to completely shutdown when you do a reboot)& 2)disktimeout (default 200 seconds) - the maximum amount of time allowed &&&&&for a voting file I/O if thistime is exceeded the voting disk will be marked as offline.& Note that this is also the amount of timethat will be required for initial cluster formation, i.e. when no nodes havepreviously been up and in a cluster.&&&&&&&$CRS_HOME/bin/crsctl set css reboottime &r& [-force]& (&r& is seconds)&&&&&&$CRS_HOME/bin/crsctl set css disktimeout &d& [-force] (&d&is seconds)&Confirm the new css& misscount setting via ocrdump&&2.2 CSS Timeout Computation in OracleClusterware2.2.1 MISSCOUNTDEFINITION AND DEFAULT VALUES&&&&&& The CSS misscount parameterrepresents the maximum time, in seconds, that a network heartbeat can be missedbefore entering into a cluster reconfiguration to evict the node. The followingare the default values for the misscount parameter and their respectiveversions when using Oracle Clusterware* in seconds:&
10g (R1 &R2)
30 &&&&&&& *CSS misscount default value when using vendor (non-Oracle)clusterware is 600 seconds. This is to allow the vendor clusterwareample time to resolve any possible split brain scenarios.&&&&&& On AIX platforms with HACMP starting with 10.2.0.3 BP#1, themisscount is 30. This is documented in&&2.2.2 CSS HEARTBEATMECHANISMS AND THEIR INTERRELATIONSHIP&&&&&& The synchronization servicescomponent (CSS) of the Oracle Clusterware maintains two heartbeat mechanisms 1.) the disk heartbeat to the voting deviceand 2.) the network heartbeat& across theinterconnect which establish and confirm valid node membership in the cluster. &&&&&& Bothof these heartbeat mechanisms have an associated timeout value. The diskheartbeat has an internal i/o timeout interval (DTO Disk TimeOut), in seconds,where an i/o to the voting disk must complete. The misscount parameter (MC), asstated above, is the maximum time, in seconds, that a network heartbeat&can be missed. The disk heartbeat i/o timeout interval is directly related tothe misscount parameter setting. There has been some variation in thisrelationship&between versions as described below:
NOTE, MISSCOUNT WAS A& DIFFERENT
ENTITY IN THIS RELEASE
No one should be on this version
DTO = MC - 15 seconds
DTO = MC - 15 seconds
10.1.0.4+Unpublished
Bug 3306964
DTO = MC - 3 seconds
10.1.0.4 with CRS
II Merge patch
DTO =Disktimeout (Defaults to 200
seconds) Normally OR Misscount seconds only during initial Cluster formation
or Slightly before reconfiguration
IOT = MC - 3 seconds
10.2.0.1 +Fix for
unpublished&Bug 4896338
IOT=Disktimeout&(Defaults to 200 seconds) Normally
OR Misscount seconds only during initial Cluster formation or Slightly before
reconfiguration
Same as above (10.2.0.1 with Patch&
During node join and leave
(reconfiguration) in a cluster we need to reconfigure, in that particular
case we use Short Disk TimeOut (SDTO) which is in all versions SDTO = MC ?EUR“
reboottime (usually 3 seconds)& &&&&&&& Misscountdrives cluster membership reconfigurations and directly effects theavailability of the cluster. In most cases, the default settings for MC shouldbe acceptable.& Modifying the default value of misscount not onlyinfluences the timeout interval for the i/o to the voting disk, but alsoinfluences the tolerance for missed network heartbeats across the interconnect.&2.2.3 LONG LATENCIES TOTHE VOTING DISKS&&&&&& If I/O latencies to the voting diskare greater than the default DTO calculations noted above, the cluster mayexperience CSS node evictions depending on (a)the Oracle Clusterware (CRS)version, (b)whether merge patch has been applied and (c)the state of theCluster. More details on this are covered in the section &Change inBehavior with CRS Merge PATCH&(4896338 on 10.2.0.1)&.&&&&&& Theselatencies can be attributed to any number of problems in the i/o subsystem orproblems with any component in the i/o path. The following is a non exhaustivelist of reported problems which resulted in CSS node eviction due to latenciesto the voting disk longer than the default Oracle Clusterware i/o timeoutvalue(DTO):1.&&&&&&&QLogic HBA cards with a LinkDown Timeout greater than the default misscount.2.&&&&&&&Bad cables to the SAN/storagearray that effect i/o latencies3.&&&&&&&SAN switch (like Brocade)failover latency greater than the default misscount4.&&&&&&&EMC Clariion Array whentrespassing the SP to the backup SP greater than default misscount5.&&&&&&&EMC PowerPath path errordetection and I/O repost and redirect greater than defaultmisscount&&6.&&&&&&&NetApp Cluster (CFO) failoverlatency greater than default misscount7.&&&&&&&Sustained high CPU load whicheffects the CSSD disk ping monitoring thread8.&&&&&&&Poor SAN network configurationthat creates latencies in the I/O path.&&&&&& The mostcommon problems relate to multi-path IO software drivers, and thereconfiguration times resulting from a failure in the IO path. Hardwareand (re)configuration issues that introduce these latencies should becorrected. Incompatible failover times with underlying OS, network or storagehardware or software may be addressed given a complete understanding of theconsiderations listed below.&&&&&&& Misscount should NOT be modified to workaround theabove-mentioned issues. Oracle support recommends that you apply thelatest patchset which changes the CSS behaviour.&More details covered innext section.&2.2.4 Change in Behaviorwith&&applied on top of&10.2.0.1&&&&&& Starting with 10.2.0.1+,CSS will not evict the node from the cluster due to (DTO) I/O to voting disktaking more than misscount seconds unless it is during the initial clusterformation or slightly before reconfiguration.&&&&&&& So if we have a&N number ofnodes in a&cluster and one of the nodes takes more than misscountseconds&to access the voting disk, the node will not be evicted as long asthe access to the voting disk is completed within disktimeoutseconds.&Consequently with this&patch, there is no need to increasethe misscount at all.&&&&&& Additionallythis merge patch introduces Disktimeout& which is the amount of time thata lack of disk ping to voting disk(s) will be tolerated.&Note:& applying the patch will notchange your value for Misscount.&&The table below explains inthe&conditions under which the eviction will occur&
Network Ping
Completes within misscount
within&Misscount seconds
Completes within Misscount
Takes more than misscount
seconds but less than Disktimeout seconds
Completes within Misscount
Takes more than Disktimeout
Takes more than Misscount
Completes within Misscount
Y &&* By default&Misscount is lessthan Disktimeout seconds&2.2.5 CONSIDERATIONS WHENCHANGING MISSCOUNT FROM THE DEFAULT VALUE1.&&&&&&&Customers drive SLA and clusteravailability. The customer ultimately defines Service Levels and availabilityfor the cluster. Before recommending any change to misscount, the full impactof that change should be described and the impact to cluster availabilitymeasured.2.&&&&&&&Customers may have timeout andretry logic in their applications. The impact of delaying reconfiguration maycause 'artificial' timeouts of the application, reconnect failures andsubsequent logon storms.3.&&&&&&&Misscount timeout values areversion dependent and are subject to change. As we have seen, misscountcalculations are variable between releases and between versions within arelease. Creating a false dependency on misscount calculation in one versionmay not be appropriate for later versions.4.&&&&&&&Internal I/O timeout interval(DTO) algorithms may change in later releases as stated above, there exists adirect relationship between the internal I/O timeout interval and misscount.This relationship is subject to change in later releases.5.&&&&&&&An increase in misscount tocompensate for i/o latencies directly effects reconfiguration times for networkfailures. The network heartbeat is the primary indicator of connectivity withinthe cluster. Misscount is the tolerance level of missed 'check ins' thattrigger cluster reconfiguration. Increasing misscount will prolong the time totake corrective action in the event of network failure or other anomalieseffecting the availability of a node in the cluster. This directly effectscluster availability.6.&&&&&&&Changing misscount toworkaround voting disk latencies will need to be corrected when the underlyingdisk latency is corrected, misscount needs to be set back to the default Thecustomer needs to document the change and set the parameter back to the defaultwhen the underlying storage I/O latency is resolved.7.&&&&&&&Do not change default misscountvalues if you are& running Vendor Clusterware along with OracleClusterware. The default values for misscount should not be changed when usingvendor clusterware. Modifying misscount in this environment may causeclusterwide outages and potential corruptions.8.&&&&&&&Changing misscount parameterincurs a clusterwide outage. As note below, the customer will need to schedule&a clusterwide outage to make this change.9.&&&&&&&Changing misscount should notbe used to compensate for poor configurations or faulty hardware10.&&&Cluster and RDBMS availabilityare directly effected by high misscount settings.11.&&&In case of stretched clustersand stretched storage systems and a site failure where we loose one storage andN number of nodes we go into a reconfiguration state and then we revert toShortDiskTimeOut value as internal I/O timeout for the votings. Several casesare known with stretched clusters where when a site failure happen the storagefailover cannot complete within SDTO. If the I/O to the votings is blocked morethan SDTO the result is node evictions on the surviving side.&To Change MISSCOUNT back to default Please&referto&&&&&&& THIS IS THE ONLY SUPPORTED METHOD.NOT FOLLOWING THIS METHOD RISKS EVICTIONS AND/OR CORRUPTING THE OCR10g Release 2 MIRRORED VOTING DISKS AND VENDORMULTIPATHING SOLUTIONS&&&&&&& Oracle RAC 10g Release 2 allows formultiple voting disks so that& the customer does not have to rely on amultipathing solution from a storage vendor. You can have n voting disks (up to31) where n = m*2+1 where m is the number of disk failures you& want tosurvive. Oracle recommends each voting disk to be on a separate physical disk.&&&&&&&-------------------------------------------------------------------------------------------------------Blog：http://blog.csdn.net/tianlesoftware Weibo: /tianlesoftwareEmail: dvd.DBA1 群：(满);&& DBA2 群：(满)&&DBA3 群：(满)&&DBA 超级群：(满);& DBA4 群： (满)&DBA5群： (满)&DBA6 群：(满)& 聊天群：(满)&& 聊天2群：(满)--加群需要在备注说明Oracle表空间和数据文件的关系，否则拒绝申请
&&相关文章推荐
* 以上用户言论只代表其个人观点，不代表CSDN网站的观点或立场
日，CNDBA社区正式上线测试，欢迎来CNDBA社区学习交流！
积分：113189
积分：113189
排名：第6名
原创：1004篇
转载：86篇
评论：2098条
安徽DBA俱乐部，俱乐部整合安徽地区的IT资源，现有成员已经包含安徽大部分IT公司，俱乐部除了资讯信息分享之外，也会定期举行线下活动。欢迎安徽地区的DBA 加入.
，备注，加群必须注明籍贯，该群只对安徽地区开放。
注意:加群必须注明表空间和数据文件关系
不要重复加群
----------------------
CNDBA_1: k群，大量空闲)
CNDBA_2: (满)
CNDBA_5: (满)
CNDBA_7: （满）
(1)(1)(1)(4)(4)(10)(13)(6)(1)(1)(5)(1)(1)(1)(5)(6)(5)(1)(3)(5)(5)(3)(15)(21)(5)(5)(3)(21)(15)(9)(5)(26)(33)(7)(26)(35)(31)(22)(21)(34)(43)(17)(95)(51)(30)(51)(54)(41)(22)(28)(24)(24)(21)(14)(23)(24)(10)(12)(23)(29)(39)随着 PowerVM 使用的越来越多，在虚拟化环境下实施 PowerHA 的案例会越来越多。传统 PowerHA6.1 在物理分区下实施是比较经典的配置，PowerHA7.1 & &为了适应 PowerVM，在开发的时候进行了相关考虑，主要包含三点：PowerHA7.1 中允许 1 个 HA 节点只有 1 个网卡、1 个 BootIP 和一个 & &ServiceIP，并且 ServieIP 可以和 BootIP 在相同网段；netmon.cf 的功能在虚拟化环境中能够成功实施，解决了 PowerHA
& &监控虚拟网卡状态的问题；FC 心跳在虚拟环境下能够成功实施。本文讲主要介绍虚拟化环境下实施的相关要点。
PowerHA7.1 对虚拟网络的监控 netmon.cf 的配置
在传统的 HA 环境下，PowerHA 可以通过监控物理网卡的状态来进行网络监控。而虚拟化环境下，VIOC 中虚拟网卡永远不会处于 down 或者 detach 的状态（除非人为操作），带来的结果是可能 VIOC 已经无法对外通信，但是由于其虚拟网卡状态仍然是 up 的状态，HA 不会识别网络故障，资源组也不会发生切换，结果就是业务中断，也就是“该它干的活它没干”，HA 失去了其本来的意义。
因此，在 PowerVM 环境下实施 PowerHA7.1 的时候，就必须要引入 netmon.cf 的配置。在 netmon.cf 中，我们通过设备 HA 本地网卡 ping 目标地址的方法，来判断虚拟网卡通讯是否正常。
针对 netmon.cf 文件的配置。在 PowerHA7.1 中推荐的格式是：
# cat /usr/es/sbin/cluster/netmon.cf !REQD 172.16.25.175 172.16.24.82
其中：172.16.25.175 是 HA 节点的 bootIP，172.16.24.82 是目标 IP。在这个配置文件中，通常建议写入多个 IP 地址（这个文件最多写 32 行），这样增加本机将会在 ping 不通第一个 IP 的时候，尝试 ping 第二个，直到配置文件中的所有 IP 地址都 ping 不通。这样做的好处是避免由于网络不稳定造成资源组错误切换。不同的 HA 节点的配置文件中，目标 IP 可以不同。
netmon.cf 能够检测到虚拟网络问题并且触发资源组切换的条件是：
& &配置 netmon.cf 的分区的 IP 地址 ping 不通 netmon.cf 中配置的目标地址。 & &HA 节点之间的网络多播心跳不通。
netmon.cf 的功能验证
我们以一个双节点的 PowerHA7.1 作为实验环境。实验环境中有两个物理服务器，每个物理服务器上有一个 VIOS，一个 VIOC，两个 VIOC 之间配置了 PowerHA，并且在两个 HA 节点上都配置了 netmon.cf。
查看配置文件内容：
# cat /usr/es/sbin/cluster/netmon.cf !REQD 172.16.25.175 172.16.24.82
查看资源组状态，资源组 rg1 运行在 HA1 上，浮动 IP 172.16.25.178 处于 up 状态。
# clRGinfo ----------------------------------------------------------------------------- Group Name & & State & & & & & & & & & & & &Node & & & & & ----------------------------------------------------------------------------- rg1 & & & & & &ONLINE & & &
& & & & & & & & node1 & & & & & & & & & & & & OFFLINE & & & & & & & & & & &node2 & & & & &
# netstat -in Name &Mtu & Network & & Address & & & & & &Ipkts Ierrs & &Opkts Oerrs &Coll en0 & 1500 &link#2 & & &ce.2.cc.e.30.a & & 181132 & & 0 & &14699 & & 0 & & 0 en0 & .25 & 172.16.25.178 & & &181132 & & 0 & &14699 & & 0 & & 0 en0 & 1500
&172.16.25 & 172.16.25.175 & & &181132 & & 0 & &14699 & & 0 & & 0 lo0 & 16896 link#1 & & & & & & & & & & & & &16237 & & 0 & &16237 & & 0 & & 0 lo0 &
& & & & 127.0.0.1 & & & & & 16237 & & 0 & &16237 & & 0 & & 0 lo0 & 16896 ::1%1 & & & & & & & & &
& & & & 16237 & & 0 & &16237 & & 0 & & 0
初始情况下，HA1 节点可以 ping 通 netmon.cf 中的目标地址（172.16.24.82），目标地址与源地址互发网络包正常。
# tcpdump host 172.16.24.82 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on en0, link-type 1, capture size 96 bytes 21:33:18.669852 IP node1 & 172.16.24.82: ICMP echo request, id 488, seq 587, length 43 21:33:18.670058
IP 172.16.24.82 & node1: ICMP echo reply, id 488, seq 587, length 43
接下来，让 HA1 节点与目标地址无法通讯（可以通过删除路由、将目标地址网卡 down 掉或者将目标分区 down 等方法），即 HA1 节点 ping 不通 172.16.24.82 地址时，HA1 节点依然会正常工作，资源组不会发生切换。
从下面的输出信息中，可以看到 HA1 与目标地址交互不正常。
#tcpdump host 172.16.24.82 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on en0, link-type 1, capture size 96 bytes 21:00:59.785591 ARP, Request who-has 172.16.24.82 tell 172.16.24.1, length 46 21:01:01.071314 IP node1
& 172.16.24.82: ICMP echo request, id 488, seq 184, length 43 21:01:01.426657 IP node1 & 172.16.24.82: ICMP echo request, id 488, seq 184, length 43 21:01:01.782209 IP node1 & 172.16.24.82: ICMP echo request, id 488, seq 184, length 43
在这个时候，可能我们会理所当然地认为本机网卡将会标示出故障。其实不然，这个时候，在 PowerHA 的日志 hacmp.out 和 PowerHA 命令 lscluster -m 的输出信息中，不会有任何报错,网络是正常的。资源组也不会发生切换。因为 HA1 节点 HA2 节点发送多播信息是可以成功的。
将为 HA1 节点提供网络服务的 VIOS 上的 SEA 删掉（或者拔掉 VIOS 的网线）。通过 console 登陆 HA1，发现 hacmp.out 中会有网络报错：
Mar 13 21:19:34 EVENT COMPLETED: network_down_complete node1 net_ether_01 0
需要注意的是，HA 识别网络错误分为 0 和-1 两种。0 标示 local 网络故障，会引起资源组切换。-1 是全局网络故障，不会引发资源组切换
此时，通过 PowerHA 命令行查看网络状态：
lscluster -m 中网卡状态为 down:
#lscluster -m Points of contact for node: 2 & & & &------------------------------------------ & & & &Interface & & State &Protocol & &Status & & & & &------------------------------------------ & & & &dpcom & & & & DOWN & none & & &RESTRICTED & & & &en0 &
& & & & DOWN & & IPv4 & & & & none
此时，如果资源组中包含浮动 IP 资源资源，将会引发资源组切换。
HACMP Event Preamble ----------------------------------------------------------------------------
Enqueued rg_move release event for resource group rg1.
Reason for recovery of Primary instance of Resource group 'rg1' from TEMP_ERROR state on node 'node1' was 'Local network failure'.
查看 PowerHA 的日志 hacmp.out,可以看到过了大约不到 30 秒，资源组在 HA2 节点启动成功：.....................
Mar 13 21:51:00 EVENT COMPLETED: resource_state_change_complete node1 0
# clRGinfo ----------------------------------------------------------------------------- Group Name & & State & & & & & & & & & & & &Node & & & & & ----------------------------------------------------------------------------- rg1 & & & & & &OFFLINE & & &
& & & & & & & &node1 & & & & & & & & & & & & ONLINE & & & & & & & & & & &node2
检测 HA 节点间 Mutil-cast 通讯的方法
以双节点 HA 为例，HA 的多播地址为 228.16.25.175，HA 两个节点的名字分别为:node1 和 node2.
HA 节点间多播协议通讯正常的表现如下：
On HA node1：
从 node1 向多播 IP 发包：
在 node2 上，从多播地址获取包，显示可以获取到。
On HA node2:
如果 HA 节点 node2 上 mping 是没有输出，则说明节点之间的多播通讯 mutil-cast 有问题。需要在交换机上进行配置，打通 Mutil-Cast 协议。
FC 心跳在 PowerVM 中的实施 FC 心跳的概念
PowerHA7.1 中心跳分为三类：以太网络 Mutil-Cast 心跳、FC 心跳和 Repository disk 心跳。为了使 HA 更加稳定，有效预防脑裂，通常建议客户在实施的时候配置 FC 心跳。当 IP 心跳或者 SAN 心跳正常时，Repository Disk 处于 UP RESTRICTED AIXCONTROLLED 状态，只是作为 Standby，不进行心跳数据的传输。当 IP 心跳或者 SAN 心跳均不可用时，Repository Disk 处于 UP AIX_CONTROLLED
状态，传输心跳消息。
在 PowerVM 环境下，VIOC 一般使用虚拟 HBA 卡，而不配置物理 HBA 卡。在这种情况下，FC 心跳如何实施？下面是实施的要点：
1.首先需要将两个 VIOS 上的一个物理光纤卡端口接在一个光纤交换机上，然后配置一个 ZONE，将两个 FC Port 划分在内。需要在 VIOS 和 VIOC 中新建一个虚拟网卡（或者给 VIOS 的 SEA 增加一个 3358 的 VLAN tag 也可以），VLAN ID 设置 3358 即可。不需要物理交换机与 SEA 的接口打 3358 的 VLAN ID，VIOC 和 VIOS 上的 3358 的虚拟网卡上也不能配置 IP 地址。
2.划 ZONE 的时候，只需要对两个/多个 VIOS 上的物理光纤卡划 zone。在 VIOC 上，虚拟光纤卡不是必须的（由于生产环境下多使用 NPIV，所以在 VIOC 下都会有 vfc 设备），在 VIOC 没有 vfc 卡的情况下，vfc 心跳可以通过 vscsi client 实现。
3.虚拟光纤卡心跳的实现，是通过 VIOS 与 VIOC 之间的 sfwcomm 设备实现的，也就是 VLAN 3358 对应的设备名称。
4.如果 VIOS 上有单独可用于 FC 心跳的物理光纤口，那么可以单独给这两个 WWPN 划 zone。如果资源紧张，那么使用存储映射的光纤卡也可以，但是最好将两个 WWPN 单独配置在一个新的 zone 里（由于存储映射已经有了一个 zone 把两个 WWPN 划分在了一起，不增加新的 zone 也能实现功能，但是为了避免存储的干扰，单独增加一个 ZONE 效果会更好。）。 PowerVM 中 FC 心跳的实施步骤
1.在配置 fc 心跳之前，在 HA 节点（VIOC）进行查看：
# lscluster -i sfwcom
Interface sfwcom not found 提示没有 sfwcom 设备。
图 1.虚拟环境下 FC 心跳的架构图
2.在 VIOS 中调整物理光纤卡的参数：
双击代码全选
chdev -P -l fcs0 -a tme=yes
chdev -P -l fscsi0 -a dyntrk=yes -a fc_err_recov=fast_fail
需要注意的是，由于 HBA 卡下挂有设备，因此直接修改 HBA 卡的参数是不能成功的，这就需要增加-P 参数，先行修改 ODM 库中的配置信息。配置成功后，重启才能生效。
3.创建 VLAN
在 VIOS 的 SEA 上增加 3358 的 VLAN(或者直接在 VIOS 中 DLPAR 增加一个 PVID 为 3358 的虚拟网卡，然后保存修改到分区 profile 中)：
图 2 VIOS 增加 SEA 的 VLAN tag
如果是在 SEA 上增加 3358 的 VLAN，那么创建完毕以后，需要 de-active 和 re-active VIOS。如果是 DLPAR 一个新的 PVID 为 3358 的虚拟网卡，那么就不需要 de-active 和 re-active VIOS。
在 VIOC 上，用 DLPAR 增加一个新的虚拟网卡（然后在 HMC 上 save configuration），指向 VIOS 上的 3358 VLAN：
图 3 查看虚拟网卡
图 4 查看虚拟网卡
在 VIOC 上 cfgmgr 重新扫描设备信息。
然后在 HA 节点上（vioc）查看， sfwcom 已经可见。
双击代码全选
# lscluster -i sfwcom
Network/Storage Interface Query
Cluster Name: wxycluster
Cluster UUID: 397cd38e-8bdc-11e2-844a-ce02cc0e300a
Number of nodes reporting = 1
Number of nodes stale = 0
Number of nodes expected = 1
Node node1
Node UUID = -8bdc-11e2-844a-ce02cc0e300a
Number of interfaces discovered = 1
&&&&&&&&Interface number 1, sfwcom
&&&&&&&&&&&&&&&&IFNET type = 0 (none)
&&&&&&&&&&&&&&&&NDD type = 304 (NDD_SANCOMM)
&&&&&&&&&&&&&&&&Smoothed RTT across
interface = 0
&&&&&&&&&&&&&&&&Mean deviation
in network RTT across
interface = 0
&&&&&&&&&&&&&&&&Probe interval
for interface
&&&&&&&&&&&&&&&&IFNET flags
for interface
&&&&&&&&&&&&&&&&NDD flags
for interface
&&&&&&&&&&&&&&&&Interface state = UP
# lsdev -C|grep sfw
sfw0&&&&&&&&&& Available&&&&&&&&&&&& Storage Framework Module
sfwcomm0&&&&&& Available 20-T1-01-FF Fibre Channel Storage Framework Comm
sfwcomm1&&&&&& Available&&&&&&&&&&&& vLAN Storage Framework Comm
这样，在虚拟化环境下实现 PowerHA7.1 的 FC heartbeat 就完成了。
测试将 node1 节点 halt -q，然后再 node2 上观察 node1 节点的 sfwcom 接口状态，处于 stale 状态，而 node2 & &自身的 sfwcom 状态正常，这符合预期：
双击代码全选
Node node1
Node UUID = -8bdc-11e2-844a-ce02cc0e300a
Number of interfaces discovered = 3
&&&&&&&&Interface number 1, en0
&&&&&&&&&&&&&&&&IFNET type = 6 (IFT_ETHER)
&&&&&&&&&&&&&&&&NDD type = 7 (NDD_ISO88023)
&&&&&&&&&&&&&&&&MAC address length = 6
&&&&&&&&&&&&&&&&MAC address = CE:02:CC:0E:30:0A
&&&&&&&&&&&&&&&&Smoothed RTT across
interface = 7
&&&&&&&&&&&&&&&&Mean deviation
in network RTT across
interface = 3
&&&&&&&&&&&&&&&&Probe interval
for interface
&&&&&&&&&&&&&&&&IFNET flags
for interface
= 0x1E080863
&&&&&&&&&&&&&&&&NDD flags
for interface
= 0x0021081B
&&&&&&&&&&&&&&&&Interface state = STALE
&&&&&&&&&&&&&&&&Number of regular addresses configured on
interface = 2
&&&&&&&&&&&&&&&&IPv4 ADDRESS: 172.16.25.175 broadcast 172.16.25.255 netmask 255.255.255.0
&&&&&&&&&&&&&&&&IPv4 ADDRESS: 172.16.25.178 broadcast 172.16.25.255 netmask 255.255.255.0
&&&&&&&&&&&&&&&&Number of cluster multicast addresses configured on
interface = 1
&&&&&&&&&&&&&&&&IPv4 MULTICAST ADDRESS: 228.16.25.175
&&&&&&&&Interface number 2, sfwcom
&&&&&&&&&&&&&&&&IFNET type = 0 (none)
&&&&&&&&&&&&&&&&NDD type = 304 (NDD_SANCOMM)
&&&&&&&&&&&&&&&&Smoothed RTT across
interface = 0
&&&&&&&&&&&&&&&&Mean deviation
in network RTT across
interface = 0
&&&&&&&&&&&&&&&&Probe interval
for interface
&&&&&&&&&&&&&&&&IFNET flags
for interface
&&&&&&&&&&&&&&&&NDD flags
for interface
&&&&&&&&&&&&&&&&Interface state = STALE
&&&&&&&&Interface number 3, dpcom
&&&&&&&&&&&&&&&&IFNET type = 0 (none)
&&&&&&&&&&&&&&&&NDD type = 305 (NDD_PINGCOMM)
&&&&&&&&&&&&&&&&Smoothed RTT across
interface = 76
&&&&&&&&&&&&&&&&Mean deviation
in network RTT across
interface = 7
&&&&&&&&&&&&&&&&Probe interval
for interface
&&&&&&&&&&&&&&&&IFNET flags
for interface
&&&&&&&&&&&&&&&&NDD flags
for interface
&&&&&&&&&&&&&&&&Interface state = STALE
通过目前的实施案例来看，在 PowerVM 环境下实施 PowerHA7.1 是完全没有问题的，PowerHA 也能实现其在物理分区中相同的功能。关于技术要点，总结如下：
PowerHA7.1 中允许 1 个 HA 节点只有 1 个网卡、1 个 BootIP 和一个 ServiceIP，并且 ServieIP 可以和 BootIP 在相同网段，这为在 VIOC 中简化网络结构提供了便利（网络的高可用通过 VIOS 上 SEA 的 NIB 或 EtherChannel 来实现）。
netmon.cf 的功能的成功实现，解决了 PowerHA 监控虚拟网卡状态的问题
FC 心跳的虚拟环境下的实施保证了 VIOC 在没有物理 HBA 卡的情况下，通过虚拟 FC 心跳的功能，使 PowerHA 更加稳定，有效地预防脑裂。
&&相关文章推荐
* 以上用户言论只代表其个人观点，不代表CSDN网站的观点或立场
访问：3112次
排名：千里之外}

常信村百科网