It’s possible for the HACMP configuration between two different nodes to be out of sync. Or, you may want to push a config from one node to another. We had one admin make changes to a down node, then try to sync the cluster. To clean it up, we had to figure out which node had the latest config. If you want to see which configuration instance number each of your HACMP nodes is using, you can run:
lssrc -l -s topsvcs | grep Instance
Permanent link to this post
(89 words, estimated 21 secs reading time)
After upgrading to AIX 5.3 TL9 SP4, we found that secldapclntd will go into a death loop during a HACMP failover. It consumes more and more CPU until the system doesn’t have any capacity left, and stops the HACMP failover. Killing secldapclntd will let HACMP continue.
We didn’t see this behavior w/ AIX 5.3 TL8 SP3. IBM has identified a couple of issues that are probably coming together to cause our problem, but they won’t be fixed in TL9… ever. IBM’s work-around is to setup a pre and post-event script to stop secldapclntd before the IP takeover (and release) and restart it afterward. In testing, this works pretty well, and it only takes a few seconds to stop and start secldapclntd.
secldapclntd Failure w/ HACMP full post
(210 words, estimated 50 secs reading time)
To test your disk heartbeats, you can look at the output of “cllsif” or “lssrc -ls topsvcs”, or you can actively test them. IBM provides a command to do this. First, find the devices associated with the disk HB VG, I’ll assume hdisk4 on nodeA and hdisk5 on nodeB.
Testing Disk Heartbeats full post
(79 words, estimated 19 secs reading time)
If the timestamps between the nodes in a Concurrent VG get out of sync, you can get:
WARNING: The HACMP timestamp file for shared volume group: vg1
is inconsistent with the time stamp in the VGDA for the
following nodes: node1
HACMP timestamp inconsistent full post
(127 words, estimated 30 secs reading time)
For more security you can make your cluster use encryption for inter-node communication with no downtime. Otherwise operations are allowed or rejected based on IP address, hostname, and the cluster rhosts file. And, C-SPOC operations are not encrypted one of the important ones being password changes. Possibly an even better option would be to create a IPsec VPN tunnel between nodes, but I haven’t tested that.
Enable cluster encryption full post
(200 words, estimated 48 secs reading time)
When creating HACMP concurrent volume groups, it’s necessary to sync the major device numbers between the nodes. To see what each node has available for major devices run:
Permanent link to this post
(30 words, estimated 7 secs reading time)
On a multi-node HACMP cluster without enhanced concurrent VGs, anytime you add a LV to a volume group, you have to make sure the other nodes see the LV. This will also fix other VG out of sync issues. You can either take everything down and do an importvg on all the nodes, or you can do a “Lazy Update”:
Lazy Update – HACMP full post
(148 words, estimated 36 secs reading time)