Switch a cluster from Broadcast mode to Multicast Mode
1. Edit /etc/cluster/cluster.conf and change version number
[root@ computer
~]#vi /etc/cluster/cluster.conf
Look
at line 2 for the version number:
<?xml
version="1.0"?>
<cluster
config_version="11" name="ttrs">
Change
it to version 12 so it looks like this:
<cluster
config_version="11" name="ttrs">
2. Change cluster Join Delay
When the cluster is quorate and the fence domain is first
created (by a fence daemon being started), any nodes not yet in the cluster
will be fenced. By default there's a
delay of 6 seconds in this case to allow any nodes unnecessarily flagged for
fencing to join the cluster an avoid being fenced.
This delay can be increased by setting post_join_delay in
cluster.conf:
<fence_daemon
post_join_delay="30">
Change line 3 to look like this:
<fence_daemon post_fail_delay="0"
post_join_delay="30"/>
3. Continue Editing /etc/cluster/cluster.conf and
change communication mode
Line
26 should look like this:
<cman
broadcast="yes" expected_votes="1"
two_node="1"/>
Change
it to look like this:
<multicast
addr="239.199.10.1"
interface="bond1"/> (Get an unused Multicast IP this is just an example )
You
need to add this line for every clusternode in the cluster and in the cman
section so it looks like this:
<?xml
version="1.0"?>
<cluster
alias="clustertest" config_version="55"
name="rhcluster">
<fence_daemon
clean_start="0" post_fail_delay="0"
post_join_delay="30"/>
<clusternodes>
<clusternode name="rhcluster01-priv"
nodeid="1" votes="1">
<fence>
<method
name="1">
<device name="pdu_01a" option="off"
port="13"/>
<device
name="pdu_02b" option="off" port="13"/>
<device name="pdu_01a" option="on"
port="13"/>
<device name="pdu_02b" option="on"
port="13"/>
</method>
</fence>
<multicast addr="239.199.10.1"
interface="bond1"/>
</clusternode>
<clusternode
name="rhcluster02-priv" nodeid="2" votes="1">
<fence>
<method
name="1">
<device name="pdu_13a" option="off"
port="14"/>
<device name="pdu_13b" option="off"
port="14"/>
<device name="pdu_13a" option="on"
port="14"/>
<device name="pdu_13b" option="on"
port="14"/>
</method>
</fence>
<multicast addr="239.199.10.1"
interface=" bond1"/>
</clusternode>
</clusternodes>
<cman expected_votes="1"
two_node="1">
<multicast addr="239.199.10.1"
interface=" bond1"/>
</cman>
4. stop the cluster on both nodes
[root@server1.net
~]#service rgmanager stop
[root@server1.net
~]# service fenced stop
[root@server1.net
~]# service cman stop
[root@server2.net
~]#service rgmanager stop
[root@server2.net
~]# service fenced stop
[root@server2.net
~]# service cman stop
5. Start the cluster on both nodes, quickly (within 30 seconds)
[root@server1.net
~]#service rgmanager start
[root@server1.net
~]# service fenced start
[root@server1.net
~]# service cman start
[root@server2.net
~]#service rgmanager start
[root@server2.net
~]# service fenced start
[root@server2.net
~]# service cman start
6. look at log
file at /var/log/messages
You should see something like this below, in
relevance to the server and app you are working on.
Highlighted are important markers in these messages:
[root@server-01-01
cluster]# tail -100 /var/log/messages
Feb 10 16:08:00 server-01-01
ccsd[21438]: Built: Jul 28 2010 19:18:39
Feb 10 16:08:00 server-01-01
ccsd[21438]: Copyright (C) Red Hat,
Inc. 2004 All rights reserved.
Feb 10 16:08:00 server-01-01
ccsd[21438]: cluster.conf
(cluster name = rhclustertest, version = 56) found.
Feb 10 16:08:01 server-01-01
openais[21448]: [MAIN ] AIS Executive Service RELEASE 'subrev 1887 version
0.80.6'
Feb 10 16:08:01 server-01-01
openais[21448]: [MAIN ] Copyright (C) 2002-2006 MontaVista Software, Inc and
contributors.
Feb 10 16:08:01 server-01-01
openais[21448]: [MAIN ] Copyright (C) 2006 Red Hat, Inc.
Feb 10 16:08:01 server-01-01
openais[21448]: [MAIN ] AIS Executive Service: started and ready to provide
service.
Feb 10 16:08:01 server-01-01
openais[21448]: [TOTEM] Token Timeout (10000 ms) retransmit timeout (495 ms)
Feb 10 16:08:01 server-01-01
openais[21448]: [TOTEM] token hold (386 ms) retransmits before loss (20
retrans)
Feb 10 16:08:01 server-01-01
openais[21448]: [TOTEM] join (60 ms) send_join (0 ms) consensus (4800 ms) merge
(200 ms)
Feb 10 16:08:01 server-01-01
openais[21448]: [TOTEM] downcheck (1000 ms) fail to recv const (50 msgs)
Feb 10 16:08:01 server-01-01
openais[21448]: [TOTEM] seqno unchanged const (30 rotations) Maximum network
MTU 1402
Feb 10 16:08:01 server-01-01
openais[21448]: [TOTEM] window size per rotation (50 messages) maximum messages
per rotation (17 messages)
Feb 10 16:08:01 server-01-01
openais[21448]: [TOTEM] send threads (0 threads)
Feb 10 16:08:01 server-01-01
openais[21448]: [TOTEM] RRP token expired timeout (495 ms)
Feb 10 16:08:01 server-01-01
openais[21448]: [TOTEM] RRP token problem counter (2000 ms)
Feb 10 16:08:01 server-01-01
openais[21448]: [TOTEM] RRP threshold (10 problem count)
Feb 10 16:08:01 server-01-01
openais[21448]: [TOTEM] RRP mode set to none.
Feb 10 16:08:01 server-01-01
openais[21448]: [TOTEM] heartbeat_failures_allowed (0)
Feb 10 16:08:01 server-01-01
openais[21448]: [TOTEM] max_network_delay (50 ms)
Feb 10 16:08:01 server-01-01
openais[21448]: [TOTEM] HeartBeat is Disabled. To enable set
heartbeat_failures_allowed > 0
Feb 10 16:08:01 server-01-01
openais[21448]: [TOTEM] Receive multicast socket recv buffer size (320000 bytes).
Feb 10 16:08:01 server-01-01
openais[21448]: [TOTEM] Transmit multicast socket send buffer size (262142 bytes).
Feb 10 16:08:01 server-01-01
openais[21448]: [TOTEM] The network interface [10.0.80.1] is now up.
Feb 10 16:08:01 server-01-01
openais[21448]: [TOTEM] Created or loaded sequence id 196.10.0.80.1 for this
ring.
Feb 10 16:08:01 server-01-01
openais[21448]: [TOTEM] entering GATHER state from 15.
Feb 10 16:08:01 server-01-01
openais[21448]: [CMAN ] CMAN 2.0.115 (built Jul 28 2010 19:18:41) started
Feb 10 16:08:01 server-01-01
openais[21448]: [MAIN ] Service initialized 'openais CMAN membership service
2.01'
Feb 10 16:08:01 server-01-01
openais[21448]: [SERV ] Service initialized 'openais extended virtual synchrony
service'
Feb 10 16:08:01 server-01-01
openais[21448]: [SERV ] Service initialized 'openais cluster membership service
B.01.01'
Feb 10 16:08:01 server-01-01
openais[21448]: [SERV ] Service initialized 'openais availability management
framework B.01.01'
Feb 10 16:08:01 server-01-01
openais[21448]: [SERV ] Service initialized 'openais checkpoint service
B.01.01'
Feb 10 16:08:01 server-01-01
openais[21448]: [SERV ] Service initialized 'openais event service B.01.01'
Feb 10 16:08:01 server-01-01
openais[21448]: [SERV ] Service initialized 'openais distributed locking
service B.01.01'
Feb 10 16:08:01 server-01-01
openais[21448]: [SERV ] Service initialized 'openais message service B.01.01'
Feb 10 16:08:01 server-01-01
openais[21448]: [SERV ] Service initialized 'openais configuration service'
Feb 10 16:08:01 server-01-01
openais[21448]: [SERV ] Service initialized 'openais cluster closed process
group service v1.01'
Feb 10 16:08:01 server-01-01
openais[21448]: [SERV ] Service initialized 'openais cluster config database
access v1.01'
Feb 10 16:08:01 server-01-01
openais[21448]: [SYNC ] Not using a virtual synchrony filter.
Feb 10 16:08:01 server-01-01
openais[21448]: [TOTEM] Creating commit token because I am the rep.
Feb 10 16:08:01 server-01-01
openais[21448]: [TOTEM] Saving state aru 0 high seq received 0
Feb 10 16:08:01 server-01-01
openais[21448]: [TOTEM] Storing new sequence id for ring c8
Feb 10 16:08:01 server-01-01
openais[21448]: [TOTEM] entering COMMIT state.
Feb 10 16:08:01 server-01-01
openais[21448]: [TOTEM] entering RECOVERY state.
Feb 10 16:08:01 server-01-01
openais[21448]: [TOTEM] position [0] member 10.0.80.1:
Feb 10 16:08:01 server-01-01
openais[21448]: [TOTEM] previous ring seq 196 rep 10.0.80.1
Feb 10 16:08:01 server-01-01
openais[21448]: [TOTEM] aru 0 high delivered 0 received flag 1
Feb 10 16:08:01 server-01-01
openais[21448]: [TOTEM] Did not need to originate any messages in recovery.
Feb 10 16:08:01 server-01-01
openais[21448]: [TOTEM] Sending initial ORF token
Feb 10 16:08:01 server-01-01
openais[21448]: [CLM ] CLM CONFIGURATION
CHANGE
Feb 10 16:08:01 server-01-01
openais[21448]: [CLM ] New
Configuration:
Feb 10 16:08:01 server-01-01
openais[21448]: [CLM ] Members Left:
Feb 10 16:08:01 server-01-01
openais[21448]: [CLM ] Members Joined:
Feb 10 16:08:01 server-01-01
openais[21448]: [CLM ] CLM CONFIGURATION
CHANGE
Feb 10 16:08:01 server-01-01
openais[21448]: [CLM ] New
Configuration:
Feb 10 16:08:01 server-01-01
openais[21448]: [CLM ] r(0) ip(10.0.80.1)
Feb 10 16:08:01 server-01-01
openais[21448]: [CLM ] Members Left:
Feb 10 16:08:01 server-01-01
openais[21448]: [CLM ] Members Joined:
Feb 10 16:08:01 server-01-01
openais[21448]: [CLM ] r(0) ip(10.0.80.1)
Feb 10 16:08:01 server-01-01
openais[21448]: [SYNC ] This node is within the primary component and will
provide service.
Feb 10 16:08:01 server-01-01
openais[21448]: [TOTEM] entering OPERATIONAL state.
Feb 10 16:08:01 server-01-01
openais[21448]: [CMAN ] quorum regained, resuming activity
Feb 10 16:08:01 server-01-01
openais[21448]: [CLM ] got nodejoin
message 10.0.80.1
Feb 10 16:08:02 server-01-01
ccsd[21438]: Initial status:: Quorate
Feb 10 16:08:02 server-01-01
openais[21448]: [TOTEM] entering GATHER state from 11.
Feb 10 16:08:02 server-01-01
openais[21448]: [TOTEM] Creating commit token because I am the rep.
Feb 10 16:08:02 server-01-01
openais[21448]: [TOTEM] Saving state aru c high seq received c
Feb 10 16:08:02 server-01-01
openais[21448]: [TOTEM] Storing new sequence id for ring cc
Feb 10 16:08:02 server-01-01
openais[21448]: [TOTEM] entering COMMIT state.
Feb 10 16:08:02 server-01-01
openais[21448]: [TOTEM] entering RECOVERY state.
Feb 10 16:08:02 server-01-01
openais[21448]: [TOTEM] position [0] member 10.0.80.1:
Feb 10 16:08:02 server-01-01
openais[21448]: [TOTEM] previous ring seq 200 rep 10.0.80.1
Feb 10 16:08:02 server-01-01
openais[21448]: [TOTEM] aru c high delivered c received flag 1
Feb 10 16:08:02 server-01-01
openais[21448]: [TOTEM] position [1] member 10.0.80.2:
Feb 10 16:08:02 server-01-01
openais[21448]: [TOTEM] previous ring seq 196 rep 10.0.80.2
Feb 10 16:08:02 server-01-01
openais[21448]: [TOTEM] aru a high delivered a received flag 1
Feb 10 16:08:02 server-01-01
openais[21448]: [TOTEM] Did not need to originate any messages in recovery.
Feb 10 16:08:02 server-01-01
openais[21448]: [TOTEM] Sending initial ORF token
Feb 10 16:08:02 server-01-01
openais[21448]: [CLM ] CLM CONFIGURATION
CHANGE
Feb 10 16:08:02 server-01-01
openais[21448]: [CLM ] New
Configuration:
Feb 10 16:08:02 server-01-01
openais[21448]: [CLM ] r(0) ip(10.0.80.1)
Feb 10 16:08:02 server-01-01
openais[21448]: [CLM ] Members Left:
Feb 10 16:08:02 server-01-01
openais[21448]: [CLM ] Members Joined:
Feb 10 16:08:02 server-01-01
openais[21448]: [CLM ] CLM CONFIGURATION
CHANGE
Feb 10 16:08:02 server-01-01
openais[21448]: [CLM ] New
Configuration:
Feb 10 16:08:02 server-01-01
openais[21448]: [CLM ] r(0) ip(10.0.80.1)
Feb 10 16:08:02 server-01-01
openais[21448]: [CLM ] r(0) ip(10.0.80.2)
Feb 10 16:08:02 server-01-01
openais[21448]: [CLM ] Members Left:
Feb 10 16:08:02 server-01-01
openais[21448]: [CLM ] Members Joined:
Feb 10 16:08:02 server-01-01
openais[21448]: [CLM ] r(0) ip(10.0.80.2)
Feb 10 16:08:02 server-01-01
openais[21448]: [SYNC ] This node is
within the primary component and will provide service.
Feb 10 16:08:02 server-01-01
openais[21448]: [TOTEM] entering OPERATIONAL state.
Feb 10 16:08:02 server-01-01
openais[21448]: [CLM ] got nodejoin message 10.0.80.1
Feb 10 16:08:02 server-01-01
openais[21448]: [CLM ] got nodejoin message 10.0.80.2
Feb 10 16:08:10 server-01-01
kernel: dlm: Using TCP for communications
Feb 10 16:08:11 server-01-01
kernel: dlm: connecting to 2
Feb 10 16:08:11 server-01-01
kernel: dlm: got connection from 2
Feb 10 16:08:11 server-01-01
clurgmgrd[21512]: <notice> Resource Group Manager Starting
Feb 10 16:08:17 server-01-01
clurgmgrd[21512]: <notice> Starting stopped service service:http_service
Feb 10 16:08:18 server-01-01
clurgmgrd[21512]: <notice> Service service:http_service started
7. Verify the cluster and app are running:
[root@server-01
~]# clustat
Cluster Status for
trrs @ Fri Feb 10 21:12:01 2012
Member Status:
Quorate
Member Name
ID Status
------ ----
---- ------
server01-priv 1
Online, Local, rgmanager
server02-priv 2
Online, rgmanager
Service Name Owner (Last)
State
------- ----
----- ------
-----
service:TRRS_VIP server02-priv
started
[root@server-02
~]# clustat
Cluster Status for
trrs @ Fri Feb 10 21:32:01 2012
Member Status:
Quorate
Member Name
ID Status
------ ----
---- ------
Server01-priv
1
Online, rgmanager
Server02-priv 2
Online, Local, rgmanager
Service Name
Owner (Last) State
------- ----
----- ------
-----
service:TRRS_VIP server-02-priv
started
8. Done!
No comments:
Post a Comment