PACEMAKER: Unterschied zwischen den Versionen

Aus xinux.net
Zur Navigation springen Zur Suche springen
Zeile 71: Zeile 71:
 
  update-rc.d -f pacemaker remove
 
  update-rc.d -f pacemaker remove
 
  update-rc.d pacemaker start 50 1 2 3 4 5 . stop 01 0 6 .
 
  update-rc.d pacemaker start 50 1 2 3 4 5 . stop 01 0 6 .
==ALL] Start cman service and then pacemaker service==
+
==[ALL] Start cman service and then pacemaker service==
 
  service cman start
 
  service cman start
 
  Starting cluster:  
 
  Starting cluster:  
Zeile 104: Zeile 104:
 
  Online: [ fix foxy ]
 
  Online: [ fix foxy ]
  
===Set up dlm_controld and o2cb in cluster's CIB ===
+
==[ALL]Set up dlm_controld and o2cb==
  
 
  node fix
 
  node fix
Zeile 166: Zeile 166:
 
Start drbd:
 
Start drbd:
 
  sudo service drbd start
 
  sudo service drbd start
===Set up dlm_controld and o2cb with drbd in cluster's CIB ===
+
==[ALL]Set up dlm_controld and o2cb with drbd==
 
  node fix
 
  node fix
 
  node foxy
 
  node foxy
Zeile 197: Zeile 197:
 
== now we format one site with ocfs2 ==
 
== now we format one site with ocfs2 ==
 
sudo mkfs.ocfs2 /dev/drbd/by-res/disk0
 
sudo mkfs.ocfs2 /dev/drbd/by-res/disk0
===Set up dlm_controld and o2cb with drbd and mounting in both sites in cluster's CIB ===
+
==[ALL]Set up dlm_controld and o2cb with drbd and mounting on both sites===
 
  node fix
 
  node fix
 
  node foxy
 
  node foxy

Version vom 8. September 2012, 15:43 Uhr

Installation

[ALL] Initial setup

Install required packages:

sudo apt-get install pacemaker cman resource-agents fence-agents gfs2-utils gfs2-cluster ocfs2-tools-cman openais drbd8-utils
Make sure each host can resolve all other hosts. Best way to achive this is by adding their IPs and hostnames to /etc/hosts on all nodes. In this example, that would be:
eth0
192.168.244.161 fix
192.168.244.162 foxy
eth1
10.168.244.161   fix-ha
10.168.244.162   foxy-ha
Disable o2cb from starting:
update-rc.d -f o2cb remove

[ALL] Create /etc/cluster/cluster.conf

Paste this into /etc/cluster/cluster.conf:

<?xml version="1.0"?>
<cluster config_version="4" name="pacemaker">
    <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
    <clusternodes>
            <clusternode name="server1" nodeid="1" votes="1">
                <fence>
                        <method name="pcmk-redirect">
                                <device name="pcmk" port="server1"/>
                        </method>
                </fence>
            </clusternode>
            <clusternode name="server2" nodeid="2" votes="1">
                <fence>
                        <method name="pcmk-redirect">
                                <device name="pcmk" port="server2"/>
                        </method>
                </fence>
            </clusternode>
            <clusternode name="server3" nodeid="3" votes="1">
                <fence>
                        <method name="pcmk-redirect">
                                <device name="pcmk" port="server3"/>
                        </method>
                </fence>
            </clusternode>
    </clusternodes>
  <fencedevices>
    <fencedevice name="pcmk" agent="fence_pcmk"/>
  </fencedevices>
    <cman/>
</cluster> 

[ALL] Edit /etc/corosync/corosync.conf

Find pacemaker service in /etc/corosync/corosync.conf and bump version to 1:

service {
        # Load the Pacemaker Cluster Resource Manager
        ver:       1
        name:      pacemaker
}

Replace bindnetaddr with the IP of your network. For example:

                bindnetaddr: 10.168.244.0

'0' is not a typo.

[ALL] Enable pacemaker init scripts

update-rc.d -f pacemaker remove
update-rc.d pacemaker start 50 1 2 3 4 5 . stop 01 0 6 .

[ALL] Start cman service and then pacemaker service

service cman start
Starting cluster: 
   Checking if cluster has been disabled at boot... [  OK  ]
   Checking Network Manager... [  OK  ]
   Global setup... [  OK  ]
   Loading kernel modules... [  OK  ]
   Mounting configfs... [  OK  ]
   Starting cman... [  OK  ]
   Waiting for quorum... [  OK  ]
   Starting fenced... [  OK  ]
   Starting dlm_controld... [  OK  ]
   Unfencing self... [  OK  ]
   Joining fence domain... [  OK  ]
service pacemaker start
Starting Pacemaker Cluster Manager: [  OK  ]

[ONE] Setup resources

Wait for a minute until pacemaker declares all nodes online:
# crm status
============
Last updated: Fri Sep  7 21:18:12 2012
Last change: Fri Sep  7 21:17:17 2012 via crmd on fix
Stack: cman
Current DC: fix - partition with quorum
Version: 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c
2 Nodes configured, unknown expected votes
0 Resources configured.
============ 

Online: [ fix foxy ]

[ALL]Set up dlm_controld and o2cb

node fix
node foxy
primitive resDLM ocf:pacemaker:controld \
        params daemon="dlm_controld" \
        op monitor interval="120s"
primitive resO2CB ocf:pacemaker:o2cb \
       params stack="cman" \
       op monitor interval="120s"
clone cloneDLM resDLM \
        meta globally-unique="false" interleave="true"
clone cloneO2CB resO2CB \
        meta globally-unique="false" interleave="true"
colocation colO2CBDLM inf: cloneO2CB cloneDLM
order ordDLMO2CB 0: cloneDLM cloneO2CB
property $id="cib-bootstrap-options" \
       dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \
       cluster-infrastructure="cman" \
       stonith-enabled="false" \
       no-quorum-policy="ignore"

[ALL] Configure drbd

On both nodes create file /etc/drbd.d/disk0.res containing (replace 'X' and 'Y' with real values):

resource disk0 {
         protocol C;
        net {
                cram-hmac-alg sha1;
                shared-secret "lucid";
                allow-two-primaries;
        }
        startup {
                become-primary-on both;
       }
       on fix {
               device /dev/drbd0;
               disk /dev/sda3;
               address 10.168.244.161:7788;
               meta-disk internal;
       }
       on foxy {
               device /dev/drbd0;
               disk /dev/sda3;
               address 10.168.244.162:7788;
               meta-disk internal;
       }
}

Pacemaker will handle starting and stopping drbd services, so remove its init script:

sudo update-rc.d -f drbd remove

Create drbd resource:

sudo drbdadm create-md disk0
You should get:
Writing meta data...
initializing activity log
NOT initialized bitmap
New drbd meta data block successfully created.
success

Start drbd:

sudo service drbd start

[ALL]Set up dlm_controld and o2cb with drbd

node fix
node foxy
primitive resDLM ocf:pacemaker:controld \
       params daemon="dlm_controld" \
       op monitor interval="120s"
primitive resDRBD ocf:linbit:drbd \
       params drbd_resource="disk0" \
       operations $id="resDRBD-operations" \
       op monitor interval="20" role="Master" timeout="20" \
       op monitor interval="30" role="Slave" timeout="20"
primitive resO2CB ocf:pacemaker:o2cb \
       params stack="cman" \
       op monitor interval="120s"
ms msDRBD resDRBD \
       meta resource-stickines="100" notify="true" master-max="2" interleave="true"
clone cloneDLM resDLM \
       meta globally-unique="false" interleave="true"
clone cloneO2CB resO2CB \
       meta globally-unique="false" interleave="true"
colocation colDLMDRBD inf: cloneDLM msDRBD:Master
colocation colO2CBDLM inf: cloneO2CB cloneDLM
order ordDLMO2CB 0: cloneDLM cloneO2CB
property $id="cib-bootstrap-options" \
       dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \
       cluster-infrastructure="cman" \
       stonith-enabled="false" \
       no-quorum-policy="ignore"

now we format one site with ocfs2

sudo mkfs.ocfs2 /dev/drbd/by-res/disk0

[ALL]Set up dlm_controld and o2cb with drbd and mounting on both sites=

node fix
node foxy
primitive resDLM ocf:pacemaker:controld \
        params daemon="dlm_controld" \
        op monitor interval="120s"
primitive resDRBD ocf:linbit:drbd \
        params drbd_resource="disk0" \
        operations $id="resDRBD-operations" \
        op monitor interval="20" role="Master" timeout="20" \
       op monitor interval="30" role="Slave" timeout="20"
primitive resFS ocf:heartbeat:Filesystem \
        params device="/dev/drbd/by-res/disk0" directory="/opt" fstype="ocfs2" \
        op monitor interval="120s"
primitive resO2CB ocf:pacemaker:o2cb \
       params stack="cman" \
       op monitor interval="120s"
ms msDRBD resDRBD \
       meta resource-stickines="100" notify="true" master-max="2" interleave="true"
clone cloneDLM resDLM \
       meta globally-unique="false" interleave="true"
clone cloneFS resFS \
       meta interleave="true" ordered="true"
clone cloneO2CB resO2CB \
       meta globally-unique="false" interleave="true"
colocation colDLMDRBD inf: cloneDLM msDRBD:Master
colocation colFSO2CB inf: cloneFS cloneO2CB
colocation colO2CBDLM inf: cloneO2CB cloneDLM
order ordDLMO2CB 0: cloneDLM cloneO2CB
order ordO2CBFS 0: cloneO2CB cloneFS
property $id="cib-bootstrap-options" \
       dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \
       cluster-infrastructure="cman" \
       stonith-enabled="false" \
       no-quorum-policy="ignore"



crm configure edit

node fix
node foxy
primitive resDLM ocf:pacemaker:controld \
        params daemon="dlm_controld" \
        op monitor interval="120s"
primitive resGFSD ocf:pacemaker:controld \
        params daemon="gfs_controld" args="" \
        op monitor interval="120s"
primitive resO2CB ocf:pacemaker:o2cb \
        params stack="cman" \
        op monitor interval="120s"
clone cloneDLM resDLM \
        meta globally-unique="false" interleave="true"
clone cloneGFSD resGFSD \
        meta globally-unique="false" interleave="true" target-role="Started"
clone cloneO2CB resO2CB \
        meta globally-unique="false" interleave="true"
colocation colGFSDDLM inf: cloneGFSD cloneDLM
colocation colO2CBDLM inf: cloneO2CB cloneDLM
order ordDLMGFSD 0: cloneDLM cloneGFSD
order ordDLMO2CB 0: cloneDLM cloneO2CB
property $id="cib-bootstrap-options" \
        dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \
        cluster-infrastructure="cman" \
       stonith-enabled="false" \
       no-quorum-policy="ignore"



EXTREMELY IMPORTANT: Notice that this example has STONITH disabled. This is just a HOWTO for a basic
setup. You should't be running shared resources with disabled STONITH. Check pacemaker's documentation 
for guidance on setting this up. If you are not sure about this, stop right now!

Save and quit. Running

crm status

should now show all these services running:

# crm status
============
Last updated: Fri Sep  7 21:28:36 2012
Last change: Fri Sep  7 21:26:36 2012 via cibadmin on fix
Stack: cman
Current DC: fix - partition with quorum
Version: 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c
2 Nodes configured, unknown expected votes
6 Resources configured.
============ 

Online: [ fix foxy ] 

Clone Set: cloneDLM [resDLM]
    Started: [ fix foxy ]
Clone Set: cloneGFSD [resGFSD]
    Started: [ fix foxy ]
Clone Set: cloneO2CB [resO2CB]
    Started: [ fix foxy ]


Links