PACEMAKER: Unterschied zwischen den Versionen

Aus xinux.net
Zur Navigation springen Zur Suche springen
 
(43 dazwischenliegende Versionen von 4 Benutzern werden nicht angezeigt)
Zeile 3: Zeile 3:
 
Install required packages:
 
Install required packages:
 
  sudo apt-get install pacemaker cman resource-agents fence-agents gfs2-utils gfs2-cluster ocfs2-tools-cman openais drbd8-utils
 
  sudo apt-get install pacemaker cman resource-agents fence-agents gfs2-utils gfs2-cluster ocfs2-tools-cman openais drbd8-utils
  Make sure each host can resolve all other hosts. Best way to achive this is by adding their IPs and hostnames to /etc/hosts on all nodes. In this example, that would be:
+
   
 +
Make sure each host can resolve all other hosts. Best way to achive this is by adding their IPs and hostnames to /etc/hosts on all nodes. In this example, that would be:
  
 
  eth0
 
  eth0
Zeile 12: Zeile 13:
 
  10.168.244.162  foxy-ha
 
  10.168.244.162  foxy-ha
  
Disable o2cb from starting:
+
Disable o2cb from starting:
  
 
  update-rc.d -f o2cb remove
 
  update-rc.d -f o2cb remove
  
 
==[ALL] Create /etc/cluster/cluster.conf==
 
==[ALL] Create /etc/cluster/cluster.conf==
Paste this into /etc/cluster/cluster.conf:
+
Paste this into '''/etc/cluster/cluster.conf'''  :
 
   
 
   
 
  <?xml version="1.0"?>
 
  <?xml version="1.0"?>
  <cluster config_version="4" name="pacemaker">
+
  <cluster config_version="1" name="pacemaker">
    <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
+
<cman two_node="1" expected_votes="1"> </cman>
    <clusternodes>
+
    <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
            <clusternode name="server1" nodeid="1" votes="1">
+
    <clusternodes>
                <fence>
+
            <clusternode name="fix" nodeid="1" votes="1">
                        <method name="pcmk-redirect">
+
                <fence>
                                <device name="pcmk" port="server1"/>
+
                        <method name="pcmk-redirect">
                        </method>
+
                                <device name="pcmk" port="fix"/>
                </fence>
+
                        </method>
            </clusternode>
+
                </fence>
            <clusternode name="server2" nodeid="2" votes="1">
+
            </clusternode>
                <fence>
+
            <clusternode name="foxy" nodeid="2" votes="1">
                        <method name="pcmk-redirect">
+
                <fence>
                                <device name="pcmk" port="server2"/>
+
                        <method name="pcmk-redirect">
                        </method>
+
                                <device name="pcmk" port="foxy"/>
                </fence>
+
                        </method>
            </clusternode>
+
                </fence>
            <clusternode name="server3" nodeid="3" votes="1">
+
            </clusternode>
                <fence>
+
    </clusternodes>
                        <method name="pcmk-redirect">
+
  <fencedevices>
                                <device name="pcmk" port="server3"/>
+
    <fencedevice name="pcmk" agent="fence_pcmk"/>
                        </method>
+
  </fencedevices>
                </fence>
+
</cluster>
            </clusternode>
+
 
    </clusternodes>
+
Validate the config
  <fencedevices>
+
# ccs_config_validate
    <fencedevice name="pcmk" agent="fence_pcmk"/>
+
  Configuration validates
  </fencedevices>
 
    <cman/>
 
  </cluster>
 
  
 
==[ALL] Edit /etc/corosync/corosync.conf==
 
==[ALL] Edit /etc/corosync/corosync.conf==
Zeile 72: Zeile 70:
 
  update-rc.d pacemaker start 50 1 2 3 4 5 . stop 01 0 6 .
 
  update-rc.d pacemaker start 50 1 2 3 4 5 . stop 01 0 6 .
 
==[ALL] Start cman service and then pacemaker service==
 
==[ALL] Start cman service and then pacemaker service==
  service cman start
+
deactivat quorum
 +
  # echo CMAN_QUORUM_TIMEOUT=0 >> /etc/default/cman
 +
start cman
 +
# service cman start
 
  Starting cluster:  
 
  Starting cluster:  
 
     Checking if cluster has been disabled at boot... [  OK  ]
 
     Checking if cluster has been disabled at boot... [  OK  ]
Zeile 85: Zeile 86:
 
     Unfencing self... [  OK  ]
 
     Unfencing self... [  OK  ]
 
     Joining fence domain... [  OK  ]
 
     Joining fence domain... [  OK  ]
 +
 +
 +
# cman_tool nodes
 +
Node  Sts  Inc  Joined              Name
 +
    1  M      4  2012-09-11 19:14:58  fix
 +
    2  M      8  2012-09-11 19:18:35  foxy
 +
 +
 +
==[ALL] Enable pacemaker init scripts==
 +
 +
update-rc.d -f pacemaker remove
 +
update-rc.d pacemaker start 50 1 2 3 4 5 . stop 01 0 6 .
 +
  
 
  service pacemaker start
 
  service pacemaker start
Zeile 104: Zeile 118:
 
  Online: [ fix foxy ]
 
  Online: [ fix foxy ]
  
==[ALL]Set up dlm_controld and o2cb==
+
==[[OCSF2 WAY]]==
 
+
==[[GFS2 WAY]]==
node fix
+
==[[CRM/CIB]]==
node foxy
 
primitive resDLM ocf:pacemaker:controld \
 
        params daemon="dlm_controld" \
 
        op monitor interval="120s"
 
primitive resO2CB ocf:pacemaker:o2cb \
 
        params stack="cman" \
 
        op monitor interval="120s"
 
clone cloneDLM resDLM \
 
        meta globally-unique="false" interleave="true"
 
clone cloneO2CB resO2CB \
 
        meta globally-unique="false" interleave="true"
 
colocation colO2CBDLM inf: cloneO2CB cloneDLM
 
order ordDLMO2CB 0: cloneDLM cloneO2CB
 
property $id="cib-bootstrap-options" \
 
        dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \
 
        cluster-infrastructure="cman" \
 
        stonith-enabled="false" \
 
        no-quorum-policy="ignore"
 
 
 
==[ALL] Configure drbd==
 
On both nodes create file /etc/drbd.d/disk0.res containing:
 
resource disk0 {
 
          protocol C;
 
        net {
 
                cram-hmac-alg sha1;
 
                shared-secret "lucid";
 
                allow-two-primaries;
 
        }
 
        startup {
 
                become-primary-on both;
 
        }
 
        on fix {
 
                device /dev/drbd0;
 
                disk /dev/sda3;
 
                address 10.168.244.161:7788;
 
                meta-disk internal;
 
        }
 
        on foxy {
 
                device /dev/drbd0;
 
                disk /dev/sda3;
 
                address 10.168.244.162:7788;
 
                meta-disk internal;
 
        }
 
}
 
Pacemaker will handle starting and stopping drbd services, so remove its init script:
 
 
 
sudo update-rc.d -f drbd remove
 
 
 
Create drbd resource:
 
sudo drbdadm create-md disk0
 
You should get:
 
 
 
Writing meta data...
 
initializing activity log
 
NOT initialized bitmap
 
New drbd meta data block successfully created.
 
success
 
Start drbd:
 
sudo service drbd start
 
==[ALL]Set up dlm_controld and o2cb with drbd==
 
node fix
 
node foxy
 
primitive resDLM ocf:pacemaker:controld \
 
        params daemon="dlm_controld" \
 
        op monitor interval="120s"
 
primitive resDRBD ocf:linbit:drbd \
 
        params drbd_resource="disk0" \
 
        operations $id="resDRBD-operations" \
 
        op monitor interval="20" role="Master" timeout="20" \
 
        op monitor interval="30" role="Slave" timeout="20"
 
primitive resO2CB ocf:pacemaker:o2cb \
 
        params stack="cman" \
 
        op monitor interval="120s"
 
ms msDRBD resDRBD \
 
        meta resource-stickines="100" notify="true" master-max="2" interleave="true"
 
clone cloneDLM resDLM \
 
        meta globally-unique="false" interleave="true"
 
clone cloneO2CB resO2CB \
 
        meta globally-unique="false" interleave="true"
 
colocation colDLMDRBD inf: cloneDLM msDRBD:Master
 
colocation colO2CBDLM inf: cloneO2CB cloneDLM
 
order ordDLMO2CB 0: cloneDLM cloneO2CB
 
property $id="cib-bootstrap-options" \
 
        dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \
 
        cluster-infrastructure="cman" \
 
        stonith-enabled="false" \
 
        no-quorum-policy="ignore"
 
 
 
==[ONE]Now we format one site with ocfs2 ==
 
sudo mkfs.ocfs2 /dev/drbd/by-res/disk0
 
  
==[ALL]Set up dlm_controld and o2cb with drbd and mounting on both sites===
+
==[ALL] install services that will fail over between servers==
node fix
+
In this example, I'm installing apache2
  node foxy
+
  sudo apt-get install apache2
primitive resDLM ocf:pacemaker:controld \
+
Disable their init scripts:
        params daemon="dlm_controld" \
+
  update-rc.d -f apache2 remove
        op monitor interval="120s"
+
In this example, I'll create failover for the apache2. I'll also add one additional and tie apache2.
  primitive resDRBD ocf:linbit:drbd \
+
sudo crm configure edit
        params drbd_resource="disk0" \
+
  primitive resAPACHE ocf:heartbeat:apache \
        operations $id="resDRBD-operations" \
+
        params configfile="/etc/apache2/apache2.conf" httpd="/usr/sbin/apache2" \
        op monitor interval="20" role="Master" timeout="20" \
+
        op monitor interval="5s"
        op monitor interval="30" role="Slave" timeout="20"
+
  primitive resIP-APACHE ocf:heartbeat:IPaddr2 \
  primitive resFS ocf:heartbeat:Filesystem \
+
         params ip="192.168.244.171" cidr_netmask="21" nic="eth0
        params device="/dev/drbd/by-res/disk0" directory="/opt" fstype="ocfs2" \
+
  group groupAPACHE resIP-APACHE resAPACHE
        op monitor interval="120s"
+
  order apache_after_ip inf: resIP-APACHE:start resAPACHE:start
  primitive resO2CB ocf:pacemaker:o2cb \
 
         params stack="cman" \
 
        op monitor interval="120s"
 
ms msDRBD resDRBD \
 
        meta resource-stickines="100" notify="true" master-max="2" interleave="true"
 
  clone cloneDLM resDLM \
 
        meta globally-unique="false" interleave="true"
 
  clone cloneFS resFS \
 
        meta interleave="true" ordered="true"
 
clone cloneO2CB resO2CB \
 
        meta globally-unique="false" interleave="true"
 
colocation colDLMDRBD inf: cloneDLM msDRBD:Master
 
colocation colFSO2CB inf: cloneFS cloneO2CB
 
colocation colO2CBDLM inf: cloneO2CB cloneDLM
 
order ordDLMO2CB 0: cloneDLM cloneO2CB
 
order ordO2CBFS 0: cloneO2CB cloneFS
 
property $id="cib-bootstrap-options" \
 
        dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \
 
        cluster-infrastructure="cman" \
 
        stonith-enabled="false" \
 
        no-quorum-policy="ignore"
 
  
 
=Links=
 
=Links=
Zeile 238: Zeile 141:
 
*https://wiki.ubuntu.com/ClusterStack/Precise
 
*https://wiki.ubuntu.com/ClusterStack/Precise
 
*https://wiki.ubuntu.com/ClusterStack/LucidTesting
 
*https://wiki.ubuntu.com/ClusterStack/LucidTesting
 +
*http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch
 +
*http://www.drbd.org/users-guide/
 +
*http://wiki.techstories.de/display/IT/Pacemaker+HA-Cluster+Schulung+GFU+Koeln
 +
*http://theclusterguy.clusterlabs.org/post/907043024/introducing-the-pacemaker-master-control-process-for
 +
*http://www.hastexo.com/resources/hints-and-kinks/fencing-libvirtkvm-virtualized-cluster-nodes
 +
*http://linux.die.net/man/5/cluster.conf
 +
*http://burning-midnight.blogspot.de/2012/05/cluster-building-ubuntu-1204.html
 +
*http://burning-midnight.blogspot.de/2012/07/cluster-building-ubuntu-1204-revised.html
 +
*http://www.hastexo.com/resources/hints-and-kinks/ocfs2-pacemaker-debianubuntu
 +
*http://www.gossamer-threads.com/lists/drbd/users/23267
 +
*http://blog.simon-meggle.de/tutorials/score-berechnung-im-pacemaker-cluster-teil-1/
 +
*http://www.clusterlabs.org/doc/en-US/Pacemaker/1.0/html/Pacemaker_Explained/s-cluster-options.html
 +
*http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/s-ocf-return-codes.html
 +
*https://aseith.com/plugins/viewsource/viewpagesrc.action?p
 +
*http://publications.jbfavre.org/virtualisation/cluster-xen-corosync-pacemaker-drbd-ocfs2.enageId=9601113
 +
*http://www.youtube.com/watch?v=3GoT36cK6os&feature=youtu.be
 +
*http://www.debian-administration.org/articles/578
 +
*http://blog.datentraeger.li/?cat=31
 +
*http://publications.jbfavre.org/virtualisation/cluster-xen-corosync-pacemaker-drbd-ocfs2.en

Aktuelle Version vom 29. Januar 2013, 11:21 Uhr

Installation

[ALL] Initial setup

Install required packages:

sudo apt-get install pacemaker cman resource-agents fence-agents gfs2-utils gfs2-cluster ocfs2-tools-cman openais drbd8-utils

Make sure each host can resolve all other hosts. Best way to achive this is by adding their IPs and hostnames to /etc/hosts on all nodes. In this example, that would be:

eth0
192.168.244.161 fix
192.168.244.162 foxy
eth1
10.168.244.161   fix-ha
10.168.244.162   foxy-ha

Disable o2cb from starting:

update-rc.d -f o2cb remove

[ALL] Create /etc/cluster/cluster.conf

Paste this into /etc/cluster/cluster.conf :

<?xml version="1.0"?>
<cluster config_version="1" name="pacemaker">
<cman two_node="1" expected_votes="1"> </cman>
   <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
   <clusternodes>
           <clusternode name="fix" nodeid="1" votes="1">
               <fence>
                       <method name="pcmk-redirect">
                               <device name="pcmk" port="fix"/>
                       </method>
               </fence>
           </clusternode>
           <clusternode name="foxy" nodeid="2" votes="1">
               <fence>
                       <method name="pcmk-redirect">
                               <device name="pcmk" port="foxy"/>
                       </method>
               </fence>
           </clusternode>
   </clusternodes>
 <fencedevices>
   <fencedevice name="pcmk" agent="fence_pcmk"/>
 </fencedevices>
</cluster>

Validate the config

# ccs_config_validate 
Configuration validates

[ALL] Edit /etc/corosync/corosync.conf

Find pacemaker service in /etc/corosync/corosync.conf and bump version to 1:

service {
        # Load the Pacemaker Cluster Resource Manager
        ver:       1
        name:      pacemaker
}

Replace bindnetaddr with the IP of your network. For example:

                bindnetaddr: 10.168.244.0

'0' is not a typo.

[ALL] Enable pacemaker init scripts

update-rc.d -f pacemaker remove
update-rc.d pacemaker start 50 1 2 3 4 5 . stop 01 0 6 .

[ALL] Start cman service and then pacemaker service

deactivat quorum

# echo CMAN_QUORUM_TIMEOUT=0 >> /etc/default/cman

start cman

# service cman start
Starting cluster: 
   Checking if cluster has been disabled at boot... [  OK  ]
   Checking Network Manager... [  OK  ]
   Global setup... [  OK  ]
   Loading kernel modules... [  OK  ]
   Mounting configfs... [  OK  ]
   Starting cman... [  OK  ]
   Waiting for quorum... [  OK  ]
   Starting fenced... [  OK  ]
   Starting dlm_controld... [  OK  ]
   Unfencing self... [  OK  ]
   Joining fence domain... [  OK  ]


# cman_tool nodes
Node  Sts   Inc   Joined               Name
   1   M      4   2012-09-11 19:14:58  fix
   2   M      8   2012-09-11 19:18:35  foxy


[ALL] Enable pacemaker init scripts

update-rc.d -f pacemaker remove
update-rc.d pacemaker start 50 1 2 3 4 5 . stop 01 0 6 .


service pacemaker start
Starting Pacemaker Cluster Manager: [  OK  ]

[ONE] Setup resources

Wait for a minute until pacemaker declares all nodes online:
# crm status
============
Last updated: Fri Sep  7 21:18:12 2012
Last change: Fri Sep  7 21:17:17 2012 via crmd on fix
Stack: cman
Current DC: fix - partition with quorum
Version: 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c
2 Nodes configured, unknown expected votes
0 Resources configured.
============ 

Online: [ fix foxy ]

OCSF2 WAY

GFS2 WAY

CRM/CIB

[ALL] install services that will fail over between servers

In this example, I'm installing apache2

sudo apt-get install apache2

Disable their init scripts:

update-rc.d -f apache2 remove

In this example, I'll create failover for the apache2. I'll also add one additional and tie apache2.

sudo crm configure edit
primitive resAPACHE ocf:heartbeat:apache \
       params configfile="/etc/apache2/apache2.conf" httpd="/usr/sbin/apache2" \
       op monitor interval="5s"
primitive resIP-APACHE ocf:heartbeat:IPaddr2 \
       params ip="192.168.244.171" cidr_netmask="21" nic="eth0
group groupAPACHE resIP-APACHE resAPACHE
order apache_after_ip inf: resIP-APACHE:start resAPACHE:start

Links