This post will explain how to setup the NFS cluster and the failover between two servers using Corosync as the cluster engine and Pacemaker as the resource manager of the cluster.

This post is the continuation of the series of posts to setup a highly available NFS server. Check out the first post to setup the iSCSI storage part here.

Technologies

Pacemaker
Pacemaker is an open source, highly available resource manager. The tasks of Pacemaker are to keep the configuration of all the resources of the cluster and the relations between the servers and resources. For example, if we need to setup a VIP (virtual IP), mount a filesystem or start a service on the active node of the cluster, Pacemaker will setup all the resources assigned to the server in the order we specify on the configuration to ensure all the services will be started correctly.

Corosync
Corosync is an open source cluster engine which allows messages to be shared between different servers of a cluster. This is in order to check health statuses and inform other components of the cluster- just in case one of the servers goes down and starts the failover process.

Resource Agents
Resource Agents are scripts that manage different services and are based on the OCF standard. The system already comes with some scripts, and most of the time they will be enough for typical cluster setups, but of course it’s possible to develop a new one depending on your needs and requirements.

Pacemaker Stack Visual

So, after this small introduction about the cluster components, let’s get started with the configuration!

Corosync Configuration:

– Install package dependencies:

# aptitude install corosync pacemaker

– Generate a private key to ensure the authenticity and privacy of the messages sent between the nodes of the cluster:

# corosync-keygen –l

NOTE: This command will generate the private key on the path: /etc/corosync/authkey. Copy the key file to the other server.

– Edit /etc/corosync/corosync.conf:

ring0_addr: nfs1-srv
nodeid: 1
}
node {
ring0_addr: nfs2-srv
nodeid: 2
}
}

# Please read the openais.conf.5 manual page

totem {
version: 2

# How long before declaring a token lost (ms)
token: 3000

# How many token retransmits before forming a new configuration
token_retransmits_before_loss_const: 10

# How long to wait for join messages in the membership protocol (ms)
join: 60

# How long to wait for consensus to be achieved before starting a new round of membership configuration (ms)
consensus: 3600

# Turn off the virtual synchrony filter
vsftype: none

# Number of messages that may be sent by one processor on receipt of the token
max_messages: 20

# Limit generated nodeids to 31-bits (positive signed integers)
clear_node_high_bit: yes

# Enable encryption
secauth: on

# How many threads to use for encryption/decryption
threads: 0

# This specifies the mode of redundant ring, which may be none, active, or passive.
rrp_mode: active

interface {
# The following values need to be set based on your environment
ringnumber: 0
bindnetaddr: 10.55.71.0
mcastaddr: 226.94.1.1
mcastport: 5405
}
}

nodelist {
node {

amf {
mode: disabled
}

quorum {
# Quorum for the Pacemaker Cluster Resource Manager
provider: corosync_votequorum
expected_votes: 1
}

service {
# Load the Pacemaker Cluster Resource Manager
ver: 0
name: pacemaker
}

aisexec {
user: root
group: root
}

logging {
fileline: off
to_stderr: yes
to_logfile: no
to_syslog: yes
syslog_facility: daemon
debug: off
timestamp: on
logger_subsys {
subsys: AMF
debug: off
tags: enter|leave|trace1|trace2|trace3|trace4|trace6
}
}

Pacemaker Configuration:

– Disable the quorum policy, since we need to deploy a 2-node configuration:

# crm configure property no-quorum-policy=ignore

– Setup the VIP resource of the cluster:

# crm configure primitive p_ip_nfs ocf:heartbeat:IPaddr2 params ip="10.55.71.21" cidr_netmask="24" nic="eth0" op monitor interval="30s"

– Setup the init script for the NFS server:

# crm configure primitive p_lsb_nfsserver lsb:nfs-kernel-server op monitor interval="30s"

NOTE: The nfs-kernel-server init script will be managed by the cluster, so disable the service to start it at boot time using update-rc.d utility:

# update-rc.d -f nfs-kernel-server remove

 

– Configure the mount point for the NFS export:

# crm configure primitive p_fs_nfs ocf:heartbeat:Filesystem params device="/dev/mapper/nfs1" directory="/mnt/nfs" fstype="ext3" op start interval="0" timeout="120" op monitor interval="60" timeout="60" OCF_CHECK_LEVEL="20" op stop interval="0" timeout="240"

– Configure a resource group with the NFS service, the mountpoint and the VIP:

_nfs# crm configure group g_nfs p_fs_nfs p_lsb_nfsserver p_ip meta target-role="Started"

– Prevent healthy resources from being moved around the cluster, configuring resource stickiness:

# crm configure rsc_defaults resource-stickiness=200

Check Cluster Status:

– Check the status of the resources of the cluster:

# crm status
Last updated: Wed Jun 3 21:44:29 2015
Last change: Wed Jun 3 16:56:15 2015 via crm_resource on nfs1-srv
Stack: corosync
Current DC: nfs1-srv (1) - partition with quorum
Version: 1.1.10-42f2063
2 Nodes configured
3 Resources configured

Online: [ nfs1-srv nfs2-srv ]

Resource Group: g_nfs
p_lsb_nfsserver (lsb:nfs-kernel-server): Started nfs2-srv
p_ip_nfs (ocf::heartbeat:IPaddr2): Started nfs2-srv
p_fs_nfs (ocf::heartbeat:Filesystem): Started nfs2-srv

Cluster Failover:

– If resources are in NFS2-SRV and we want to failover to NFS1-SRV:

# crm resource move g_nfs nfs1-srv

– Remove all constraints created by the move command:

# crm resource unmove g_nfs

Resulting Configuration:

# crm configure show
node $id="1" nfs1-srv
node $id="2" nfs2-srv
primitive p_fs_nfs ocf:heartbeat:Filesystem
params device="/dev/mapper/nfs-part1" directory="/mnt/nfs" fstype="ext3" options="_netdev"
op start interval="0" timeout="120"
op monitor interval="60" timeout="60" OCF_CHECK_LEVEL="20"
op stop interval="0" timeout="240"
primitive p_ip_nfs ocf:heartbeat:IPaddr2
params ip="10.55.71.21" cidr_netmask="24" nic="eth0"
op monitor interval="30s"
primitive p_lsb_nfsserver lsb:nfs-kernel-server
op monitor interval="30s"
group g_nfs p_lsb_nfsserver p_ip_nfs
meta target-role="Started"
colocation c_nfs_on_fs inf: p_lsb_nfsserver p_fs_nfs
order o_volume_before_nfs inf: p_fs_nfs g_nfs:start
property $id="cib-bootstrap-options"
dc-version="1.1.10-42f2063"
cluster-infrastructure="corosync"
no-quorum-policy="ignore"
rsc_defaults $id="rsc-options"
resource-stickiness="200"

References
Natty Information
Ubuntu Quickstart
Clusters from Scratch

Original post by Iván Mora (SysOps Engineer @ CAPSiDE) and can be found at opentodo.net.

To find the first part of how to set up the iSCSI storage, please click here.

TAGS: how-to

speech-bubble-13-icon Created with Sketch.
Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

*
*