This post will explain how to setup the NFS cluster and the failover between two servers using Corosync as the cluster engine and Pacemaker as the resource manager of the cluster.

This post is the continuation of the series of posts to setup a highly available NFS server. Check out the first post to setup the iSCSI storage part here.


Pacemaker is an open source, highly available resource manager. The tasks of Pacemaker are to keep the configuration of all the resources of the cluster and the relations between the servers and resources. For example, if we need to setup a VIP (virtual IP), mount a filesystem or start a service on the active node of the cluster, Pacemaker will setup all the resources assigned to the server in the order we specify on the configuration to ensure all the services will be started correctly.

Corosync is an open source cluster engine which allows messages to be shared between different servers of a cluster. This is in order to check health statuses and inform other components of the cluster- just in case one of the servers goes down and starts the failover process.

Resource Agents
Resource Agents are scripts that manage different services and are based on the OCF standard. The system already comes with some scripts, and most of the time they will be enough for typical cluster setups, but of course it’s possible to develop a new one depending on your needs and requirements.

Pacemaker Stack Visual

So, after this small introduction about the cluster components, let’s get started with the configuration!

Corosync Configuration:

– Install package dependencies:
[code language=”shell”]# aptitude install corosync pacemaker[/code]

– Generate a private key to ensure the authenticity and privacy of the messages sent between the nodes of the cluster:
[code language=”shell”]# corosync-keygen –l[/code]

NOTE: This command will generate the private key on the path: /etc/corosync/authkey. Copy the key file to the other server.

– Edit /etc/corosync/corosync.conf:
[code language=”shell”]# Please read the openais.conf.5 manual page

totem {
version: 2

# How long before declaring a token lost (ms)
token: 3000

# How many token retransmits before forming a new configuration
token_retransmits_before_loss_const: 10

# How long to wait for join messages in the membership protocol (ms)
join: 60

# How long to wait for consensus to be achieved before starting a new round of membership configuration (ms)
consensus: 3600

# Turn off the virtual synchrony filter
vsftype: none

# Number of messages that may be sent by one processor on receipt of the token
max_messages: 20

# Limit generated nodeids to 31-bits (positive signed integers)
clear_node_high_bit: yes

# Enable encryption
secauth: on

# How many threads to use for encryption/decryption
threads: 0

# This specifies the mode of redundant ring, which may be none, active, or passive.
rrp_mode: active

interface {
# The following values need to be set based on your environment
ringnumber: 0
mcastport: 5405

nodelist {
node {
ring0_addr: nfs1-srv
nodeid: 1
node {
ring0_addr: nfs2-srv
nodeid: 2

amf {
mode: disabled

quorum {
# Quorum for the Pacemaker Cluster Resource Manager
provider: corosync_votequorum
expected_votes: 1

service {
# Load the Pacemaker Cluster Resource Manager
ver: 0
name: pacemaker

aisexec {
user: root
group: root

logging {
fileline: off
to_stderr: yes
to_logfile: no
to_syslog: yes
syslog_facility: daemon
debug: off
timestamp: on
logger_subsys {
subsys: AMF
debug: off
tags: enter|leave|trace1|trace2|trace3|trace4|trace6

Pacemaker Configuration:

– Disable the quorum policy, since we need to deploy a 2-node configuration:
[code language=”shell”]# crm configure property no-quorum-policy=ignore[/code]

– Setup the VIP resource of the cluster:
[code language=”shell”]# crm configure primitive p_ip_nfs ocf:heartbeat:IPaddr2 params ip=”″ cidr_netmask=”24″ nic=”eth0″ op monitor interval=”30s”[/code]

– Setup the init script for the NFS server:
[code language=”shell”]# crm configure primitive p_lsb_nfsserver lsb:nfs-kernel-server op monitor interval=”30s”[/code]

NOTE: The nfs-kernel-server init script will be managed by the cluster, so disable the service to start it at boot time using update-rc.d utility:
[code language=”shell”]# update-rc.d -f nfs-kernel-server remove[/code]

– Configure the mount point for the NFS export:
[code language=”shell”]# crm configure primitive p_fs_nfs ocf:heartbeat:Filesystem params device=”/dev/mapper/nfs1″ directory=”/mnt/nfs” fstype=”ext3″ op start interval=”0″ timeout=”120″ op monitor interval=”60″ timeout=”60″ OCF_CHECK_LEVEL=”20″ op stop interval=”0″ timeout=”240″[/code]

– Configure a resource group with the NFS service, the mountpoint and the VIP:
[code language=”shell”]# crm configure group g_nfs p_fs_nfs p_lsb_nfsserver p_ip_nfs meta target-role=”Started”[/code]

– Prevent healthy resources from being moved around the cluster, configuring resource stickiness:
[code language=”shell”]# crm configure rsc_defaults resource-stickiness=200[/code]

Check Cluster Status:

– Check the status of the resources of the cluster:
[code language=”shell”]# crm status
Last updated: Wed Jun 3 21:44:29 2015
Last change: Wed Jun 3 16:56:15 2015 via crm_resource on nfs1-srv
Stack: corosync
Current DC: nfs1-srv (1) – partition with quorum
Version: 1.1.10-42f2063
2 Nodes configured
3 Resources configured

Online: [ nfs1-srv nfs2-srv ]

Resource Group: g_nfs
p_lsb_nfsserver (lsb:nfs-kernel-server): Started nfs2-srv
p_ip_nfs (ocf::heartbeat:IPaddr2): Started nfs2-srv
p_fs_nfs (ocf::heartbeat:Filesystem): Started nfs2-srv[/code]

Cluster Failover:

– If resources are in NFS2-SRV and we want to failover to NFS1-SRV:
[code language=”shell”]# crm resource move g_nfs nfs1-srv[/code]

– Remove all constraints created by the move command:
[code language=”shell”]# crm resource unmove g_nfs[/code]

Resulting Configuration:
[code language=”shell”]# crm configure show
node $id=”1″ nfs1-srv
node $id=”2″ nfs2-srv
primitive p_fs_nfs ocf:heartbeat:Filesystem
params device=”/dev/mapper/nfs-part1″ directory=”/mnt/nfs” fstype=”ext3″ options=”_netdev”
op start interval=”0″ timeout=”120″
op monitor interval=”60″ timeout=”60″ OCF_CHECK_LEVEL=”20″
op stop interval=”0″ timeout=”240″
primitive p_ip_nfs ocf:heartbeat:IPaddr2
params ip=”″ cidr_netmask=”24″ nic=”eth0″
op monitor interval=”30s”
primitive p_lsb_nfsserver lsb:nfs-kernel-server
op monitor interval=”30s”
group g_nfs p_lsb_nfsserver p_ip_nfs
meta target-role=”Started”
colocation c_nfs_on_fs inf: p_lsb_nfsserver p_fs_nfs
order o_volume_before_nfs inf: p_fs_nfs g_nfs:start
property $id=”cib-bootstrap-options”
rsc_defaults $id=”rsc-options”

Natty Information
Ubuntu Quickstart
Clusters from Scratch

Original post by Iván Mora (SysOps Engineer @ CAPSiDE) and can be found at

To find the first part of how to set up the iSCSI storage, please click here.

TAGS: clustering, corosync, how-to, iscsi, multipath, nfs server, opensource, pacemaker

speech-bubble-13-icon Created with Sketch.

Deja un comentario

Tu dirección de correo electrónico no será publicada. Los campos obligatorios están marcados con *