In an earlier post here, I shared my frustrations about how it doesn’t seem possible to get a Pacemaker cluster going with OCFS2 as the cluster fs on CentOS 6. But I wouldn’t be a “guru” if I couldn’t get it work now would I?? <evil laugh> hahahaha </evil laugh>. Enough laughter – on with the how to.
Note: I’m going to try and be as detailed as possible on this how to. I ran through it several times just to be sure I wasn’t missing anything, but I’m human – I may have made a mistake or two. Let me know if you run into issues.
Step 1: Installing and Configuring CentOS
Installing CentOS is pretty easy, so I won’t go into tremendous detail here.
I strongly suggest that you use a development host or virtual machine because you are going to install quite a few development libraries and their corresponding dependencies. I chose to use a vm with 3 GB of memory and a 12 GB disk. Once we get all the RPMs and the one or two stand alone binaries built, you can simply transfer and install them on your production systems.
For my system, I simply went with the defaults for the partitioning scheme however, feel free to adjust your disk layout to suit your specific needs. When choosing my installation type, again I choose the default of “minimal”. I suggest that you use the defaults in this respect to avoid potential conflicts between this how to and your system configuration.
Once your installation is complete, reboot, log in and perform some basic configuration:
- Set a static IP address and DNS
- Ensure name resolution is properly configured and the OS can ping itself (e.g.
ping `uname -n`
) - Disable selinux (edit /etc/sysconfig/selinux) and iptables (
chkconfig iptables off; chkconfig ip6tables off
)
Step 2: Installing the UEK
I would love for this step to be optional, but the ocfs2 kernel module does not ship with any of CentOS kernels (AFAIK) and it doesn’t seem to be provided by any other package. Therefore, we need to go outside the distribution to get a compatible kernel with the module built in. The latest UE Kernel from Oracle at the time of this how-to is v2.6.39 and does the job quite nicely as it ships with kmod ocfs2, version 1.8 which provides some really cool features. I’ll try to highlight some of them:
- POSIX ACL support
- Extended attributes for SELinux
- Quota recovery
- Quota syncing
- Quota accounting on mount, disable on umount
- Name based indexed b-tree of directory inodes
- Optimized inode allocation
- CoW support and unlimited inode-based writeable snapshot
- Huge volume (> 16 TiB) support
- Mount option (coherency=*) to control how to handle cluster coherency for O_DIRECT writes
- SSD trimming support
So you excited yet? ;). OK – so to install it, do a yum install wget
then follow the instructions here: http://public-yum.oracle.com/. Edit the repository file and only enable the [ol6_UEK_latest]
repository then execute yum update; yum install kernel-uek kernel-uek-devel
. After the kernel is installed, you will need to edit your /boot/grub/menu.lst
and make sure the UEK is the default, but before you reboot, one additional change is needed.
The Pacemaker cluster stack requires a control device in /dev/misc
to function. The full path to the device is /dev/misc/ocfs2_control
. On CentOS 6.x, with no changes, this device is either inaccessible or non-existent. We fix that by adding a udev rule (more on udev here: http://en.wikipedia.org/wiki/Udev) that will take effect at each boot.
To add the rule, create the file /etc/udev/rules.d/99-ocfs2_control.rules
. The file should contain a single line: KERNEL=="ocfs2_control", NAME="misc/ocfs2_control", MODE="0660"
. Perform this step then reboot.
Step 3: Ready the Development Environment
So at this point, you have a pristine installation of CentOS 6.x with the Oracle Unbreakable Enterprise Kernel which has built-in support for OCFSv2. Time start installing our software.
I’m going to give you one liner that will download all the necessary rpms. Here is the rpm list: pacemaker openais corosync pacemaker-libs pacemaker-libs-devel gcc corosync-devel openais-devel rpm-build e2fsprogs-devel libuuid-devel git pygtk2 python-devel readline-devel clusterlib-devel redhat-lsb sqlite-devel gnutls-devel byacc flex nss-devel
. Install those rpms with the yum install
command. Any dependencies will automatically be met
Lastly, create a symlink /usr/include/libxml2/libxml
in /usr/include
. Now lets try and build /usr/sbin/dlm_controld.pcmk
.
Step 4: Building dlm_controld_pcmk
So now it’s time to build /usr/sbin/dlm_controld.pcmk
. This is Pacemaker’s interface to the kernel’s distributed lock manager. Without it, you can’t run a cluster with OCFS2.
The sources are available at https://fedorahosted.org/cluster/wiki/HomePage via git, however we need to patch them otherwise it won’t build successfully.
Copy the source code below into a file on your system. Name it something like dlm_controld_pacemaker.patch. We will need it later.
--- a/group/dlm_controld/pacemaker.c 2012-07-12 19:07:56.555023010 -0400 +++ b/group/dlm_controld/pacemaker.c 2012-07-12 19:06:11.058024609 -0400 @@ -16,7 +16,7 @@ #undef SUPPORT_HEARTBEAT #define SUPPORT_HEARTBEAT 0 #include-#include +#include #include #include #include @@ -64,7 +64,7 @@ crm_log_init("cluster-dlm", LOG_INFO, FALSE, TRUE, 0, NULL); if(init_ais_connection(NULL, NULL, NULL, &local_node_uname, &our_nodeid) == FALSE) { - log_error("Connection to our AIS plugin (%d) failed", CRM_SERVICE); + log_error("Connection to our AIS plugin (CRM) failed"); return -1; }
So here is a synopsis of what we are going to do:
- Clone the Repository
- Checkout the appropriate branch
- Patch
group/dlm_controld/pacemaker.c
- Run the
configure
script - Run
make
- Profit??
Shell we begin?
First, clone the repository.
[root@pcmk-dev2 ~]# mkdir cluster-build [root@pcmk-dev2 ~]# cd cluster-build/ [root@pcmk-dev2 cluster-build]# ls [root@pcmk-dev2 cluster-build]# git clone http://git.fedorahosted.org/git/dlm.git Initialized empty Git repository in /root/cluster-build/dlm/.git/ cd dlm
Then checkout the “pacemaker” branch.
[root@pcmk-dev2 cluster-build]# cd dlm [root@pcmk-dev2 dlm]# git branch -a * master remotes/origin/HEAD -> origin/master remotes/origin/dlm-fixes remotes/origin/fscontrol remotes/origin/master remotes/origin/pacemaker remotes/origin/sles [root@pcmk-dev2 dlm]# [root@pcmk-dev2 dlm]# git checkout pacemaker Branch pacemaker set up to track remote branch pacemaker from origin. Switched to a new branch 'pacemaker'
Now patch the code
[root@pcmk-dev2 dlm]# pwd /root/cluster-build/dlm [root@pcmk-dev2 dlm]# cat ../dlm_controld_pacmaker.patch | patch -p1 patching file group/dlm_controld/pacemaker.c [root@pcmk-dev2 dlm]#
Run the configure script.
[root@pcmk-dev2 dlm]# ./configure --enable_pacemaker Configuring Makefiles for your system... Checking tree: nothing to do Checking kernel: WARNING: Could not determine kernel version. Build might fail! Completed Makefile configuration [root@pcmk-dev2 dlm]#
Finally, run make
[root@pcmk-dev2 dlm]# make [ -n "" ] || make -C dlm all make[1]: Entering directory `/root/cluster-build/dlm/dlm' set -e && \ for i in libdlm libdlmcontrol tool man; do \ make -C $i all; \ done make[2]: Entering directory `/root/cluster-build/dlm/dlm/libdlm' gcc -Wall -Wformat=2 -MMD -O2 -g -I/root/cluster-build/dlm/make -DENABLE_PACEMAKER=1 -DLOGDIR=\"/var/log/cluster\" -DSYSLOGFACILITY=LOG_LOCAL4 -DSYSLOGLEVEL=LOG_INFO -DRELEASE_VERSION=\"1342137464\" -fPIC -I/root/cluster-build/dlm/dlm/libdlm -I/usr/include -I/lib/modules/2.6.39-200.29.1.el6uek.x86_64/source/include -D_REENTRANT -c -o libdlm.o /root/cluster-build/dlm/dlm/libdlm/libdlm.c In file included from /root/cluster-build/dlm/dlm/libdlm/libdlm.c:21: /lib/modules/2.6.39-200.29.1.el6uek.x86_64/source/include/linux/types.h:13:2: warning: #warning "Attempt to use kernel headers from user space, see http://kernelnewbies.org/KernelHeaders" ar cru libdlm.a libdlm.o ranlib libdlm.a gcc -Wall -Wformat=2 -MMD -O2 -g -I/root/cluster-build/dlm/make -DENABLE_PACEMAKER=1 -DLOGDIR=\"/var/log/cluster\" -DSYSLOGFACILITY=LOG_LOCAL4 -DSYSLOGLEVEL=LOG_INFO -DRELEASE_VERSION=\"1342137464\" -fPIC -I/root/cluster-build/dlm/dlm/libdlm -I/usr/include -I/lib/modules/2.6.39-200.29.1.el6uek.x86_64/source/include -c -o libdlm_lt.o /root/cluster-build/dlm/dlm/libdlm/libdlm.c In file included from /root/cluster-build/dlm/dlm/libdlm/libdlm.c:21: /lib/modules/2.6.39-200.29.1.el6uek.x86_64/source/include/linux/types.h:13:2: warning: #warning "Attempt to use kernel headers from user space, see http://kernelnewbies.org/KernelHeaders" ar cru libdlm_lt.a libdlm_lt.o ranlib libdlm_lt.a gcc -shared -o libdlm.so.3.0 -Wl,-soname=libdlm.so.3 libdlm.o -lpthread -L/usr/lib ln -sf libdlm.so.3.0 libdlm.so ln -sf libdlm.so.3.0 libdlm.so.3 gcc -shared -o libdlm_lt.so.3.0 -Wl,-soname=libdlm_lt.so.3 libdlm_lt.o -L/usr/lib ln -sf libdlm_lt.so.3.0 libdlm_lt.so ln -sf libdlm_lt.so.3.0 libdlm_lt.so.3 cat /root/cluster-build/dlm/dlm/libdlm/libdlm.pc.in | \ sed \ -e 's#@PREFIX@#/usr#g' \ -e 's#@LIBDIR@#/usr/lib#g' \ -e 's#@INCDIR@#/usr/include#g' \ -e 's#@VERSION@#1342137464#g' \ > libdlm.pc cat /root/cluster-build/dlm/dlm/libdlm/libdlm_lt.pc.in | \ sed \ -e 's#@PREFIX@#/usr#g' \ -e 's#@LIBDIR@#/usr/lib#g' \ -e 's#@INCDIR@#/usr/include#g' \ **** make[2]: Entering directory `/root/cluster-build/dlm/bindings/python' set -e && \ for i in ; do \ make -C $i all; \ done make[2]: Leaving directory `/root/cluster-build/dlm/bindings/python' make[1]: Leaving directory `/root/cluster-build/dlm/bindings' [ -n "" ] || make -C contrib all make[1]: Entering directory `/root/cluster-build/dlm/contrib' set -e && \ for i in ; do \ make -C $i all; \ done make[1]: Leaving directory `/root/cluster-build/dlm/contrib' [root@pcmk-dev2 dlm]#
At the end of this process, you should have a shiny new dlm_controld.pcmk
in group/dlm_controld
. Admire it.. then go ahead and run make install
then we will move to the next step.
Step 5: Building ocfs2-tools
OK, so once again, we are going to have to go outside of the CentOS repository as the ocfs2-tools rpm is nowhere to be found.
Visit the Oracle Public Yum Repository. Download the SRPM for ocfs2-tools 1.8 and install it. We are going to rebuild the rpm but before we do that, we need to patch the code using the patch below patch.
--- a/ocfs2_controld/pacemaker.c 2012-07-12 20:33:52.525019305 -0400 +++ b/ocfs2_controld/pacemaker.c 2012-07-12 20:34:01.781025802 -0400 @@ -30,7 +30,9 @@ #include#include #include -#include +#include +#include +#include #include "ocfs2-kernel/kernel-list.h" #include "o2cb/o2cb.h" @@ -155,7 +157,7 @@ crm_log_init("ocfs2_controld", LOG_INFO, FALSE, TRUE, 0, NULL); if(init_ais_connection(NULL, NULL, NULL, &local_node_uname, &our_nodeid) == FALSE) { - log_error("Connection to our AIS plugin (%d) failed", CRM_SERVICE); + log_error("Connection to our AIS plugin (CRM) failed"); return -1; }
If that patch looks like patch we applied to group/dlm_controld/pacemaker.c
in the source for dlm_controld.pcmk
, you are right. The common theme in both cases is that the CRM_SERVICE is not properly defined and some headers are not properly being referenced.
Copy that code to the server and name it ocfs2_controld.patch. Also, download my RPM spec file from from this link and put it on your server. For some odd reason, the spec files shipped with the RPM builds but doesn’t package/usr/sbin/ocfs2_controld.pcmk
(or /usr/sbin/ocfs2_controld.cman
for that matter). My spec file corrects this.
So here is the synopsis for building the ocfs2-tools rpm.
- Download and install the SRPM from the link I provided
- Untar and patch the ocfs2-tools
- Re-tar the source and replace the one shipped with the rpm
- Build the rpm using my spec file
- Install the rpm
OK so lets go.
Download the SRPM and install it.
[root@pcmk-dev2 ~]# wget http://public-yum.oracle.com/repo/OracleLinux/OL6/latest/x86_64/ocfs2-tools-1.8.0-10.el6.src.rpm --2012-07-12 22:02:01-- http://public-yum.oracle.com/repo/OracleLinux/OL6/latest/x86_64/ocfs2-tools-1.8.0-10.el6.src.rpm Resolving public-yum.oracle.com... 141.146.44.34 Connecting to public-yum.oracle.com|141.146.44.34|:80... connected. HTTP request sent, awaiting response... 200 OK Length: 853920 (834K) [application/x-rpm] Saving to: âocfs2-tools-1.8.0-10.el6.src.rpmâ 100%[===================================================================================================================================================================================================>] 853,920 1.38M/s in 0.6s 2012-07-12 22:02:01 (1.38 MB/s) - âocfs2-tools-1.8.0-10.el6.src.rpmâ [root@pcmk-dev2 ~]# rpm -ivh ocfs2-tools-1.8.0-10.el6.src.rpm 1:ocfs2-tools warning: user mockbuild does not exist - using root warning: group mockbuild does not exist - using root warning: user mockbuild does not exist - using root warning: group mockbuild does not exist - using root warning: user mockbuild does not exist - using root warning: group mockbuild does not exist - using root warning: user mockbuild does not exist - using root warning: group mockbuild does not exist - using root warning: user mockbuild does not exist - using root warning: group mockbuild does not exist - using root warning: user mockbuild does not exist - using root warning: group mockbuild does not exist - using root warning: user mockbuild does not exist - using root warning: group mockbuild does not exist - using root warning: user mockbuild does not exist - using root warning: group mockbuild does not exist - using root warning: user mockbuild does not exist - using root warning: group mockbuild does not exist - using root ########################################### [100%] warning: user mockbuild does not exist - using root warning: group mockbuild does not exist - using root [root@pcmk-dev2 ~]#
Untar the rpm and patch it
[root@pcmk-dev2 ~]# tar -zxf rpmbuild/SOURCES/ocfs2-tools-1.8.0.tar.gz [root@pcmk-dev2 ~]# cd ocfs2-tools-1.8.0 [root@pcmk-dev2 ocfs2-tools-1.8.0]# cat ../ocfs2_controld.patch | patch -p1 patching file ocfs2_controld/pacemaker.c [root@pcmk-dev2 ocfs2-tools-1.8.0]#
Now re-tar the source code and replace the source shipped with the srpm rpmbuild/SOURCES
[root@pcmk-dev2 ocfs2-tools-1.8.0]# cd ../ [root@pcmk-dev2 ~]# tar -czf ocfs2-tools-1.8.0.tar.gz ocfs2-tools-1.8.0 [root@pcmk-dev2 ~]# cp ocfs2-tools-1.8.0.tar.gz rpmbuild/SOURCES/. cp: overwrite `rpmbuild/SOURCES/./ocfs2-tools-1.8.0.tar.gz'? y [root@pcmk-dev2 ~]#
Finally, build the rpm with my spec file
[root@pcmk-dev2 ~]# rpmbuild -bb ocfs2-tools.spec Executing(%prep): /bin/sh -e /var/tmp/rpm-tmp.dttsXy + umask 022 + cd /root/rpmbuild/BUILD + cd /root/rpmbuild/BUILD + rm -rf ocfs2-tools-1.8.0 + /bin/tar -xvvf - + /usr/bin/gzip -dc /root/rpmbuild/SOURCES/ocfs2-tools-1.8.0.tar.gz drwxr-xr-x root/root 0 2011-03-20 03:13 ocfs2-tools-1.8.0/ -rw-r--r-- root/root 2390 2011-03-20 02:48 ocfs2-tools-1.8.0/mbvendor.m4 drwxr-xr-x root/root 0 2011-03-20 03:13 ocfs2-tools-1.8.0/listuuid/ -rw-r--r-- root/root 5210 2011-03-20 02:48 ocfs2-tools-1.8.0/listuuid/listuuid.c -rw-r--r-- root/root 640 2011-03-20 02:48 ocfs2-tools-1.8.0/listuuid/Makefile **** ** ** ** ** Processing files: ocfs2-tools-devel-1.8.0-10.el6.x86_64 Checking for unpackaged file(s): /usr/lib/rpm/check-files /root/rpmbuild/BUILDROOT/ocfs2-tools-1.8.0-10.el6.x86_64 warning: Installed (but unpackaged) file(s) found: /sbin/ocfs2_controld.cman /sbin/ocfs2_controld.pcmk warning: Could not canonicalize hostname: pcmk-dev2 Wrote: /root/rpmbuild/RPMS/x86_64/ocfs2-tools-1.8.0-10.el6.x86_64.rpm Wrote: /root/rpmbuild/RPMS/x86_64/ocfs2-tools-devel-1.8.0-10.el6.x86_64.rpm Executing(%clean): /bin/sh -e /var/tmp/rpm-tmp.IvQAGC + umask 022 + cd /root/rpmbuild/BUILD + cd ocfs2-tools-1.8.0 + rm -rf /root/rpmbuild/BUILDROOT/ocfs2-tools-1.8.0-10.el6.x86_64 + exit 0
If all goes well, you should see output similar to that above. Install the resulting rpm and rejoice! The hard work is done. Tar up the compiled source code for dlm_controld.pcmk and copy that plus the ocfs2-tools-1.8 RPM produced above to your servers. On your production servers you’ll simply need to install the UEK kernel, redhat-lsb, cluster-lib, corosync, pacemaker, openais and resource-agents in addition to the ocfs2-tools rpm you compiled. You’ll also need to untar the dlm_controld.pcmk source and run a make install
.
Stay tuned. In the next how to, I’ll show you how to put this software to work!
Hi,FYI, if you install the latest pacemaker-libs* packages (1.1.8-7) the dlm compiling won’t work. The lastest pacemaker-lib* packages that work well are the 1.1.7-6.el6 version. If you have already installed the 1.1.8-7 version, you can downgrade the packages with this command: yum downgrade pacemaker pacemaker-cli pacemaker-cluster-libs pacemaker-cts pacemaker-libs pacemaker-libs-devel.
Regards.
Thanks for the heads up Francisco. I’ll add that as a note in the how-to.
Hmmm,, I am having linking problems:
pacemaker.o: In function `dlm_process_node’:
/root/cluster-build/dlm/group/dlm_controld/pacemaker.c:122: undefined reference to `crm_is_member_active’
/root/cluster-build/dlm/group/dlm_controld/pacemaker.c:214: undefined reference to `crm_is_member_active’
/root/cluster-build/dlm/group/dlm_controld/pacemaker.c:214: undefined reference to `crm_is_member_active’
pacemaker.o: In function `process_cluster’:
/root/cluster-build/dlm/group/dlm_controld/pacemaker.c:94: undefined reference to `ais_dispatch’
etc.
Assuming it is a problem with the pacemaker libs, I will try to downgrade and see if that works.
So, I eventually got this working with CentOS 6.6 and OCFS2 but I had to build the ocfs2-tools code and fish out ocfs2_controld.cman and place it in the correct place.
Also, of course, I had to use the correct approach to configure things, mostly with help from this site:
http://floriancrouzat.net/2013/04/rhel-6-4-pacemaker-1-1-8-adding-cman-support-and-getting-rid-of-the-plugin/
Also, the OCFS2 documentation is a little wrong (with respect to the steps), but that is OK.
I now have CTDB running on a two node cluster on two VBox VMs using OCFS2, CMAN and Pacemaker.