Mini-HOWTOs
HOW TO PREPARE A FAIL PROOF SOLUTION WITH A REDUNDANT MASTER?
A redundant master functionality is right now not a built-in functionality. But this subject is for us very crucial because we know how important this is and we receive lots of requests about it from many sources.
It is important to mention that even in MooseFS v 1.5.x edition it is relatively easy to write a scripts set which would quite automatically start a backup master server and in 1.6.x it is even simpler. The whole process of switching to the backup server would take less than a minute.
It is enough to use for example Common Address Redundancy Protocol (http://www.openbsd.org/faq/pf/carp.html, http://en.wikipedia.org/wiki/Common_Address_Redundancy_Protocol). CARP allows that there exist two machines with the same IP in one LAN - one MASTER and the second BACKUP.
So you can set up IP of mfsmaster on a CARP interface and configure the master machine to be used as MooseFS main master. On the backup machine you also install mfsmaster but of course you do not run it.
Versions 1.6.5 and above contain a new program mfsmetalogger which you can run on whatever machine you wish. The program gets metadata from master - every several hours (by default every 24 hours) gets a full metadata file and on current basis a complete change log.
If you run an earlier version of MooseFS than 1.6.5 it is enough to set up several simple scripts run regularly from cron (eg. every one hour) which would backup metadata file from the main master PREFIX/var/mfs/metadata.mfs.back.
You also need an extra script run continuously testing the CARP interface state which in case if this interface goes in a MASTER mode would get two or three newest "changelog" files from any chunkserver (just by using "scp"), would also start mfsmetarestore and then mfsmaster. The switch time should take approximately several seconds and along with time necessary for reconnecting chunkservers a new master would be fully functional in about a minute (in both read and write modes).
We also plan to add option to run master in read-only mode - which would be the best option for running the backup machine. This would secure the system against potential desynchronization of the two master machines and the need of merging all the changes to the main master which took place on the backup master.
UPGRADE FROM VERSION 1.5.X TO 1.6.X
When you upgrade
from 1.5.x to 1.6.x it is important to keep
the order of upgraded modules which is described in UPGRADE
file:
Upgrading
from 1.5.x to 1.6.x:
- Upgrade and restart mfsmaster.
- Upgrade and restart chunkservers.
- Upgrade mfs clients (mfstools before remounting MFS with new mfsmount), remount MFS trees.
You
do not need to uninstall any older versions. You also do not need to
stop the previous version - the new one should stop the previous.
1
(machine with mfsmaster):
make
install
PREFIX/sbin/mfsmaster
2
(machines with chunkservers):
make
install
PREFIX/sbin/mfschunkserver
You
should restart chunkservers with 5-10 minutes delay.
3 (clients)
Remount the disks:
umount /mnt/mfs
mfsmount /mnt/mfs
If you can stop the
whole system for the restart, you can do the
process this way:
1 (machine with mfsmaster):
make
install
PREFIX/sbin/mfsmaster -s
2
(machines with chunkservers):
make
install
#PREFIX/sbin/mfschunkserver
And
here you can make all restarts at one moment. But please wait about 3
minutes before starting the master.
3
(master)
PREFIX/sbin/mfsmaster
4.
And now you shoud change all "msfmounts"


