Mini-HOWTOs

   

HOW TO PREPARE A FAIL PROOF SOLUTION WITH A REDUNDANT MASTER?

 

A redundant master functionality is right now not a built-in functionality. But this subject is for us very crucial because we know how important this is and we receive lots of requests about it from many sources.

 

It is important to mention that even in MooseFS v 1.5.x edition it is relatively easy to write a scripts set which would quite automatically start a backup master server and in 1.6.x it is even simpler. The whole process of switching to the backup server would take less than a minute.

 

It is enough to use for example Common Address Redundancy Protocol (http://www.openbsd.org/faq/pf/carp.html, http://en.wikipedia.org/wiki/Common_Address_Redundancy_Protocol). CARP allows that there exist two machines with the same IP in one LAN - one MASTER and the second BACKUP.

 

So you can set up IP of mfsmaster on a CARP interface and configure the master machine to be used as MooseFS main master. On the backup machine you also install mfsmaster but of course you do not run it.

 

Versions 1.6.5 and above contain a new program mfsmetalogger which you can run on whatever machine you wish. The program gets metadata from master - every several hours (by default every 24 hours) gets a full metadata file and on current basis a complete change log.

 

If you run an earlier version of MooseFS than 1.6.5 it is enough to set up several simple scripts run regularly from cron (eg. every one hour) which would backup metadata file from the main master PREFIX/var/mfs/metadata.mfs.back.

 

You also need an extra script run continuously testing the CARP interface state which in case if this interface goes in a MASTER mode would get two or three newest "changelog" files from any chunkserver (just by using "scp"), would also start mfsmetarestore and then mfsmaster. The switch time should take approximately several seconds and along with time necessary for reconnecting chunkservers a new master would be fully functional in about a minute (in both read and write modes).

 

We also plan to add option to run master in read-only mode - which would be the best option for running the backup machine. This would secure the system against potential desynchronization of the two master machines and the need of merging all the changes to the main master which took place on the backup master.

 

UPGRADE FROM VERSION 1.5.X TO 1.6.X

 

When you upgrade from 1.5.x to 1.6.x it is important to keep the order of upgraded modules which is described in UPGRADE file:

Upgrading from 1.5.x to 1.6.x:

  1. Upgrade and restart mfsmaster.
  2. Upgrade and restart chunkservers.
  3. Upgrade mfs clients (mfstools before remounting MFS with new mfsmount), remount MFS trees.


You do not need to uninstall any older versions. You also do not need to stop the previous version - the new one should stop the previous.

1 (machine with mfsmaster):
make install
PREFIX/sbin/mfsmaster


2 (machines with chunkservers):
make install
PREFIX/sbin/mfschunkserver


You should restart chunkservers with 5-10 minutes delay.

3 (clients)

Remount the disks:

umount /mnt/mfs

mfsmount /mnt/mfs

 

 

If you can stop the whole system for the restart, you can do the process this way:

1 (machine with mfsmaster):
make install
PREFIX/sbin/mfsmaster -s


2 (machines with chunkservers):
make install
#PREFIX/sbin/mfschunkserver


And here you can make all restarts at one moment. But please wait about 3 minutes before starting the master.

3 (master)
PREFIX/sbin/mfsmaster

4. And now you shoud change all "msfmounts"