Today I decided I would write a little blog entry about how to backup mongodb using Linux LVM snapshots. I have seen a lot of questions regarding “what” is the best method to backup mongodb lately on the IRC channel #mongodb.
From the mongodb.org website “This takes a database and outputs it in a binary representation. This is used for doing (hot) backups of a database.”
While mongodb does have its uses I would argue that it’s not the most efficient way to backup your mongodb data on a regular basis especially if your database has grown into the gigabytes of data. That being said here are a few projects I found on github.com which can help you automate your mongodb backups using mongodump:
Using Linux LVM
Personally I have come to rely upon LVM to backup (take snapshots) of my mongodb server, specifically on my delayed secondary. To get started you are going to need to complete the following tasks on at least one of your servers. For this example we are going to use EC2 as an example. This is going to be an EC2 instance with 4 EBS volumes attached to them in RAID 10:
- Setup EBS Volumes Raid10
mdadm -C -R /dev/md0 -l10 -c256 -n4 /dev/xvdf /dev/xvdg /dev/xvdh /dev/xvdi mdadm --detail --scan`" | tee -a /etc/mdadm.conf blockdev --setra 128 /dev/md0 blockdev --setra 128 /dev/xvdf blockdev --setra 128 /dev/xvdg blockdev --setra 128 /dev/xvdh blockdev --setra 128 /dev/xvdi
- Setup LVM
pvcreate /dev/md0 vgcreate datavg /dev/md0 lvcreate -l 80%vg -n datalv datavg lvcreate -l 5%vg -n journallv datavg mke2fs -t ext4 -F /dev/datavg/datalv mke2fs -t ext4 -F /dev/datavg/journallv echo "/dev/datavg/datalv /data ext4 defaults,auto,noatime,noexec 0 0" >> /etc/fstab echo "/dev/datavg/journallv /journal ext4 defaults,auto,noatime,noexec 0 0" >> /etc/fstab mkdir /p /data/ mkdir -p /journal mount -a ln -s /journal /data/journal chown -R mongodb:mongodb /data/ chown -R mongodb:mongodb /journal/
This is going to setup a logical volume group called datavg with two logical volumes. One logical volume will be named datalv and the other will be named journallv. Only 85% of the volume group will be allocated which gives us 15% for snapshots. You might need to adjust your snapshot size depending on how active your mongodb is and how many backups you wish to keep.
- Enable Journaling
nojournal = false
Now that we have LVM setup and journaling enabled we can proceed with setting up our snapshot schedule. Here is a basic script which will make 5G/2G snapshots every day Mon-Sun. Every week the previous week’s snapshot will be removed:
- Setup Cron
Simply add this to cron and you are on your way.
10 0 * * * /usr/local/bin/lvsnapshot.sh > /dev/null 2>&1
Now that you have snapshots your options for archiving mongodb becomes limitless. These snapsnots can then be mounted and archived to S3 or even copied back to your own datacenter.