DevOps Zone is brought to you in partnership with:

Doug has been engrossed in programming since his parents first bought him an Apple IIe computer in 4th grade. Throughout his early career, Doug proved his flexibility and ingenuity in crafting solutions in a variety of environments. Doug’s most recent work has been in the telecom industry developing tools to analyze large amounts of network traffic using C++ and Python. Doug loves learning and synthesizing this knowledge into code and blog articles. Doug is a DZone MVB and is not an employee of DZone and has posted 34 posts at DZone. You can read more from them at their website. View Full User Profile

Backing Up Virtual Machines In VirtualBox

02.11.2013
| 10735 views |
  • submit to reddit

And now to the fun subject of virtual machine images!

My computer blue screened the other day. So I got fairly paranoid. This is the second time in a month this has happened. I needed to make sure my backup strategy was solid. The most important thing I want to backup is my Ubuntu VirtualBox image. Here at work we use CrashPlan which seems to do a decent enough job automatically backing up the directories I tell it to. However, the dumb me was just pointing CrashPlan at the directory with my virtual machine images and hoping for the best. This is, of course, problematic as CrashPlan is most likely running its backup while I’m actively working in my virtual machine. How can I have any guarantee that that image is in any kind of consistent state?

Backing Up With A Versioning System

I tried all kinds of hair-brained schemes to backup my virtualbox image in something resembling automatic. I really wanted some kind of diff-based approach to backing up the large file. One thought I had was that before launching the virtual machine in a batch file or script I’d commit the vdi it to a version control system.  Then I’d point CrashPlan at the associated repository which would be backed up in the background. I don’t reccomend this. Git turns out to have a sensible maximum file limit, which was a pretty big “hey idiot… what are you doing” warning. I found and tried Boar, a versioning system which attempts to work well for binary files. Unfortunately even its diffing ability in this situation was lacking. After 2 commits of a 20 GB .vdi file, the repository had grown to 40 GB+. This solution wasn’t very space efficient. Moreover committing such a large file is tediously slow. Everytime I launched my virtual machine I’d have to wait for this boring 5-10 minute process.

Backing Up Using VirtualBox Snapshots

The canonical solution turns out to involve a VirtualBox feature known assnapshots which I had before now known nothing about.

From a users perspective, a snapshot is a restore point. If I create a snapshot, I can go back to that point in time. The best part is I can take a snapshot of a system even while its running. In the simplest use-case you have a linear progression in time of various snapshots. You can restore your virtual machine to any snapshot in the history. You can also do crazy things like go back to a snapshot and create a branch from that snapshot — taking your virtual machine in multiple experimental directions from a single restore point.

How snapshots actually work makes it extremely powerful for backups. A snapshot turns out to be the diffing system I was looking for. When you take a snapshot of a virtual machine, in the default “normal” mode, the associated parent (either the virtual machine image or another snapshot) is frozen and no longer written to. Instead, all writes go into the file associated with new snapshot. This file is in essence a kind-of commit log against the underlying virtual file system of everything that has happened after the snapshot in time. Its a diff of stuff thats changed since the snapshot took place. Restoring to the snapshot point is as simple as throwing away the snapshot-file — the commit log — and unfreezing the snapshot’s ancestor.

Deleting a snapshot is not removing all the changes in that commit log. Instead its instead folding that snapshot into its ancestor (back to the vdi or another diff file). Its actually committing the diff.

Which leads me to understanding why this can work as a backup strategy.

#!/bin/bash

VBOXMANAGE="/usr/bin/VBoxManage -q"

if [ $# != 1 ]
then
    echo "Usage: $0 VBoxName"
    exit
fi
 
echo "Renaming old snapshot..."
$VBOXMANAGE snapshot "$1" edit previous --name deleteme
echo "Renaming current snapshot..."
$VBOXMANAGE snapshot "$1" edit current --name previous
echo "Taking new snapshot..."
$VBOXMANAGE snapshot "$1" take current
echo "Deleting old snapshot..."
$VBOXMANAGE snapshot "$1" delete deleteme

You can backup a running VirtualBox virtual machine by maintaining a cascade of snapshots. One snapshot, knows as “current”  is the most recent snapshot. Restoring to it restores to the last backup. The diff file associated with it (holding all the stuff that has happened AFTER the snapshot) reflects all the non-backed up changes and is the where VirtualBox is actively keeping the guest OS’s writes. This diff is a kind of “commit log” of all the changes that are going to the virtual disk. The “previous” snapshot is the restore point before current. In this script, previous’s commit log is the old current. The commit log/diff associated with “previous” reflects the changes between the previous/current snapshots. Finally this script also has a “deleteMe” — the old previous. On every run, deleteMe, is folded back into the main vdi file by telling VBoxManage to delete the snapshot (deleting a snapshot doesn’t remove the associated data, it just folds in the data and forgets the restore point).

This strategy lets us keep 2 restore points (in case an accidental backup backs up an unstable image). Its a great strategy, but….

Snapshots — Tread Carefully

Sadly, for me personally, live VirtualBox snapshots haven’t been a terribly robust backup strategy. I’ve unfortunately seen several snapshots fail. Or, worse, had VirtualBox crash while a snapshot was taking place. Luckily I haven’t lost a lot of data, as I’ve been diligent about pushing my code to github. When a snapshot has failed, I’ve had to edit my virtual machine’s vbox.xml file. A file that clearly states “DO NOT EDIT” at the top. Its easy to fall into the lull of thinking that this seems like something that should either succeed or fail atomically like comiting to a versioning system. It hasn’t been my experience that this is the case.

Here’s a gallery of horrors of some of the errors I’ve seen. First there’s the “A differencing image of snapshot could not be found” where somehow a snapshot image file gets lost

I’ve also encountered this error — “Hard Disk XXX cannot be directly attached to the virtual machine because it has 1 differencing child hard disks”. I’ve had errors taking snapshots, including having the snapshot process hang with a live VM. Sadly I can’t say I trust live snapshots right now.

I’ve reverted to a simpler, non-live backup strategy that only takes snapshots immediately before starting up my virtual machine and won’t take a snapshot while VirtualBox.exe is running (asking you to close VirtualBox before continuing). This is a combination of the script above reworked into Python on Windows and this Python ActiveState recipe. I’ve replaced the VirtualBox icon pinned to the taskbar with a batch file that runs my script, and use the default VirtualBox icon on the task bar.

I also only ever have one snapshot currently running — “current”. Before launching VirtualBox, current gets compacted into the main vdi image via a snapshot delete. I then take a new “current” snapshot which VirtualBox uses. For me, this seems to be the best solution thus far. I have one differencing image active at one time. The main vdi gets updated right before the VM launches. Therefore when crashplan backs up that folder, it should be backing up a stable vdi thats not constantly changing and getting unstable. This **seems** to be the best solution for me for maintaining a trustworthy backup strategy without weird errors that crash my live VM.

Anyway, I’d definitely be curious to here about your experiences backing up VMs!


Published at DZone with permission of Doug Turnbull, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)

Comments

Fabrizio Giudici replied on Thu, 2013/02/14 - 4:39am

 First, thank you because this is a valuable post. 

"This is, of course, problematic as CrashPlan is most likely running its backup while I’m actively working in my virtual machine."

This is a very interesting point as this way of operating is common to other popular backup systems, such as Time Machine on the Mac, and basically makes this kind of backup almost useless. Not only for backing up a VM, but anything that doesn't commit atomically changes to a filesystem (which means everything, from Opera to Lightroom). Most people seem to just "trust" the coolness factor of the UI of TimeMachine, or the fact that "my backup does everything automatically", without considering these side effects. Usually you realize you didn't have a solid backup when you have to restore something.

Back to the central point, I hoped for the snapshot to be more effective, but found the same problems you describe. For me, VM aren't such important, I mean I use them for testing something for my customers, but there are no vital data inside. If something crashes without possibility of a recovery, I just have to restart from a plain system, checkout some code and run tests. That's why I'm perfectly fine with a manual, sync approach: I have created "clean" versions of Windows 7 and Ubuntu, applied the patches available at the time of creation, then zipped the .vdi and archived it (before zipping you can apply some tricks to minimize the size of the zip, for Windows e.g. http://garethtuckercrm.com/2012/07/25/shrinking-virtualbox-vdi-files/). When I have to restart from scratch, I just unzip the image, change the file name of the disk image, change the UUID which is registered by Virtual Box and I'm ready.

For a more frequent backup of vital data, I frankly don't see anything more safe and effective that manually run rsync (or something based on it), periodically, after stopping the applications. I backup my data (not only VMs) in this way and I've been fine so far. rsync can do incremental backups, preserving the overwritten files for as many times as you want. I usually do this in the launch and dinner pauses, so it's like I have a sort of automatic reminder (whenever I eat I backup) and I don't have to pause my job. For backups requiring a longer time, you can launch them before going to sleep, adding something that powers off/freezes the machine when it's over.

For the record, I also have a Time Machine backup disk, mostly because I've got a bag of recovered 2.5" disks that I "need" to use in some way. It doesn't cost me to have this disk attached all the day and it's a sort of backup complement. It's mostly useful for special cases, such as you delete a file, then you empty the trash, and then you realize you've done a mistake. It actually worked for me once, but as you can guess it's a fairly rare use case. 

Serious backup of serious data today still requires discipline and manual care. Things would be different if we had a transactional file system. Unfortunately Apple seems not to care for Mac OS X; Linux users could try BTRFS; for Windows, I don't now.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.