Dump

dump is a Unix program used to back up file systems. It operates on blocks, below filesystem abstractions such as files and directories. Dump can back up a file system to a tape or another disk. It is often used across a network by piping its output through bzip2 then SSH. A dump utility first appeared in Version 6 AT&T UNIX.

To use dump to backup a partition you can invoke it as such:

tethys# dump -0 -auLf /backups/var.dump /var
  DUMP: Date of this level 0 dump: Wed May  6 19:02:07 2009
  DUMP: Date of last level 0 dump: the epoch
  DUMP: Dumping snapshot of /dev/ad10s1e (/var) to /backups/var.dump
  DUMP: mapping (Pass I) [regular files]
  DUMP: mapping (Pass II) [directories]
  DUMP: estimated 3970989 tape blocks.
  DUMP: dumping (Pass III) [directories]
  DUMP: dumping (Pass IV) [regular files]
  DUMP: DUMP: 3970386 tape blocks on 1 volume
  DUMP: finished in 174 seconds, throughput 22818 KBytes/sec
  DUMP: level 0 dump on Wed May  6 19:02:07 2009
  DUMP: Closing /backups/var.dump
  DUMP: DUMP IS DONE
tethys#

The flags I used above are:

-0Run a full backup. This can be from levels 0 to 9.
-aAuto-size. When backing up to another harddrive as I am, use this option.
-uUpdate the dumpdates file after a successful dump.
-LThe file system being backed up is a live file system. Take a spanpshot first before dumping.
-fThe file to store it in.

Compressing your Backups

Backups with dump can also be compressed to save space, with a few caveats. First, not all partitions will compress equally. Compression with gzip saved about 50-60% on dumps of /, /usr, and /var. However, compression on my /home partition, which contains lots of digital content such as mp3's, videos, and pictures only saved around 11%. While bzip did 8-10% better, it takes a lot longer.

Previously, I was using tar and bzip to backup my partitions:

/usr/bin/tar cyf /backups/etc.tar.bz2 /etc/
/usr/bin/tar cyf /backups/usrlocaletc.tar.bz2 /usr/local/etc/
/usr/bin/tar cyf /backups/varmail.tar.bz2 /var/mail/
/usr/bin/tar cyf /backups/vardb.tar.bz2 /var/db/
/usr/bin/tar cyf /backups/home.tar.bz2 /home/

The home partition took almost 24 hours to backup this way and as such I was only backing it up 2 times a month. This also only saved about 12% compared to the normal size of the partition. Testing with dump and gzip:

tethys# dump -0 -auLf - /usr/home | gzip -q > /backups/usrhome.dump.gz
  DUMP: Date of this level 0 dump: Mon May  4 21:34:33 2009           
  DUMP: Date of last level 0 dump: the epoch                          
  DUMP: Dumping snapshot of /dev/ad12s1d (/usr/home) to standard output
  DUMP: mapping (Pass I) [regular files]                               
  DUMP: mapping (Pass II) [directories]                                
  DUMP: estimated 259514698 tape blocks.                               
  DUMP: dumping (Pass III) [directories]                               
  DUMP: dumping (Pass IV) [regular files]                              
  DUMP: 1.02% done, finished in 8:05 at Tue May  5 05:46:16 2009       
  DUMP: 2.09% done, finished in 7:49 at Tue May  5 05:35:15 2009       
  DUMP: 3.23% done, finished in 7:29 at Tue May  5 05:19:51 2009       
  DUMP: 4.25% done, finished in 7:30 at Tue May  5 05:25:59 2009       
  DUMP: 5.22% done, finished in 7:34 at Tue May  5 05:34:51 2009       
  DUMP: 6.33% done, finished in 7:23 at Tue May  5 05:29:38 2009
...

The above took about 8 hours (as indicated in the line "finished in 8:05"). This is a drastic improvement. Testing without compression at all reduced my run time to 6 hours. This is backing up a partition with nearly 300GB of data.

Restore

Restore can be invoked in interactive and non-interactive modes. In general, unless you have had a complete harddrive failure, you will want to use the interactive version.

Non-interactive restores

To perform a restore of all files in a backup set, follow this recipe:

  1. Change directory to the root of where you wish to restore. Restore will use the current directory as its starting point for extractions, so it is important to be in the correct directory when invoking restore.
  2. Invoke restore with the -r flag: restore -rf tapedev
  3. Get a cup of your favorite beverage while the restore runs.

Note that restore will not delete any existing files, but it will overwrite files in the target area even if they are newer than those in the backup set.

Interactive Restores

To selectively restore only a subset of the backup set (for example, if you inadvertently deleted a particular file/directory and wish to retrieve only that file from the tape), use an interactive restore session.

To start an interactive restore, use the -i flag to restore: restore -if dumpfile. You will then see a restore> prompt, at which you can type interactive restore commands, including:

  • add [arg] - Adds [arg] to the list of files to be extracted from the backup.
  • cd [arg] - Use to navigate the backup set just like you'd use a shell to navigate a regular filesystem.
  • delete [arg] - Removes [arg] from the list of files to extract from the backup. Note that this won't actually cause any files to be removed from the disk or from the backup set; it only removes the file from the extract list.
  • extract - After you have used add to mark all the files and directories you want to recover, use this command to actually begin the recovery.
  • ls - List the contents of the backup archive. You can use this and the cd command above to view the contents of the backup set. Files and directories marked for extraction are marked with an x.
  • pwd - Use this to tell you where you are within the backup set.
  • quit - As you would expect, this quits the restore session immediately, without restoring any files.
  • verbose - Tells restore to be much more chatty about what it is doing during the actual extraction.

For example, say you want to restore a mailbox from the dump you did earlier on /var. This example below would restore a copy into the same directory as the backup so you can verify it first.

tethys# cd /backups
tethys# restore -i -f var.dump
restore > ls
.:
.snap/           crash/           log/             quarantine/
account/         cron/            mail/            run/
agentx/          db/              mediatomb/       rwho/
at/              empty/           milter-greylist/ spool/
audit/           games/           msgs/            tmp/
backups/         heimdal/         named/           yp/
cache/           lib/             preserve/

restore > cd mail
restore > ls rnejdl
rnejdl
restore > add rnejdl
restore > extract
You have not read any tapes yet.
If you are extracting just a few files, start with the last volume
and work towards the first; restore can quickly skip tapes that
have no further files to extract. Otherwise, begin with volume 1.
Specify next volume #: 1
set owner/mode for '.'? [yn] n
restore > quit
tethys#

And now, in my /backups directory, I have a mail folder with my mailbox, rnejdl, in it:

tethys# ls -al mail/rnejdl
-rw-------  1 rnejdl  rnejdl  3184712 May  6 18:03 mail/rnejdl
tethys#

If your dump file is compressed, then you would invoke restore instead as:

tethys# bzcat var.dump.bz2 | restore -i -f -

Some addtional observations

I did a lot of searching for what I should use. Dump appears to be one of the most robust solutions for Unix backups according to www.coredumps.de/doc/dump/zwicky/testdump.doc.html. Tar for me was too slow, especially for extraction. The restore command is insanely simple, even for someone who has never used it.

Compression is only advised on certain partitions and I recommend gzip instead of bzip2 due to the amount of CPU and time involved. Harddrives are cheap. Buy a large harddrive for your backups. Keep at least 2 backups of everything. You never know when you will need to retrieve a prior version or the current version gets corrupted during creation somehow.

An example backup script

#!/usr/bin/perl

#	Part of a group of programs to admin a FreeBSD web/email server.
#	Do what you want with it, just keep the copyright.
#
#	Copyright 2004, Erin Fortenberry
#
#	$Id: backup.pl,v 1.2 2004/08/26 21:01:19 erinf Exp $
#
#	usage: "scriptname -h"
#
#	I use /backup as my backup drive and it is nfs mounted.
#	You can remove the compression on this, but it adds alot
#	to the backup time so I did not make a switch for it.
#
#	The current backup directory it /backup/$hostname/$day.$name.dump.gz
#
#	This script does not check to see if the backup directories exist or
#	if they have enough space for the dump file.
#
#	Requires perl 5.6.1 or newer.
#



use strict;
use warnings;
use Getopt::Std;
use POSIX qw(strftime);
use vars qw($VERSION);

$Getopt::Std::STANDARD_HELP_VERSION = 1;
$VERSION = '1.2';

my @FS = ('/', '/home', '/usr', '/var');
my $day = lc(strftime "%A", localtime);
my $hostname = `/bin/hostname -s`;
my %opt = ('F' => 0, 'd' => 0, 'h' => 0);
my $type;

chomp $hostname;

getopts("Fdh",\%opt);

if ( $opt{h} == 1 ) {

	print STDERR << "EOF";

    usage: $0 [-hqd]

     -h        : this (help) message
     -d        : Dry run, only print what I am going to do
     -F        : Force full backup {type 0}

    example: $0 -h -q -d

EOF
	exit(0)
}

if ( $opt{F} == 1 ) {

	$type = "0"

} else {

	if ($day eq "sunday") {
		$type = "0"
	} else {
		$type = "1"
	}

}

foreach (@FS) {

	my $name = $_;

	if ($name eq '/') {
		$name = '/root';
	};

	$name =~ s/^\///g;

# Unncomment for /backup/$day/$name.dump.gz
#	my $command = '/sbin/dump -' . $type  . ' -auf - ' .  $_ . ' |
	gzip -q > /backup/' . $day . '/' . $name . '.dump.gz';

# Put a "#" in front of the next line if you uncomment the last line
	my $command = '/sbin/dump -' . $type  . ' -auf - ' .  $_ . ' |
	gzip -q > /backup/' . $hostname . '/' . $day . '.' . $name . '.dump.gz';

	if ($opt{d}) {
		print($command . "\n");
	} else {
		system($command);
	};
};
exit(0);