Wednesday, October 14, 2009

Remote Incremental Backup with rdiff-backup

This is a refined "how-to" of many “how-to” about rdiff-backup. As non of the "how-to" did solved my problem completely or describe the subject in length to loose the interest. I have decided to compile a short and practical how-to for the users who would :

  1. Like to have remote backup

  2. Like to have incremental backup
  3. like to do it automatically
  4. Like to save bandwidth
  5. Like to have the folder in same format in backup server without any compression
  6. Remote backup on Linux m/c

(the document is my notes and taken important important portion from http://tombuntu.com/index.php/2008/11/18/a-guide-to-system-backup-and-restore-in-ubuntu/ and http://www.howtoforge.com/linux_rdiff_backup . Please find out more about rdiff-backup's features here: http://www.nongnu.org/rdiff-backup/index.html This howto is meant as a practical guide; it does not cover the theoretical backgrounds. They are treated in a lot of other documents in the web.)

This document comes without warranty of any kind! I want to say that this is not the only way of setting up such a system. There are many ways of achieving this goal but this is the way I take. I do not issue any guarantee that this will work for you!)

Important notes for the document :

  1. Main server is server1.example.com
  2. Backup server is backup.example.com
  3. rdiff-backup needs to be load on both the machines (preferably same version)
  4. backup server will fetch the backup from server1.example.com, and main server has to do nothing.
  5. In the following document you may use the ip address also instead of server name.

Automated Backups With rdiff-backup

This tutorial describes how to do automated server backups with the tool rdiff-backup. rdiff-backup lets you make backups over a network using SSH. The use of SSH makes rdiff-backup very secure because no one can read the data that is being transferred. rdiff-backup makes incremental backups, thus saving bandwidth.

The problem is that SSH requires a password for logging in, which is not good if you want to run rdiff-backup as a cron job. The need for a password requires human interaction which is not what we want. For example, to backup the directory /boot of server1.example.com, you would type rdiff-backup root@server1.example.com::/boot boot on your backup server which would try to save server1.example.com's directory /boot in backup.example.com's directory boot. Now this is what happens:

rdiff-backup@backup:~$ rdiff-backup root@server1.example.com::/boot boot
Password: -----------------------------------------------------------------
Detected abilities for source (read only) file system:
Access control lists Off
Extended attributes Off
Mac OS X style resource forks Off
Mac OS X Finder information Off
-----------------------------------------------------------------
Warning: ownership cannot be changed on filesystem at boot/rdiff-backup-data
-----------------------------------------------------------------
Detected abilities for destination (read/write) file system:
Characters needing quoting ''
Ownership changing Off
Hard linking On
fsync() directories On
Directory inc permissions On
Access control lists Off
Extended attributes Off
Mac OS X style resource forks Off
Mac OS X Finder information Off
-----------------------------------------------------------------
rdiff-backup@backup:~$



You see, in line 2 you are asked for the root password of server1.example.com

But fortunately there is a solution: the use of public keys. We create a pair of keys (on our backup server backup.example.com), one of which is saved in a file on the remote system (server1.example.com). Afterwards we will not be prompted for a password anymore when we run rdiff-backup. This also includes cron jobs which is exactly what we want.

The concept is that we initiate the backups of server1.example.com directly from backup.example.com; server1.example.com does not have to do anything to get backed up.


Step 1: Install rdiff-backup On server1.example.com And backup.example.com First we have to install rdiff-backup on both server1.example.com and backup.example.c

om.

On Debian systems you can simply do that by running

apt-get install rdiff-backup

On other distributions the installation is different (on Fedora it might be something like

yum install rdiff-backup

Step 2: Create The Keys On backup.example.com

On backup.example.com, create new key pair with no passphrase for your user:

#ssh-keygen -t rsa

You will see something like this:

rdiff-backup@backup:~$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/backup/.ssh/id_rsa):
Created directory '/backup/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /backup/.ssh/id_rsa.
Your public key has been saved in /backup/.ssh/id_rsa.pub.
The key fingerprint is:
88:18:4e:55:e9:27:8e:2a:44:4b:03:bd:9d:0f:fc:48 rdiff-backup@backup

It is ok to save the key in /backup/.ssh/id_rsa so you can simply hit enter.

It is important that you do not enter a passphrase otherwise the backup will not work without human interaction so again hit enter.

In the end two files are created: /backup/.ssh/id_rsa and /backup/.ssh/id_rsa.pub.

Change the permissions of that file:

#chmod -R go-rwx /backup/.ssh

Step 3:Now we copy over our public key to server1.example.com:

The following command is to be run on backup server and will copy public key to the main server.

#ssh-copy-id -i ~/.ssh/id_rsa.pub '-p 22 root@server1.example.com'

Here 22 is the port number for ssh (please check the firewall for the same) and root is the user at main server which will be authorising for connecting (you may use ip address also for instead of server1.example.com).

This will look like this:

rdiff-backup@backup:~$ ssh-copy-id -i ~/.ssh/id_rsa.pub '-p 22
root@server1.example.com'
23
The authenticity of host 'server1.example.com (1.2.3.4)' can't be established.
RSA key fingerprint is c7:19:55:7a:54:ce:93:c8:b6:f9:0e:e3:65:24:64:11.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'server1.example.com' (RSA) to the list of known hosts.
Password:
Now try logging into the machine, with "ssh 'root@server1.example.com'", and check in:
.ssh/authorized_keys
to make sure we haven't added extra keys that you weren't expecting.
rdiff-backup@backup:~$

Once again you have to type in the root password of server1.example.com.

What this command does is it copies the public key of the user rdiff-backup to the file /root/.ssh/authorized_keys on the remote server server1.example.com.


Step 4 : Finally, test logging in to the remote system without a password:

On the backup server run the following command

#ssh -p 22 root@server1.example.com

If you able to login main server, CONGRATULATIONS ! You are done.

Step 5. Change the permissions

Next run

chmod -R go-rwx /root/.ssh

Then have a look at /etc/ssh/sshd_config. It should contain the lines

RSAAuthentication yes
PubkeyAuthentication yes

Restart ssh if you had to change /etc/ssh/sshd_config:

/etc/init.d/ssh restart

Step 6. Write your backup script
rdiff-backup’s options are pretty easy to configure. Be sure to read the page of examples as well as the manpage as you write your backup command. Here’s my backup.sh script file for running a backup of my home directory:


#!/bin/sh

rdiff-backup --print-statistics --remote-schema 'ssh -p 22 %s rdiff-backup --server' root@server1.example.com::/home/arun /root/backup/arun

done

Save the file with name arunbackup


Step 7 : Automate it

Copy the file arunbackup in /etc/cron.hourly/ or /etc/cron.daily/ or /etc/cron.weekly/ or /etc/cron.monthly/

OR

you may also create cron job, if you have some expertise on it.

Step 8 : View the log

As you know that there will be one more folder in backup folder “rdiff-backup-data”

view the file named backup.log

This file will give you complete statistics you require


and have a look at http://www.nongnu.org/rdiff-backup/examples.html.