For too long now I have been lazy with my backup procedures which was normally a quick rsync to a different hard drive. This is hardly ideal and up until now I have been quite lucky. I also tend to have a blast every so often and burn some bits to CD but I have been doing a lot of work lately and I know what its like to loose a few days worth of it.
For those with the luxury of a large Lacie drive or a decent tape drive then deciding what to backup is relatively easy. I decided to limit myself to a single CDR-W which is a measly 650Mb. I did this because I am a hoarder and it was about time I cleaned house. Besides, if push comes to shove I have a 20Gb drive that is not plugged in due to noise so if the backup takes more than 650Mb then I will use it rather than risk loosing the data.
I have four users on this machine that I use to do all my work, all other users are used by the system and for the most part I am not worried about the data they generate. I cannot backup each users entire directory because there is too much data do I limited myself to certain directories in the users home directory.
First thing I did was create an area where the backups are going to take place. I have 19Gb free in one partition on an SATA drive so thats where it’s going.
The basic idea is as follows.
1. Determine which directories in each users home directory are important.
2. Determine if there are any files outside of these directories that are important.
3. Move those files inside one of the directories.
4. Weed any crap from the directories, either delete it or move it out to another area on disk.
5. Write the backup script.
For step 2 there are some files outside these directories that I would like backed up. Some examples are.
/boot
/etc/
/var/spool/cron
/var/spool/mail
Backups script itself.
The backup script does the following.
rsync -av user1/dir1 rsync_dir/user1/
rsync -av user1/dir2 rsync_dir/user1/
rsync -av user1/dir3 rsync_dir/user1/
tar -czvf user1.tar.gz tar_dir/user1/
………
……… DO THE SAME FOR ALL USERS
………
rsync -av /var/spool/cron/crontabs rsync_dir/system/
rsync -av /var/spool/mail rsync_dir/system/
……… rsync each system directory
tar -czvf system.tar.gz tar_dir/system/
I have deliberately created separate tar.gz files because its easier and faster to extract them on an individual basis where we just want a couple of files out of the backup. One thing to note is that when you tar the files up you want the paths to be the paths to the original files on disk not the rsynced files that we just copied. This is for sanity checking later.
To create an image of the tar files we need to make an ISO image of them as follows.
mkisofs -r -J -l -o backup.iso tar_dir/
Once the iso has been written we need to make sure that it is not too big
( < 650Mb use "ls -lah") and then we can burn it to our CDR-W as follows.
cdrecord -v blank=fast
cdrecord -v speed=8 dev=1,5,0 backup.iso
Note that I am blanking the CDRW before I write to it. The above script is now being managed by cron so I no longer need to worry too much about the backup as long as I have a cdrw in the disk. The next step is to check to make sure that the backup actually worked.
The best way to do this is to use tar.
From the man page:
-d, –diff, –compare
find differences between archive and file system
cd rsync_dir/user1/
mount /cdrom/
tar -zdf -diff /cdrom/user1.tar.gz
If you don't see anything then you have a clean backup from the file system. If you are unsure if anything happened edit a file on the file system and try it again. You would get a similar message to this if all you do is change the mod time of the file
bin/document_parser.pl: Mod time differs
bin/indexer.pl: Mod time differs
Next thing we should do is start rotating the media so I am off into town tomorrow during my lunch break to get a couple more rewritable CD's.
The above is a very simplified version of what I have done. There are lots of options to rsync and tar that can make things much easier so go have a look. I also have some websites not on the local machine that I am doing manually.
I also need to get myself another large disk for the machine. I would ideally like to use SCSI but it's very expensive. Another big SATA drive may just be what I am after or perhaps one of those Lacie drives…………..