Backup all sub-directories with a Bash array loop

April 15, 2010

I manage lots of domains, and offer my clients free backup and recovery service. Nice selling perk, but I best be damn sure I am backing things up regularly.  Since there is no way my space-cadet brain would remember that, I rely on my ‘nix friends: bash, cron and tar to neatly package every sub-directory of my webroot into their own little tarballs. The bash script included after the break reads all directories into an array that we can loop through and manipulate as needed**.**


My webroot is pretty straightforward. Each subdirectory is the full URL of a site. Example:

webroot |– |– |– cafe– |– |– |– |– |– |– |– |– |– |– |– |– |– |– |– |– |– |– `–

So I want to take each directory and create a gzipped tarball using tar -zcf.    I don’t want to do any manual intervention. *ALso, I want to exclude some directories.


This actually took me much longer than it should have. Too many articles insisted you should change the internal field separator (IFS) to a newline character and just use ls.  Now in my training I was learn-ed that messing with IFS can be dangerous, and will disrupt many things, including any command that relies on options or arguments (i.e. ALL) So rather than make my entire script confirm to pesky newlines returned by ls, I did one simpler - eliminate the newlines in place of a decent token separator, like a space! To exclude directories I just use a magic file “.DONT_BACKUP” that the script checks for.



Backup all directories within webroot

use empty file “.DONT_BACKUP” to exclude any directory

days to retain backup. Used by recycler script

DEFRETAIN=14 LOGFILE=/home/webb_e/site_backups/WebrootBackup.log


and name of backup source subfolder under the users home


and name of dest folder for tar files


#alright, thats it for config, the rest is script #########################################

cd ${HOME}/${WEBDIR}/

TODAY=date BU_FILE_COUNT=0 suffix=$(date +%m-%d-%Y) printf “\n\n********************************************\n\tSite Backup r Log for:\n\t” | tee -a $LOGFILE echo $TODAY | tee -a $LOGFILE printf “********************************************\n” $TODAY | tee -a $LOGFILE echo “see ${LOGFILE} for details”

#for DIR in $(ls | grep ^[a-z.]*$)

for DIR in $(ls | grep ^[a-z.]*$) do echo $DIR #tar the current directory if [ -f $DIR/.DONT_BACKUP ] then

    printf "\tSKIPPING $DIR as it contains ignore file\n" | tee -a $LOGFILE
	#check if we need to make path
	if [ -d $cpath ]
        # direcotry exists, we're good to continue
        echo Creating $cpath
        mkdir -p $cpath
        echo $DEF_RETAIN > $cpath/.RETAIN_RULE
	tar -zcf ${HOME}/${DESDIR}/${DIR}/${DIR}_$suffix.tar.gz ./$DIR

done printf “\n\n********************************************\n” | tee -a $LOGFILE echo $BU_FILE_COUNT sites were backed up printf “********************************************\n” $TODAY | tee -a $LOGFILE


$ ~/scripts/file_backups/ see /home/myself/site_backups/WebrootBackup.log for details

    Site Backup r Log for:
    Thu Apr 15 14:31:53 PDT 2010 SKIPPING as it contains ignore file Creating /home/myself/site_backups/

33 sites were backed up

But wait! We wanted to automate this whole thing right? And so we shall.

Using Cron to automate the process

If your using a webhost they likely provide a GUI to add cron jobs. If that’s the case you can just point to the full path where you saved the above script, select the interval and your good to go. If your using this on your own server you’ll need to get your hands dirty with a crontab. You can open the crontab file in your editor of choice, or call it from the command line. IN this example we’ll rely in vi, my systems default editor. Create a crontab file if it does not already exist and open it for edit

crontab -e

You may see some existing lines or you may not. Just remember one job per line. THe layout may seem overwhelming at first, but its quite simple, and breaks down like this

min hour day month weekday job_to_Run

The values are in the respective ranges for day of week 0 is Sunday.

0-59 0-23 1-31 1-12 0-6 filename

To omit a field replace it with an asterisk (*) which means all values. Alternately you may use comma separated lists. Although I believe it will treat any whitespace as a delimiter I use tabs to make the organization a little nicer. So let’s suppose I want to run this job nightly, it is after all named DAILY sql backup :) I will add the following line to my crontab

15 0 * * * $HOME/scripts/ > logfile.log

This means every day @ 00:15 a.k.a 15 minutes past midnight it will run the script and print any output into the specified logfile. If you leave off the redirect to logfile it will email the user with the results. To omit any output use the handy standby

/dev/null 2>&1


More Help

If you are curious about the script that will actually recycle old backups, then I suggest Log Recycler Script which could easily be updated to handle .tar.gz files instead of .sql.gz files :) If you want to backup your MySQL databases, I have that too, Shell script to backup multiple databases

Nifty tech tag lists from Wouter Beeftink | Page content generated from commit: d197a6c