So when I wrote the article that introduced a script to generate mysql backup files for multiple databases I mentioned the trouble that will occur if you don’t get a handle on some means to retire old files. This applies to log files, mysql backups, or just about any other type of file that is created on a recurring basis. You don’t need a error log from 134 days ago, but error logs for the past week could be very useful. So what do you do? Why recycle of course. This article shares a simple shell script to purge any files older than X days, where X is of course a number allowing for flexibility. It is very simple to use a shell script to delete log files, or in this example sql backups.
Your server is being overrun with numerous files that just hang around long after they have served there useful life. These files may be small or large, but something about leaving unused files hanging around doesn’t feel right. After just a few days of mysql backups I end up with a directory structure like this;
sql_dumps/
|– edwardawebb.com
| |– edwardawebb_wordpress_01-13-2009.sql.gz
| |– edwardawebb_wordpress_01-14-2009.sql.gz
| |– edwardawebb_wordpress_01-15-2009.sql.gz
| |– edwardawebb_wordpress_01-16-2009.sql.gz
| |– edwardawebb_wordpress_01-17-2009.sql.gz
| |– edwardawebb_wordpress_01-18-2009.sql.gz
| -- edwardawebb_wordpress_01-19-2009.sql.gz |-- mantis.mainsite.org | |-- mainsite_mantis_01-13-2009.sql.gz | |-- mainsite_mantis_01-14-2009.sql.gz | |-- mainsite_mantis_01-15-2009.sql.gz | |-- mainsite_mantis_01-16-2009.sql.gz | |-- mainsite_mantis_01-17-2009.sql.gz | |-- mainsite_mantis_01-18-2009.sql.gz |
– mainsite_mantis_01-19-2009.sql.gz
-- taskfreak.mainsite.org |-- mainsite_taskfreak_01-11-2009 |-- mainsite_taskfreak_01-11-2009.sql.gz |-- mainsite_taskfreak_01-12-2009.sql.gz |-- mainsite_taskfreak_01-13-2009.sql.gz |-- mainsite_taskfreak_01-14-2009.sql.gz |-- mainsite_taskfreak_01-15-2009.sql.gz |-- mainsite_taskfreak_01-16-2009.sql.gz |-- mainsite_taskfreak_01-17-2009.sql.gz |-- mainsite_taskfreak_01-18-2009.sql.gz
– mainsite_taskfreak_01-19-2009.sql.gz
Although 24 files may seem manageable, those who deal with log files and multiple sites know that this can quickly get out of hand.
We lazily create a shell script to run at weekly intervals to purge all those older files and send them off to the bit bucket. Only files older than X days should be deleted, we’ll leave all the fresh and potentially needed logs/backups in place This example assumes mysql logs with the .sql or .sql.gz extensions.
#!/bin/bash
#if you use this script you must attribute to me Eddie - Edwardawebb.com 1/14/09
#this script will run through all nested directories of a parent just killing off all matching files.
DEFRETAIN=60
#want to append the activity to a log? good idea, add its location here
LOGFILE=pwd
/Recycler.log
EXTENSION=sql
#the absolute path of folder to begin purging #this is the top most file to begin the attack, all sub directories contain lowercase letters and periods are game. SQLDIR=$HOME/sql_dumps
#this note will remind you that you have a log in case your getting emails form a cron job or something echo see $LOGFILE for details
#jump to working directory cd $SQLDIR
#if your sub-dirs have some crazy characters you may adjust this regex
DIRS=ls | grep ^[a-z.]*$
TODAY=date
printf “\n\n********************************************\n\tSQL Recycler Log for:\n\t” | tee -a $LOGFILE echo $TODAY | tee -a $LOGFILE printf “********************************************\n” $TODAY | tee -a $LOGFILE
for DIR in $DIRS
do
pushd $DIR >/dev/null
HERE=pwd
printf “\n\n%s\n” $HERE | tee -a $LOGFILE
if [ -f .RETAIN_RULE ]
then
printf “\tdefault Retain period being overridden\n” | tee -a $LOGFILE
read RETAIN < .RETAIN_RULE
else
RETAIN=$DEFRETAIN
fi
printf "\tpurging files older than %s days\n" ${RETAIN} | tee -a $LOGFILE
OLDFILES=`find -mtime +${RETAIN} -regex .*${EXTENSION}.*`
set -- $OLDFILES
if [ -z $1 ]
then
printf "\tNo files matching purge criteria\n" | tee -a $LOGFILE
else
printf "\tSQL Files being Delete from $HERE\n" | tee -a $LOGFILE
printf "\t\t%s\n" $OLDFILES | tee -a $LOGFILE
fi
rm -f $OLDFILES
if [ $? -ne 0 ]
then
echo "Error while deleting last set" | tee -a $LOGFILE
exit 2
else
printf "\tSuccess\n" | tee -a $LOGFILE
fi
popd >/dev/null
done
did you notice the bit about .RETAIN_RULE? good! I added this after I realized that I don’t treat all my sites equally. For this very blog which is backed up daily I only need 3-4 days back max. But for other sites that I back up monthly I need to keep the default 60 days or 1-2 files. So I set the default in the script to 60. But I allow it to be overwritten by adding a simple text file to any directory. If a file .RETAIN_RULE is present it will read the first line (and first line only!) for a new value, example;
5 #only keep files in this single directory around for 5 days
notice i comment after the actual data! This means my actual directory structure including retain rules looks more like;
#tree -a sql_dumps
sql_dumps/
|– edwardawebb.com
| |– .RETAIN_RULE
| |– edwardawebb_wordpress_01-13-2009.sql.gz
| |– edwardawebb_wordpress_01-14-2009.sql.gz
| |– edwardawebb_wordpress_01-15-2009.sql.gz
| |– edwardawebb_wordpress_01-16-2009.sql.gz
| |– edwardawebb_wordpress_01-17-2009.sql.gz
| |– edwardawebb_wordpress_01-18-2009.sql.gz
| -- edwardawebb_wordpress_01-19-2009.sql.gz |-- mantis.mainsite.org | |-- .RETAIN_RULE | |-- mainsite_mantis_01-13-2009.sql.gz | |-- mainsite_mantis_01-14-2009.sql.gz | |-- mainsite_mantis_01-15-2009.sql.gz | |-- mainsite_mantis_01-16-2009.sql.gz | |-- mainsite_mantis_01-17-2009.sql.gz | |-- mainsite_mantis_01-18-2009.sql.gz |
– mainsite_mantis_01-19-2009.sql.gz
-- taskfreak.mainsite.org |-- mainsite_taskfreak_01-11-2009 |-- mainsite_taskfreak_01-11-2009.sql.gz |-- mainsite_taskfreak_01-12-2009.sql.gz |-- mainsite_taskfreak_01-13-2009.sql.gz |-- mainsite_taskfreak_01-14-2009.sql.gz |-- mainsite_taskfreak_01-15-2009.sql.gz |-- mainsite_taskfreak_01-16-2009.sql.gz |-- mainsite_taskfreak_01-17-2009.sql.gz |-- mainsite_taskfreak_01-18-2009.sql.gz
– mainsite_taskfreak_01-19-2009.sql.gz
So as the script walks through the structure above it prints a log to the effect of;
see /home//sql_dumps/Recycler.log for details
SQL Recycler Log for:
Sun Feb 8 00:00:07 PST 2009
/home/MYUSERNAME/sql_dumps/edwardawebb.com default Retain period being overridden purging files older than 4 days SQL Files being Delete from /home/masterkeedu/sql_dumps/edwardawebb.com ./edwardawebb_wordpress_01-28-2009.sql.gz ./edwardawebb_wordpress_02-03-2009.sql.gz ./edwardawebb_wordpress_01-29-2009.sql.gz ./edwardawebb_wordpress_02-02-2009.sql.gz ./edwardawebb_wordpress_01-31-2009.sql.gz ./edwardawebb_wordpress_01-30-2009.sql.gz ./edwardawebb_wordpress_02-01-2009.sql.gz Success
/home/MYUSERNAME/sql_dumps/mantis.mainsite.org default Retain period being overridden purging files older than 4 days SQL Files being Delete from /home/masterkeedu/sql_dumps/mantis.mainsite.org ./webbmaster_mantis_01-30-2009.sql.gz ./webbmaster_mantis_01-31-2009.sql.gz ./webbmaster_mantis_02-01-2009.sql.gz ./webbmaster_mantis_01-27-2009.sql.gz ./webbmaster_mantis_01-29-2009.sql.gz ./webbmaster_mantis_02-02-2009.sql.gz ./webbmaster_mantis_01-28-2009.sql.gz Success
/home/MYUSERNAME/sql_dumps/taskfreak.mainsite.org purging files older than 60 days No files matching purge criteria Success
As with any article I welcome feedback or questions!