Thursday, October 21, 2010

pg_hotbackup utility for Backup of PG

Users always look for a Backup utility which can give some Good options and a utility can be use with all the instances of PG to take the backup.

I have thought in same way and created a pg_hotbackup script. pg_hotbackup utility which I worked on is now a server side utility which takes backup on server side and keep the backups in Backup directory.

Options which I have included in it, are following:
Compulsory Options:
-H  (This is for Bin Directory location, utility will use psql command of PG Instance )
-b  (Directory where user wants to keep the backup)
-p  (Port number of PG Instance)
-U username (Username)
-P passwd   (Password).
Some other options:
-a: Archive Only Option [1|0]
-r: Retention Policy [ in days ]
-l: List Backups
-n: Backup File Name
-v: Validate Only [1|0]
-R: Retention Only [|0]
\?: Help
So, I have all the options with me.

Now, lets understand what do we need:
1. We need a catalog file, in which utility can keep the backups information and validate the information as per new backups and retention policy.
If same binaries are getting used, I have preferred to keep that catalog file in Binary location.

2. A function, which can update the catalog file with each backup of archive/hotbackup backup:
Following is a code which can help to achieve this objective
 updatecatalog()
  {
 BCKLIST=`ls -1 $BIN/.$port*` 2>/dev/null
 if [ $? -ne 0 ];then
                echo "No Catalog Found! Please check the Home Director and Backup Directory"
                exit 1
        fi
 if [ -f /tmp/list ];then
  echo "Nothing to Update"
  exit
 fi
 for i in `cat /tmp/list`
 do
  cat $BCKLIST|awk -F';' '{if ($1==$i){printf "%s;%s;%s;%s;%s;%s;",$1,$2,$3,$4,$5,"Expired";}else {printf "%s;%s;%s;%s;%s;%s;",$1,$2,$3,$4,$5,$6}'>>/tmp/newcatlog
 done
 if [ -f /tmp/newcatlog ];then
  cat /tmp/newcatlog>$BCKLIST
  rm /tmp/newcatlog
 fi
 rm /tmp/list 2>/dev/null
 echo "Update Catalog is successful!"
  }
3. A function which can validate the backups, with every backups. Validation function should validate the backup information in catalog i.e it should check if the backups are available in defined destination or not.
Following is a Validation backup Function:
/dev/null
 if [ $? -ne 0 ];then
  echo "No catalog Available for Mentioned Bin and Cluster"
  exit 0
 fi
  for i in `cat $BCKLIST|awk -F';' '{print $1}'`
 do 
    ls -1 $BCKDIR/$i 2> /dev/null
    if [ $? -ne 0 ];then
  cat $BCKLIST|grep -v $i >/tmp/valcatalog.log
  cat /tmp/valcatalog >$BCKLIST
    fi
 done
 rm /tmp/valcatalog.log 2>/dev/null
 echo "All Backups has been validated!" 
   
   }
4. With this we would also need a function which can implement some retention policy on backup. i.e of how many days backup which user wants to keep:
 backup_retention()
  {
 if [ -z $RETENTION ];then
  echo "No retention days..."
  echo "Skipping the retention.."
  return 0
       else
  find $BCKDIR -mtime +$RETENTION -exec ls {} \; >/tmp/list;
  find $BCKDIR -mtime +$RETENTION -exec rm {} \;
 fi
 echo "Updating Catalog.."
 validate_backups
 updatecatalog
 }


With retention, policy above function will also update the catalog information.

5. Now, some functions which can be use for HotBackup. i.e Start and Stop backup functions:
 startbackup() 
   {
     QUERY="select pg_start_backup('HOTBACKUP `date +"%d-%m-%Y"`');"
     export PGPASSWORD="$PASSWD"
      $BIN/psql -t -p $port  -U $USER -c "$QUERY" template1
   }

 stopbackup()
   {
 QUERY="select pg_stop_backup();"
 export PGPASSWORD="$PASSWD"
 $BIN/psql -t -p $port -U $USER -c "$QUERY" template1
  }

6. One backup function, which can be use for backup, here is that function: backupdata()
    {
 if [ -z $FILENAME ];then
   NEWFILENAME=hotbackup_`echo $port`_`date +"%d-%m-%Y"`.tar.gz
   LABEL=HOTBACKUP
 else
   NEWFILENAME=`echo $FILENAME`_hotbackup.tar.gz
   LABEL=$FILENAME
 fi
     if [ -f /tmp/.backup.lock ];then 
 echo "One backup is already running.."
 exit 1
    fi
      LISTFILE=".`echo $port`_`echo $BCKDIR|sed 's/\//-/g'`"
      TYPE="Full Backup"
      STARTTIME=`date +"%d/%m/%Y %H:%M:%S"`
      touch /tmp/.backup.lock
      startbackup
      export PGPASSWORD=$PASSWD
      DATADIR=`$BIN/psql -t -U $USER -p $port -c "show data_directory;"|sed '/^$/d'`
      cmd="tar -zchvf $BCKDIR/$NEWFILENAME $DATADIR"
     if [ -d $BCKDIR ];then
         :
     else
        mkdir -p $BCKDIR
     fi
       $cmd >/tmp/Hot_backup_`date +"%d-%m-%Y"`.log 2>&1
       if [ $? -ne 0 ];then
  echo "Backup Failed.. Please /tmp/Hot_backup_`date +"%d-%m-%Y"`.log"
  stopbackup
  ENDTIME=`date +"%d/%m/%Y %H:%M:%S"`
  STATUS="Failed!"
     echo "$NEWFILENAME;$LABEL;$STARTTIME;$ENDTIME;$TYPE;$STATUS">>$BIN/$LISTFILE 
  rm /tmp/.backup.lock
         exit 1
       else 
  stopbackup
  rm /tmp/.backup.lock
  ENDTIME=`date +"%d/%m/%Y %H:%M:%S"`
  STATUS="Successful"
  echo "$NEWFILENAME;$LABEL;$STARTTIME;$ENDTIME;$TYPE;$STATUS">>$BIN/$LISTFILE
  echo "Data Directory of port $port has been backed up successfully!"
       fi
    } 
Above function, will also update the catalog file, which shell scriptor can figure out from the function definition.

7. With this we also need a archive backup functions too. Below is a archive function, which update the status in catalog and also takes the backup:
 backuparchive() 
 {
 touch /tmp/.archive.lock
 TYPE="Archive Backup"
 STARTTIME=`date +"%d/%m/%Y %H:%M:%S"`
        LISTFILE=".`echo $port`_`echo $BCKDIR|sed 's/\//-/g'`"
 if [ -z $FILENAME ];then
   NEWFILENAME=archivebackup_`echo $port`_`date +"%d-%m-%Y-%H:%M"`.tar.gz
   LABEL="Archive"
 else
   NEWFILENAME=`echo $FILENAME`_archive.tar.gz
   LABEL=$FILENAME
 fi
 archiveloc=`$BIN/psql -t -U $USER -p $port -c "show archive_command"|sed '/^$/d'|awk  '{k=split($0,arr," "); for(i=0;i<=k;i++){ if (match(arr[i],"%f")) print arr[i];}}'|xargs echo|awk '{print $1}'|sed 's/%f//g'`
        cmd="tar -zchvf $BCKDIR/$NEWFILENAME $archiveloc/*" 
        $cmd >/tmp/backup_archive_`date +"%d-%m-%Y"`.log 2>&1
 if [ $? -ne 0 ];then
  echo "Backup Failed!. Please check log file in /tmp/backup_archive_`date +"%d-%-%Y"`"
  rm /tmp/.archive.lock
  ENDTIME=`date +"%d/%m/%Y %H:%M:%S"`
  STATUS="Failed!"
         echo "$NEWFILENAME;$LABEL;$STARTTIME;$ENDTIME;$TYPE;$STATUS">>$BIN/$LISTFILE 
  exit 1
 fi
 echo "Archive Backup is successful"
 ENDTIME=`date +"%d/%m/%Y %H:%M:%S"`
 STATUS="Successful"
 echo "$NEWFILENAME;$LABEL;$STARTTIME;$ENDTIME;$TYPE;$STATUS">>$BIN/$LISTFILE
 rm /tmp/.archive.lock
}


So, I have all the necessary functions, now left function is: if user wants to view the status of backup and wants to list the backups. For that purpose following is a list backup function:
listbackups()
 {
 
 BCKLIST=`ls -1 $BIN/.$port*` 
 if [ $? -ne 0 ];then
    echo "No Catalog Found! Please check the Home Director and Backup Directory"
  exit 1
 fi
 
 BCKDIR=`echo $BCKLIST|awk -F"/" '{print $NF}'|cut -d"." -f2|awk -F"_" '{print $2}'|sed 's/-/\//g'`
 DATADIR=`$BIN/psql -t -U $USER -p $port -c "show data_directory;"|sed '/^$/d'` >/dev/null
 if [ $? -ne 0 ];then
  echo "Unable to Connect to Database for Verification. Exiting"
  exit 1
 fi
 echo ""|awk 'BEGIN{printf "\t\t%-20s\n","LISTING BACKUPS"}END{printf "\t\t%-20s\n\n","---------------"}'
 echo "PORT: $port"
 echo "DATADIR: $DATADIR"
 echo "LOCATION: $BCKDIR"
 echo ""
        awk -F';' 'BEGIN{ printf "%-20s\t%-20s\t%-20s\t%-20s\t%-20s \n","Filename","START DATE","END DATE", "TYPE", "STATUS"}{printf "%-20s\t%-20s\t%-20s\t%-20s\t%-20s\n",$2,$3,$4,$5,$6}' $BCKLIST  
 }

Now, I have all the required functions. Using above functions I can write a command pg_hotbackup which I can use for Base Backup of data directory, archive log backup, implementing retention policy and listing all the backups.

Few Snapshots are given below:
Hotbackup:
./pg_hotbackup -H $PGHOME/bin -b /Users/postgres/BackupDir -U postgres -p 5432 -P postgres
0/46000020
NOTICE: pg_stop_backup complete, all required WAL segments have been archived
0/460000D8
Data Directory of port 5432 has been backed up successfully!
Archive Backup is successful
No retention days...
Skipping the retention..
  LISTING BACKUPS     
  ---------------     

PORT: 5432
DATADIR:  /Library/PostgreSQL/9.0/data
LOCATION: /Users/postgres/BackupDir

Filename             START DATE           END DATE             TYPE                 STATUS               
HOTBACKUP            19/10/2010 06:11:41  19/10/2010 06:12:07  Full Backup          Successful          
Archive              19/10/2010 06:12:07  19/10/2010 06:12:29  Archive Backup       Successful          
HOTBACKUP            22/10/2010 02:04:40  22/10/2010 02:05:16  Full Backup          Successful          
Archive              22/10/2010 02:05:16  22/10/2010 02:05:44  Archive Backup       Successful   

No comments:

Post a Comment