New Replication and Recovery Features in PostgreSQL 9.1

1. Streaming Replication and Continuous Archiving Features.
a. Synchronous Replication
PostgreSQL 9.1 came with Synchronous Replication. In Synchronous Replication, all commited transaction will be transferred to standby synchronously. When Primary is in Synchronous replication, then each commit transaction will wait untill transaction get transfered to Slave/Replication Server. This reduces the loss of any committed transaction. And gives high degree of durability.
For setting up synchronous replication you can look at my blog:
http://vibhorkumar.wordpress.com/2011/10/20/asynchronoussynchronous-streaming-replication-in-postgresql-9-1/ 

2. New parameter replication_timeout:
 This is a new parameter which has been added in 9.1. Using this parameter user can terminate replication connection, if connection is inactive more than replication_timeout seconds. This is useful parameter for Primary Server to detect Replication Server Crash or network outage.


3. New role/permission for Streaming Replication
 REPLICATION is a new role, which has been added in PostgreSQL 9.1 for Streaming Replication. If a user has REPLICATION role granted then user would be able to initiate Streaming Replication. Some example is given below:
 a. CREATE USER with REPLICATION Privilege:
     CREATE USER rep_user WITH REPLICATION PASSWORD 'password';
b. Assign REPLICATION Privilege to a user:
      ALTER USER username WITH REPLICATION;
REPLICATION privileged can also used for Online Backup of PostgreSQL.


4. New tool for Base Backup: pg_basebackup
pg_basebackup is a new Backup tool introduce in PostgreSQL 9.1. Using this tool user can take Base Backups of running PostgreSQL database cluster. To use pg_basebackup, user has to make following in Changes in PostgreSQL Cluster

 a. Make following changes in pg_hba.conf file:
      host    replication     postgres        [Ipv4 address of client]/32  trust
b. Make following changes in postgresql.conf file of PostgreSQL Cluster:
    archive_command = 'cp -i %p /Users/postgres/archive/%f'
    archive_mode = on # Require Restart
    max_wal_senders = 3 # Maximum 'wal_senders'
    wal_keep_segments = # How many WAL segments (=files) should be kept on the server 
c. After making above changes, user can restart the PostgreSQL Cluster and can use pg_basebackup to take the backup as given below:
     pg_basebackup -D /Users/vibhor/testbackup -v -Fp -l Testbackup -h 127.0.0.1 -U postgres
Note:: User which can use pg_basebackup should have either SUPERUSER or REPLICATION privilege.


5. Functions to Control Streaming Replication replay:
 New version of PostgreSQL has New Streaming Replication Control Function. Information on it is given below:

  a. pg_xlog_replay_pause(): 
Using this function user can pause recovery of Standby and would be able to take consistent backup of Standby Data Directory.
Example is given below:
postgres=# select  pg_xlog_replay_pause();
 pg_xlog_replay_pause 
----------------------
 
(1 row)
b. pg_is_xlog_replay_paused():
 Using this function user would be able to check the Standby/Streaming Replication is paused or not. Example is given below:
 postgres=# select  pg_is_xlog_replay_paused();
 pg_is_xlog_replay_paused 
--------------------------
 t
(1 row)
c. pg_xlog_replay_resume():
 Using this function user would be able resume replication of standby/streaming replication, if its recovery/replay is paused. Example is given below:
postgres=# select pg_xlog_replay_resume();
 pg_xlog_replay_resume 
-----------------------
 
(1 row)

postgres=# select  pg_is_xlog_replay_paused();
 pg_is_xlog_replay_paused 
--------------------------
 f
(1 row)
6. New in Replication Monitoring:

a. View: pg_stat_replication:
This view displays information on WAL sender processes. View contains one row for each WAL sender process shows, information on processid, user (oid), username, application name, host name (if available) and port number, time at which the server process began execution, and the current WAL sender state and transaction log location.
Example is given below:
postgres=# select * from pg_stat_replication;
 procpid | usesysid | usename  | application_name | client_addr | client_hostname | client_port |          backend_start           |   state   | sent_location | write_location | flush_location | replay_location | sync_priority | sync_state 
---------+----------+----------+------------------+-------------+-----------------+-------------+----------------------------------+-----------+---------------+----------------+----------------+-----------------+---------------+------------
    2319 |       10 | postgres | sync_replication | ::1         |                 |       50224 | 2011-11-10 21:39:45.424503+05:30 | streaming | 0/33002798    | 0/33002798     | 0/33002798     | 0/33002798      |             0 | async
(1 row)
Note: pg_stat_replication view will give information on Master. Executing Query against pg_stat_replication view on Standby/streaming will return zero rows.

  b. Function: pg_last_xact_replay_timestamp():
Above function gives time stamp of last transaction replayed during recovery, time at which the commit or abort WAL record for that transaction was generated on the primary. If no transactions have been replayed during recovery, this function returns NULL. Otherwise, if recovery is still in progress this will increase monotonically. If recovery has completed then this value will remain static at the value of the last transaction applied during that recovery. When the server has been started normally without recovery the function returns NULL. Example is given below:
postgres=# select * from pg_last_xact_replay_timestamp();
  pg_last_xact_replay_timestamp   
----------------------------------
 2011-11-10 22:17:26.431321+05:30
(1 row)
7. New in Hot Standby:

  a. New parameter hot_standby_feedback:
 This is a new parameter which has been added in postgresql.conf file for Standby server. Using this parameter, now user would be able avoid canceling of Queries. This enable Hot Standby to postpone of cleaning old version of rows if any SELECT query running on Standby and makes Hot Standby to send feedback, once as per wal_receive_status_interval, to primary about queries currently executing on Standby. Note:: Setting this parameter may result in bloat on primary.

  b. New column:(conflicts) in pg_stat_database In PostgreSQL 9.1, New column conflicts has been added in pg_stat_database. This columns gives the total number of queries canceled due to conflict with recovery on standby. Example is given below:
postgres=# select datname, conflicts from pg_stat_database;
  datname  | conflicts 
-----------+-----------
 template1 |         0
 template0 |         0
 postgres  |         0
 korean    |         0
(4 rows)


c. New view: pg_stat_database_conflicts
 In 9.1, pg_stat_database_conflicts view has been added for monitoring and finding the cancelled queries due dropped tablespaces/ lock timeouts/old snapshots/pinned buffers/deadlocks. This view contains one row per database, which gives information on database OID, database name and the number of queries that have been canceled in this database due to dropped tablespaces, lock timeouts, old snapshots, pinned buffers and deadlocks. Example is given below:
postgres=# select * from pg_stat_database_conflicts;
 datid |  datname  | confl_tablespace | confl_lock | confl_snapshot | confl_bufferpin | confl_deadlock 
-------+-----------+------------------+------------+----------------+-----------------+----------------
     1 | template1 |                0 |          0 |              0 |               0 |              0
 12172 | template0 |                0 |          0 |              0 |               0 |              0
 12180 | postgres  |                0 |          0 |              0 |               0 |              0
 25354 | korean    |                0 |          0 |              0 |               0 |              0
(4 rows)
Note: User has to run queries against this view on Standby, since conflicts occurs on Standby.

  d. Increase the maximum values for max_standby_archive_delay and max_standby_streaming_delay. 
In PostgreSQL 9.0, maximum value for max_standby_archive_delay and max_standby_streaming_delay were 35 minutes. Now in PostgreSQL 9.1, user can mention much larger value. These parameters determines the maximum total time allowed to apply any WAL segment/WAL data. For more detail about these parameter, please refer following page of document:
   http://www.postgresql.org/docs/9.1/static/runtime-config-replication.html#GUC-MAX-STANDBY-ARCHIVE-DELAY


e. New Error Code: ERRCODE_T_R_DATABASE_DROPPED
  Error Code    Condition Name
      57P04 database_dropped
Till 9.0, PostgreSQL used to use ERRCODE_ADMIN_SHUTDOWN for recovery conflict( on Standby) caused by Database Drop on Master. Now, in 9.1, ERRCODE_T_R_DATABASE_DROPPED will be used for Same situation. This change made for poolers to handle such situation (where database on Master no longer exists) correctly like pgpool.

  8. New in Recovery Control:

  a. pg_create_restore_point(text) function and recovery_target_name parameter:
 PostgreSQL 9.1 has come with the special function pg_create_restore_point(text). Using this function, admin/DBA can now create their own recovery/restore point. pg_create_restore_point returns Transaction log location, upto which user can restore their Hotbackup. To support the named restore point, PostgreSQL 9.1, has new parameter recovery_target_name for recovery.conf file Let see these two in action.
 i. Set following parameters in postgresql.conf file of cluster:
    archive_mode=on
    archive_command='cp -i %p /Users/postgres/archive/%f'
ii. Now, lets take base backup using pg_basebackup as given below:
edbs-MacBook-Pro:standby postgres$ pg_basebackup -D /Users/postgres/test_backup -v -Fp -l "Standby Backup" -h localhost
NOTICE:  pg_stop_backup complete, all required WAL segments have been archived
pg_basebackup: base backup completed
iii. Now lets connect to database and make Named restore point using pg_create_restore_point
postgres=# select pg_create_restore_point('myrecoverylocation');
 pg_create_restore_point 
-------------------------
 0/3B000118
(1 row)
To verify the behavior of recovery_target_name parameter, lets create a table and perform some activities:
postgres=# create table test_id as select id from generate_series(1,1000000) as t(id);
SELECT 1000000
postgres=# 
iv. Now, create recovery.conf file as given below in Backup directory:
restore_command = 'cp -i /Users/postgres/archive/%f %p'  # e.g. 'cp /mnt/server/archivedir/%f %p'
recovery_target_name = myrecoverylocation
and change the port number of restored postgresql.conf as given below:
vi /Users/postgres/test_backup/postgresql.conf
port=5433
v. Now, start the restored PostgreSQL cluster as given below:
pg_ctl -D /Users/postgres/test_backup start
After executing above command, PostgreSQL will perform recovery till recovery_target_name location. To verify this, user can check PostgreSQL logfile, which will display message something like given below:
2011-11-14 17:15:18 IST LOG:  database system was interrupted; last known up at 2011-11-14 17:10:51 IST
2011-11-14 17:15:18 IST LOG:  creating missing WAL directory "pg_xlog/archive_status"
2011-11-14 17:15:18 IST LOG:  starting point-in-time recovery to "myrecoverylocation"
With this, after connecting to restore database, user can see that test_id table which we had created, after pg_create_restore_point('myrecoverylocation'), will not be part of restored database as given below:
--Restored Database:
edbs-MacBook-Pro:test_backup postgres$ psql -p 5433
Password: 
psql (9.1.1)
Type "help" for help.

postgres=# \dt
             List of relations
 Schema |     Name      | Type  |  Owner   
--------+---------------+-------+----------
 public | dept          | table | postgres
 public | emp           | table | postgres
 public | employees     | table | postgres
 public | entities      | table | postgres
 public | jobhist       | table | postgres
 public | like_op_table | table | postgres
 public | person_job    | table | postgres
 public | test          | table | postgres
 public | test_default  | table | postgres
(9 rows)

-- Primary Database:
postgres=# \dt
             List of relations
 Schema |     Name      | Type  |  Owner   
--------+---------------+-------+----------
 public | dept          | table | postgres
 public | emp           | table | postgres
 public | employees     | table | postgres
 public | entities      | table | postgres
 public | jobhist       | table | postgres
 public | like_op_table | table | postgres
 public | person_job    | table | postgres
 public | test          | table | postgres
 public | test_default  | table | postgres
 public | test_id       | table | postgres
(10 rows)


b. Standby recovery to switch to a new timeline automatically Till 9.0, when user specify recovery_target_timeline='latest', postgreSQL scan for the latest timeline at the beginning of recovery, and pick that as the target. If new timelines appear during recovery, PostgreSQL stick to the target chosen in the beginning, the new timelines are ignored. To make PostgreSQL to notice about the new timeline, user has to restart the Standby. Now in 9.1, Standby servers scan the archive directory for new timelines periodically and switch to new timeline new timeline appears during recovery.

c. New parameter: restart_after_crash = on/off By default till 9.0, whenever, there is backend crash PostgreSQL used to automatically get restarted and there was no control on restarting of PostgreSQL (like for clusterware solutions, where clusterware userd to try to restart the PostgreSQL). Adding this parameter gives control on restart of postgreSQL. If value of restart_after_crash is on, then PostgreSQL will restart automatically, after a backend crash. If value is off, then PostgreSQL will not restart, after backend crash and will be get shutdown.

Comments

Popular posts from this blog

xDB Replication from Oracle to PPAS

PostgreSQL Database Link to Oracle Database on Linux

Fix of "ORA-29275: partial multibyte character"