Thursday, November 3, 2011

New Features in Slony 2.1

Slony 2.1 has been released on 19th Oct. 2011. So, I thought to look at some important improvement done in this new release, which can make users life easier.

Before discussing about the changes, lets setup slony replication. Following Codes can be use for setting up slony replication.
####   Preable Scripts:

cluster name=slonytest;
NODE 1 ADMIN CONNINFO = 'dbname=postgres host=localhost user=postgres port=5432 password=postgres';
NODE 2 ADMIN CONNINFO = 'dbname=repdb host=localhost user=postgres port=5432 password=postgres';
Adding Node script:
### create_nodes.slonik

include ;
init cluster (id=1, comment='slonytest node 1');
store node (id=2, comment='slonytest subscriber node 2', event node=1);
Storing Path:
### store_paths.slonik

include ;
STORE PATH (SERVER=1, CLIENT=2, CONNINFO='dbname=postgres host=localhost user=postgres port=5432 password=Empid#042');
STORE PATH (SERVER=2, CLIENT=1, CONNINFO='dbname=repdb host=localhost user=postgres port=5432 password=Empid#042');
Master Slon Process conf file:
## Master Slon Conf:

vac_frequency=3
cleanup_interval="10 minutes"
log_level=4
sync_interval=2000
sync_interval_timeout=10000
sync_group_maxsize=6
sync_max_rowsize=8192
sync_max_largemem=5242880
syslog=0
log_pid=false
log_timestamp=true
pid_file='/Users/vibhor/PGtools/slony/master_slon.pid'
syslog_ident=slon
cluster_name='slonytest'
conn_info='dbname=postgres host=localhost user=postgres port=5432 password=postgres'
desired_sync_time=60000
sql_on_connection="SET log_min_duration_statement TO '1000';"
Slave Slon process conf file:
## Slave Slon Conf:
vac_frequency=3
cleanup_interval="10 minutes"
log_level=4
sync_interval=2000
sync_interval_timeout=10000
sync_group_maxsize=6
sync_max_rowsize=8192
sync_max_largemem=5242880
syslog=0
log_pid=false
log_timestamp=true
pid_file='/Users/vibhor/PGtools/slony/slave_slon.pid'
syslog_ident=slon
cluster_name='slonytest'
conn_info='dbname=repdb host=localhost user=postgres port=5432 password=Empid#042'
desired_sync_time=60000
sql_on_connection="SET log_min_duration_statement TO '1000';"
Ater creating above configuration, user can setup Slony Replication. Now, lets look at the features added in slony 2.1

  1. Support for adding tables in bulk
 In Previous Version of Slony, if user has to add tables of a particular schema, then he has to write slonik command as given below:
include ;
create set (id=1, origin=1, comment='slonytest Tables and Sequences');
set add table (id=1, set id=1, origin=1, fully qualified name='public.dept', comment='slonytest table public.dept');
set add table (id=2, set id=1, origin=1, fully qualified name='public.emp', comment='slonytest table public.emp');
set add table (id=3, set id=1, origin=1, fully qualified name='public.employees', comment='slonytest table public.employees');
Which was a bit work, where user has to list all the tables and then add into slonik command Or user has to write a script which can generate slonik command.
 Now, in slony 2.1, user can write single command to add all tables of a schema in slony replication as given below:
include ;
create set (id=1, origin=1, comment='slonytest Tables and Sequences');
set add table( set id=1, tables='public.*');
Snapshot of activity is given below:
edbs-MacBook-Pro:slony vibhor$ cat create_set.slonik
include ;
create set (id=1, origin=1, comment='slonytest Tables and Sequences');
set add table( set id=1, tables='public.*');
edbs-MacBook-Pro:slony vibhor$ 
edbs-MacBook-Pro:slony vibhor$ slonik < create_set.slonik
User can verify the added tables in bulk using following command:
postgres=# select * from _slonytest.sl_table;
 tab_id | tab_reloid | tab_relname | tab_nspname | tab_set |   tab_idxname   | tab_altered |   tab_comment    
--------+------------+-------------+-------------+---------+-----------------+-------------+------------------
      1 |      16438 | dept        | public      |       1 | dept_pk         | f           | replicated table
      2 |      16445 | emp         | public      |       1 | emp_pk          | f           | replicated table
      3 |      16995 | employees   | public      |       1 | employees_pkey  | f           | replicated table
      4 |      16987 | entities    | public      |       1 | entities_pkey   | f           | replicated table
      5 |      16456 | jobhist     | public      |       1 | jobhist_pk      | f           | replicated table
      6 |      16967 | person_job  | public      |       1 | person_job_pkey | f           | replicated table
(6 rows)
If user wants to add all sequences of a particular schema in Slony Replication, then he can try something like given below:
include ;
set add sequences(set id=1, tables='public.*seq'); ## Adding all sequences of Public schema
User can also use some patterns while adding schema as given below:
include ;
set add tables(set id=1, tables='public.slonytest_[1234]',add sequences=true); ## adding tables: slonytest_1, slonytest_2, slonytest_3
2. Implicit WAIT FOR
 In Previous version of Slony, user has to use WAIT FOR Clause to before executing command like MergeSet/MoveSet
     SUBSCRIBE SET (ID = 999, PROVIDER = 1, RECEIVER = 2);
     WAIT FOR EVENT (ORIGIN = 1, CONFIRMED = ALL, WAIT ON=1);
     SUBSCRIBE SET (ID = 999, PROVIDER = 1, RECEIVER = 3);
     WAIT FOR EVENT (ORIGIN = 1, CONFIRMED = ALL, WAIT ON=1);
     MERGE SET ( ID = 1, ADD ID = 999, ORIGIN = 1 );
Now, in 2.1, if user forgets WAIT FOR clause, then slony will use implicit WAIT FOR whenever it requires. However, user has to keep following in mind:
 a. User should not run multiple slonik script parallely. Since, running parallel slonik script may confuse Slonu implicit WAIT FOR behavior.
 b. Slonik Command will use WAIT FOR if event node changes.
 c. Slonik Command will wait before executing SUBSCRIVE SET/DROP NODE/CLONE NODE Commands, till primary caught up with other slave nodes.
 d. Slonik Command CREATE SET will wait, untill all outstanding DROP SET confirmed by master and slaves in a Slony Replication cluster. e. WAIT FOR command will not work in TRY block. --- (Incompatibilities of other version)

  3. Support for replicating TRUNCATE

 Slony 2.1 has introduced feature of Replication TRUNCATE Command on slaves. TRUNCATE replication is possible for PostgreSQL version >=8.4.
 When user subscibe a table using slony 2.1, then slony creates a _slonytest_truncatetrigger trigger on master to replicate those truncate command.
 Lets see its work:
edbs-MacBook-Pro:~ vibhor$ psql -c "truncate table person_job" postgres
TRUNCATE TABLE
edbs-MacBook-Pro:~ vibhor$ psql -c "select * from person_job" repdb
 pid | job 
-----+-----
(0 rows)
4. Health checks at startup
 slony 2.1, now checks the status of Each node during startup and give proper message to fix the issue. In health check it checks the configuration information and returns OK, if there is no problem.
Pl/pgSQL function which has been added for health check is given below:
CREATE OR REPLACE FUNCTION _slonytest.slon_node_health_check()
 RETURNS boolean
 LANGUAGE plpgsql
AS $function$
declare
  prec record;
  all_ok boolean;
begin
  all_ok := 't'::boolean;
  -- validate that all tables in sl_table have:
  --      sl_table agreeing with pg_class
  for prec in select tab_id, tab_relname, tab_nspname from
  "_slonytest".sl_table t where not exists (select 1 from pg_catalog.pg_class c, pg_catalog.pg_namespace n
    where c.oid = t.tab_reloid and c.relname = t.tab_relname and c.relnamespace = n.oid and n.nspname = t.tab_nspname) loop
    all_ok := 'f'::boolean;
    raise warning 'table [id,nsp,name]=[%,%,%] - sl_table does not match pg_class/pg_namespace', prec.tab_id, prec.tab_relname, prec.tab_nspname;
  end loop;
  if not all_ok then
     raise warning 'Mismatch found between sl_table and pg_class.  Slonik command REPAIR CONFIG may be useful to rectify this.';
  end if;
  return all_ok;
end
$function$
5. Performance improvement in cases of large backlog
 Slony 2.1 came with performance improvement in case of large replication backlog. In previous versions, user used to experience performance impact when sl_log_* tables grows and replication of data to subscriber takes so long. Due to which Slony Replication takes huge time to catch up. Main cause of this performance issue was: Slony used to do sequential scan while pulling data from sl_log_* tables. Now, in 2.1, queries which pulls data from sl_log_* have been modified and now it uses index scan.

  6. Monitoring thread to provide better monitoring data
 2.1, also came with one monitoring table sl_components. This table is very useful in monitoring the Slony Threads. Description/detail of sl_components can be found in following link:
http://slony.info/documentation/2.1/table.sl-components.html
Snapshot of informantion display by sl_components is given below:
postgres=# select * from sl_components;
       co_actor       | co_pid | co_node | co_connection_pid |   co_activity    |       co_starttime        |  co_event  | co_eventtype 
----------------------+--------+---------+-------------------+------------------+---------------------------+------------+--------------
 local_listen         |  26688 |       1 |             26692 | thread main loop | 2011-11-03 12:07:56+05:30 |            | n/a
 local_sync           |  26688 |       0 |             26697 | thread main loop | 2011-11-03 12:08:00+05:30 |            | n/a
 remote listener      |  26688 |       2 |             26695 | thread main loop | 2011-11-03 12:08:00+05:30 |            | n/a
 remoteWorkerThread_2 |  26688 |       2 |             26694 | SYNC             | 2011-11-03 12:08:00+05:30 | 5000004917 | SYNC
 local_monitor        |  26688 |       0 |             26698 | thread main loop | 2011-11-03 00:32:19+05:30 |            | n/a
 local_cleanup        |  26688 |       0 |             26696 | cleanupEvent     | 2011-11-03 12:00:15+05:30 |            | n/a
(6 rows)

1 comment:

  1. This is very useful info especially when we are planning to do our slony upgrade to 2.1. Thanks

    ReplyDelete