Monday, June 11, 2012

Creating user probe and alert in PEM 2.1

This is also one type of thing in which people are very interested. How to create probes other than PEM inbuilt probe? And how to create alert based on probe?
Well answer is simple. Understand what probe is in PEM and then understand PEM Data detail.
Probe in PEM requires following:
1. SQL Code which can be use to gathering data by pemagent.
2. Table in pemdata schema, which will be use for storing Current status/data of SQL.
3. History table (specially in pemhistory schema) where all history data will reside.

Suppose user wants to monitor the Slony Replication using PEM (since, PEM doesn't have slony replication moniotring), so user can do following:
1. SQL Code which can be use for slony replication monitoring.
We know that slony replication monitoring can be done using view sl_status of slony. So, user can create a SQL Code as given below for replication.
select '_test_slony' as schema_name, st_origin,st_received, st_lag_num_events , st_lag_time from _test_slony.sl_status
Now, you must be thinking why I am including Slony SCHEMA name in SQL Code. Well, reason is that slony_schema name represents Slony Cluster name and if user is having more than one slony replication in Database, then those can be monitor with small modification of SQL Code. Also, PEM data table and PEM History can be use for monitoring multiple slony replication cluster.

Lets start with creating probe based on SQL Code:
 Standard way of creating probe in Postgres Enterprise Manager:
1. Insert the detail of probe in probe table with SQL Code for monitoring:
INSERT INTO probe(display_name,
enabled_by_default, default_execution_frequency,default_lifetime,any_server_version,force_enabled,probe_key_list) 
VALUES('slon replication monitor','slony_replication','s',300,600,'select ''_test_slony'' as schema_name, st_origin,st_received, st_lag_num_events , st_lag_time from  _test_slony.sl_status',false,300,180,true,false,ARRAY['st_origin','st_received']);
Comlumn Description of probe table is given below:
display_name: Name as presented in Alerting Box.
internal_name: Internal Name of probe to PEM server. Based on that function: pem.create_data_and_history_tables(), creates pemdata and pemhistory table
collection_method: is Collection based on SQL: 's' or based on internal code of pemagent: 'i' target_type_id: Target type id, Type id and its description below:
  Global  : 50
  Agent   : Agent
  Server  : 200
  Database: 300
  Schema :   400
  Table :   500
  Index :  600
  Sequence: 700
  Function: 800
If probe is for database level then target_type_id would be 300.
applies_to_id: probe is apply to which type id, i.e if user can make database level probe which can be apply to table (like IndexSize can be database level and can be table level alert).
probe_code: SQL Code/function to collect data e
nabled_by_default: if true then it will be enable for all pemagent servers
default_execution_frequency: Interval
default_lifetime: Data retention period
any_server_version: Is it PPAS specific alert or PG/PPAS both (true/false)
force_enabled: By default enable (true/false).
probe_key_list: Key columns

2. Insert the detail of data column, returns by SQL Code, in probe column, as given below:
INSERT INTO pem.probe_column (probe_id, internal_name, display_name, display_position, classification,
       sql_data_type, unit_of_value, calculate_pit, discard_history, pit_by_default, is_graphable)

   (SELECT max(id) FROM PEM.probe), v.internal_name, v.display_name, v.display_position, v.classification,
       v.sql_data_type, v.unit_of_value, v.calculate_pit, v.discard_history, v.pit_by_default, v.is_graphable FROM (
    VALUES('database_name','Database Name',1,'k','text','',false,false,false,false),
    ('schema_name','Slony Cluster',2,'k','text','',false,false,false,false),
    ('st_origin','Master Node ID',3,'k','INTEGER','',false,false,false,false),
    ('st_received','Slave Node ID',4,'k','INTEGER','',false,false,false,false),
    ('st_lag_num_events','Lag Events',5,'m','INTEGER','',false,false,false,false),
    ('st_lag_time','Lag Interval',6,'m','INTERVAL','',false,false,false,false)
v(internal_name, display_name, display_position, classification,
               sql_data_type, unit_of_value, calculate_pit, discard_history, pit_by_default, is_graphable);
Description of columns are given below:
probe_id : Assigned probe_id for column (its max(id) of pem.probe column.
internal_name : Internal name to PEM server.
display_name : Column Display name to user.
display_position : Position of column in SQL Code.
classification : If column consider as primary key i.e based on this record can be identified, then value would 'k' else 'm'.
sql_data_type : Data type of Column
unit_of_value : Unit of data of column.
calculate_pit : Point in time data
discard_history : Discard any history.
pit_by_default : Default Point in time representation true/false
is_graphable : Can be use for graph (always keep false).

3. Now use PEM server function to create data and history table in pemdata and pemhistory schema of PEM server.
SELECT pem.create_data_and_history_tables();
Above will create table with internal_name mentioned in probe_column, like pemdata.slony_replication and pemhistory.slony_replication. 

Above are three simple steps to create probe in PEM Server. Now, user can see that if you know SQL language, you can create probe in PEM. To verify the probe you can see the probe in PEM Client:
Open PEM Client -> Go to PEM Server directory (in Left pane) -> Connect to PostgreSQL Cluster -> expand the databases -> Right click on any database -> select Probe Configuration.
which will popup "Probe configuration" and you would be able to see the slony replication.

Since, we have created probe and based on probe, we can create template alert.
Creating Template Alert is simple. For new template alert, you have to do following:
1. Identify the right SQL code for monitoring based on probe, as given below for slony replication:
SELECT extract ('epoch' from st_lag_time) from  pemdata.slony_replication WHERE server_id=${server_id} AND database_name='${database_name}' AND st_origin='${param_1}' AND st_received='${param_2}' and st_lag_time >=interval '1 sec' AND schema_name='_test_slony'
In above I am using epoch function to calculate seconds and based on that I can have alert which would be having threshold value for Low, Medium and High alert.

2. Use PEM server function, pem.create_alert_template(), to create template alert as given below:
SELECT pem.create_alert_template('Slony Replication','Slony Replication Monitoring Alert',$SQL$ SELECT extract ('epoch' from st_lag_time) from  pemdata.slony_replication WHERE server_id=${server_id} AND database_name='${database_name}' AND st_origin='${param_1}' AND st_received='${param_2}' and st_lag_time >=interval '1 sec' AND schema_name='_test_slony' $SQL$,300,ARRAY['st_origin','st_received'],ARRAY['INTEGER','INTEGER']::alert_param_type[],NULL,'seconds',ARRAY['slony_replication'],2);

Argument description of pem.create_alert_template() function is given below: : Name of alert in single quotes
2.description : Small Description of Alert
3.sql : SQL code for getting threshold
4.object_type : Alert is of Agent LevelServer Level,
5.param_names : SQL Code is dependent on any parameter name, (like in my SQL Code, its depend '${param_1}' (origin) and '{param_2}' (subscriber node id)
6.param_types : Data type of each parameter (As per SQL code for slony, this would be NULL)
7.param_units : Assigned unit
8.threshold_unit : Unit of threshold (since I am doing Checking the Time lag, there it would seconds).
9.probe_dependency_list: Dependency on Pemdata table (Its slony_replication)
10.snmp_oid : Oid for snmp
11.applicable_on_server: Is this applicable for Advanced Server or ALL
12.default_check_frequency: Interval
13.default_history_retention: data retention in number of days.

Thats it! Now, you can see creating new user defined alert and probe in PEM is very simple. Only thing which user has to do is to focus on identifying right SQL Code for alert/probe.

Have fun with Postgres Enterprise Manager!

No comments:

Post a Comment