Monday, 29 January 2018

Partitions in PostgreSQL

Below are some important points regarding "Partitions" in PostgreSQL -
  1. PostgreSQL supports basic table partitioning until Version 9.6
  2. Query Performance can be increased dramatically situations when most of the heavily accessed rows of the table are in a single partition. 
  3. Partitions substitutes for leading column index, reducing size and make it more likely that the heavily used parts fits the memory. 
  4. When queries or updates access large partition of single partition, performance can be improved by taking advantage of sequential scan instead using index scan and random access read scattered across the table. 
  5. Bulk loads and deletes can be accomplished by adding or removing partitions, if that requirement is planned into the partitioning design
  6. ALTER TABLE NO INHERIT and DROP TABLE are both far faster than a bulk operation. These commands also entirely avoid the VACUUM overhead caused by a bulk DELETE.
  7. Each partition must be created as a child table of a single parent table.
  8. The parent table itself is normally empty; it exists just to represent the entire data set.
  9. PostgreSQL supports partitioning via table inheritance until version 9.6
  10. Declarative partitioning is the new feature which is supported from PostgreSQL version 10.0
PARTITION USING TABLE INHERITANCE:
  1. Create the “parent” table, from which all of the partitions will inherit.
  2. Create several “child” tables (each representing a partition of the data) that each inherit from the parent.
  3. Add constraints to the partition tables to define the row values in each partition.
  4. Create indexes on any parent and child tables individually. (Indexes do not propagate from the parent tables to child tables).
  5. Write a suitable trigger function to the master table so that inserts into the parent table redirect into the appropriate partition table.
  6. Create a trigger that calls the trigger function.
  7. Redefine the trigger function when the set of child tables changes.
PROBELEMS:
  1. Data consistency depends on the CHECK constraints specified on each child table.
  2. INSERT and COPY commands do not automatically propagate data to other child tables in the inheritance hierarchy, but instead rely on triggers, resulting in slower inserts.
  3. Substantial manual work is required to create and maintain child sub-tables.
  4. Given 3, re-partitioning is non-trivial (e.g., when adding disks, making schema changes).
  5. Ensuring uniqueness across partitioned tables is not supported.
  6. Indexes, constraints, and many table maintenance commands need to be applied to child tables explicitly. This greatly complicates management tasks.
DECLARATIVE PARTITIONS ( Version 10 Onwards):

Currently, declarative partitioning supports RANGE and LIST partitions:

RANGE — partition the table into “ranges” defined by a key column or set of columns, with no overlap between the ranges of values assigned to different partitions. For example, device_id.

LIST— partition the table by explicitly listing which key values appear in each partition, and are limited to a single column. For example, device_type.

How does it work?

Create a partitioned table using the PARTITION BY clause, which includes the partitioning method (RANGE in this example) and the list of column(s) to use as the partition key.

CREATE TABLE measurement (
 city_id int not null,
 logdate date not null,
 peaktemp int,
 unitsales int
) PARTITION BY RANGE (logdate);

After creating the partitioned table, partitions can be manually created:

CREATE TABLE measurement_y2006m02 PARTITION OF measurement
 FOR VALUES FROM (‘2006–02–01’) TO (‘2006–03–01’);

CREATE TABLE measurement_y2006m03 PARTITION OF measurement
 FOR VALUES FROM (‘2006–03–01’) TO (‘2006–04–01’);

CREATE TABLE measurement_y2006m04 PARTITION OF measurement
 FOR VALUES FROM (‘2006–04–01’) TO (‘2006–05–01’);
...
CREATE TABLE measurement_y2007m11 PARTITION OF measurement
 FOR VALUES FROM (‘2007–11–01’) TO (‘2007–12–01’);

And then indexes can be created:

CREATE INDEX ON measurement_y2006m02 (logdate);
CREATE INDEX ON measurement_y2006m03 (logdate);
CREATE INDEX ON measurement_y2006m04 (logdate);
CREATE INDEX ON measurement_y2007m11 (logdate);

The partitioned table functions similarly to the parent or “master” table in table inheritance but is much improved, serving as a much more powerful query interface to the underlying child table partitions. For instance, commands such as TRUNCATE and COPY now propagate to child tables via execution on the partitioned table. Additionally, users can insert data into underlying child tables via the partitioned table, since tuples (i.e., rows) now automatically route to the right partition on INSERT and no longer depend on triggers, as is the case with table inheritance.

Index in PostgreSQL

There are different types of index in PostgreSQL. They are below:
  1. B-tree
  2. Hash
  3. GiST
  4. SP-GiST
  5. GIN
  6. Single Column Index
  7. Multi Column Index ( Composite Index)
  8. Functional Index
  9. Index on Foreign Key
  10. Unique Index
  11. Partial Index
  12. Implicit Index 

When Index is created, it blocks the table for writes and performs the entire index build with a single scan of table. Other transaction can read the table. DDL and DML operations will be in waiting state until Index rebuild is done. 

PostgreSQL supports building index without locking out writes. We need to specify concurrently option while creating index. When we use concurrently option, PostgreSQL must perform 2 scans of the table and it should wait for all existing transaction that could potentially use or modify the index to terminate. Thus, it takes more time to complete. But this allows to continue write in production, so it is helpful. But it will slow down other operation.

B-Tree - Balanced Tree are used for Equality and Range Queries. All data types and NULL values.

Hash - Hash Indexes are only useful for equality comparisons, but you pretty much never want to use them since they are not transaction safe, need to be manually rebuilt after crashes, and are not replicated to followers.

GIN - Generalized Inverted Index are useful when an index must map many values to one row, whereas B-Tree indexes are optimized for when a row has a single key value. GINs are good for indexing array values as well as for implementing full-text search.

GIST - Generalized Search Tree - indexes allow you to build general balanced tree structures, and can be used for operations beyond equality and range comparisons. They are used to index the geometric data types, as well as full-text search.

Implicit Index - Indexes are automatically created for primary key constraints and unique constraints.

Places where we should avoid Index:
  1. Where there are too many updates,deletes in table.
  2. We should avoid in small tables
  3. Should be avoided where there are too many NULL values
  4. Columns that are frequently manipulated should not be indexed.


Monitoring in PostgreSQL

Database Monitoring:

  1. Database monitoring consists of capturing and recording the database events.
  2. This information helps DBA to detect, identify and fix potential database performance issues.
  3. Database monitoring statistics makes it easy for DBA’s to monitor the health and performance of PostgreSQL Database. 
  4. Several tools are available for monitoring database activity and analyzing performance.

Database Statistics:

  • DB statistics catalog tables store DB activity information like
  1. Current running sessions
  2. Running SQL
  3. Locks
  4. DML Counts
  5. Row Counts
  6. Index Usage
  • Stat collector collects all the above stats and update pg_catalog(statistics table)
  • Pg_catalog is a schema which contains all the statistics table
CATALOG TABLE - Which needs to be updated with analyze(pg_class, pg_catalog)
SYSTEM VIEWS - Which automatically gets updated in occurrence to an event(pg_stat_activity)

The statistics Collector:
  • Stats collector is a process that collect and report information about database activities.
  • Stats collector  adds some overhead to query execution.
  • Stats Parameter:
  1. Track counts - controls table and index statistics
  2. Track activities - enables monitoring of the current command.
  • The stats collector use temp files stored in pg_stat_tmp subdirectory.
  • Permanent statistics are stored in pg_catalog schema in global subdirectory
  • Performance TIP: Parameter stats_temp_directory can be pointed at a RAM based file System.
Database Statistics Table:
  1. Pg_catalog schema contains a set of table,view and function which store and report database statistics.
  2. Pg_class and pg_stats catalog tables stores all statistics. They are two main statistics table. It stores index and table statistics.
  3. Pg_stat_database view can be used to view stats information about a database
  4. Pg_stat_bgwriter shows background writer steps
  5. Pg_stat_user_tables shows information about activities on a table like inserts, updates, deletes, vacuum, autovacuum etc
  6. Pg_stat_user_indexes shows information about index usage for all user table. 
  7. \dS - shows view and statistics table in a server. 
  8. Select * from pg_stat_database
  9. Pg_stat_% are all views
Pg_Stat_Database:

postgres=# \x
Expanded display is on.
postgres=# select * from pg_stat_database where datname='postgres';
-[ RECORD 1 ]--+---------------------------------
datid          | 12896            --- Database OID
datname        | postgres    --- Database Name
numbackends    | 1            --- Number of connection at this time
xact_commit    | 16791      --- How many connections are committed
xact_rollback  | 2               --- How many connections are rollbacked
blks_read      | 187288      --- How many bulk reads happened
blks_hit       | 273549         --- How many bulks blocks are hit
tup_returned   | 7688150   --- How many tuples were returned
tup_fetched    | 93446        --- How many tuples were fetched
tup_inserted   | 0                --- How many tuples were inserted
tup_updated    | 0               --- How many tuples were updates
tup_deleted    | 0                --- How many tuples were deleted
conflicts      | 0                    ---   
temp_files     | 0
temp_bytes     | 0
deadlocks      | 0
blk_read_time  | 0
blk_write_time | 0
stats_reset    | 2016-10-07 12:29:11.117365+05:30  -- When statistics were reset

Pg_stat_bgwriter:

postgres=# select * from pg_stat_bgwriter;
-[ RECORD 1 ]---------+---------------------------------
checkpoints_timed     | 96              -- How many checkpoint were timed/occurred
checkpoints_req       | 11394         -- How many checkpoint were actually required
checkpoint_write_time | 210405805  -- 
checkpoint_sync_time  | 8204341
buffers_checkpoint    | 49477813   -- How many buffers were checkpoint
buffers_clean         | 234507961    -- How many buffers were cleaned
maxwritten_clean      | 1296814     -- How many maximum written clean buffer
buffers_backend       | 1768849795 -- How many buffers at backend 
buffers_backend_fsync | 0              -- How many buffers used by fsync
buffers_alloc         | 2155806772     -- Buffer allocation
stats_reset           | 2016-10-07 12:29:04.686137+05:30  -- Stats were reset on

When checkpoint_req is more than checkpoint_times, there is an issue. Because there are too many transaction. Checkpoint are happening because checkpoint_segment is getting full.
SOLUTION: Increase the shared_buffer, Increase checkpoint_segment.

Pg_stat_user_tables:

postgres=# Select * from pg_stat_user_tables where relname='a';
-[ RECORD 1 ]-----+-----------
relid             | 2338166798
schemaname        | public   -- Schemaname
relname           | a                -- Table Name
seq_scan          | 0               -- How many seq_scan
seq_tup_read      | 0            -- How many tuples were read sequentially
idx_scan          |                   -- How many index scan
idx_tup_fetch     |                 -- How many tuples returned using index
n_tup_ins         | 0                -- How many tuples inserted
n_tup_upd         | 0              -- How many tuples updated
n_tup_del         | 0               -- How many tuples deleted
n_tup_hot_upd     | 0           -- 
n_live_tup        | 0
n_dead_tup        | 0              -- How many tuples were dead
last_vacuum       |                 -- When the table was last vacuum
last_autovacuum   |              -- when the table was last auto vacuum 
last_analyze      |                  -- when the table was last analyzed
last_autoanalyze  |               -- When the table was last auto analyzed
vacuum_count      | 0            -- How many vacuum was done
autovacuum_count  | 0         -- How many autovacuum was done
analyze_count     | 0             -- How many analyze was done
autoanalyze_count | 0          -- How many auto analyze was done.

Autoanalyze: Autovacuum daemon does auto analyze. But if table is rarely inserted, then have to analyze manually. If the table which is being analyzed have parent and child table. It will analyze parent table and then parent and child table. 

Sequential scan should be less than index scan. 


IMPORTANTS VIEWS:

Below are some important views which are very much essential for a DBA.

pg_stat_activity
pg_stat_bgwriter
pg_stat_database
pg_stat_all_tables
pg_stat_sys_tables
pg_stat_user_tables
pg_stat_xact_all_tables
pg_stat_xact_sys_tables
pg_stat_xact_user_tables
pg_stat_all_indexes
pg_stat_sys_indexes
pg_stat_user_indexes
pg_statio_all_tables
pg_statio_sys_tables
pg_statio_user_tables
pg_statio_all_indexes
pg_statio_sys_indexes
pg_statio_user_indexes
pg_statio_all_sequences
pg_statio_sys_sequences
pg_statio_user_sequences
pg_stat_user_functions
pg_stat_xact_user_functions
pg_stat_replication
Pg_stat_database_conflicts

Pg_stat_activity:

postgres=# select * from pg_stat_activity;

-[ RECORD 1 ]----+---------------------------------------------------------------------------------------------------------------------------------------------
datid         | 16384  -- DB ID
datname       | ewcloud  -- DB Name
pid           | 32438     -- Process ID matches with pid of ps -ef | grep post
usesysid      | 32217507
usename       | t_a3efe696b07a44f9ae84ef5b5f096e0c  -- schema Name
application_name | Cisco Energy Management Server Processor Message Queue reportin
client_addr   | 172.30.2.64
client_hostname  |
client_port   | 61742
backend_start | 2016-10-07 12:29:05.183022+05:30  -- Connection started
xact_start    |
query_start   | 2016-10-23 00:30:32.395862+05:30  -- Query start time
state_change  | 2016-10-23 00:30:32.396797+05:30
waiting       | f  -- Query is in waiting state or not
state         | idle
query         | COMMIT


Replication in PostgreSQL

Synchronous Replication:

  1. Ensures that all data are first written to master, slave and transaction log of disk in both master and slave then it will be committed to client.
  2. It ensures maximum Protection.
  3. Synchronous replication offers the ability to confirm that all changes made by a transaction have been transferred to one synchronous standby server.
  4. The only chance data will be lost if both Master and slave crashed as same time. 
  5. The maximum wait time is the round trip starting from Master to Slave & slave to master. Once streaming replication has been configured, configuring synchronous replication requires only one additional configuration step: synchronous_standby_names must be set to a non-empty value.

Asynchronous Replication:


  1. It is by default.
  2. If Master/Primary server crashes then some transaction that were committed may be not replicated to standby server/slave. The amount of data loss is proportional to delay at the time of failover.

Logical Replication:


  1. Replication slots with Streaming replication
  2. Replication Slot - a slot represents a stream of changes that can be replayed to a client in the order they were made on the origin server. Each slot streams a sequence of changes from a single database, sending each change exactly once. Only one receiver may consume changes from a slot at any given time.
  3. It can only replicate DML changes not DDL changes.
  4. Any version it can replicate, any DB it can replicate as it is logical replication.
  5. Replication should be synchronous replication.
  6. Only after replication statement are read, it will delete the xlog.
  7. It don’t need any overhead process from server, as it reads from WAL files
  8. It is an approach towards multi-master slowly.
  9. Replication slots are counted as one of the connection in max_connection.
  10. Process - There is a logical decoding, which pulls all the queries from WAL logs. Decoder should have access to pg_catalog as needs various information from pg_catalog. The DML queries are decoded in decoder. There  is plugin attached to decoder which reads the decoder and send statements to slave server. The plugin can be customized plugin according to our use. We can use that plugin for analytical use also or for logical replication. 
  11. We should make sure, replcation_slot, wal_sender, wal_receiver is enabled in master and slave.
  12. Please find the below example.


postgres=# -- Create a slot named 'regression_slot' using the output plugin 'test_decoding'

postgres=# SELECT * FROM pg_create_logical_replication_slot('regression_slot', 'test_decoding');

   slot_name    | xlog_position

-----------------+---------------

regression_slot | 0/16B1970

(1 row)

postgres=# SELECT * FROM pg_replication_slots;

   slot_name    |    plugin     | slot_type | datoid | database | active |  xmin  | catalog_xmin | restart_lsn

-----------------+---------------+-----------+--------+----------+--------+--------+--------------+-------------
regression_slot | test_decoding | logical   |  12052 | postgres | f      |        |          684 | 0/16A4408
(1 row)

postgres=# -- There are no changes to see yet
postgres=# SELECT * FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL);
location | xid | data
----------+-----+------
(0 rows)

postgres=# CREATE TABLE data(id serial primary key, data text);
CREATE TABLE

postgres=# -- DDL isn't replicated, so all you'll see is the transaction
postgres=# SELECT * FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL);
location  | xid |    data
-----------+-----+------------
0/16D5D48 | 688 | BEGIN 688
0/16E0380 | 688 | COMMIT 688
(2 rows)

postgres=# -- Once changes are read, they're consumed and not emitted
postgres=# -- in a subsequent call:
postgres=# SELECT * FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL);
location | xid | data
----------+-----+------
(0 rows)

postgres=# BEGIN;
postgres=# INSERT INTO data(data) VALUES('1');
postgres=# INSERT INTO data(data) VALUES('2');
postgres=# COMMIT;

postgres=# SELECT * FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL);
location  | xid |                     data
-----------+-----+-----------------------------------------------
0/16E0478 | 689 | BEGIN 689
0/16E0478 | 689 | table public.data: INSERT: id[integer]:1 data[text]:'1'
0/16E0580 | 689 | table public.data: INSERT: id[integer]:2 data[text]:'2'
0/16E0650 | 689 | COMMIT 689
(4 rows)

postgres=# INSERT INTO data(data) VALUES('3');

postgres=# -- You can also peek ahead in the change stream without consuming changes
postgres=# SELECT * FROM pg_logical_slot_peek_changes('regression_slot', NULL, NULL);
location  | xid |                     data
-----------+-----+-----------------------------------------------
0/16E09C0 | 690 | BEGIN 690
0/16E09C0 | 690 | table public.data: INSERT: id[integer]:3 data[text]:'3'
0/16E0B90 | 690 | COMMIT 690
(3 rows)

postgres=# -- The next call to pg_logical_slot_peek_changes() returns the same changes again
postgres=# SELECT * FROM pg_logical_slot_peek_changes('regression_slot', NULL, NULL);
location  | xid |                     data
-----------+-----+-----------------------------------------------
0/16E09C0 | 690 | BEGIN 690
0/16E09C0 | 690 | table public.data: INSERT: id[integer]:3 data[text]:'3'
0/16E0B90 | 690 | COMMIT 690
(3 rows)

postgres=# -- options can be passed to output plugin, to influence the formatting
postgres=# SELECT * FROM pg_logical_slot_peek_changes('regression_slot', NULL, NULL, 'include-timestamp', 'on');
location  | xid |                     data
-----------+-----+-----------------------------------------------
0/16E09C0 | 690 | BEGIN 690
0/16E09C0 | 690 | table public.data: INSERT: id[integer]:3 data[text]:'3'
0/16E0B90 | 690 | COMMIT 690 (at 2014-02-27 16:41:51.863092+01)
(3 rows)

postgres=# -- Remember to destroy a slot you no longer need to stop it consuming
postgres=# -- server resources:
postgres=# SELECT pg_drop_replication_slot('regression_slot');
pg_drop_replication_slot
-----------------------

(1 row)



Physical Replication:

  1. Streaming Replication
  2. Hot Standby
  3. Warm Standby
  4. Log shipping

Streaming Replication:

There should be always a real time backup so that the data's are always safe. Streaming Replication is one of the process that always sync the real time data into different server. The main server is always called as master and the server where the data's need to get sync is called slave. 

Max_Wal sender process keeps checking the change block and sends it to replicated server where wal_reciever makes sure, it is written to the database accordingly. Row by row can be replicated in streaming replication, as process reads from xlog. 

Setting up Streaming Replication:


Pre-Requisite:
  1. Slave Empty
  2. Same Version (9.2.2)
  3. Same OS-Linux (Ubuntu)
  4. Same bit - 32
Steps:

1)Set the keygen so that there is no problem in connection both the server i.e slave and master
 
   a) As a root user create the key in master and in slave
            Master: ssh-keygen -t rsa
            Slave: ssh-keygen -t rsa
   b) Now copy the key from master to slave and slave to master
             Master: vi authorized-keys
                        cat id-rsa.pub from slave and paste in authorized-keys
             Slave: Do same thing for slave

2)Change the permission of the file authorized-keys using the below command -
      [root@linux] chmod 600 authorized-keys

3) We need to repeat the above steps to set the keygen as postgres user. If server A needs to connect to server B as postgres user. Then we need to copy public key of server A to authorized_keys in server B. Permissions are very important, without that it won’t work. Also we need to make sure that knows_hosts are cleared up before/after we do ssh keygen setup.
       .ssh permission - 700
       Authorized_keys - 640

4)  Check the master is connected to the slave without password or not.
         ssh postgres@192.168.175.3
     It may be still it is not connection, and throws an error. Then we can just reset our password in OS
         passwd postgres

5) Create directory wallogs in slave in root and change the permission
      chown -R postgres wallogs

6) Change postgres config(master):
     archive_mode = on
     archive_command = 'ssh slave_host test -d /wallogs && rsync -a %p slave_host:/wallogs/%f'
      (it is secure shelling to slave ip and test the wallogs if everything ok then it copies to slave wallogs)
      wal_level = hot_standby
      max_wal_senders = 2
       wal_keep_segments = 30

7) Restart the master
    service postgresql restart

8) We can switch log to check logs are getting created and moved.
    select pg_switch_xlog();

9) Change in pg_hba.conf
    type    database    user          host(slave)            Method
    host    all         all         192.168.175.196/32      trust
    host  replication   all         192.168.175.196/32      trust
     Change all md5 to trust.

10) Next we will run commands from slave. We won’t touch master

11) Keep a backup of postgresql.conf and pg_hba.conf in home directory.

12) Going to root:
     rm /tmp/data_pgsql.trigger.5432
     /etc/init.d/postgresql stop
     rm /usr/local/pgsql/data/ -rf
     echo "SELECT pg_start_backup('`hostname``/bin/date '+%Y%m-%d-%T'`');" | psql -h masr_hostte -U postgres
     rsync -avz   postgres@master_host:/usr/local/pgsql/data/  /usr/local/pgsql/data/
     su - postgres -c "/usr/local/pgsql/bin/psql -h master_host -c 'SELECT pg_stop_backup();'"
     rm /usr/local/pgsql/data/pg_xlog/archive_status/*
     rm /usr/local/pgsql/data/pg_xlog/*

13) Change in postgresql.conf
      hot_standby = on

14) Create recovery.conf
      standby_mode = 'on'
      primary_conninfo = 'host=master_host port=5432 user=postgres'
      restore_command = 'cp /wallogs/%f %p'
      trigger_file = '/tmp/pgsql.trigger'
      archive_cleanup_command = 'pg_archivecleanup /wallogs/ps_admin_juliet %r

15) Remove the postmaster.pid before restarting the slave server.

16) Restart the slave and check is it working or not.

17) Monitoring: pg_stat_replication

     Large differences between pg_current_xlog_location and sent_location field might indicate that the master server is under heavy load, while differences between sent_location and pg_last_xlog_receive_location on the standby might indicate network delay, or that the standby is under heavy load.

Recovery.conf File:

standby_mode = 'on'
primary_conninfo = 'host=master_host port=5432 user=postgres'
restore_command = 'cp /wallogs/%f %p'
trigger_file = '/tmp/pgsql.trigger'
archive_cleanup_command = 'pg_archivecleanup /wallogs/ps_admin_juliet %r’
Recovery_target_name = name point created pg_create_restore_point()
Recovery_target_time = till a specific time, it will recover. 
Recovery_target_xid = recovery upto transaction ID
Recovery_target_inclusive = True(default) false stops before recovery target
Recovery_target_timeline = Recovery till specific timeline
Pause_at_recovery_target = recovery target should pause after recovery is complete.

What happens at the end of recovery?

  1. End of recovery means the point where the the database opens up for writing
  2. New timeline is chosen
  3. A timeline history file is written
  4. The partial last WAL file on the previous timeline is copied with the new timeline's ID
  5. A checkpoint record is written on the new timeline
Example:

LOG: database system was interrupted; last known up at 2013-01-30
21:45:14 EET
LOG: starting archive recovery
LOG: redo starts at 13/E00000C8
LOG: could not open file "pg_xlog/0000000100000013000000E4": No
such file or directory
LOG: redo done at 13/E3D389A0
LOG: last completed transaction was at log time 2013-01-30 21:45:20+02
LOG: selected new timeline ID: 2
LOG: archive recovery complete
LOG: database system is ready to accept connections

Partial File:

0000000100000013000000E4
------------------- - - - - - - - -  - - -      - - - - ( Unused Portion)
0000000200000013000000E4
------------------------------------           --------(Merged Portion)
Both are partial file at one point of time and partial file vanishes and become normal WAL file.

TimeLine History File:

After timeline switch there is a history file in between
0000000100000013000000E4
0000000100000013000000E5
00000002.history
0000000200000013000000E3
0000000200000013000000E4

Cat 00000002.history
13/E4000000  no recovery target specified
13/ED000090  at restore point ”before FOSDEM”

Promoting a standby also switches timeline.

Pg-receivexlog:

  1. Pg_receivexlog streams transaction log from a running postgresql cluster.
  2. This tool uses streaming replication protocol to stream transaction logs to a local directory.
  3. These files can be used for PITR.
  4. Transaction logs are streamed in real time thus pg_receivelog can be used instead of archive command.
  5. EX: pg_receivelog -h localhost -D /usr/local/wal_archive
  6. It can be used as an alternate of archiving or for streaming replication.
  7. Max_wal_sender=1 . At Least 1 and create entry in pg_hba.conf for the IP as replication user, Then it will work
  8. It uses replication protocol

Replication Troubleshooting:

  1. Generally we look into view pg_stat_replication:
  2. Below queries can be used to find lag:
  3. psql>select extract( 'epoch' from now() - pg_last_xact_replay_timestamp())::int ;
  4. Few More Queries are below:
$ psql -c "SELECT pg_current_xlog_location()" -h192.168.0.10 (primary host)
 pg_current_xlog_location 
--------------------------
 0/2000000
(1 row)

$ psql -c "select pg_last_xlog_receive_location()" -h192.168.0.20 (standby host)
 pg_last_xlog_receive_location 
-------------------------------
 0/2000000
(1 row)

$ psql -c "select pg_last_xlog_replay_location()" -h192.168.0.20 (standby host)
 pg_last_xlog_replay_location 
------------------------------
 0/2000000
(1 row)

Pg_stat_replication:

postgres=# select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid                        | 14447
usesysid               | 10
usename              | postgres
application_name | walreceiver
client_addr            | 172.31.28.103
client_hostname   |
client_port             | 34792
backend_start       | 2016-11-29 00:09:15.958171-05
backend_xmin      |
state                      | streaming
sent_location        | 0/2000060
write_location       | 0/2000060
flush_location       | 0/2000060
replay_location     | 0/2000060
sync_priority         | 0
sync_state            | async


Column
Type
Description
pid
integer
Process ID of a WAL sender process
usesysid
oid
OID of the user logged into this WAL sender process
usename
name
Name of the user logged into this WAL sender process
application_name
text
Name of the application that is connected to this WAL sender
client_addr
inet
IP address of the client connected to this WAL sender. If this field is null, it indicates that the client is connected via a Unix socket on the server machine.
client_hostname
text
Host name of the connected client, as reported by a reverse DNS lookup of client_addr. This field will only be non-null for IP connections, and only when log_hostname is enabled.
client_port
integer
TCP port number that the client is using for communication with this WAL sender, or -1 if a Unix socket is used
backend_start
timestamp with time zone
Time when this process was started, i.e., when the client connected to this WAL sender
state
text
Current WAL sender state
sent_location
text
Last transaction log position sent on this connection
write_location
text
Last transaction log position written to disk by this standby server
flush_location
text
Last transaction log position flushed to disk by this standby server
replay_location
text
Last transaction log position replayed into the database on this standby server
sync_priority
integer
Priority of this standby server for being chosen as the synchronous standby
sync_state
text
Synchronous state of this standby server

Cascading Replication:

  1. Allows standby server to accept connection and stream WAL logs to other standby servers acting like a relay. This helps in reducing number of direct connection to the master.
  2. A standby acting as both receiver and standby is known as cascading standby.
  3. Standby servers that are more connected to master is called upstream servers and those servers which are further away are called downstream servers.
  4. Cascading replication is presently asynchronous.
  5. We need to set primary_conninfo in the downstream standby to point to the cascading standby.

Log Shipping/Warm Standby:

  1. Log shipping is asynchronous
  2. Wal_level = archive(Master) 
  3. Hot_standby=off(Slave). 
  4. Not able to connect to DB. DB system is starting up mode.
  5. Also known as Warm Standby. 

Hot-Standby:

  1. Wal_level =hot_standby(master) 
  2. hot_standby=on.(Slave) 
  3. Can connect to DB for read only operation.
  4. Slave will be one archive behind master. It will read from archive_location.


Saturday, 27 January 2018

AWS - CloudFormation Template to create Aurora PostgreSQL Cluster Multi AZ with Read-Replica

Below is the CloudFormation Template to create Aurora PostgreSQL Cluster with Multi AZ along with Read-Replica.

AWSTemplateFormatVersion: 2010-09-09
Description: A basic CloudFormation template for an RDS Aurora PostgreSQL cluster.
Parameters:
    DatabaseInstanceType:
        Default: db.r4.large
        AllowedValues:
            - db.r4.large
            - db.r4.xlarge
            - db.r4.2xlarge
            - db.r4.4xlarge
            - db.r4.8xlarge
            - db.r4.16xlarge
        Description: The instance type to use for the database.
        Type: String
    DatabasePassword:
        Default: password
        AllowedPattern: "[a-zA-Z0-9]+"
        ConstraintDescription: must contain only alphanumeric characters.
        Description: The database admin account password.
        MaxLength: '41'
        MinLength: '8'
        NoEcho: 'true'
        Type: String
    DatabaseUsername:
        Default: aurorapg
        AllowedPattern: "[a-zA-Z0-9]+"
        ConstraintDescription: must contain only alphanumeric characters.
        Description: The database admin account user name.
        MaxLength: '16'
        MinLength: '1'
        Type: String
    DatabaseClusterParameterGroupName:
        Type: String
        Default: "default.aurora-postgresql9.6"
    DatabaseBackupRetentionPeriod:
        Type: String
        Default: 35
        Description: The database backup retention period in days.
    DatabaseSubnets:
        Default: "subnet-6749a72f,subnet-c3033698"
        Description: The subnets to place database instances in.
        Type: List<AWS::EC2::Subnet::Id>
    DatabaseSecurityGroups:
        Default: sg-0ee3d27f
        Type: List<AWS::EC2::SecurityGroup::Id>
        Description: Security groups to apply to the RDS cluster.
Metadata:
    AWS::CloudFormation::Interface:
        ParameterGroups:
            - Label:
                default: default.aurora-postgresql9.6
              Parameters:
                - DatabaseInstanceType
                - DatabaseName
                - DatabaseUsername
                - DatabasePassword
                - DatabaseSubnets
                - DatabaseSecurityGroups
                - DatabaseBackupRetentionPeriod
                - DatabaseClusterParameterGroupName
        ParameterLabels:
            DatabaseInstanceType:
                default: Database Instance Type
            DatabaseClusterParameterGroupName:
                default: DB cluster parameter group
            DatabasePassword:
                default: Database Password
            DatabaseUsername:
                default: Database Username
            DatabaseBackupRetentionPeriod:
                default: Database Backup Retention Period
            DatabaseSubnets:
                default: Database Subnets
            DatabaseSecurityGroups:
                default: Database Security Groups
Resources:
    DatabaseSubnetGroup:
        Type: AWS::RDS::DBSubnetGroup
        Properties:
            DBSubnetGroupDescription: CloudFormation managed DB subnet group.
            SubnetIds:
                Ref: DatabaseSubnets
    DatabaseCluster:
        Type: AWS::RDS::DBCluster
        Properties:
            Engine: aurora-postgresql
            MasterUsername:
                Ref: DatabaseUsername
            MasterUserPassword:
                Ref: DatabasePassword
            BackupRetentionPeriod:
                Ref: DatabaseBackupRetentionPeriod
            DBClusterParameterGroupName:
                Ref: DatabaseClusterParameterGroupName
            PreferredBackupWindow: 02:00-03:00
            PreferredMaintenanceWindow: mon:03:00-mon:04:00
            DBSubnetGroupName:
                Ref: DatabaseSubnetGroup
            VpcSecurityGroupIds:
                Ref: DatabaseSecurityGroups
    DatabasePrimaryInstance:
        Type: AWS::RDS::DBInstance
        Properties:
            Engine: aurora-postgresql
            DBClusterIdentifier:
                Ref: DatabaseCluster
            DBInstanceClass:
                Ref: DatabaseInstanceType
            DBSubnetGroupName:
                Ref: DatabaseSubnetGroup
    DatabaseReplicaInstance:
        Type: AWS::RDS::DBInstance
        Properties:
            Engine: aurora-postgresql
            DBClusterIdentifier:
                Ref: DatabaseCluster
            DBInstanceClass:
                Ref: DatabaseInstanceType
            DBSubnetGroupName:
                Ref: DatabaseSubnetGroup
    DatabaseReplicaInstance2:
        Type: AWS::RDS::DBInstance
        Properties:
            Engine: aurora-postgresql
            DBClusterIdentifier:
                Ref: DatabaseCluster
            DBInstanceClass:
                Ref: DatabaseInstanceType
            DBSubnetGroupName:
                Ref: DatabaseSubnetGroup