Replication in PostgreSQL

Synchronous Replication:

Ensures that all data are first written to master, slave and transaction log of disk in both master and slave then it will be committed to client.

It ensures maximum Protection.

Synchronous replication offers the ability to confirm that all changes made by a transaction have been transferred to one synchronous standby server.

The only chance data will be lost if both Master and slave crashed as same time.

The maximum wait time is the round trip starting from Master to Slave & slave to master. Once streaming replication has been configured, configuring synchronous replication requires only one additional configuration step: synchronous_standby_names must be set to a non-empty value.

Asynchronous Replication:

It is by default.

If Master/Primary server crashes then some transaction that were committed may be not replicated to standby server/slave. The amount of data loss is proportional to delay at the time of failover.

Logical Replication:

Replication slots with Streaming replication

Replication Slot - a slot represents a stream of changes that can be replayed to a client in the order they were made on the origin server. Each slot streams a sequence of changes from a single database, sending each change exactly once. Only one receiver may consume changes from a slot at any given time.

It can only replicate DML changes not DDL changes.

Any version it can replicate, any DB it can replicate as it is logical replication.

Replication should be synchronous replication.

Only after replication statement are read, it will delete the xlog.

It don’t need any overhead process from server, as it reads from WAL files

It is an approach towards multi-master slowly.

Replication slots are counted as one of the connection in max_connection.

Process - There is a logical decoding, which pulls all the queries from WAL logs. Decoder should have access to pg_catalog as needs various information from pg_catalog. The DML queries are decoded in decoder. There is plugin attached to decoder which reads the decoder and send statements to slave server. The plugin can be customized plugin according to our use. We can use that plugin for analytical use also or for logical replication.

We should make sure, replcation_slot, wal_sender, wal_receiver is enabled in master and slave.

Please find the below example.

postgres=# -- Create a slot named 'regression_slot' using the output plugin 'test_decoding'

postgres=# SELECT * FROM pg_create_logical_replication_slot('regression_slot', 'test_decoding');

slot_name | xlog_position

-----------------+---------------

regression_slot | 0/16B1970

(1 row)

postgres=# SELECT * FROM pg_replication_slots;

-----------------+---------------+-----------+--------+----------+--------+--------+--------------+-------------

regression_slot | test_decoding | logical | 12052 | postgres | f | | 684 | 0/16A4408

(1 row)

postgres=# -- There are no changes to see yet

postgres=# SELECT * FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL);

location | xid | data

----------+-----+------

(0 rows)

postgres=# CREATE TABLE data(id serial primary key, data text);

CREATE TABLE

postgres=# -- DDL isn't replicated, so all you'll see is the transaction

postgres=# SELECT * FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL);

location | xid | data

-----------+-----+------------

0/16D5D48 | 688 | BEGIN 688

0/16E0380 | 688 | COMMIT 688

(2 rows)

postgres=# -- Once changes are read, they're consumed and not emitted

postgres=# -- in a subsequent call:

postgres=# SELECT * FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL);

location | xid | data

----------+-----+------

(0 rows)

postgres=# BEGIN;

postgres=# INSERT INTO data(data) VALUES('1');

postgres=# INSERT INTO data(data) VALUES('2');

postgres=# COMMIT;

postgres=# SELECT * FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL);

location | xid | data

-----------+-----+-----------------------------------------------

0/16E0478 | 689 | BEGIN 689

0/16E0478 | 689 | table public.data: INSERT: id[integer]:1 data[text]:'1'

0/16E0580 | 689 | table public.data: INSERT: id[integer]:2 data[text]:'2'

0/16E0650 | 689 | COMMIT 689

(4 rows)

postgres=# INSERT INTO data(data) VALUES('3');

postgres=# -- You can also peek ahead in the change stream without consuming changes

postgres=# SELECT * FROM pg_logical_slot_peek_changes('regression_slot', NULL, NULL);

location | xid | data

-----------+-----+-----------------------------------------------

0/16E09C0 | 690 | BEGIN 690

0/16E09C0 | 690 | table public.data: INSERT: id[integer]:3 data[text]:'3'

0/16E0B90 | 690 | COMMIT 690

(3 rows)

postgres=# -- The next call to pg_logical_slot_peek_changes() returns the same changes again

postgres=# SELECT * FROM pg_logical_slot_peek_changes('regression_slot', NULL, NULL);

location | xid | data

-----------+-----+-----------------------------------------------

0/16E09C0 | 690 | BEGIN 690

0/16E09C0 | 690 | table public.data: INSERT: id[integer]:3 data[text]:'3'

0/16E0B90 | 690 | COMMIT 690

(3 rows)

postgres=# -- options can be passed to output plugin, to influence the formatting

postgres=# SELECT * FROM pg_logical_slot_peek_changes('regression_slot', NULL, NULL, 'include-timestamp', 'on');

location | xid | data

-----------+-----+-----------------------------------------------

0/16E09C0 | 690 | BEGIN 690

0/16E09C0 | 690 | table public.data: INSERT: id[integer]:3 data[text]:'3'

0/16E0B90 | 690 | COMMIT 690 (at 2014-02-27 16:41:51.863092+01)

(3 rows)

postgres=# -- Remember to destroy a slot you no longer need to stop it consuming

postgres=# -- server resources:

postgres=# SELECT pg_drop_replication_slot('regression_slot');

pg_drop_replication_slot

-----------------------

(1 row)

Physical Replication:

Streaming Replication
Hot Standby
Warm Standby
Log shipping

Streaming Replication:

There should be always a real time backup so that the data's are always safe. Streaming Replication is one of the process that always sync the real time data into different server. The main server is always called as master and the server where the data's need to get sync is called slave.

Max_Wal sender process keeps checking the change block and sends it to replicated server where wal_reciever makes sure, it is written to the database accordingly. Row by row can be replicated in streaming replication, as process reads from xlog.

Setting up Streaming Replication:

Pre-Requisite:

Slave Empty
Same Version (9.2.2)
Same OS-Linux (Ubuntu)
Same bit - 32

Steps:

1)Set the keygen so that there is no problem in connection both the server i.e slave and master

a) As a root user create the key in master and in slave
Master: ssh-keygen -t rsa
Slave: ssh-keygen -t rsa
b) Now copy the key from master to slave and slave to master
Master: vi authorized-keys
cat id-rsa.pub from slave and paste in authorized-keys
Slave: Do same thing for slave

2)Change the permission of the file authorized-keys using the below command -
[root@linux] chmod 600 authorized-keys

3) We need to repeat the above steps to set the keygen as postgres user. If server A needs to connect to server B as postgres user. Then we need to copy public key of server A to authorized_keys in server B. Permissions are very important, without that it won’t work. Also we need to make sure that knows_hosts are cleared up before/after we do ssh keygen setup.
.ssh permission - 700
Authorized_keys - 640

4) Check the master is connected to the slave without password or not.
ssh postgres@192.168.175.3
It may be still it is not connection, and throws an error. Then we can just reset our password in OS
passwd postgres

5) Create directory wallogs in slave in root and change the permission
chown -R postgres wallogs

6) Change postgres config(master):
archive_mode = on
archive_command = 'ssh slave_host test -d /wallogs && rsync -a %p slave_host:/wallogs/%f'
(it is secure shelling to slave ip and test the wallogs if everything ok then it copies to slave wallogs)
wal_level = hot_standby
max_wal_senders = 2
wal_keep_segments = 30

7) Restart the master
service postgresql restart

8) We can switch log to check logs are getting created and moved.
select pg_switch_xlog();

9) Change in pg_hba.conf
type database user host(slave) Method
host all all 192.168.175.196/32 trust
host replication all 192.168.175.196/32 trust
Change all md5 to trust.

10) Next we will run commands from slave. We won’t touch master

11) Keep a backup of postgresql.conf and pg_hba.conf in home directory.

12) Going to root:
rm /tmp/data_pgsql.trigger.5432
/etc/init.d/postgresql stop
rm /usr/local/pgsql/data/ -rf
echo "SELECT pg_start_backup('`hostname``/bin/date '+%Y%m-%d-%T'`');" | psql -h masr_hostte -U postgres
rsync -avz postgres@master_host:/usr/local/pgsql/data/ /usr/local/pgsql/data/
su - postgres -c "/usr/local/pgsql/bin/psql -h master_host -c 'SELECT pg_stop_backup();'"
rm /usr/local/pgsql/data/pg_xlog/archive_status/*
rm /usr/local/pgsql/data/pg_xlog/*

13) Change in postgresql.conf
hot_standby = on

14) Create recovery.conf
standby_mode = 'on'
primary_conninfo = 'host=master_host port=5432 user=postgres'
restore_command = 'cp /wallogs/%f %p'
trigger_file = '/tmp/pgsql.trigger'
archive_cleanup_command = 'pg_archivecleanup /wallogs/ps_admin_juliet %r

15) Remove the postmaster.pid before restarting the slave server.

16) Restart the slave and check is it working or not.

17) Monitoring: pg_stat_replication

Large differences between pg_current_xlog_location and sent_location field might indicate that the master server is under heavy load, while differences between sent_location and pg_last_xlog_receive_location on the standby might indicate network delay, or that the standby is under heavy load.

Recovery.conf File:

standby_mode = 'on'

primary_conninfo = 'host=master_host port=5432 user=postgres'

restore_command = 'cp /wallogs/%f %p'

trigger_file = '/tmp/pgsql.trigger'

archive_cleanup_command = 'pg_archivecleanup /wallogs/ps_admin_juliet %r’

Recovery_target_name = name point created pg_create_restore_point()

Recovery_target_time = till a specific time, it will recover.

Recovery_target_xid = recovery upto transaction ID

Recovery_target_inclusive = True(default) false stops before recovery target

Recovery_target_timeline = Recovery till specific timeline

Pause_at_recovery_target = recovery target should pause after recovery is complete.

What happens at the end of recovery?

End of recovery means the point where the the database opens up for writing
New timeline is chosen
A timeline history file is written
The partial last WAL file on the previous timeline is copied with the new timeline's ID
A checkpoint record is written on the new timeline

Example:

LOG: database system was interrupted; last known up at 2013-01-30

21:45:14 EET

LOG: starting archive recovery

LOG: redo starts at 13/E00000C8

LOG: could not open file "pg_xlog/0000000100000013000000E4": No

such file or directory

LOG: redo done at 13/E3D389A0

LOG: last completed transaction was at log time 2013-01-30 21:45:20+02

LOG: selected new timeline ID: 2

LOG: archive recovery complete

LOG: database system is ready to accept connections

Partial File:

0000000100000013000000E4

------------------- - - - - - - - - - - - - - - - ( Unused Portion)

0000000200000013000000E4

------------------------------------ --------(Merged Portion)

Both are partial file at one point of time and partial file vanishes and become normal WAL file.

TimeLine History File:

After timeline switch there is a history file in between

0000000100000013000000E4

0000000100000013000000E5

00000002.history

0000000200000013000000E3

0000000200000013000000E4

Cat 00000002.history

13/E4000000 no recovery target specified

13/ED000090 at restore point ”before FOSDEM”

Promoting a standby also switches timeline.

Pg-receivexlog:

Pg_receivexlog streams transaction log from a running postgresql cluster.
This tool uses streaming replication protocol to stream transaction logs to a local directory.
These files can be used for PITR.
Transaction logs are streamed in real time thus pg_receivelog can be used instead of archive command.
EX: pg_receivelog -h localhost -D /usr/local/wal_archive
It can be used as an alternate of archiving or for streaming replication.
Max_wal_sender=1 . At Least 1 and create entry in pg_hba.conf for the IP as replication user, Then it will work
It uses replication protocol

Replication Troubleshooting:

Generally we look into view pg_stat_replication:
Below queries can be used to find lag:
psql>select extract( 'epoch' from now() - pg_last_xact_replay_timestamp())::int ;
Few More Queries are below:

$ psql -c "SELECT pg_current_xlog_location()" -h192.168.0.10 (primary host)

pg_current_xlog_location

--------------------------

0/2000000

(1 row)

$ psql -c "select pg_last_xlog_receive_location()" -h192.168.0.20 (standby host)

pg_last_xlog_receive_location

-------------------------------

0/2000000

(1 row)

$ psql -c "select pg_last_xlog_replay_location()" -h192.168.0.20 (standby host)

pg_last_xlog_replay_location

------------------------------

0/2000000

(1 row)

Pg_stat_replication:

postgres=# select * from pg_stat_replication;

-[ RECORD 1 ]----+------------------------------

pid | 14447

usesysid | 10

usename | postgres

application_name | walreceiver

client_addr | 172.31.28.103

client_hostname |

client_port | 34792

backend_start | 2016-11-29 00:09:15.958171-05

backend_xmin |

state | streaming

sent_location | 0/2000060

write_location | 0/2000060

flush_location | 0/2000060

replay_location | 0/2000060

sync_priority | 0

sync_state | async

Column	Type	Description
pid	integer	Process ID of a WAL sender process
usesysid	oid	OID of the user logged into this WAL sender process
usename	name	Name of the user logged into this WAL sender process
application_name	text	Name of the application that is connected to this WAL sender
client_addr	inet	IP address of the client connected to this WAL sender. If this field is null, it indicates that the client is connected via a Unix socket on the server machine.
client_hostname	text	Host name of the connected client, as reported by a reverse DNS lookup of client_addr. This field will only be non-null for IP connections, and only when log_hostname is enabled.
client_port	integer	TCP port number that the client is using for communication with this WAL sender, or -1 if a Unix socket is used
backend_start	timestamp with time zone	Time when this process was started, i.e., when the client connected to this WAL sender
state	text	Current WAL sender state
sent_location	text	Last transaction log position sent on this connection
write_location	text	Last transaction log position written to disk by this standby server
flush_location	text	Last transaction log position flushed to disk by this standby server
replay_location	text	Last transaction log position replayed into the database on this standby server
sync_priority	integer	Priority of this standby server for being chosen as the synchronous standby
sync_state	text	Synchronous state of this standby server

Cascading Replication:

Allows standby server to accept connection and stream WAL logs to other standby servers acting like a relay. This helps in reducing number of direct connection to the master.
A standby acting as both receiver and standby is known as cascading standby.
Standby servers that are more connected to master is called upstream servers and those servers which are further away are called downstream servers.
Cascading replication is presently asynchronous.
We need to set primary_conninfo in the downstream standby to point to the cascading standby.

Log Shipping/Warm Standby:

Log shipping is asynchronous

Wal_level = archive(Master)

Hot_standby=off(Slave).

Not able to connect to DB. DB system is starting up mode.

Also known as Warm Standby.

Hot-Standby:

Wal_level =hot_standby(master)

hot_standby=on.(Slave)

Can connect to DB for read only operation.

Slave will be one archive behind master. It will read from archive_location.

Open Source Concepts - PostgreSQL, MySQL and Linux

Pages

Monday, 29 January 2018