Wednesday 15 August 2018

Installing and Configuring PgBadger in Linux

PgBadger is one of the best open source tool for postgres to have a track of slow queries, most frequent queries, users analysis and many more. Therefore, I am making this blog for detailed steps to set up pgbadger in Linux and viewing report in web.

As pgbadger reads the log file of postgres, therefore we need to make changes in the log file so that correct information are logged and pgbadger can read the log file appropriately. Below are some of the changes needs to make in postgres log file -

  • Log connection - On
  • Log disconnection - On
  • Log Duration - off
  • Log statement - none
  • Log hostname - On
  • log_line_prefix = '%t [%p]: [%l-1] db=%d, %u , app=%a, %h '  
  • Log Min Duration Statement - 0
  • Log Checkpoint - on
  • lc_messages = ‘C’
Below is the configuration change for the log file size, format and rotation -

  • log_truncate_on_rotation = off
  • log_rotation_age = 1d 
  • log_rotation_size = 0
After making all the above changes, verify the generation of log file with the changes made in configuration file. After verifying it time to download the pgbadger and configure it from the below official site - 

Link - https://github.com/dalibo/pgbadger/releases ( Version 9.2)

Steps for installing pgbadger in linux( RHEL in my case):
  • Download the latest pgbadger using wget command - wget https://github.com/dalibo/pgbadger/archive/v9.2.tar.gz
  • Untar the tar Ball  - tar xvf pgbadger-9.2.tar
  • Install the source code using the below steps- 
  • A. cd pgbadger-9.2  
  • B. perl Makefile.PL 
  • C. Make 
  • D. Make Install
Once we install the pgbadger, we can view the report in text format or we can view the report in HTML format in web. We will see both the process in details - 

Text Format - 

In this process we simply parse the log file and generate a report in text format. And the text format report is generated in linux only, and we can view the report using less, more or VI command. Below is the command for it - 

"pgbadger /data/postgres/data/pg_log/postgresql-2018-07-02_075351.log --outfile /tmp/queries_4.txt"

HTML FORMAT - 

For this format, we have multiple steps to achieve this. We need to make sure we have apache or httpd installed and have all the report generated in the root folder so that we can view the reports over web. Below are the details steps -
  • Firstly, we need to install apache in the linux system using yum or rpm command. Along with it, there are few more dependencies we need to install. They are - GCC, apr-util and apr.
  • If we do not have apache in repository, instead we can install httpd and the above dependencies.
  • Lastly, we need to create the report in the apache root directory i.e. "/var/www/html" so that apache can read the html files.
  • Below is the command to generate html report - pgbadger --incremental --outdir /var/www/html /data/postgres/data/pg_log/postgresql-2018-07-02_075351.log -o out.html
  • Most important is the cron the above steps and generate the report everyday. 
  • You can view the reports online at http://127.0.0.1:80 if it is localhost or the IP where pgbadger is installed.

I have installed pgbadger many times and have good hands on experience. So, please comment below if you face any issue and I will be there to help you. 

Simple Logical PostgreSQL Backup Script (Pg_dump)

Below is the script written by me to make life easier for all DBA. You would just need to change few things in accordance to your environment and the script is ready for use. Also, you can comment in the comment section, if you want any modified script and I can work on that and share with you. Below script is meant for taking logical dump, I did not include much information but as I said you can always tweak but below script is a working script and it makes my life easy. This script can be used for restore process, not for recovery purpose. For that there is a different script using base backup.

Below are some of the details for the script -

1. I am declaring the backup user, backup directory, date, port and small small information which will get repeated in the script.

2. First condition will check if backup directory exist, if not it will create a backup directory.

3. I have also made a table to note down all the details of backup, so that I can query and find out if the backup is successful or failed. It makes a better Audit History.

4. Script is making a note of all the DB to be taken as backup and it is taking the backup of individual backup in for loop and making a note if backup is successful or not.

5. Lastly, I am taking up the backup of global users which is very important and making a entry in table too.

6. Sample how the backup take will look like is mentioned below and it can be taken as reference if you need any similar sort of Backup Strategy.

Here goes the below script -

#!/bin/bash
## Backup Script of Postgres
## Script will take backup of individual database and global objects. Script is good if DB size is small and do not need backup of WAL Files
## Author - rahulsaha0309@gmail.com

set -x

export PGPASSWORD=PassWord!
DATE=`(date +"%Y-%m-%d")`
SIMPLE_DATE=`(date +"%Y%m%d")`
BACKUP_DIRECTORY=/tmp/postgres_backup/$SIMPLE_DATE
BACKUP_USER=postgres
PG_DUMP=/data/postgres/bin/pg_dump
PORT=5432
PSQL=/data/postgres/bin/psql
DUMPALL=/data/postgres/bin/pg_dumpall


if [ ! -d $BACKUP_DIRECTORY ]
then
/bin/mkdir  -p $BACKUP_DIRECTORY
else
echo " Backup Directory exist"

fi

TABLE_EXIST=`$PSQL -d audit -t -c "SELECT EXISTS (SELECT 1 FROM information_schema.tables WHERE  table_schema = 'public' AND table_name='backup_logs')"`

if [ $TABLE_EXIST = f ]
    then
        $PSQL -d audit -c "create table backup_logs(date date, database varchar, status varchar, backup_location varchar)"
else
    echo "table_exist"
fi


LIST_DB=`$PSQL -t -d postgres -c "select datname from pg_database where datname not like 'postgres';"`

array1=($LIST_DB)
echo ${array1[@]}

for database in "${array1[@]}"

do
BACKUP_DATABASE="$PG_DUMP -U$BACKUP_USER -p $PORT -d $database -f $BACKUP_DIRECTORY/backup_$database.sql"
    echo "Taking backup of DATABASE $database"
$BACKUP_DATABASE

if [ $? -eq 0 ]
then
$PSQL -d audit -c "insert into backup_logs values(current_timestamp, '$database', 'Success', '$BACKUP_DIRECTORY/backup_$database.sql')"
else
$PSQL -d audit -c "insert into backup_logs values(current_timestamp, '$database', 'Failed', '$BACKUP_DIRECTORY/backup_$database.sql')"
fi

done

echo "BACKUP_GLOBAL_ROLES"

$DUMPALL -U$BACKUP_USER -p $PORT -g -f $BACKUP_DIRECTORY/backup_roles.sql


Output in table if backup is success or not.

audit=# select * from backup_logs;
    date    | database  | status  |                  backup_location
------------+-----------+---------+----------------------------------------------------
 2018-07-24 | template1 | Success | /tmp/postgres_backup/20180724/backup_template1.sql
 2018-07-24 | template0 | Failed  | /tmp/postgres_backup/20180724/backup_template0.sql
 2018-07-24 | test      | Success | /tmp/postgres_backup/20180724/backup_test.sql
 2018-07-24 | test1     | Success | /tmp/postgres_backup/20180724/backup_test1.sql
 2018-07-24 | test2     | Success | /tmp/postgres_backup/20180724/backup_test2.sql
 2018-07-24 | new       | Success | /tmp/postgres_backup/20180724/backup_new.sql
 2018-07-24 | audit     | Success | /tmp/postgres_backup/20180724/backup_audit.sql
(7 rows)


Cron Entry:
##Scheduling Postgres Backup - Logical pg_dump backup
21 7 * * * /opt/postgresql/scripts/backup_postgres.sh > /data/postgres_backup/`/bin/date +'\%Y\%m\%d'`/log_file/postgres_backup.out 2>&1

-bash-4.1$ crontab -l
##Scheduling Postgres Backup - Logical pg_dump backup
5 21 * * * /opt/postgresql/scripts/backup_postgres.sh > /data/postgres_backup/backup_log_file/`date +\%Y\%m\%d`-cron.out 2>&1

Parameter Needs Restart in PostgreSQL

We come up with a scenario where we want to know if a parameter needs restart or reload. So this article will describe how we can know which parameter needs what. Below are few parameter which needs restart. 

  1. Archive_mode
  2. Archive_level
  3. Port
  4. Connections
  5. Shared_buffer
  6. Max_wal_sender


Some of the below which needs reload:
  1. Archive_command
  2. Pg_min_log_statement
  3. Work_mem
  4. maintenance_work_mem

But it is difficult to remember all the parameter, therefore there is a trick. When we issue the below command and if the result is "postmaster", it means it needs reboot. And if the result is "sighup", it means it needs reload. Also, we can also know if any parameter there is a pending reboot. Below are the example commands -

1. This command would give the context if the parameter needs reboot.
SQL> select context from pg_settings where name = 'work_mem';

2. The below command would say us if the pending reboot is pending for the parameter.
SQL> select pending_restart from pg_settings where name = 'work_mem';

Below are some of the example commanfs for some of the parameter -

postgres=# select context from pg_settings where name ='work_mem';
 context
---------
 user

postgres=# select context from pg_settings where name ='port';
  context
------------
 postmaster
(1 row)

postgres=# select context from pg_settings where name ='archive_command';
 context
---------
 sighup
(1 row)

postgres=# select context from pg_settings where name ='archive_mode';
  context
------------
 postmaster
(1 row)