PostgreSQL Administration

Presented by Keith Fiske / @keithf4

Senior Database Engineer @ Crunchy Data Solutions
(pg_partman, pg_monitor, pg_extractor, mimeo)

Follow along at
http://slides.keithf4.com/pg_admin_training_centos

VM Setup

Copy PGTraining_VM.zip from USB stick to your computer. Extract in location with at least 15GB of free space.
Copy relevant VirtualBox installation file for your OS & Install if necessary (5.2.30 minimum)

Linux install instructions

Copy Oracle_VM_VirtualBox_Extension_Pack-5.2.30.vbox-extpack and Install by double-clicking
Remove USB Stick
Import Virtual Machine Image

Machine -> Add... ->
Browse to where you extracted the zip file (DO NOT run off USB stick)
Double-click PGTraining_VM.vbox

Start up VM!
Login: training / postgres

Crunchy Data Solutions, Inc

Industry leader in providing enterprise PostgreSQL support and open source solutions
Crunchy Certified PostgreSQL

100% Open Source PostgreSQL
Common Criteria EAL 2+ Certified

We're hiring!

https://www.crunchydata.com/
DBAs, Systems Engineers, Container Experts

What is PostgreSQL?

Open Source RDBMS
Started at UC Berkley 1986, open sourced in 1996
BSD-type License
Follows SQL Standard very closely
Third-party Plugin Support

Procedural Languages

C, Java, Python, Perl, JavaScript, PHP, R, Ruby, etc
Extensions
Background Workers (>=9.3)

Massive online community (mailing lists, irc, conferences)

Training VM

VirtualBox 5.2.30 + Extension Pack
Login: training / postgres
training user has sudo

All commands shown with "sudo" are run by training user

Applications -> System Tools -> Terminal
Internet should work within VM (if it works externally)
Clipboard should work between host & VM
vim & nano for CLI file editing. For GUI:


                            sudo -s
                            gedit &

Take snapshot now & as many times as you'd like along the way

Installing PostgreSQL

Setup yum repo for your desired version

Default for CentOS 7 is 9.2 which is EOL as of Sept 2017
http://www.postgresql.org/download/linux/redhat/
Select 11, CentOS7, x86_64
Recently changed from repo per major version to one universal repo per distro


                            sudo yum install https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm

Install PostgreSQL 11.x & Contrib modules


                                sudo yum install postgresql11-server postgresql11-contrib

No clusters automatically created and no initial automatic startup (Redhat policy)


                            sudo PGSETUP_INITDB_OPTIONS="--data-checksums" /usr/pgsql-11/bin/postgresql-11-setup initdb
                            sudo systemctl enable postgresql-11
                            sudo systemctl start postgresql-11
                            systemctl status postgresql-11
                            ...

Initial Login

Become postgres system user


                            sudo -iu postgres

Log into database (default postgres role & database already created)

psql

\? to see all available commands

Recommend keeping second terminal opened logged in as postgres user

Configuration - pg_hba.conf

http://www.postgresql.org/docs/current/static/auth-pg-hba-conf.html
Find where your data directory is


                            postgres=# show data_directory;

Open pg_hba.conf file located in data directory (must be postgres user)
All authentication into cluster is controlled by this file
Secure SSL configurations possible
Evaluated in top-down order
Default only allows local system users with matching role names (peer)
Avoid "trust" if at all possible

.pgpass file allows password authentication without having to type it every time
http://www.postgresql.org/docs/current/static/libpq-pgpass.html

Configuration - pg_hba.conf

Add following lines above other active ones (for later):


                            # TYPE  DATABASE       USER           ADDRESS         METHOD
                            host    replication    replication    127.0.0.1/32    md5
                            host    replication    training       127.0.0.1/32    trust

Only requires a reload to put new HBA settings into place


                            postgres=# select pg_reload_conf();
                            OR
                            sudo systemctl reload postgresql-11

View current pg_hba.conf file from within database


                            SELECT * FROM pg_hba_file_rules;

Check log file for SIGHUP

"log" directory in the data directory
Changed from "pg_log" < PG10

Create Role & Database

Create a role & database for yourself


                            postgres=# CREATE ROLE training WITH LOGIN SUPERUSER;
                            postgres=# CREATE DATABASE training;

Should now be able to log into PostgreSQL as your training user
Create replication role for later


                            training=# CREATE ROLE replication WITH LOGIN REPLICATION;

Set passwords ("password")


                                training=# \password training
                                training=# \password replication
                                ...

Configuration - postgresql.conf

Main configuration file
Important initial settings review (most require restart)

listen_addresses (defaults to localhost only; set to server IP; * = all IPs)
max_connections (monitor active connections, affected by work_mem)
shared_buffers (8GB good starting point; leave default for training)
work_mem (2-5MB good starting point. The more RAM the better.)
maintenance_work_mem (1GB good starting point)
wal_level = replica
effective_cache_size (50% RAM good starting point)
archive_mode = on
archive_command = ‘/bin/true’
archive_timeout = 60
max_wal_senders = 10
wal_keep_segments = 30
max_replication_slots = 10
hot_standby = on (setting for replica)
logging_collector = on
autovacuum_freeze_max_age = 1000000000 (only set this high if monitoring for wraparound. So monitor for wraparound and set it this high.)

Configuration - postgresql.conf

Important secondary settings review (all only require reload)

max_wal_size = 1GB (default; hitting this forces checkpoint)
min_wal_size = 80MB (default; WAL kept for recycling)
checkpoint_timeout = 5min (default; good starting point)
checkpoint_completion_target = 0.9
log_filename = 'postgresql-%Y-%m-%d.log'
log_min_duration_statement = 5000ms (beware setting too low. can fill logs fast)
log_connections = on
log_disconnections = on
log_line_prefix = '%m [%r] [%p]: [l-%l] user=%u,db=%d,e=%e '

%x & %v can be good for transaction tracking

log_temp_files = 4096 (kilobytes; good starting point = work_mem)
log_lock_waits = on
autovacuum_vacuum_threshold = 500 (default too aggressive)
autovacuum_analyze_threshold = 500 (default too aggressive)
autovacuum_vacuum_scale_factor = 0.1 (default too mild)
autovacuum_analyze_scale_factor = 0.05 (default too mild)

Restart postgresql to put all new settings into place (as training user)

sudo systemctl restart postgresql-11

If restart unsuccessful, check postgresql log files

Vacuums & Freezing

http://www.postgresql.org/docs/current/static/routine-vacuuming.html
Multi-Version Currency Control (MVCC)

Updated/Deleted rows not actually deleted. Just marked unavailable.
Vacuum marks unavailable rows as re-usable space. Leads to bloat.
VACUUM FULL recovers disk space but locks table for full table rewrite
pg_repack - Extension to reclaim disk space with minimal lock
Don’t overuse VACUUM FULL or pg_repack.

Reusable space can be more efficient than [re]allocating new pages

Transaction ID Exhaustion (Wraparound)

Every row has transaction id (XID) value
Every new write transaction increments cluster-wide XID
Determines visibility to current transactions
32-bit number, so wraparound is possible after 4 billion transactions

2 billion transactions newer than current & 2 billion older

VACUUM marks rows as frozen, setting the XID such that it will be visible to all future transactions
Reserved, FrozenXID always older than all XIDs

VACUUM FREEZE tables rarely written to so that vacuum can skip pages and run more efficiently (9.6+ feature)

autovacuum_freeze_max_age - when table XID value reaches this, a VACUUM is forced (even if autovac turned off).

Replication - Basic

Ensure line exists in master pg_hba.conf and reload master


                            host    replication    replication    127.0.0.1/32    md5

Create a replication slot


                            training=# SELECT * FROM pg_create_physical_replication_slot('training_replica');
                            training=# SELECT * FROM pg_replication_slots;

Use pg_basebackup to create copy of master (as postgres system user)

http://www.postgresql.org/docs/current/static/app-pgbasebackup.html


                            pg_basebackup -h 127.0.0.1 -U replication -D /var/lib/pgsql/11/replica -R -Xs -P -S training_replica -v

                            training=# SELECT * FROM pg_replication_slots;
                            ...

Replication - Basic

Edit replica postgresql.conf


                            port = 5444

Edit replica recovery.conf


                            standby_mode = 'on'
                            primary_conninfo = 'host=127.0.0.1 port=5432 user=replication password=password'
                            primary_slot_name = 'training_replica'
                            recovery_target_timeline='latest'
                            ...

Replication - Basic

Production method for CentOS
Create new systemd service for replica


                            sudo cp /usr/lib/systemd/system/postgresql-11.service /etc/systemd/system/postgresql-11-replica.service

See systemd service file locations
Edit config file to set replica data directory (as root):


                            Environment=PGDATA=/var/lib/pgsql/11/replica/


                            sudo systemctl daemon-reload
                            sudo systemctl enable postgresql-11-replica
                            sudo systemctl start postgresql-11-replica
                            sudo systemctl status postgresql-11-replica
                            ...

Replication - Basic

Check replica log to ensure it connected


                            started streaming WAL from primary at 0/3000000 on timeline 1

Check from master


                            training=# SELECT * FROM pg_stat_replication;
                            training=# SELECT * FROM pg_replication_slots;

Create an object and make sure it appears on replica (as training user)


                            training=# CREATE TABLE testing (
                                id bigint primary key, 
                                stuff text, 
                                inserted_at timestamptz default now());

                            training=# INSERT INTO testing (id) values (generate_series(1,1000));

Connect to replica


                            psql -p 5444

                            training=# SELECT pg_is_in_recovery();
                            ...

Backup - pg_dump/all

pg_dumpall

Dumps all databases in cluster in plaintext format
Can dump only cluster-wide data: roles, tablespaces


                                    pg_dumpall -g -f /home/training/globals.sql

Use psql to restore backup
Restores entire cluster. No object filtering options.

pg_dump

Backs up individual databases in cluster
Does not back up roles or tablespaces, but does back up object privileges of those roles.
Provides binary backup format that can make restore easier
Use pg_restore to restore binary backups
Can restore individual schemas or tables instead of entire cluster
--section option can break up dump into stages


                                    pg_dump -Fc training -f /home/training/training.pgr -v
                                    ...

Restore - pg_restore

As training user, create new database in home folder


                                mkdir /home/training/mydb
                                chmod 700 /home/training/mydb
                                /usr/pgsql-11/bin/initdb --data-checksums -D /home/training/mydb

Edit postgresql.conf:


                                    port = 5888
                                    unix_socket_directories = '/home/training/mydb'

Start the DB


                                /usr/pgsql-11/bin/pg_ctl -D /home/training/mydb start

Automatically made role “training”
psql used to restore pg_dumpall or plaintext version of pg_dump

Restore roles before restoring database so permissions are set properly


                                psql -h localhost -p 5888 -d postgres -f /home/training/globals.sql -a

pg_restore used to restore binary dump from pg_dump


                                psql -h localhost -p 5888 postgres
                                postgres=# create database new_training;
                                pg_restore -h localhost -p 5888 -d new_training -v /home/training/training.pgr
                                ...

Restore - pg_restore

Smaller backups & more flexible restore
Must recreate all indexes & constraints
Stop this instance and delete it


                            /usr/pgsql-11/bin/pg_ctl stop -m fast -D /home/training/mydb
                            rm -r /home/training/mydb
                            ...

Backup - File System (Basic)

pg_basebackup

We already did it! Let's do it again...
Create compressed backup


                            mkdir /home/training/mybackup
                            pg_basebackup -h 127.0.0.1 -D /home/training/mybackup -Ft -z -Xs -P -v

Separate tar file is made for necessary WAL files
If additional tablespaces, they go in their own tar file
Do not set -R for normal backups
-Xs (--wal-method=stream) is default for PG10+

-Xf (fetch) only works if all WAL generated during backup run exist at the end of the backup
-Xf can require extremely high wal_keep_segements on a busy database
-Xs does require one additional wal sender being available (max_wal_senders)

Restore - File System (Basic)

We already did it! Let's do it again...


                                mkdir /home/training/mydb2
                                chmod 700 /home/training/mydb2


                                tar xvzf /home/training/mybackup/base.tar.gz -C /home/training/mydb2/
                                tar xvzf /home/training/mybackup/pg_wal.tar.gz -C /home/training/mydb2/pg_wal


                                EDIT postgresql.conf: 
                                    port = 5888
                                    unix_socket_directories = '/home/training/mydb2'


                                /usr/pgsql-11/bin/pg_ctl -D /home/training/mydb2 start

                                psql -h localhost -p 5888
                                (requires hba fix)

                                psql -h /home/training/mydb2/ -p 5888
                                ...

Restore - File System (Basic)

MUCH faster disaster recovery
Less flexible restore options (all or nothing)
Stop this instance and delete it


                            /usr/pgsql-11/bin/pg_ctl stop -m fast -D /home/training/mydb2
                            rm -r /home/training/mydb2
                            ...

Backup - File System (Advanced)

pgBackRest - http://pgbackrest.org
Point-in-Time Recovery (PITR)
Incremental & Differential Backups
Parallel Backup & Restore
Backup Rotation & Archive Expiration
Checksums
Many more options!

Backup - File System (Advanced)

Install pgBackRest


                            sudo yum install pgbackrest.x86_64

Configure a stanza

Defines where PG cluster is located, how it will be backed up, archiving options, etc
One stanza on db server. Many stanzas on a dedicated backup machine.
Edit /etc/pgbackrest.conf (default location, requires root)


                                [global]
                                repo1-path=/var/lib/pgbackrest
                                log-level-console=info

                                [main]
                                pg1-path=/var/lib/pgsql/11/data
                                retention-full=2

Ensure repository folders exists with proper permissions. Should already be there and ready if install worked.


                            sudo mkdir /var/lib/pgbackrest
                            sudo chmod 750 /var/lib/pgbackrest/
                            sudo chown postgres:postgres /var/lib/pgbackrest
                            sudo chown postgres:postgres /var/log/pgbackrest
                            ...

Backup - File System (Advanced)

Set primary postgresql.conf options for WAL archiving


                            archive_command = 'pgbackrest --stanza=main archive-push %p'
                            archive_mode = on
                            max_wal_senders = 10 
                            wal_level = replica

Normally set above exactly the same on replica, but can't in VM
Reload/restart to set above options


                            sudo systemctl reload postgresql-11

Create & check the stanza


                            sudo -u postgres pgbackrest --stanza=main stanza-create

                            sudo -u postgres pgbackrest --stanza=main check
                            ...

Backup - File System (Advanced)

Perform a backup


                            sudo -u postgres pgbackrest --stanza=main backup

By default attempts an incremental backup, but since no full exists, creates a full. Run again...


                            sudo -u postgres pgbackrest --stanza=main backup

Designate backup type with --type option (full, incr, diff)


                            sudo -u postgres pgbackrest --stanza=main --type=diff backup

Check backup status


                            sudo -u postgres pgbackrest info
                            ...

Restore - File System (Advanced)

pgBackRest will restore the database to the target designated in the stanza config by default.
In disaster recovery, this is normal. --delta option can even speed it up greatly if original data dir is still there
For training, restore to a target folder:


                            sudo -iu postgres 
                            mkdir /var/lib/pgsql/11/br_restore
                            chmod 700 /var/lib/pgsql/11/br_restore

                            pgbackrest restore --stanza=main --db-path=/var/lib/pgsql/11/br_restore

Change postgresql.conf port to 5911
TURN OFF ARCHIVE COMMAND

Set back to /bin/true or set mode to off


                           /usr/pgsql-11/bin/pg_ctl start -D /var/lib/pgsql/11/br_restore/


                           /usr/pgsql-11/bin/pg_ctl stop -m fast -D /var/lib/pgsql/11/br_restore/ 
                           rm -r /var/lib/pgsql/11/br_restore
                           ...

Failover

Stop the master


                            sudo systemctl stop postgresql-11
                            sudo systemctl status postgresql-11

Check replica logs to see connection errors


                            /var/lib/pgsql/11/replica/log/...

Promote replica


                            /usr/pgsql-11/bin/pg_ctl promote -D /var/lib/pgsql/11/replica

Check logs
Write to database (running on port 5444)

Upgrading

Dump/Restore

Works on all versions
Use the pg_dump binary of the upgrade target
8.4+ has parallel pg_restore
9.3+ has parallel pg_dump
Same caveats mentioned previously (smaller dump/longer restore)

pg_upgrade

In-place upgrade of data files
Works only for 8.4.7+ (all previous versions must dump/restore or do staged upgrade)
OS w/ hard link support can greatly decrease upgrade time

Ex: 700GB upgrade in under 5 minutes, but...
Don't forget to analyze entire cluster


                                    vacuumdb --all --analyze-in-stages

Logical

More setup and some limitations, but greatly reduced or even zero downtime upgrade
native, pglogical, mimeo, bucardo, slony, etc

Monitoring

Replication status
Active/Idle connections. Idle in transaction session times.
Transaction ID Wraparound
Table statistics

Sequential scans vs index scans
Insert/Update/Delete rate

Transactions

Commits vs Rollbacks

Database size

Total Table Size vs Total Index Size

WAL file count (expected vs current)
Log/query analysis - pgbadger
Critical functions - pg_jobmon
Prometheus - https://prometheus.io

Trending + Monitoring
pgmonitor

Replication Status

Monitor from replica (seconds behind since last WAL replay)


                                SELECT extract(epoch from now() - pg_last_xact_replay_timestamp()) AS replica_lag;

Monitor from master if streaming (bytes behind)


                                SELECT client_hostname
                                    , client_addr
                                    , pg_wal_lsn_diff(pg_stat_replication.sent_lsn, 
                                        pg_stat_replication.replay_lsn) AS byte_lag
                                FROM pg_stat_replication;

Connections

pg_stat_activity only shows all column data to superuser. Otherwise it censors data of other sessions.
PG10 has new pg_monitor default role that allows viewing/executing various monitoring views and functions


                            CREATE ROLE monitoring WITH LOGIN;
                            GRANT pg_monitor TO monitoring;

Add to top of pg_hba.conf and reload


                            local    all    monitoring       trust 

                            sudo systemctl reload postgresql-11


                            select max_connections
                            , total_used
                            , coalesce(round(100*(total_used/max_connections)),0) as pct_used
                            , idle
                            , idle_in_txn
                            , ((total_used - idle) - idle_in_txn) as active
                            , (select coalesce(extract(epoch from (max(now() - query_start))),0) from 
                                    pg_stat_activity where state = 'idle in transaction') as max_idle_in_txn
                            , (select coalesce(extract(epoch from (max(now() - query_start))),0) from 
                                    pg_stat_activity where state <> 'idle') as max_txn_time
                            from (select count(*) as total_used
                                    , coalesce(sum(case when state = 'idle' then 1 else 0 end),0) as idle
                                    , coalesce(sum(case when state = 'idle in transaction' then 1 else 0 end),0) as idle_in_txn 
                                    from pg_stat_activity) 
                            x join (select setting::float AS max_connections FROM 
                                        pg_settings WHERE name = 'max_connections') xx ON (true);

Autovac Freeze

Query that provides quick summary of transaction ID wraparound status


                            WITH max_age AS ( 
                                SELECT 2000000000 as max_old_xid
                                    , setting AS autovacuum_freeze_max_age 
                                FROM pg_catalog.pg_settings 
                                WHERE name = 'autovacuum_freeze_max_age')
                            , per_database_stats AS ( 
                                SELECT datname 
                                    , m.max_old_xid::int 
                                    , m.autovacuum_freeze_max_age::int 
                                    , age(d.datfrozenxid) AS oldest_current_xid 
                                FROM pg_catalog.pg_database d 
                                JOIN max_age m ON (true) 
                                WHERE d.datallowconn) 
                            SELECT max(oldest_current_xid) AS oldest_current_xid 
                                , max(ROUND(100*(oldest_current_xid/max_old_xid::float))) AS percent_towards_wraparound 
                                , max(ROUND(100*(oldest_current_xid/autovacuum_freeze_max_age::float))) AS percent_towards_emergency_autovac 
                            FROM per_database_stats;


                            -[ RECORD 1 ]---------------------+----------
                            oldest_current_xid                | 496452209
                            percent_towards_wraparound        | 25
                            percent_towards_emergency_autovac | 99

                            postgres=# show autovacuum_freeze_max_age ;
                            -[ RECORD 1 ]-------------+----------
                            autovacuum_freeze_max_age | 500000000

Table Statistics

Not cluster-wide. Run this on each database in the cluster.
Graph all columns as rate of change


                            SELECT current_database() as dbname, schemaname, relname
                                , seq_scan, seq_tup_read
                                , idx_scan, idx_tup_fetch
                                , n_tup_ins, n_tup_upd, n_tup_del, n_tup_hot_upd
                                , n_live_tup, n_dead_tup
                                , vacuum_count, autovacuum_count
                                , analyze_count, autoanalyze_count 
                                FROM pg_catalog.pg_stat_user_tables;

Table Sizes

Not cluster-wide. Run this on each database in the cluster.
Size includes all indexes in the table


                            SELECT current_database() as dbname
                                , n.nspname as schemaname
                                , c.relname
                                , pg_total_relation_size(c.oid) as size_bytes 
                            FROM pg_catalog.pg_class c 
                            JOIN pg_catalog.pg_namespace n ON c.relnamespace = n.oid 
                            WHERE NOT pg_is_other_temp_schema(n.oid) 
                            AND relkind IN ('r', 'm', 'f');

Database/Cluster-wide Statistics

Graph all columns as rate of change


                        SELECT s.datname as dbname
                            , xact_commit, xact_rollback
                            , blks_read, blks_hit
                            , tup_returned, tup_fetched
                            , tup_inserted, tup_updated, tup_deleted
                            , conflicts
                            , temp_files, temp_bytes
                            , deadlocks 
                        FROM pg_catalog.pg_stat_database s 
                        JOIN pg_catalog.pg_database d on d.datname = s.datname 
                        WHERE d.datistemplate = false;

Database Sizes

Returns size of each database in cluster
Graph as both actual value & rate of change
Sum bytes for total cluster size


                        SELECT datname as dbname
                            , pg_database_size(datname) as bytes 
                        FROM pg_catalog.pg_database 
                        WHERE datistemplate = false;

Table vs Index Size

Useful stat just to see the impact that indexes are having on your total size
Plot both points together on same graph


                        SELECT sum(pg_relation_size(c.oid)) as size_table 
                        FROM pg_class c
                        JOIN pg_namespace n ON c.relnamespace = n.oid 
                        WHERE relkind in ( 'm', 'r', 'f') 
                        AND NOT pg_is_other_temp_schema(n.oid);


                        SELECT sum(pg_relation_size(c.oid)) as size_index 
                        FROM pg_class c
                        JOIN pg_namespace n ON c.relnamespace = n.oid 
                        WHERE (relkind = 'i') 
                        AND NOT pg_is_other_temp_schema(n.oid);

WAL Metrics

pg_ls_waldir() normally requires superuser privileges, but can also be granted with pg_monitor.
Lists files in pg_wal directory
Warn if it goes above 150% for longer than 10 minutes.
Worth investigating if ever >= 300%. Much higher than usual write activity.
May need to increase max_wal_size
Watch for very old WAL files lingering


                            SELECT count(*) AS total_count
                                , sum(size) AS total_size_bytes
                                , 100 * sum(size) / (SELECT setting::int*1024*1024 FROM pg_settings WHERE name = 'max_wal_size') AS perc_max
                                , min(modification)
                            FROM pg_ls_waldir();

Final Questions?

Thank you!

PostgreSQL Home Page - postgresql.org
Planet PostgreSQL Community News Feed - planet.postgresql.org
PostgreSQL Extension Network - pgxn.org
Crunchy Data Solutions, Inc - crunchydata.com
Make cool presentations like this - reveal.js