PostgreSQL Administration

Presented by Keith Fiske / @keithf4

Senior Database Engineer @ Crunchy Data Solutions
(pg_partman, pg_monitor, pg_extractor, mimeo)



Follow along at
http://slides.keithf4.com/pg_admin_training_centos

VM Setup

  • Copy PGTraining_VM.zip from USB stick to your computer. Extract in location with at least 15GB of free space.
  • Copy relevant VirtualBox installation file for your OS & Install if necessary (5.2.30 minimum)
  • Copy Oracle_VM_VirtualBox_Extension_Pack-5.2.30.vbox-extpack and Install by double-clicking
  • Remove USB Stick
  • Import Virtual Machine Image
    • Machine -> Add... ->
    • Browse to where you extracted the zip file (DO NOT run off USB stick)
    • Double-click PGTraining_VM.vbox
  • Start up VM!
  • Login: training / postgres

Crunchy Data Solutions, Inc

  • Industry leader in providing enterprise PostgreSQL support and open source solutions
  • Crunchy Certified PostgreSQL
    • 100% Open Source PostgreSQL
    • Common Criteria EAL 2+ Certified
  • We're hiring!

What is PostgreSQL?

  • Open Source RDBMS
  • Started at UC Berkley 1986, open sourced in 1996
  • BSD-type License
  • Follows SQL Standard very closely
  • Third-party Plugin Support
    • Procedural Languages
      • C, Java, Python, Perl, JavaScript, PHP, R, Ruby, etc
      • Extensions
      • Background Workers (>=9.3)
  • Massive online community (mailing lists, irc, conferences)

Training VM

  • VirtualBox 5.2.30 + Extension Pack
  • Login: training / postgres
  • training user has sudo
    • All commands shown with "sudo" are run by training user
  • Applications -> System Tools -> Terminal
  • Internet should work within VM (if it works externally)
  • Clipboard should work between host & VM
  • vim & nano for CLI file editing. For GUI:
  • 
                                sudo -s
                                gedit &
                            
    then close that terminal to avoid spam
  • Take snapshot now & as many times as you'd like along the way

Installing PostgreSQL

  • Setup yum repo for your desired version
  • 
                                sudo yum install https://download.postgresql.org/pub/repos/yum/reporpms/EL-7-x86_64/pgdg-redhat-repo-latest.noarch.rpm
                            
  • Install PostgreSQL 11.x & Contrib modules
  • 
                                    sudo yum install postgresql11-server postgresql11-contrib
  • No clusters automatically created and no initial automatic startup (Redhat policy)
  • 
                                sudo PGSETUP_INITDB_OPTIONS="--data-checksums" /usr/pgsql-11/bin/postgresql-11-setup initdb
                                sudo systemctl enable postgresql-11
                                sudo systemctl start postgresql-11
                                systemctl status postgresql-11
                                ...
                            

Initial Login

  • Become postgres system user
  • 
                                sudo -iu postgres
  • Log into database (default postgres role & database already created)
  • psql
    • \? to see all available commands
  • Recommend keeping second terminal opened logged in as postgres user

Configuration - pg_hba.conf

Configuration - pg_hba.conf

  • Add following lines above other active ones (for later):
  • 
                                # TYPE  DATABASE       USER           ADDRESS         METHOD
                                host    replication    replication    127.0.0.1/32    md5
                                host    replication    training       127.0.0.1/32    trust 
                            
  • Only requires a reload to put new HBA settings into place
  • 
                                postgres=# select pg_reload_conf();
                                OR
                                sudo systemctl reload postgresql-11
                            
  • View current pg_hba.conf file from within database
  • 
                                SELECT * FROM pg_hba_file_rules;
                            
  • Check log file for SIGHUP
    • "log" directory in the data directory
    • Changed from "pg_log" < PG10

Create Role & Database

  • Create a role & database for yourself
  • 
                                postgres=# CREATE ROLE training WITH LOGIN SUPERUSER;
                                postgres=# CREATE DATABASE training;
                            
  • Should now be able to log into PostgreSQL as your training user
  • Create replication role for later
  • 
                                training=# CREATE ROLE replication WITH LOGIN REPLICATION;
    • Set passwords ("password")
    • 
                                      training=# \password training
                                      training=# \password replication
                                      ...
                                  

Configuration - postgresql.conf

  • Main configuration file
  • Important initial settings review (most require restart)
    • listen_addresses (defaults to localhost only; set to server IP; * = all IPs)
    • max_connections (monitor active connections, affected by work_mem)
    • shared_buffers (8GB good starting point; leave default for training)
    • work_mem (2-5MB good starting point. The more RAM the better.)
    • maintenance_work_mem (1GB good starting point)
    • wal_level = replica
    • effective_cache_size (50% RAM good starting point)
    • archive_mode = on
    • archive_command = ‘/bin/true’
    • archive_timeout = 60
    • max_wal_senders = 10
    • wal_keep_segments = 30
    • max_replication_slots = 10
    • hot_standby = on (setting for replica)
    • logging_collector = on
    • autovacuum_freeze_max_age = 1000000000 (only set this high if monitoring for wraparound. So monitor for wraparound and set it this high.)

Configuration - postgresql.conf

Vacuums & Freezing

  • http://www.postgresql.org/docs/current/static/routine-vacuuming.html
  • Multi-Version Currency Control (MVCC)
    • Updated/Deleted rows not actually deleted. Just marked unavailable.
    • Vacuum marks unavailable rows as re-usable space. Leads to bloat.
    • VACUUM FULL recovers disk space but locks table for full table rewrite
    • pg_repack - Extension to reclaim disk space with minimal lock
    • Don’t overuse VACUUM FULL or pg_repack.
      • Reusable space can be more efficient than [re]allocating new pages
  • Transaction ID Exhaustion (Wraparound)
    • Every row has transaction id (XID) value
    • Every new write transaction increments cluster-wide XID
    • Determines visibility to current transactions
    • 32-bit number, so wraparound is possible after 4 billion transactions
      • 2 billion transactions newer than current & 2 billion older
    • VACUUM marks rows as frozen, setting the XID such that it will be visible to all future transactions
    • Reserved, FrozenXID always older than all XIDs
      • VACUUM FREEZE tables rarely written to so that vacuum can skip pages and run more efficiently (9.6+ feature)
    • autovacuum_freeze_max_age - when table XID value reaches this, a VACUUM is forced (even if autovac turned off).

Replication - Basic

  • Ensure line exists in master pg_hba.conf and reload master
  • 
                                host    replication    replication    127.0.0.1/32    md5
                            
  • Create a replication slot
  • 
                                training=# SELECT * FROM pg_create_physical_replication_slot('training_replica');
                                training=# SELECT * FROM pg_replication_slots;
                            
  • Use pg_basebackup to create copy of master (as postgres system user)

Replication - Basic

  • Edit replica postgresql.conf
  • 
                                port = 5444 
                            
  • Edit replica recovery.conf
  • 
                                standby_mode = 'on'
                                primary_conninfo = 'host=127.0.0.1 port=5432 user=replication password=password'
                                primary_slot_name = 'training_replica'
                                recovery_target_timeline='latest'
                                ...
                            

Replication - Basic

  • Production method for CentOS
  • Create new systemd service for replica
  • 
                                sudo cp /usr/lib/systemd/system/postgresql-11.service /etc/systemd/system/postgresql-11-replica.service
                            
  • See systemd service file locations
  • Edit config file to set replica data directory (as root):
  • 
                                Environment=PGDATA=/var/lib/pgsql/11/replica/
                            
  • Register service, start up replica & and check service status
  • 
                                sudo systemctl daemon-reload
                                sudo systemctl enable postgresql-11-replica
                                sudo systemctl start postgresql-11-replica
                                sudo systemctl status postgresql-11-replica
                                ...
                            

Replication - Basic

  • Check replica log to ensure it connected
  • 
                                started streaming WAL from primary at 0/3000000 on timeline 1
                            
  • Check from master
  • 
                                training=# SELECT * FROM pg_stat_replication;
                                training=# SELECT * FROM pg_replication_slots;
                            
  • Create an object and make sure it appears on replica (as training user)
  • 
                                training=# CREATE TABLE testing (
                                    id bigint primary key, 
                                    stuff text, 
                                    inserted_at timestamptz default now());
    
                                training=# INSERT INTO testing (id) values (generate_series(1,1000));
                            
  • Connect to replica
  • 
                                psql -p 5444
    
                                training=# SELECT pg_is_in_recovery();
                                ...
                            

Backup - pg_dump/all

  • pg_dumpall
    • Dumps all databases in cluster in plaintext format
    • Can dump only cluster-wide data: roles, tablespaces
    • 
                                          pg_dumpall -g -f /home/training/globals.sql
                                      
    • Use psql to restore backup
    • Restores entire cluster. No object filtering options.
  • pg_dump
    • Backs up individual databases in cluster
    • Does not back up roles or tablespaces, but does back up object privileges of those roles.
    • Provides binary backup format that can make restore easier
    • Use pg_restore to restore binary backups
    • Can restore individual schemas or tables instead of entire cluster
    • --section option can break up dump into stages
    • 
                                          pg_dump -Fc training -f /home/training/training.pgr -v
                                          ...
                                      

Restore - pg_restore

  • As training user, create new database in home folder
  • 
                                    mkdir /home/training/mydb
                                    chmod 700 /home/training/mydb
                                    /usr/pgsql-11/bin/initdb --data-checksums -D /home/training/mydb
                                
  • Edit postgresql.conf:
  • 
                                        port = 5888
                                        unix_socket_directories = '/home/training/mydb'
                                
  • Start the DB
  • 
                                    /usr/pgsql-11/bin/pg_ctl -D /home/training/mydb start
                                
  • Automatically made role “training”
  • psql used to restore pg_dumpall or plaintext version of pg_dump
    • Restore roles before restoring database so permissions are set properly
    
                                    psql -h localhost -p 5888 -d postgres -f /home/training/globals.sql -a
                                
  • pg_restore used to restore binary dump from pg_dump
  • 
                                    psql -h localhost -p 5888 postgres
                                    postgres=# create database new_training;
                                    pg_restore -h localhost -p 5888 -d new_training -v /home/training/training.pgr
                                    ...
                                

Restore - pg_restore

  • Smaller backups & more flexible restore
  • Must recreate all indexes & constraints
  • Stop this instance and delete it
  • 
                                /usr/pgsql-11/bin/pg_ctl stop -m fast -D /home/training/mydb
                                rm -r /home/training/mydb
                                ...
                            

Backup - File System (Basic)

pg_basebackup

  • We already did it! Let's do it again...
  • Create compressed backup
  • 
                                mkdir /home/training/mybackup
                                pg_basebackup -h 127.0.0.1 -D /home/training/mybackup -Ft -z -Xs -P -v
                            
  • Separate tar file is made for necessary WAL files
  • If additional tablespaces, they go in their own tar file
  • Do not set -R for normal backups
  • -Xs (--wal-method=stream) is default for PG10+
    • -Xf (fetch) only works if all WAL generated during backup run exist at the end of the backup
    • -Xf can require extremely high wal_keep_segements on a busy database
    • -Xs does require one additional wal sender being available (max_wal_senders)

Restore - File System (Basic)

  • We already did it! Let's do it again...
  • 
                                    mkdir /home/training/mydb2
                                    chmod 700 /home/training/mydb2
                                
    
                                    tar xvzf /home/training/mybackup/base.tar.gz -C /home/training/mydb2/
                                    tar xvzf /home/training/mybackup/pg_wal.tar.gz -C /home/training/mydb2/pg_wal
                                
    
                                    EDIT postgresql.conf: 
                                        port = 5888
                                        unix_socket_directories = '/home/training/mydb2'
                                
    
                                    /usr/pgsql-11/bin/pg_ctl -D /home/training/mydb2 start
    
                                    psql -h localhost -p 5888
                                    (requires hba fix)
    
                                    psql -h /home/training/mydb2/ -p 5888
                                    ...
                                

Restore - File System (Basic)

  • MUCH faster disaster recovery
  • Less flexible restore options (all or nothing)
  • Stop this instance and delete it
  • 
                                /usr/pgsql-11/bin/pg_ctl stop -m fast -D /home/training/mydb2
                                rm -r /home/training/mydb2
                                ...
                            

Backup - File System (Advanced)

  • pgBackRest - http://pgbackrest.org
  • Point-in-Time Recovery (PITR)
  • Incremental & Differential Backups
  • Parallel Backup & Restore
  • Backup Rotation & Archive Expiration
  • Checksums
  • Many more options!

Backup - File System (Advanced)

  • Install pgBackRest
  • 
                                sudo yum install pgbackrest.x86_64
                            
  • Configure a stanza
    • Defines where PG cluster is located, how it will be backed up, archiving options, etc
    • One stanza on db server. Many stanzas on a dedicated backup machine.
    • Edit /etc/pgbackrest.conf (default location, requires root)
    • 
                                      [global]
                                      repo1-path=/var/lib/pgbackrest
                                      log-level-console=info
      
                                      [main]
                                      pg1-path=/var/lib/pgsql/11/data
                                      retention-full=2
                                  
  • Ensure repository folders exists with proper permissions. Should already be there and ready if install worked.
  • 
                                sudo mkdir /var/lib/pgbackrest
                                sudo chmod 750 /var/lib/pgbackrest/
                                sudo chown postgres:postgres /var/lib/pgbackrest
                                sudo chown postgres:postgres /var/log/pgbackrest
                                ...
                            

Backup - File System (Advanced)

  • Set primary postgresql.conf options for WAL archiving
  • 
                                archive_command = 'pgbackrest --stanza=main archive-push %p'
                                archive_mode = on
                                max_wal_senders = 10 
                                wal_level = replica
                            
  • Normally set above exactly the same on replica, but can't in VM
  • Reload/restart to set above options
  • 
                                sudo systemctl reload postgresql-11
                            
  • Create & check the stanza
  • 
                                sudo -u postgres pgbackrest --stanza=main stanza-create
    
                                sudo -u postgres pgbackrest --stanza=main check
                                ...
                            

Backup - File System (Advanced)

  • Perform a backup
  • 
                                sudo -u postgres pgbackrest --stanza=main backup
                            
  • By default attempts an incremental backup, but since no full exists, creates a full. Run again...
  • 
                                sudo -u postgres pgbackrest --stanza=main backup
                            
  • Designate backup type with --type option (full, incr, diff)
  • 
                                sudo -u postgres pgbackrest --stanza=main --type=diff backup
                            
  • Check backup status
  • 
                                sudo -u postgres pgbackrest info
                                ...
                            

Restore - File System (Advanced)

  • pgBackRest will restore the database to the target designated in the stanza config by default.
  • In disaster recovery, this is normal. --delta option can even speed it up greatly if original data dir is still there
  • For training, restore to a target folder:
  • 
                                sudo -iu postgres 
                                mkdir /var/lib/pgsql/11/br_restore
                                chmod 700 /var/lib/pgsql/11/br_restore
    
                                pgbackrest restore --stanza=main --db-path=/var/lib/pgsql/11/br_restore
                            
  • Change postgresql.conf port to 5911
  • TURN OFF ARCHIVE COMMAND
    • Set back to /bin/true or set mode to off
    
                               /usr/pgsql-11/bin/pg_ctl start -D /var/lib/pgsql/11/br_restore/ 
                            
    
                               /usr/pgsql-11/bin/pg_ctl stop -m fast -D /var/lib/pgsql/11/br_restore/ 
                               rm -r /var/lib/pgsql/11/br_restore
                               ...
                            

Failover

  • Stop the master
  • 
                                sudo systemctl stop postgresql-11
                                sudo systemctl status postgresql-11
                            
  • Check replica logs to see connection errors
  • 
                                /var/lib/pgsql/11/replica/log/...
                            
  • Promote replica
  • 
                                /usr/pgsql-11/bin/pg_ctl promote -D /var/lib/pgsql/11/replica
                            
  • Check logs
  • Write to database (running on port 5444)

Upgrading

  • Dump/Restore
    • Works on all versions
    • Use the pg_dump binary of the upgrade target
    • 8.4+ has parallel pg_restore
    • 9.3+ has parallel pg_dump
    • Same caveats mentioned previously (smaller dump/longer restore)
  • pg_upgrade
    • In-place upgrade of data files
    • Works only for 8.4.7+ (all previous versions must dump/restore or do staged upgrade)
    • OS w/ hard link support can greatly decrease upgrade time
      • Ex: 700GB upgrade in under 5 minutes, but...
      • Don't forget to analyze entire cluster
      • 
                                            vacuumdb --all --analyze-in-stages 
                                        
  • Logical

Monitoring

  • Replication status
  • Active/Idle connections. Idle in transaction session times.
  • Transaction ID Wraparound
  • Table statistics
    • Sequential scans vs index scans
    • Insert/Update/Delete rate
  • Transactions
    • Commits vs Rollbacks
  • Database size
    • Total Table Size vs Total Index Size
  • WAL file count (expected vs current)
  • Log/query analysis - pgbadger
  • Critical functions - pg_jobmon
  • Prometheus - https://prometheus.io

Replication Status

  • Monitor from replica (seconds behind since last WAL replay)
  • 
                                    SELECT extract(epoch from now() - pg_last_xact_replay_timestamp()) AS replica_lag;
                                
  • Monitor from master if streaming (bytes behind)
  • 
                                    SELECT client_hostname
                                        , client_addr
                                        , pg_wal_lsn_diff(pg_stat_replication.sent_lsn, 
                                            pg_stat_replication.replay_lsn) AS byte_lag
                                    FROM pg_stat_replication;
                                

Connections

  • pg_stat_activity only shows all column data to superuser. Otherwise it censors data of other sessions.
  • PG10 has new pg_monitor default role that allows viewing/executing various monitoring views and functions
  • 
                                CREATE ROLE monitoring WITH LOGIN;
                                GRANT pg_monitor TO monitoring;
                            
  • Add to top of pg_hba.conf and reload
  • 
                                local    all    monitoring       trust 
    
                                sudo systemctl reload postgresql-11
                            
    
                                select max_connections
                                , total_used
                                , coalesce(round(100*(total_used/max_connections)),0) as pct_used
                                , idle
                                , idle_in_txn
                                , ((total_used - idle) - idle_in_txn) as active
                                , (select coalesce(extract(epoch from (max(now() - query_start))),0) from 
                                        pg_stat_activity where state = 'idle in transaction') as max_idle_in_txn
                                , (select coalesce(extract(epoch from (max(now() - query_start))),0) from 
                                        pg_stat_activity where state <> 'idle') as max_txn_time
                                from (select count(*) as total_used
                                        , coalesce(sum(case when state = 'idle' then 1 else 0 end),0) as idle
                                        , coalesce(sum(case when state = 'idle in transaction' then 1 else 0 end),0) as idle_in_txn 
                                        from pg_stat_activity) 
                                x join (select setting::float AS max_connections FROM 
                                            pg_settings WHERE name = 'max_connections') xx ON (true);
                            

Autovac Freeze

  • Query that provides quick summary of transaction ID wraparound status
  • 
                                WITH max_age AS ( 
                                    SELECT 2000000000 as max_old_xid
                                        , setting AS autovacuum_freeze_max_age 
                                    FROM pg_catalog.pg_settings 
                                    WHERE name = 'autovacuum_freeze_max_age')
                                , per_database_stats AS ( 
                                    SELECT datname 
                                        , m.max_old_xid::int 
                                        , m.autovacuum_freeze_max_age::int 
                                        , age(d.datfrozenxid) AS oldest_current_xid 
                                    FROM pg_catalog.pg_database d 
                                    JOIN max_age m ON (true) 
                                    WHERE d.datallowconn) 
                                SELECT max(oldest_current_xid) AS oldest_current_xid 
                                    , max(ROUND(100*(oldest_current_xid/max_old_xid::float))) AS percent_towards_wraparound 
                                    , max(ROUND(100*(oldest_current_xid/autovacuum_freeze_max_age::float))) AS percent_towards_emergency_autovac 
                                FROM per_database_stats;
                            
    
                                -[ RECORD 1 ]---------------------+----------
                                oldest_current_xid                | 496452209
                                percent_towards_wraparound        | 25
                                percent_towards_emergency_autovac | 99
    
                                postgres=# show autovacuum_freeze_max_age ;
                                -[ RECORD 1 ]-------------+----------
                                autovacuum_freeze_max_age | 500000000
                            

Table Statistics

  • Not cluster-wide. Run this on each database in the cluster.
  • Graph all columns as rate of change
  • 
                                SELECT current_database() as dbname, schemaname, relname
                                    , seq_scan, seq_tup_read
                                    , idx_scan, idx_tup_fetch
                                    , n_tup_ins, n_tup_upd, n_tup_del, n_tup_hot_upd
                                    , n_live_tup, n_dead_tup
                                    , vacuum_count, autovacuum_count
                                    , analyze_count, autoanalyze_count 
                                    FROM pg_catalog.pg_stat_user_tables;
                            

Table Sizes

  • Not cluster-wide. Run this on each database in the cluster.
  • Size includes all indexes in the table
  • 
                                SELECT current_database() as dbname
                                    , n.nspname as schemaname
                                    , c.relname
                                    , pg_total_relation_size(c.oid) as size_bytes 
                                FROM pg_catalog.pg_class c 
                                JOIN pg_catalog.pg_namespace n ON c.relnamespace = n.oid 
                                WHERE NOT pg_is_other_temp_schema(n.oid) 
                                AND relkind IN ('r', 'm', 'f');
                            

Database/Cluster-wide Statistics

  • Graph all columns as rate of change

                        SELECT s.datname as dbname
                            , xact_commit, xact_rollback
                            , blks_read, blks_hit
                            , tup_returned, tup_fetched
                            , tup_inserted, tup_updated, tup_deleted
                            , conflicts
                            , temp_files, temp_bytes
                            , deadlocks 
                        FROM pg_catalog.pg_stat_database s 
                        JOIN pg_catalog.pg_database d on d.datname = s.datname 
                        WHERE d.datistemplate = false;
                    

Database Sizes

  • Returns size of each database in cluster
  • Graph as both actual value & rate of change
  • Sum bytes for total cluster size

                        SELECT datname as dbname
                            , pg_database_size(datname) as bytes 
                        FROM pg_catalog.pg_database 
                        WHERE datistemplate = false;
                    

Table vs Index Size

  • Useful stat just to see the impact that indexes are having on your total size
  • Plot both points together on same graph

                        SELECT sum(pg_relation_size(c.oid)) as size_table 
                        FROM pg_class c
                        JOIN pg_namespace n ON c.relnamespace = n.oid 
                        WHERE relkind in ( 'm', 'r', 'f') 
                        AND NOT pg_is_other_temp_schema(n.oid); 
                    

                        SELECT sum(pg_relation_size(c.oid)) as size_index 
                        FROM pg_class c
                        JOIN pg_namespace n ON c.relnamespace = n.oid 
                        WHERE (relkind = 'i') 
                        AND NOT pg_is_other_temp_schema(n.oid); 
                    

WAL Metrics

  • pg_ls_waldir() normally requires superuser privileges, but can also be granted with pg_monitor.
  • Lists files in pg_wal directory
  • Warn if it goes above 150% for longer than 10 minutes.
  • Worth investigating if ever >= 300%. Much higher than usual write activity.
  • May need to increase max_wal_size
  • Watch for very old WAL files lingering
  • 
                                SELECT count(*) AS total_count
                                    , sum(size) AS total_size_bytes
                                    , 100 * sum(size) / (SELECT setting::int*1024*1024 FROM pg_settings WHERE name = 'max_wal_size') AS perc_max
                                    , min(modification)
                                FROM pg_ls_waldir();
                            

Final Questions?

Thank you!