So You Want To
Make An Extension?

Presented by Keith Fiske / @keithf4

Database Administrator @ OmniTI
(pg_partman, pg_extractor, mimeo)



Follow along at
http://slides.keithf4.com/extension_dev

OmniTI, Inc

  • Full-stack support for high-traffic websites & applications
    • Millions+ of users
    • Terabytes+ of data
    • High concurrency
  • We're hiring!

Why Extensions?

  • The old way
    
    psql -d mydb < myplpgsql_code.sql
                                
  • Nothing in the database distinguished your objects from any other (except schema)
  • Nothing in the database to indicate what version of your code is installed
  • No easy way to remove your code later (except drop schema)
  • No dependency options (ex. ensure dblink is installed first)

Here's why...

  • Simple install
    
    CREATE EXTENSION ... [ SCHEMA ... ]
                                
  • Versioned code so you know exactly what is installed
  • Simple update (upgrade or downgrade)
    
    ALTER EXTENSION ... UPDATE [ TO '...' ]
                                
  • Easily see which objects are part of an extension (\dx+ in psql)
  • Help prevent removal of critical objects
    
    # drop table partman.part_config;
    ERROR:  cannot drop table partman.part_config because extension pg_partman requires it
    HINT:  You can drop extension pg_partman instead.
                                
  • Extension dependencies
    
    requires = 'dblink,pgcrypto'
                                

Getting started...

Control File

  • [extension].control
  • default_version - version installed when not specified with CREATE/ALTER EXTENSION
  • comment - Installed as COMMENT when running \dx in psql
  • module_pathname - subtituted for MODULE_PATHNAME in script files. Avoids having to hardcode library the name of the shared library file.
  • requires - list of extensions this extension depends on. These must be installed first.
  • relocatable - boolean that sets whether extension can be moved to another schema after install.

  • directory - location of extension SQL script files. Unless absolute, relative to SHAREDIR.
  • encoding - character set encoding used by script files. Default is database's encoding.
  • superuser - Default requires superuser to install/update extensions. Setting to false only requires privileges for commands given in script.
  • schema - hardcodes schema extension is installed to. Not recommended.
    • Not setting this allows user to designate their own schema at installation and also allows extension relocatability

Secondary Control File

  • [extension--version].control
  • Allows specific installation instructions for a version given to CREATE EXTENSION
  • Installed to same location as primary control
  • Same options as primary except directory and default_version cannot be used.

SQL Script Files (Install)

  • [extension--version].sql
  • pg_partman--3.0.2.sql
  • Contains all the sql/plpgsql/plperl/etc code that gets installed into the database
  • No compiled code (C)
  • @extschema@ - a macro that can be used to fill in the schema given with the CREATE EXTENSION command or the schema option in the .control file. Avoids having to hardcode your schema but still be able to refer to it explicitly to ensure your function is called when another similarly named one is in the search path.

SQL Script Files (Updates)

  • [extension--oldversion--newversion].sql
  • pg_partman--3.0.1--3.0.2.sql
  • Provide upgrade/downgrade path for code installed within the database
  • Code within and versions in filename are arbitrary (any character other than "--")
  • PostgreSQL doesn't know that 3.0.2 is greater than 3.0.1. All it knows is that's the update path you want.
  • Single update runs within a single transaction. Changes requiring multiple transactions must be done as separate versions.
  • Upgrading multiple versions in a single ALTER command is also done in a single transaction
    
    Existing version is 1.2.3
    ALTER EXTENSION UPDATE extension TO '2.0.0';
                                
    All updates between 1.2.3 and 2.0.0 (if any) are run in a single transaction.
  • If updates require separate transactions, must update to specific versions individually
    
    ALTER EXTENSION UPDATE extension TO '1.2.4';
    ALTER EXTENSION UPDATE extension TO '1.2.5';
    ALTER EXTENSION UPDATE extension TO '2.0.0';
                                

Version Update Paths

  • When no TO clause is given to ALTER EXTENSION ... UPDATE, the version contained in the installed .control file is used as the target version
  • Any update scripts found between the installed version and the target update version are always run in order of available paths based on update file names
  • Can see the update path that will be followed between any given versions using: pg_extension_update_paths(). Returns source, target & update path that will be taken between them.
  • 
    select * from pg_extension_update_paths('pg_partman') where source = '2.3.4' and target = '3.0.0';
     source | target |                                    path                                     
    --------+--------+-----------------------------------------------------------------------------
     2.3.4  | 3.0.0  | 2.3.4--2.4.0--2.4.1--2.5.0--2.5.1--2.6.0--2.6.1--2.6.2--2.6.3--2.6.4--3.0.0
    
                                
  • Not giving conditions will show every conceivable version path (up and down), even those with no actual possible path
  • 
    select * from pg_extension_update_paths('pg_partman') where source = '2.3.4' and target = '1.0.0';
     source | target |  path  
    --------+--------+--------
     2.3.4  | 1.0.0  | ┬źNULL┬╗
                                

Skip Versions

  • PostgreSQL will always use the shortest possible path when multiple paths are available.
  • Beware of downgrade scripts that will cause a shorter path!
  • You can use this feature to create an update script that skips a specific version.
  • Useful if you make a bad release that does not install properly
  • Create a new update script that skips over the version you don't want to have installed. Here I don't want 0.10.0 to be installed anymore
    
    mimeo--0.9.3--0.10.0.sql
    mimeo--0.9.3--0.10.1.sql
                                
    
    select * from pg_extension_update_paths('mimeo') where source = '0.9.3' and target ~ '0.10.' order by 2;
     source | target |                         path                          
    --------+--------+-------------------------------------------------------
     0.9.3  | 0.10.0 | 0.9.3--0.10.0
     0.9.3  | 0.10.1 | 0.9.3--0.10.1
     0.9.3  | 0.10.2 | 0.9.3--0.10.1--0.10.2
     0.9.3  | 0.10.3 | 0.9.3--0.10.1--0.10.2--0.10.3
                                
  • Allows people that may have successfully installed the bad version to still update
    
    select * from pg_extension_update_paths('mimeo') where source = '0.10.0' and target ~ '0.10.' order by 2;
     source | target |                          path                          
    --------+--------+--------------------------------------------------------
     0.10.0 | 0.10.1 | 0.10.0--0.10.1
     0.10.0 | 0.10.2 | 0.10.0--0.10.1--0.10.2
                                

Create Extension From Existing Objects

  • [extension--old_version--target_version].sql
  • Use FROM clause of the ALTER EXTENSION command
  • "old_version" can be any arbitrary value, but is typically the string "unpackaged"
    
    ALTER EXTENSION dblink UPDATE FROM unpackaged;
                                
  • Do not specify the TO option yet for a specific version. This method is usually done when the "target_version" will correspond to the code state that existed just before it was turned into an extension.
  • Once the unpackaged version of your code is an extension, you can then run any extension additional updates to get it up to date.
  • This was done for contrib modules when upgrading from PostgreSQL before 9.1.
  • Highly recommended to convert any contrib modules installed before 9.1 to their extension form after upgrading. Otherwise you may not get updates to those extensions outside of the newly compiled binaries.

Config Tables

  • Extension tables, sequences and their data not normally output by pg_dump. Restore only runs the CREATE EXTENSION command.
    • This means extension script files & libraries must be in place before doing a pg_restore
  • Set extension flag to dump config table data & sequence values (not the schema itself)
    
    SELECT pg_catalog.pg_extension_config_dump('part_config', '');
    SELECT pg_catalog.pg_extension_config_dump('my_config_seq', '');
                                    
  • When second argument is an empty string, the entire table is dumped. Only do this when table is initially empty on extension creation. Second argument unused for sequences.
  • When extension config table has initial data, set second argument to WHERE condition to exclude this data when output by pg_dump.
    
    SELECT pg_catalog.pg_extension_config_dump('job_status_text', 'WHERE alert_code NOT IN (1,2,3)');
                                    
  • Note this data is ALWAYS dumped, even when a schema only dump (-s) is done. Makes it difficult to manage extension tables with a lot of data since this setting will likely not be set (this may have been fixed, if not, here's a feature request!).

File/Code Organization

  • Not required, just personal preference for easier maintenance
  • Folders in extension directory: bin, doc, sql, src, test, updates
  • sql/ contains: functions, tables, types, [etc]
  • Allows easier maintenance of script code since all functions aren't maintained in a single file
  • More later on how to put it all back together into the required file format...
  • Important Note: Ensure code in sql/ files matches the code in the update scripts.
    • Meld - Tool for doing side-by-side file diff comparison

Build Installation

  • Full details at V.36.16. Extension Building Infrastructure
  • PG_CONFIG - path to pg_config of PostgreSQL install to build against. Just set to "pg_config" to use the one in current PATH
  • MODULES(_big) - shared library objects to build
  • EXTENSION - extension name that matches control file name
  • SCRIPTS(_built) - script files that install to prefix/bin
  • MODULEDIR - subdir of prefix/share" where DATA and DOCS goes. Defaults to "extension" when EXTENSION is set
  • DOCS - documentation files installed to prefix/doc/$MODULEDIR
  • Use PGXS build infrastructure to easily build against an existing PostgreSQL install.
    
    PGXS := $(shell $(PG_CONFIG) --pgxs)
    include $(PGXS)
                                

Build Installation (cont)

  • DATA - files to install to prefix/share/$MODULEDIR (usually sql files).
  • Use Make build macros to piece the sql files back together into the required, single script file.
  • Include the updates folder in the DATA definition so it gets copied with the main script file.
  • 
    EXTVERSION = $(shell grep default_version $(EXTENSION).control | \
                   sed -e "s/default_version[[:space:]]*=[[:space:]]*'\([^']*\)'/\1/")
    
    all: sql/$(EXTENSION)--$(EXTVERSION).sql
    
    sql/$(EXTENSION)--$(EXTVERSION).sql: $(sort $(wildcard sql/types/*.sql)) $(sort $(wildcard sql/tables/*.sql)) $(sort $(wildcard sql/functions/*.sql))
    	cat $^ > $@
    
    DATA = $(wildcard updates/*--*.sql) sql/$(EXTENSION)--$(EXTVERSION).sql
    EXTRA_CLEAN = sql/$(EXTENSION)--$(EXTVERSION).sql
                                
  • EXTRA_CLEAN - Usually just the main sql script since it's dynamically generated

Build Instructions (cont)

  • Version checking at build time
  • 
    PG94 = $(shell $(PG_CONFIG) --version | egrep " 8\.| 9\.0| 9\.1| 9\.2| 9\.3" > /dev/null && echo no || echo yes)
    
    ifeq ($(PG94),yes)
        [... Do all your build stuff here ...]
    else
    $(error Minimum version of PostgreSQL required is 9.4.0)
    endif
                                
  • Full example Makefile: https://github.com/keithf4/pg_partman/blob/master/Makefile

Personal Experience Tips

Predictable Naming

  • Whenever possible, explicitly name all objects the extension creates.
  • Indexes, constraints, foreign keys, etc.
  • Having a known, predictable name makes doing updates on such things later much easier
  • Ex. Automatically named constraints/indexes on the same column get unpredictable names as time goes on.
  • 
    CREATE INDEX part_config_type_idx ON @extschema@.part_config (partition_type);
    
    CREATE TABLE part_config_sub (
        sub_parent text 
        , sub_partition_type text NOT NULL
        [...]
        , CONSTRAINT part_config_sub_pkey PRIMARY KEY (sub_parent)
        , CONSTRAINT part_config_sub_sub_parent_fkey FOREIGN KEY (sub_parent) REFERENCES @extschema@.part_config (parent_table) ON DELETE CASCADE ON UPDATE CASCADE DEFERRABLE INITIALLY DEFERRED
        , CONSTRAINT positive_premake_check CHECK (sub_premake > 0)
    );
                                

Preserving Privileges

  • You will, at some point, have to drop and recreate a function because you change its definition and cannot use CREATE OR REPLACE ...
  • User may have granted additional privileges on extension objects that you cannot predict
  • If you have to drop and recreate objects, that will remove the user's custom grants
  • 
    CREATE TEMP TABLE partman_preserve_privs_temp (statement text);
    
    INSERT INTO partman_preserve_privs_temp 
    SELECT 'GRANT EXECUTE ON FUNCTION @extschema@.create_function_id(text, bigint) TO '||array_to_string(array_agg(grantee::text), ',')||';' 
    FROM information_schema.routine_privileges
    WHERE routine_schema = '@extschema@'
    AND routine_name = 'create_function_id'; 
    
    DROP FUNCTION @extschema@.create_function_id(text, bigint, boolean); 
    
    [... Do all your update code here ...]
                                

Preserving Privileges (cont)


[... Do all your extension update code here ...]

-- Restore dropped object privileges
DO $$
DECLARE
v_row   record;
BEGIN
    FOR v_row IN SELECT statement FROM partman_preserve_privs_temp LOOP
        IF v_row.statement IS NOT NULL THEN
            EXECUTE v_row.statement;
        END IF;
    END LOOP;
END
$$;

DROP TABLE IF EXISTS partman_preserve_privs_temp;
                        

Manually Drop Temp Tables

  • Reported to me that it's very important to manually drop any temp table created in any update script
  • Possibility that when transaction ends, it may drop the entire extension as part of the drop cascade when temporary objects are automatically cleaned up.
  • Tested this and unable to replicate, but I've always done this anyway due to my own general paranoia.

Version Checking

  • PL/pgSQL
    • Simple IF Condition
      
      IF current_setting('server_version_num')::int < 100000 THEN
                                  
    • check_version(text)
      • Returns true if current version is greater than or equal to given version
      • Allows accounting for beta/RC releases as special case
  • C
    
    #if (PG_VERSION_NUM < 100000)
    worker.bgw_main = pg_partman_bgw_main;
    #endif
    #if (PG_VERSION_NUM >= 100000)
    sprintf(worker.bgw_library_name, "pg_partman_bgw");
    sprintf(worker.bgw_function_name, "pg_partman_bgw_main");
    #endif
                                

Object Name Truncation

  • Object names have a 63 character limit (by default, set at compile time)
  • check_name_length()
  • Automatically truncate given object name to workable length
  • Allow consistent prefix/suffix entries on object names (partitioning)

Avoid Enums

  • Not easily editable, especially in extension transaction context
  • Getting better in 10
    • Previously it was impossible to run ALTER TYPE ... ADD VALUE in a transaction block unless the enum type was created in the same block.
    • Allow ENUM values to be renamed
  • Deletion of value still non-trivial

Prevent Non-Extension Installation

  • Add the following line to the top of the sql install script to prevent running it in psql
    
    \echo Use "CREATE EXTENSION dblink" to load this file. \quit
                                
  • If you split sql objects out to individual files, will have to be in whatever the first object added to file is (predictability is one reason it's good to sort in the make file).

pgTAP

  • http://pgtap.org
  • Unit testing for schema, data changes & and actual functionality of both
  • Two methods I've used:
    1. Each file is its own, uncommitted transaction. BEGIN [...] ROLLBACK (pg_partman)
    2. Each file depends on the previous to run in a specific order. Number the test files. Last one(s) clean things up. (mimeo)
  • Add a README_test file with important test info (necessary schema architecture, dependencies, ex. max_locks_per_transaction in pg_partman)

PGXN

  • PostgreSQL Extension Network - https://pgxn.org
  • Repository of available extensions maintained by their author
  • Command line tool for managing extension installation
  • Where I stole most of my Makefile features from and contributed a few back

Go Make An Extension!