PostgreSQL Partitioning:
Then & Now

Presented by Keith Fiske / @keithf4

Senior Database Engineer @ Crunchy Data Solutions, Inc
(pg_partman, pg_extractor, mimeo)



Follow along at
http://slides.keithf4.com/partitioning_thennow

Crunchy Data Solutions, Inc

  • Industry leader in providing enterprise PostgreSQL support and open source solutions
  • Crunchy Certified PostgreSQL
    • 100% Open Source PostgreSQL
    • Common Criteria EAL 2+
  • We're hiring!

Partitioning: What & Why?

  • Organization of data into logical "chunks"
  • Each "chunk" goes into its own table
  • Allows planner to optimize around that organization (partition pruning, tuple routing)
  • Allows data retention policies
    • Easily remove data that is no longer needed "live"
    • Minimal maintenance impact
    • Improved write & read performance

How did PostgreSQL do this before 10?

  • Inheritance
  • Constraints
  • Triggers

How did PostgreSQL do this? (pre-10)

Table Inheritance

  • Causes target table to inherit the column structure and storage options of the given parent.
  • Schema changes to parent are automatically applied to children. Not vice-versa.
  • All data in child tables is visible when the parent is queried.
  • Data inserted into parent goes into parent without trigger/rule redirection.
  • Tablespaces not inherited

CREATE TABLE ... INHERITS ...;
ALTER TABLE ... INHERITS ...;
                    

How did PostgreSQL do this? (pre-10)

Table Constraints

  • Each child table has a constraint on the partition column that limits the data to specific values/ranges based on the partitioning pattern
  • When the parent table is queried, constraint exclusion allows the planner to skip scanning child tables that are known to not have the desired data based on the constraint

ALTER TABLE ... ADD CONSTRAINT ... CHECK ('lower bound' >= partition_column AND 'upper bound' < partition_column);'
ALTER TABLE ... ADD CONSTRAINT ... CHECK ('static value' = partition_column);'
                    

How did PostgreSQL do this? (pre-10)

Triggers

  • Use table trigger on parent table to redirect data to child tables
  • Straight-forward for INSERT-only. Tricky for UPDATEs.
    • Requires moving data from one table to another causing write amplification.

CREATE OR REPLACE FUNCTION partman_test.time_taptest_table_part_trig_func()
 RETURNS trigger
 LANGUAGE plpgsql
AS $function$
            DECLARE
            v_count                 int;
            v_partition_name        text;
            v_partition_timestamp   timestamptz;
        BEGIN 
        IF TG_OP = 'INSERT' THEN 
            v_partition_timestamp := date_trunc('day', NEW.col3);
            IF NEW.col3 >= '2018-01-12 00:00:00-05' AND NEW.col3 < '2018-01-13 00:00:00-05' THEN 
            INSERT INTO partman_test.time_taptest_table_p2018_01_12 VALUES (NEW.*) ; 
            ELSIF NEW.col3 >= '2018-01-11 00:00:00-05' AND NEW.col3 < '2018-01-12 00:00:00-05' THEN 
                INSERT INTO partman_test.time_taptest_table_p2018_01_11 VALUES (NEW.*) ; 
            ELSIF NEW.col3 >= '2018-01-13 00:00:00-05' AND NEW.col3 < '2018-01-14 00:00:00-05' THEN 
                INSERT INTO partman_test.time_taptest_table_p2018_01_13 VALUES (NEW.*) ;
            ELSIF NEW.col3 >= '2018-01-10 00:00:00-05' AND NEW.col3 < '2018-01-11 00:00:00-05' THEN 
                INSERT INTO partman_test.time_taptest_table_p2018_01_10 VALUES (NEW.*) ; 
            ELSIF NEW.col3 >= '2018-01-14 00:00:00-05' AND NEW.col3 < '2018-01-15 00:00:00-05' THEN 
                INSERT INTO partman_test.time_taptest_table_p2018_01_14 VALUES (NEW.*) ;
            ELSIF NEW.col3 >= '2018-01-09 00:00:00-05' AND NEW.col3 < '2018-01-10 00:00:00-05' THEN 
                INSERT INTO partman_test.time_taptest_table_p2018_01_09 VALUES (NEW.*) ; 
            ELSIF NEW.col3 >= '2018-01-15 00:00:00-05' AND NEW.col3 < '2018-01-16 00:00:00-05' THEN 
                INSERT INTO partman_test.time_taptest_table_p2018_01_15 VALUES (NEW.*) ;
            ELSIF NEW.col3 >= '2018-01-08 00:00:00-05' AND NEW.col3 < '2018-01-09 00:00:00-05' THEN 
                INSERT INTO partman_test.time_taptest_table_p2018_01_08 VALUES (NEW.*) ; 
            ELSIF NEW.col3 >= '2018-01-16 00:00:00-05' AND NEW.col3 < '2018-01-17 00:00:00-05' THEN 
                INSERT INTO partman_test.time_taptest_table_p2018_01_16 VALUES (NEW.*) ;
            ELSIF NEW.col3 >= '2018-01-07 00:00:00-05' AND NEW.col3 < '2018-01-08 00:00:00-05' THEN 
                INSERT INTO partman_test.time_taptest_table_p2018_01_07 VALUES (NEW.*) ; 
            ELSIF NEW.col3 >= '2018-01-17 00:00:00-05' AND NEW.col3 < '2018-01-18 00:00:00-05' THEN 
                INSERT INTO partman_test.time_taptest_table_p2018_01_17 VALUES (NEW.*) ;
            ELSIF NEW.col3 >= '2018-01-06 00:00:00-05' AND NEW.col3 < '2018-01-07 00:00:00-05' THEN 
                INSERT INTO partman_test.time_taptest_table_p2018_01_06 VALUES (NEW.*) ; 
            ELSIF NEW.col3 >= '2018-01-18 00:00:00-05' AND NEW.col3 < '2018-01-19 00:00:00-05' THEN 
                INSERT INTO partman_test.time_taptest_table_p2018_01_18 VALUES (NEW.*) ;
            ELSE
                v_partition_name := partman.check_name_length('time_taptest_table'
                         , to_char(v_partition_timestamp, 'YYYY_MM_DD'), TRUE);
                SELECT count(*) INTO v_count 
                         FROM pg_catalog.pg_tables 
                         WHERE schemaname = 'partman_test'::name 
                         AND tablename = v_partition_name::name;
                IF v_count > 0 THEN 
                    EXECUTE format('INSERT INTO %I.%I VALUES($1.*) '
                        , 'partman_test', v_partition_name) USING NEW;
                ELSE
                    RETURN NEW;
                END IF;
            END IF;
        END IF;
        RETURN NULL;
        END $function$
                    

How did PostgreSQL do this? (pre-10)

Property Inheritance

  • Use the LIKE clause to the CREATE TABLE command to have the children include properties other than schema
  • 
    CREATE TABLE ... (LIKE ... { INCLUDING | EXCLUDING } { DEFAULTS | CONSTRAINTS | 
        | INDEXES | STORAGE | COMMENTS | ALL })
                        

How did PostgreSQL do this? (pre-10)

Retention

  • Without partitioning, removing data requires DELETE statements.
  • Can cause massive bloat depending on data churn. May require downtime to fix and regain performance.
  • Partitioning allows removal of old data by simply uninheriting or dropping table
  • 
    ALTER TABLE [child_table] NO INHERIT [parent_table];
    DROP TABLE [child_table];
                        

How did PostgreSQL do this? (pre-10)

pg_partman (PG 9.4+)

  • Extension to manage all previous tasks... and more!
  • Pre-creates child partitions to avoid contention
  • Automatically optimized trigger for recent data
  • Inherit index/foreign keys from parent
  • Partition/unpartition data in manageable commit batches
  • Background worker for maintenance (no cron required)
  • Manage constraints on non-partition columns
  • Naming length limits; ensure complete partition name suffix
  • Ensure consistent child table privileges
  • Sub-partitioning supported
  • Limited UPSERT support
  • Monitoring
    • Check parent for unexpected data
    • Check across child tables for uniqueness
    • pg_jobmon - logging & monitoring to ensure maintenance is running properly

How does PostgreSQL 10
do this natively?

  • https://www.postgresql.org/docs/10/static/ddl-partitioning.html
  • Inheritance, constraints & triggers all handled internally
  • New syntax commands
  • Range & List modes
  • Sub-partitioning supported
  • Internal tuple-routing tremendously faster than trigger redirects
  • Not yet any significant read optimization that's better than pre-10

How does PostgreSQL do this? (PG 10)

Range Partitioning

  • Partitioned into ranges by one or more columns with no overlap between partitions. Ex: Time/ID

CREATE TABLE measurement (
    city_id         int not null,
    logdate         date not null,
    peaktemp        int,
    unitsales       int
) PARTITION BY RANGE (logdate);                   
                    

How does PostgreSQL do this? (PG 10)

Range Partitioning


CREATE TABLE measurement_y2006m02 PARTITION OF measurement
    FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');

CREATE TABLE measurement_y2006m03 PARTITION OF measurement
    FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');
                    

keith@keith=# \d+ measurement
                                 Table "public.measurement"
  Column   |  Type   | Collation | Nullable | Default | Storage | Stats target | Description 
-----------+---------+-----------+----------+---------+---------+--------------+-------------
 city_id   | integer |           | not null |         | plain   |              | 
 logdate   | date    |           | not null |         | plain   |              | 
 peaktemp  | integer |           |          |         | plain   |              | 
 unitsales | integer |           |          |         | plain   |              | 
Partition key: RANGE (logdate)
Partitions: measurement_y2006m02 FOR VALUES FROM ('2006-02-01') TO ('2006-03-01'),
            measurement_y2006m03 FOR VALUES FROM ('2006-03-01') TO ('2006-04-01')
                    

How does PostgreSQL do this? (PG 10)

List Partitioning

  • Partitioned by explicitly listing which key value(s) appear(s) in each partition

CREATE TABLE cities (
    city_id         bigserial not null,
    name         text not null,
    population   int
) PARTITION BY LIST (initcap(name));
                    

How does PostgreSQL do this? (PG 10)

List Partitioning


CREATE TABLE cities_west
    PARTITION OF cities (
    CONSTRAINT city_id_nonzero CHECK (city_id != 0)
) FOR VALUES IN ('Los Angeles', 'San Francisco');
                    

keith@keith=# \d+ cities
                                                     Table "public.cities"
   Column   |  Type   | Collation | Nullable |                 Default                 | Storage  | Stats target | Description 
------------+---------+-----------+----------+-----------------------------------------+----------+--------------+-------------
 city_id    | bigint  |           | not null | nextval('cities_city_id_seq'::regclass) | plain    |              | 
 name       | text    |           | not null |                                         | extended |              | 
 population | integer |           |          |                                         | plain    |              | 
Partition key: LIST (initcap(name))
Partitions: cities_west FOR VALUES IN ('Los Angeles', 'San Francisco')
                    

How does PostgreSQL do this? (PG 10)

pg_partman

  • Native does not automatically make new partitions, so pg_partman still useful.
  • Privileges of parent not explicitly given to children in native. partman did this pre-10 and still does for 10+ to allow direct child interaction.
  • Uses a template table since indexes/FKs cannot be applied to a native parent (yet).
  • Native does not handle data that has no target child (yet). Still requires old trigger method.

How does PostgreSQL do this? (PG 10)

A Note About "IDENTITY"

  • New feature in PG10 to properly follow SQL standard for managing table sequences
  • Better handling of sequence permissions when tied to a table
  • Better enforcement of only allowing sequence use for column values
  • Easier to remove sequences from a table
  • Only supported properly with native partitioning
  • Only works when entering data through the parent table
  • More info on 2ndQuadrant Blog

How will PostgreSQL do this? (PG 11+)

Update Routing (committed)

  • When an UPDATE causes a row to no longer match the partition constraint, try to move it to a different partition where it does match the partition constraint
  • Behind the scences, does a DELETE/INSERT
  • Concurrency has some special edge-cases. See docs.

How will PostgreSQL do this? (PG 11+)

Hash Partitioning (committed)

  • Used when you want to partition a growing data set evenly or don't know data distribution in advance
  • Thanks to Depesz for examples (depesz.com)

CREATE TABLE users (
    username    text         not null,
    password    text,
    created_on  timestamptz  not null default now(),
    id_admin    bool         not null default false
) PARTITION BY HASH (username);
                    

Hash Partitioning

  • Simple example: MODULUS is number of partitions, and REMAINDER is a number, 0 or more, but less than MODULUS.

CREATE TABLE users_p0 PARTITION OF users ( primary key (username) ) FOR VALUES WITH (MODULUS 8, REMAINDER 0);
CREATE TABLE users_p1 PARTITION OF users ( primary key (username) ) FOR VALUES WITH (MODULUS 8, REMAINDER 1);
CREATE TABLE users_p2 PARTITION OF users ( primary key (username) ) FOR VALUES WITH (MODULUS 8, REMAINDER 2);
CREATE TABLE users_p3 PARTITION OF users ( primary key (username) ) FOR VALUES WITH (MODULUS 8, REMAINDER 3);
CREATE TABLE users_p4 PARTITION OF users ( primary key (username) ) FOR VALUES WITH (MODULUS 8, REMAINDER 4);
CREATE TABLE users_p5 PARTITION OF users ( primary key (username) ) FOR VALUES WITH (MODULUS 8, REMAINDER 5);
CREATE TABLE users_p6 PARTITION OF users ( primary key (username) ) FOR VALUES WITH (MODULUS 8, REMAINDER 6);
CREATE TABLE users_p7 PARTITION OF users ( primary key (username) ) FOR VALUES WITH (MODULUS 8, REMAINDER 7);
                    

Hash Partitioning


\d+ users
                                              Table "public.users"
   Column   |           Type           | Collation | Nullable | Default | Storage  | Stats target | Description 
------------+--------------------------+-----------+----------+---------+----------+--------------+-------------
 username   | text                     |           | not null |         | extended |              | 
 password   | text                     |           |          |         | extended |              | 
 created_on | timestamp with time zone |           | not null | now()   | plain    |              | 
 id_admin   | boolean                  |           | not null | false   | plain    |              | 
Partition key: HASH (username)
Partitions: users_p0 FOR VALUES WITH (modulus 8, remainder 0),
            users_p1 FOR VALUES WITH (modulus 8, remainder 1),
            users_p2 FOR VALUES WITH (modulus 8, remainder 2),
            users_p3 FOR VALUES WITH (modulus 8, remainder 3),
            users_p4 FOR VALUES WITH (modulus 8, remainder 4),
            users_p5 FOR VALUES WITH (modulus 8, remainder 5),
            users_p6 FOR VALUES WITH (modulus 8, remainder 6),
            users_p7 FOR VALUES WITH (modulus 8, remainder 7)
                    

\d+ users_p1
                                            Table "public.users_p1"
   Column   |           Type           | Collation | Nullable | Default | Storage  | Stats target | Description 
------------+--------------------------+-----------+----------+---------+----------+--------------+-------------
 username   | text                     |           | not null |         | extended |              | 
 password   | text                     |           |          |         | extended |              | 
 created_on | timestamp with time zone |           | not null | now()   | plain    |              | 
 id_admin   | boolean                  |           | not null | false   | plain    |              | 
Partition of: users FOR VALUES WITH (modulus 8, remainder 1)
Partition constraint: satisfies_hash_partition('1161847'::oid, 8, 1, username)
Indexes:
    "users_p1_pkey" PRIMARY KEY, btree (username)
                    

Hash Partitioning


\copy users (username) from stdin;
proffers
babbles
cents
choose
chalked
redoubts
pitting
coddling
relieves
wooing
codgers
sinewy
separate
ferry
crusty
cursing
hawkers
deducted
gaseous
voyagers
\.
                    

SELECT tableoid::regclass as partition_name, count(*) FROM users GROUP BY 1 ORDER BY 1;

 partition_name | count 
----------------+-------
 users_p0       |     2
 users_p1       |     5
 users_p2       |     1
 users_p3       |     3
 users_p4       |     2
 users_p5       |     3
 users_p6       |     3
 users_p7       |     1
(8 rows)
                    

How will PostgreSQL do this? (PG 11+)

Default partition (committed)

  • Handle partition values that do not have a defined child

ALTER TABLE [parent_table] ATTACH PARTITION [partition_name] DEFAULT;
                    

How will PostgreSQL do this? (PG 11+)

Limited UPSERT (committed)

  • Allow INSERT ... ON CONFLICT DO NOTHING
  • DO UPDATE
    • Works if there's a matching unique constraint with the partition key

How will PostgreSQL do this? (PG 11+)

Faster read querying

  • Faster partition pruning
  • More efficient than constraint exclusion to allow faster querying of partitioned data
  • Improves both planner and execution time (Committed for 11)
  • Partition-wise aggregation/grouping (Committed for 11)
    • Turned off by default due to increased memory usage

How will PostgreSQL do this? (PG 11+)

Index Inheritance (committed)

  • Primary, Unique, & General indexes
  • Primary & Unique require the partition column to be part of the index definition
  • Uniqueness only enforced per child table, not partition-wide. But it technically works since partition key is required.

Still to come

  • Foreign Keys - FROM works but still working on TO
  • Further UPSERT support options
  • Tablespace inheritance
  • Global Indexes - No ETA (please correct me)

Thank you!