Skip to content

Introduce an Explicit Workflow for Renaming Schema Objects #49

@NattyPluz

Description

@NattyPluz

This proposal introduces a safe, explicit, and idempotent workflow for handling RENAME operations. It centers around a version-controlled manifest file, refactor.yaml, which explicitly declares the developer's intent. To ensure operations run only once, a new internal log table, pgschema.refactor_log, will track completed refactors. Finally, a new command, pgschema refactors, will provide a way to audit this log.

The Problem: Renaming is Destructive

For a user of the current version of pgschema, any attempt to rename an object is interpreted as a destructive DROP and CREATE operation, leading to data loss. This proposal solves that by allowing developers to explicitly declare their intent.

Core Component 1: The pgschema.refactor_log Table

The cornerstone of this solution is an internal log table whose sole purpose is to serve as a permanent, historical record of completed refactoring operations. This is the key to idempotency.

  • Location: Resides in a dedicated pgschema schema (CREATE SCHEMA IF NOT EXISTS pgschema;).
  • Proposed Structure (pgschema.refactor_log):
-- This table stores a permanent, append-only log of completed refactoring operations.
-- Its primary purpose is to ensure that a given refactor is only ever applied once.
CREATE TABLE IF NOT EXISTS pgschema.refactor_log (
    id BIGSERIAL PRIMARY KEY,

    -- A unique, deterministic SHA-256 hash of the operation's parameters.
    -- This is the key to ensuring idempotency.
    -- e.g., '3a4f6534571e1f76f7f6c3823f6d7240c21b3a6a9b5f0a1c2d3e4f5a6b7c8d9e'
    operation_hash TEXT NOT NULL UNIQUE,

    -- The type of refactoring operation that was performed.
    -- e.g., 'RENAME'
    operation_type VARCHAR(64) NOT NULL,

    -- The type of database object that was affected.
    -- e.g., 'TABLE', 'COLUMN'
    object_type VARCHAR(64) NOT NULL,

    -- The schema of the object that was refactored.
    -- e.g., 'public'
    schema_name TEXT NOT NULL,

    -- For nested objects (like columns or indexes), this is the name of the parent object.
    -- For top-level objects (like tables), this is NULL.
    -- e.g., 'users' for a column rename, NULL for a table rename
    parent_object_name TEXT,

    -- The name of the object before the refactor.
    -- e.g., 'email', 'orders_archive'
    old_name TEXT,

    -- The name of the object after the refactor.
    -- e.g., 'email_address', 'archived_orders'
    new_name TEXT,

    -- The timestamp when the operation was successfully applied.
    -- e.g., '2025-09-28 14:30:00Z'
    applied_at TIMESTPTZ NOT NULL DEFAULT NOW()
);

Core Component 2: The refactor.yaml Manifest File

This file is the developer's explicit, version-controlled intent for the next migration.

  • Syntax (refactor.yaml):
    renames:
      - type: column
        table: public.users
        from: email
        to: email_address
    
      # For rare edge cases, an optional `nonce` can force a unique hash.
      - type: table
        from: public.archived_orders
        to: public.orders_archive_2025
        # A "nonce" (number used once) makes this operation's hash unique.
        nonce: "archive-orders-for-end-of-year-2025"

Part 1: Declaring Intent (Two Alternative Workflows)

A developer can use either method to populate their local refactor.yaml file.

Workflow A: Convenient CLI (pgschema rename)

  • New Command: pgschema rename
  • Behavior: A pure file-writing utility that appends an entry to the local refactor.yaml file. It does not interact with the database.
  • Syntax: pgschema rename column --table public.users --from email --to email_address

Workflow B: Manual File Creation

The developer creates or edits the refactor.yaml file by hand, giving them full control, including the ability to add a nonce if needed.


Part 2: Applying and Tracking Changes

The plan and apply commands are enhanced to use the manifest and the log. The key is the deterministic generation of an operation_hash for each directive.

  • Hash Generation: The operation_hash is a SHA-256 hash of a canonical JSON string representing the operation. If the optional nonce field is present in refactor.yaml, its value is included in the string that gets hashed. This ensures that even if all other parameters are identical to a past operation, the presence of a unique nonce will generate a new, un-logged hash.

  • pgschema plan: Reads refactor.yaml, checks the pgschema.refactor_log for each entry's hash, and generates a safe RENAME plan for any un-logged operations.

  • pgschema apply: Executes the RENAME and records its hash in the pgschema.refactor_log to mark it as completed.


Part 3: Auditing and History

To provide visibility into the history of completed refactors, a new, single-word command is introduced.

  • New Command: pgschema refactors
  • Behavior: This is a read-only command that connects to the database, queries the pgschema.refactor_log table, and displays a list of all completed refactoring operations for auditing purposes.
  • Default Output (Human-Readable):
    $ pgschema refactors --db myapp
    
    TYPE    OBJECT    SCHEMA    PARENT    OLD_NAME        NEW_NAME         APPLIED_AT
    ------- --------- --------- --------- --------------- ---------------- --------------------
    RENAME  COLUMN    public    users     email           email_address    2025-09-28T14:30:00Z
    
  • Optional YAML Output:
    pgschema refactors --db myapp --output-yaml > refactor_history.yaml

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions