Designing ISO container hierarchy trees in PostgreSQL

Modern terminal operating systems (TOS) and shipping line stowage planners operate under strict Service Level Agreements (SLAs) for yard planning, crane dispatch, and customs clearance. At the operational core of these workflows lies a deterministic representation of container placement, status, and regulatory constraints. When engineering systems for maritime operations, establishing a rigid tree structure that maps vessel bays, rows, tiers, and individual ISO containers to their operational metadata is non-negotiable. This architecture must align with established Core Maritime Architecture & Taxonomy principles to ensure interoperability across legacy EDI pipelines, modern API gateways, and automated straddle carrier networks.

Schema Architecture: Materialized Paths Over Recursive Traversal

PostgreSQL offers multiple strategies for hierarchical data, but high-throughput port environments demand predictable read latency. Recursive CTEs (WITH RECURSIVE) degrade rapidly under concurrent read loads and trigger excessive buffer cache thrashing when traversing 10,000+ slot trees. The production standard is the ltree extension combined with materialized path indexing. A canonical path such as VESSEL_0401.BAY_04.ROW_01.TIER_08.CONT_02 enables prefix matching, rapid slot availability queries, and efficient subtree isolation without recursive joins.

-- Enable ltree extension (requires superuser or pre-provisioned role)
CREATE EXTENSION IF NOT EXISTS ltree;

-- Core hierarchy table
CREATE TABLE container_hierarchy (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    path ltree NOT NULL,
    iso_code VARCHAR(11) NOT NULL,
    status VARCHAR(20) NOT NULL DEFAULT 'PLANNED',
    vgm_weight_kg NUMERIC(8,2),
    imdg_class VARCHAR(10),
    customs_hold BOOLEAN DEFAULT FALSE,
    bl_reference VARCHAR(20),
    created_at TIMESTAMPTZ DEFAULT NOW(),
    updated_at TIMESTAMPTZ DEFAULT NOW(),
    
    -- Enforce ISO 6346 format at the database layer (4 alpha + 7 numeric)
    CONSTRAINT chk_iso_code CHECK (iso_code ~ '^[A-Z]{4}\d{7}$'),
    -- Prevent invalid path structures (Vessel.Bay.Row.Tier[.Container])
    CONSTRAINT chk_path_depth CHECK (nlevel(path) BETWEEN 4 AND 5),
    -- SOLAS VGM compliance: must be positive and within structural limits
    CONSTRAINT chk_vgm_range CHECK (vgm_weight_kg IS NULL OR vgm_weight_kg BETWEEN 1000 AND 35000),
    -- IMDG class validation (1-9, or NULL for non-hazardous)
    CONSTRAINT chk_imdg_class CHECK (imdg_class IS NULL OR imdg_class ~ '^[1-9](\.[0-9])?$')
);

-- GiST index for prefix matching and subtree queries (O(log n) traversal)
CREATE INDEX idx_container_path_gist ON container_hierarchy USING GIST (path);
-- B-tree for exact slot lookups and equality joins
CREATE INDEX idx_container_path_btree ON container_hierarchy (path);
-- Partial index for active operational queries (excludes historical/archived)
CREATE INDEX idx_active_containers ON container_hierarchy (path, status) 
    WHERE status NOT IN ('DISCHARGED', 'ARCHIVED');

Normalization Pipeline: Resolving Format Drift

flowchart LR
  A["Slot id variants
0401082 · BAY04-ROW01-TIER08 · 04-01-08-2"] --> B["normalize_slot_identifier()"] B --> C{"Bay / tier
sanity check"} C -->|tier over 12| X["Reject · invalid stacking"] C -->|valid| D["Canonical ltree path
VES01.BAY_04.ROW_01.TIER_08"] D --> E[("container_hierarchy · GiST index")]

Format drift occurs when terminal operators ingest conflicting location identifiers from EDI 309/310 stowage plans, JSON-based shipping line APIs, and OCR-scanned gate receipts. A single vessel call may receive 0401082, BAY04-ROW01-TIER08-POS02, and 04-01-08-2 for the same physical slot. Normalizing these inputs requires a deterministic parsing pipeline that maps all external identifiers to a canonical ltree path before insertion. This validation layer is critical when aligning with Container Hierarchy Data Models across multi-terminal environments.

import re
import structlog
from typing import Optional, Dict, Any
from dataclasses import dataclass

logger = structlog.get_logger()

@dataclass
class SlotCoordinates:
    bay: int
    row: int
    tier: int
    container_pos: Optional[int] = None

def normalize_slot_identifier(raw_input: str, vessel_prefix: str) -> str:
    """
    Parses fragmented EDI/API/OCR inputs into a canonical ltree path.
    Handles real-world quirks: zero-padding inconsistencies, hyphen/space delimiters,
    and missing container position indicators.
    """
    # Strip non-alphanumeric delimiters and normalize to uppercase
    cleaned = re.sub(r'[^A-Z0-9]', '', raw_input.upper())
    
    # Two unambiguous forms. Keyword-delimited (BAY04ROW01TIER08POS02) is parsed
    # by label; bare numeric is fixed-width (2 bay, 2 row, 2 tier, optional 1-2 pos)
    # so digits are never misallocated (e.g. 0401082 -> 04/01/08/2, not 040/10/82).
    if any(kw in cleaned for kw in ("BAY", "ROW", "TIER")):
        match = re.match(r'BAY(\d{2,3})ROW(\d{2})TIER(\d{2})(?:(?:POS|CONT)(\d{1,2}))?$', cleaned)
    else:
        match = re.match(r'(\d{2})(\d{2})(\d{2})(\d{1,2})?$', cleaned)

    if not match:
        logger.error("invalid_slot_format", raw=raw_input, error="regex_mismatch")
        raise ValueError(f"Unparseable slot identifier: {raw_input}")

    g = match.groups()
    bay, row, tier = int(g[0]), int(g[1]), int(g[2])
    pos = int(g[3]) if g[3] is not None else None
    
    # Operational sanity checks
    if not (0 < bay <= 100):
        logger.warning("bay_out_of_range", bay=bay, action="clamped_to_max")
        bay = 100
    if not (0 < tier <= 12):
        logger.error("tier_exceeds_vessel_profile", tier=tier)
        raise ValueError("Tier exceeds standard vessel stacking profile")
        
    # Construct canonical path
    path_segments = [
        vessel_prefix,
        f"BAY_{bay:02d}",
        f"ROW_{row:02d}",
        f"TIER_{tier:02d}"
    ]
    if pos is not None:
        path_segments.append(f"CONT_{pos:02d}")
        
    canonical_path = ".".join(path_segments)
    logger.info("slot_normalized", raw=raw_input, canonical=canonical_path)
    return canonical_path

Regulatory Enforcement & Security Boundaries

Maritime operations cannot treat hierarchy trees as purely geometric constructs. Each node must carry regulatory state. IMDG segregation rules require that hazardous containers are never stacked adjacent to incompatible classes, while customs holds mandate that specific subtrees remain locked until clearance documentation is verified. Integrating Bill of Lading Schema Mapping ensures that commercial metadata aligns with physical placement.

Security protocols further dictate that certain bays (e.g., reefer plugs, dangerous goods zones) operate under strict Maritime Security Boundary Setup constraints. The schema above enforces these at the row level via imdg_class, customs_hold, and status flags. When a container transitions to SECURITY_HOLD, downstream automation (straddle carriers, RTGs) must respect the boundary without querying external systems. The ltree prefix structure allows security middleware to instantly isolate and block operations on entire subtrees using path <@ 'VESSEL_0401.BAY_04'::ltree.

Operational Query Patterns & Fallback Routing

Stowage planners require millisecond responses when querying available slots. The materialized path enables direct prefix scans. During Port Call Workflow Design, systems must also implement fallback routing logic when primary slots are occupied by holds or damaged equipment.

-- Find next available tier in a specific bay/row, excluding holds and IMDG conflicts
-- Uses ltree prefix matching for O(log n) performance
SELECT path, status, vgm_weight_kg
FROM container_hierarchy
WHERE path <@ 'VESSEL_0401.BAY_04.ROW_01'::ltree
  AND status = 'AVAILABLE'
  AND customs_hold = FALSE
  AND imdg_class IS NULL
ORDER BY path ASC
LIMIT 1;

-- Fallback: If primary bay is full, route to adjacent bay using set operations
-- This avoids application-level recursion and leverages PostgreSQL's set operations
WITH primary_slots AS (
    SELECT path FROM container_hierarchy WHERE path <@ 'VESSEL_0401.BAY_04'::ltree AND status = 'AVAILABLE'
),
fallback_slots AS (
    SELECT path FROM container_hierarchy WHERE path <@ 'VESSEL_0401.BAY_05'::ltree AND status = 'AVAILABLE'
)
SELECT path FROM primary_slots
UNION ALL
SELECT path FROM fallback_slots
LIMIT 1;

Python Integration & Structured Observability

Database interactions must be wrapped in transactional boundaries with explicit error handling and structured telemetry. The following example demonstrates a production-ready insertion routine that respects SOLAS VGM limits, logs operational state changes, and handles constraint violations gracefully.

import asyncpg
import structlog
from datetime import datetime, timezone
from typing import Optional, Dict, Any

logger = structlog.get_logger()

async def register_container_in_tree(
    pool: asyncpg.Pool,
    vessel_prefix: str,
    raw_slot: str,
    iso_code: str,
    vgm: Optional[float],
    imdg: Optional[str],
    bl_ref: str
) -> Dict[str, Any]:
    """
    Inserts a container into the hierarchy tree with full regulatory validation.
    Handles real data quirks: duplicate ISO codes, VGM drift, and concurrent slot contention.
    """
    try:
        canonical_path = normalize_slot_identifier(raw_slot, vessel_prefix)
        
        async with pool.acquire() as conn:
            async with conn.transaction():
                # Check for existing ISO code in tree (prevents duplicate B/L assignments)
                existing = await conn.fetchval(
                    "SELECT id FROM container_hierarchy WHERE iso_code = $1 AND status != 'DISCHARGED'",
                    iso_code
                )
                if existing:
                    logger.warning("duplicate_iso_detected", iso_code=iso_code, existing_id=existing)
                    raise ValueError(f"Container {iso_code} already active in yard")
                
                # Insert with explicit regulatory fields
                result = await conn.fetchrow(
                    """
                    INSERT INTO container_hierarchy 
                        (path, iso_code, vgm_weight_kg, imdg_class, bl_reference, status)
                    VALUES ($1::ltree, $2, $3, $4, $5, 'GATE_IN')
                    RETURNING id, path, created_at
                    """,
                    canonical_path, iso_code, vgm, imdg, bl_ref
                )
                
                logger.info(
                    "container_registered",
                    container_id=str(result["id"]),
                    path=str(result["path"]),
                    iso=iso_code,
                    vgm_verified=vgm is not None,
                    imdg_flag=imdg is not None,
                    timestamp=datetime.now(timezone.utc).isoformat()
                )
                return dict(result)
                
    except asyncpg.UniqueViolationError as e:
        logger.error("constraint_violation", error=str(e), action="reject_insert")
        raise RuntimeError("Slot collision or duplicate ISO code detected") from e
    except Exception as e:
        logger.exception("registration_failed", raw_slot=raw_slot, iso_code=iso_code)
        raise

Operational Discipline

Hierarchy trees in maritime systems are not academic exercises; they are the physical manifestation of terminal throughput. By anchoring the architecture to ltree materialized paths, enforcing ISO 6346 and SOLAS constraints at the database layer, and normalizing ingestion drift before persistence, engineering teams eliminate the latency and ambiguity that degrade crane utilization and delay vessel departures. The PostgreSQL schema, combined with deterministic Python parsing and structured observability, provides a resilient foundation for automated yard operations and real-time stowage optimization.