Data Model
All Icepack state lives in a single Postgres database. This page documents the
hot-path queue/lock/cache tables plus the durable history tables created by
HistoryStore.ensure_schema().
jobs
The primary record for each maintenance request. One row per job.
| Column | Type | Constraints | Description |
|---|---|---|---|
job_id | TEXT | PK | Unique job identifier |
database_name | TEXT | NOT NULL | Iceberg database name |
table_name | TEXT | NOT NULL | Iceberg table name |
actions | TEXT[] | NOT NULL | Ordered list of maintenance actions |
dry_run | BOOL | NOT NULL | Whether to simulate without writing |
status | TEXT | NOT NULL | One of: pending, running, completed, failed, cancelled |
submitted_at | TIMESTAMPTZ | NOT NULL | When the job was submitted |
started_at | TIMESTAMPTZ | When the worker began execution | |
completed_at | TIMESTAMPTZ | When the job reached terminal state | |
error | TEXT | Error message if failed | |
created_at | TIMESTAMPTZ | DEFAULT NOW() | Row creation timestamp |
attempt_id | TEXT | DL-197 fence — current claim’s unique ID | |
deleted_at | TIMESTAMPTZ | Soft-delete tombstone |
job_queue
Transient work queue. Rows exist only while a job is in-flight (pending or running). Deleted on ack.
| Column | Type | Constraints | Description |
|---|---|---|---|
job_id | TEXT | PK, FK -> jobs | References the parent job |
enqueued_at | TIMESTAMPTZ | DEFAULT NOW() | When the job entered the queue |
visible_at | TIMESTAMPTZ | NOT NULL | Claim lease deadline — job becomes re-claimable after this time |
claimed_at | TIMESTAMPTZ | When a worker claimed the job | |
claimed_by | TEXT | Worker hostname | |
attempt_id | TEXT | Fence echo — must match jobs.attempt_id | |
attempt_count | INT | DEFAULT 0 | Number of claim attempts |
job_actions
Per-action results within a job. One row per action execution.
| Column | Type | Constraints | Description |
|---|---|---|---|
id | BIGINT | PK, identity | Unique action-result identifier |
job_id | TEXT | FK -> jobs | Parent job |
action | TEXT | NOT NULL | Action name (e.g., rewrite_data_files, expire_snapshots) |
success | BOOL | NOT NULL | Whether the action succeeded |
message | TEXT | NOT NULL, DEFAULT '' | Human-readable result summary |
impact | TEXT | nullable | Concise user-facing summary of the action’s effect, such as 3,000 -> 1,500 data files |
error | TEXT | Error message if the action failed | |
elapsed_seconds | DOUBLE PRECISION | NOT NULL, DEFAULT 0.0 | Wall-clock duration of the action |
created_at | TIMESTAMPTZ | NOT NULL, DEFAULT NOW() | Row creation timestamp |
table_locks
Per-table ownership lock. Prevents concurrent maintenance on the same table.
| Column | Type | Constraints | Description |
|---|---|---|---|
table_key | TEXT | PK | "{database}.{table}" composite key |
holder | TEXT | NOT NULL | job_id of the lock owner |
acquired_at | TIMESTAMPTZ | DEFAULT NOW() | When the lock was acquired |
expires_at | TIMESTAMPTZ | NOT NULL | Lock expiration deadline |
table_cache_entries
Cached table inventory from the catalog. Refreshed atomically by TableCacheSyncWorker.
| Column | Type | Constraints | Description |
|---|---|---|---|
database_name | TEXT | Iceberg database name | |
table_name | TEXT | Iceberg table name | |
maintenance_enabled | BOOL | nullable | Whether maintenance is enabled for this table |
payload | JSONB | Full table metadata payload |
Primary key: (database_name, table_name)
table_cache_meta
Single-row metadata about the table cache sync state.
| Column | Type | Constraints | Description |
|---|---|---|---|
id | INT | PK, CHECK (id = 1) | Single-row invariant |
last_synced_at | TIMESTAMPTZ | Last successful sync timestamp |
health_snapshots
Legacy mixed health-report snapshots. This table is still created for backward
compatibility with older store methods and is pruned by retention cleanup, but
new cached table contracts use table_status_snapshots and
table_health_snapshots.
| Column | Type | Constraints | Description |
|---|---|---|---|
id | BIGINT | PK, identity | Unique snapshot identifier |
database_name | TEXT | NOT NULL | Iceberg database name |
table_name | TEXT | NOT NULL | Iceberg table name |
total_data_files | INTEGER | NOT NULL, DEFAULT 0 | Current data-file count |
total_delete_files | INTEGER | NOT NULL, DEFAULT 0 | Current delete-file count |
total_size_bytes | BIGINT | NOT NULL, DEFAULT 0 | Total data size |
avg_file_size_bytes | DOUBLE PRECISION | NOT NULL, DEFAULT 0.0 | Average data-file size |
small_file_count | INTEGER | NOT NULL, DEFAULT 0 | Number of small files |
small_file_pct | DOUBLE PRECISION | NOT NULL, DEFAULT 0.0 | Percent of data files considered small |
snapshot_count | INTEGER | NOT NULL, DEFAULT 0 | Snapshot count |
manifest_count | INTEGER | NOT NULL, DEFAULT 0 | Manifest count |
needs_maintenance | BOOLEAN | NOT NULL | Whether any threshold recommends work |
maintenance_enabled | BOOLEAN | nullable | Three-state maintenance enrollment; NULL means unset (maintained in opt-out mode, skipped in opt-in mode) |
maintenance_cadence_hours | INTEGER | Per-table cadence override | |
recommended_actions | TEXT[] | NOT NULL, DEFAULT '{}' | Ordered maintenance actions |
error | TEXT | Non-fatal analyzer error | |
total_records | BIGINT | DEFAULT 0 | Current record count |
snapshot_id | BIGINT | DEFAULT 0 | Current snapshot ID |
health_status | TEXT | DEFAULT 'Unknown' | Summary health label |
hours_since_last_snapshot | DOUBLE PRECISION | DEFAULT 0 | Age of the current snapshot |
oldest_snapshot_age_hours | DOUBLE PRECISION | DEFAULT 0 | Age of the oldest retained snapshot |
checked_at | TIMESTAMPTZ | NOT NULL, DEFAULT NOW() | Collection timestamp |
New callers should not depend on this table for action intent. Use
/tables/{database}/{table}/maintenance/recommendation for policy-aware actions.
table_status_snapshots
Split status payloads persisted by live /tables/{database}/{table}/status,
live /tables/{database}/{table}/health, live recommendation fallback, and the
health-sync worker. Cached status reads use the newest successful row per table.
Failed collection attempts are recorded as rows with success = false.
| Column | Type | Constraints | Description |
|---|---|---|---|
id | BIGINT | PK, identity | Unique snapshot identifier |
database_name | TEXT | NOT NULL | Iceberg database name |
table_name | TEXT | NOT NULL | Iceberg table name |
payload | JSONB | nullable | Successful /status payload |
success | BOOLEAN | NOT NULL, DEFAULT TRUE | Whether collection succeeded |
error | TEXT | nullable | Error message for failed collection |
collected_at | TIMESTAMPTZ | NOT NULL, DEFAULT NOW() | Collection timestamp |
The check constraint enforces either (success = true, payload present, error null) or (success = false, payload null, error present).
table_health_snapshots
Split health assessment payloads persisted by live
/tables/{database}/{table}/health and by the health-sync worker. Cached health
reads and /jobs/history/health use this table.
| Column | Type | Constraints | Description |
|---|---|---|---|
id | BIGINT | PK, identity | Unique assessment identifier |
database_name | TEXT | NOT NULL | Iceberg database name |
table_name | TEXT | NOT NULL | Iceberg table name |
health_status | TEXT | NOT NULL | Healthy, Warning, or Critical |
payload | JSONB | NOT NULL | Full /health assessment payload |
assessed_at | TIMESTAMPTZ | NOT NULL, DEFAULT NOW() | Assessment timestamp |
orchestrator_runs
Audit log of orchestrator CronJob executions.
| Column | Type | Constraints | Description |
|---|---|---|---|
id | BIGINT | PK, identity | Unique run identifier |
started_at | TIMESTAMPTZ | NOT NULL | When the run started |
completed_at | TIMESTAMPTZ | NOT NULL | When the run completed |
status | TEXT | NOT NULL | Run outcome |
tables_assessed | INTEGER | NOT NULL, DEFAULT 0 | Tables evaluated |
tables_skipped | INTEGER | NOT NULL, DEFAULT 0 | Tables skipped |
tables_healthy | INTEGER | NOT NULL, DEFAULT 0 | Tables already healthy |
jobs_submitted | INTEGER | NOT NULL, DEFAULT 0 | Jobs submitted by the run |
jobs_completed | INTEGER | NOT NULL, DEFAULT 0 | Submitted jobs completed |
jobs_failed | INTEGER | NOT NULL, DEFAULT 0 | Submitted jobs failed |
dry_run | BOOLEAN | NOT NULL, DEFAULT FALSE | Whether the run was advisory only |
error | TEXT | Run-level error | |
created_at | TIMESTAMPTZ | NOT NULL, DEFAULT NOW() | Row creation timestamp |
Key indexes
| Index | Definition | Purpose |
|---|---|---|
idx_jobs_table | (database_name, table_name, submitted_at DESC) | Table-scoped job history lookups |
idx_jobs_status | (status, submitted_at DESC) | Status-filtered job lists |
idx_jobs_submitted | (submitted_at DESC) | Recent-job ordering |
idx_jobs_visible | Partial index WHERE deleted_at IS NULL | Supports the live-read joined view — excludes soft-deleted rows so the API’s job listing query only scans active jobs |
idx_job_actions_job_id | (job_id) | Fetches per-job action results |
idx_job_actions_action | (action, created_at DESC) | Action-level history queries |
idx_job_queue_ready | Partial index on (visible_at) WHERE claimed_at IS NULL | Serves KEDA’s scaler query and the worker’s dequeue_with_claim CTE |
idx_job_queue_claimed | Partial index on (visible_at) WHERE claimed_at IS NOT NULL | Supports stale-claim sweeps |
idx_table_locks_expires | (expires_at) | Finds expired table locks |
idx_table_cache_maintenance | Partial index on (maintenance_enabled) WHERE maintenance_enabled = TRUE | Speeds listing of explicitly-enabled tables |
idx_health_table | (database_name, table_name, checked_at DESC) | Fetches latest cached health per table |
idx_health_checked_at | (checked_at DESC) | Retention and recent-health scans |
idx_table_status_success | Partial index on (database_name, table_name, collected_at DESC) WHERE success = TRUE | Fetches latest successful cached status |
idx_table_status_latest | (database_name, table_name, collected_at DESC) | Finds latest status attempt, including failures |
idx_table_health_latest | (database_name, table_name, assessed_at DESC) | Fetches latest cached health assessment |
idx_table_health_status | (health_status, assessed_at DESC) | Health-status filtered assessment scans |
idx_orchestrator_runs_started | (started_at DESC) | Recent orchestrator run listing |