Skip to content

Table Properties

Icepack reads icepack.* table properties from the Iceberg catalog to control per-table behavior. These properties override global defaults set by environment variables or Helm values.

Property reference

PropertyTypeDefaultDescription
icepack.maintenance_enabledboolunsetThree-state maintenance enrollment. In opt-out mode (the default) a table is maintained unless this is set to false; in opt-in mode only tables set to true are maintained. Unset means “use the orchestrator’s mode default”.
icepack.maintenance_cadence_hoursint24 (global)Per-table override for the minimum hours between maintenance runs. Takes precedence over ICEPACK_MAINTENANCE_CADENCE_HOURS.
icepack.target_file_size_bytesint536870912 (512 MB)Target file size for compaction. Files smaller than this are candidates for rewriting.
icepack.max_snapshot_age_daysint5Maximum snapshot retention in days. Snapshots older than this are expired during maintenance.
icepack.min_input_filesint5Minimum number of small files required to trigger compaction. Prevents rewriting when there are only a few small files.
icepack.delete_file_thresholdint5Trigger delete-file maintenance and data-file rewrite recommendations when total delete files exceed this count.
icepack.partial_progress_max_commitsint10Maximum partial-progress commits Iceberg can create during one rewrite action.
icepack.rewrite_data_delete_file_thresholdint2Rewrite data files with at least this many associated delete files. Lower this for tables with broad position-delete pressure.
icepack.rewrite_data_delete_ratio_thresholdfloat0.10Rewrite data files when deleted rows exceed this fraction of a data file.
icepack.rewrite_data_rewrite_allboolfalseForce rewrite_data_files to rewrite all selected data files. Use only for one-off remediation, not steady-state automation.
icepack.spark.sql.autoBroadcastJoinThresholdbytes or -1unsetSession-scoped Spark SQL override for Icepack maintenance. Set to -1 to disable static automatic broadcast join selection for this table’s maintenance actions.
icepack.spark.sql.adaptive.autoBroadcastJoinThresholdbytes or -1unsetSession-scoped Spark SQL override for Icepack maintenance. Set to -1 to disable adaptive automatic broadcast join selection for this table’s maintenance actions.
icepack.spark.sql.join.preferSortMergeJoinboolunsetSession-scoped Spark SQL override for Icepack maintenance. Set to false with a shuffled-hash threshold when steering Spark away from sort-merge join.
icepack.spark.sql.adaptive.maxShuffledHashJoinLocalMapThresholdbytesunsetSession-scoped Spark SQL override for Icepack maintenance. Allows adaptive shuffled hash join when each local hash map fits under the configured threshold, such as 2g.
icepack.spark.sql.maxBroadcastTableSizebytesunsetSession-scoped Spark SQL override for Icepack maintenance. Use only as an explicit fallback when intentionally allowing a larger broadcast; it is not part of the initial avoid-broadcast strategy.
compaction_skipboolfalseExclude this table from all Icepack maintenance. Overrides icepack.maintenance_enabled. Useful for tables undergoing migration or manual intervention.

Setting properties

Set properties using Spark SQL ALTER TABLE ... SET TBLPROPERTIES:

-- Opt a table in to automated maintenance
ALTER TABLE lakehouse_dev.my_database.my_table
SET TBLPROPERTIES ('icepack.maintenance_enabled' = 'true');
-- Override cadence and file size for a high-throughput table
ALTER TABLE lakehouse_dev.my_database.my_table
SET TBLPROPERTIES (
'icepack.maintenance_cadence_hours' = '3',
'icepack.target_file_size_bytes' = '268435456'
);
-- Increase data-file rewrites for a table with heavy position-delete pressure
ALTER TABLE lakehouse_dev.my_database.my_table
SET TBLPROPERTIES (
'icepack.rewrite_data_delete_file_threshold' = '1',
'icepack.rewrite_data_delete_ratio_threshold' = '0.05',
'icepack.min_input_files' = '2',
'icepack.partial_progress_max_commits' = '3'
);
-- DL-606: steer a large orphan-file cleanup away from automatic broadcast joins
ALTER TABLE lakehouse_prod.raw.offer_eligibility_snapshots_raw
SET TBLPROPERTIES (
'icepack.spark.sql.autoBroadcastJoinThreshold' = '-1',
'icepack.spark.sql.adaptive.autoBroadcastJoinThreshold' = '-1',
'icepack.spark.sql.join.preferSortMergeJoin' = 'false',
'icepack.spark.sql.adaptive.maxShuffledHashJoinLocalMapThreshold' = '2g'
);

icepack.spark.sql.maxBroadcastTableSize is supported as an escape hatch, but do not set it for the initial DL-606 remediation when the goal is to avoid broadcast joins. Raising it to 16g changes the strategy to allow a larger broadcast and can increase memory pressure.

-- Temporarily exclude a table from maintenance
ALTER TABLE lakehouse_dev.my_database.my_table
SET TBLPROPERTIES ('compaction_skip' = 'true');

To remove properties and revert to the global default:

ALTER TABLE lakehouse_dev.my_database.my_table
UNSET TBLPROPERTIES ('icepack.maintenance_cadence_hours');
-- Remove the initial DL-606 Spark SQL overrides
ALTER TABLE lakehouse_prod.raw.offer_eligibility_snapshots_raw
UNSET TBLPROPERTIES (
'icepack.spark.sql.autoBroadcastJoinThreshold',
'icepack.spark.sql.adaptive.autoBroadcastJoinThreshold',
'icepack.spark.sql.join.preferSortMergeJoin',
'icepack.spark.sql.adaptive.maxShuffledHashJoinLocalMapThreshold'
);

If icepack.spark.sql.maxBroadcastTableSize was later applied as a fallback, unset that property as part of the same rollback.

Visibility

Table properties are surfaced in two places:

  • Recommendation API — The GET /tables/{database}/{table}/maintenance/recommendation response includes policy.icepack_config, containing the table’s icepack.* properties with the prefix stripped. It also returns effective maintenance enrollment and cadence after applying global defaults and table-level overrides.

  • Web UI — The table detail page displays the current icepack.* properties alongside health metrics, making it easy to verify configuration without running SQL.