Skip to content

Conversation

@hsiang-c
Copy link
Contributor

Which issue does this PR close?

Related to #. #2921

Rationale for this change

  • Iceberg's support to Parquet Modular Encryption requires 2 properties: one catalog-level property encryption.kms-impl and the other table-level property encryption.key-id.

What changes are included in this PR?

  • This PR pass the table-level property, encryption.key-id, to native scan so that we can pass to Iceberg Rust reader for KMS management.

How are these changes tested?

@codecov-commenter
Copy link

codecov-commenter commented Jan 21, 2026

Codecov Report

❌ Patch coverage is 77.77778% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 60.04%. Comparing base (f09f8af) to head (9273f4f).
⚠️ Report is 862 commits behind head on main.

Files with missing lines Patch % Lines
...a/org/apache/comet/iceberg/IcebergReflection.scala 77.77% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #3229      +/-   ##
============================================
+ Coverage     56.12%   60.04%   +3.92%     
- Complexity      976     1429     +453     
============================================
  Files           119      170      +51     
  Lines         11743    15779    +4036     
  Branches       2251     2607     +356     
============================================
+ Hits           6591     9475    +2884     
- Misses         4012     4982     +970     
- Partials       1140     1322     +182     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Feedback

Thanks for working on Parquet Modular Encryption support for Iceberg! I found an issue with the current implementation.

Missing Protobuf Serialization

The tableMasterKey is extracted from Iceberg table properties in IcebergReflection.scala and stored in CometIcebergNativeScanMetadata, but it's never serialized to protobuf.

Looking at the data flow:

Step File Status
1. Protobuf field operator.proto ✅ Added optional string table_master_key = 13
2. Rust struct field iceberg_scan.rs ✅ Added table_master_key: Option<String>
3. Rust planner planner.rs ✅ Reads scan.table_master_key.clone()
4. Scala extraction IcebergReflection.scala ✅ Extracts encryption.key-id into metadata
5. Scala serialization CometIcebergNativeScan.scala Missing

The serde file CometIcebergNativeScan.scala needs to serialize the key to protobuf. Something like:

// In the convert() method, after building other fields:
metadata.tableMasterKey.foreach { key =>
  icebergScanBuilder.setTableMasterKey(key)
}

Without this, the key extracted in step 4 never reaches the Rust side.

Other Notes

  • The #[allow(dead_code)] annotation and TODO comment in Rust are fine since this is infrastructure for future work
  • Consider adding a simple test to verify the key is correctly passed through the serde layer, even if encryption isn't fully implemented yet

This review was generated with AI assistance.

}

// tableMasterKey is optional
val tableMasterKey = getTableProperties(table).flatMap { properties =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make it so that we only call getTableProperties once and get all of the fields we want to avoid reflection overhead?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants