Skip to content

Conversation

@dramaticlly
Copy link
Contributor

@dramaticlly dramaticlly commented Jan 8, 2026

Currently iceberg commit does not expose the manifest level information, so it's difficult to see how metadata will evolve with given snapshot.

Now snapshot summary will also include following information about the table commit

  • manifests-created: new manifest was created by the same snapshot
  • manifests-replaced: manifest was filtered (in-place rewrite)/merged by the merging snapshot producer
  • manifest-kept: existing manifest was carried over where it's created by the older snapshot prior to the commit
// append create 1 manifest for 2 files
table.newAppend().appendFile(FILE_A).appendFile(FILE_B).commit()
assertThat(appendFile.summary())
    .containsEntry(SnapshotSummary.CREATED_MANIFESTS_COUNT, "1")
    .containsEntry(SnapshotSummary.REPLACED_MANIFESTS_COUNT, "0")
    .containsEntry(SnapshotSummary.KEPT_MANIFESTS_COUNT, "0");

// delete will copy-on-write to replace one of the manifest-entry for the existing manifest 
table.newDelete().deleteFile(FILE_A).commit()
assertThat(deleteFile.summary())
    .containsEntry(SnapshotSummary.CREATED_MANIFESTS_COUNT, "1")
    .containsEntry(SnapshotSummary.REPLACED_MANIFESTS_COUNT, "1")
    .containsEntry(SnapshotSummary.KEPT_MANIFESTS_COUNT, "0");

@github-actions github-actions bot added the core label Jan 8, 2026
@dramaticlly dramaticlly force-pushed the commitManifestMetrics branch from 9ebf11b to 62e148e Compare January 9, 2026 17:22
@dramaticlly dramaticlly marked this pull request as ready for review January 9, 2026 18:37
@dramaticlly
Copy link
Contributor Author

@nastra @amogh-jahagirdar if you can help take a look

@dramaticlly
Copy link
Contributor Author

also @huaxingao and @stevenzwu, if you are interested to take a look.

@dramaticlly dramaticlly force-pushed the commitManifestMetrics branch from c06a28e to f694c31 Compare January 26, 2026 18:34
@dramaticlly dramaticlly changed the title Core: populate manifest created/kept count when commit a snapshot Core: populate manifest created/replaced/kept count when commit a snapshot Jan 26, 2026
@dramaticlly dramaticlly force-pushed the commitManifestMetrics branch from f694c31 to 131cd47 Compare January 26, 2026 19:17
* Returns the count of manifests that were replaced (rewritten) during filtering.
*
* <p>A manifest is considered replaced when a new manifest was created to replace the original
* one (i.e., the original manifest != filtered manifest).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note because of the normal append path in merging snapshot producer this can also be original manifest != appended manifest

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah nvm, didn't see we were in manifest filter manager

@RussellSpitzer
Copy link
Member

I'm generally +1 on this idea but don't have time to do a full review right now, I do think we should consider using "existing" instead of "kept"? Or maybe skip it all together since I think we aren't tracking manifests which are not scanned in the first place right?

@dramaticlly
Copy link
Contributor Author

I'm generally +1 on this idea but don't have time to do a full review right now, I do think we should consider using "existing" instead of "kept"? Or maybe skip it all together since I think we aren't tracking manifests which are not scanned in the first place right?

Thanks @RussellSpitzer , discussed offline as we reuse the SnapshotSummary already defined in https://iceberg.apache.org/spec/#optional-snapshot-summary-fields. Previously we only populate such for rewrite-manifest operation, this change I want to introduce them for all commits result in a new snapshot, like append, row-delta and delete etc

@RussellSpitzer
Copy link
Member

You are totally right! Better to keep with the original definitions then

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants