[docs] Document COMPACTED table format #2264

Priyamanjare54 · 2025-12-27T07:29:17Z

Purpose

Linked issue: closes #2256

This pull request adds documentation for the COMPACTED table format on the Fluss
website to help users understand what it is, how to configure it, and when it
should be used.

Brief change log

Added documentation for the COMPACTED table format under table design
Explained supported table types (Log and KV tables)
Documented usage with table.changelog.image=WAL and its performance benefits
Added guidance on recommended use cases and limitations

Tests

Not applicable. This change only affects documentation.

API and Format

No. This change does not affect any public API or storage format.

Documentation

Yes. This PR introduces new user-facing documentation for the COMPACTED table
format on the Fluss website.

wuchong · 2025-12-30T12:34:58Z

Hi @Priyamanjare54 , it seems this change is empty.

wuchong · 2025-12-31T12:13:15Z

Thanks @Priyamanjare54 , could you update the pull request according to the discussion and proposed structure in #2256 (comment)?

Priyamanjare54 · 2025-12-31T13:12:37Z

Thanks for the feedback!
I’ve updated the PR to introduce a dedicated Data Encodings page and moved the COMPACTED documentation under it as discussed in #2256.
The earlier standalone COMPACTED page has been removed accordingly.

Please let me know if you’d like any further adjustments.

wuchong · 2025-12-31T15:23:54Z

@polyzos could you help to review this doc?

polyzos · 2025-12-31T17:17:28Z

@wuchong Regarding the Indexed format, is it going to be deprecated, or should we document it as well?
@Priyamanjare54 Thank you for your contribution.
It’s great documenting, however we want to make it as simple as possible, so every user can easily understand the formats and when they should use each and with a current approach, I’m cautious that less technical users might find it a bit harder to understand.

For example we can say that the arrow format is the default one, its benefits and that it allows operations such as column pruning and predicate push down.

However for tables that don’t have such requirements, such as large vector tables maybe, aggregates and joined tables that we select all columns a compacted format might be a better fit for disk and CPU efficiency.

WDYT? If you need more context I can help craft this

Priyamanjare54 · 2026-01-01T13:15:24Z

Thanks for the feedback @polyzos! I appreciate the guidance on making this more accessible. I agree that simplifying the explanation will help users better understand when to use each format.

Regarding the Indexed format - could you clarify if this should be included or if it's being deprecated? I want to make sure I'm documenting the right formats @wuchong .
I'll work on a revision based on your suggestions.

Priyamanjare54 · 2026-01-03T05:07:22Z

Hi @polyzos @wuchong, just following up on my question from 2 days ago could you please confirm the status of the Indexed format so I can proceed with the revision? Thanks!

wuchong · 2026-01-04T03:41:58Z

+1 to remove the indexed format from the doc.

Priyamanjare54 · 2026-01-04T05:07:00Z

Thanks for confirming! I’ll proceed with removing the Indexed format from the documentation and update the PR accordingly.

polyzos · 2026-01-04T16:16:11Z

@Priyamanjare54 this is great work 👌 I think before merging we can just a few things as “summaries”, like in the beginning add a quick section in terms of “of how to think about encodings”:

How to Think About Encodings in Fluss

In Fluss, a data encoding primarily determines:

How data is laid out on disk (columnar vs row-oriented)
How efficiently data can be filtered, projected, and scanned
Whether the encoding is optimized for streaming scans or key-based access

Encodings in Fluss determine:

CPU vs IO tradeoffs
Scan-heavy vs lookup-heavy workloads
Analytical vs operational access patterns

And then we can add a table with the exact tradeoffs maybe in the bottom of the page.

ARROW vs COMPACTED

Encoding	ARROW	COMPACTED
Physical layout	Columnar	Row-oriented
Typical access pattern	Scans with projection & filters	Full-row reads or key lookups
Column pruning	✅ Yes	❌ No
Predicate pushdown	✅ Yes	❌ No
Storage efficiency	Good	Excellent
CPU efficiency	Better for selective reads	Better for full-row reads
Log encoding	✅ Yes	✅ Yes
KV encoding	❌ No	✅ Yes
Best suited for	Analytics, streaming analytics	State tables, materialized views

WDYT?

Priyamanjare54 · 2026-01-05T10:41:25Z

Thanks for the suggestions! I’ve added a short “How to Think About Encodings in Fluss” section near the top and included an ARROW vs COMPACTED comparison table summarizing the trade-offs.

[docs] Document COMPACTED table format

afc9792

This was referenced Dec 30, 2025

Docs: document COMPACTED table format #2269

Open

[Docs] Document the Compacted format support for Log and PK Tables #2256

Closed

[docs] Add detailed explanation for COMPACTED format

677e242

Priyamanjare54 added 2 commits December 31, 2025 18:34

[docs] Add Data Encodings documentation

2a8b6eb

[docs] Remove standalone COMPACTED doc in favor of Data Encodings page

bc5e50a

docs: updated Data Encodings page

dc8bf90

wuchong requested a review from polyzos January 4, 2026 13:32

docs: improve Data Encodings overview

b196144

small improvements

32651f2

polyzos merged commit eb75d55 into apache:main Jan 5, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[docs] Document COMPACTED table format #2264

[docs] Document COMPACTED table format #2264

Uh oh!

Priyamanjare54 commented Dec 27, 2025 •

edited

Loading

Uh oh!

wuchong commented Dec 30, 2025

Uh oh!

wuchong commented Dec 31, 2025

Uh oh!

Priyamanjare54 commented Dec 31, 2025

Uh oh!

wuchong commented Dec 31, 2025

Uh oh!

polyzos commented Dec 31, 2025

Uh oh!

Priyamanjare54 commented Jan 1, 2026

Uh oh!

Priyamanjare54 commented Jan 3, 2026 •

edited

Loading

Uh oh!

wuchong commented Jan 4, 2026

Uh oh!

Priyamanjare54 commented Jan 4, 2026

Uh oh!

polyzos commented Jan 4, 2026 •

edited

Loading

Uh oh!

Priyamanjare54 commented Jan 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[docs] Document COMPACTED table format #2264

[docs] Document COMPACTED table format #2264

Uh oh!

Conversation

Priyamanjare54 commented Dec 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Brief change log

Tests

API and Format

Documentation

Uh oh!

wuchong commented Dec 30, 2025

Uh oh!

wuchong commented Dec 31, 2025

Uh oh!

Priyamanjare54 commented Dec 31, 2025

Uh oh!

wuchong commented Dec 31, 2025

Uh oh!

polyzos commented Dec 31, 2025

Uh oh!

Priyamanjare54 commented Jan 1, 2026

Uh oh!

Priyamanjare54 commented Jan 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wuchong commented Jan 4, 2026

Uh oh!

Priyamanjare54 commented Jan 4, 2026

Uh oh!

polyzos commented Jan 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

How to Think About Encodings in Fluss

ARROW vs COMPACTED

Uh oh!

Priyamanjare54 commented Jan 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Priyamanjare54 commented Dec 27, 2025 •

edited

Loading

Priyamanjare54 commented Jan 3, 2026 •

edited

Loading

polyzos commented Jan 4, 2026 •

edited

Loading