-
Notifications
You must be signed in to change notification settings - Fork 474
[docs] Document COMPACTED table format #2264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[docs] Document COMPACTED table format #2264
Conversation
|
Hi @Priyamanjare54 , it seems this change is empty. |
|
Thanks @Priyamanjare54 , could you update the pull request according to the discussion and proposed structure in #2256 (comment)? |
|
Thanks for the feedback! Please let me know if you’d like any further adjustments. |
|
@polyzos could you help to review this doc? |
|
@wuchong Regarding the Indexed format, is it going to be deprecated, or should we document it as well? For example we can say that the arrow format is the default one, its benefits and that it allows operations such as column pruning and predicate push down. However for tables that don’t have such requirements, such as large vector tables maybe, aggregates and joined tables that we select all columns a compacted format might be a better fit for disk and CPU efficiency. WDYT? If you need more context I can help craft this |
|
Thanks for the feedback @polyzos! I appreciate the guidance on making this more accessible. I agree that simplifying the explanation will help users better understand when to use each format. Regarding the Indexed format - could you clarify if this should be included or if it's being deprecated? I want to make sure I'm documenting the right formats @wuchong . |
|
+1 to remove the |
|
Thanks for confirming! I’ll proceed with removing the Indexed format from the documentation and update the PR accordingly. |
|
@Priyamanjare54 this is great work 👌 I think before merging we can just a few things as “summaries”, like in the beginning add a quick section in terms of “of how to think about encodings”: How to Think About Encodings in FlussIn Fluss, a data encoding primarily determines:
Encodings in Fluss determine:
And then we can add a table with the exact tradeoffs maybe in the bottom of the page. ARROW vs COMPACTED
WDYT? |
|
Thanks for the suggestions! I’ve added a short “How to Think About Encodings in Fluss” section near the top and included an ARROW vs COMPACTED comparison table summarizing the trade-offs. |
Purpose
Linked issue: closes #2256
This pull request adds documentation for the COMPACTED table format on the Fluss
website to help users understand what it is, how to configure it, and when it
should be used.
Brief change log
Tests
Not applicable. This change only affects documentation.
API and Format
No. This change does not affect any public API or storage format.
Documentation
Yes. This PR introduces new user-facing documentation for the COMPACTED table
format on the Fluss website.