-
Notifications
You must be signed in to change notification settings - Fork 272
Open
Labels
Description
Summary
Investigate the feasibility and potential performance benefits of replacing Protocol Buffers (protobuf) with FlatBuffers for serializing query plans in Scala and deserializing them in Rust.
Motivation
FlatBuffers offers zero-copy deserialization which could reduce overhead in the JVM-to-native communication path. Plan deserialization occurs per partition in CometExecIterator, so any reduction in deserialization cost could improve overall query performance.
| Aspect | Protobuf (Current) | FlatBuffers |
|---|---|---|
| Deserialization | O(n) - parses all fields | O(1) - zero-copy access |
| Memory allocation | Creates new objects | Reads directly from buffer |
| Random field access | Must parse all preceding | Direct offset lookup |
Current State
- Proto schemas:
native/proto/src/proto/*.proto(~1,038 lines across 7 files) - Message types: 94 message types with 8 oneof unions
- Scala serialization: 33 handlers in
spark/src/main/scala/org/apache/comet/serde/ - Rust deserialization:
native/core/src/execution/serde.rs,native/core/src/execution/planner.rs
Migration Scope
| Component | Files Affected | Effort |
|---|---|---|
| Schema rewrite | 7 .proto → 7 .fbs | Medium |
| Scala serializers | 33+ files | High |
| Rust deserializers | ~5 files | Medium |
| Build system | 3-4 files | Low |
| Tests | Many | High |
Challenges
- Schema translation: 94 message types need FlatBuffer equivalents; oneof unions have different semantics in FlatBuffers
- Scala code changes: FlatBuffers requires bottom-up construction (nested objects must be built first), different from protobuf's builder pattern
- Rust code changes: Replace
prostwithflatbufferscrate; updateplanner.rs(~2,700 lines) to work with FlatBuffer accessors - Recursive structures:
Operatorcontainschildren: repeated Operatorwhich requires careful handling
Open Questions
- What is the actual serialization/deserialization overhead with the current protobuf implementation? (Need benchmarking)
- For typical query plans (KB-sized), would zero-copy provide meaningful benefits given the plan is immediately converted to DataFusion structures?
- Would the increased code complexity and maintenance burden be justified by the performance gains?
Suggested Approach
- Benchmark current state: Profile protobuf serialization/deserialization overhead to quantify the problem
- Proof of concept: Convert a single operator type (e.g.,
Filter) to FlatBuffers and benchmark - Evaluate alternatives: Consider optimizations within protobuf first (arena allocation in Rust, builder reuse in Scala)
- Full migration (if justified): Incrementally migrate all operators with comprehensive testing
Related Links
hsiang-c and SemyonSinchenko