Skip to content

Evaluate replacing Protocol Buffers with FlatBuffers for plan serialization #3245

@andygrove

Description

@andygrove

Summary

Investigate the feasibility and potential performance benefits of replacing Protocol Buffers (protobuf) with FlatBuffers for serializing query plans in Scala and deserializing them in Rust.

Motivation

FlatBuffers offers zero-copy deserialization which could reduce overhead in the JVM-to-native communication path. Plan deserialization occurs per partition in CometExecIterator, so any reduction in deserialization cost could improve overall query performance.

Aspect Protobuf (Current) FlatBuffers
Deserialization O(n) - parses all fields O(1) - zero-copy access
Memory allocation Creates new objects Reads directly from buffer
Random field access Must parse all preceding Direct offset lookup

Current State

  • Proto schemas: native/proto/src/proto/*.proto (~1,038 lines across 7 files)
  • Message types: 94 message types with 8 oneof unions
  • Scala serialization: 33 handlers in spark/src/main/scala/org/apache/comet/serde/
  • Rust deserialization: native/core/src/execution/serde.rs, native/core/src/execution/planner.rs

Migration Scope

Component Files Affected Effort
Schema rewrite 7 .proto → 7 .fbs Medium
Scala serializers 33+ files High
Rust deserializers ~5 files Medium
Build system 3-4 files Low
Tests Many High

Challenges

  1. Schema translation: 94 message types need FlatBuffer equivalents; oneof unions have different semantics in FlatBuffers
  2. Scala code changes: FlatBuffers requires bottom-up construction (nested objects must be built first), different from protobuf's builder pattern
  3. Rust code changes: Replace prost with flatbuffers crate; update planner.rs (~2,700 lines) to work with FlatBuffer accessors
  4. Recursive structures: Operator contains children: repeated Operator which requires careful handling

Open Questions

  1. What is the actual serialization/deserialization overhead with the current protobuf implementation? (Need benchmarking)
  2. For typical query plans (KB-sized), would zero-copy provide meaningful benefits given the plan is immediately converted to DataFusion structures?
  3. Would the increased code complexity and maintenance burden be justified by the performance gains?

Suggested Approach

  1. Benchmark current state: Profile protobuf serialization/deserialization overhead to quantify the problem
  2. Proof of concept: Convert a single operator type (e.g., Filter) to FlatBuffers and benchmark
  3. Evaluate alternatives: Consider optimizations within protobuf first (arena allocation in Rust, builder reuse in Scala)
  4. Full migration (if justified): Incrementally migrate all operators with comprehensive testing

Related Links

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions