-
Notifications
You must be signed in to change notification settings - Fork 272
Open
Description
Description
The date_format expression was added in PR #3201, but currently only supports UTC timezone with full compatibility. Non-UTC timezones are marked as Incompatible and fall back to Spark by default.
This issue tracks adding proper timezone conversion support so that date_format can be fully compatible with Spark for all timezones.
Current Behavior
- UTC timezone:
Compatible()- runs natively in Comet - Non-UTC timezones:
Incompatible()- falls back to Spark by default - Users can enable non-UTC with
spark.comet.expr.DateFormatClass.allowIncompatible=truebut results may differ from Spark
Desired Behavior
All timezones should be Compatible() and produce results identical to Spark.
Technical Details
The current implementation uses DataFusion's to_char function which formats timestamps without timezone conversion. Spark's date_format applies the session timezone when formatting.
Possible approaches:
- Convert the timestamp to the target timezone before calling
to_char - Use a timezone-aware formatting function if available in DataFusion
- Implement custom Rust logic to handle timezone conversion
Related
- PR feat: add partial support for date_format expression #3201 - Initial date_format implementation
- Issue date_trunc incorrect results in non-UTC timezone #2649 - Similar timezone issue with
date_trunc
Note: This issue was generated with AI assistance.
Metadata
Metadata
Assignees
Labels
No labels