Skip to content

Conversation

@andygrove
Copy link
Member

@andygrove andygrove commented Jan 23, 2026

Summary

This PR optimizes CI build times by building the native Rust library once and sharing it across all test jobs, instead of rebuilding it in every job.

Changes

PR Build Linux (pr_build_linux.yml):

  • Add build-native job that builds libcomet.so once and uploads as artifact
  • All 36 test jobs download the pre-built library instead of rebuilding
  • Rust tests run in parallel with build-native (they use debug builds)

PR Build macOS (pr_build_macos.yml):

  • Add build-native job that builds libcomet.dylib once
  • All 18 test jobs download the pre-built library instead of rebuilding

Spark SQL Tests (spark_sql_test.yml):

  • Add build-native job that builds once
  • All ~48 test jobs download the pre-built library instead of rebuilding
  • Add skip-native-build input to setup-spark-builder action

Shared Changes:

  • Add Cargo caching throughout
  • Add faster CI Cargo profile in native/Cargo.toml (no LTO, parallel codegen) with same overflow behavior as release
  • Add skip-native-build input to java-test action

Estimated Savings

Workflow Before After Savings
PR Build Linux 36 Rust builds 1 ~175-525 min
PR Build macOS 18 Rust builds 1 ~85-255 min
Spark SQL Tests 48 Rust builds 1 ~235-705 min
Total 102 Rust builds 3 ~495-1485 min

Additionally, the CI Cargo profile compiles ~30-50% faster than full release (no LTO).

andygrove and others added 2 commits January 22, 2026 17:00
Same optimization as Linux: build libcomet.dylib once and share
across all 18 test jobs instead of rebuilding 18 times.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
rust-version: ${{ env.RUST_VERSION }}
jdk-version: ${{ matrix.profile.java_version }}

- name: Download native library
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

andygrove and others added 2 commits January 22, 2026 17:36
- Add build-native job that builds libcomet.so once
- Add skip-native-build input to setup-spark-builder action
- All ~48 test jobs now download pre-built artifact instead of rebuilding

Eliminates 47 redundant Rust builds.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The linux-test-rust job caches native/target which contains JNI-related
build artifacts. Without Java version in the cache key, Java 11 and 17
jobs could restore incompatible cached artifacts from each other.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@codecov-commenter
Copy link

codecov-commenter commented Jan 23, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 60.00%. Comparing base (f09f8af) to head (2a7801f).
⚠️ Report is 868 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #3249      +/-   ##
============================================
+ Coverage     56.12%   60.00%   +3.87%     
- Complexity      976     1428     +452     
============================================
  Files           119      170      +51     
  Lines         11743    15796    +4053     
  Branches       2251     2608     +357     
============================================
+ Hits           6591     9478    +2887     
- Misses         4012     5000     +988     
- Partials       1140     1318     +178     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@andygrove andygrove marked this pull request as ready for review January 23, 2026 01:50
@andygrove andygrove changed the title perf: build native library once and share across CI test jobs build: build native library once and share across CI test jobs Jan 23, 2026
jdk-version: ${{ matrix.java_version }}

- name: Cache Cargo
uses: actions/cache@v4
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use ${{ runner.os }}-cargo-ci- generated by build-native here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The linux-test-rust job uses debug builds (cargo test) while build-native uses the CI profile, so their native/target artifacts are incompatible. Sharing the same cache key would cause cache thrashing, where each job overwrites the other's compiled artifacts, if I'm understanding this correctly.

Copy link
Member

@wForget wForget left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @andygrove , this change looks good to me.

The cache storage is approaching its limit: https://github.com/apache/datafusion-comet/actions/caches

Image

@andygrove andygrove merged commit c6c3002 into apache:main Jan 23, 2026
175 of 176 checks passed
@andygrove andygrove deleted the faster-ci-1 branch January 23, 2026 14:16
@andygrove
Copy link
Member Author

Thanks @andygrove , this change looks good to me.

The cache storage is approaching its limit: https://github.com/apache/datafusion-comet/actions/caches
Image

Thanks. #3251 will help with this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants