Skip to content

Conversation

@andygrove
Copy link
Member

@andygrove andygrove commented Jan 17, 2026

Closes #2582

Summary

Move documentation generation from build time to publish time to prevent merge conflicts in PRs.

Previously, GenerateDocs ran during mvn package and modified docs/source/user-guide/latest/configs.md and compatibility.md in-place. When PRs added new configs or expressions, the regenerated tables would conflict with main.

Now:

  • Source docs contain only template markers (no generated content)
  • Content is generated at publish time in docs/build.sh
  • Release branches freeze generated content via dev/generate-release-docs.sh

Changes

  • spark/pom.xml: Remove exec-maven-plugin that ran GenerateDocs during package phase
  • docs/build.sh: Add step to compile and run GenerateDocs against temp copy before Sphinx build
  • docs/source/user-guide/latest/configs.md: Replace generated tables with empty template markers
  • docs/source/user-guide/latest/compatibility.md: Replace generated tables with empty template markers
  • dev/generate-release-docs.sh: New script to generate docs when creating a release branch
  • dev/release/README.md: Add "Generate Release Documentation" step to release process

How It Works

Branch Source Docs State When Generated
main Empty template markers At publish time by CI
branch-0.x Frozen generated content Once, when branch is created

Release Process Addition

When cutting a new release:

git checkout -b branch-0.13 main
./dev/generate-release-docs.sh
git add docs/source/user-guide/latest/
git commit -m "Generate docs for 0.13.0 release"
git push apache branch-0.13

Test Plan

  • Verify make no longer modifies docs
  • Verify docs/build.sh generates content correctly (test locally or check CI)
  • Verify dev/generate-release-docs.sh works on a test branch

@andygrove andygrove changed the title stop generating dynamic docs content in build [WIP] docs: Stop generating dynamic docs content in build [WIP] Jan 17, 2026
@andygrove
Copy link
Member Author

@snmvaughan fyi

@codecov-commenter
Copy link

codecov-commenter commented Jan 17, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 60.07%. Comparing base (f09f8af) to head (31e28c4).
⚠️ Report is 877 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #3212      +/-   ##
============================================
+ Coverage     56.12%   60.07%   +3.94%     
- Complexity      976     1437     +461     
============================================
  Files           119      172      +53     
  Lines         11743    15926    +4183     
  Branches       2251     2631     +380     
============================================
+ Hits           6591     9567    +2976     
- Misses         4012     5031    +1019     
- Partials       1140     1328     +188     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@andygrove andygrove added this to the 0.13.0 milestone Jan 22, 2026
@andygrove andygrove changed the title docs: Stop generating dynamic docs content in build [WIP] docs: Stop generating dynamic docs content in build Jan 22, 2026
andygrove and others added 4 commits January 22, 2026 12:55
This reverts commit bd46820.
The script was only compiling the spark module, which could miss
config changes defined in the common module. Now explicitly compiles
both common and spark modules.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@snmvaughan
Copy link
Contributor

Maybe we should use a profile to decide when to generate docs?

@andygrove
Copy link
Member Author

andygrove commented Jan 23, 2026

Maybe we should use a profile to decide when to generate docs?

I went with build.sh because the doc generation is tightly coupled to the Sphinx build step, and keeping it there ensures mvn package stays deterministic (no modified files).

The workflow also has two different targets: during normal publishing, we generate docs into a temp folder (so the repo stays clean), but during the release process, we run dev/generate-release-docs.sh once to freeze the generated content into the actual docs folder in the release branch. A Maven profile would only cover the first case cleanly, unless we added multiple profiles perhaps.

echo "Done! Generated documentation content in docs/source/user-guide/latest/"
echo ""
echo "Next steps:"
echo " git add docs/source/user-guide/latest/configs.md docs/source/user-guide/latest/compatibility.md"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) We can follow what README.md suggests git add docs/source/user-guide/latest/

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

Copy link
Contributor

@hsiang-c hsiang-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@parthchandra parthchandra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm (please address the comment from @hsiang-c)

@andygrove andygrove merged commit 6a2209d into apache:main Jan 24, 2026
130 checks passed
@andygrove andygrove deleted the stop-generating-docs-in-build branch January 24, 2026 03:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Stop storing generated docs in git

5 participants