-
Notifications
You must be signed in to change notification settings - Fork 1.9k
docs: add Docker-based workflow for building documentation #19863
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
docs: add Docker-based workflow for building documentation #19863
Conversation
- Installs Rust 1.92.0, cargo-depgraph, and all doc dependencies - Provides isolated, reproducible build environment - Fixes line endings for cross-platform compatibility (Windows/Unix) - No additional host setup required beyond Docker
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds Docker-based infrastructure to enable building DataFusion documentation in a containerized environment without requiring manual installation of dependencies (Rust, cargo-depgraph, Sphinx, etc.).
Changes:
- Adds Dockerfile with all necessary documentation build dependencies (Rust 1.92.0, Python packages, cargo-depgraph, graphviz)
- Updates docker-compose.yml to include a
docsservice for easy documentation building - Updates docs/README.md with Docker-based workflow instructions as the recommended approach
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| docs/Dockerfile | Defines container image with all doc build dependencies, handles line endings for cross-platform compatibility |
| docker-compose.yml | Adds docs service configuration for simplified Docker workflow |
| docs/README.md | Documents Docker-based build option as recommended approach, maintains existing local installation instructions |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
- Creates separate entrypoint script instead of complex bash in CMD - Improves readability and makes logic easier to test and modify - Cleaner Dockerfile with ENTRYPOINT pattern - Script intelligently checks for CRLF before converting - Handles both build-time and runtime line ending fixes
- Use rust:1.92.0-bookworm base image instead of python with manual Rust install - Remove docker-compose.yml (unnecessary for this use case) - Remove entrypoint.sh and CRLF conversion logic - Use volume mount at runtime instead of copying entire repository Usage: docker build -t datafusion-docs ./docs docker run --rm -v C:\Users\HP\Music\Apache_org\data\datafusion:/datafusion datafusion-docs
|
Thanks for the feedback @Jefffrey! I've simplified the approach significantly: Changes made: ✅ Switched to rust:1.92.0-bookworm base image (install Python on top instead of vice versa) Tested locally and the docs build successfully. Please take another look!> docker build -t datafusion-docs ./docs Tested locally and the docs build successfully. Please take another look! |
docs/Dockerfile
Outdated
| COPY requirements.txt . | ||
| RUN python3 -m pip install --break-system-packages -r requirements.txt | ||
|
|
||
| CMD ["make", "html"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be using the build.sh script because that generates the dependency chart for us (and is what is motivating this issue)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback @Jefffrey! All addressed:
✅ Using rust:bookworm (latest) instead of pinned version
✅ Removed python3-venv
✅ Using build.sh instead of make html - dependency graph now generates correctly
✅ Added dos2unix for Windows line ending compatibility
Tested locally - docs build succeeds with the dependency graph generated. Please take another look!
- Use rust:bookworm base image (latest) instead of pinned version - Remove python3-venv (not used) - Use build.sh instead of make html (generates dependency graph) - Add dos2unix for Windows line ending compatibility
|
When I try the docker run command it seems to download & install some components each time, slowing down the doc build process: datafusion (docs/dockerized-docs-build)$ docker run --rm -v $(pwd):/datafusion datafusion-docs
info: syncing channel updates for '1.92.0-aarch64-unknown-linux-gnu'
info: latest update on 2025-12-11, rust version 1.92.0 (ded5c06cf 2025-12-08)
info: downloading component 'clippy'
info: downloading component 'rustfmt'
info: installing component 'clippy'
info: installing component 'rustfmt'Could you look into this? |
This PR adds a Docker-based workflow for building DataFusion documentation, allowing contributors to build docs without installing additional host dependencies.
The container includes all required tools (Rust 1.92.0, cargo-depgraph, Sphinx, etc.) and provides a simple, reproducible build process.
This addresses #19777.
What's included:
Usage: