-
Notifications
You must be signed in to change notification settings - Fork 0
feat(docs-devel): Improve developer documentation #175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Remove old config values.
If they are important the commit date can be used.
It just needs to be markdown files viewable through GitHub
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughThe pull request restructures documentation infrastructure, transitioning from Sphinx to a new structure with comprehensive deployment guides. It adds linting configuration, implements a Lakekeeper REST client script, creates architecture diagrams, introduces Ansible deployment documentation, and removes outdated documentation files. Architecture decision records are updated, and supporting infrastructure configurations are refined. Changes
Poem
Pre-merge checks❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 11
🧹 Nitpick comments (1)
AGENTS.md (1)
44-51: Improved PR guidelines with helpful reminder.The reformatted guidelines and emphasis on checking the actual template file will help ensure contributors use the latest format.
Minor: Consider GitHub capitalisation
On line 44,
.githubrefers to the directory name. However, if you're referring to the platform elsewhere, the official capitalisation is "GitHub" (capital H).
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
⛔ Files ignored due to path filters (5)
docs-devel/images/analytics-data-platform-arch.drawio.pngis excluded by!**/*.pngdocs-devel/images/analytics-data-platform-arch.pngis excluded by!**/*.pngdocs-devel/images/logo.svgis excluded by!**/*.svgdocs-devel/src/assets/images/lakehouse-data-architecture-layers.jpgis excluded by!**/*.jpgdocs-devel/src/assets/images/lakehouse-poc.pngis excluded by!**/*.png
📒 Files selected for processing (34)
.markdownlint.json(1 hunks).pre-commit-config.yaml(1 hunks)AGENTS.md(3 hunks)docs-devel/data-architecture/catalogs.md(1 hunks)docs-devel/data-architecture/index.md(1 hunks)docs-devel/deployment/index.md(1 hunks)docs-devel/deployment/prerequisites.md(1 hunks)docs-devel/images/.gitignore(1 hunks)docs-devel/images/analytics-data-platform-arch.drawio(1 hunks)docs-devel/index.md(1 hunks)docs-devel/mkdocs.yml(0 hunks)docs-devel/requirements.in(0 hunks)docs-devel/requirements.txt(0 hunks)docs-devel/src/assets/images/lakehouse-poc.drawio(0 hunks)docs-devel/src/conf.py(0 hunks)docs-devel/src/data-architecture/index.md(0 hunks)docs-devel/src/data-architecture/sources.md(0 hunks)docs-devel/src/deployment/index.md(0 hunks)docs-devel/src/index.md(0 hunks)docs-devel/src/system-architecture/index.md(0 hunks)docs-devel/system-architecture/adrs/0001-record-architecture-decisions.md(0 hunks)docs-devel/system-architecture/adrs/0002-use-ansible-for-vm-provisioning-and-configuration.md(0 hunks)docs-devel/system-architecture/adrs/0003-use-scd-cloud-vms.md(0 hunks)docs-devel/system-architecture/adrs/0004-use-minio-as-object-store.md(0 hunks)docs-devel/system-architecture/adrs/0005-use-apache-iceberg-as-lakehouse-table-format.md(0 hunks)docs-devel/system-architecture/adrs/0006-use-lakekeeper-as-iceberg-catalog.md(1 hunks)docs-devel/system-architecture/index.md(1 hunks)elt-common/README.md(1 hunks)infra/ansible-docker/group_vars/all/openstack.yml(0 hunks)infra/ansible-docker/playbooks/cloud/private_network_create.yml(1 hunks)infra/ansible-docker/readme.md(1 hunks)infra/local/README.md(1 hunks)infra/scripts/lakekeeper.py(1 hunks)readme.md(1 hunks)
💤 Files with no reviewable changes (16)
- docs-devel/src/system-architecture/index.md
- docs-devel/mkdocs.yml
- docs-devel/system-architecture/adrs/0003-use-scd-cloud-vms.md
- infra/ansible-docker/group_vars/all/openstack.yml
- docs-devel/system-architecture/adrs/0001-record-architecture-decisions.md
- docs-devel/src/assets/images/lakehouse-poc.drawio
- docs-devel/src/data-architecture/sources.md
- docs-devel/src/deployment/index.md
- docs-devel/src/index.md
- docs-devel/src/data-architecture/index.md
- docs-devel/system-architecture/adrs/0005-use-apache-iceberg-as-lakehouse-table-format.md
- docs-devel/system-architecture/adrs/0002-use-ansible-for-vm-provisioning-and-configuration.md
- docs-devel/src/conf.py
- docs-devel/requirements.in
- docs-devel/system-architecture/adrs/0004-use-minio-as-object-store.md
- docs-devel/requirements.txt
🧰 Additional context used
🪛 LanguageTool
docs-devel/data-architecture/catalogs.md
[uncategorized] ~7-~7: Loose punctuation mark.
Context: ...IS Lakehouse. ## Index - accelerator: Provides data for analysis of accelerat...
(UNLIKELY_OPENING_PUNCTUATION)
docs-devel/deployment/index.md
[uncategorized] ~4-~4: Use a comma before ‘but’ if it connects two independent clauses (unless they are closely connected and short).
Context: ... be deployed on any cloud infrastructure but it will require_ _changes before this w...
(COMMA_COMPOUND_SENTENCE)
[uncategorized] ~49-~49: Possible missing comma found.
Context: ... playbooks/traefik/deploy.yml ``` Once deployed check the Traefik dashboard is availabl...
(AI_HYDRA_LEO_MISSING_COMMA)
[uncategorized] ~55-~55: Possible missing comma found.
Context: ...ng available. Each service has a single VM with the exception of Superset that hos...
(AI_HYDRA_LEO_MISSING_COMMA)
[style] ~55-~55: Consider using “except” or “except for”
Context: ...available. Each service has a single VM with the exception of Superset that hosts multiple instances ...
(WITH_THE_EXCEPTION_OF)
[uncategorized] ~56-~56: Possible missing preposition found.
Context: ... Superset that hosts multiple instances one once machine. First create the VMs: `...
(AI_HYDRA_LEO_MISSING_IN)
docs-devel/index.md
[typographical] ~6-~6: Consider adding a comma.
Context: ...5.sharepoint.com/sites/ISISProject-1432) there is appetite for trialling a system to s...
(IF_THERE_COMMA)
[style] ~18-~18: Consider removing “of” to be more concise
Context: ...his repository is a monorepo containing all of the code for the platform. It may be separa...
(ALL_OF_THE)
docs-devel/system-architecture/index.md
[uncategorized] ~15-~15: “they” seems less likely than “the”.
Context: ...ervices Here we provide an overview of they key services. For details on their depl...
(AI_HYDRA_LEO_CP_THEY_THE)
[grammar] ~63-~63: A determiner cannot be combined with a possessive pronoun. Did you mean simply “a” or “our”?
Context: ...mmon elt-common is a our own Python package that provides additi...
(A_MY)
AGENTS.md
[uncategorized] ~44-~44: The official name of this software platform is spelled with a capital “H”.
Context: ...the current PR template**: Always check .github/PULL_REQUEST_TEMPLATE.md for the lates...
(GITHUB)
docs-devel/data-architecture/index.md
[style] ~5-~5: Would you like to use the Oxford spelling “organize”? The spelling ‘organise’ is also correct.
Context: ...icks.com/aws/en/lakehouse/medallion) to organise data as it flows from source systems th...
(OXFORD_SPELLING_Z_NOT_S)
[style] ~14-~14: Would you like to use the Oxford spelling “normalized”? The spelling ‘normalised’ is also correct.
Context: ...its original state or minimally cleaned/normalised. The bronze layer is the single source ...
(OXFORD_SPELLING_Z_NOT_S)
[uncategorized] ~21-~21: Possible missing comma found.
Context: ...er layer typically applies little to no aggregation leaving detailed information for end us...
(AI_HYDRA_LEO_MISSING_COMMA)
[uncategorized] ~33-~33: Possible missing comma found.
Context: ...scription of the implementation of this architecture see the [system architecture](../system...
(AI_HYDRA_LEO_MISSING_COMMA)
[uncategorized] ~35-~35: Possible missing comma found.
Context: ...on of the structure of the catalogs for ISIS see catalogs.
(AI_HYDRA_LEO_MISSING_COMMA)
infra/local/README.md
[uncategorized] ~58-~58: Possible missing comma found.
Context: ...nt: https://localhost:58443 (note the https as this is required by Trino. You may n...
(AI_HYDRA_LEO_MISSING_COMMA)
docs-devel/system-architecture/adrs/0006-use-lakekeeper-as-iceberg-catalog.md
[uncategorized] ~11-~11: Possible missing comma found.
Context: ...mplementations exist within the Iceberg libraries including: - REST: a server-side catal...
(AI_HYDRA_LEO_MISSING_COMMA)
[uncategorized] ~24-~24: Possible missing comma found.
Context: ...bility in how clients interact with the catalog and several implementations exist that ...
(AI_HYDRA_LEO_MISSING_COMMA)
[grammar] ~49-~49: This is normally spelled as one word.
Context: ...## Consequences Lakekeeper will be the back bone of the data lake and is a relatively ne...
(BACK_COMPOUNDS)
[uncategorized] ~49-~49: Use a comma before ‘so’ if it connects two independent clauses (unless they are closely connected and short).
Context: ...ata lake and is a relatively new project so there is some risk that it does not sur...
(COMMA_COMPOUND_SENTENCE_2)
[uncategorized] ~51-~51: Possible missing comma found.
Context: ...rating to a new catalog if that becomes necessary as the catalog only stores metadata and...
(AI_HYDRA_LEO_MISSING_COMMA)
🪛 Ruff (0.14.8)
infra/scripts/lakekeeper.py
78-78: Probable use of requests call without timeout
(S113)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: elt-common end-to-end tests
🔇 Additional comments (14)
.markdownlint.json (1)
3-5: Non-standard list indentation value.The MD007 indent is set to 3, which is unconventional. Most markdown style guides recommend using 2 or 4 spaces for list indentation to maintain consistency with common formatting standards. Please verify this is intentional.
.pre-commit-config.yaml (1)
27-27: Excluding documentation from markdown linting seems counterintuitive.The exclusion pattern skips the
docsdirectory from markdown linting. Documentation files would typically benefit the most from consistent markdown formatting. Please verify this exclusion is intentional and document the rationale if there's a specific reason (e.g., auto-generated docs or conflicting formatting requirements).elt-common/README.md (1)
39-39: Good improvement to link accessibility.Changing the link text from "here" to a descriptive label improves documentation clarity and accessibility.
docs-devel/images/.gitignore (1)
1-2: LGTM!The gitignore pattern correctly excludes Draw.io temporary save files.
readme.md (1)
10-11: Good documentation navigation improvement.The updated link provides a more specific target with an explanatory note, improving developer experience.
infra/ansible-docker/playbooks/cloud/private_network_create.yml (1)
2-7: Verify the necessity of explicit variable loading.Ansible typically loads variables from
group_vars/automatically based on the inventory structure. This explicitinclude_varstask may be redundant unless the playbook is executed in a non-standard manner or there's a specific ordering requirement.Please confirm whether this explicit variable loading is required, or if Ansible's automatic group_vars loading would suffice.
infra/local/README.md (1)
50-58: LGTM!The formatting improvements enhance readability and provide better markdown rendering across different parsers.
infra/ansible-docker/readme.md (1)
1-5: Excellent documentation consolidation.Centralising deployment guidance in a single location reduces maintenance overhead and ensures consistency.
AGENTS.md (1)
20-20: Good consolidation of documentation references.Linking to centralised documentation reduces duplication and ensures consistency across the repository.
Also applies to: 31-32
docs-devel/deployment/prerequisites.md (1)
78-85: LGTM!The updated networking prerequisites are clear and well-structured. The simplified IP allocation instructions and DNS alignment requirement properly reflect the streamlined deployment workflow.
docs-devel/data-architecture/catalogs.md (1)
1-9: LGTM!The catalogs documentation is clear and well-structured. The use of backticks for technical terms like
accelerator,src_, andanalytics_is appropriate.docs-devel/data-architecture/index.md (1)
1-35: LGTM!The data architecture documentation provides a clear and comprehensive overview of the Medallion-style lakehouse implementation. The explanations of bronze, silver, and gold layers are well-structured and include appropriate references to external resources and internal documentation.
docs-devel/index.md (1)
1-39: LGTM!The index page provides an excellent entry point to the developer documentation. The repository overview with the directory tree is particularly helpful for understanding the monorepo structure. The links to detailed architecture sections are well-organised.
docs-devel/images/analytics-data-platform-arch.drawio (1)
1-155: LGTM!The architecture diagram provides a valuable visual representation of the Analytics Data Platform components and their interactions. This complements the written system architecture documentation effectively.
docs-devel/system-architecture/adrs/0006-use-lakekeeper-as-iceberg-catalog.md
Outdated
Show resolved
Hide resolved
docs-devel/system-architecture/adrs/0006-use-lakekeeper-as-iceberg-catalog.md
Outdated
Show resolved
Hide resolved
Use the same tool as VSCode for ease of configuring the same rules for both the editor and CI tooling
Use correct extension in docs
0874624 to
4970aec
Compare
Summary
Refs #109
Summary by CodeRabbit
Release Notes
Documentation
Chores
✏️ Tip: You can customize this high-level summary in your review settings.