From e4dcec75721321f69c72eabf1b4538b4fc8ef7ce Mon Sep 17 00:00:00 2001 From: Maciej Jalocha Date: Sun, 15 Dec 2024 20:13:16 +0100 Subject: [PATCH 01/16] docs: create documentation for the project --- README.md | 61 +------------------------------------------- references/README.md | 60 +++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 61 insertions(+), 60 deletions(-) create mode 100644 references/README.md diff --git a/README.md b/README.md index 4664778..ac9e4d5 100644 --- a/README.md +++ b/README.md @@ -1,60 +1 @@ -# github-dagger-workflow-project - - - - - -A short description of the project. - -## Project Organization - -``` -├── LICENSE <- Open-source license if one is chosen -├── Makefile <- Makefile with convenience commands like `make data` or `make train` -├── README.md <- The top-level README for developers using this project. -├── data -│ ├── external <- Data from third party sources. -│ ├── interim <- Intermediate data that has been transformed. -│ ├── processed <- The final, canonical data sets for modeling. -│ └── raw <- The original, immutable data dump. -│ -├── docs <- A default mkdocs project; see www.mkdocs.org for details -│ -├── models <- Trained and serialized models, model predictions, or model summaries -│ -├── notebooks <- Jupyter notebooks. Naming convention is a number (for ordering), -│ the creator's initials, and a short `-` delimited description, e.g. -│ `1.0-jqp-initial-data-exploration`. -│ -├── pyproject.toml <- Project configuration file with package metadata for -│ github-dagger-workflow-project and configuration for tools like black -│ -├── references <- Data dictionaries, manuals, and all other explanatory materials. -│ -├── reports <- Generated analysis as HTML, PDF, LaTeX, etc. -│ └── figures <- Generated graphics and figures to be used in reporting -│ -├── requirements.txt <- The requirements file for reproducing the analysis environment, e.g. -│ generated with `pip freeze > requirements.txt` -│ -├── setup.cfg <- Configuration file for flake8 -│ -└── github_dagger_workflow_project <- Source code for use in this project. - │ - ├── __init__.py <- Makes github_dagger_workflow_project a Python module - │ - ├── config.py <- Store useful variables and configuration - │ - ├── dataset.py <- Scripts to download or generate data - │ - ├── features.py <- Code to create features for modeling - │ - ├── modeling - │ ├── __init__.py - │ ├── predict.py <- Code to run model inference with trained models - │ └── train.py <- Code to train models - │ - └── plots.py <- Code to create visualizations -``` - ---- +### Documentation diff --git a/references/README.md b/references/README.md new file mode 100644 index 0000000..4664778 --- /dev/null +++ b/references/README.md @@ -0,0 +1,60 @@ +# github-dagger-workflow-project + + + + + +A short description of the project. + +## Project Organization + +``` +├── LICENSE <- Open-source license if one is chosen +├── Makefile <- Makefile with convenience commands like `make data` or `make train` +├── README.md <- The top-level README for developers using this project. +├── data +│ ├── external <- Data from third party sources. +│ ├── interim <- Intermediate data that has been transformed. +│ ├── processed <- The final, canonical data sets for modeling. +│ └── raw <- The original, immutable data dump. +│ +├── docs <- A default mkdocs project; see www.mkdocs.org for details +│ +├── models <- Trained and serialized models, model predictions, or model summaries +│ +├── notebooks <- Jupyter notebooks. Naming convention is a number (for ordering), +│ the creator's initials, and a short `-` delimited description, e.g. +│ `1.0-jqp-initial-data-exploration`. +│ +├── pyproject.toml <- Project configuration file with package metadata for +│ github-dagger-workflow-project and configuration for tools like black +│ +├── references <- Data dictionaries, manuals, and all other explanatory materials. +│ +├── reports <- Generated analysis as HTML, PDF, LaTeX, etc. +│ └── figures <- Generated graphics and figures to be used in reporting +│ +├── requirements.txt <- The requirements file for reproducing the analysis environment, e.g. +│ generated with `pip freeze > requirements.txt` +│ +├── setup.cfg <- Configuration file for flake8 +│ +└── github_dagger_workflow_project <- Source code for use in this project. + │ + ├── __init__.py <- Makes github_dagger_workflow_project a Python module + │ + ├── config.py <- Store useful variables and configuration + │ + ├── dataset.py <- Scripts to download or generate data + │ + ├── features.py <- Code to create features for modeling + │ + ├── modeling + │ ├── __init__.py + │ ├── predict.py <- Code to run model inference with trained models + │ └── train.py <- Code to train models + │ + └── plots.py <- Code to create visualizations +``` + +--- From f3402d7b5c89ef74a2f1acc8c7ec492a75f0a69e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Adam=20Rosen=C3=B8rn?= Date: Sun, 15 Dec 2024 22:32:07 +0100 Subject: [PATCH 02/16] docs: README updates --- README.md | 64 ++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 63 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index ac9e4d5..123071e 100644 --- a/README.md +++ b/README.md @@ -1 +1,63 @@ -### Documentation +# ITU BDS SDSE'24 - Project + +This project is part of the Software Development and Software Engineering at ITU. The original project description can be found [here](https://github.com/lasselundstenjensen/itu-sdse-project) + +In this project we were tasked with restructuring a Python monolith using the concepts we have learned throughout the course. This project contains a Dagger and Github workflow. + +## Project Structure + +``` +├── README.md <- Project description +│ +├── .dvc +│ +├── .github/workflows <- +│ ├── tag_version.yml <- +│ ├── test_action.yml <- +│ +├── pipeline_deps <- +│ ├── requirements.txt <- +│ +├── CODEOWNERS <- +│ +├── go.mod <- +│ +├── go.sum <- +│ +├── pipeline.go <- +│ +├── pyproject.toml <- +│ +├── references <- +│ +├── requirements.txt <- +│ +└── github_dagger_workflow_project <- + │ + ├── data_transformations.py <- + │ + ├── model_deployment.py <- + │ + ├── model_selection.py <- + │ + ├── model_training.py <- + │ + ├── prod_model.py <- + │ + ├── artifacts + │ │ + │ └── raw_data.csv.dvc <- + │ + └── utils.py <- +``` + +--- + + +## How to run the code + +### Triggering Github Workflow + +The workflow can be triggered either by pushing changes or manually. + + It can be triggered manually [here](https://github.com/PLtier/github-dagger-workflow-project/actions/workflows/test_action.yml) by pressing `Run workflow` on the `main` branch \ No newline at end of file From a7722eb5d28ebeaa94e43d1344d866a8b52802fb Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Adam=20Rosen=C3=B8rn?= Date: Mon, 16 Dec 2024 21:32:34 +0100 Subject: [PATCH 03/16] docs: small README changes --- README.md | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/README.md b/README.md index 123071e..c6477d3 100644 --- a/README.md +++ b/README.md @@ -2,7 +2,7 @@ This project is part of the Software Development and Software Engineering at ITU. The original project description can be found [here](https://github.com/lasselundstenjensen/itu-sdse-project) -In this project we were tasked with restructuring a Python monolith using the concepts we have learned throughout the course. This project contains a Dagger and Github workflow. +In this project we were tasked with restructuring a Python monolith using the concepts we have learned throughout the course. This project contains a [Dagger](https://github.com/PLtier/github-dagger-workflow-project/blob/main/pipeline.go) and [Github](https://github.com/PLtier/github-dagger-workflow-project/blob/main/.github/workflows/test_action.yml) workflow. ## Project Structure @@ -13,10 +13,10 @@ In this project we were tasked with restructuring a Python monolith using the co │ ├── .github/workflows <- │ ├── tag_version.yml <- -│ ├── test_action.yml <- +│ └── test_action.yml <- │ ├── pipeline_deps <- -│ ├── requirements.txt <- +│ └── requirements.txt <- │ ├── CODEOWNERS <- │ @@ -34,15 +34,17 @@ In this project we were tasked with restructuring a Python monolith using the co │ └── github_dagger_workflow_project <- │ - ├── data_transformations.py <- + ├── __init__.py <- │ - ├── model_deployment.py <- + ├── 01_data_transformations.py <- │ - ├── model_selection.py <- + ├── 02_model_training.py <- │ - ├── model_training.py <- + ├── 03_model_selection.py <- │ - ├── prod_model.py <- + ├── 04_prod_model.py <- + │ + ├── 05_model_deployment.py <- │ ├── artifacts │ │ @@ -58,6 +60,6 @@ In this project we were tasked with restructuring a Python monolith using the co ### Triggering Github Workflow -The workflow can be triggered either by pushing changes or manually. +The workflow can be triggered either by on pull requests to main or manually. It can be triggered manually [here](https://github.com/PLtier/github-dagger-workflow-project/actions/workflows/test_action.yml) by pressing `Run workflow` on the `main` branch \ No newline at end of file From 1f256d388a094cb044e7dbedc44a4faa073074b6 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Adam=20Rosen=C3=B8rn?= Date: Mon, 16 Dec 2024 23:09:00 +0100 Subject: [PATCH 04/16] docs: added explanations --- README.md | 47 +++++++++++++++++++++++++---------------------- 1 file changed, 25 insertions(+), 22 deletions(-) diff --git a/README.md b/README.md index c6477d3..d883e5f 100644 --- a/README.md +++ b/README.md @@ -7,50 +7,53 @@ In this project we were tasked with restructuring a Python monolith using the co ## Project Structure ``` -├── README.md <- Project description +├── README.md <- Project description and how to run the code │ ├── .dvc │ -├── .github/workflows <- -│ ├── tag_version.yml <- -│ └── test_action.yml <- +├── .github/workflows <- Github Action workflows +│ │ +│ ├── tag_version.yml <- +│ │ +│ └── test_action.yml <- │ -├── pipeline_deps <- -│ └── requirements.txt <- +├── pipeline_deps <- +│ │ +│ └── requirements.txt <- Dependencies for the pipeline │ -├── CODEOWNERS <- +├── CODEOWNERS <- Defines codeowners for the repository │ -├── go.mod <- +├── go.mod <- Go file that defines the module and the required dependencies │ -├── go.sum <- +├── go.sum <- │ -├── pipeline.go <- +├── pipeline.go <- Dagger workflow written in Go │ -├── pyproject.toml <- +├── pyproject.toml <- │ -├── references <- +├── references <- Documentation and extra resources │ -├── requirements.txt <- +├── requirements.txt <- Python dependecies need for the project │ -└── github_dagger_workflow_project <- +└── github_dagger_workflow_project <- Source code for the project │ - ├── __init__.py <- + ├── __init__.py <- │ - ├── 01_data_transformations.py <- + ├── 01_data_transformations.py <- Code for data preprocessing and transformation │ - ├── 02_model_training.py <- + ├── 02_model_training.py <- Code for training the models │ - ├── 03_model_selection.py <- + ├── 03_model_selection.py <- Code for selecting the best perfoming model │ - ├── 04_prod_model.py <- + ├── 04_prod_model.py <- Code for comparing new best model and production model │ - ├── 05_model_deployment.py <- + ├── 05_model_deployment.py <- Code for deploying model │ ├── artifacts │ │ - │ └── raw_data.csv.dvc <- + │ └── raw_data.csv.dvc <- │ - └── utils.py <- + └── utils.py <- Helper functions ``` --- From b92b259fb7f1cba20cb3dd191b49c3c6c45b11bb Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Adam=20Rosen=C3=B8rn?= Date: Wed, 18 Dec 2024 15:36:04 +0100 Subject: [PATCH 05/16] docs: README updates --- README.md | 54 ++++++++++++++++++++++++++++++------------------------ 1 file changed, 30 insertions(+), 24 deletions(-) diff --git a/README.md b/README.md index d883e5f..5d5555d 100644 --- a/README.md +++ b/README.md @@ -7,53 +7,59 @@ In this project we were tasked with restructuring a Python monolith using the co ## Project Structure ``` -├── README.md <- Project description and how to run the code +├── README.md <- Project description and how to run the code │ -├── .dvc -│ -├── .github/workflows <- Github Action workflows +├── .github/workflows <- Github Action workflows │ │ -│ ├── tag_version.yml <- +│ ├── tag_version.yml <- Workflow for creating version tags │ │ -│ └── test_action.yml <- +│ └── test_action.yml <- Workflow that automatically trains and tests model │ -├── pipeline_deps <- +├── pipeline_deps │ │ -│ └── requirements.txt <- Dependencies for the pipeline +│ └── requirements.txt <- Dependencies for the pipeline +│ +├── CODEOWNERS <- Defines codeowners for the repository +│ +├── go.mod <- Go file that defines the module and required dependencies │ -├── CODEOWNERS <- Defines codeowners for the repository +├── go.sum <- Go file that ensures continuity and integrity of dependencies │ -├── go.mod <- Go file that defines the module and the required dependencies +├── pipeline.go <- Dagger workflow written in Go │ -├── go.sum <- +├── pyproject.toml <- Configuration file │ -├── pipeline.go <- Dagger workflow written in Go +├── .pre-commit-config.yaml <- Checks quality of code before commits │ -├── pyproject.toml <- +├── Makefile.venv <- Creates and manages Pythion virtual enviorment │ -├── references <- Documentation and extra resources +├── references <- Documentation and extra resources │ -├── requirements.txt <- Python dependecies need for the project +├── requirements.txt <- Python dependecies need for the project │ -└── github_dagger_workflow_project <- Source code for the project +└── github_dagger_workflow_project <- Source code for the project │ - ├── __init__.py <- + ├── __init__.py <- Marks the directory as a Python package │ - ├── 01_data_transformations.py <- Code for data preprocessing and transformation + ├── 01_data_transformations.py <- Script for data preprocessing and transformation │ - ├── 02_model_training.py <- Code for training the models + ├── 02_model_training.py <- Script for training the models │ - ├── 03_model_selection.py <- Code for selecting the best perfoming model + ├── 03_model_selection.py <- Script for selecting the best perfoming model │ - ├── 04_prod_model.py <- Code for comparing new best model and production model + ├── 04_prod_model.py <- Script for comparing new best model and production model │ - ├── 05_model_deployment.py <- Code for deploying model + ├── 05_model_deployment.py <- Script for deploying model │ ├── artifacts │ │ - │ └── raw_data.csv.dvc <- + │ └── raw_data.csv.dvc <- Metadata tracked by DVC for data file + │ + ├── tests + │ │ + │ └── verify_artifacts.py <- Tests to check if all artifacts are copied correctly │ - └── utils.py <- Helper functions + └── utils.py <- Helper functions ``` --- From dc4d16801c00364de56b6b9b3897d69190f39f36 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Adam=20Rosen=C3=B8rn?= Date: Wed, 18 Dec 2024 17:53:25 +0100 Subject: [PATCH 06/16] docs: clarified how to trigger workflow --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 5d5555d..b2585f9 100644 --- a/README.md +++ b/README.md @@ -2,7 +2,7 @@ This project is part of the Software Development and Software Engineering at ITU. The original project description can be found [here](https://github.com/lasselundstenjensen/itu-sdse-project) -In this project we were tasked with restructuring a Python monolith using the concepts we have learned throughout the course. This project contains a [Dagger](https://github.com/PLtier/github-dagger-workflow-project/blob/main/pipeline.go) and [Github](https://github.com/PLtier/github-dagger-workflow-project/blob/main/.github/workflows/test_action.yml) workflow. +In this project we were tasked with restructuring a Python monolith using the concepts we have learned throughout the course. This project contains a [Dagger workflow](https://github.com/PLtier/github-dagger-workflow-project/blob/main/pipeline.go) and a [Github workflow](https://github.com/PLtier/github-dagger-workflow-project/blob/main/.github/workflows/test_action.yml). ## Project Structure @@ -71,4 +71,4 @@ In this project we were tasked with restructuring a Python monolith using the co The workflow can be triggered either by on pull requests to main or manually. - It can be triggered manually [here](https://github.com/PLtier/github-dagger-workflow-project/actions/workflows/test_action.yml) by pressing `Run workflow` on the `main` branch \ No newline at end of file + It can be triggered manually [here](https://github.com/PLtier/github-dagger-workflow-project/actions/workflows/test_action.yml) by pressing `Run workflow` on the `main` branch, then refresh the page and the triggered workflow will appear. After all the jobs have been run, the model artifacts can be found on the summary page of the run. \ No newline at end of file From 55e2c38b4003b5836a6f10540f160d82970216dd Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Adam=20Rosen=C3=B8rn?= Date: Wed, 18 Dec 2024 18:01:45 +0100 Subject: [PATCH 07/16] docs: minor grammar fixes --- README.md | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/README.md b/README.md index b2585f9..b39a3e9 100644 --- a/README.md +++ b/README.md @@ -2,14 +2,14 @@ This project is part of the Software Development and Software Engineering at ITU. The original project description can be found [here](https://github.com/lasselundstenjensen/itu-sdse-project) -In this project we were tasked with restructuring a Python monolith using the concepts we have learned throughout the course. This project contains a [Dagger workflow](https://github.com/PLtier/github-dagger-workflow-project/blob/main/pipeline.go) and a [Github workflow](https://github.com/PLtier/github-dagger-workflow-project/blob/main/.github/workflows/test_action.yml). +In this project we were tasked with restructuring a Python monolith using the concepts we have learned throughout the course. This project contains a [Dagger workflow](https://github.com/PLtier/github-dagger-workflow-project/blob/main/pipeline.go) and a [GitHub workflow](https://github.com/PLtier/github-dagger-workflow-project/blob/main/.github/workflows/test_action.yml). ## Project Structure ``` ├── README.md <- Project description and how to run the code │ -├── .github/workflows <- Github Action workflows +├── .github/workflows <- GitHub Action workflows │ │ │ ├── tag_version.yml <- Workflow for creating version tags │ │ @@ -31,11 +31,11 @@ In this project we were tasked with restructuring a Python monolith using the co │ ├── .pre-commit-config.yaml <- Checks quality of code before commits │ -├── Makefile.venv <- Creates and manages Pythion virtual enviorment +├── Makefile.venv <- Creates and manages Python virtual environment │ ├── references <- Documentation and extra resources │ -├── requirements.txt <- Python dependecies need for the project +├── requirements.txt <- Python dependencies need for the project │ └── github_dagger_workflow_project <- Source code for the project │ @@ -67,8 +67,8 @@ In this project we were tasked with restructuring a Python monolith using the co ## How to run the code -### Triggering Github Workflow +### Triggering GitHub Workflow -The workflow can be triggered either by on pull requests to main or manually. +The workflow can be triggered either on pull requests to main or manually. It can be triggered manually [here](https://github.com/PLtier/github-dagger-workflow-project/actions/workflows/test_action.yml) by pressing `Run workflow` on the `main` branch, then refresh the page and the triggered workflow will appear. After all the jobs have been run, the model artifacts can be found on the summary page of the run. \ No newline at end of file From 558c0b6838a59d6d80f0ba42a15ffbd7e766ace6 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Adam=20Rosen=C3=B8rn?= Date: Wed, 18 Dec 2024 21:54:26 +0100 Subject: [PATCH 08/16] docs: minor tweaks to documentation --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index b39a3e9..a5c6874 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # ITU BDS SDSE'24 - Project -This project is part of the Software Development and Software Engineering at ITU. The original project description can be found [here](https://github.com/lasselundstenjensen/itu-sdse-project) +This project is part of the Software Development and Software Engineering course at ITU. The original project description can be found [here](https://github.com/lasselundstenjensen/itu-sdse-project). In this project we were tasked with restructuring a Python monolith using the concepts we have learned throughout the course. This project contains a [Dagger workflow](https://github.com/PLtier/github-dagger-workflow-project/blob/main/pipeline.go) and a [GitHub workflow](https://github.com/PLtier/github-dagger-workflow-project/blob/main/.github/workflows/test_action.yml). @@ -69,6 +69,6 @@ In this project we were tasked with restructuring a Python monolith using the co ### Triggering GitHub Workflow -The workflow can be triggered either on pull requests to main or manually. +The workflow can be triggered either on pull requests to `main` or manually. It can be triggered manually [here](https://github.com/PLtier/github-dagger-workflow-project/actions/workflows/test_action.yml) by pressing `Run workflow` on the `main` branch, then refresh the page and the triggered workflow will appear. After all the jobs have been run, the model artifacts can be found on the summary page of the run. \ No newline at end of file From ae5c045d2e39f9d9b3ca703909a4411809d68fa8 Mon Sep 17 00:00:00 2001 From: Maciej Jalocha Date: Thu, 19 Dec 2024 02:42:41 +0100 Subject: [PATCH 09/16] docs: running, testing, code quality, decisions, todo --- README.md | 98 +++++++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 91 insertions(+), 7 deletions(-) diff --git a/README.md b/README.md index a5c6874..9363cd8 100644 --- a/README.md +++ b/README.md @@ -11,11 +11,11 @@ In this project we were tasked with restructuring a Python monolith using the co │ ├── .github/workflows <- GitHub Action workflows │ │ -│ ├── tag_version.yml <- Workflow for creating version tags +│ ├── tag_version.yml <- Workflow for creating version tags │ │ │ └── test_action.yml <- Workflow that automatically trains and tests model │ -├── pipeline_deps +├── pipeline_deps │ │ │ └── requirements.txt <- Dependencies for the pipeline │ @@ -33,7 +33,7 @@ In this project we were tasked with restructuring a Python monolith using the co │ ├── Makefile.venv <- Creates and manages Python virtual environment │ -├── references <- Documentation and extra resources +├── references <- Documentation and extra resources │ ├── requirements.txt <- Python dependencies need for the project │ @@ -64,11 +64,95 @@ In this project we were tasked with restructuring a Python monolith using the co --- +# How to run the code -## How to run the code - -### Triggering GitHub Workflow +## Artifact creation The workflow can be triggered either on pull requests to `main` or manually. - It can be triggered manually [here](https://github.com/PLtier/github-dagger-workflow-project/actions/workflows/test_action.yml) by pressing `Run workflow` on the `main` branch, then refresh the page and the triggered workflow will appear. After all the jobs have been run, the model artifacts can be found on the summary page of the run. \ No newline at end of file +It can be triggered manually [here](https://github.com/PLtier/github-dagger-workflow-project/actions/workflows/test_action.yml) by pressing `Run workflow` on the `main` branch, then refresh the page and the triggered workflow will appear. After all the jobs have been run, the model artifact can be found on the summary page of the run of the first job. We also store other artifacts for convenience. +The testing is automatically run afterwards to let the user check if it was of a quality. +Artifacts are stored for 90 days. + +## Local development + +### Environment installation + +You need to have downloaded: + +- `docker` +- `dagger` >= 15 +- `go` - 1.23.3 is currently used. +- `git` +- `python` >=3.11.9 + +Then run: + +```shell +make setup +.venv\Scripts\activate # windows +source .venv/bin/activate # unix +go mod tidy +``` + +It installs `pre-commit` which takes care of formatting and linting before commits for go and python. (we use `ruff`, `ruff format`, `gofmt` and `govet`) + +### Running the code: + +#### Run scripts on the host machine + +For that you can run scripts sequentially in the github_dagger_workflow_project. +Callout: + +> Beware: all artifacts will be appended to your repo dir! + +#### Run in a container + +The command will run the `dagger` pipeline. In the end, **only** final artifacts will be appended to + +```shell +make run +``` + +#### Local testing + +Perhaps most useful. It will not append any of the container-produced files to the host machine, but it will run a test script **which will ensure that all important artifacts are indeed logged** + +```shell +make test +``` + +> Beware: it will not test the model on the inference test! + +## Inference testing + +The same workflow which generates artifacts automatically runs the inference testing. Also, the artifacts testing and the inference test is carried out after every PR (and subsequent commits) to `main` + +## Maintaining code quality + +- We used `pre-commit` to lint and format, as stated above. +- `main` branch-protection (with github repo settings) + - PR is required before merging + - at least one approval is needed. We automatically assign reviewers with `CODEOWNERS` file. + - we required status checks to be passed for both of our jobs i.e. `Train and Upload Model` and `Unit Test Model Artifacts`. The test checks explicitly whether all artifacts have been generated and if the model passes inference test. Jobs are automatically triggered on merge. +- We maintained a clear goals via `Issues` and often quite verbose reviews. +- we used 90% of time semantic commits + +## Code releases + +On every push to main a new tag is released with the current time it was published. +See current tags: [Tags](https://github.com/PLtier/github-dagger-workflow-project/tags) + +### Decisions which have been made + +- We have noticed a few strong signs an XGBoost was supposed to be in the pipeline. We initially included it, but finally decided on wrapping the code in such a way, that by one-liner one can start effectively compare LR with XGBoost. Please read more in: _Originally posted by @PLtier in [#3 Issues](https://github.com/PLtier/github-dagger-workflow-project/issues/3#issuecomment-2551304436)_ +- We strived to encapsulate as much of the code into functions (no global variables shared except constants). This was to improve + - readibility + - better troubleshooting + - not polluting global namespace (so fewer bugs) +- We sorted imports. For our module we use absolute imports. +- We have removed unnecessary code. + +### what to improve upon + +Take out tests out of the production code as right now we do it. From 14cb8f617e4ebb35de147e508350b3e3b88e5bf7 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Adam=20Rosen=C3=B8rn?= Date: Thu, 19 Dec 2024 10:24:57 +0100 Subject: [PATCH 10/16] docs: small fix in project structure documentation --- README.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 9363cd8..bd0430d 100644 --- a/README.md +++ b/README.md @@ -37,6 +37,10 @@ In this project we were tasked with restructuring a Python monolith using the co │ ├── requirements.txt <- Python dependencies need for the project │ +├── tests +│ │ +│ └── verify_artifacts.py <- Tests to check if all artifacts are copied correctly +│ └── github_dagger_workflow_project <- Source code for the project │ ├── __init__.py <- Marks the directory as a Python package @@ -55,10 +59,6 @@ In this project we were tasked with restructuring a Python monolith using the co │ │ │ └── raw_data.csv.dvc <- Metadata tracked by DVC for data file │ - ├── tests - │ │ - │ └── verify_artifacts.py <- Tests to check if all artifacts are copied correctly - │ └── utils.py <- Helper functions ``` From f8cd57c700e4c814ad2338bd51762fe586d5aac3 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Adam=20Rosen=C3=B8rn?= Date: Thu, 19 Dec 2024 10:49:11 +0100 Subject: [PATCH 11/16] docs: corrected file name in project structure --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index bd0430d..32341ee 100644 --- a/README.md +++ b/README.md @@ -13,7 +13,7 @@ In this project we were tasked with restructuring a Python monolith using the co │ │ │ ├── tag_version.yml <- Workflow for creating version tags │ │ -│ └── test_action.yml <- Workflow that automatically trains and tests model +│ └── log_and_test_action.yml <- Workflow that automatically trains and tests model │ ├── pipeline_deps │ │ From d2c120dc35143c93ab06ae5a966b37cc9844c48a Mon Sep 17 00:00:00 2001 From: Maciej Jalocha Date: Thu, 19 Dec 2024 11:24:03 +0100 Subject: [PATCH 12/16] docs: update README.md --- README.md | 38 +++++++++++++++----------------------- 1 file changed, 15 insertions(+), 23 deletions(-) diff --git a/README.md b/README.md index 9363cd8..aa2e077 100644 --- a/README.md +++ b/README.md @@ -4,6 +4,8 @@ This project is part of the Software Development and Software Engineering course In this project we were tasked with restructuring a Python monolith using the concepts we have learned throughout the course. This project contains a [Dagger workflow](https://github.com/PLtier/github-dagger-workflow-project/blob/main/pipeline.go) and a [GitHub workflow](https://github.com/PLtier/github-dagger-workflow-project/blob/main/.github/workflows/test_action.yml). +![Goal](./references/project-architecture.png) + ## Project Structure ``` @@ -62,8 +64,6 @@ In this project we were tasked with restructuring a Python monolith using the co └── utils.py <- Helper functions ``` ---- - # How to run the code ## Artifact creation @@ -74,35 +74,37 @@ It can be triggered manually [here](https://github.com/PLtier/github-dagger-work The testing is automatically run afterwards to let the user check if it was of a quality. Artifacts are stored for 90 days. -## Local development +## Local development / Running ### Environment installation -You need to have downloaded: +For local running you need: - `docker` - `dagger` >= 15 + +For local development you need as well: + - `go` - 1.23.3 is currently used. - `git` - `python` >=3.11.9 +- `make` Then run: ```shell make setup -.venv\Scripts\activate # windows -source .venv/bin/activate # unix -go mod tidy +.venv\Scripts\activate # for windows +source .venv/bin/activate # for linux/macos ``` -It installs `pre-commit` which takes care of formatting and linting before commits for go and python. (we use `ruff`, `ruff format`, `gofmt` and `govet`) +Additionally, It installs `pre-commit` which takes care of formatting and linting before commits for go and python. ### Running the code: #### Run scripts on the host machine For that you can run scripts sequentially in the github_dagger_workflow_project. -Callout: > Beware: all artifacts will be appended to your repo dir! @@ -111,7 +113,7 @@ Callout: The command will run the `dagger` pipeline. In the end, **only** final artifacts will be appended to ```shell -make run +make container_run ``` #### Local testing @@ -130,7 +132,7 @@ The same workflow which generates artifacts automatically runs the inference tes ## Maintaining code quality -- We used `pre-commit` to lint and format, as stated above. +- We used `pre-commit` to lint and format, as stated above. We use `ruff`, `ruff format`, `gofmt` and `govet`. We check for PEP8 warnings and errors. - `main` branch-protection (with github repo settings) - PR is required before merging - at least one approval is needed. We automatically assign reviewers with `CODEOWNERS` file. @@ -143,16 +145,6 @@ The same workflow which generates artifacts automatically runs the inference tes On every push to main a new tag is released with the current time it was published. See current tags: [Tags](https://github.com/PLtier/github-dagger-workflow-project/tags) -### Decisions which have been made - -- We have noticed a few strong signs an XGBoost was supposed to be in the pipeline. We initially included it, but finally decided on wrapping the code in such a way, that by one-liner one can start effectively compare LR with XGBoost. Please read more in: _Originally posted by @PLtier in [#3 Issues](https://github.com/PLtier/github-dagger-workflow-project/issues/3#issuecomment-2551304436)_ -- We strived to encapsulate as much of the code into functions (no global variables shared except constants). This was to improve - - readibility - - better troubleshooting - - not polluting global namespace (so fewer bugs) -- We sorted imports. For our module we use absolute imports. -- We have removed unnecessary code. - -### what to improve upon +# Code decisions and reflections -Take out tests out of the production code as right now we do it. +> This is not the part of the documentation: you can read about a few (hard) decisions we have made on [Reflections](./references/project_reflections.md) From af293c57b519e36f9dca43fb4397351efadb6f21 Mon Sep 17 00:00:00 2001 From: Maciej Jalocha Date: Thu, 19 Dec 2024 11:24:18 +0100 Subject: [PATCH 13/16] docs: update references --- references/README.md | 60 - references/diagrams.excalidraw | 1617 ------------------- references/problem_specification.md | 77 - references/project_reflections.md | 22 + references/subjective_code_practices_mpj.md | 15 +- 5 files changed, 25 insertions(+), 1766 deletions(-) delete mode 100644 references/README.md delete mode 100644 references/diagrams.excalidraw delete mode 100644 references/problem_specification.md create mode 100644 references/project_reflections.md diff --git a/references/README.md b/references/README.md deleted file mode 100644 index 4664778..0000000 --- a/references/README.md +++ /dev/null @@ -1,60 +0,0 @@ -# github-dagger-workflow-project - - - - - -A short description of the project. - -## Project Organization - -``` -├── LICENSE <- Open-source license if one is chosen -├── Makefile <- Makefile with convenience commands like `make data` or `make train` -├── README.md <- The top-level README for developers using this project. -├── data -│ ├── external <- Data from third party sources. -│ ├── interim <- Intermediate data that has been transformed. -│ ├── processed <- The final, canonical data sets for modeling. -│ └── raw <- The original, immutable data dump. -│ -├── docs <- A default mkdocs project; see www.mkdocs.org for details -│ -├── models <- Trained and serialized models, model predictions, or model summaries -│ -├── notebooks <- Jupyter notebooks. Naming convention is a number (for ordering), -│ the creator's initials, and a short `-` delimited description, e.g. -│ `1.0-jqp-initial-data-exploration`. -│ -├── pyproject.toml <- Project configuration file with package metadata for -│ github-dagger-workflow-project and configuration for tools like black -│ -├── references <- Data dictionaries, manuals, and all other explanatory materials. -│ -├── reports <- Generated analysis as HTML, PDF, LaTeX, etc. -│ └── figures <- Generated graphics and figures to be used in reporting -│ -├── requirements.txt <- The requirements file for reproducing the analysis environment, e.g. -│ generated with `pip freeze > requirements.txt` -│ -├── setup.cfg <- Configuration file for flake8 -│ -└── github_dagger_workflow_project <- Source code for use in this project. - │ - ├── __init__.py <- Makes github_dagger_workflow_project a Python module - │ - ├── config.py <- Store useful variables and configuration - │ - ├── dataset.py <- Scripts to download or generate data - │ - ├── features.py <- Code to create features for modeling - │ - ├── modeling - │ ├── __init__.py - │ ├── predict.py <- Code to run model inference with trained models - │ └── train.py <- Code to train models - │ - └── plots.py <- Code to create visualizations -``` - ---- diff --git a/references/diagrams.excalidraw b/references/diagrams.excalidraw deleted file mode 100644 index 021fb8c..0000000 --- a/references/diagrams.excalidraw +++ /dev/null @@ -1,1617 +0,0 @@ -{ - "type": "excalidraw", - "version": 2, - "source": "https://excalidraw.com", - "elements": [ - { - "type": "rectangle", - "version": 112, - "versionNonce": 485086211, - "index": "a0", - "isDeleted": false, - "id": "JLN87aw43yq_b66L3aF63", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "angle": 0, - "x": 665.4609375, - "y": 234.64453125, - "strokeColor": "#2f9e44", - "backgroundColor": "#b2f2bb", - "width": 65.49487433010404, - "height": 92.98828125, - "seed": 1808160269, - "groupIds": [], - "frameId": null, - "roundness": { - "type": 3 - }, - "boundElements": [ - { - "type": "text", - "id": "Cdlu7dWasRalCq6MzlzBX" - }, - { - "id": "UllqRgtDdGDcvc_20R07g", - "type": "arrow" - }, - { - "id": "I8rDfQB7NIXAsN-l9V-KM", - "type": "arrow" - }, - { - "id": "4tWMepvCHRUWk1zBrcB-4", - "type": "arrow" - } - ], - "updated": 1728476250770, - "link": null, - "locked": false - }, - { - "type": "text", - "version": 19, - "versionNonce": 150959363, - "index": "a1", - "isDeleted": false, - "id": "Cdlu7dWasRalCq6MzlzBX", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "angle": 0, - "x": 684.0983815648857, - "y": 268.638671875, - "strokeColor": "#2f9e44", - "backgroundColor": "#b2f2bb", - "width": 28.21998620033264, - "height": 25, - "seed": 727251565, - "groupIds": [], - "frameId": null, - "roundness": null, - "boundElements": [], - "updated": 1728476026890, - "link": null, - "locked": false, - "fontSize": 20, - "fontFamily": 5, - "text": ".py", - "textAlign": "center", - "verticalAlign": "middle", - "containerId": "JLN87aw43yq_b66L3aF63", - "originalText": ".py", - "autoResize": true, - "lineHeight": 1.25 - }, - { - "type": "rectangle", - "version": 181, - "versionNonce": 1741539437, - "index": "a2", - "isDeleted": false, - "id": "B0__NJqN-xUib6eGz9rJc", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "angle": 0, - "x": 895.3007124183662, - "y": 156.56141168763392, - "strokeColor": "#2f9e44", - "backgroundColor": "#b2f2bb", - "width": 65.49487433010404, - "height": 92.98828125, - "seed": 2044157987, - "groupIds": [], - "frameId": null, - "roundness": { - "type": 3 - }, - "boundElements": [ - { - "type": "text", - "id": "5vMtA0AJvHQO4idoMzQ5s" - }, - { - "id": "UllqRgtDdGDcvc_20R07g", - "type": "arrow" - } - ], - "updated": 1728476242653, - "link": null, - "locked": false - }, - { - "type": "text", - "version": 102, - "versionNonce": 1052023491, - "index": "a3", - "isDeleted": false, - "id": "5vMtA0AJvHQO4idoMzQ5s", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "angle": 0, - "x": 900.3007124183662, - "y": 178.05555231263392, - "strokeColor": "#2f9e44", - "backgroundColor": "#b2f2bb", - "width": 32.87994384765625, - "height": 50, - "seed": 81617859, - "groupIds": [], - "frameId": null, - "roundness": null, - "boundElements": [], - "updated": 1728476058842, - "link": null, - "locked": false, - "fontSize": 20, - "fontFamily": 5, - "text": "----\n---", - "textAlign": "left", - "verticalAlign": "middle", - "containerId": "B0__NJqN-xUib6eGz9rJc", - "originalText": "----\n---", - "autoResize": true, - "lineHeight": 1.25 - }, - { - "type": "rectangle", - "version": 226, - "versionNonce": 739192067, - "index": "a4", - "isDeleted": false, - "id": "_r06E4r0gRqSuzSYMWU03", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "angle": 0, - "x": 895.2544155256526, - "y": 265.60468438679914, - "strokeColor": "#2f9e44", - "backgroundColor": "#b2f2bb", - "width": 65.49487433010404, - "height": 92.98828125, - "seed": 1742737933, - "groupIds": [], - "frameId": null, - "roundness": { - "type": 3 - }, - "boundElements": [ - { - "type": "text", - "id": "0BVvfJxOwcoQayQ00TfmD" - }, - { - "id": "I8rDfQB7NIXAsN-l9V-KM", - "type": "arrow" - } - ], - "updated": 1728476246386, - "link": null, - "locked": false - }, - { - "type": "text", - "version": 154, - "versionNonce": 475348291, - "index": "a5", - "isDeleted": false, - "id": "0BVvfJxOwcoQayQ00TfmD", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "angle": 0, - "x": 900.2544155256526, - "y": 274.59882501179914, - "strokeColor": "#2f9e44", - "backgroundColor": "#b2f2bb", - "width": 49.319915771484375, - "height": 75, - "seed": 1670137965, - "groupIds": [], - "frameId": null, - "roundness": null, - "boundElements": [], - "updated": 1728476076289, - "link": null, - "locked": false, - "fontSize": 20, - "fontFamily": 5, - "text": "----\n------\n--", - "textAlign": "left", - "verticalAlign": "middle", - "containerId": "_r06E4r0gRqSuzSYMWU03", - "originalText": "----\n------\n--", - "autoResize": true, - "lineHeight": 1.25 - }, - { - "type": "rectangle", - "version": 261, - "versionNonce": 1976627011, - "index": "a6", - "isDeleted": false, - "id": "65VwEzbeZ74g5DrIjduzQ", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "angle": 0, - "x": 894.1513417340417, - "y": 377.4458997325685, - "strokeColor": "#2f9e44", - "backgroundColor": "#b2f2bb", - "width": 65.49487433010404, - "height": 92.98828125, - "seed": 169148451, - "groupIds": [], - "frameId": null, - "roundness": { - "type": 3 - }, - "boundElements": [ - { - "type": "text", - "id": "4L-EvDeSL0jkCUphMPuIe" - }, - { - "id": "4tWMepvCHRUWk1zBrcB-4", - "type": "arrow" - } - ], - "updated": 1728476250770, - "link": null, - "locked": false - }, - { - "type": "text", - "version": 185, - "versionNonce": 1315247971, - "index": "a7", - "isDeleted": false, - "id": "4L-EvDeSL0jkCUphMPuIe", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "angle": 0, - "x": 899.1513417340417, - "y": 398.9400403575685, - "strokeColor": "#2f9e44", - "backgroundColor": "#b2f2bb", - "width": 32.87994384765625, - "height": 50, - "seed": 2087125955, - "groupIds": [], - "frameId": null, - "roundness": null, - "boundElements": [], - "updated": 1728476090550, - "link": null, - "locked": false, - "fontSize": 20, - "fontFamily": 5, - "text": "--\n----", - "textAlign": "left", - "verticalAlign": "middle", - "containerId": "65VwEzbeZ74g5DrIjduzQ", - "originalText": "--\n----", - "autoResize": true, - "lineHeight": 1.25 - }, - { - "type": "rectangle", - "version": 538, - "versionNonce": 172436024, - "index": "a7G", - "isDeleted": false, - "id": "YAdC_kcGvuq5NUQFGFmyL", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "angle": 0, - "x": 627.988828373765, - "y": 417.54040154530134, - "strokeColor": "#2f9e44", - "backgroundColor": "#b2f2bb", - "width": 60.68717471400589, - "height": 75.32101862825932, - "seed": 585375277, - "groupIds": [ - "65TJqao4K4qqpYqj_FAne" - ], - "frameId": null, - "roundness": { - "type": 3 - }, - "boundElements": [ - { - "type": "text", - "id": "V3sA8aDuhMxYvSjON_s9Y" - } - ], - "updated": 1729664824743, - "link": null, - "locked": false - }, - { - "type": "text", - "version": 303, - "versionNonce": 760554296, - "index": "a7V", - "isDeleted": false, - "id": "V3sA8aDuhMxYvSjON_s9Y", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "angle": 0, - "x": 635.0024270723399, - "y": 442.7009108594311, - "strokeColor": "#2f9e44", - "backgroundColor": "#b2f2bb", - "width": 46.659977316856384, - "height": 25, - "seed": 2044278115, - "groupIds": [ - "65TJqao4K4qqpYqj_FAne" - ], - "frameId": null, - "roundness": null, - "boundElements": [], - "updated": 1729664824743, - "link": null, - "locked": false, - "fontSize": 20, - "fontFamily": 5, - "text": "data", - "textAlign": "center", - "verticalAlign": "middle", - "containerId": "YAdC_kcGvuq5NUQFGFmyL", - "originalText": "data", - "autoResize": true, - "lineHeight": 1.25 - }, - { - "type": "ellipse", - "version": 499, - "versionNonce": 238812216, - "index": "a8", - "isDeleted": false, - "id": "G3mtGsnmChQqWc-elc5js", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "angle": 0, - "x": 627.7186831867094, - "y": 415.86163595733905, - "strokeColor": "#2f9e44", - "backgroundColor": "#b2f2bb", - "width": 60.89047846027006, - "height": 20.33037462640425, - "seed": 119200419, - "groupIds": [ - "65TJqao4K4qqpYqj_FAne" - ], - "frameId": null, - "roundness": { - "type": 2 - }, - "boundElements": [], - "updated": 1729664824743, - "link": null, - "locked": false - }, - { - "type": "ellipse", - "version": 565, - "versionNonce": 1054848312, - "index": "a9", - "isDeleted": false, - "id": "Rwo3oCY5sxtOAFVzZAI_Q", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "angle": 0, - "x": 626.8293931475463, - "y": 474.8398514576997, - "strokeColor": "#2f9e44", - "backgroundColor": "#b2f2bb", - "width": 60.89047846027006, - "height": 20.33037462640425, - "seed": 2116237347, - "groupIds": [ - "65TJqao4K4qqpYqj_FAne" - ], - "frameId": null, - "roundness": { - "type": 2 - }, - "boundElements": [], - "updated": 1729664824743, - "link": null, - "locked": false - }, - { - "type": "arrow", - "version": 139, - "versionNonce": 451039693, - "index": "aB", - "isDeleted": false, - "id": "UllqRgtDdGDcvc_20R07g", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "angle": 0, - "x": 744.4731866034314, - "y": 252.3207560258277, - "strokeColor": "#2f9e44", - "backgroundColor": "#b2f2bb", - "width": 138.20060307028712, - "height": 46.232108598120305, - "seed": 235799085, - "groupIds": [], - "frameId": null, - "roundness": { - "type": 2 - }, - "boundElements": [], - "updated": 1728476906879, - "link": null, - "locked": false, - "startBinding": { - "elementId": "JLN87aw43yq_b66L3aF63", - "focus": -0.13426487645230828, - "gap": 13.517374773327333, - "fixedPoint": null - }, - "endBinding": { - "elementId": "B0__NJqN-xUib6eGz9rJc", - "focus": 0.14385224960085158, - "gap": 12.626922744647686, - "fixedPoint": null - }, - "lastCommittedPoint": null, - "startArrowhead": null, - "endArrowhead": "arrow", - "points": [ - [ - 0, - 0 - ], - [ - 63.87397296488314, - -28.4629457321071 - ], - [ - 138.20060307028712, - -46.232108598120305 - ] - ], - "elbowed": false - }, - { - "type": "arrow", - "version": 60, - "versionNonce": 4172685, - "index": "aC", - "isDeleted": false, - "id": "I8rDfQB7NIXAsN-l9V-KM", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "angle": 0, - "x": 743.7150224372008, - "y": 282.7135957665072, - "strokeColor": "#2f9e44", - "backgroundColor": "#b2f2bb", - "width": 137.44508982771458, - "height": 18.65454978239518, - "seed": 1702292611, - "groupIds": [], - "frameId": null, - "roundness": { - "type": 2 - }, - "boundElements": [], - "updated": 1728476911522, - "link": null, - "locked": false, - "startBinding": { - "elementId": "JLN87aw43yq_b66L3aF63", - "focus": -0.1628316397450537, - "gap": 12.759210607096747, - "fixedPoint": null - }, - "endBinding": { - "elementId": "_r06E4r0gRqSuzSYMWU03", - "focus": 0.19610138562900511, - "gap": 14.094303260737206, - "fixedPoint": null - }, - "lastCommittedPoint": null, - "startArrowhead": null, - "endArrowhead": "arrow", - "points": [ - [ - 0, - 0 - ], - [ - 73.427784298765, - 16.715752990788758 - ], - [ - 137.44508982771458, - 18.65454978239518 - ] - ], - "elbowed": false - }, - { - "type": "arrow", - "version": 176, - "versionNonce": 1832948717, - "index": "aD", - "isDeleted": false, - "id": "4tWMepvCHRUWk1zBrcB-4", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "angle": 0, - "x": 744.3485931914984, - "y": 303.99255997047, - "strokeColor": "#2f9e44", - "backgroundColor": "#b2f2bb", - "width": 135.04070206977337, - "height": 109.6872682032353, - "seed": 2041756429, - "groupIds": [], - "frameId": null, - "roundness": { - "type": 2 - }, - "boundElements": [], - "updated": 1728476903261, - "link": null, - "locked": false, - "startBinding": { - "elementId": "JLN87aw43yq_b66L3aF63", - "focus": -0.35694488257429613, - "gap": 13.392781361394327, - "fixedPoint": null - }, - "endBinding": { - "elementId": "65VwEzbeZ74g5DrIjduzQ", - "focus": -0.26584341000504674, - "gap": 14.762046472769953, - "fixedPoint": null - }, - "lastCommittedPoint": null, - "startArrowhead": null, - "endArrowhead": "arrow", - "points": [ - [ - 0, - 0 - ], - [ - 55.08568988587979, - 63.07810832229552 - ], - [ - 135.04070206977337, - 109.6872682032353 - ] - ], - "elbowed": false - }, - { - "id": "st9T3AoBQeJ_95vbcxsj1", - "type": "text", - "x": 867.3730368230083, - "y": 84.19003763586892, - "width": 122.9399191737175, - "height": 50, - "angle": 0, - "strokeColor": "#1e1e1e", - "backgroundColor": "transparent", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "groupIds": [], - "frameId": null, - "index": "aE", - "roundness": null, - "seed": 1554541880, - "version": 177, - "versionNonce": 591430984, - "isDeleted": false, - "boundElements": null, - "updated": 1729664059752, - "link": null, - "locked": false, - "text": "cookiecutter\nfiles", - "fontSize": 20, - "fontFamily": 5, - "textAlign": "center", - "verticalAlign": "top", - "containerId": null, - "originalText": "cookiecutter\nfiles", - "autoResize": true, - "lineHeight": 1.25 - }, - { - "id": "c1fSeBygovRAsnOrhhrT7", - "type": "text", - "x": 616.0110217544773, - "y": 166.47910013586892, - "width": 90.55996763706207, - "height": 50, - "angle": 0, - "strokeColor": "#1e1e1e", - "backgroundColor": "transparent", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "groupIds": [], - "frameId": null, - "index": "aF", - "roundness": null, - "seed": 1437105720, - "version": 177, - "versionNonce": 1376903752, - "isDeleted": false, - "boundElements": null, - "updated": 1729664475074, - "link": null, - "locked": false, - "text": "jupyter\nnotebook", - "fontSize": 20, - "fontFamily": 5, - "textAlign": "center", - "verticalAlign": "top", - "containerId": null, - "originalText": "jupyter\nnotebook", - "autoResize": true, - "lineHeight": 1.25 - }, - { - "type": "rectangle", - "version": 209, - "versionNonce": 2089840712, - "index": "aG", - "isDeleted": false, - "id": "Nz8lPayTFz92q9e3M7TaP", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "angle": 0, - "x": 589.6412246579562, - "y": 234.04745951086892, - "strokeColor": "#2f9e44", - "backgroundColor": "#b2f2bb", - "width": 65.49487433010404, - "height": 92.98828125, - "seed": 2058741576, - "groupIds": [], - "frameId": null, - "roundness": { - "type": 3 - }, - "boundElements": [ - { - "type": "text", - "id": "IN0GiRj0GvTOVTkfKgK1H" - } - ], - "updated": 1729664412907, - "link": null, - "locked": false - }, - { - "type": "text", - "version": 124, - "versionNonce": 1387267128, - "index": "aH", - "isDeleted": false, - "id": "IN0GiRj0GvTOVTkfKgK1H", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "angle": 0, - "x": 595.4586842868067, - "y": 268.0416001358689, - "strokeColor": "#2f9e44", - "backgroundColor": "#b2f2bb", - "width": 53.859955072402954, - "height": 25, - "seed": 688504392, - "groupIds": [], - "frameId": null, - "roundness": null, - "boundElements": [], - "updated": 1729664429710, - "link": null, - "locked": false, - "fontSize": 20, - "fontFamily": 5, - "text": ".ipynb", - "textAlign": "center", - "verticalAlign": "middle", - "containerId": "Nz8lPayTFz92q9e3M7TaP", - "originalText": ".ipynb", - "autoResize": true, - "lineHeight": 1.25 - }, - { - "id": "-y2rHdlaB_nDXveFx7xJG", - "type": "rectangle", - "x": 1122.2636618230083, - "y": 123.92441263586892, - "width": 464.54296875, - "height": 395.54687500000006, - "angle": 0, - "strokeColor": "#1e1e1e", - "backgroundColor": "transparent", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "groupIds": [], - "frameId": null, - "index": "aI", - "roundness": { - "type": 3 - }, - "seed": 2083747144, - "version": 690, - "versionNonce": 711230520, - "isDeleted": false, - "boundElements": [ - { - "id": "Znv9m5Gkupi1sOudyurhk", - "type": "arrow" - } - ], - "updated": 1729664893816, - "link": null, - "locked": false - }, - { - "id": "hP0oqKmM6W9dxMm28SPV5", - "type": "text", - "x": 1250.0021719548442, - "y": 80.09238138586892, - "width": 213.81985473632812, - "height": 25, - "angle": 0, - "strokeColor": "#1e1e1e", - "backgroundColor": "transparent", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "groupIds": [], - "frameId": null, - "index": "aJ", - "roundness": null, - "seed": 1990330184, - "version": 223, - "versionNonce": 1920899128, - "isDeleted": false, - "boundElements": null, - "updated": 1729664537147, - "link": null, - "locked": false, - "text": "Dagger workflow (Go)", - "fontSize": 20, - "fontFamily": 5, - "textAlign": "center", - "verticalAlign": "top", - "containerId": null, - "originalText": "Dagger workflow (Go)", - "autoResize": true, - "lineHeight": 1.25 - }, - { - "type": "rectangle", - "version": 1234, - "versionNonce": 126162248, - "index": "aK", - "isDeleted": false, - "id": "-OaaCa6citSZfSL--26bu", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "angle": 0, - "x": 1086.6328024480083, - "y": 43.74863138586886, - "strokeColor": "#1e1e1e", - "backgroundColor": "transparent", - "width": 541.1367187500001, - "height": 684.7695312500001, - "seed": 1441129272, - "groupIds": [], - "frameId": null, - "roundness": { - "type": 3 - }, - "boundElements": [], - "updated": 1729664688197, - "link": null, - "locked": false - }, - { - "id": "ngzs2xhBfVRf42ezGbCUl", - "type": "text", - "x": 1275.4005853447654, - "y": 0.029881385868918642, - "width": 158.31990295648575, - "height": 25, - "angle": 0, - "strokeColor": "#1e1e1e", - "backgroundColor": "transparent", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "groupIds": [], - "frameId": null, - "index": "aL", - "roundness": null, - "seed": 863763528, - "version": 89, - "versionNonce": 528129336, - "isDeleted": false, - "boundElements": null, - "updated": 1729664575729, - "link": null, - "locked": false, - "text": "GitHub workflow", - "fontSize": 20, - "fontFamily": 5, - "textAlign": "center", - "verticalAlign": "top", - "containerId": null, - "originalText": "GitHub workflow", - "autoResize": true, - "lineHeight": 1.25 - }, - { - "id": "BXjqv0PM0qw1N6iPzSfDw", - "type": "text", - "x": 1157.6344732854106, - "y": 194.80331888586892, - "width": 58.35993957519531, - "height": 25, - "angle": 0, - "strokeColor": "#1e1e1e", - "backgroundColor": "transparent", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "groupIds": [], - "frameId": null, - "index": "aM", - "roundness": null, - "seed": 725878840, - "version": 11, - "versionNonce": 560920136, - "isDeleted": false, - "boundElements": [ - { - "id": "uh8IVkDTdObjIHNorW3Kp", - "type": "arrow" - } - ], - "updated": 1729665352577, - "link": null, - "locked": false, - "text": "func()", - "fontSize": 20, - "fontFamily": 5, - "textAlign": "center", - "verticalAlign": "top", - "containerId": null, - "originalText": "func()", - "autoResize": true, - "lineHeight": 1.25 - }, - { - "type": "text", - "version": 99, - "versionNonce": 1981077048, - "index": "aN", - "isDeleted": false, - "id": "TeFfhpdCWgk6uewLmkZu8", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "angle": 0, - "x": 1157.1579107854106, - "y": 290.2525376358689, - "strokeColor": "#1e1e1e", - "backgroundColor": "transparent", - "width": 58.35993957519531, - "height": 25, - "seed": 32612920, - "groupIds": [], - "frameId": null, - "roundness": null, - "boundElements": [ - { - "id": "cRKPcwyJGtCsYuysKOHuz", - "type": "arrow" - } - ], - "updated": 1729664723135, - "link": null, - "locked": false, - "fontSize": 20, - "fontFamily": 5, - "text": "func()", - "textAlign": "center", - "verticalAlign": "top", - "containerId": null, - "originalText": "func()", - "autoResize": true, - "lineHeight": 1.25 - }, - { - "type": "text", - "version": 239, - "versionNonce": 1482602056, - "index": "aO", - "isDeleted": false, - "id": "dro3cy9ilgBla8VVuTDyf", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "angle": 0, - "x": 1157.7829107854106, - "y": 385.6470688858689, - "strokeColor": "#1e1e1e", - "backgroundColor": "transparent", - "width": 58.35993957519531, - "height": 25, - "seed": 1530733896, - "groupIds": [], - "frameId": null, - "roundness": null, - "boundElements": [ - { - "id": "AXgrYyTW973aRm0IHdYT3", - "type": "arrow" - } - ], - "updated": 1729664746633, - "link": null, - "locked": false, - "fontSize": 20, - "fontFamily": 5, - "text": "func()", - "textAlign": "center", - "verticalAlign": "top", - "containerId": null, - "originalText": "func()", - "autoResize": true, - "lineHeight": 1.25 - }, - { - "type": "text", - "version": 327, - "versionNonce": 1332905032, - "index": "aP", - "isDeleted": false, - "id": "wjxOB1JKv9GlC0Lbw956k", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "angle": 1.5707963267948957, - "x": 1170.6991351458767, - "y": 335.0884751358689, - "strokeColor": "#1e1e1e", - "backgroundColor": "transparent", - "width": 32.839990854263306, - "height": 25, - "seed": 228182584, - "groupIds": [], - "frameId": null, - "roundness": null, - "boundElements": [], - "updated": 1729664617482, - "link": null, - "locked": false, - "fontSize": 20, - "fontFamily": 5, - "text": ". . .", - "textAlign": "center", - "verticalAlign": "top", - "containerId": null, - "originalText": ". . .", - "autoResize": true, - "lineHeight": 1.25 - }, - { - "type": "text", - "version": 401, - "versionNonce": 722574904, - "index": "aQ", - "isDeleted": false, - "id": "1z7PEREDR3810eTeecA3e", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "angle": 1.5707963267948957, - "x": 1171.9022601458767, - "y": 240.56113138586892, - "strokeColor": "#1e1e1e", - "backgroundColor": "transparent", - "width": 32.839990854263306, - "height": 25, - "seed": 1724125768, - "groupIds": [], - "frameId": null, - "roundness": null, - "boundElements": [], - "updated": 1729664623156, - "link": null, - "locked": false, - "fontSize": 20, - "fontFamily": 5, - "text": ". . .", - "textAlign": "center", - "verticalAlign": "top", - "containerId": null, - "originalText": ". . .", - "autoResize": true, - "lineHeight": 1.25 - }, - { - "id": "jwf3ddjfOaUORekzMD35s", - "type": "ellipse", - "x": 1392.6660055730083, - "y": 554.2927155035721, - "width": 111.17578125000006, - "height": 113.34151926459364, - "angle": 0, - "strokeColor": "#2f9e44", - "backgroundColor": "#b2f2bb", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "groupIds": [], - "frameId": null, - "index": "aR", - "roundness": { - "type": 2 - }, - "seed": 771321912, - "version": 310, - "versionNonce": 1984615992, - "isDeleted": false, - "boundElements": [ - { - "type": "text", - "id": "RxzaG_y5w8iuXeNiyA13B" - }, - { - "id": "Znv9m5Gkupi1sOudyurhk", - "type": "arrow" - } - ], - "updated": 1729664893817, - "link": null, - "locked": false - }, - { - "id": "RxzaG_y5w8iuXeNiyA13B", - "type": "text", - "x": 1423.537138295892, - "y": 587.2610597185776, - "width": 49.820366978645325, - "height": 47.26027397260275, - "angle": 0, - "strokeColor": "#2f9e44", - "backgroundColor": "transparent", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "groupIds": [], - "frameId": null, - "index": "aS", - "roundness": null, - "seed": 164856888, - "version": 198, - "versionNonce": 489885256, - "isDeleted": false, - "boundElements": null, - "updated": 1729664669783, - "link": null, - "locked": false, - "text": "AI\nmodel", - "fontSize": 18.9041095890411, - "fontFamily": 5, - "textAlign": "center", - "verticalAlign": "middle", - "containerId": "jwf3ddjfOaUORekzMD35s", - "originalText": "AI model", - "autoResize": true, - "lineHeight": 1.25 - }, - { - "id": "tDoGnkvg1SvOnMlUl5YJN", - "type": "text", - "x": 1399.441193542216, - "y": 677.693943885869, - "width": 101.01993656158447, - "height": 25, - "angle": 0, - "strokeColor": "#2f9e44", - "backgroundColor": "#b2f2bb", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "groupIds": [], - "frameId": null, - "index": "aT", - "roundness": null, - "seed": 1191881784, - "version": 78, - "versionNonce": 30180680, - "isDeleted": false, - "boundElements": null, - "updated": 1729666135096, - "link": null, - "locked": false, - "text": "", - "fontSize": 20, - "fontFamily": 5, - "textAlign": "center", - "verticalAlign": "top", - "containerId": null, - "originalText": "", - "autoResize": true, - "lineHeight": 1.25 - }, - { - "id": "uh8IVkDTdObjIHNorW3Kp", - "type": "arrow", - "x": 978.3925680730083, - "y": 196.36972513586892, - "width": 164.3125, - "height": 11.83984375, - "angle": 0, - "strokeColor": "#2f9e44", - "backgroundColor": "#b2f2bb", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "groupIds": [], - "frameId": null, - "index": "aU", - "roundness": { - "type": 2 - }, - "seed": 1863024184, - "version": 125, - "versionNonce": 1405676872, - "isDeleted": false, - "boundElements": null, - "updated": 1729665352578, - "link": null, - "locked": false, - "points": [ - [ - 0, - 0 - ], - [ - 78.1015625, - -2.5625 - ], - [ - 164.3125, - 9.27734375 - ] - ], - "lastCommittedPoint": null, - "startBinding": null, - "endBinding": { - "elementId": "BXjqv0PM0qw1N6iPzSfDw", - "focus": -0.2666398071226864, - "gap": 14.929405212402344, - "fixedPoint": null - }, - "startArrowhead": null, - "endArrowhead": "arrow", - "elbowed": false - }, - { - "id": "cRKPcwyJGtCsYuysKOHuz", - "type": "arrow", - "x": 981.7714743230083, - "y": 228.66269388586892, - "width": 167.32421875, - "height": 52.72265624999994, - "angle": 0, - "strokeColor": "#2f9e44", - "backgroundColor": "#b2f2bb", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "groupIds": [], - "frameId": null, - "index": "aV", - "roundness": { - "type": 2 - }, - "seed": 778980920, - "version": 384, - "versionNonce": 53074744, - "isDeleted": false, - "boundElements": null, - "updated": 1729664765311, - "link": null, - "locked": false, - "points": [ - [ - 0, - 0 - ], - [ - 72.44921875, - 13.265625 - ], - [ - 167.32421875, - 52.72265624999994 - ] - ], - "lastCommittedPoint": null, - "startBinding": null, - "endBinding": { - "elementId": "TeFfhpdCWgk6uewLmkZu8", - "focus": 0.23862899610593896, - "gap": 8.867187500000057, - "fixedPoint": null - }, - "startArrowhead": null, - "endArrowhead": "arrow", - "elbowed": false - }, - { - "id": "AXgrYyTW973aRm0IHdYT3", - "type": "arrow", - "x": 978.0995993230083, - "y": 422.6080063858689, - "width": 169.25390625, - "height": 26.6015625, - "angle": 0, - "strokeColor": "#2f9e44", - "backgroundColor": "#b2f2bb", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "groupIds": [], - "frameId": null, - "index": "aW", - "roundness": { - "type": 2 - }, - "seed": 218440264, - "version": 227, - "versionNonce": 846354504, - "isDeleted": false, - "boundElements": null, - "updated": 1729664750904, - "link": null, - "locked": false, - "points": [ - [ - 0, - 0 - ], - [ - 70.390625, - -6.58203125 - ], - [ - 169.25390625, - -26.6015625 - ] - ], - "lastCommittedPoint": null, - "startBinding": null, - "endBinding": { - "elementId": "dro3cy9ilgBla8VVuTDyf", - "focus": 0.551984740547581, - "gap": 10.429405212402344, - "fixedPoint": null - }, - "startArrowhead": null, - "endArrowhead": "arrow", - "elbowed": false - }, - { - "id": "1dp1Y_HqxE55DmlPG2si7", - "type": "text", - "x": 622.6005758855083, - "y": 354.7759751358689, - "width": 71.919921875, - "height": 50, - "angle": 0, - "strokeColor": "#1e1e1e", - "backgroundColor": "#b2f2bb", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "groupIds": [], - "frameId": null, - "index": "aX", - "roundness": null, - "seed": 917334856, - "version": 139, - "versionNonce": 1956924984, - "isDeleted": false, - "boundElements": null, - "updated": 1729664824743, - "link": null, - "locked": false, - "text": "training\ndata", - "fontSize": 20, - "fontFamily": 5, - "textAlign": "center", - "verticalAlign": "top", - "containerId": null, - "originalText": "training\ndata", - "autoResize": true, - "lineHeight": 1.25 - }, - { - "id": "Znv9m5Gkupi1sOudyurhk", - "type": "arrow", - "x": 1315.4497901628338, - "y": 503.88671081836395, - "width": 72.24824796924963, - "height": 112.52121320570478, - "angle": 0, - "strokeColor": "#1e1e1e", - "backgroundColor": "#b2f2bb", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "groupIds": [], - "frameId": null, - "index": "aY", - "roundness": { - "type": 2 - }, - "seed": 89092664, - "version": 307, - "versionNonce": 1633095992, - "isDeleted": false, - "boundElements": null, - "updated": 1729664906569, - "link": null, - "locked": false, - "points": [ - [ - 0, - 0 - ], - [ - 8.048580233882603, - 101.01097039524308 - ], - [ - 72.24824796924963, - 112.52121320570478 - ] - ], - "lastCommittedPoint": null, - "startBinding": { - "elementId": "-y2rHdlaB_nDXveFx7xJG", - "focus": 0.2161117000788876, - "gap": 1, - "fixedPoint": null - }, - "endBinding": { - "elementId": "jwf3ddjfOaUORekzMD35s", - "focus": -0.28330339832825724, - "gap": 5.203759307960375, - "fixedPoint": null - }, - "startArrowhead": null, - "endArrowhead": "arrow", - "elbowed": false - }, - { - "id": "9eBl1erCu_hetHwjMXiHD", - "type": "text", - "x": 1254.1505690143022, - "y": 194.26972072133148, - "width": 163.29978942871094, - "height": 50, - "angle": 0, - "strokeColor": "#1e1e1e", - "backgroundColor": "#b2f2bb", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "groupIds": [], - "frameId": null, - "index": "aZ", - "roundness": null, - "seed": 894712648, - "version": 153, - "versionNonce": 101903928, - "isDeleted": false, - "boundElements": null, - "updated": 1729665374438, - "link": null, - "locked": false, - "text": "---- --- ---- -- -- \n--- ------ --", - "fontSize": 20, - "fontFamily": 5, - "textAlign": "left", - "verticalAlign": "top", - "containerId": null, - "originalText": "---- --- ---- -- -- \n--- ------ --", - "autoResize": true, - "lineHeight": 1.25 - }, - { - "type": "text", - "version": 295, - "versionNonce": 369525320, - "index": "aa", - "isDeleted": false, - "id": "66wfMFKhaFTFgt--YGLKB", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "angle": 0, - "x": 1256.2717347475366, - "y": 289.44159643348974, - "strokeColor": "#1e1e1e", - "backgroundColor": "#b2f2bb", - "width": 212.6197052001953, - "height": 50, - "seed": 350666808, - "groupIds": [], - "frameId": null, - "roundness": null, - "boundElements": [], - "updated": 1729665418723, - "link": null, - "locked": false, - "fontSize": 20, - "fontFamily": 5, - "text": "---- --- ---- -- -- \n--- ------ --- ---- -- ---", - "textAlign": "left", - "verticalAlign": "top", - "containerId": null, - "originalText": "---- --- ---- -- -- \n--- ------ --- ---- -- ---", - "autoResize": true, - "lineHeight": 1.25 - }, - { - "type": "text", - "version": 388, - "versionNonce": 384919352, - "index": "ab", - "isDeleted": false, - "id": "GYr7G_bpzSnTPDyT4EVkT", - "fillStyle": "solid", - "strokeWidth": 2, - "strokeStyle": "solid", - "roughness": 1, - "opacity": 100, - "angle": 0, - "x": 1257.6094372160226, - "y": 387.34890624030004, - "strokeColor": "#1e1e1e", - "backgroundColor": "#b2f2bb", - "width": 138.63983154296875, - "height": 50, - "seed": 1629539656, - "groupIds": [], - "frameId": null, - "roundness": null, - "boundElements": [], - "updated": 1729665429111, - "link": null, - "locked": false, - "fontSize": 20, - "fontFamily": 5, - "text": "---- --- - -- -- \n--- -- --", - "textAlign": "left", - "verticalAlign": "top", - "containerId": null, - "originalText": "---- --- - -- -- \n--- -- --", - "autoResize": true, - "lineHeight": 1.25 - } - ], - "appState": { - "gridSize": 20, - "gridStep": 5, - "gridModeEnabled": false, - "viewBackgroundColor": "#ffffff" - }, - "files": {} -} \ No newline at end of file diff --git a/references/problem_specification.md b/references/problem_specification.md deleted file mode 100644 index f2a3a22..0000000 --- a/references/problem_specification.md +++ /dev/null @@ -1,77 +0,0 @@ -# ITU BDS SDSE'24 - Project - -## Task - -Based on the input provided (see below), fork the repository and restructure the code to adhere to the concepts and ideas you have seen throughout the course. The diagram below provides a detailed overview of the structure that the solution is expected to follow. - -![Project architecture](./docs/project-architecture.png) - -For the exam submission, we expect you to submit a pdf containing: -- the list of members of the group -- the link to the github.com private repository hosting your solution. You will need to invite the three of us as collaborators: lasselundstenjensen, Jeppe-T-K, paolotell. - -The repository linked in the submission should contain: - -- A README.md file that describes the project -- GitHub automation workflow -- Dagger workflow (in Go) -- All history - - -## Inputs - -You are given the following material: -- Python monolith (see `notebooks` folder) -- Raw input data (see `notebooks/artifacts` folder) -- GitHub action to test model inference (see [`model-validator`](https://github.com/lasselundstenjensen/itu-sdse-project-model-validator) action) - -## Outputs - -- Your GitHub repository (including all history) - - A README.md file that describes the project - - GitHub automation workflow - - Dagger workflow (in Go) -- Model artifact produced by GitHub workflow and named 'model' - -> **NOTE:** -> The Dagger workflow can be run locally or inside the GitHub workflow—both are viable options during development. -> -> The Dagger workflow can run locally and can also be made to produce outputs locally during development. But when wrapping the Dagger workflow in a GitHub workflow, the output is instead stored inside the GitHub runner (i.e. a virtual machine). -> -> Use the publicly available [`actions/upload-artifact`](https://github.com/actions/upload-artifact) to store the model artifact in the GitHub worklow pipeline. -> -> This model artifact can then be picked up by the [action provided](https://github.com/lasselundstenjensen/itu-sdse-project-model-validator), which will run some inference tests to ensure that the correct model was trained. - - -## How will we assess - -Below, we provide information on how we will assess the submission clustered around several aspects. The list relates to groups of size 3; if your group is of size 4, you are expected also to work on the optional items, i.e., to use pull requests and to provide tests. - -#### Versioning - -- Use of Git (semantic commit messages, branches, branch longevity, commit frequency/size) -- Management of data -- Use of pull requests (OPTIONAL) - -#### Programming - -- Decomposition of Python notebook -- Adherance to standard data science MLOps project structure -- Presence of tests (OPTIONAL) - -#### Workflow automation - -- Presence of a workflow that trains the model -- Presence of a workflow that tests the model -- Structure of Dagger workflow -- Orchestration of Dagger workflow through GitHub workflow - -#### Documentation (README.md) - -- Description of project structure -- How to run the code and generate the model artifact - - -## Questions - -If you have any questions about the information shared here, please feel free to post them on Learnit. Answers to private emails on this topic will also be shared on Learnit, along with the original email content, so that everyone has access to the same information. diff --git a/references/project_reflections.md b/references/project_reflections.md new file mode 100644 index 0000000..3da707f --- /dev/null +++ b/references/project_reflections.md @@ -0,0 +1,22 @@ +# Reflections + +Below is not the part of the documentation. + +## A few decisions + +- We have noticed a few strong signs an XGBoost was supposed to be in the pipeline. We initially included it, but finally decided on wrapping the code in such a way, that by one-liner one can start effectively compare LR with XGBoost. Please read more in: _Originally posted by @PLtier in [#3 Issues](https://github.com/PLtier/github-dagger-workflow-project/issues/3#issuecomment-2551304436)_. +- We tried to make the code more explicit. E.g. We made `f1_score` to explicitly use `average='binary'` as there were both used `binary` and `weighted` versions. +- LR regression: we realised that code was performing GridSearch but unfortunately it was still saving unfitted LR. We strongly think it was a bug and decided to store best_model output by GridSearch. +- XGBoost: in the initial code `classification_report` for LR regression was computed using `test_data` whereas for Xgboost on `train_data` (!). We changed it to use `test_data`. We strongly think it was an oversight and have changed it. +- XGBoost / LR comparison metric: there was one MLFlow comparison (model selection) using binary F1-Score and also the second one based on weighted F1-Score `model_result` . We decided to use `binary` because it was used in MLFlow comparison which we are certain intent of. (As stated above: we still output always LR). +- We strived to encapsulate as much of the code into functions (no global variables shared except constants). This was to improve + - readibility + - better troubleshooting + - not polluting global namespace (so fewer bugs) +- Imports: we sorted them and removed relative imports i.e. we don't do `import utils` in order not to confuse with an external library. +- The code responsible for registering / transition to staging / deployment has not been deleted (except a few lines) but wrapped and left. +- We moved all paths in scripts to external file. + +## What to improve upon + +- Take out tests out of the production code as right now we do it. diff --git a/references/subjective_code_practices_mpj.md b/references/subjective_code_practices_mpj.md index 326049b..329d307 100644 --- a/references/subjective_code_practices_mpj.md +++ b/references/subjective_code_practices_mpj.md @@ -106,17 +106,6 @@ It's easy to refactor notebook code because the ccds template makes your project from classification_fashion_mnist.data import make_dataset ``` -### installing dependencies - -First of all, install `make` - -Due to simplicity we use `venv`. - -```shell -make create_environment # creates .venv -make requirements # fetches & install deps -``` - ### Naming Convention for Notebooks We use name notebooks with a scheme that looks like this: @@ -172,5 +161,7 @@ git push -u origin main #NOT! It's blocked besides ``` ### Few practices: + From "Accelerate", Forsgren: -"In short and maybe- counter-intuitively - going faster and releasing more frequently actually LEDs to higher quality products." \ No newline at end of file + +> "In short and maybe- counter-intuitively - going faster and releasing more frequently actually LEDs to higher quality products." From d162f92f478d23f39a101fa0ac67088d52cff982 Mon Sep 17 00:00:00 2001 From: Maciej Jalocha Date: Thu, 19 Dec 2024 11:57:26 +0100 Subject: [PATCH 14/16] docs: update project structure --- README.md | 24 +++++++++++++++--------- 1 file changed, 15 insertions(+), 9 deletions(-) diff --git a/README.md b/README.md index 04ab0d4..f4fd37b 100644 --- a/README.md +++ b/README.md @@ -29,11 +29,13 @@ In this project we were tasked with restructuring a Python monolith using the co │ ├── pipeline.go <- Dagger workflow written in Go │ -├── pyproject.toml <- Configuration file +├── pyproject.toml <- Project metadata and configuration │ ├── .pre-commit-config.yaml <- Checks quality of code before commits │ -├── Makefile.venv <- Creates and manages Python virtual environment +├── Makefile.venv <- Library for managing venv via makefile +│ +├── Makefile <- Project related scripts │ ├── references <- Documentation and extra resources │ @@ -57,11 +59,15 @@ In this project we were tasked with restructuring a Python monolith using the co │ ├── 05_model_deployment.py <- Script for deploying model │ + ├── config.py <- Constants and paths used in the pipeline's scripts + │ + ├── pipeline_utils.py <- Encapsulated code from the .py monolith. + │ ├── artifacts │ │ │ └── raw_data.csv.dvc <- Metadata tracked by DVC for data file │ - └── utils.py <- Helper functions + └── utils.py <- Helper functions extracted from the .py monolith ``` # How to run the code @@ -70,7 +76,7 @@ In this project we were tasked with restructuring a Python monolith using the co The workflow can be triggered either on pull requests to `main` or manually. -It can be triggered manually [here](https://github.com/PLtier/github-dagger-workflow-project/actions/workflows/test_action.yml) by pressing `Run workflow` on the `main` branch, then refresh the page and the triggered workflow will appear. After all the jobs have been run, the model artifact can be found on the summary page of the run of the first job. We also store other artifacts for convenience. +It can be triggered manually [here](https://github.com/PLtier/github-dagger-workflow-project/actions/workflows/log_and_test_action.yml) by pressing `Run workflow` on the `main` branch, then refresh the page and the triggered workflow will appear. After all the jobs have been run, the model artifact can be found on the summary page of the run of the first job. We also store other artifacts for convenience. The testing is automatically run afterwards to let the user check if it was of a quality. Artifacts are stored for 90 days. @@ -80,15 +86,15 @@ Artifacts are stored for 90 days. For local running you need: -- `docker` -- `dagger` >= 15 +- `docker` (Server): >= 4.36 +- `dagger` >= 0.14 For local development you need as well: - `go` - 1.23.3 is currently used. -- `git` -- `python` >=3.11.9 -- `make` +- `git` >= 2.39 +- `python` >= 3.11 +- `make` >= 3.81 (lower should work too) Then run: From fc7ee493bfbb4c7d9458e741d8216c15f7b71f59 Mon Sep 17 00:00:00 2001 From: Maciej Jalocha Date: Thu, 19 Dec 2024 12:12:48 +0100 Subject: [PATCH 15/16] docs: more reflections --- references/project_reflections.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/references/project_reflections.md b/references/project_reflections.md index 3da707f..e337a4c 100644 --- a/references/project_reflections.md +++ b/references/project_reflections.md @@ -14,8 +14,10 @@ Below is not the part of the documentation. - better troubleshooting - not polluting global namespace (so fewer bugs) - Imports: we sorted them and removed relative imports i.e. we don't do `import utils` in order not to confuse with an external library. -- The code responsible for registering / transition to staging / deployment has not been deleted (except a few lines) but wrapped and left. -- We moved all paths in scripts to external file. +- The code responsible for registering / transition to staging / deployment has not been deleted (except a few lines) but wrapped though not used. +- We moved all constants in scripts to external file. +- We realised that in MLFlow runs, LR would copy the whole artifacts folder again to its MLFlow folder, even including xgboost model. We think it might be unoptimal, but we leave it as it is as we are uncertain of the motives. +- We keep all artifacts in the artifacts folder because there are not many of them and it allows us for easy testing them and retrieving during workflows. ## What to improve upon From b20c81d62335cdf43f2332598eb647f0628b018e Mon Sep 17 00:00:00 2001 From: Maciej Jalocha Date: Thu, 19 Dec 2024 12:13:27 +0100 Subject: [PATCH 16/16] docs: more on what to improve upon --- references/project_reflections.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/references/project_reflections.md b/references/project_reflections.md index e337a4c..104e561 100644 --- a/references/project_reflections.md +++ b/references/project_reflections.md @@ -21,4 +21,5 @@ Below is not the part of the documentation. ## What to improve upon -- Take out tests out of the production code as right now we do it. +- Take out tests out of the production code as right now we do it +- Modularise pipeline.go. We think that now it's fine to keep helper/util function within the pipeline file as there are not many of them, but we see that we could have put them already somewhere else.