Transformer Hyperparameters Tunning #13

jvachier · 2025-05-04T11:14:50Z

Describe your changes

Provide a clear and concise description of the changes made in this pull request. Include any relevant context or background information.

What is the purpose of this pull request?
What problem does it solve?
What functionality does it add or improve?

Issue ticket number and link

Type of Change

Check the type of change your pull request introduces:

Bug fix (non-breaking change that fixes an issue)
New feature (non-breaking change that adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update
Refactoring (code improvements without changing functionality)

Checklist before requesting a review

Before submitting your pull request, ensure the following:

I have performed a self-review of my code.
I have added tests that prove my fix is effective or that my feature works.
I have added or updated relevant documentation (if applicable).
My changes do not introduce any new warnings or errors.
I have checked for security vulnerabilities in the code.
I have ensured backward compatibility (if applicable).

Copilot

Pull Request Overview

This pull request introduces a new transformer model with hyperparameter tuning using Optuna, alongside relevant documentation updates.

Adds a new module for building and tuning a transformer model.
Implements data processing, model construction, and hyperparameter optimization logic.

Copilot · 2025-05-04T11:15:56Z

src/modules/optuna_transformer.py

+    ]
+
+    # Train the model
+    with tf.device("/GPU:0"):


Consider checking for GPU availability before explicitly setting the device to '/GPU:0' to ensure compatibility on systems without a GPU. This can help avoid potential runtime errors on CPU-only machines.

Suggested change

with tf.device("/GPU:0"):

device = "/GPU:0" if tf.config.list_physical_devices('GPU') else "/CPU:0"

with tf.device(device):

Copilot · 2025-05-04T11:15:56Z

src/modules/optuna_transformer.py

+    x = PositionalEmbedding(sequence_length, vocab_size, embed_dim)(encoder_inputs)
+    encoder_outputs = TransformerEncoder(embed_dim, dense_dim, num_heads)(x)
+
+    decoder_inputs = tf.keras.Input(shape=(None,), dtype="int32", name="french")
+    x = PositionalEmbedding(sequence_length, vocab_size, embed_dim)(decoder_inputs)
+    x = TransformerDecoder(embed_dim, dense_dim, num_heads)(x, encoder_outputs)
+    x = tf.keras.layers.Dropout(dropout_rate)(x)
+    decoder_outputs = tf.keras.layers.Dense(vocab_size, activation="softmax")(x)


[nitpick] Consider using a more descriptive variable name instead of 'x' to improve code readability, especially within the transformer pipeline where multiple transformations are applied.

Suggested change

x = PositionalEmbedding(sequence_length, vocab_size, embed_dim)(encoder_inputs)

encoder_outputs = TransformerEncoder(embed_dim, dense_dim, num_heads)(x)

decoder_inputs = tf.keras.Input(shape=(None,), dtype="int32", name="french")

x = PositionalEmbedding(sequence_length, vocab_size, embed_dim)(decoder_inputs)

x = TransformerDecoder(embed_dim, dense_dim, num_heads)(x, encoder_outputs)

x = tf.keras.layers.Dropout(dropout_rate)(x)

decoder_outputs = tf.keras.layers.Dense(vocab_size, activation="softmax")(x)

encoder_embeddings = PositionalEmbedding(sequence_length, vocab_size, embed_dim)(encoder_inputs)

encoder_outputs = TransformerEncoder(embed_dim, dense_dim, num_heads)(encoder_embeddings)

decoder_inputs = tf.keras.Input(shape=(None,), dtype="int32", name="french")

decoder_embeddings = PositionalEmbedding(sequence_length, vocab_size, embed_dim)(decoder_inputs)

decoder_outputs = TransformerDecoder(embed_dim, dense_dim, num_heads)(decoder_embeddings, encoder_outputs)

dropout_outputs = tf.keras.layers.Dropout(dropout_rate)(decoder_outputs)

final_outputs = tf.keras.layers.Dense(vocab_size, activation="softmax")(dropout_outputs)

Copilot

Pull Request Overview

This pull request focuses on tuning the hyperparameters for the Transformer model used for French-to-English translation and updating related documentation. Key changes include:

Adding a new argument (transformer_model_path) to load a saved model.
Adjusting the model hyperparameters (embed_dim, dense_dim, num_heads, and dropout_rate) in the transformer_model function.
Introducing a new module (optuna_transformer.py) for hyperparameter optimization using Optuna.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File	Description
src/translation_french_english.py	Updated transformer_model: added transformer_model_path parameter and modified hyperparameters.
src/modules/optuna_transformer.py	Added a new module to enable hyperparameter tuning with Optuna.

Copilot

Pull Request Overview

This PR implements transformer hyperparameter tuning by introducing new testing scripts and an Optuna-based module for hyperparameter optimization while updating the Transformer model parameters and documentation. Key changes include:

New tests in tests/test_transformer_model.py for model building, training, evaluation, BLEU scoring, and loading.
Updates to the transformer_model in src/translation_french_english.py with reduced hyperparameter values and dropout addition.
Introduction of src/modules/optuna_transformer.py for Optuna hyperparameter optimization and workflow updates in .github/workflows/test.yaml.

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File	Description
tests/test_transformer_model.py	Added tests to validate model build, training, evaluation, BLEU score computation, and loading.
src/translation_french_english.py	Modified model hyperparameters and added dropout in the Transformer architecture.
src/modules/optuna_transformer.py	Created an Optuna optimization module for hyperparameter tuning of the Transformer model.
.github/workflows/test.yaml	Updated workflow file to include additional file paths for triggering tests.

Comments suppressed due to low confidence (2)

.github/workflows/test.yaml:10

[nitpick] The file path 'src/modules/transformer_component.py' may be a typo; consider confirming if the intended filename is 'transformer_components.py' to ensure the workflow triggers correctly.

- "src/modules/transformer_component.py"

src/translation_french_english.py:59

[nitpick] The reduced embedding dimensions, dense layer size, and number of heads significantly lower the model's capacity; please verify that these changes are intentional and sufficient for achieving the desired performance.

embed_dim = 64

Copilot

Pull Request Overview

This PR introduces new features and tests for tuning Transformer hyperparameters, updates the Transformer model implementation, and enhances CI workflow triggers.

Added comprehensive tests for building, training, evaluating, and loading the Transformer model.
Updated model hyperparameters and refined model architecture in the translation module.
Introduced a new module for hyperparameter tuning using Optuna and updated GitHub Actions paths for testing.

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File	Description
tests/test_transformer_model.py	Added unit tests for building, training, evaluation, and BLEU scoring of the model.
src/translation_french_english.py	Updated Transformer model parameters and architecture for improved tuning.
src/modules/optuna_transformer.py	Introduces hyperparameter optimization using Optuna for the Transformer model.
.github/workflows/test.yaml	Updated workflow paths to include new and modified module files for testing.

Comments suppressed due to low confidence (1)

tests/test_transformer_model.py:99

[nitpick] Consider using explicit metric names or indices rather than relying on the number of elements in 'results', which would make the test more robust if additional metrics are ever included.

assert len(results) == 2, "Evaluation did not return loss and accuracy."

Copilot · 2025-05-04T14:39:41Z

src/translation_french_english.py

    # Define model parameters
-    embed_dim = 128
-    dense_dim = 2048
-    num_heads = 8
+    embed_dim = 64
+    dense_dim = 1536
+    num_heads = 2


[nitpick] Consider adding a comment or updating the docstring to explain the rationale behind the updated hyperparameter values (e.g., embed_dim, dense_dim, and num_heads) for improved clarity and maintainability.

jvachier added 2 commits May 4, 2025 12:57

Adding Optuna for Transformer.

aff0373

Enabling GPU for optuna.

29e0b65

jvachier self-assigned this May 4, 2025

jvachier added the SemVer Patch label May 4, 2025

jvachier requested a review from Copilot May 4, 2025 11:15

Copilot AI reviewed May 4, 2025

View reviewed changes

jvachier added 3 commits May 4, 2025 15:40

update.

dd77851

Addressing comment 1.

7ee7d40

Addressing coment 2.

2e688f5

jvachier requested a review from Copilot May 4, 2025 14:15

Copilot AI reviewed May 4, 2025

View reviewed changes

jvachier added 2 commits May 4, 2025 16:27

Adding tests.

b2c10bf

Updating workflow.

e248e24

jvachier requested a review from Copilot May 4, 2025 14:30

Copilot AI reviewed May 4, 2025

View reviewed changes

jvachier added 2 commits May 4, 2025 16:32

Fixing typo.

ae636e9

Creating mock data.

69a25f4

jvachier requested a review from Copilot May 4, 2025 14:39

Copilot AI reviewed May 4, 2025

View reviewed changes

jvachier merged commit 9f3a1c9 into main May 4, 2025
1 check passed

jvachier deleted the jv/translation_optuna branch May 4, 2025 14:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Transformer Hyperparameters Tunning #13

Transformer Hyperparameters Tunning #13

Uh oh!

jvachier commented May 4, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI May 4, 2025

Uh oh!

Copilot AI May 4, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI May 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	with tf.device("/GPU:0"):
	device = "/GPU:0" if tf.config.list_physical_devices('GPU') else "/CPU:0"
	with tf.device(device):

Transformer Hyperparameters Tunning #13

Transformer Hyperparameters Tunning #13

Uh oh!

Conversation

jvachier commented May 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe your changes

Issue ticket number and link

Type of Change

Checklist before requesting a review

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI May 4, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI May 4, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI May 4, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jvachier commented May 4, 2025 •

edited

Loading