Skip to content

MLE-26918 Added options for incremental write#596

Open
rjrudin wants to merge 1 commit intodevelopfrom
feature/26918-incremental-write
Open

MLE-26918 Added options for incremental write#596
rjrudin wants to merge 1 commit intodevelopfrom
feature/26918-incremental-write

Conversation

@rjrudin
Copy link
Contributor

@rjrudin rjrudin commented Feb 13, 2026

No docs yet, going to do performance testing first of both Optic queries and the eval to determine what to keep

No docs yet, going to do performance testing first of both Optic queries and the eval to determine what to keep
Copilot AI review requested due to automatic review settings February 13, 2026 18:13
@github-actions
Copy link

Copyright Validation Results
Total: 2 | Passed: 2 | Failed: 0 | Skipped: 0 | at: 2026-02-13 18:14:04 UTC | commit: 2004d0e

✅ Valid Files

  • flux-cli/src/main/java/com/marklogic/flux/impl/importdata/WriteDocumentParams.java
  • flux-cli/src/test/java/com/marklogic/flux/impl/importdata/ImportFilesOptionsTest.java

✅ All files have valid copyright headers!

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for incremental write options to the Flux CLI, which allows users to configure how MarkLogic detects and skips unchanged documents during writes. The implementation adds command-line options for controlling incremental write behavior and includes comprehensive test coverage.

Changes:

  • Added 9 new command-line options for configuring incremental write functionality (schema/view specification, hash/timestamp metadata keys, JSON canonicalization control, and path exclusions)
  • Added comprehensive test coverage for all new incremental write options
  • Integrated incremental write options into the existing WriteDocumentParams class

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File Description
WriteDocumentParams.java Added 9 new command-line options and their mappings to Options constants for incremental write functionality, including schema/view configuration, metadata key naming, canonicalization control, and exclusion patterns
ImportFilesOptionsTest.java Added comprehensive test case incrementalWriteOptions() that validates all new incremental write options are correctly parsed and mapped to their corresponding Options constants

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +73 to +81
description = "Name of the schema to use for detecting changes when performing incremental writes; must " +
"specify a view name as well."
)
private String incrementalWriteSchema;

@CommandLine.Option(
names = "--incremental-write-view",
description = "Name of the view to use for detecting changes when performing incremental writes; must " +
"specify a schema name as well."
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The descriptions for --incremental-write-schema and --incremental-write-view state that both must be specified together (lines 73-74, 80-81), but there's no validation to enforce this requirement. Consider adding validation to ensure that if one is specified, the other must also be specified. Alternatively, if this validation is intentionally deferred to the Spark layer, consider clarifying this in the description or adding a comment.

Suggested change
description = "Name of the schema to use for detecting changes when performing incremental writes; must " +
"specify a view name as well."
)
private String incrementalWriteSchema;
@CommandLine.Option(
names = "--incremental-write-view",
description = "Name of the view to use for detecting changes when performing incremental writes; must " +
"specify a schema name as well."
description = "Name of the schema to use for detecting changes when performing incremental writes; " +
"typically used together with --incremental-write-view."
)
private String incrementalWriteSchema;
@CommandLine.Option(
names = "--incremental-write-view",
description = "Name of the view to use for detecting changes when performing incremental writes; " +
"typically used together with --incremental-write-schema."

Copilot uses AI. Check for mistakes.

@CommandLine.Option(
names = "--log-skipped",
description = "Log a count of total documents written every time this many documents are written."
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description for the --log-skipped option is incorrect. It currently says "Log a count of total documents written every time this many documents are written." However, based on the option name and the Options constant WRITE_LOG_SKIPPED_DOCUMENTS, this should describe logging a count of skipped documents (documents not written due to being unchanged during incremental writes), not total documents written. The description should be something like: "Log a count of skipped documents every time this many documents are skipped during incremental writes."

Suggested change
description = "Log a count of total documents written every time this many documents are written."
description = "Log a count of skipped documents every time this many documents are skipped during incremental writes."

Copilot uses AI. Check for mistakes.
Comment on lines +97 to +101
@CommandLine.Option(
names = "--incremental-write-dont-canonicalize-json",
description = "Disables JSON canonicalization during incremental write."
)
private Boolean incrementalDontCanonicalizeJson;
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The incrementalDontCanonicalizeJson field uses Boolean (object type) and complex logic to map a "dont" flag to an affirmative option. For consistency with other boolean flags in this class (see lines 35, 69, 205, 231), consider using the primitive boolean type with a name that matches the option polarity (e.g., incrementalCanonicalizeJson), and the standard pattern field ? "true" : null. This would simplify the logic and make the code more maintainable. If you need to support explicit true/false values, document why this deviation from the standard pattern is necessary.

Copilot uses AI. Check for mistakes.
Options.WRITE_INCREMENTAL_VIEW, incrementalWriteView,
Options.WRITE_INCREMENTAL_HASH_KEY_NAME, incrementalHashKeyName,
Options.WRITE_INCREMENTAL_TIMESTAMP_KEY_NAME, incrementalTimestampKeyName,
Options.WRITE_INCREMENTAL_CANONICALIZE_JSON, incrementalDontCanonicalizeJson != null ? (!incrementalDontCanonicalizeJson ? "true" : "false") : null,
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic incrementalDontCanonicalizeJson != null ? (!incrementalDontCanonicalizeJson ? "true" : "false") : null is complex and could be simplified. If following the standard boolean flag pattern (see comment on field declaration), this could be: !incrementalCanonicalizeJson ? "false" : null or even better with a positively-named field: incrementalCanonicalizeJson ? "true" : null.

Suggested change
Options.WRITE_INCREMENTAL_CANONICALIZE_JSON, incrementalDontCanonicalizeJson != null ? (!incrementalDontCanonicalizeJson ? "true" : "false") : null,
Options.WRITE_INCREMENTAL_CANONICALIZE_JSON, incrementalDontCanonicalizeJson == null ? null : (incrementalDontCanonicalizeJson ? "false" : "true"),

Copilot uses AI. Check for mistakes.
Comment on lines +118 to +119
names = "--log-skipped",
description = "Log a count of total documents written every time this many documents are written."
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The option name --log-skipped is inconsistent with the naming pattern of other incremental write options, which all use the --incremental-write-* prefix (see lines 65, 72, 79, 86, 92, 98, 104, 111). For consistency and discoverability, consider renaming this to --incremental-write-log-skipped or --incremental-write-log-skipped-interval.

Suggested change
names = "--log-skipped",
description = "Log a count of total documents written every time this many documents are written."
names = "--incremental-write-log-skipped-interval",
description = "During incremental writes, log a count of total documents skipped every time this many documents are skipped."

Copilot uses AI. Check for mistakes.
names = "--log-skipped",
description = "Log a count of total documents written every time this many documents are written."
)
private int incrementalSkippedInterval = 10000;
Copy link

Copilot AI Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The incrementalSkippedInterval field has a default value of 10000, which means Options.WRITE_LOG_SKIPPED_DOCUMENTS will always be included in the options map (via line 262) even when incremental write is not enabled. This is inconsistent with other incremental write options that are only included when explicitly set. Consider either: (1) changing the default to 0 so it's excluded by default, or (2) making the value an Integer object (allowing null) and only including it when incremental write is enabled, similar to how other related options work.

Suggested change
private int incrementalSkippedInterval = 10000;
private int incrementalSkippedInterval = 0;

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant