NVIDIA-554: DPU-host mode: use ConfigMap for OVN feature enablement instead of per-node script gating#2907
NVIDIA-554: DPU-host mode: use ConfigMap for OVN feature enablement instead of per-node script gating#2907tsorya wants to merge 1 commit intoopenshift:masterfrom
Conversation
|
@tsorya: This pull request references MGMT-21485 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set. DetailsIn response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Skipping CI for Draft Pull Request. |
WalkthroughRemoved per-node (DPU-host) mode documentation and runtime gating; consolidated feature enablement into cluster-wide ConfigMap entries and simplified ovnkube startup flags across control-plane and script libraries; updated tests and templates to reflect new ConfigMap-driven feature flags. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes ✨ Finishing touches
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.5.0)level=error msg="Running error: context loading failed: failed to load packages: failed to load packages: failed to load with go/packages: err: exit status 1: stderr: go: inconsistent vendoring in :\n\tgithub.com/Masterminds/semver@v1.5.0: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/Masterminds/sprig/v3@v3.2.3: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/containernetworking/cni@v0.8.0: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/ghodss/yaml@v1.0.1-0.20190212211648-25d852aebe32: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/go-bindata/go-bindata@v3.1.2+incompatible: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/onsi/gomega@v1.38.1: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/ope ... [truncated 17231 characters] ... ired in go.mod, but not marked as explicit in vendor/modules.txt\n\tk8s.io/gengo/v2@v2.0.0-20250922181213-ec3ebc5fd46b: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tk8s.io/kms@v0.34.1: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tk8s.io/kube-aggregator@v0.34.1: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tsigs.k8s.io/randfill@v1.0.0: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tsigs.k8s.io/structured-merge-diff/v6@v6.3.0: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\n\tTo ignore the vendor directory, use -mod=readonly or -mod=mod.\n\tTo sync the vendor directory, run:\n\t\tgo mod vendor\n" Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: tsorya The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
79c2f28 to
9d8ca23
Compare
|
@tsorya: This pull request references MGMT-21485 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@tsorya: This pull request references NVIDIA-554 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@bindata/network/ovn-kubernetes/managed/004-config.yaml`:
- Around line 48-54: Remove or balance the orphaned Go template directives:
delete the stray "{{- end }}" that does not close any "{{- if }}" near the block
that references ".OVN_MULTI_NETWORK_POLICY_ENABLE", and remove the two
consecutive orphaned "{{- end }}" occurrences later (or alternatively add the
appropriate matching "{{- if <condition> }}" blocks if conditional behavior was
intended) so that every "{{- end }}" has a corresponding opening "{{- if ... }}"
and the template parses correctly.
🧹 Nitpick comments (1)
bindata/network/ovn-kubernetes/managed/ovnkube-control-plane.yaml (1)
196-198: Minor: Redundant variable initialization.Line 196 initializes
multi_network_enabled_flagto empty, and line 197 immediately sets it to"--enable-multi-network". The empty initialization on line 196 is unnecessary.♻️ Suggested simplification
- multi_network_enabled_flag= multi_network_enabled_flag="--enable-multi-network"
9d8ca23 to
fc4f0a7
Compare
…nstead of per-node script gating This reverts commit c17dba5 (MGMT-21485) which moved feature enablement from the ConfigMap into per-node script logic gated by OVN_NODE_MODE. Feature flags (egress IP, multicast, multi-network, network segmentation, admin network policy, multi-external-gateway, etc.) are managed via the cluster-wide ConfigMap (004-config.yaml) passed to ovnkube through --config-file. They do not need per-node gating in the startup script. OVN_NODE_MODE remains used only for DPU-host structural differences: gateway interface, ovnkube-node-mode flag, and init-controller. Also re-applies the feature gate cleanup from f5b8490 (removal of OVN_ADMIN_NETWORK_POLICY_ENABLE template references) and removes redundant CLI flags from 008-script-lib.yaml that duplicate what the ConfigMap already provides. Co-authored-by: Cursor <cursoragent@cursor.com>
fc4f0a7 to
f8367a3
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
bindata/network/ovn-kubernetes/managed/ovnkube-control-plane.yaml (1)
208-214:⚠️ Potential issue | 🟠 MajorRemove unused variable definitions that are no longer passed to the ovnkube command.
The variables
multi_network_policy_enabled_flag(lines 208-211) andadmin_network_policy_enabled_flag(line 214) are computed but never used in the exec command (lines 217-239). These settings are now managed via the ConfigMap (004-config.yaml), making these shell variable definitions dead code. Remove them to improve clarity.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindata/network/ovn-kubernetes/managed/ovnkube-control-plane.yaml` around lines 208 - 214, Remove the dead shell variables that are computed but never used: delete the multi_network_policy_enabled_flag definition/block (the conditional setting using {{.OVN_MULTI_NETWORK_POLICY_ENABLE}} that assigns "--enable-multi-networkpolicy") and the admin_network_policy_enabled_flag assignment ("--enable-admin-network-policy"); these flags are not referenced by the ovnkube exec invocation so remove their declarations to clean up the script and rely on the ConfigMap-managed settings instead.
🧹 Nitpick comments (2)
bindata/network/ovn-kubernetes/self-hosted/ovnkube-control-plane.yaml (1)
141-143: Redundant empty initialization ofnetwork_segmentation_enabled_flag.Line 141 initializes
network_segmentation_enabled_flagto empty, but it's unconditionally overwritten on line 143. The empty initialization can be removed.- network_segmentation_enabled_flag= multi_network_enabled_flag="--enable-multi-network" network_segmentation_enabled_flag="--enable-network-segmentation"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindata/network/ovn-kubernetes/self-hosted/ovnkube-control-plane.yaml` around lines 141 - 143, Remove the redundant empty initialization of the variable network_segmentation_enabled_flag (the standalone network_segmentation_enabled_flag= line); it is immediately overwritten by the later assignment network_segmentation_enabled_flag="--enable-network-segmentation", so delete the empty initialization to avoid confusion and dead code while leaving the later assignment intact.bindata/network/ovn-kubernetes/managed/ovnkube-control-plane.yaml (1)
196-197: Redundant empty initialization on line 196.The variable
multi_network_enabled_flagis initialized to empty and immediately overwritten on the next line. Consider removing the redundant empty assignment.- multi_network_enabled_flag= multi_network_enabled_flag="--enable-multi-network"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@bindata/network/ovn-kubernetes/managed/ovnkube-control-plane.yaml` around lines 196 - 197, Remove the redundant empty initialization of the variable multi_network_enabled_flag that immediately gets overwritten; keep only the meaningful assignment multi_network_enabled_flag="--enable-multi-network" (ensure no other code expects an earlier empty declaration before this assignment).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@bindata/network/ovn-kubernetes/common/008-script-lib.yaml`:
- Line 678: The hardcoded CLI flag `--enable-multicast` in the script lib should
be removed and multicast should be managed via the cluster-wide ConfigMap like
other feature flags; update the `[ovnkubernetesfeature]` section in
004-config.yaml to include `enable-multicast=true` (or `false` as appropriate)
and remove the `--enable-multicast` token from the command construction in
008-script-lib.yaml, or if multicast truly requires an exception, add a clear
comment in 008-script-lib.yaml documenting why it cannot be moved to the
ConfigMap and reference the `[ovnkubernetesfeature]` design decision.
---
Outside diff comments:
In `@bindata/network/ovn-kubernetes/managed/ovnkube-control-plane.yaml`:
- Around line 208-214: Remove the dead shell variables that are computed but
never used: delete the multi_network_policy_enabled_flag definition/block (the
conditional setting using {{.OVN_MULTI_NETWORK_POLICY_ENABLE}} that assigns
"--enable-multi-networkpolicy") and the admin_network_policy_enabled_flag
assignment ("--enable-admin-network-policy"); these flags are not referenced by
the ovnkube exec invocation so remove their declarations to clean up the script
and rely on the ConfigMap-managed settings instead.
---
Nitpick comments:
In `@bindata/network/ovn-kubernetes/managed/ovnkube-control-plane.yaml`:
- Around line 196-197: Remove the redundant empty initialization of the variable
multi_network_enabled_flag that immediately gets overwritten; keep only the
meaningful assignment multi_network_enabled_flag="--enable-multi-network"
(ensure no other code expects an earlier empty declaration before this
assignment).
In `@bindata/network/ovn-kubernetes/self-hosted/ovnkube-control-plane.yaml`:
- Around line 141-143: Remove the redundant empty initialization of the variable
network_segmentation_enabled_flag (the standalone
network_segmentation_enabled_flag= line); it is immediately overwritten by the
later assignment
network_segmentation_enabled_flag="--enable-network-segmentation", so delete the
empty initialization to avoid confusion and dead code while leaving the later
assignment intact.
| ${dns_name_resolver_enabled_flag} \ | ||
| ${network_observability_enabled_flag} \ | ||
| ${enable_multicast_flag} \ | ||
| --enable-multicast \ |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# First, check the actual content at line 678 with context
echo "=== Content around line 678 ==="
sed -n '670,690p' bindata/network/ovn-kubernetes/common/008-script-lib.yaml
echo -e "\n=== Search for multicast references ==="
rg -n 'multicast' bindata/network/ovn-kubernetes/
echo -e "\n=== Recent git history for the file ==="
git log --oneline -n 15 -- bindata/network/ovn-kubernetes/common/008-script-lib.yaml
echo -e "\n=== Check for multicast in ConfigMaps or configurations ==="
rg -n 'multicast' bindata/network/ovn-kubernetes/common/ -A 2 -B 2Repository: openshift/cluster-network-operator
Length of output: 1811
🏁 Script executed:
#!/bin/bash
echo "=== Show the specific commit for OVN feature enablement ==="
git show --stat f8367a3a | head -50
echo -e "\n=== Check for ConfigMap files in OVN kubernetes directory ==="
find bindata/network/ovn-kubernetes -name '*config*' -o -name '*ConfigMap*' | head -20
echo -e "\n=== Search for multicast in all config/ConfigMap files ==="
rg -l 'ConfigMap\|config' bindata/network/ovn-kubernetes/common/ | head -10
echo -e "\n=== Check 004-config.yaml for multicast or ConfigMap setup ==="
if [ -f bindata/network/ovn-kubernetes/common/004-config.yaml ]; then
echo "=== File exists, showing first 100 lines ==="
head -100 bindata/network/ovn-kubernetes/common/004-config.yaml
fi
echo -e "\n=== Search for how other flags are set from ConfigMap ==="
rg -n '\$\{' bindata/network/ovn-kubernetes/common/008-script-lib.yaml | head -20Repository: openshift/cluster-network-operator
Length of output: 4118
🏁 Script executed:
#!/bin/bash
echo "=== Check managed 004-config.yaml for multicast settings ==="
if [ -f bindata/network/ovn-kubernetes/managed/004-config.yaml ]; then
cat bindata/network/ovn-kubernetes/managed/004-config.yaml
fi
echo -e "\n=== Check self-hosted 004-config.yaml for multicast settings ==="
if [ -f bindata/network/ovn-kubernetes/self-hosted/004-config.yaml ]; then
cat bindata/network/ovn-kubernetes/self-hosted/004-config.yaml
fi
echo -e "\n=== Search for all --enable-multicast references ==="
git log --all --oneline -- '**/*script-lib.yaml' | grep -i multicast
echo -e "\n=== Check what was removed/changed in the commit regarding multicast ==="
git show f8367a3a -- bindata/network/ovn-kubernetes/common/008-script-lib.yaml | grep -A 5 -B 5 'multicast' || echo "No multicast changes in diff"Repository: openshift/cluster-network-operator
Length of output: 9604
Remove hardcoded --enable-multicast flag or add it to ConfigMap configuration.
The --enable-multicast flag at line 678 is hardcoded, inconsistent with the design established in commit f8367a3. That commit explicitly states feature flags (including multicast) should be "managed via the cluster-wide ConfigMap (004-config.yaml)" rather than hardcoded CLI flags. All similar feature flags (egress-ip, egress-firewall, egress-qos, etc.) are defined in the ConfigMap's [ovnkubernetesfeature] section, but enable-multicast is missing there. Either add enable-multicast=true to the ConfigMap's [ovnkubernetesfeature] section and remove the hardcoded CLI flag, or document why multicast requires hardcoding as an exception.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@bindata/network/ovn-kubernetes/common/008-script-lib.yaml` at line 678, The
hardcoded CLI flag `--enable-multicast` in the script lib should be removed and
multicast should be managed via the cluster-wide ConfigMap like other feature
flags; update the `[ovnkubernetesfeature]` section in 004-config.yaml to include
`enable-multicast=true` (or `false` as appropriate) and remove the
`--enable-multicast` token from the command construction in 008-script-lib.yaml,
or if multicast truly requires an exception, add a clear comment in
008-script-lib.yaml documenting why it cannot be moved to the ConfigMap and
reference the `[ovnkubernetesfeature]` design decision.
|
@tsorya: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
NVIDIA-554: DPU-host mode: use ConfigMap for OVN feature enablement instead of per-node script gating
This reverts commit c17dba5 (MGMT-21485) which moved feature
enablement from the ConfigMap into per-node script logic gated
by OVN_NODE_MODE.
Feature flags (egress IP, multicast, multi-network, network
segmentation, admin network policy, multi-external-gateway, etc.)
are managed via the cluster-wide ConfigMap (004-config.yaml) passed
to ovnkube through --config-file. They do not need per-node gating
in the startup script.
OVN_NODE_MODE remains used only for DPU-host structural differences:
gateway interface, ovnkube-node-mode flag, and init-controller.
Also re-applies the feature gate cleanup from f5b8490 (removal of
OVN_ADMIN_NETWORK_POLICY_ENABLE template references) and removes
redundant CLI flags from 008-script-lib.yaml that duplicate what
the ConfigMap already provides.
Co-authored-by: Cursor cursoragent@cursor.com