Skip to content

OCPBUGS-76964: HyperShift: relax ovnkube-control-plane zone anti-affinity#2909

Open
ricky-rav wants to merge 1 commit intoopenshift:masterfrom
ricky-rav:affinityHP
Open

OCPBUGS-76964: HyperShift: relax ovnkube-control-plane zone anti-affinity#2909
ricky-rav wants to merge 1 commit intoopenshift:masterfrom
ricky-rav:affinityHP

Conversation

@ricky-rav
Copy link
Contributor

@ricky-rav ricky-rav commented Feb 16, 2026

The required zone (topology.kubernetes.io/zone) in the anti-affinity section on the managed ovnkube-control-plane deployment can prevent scheduling when the management cluster has one zone only.

Replace it with:

  • a hard hostname anti-affinity (guaranteeing that two replicas land on different nodes) and
  • a soft zone anti-affinity (preferring different zones when possible).

The hard requirement comes from the way we ran OVNK on hypershift until 4.13, that is with ovnkube-master running as a stateful set, where each pod had its own OVNDB and we probably wanted to spread DBs across different zones in order to avoid problems related to RAFT quorum when a zone went down.

podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: ovnkube-master
topologyKey: topology.kubernetes.io/zone

The required zone in the anti-affinity section on the managed ovnkube-control-plane
deployment can prevent scheduling when the management cluster has one zone only.
Replace it with a hard hostname anti-affinity (guaranteeing that two
replicas land on different nodes) and a soft zone anti-affinity (preferring
different zones when possible).

Signed-off-by: Riccardo Ravaioli <rravaiol@redhat.com>
@coderabbitai
Copy link

coderabbitai bot commented Feb 16, 2026

No actionable comments were generated in the recent review. 🎉


Walkthrough

Modifies the ovnkube-control-plane pod anti-affinity configuration, replacing zone-based hard anti-affinity with hostname-based hard anti-affinity while reintroducing zone-based anti-affinity as a soft preference.

Changes

Cohort / File(s) Summary
Pod Affinity Configuration
bindata/network/ovn-kubernetes/managed/ovnkube-control-plane.yaml
Changed pod anti-affinity topologyKey from topology.kubernetes.io/zone to kubernetes.io/hostname in required scheduling rule. Added new preferred anti-affinity block with zone-based soft affinity for app: ovnkube-control-plane pods.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.


Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot requested review from arghosh93 and pliurh February 16, 2026 17:46
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 16, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ricky-rav
Once this PR has been reviewed and has the lgtm label, please assign jcaamano for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ricky-rav ricky-rav changed the title HyperShift: relax ovnkube-control-plane zone anti-affinity OCPBUGS-76964: HyperShift: relax ovnkube-control-plane zone anti-affinity Feb 16, 2026
@openshift-ci-robot openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels Feb 16, 2026
@openshift-ci-robot
Copy link
Contributor

@ricky-rav: This pull request references Jira Issue OCPBUGS-76964, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.22.0) matches configured target version for branch (4.22.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

No GitHub users were found matching the public email listed for the QA contact in Jira (yli2@redhat.com), skipping review request.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

The required zone (topology.kubernetes.io/zone) in the anti-affinity section on the managed ovnkube-control-plane deployment can prevent scheduling when the management cluster has one zone only.

Replace it with:

  • a hard hostname anti-affinity (guaranteeing that two replicas land on different nodes) and
  • a soft zone anti-affinity (preferring different zones when possible).

The hard requirement comes from the way we ran OVNK on hypershift until 4.13, that is with ovnkube-master running as a stateful set, where each pod had its own OVNDB and we probably wanted to spread DBs across different zones in order to avoid problems related to RAFT quorum when a zone went down.

podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: ovnkube-master
topologyKey: topology.kubernetes.io/zone

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@ricky-rav
Copy link
Contributor Author

/retest-required

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Feb 17, 2026

@ricky-rav: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/security 223579a link false /test security
ci/prow/e2e-aws-ovn-hypershift-conformance 223579a link true /test e2e-aws-ovn-hypershift-conformance
ci/prow/e2e-aws-ovn-serial-1of2 223579a link true /test e2e-aws-ovn-serial-1of2
ci/prow/e2e-azure-ovn-upgrade 223579a link true /test e2e-azure-ovn-upgrade

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments