Skip to content

OCPBUGS-90536: openstack: Guard network resource names on os_net_id being defined#10639

Open
stephenfin wants to merge 2 commits into
openshift:mainfrom
shiftstack:ignore-os_net_id
Open

OCPBUGS-90536: openstack: Guard network resource names on os_net_id being defined#10639
stephenfin wants to merge 2 commits into
openshift:mainfrom
shiftstack:ignore-os_net_id

Conversation

@stephenfin

@stephenfin stephenfin commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

The Compute network resource names task in common.yaml uses the os_net_id variable, which is loaded from netid.json. Since commit 7b6b3f1 (OCPBUGS-39285), netid.json is loaded conditionally via include_vars (skipped when the file doesn't exist), but the task that consumes os_net_id had no corresponding guard. This causes the UPI deprovision playbook to fail with 'os_net_id' is undefined when netid.json is not present in the working directory.

Add a when: os_net_id is defined guard, consistent with the existing when: sym.stat.exists guard on the Compute resource names task above.

Note that the deprovision playbooks (e.g. down-network.yaml) do use these network resource names, but they already handle missing resources gracefully - they list resources by tag first and use state: absent for deletion. Without os_net_id the names can't be computed at all, so skipping is the only safe option. More importantly, failing hard here prevents the entire deprovision run from cleaning up resources that can be cleaned up (servers, security groups, etc. that use infraID from metadata.json rather than os_net_id).

Summary by CodeRabbit

  • Bug Fixes
    • Network resource name generation now runs only when the required network identifier is available, reducing deployment-time errors.
    • Down-network operations now include a floating IP cleanup phase for the target router, deleting any associated floating IPs before router and network teardown.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jun 19, 2026
@openshift-ci-robot

Copy link
Copy Markdown
Contributor

@stephenfin: This pull request explicitly references no jira issue.

Details

In response to this:

The 'Compute network resource names' task in common.yaml uses the os_net_id variable, which is loaded from netid.json. Since commit 7b6b3f1 (OCPBUGS-39285), netid.json is loaded conditionally via include_vars (skipped when the file doesn't exist), but the task that consumes os_net_id had no corresponding guard. This causes the UPI deprovision playbook to fail with 'os_net_id' is undefined when netid.json is not present in the working directory.

Add a when: os_net_id is defined guard, consistent with the existing when: sym.stat.exists guard on the Compute resource names task above.

Note that the deprovision playbooks (e.g. down-network.yaml) do use these network resource names, but they already handle missing resources gracefully - they list resources by tag first and use state: absent for deletion. Without os_net_id the names can't be computed at all, so skipping is the only safe option. More importantly, failing hard here prevents the entire deprovision run from cleaning up resources that can be cleaned up (servers, security groups, etc. that use infraID from metadata.json rather than os_net_id).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 12a7ebca-eecf-417a-acf2-8229793a2633

📥 Commits

Reviewing files that changed from the base of the PR and between b487707 and a211562.

📒 Files selected for processing (2)
  • upi/openstack/common.yaml
  • upi/openstack/down-network.yaml
🚧 Files skipped from review as they are similar to previous changes (2)
  • upi/openstack/common.yaml
  • upi/openstack/down-network.yaml

📝 Walkthrough

Walkthrough

The PR adds a conditional guard for OpenStack network resource naming and adds floating IP cleanup to network teardown before router removal.

Changes

OpenStack UPI Network Cleanup and Guards

Layer / File(s) Summary
Conditional network name derivation
upi/openstack/common.yaml
Adds when: os_net_id is defined to the network resource name set_fact task.
Floating IP deletion during teardown
upi/openstack/down-network.yaml
Lists floating IPs for os_router and deletes them in bulk before router removal when any are present.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 15
✅ Passed checks (15 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: guarding OpenStack network resource name computation when os_net_id is defined.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed No Ginkgo test titles were added or changed; the PR only touches OpenStack Ansible YAML tasks.
Test Structure And Quality ✅ Passed No Ginkgo test code was changed; the PR only updates Ansible YAML playbooks and a skill doc, so the test-structure criteria are not applicable.
Microshift Test Compatibility ✅ Passed Only Ansible playbooks changed; no new Ginkgo e2e tests or MicroShift-sensitive APIs were added.
Single Node Openshift (Sno) Test Compatibility ✅ Passed PR only changes Ansible playbooks; no new Ginkgo e2e tests or SNO-sensitive test logic were added.
Topology-Aware Scheduling Compatibility ✅ Passed Only OpenStack Ansible cleanup playbooks changed; no pod scheduling, affinity, nodeSelector, replica, or topology logic was added.
Ote Binary Stdout Contract ✅ Passed Only YAML playbooks changed; no process-level Go code or stdout writes were introduced.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed PASS: The PR only changes OpenStack Ansible playbooks; no new Ginkgo e2e test specs or IPv4/external-connectivity checks were added.
No-Weak-Crypto ✅ Passed The touched Ansible playbooks only add a guard and FIP cleanup; no MD5/SHA1/DES/RC4/3DES/Blowfish/ECB or custom secret comparisons found.
Container-Privileges ✅ Passed Changed files are Ansible playbooks only; searches found no privileged/securityContext fields or Kubernetes manifest privilege settings.
No-Sensitive-Data-In-Logs ✅ Passed PASS: The changed playbooks add cleanup only; no debug/no_log statements or handling of passwords, tokens, PII, or hostnames were added.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands.

@stephenfin stephenfin changed the title NO-JIRA: openstack: Guard network resource names on os_net_id being defined NO-JIRA: openstack: Guard network resource names on os_net_id being defined Jun 19, 2026
@openshift-ci openshift-ci Bot requested review from gryf and mandre June 19, 2026 16:28
@stephenfin stephenfin changed the title NO-JIRA: openstack: Guard network resource names on os_net_id being defined OCPBUGS-90536: openstack: Guard network resource names on os_net_id being defined Jun 19, 2026
@openshift-ci-robot openshift-ci-robot added the jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. label Jun 19, 2026
@openshift-ci-robot

Copy link
Copy Markdown
Contributor

@stephenfin: This pull request references Jira Issue OCPBUGS-90536, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

The Compute network resource names task in common.yaml uses the os_net_id variable, which is loaded from netid.json. Since commit 7b6b3f1 (OCPBUGS-39285), netid.json is loaded conditionally via include_vars (skipped when the file doesn't exist), but the task that consumes os_net_id had no corresponding guard. This causes the UPI deprovision playbook to fail with 'os_net_id' is undefined when netid.json is not present in the working directory.

Add a when: os_net_id is defined guard, consistent with the existing when: sym.stat.exists guard on the Compute resource names task above.

Note that the deprovision playbooks (e.g. down-network.yaml) do use these network resource names, but they already handle missing resources gracefully - they list resources by tag first and use state: absent for deletion. Without os_net_id the names can't be computed at all, so skipping is the only safe option. More importantly, failing hard here prevents the entire deprovision run from cleaning up resources that can be cleaned up (servers, security groups, etc. that use infraID from metadata.json rather than os_net_id).

Summary by CodeRabbit

  • Bug Fixes
  • Network resource configuration now executes conditionally, preventing errors in deployment scenarios where required network parameters are not available.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@mandre

mandre commented Jun 22, 2026

Copy link
Copy Markdown
Member

/test e2e-openstack-dualstack-upi

@mandre mandre left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci Bot added lgtm Indicates that a PR is ready to be merged. and removed lgtm Indicates that a PR is ready to be merged. labels Jun 22, 2026
@openshift-ci

openshift-ci Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

New changes are detected. LGTM label has been removed.

@openshift-ci-robot

Copy link
Copy Markdown
Contributor

@stephenfin: This pull request references Jira Issue OCPBUGS-90536, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)
Details

In response to this:

The Compute network resource names task in common.yaml uses the os_net_id variable, which is loaded from netid.json. Since commit 7b6b3f1 (OCPBUGS-39285), netid.json is loaded conditionally via include_vars (skipped when the file doesn't exist), but the task that consumes os_net_id had no corresponding guard. This causes the UPI deprovision playbook to fail with 'os_net_id' is undefined when netid.json is not present in the working directory.

Add a when: os_net_id is defined guard, consistent with the existing when: sym.stat.exists guard on the Compute resource names task above.

Note that the deprovision playbooks (e.g. down-network.yaml) do use these network resource names, but they already handle missing resources gracefully - they list resources by tag first and use state: absent for deletion. Without os_net_id the names can't be computed at all, so skipping is the only safe option. More importantly, failing hard here prevents the entire deprovision run from cleaning up resources that can be cleaned up (servers, security groups, etc. that use infraID from metadata.json rather than os_net_id).

Summary by CodeRabbit

  • Bug Fixes
  • Network resource name generation now runs only when the required network identifier is available, reducing deployment-time errors.
  • Added a floating IP cleanup step for the target router during down-network operations, ensuring associated floating IPs are removed before router and network teardown.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci

openshift-ci Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from mandre. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

The 'Compute network resource names' task in common.yaml uses the
os_net_id variable, which is loaded from netid.json. Since commit
7b6b3f1 (OCPBUGS-39285), netid.json is loaded conditionally via
include_vars (skipped when the file doesn't exist), but the task
that consumes os_net_id had no corresponding guard. This causes
the UPI deprovision playbook to fail with "'os_net_id' is undefined"
when netid.json is not present in the working directory.

Add a 'when: os_net_id is defined' guard, consistent with the
existing 'when: sym.stat.exists' guard on the 'Compute resource
names' task above.

Note that the deprovision playbooks (e.g. down-network.yaml) do use
these network resource names, but they already handle missing resources
gracefully - they list resources by tag first and use 'state: absent'
for deletion. Without os_net_id the names can't be computed at all, so
skipping is the only safe option. More importantly, failing hard here
prevents the entire deprovision run from cleaning up resources that
*can* be cleaned up (servers, security groups, etc. that use infraID
from metadata.json rather than os_net_id).

Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Assisted-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
You cannot delete a router while it still has FIPs associated with it.

Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
@ekuris-redhat

Copy link
Copy Markdown

/test e2e-openstack-ovn

@ekuris-redhat

Copy link
Copy Markdown

Re-test because it looks like an infrastructure flake, not a code bug.

@mandre

mandre commented Jun 26, 2026

Copy link
Copy Markdown
Member

/test e2e-openstack-dualstack-upi

@openshift-ci

openshift-ci Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

@stephenfin: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-openstack-proxy a211562 link false /test e2e-openstack-proxy
ci/prow/e2e-openstack-dualstack-upi a211562 link false /test e2e-openstack-dualstack-upi

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@stephenfin

Copy link
Copy Markdown
Contributor Author

/hold

Looks like my fix to remove the stale FIPs wasn't enough and there are still ports left over. I'm guessing something is creating a router, but I don't know what or why it's suddenly changed 😕

@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants