Skip to content

Add hot loop test automation#6212

Draft
sergiordlr wants to merge 1 commit into
openshift:mainfrom
sergiordlr:add_tc_89596_hot_loop
Draft

Add hot loop test automation#6212
sergiordlr wants to merge 1 commit into
openshift:mainfrom
sergiordlr:add_tc_89596_hot_loop

Conversation

@sergiordlr

@sergiordlr sergiordlr commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

- What I did
WIP
- How to verify it

- Description for the changelog

Summary by CodeRabbit

Release Notes

  • Tests
    • Enhanced boot image update error validation with improved message templates.
    • Added test coverage for hot-loop detection and recovery in boot image management, including verification of degraded status handling and recovery workflows.

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@sergiordlr sergiordlr marked this pull request as draft June 22, 2026 13:45
@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown

Walkthrough

In test/extended-priv/mco_bootimages.go, mapiBaseErrorMessageTemplate is shortened to the "Error(s):" prefix and a new mapiReconcileErrorMessageTemplate is added for the "failed to reconcile machineset" suffix. Three existing negative test cases are updated to use the new template. A new disruptive test case for hot-loop detection and recovery is added.

Changes

Bootimage test updates

Layer / File(s) Summary
Error message template refactor and existing test updates
test/extended-priv/mco_bootimages.go
mapiBaseErrorMessageTemplate is shortened to the shared "Error(s):" prefix; mapiReconcileErrorMessageTemplate is introduced for the "failed to reconcile machineset …" suffix. Three negative test cases ("secret doesn't exist", "no json data", "wrong version") update their expectedFailedMessageRegexp to use the new reconcile-specific template.
New hot-loop detection and recovery test
test/extended-priv/mco_bootimages.go
Adds a disruptive g.It test that enables All managedBootImages, clones a machineset, patches it with a wrong boot image repeatedly to trigger hot-loop detection, asserts BootImageUpdateDegraded is True with a regex-matched hot-loop message, clears state by toggling to None, re-enables All, and asserts the condition clears and the boot image is restored.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 11 | ❌ 4

❌ Failed checks (4 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Structure And Quality ⚠️ Warning Assertion format strings on lines 727, 730, 747 have 2 placeholders but only 1 argument (causing %!s(MISSING)); defer cleanup on line 703 ignores Delete() error. Fix format strings to match argument count; wrap defer cleanup in error-checking function as suggested in review comments.
Microshift Test Compatibility ⚠️ Warning Test uses MachineSet (machine.openshift.io) API which is unavailable on MicroShift and lacks protective mechanisms [Skipped:MicroShift], [apigroup:machine.openshift.io], or IsMicroShiftCluster() skip. Add [apigroup:machine.openshift.io] tag to test name, or add [Skipped:MicroShift] label, or guard with exutil.IsMicroShiftCluster() skip check.
Ipv6 And Disconnected Network Test Compatibility ⚠️ Warning The test [PolarionID:89596] runs on multiple platforms including VSphere. When executed on VSphere, getBackdatedBootImage() calls GetRHCOSHandler().GetBaseImageFromRHCOSImageInfo(), which eventuall... Either add [Skipped:Disconnected] to the test name to skip on disconnected environments, or refactor getBackdatedBootImage() to avoid external network calls when running in VSphere on disconnected clusters by providing pre-cached RHCOS d...
✅ Passed checks (11 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'Add hot loop test automation' directly describes the main change—adding a new test case for hot loop detection and recovery functionality.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed All test titles in the modified file use stable, deterministic names with no dynamic content. The new test title "[PolarionID:89596][OTP] ManagedBootImages. Hot loop detection and recovery" contain...
Single Node Openshift (Sno) Test Compatibility ✅ Passed Test 89596 does not assume multiple nodes: it clones a machineset with 0 replicas and operates at the API spec level without scaling, pod scheduling, or node operations.
Topology-Aware Scheduling Compatibility ✅ Passed PR modifies only test code (test/extended-priv/mco_bootimages.go), not deployment manifests, operator code, or controllers. Topology-aware scheduling check explicitly applies to deployed code only.
Ote Binary Stdout Contract ✅ Passed The PR modifies only test/extended-priv/mco_bootimages.go. No stdout writes detected in process-level code: no fmt.Print operations, logger.Infof uses Ginkgo-intercepted GinkgoWriter, exutil.By use...
No-Weak-Crypto ✅ Passed The modified test file contains no weak cryptographic algorithms (MD5, SHA1, DES, RC4, 3DES, Blowfish, ECB), no custom crypto implementations, and no non-constant-time secret comparisons. The file...
Container-Privileges ✅ Passed PR adds test automation code only; privileged containers in repo (MCO daemon) are fully justified for system-level node configuration management.
No-Sensitive-Data-In-Logs ✅ Passed The PR's logger.Infof calls only log generic messages ("OK!") and loop iteration numbers (1-3). Assertion messages use .PrettyString() which redacts sensitive resources. No passwords, tokens, API k...

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot requested review from umohnani8 and yuqi-zhang June 22, 2026 13:45
@openshift-ci

openshift-ci Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sergiordlr

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jun 22, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/extended-priv/mco_bootimages.go`:
- Around line 727-730: The failure message format strings in the assertion calls
contain two %s placeholders each, but only one argument
(machineConfiguration.PrettyString()) is provided to each. This causes Go's
formatter to output %!s(MISSING) in the diagnostics, making debugging harder.
Fix both assertions by either removing the second %s placeholder from each
format string or providing an appropriate second argument to match the number of
placeholders. This issue appears in the block around lines 727-730 and also
applies to similar assertions around lines 746-747.
- Line 703: The `defer clonedMS.Delete()` statement ignores the error returned
by the Delete() method, which violates Go coding guidelines and can cause
resource leaks in this disruptive test. Modify the deferred cleanup to capture
and explicitly handle the error returned by clonedMS.Delete() - this typically
involves logging the error or checking it to ensure cleanup failures are not
silently ignored and can be traced during test debugging.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 1a678e5f-7091-4947-89b3-08083ff82315

📥 Commits

Reviewing files that changed from the base of the PR and between ae127c3 and a05c3fa.

📒 Files selected for processing (1)
  • test/extended-priv/mco_bootimages.go

exutil.By("Clone the first machineset")
clonedMS, err := machineSet.Duplicate(clonedMSName)
o.Expect(err).NotTo(o.HaveOccurred(), "Error duplicating %s", machineSet)
defer clonedMS.Delete()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle deferred cleanup errors explicitly.

defer clonedMS.Delete() ignores the returned error (Line 703). In this disruptive test, failed cleanup can leak resources and cause cross-test instability.

Suggested fix
-		defer clonedMS.Delete()
+		defer func() {
+			o.Expect(clonedMS.Delete()).To(o.Succeed(), "Failed to delete cloned machineset %s", clonedMSName)
+		}()

As per coding guidelines, Go code should never ignore error returns.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
defer clonedMS.Delete()
defer func() {
o.Expect(clonedMS.Delete()).To(o.Succeed(), "Failed to delete cloned machineset %s", clonedMSName)
}()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/extended-priv/mco_bootimages.go` at line 703, The `defer
clonedMS.Delete()` statement ignores the error returned by the Delete() method,
which violates Go coding guidelines and can cause resource leaks in this
disruptive test. Modify the deferred cleanup to capture and explicitly handle
the error returned by clonedMS.Delete() - this typically involves logging the
error or checking it to ensure cleanup failures are not silently ignored and can
be traced during test debugging.

Source: Coding guidelines

Comment on lines +727 to +730
"Expected %s to be BootImageUpdateDegraded due to hot loop.\n%s", machineConfiguration.PrettyString())

o.Eventually(machineConfiguration, "5m", "20s").Should(HaveConditionField("BootImageUpdateDegraded", "message", o.MatchRegexp(expectedHotLoopMessageRegex)),
"Expected hot loop error message in %s.\n%s", machineConfiguration.PrettyString())

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix assertion format strings with missing arguments.

These failure messages use two %s placeholders but pass one argument, so diagnostics will contain %!s(MISSING) and be harder to debug.

Suggested fix
-		o.Eventually(machineConfiguration, "5m", "20s").Should(HaveConditionField("BootImageUpdateDegraded", "status", "True"),
-			"Expected %s to be BootImageUpdateDegraded due to hot loop.\n%s", machineConfiguration.PrettyString())
+		o.Eventually(machineConfiguration, "5m", "20s").Should(HaveConditionField("BootImageUpdateDegraded", "status", "True"),
+			"Expected BootImageUpdateDegraded=True due to hot loop.\n%s", machineConfiguration.PrettyString())

-		o.Eventually(machineConfiguration, "5m", "20s").Should(HaveConditionField("BootImageUpdateDegraded", "message", o.MatchRegexp(expectedHotLoopMessageRegex)),
-			"Expected hot loop error message in %s.\n%s", machineConfiguration.PrettyString())
+		o.Eventually(machineConfiguration, "5m", "20s").Should(HaveConditionField("BootImageUpdateDegraded", "message", o.MatchRegexp(expectedHotLoopMessageRegex)),
+			"Expected hot loop error message.\n%s", machineConfiguration.PrettyString())

-		o.Eventually(machineConfiguration, "5m", "20s").Should(HaveConditionField("BootImageUpdateDegraded", "status", "False"),
-			"Expected %s NOT to be BootImageUpdateDegraded after clearing hot loop.\n%s", machineConfiguration.PrettyString())
+		o.Eventually(machineConfiguration, "5m", "20s").Should(HaveConditionField("BootImageUpdateDegraded", "status", "False"),
+			"Expected BootImageUpdateDegraded=False after clearing hot loop.\n%s", machineConfiguration.PrettyString())

Also applies to: 746-747

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/extended-priv/mco_bootimages.go` around lines 727 - 730, The failure
message format strings in the assertion calls contain two %s placeholders each,
but only one argument (machineConfiguration.PrettyString()) is provided to each.
This causes Go's formatter to output %!s(MISSING) in the diagnostics, making
debugging harder. Fix both assertions by either removing the second %s
placeholder from each format string or providing an appropriate second argument
to match the number of placeholders. This issue appears in the block around
lines 727-730 and also applies to similar assertions around lines 746-747.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant