Skip to content

MCO1908: MCO-2387: Migrate MCO Upgrade TestCase#6211

Open
ptalgulk01 wants to merge 1 commit into
openshift:mainfrom
ptalgulk01:migrate-mco-upgrade
Open

MCO1908: MCO-2387: Migrate MCO Upgrade TestCase#6211
ptalgulk01 wants to merge 1 commit into
openshift:mainfrom
ptalgulk01:migrate-mco-upgrade

Conversation

@ptalgulk01

@ptalgulk01 ptalgulk01 commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Migrate MCO upgrade tests from openshift-tests-private

Migrate 7 upgrade test IDs (9 test cases) from openshift-tests-private.
Tests validate MCO behavior during and after cluster upgrades.

Migrated test cases:

  • 55748: Verify no "Transaction in progress" errors in MCD logs
  • 59427: Verify SSH keys migrate from RHCOS8 to RHCOS9 paths
  • 62154 (Pre): Setup KubeletConfig/ContainerRuntimeConfig before upgrade
  • 62154 (Post): Verify controller versions match after upgrade
  • 64781: Verify MCO pod complies with CIS benchmark rules
  • 70577: Verify ovs-configuration.service runs before dnsmasq (Azure)
  • 70813 (Pre): Setup ManagedBootImages test resources before upgrade
  • 70813 (Post): Verify ManagedBootImages updates after upgrade
  • 76216: Verify nodes can scale up after upgrade

Added upgrade markers:

  • [Feature:ClusterUpgrade] - Identifies tests for openshift-tests run-upgrade
  • [Early] - Pre-upgrade setup tests (62154-Pre, 70813-Pre)
  • [Late] - Post-upgrade verification tests (all others)

Tests run automatically in existing MCO upgrade CI jobs (e2e-aws-ovn-upgrade,
e2e-azure-ovn-upgrade, e2e-gcp-upgrade, etc.)

Summary by CodeRabbit

  • Tests
    • Added a new long-duration Ginkgo extended-privilege “MCO Upgrade” suite covering multi-node log validation, RHCOS8→RHCOS9 SSH authorized-keys migration, controller-version annotation consistency, and CIS checks for default service-account usage.
    • Added ManagedBootImages validation in two phases, including update-target boot image/user-data verification and ensuring non-target boot images remain unchanged.
    • Added Azure-specific validation for systemd ordering and dnsmasq behavior, plus an ARO cluster detection helper used by the suite.

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jun 22, 2026
@openshift-ci-robot

openshift-ci-robot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

@ptalgulk01: This pull request references MCO-2387 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Migrate MCO upgrade tests from openshift-tests-private

Migrate 7 upgrade test IDs (9 test cases) from openshift-tests-private.
Tests validate MCO behavior during and after cluster upgrades.

Test IDs: 55748, 59427, 62154 (Pre+Post), 64781, 70577, 70813 (Pre+Post), 76216

Added upgrade markers:

  • [Feature:ClusterUpgrade] - Identifies tests for openshift-tests run-upgrade
  • [Early] - Pre-upgrade setup tests (62154-Pre, 70813-Pre)
  • [Late] - Post-upgrade verification tests (all others)

Tests run automatically in existing MCO upgrade CI jobs (e2e-aws-ovn-upgrade,
e2e-azure-ovn-upgrade, e2e-gcp-upgrade, etc.)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 4ce18d20-b151-4931-895c-a6e3c244739a

📥 Commits

Reviewing files that changed from the base of the PR and between 519809b and 5bb79d7.

📒 Files selected for processing (2)
  • test/extended-priv/mco_upgrade.go
  • test/extended-priv/util.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • test/extended-priv/util.go
  • test/extended-priv/mco_upgrade.go

Walkthrough

Adds a new extended-privilege Ginkgo suite for MCO upgrade validation, covering node log checks, SSH key migration, controller-version annotations, CIS and Azure assertions, and managed boot image behavior. Adds an ARO cluster detection helper.

Changes

MCO Upgrade Extended-Priv Test Suite

Layer / File(s) Summary
Suite setup and ARO detection
test/extended-priv/mco_upgrade.go, test/extended-priv/util.go
Adds suite-scoped CLI/admin setup, temp directory lifecycle, worker pool initialization, and IsAROCluster resource probing.
Log scan and SSH migration
test/extended-priv/mco_upgrade.go
Adds the machine-config-daemon log check across Linux nodes and the RHCOS authorized-keys migration assertion with debug-node skipping.
CIS and Azure checks
test/extended-priv/mco_upgrade.go
Adds the MCO service-account and legacy clusterrolebinding assertions, plus Azure ovs-configuration.service ordering and dnsmasq state checks.
Controller-version validation
test/extended-priv/mco_upgrade.go
Adds the early KubeletConfig/ContainerRuntimeConfig precheck and the late controller-version annotation comparison against rendered MachineConfigs.
Managed boot images flow
test/extended-priv/mco_upgrade.go
Adds the early boot-image snapshot and target machineset setup, the late boot image/configmap validation, and the follow-up machineset scaling test.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested labels

verified

Suggested reviewers

  • djoshy
  • RishabhSaini
  • yuqi-zhang
  • umohnani8
🚥 Pre-merge checks | ✅ 13 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Test Structure And Quality ⚠️ Warning 70813 Early creates a temp namespace/configmap and patches cluster ManagedBootImages, but only tmpdir is cleaned; no teardown restores or deletes those resources. Add DeferCleanup/AfterEach to delete tc-70813-tmp-namespace/configmap and revert the cluster MachineConfiguration managedBootImages setting after the late phase.
Microshift Test Compatibility ⚠️ Warning All 9 tests reference unavailable MicroShift APIs (MachineSet, MachineConfig, MachineConfigPool, ClusterVersion, KubeletConfig, ContainerRuntimeConfig, FeatureGates) with no protective labels/skips. Add [apigroup:machine.openshift.io] or [apigroup:machineconfiguration.openshift.io] tags to all test names, or add [Skipped:MicroShift] labels for all 9 test cases.
✅ Passed checks (13 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: migrating MCO upgrade test cases tied to MCO-2387.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed All Ginkgo titles in the new suite are static literals; no interpolated or run-dependent values appear in test names.
Single Node Openshift (Sno) Test Compatibility ✅ Passed All 9 MCO upgrade tests are SNO-compatible: tests requiring node scaling have skipTestIfWorkersCannotBeScaled guards, worker pool tests have explicit empty pool checks, and node-iterating tests wor...
Topology-Aware Scheduling Compatibility ✅ Passed Changes add upgrade tests and a cluster-type helper only; no pods/controllers, affinity, nodeSelectors, PDBs, or replica logic were introduced.
Ote Binary Stdout Contract ✅ Passed PASS: No fmt.Print/klog stdout in init/top-level setup; logging uses logext/GinkgoWriter, and output calls in mco_upgrade.go are inside It bodies.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed No IPv4 literals or IPv4-only parsing found, and the new tests only query cluster resources; no public-internet calls are used.
No-Weak-Crypto ✅ Passed No MD5/SHA1/DES/RC4/3DES/Blowfish/ECB, custom crypto, or non-constant-time secret/token compares appear in the new code.
Container-Privileges ✅ Passed The new test code only performs API/read-only checks; no added manifests or specs set privileged/hostPID/hostNetwork/hostIPC/SYS_ADMIN/allowPrivilegeEscalation.
No-Sensitive-Data-In-Logs ✅ Passed The new logs only print resource names/statuses; no passwords, tokens, API keys, PII, or session IDs are logged.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.12.2)

Command failed


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@openshift-ci

openshift-ci Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ptalgulk01

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 22, 2026
@ptalgulk01 ptalgulk01 changed the title MCO1908: MCO-2387: Add MCO Upgrade TestCase MCO1908: MCO-2387: Migrate MCO Upgrade TestCase Jun 22, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/extended-priv/mco_upgrade.go`:
- Around line 339-340: In the logger.Warnf call, replace the
tmpConfigMap.PrettyString() parameter with a safer alternative that only
includes the ConfigMap's namespace and name along with the missing key
identifier, rather than exposing the full ConfigMap payload. This prevents
sensitive cluster metadata like machine set identifiers and boot-image details
from being exposed in CI logs while maintaining sufficient debugging context.
- Around line 33-34: The os.RemoveAll call on line 33 (and similar calls at
lines 108-109 and 139-140) return errors that are being ignored, which violates
the coding guideline to never ignore error returns. Capture the error returned
by os.RemoveAll(tmpdir) and handle it by checking if it is non-nil before
logging the success message, then either log the error or return it as
appropriate for cleanup failures. Apply the same error handling pattern to the
other two locations identified (around lines 108 and 139) to ensure all error
returns from cleanup and debug command invocations are properly handled.
- Around line 373-375: The defer statement for clonedMS.Delete() is registered
before the error validation from machineSet.Duplicate(), which means if the
duplication fails and clonedMS is invalid, the deferred cleanup will still
execute and potentially obscure the actual error. Move the defer
clonedMS.Delete() statement to appear after the
o.Expect(err).NotTo(o.HaveOccurred()) error check so that cleanup is only
registered after confirming the duplication was successful and clonedMS is a
valid object.
- Around line 46-48: The assertion on the getLogErr variable returned from the
GetMCDaemonLogs call has incorrect polarity and will incorrectly mask real
log-collection failures. The current
o.Expect(getLogErr).Should(o.HaveOccurred()) expects an error to occur, but this
should be checking that no error occurred during log retrieval. Change the
assertion to o.Expect(getLogErr).ShouldNot(o.HaveOccurred()) so that actual
log-collection failures are properly detected, while the subsequent errLog
assertion continues to validate that the specific transaction error is not
present in the logs.
- Around line 195-199: The jsonpath query in the test checking the MCO pod's
serviceAccountName only validates items[0], which fails to detect two issues: an
empty pod list would incorrectly pass the test, and multiple pods are not all
validated for the default serviceAccountName. Modify the jsonpath expression to
verify all pods returned do not use 'default', and add an assertion to ensure
the query result is not empty before checking the serviceAccountName values.
This ensures the test actually finds MCO pods and validates that none of them
are using the 'default' serviceAccountName.

In `@test/extended-priv/util.go`:
- Around line 542-545: The IsAROCluster function currently collapses all errors
from Exists() into a boolean false result, which silently treats transient API
failures as "not ARO" and can misroute the assertion logic. Modify IsAROCluster
to return both a boolean and an error instead of just a boolean, so that the
error from calling Exists() on the NewResource is explicitly surfaced to the
caller rather than being collapsed into the return value. This allows the call
site to explicitly check for and handle errors before making assertions based on
the cluster type.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: e5f271f1-8fd2-4e41-995a-fa23fc69c527

📥 Commits

Reviewing files that changed from the base of the PR and between ae127c3 and 897d7f0.

📒 Files selected for processing (2)
  • test/extended-priv/mco_upgrade.go
  • test/extended-priv/util.go

Comment thread test/extended-priv/mco_upgrade.go Outdated
Comment thread test/extended-priv/mco_upgrade.go Outdated
Comment thread test/extended-priv/mco_upgrade.go Outdated
Comment thread test/extended-priv/mco_upgrade.go Outdated
Comment thread test/extended-priv/mco_upgrade.go
Comment thread test/extended-priv/util.go Outdated
@ptalgulk01

ptalgulk01 commented Jun 22, 2026

Copy link
Copy Markdown
Contributor Author

4.23

https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-main-nightly-4.23-upgrade-from-stable-4.22-e2e-aws-upgrade-ovn-single-node/2069001257581809664
https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-machine-config-operator-release-4.23-periodics-e2e-aws-ovn-upgrade-ocl/2069001272257679360
https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-main-nightly-4.23-upgrade-from-stable-4.22-e2e-vsphere-upgrade/2069001263999094784

5.0
https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-main-nightly-5.0-upgrade-from-stable-4.22-e2e-aws-upgrade-ovn-single-node/2069002519119400960
https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-main-nightly-5.0-upgrade-from-stable-4.22-e2e-vsphere-upgrade/2069002528514641920
https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-machine-config-operator-release-5.0-periodics-e2e-aws-ovn-upgrade-ocl/2069002539981869056

Result

Early (Pre-Upgrade) Tests:

  ┌─────────┬──────────────────────────┬───────────────────┬───────────────────┬────────────────────┬──────────────────┬───────────────────┬──────────────────┐
  │ Test ID │       Description        │ Job 1<br>4.23 SNO │ Job 2<br>4.23 MCO │ Job 3<br>4.23 vSph │ Job 4<br>5.0 SNO │ Job 5<br>5.0 vSph │ Job 6<br>5.0 MCO │
  ├─────────┼──────────────────────────┼───────────────────┼───────────────────┼────────────────────┼──────────────────┼───────────────────┼──────────────────┤
  │ 62154   │ Controller version check │      SKIPPED      │     ✅ PASSED     │     ✅ PASSED      │     SKIPPED      │     ✅ PASSED     │    ✅ PASSED     │
  ├─────────┼──────────────────────────┼───────────────────┼───────────────────┼────────────────────┼──────────────────┼───────────────────┼──────────────────┤
  │ 70813   │ ManagedBootImages setup  │      SKIPPED      │      SKIPPED      │     NOT FOUND      │     SKIPPED      │     NOT FOUND     │     SKIPPED      │
  └─────────┴──────────────────────────┴───────────────────┴───────────────────┴────────────────────┴──────────────────┴───────────────────┴──────────────────┘

  Late (Post-Upgrade) Tests:

  ┌─────────┬───────────────────────────┬───────────────────┬───────────────────┬────────────────────┬──────────────────┬───────────────────┬──────────────────┐
  │ Test ID │        Description        │ Job 1<br>4.23 SNO │ Job 2<br>4.23 MCO │ Job 3<br>4.23 vSph │ Job 4<br>5.0 SNO │ Job 5<br>5.0 vSph │ Job 6<br>5.0 MCO │
  ├─────────┼───────────────────────────┼───────────────────┼───────────────────┼────────────────────┼──────────────────┼───────────────────┼──────────────────┤
  │ 55748   │ Transaction errors check  │     ✅ PASSED     │     ✅ PASSED     │     ✅ PASSED      │    ✅ PASSED     │     ✅ PASSED     │    ✅ PASSED     │
  ├─────────┼───────────────────────────┼───────────────────┼───────────────────┼────────────────────┼──────────────────┼───────────────────┼──────────────────┤
  │ 59427   │ SSH keys migration        │     ✅ PASSED     │     ✅ PASSED     │     ✅ PASSED      │    ✅ PASSED     │     ✅ PASSED     │    ✅ PASSED     │
  ├─────────┼───────────────────────────┼───────────────────┼───────────────────┼────────────────────┼──────────────────┼───────────────────┼──────────────────┤
  │ 62154   │ Controller version verify │      SKIPPED      │     ✅ PASSED     │     ✅ PASSED      │     SKIPPED      │     ✅ PASSED     │    ✅ PASSED     │
  ├─────────┼───────────────────────────┼───────────────────┼───────────────────┼────────────────────┼──────────────────┼───────────────────┼──────────────────┤
  │ 64781   │ CIS benchmark check       │     ✅ PASSED     │     ✅ PASSED     │     ✅ PASSED      │    ✅ PASSED     │     ✅ PASSED     │    ✅ PASSED     │
  ├─────────┼───────────────────────────┼───────────────────┼───────────────────┼────────────────────┼──────────────────┼───────────────────┼──────────────────┤
  │ 70577   │ ovs-configuration (Azure) │     NOT FOUND     │     NOT FOUND     │     NOT FOUND      │    NOT FOUND     │     NOT FOUND     │    NOT FOUND     │
  ├─────────┼───────────────────────────┼───────────────────┼───────────────────┼────────────────────┼──────────────────┼───────────────────┼──────────────────┤
  │ 70813   │ ManagedBootImages verify  │      SKIPPED      │      SKIPPED      │     NOT FOUND      │     SKIPPED      │     NOT FOUND     │     SKIPPED      │
  ├─────────┼───────────────────────────┼───────────────────┼───────────────────┼────────────────────┼──────────────────┼───────────────────┼──────────────────┤
  │ 76216   │ Scale up nodes            │      SKIPPED      │     ✅ PASSED     │     ✅ PASSED      │     SKIPPED      │     ✅ PASSED     │    ✅ PASSED     │
  └─────────┴───────────────────────────┴───────────────────┴───────────────────┴────────────────────┴──────────────────┴───────────────────┴──────────────────┘

@ptalgulk01 ptalgulk01 force-pushed the migrate-mco-upgrade branch from 897d7f0 to 519809b Compare June 23, 2026 12:40
@ptalgulk01 ptalgulk01 force-pushed the migrate-mco-upgrade branch from 519809b to 5bb79d7 Compare June 24, 2026 15:45
@openshift-ci

openshift-ci Bot commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

@ptalgulk01: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@ptalgulk01

ptalgulk01 commented Jun 25, 2026

Copy link
Copy Markdown
Contributor Author

4.23
https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-main-nightly-4.23-upgrade-from-stable-4.22-e2e-aws-upgrade-ovn-single-node/2070006166703837184
https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-machine-config-operator-release-4.23-periodics-e2e-aws-ovn-upgrade-ocl/2070006576978071552
https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-main-nightly-4.23-upgrade-from-stable-4.22-e2e-vsphere-upgrade/2070006586473975808

5.0
https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-machine-config-operator-release-5.0-periodics-e2e-aws-ovn-upgrade-ocl/2070007382586429440
https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-main-nightly-5.0-upgrade-from-stable-4.22-e2e-vsphere-upgrade/2070007373929385984
https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-main-nightly-5.0-upgrade-from-stable-4.22-e2e-aws-upgrade-ovn-single-node/2070007364102131712

Result

  ✅ PASSING Tests (3 core tests):
  - 55748, 59427, 64781 - Pass on ALL platforms

  ⏭️  SKIP Tests (with reasons):
  - 62154: Skip on SNO (multi-node needed)
  - 70813: Skip on AWS SNO/Multi (Feature N/A)
  - 76216: Skip on SNO (cannot scale)

  ❌ NOT FOUND Tests (with reasons):
  - 70577: NOT FOUND on AWS/vSphere (Azure-only test) - This is correct!
  - 70813: NOT FOUND on vSphere (GCP/AWS only)


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants