OCPBUGS-74970: Fix kubelet certificate wait loop in criometricsproxy.yaml#6125
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: LGTM mode |
|
@aksjadha: This pull request references Jira Issue OCPBUGS-74970, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository: openshift/coderabbit/.coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (3)
🚧 Files skipped from review as they are similar to previous changes (3)
WalkthroughThe PR inverts the kubelet server certificate wait condition across three node configuration templates and updates an arbiter volumeMount path. InitContainers now wait until /var/lib/kubelet/pki/kubelet-server-current.pem exists before proceeding. ChangesKubelet Certificate Readiness Wait
🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested labels: 🚥 Pre-merge checks | ✅ 15✅ Passed checks (15 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
/jira refresh |
|
@aksjadha: This pull request references Jira Issue OCPBUGS-74970, which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@aksjadha: This pull request references Jira Issue OCPBUGS-74970, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
af45f24 to
ee0dcdd
Compare
|
@aksjadha: This pull request references Jira Issue OCPBUGS-74970, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Fix is working as expected. Before fix, there were multiple restarts of pods as init container was not waiting for file to exist. With fix, init container checking if file exists and checking correct mount path i,e /var/lib/kubelet/, there are no multiple restart. |
… init container's volumeMount to /var/lib/kubelet
ee0dcdd to
78749ce
Compare
|
The change looks good to me. |
|
/approve |
|
/lgtm |
|
Scheduling tests matching the |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: aksjadha, ngopalak-redhat, rphillips The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/verified by @aksjadha Verified the changes through manual testing and have added the test results above. |
|
@aksjadha: This PR has been marked as verified by DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@aksjadha: This pull request references Jira Issue OCPBUGS-74970. The bug has been updated to no longer refer to the pull request using the external bug tracker. All external bug links have been closed. The bug has been moved to the NEW state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@aksjadha: This pull request references Jira Issue OCPBUGS-74970, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/retest-required |
|
/pipeline required |
|
Scheduling tests matching the |
|
@aksjadha: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
@aksjadha: Jira Issue Verification Checks: Jira Issue OCPBUGS-74970 Jira Issue OCPBUGS-74970 has been moved to the MODIFIED state and will move to the VERIFIED state when the change is available in an accepted nightly payload. 🕓 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Fix included in release 5.0.0-0.nightly-2026-06-12-141614 |
|
/cherrypick release-4.23 release-4.22 release-4.21 release-4.20 |
|
@aksjadha: new pull request could not be created: failed to create pull request against openshift/machine-config-operator#release-4.23 from head openshift-cherrypick-robot:cherry-pick-6125-to-release-4.23: status code 422 not one of [201], body: {"message":"Validation Failed","errors":[{"resource":"PullRequest","code":"custom","message":"No commits between openshift:release-4.23 and openshift-cherrypick-robot:cherry-pick-6125-to-release-4.23"}],"documentation_url":"https://docs.github.com/rest/pulls/pulls#create-a-pull-request","status":"422"} DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/cherrypick release-4.22 release-4.21 release-4.20 |
|
@aksjadha: new pull request created: #6200 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Related bug: Fix kubelet certificate wait loop and mount path in criometricsproxy.yaml
The previous condition
[ -n "$(test -e ...)" ]always evaluated to false becausetest -eproduces no stdout output — it communicates via exit code only. So-nalways evaluated to false, causing the loop to exit immediately instead of waiting for the kubelet certificate to appear.The init container mounts the host's /
var/lib/kubeletat/var. So inside the init container, the host's/var/lib/kubelet/pki/kubelet-server-current.pemappears at/var/pki/kubelet-server-current.pem— but the script checks/var/lib/kubelet/pki/kubelet-server-current.pem, which doesn't exist at that path inside the init container.- What I did
[ ! -e /var/lib/kubelet/pki/kubelet-server-current.pem ], which properly loops until the kubelet certificate file exists/var/lib/kubeletfrom/varto match main container so the script's path/var/lib/kubelet/pki/kubelet-server-current.pemresolves correctly.- How to verify it
kube-rbac-proxy-crio-ippod logs in namespaceopenshift-machine-config-operatorto verify the CRI-O metrics proxy init container correctly waits for the kubelet certificate before proceeding- Description for the changelog
criometricsproxy.yamlacross all node roles (arbiter, master, worker)[ -n "$(test -e /var/lib/kubelet/pki/kubelet-server-current.pem)" ]was incorrect. Replaced with the correct condition[ ! -e /var/lib/kubelet/pki/kubelet-server-current.pem ], which properly loops until the kubelet certificate file existsSummary by CodeRabbit