Use CAPI featuregate for CAPI aws worker machinesets#10659
Conversation
📝 WalkthroughWalkthroughMachine pool management defaulting now applies only to AWS compute pools when the AWS machine-management feature gate is enabled. Validation and tests were updated to match the AWS-specific behavior. ChangesAWS machine pool management
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 14 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (14 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.12.2)Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/test e2e-aws-ovn-techpreview |
This removes use of: * FeatureGateClusterAPIControlPlaneInstall because it is not implemented in either installer or CPMS. * FeatureGateClusterAPIComputeInstall replaced by FeatureGateClusterAPIMachineManagementAWS and scoped to AWS only, as that is the only platform where it is implemented The effect of this change is that installer-created worker machinesets will use CAPI as soon as platform support is promoted. Also updates the tests to use the feature gates explicitly instead of assuming inclusion in a particular feature set.
d7d2bfc to
a976a1f
Compare
|
/test e2e-aws-ovn-techpreview |
|
Leaving this draft until I get a good run of e2e-aws-ovn-techpreview. |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/types/validation/featuregates.go`:
- Around line 45-47: The validation mapping is using the generic
machine-management feature gate instead of the AWS-specific one, which broadens
install config acceptance in `installconfig.go` through `FeatureGateName`.
Update the `featuregates.go` entry to reference the AWS-specific gate used for
`compute.management`, and keep the platform-specific behavior aligned with the
AWS-only contract rather than falling back to
`FeatureGateClusterAPIMachineManagement`. If the helper cannot represent
platform-specific gating, fix that abstraction instead of changing this mapping.
- Around line 48-49: The validation error path in the feature gate check is
pointing at the list field instead of the actual compute pool entry, so update
the field path used in the feature validation logic to target the indexed pool.
In the validation code around the compute management check, adjust the
`field.NewPath(...)` call so it references `compute[0].management` (or the
matching index if the logic is generalized), keeping the check in sync with the
`c.Compute` slice access.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 15a6ab77-30eb-4de5-a0ee-ee5fe328f05f
📒 Files selected for processing (3)
pkg/types/defaults/machinepools.gopkg/types/defaults/machinepools_test.gopkg/types/validation/featuregates.go
🚧 Files skipped from review as they are similar to previous changes (2)
- pkg/types/defaults/machinepools.go
- pkg/types/defaults/machinepools_test.go
| // Note that this should use a platform-specific feature gate, but | ||
| // there is no way to express that here. | ||
| FeatureGateName: features.FeatureGateClusterAPIMachineManagement, |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | 🏗️ Heavy lift
Use the AWS-specific gate in validation, not the generic machine-management gate.
pkg/types/validation/installconfig.go consumes this mapping purely by FeatureGateName, so switching this entry to FeatureGateClusterAPIMachineManagement broadens which configs are admitted/rejected. That no longer matches the AWS-only behavior described for this change and can make compute.management validation diverge from the actual platform support. If the current helper cannot express platform-specific gating, that limitation needs fixing instead of weakening the contract here.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/types/validation/featuregates.go` around lines 45 - 47, The validation
mapping is using the generic machine-management feature gate instead of the
AWS-specific one, which broadens install config acceptance in `installconfig.go`
through `FeatureGateName`. Update the `featuregates.go` entry to reference the
AWS-specific gate used for `compute.management`, and keep the platform-specific
behavior aligned with the AWS-only contract rather than falling back to
`FeatureGateClusterAPIMachineManagement`. If the helper cannot represent
platform-specific gating, fix that abstraction instead of changing this mapping.
| Condition: len(c.Compute) > 0 && c.Compute[0].Management == types.ClusterAPI, | ||
| Field: field.NewPath("compute", "management"), |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
Point the validation error at the indexed compute pool.
compute is a list in install-config, so field.NewPath("compute", "management") does not map to the structure users edit. If this check is for the first pool, the field path should be compute[0].management (or the matched index if you generalize the scan), otherwise the error is harder to act on. As per path instructions, "Verify error field paths match the YAML/JSON structure users provide."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/types/validation/featuregates.go` around lines 48 - 49, The validation
error path in the feature gate check is pointing at the list field instead of
the actual compute pool entry, so update the field path used in the feature
validation logic to target the indexed pool. In the validation code around the
compute management check, adjust the `field.NewPath(...)` call so it references
`compute[0].management` (or the matching index if the logic is generalized),
keeping the check in sync with the `c.Compute` slice access.
Source: Path instructions
|
/test ? |
|
/test e2e-aws-ovn-dualstack-ipv6-primary-techpreview e2e-aws-ovn-dualstack-ipv4-primary-techpreview |
|
/hold |
| featureSet configv1.FeatureSet | ||
| featureGates []string |
There was a problem hiding this comment.
can we just stick with featureset?
There was a problem hiding this comment.
I don't think so: it's the wrong abstraction. We'd have to change all the tests every time we promoted anything.
There was a problem hiding this comment.
Ok this looks fine to me although you might need CNU featureset enabled?
There was a problem hiding this comment.
But we lose coverage for scenarios where the users set: featureSet: Default/TechPreviewNoUpgrade/DevPreviewNoUpgrade in the install-config, right?
| FeatureGateName: features.FeatureGateClusterAPIComputeInstall, | ||
| // Note that this should use a platform-specific feature gate, but | ||
| // there is no way to express that here. | ||
| FeatureGateName: features.FeatureGateClusterAPIMachineManagement, |
There was a problem hiding this comment.
Instead, move it here: https://github.com/openshift/installer/blob/main/pkg/types/aws/validation/featuregates.go
|
@mdbooth: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
ipv6 failure: |
| name: "compute with default feature set", | ||
| name: "non-AWS compute is unaffected by ClusterAPIMachineManagementAWS", | ||
| pool: &types.MachinePool{Name: types.MachinePoolComputeRoleName}, | ||
| platform: &types.Platform{}, |
There was a problem hiding this comment.
nit: let's put a non-nil platform (e.g. GCP) to test "non-AWS" case?
| Field: field.NewPath("osImageStream"), | ||
| }, | ||
| { | ||
| FeatureGateName: features.FeatureGateClusterAPIControlPlaneInstall, |
There was a problem hiding this comment.
We should also remove the 2 unit tests, following this change:
| }, | ||
| { | ||
| FeatureGateName: features.FeatureGateClusterAPIControlPlaneInstall, | ||
| Condition: c.ControlPlane != nil && c.ControlPlane.Management == types.ClusterAPI, |
There was a problem hiding this comment.
By removing this check, field controlPlane.management can be set to ClusterAPI freely; but results in a no-op.
We should add a validation to block it for the moment, WDYT?
installer/pkg/types/validation/installconfig.go
Lines 900 to 904 in 6831905
Let's check with the 2 CCAPIO PRs 👇 (see next comment) Note: There is another "issue" that breaks |
|
/testwith openshift/installer/main/e2e-aws-ovn-dualstack-ipv6-primary-techpreview openshift/cluster-capi-operator#592 openshift/cluster-capi-operator#593 |
|
Hmm, in the AWS ipv4-primary dualstack, the konnectivity agent pods are rejected (see events) 😕: This may be some sort of timing/race problem? AWS ipv6-primary dualstack and ipv4-only TPNU installed just fine. |
This removes use of:
The effect of this change is that installer-created worker machinesets will use CAPI as soon as platform support is promoted.
Also updates the tests to use the feature gates explicitly instead of assuming inclusion in a particular feature set.
Summary by CodeRabbit