test(e2e): add Cilium-CNI e2e suite#10
test(e2e): add Cilium-CNI e2e suite#10kvaps wants to merge 1 commit intofeat/cross-mesh-granularityfrom
Conversation
Adds a new e2e harness that brings up a kind cluster with Cilium as the CNI (VXLAN overlay, default kube-proxy) and runs Kilo on top with --cni=false --compatibility=cilium. This validates the cross granularity + Cilium combination, which is the configuration shipped by downstream platforms but not exercised by the existing bridge-CNI suite. Files: - e2e/kilo-kind-cilium.yaml: Kilo DaemonSet for the Cilium-CNI cluster (no kilo CNI ConfigMap, no install-cni init container, --cni=false, --compatibility=cilium) - e2e/lib.sh: install_cilium() helper (Helm, Cilium 1.16.5, VXLAN tunnelProtocol, IPAM kubernetes, host firewall off) and create_cilium_cluster() - e2e/cilium-setup.sh, cilium-cross-mesh.sh, cilium-teardown.sh: three-stage suite mirroring the existing setup/...mesh/teardown pattern - Makefile: new `e2e-cilium` target, kept separate from `e2e` because the Cilium cluster is incompatible with the Kilo bridge CNI used by the default suite Kube-proxy replacement is intentionally left at the default (off) for this baseline; KPR coverage can be added in a follow-up. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Andrei Kvapil <kvapss@gmail.com>
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Reverting — keeping the upstream PR squat#491 as the canonical one. |
There was a problem hiding this comment.
Code Review
This pull request introduces an end-to-end test suite for Kilo's Cilium compatibility mode, including new test scripts, a dedicated Kilo manifest, and helper functions for cluster provisioning. Feedback highlights opportunities to improve test stability by replacing a hardcoded sleep with polling, fixing a namespace typo in the setup logic, and pinning external manifest URLs to specific versions.
| assert "retry 30 5 '' check_ping" "should be able to ping all Pods over Cilium VXLAN + Kilo cross mesh" | ||
| assert "retry 10 5 'the adjacency matrix is not complete yet' check_adjacent 3" "adjacency should return the right number of successful pings" | ||
| echo "sleep for 30s (one reconciliation period) and try again..." | ||
| sleep 30 |
There was a problem hiding this comment.
Using a hardcoded sleep in end-to-end tests is generally discouraged as it can lead to flaky tests (if the reconciliation takes longer than 30s) or unnecessarily slow tests (if it takes much less). It is better to poll for a specific state or condition that indicates the reconciliation has completed, such as checking for a specific log message or a change in the node/peer status.
| install_cilium | ||
| block_until_ready kube-system k8s-app=cilium | ||
| _kubectl wait nodes --all --for=condition=Ready --timeout=120s | ||
| block_until_ready kube_system k8s-app=kube-dns |
There was a problem hiding this comment.
The namespace kube_system contains an underscore, which is likely a typo for the standard Kubernetes namespace kube-system. This inconsistency with line 177 will cause the block_until_ready call to look in a non-existent namespace, which may lead to a false positive or a timeout depending on how is_ready handles missing namespaces.
| block_until_ready kube_system k8s-app=kube-dns | |
| block_until_ready kube-system k8s-app=kube-dns |
| _kubectl apply -f helper-curl.yaml | ||
| block_until_ready_by_name default curl || return 1 | ||
| _kubectl taint node $KIND_CLUSTER-control-plane node-role.kubernetes.io/control-plane:NoSchedule- | ||
| _kubectl apply -f https://raw.githubusercontent.com/kilo-io/adjacency/main/example.yaml |
There was a problem hiding this comment.
Relying on a raw URL from the main branch of an external repository for test manifests can introduce fragility. If the external file is modified or moved, the e2e suite may break unexpectedly. It is recommended to use a specific commit hash in the URL or to vendor the manifest locally within the repository.
Summary
Adds an e2e suite that runs Kilo with
--compatibility=ciliumon top ofa kind cluster where Cilium is the CNI. The existing e2e suites only
cover the Kilo bridge CNI path, so the
--compatibility=ciliummode hasno end-to-end coverage in upstream CI today.
This is a stacked PR: the diff against
feat/cross-mesh-granularityshows only the Cilium-specific changes. The branch is still based on
top of the cross-granularity work that we'll send upstream as squat/kilo
PR squat#490 once it's green.
What's added
e2e/kilo-kind-cilium.yaml— Kilo DaemonSet for the Cilium-CNIcluster: no kilo CNI ConfigMap, no
install-cniinit container,--cni=false --compatibility=ciliume2e/lib.sh—install_cilium()(Helm, Cilium 1.16.5, VXLANtunnelProtocol,ipam.mode=kubernetes, host firewall off) andcreate_cilium_cluster()mirroringcreate_cluster()e2e/cilium-setup.sh,cilium-cross-mesh.sh,cilium-teardown.sh— three-stage suite mirroring the bridge
setup/...mesh/teardownpattern
Makefile— newe2e-ciliumtarget, kept separate frome2ebecause the Cilium cluster is incompatible with Kilo's bridge CNI
used by the default suite
Scope
Baseline coverage is intentionally narrow:
crossgranularity is exercised;fullandlocationwithCilium CNI are reasonable follow-ups in the same harness.
kubeProxyReplacementis off, so this checks Kilo's Cilium-overlayhandling without entangling Cilium's eBPF service LB. KPR coverage
can be added via a values flag in a follow-up.
Validation note
helmmust be available on the runner. The author couldn't runmake e2e-ciliumlocally (Docker Desktop on macOS), so behaviour willfirst be observed in CI; tuning (Cilium version pin, MTU, timeouts)
may be needed once it runs.