Skip to content

docs: lineage-controller-webhook configuration guide#513

Open
lllamnyp wants to merge 1 commit intomainfrom
docs/2481
Open

docs: lineage-controller-webhook configuration guide#513
lllamnyp wants to merge 1 commit intomainfrom
docs/2481

Conversation

@lllamnyp
Copy link
Copy Markdown
Member

@lllamnyp lllamnyp commented Apr 28, 2026

Summary

  • Adds content/en/docs/next/operations/configuration/lineage-controller-webhook.md (weight 40, sibling of the existing Components reference page).
  • Documents the five user-facing knobs (deployment.enabled, deployment.replicas, nodeAffinity, tolerations, localK8sAPIEndpoint.enabled) and three worked Package CR overrides covering the common non-default topologies: Deployment on a large control plane, managed Kubernetes / Cozy-in-Cozy tenants, and dedicated webhook nodes.
  • Explains the chart's validation guard (localK8sAPIEndpoint.enabled: true + nodeAffinity: [] is rejected at render time) and adds a warning-style callout for the case the guard cannot catch — non-apiserver-bearing custom nodeAffinity with the local endpoint left enabled, which silently produces crash-looping pods and tenant CREATE/UPDATE outage with failurePolicy: Fail.

Companion PR

Companion to cozystack/cozystack#2481, which exposes the underlying knobs in the chart. The website page is targeted at next/ since v1.3 docs aren't registered yet and these knobs ship in the next release.

Test plan

  • make serve — confirm the page renders, the cross-link to Components resolves, and the alert callout displays correctly.
  • Visually inspect the four code blocks; confirm Package CR overrides are valid YAML.
  • Verify the page appears in the Configuration sidebar between Components (weight 30) and White Labeling (weight 50).

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Documentation
    • Added a new Lineage Controller Webhook guide describing its behavior and impact on resource labeling.
    • Documents admission requirement (failurePolicy: Fail) and operational implications.
    • Provides deployment topology guidance (replicas, affinity, anti-affinity, PDB, service distribution) and a Helm override example.
    • Warns about a deprecated flag that can cause crashes/admission outages.
    • Includes verification commands and a dry-run validation flow.

@lllamnyp lllamnyp requested a review from kvaps as a code owner April 28, 2026 07:06
@netlify
Copy link
Copy Markdown

netlify Bot commented Apr 28, 2026

Deploy Preview for cozystack ready!

Name Link
🔨 Latest commit dab51ee
🔍 Latest deploy log https://app.netlify.com/projects/cozystack/deploys/69f0a9f340e3410008dbb5e3
😎 Deploy Preview https://deploy-preview-513--cozystack.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 28, 2026

Warning

Rate limit exceeded

@lllamnyp has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 20 minutes and 9 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b84bd0dd-c5b0-49fa-a043-f9ae32f40e1e

📥 Commits

Reviewing files that changed from the base of the PR and between 7c5db43 and dab51ee.

📒 Files selected for processing (1)
  • content/en/docs/next/operations/configuration/lineage-controller-webhook.md
📝 Walkthrough

Walkthrough

A new documentation page adds configuration and operational details for the Lineage Controller Webhook: its mutating behavior that stamps owning application labels, failurePolicy: Fail requirement, default deployment topology and affinity/toleration guidance, Helm override example, deprecated flag warning, verification commands, and a dry-run walkthrough.

Changes

Cohort / File(s) Summary
Lineage Controller Webhook Documentation
content/en/docs/next/operations/configuration/lineage-controller-webhook.md
Adds a new doc page describing the lineage-controller mutating admission webhook, label-stamping behavior on tenant CREATE/UPDATE, failurePolicy: Fail requirement, default Deployment-like topology (replica guidance, soft affinity/anti-affinity, tolerations, PDB), Service trafficDistribution note, Helm Package override example, deprecated localK8sAPIEndpoint.enabled warning, and verification/dry-run commands.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰 I hop through docs with a bright little cheer,
I stamp tiny labels that everyone hears,
Replicas aligned, affinities neat,
Helm charts and checks make deployment complete,
Carrots and configs — a rabbit’s small feat! 🥕✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically identifies the main change: a new documentation guide for the lineage-controller-webhook configuration, which matches the single file addition of lineage-controller-webhook.md.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch docs/2481

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a comprehensive documentation page for configuring the Lineage Controller Webhook, detailing its default DaemonSet topology and providing specific override examples for managed Kubernetes and custom node pools. The review feedback focuses on enhancing technical clarity and accuracy, including explicitly listing label keys to avoid ambiguous shell-style notation, clarifying the mapping of environment variables to Kubernetes field paths, and noting the version requirements for the trafficDistribution feature.

The **lineage controller webhook** is a mutating admission webhook shipped as
part of the `cozystack.cozystack-engine` Package. On every CREATE and UPDATE
of a tenant `Pod`, `Secret`, `Service`, `PersistentVolumeClaim`,
`Ingress`, or `WorkloadMonitor` it walks up the ownership graph and stamps the
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The brace expansion notation {group,kind,name} is a shell-ism that might be ambiguous to some readers. It is clearer to list the literal label keys explicitly.

Suggested change
`Ingress`, or `WorkloadMonitor` it walks up the ownership graph and stamps the
(`apps.cozystack.io/application.group`, `apps.cozystack.io/application.kind`, and `apps.cozystack.io/application.name`).

`node-role.kubernetes.io/control-plane` (using `Exists`, so it matches both
Talos's empty value and k3s/kubeadm's `"true"`).
- Each pod talks to its **local** kube-apiserver via
`KUBERNETES_SERVICE_HOST=status.hostIP`, port `6443`. This avoids a
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The phrasing KUBERNETES_SERVICE_HOST=status.hostIP is slightly technically inaccurate as status.hostIP is the field path used to populate the environment variable, not the literal value. Clarifying this helps users who might be inspecting the Pod spec directly.

Suggested change
`KUBERNETES_SERVICE_HOST=status.hostIP`, port `6443`. This avoids a
`KUBERNETES_SERVICE_HOST` set to the node's IP (via `status.hostIP`), port `6443`.

`KUBERNETES_SERVICE_HOST=status.hostIP`, port `6443`. This avoids a
Service-loop-through-itself class of bug and means the webhook stays
reachable during apiserver rolling restarts.
- The Service uses `spec.trafficDistribution: PreferClose` so the apiserver
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Since trafficDistribution is a relatively new Kubernetes feature (Beta in 1.31), it is helpful to mention the version requirement here as well, consistent with the verification section.

Suggested change
- The Service uses `spec.trafficDistribution: PreferClose` so the apiserver
The Service uses `spec.trafficDistribution: PreferClose` (on Kubernetes ≥ 1.31) so the apiserver

kube-apiserver — including the "dedicated webhook nodes" example above —
**you must also set `localK8sAPIEndpoint.enabled: false`**. With the local
endpoint left enabled on a non-apiserver node, the pod will start and then
crash-loop dialing `status.hostIP:6443`. Combined with `failurePolicy: Fail`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

When troubleshooting, users will see the actual IP address in the logs (e.g., dial tcp 10.0.0.5:6443: connect: connection refused) rather than the string status.hostIP. Describing the error in terms of the node's IP is more helpful for debugging.

Suggested change
crash-loop dialing `status.hostIP:6443`. Combined with `failurePolicy: Fail`
crash-loop dialing the node's IP on port 6443. Combined with `failurePolicy: Fail`

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@content/en/docs/next/operations/configuration/lineage-controller-webhook.md`:
- Line 35: Replace the incorrect phrase "falls over" with the standard
operational term "fails over" in the documentation; locate the sentence
containing "prefers a webhook endpoint on its own node, but transparently falls
over" and update it to read "...but transparently fails over" so the technical
wording is correct.
- Around line 197-198: The kubectl service creation command shown ("kubectl -n
lineage-webhook-test create service clusterip probe --clusterip=None
--dry-run=server") is missing a port mapping; add the --tcp=80:80 flag so the
service is created with a port (e.g., update that command to include
--tcp=80:80) to ensure the verification step is reliable and matches Kubernetes
requirements.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: fd2520c7-f6ae-4319-81d7-89f8b1559527

📥 Commits

Reviewing files that changed from the base of the PR and between c88b4ec and 8c08d39.

📒 Files selected for processing (1)
  • content/en/docs/next/operations/configuration/lineage-controller-webhook.md

Service-loop-through-itself class of bug and means the webhook stays
reachable during apiserver rolling restarts.
- The Service uses `spec.trafficDistribution: PreferClose` so the apiserver
prefers a webhook endpoint on its own node, but transparently falls over
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use “fails over” instead of “falls over.”

This reads like a typo in operational wording; “fails over” is the standard term here.

Suggested edit
--  prefers a webhook endpoint on its own node, but transparently falls over
+-  prefers a webhook endpoint on its own node, but transparently fails over
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
prefers a webhook endpoint on its own node, but transparently falls over
prefers a webhook endpoint on its own node, but transparently fails over
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@content/en/docs/next/operations/configuration/lineage-controller-webhook.md`
at line 35, Replace the incorrect phrase "falls over" with the standard
operational term "fails over" in the documentation; locate the sentence
containing "prefers a webhook endpoint on its own node, but transparently falls
over" and update it to read "...but transparently fails over" so the technical
wording is correct.

Comment on lines +197 to +198
kubectl -n lineage-webhook-test create service clusterip probe \
--clusterip=None --dry-run=server
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

For kubectl (current stable docs), does kubectl create service clusterip NAMErequire--tcp/port arguments? Please cite the command reference and examples.

💡 Result:

Yes, kubectl create service clusterip NAME requires the --tcp or port arguments. The official Kubernetes documentation shows the syntax as kubectl create service clusterip NAME [--tcp=<port>:<targetPort>] [--dry-run=server|client|none], where --tcp is listed as a required flag for specifying port mappings. All provided examples include it, such as kubectl create service clusterip my-cs --tcp=5678:8080 or for headless kubectl create service clusterip my-cs --clusterip="None", but the latter still typically pairs with ports for functionality. Without ports, the service would have no ports defined, rendering it ineffective.

Citations:


Add --tcp=80:80 to kubectl service creation command for functionality.

The kubectl create service clusterip command requires port specification via the --tcp flag. Without it, the service is created with no ports defined, making the verification step unreliable. The official Kubernetes documentation lists --tcp as necessary for port mappings.

Suggested edit
-kubectl -n lineage-webhook-test create service clusterip probe \
-  --clusterip=None --dry-run=server
+kubectl -n lineage-webhook-test create service clusterip probe \
+  --tcp=80:80 --clusterip=None --dry-run=server
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
kubectl -n lineage-webhook-test create service clusterip probe \
--clusterip=None --dry-run=server
kubectl -n lineage-webhook-test create service clusterip probe \
--tcp=80:80 --clusterip=None --dry-run=server
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@content/en/docs/next/operations/configuration/lineage-controller-webhook.md`
around lines 197 - 198, The kubectl service creation command shown ("kubectl -n
lineage-webhook-test create service clusterip probe --clusterip=None
--dry-run=server") is missing a port mapping; add the --tcp=80:80 flag so the
service is created with a port (e.g., update that command to include
--tcp=80:80) to ensure the verification step is reliable and matches Kubernetes
requirements.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point of this step is to create a service, whose creation is gated by the lineage webhook. If the creation fails due to a webhook error, the user is immediately alerted. If it works fine, the most basic function of the webhook is proven to work. The service does not need to point to anything or route real traffic anywhere.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lllamnyp, that's a good clarification — the --dry-run=server flag is precisely the key detail I missed. Since the request reaches the apiserver (and therefore the webhook) but is never persisted, a portless headless service is entirely valid as a probe. The command is correct as written. Apologies for the noise.


🧠 Learnings used
Learnt from: lllamnyp
Repo: cozystack/website PR: 416
File: content/en/docs/operations/oidc/self-signed-certificates.md:102-110
Timestamp: 2026-02-12T15:52:39.278Z
Learning: When documenting kubelogin usage with kubectl config set-credentials, omit the --exec-interactive-mode option entirely. Do not include this flag in examples or explanations. If encountered in a doc, remove it and note that interactive mode is not supported. Apply this guideline to all Markdown docs under content to ensure consistent, accurate guidance.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
content/en/docs/next/operations/configuration/lineage-controller-webhook.md (2)

42-42: ⚠️ Potential issue | 🟡 Minor

Replace “falls over” with “fails over.”

Line 42 uses non-standard operational wording; “fails over” is the correct term.

Suggested edit
-  transparently falls over to a remote endpoint otherwise. Requires
+  transparently fails over to a remote endpoint otherwise. Requires
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@content/en/docs/next/operations/configuration/lineage-controller-webhook.md`
at line 42, Replace the incorrect phrase "falls over" with the correct
operational term "fails over" in the lineage-controller-webhook documentation;
locate the sentence containing "transparently falls over to a remote endpoint
otherwise." and update it to read "transparently fails over to a remote endpoint
otherwise." so the wording uses the standard term.

103-104: ⚠️ Potential issue | 🟠 Major

Add a port mapping to the kubectl create service clusterip dry-run command.

The example omits required port mapping, so this verification step can be misleading.

Suggested edit
 kubectl -n lineage-webhook-test create service clusterip probe \
-  --clusterip=None --dry-run=server
+  --tcp=80:80 --clusterip=None --dry-run=server
For current Kubernetes docs, does `kubectl create service clusterip NAME` require `--tcp` (or equivalent port mapping)? Please cite the official command reference and examples.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@content/en/docs/next/operations/configuration/lineage-controller-webhook.md`
around lines 103 - 104, The example command "kubectl -n lineage-webhook-test
create service clusterip probe --clusterip=None --dry-run=server" is missing a
required port mapping which makes the dry-run misleading; update that command to
include a port mapping flag (for example use --tcp=80:80 or --port=80
--target-port=80) so it matches current kubectl usage and examples, ensuring the
altered command (referencing the shown "kubectl create service clusterip probe"
invocation) validates correctly during dry-run.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@content/en/docs/next/operations/configuration/lineage-controller-webhook.md`:
- Line 42: Replace the incorrect phrase "falls over" with the correct
operational term "fails over" in the lineage-controller-webhook documentation;
locate the sentence containing "transparently falls over to a remote endpoint
otherwise." and update it to read "transparently fails over to a remote endpoint
otherwise." so the wording uses the standard term.
- Around line 103-104: The example command "kubectl -n lineage-webhook-test
create service clusterip probe --clusterip=None --dry-run=server" is missing a
required port mapping which makes the dry-run misleading; update that command to
include a port mapping flag (for example use --tcp=80:80 or --port=80
--target-port=80) so it matches current kubectl usage and examples, ensuring the
altered command (referencing the shown "kubectl create service clusterip probe"
invocation) validates correctly during dry-run.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1a36716b-1a2d-4ba4-b536-589530fb65e7

📥 Commits

Reviewing files that changed from the base of the PR and between 8c08d39 and 7c5db43.

📒 Files selected for processing (1)
  • content/en/docs/next/operations/configuration/lineage-controller-webhook.md

Add operations/configuration/lineage-controller-webhook.md (weight
40, sibling of components.md) describing the webhook for cluster
admins.

The page covers what the webhook does (mutating admission on tenant
pods/secrets/services/PVCs/ingresses/workloadmonitors, walking up the
ownership graph to stamp Cozystack Application identity labels), the
default deployment shape it ships with (single Deployment, two
replicas, soft preferred control-plane nodeAffinity, permissive
tolerations, soft podAntiAffinity, unconditional PDB with
maxUnavailable: 1, Service with trafficDistribution: PreferClose),
how to bump replicas via the cozystack.cozystack-engine Package
override, the deprecation note for localK8sAPIEndpoint.enabled (with
a warning about the crash-loop footgun if it is enabled while the
soft control-plane affinity sends the pod off-control-plane), and a
verification recipe with kubectl plus a webhook-reachability probe.

Companion to cozystack/cozystack#2481.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Timofei Larkin <lllamnyp@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants