Skip to content

feat(provider/lambdai): Kubernetes node-data-broker support#375

Open
alaski-lambda wants to merge 1 commit into
NVIDIA:mainfrom
alaski-lambda:feat/lambdai-node-data-broker
Open

feat(provider/lambdai): Kubernetes node-data-broker support#375
alaski-lambda wants to merge 1 commit into
NVIDIA:mainfrom
alaski-lambda:feat/lambdai-node-data-broker

Conversation

@alaski-lambda

@alaski-lambda alaski-lambda commented Jun 30, 2026

Copy link
Copy Markdown

Description

Add lambdai.GetNodeAnnotations, which derives Topograph's instance and region node annotations from a node's .spec.providerID (lambda://) and its topology.kubernetes.io/region label, and dispatch to it from the node-data-broker so the k8s engine can discover and label Lambda nodes. Register lambdai in the provider overview and docs nav.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.
  • All commits are signed off per DCO (git commit -s).

Add lambdai.GetNodeAnnotations, which derives Topograph's instance and
region node annotations from a node's .spec.providerID (lambda://<id>)
and its topology.kubernetes.io/region label, and dispatch to it from the
node-data-broker so the k8s engine can discover and label Lambda nodes.
Register lambdai in the provider overview and docs nav.

Signed-off-by: Andrew Laski <alaski@lambdal.com>
@copy-pr-bot

copy-pr-bot Bot commented Jun 30, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@greptile-apps

greptile-apps Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adds Kubernetes node-data-broker support for the Lambda AI provider by implementing GetNodeAnnotations in pkg/providers/lambdai/k8s.go, wiring it into the broker's provider switch, and registering Lambda in the docs nav and overview.

  • GetNodeAnnotations derives the Topograph instance annotation from spec.providerID (lambda://<id>) and the region annotation from the topology.kubernetes.io/region label, with correct error-on-empty behavior so the init container retries until the lambda-cloud-controller has initialized the node.
  • The docs entries in docs/index.yml and docs/overview.md reference providers/lambdai.md, but that file is not present in the repository — every other provider in the nav has a corresponding doc file, so this will produce a broken link on the docs site.

Confidence Score: 4/5

The core Go implementation is solid and the broker wiring is correct, but the docs nav and overview reference a provider page that does not yet exist.

The new GetNodeAnnotations function, its tests, and the broker switch entry are all correct. The only gap is that docs/providers/lambdai.md is referenced in both docs/index.yml and docs/overview.md but is absent from the repository, which will produce broken links on the docs site.

docs/index.yml and docs/overview.md reference providers/lambdai.md, which does not exist.

Important Files Changed

Filename Overview
pkg/providers/lambdai/k8s.go New GetNodeAnnotations function; extracts instance ID from spec.providerID and region from topology label. Logic is correct, nil-map access is safe in Go, and error paths retry correctly in the init container.
pkg/providers/lambdai/k8s_test.go Five table-driven test cases covering success, node-not-found, missing prefix, empty providerID, and missing region label — good coverage.
cmd/node-data-broker/main.go Adds lambdai.NAME case to the provider switch and imports the lambdai package; wiring is correct.
docs/index.yml Adds Lambda nav entry pointing to providers/lambdai.md, which does not exist in the repository — broken link.
docs/overview.md Adds Lambda to the supported providers list and the scenario table; links to the non-existent providers/lambdai.md.
charts/topograph/values.yaml Comment updated to include lambdai as a valid provider name value — straightforward documentation fix.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant NDB as node-data-broker (init container)
    participant K8s as Kubernetes API
    participant Lambda as lambda-cloud-controller

    Lambda->>K8s: "Sets Node.spec.providerID = "lambda://<id>""
    Lambda->>K8s: Sets Node label topology.kubernetes.io/region

    NDB->>K8s: GET Node (lambdai.GetNodeAnnotations)
    alt providerID or region label missing
        K8s-->>NDB: Node missing fields
        NDB-->>NDB: Return error - init container retries
    else fields present
        K8s-->>NDB: Node with providerID + region label
        NDB->>NDB: Extract instance ID via strings.CutPrefix
        NDB->>K8s: GET Node (broker.apply)
        NDB->>K8s: PATCH Node annotations (topograph.nvidia.com/instance, topograph.nvidia.com/region)
    end
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant NDB as node-data-broker (init container)
    participant K8s as Kubernetes API
    participant Lambda as lambda-cloud-controller

    Lambda->>K8s: "Sets Node.spec.providerID = "lambda://<id>""
    Lambda->>K8s: Sets Node label topology.kubernetes.io/region

    NDB->>K8s: GET Node (lambdai.GetNodeAnnotations)
    alt providerID or region label missing
        K8s-->>NDB: Node missing fields
        NDB-->>NDB: Return error - init container retries
    else fields present
        K8s-->>NDB: Node with providerID + region label
        NDB->>NDB: Extract instance ID via strings.CutPrefix
        NDB->>K8s: GET Node (broker.apply)
        NDB->>K8s: PATCH Node annotations (topograph.nvidia.com/instance, topograph.nvidia.com/region)
    end
Loading

Reviews (1): Last reviewed commit: "feat(provider/lambdai): Kubernetes node-..." | Re-trigger Greptile

Comment thread docs/index.yml
Comment on lines +35 to +36
- page: Lambda
path: providers/lambdai.md

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Missing documentation file — broken nav link

docs/index.yml and docs/overview.md both reference providers/lambdai.md, but the file docs/providers/lambdai.md does not exist in the repository. Every other provider entry in the nav (aws, gcp, oci, nebius, nscale, infiniband, netq, dra) has a corresponding file under docs/providers/. Clicking the "Lambda" nav entry will result in a 404 on the docs site. A provider page is needed before this navigation entry can be wired up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant