Skip to content

feat(actor-template): per-container securityContext#73

Open
Davanum Srinivas (dims) wants to merge 1 commit into
agent-substrate:mainfrom
dims:feat/actor-template-capabilities
Open

feat(actor-template): per-container securityContext#73
Davanum Srinivas (dims) wants to merge 1 commit into
agent-substrate:mainfrom
dims:feat/actor-template-capabilities

Conversation

@dims
Copy link
Copy Markdown
Collaborator

@dims Davanum Srinivas (dims) commented May 24, 2026

Add an opt-in securityContext block on ActorTemplate.spec.containers[], plumbed through ateletpb to atelet's OCI bundle builder. Templates that omit it produce the same OCI bundle as before.

Two fields are exposed:

  • capabilities.add — Linux capabilities to grant on top of the default sandbox set (CAP_AUDIT_WRITE, CAP_KILL, CAP_NET_BIND_SERVICE). Entries may be written with or without the CAP_ prefix; case is normalised; duplicates collapse against the defaults.

  • runAsUser / runAsGroup — the UID and GID to start the container process as. Unset preserves atelet's existing default of root.

The motivating workload is NVIDIA OpenShell's openshell-sandbox supervisor, which needs CAP_NET_ADMIN, CAP_SETUID, CAP_SETGID to configure the actor's network and user namespaces, and a non-root start UID for the supervisor process itself. Capabilities alone are not enough — the entry point still runs as root until something drops privileges.

Test plan:

  • go vet clean on touched packages
  • Unit tests for resolveCapabilities: defaults, prefix normalisation, case folding, dedup, blank-entry skip
  • ContainerSecurityContext DeepCopy round-trip with pointer-isolation assertions for Capabilities.Add and RunAsUser
  • cmd/ateapi/internal/controlapi workflow tests pass with the new copy block in resume + suspend
  • kind end-to-end with a template that opts into both fields

@dims Davanum Srinivas (dims) changed the title [WIP] Feat/actor template capabilities feat(actor-template): per-container securityContext May 24, 2026
Davanum Srinivas (dims) added a commit to dims/openshell-driver-substrate that referenced this pull request May 24, 2026
…text)

The substrate-side PR #73 — per-container `securityContext` on
`ActorTemplate.spec.containers[]` with both `capabilities.add` and
`runAsUser` / `runAsGroup` — is the field that lets this driver's
`synthesize_template` start emitting capability adds and a non-root
supervisor start UID once it merges. Empty templates produce the same
OCI bundle as before; opt-in per container.

Surface the PR in three places: the top-of-doc header in poc-intro
(alongside #66 and #67), the §3 "Companion changes" component table,
and the §9 "Where to next" item 8 that was previously an open TODO
about capability plumbing.

Also tidy the embedded `~/notes/...` references in poc-intro: the
local agent-substrate notes (kind-local-dev runbook, Shorewall recipe)
moved from `~/notes/` to `~/notes/agent-substrate/` to mirror the
existing `~/notes/openshell-on-substrate/` layout.

Signed-off-by: Davanum Srinivas <dsrinivas@nvidia.com>
@ahmedtd Taahir Ahmed (ahmedtd) self-requested a review May 26, 2026 23:23
@BenTheElder Benjamin Elder (BenTheElder) added the feature An enhancement / feature request or implementation label May 27, 2026
@a4-a4s1
Copy link
Copy Markdown

a4-a4s1 Bot commented May 27, 2026

Looking at #73 alongside the production-readiness thread already on #20:

Two layers showing up — spec-default vs admission-policy — both with K8s precedent.

One Q for maintainers: is a convention forming for which hardening axes land at which layer, or is this contributor-discretion right now? The first axis merged tends to anchor the shape for the rest.

[🤖a4s1]

@mtaufen
Copy link
Copy Markdown
Collaborator

Where does the openshell-sandbox supervisor sit in the architecture? Is it duplicating some of the jobs ateom has, or is it more like runsc or a vmm? If the latter, we have plans to support multiple sandbox technologies, maybe it fits best there? Taahir Ahmed (@ahmedtd) curious for your thoughts

@dims
Copy link
Copy Markdown
Collaborator Author

Michael Taufen (@mtaufen) openshell-sandbox is an in-container supervisor that runs as PID 1 of the workload container, inside the gVisor sandbox that ateom-gvisor + runsc already set up. From my understanding, it looks like this right now:

ateom-gvisor → runsc (single gVisor sandbox kernel)
                ├── pause container        (pid ns A, mount ns A, ...)
                │   └── PID 1: /pause
                └── supervisor container   (pid ns B, mount ns B, ...)
                    └── PID 1: openshell-sandbox  ← here
                        └── PID 2: workload (e.g. python3 agent.py)

It's just a regular Linux process under runsc, not a peer of runsc or a duplicate of ateom. Its job is application-layer policy enforcement: OPA/rego rules over outbound syscalls and HTTP, Landlock filesystem allowlists, capability bounding, and drop_privileges from root to a non-root UID before execve-ing the actual workload. None of that overlaps with what ateom or runsc do.

This PR is needed because that supervisor has to start with a couple of capabilities (CAP_SETUID / CAP_SETGID for the privilege drop, etc.) and a configurable runAsUser — knobs that the driver currently can't request through ActorTemplate. The multi-sandbox-technology direction is orthogonal: openshell-sandbox would still want this same securityContext regardless of whether the underlying sandbox is runsc, Kata, or anything else.

@dims Davanum Srinivas (dims) force-pushed the feat/actor-template-capabilities branch from 1f42a2a to 5eb794e Compare May 27, 2026 21:57
…unAs)

Add an opt-in `securityContext` block on `ActorTemplate.spec.containers[]`
carrying two K8s-shape sub-fields:

  - capabilities.add:  []string of Linux caps to add on top of the
                       default sandbox set (CAP_AUDIT_WRITE,
                       CAP_KILL, CAP_NET_BIND_SERVICE)
  - runAsUser/runAsGroup: *int64, the UID/GID the container's
                       entrypoint starts as

Empty templates produce the same OCI bundle as before. The pause
container is unaffected — it always runs as root with the default
sandbox cap set.

Plumbing:

  ActorTemplate.spec.containers[].securityContext
    → ateletpb.Container.security_context
    → atelet's prepareOCIDirectory (via prepareOCIBundles)
    → OCI process.capabilities.{Bounding,Effective,Inheritable,Permitted}
      and process.user.{uid,gid}

`resolveCapabilities` in cmd/atelet/oci.go normalises each entry to
its CAP_… form so templates may write either `NET_ADMIN` or
`CAP_NET_ADMIN`, and de-duplicates against the default set.

`RunAsUser` / `RunAsGroup` are bare `int64` on the wire but `*int64`
in the CRD. At the proto boundary "unset" and "0" both mean root, and
atelet's OCI bundle builder collapses them into the same Process.User
block. The CRD shape keeps `*int64` so K8s users can express the
usual "unset vs. explicit 0" distinction in YAML even though the
runtime ignores it.

The two halves are useful together: `Capabilities.Add` alone only
enables `setresuid` inside the running process (useful for
supervisors that drop privileges mid-startup), but the entry point
still runs as root until they do. `RunAsUser` is the field that
makes the container actually *start* at a non-root UID.

A gVisor compatibility spike confirmed runsc honours the OCI cap set
exactly: granting CAP_SETUID/CAP_SETGID unblocks `setresuid` inside
the actor, while `unshare(CLONE_NEWNET)` remains refused regardless
of caps (architectural refusal in the sentry, unrelated to capability
bits).

The motivating workload is NVIDIA OpenShell's `openshell-sandbox` —
an in-container policy supervisor that runs as PID 1 of the
workload's sub-container, needs CAP_NET_ADMIN/CAP_SETUID/CAP_SETGID
to configure user namespaces and prepare the workload's filesystem,
then drops privileges to a non-root UID before exec'ing the inner
workload. The field godoc and CRD description describe this in
generic terms; the test fixtures use `app` / `registry.example/app:test`
rather than naming the downstream consumer.

Tests cover `resolveCapabilities` normalisation/dedup/blank-skip and
a round-trip DeepCopy of `ContainerSecurityContext`.

Signed-off-by: Davanum Srinivas <davanum@gmail.com>
@dims Davanum Srinivas (dims) force-pushed the feat/actor-template-capabilities branch from 5eb794e to 45ad373 Compare May 27, 2026 22:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature An enhancement / feature request or implementation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants