Skip to content

feat(k8s): Support imagePullSecrets for sandbox pods #1030

@kon-angelo

Description

@kon-angelo

feat(k8s): Support imagePullSecrets for sandbox pods

Problem Statement

OpenShell currently does not support pulling sandbox images from private container registries that require authentication. While the gateway StatefulSet supports imagePullSecrets via Helm values, sandbox pods created by the Kubernetes compute driver do not inherit or configure these secrets.

Users cannot:

  1. Use custom sandbox images stored in private registries (Docker Hub private repos, ECR, GCR, ACR, etc.)
  2. Leverage organization-specific base images with proprietary tooling
  3. Deploy sandboxes in air-gapped environments with internal registries
  4. Use registry pull rate limits bypass credentials

When a user specifies a sandbox image from a private registry:

openshell sandbox create my-sandbox --image my-registry.io/my-org/my-sandbox:v1

The sandbox pod fails to start with:

Failed to pull image "my-registry.io/my-org/my-sandbox:v1": 
rpc error: code = Unknown desc = failed to pull and unpack image 
"my-registry.io/my-org/my-sandbox:v1": failed to resolve reference 
"my-registry.io/my-org/my-sandbox:v1": pull access denied, 
repository does not exist or may require authorization

Proposed Design

Add support for configuring Kubernetes imagePullSecrets on sandbox pods through two levels:

1. Global Default (Helm/Server Config):

  • Helm value: server.sandboxImagePullSecrets: []
  • Env var: OPENSHELL_SANDBOX_IMAGE_PULL_SECRETS (comma-separated)
  • Applied to all sandbox pods by default

2. Per-Sandbox Override (Platform Config):

  • Allow platform_config.image_pull_secrets in SandboxTemplate
  • Overrides global default when specified
  • Enables per-sandbox registry credentials

Implementation

Core Driver Changes

File: crates/openshell-driver-kubernetes/src/config.rs

Add field to config struct:

pub struct KubernetesComputeConfig {
    // ... existing fields ...
    pub image_pull_secrets: Vec<String>,  // Secret names
}

File: crates/openshell-driver-kubernetes/src/driver.rs

In sandbox_template_to_k8s() (after line 1113, where hostAliases are added):

// Apply global default image pull secrets
if !self.config.image_pull_secrets.is_empty() {
    spec.insert(
        "imagePullSecrets".to_string(),
        serde_json::json!(
            self.config.image_pull_secrets
                .iter()
                .map(|name| serde_json::json!({"name": name}))
                .collect::<Vec<_>>()
        ),
    );
}

// Allow per-sandbox override via platform_config
if let Some(override_secrets) = platform_config_struct(template, "image_pull_secrets") {
    spec.insert("imagePullSecrets".to_string(), override_secrets);
}

Server Configuration

File: crates/openshell-server/src/cli.rs

Add CLI argument:

#[arg(long, env = "OPENSHELL_SANDBOX_IMAGE_PULL_SECRETS")]
sandbox_image_pull_secrets: Option<String>,  // comma-separated

Parse and wire through:

if let Some(secrets) = args.sandbox_image_pull_secrets {
    let secret_names: Vec<String> = secrets
        .split(',')
        .map(|s| s.trim().to_string())
        .filter(|s| !s.is_empty())
        .collect();
    config = config.with_sandbox_image_pull_secrets(secret_names);
}

Helm Integration

File: deploy/helm/openshell/values.yaml

Add field:

server:
  # ... existing fields ...
  # Image pull secrets for sandbox pods. List of K8s secret names.
  # Secrets must exist in the sandbox namespace and contain registry credentials.
  sandboxImagePullSecrets: []
  # Example:
  # sandboxImagePullSecrets:
  #   - my-registry-secret
  #   - docker-hub-creds

File: deploy/helm/openshell/templates/statefulset.yaml

Inject as env var (after line 70):

{{- if .Values.server.sandboxImagePullSecrets }}
- name: OPENSHELL_SANDBOX_IMAGE_PULL_SECRETS
  value: {{ .Values.server.sandboxImagePullSecrets | join "," | quote }}
{{- end }}

Backward Compatibility

No breaking changes:

  • New config field defaults to empty list (no secrets)
  • Existing sandboxes continue to work with public images
  • Helm chart adds optional value (defaults to [])
  • Env var is optional

Testing Strategy

Create test secret:

kubectl create secret docker-registry test-registry-secret \
  --docker-server=ghcr.io \
  --docker-username=test-user \
  --docker-password=test-token \
  -n openshell

Configure gateway:

helm upgrade openshell ./deploy/helm/openshell \
  --set server.sandboxImagePullSecrets[0]=test-registry-secret

Create sandbox and verify:

openshell sandbox create test-private \
  --image ghcr.io/my-org/private-sandbox:latest

kubectl get pod -n openshell -l openshell.ai/sandbox-id=<id> \
  -o jsonpath='{.spec.imagePullSecrets}'
# Expected: [{"name":"test-registry-secret"}]

Alternatives Considered

Inherit from Gateway Pod: Copy imagePullSecrets from gateway StatefulSet to sandbox pods automatically. Rejected — gateway and sandboxes may need different registry access; no per-sandbox override capability.

Service Account Pull Secrets: Attach image pull secrets to the sandbox service account. Rejected — requires separate service account per registry, no per-sandbox granularity, more complex RBAC.

Registry Credential Operator: Deploy a Kubernetes operator that injects secrets dynamically. Rejected — too complex for this use case, adds external dependency.

Agent Investigation

Codebase surveyed prior to filing:

Kubernetes Driver (crates/openshell-driver-kubernetes/src/driver.rs):

  • Lines 987-1134: sandbox_template_to_k8s() builds the pod template JSON
  • Lines 1023-1039: Sets container image and imagePullPolicy
  • Lines 1105-1113: Adds hostAliases when configured
  • Missing: No logic to inject imagePullSecrets into spec.imagePullSecrets

Configuration (crates/openshell-driver-kubernetes/src/config.rs):

  • Lines 4-15: KubernetesComputeConfig struct contains image_pull_policy (line 8) but no image_pull_secrets field

Server Configuration (crates/openshell-server/src/cli.rs):

  • Line 84: Accepts sandbox_image_pull_policy CLI arg
  • Missing: No CLI arg or env var for image pull secrets

Helm Chart (deploy/helm/openshell/):

  • values.yaml line 13: Gateway has imagePullSecrets: []
  • values.yaml line 80: Sandbox has sandboxImagePullPolicy but no secrets config
  • Missing: No mechanism to pass secrets to sandbox pods

Gap: The Kubernetes Sandbox CRD does support imagePullSecrets in pod templates (deploy/kube/manifests/agent-sandbox.yaml lines 2074-2085), but the OpenShell driver never sets this field.

Precedent: The codebase already follows this pattern for image_pull_policy (lines 1033-1038), runtime_class_name (lines 1016-1021), and host_gateway_ip (lines 1105-1113) — all configured via env vars or platform_config and injected into pod specs.

Checklist

  • I've reviewed existing issues and the architecture docs
  • This is a design proposal, not a "please build this" request

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions