Skip to content

argv-safety: register children at fork time#35

Open
congwang-mk wants to merge 4 commits intomainfrom
argv-safety-fork-tracking
Open

argv-safety: register children at fork time#35
congwang-mk wants to merge 4 commits intomainfrom
argv-safety-fork-tracking

Conversation

@congwang-mk
Copy link
Copy Markdown
Contributor

Summary

  • When any consumer can inspect execve argv from child memory — policy_fn, or an extra handler bound to execve/execveat that can call read_child_mem — the supervisor now arms one-shot ptrace fork-event tracking (PTRACE_O_TRACE{FORK,VFORK,CLONE}) on every fork-class notification. New children are registered in ProcessIndex before they can run user code, so the execve argv-safety freeze can enumerate every tracked TGID and PTRACE_SEIZE every thread that could mutate argv between supervisor inspection and the kernel's post-Continue re-read.
  • Default policies are unaffected: bare fork(2) only enters the BPF notif filter when policy_fn is set or an extra handler is bound to exec, so the COW map-reduce hot path keeps bypassing the supervisor.
  • ProcessCreationTrace is an RAII guard — panics or early returns can't leak ptrace attachments. The blocking waitpid calls run on tokio::task::spawn_blocking so a stretched wait can't stall a notification worker.

Test plan

  • cargo test --workspace — 443/443 passing (was 441; +2 new tests in crates/sandlock-core/src/resource.rs)
  • pytest python/tests/ — 226/226 passing
  • process_creation_tracking_predicates_follow_argv_safety_gate locks in the gating across clone/clone3/fork/vfork and the CLONE_THREAD bypass
  • process_creation_tracking_registers_child_before_user_code_runs validates the central safety property end-to-end via a MAP_SHARED flag page: a raw-fork child is observed to be registered while still ptrace-stopped (before its first user-mode instruction). Skips gracefully if YAMA ptrace_scope denies the seize.
  • x86_64-only integration test (raw fork(2) syscall); other arches skip via #[cfg] — intentional

🤖 Generated with Claude Code

…tras

When any consumer can inspect execve argv from child memory — the
policy_fn callback or an extra handler bound to execve/execveat that
can call read_child_mem — fork/clone/vfork notifications now wrap their
Continue response in one-shot ptrace fork-event tracking, so every new
child is registered in ProcessIndex before it can run user code. The
execve argv-safety freeze can then enumerate every tracked TGID and
PTRACE_SEIZE every thread that could mutate argv between supervisor
inspection and the kernel's post-Continue re-read.

Signed-off-by: Cong Wang <cwang@multikernel.io>
…king

Signed-off-by: Cong Wang <cwang@multikernel.io>
Signed-off-by: Cong Wang <cwang@multikernel.io>
Signed-off-by: Cong Wang <cwang@multikernel.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant