Skip to content

Allowlist mesa CVE-2026-40393 in PT 2.9 EC2 training images#6273

Merged
bhanutejagk merged 4 commits into
aws:masterfrom
bhanutejagk:patch/pt29-ec2-allowlist-mesa-cve-2026-40393
Jun 19, 2026
Merged

Allowlist mesa CVE-2026-40393 in PT 2.9 EC2 training images#6273
bhanutejagk merged 4 commits into
aws:masterfrom
bhanutejagk:patch/pt29-ec2-allowlist-mesa-cve-2026-40393

Conversation

@bhanutejagk

@bhanutejagk bhanutejagk commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

CVE-2026-40393 is an out-of-bounds memory access in Mesa's WebGPU code path (alloca size derived from untrusted input). Fixed upstream in mesa 25.3.6 / 26.0.1; Ubuntu 22.04 (jammy) is currently "Needs evaluation" with no patched package available yet.

DLC training containers do not expose a WebGPU or browser rendering surface to untrusted content. mesa is pulled in transitively via the libgl1-mesa-glx system package and is not invoked by training workloads, so the vulnerable code path is unreachable in these images.

Adds the entry to:

  • pytorch/training/docker/2.9/py3/Dockerfile.ec2.cpu.os_scan_allowlist.json
  • pytorch/training/docker/2.9/py3/cu130/Dockerfile.ec2.gpu.os_scan_allowlist.json

Existing allowlist entries (black, torch, flash_attn) are preserved.

Purpose

Test Plan

Test Result

ebc9a97 - passed all tests


Toggle if you are merging into master Branch

By default, docker image builds and tests are disabled. Two ways to run builds and tests:

  1. Using dlc_developer_config.toml
  2. Using this PR description (currently only supported for PyTorch, TensorFlow, vllm, and base images)
How to use the helper utility for updating dlc_developer_config.toml

Assuming your remote is called origin (you can find out more with git remote -v)...

  • Run default builds and tests for a particular buildspec - also commits and pushes changes to remote; Example:

python src/prepare_dlc_dev_environment.py -b </path/to/buildspec.yml> -cp origin

  • Enable specific tests for a buildspec or set of buildspecs - also commits and pushes changes to remote; Example:

python src/prepare_dlc_dev_environment.py -b </path/to/buildspec.yml> -t sanity_tests -cp origin

  • Restore TOML file when ready to merge

python src/prepare_dlc_dev_environment.py -rcp origin

NOTE: If you are creating a PR for a new framework version, please ensure success of the local, standard, rc, and efa sagemaker tests by updating the dlc_developer_config.toml file:

  • sagemaker_remote_tests = true
  • sagemaker_efa_tests = true
  • sagemaker_rc_tests = true
  • sagemaker_local_tests = true
How to use PR description Use the code block below to uncomment commands and run the PR CodeBuild jobs. There are two commands available:
  • # /buildspec <buildspec_path>
    • e.g.: # /buildspec pytorch/training/buildspec.yml
    • If this line is commented out, dlc_developer_config.toml will be used.
  • # /tests <test_list>
    • e.g.: # /tests sanity security ec2
    • If this line is commented out, it will run the default set of tests (same as the defaults in dlc_developer_config.toml): sanity, security, ec2, ecs, eks, sagemaker, sagemaker-local.
# /buildspec <buildspec_path>
# /tests <test_list>
Toggle if you are merging into main Branch

PR Checklist

  • [] I ran pre-commit run --all-files locally before creating this PR. (Read DEVELOPMENT.md for details).

CVE-2026-40393 is an out-of-bounds memory access in Mesa's WebGPU code
path (alloca size derived from untrusted input). Fixed upstream in mesa
25.3.6 / 26.0.1; Ubuntu 22.04 (jammy) is currently "Needs evaluation"
with no patched package available yet.

DLC training containers do not expose a WebGPU or browser rendering
surface to untrusted content. mesa is pulled in transitively via the
libgl1-mesa-glx system package and is not invoked by training workloads,
so the vulnerable code path is unreachable in these images.

Adds the entry to:
- pytorch/training/docker/2.9/py3/Dockerfile.ec2.cpu.os_scan_allowlist.json
- pytorch/training/docker/2.9/py3/cu130/Dockerfile.ec2.gpu.os_scan_allowlist.json

Existing allowlist entries (black, torch, flash_attn) are preserved.
Bhanu Teja Goshikonda added 3 commits June 19, 2026 00:35
Limit dlc_developer_config.toml to building PyTorch training images
for the mesa CVE-2026-40393 allowlist on PT 2.9 EC2 training
Dockerfiles. SageMaker local/remote test paths are disabled since
they don't validate this allowlist; EC2, ECS, EKS, sanity, and
security tests remain enabled.
Targets the PT 2.9 EC2 training buildspec for CI on the mesa
CVE-2026-40393 allowlist branch so only the images covered by the
allowlist are rebuilt and tested.
Restores dlc_developer_config.toml to upstream master so the buildspec
override merge-gate passes. The PT 2.9 EC2 mesa CVE-2026-40393
allowlist remains in this branch.
@bhanutejagk bhanutejagk merged commit 556e2fe into aws:master Jun 19, 2026
32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants