Skip to content

Transition MaxText docker image build workflow to package-based installation#3788

Open
SurbhiJainUSC wants to merge 1 commit intomainfrom
docker_package_build
Open

Transition MaxText docker image build workflow to package-based installation#3788
SurbhiJainUSC wants to merge 1 commit intomainfrom
docker_package_build

Conversation

@SurbhiJainUSC
Copy link
Copy Markdown
Collaborator

@SurbhiJainUSC SurbhiJainUSC commented May 1, 2026

Description

This PR transitions the nightly Docker image build workflow from a source-based installation to a package-based installation. This change is to standardize environment setups across all workflows by leveraging the MaxText Python package. Additionally, it introduces an automated notification system to track nightly build failures.

Currently, in main this workflow builds docker image like this:

  1. Clone MaxText
  2. Run docker build command

This PR changes the workflow to:

  1. Clone MaxText
  2. Create a wheel file
  3. Forget about cloned MaxText repo
  4. Install [runner] from wheel file
  5. Run docker build command

The reason to do so is when we build docker image from inside MaxText repo, then we don't see any issue because all the files exist in the repo. However, when we install MaxText from wheel/pypi, then we only get src directory. This leads to some problems that gets unnoticed. One of the examples: cl/907712343

Key Changes

  • Package-Based Build: Refactors "Build Images" workflow to install maxtext via PyPI route (uv pip install -e .maxtext[runner]).
  • Automated Failure Notifications: Added a new notify_failure job to the nightly workflow. This job is designed to create a new GitHub issue or update an existing one whenever a scheduled build fails.
  • Standardization: Aligns the containerized environment with the recommended installation path.

Tests

CI tests and https://github.com/AI-Hypercomputer/maxtext/actions/runs/25235820972

Checklist

Before submitting this PR, please make sure (put X in square brackets):

  • I have performed a self-review of my code. For an optional AI review, add the gemini-review label.
  • I have necessary comments in my code, particularly in hard-to-understand areas.
  • I have run end-to-end tests tests and provided workload links above if applicable.
  • I have made or will make corresponding changes to the doc if needed, including adding new documentation pages to the relevant Table of Contents (toctree directive) as explained in our documentation.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 1, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@SurbhiJainUSC SurbhiJainUSC force-pushed the docker_package_build branch from 57e66e4 to 8b9df5e Compare May 1, 2026 17:44
@SurbhiJainUSC SurbhiJainUSC changed the title Build MaxText docker image from PyPI Transition MaxText docker image build workflow to package-based installation May 1, 2026
@SurbhiJainUSC SurbhiJainUSC force-pushed the docker_package_build branch from 8b9df5e to bd2fb51 Compare May 1, 2026 17:45
@SurbhiJainUSC SurbhiJainUSC marked this pull request as ready for review May 1, 2026 18:51
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 1, 2026

🤖 Hi @SurbhiJainUSC, I've received your request, and I'm working on it now! You can track my progress in the logs for more details.

Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

## 📋 Review Summary

This PR successfully transitions the MaxText Docker image build process to a package-based installation, which improves consistency and maintainability. It also introduces a valuable automated failure notification system for nightly builds.

🔍 General Feedback

  • Redundancy: The Copy tests assets to package directory step in the workflow is redundant because the Dockerfile already handles copying tests from the root context.
  • Brittleness: Hardcoded Python version paths (python3.12) should be replaced with more dynamic alternatives to future-proof the workflows.
  • Conventions: The change to image naming (adding an underscore before the suffix) and the use of short SHAs for tagging are positive improvements but should be noted as changes to existing conventions.
  • Variables: Ensure vars.PROJECT_NAME is defined in the repository settings to prevent build failures.

Comment thread .github/workflows/run_tests_against_package.yml
Comment thread .github/workflows/build_and_push_docker_image.yml
Comment thread .github/workflows/build_and_push_docker_image.yml
Comment thread .github/workflows/UploadDockerImages.yml
Comment thread .github/workflows/build_and_push_docker_image.yml
Comment thread .github/workflows/build_and_push_docker_image.yml
@SurbhiJainUSC SurbhiJainUSC force-pushed the docker_package_build branch from bd2fb51 to 704fba0 Compare May 1, 2026 22:25
@SurbhiJainUSC SurbhiJainUSC force-pushed the docker_package_build branch from 704fba0 to 52d8035 Compare May 1, 2026 22:47
Copy link
Copy Markdown
Collaborator

@bvandermoon bvandermoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need a nightly release if it is the same image between releases?

@SurbhiJainUSC
Copy link
Copy Markdown
Collaborator Author

Why do we need a nightly release if it is the same image between releases?

Discussed offline and updated the description accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants