Skip to content

Dockerfile - Add ROCm6.3 dockerfile#790

Open
polarG wants to merge 3 commits intomainfrom
dev/hongtaozhang/add-dockerfile-rocm-6.3.x
Open

Dockerfile - Add ROCm6.3 dockerfile#790
polarG wants to merge 3 commits intomainfrom
dev/hongtaozhang/add-dockerfile-rocm-6.3.x

Conversation

@polarG
Copy link
Contributor

@polarG polarG commented Mar 16, 2026

Description
Add ROCm6.3 dockerfile.

Copilot AI review requested due to automatic review settings March 16, 2026 05:01
@polarG polarG requested a review from a team as a code owner March 16, 2026 05:01
@polarG polarG self-assigned this Mar 16, 2026
@polarG polarG requested review from abuccts and guoshzhao March 16, 2026 05:02
@polarG polarG added ROCm enhancement New feature or request labels Mar 16, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new ROCm 6.3 container recipe to the repo’s dockerfile/ collection, targeting the rocm/pytorch-training:v25.6 base image and layering SuperBench build/install steps plus common tooling needed for benchmarks.

Changes:

  • Introduces dockerfile/rocm6.3.x.dockerfile for a ROCm 6.3.4 + PyTorch training base image.
  • Installs additional system tools (Docker client, OFED if missing, Intel MLC) and configures SSH/limits.
  • Builds SuperBench third-party dependencies and installs the package with AMD worker extras.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +67 to +68
ARG NUM_MAKE_JOBS=64

ENV USE_HIPBLAS_COMPUTETYPE=1
RUN python3 -m pip install --no-build-isolation .[amdworker] && \
CXX=/opt/rocm/bin/hipcc make cppbuild && \
make postinstall
Comment on lines +87 to +95
# Get Ubuntu version and set as an environment variable
RUN echo "Ubuntu version: $(lsb_release -r -s)"
ARG UBUNTU_VERSION=22.04

# Install OFED
ENV OFED_VERSION=24.10-1.1.4.0
# Check if ofed_info is present and has a version
RUN if ! command -v ofed_info >/dev/null 2>&1; then \
echo "OFED not found. Installing OFED..."; \
Comment on lines +126 to +128
RUN python3 -m pip install --upgrade pip wheel setuptools==65.7 && \
python3 -c "import pkg_resources" || python3 -m pip install setuptools

RUN if ! command -v ofed_info >/dev/null 2>&1; then \
echo "OFED not found. Installing OFED..."; \
cd /tmp && \
wget -q http://content.mellanox.com/ofed/MLNX_OFED-${OFED_VERSION}/MLNX_OFED_LINUX-${OFED_VERSION}-ubuntu${UBUNTU_VERSION}-x86_64.tgz && \
Comment on lines +25 to +26
# Upgrading botocore/boto3 to versions compatible with urllib3 2.x.
RUN python3 -m pip install --upgrade botocore boto3
@codecov
Copy link

codecov bot commented Mar 16, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.70%. Comparing base (6b8e810) to head (32471dd).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #790   +/-   ##
=======================================
  Coverage   85.70%   85.70%           
=======================================
  Files         102      102           
  Lines        7703     7703           
=======================================
  Hits         6602     6602           
  Misses       1101     1101           
Flag Coverage Δ
cpu-python3.10-unit-test 70.96% <ø> (ø)
cpu-python3.12-unit-test 70.96% <ø> (ø)
cpu-python3.7-unit-test 70.43% <ø> (ø)
cuda-unit-test 83.59% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copilot AI review requested due to automatic review settings March 16, 2026 18:52
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new ROCm 6.3 build target to the repository’s Docker image build pipeline, enabling CI to produce a superbench/main:rocm6.3 image variant.

Changes:

  • Introduces dockerfile/rocm6.3.x.dockerfile based on rocm/pytorch-training:v25.6, with additional system deps, Docker CLI, OFED (conditional), and SuperBench build/install steps.
  • Updates .github/workflows/build-image.yml to build and tag the new ROCm 6.3 image on the self-hosted ROCm runner.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File Description
dockerfile/rocm6.3.x.dockerfile New ROCm 6.3 Dockerfile building SuperBench on top of rocm/pytorch-training:v25.6.
.github/workflows/build-image.yml Adds a rocm6.3 entry to the build matrix to produce/push superbench/main:rocm6.3.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


ADD third_party third_party
# perftest_rocm6.patch skipped — upstream perftest already includes the equivalent changes
RUN make RCCL_HOME=/opt/rocm ROCBLAS_BRANCH=release-staging/rocm-rel-6.3 HIPBLASLT_BRANCH=release-staging/rocm-rel-6.3 ROCM_VER=rocm-5.5.0 -C third_party rocm -o cpu_hpl -o cpu_stream -o megatron_lm -o rocm_megatron_lm
Comment on lines +82 to +86
sed -i "s/[# ]*PermitRootLogin prohibit-password/PermitRootLogin yes/" /etc/ssh/sshd_config && \
sed -i "s/[# ]*PermitUserEnvironment no/PermitUserEnvironment yes/" /etc/ssh/sshd_config && \
sed -i "s/[# ]*Port.*/Port 22/" /etc/ssh/sshd_config && \
echo "* soft nofile 1048576\n* hard nofile 1048576" >> /etc/security/limits.conf && \
echo "root soft nofile 1048576\nroot hard nofile 1048576" >> /etc/security/limits.conf

# Install OFED
ENV OFED_VERSION=24.10-1.1.4.0
# Check if ofed_info is present and has a version
Comment on lines +139 to +144
ADD . .
ENV USE_HIP_DATATYPE=1
ENV USE_HIPBLAS_COMPUTETYPE=1
RUN python3 -m pip install --no-build-isolation .[amdworker] && \
CXX=/opt/rocm/bin/hipcc make cppbuild && \
make postinstall
Comment on lines +128 to +129
RUN python3 -m pip install --upgrade pip wheel setuptools==65.7 && \
python3 -c "import pkg_resources" || python3 -m pip install setuptools
Comment on lines +94 to +99
ENV OFED_VERSION=24.10-1.1.4.0
# Check if ofed_info is present and has a version
RUN if ! command -v ofed_info >/dev/null 2>&1; then \
echo "OFED not found. Installing OFED..."; \
cd /tmp && \
wget -q http://content.mellanox.com/ofed/MLNX_OFED-${OFED_VERSION}/MLNX_OFED_LINUX-${OFED_VERSION}-ubuntu${UBUNTU_VERSION}-x86_64.tgz && \
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request ROCm

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants