Skip to content

Auto-detect CUTLASS for EvoformerAttention#8000

Open
MaxTretikov wants to merge 1 commit intodeepspeedai:masterfrom
MaxTretikov:master
Open

Auto-detect CUTLASS for EvoformerAttention#8000
MaxTretikov wants to merge 1 commit intodeepspeedai:masterfrom
MaxTretikov:master

Conversation

@MaxTretikov
Copy link
Copy Markdown

DS4Sci EvoformerAttention currently depends on CUTLASS, but requiring users to manually set CUTLASS_PATH creates unnecessary friction for an otherwise standard extension build flow. This change makes CUTLASS discovery automatic while preserving CUTLASS_PATH as the explicit override.

The discovery approach is based on PyTorch's CUDA detection pattern in torch.utils.cpp_extension: honor the explicit environment variable first, then infer from installed packages and conventional filesystem locations, and only fail with an actionable message when discovery cannot succeed.

This improves first-run usability, CI behavior, editable installs, and package-based environments where CUTLASS may already be installed in a discoverable location. It also reduces setup divergence between users who clone CUTLASS manually and users who install NVIDIA's nvidia-cutlass package.

DeepSpeed should already have had this because EvoformerAttention is part of DeepSpeed's extension-builder system, and extension builders should locate common build dependencies using predictable heuristics instead of requiring users to export paths manually. CUDA itself is not treated as "you must always set CUDA_HOME"; PyTorch attempts discovery first and uses the env var as a fallback. CUTLASS should follow the same principle here.

Signed-off-by: Max Tretikov <max@tretikov.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant