Skip to content

【训练营】Add FlashAttention operator into infiniTrain Framework#122

Closed
Aoshine999 wants to merge 18 commits intoInfiniTensor:masterfrom
Aoshine999:pullrequest
Closed

【训练营】Add FlashAttention operator into infiniTrain Framework#122
Aoshine999 wants to merge 18 commits intoInfiniTensor:masterfrom
Aoshine999:pullrequest

Conversation

@Aoshine999
Copy link

Add Flash Attention forward/backward implementation and wire it into the autograd/dispatcher system.
Key changes:

  • infini_train/include/autograd/ScaledDotProductAttention.h
  • infini_train/src/autograd/ScaledDotProductAttention.cc
  • infini_train/include/kernels/cuda/flash_attention.h
  • infini_train/src/kernels/cuda/flash_attention.cu
  • run gpt2/llama3 : add --flash flag to switch attention path
    Constraints: dtype=float32, bfloat16 Flashattention forward and backward kernel only support BlockDim(32,32)

@kilinchange kilinchange changed the title Add FlashAttention operator into infiniTrain Framework 【训练营】Add FlashAttention operator into infiniTrain Framework Mar 16, 2026
@kilinchange kilinchange self-requested a review March 16, 2026 08:49
@kilinchange
Copy link
Collaborator

请使用 clang-format-16 对代码进行格式化,确保 ci 通过。

@Aoshine999 Aoshine999 closed this Mar 16, 2026
@Aoshine999 Aoshine999 deleted the pullrequest branch March 16, 2026 10:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants