Feat/moore swiglu by gongchensu · Pull Request #24 · InfiniTensor/InfiniOps

gongchensu · 2026-03-16T03:12:36Z

A100编译及算子测试：

沐曦编译测试：

寒武纪编译测试：

天数编译测试：

摩尔算子测试：

* refactor metadata handling to share workspace layout across GPU backends * update CUDA/NVIDIA to use workspace-backed shape and stride metadata * add MUSA source build rules for Moore

Ziminli · 2026-03-17T07:46:04Z

src/base/swiglu.h

-        "operator `Swiglu` requires all input and output tensors to have the "
-        "same dtype");
+        "Operator `Swiglu` requires all input and output tensors to have the "
+        "same dtype.");


不需要这个更改，贡献指南C++的第四点

Ziminli · 2026-03-17T07:46:17Z

src/cuda/swiglu/kernel.h

+    if (required_workspace_size != 0) {
+      assert(workspace != nullptr && "`Swiglu` requires a workspace buffer.");
+      assert(workspace_size >= required_workspace_size &&
+             "`workspace_size_in_bytes` is insufficient for `Swiglu`.");


同上，末尾无需句号，其他文件也麻烦检查更正一下。

Ziminli · 2026-03-17T07:46:35Z

src/moore/swiglu/kernel.h

这个按理来说应该是基本复用 cuda/swiglu/kernel.h 的，可以参考一下 nvidia 的实现。

Ziminli · 2026-03-17T07:46:55Z

src/moore/swiglu/swiglu_moore.mu

按理来说这整个文件的内容都是不需要的，基本都是重复的部分，可以参考 add 的类 CUDA 实现（cuda/add/, nvidia/add/, 和 metax/add/）以及参考 common/cuda/kernel_commons.h 里已有的公共函数。

Ziminli · 2026-03-17T07:47:16Z

src/moore/swiglu/swiglu_moore_kernel.h

+namespace infini::ops::swiglu::moore {
+
+using cuda_bfloat16 = mt_bfloat16;
+using cuda_bfloat162 = mt_bfloat162;


这个要有应该是在 common 里的，可以看一下其他平台的做法。

Ziminli · 2026-03-17T07:47:20Z

src/moore/swiglu/swiglu_moore_kernel.h

+    } else {
+      return static_cast<T>(1 / (1 + std::exp(-x)));
+    }
+  }


这个的存在似乎没有意义。同理，如果摩尔和其他类 CUDA 一样，应该直接复用，还是参考 add.

Ziminli · 2026-03-17T07:47:22Z

src/moore/swiglu/swiglu_moore_kernel.h

+      float sig0 = __low2float(sig);
+      float sig1 = __high2float(sig);
+      float up0 = __low2float(up);
+      float up1 = __high2float(up);


如果摩尔在个别地方和其他类 CUDA 不一样，无法直接复用某个部分的逻辑，应该只针对这个情况，其他地方仍复用统一的 cuda 实现。

gongchensu and others added 4 commits March 13, 2026 09:13

feat(gemm-moore): add Moore (MUSA) GEMM backend support.

8f810a7

refactor(gemm-moore): reuse shared BLAS helper and specialize scalars.

7eeabee

build: use detected Python interpreter for wrapper generation.

93607e1

feat(swiglu-moore): add Moore (MUSA) backend support.

0056c17

* refactor metadata handling to share workspace layout across GPU backends * update CUDA/NVIDIA to use workspace-backed shape and stride metadata * add MUSA source build rules for Moore

gongchensu self-assigned this Mar 16, 2026

gongchensu requested a review from voltjia March 16, 2026 07:37

Ziminli requested changes Mar 17, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/moore swiglu#24

Feat/moore swiglu#24
gongchensu wants to merge 4 commits intoInfiniTensor:feat/dev-infrafrom
gongchensu:feat/moore-swiglu

gongchensu commented Mar 16, 2026 •

edited

Loading

Uh oh!

Ziminli Mar 17, 2026

Uh oh!

Ziminli Mar 17, 2026

Uh oh!

Ziminli Mar 17, 2026

Uh oh!

Ziminli Mar 17, 2026

Uh oh!

Ziminli Mar 17, 2026

Uh oh!

Ziminli Mar 17, 2026

Uh oh!

Ziminli Mar 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gongchensu commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Ziminli Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

Ziminli Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

Ziminli Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

Ziminli Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

Ziminli Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

Ziminli Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

Ziminli Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gongchensu commented Mar 16, 2026 •

edited

Loading