MingkunZhang
|
7ad5737560
|
[Metax] adapt to gemm interface on different versions of maca (#5905)
Co-authored-by: root <root@lt-wks-10-0-180-15.pub.metax-tech.com>
|
2026-01-07 10:02:24 +08:00 |
|
Neil Zhu
|
272a371635
|
[Metax] optimize flash attention backend (#5876)
|
2026-01-06 09:52:09 +08:00 |
|
MingkunZhang
|
f732d7d2ad
|
[Metax] adapt prefix caching & cpu swap (#5844)
Co-authored-by: root <root@lt-wks-10-0-180-15.pub.metax-tech.com>
|
2025-12-31 17:02:48 +08:00 |
|
Neil Zhu
|
4403a21d4b
|
[Metax] refactor cutlass moe and optimize flash attention (#5361)
* [Metax] refactor moe and flash attention backend
---------
Co-authored-by: zhangchenyi_dl <16219492+zhangchenyidl@user.noreply.gitee.com>
|
2025-12-10 17:15:17 +08:00 |
|
Neil Zhu
|
0edda75a56
|
[Metax] optimize cutlass moe and flash attention backend (#5128)
|
2025-11-20 16:12:35 +08:00 |
|
xiaozude
|
74722308f2
|
[Metax] adapt cutlass moe and fix mla attention (#4602)
Co-authored-by: Yuanle Liu <yuanlehome@163.com>
|
2025-11-05 10:03:49 +08:00 |
|
Neil Zhu
|
c95d0740ec
|
[Metax] adapt cutlass moe for ernie-vl (#4685)
|
2025-11-03 17:44:27 +08:00 |
|
zhupengyang
|
3a6883ac1a
|
c++ code format (#4527)
|
2025-10-22 17:59:50 +08:00 |
|
SuperNova
|
80a16c4c87
|
[fix] adjust mctlass moe api (#4474)
|
2025-10-20 14:23:54 +08:00 |
|
xiaozude
|
7c919070f7
|
[Metax] support cutlass moe & optimize flash attention (#4208)
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
|
2025-09-29 11:22:43 +08:00 |
|