[Feature] support compute shared experts before combine for better overlap (#6697)

* [Feature] support compute shared experts before combine for better overlap

* fix test

* fix xpu

* fix
This commit is contained in:
Longzhi Wang
2026-03-17 15:18:51 +08:00
committed by GitHub
parent 12eb001d0c
commit daaf498213
15 changed files with 104 additions and 27 deletions
+1 -1
View File
@@ -106,7 +106,7 @@ class MockAttentionBackend:
class MockQuantMethod:
def apply(self, layer, x, gate, topk_ids_hookfunc=None):
def apply(self, layer, x, gate, topk_ids_hookfunc=None, shared_experts=None):
return x