[Feature] support compute shared experts before combine for better overlap (#6697)

* [Feature] support compute shared experts before combine for better overlap * fix test * fix xpu * fix
2026-04-23 00:17:25 +08:00 · 2026-03-17 15:18:51 +08:00
parent 12eb001d0c
commit daaf498213
15 changed files with 104 additions and 27 deletions
@@ -106,7 +106,7 @@ class MockAttentionBackend:


 class MockQuantMethod:
-    def apply(self, layer, x, gate, topk_ids_hookfunc=None):
+    def apply(self, layer, x, gate, topk_ids_hookfunc=None, shared_experts=None):
        return x