ShaneGZhu
|
2d8338f9e4
|
[Optimization][DeepSeekV3.2]Reducing slot_mapping compute frequency from twice per layer to a single pre-processing step. (#7367)
|
2026-04-16 19:54:12 +08:00 |
|
ShaneGZhu
|
7005404ce3
|
[DeepSeekV3.2][Graph Optimization]Remove synchronous operation to avoid capture fail and unnecessary contiguous in DSA Backend (#7253)
* Delete contiguous ops.
* fix scale
* Delete unnecessary comments
* fix style
|
2026-04-09 11:00:13 +08:00 |
|
K11OntheBoat
|
bb48bcbaa2
|
Split enable_mm (#7183)
Co-authored-by: liuruian <liuruian@MacBook-Pro.local>
|
2026-04-08 11:25:41 +08:00 |
|
周周周
|
820eb60ec6
|
[Others] clean code (#6839)
Co-authored-by: “liuruian” <liuruian@baidu.com>
|
2026-03-14 11:09:28 +08:00 |
|
周周周
|
8c1a2827d3
|
DSA clean code (#6827)
|
2026-03-13 16:39:47 +08:00 |
|
AIbin
|
c3aceb6bdc
|
[Models][OP][Optimization] Support DeepSeek-v3.2 model, integrate DSA & Indexer architecture with FlashMLA/DeepGEMM (#6689)
* Support DeepSeek-v3.2 model, integrate DSA & Indexer architecture with FlashMLA/DeepGEMM
|
2026-03-10 15:05:14 +08:00 |
|