Commit Graph

6 Commits

Author SHA1 Message Date
AIbin 48d2bbeb74 fix dsa (#7252) 2026-04-08 20:21:38 +08:00
AIbin bf7e2424d0 [Optimization][Feature]Supports multiple batches of DSK-DSA. (#6930)
* support DSA_MUTI_BATCH

* update test topk

* update dsk-dsa
2026-03-20 15:59:22 +08:00
AIbin cb6819d086 [Optimization][OP]support per_token_group_fp8_quant cuda kernel (#6865)
* support per_token_group_fp8_quant cuda kernel

* Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

* update code

---------

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
2026-03-17 19:17:51 +08:00
AIbin c9f7f5234e [Optimization][BugFix]Optimize Deepseek networking code (#6861)
* update dsk model

* update dsk model
2026-03-16 16:52:43 +08:00
AIbin 1118351b27 [Optimization] Update Deepseekv3.2 model and dsa-indexer networking and add some unitest (#6762)
* add deepseek model doc

* update deepseek model doc

* update deepseek model doc

* update deepseek model doc

* cwb suppor DSK_V32 Model

* update DSK_V32_DSA modeling

* Ibin Support DSK_DSA

* update kernel

* update yaml

* update requirements

* update pre_commit

* update model-runner

* fix CI bug

* del start.sh

* fix iluvatar_model_runner

* update DSA & add unitest

* update import deep_gemm
2026-03-11 15:52:54 +08:00
AIbin c3aceb6bdc [Models][OP][Optimization] Support DeepSeek-v3.2 model, integrate DSA & Indexer architecture with FlashMLA/DeepGEMM (#6689)
* Support DeepSeek-v3.2 model, integrate DSA & Indexer architecture with FlashMLA/DeepGEMM
2026-03-10 15:05:14 +08:00