AIbin
|
48d2bbeb74
|
fix dsa (#7252)
|
2026-04-08 20:21:38 +08:00 |
|
AIbin
|
bf7e2424d0
|
[Optimization][Feature]Supports multiple batches of DSK-DSA. (#6930)
* support DSA_MUTI_BATCH
* update test topk
* update dsk-dsa
|
2026-03-20 15:59:22 +08:00 |
|
AIbin
|
cb6819d086
|
[Optimization][OP]support per_token_group_fp8_quant cuda kernel (#6865)
* support per_token_group_fp8_quant cuda kernel
* Potential fix for pull request finding
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
* update code
---------
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
|
2026-03-17 19:17:51 +08:00 |
|
AIbin
|
c9f7f5234e
|
[Optimization][BugFix]Optimize Deepseek networking code (#6861)
* update dsk model
* update dsk model
|
2026-03-16 16:52:43 +08:00 |
|
AIbin
|
1118351b27
|
[Optimization] Update Deepseekv3.2 model and dsa-indexer networking and add some unitest (#6762)
* add deepseek model doc
* update deepseek model doc
* update deepseek model doc
* update deepseek model doc
* cwb suppor DSK_V32 Model
* update DSK_V32_DSA modeling
* Ibin Support DSK_DSA
* update kernel
* update yaml
* update requirements
* update pre_commit
* update model-runner
* fix CI bug
* del start.sh
* fix iluvatar_model_runner
* update DSA & add unitest
* update import deep_gemm
|
2026-03-11 15:52:54 +08:00 |
|
AIbin
|
c3aceb6bdc
|
[Models][OP][Optimization] Support DeepSeek-v3.2 model, integrate DSA & Indexer architecture with FlashMLA/DeepGEMM (#6689)
* Support DeepSeek-v3.2 model, integrate DSA & Indexer architecture with FlashMLA/DeepGEMM
|
2026-03-10 15:05:14 +08:00 |
|