FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2026-04-24 01:29:57 +08:00

Files

T

AIbin c3aceb6bdc [Models][OP][Optimization] Support DeepSeek-v3.2 model, integrate DSA & Indexer architecture with FlashMLA/DeepGEMM (#6689 )

* Support DeepSeek-v3.2 model, integrate DSA & Indexer architecture with FlashMLA/DeepGEMM

2026-03-10 15:05:14 +08:00

__init__.py

2026-03-10 15:05:14 +08:00

pre_token_quant_fp8_kernel.py

2026-03-10 15:05:14 +08:00

qk_rmsnorm_fused_kernel.py

2026-01-12 05:10:21 -08:00

repetition_early_stop_kernel.py

2025-07-29 22:42:54 +08:00

triton_utils_v2.py

2025-08-01 10:46:20 +08:00

triton_utils.py

2026-01-12 05:10:21 -08:00

wint2_fused_moe_kernel.py

2025-07-19 23:19:27 +08:00