FastDeploy/fastdeploy/model_executor/ops/triton_ops at 0b4c1cba9b7619b292687f381069f9a6244bf9e8 - FastDeploy - 子说镜像小站

apps/FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2026-04-24 01:29:57 +08:00

Files

T

History

Nyakku Shigure dd93f8ffb4 [Optimization] Skip compat guard when torch is not installed (#6913 )

2026-03-19 11:29:27 +08:00

..

__init__.py

[Models][OP][Optimization] Support DeepSeek-v3.2 model, integrate DSA & Indexer architecture with FlashMLA/DeepGEMM (#6689 )

2026-03-10 15:05:14 +08:00

pre_token_quant_fp8_kernel.py

[Models][OP][Optimization] Support DeepSeek-v3.2 model, integrate DSA & Indexer architecture with FlashMLA/DeepGEMM (#6689 )

2026-03-10 15:05:14 +08:00

qk_rmsnorm_fused_kernel.py

[Optimization] Accelerate Qwen3 QK RMSNorm via Fused Triton Kernel (#5880 )

2026-01-12 05:10:21 -08:00

repetition_early_stop_kernel.py

[Feature] Support repetition early stop (#3024 )

2025-07-29 22:42:54 +08:00

triton_utils_v2.py

[BugFix][Optimization] Replace silent failures with catchable exceptions and informative error messages (#6533 )

2026-03-16 21:32:43 +08:00

triton_utils.py

[Optimization] Skip compat guard when torch is not installed (#6913 )

2026-03-19 11:29:27 +08:00

wint2_fused_moe_kernel.py

polish code with new pre-commit rule (#2923 )

2025-07-19 23:19:27 +08:00