Logo
Explore Help
Sign In
apps/FastDeploy
1
0
Fork 0
You've already forked FastDeploy
mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2026-04-23 00:17:25 +08:00
Code Issues Actions 19 Packages Projects Releases Wiki Activity
Files
3cc09418f1574369442d49292d41925887acd1c7
FastDeploy/fastdeploy/model_executor/layers/attention
T
History
周周周 3cc09418f1 support dsv3 use flashmla (#6593)
2026-03-03 11:09:43 +08:00
..
ops
seq_lens related tensor shape -> [max_num_seqs] (#6535)
2026-03-02 11:18:30 +08:00
__init__.py
[XPU] move xpu_attn_backend.py to FastDeploy/fastdeploy/model_executor/layers/backends/xpu (#5878)
2026-01-09 16:34:57 +08:00
append_attn_backend.py
[Feature]Supports SWA based on appendattn (#6547)
2026-03-01 19:02:08 +08:00
attention_selecter.py
…
attention.py
Support Norm before Rope (#6332)
2026-02-05 15:28:52 +08:00
base_attention_backend.py
…
block_multihead_attn_backend.py
[Feature]Support reorder ids to split prefill and decodes (#5779)
2026-02-03 00:28:02 -08:00
flash_attn_backend.py
[BugFix] lazy enable_torch_proxy for cutlass (#6523)
2026-03-02 10:43:58 +08:00
flash_mask_attn_backend.py
[MTP] refactor MTP pre_process (#6358)
2026-02-09 10:47:15 +08:00
iluvatar_attn_backend.py
[Iluvatar] Support CudaGraph and optimize flash_attn_unpadded and fused_neox_rope_embedding (#6553)
2026-03-02 14:07:17 +08:00
mla_attention_backend.py
support dsv3 use flashmla (#6593)
2026-03-03 11:09:43 +08:00
moba_attention_backend.py
[Feature]Support reorder ids to split prefill and decodes (#5779)
2026-02-03 00:28:02 -08:00
native_paddle_backend.py
…
utils.py
…
Powered by Gitea Version: 1.26.0 Page: 1252ms Template: 14ms
Auto
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API