FastDeploy/fastdeploy/model_executor/layers/attention at 2fb2c0f46a8d688cc3b1eb0001ea3a7bc2abaef1 - FastDeploy - 子说镜像小站

apps/FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2026-05-08 16:32:41 +08:00

Files

T

History

lifulll 72094d4d82 enable dcu ci (#3402 )

2025-08-29 10:23:08 +08:00

..

Add with_output version AppendAttention (#3302 )

2025-08-28 17:10:18 +08:00

__init__.py

Revert "[Feature] block sparse attention (#3209 )" (#3647 )

2025-08-27 17:35:04 +08:00

append_attn_backend.py

add input_processor plugin (#3657 )

2025-08-28 22:53:57 +08:00

attention_selecter.py

polish code with new pre-commit rule (#2923 )

2025-07-19 23:19:27 +08:00

attention.py

Revert "[Feature] block sparse attention (#3209 )" (#3647 )

2025-08-27 17:35:04 +08:00

base_attention_backend.py

[MetaxGPU] Support FastDeploy on metax gpu (#3241 )

2025-08-13 11:11:54 +08:00

block_multihead_attn_backend.py

enable dcu ci (#3402 )

2025-08-29 10:23:08 +08:00

flash_attn_backend.py

Add with_output version AppendAttention (#3302 )

2025-08-28 17:10:18 +08:00

iluvatar_attn_backend.py

[Iluvatar GPU] Optimze attention and moe performance (#3234 )

2025-08-08 10:51:24 +08:00

mla_attention_backend.py

Add custom op declaration for all_reduce (#3473 )

2025-08-20 20:29:58 +08:00

native_paddle_backend.py

polish code with new pre-commit rule (#2923 )

2025-07-19 23:19:27 +08:00

utils.py

polish code with new pre-commit rule (#2923 )

2025-07-19 23:19:27 +08:00

xpu_attn_backend.py

[Executor] Refactor GetBlockShapeAndSplitKVBlock Kernel (#2989 )

2025-07-31 00:09:31 +08:00