FastDeploy/fastdeploy/model_executor/layers/attention at da6b4c10e53ff64a34edf8d9d1f06fb79f8bd225 - FastDeploy - 子说镜像小站

apps/FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2026-04-23 08:21:53 +08:00

Files

T

History

周周周 da6b4c10e5 [ATTENTION] make buffer alloc as a function (#4945 )

2025-11-11 19:17:08 +08:00

..

Support GPT-OSS-BF16 (#4240 )

2025-10-20 14:44:58 +08:00

__init__.py

…

append_attn_backend.py

[ATTENTION] make buffer alloc as a function (#4945 )

2025-11-11 19:17:08 +08:00

attention_selecter.py

…

attention.py

fix Cfp8 for RL load (#4144 )

2025-11-03 17:51:51 +08:00

base_attention_backend.py

…

block_multihead_attn_backend.py

…

flash_attn_backend.py

Support GPT-OSS-BF16 (#4240 )

2025-10-20 14:44:58 +08:00

iluvatar_attn_backend.py

[Iluvatar GPU] Adapt VL model (#4313 )

2025-10-17 16:13:38 +08:00

mla_attention_backend.py

…

moba_attention_backend.py

[FDConfig]Remove total_block_num/dtype/block_size/enc_dec_block_num in ParallelConfig (#4400 )

2025-10-16 20:00:37 +08:00

native_paddle_backend.py

…

utils.py

supports pd partn (#4615 )

2025-11-04 16:36:35 +08:00

xpu_attn_backend.py

[XPU] Support PaddleOCR-VL model for XPU (#4529 )

2025-10-28 20:35:04 +08:00