FastDeploy/fastdeploy/model_executor/layers/attention at 8995a38fa4d9a636a855e1dac8e766948379ecf6 - FastDeploy - 子说镜像小站

apps/FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2026-04-23 17:11:21 +08:00

Files

T

History

chen 616b29ce08 check init_flash_attn_version log (#7399 )

2026-04-15 11:05:10 +08:00

..

[Feature] Support cute cpp Encoder FA4 (#7016 )

2026-03-30 10:54:56 +08:00

[Optimization] Use a separate driver when using Triton with Paddle (#6897 )

2026-03-24 10:56:00 +08:00

__init__.py

[Iluvatar] refactor attn and moe code (#6887 )

2026-03-18 10:31:00 +08:00

append_attn_backend.py

Split enable_mm (#7183 )

2026-04-08 11:25:41 +08:00

attention_selecter.py

…

attention.py

…

base_attention_backend.py

…

block_multihead_attn_backend.py

…

dsa_attention_backend.py

[DeepSeekV3.2][Graph Optimization]Remove synchronous operation to avoid capture fail and unnecessary contiguous in DSA Backend (#7253 )

2026-04-09 11:00:13 +08:00

dsa_helper.py

Dsa clean code，add dsk_attn_write_cache baseline (#6855 )

2026-03-16 11:01:14 +08:00

flash_attn_backend.py

check init_flash_attn_version log (#7399 )

2026-04-15 11:05:10 +08:00

flash_mask_attn_backend.py

[BugFix] Fix batch_size derivation and relax shape checks in SM90 flash_mask_attn (#7210 )

2026-04-09 11:05:10 +08:00

mla_attention_backend.py

[Docs][BugFix] fix mla log (#7243 )

2026-04-13 12:15:43 +08:00

moba_attention_backend.py

[BugFix][Optimization] Replace silent failures with catchable exceptions and informative error messages (#6533 )

2026-03-16 21:32:43 +08:00

native_paddle_backend.py

…

utils.py

…