FastDeploy/fastdeploy/model_executor/layers/attention/ops at 888c4b992dd9081881a2dfeed445f4e630788eed - FastDeploy - 子说镜像小站

apps/FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2026-04-23 00:17:25 +08:00

Files

T

History

周周周 31410415db FA3 support qwen3 (#5441 )

2025-12-09 16:16:16 +08:00

..

__init__.py

[Feature] support flash_mask_attention backend (#5134 )

2025-11-28 10:12:16 +08:00

append_attention.py

Support GPT-OSS-BF16 (#4240 )

2025-10-20 14:44:58 +08:00

flash_mask_attention.py

[Feature] support flash_mask_attention backend (#5134 )

2025-11-28 10:12:16 +08:00

get_block_shape_and_split_kv_block.py

[Others]get_block_shape_and_split_kv_block clean code (#5123 )

2025-11-20 16:40:04 +08:00

gqa_rope_write_cache.py

FA3 support qwen3 (#5441 )

2025-12-09 16:16:16 +08:00

init_kv_signal_per_query.py

[PD Disaggregation][XPU] Add XPU support for PD disaggregation (#5113 )

2025-11-21 14:09:01 +08:00

init_signal_layerwise.py

[PD Disaggregation][XPU] Add XPU support for PD disaggregation (#5113 )

2025-11-21 14:09:01 +08:00

open_shm_and_get_meta_signal.py

[PD Disaggregation][XPU] Add XPU support for PD disaggregation (#5113 )

2025-11-21 14:09:01 +08:00

pre_cache_len_concat.py

[Others] Remove useless code (#5404 )

2025-12-08 13:59:46 +08:00