FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2026-04-23 00:17:25 +08:00

Files

T

Ryan 49cea8fb1c [SOT][Cudagraph] Remove BreakGraph of #3302 && update CustomOp (#3694 )

* rm inplace info && to(gpu)

* update append_attention

* unpin paddle version

* add full_cuda_graph=False

* add blank line

---------

Co-authored-by: SigureMo <sigure.qaq@gmail.com>

2025-10-17 10:57:55 +08:00

__init__.py

Add with_output version AppendAttention (#3302 )

2025-08-28 17:10:18 +08:00

append_attention.py

[SOT][Cudagraph] Remove BreakGraph of #3302 && update CustomOp (#3694 )

2025-10-17 10:57:55 +08:00

get_block_shape_and_split_kv_block.py

[Optimization] Fuse get_max_len and get_kv_max_len (#4369 )

2025-10-13 20:35:00 +08:00

gqa_rope_write_cache.py

support fa3 rope3d (#3622 )

2025-08-27 11:31:29 +08:00

init_kv_signal_per_query.py

polish code with new pre-commit rule (#2923 )

2025-07-19 23:19:27 +08:00

init_signal_layerwise.py

polish code with new pre-commit rule (#2923 )

2025-07-19 23:19:27 +08:00

open_shm_and_get_meta_signal.py

polish code with new pre-commit rule (#2923 )

2025-07-19 23:19:27 +08:00

pre_cache_len_concat.py

polish code with new pre-commit rule (#2923 )

2025-07-19 23:19:27 +08:00