[Feature] [KVCache] support attention_store kv cache backend (#5823)

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2026-04-23 17:11:21 +08:00

* [feat] support attention_store kv cache backend

* [fix] fix codestyle

* [chore] optimize log

* [fix] fix write storage task

* [fix] fix read storage

* [fix] fix code conflict after merge develop

* [fix] fix cache bytes and read task token ids

* [chore] add model for cache transfer manager

* [chore] add some log

* [chore] remove launched_cache_manager_signal

* [fix] fix write_back_storage_task match_block_num condition

* [fix] fix swap_cost_time

* [ci] fix ci

* Update fastdeploy/engine/sched/resource_manager_v1.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update fastdeploy/cache_manager/cache_transfer_manager.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update fastdeploy/cache_manager/transfer_factory/mooncake_store/attention_store.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

This commit is contained in:

Yonghua Li

2026-01-22 21:01:23 +08:00

committed by

GitHub

parent 3cd0ffe36c

commit 8d27a523e7

17 changed files with 599 additions and 226 deletions

									
										tests/cache_manager/test_prefix_cache_manager.py
									
		+1
		
												View File
												
				@@ -185,6 +185,7 @@ def _create_manager(

				        swap_space=4,

				    )

				    model_config = SimpleNamespace(

				        model="test_model",

				        num_attention_heads=1,

				        num_key_value_heads=1,

				        head_dim=1,