Logo
Explore Help
Sign In
apps/FastDeploy
1
0
Fork 0
You've already forked FastDeploy
mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2026-05-06 23:49:39 +08:00
Code Issues Actions 2 Packages Projects Releases Wiki Activity
Files
490a6551dcff20d7b578e03d9bac1e981e07efc4
FastDeploy/fastdeploy/cache_manager
T
History
qwes5s5 b2a2e11551 [Feature] Support stopping the inference for the corresponding request in the online service after a disconnection request. (#5320)
* request disconnect

* request disconnect

* fix bug

* fix bug--amend

---------

Co-authored-by: root <root@yq01-sys-rpm26xc1knu.yq01.baidu.com>
2026-01-16 11:46:13 +08:00
..
transfer_factory
[BugFix] [MultiAPIServer] fix rdma script and port check for multi api server (#5935)
2026-01-12 10:38:52 +08:00
__init__.py
polish code with new pre-commit rule (#2923)
2025-07-19 23:19:27 +08:00
cache_data.py
[Feature] Support KV Cache Storage (#5571)
2025-12-25 16:30:35 +08:00
cache_messager.py
[Feature] get_output_kv_signal blocking read mode & send_first_token (#5836)
2026-01-15 14:11:03 +08:00
cache_metrics.py
[BugFix] Refine the preparation of cpu and storage cache (#5777)
2026-01-05 10:13:30 +08:00
cache_transfer_manager.py
[BugFix] fix cache transfer manager updating/clearing (#5930)
2026-01-13 05:09:29 -08:00
multimodal_cache_manager.py
[Optimization] support mm prefill batch (#5313)
2025-12-11 22:21:14 +08:00
ops.py
[Metax] adapt prefix caching & cpu swap (#5844)
2025-12-31 17:02:48 +08:00
prefix_cache_manager.py
[Feature] Support stopping the inference for the corresponding request in the online service after a disconnection request. (#5320)
2026-01-16 11:46:13 +08:00
Powered by Gitea Version: 1.26.0 Page: 160ms Template: 5ms
Auto
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API