FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2026-04-23 00:17:25 +08:00

Author	SHA1	Message	Date
jc	04fde3b227	[PD Disaggregation] Prefill and decode support cache storage (#6768 ) * Prefill and decode support cache storage * up * up * update docs and refine mooncake store * up	2026-03-16 14:44:49 +08:00
mouxin	6e96bd0bd2	[Feature] Fix counter release logic & update go-router download URL (#6280 ) * [Doc] Update prerequisites in the documentation * [Feature] Enhance Router with /v1/completions, docs, scripts, and version info * [Feature] Enhance Router with /v1/completions, docs, scripts, and version info * [Feature] Enhance Router with /v1/completions, docs, scripts, and version info * [Feature] Fix counter release logic * [Feature] Update go-router download URL * [Feature] Update go-router download URL * [Feature] Update go-router download URL * [Feature] Update go-router download URL * [Feature] Update token counter logic and docs * [Feature] Update token counter logic and docs --------- Co-authored-by: mouxin <mouxin@baidu.com>	2026-02-04 15:02:38 +08:00
mouxin	506f1545cd	[Feature] Enhance Router with /v1/completions, docs, scripts, and version info (#5966 ) * [Doc] Update prerequisites in the documentation * [Feature] Enhance Router with /v1/completions, docs, scripts, and version info * [Feature] Enhance Router with /v1/completions, docs, scripts, and version info --------- Co-authored-by: mouxin <mouxin@baidu.com>	2026-01-30 10:28:48 +08:00
Cheng Yanfei	fbcccaa750	[Intel HPU] enable MoE EP for hpu (#5855 ) * enable HPU MoE EP * MoE intermediate_scale stack * enable loader_v1 esp for tensor_wise_fp8 TP or EP * modify activation_scale name	2026-01-15 13:08:00 +08:00
mouxin	0a92e96f20	[Feature] Add Golang-based Router for Request Scheduling and Load Balancing (#5882 ) * [Feature] add golang router * [Feature] add golang router * [Feature] add golang router * [Feature] add golang router * [Feature] add golang router * [Feature] Add Golang-based Router for Request Scheduling and Load Balancing * [Feature] Add Golang-based Router for Request Scheduling and Load Balancing * [Feature] Add Golang-based Router for Request Scheduling and Load Balancing * [Feature] Add Golang-based Router for Request Scheduling and Load Balancing --------- Co-authored-by: mouxin <mouxin@baidu.com>	2026-01-07 21:28:08 +08:00
jc	e9b25aa72f	[BugFix] Storage backend gets env params (#5892 ) * Storage backend gets env params * up * up * up	2026-01-06 14:14:17 +08:00
jc	8d384f9fd8	[PD Disaggregation] Update usage of pd disaggregation and data parallel (#5742 ) * Update usage of pd disaggregation * up * up * up * up * up * up * up * up * up * up dp docs * up * up * up * fix unittest	2026-01-05 17:51:29 +08:00
jc	e911ac2ce7	[BugFix] Refine the preparation of cpu and storage cache (#5777 ) * Refine the preparation of cpu and storage cache * fix error * fix error * up * fix * up docs * fix unittest * remove debug info	2026-01-05 10:13:30 +08:00
Juncai	412867fd99	[Feature] Support KV Cache Storage (#5571 ) * Support Mooncake Store * up * up * add op * fix conflict * fix error * up for comments * avoid thread lock * up * fix unittest * fix unittest * remove debug info * consider tp_size > 1 * add default rdma_nics * add utils * up * fix error --------- Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>	2025-12-25 16:30:35 +08:00
fmiao2372	a8fce47195	[Intel HPU] enable kv cache scheduler v1 for hpu (#5648 ) * [Intel HPU] enable kv cache scheduler v1 for hpu * fix copilt comments	2025-12-19 12:03:39 +08:00
Yonghua Li	0c8c6369ed	[Feature] [PD Disaggregation] simplify configuration for pd-disaggregated deployment, and refactor post-init and usage for all ports (#5415 ) * [feat] simplify configuration for pd-disaggregated deployment, and refactor post-init and usage for all ports * [fix] fix some bugs * [fix] fix rdma port for cache manager/messager * [fix] temporarily cancel port availability check to see if it can pass ci test * [feat] simplify args for multi api server * [fix] fix dp * [fix] fix port for xpu * [fix] add tests for ports post processing & fix ci * [test] fix test_multi_api_server * [fix] fix rdma_comm_ports args for multi_api_server * [fix] fix test_common_engine * [fix] fix test_cache_transfer_manager * [chore] automatically setting FD_ENABLE_MULTI_API_SERVER * [fix] avoid api server from creating engine_args twice * [fix] fix test_run_batch * [fix] fix test_metrics * [fix] fix splitwise connector init * [test] add test_rdma_transfer and test_expert_service * [fix] fix code syntax * [fix] fix test_rdma_transfer and build wheel with rdma script	2025-12-17 15:50:42 +08:00
xiaolei373	a30b4da260	[Feature] Tracing: Fine-Grained Tracing for Request Latency Part1 (#5458 )	2025-12-16 16:36:09 +08:00
Yonghua Li	f4119d51b4	[PD Disaggregation] support DP via v1 router and decouple DP and EP (#5197 ) * [fix] support DP via v1 router and decouple DP and EP * [fix] fix scripts * [fix] reset model path * [fix] dp use get_output_ep, fix router port type, update scripts * [merge] merge with latest code * [chore] remove some debug log * [fix] fix code style check * [fix] fix test_multi_api_server for log_dir name * [chore] reduce logs * Apply suggestions from code review Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-12-04 15:38:43 +08:00
fmiao2372	429dd2b1db	[Intel HPU] add example benchmark scripts for hpu (#5304 ) * [Intel HPU] add example benchmark scripts for hpu * Revise the code based on the copilot comments * update code based on comments * update ci ops version	2025-12-02 18:00:01 +08:00
K11OntheBoat	2e1680838f	[PD Disaggregation] Support PD deployment of DeepSeekv3. (#5251 ) * Support deepseekv3 cache transfer for PD deploy * clean some log info --------- Co-authored-by: K11OntheBoat <“ruianmaidanglao@163.com”>	2025-12-02 14:11:50 +08:00
Juncai	f9b0545a7f	[PD Disaggregation] [Refine] Refine splitwise deployment (#5151 ) * Refine splitwise deployment * up	2025-11-21 15:30:24 +08:00
Yonghua Li	43097a512a	[BugFix] [PD Disaggregation] fix v1 scheduler prefill node profile run & ipc transfer protocol (#5132 ) CE Compile Job / ce_job_pre_check (push) Has been cancelled Details CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled Details CE Compile Job / FD-Clone-Linux (push) Has been cancelled Details CE Compile Job / Show Code Archive Output (push) Has been cancelled Details CE Compile Job / BUILD_SM8090 (push) Has been cancelled Details CE Compile Job / BUILD_SM8689 (push) Has been cancelled Details CE Compile Job / CE_UPLOAD (push) Has been cancelled Details Deploy GitHub Pages / deploy (push) Has been cancelled Details * [fix] fix v1 scheduler profile run for append attention in prefill node * [fix] skip send_signal if kv signal not inited for gpu and xpu * [fix] extend fix to flash_attn & mla_attn * [fix] fix v1 pd run in ipc transfer protocol * [ci] add test for v1 pd profile run using ipc transfer protocol * [style] fix code style check * [style] fix code style again * [fix] fix profile run * [update] remove --num-gpu-blocks-override in example script * [chore] rename forward_meta is_profiling to is_dummy_or_profile_run	2025-11-20 21:39:22 +08:00
Juncai	36822fa49c	[PD Disaggregation] remove splitwise deployment on single node and refine the code (#4891 ) * remove splitwise deployment on single node and refine the code * up * up * up * add test * up	2025-11-14 09:56:53 +08:00
Juncai	08ca0f6aea	[Feature] [PD] add simple router and refine splitwise deployment (#4709 ) * add simple router and refine splitwise deployment * fix	2025-11-06 14:56:02 +08:00
jiangjiajun	684703fd72	[LLM] First commit the llm deployment code	2025-06-09 19:20:15 +08:00
Jules	4f4f2e14bf	fix Windows text encoding issue causing infinite loop	2025-02-14 18:40:00 +08:00
Yutian Rao	a300abde8c	Update README.md 修改文字描述OenVINO=》OpenVINO	2024-10-29 15:22:49 +08:00
DefTruth	12bb44e0de	[Bug Fix] fix build xpu encrypt & auth image scripts (#2133 ) * [patchelf] fix patchelf error for inference xpu * [serving] add xpu dockerfile and support fd server * [serving] add xpu dockerfile and support fd server * [Serving] support XPU + Tritron * [Serving] support XPU + Tritron * [Dockerfile] update xpu tritron docker file -> paddle 0.0.0 * [Dockerfile] update xpu tritron docker file -> paddle 0.0.0 * [Dockerfile] update xpu tritron docker file -> paddle 0.0.0 * [Dockerfile] add comments for xpu tritron dockerfile * [Doruntime] fix xpu infer error * [Doruntime] fix xpu infer error * [XPU] update xpu dockerfile * add xpu triton server docs * add xpu triton server docs * add xpu triton server docs * add xpu triton server docs * update xpu triton server docs * update xpu triton server docs * update xpu triton server docs * update xpu triton server docs * update xpu triton server docs * update xpu triton server docs * update xpu triton server docs * update xpu triton server docs * [XPU] Update XPU L3 Cache setting docs * [XPU] Add Encryption and AUTH support for XPU Server * [XPU] Add Encryption and AUTH support for XPU Server * [Bug Fix] fix paddle reader error * [Serving] Support XPU encrypt & auth server * [Serving] Support XPU encrypt & auth server * [Serving] Support XPU encrypt & auth server * [Serving] Support XPU encrypt & auth server * [Triton] switch TAG 22.12 -> TAG 21.10wq * update xpu auth server script * [Bug Fix] fix build xpu encrypt & auth image scripts	2023-07-24 21:00:05 +08:00
jack xu	821adb387e	[Bug Fix] Fixed the issue with the incorrect path to FastDeploy.cmake in CMakeLists.txt file. (#2082 ) [Bug Fix] fixed CMakeLists.txt FastDeploy.cmake path	2023-07-04 13:04:19 +08:00
zengshao0622	79a3587339	[Model] Add Paddle3D CenterPoint model (#2078 ) * add centerpoint * update for review comments	2023-07-03 13:39:16 +08:00
YuBinglei	5f9e8b6e08	[Bug Fix] Re-Fix OCR Serving bug. #1516 (#2011 ) see https://github.com/PaddlePaddle/FastDeploy/pull/1516 Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>	2023-06-09 13:44:55 +08:00
Zheng-Bicheng	8d357814e8	[Backend] Add pybind & PaddleDetection example for TVM (#1998 ) * update * update * Update infer_ppyoloe_demo.cc --------- Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>	2023-06-04 13:26:47 +08:00
DefTruth	434b48dda5	[Serving] Support FastDeploy XPU Triton Server (#1994 ) * [patchelf] fix patchelf error for inference xpu * [serving] add xpu dockerfile and support fd server * [serving] add xpu dockerfile and support fd server * [Serving] support XPU + Tritron * [Serving] support XPU + Tritron * [Dockerfile] update xpu tritron docker file -> paddle 0.0.0 * [Dockerfile] update xpu tritron docker file -> paddle 0.0.0 * [Dockerfile] update xpu tritron docker file -> paddle 0.0.0 * [Dockerfile] add comments for xpu tritron dockerfile * [Doruntime] fix xpu infer error * [Doruntime] fix xpu infer error * [XPU] update xpu dockerfile * add xpu triton server docs * add xpu triton server docs * add xpu triton server docs * add xpu triton server docs * update xpu triton server docs * update xpu triton server docs * update xpu triton server docs * update xpu triton server docs * update xpu triton server docs * update xpu triton server docs * update xpu triton server docs * update xpu triton server docs	2023-05-29 14:38:25 +08:00
Zheng-Bicheng	643730bf5f	[Hackathon 181] Add TVM support for FastDeploy on macOS (#1969 ) * update for tvm backend * update third_party * update third_party * update * update * update * update * update * update * update * update --------- Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>	2023-05-25 19:59:02 +08:00
CoolCola	e3b285c762	[Model] Support Paddle3D PETR v2 model (#1863 ) * Support PETR v2 * make petrv2 precision equal with the origin repo * delete extra func * modify review problem * delete visualize * Update README_CN.md * Update README.md * Update README_CN.md * fix build problem * delete external variable and function --------- Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>	2023-05-19 10:45:36 +08:00
Qianhe Chen	09ec386e8d	[Bug Fix] Fix speech and silence state transition in VAD (#1937 ) * Fix speech and silence state transition * Fix typo * Fix speech and silence state transition --------- Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>	2023-05-16 18:50:04 +08:00
DefTruth	77cb9db6da	[Model] Support PP-ShiTuV2 models for PaddleClas (#1900 ) * [cmake] add faiss.cmake -> pp-shituv2 * [PP-ShiTuV2] Support PP-ShituV2-Det model * [PP-ShiTuV2] Support PP-ShiTuV2-Det model * [PP-ShiTuV2] Add PPShiTuV2Recognizer c++&python support * [PP-ShiTuV2] Add PPShiTuV2Recognizer c++&python support * [Bug Fix] fix ppshitu_pybind error * [benchmark] Add ppshituv2-det c++ benchmark * [examples] Add PP-ShiTuV2 det & rec examples * [vision] Update vision classification result * [Bug Fix] fix trt shapes setting errors	2023-05-08 14:04:09 +08:00
seyosum	df8dd3e3ac	【Hackthon_4th 180】Support HORIZON BPU Backend for FastDeploy (#1822 ) * add horizon backend and PPYOLOE examples * 更改horizon头文件编码规范 * 更改horizon头文件编码规范 * 更改horizon头文件编码规范 * 增加horizon packages下载及自动安装 * Add UseHorizonNPUBackend Method * 删除编译FD SDK后多余的头文件,同时更改部分规范 * Update horizon.md * Update horizon.md --------- Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>	2023-05-06 16:10:37 +08:00
DefTruth	6d0261e9e4	[Model] Support PP-StructureV2-Layout model (#1867 ) * [Model] init pp-structurev2-layout code * [Model] init pp-structurev2-layout code * [Model] init pp-structurev2-layout code * [Model] add structurev2_layout_preprocessor * [PP-StructureV2] add postprocessor and layout detector class * [PP-StructureV2] add postprocessor and layout detector class * [PP-StructureV2] add postprocessor and layout detector class * [PP-StructureV2] add postprocessor and layout detector class * [PP-StructureV2] add postprocessor and layout detector class * [pybind] add pp-structurev2-layout model pybind * [pybind] add pp-structurev2-layout model pybind * [Bug Fix] fixed code style * [examples] add pp-structurev2-layout c++ examples * [PP-StructureV2] add python example and docs * [benchmark] add pp-structurev2-layout benchmark support	2023-05-05 13:05:58 +08:00
thunder95	2c5fd91a7f	[Hackthon_4th 242] Support en_ppstructure_mobile_v2.0_SLANet (#1816 ) * first draft * update api name * fix bug * fix bug and * fix bug in c api * fix bug in c_api --------- Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>	2023-04-27 10:45:14 +08:00
thunder95	51be3fea78	[Hackthon_4th 177] Support PP-YOLOE-R with BM1684 (#1809 ) * first draft * add robx iou * add benchmark for ppyoloe_r * remove trash code * fix bugs * add pybind nms rotated option * add missing head file * fix bug * fix bug2 * fix shape bug --------- Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>	2023-04-21 10:48:05 +08:00
yeliang2258	a509dd8ec1	[Model] Add Paddle3D smoke model (#1766 ) * add smoke model * add 3d vis * update code * update doc * mv paddle3d from detection to perception * update result for velocity * update code for CI * add set input data for TRT backend * add serving support for smoke model * update code * update code * update code --------- Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>	2023-04-14 16:30:56 +08:00
yeliang2258	e2f5a9ce66	[Model] Add picodet for RV1126 and A311D (#1549 ) * add infer for picodet * update code * update lite lib --------- Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>	2023-04-10 22:04:45 +08:00
hjyp	cc4bbf2163	[PaddlePaddle Hackathon4 No.185] Add PaddleDetection Models Deployment Java Examples (#1782 ) * add java examples * fix detail * fix pre-config --------- Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>	2023-04-10 21:23:44 +08:00
Zheng-Bicheng	109d1046ae	[Model] add function for setting anchor rknpu2 (#1728 ) * add function for setting anchor rknpu2 add more demo for rknpu2 fixed md error * Update config.h --------- Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>	2023-04-04 20:33:06 +08:00
wanziyu	95c977c638	[PaddlePaddle Hackathon4 No.184] Add PaddleDetection Models Deployment Rust Examples (#1717 ) * [PaddlePaddle Hackathon4 No.186] Add PaddleDetection Models Deployment Go Examples Signed-off-by: wanziyu <ziyuwan@zju.edu.cn> * Fix YOLOv8 Deployment Go Example Signed-off-by: wanziyu <ziyuwan@zju.edu.cn> * [Hackathon4 No.184] Add PaddleDetection Models Deployment Rust Examples Signed-off-by: wanziyu <ziyuwan@zju.edu.cn> * Add main and cargo files in examples Signed-off-by: wanziyu <ziyuwan@zju.edu.cn> --------- Signed-off-by: wanziyu <ziyuwan@zju.edu.cn> Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>	2023-04-03 11:19:28 +08:00
Yi-sir	9e20dab0d6	[Example] Merge Download Paddle Model, Paddle->ONNX->MLIR->BModel (#1643 ) * fix infer.py and README * [Example] Merge Download Paddle Model, Paddle->Onnx->Mlir->Bmodel and inference into infer.py. Modify README.md * modify pp_liteseg sophgo infer.py and README.md * fix PPOCR,PPYOLOE,PICODET,LITESEG sophgo infer.py and README.md * fix memory overflow problem while inferring with sophgo backend * fix memory overflow problem while inferring with sophgo backend --------- Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com> Co-authored-by: xuyizhou <yizhou.xu@sophgo.com>	2023-03-31 15:08:01 +08:00
wanziyu	b1d2903b93	[PaddlePaddle Hackathon4 No.186] Add PaddleDetection Models Deployment Go Examples (#1648 ) * [PaddlePaddle Hackathon4 No.186] Add PaddleDetection Models Deployment Go Examples Signed-off-by: wanziyu <ziyuwan@zju.edu.cn> * Fix YOLOv8 Deployment Go Example Signed-off-by: wanziyu <ziyuwan@zju.edu.cn> --------- Signed-off-by: wanziyu <ziyuwan@zju.edu.cn> Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>	2023-03-28 20:30:03 +08:00
wangguoya	c61a07712e	fix bug for kunlunxin run sd demo for uing fp16 (#1680 ) * modify sd infer.py for using paddle_kunlunxin_fp16 * Update infer.py * [fix bug] fix bug sd in demo infer.py for kunlunxin using fp16	2023-03-27 14:04:21 +08:00
yunyaoXYY	f36f9324de	[Docs] Pick PPOCR fastdeploy docs from PaddleOCR (#1534 ) * Pick PPOCR fastdeploy docs from PaddleOCR * improve ppocr * improve readme * remove old PP-OCRv2 and PP-OCRv3 folfers * rename kunlun to kunlunxin * improve readme * improve readme * improve readme --------- Co-authored-by: Jason <jiangjiajun@baidu.com> Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>	2023-03-23 13:11:19 +08:00
yunyaoXYY	c91e99b5f5	[Docs] Pick paddleclas fastdeploy docs from PaddleClas (#1654 ) * Adjust folders structures in paddleclas * remove useless files * Update sophgo * improve readme	2023-03-23 13:06:09 +08:00
DefTruth	af18e597d0	[Docs] rename ppseg kunlun docs -> kunlunxin (#1662 ) * [Docs] rename ppseg kunlun -> kunlunxin * [Docs] rename ppseg fastdeploy kunlun docs -> kunlunxin	2023-03-20 19:46:18 +08:00
DefTruth	5b143219ce	[Docs] Pick seg fastdeploy docs from PaddleSeg (#1482 ) * [Docs] Pick seg fastdeploy docs from PaddleSeg * [Docs] update seg docs * [Docs] Add c&csharp examples for seg * [Docs] Add c&csharp examples for seg * [Doc] Update paddleseg README.md * Update README.md	2023-03-17 11:22:46 +08:00
Jason	6343b0db47	[Build] Support build with source code of Paddle2ONNX (#1559 ) * Add notes for tensors * Optimize some apis * move some warnings * Support build with Paddle2ONNX * Add protobuf support * Fix compile on mac * add clearn package script * Add paddle2onnx code * remove submodule * Add onnx ocde * remove softlink * add onnx code * fix error * Add cmake file * fix patchelf * update paddle2onnx * Delete .gitmodules --------- Co-authored-by: PaddleCI <paddle_ci@example.com> Co-authored-by: pangyoki <pangyoki@126.com> Co-authored-by: jiangjiajun <jiangjiajun@baidu.lcom>	2023-03-17 10:03:22 +08:00
Zheng-Bicheng	d14db2629d	[Example] Move SOLOv2 jetson example -> cpp (#1600 ) * move solov2 * move solov2 --------- Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>	2023-03-16 22:04:50 +08:00

1 2 3 4 5 ...

687 Commits