[PD Disaggregation] Update usage of pd disaggregation and data parallel (#5742)

* Update usage of pd disaggregation * up * up * up * up * up * up * up * up * up * up dp docs * up * up * up * fix unittest
2026-04-23 00:17:25 +08:00 · 2026-01-05 17:51:29 +08:00
parent 690d4bcdb0
commit 8d384f9fd8
15 changed files with 441 additions and 385 deletions
@@ -1,36 +1,10 @@
-# Run the Examples on NVIDIA CUDA GPU

-## Prepare the Environment
-Refer to [NVIDIA CUDA GPU Installation](https://paddlepaddle.github.io/FastDeploy/get_started/installation/nvidia_gpu/) to pull the docker image, such as:
-```
-docker pull ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/fastdeploy-cuda-12.6:2.3.0
-```
+PD分离式部署，请参考[使用文档](../../docs/zh/features/disaggregated.md)。

-In the docker container, the [NVIDIA MLNX_OFED](https://network.nvidia.com/products/infiniband-drivers/linux/mlnx_ofed/) and [Redis](https://redis.io/) are pre-installed.
+PD分离式部署，推荐使用Router来做请求调度（即是V1模式）。

-## Build and install FastDeploy
+启动脚本：

-```
-git clone https://github.com/PaddlePaddle/FastDeploy
-cd FastDeploy
-
-export ENABLE_FD_RDMA=1
-
-# Argument 1: Whether to build wheel package (1 for yes, 0 for compile only)
-# Argument 2: Python interpreter path
-# Argument 3: Whether to compile CPU inference operators
-# Argument 4: Target GPU architectures
-bash build.sh 1 python false [80,90]
-```
-
-## Run the Examples
-
-Run the shell scripts in this directory, ```bash start_v0_tp1.sh``` or ```bash start_v1_tp1.sh```
-
-Note that, there are two methods for splitwise deployment:
-* v0: using splitwise_scheduler or dp_scheduler, in which the requests are scheduled in the engine.
-* v1: using router, in which the requests are scheduled in the router.
-
-# Run the Examples on Kunlunxin XPU
-
-Coming soon...
+* `start_v1_tp1.sh`：使用Router调度，P和D实例是TP1。
+* `start_v1_tp2.sh`：使用Router调度，P和D实例是TP2。
+* `start_v1_dp2.sh`：使用Router调度，P和D实例是DP2 TP1。