[Docs][CI] Fix prebuilt wheel installation and update Docs (#7289)

* [CI] Fix prebuilt wheel installation and update Docs

* [CI] Update Dockerfile.gpu to restrict SM80/86/89/90, CUDA 12.6 and Python 3.10

* Update nvidia_gpu.md

* Update nvidia_gpu.md

* Revise NVIDIA GPU installation instructions

Updated installation instructions for PaddlePaddle and FastDeploy to remove specific CUDA version mentions and clarify support for multiple GPU architectures.

---------

Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
This commit is contained in:
YuBaoku
2026-04-10 10:31:12 +08:00
committed by GitHub
parent ee73623c76
commit b7b4fe6a69
4 changed files with 23 additions and 18 deletions
+1 -1
View File
@@ -207,7 +207,7 @@ function copy_ops(){
} }
function extract_ops_from_precompiled_wheel() { function extract_ops_from_precompiled_wheel() {
local WHL_NAME="fastdeploy_gpu-0.0.0-py3-none-any.whl" local WHL_NAME="fastdeploy_gpu-0.0.0-cp310-cp310-manylinux_2_28_x86_64.whl"
if [ -z "$FD_COMMIT_ID" ]; then if [ -z "$FD_COMMIT_ID" ]; then
if git rev-parse HEAD >/dev/null 2>&1; then if git rev-parse HEAD >/dev/null 2>&1; then
FD_COMMIT_ID=$(git rev-parse HEAD) FD_COMMIT_ID=$(git rev-parse HEAD)
+3 -3
View File
@@ -1,6 +1,6 @@
FROM ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/fastdeploy-cuda-12.6:tag-base FROM ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/fastdeploy-cuda-12.6:tag-base
ARG PADDLE_VERSION=3.3.0 ARG PADDLE_VERSION=3.3.1
ARG FD_VERSION=2.4.0 ARG FD_VERSION=2.5.0
ENV DEBIAN_FRONTEND=noninteractive ENV DEBIAN_FRONTEND=noninteractive
@@ -16,7 +16,7 @@ RUN python -m pip uninstall paddlepaddle-gpu fastdeploy-gpu -y
RUN python -m pip install --no-cache-dir paddlepaddle-gpu==${PADDLE_VERSION} -i https://www.paddlepaddle.org.cn/packages/stable/cu126/ RUN python -m pip install --no-cache-dir paddlepaddle-gpu==${PADDLE_VERSION} -i https://www.paddlepaddle.org.cn/packages/stable/cu126/
# build and install FastDeploy # build and install FastDeploy
RUN python -m pip install --no-cache-dir fastdeploy-gpu==${FD_VERSION} -i https://www.paddlepaddle.org.cn/packages/stable/fastdeploy-gpu-80_90/ --extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple RUN python -m pip install --no-cache-dir fastdeploy-gpu==${FD_VERSION} -i https://www.paddlepaddle.org.cn/packages/stable/cu126/ --extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
ENV http_proxy="" ENV http_proxy=""
ENV https_proxy="" ENV https_proxy=""
+11 -9
View File
@@ -15,7 +15,10 @@ The following installation methods are available when your environment meets the
**Notice**: The pre-built image supports SM 80/86/89/90 architecture GPUs (e.g. A800/H800/L20/L40/4090). **Notice**: The pre-built image supports SM 80/86/89/90 architecture GPUs (e.g. A800/H800/L20/L40/4090).
```shell ```shell
# CUDA 12.6
docker pull ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/fastdeploy-cuda-12.6:2.5.0 docker pull ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/fastdeploy-cuda-12.6:2.5.0
# CUDA 12.9
docker pull ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/fastdeploy-cuda-12.9:2.5.0
``` ```
## 2. Pre-built Pip Installation ## 2. Pre-built Pip Installation
@@ -23,13 +26,13 @@ docker pull ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/fastdeploy-cuda-12
First install paddlepaddle-gpu. For detailed instructions, refer to [PaddlePaddle Installation](https://www.paddlepaddle.org.cn/en/install/quick?docurl=/documentation/docs/en/develop/install/pip/linux-pip_en.html) First install paddlepaddle-gpu. For detailed instructions, refer to [PaddlePaddle Installation](https://www.paddlepaddle.org.cn/en/install/quick?docurl=/documentation/docs/en/develop/install/pip/linux-pip_en.html)
```shell ```shell
# Install stable release # Install stable release
# CUDA 12.6 # CUDA
python -m pip install paddlepaddle-gpu==3.3.1 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/ python -m pip install paddlepaddle-gpu==3.3.1 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/
# CUDA 12.9 # CUDA 12.9
python -m pip install paddlepaddle-gpu==3.3.1 -i https://www.paddlepaddle.org.cn/packages/stable/cu129/ python -m pip install paddlepaddle-gpu==3.3.1 -i https://www.paddlepaddle.org.cn/packages/stable/cu129/
# Install latest Nightly build # Install latest Nightly build
# CUDA 12.6 # CUDA
python -m pip install --pre paddlepaddle-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu126/ python -m pip install --pre paddlepaddle-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu126/
# CUDA 12.9 # CUDA 12.9
python -m pip install --pre paddlepaddle-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu129/ python -m pip install --pre paddlepaddle-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu129/
@@ -40,13 +43,13 @@ Then install fastdeploy. **Do not install from PyPI**. Use the following methods
**Note**: Stable FastDeploy release pairs with stable PaddlePaddle; Nightly Build FastDeploy pairs with Nightly Build PaddlePaddle. The `--extra-index-url` is only used for downloading fastdeploy-gpu's dependencies; fastdeploy-gpu itself must be installed from the Paddle source specified by `-i`. **Note**: Stable FastDeploy release pairs with stable PaddlePaddle; Nightly Build FastDeploy pairs with Nightly Build PaddlePaddle. The `--extra-index-url` is only used for downloading fastdeploy-gpu's dependencies; fastdeploy-gpu itself must be installed from the Paddle source specified by `-i`.
``` ```
# Install stable release FastDeploy # Install stable release FastDeploy
# CUDA 12.6 # CUDA
python -m pip install fastdeploy-gpu==2.5.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/ --extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple python -m pip install fastdeploy-gpu==2.5.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/ --extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
# CUDA 12.9 # CUDA 12.9
python -m pip install fastdeploy-gpu==2.5.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu129/ --extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple python -m pip install fastdeploy-gpu==2.5.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu129/ --extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
# Install Nightly Build FastDeploy # Install Nightly Build FastDeploy
# CUDA 12.6 # CUDA
python -m pip install fastdeploy-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu126/ --extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple python -m pip install fastdeploy-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu126/ --extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
# CUDA 12.9 # CUDA 12.9
python -m pip install fastdeploy-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu129/ --extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple python -m pip install fastdeploy-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu129/ --extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
@@ -54,7 +57,7 @@ python -m pip install fastdeploy-gpu -i https://www.paddlepaddle.org.cn/packages
## 3. Build from Source Using Docker ## 3. Build from Source Using Docker
- Note: ```dockerfiles/Dockerfile.gpu``` by default supports SM 80/90 architectures. To support other architectures, modify ```bash build.sh 1 python false [80,90]``` in the Dockerfile. It's recommended to specify no more than 2 architectures. > Note: `dockerfiles/Dockerfile.gpu` currently supports CUDA 12.6 only, targeting SM 80/86/89/90 architectures. To support other architectures, modify ```bash build.sh 1 python false [80,90]``` in the Dockerfile. It's recommended to specify no more than 2 architectures.
```shell ```shell
git clone https://github.com/PaddlePaddle/FastDeploy git clone https://github.com/PaddlePaddle/FastDeploy
@@ -84,7 +87,6 @@ The built packages will be in the ```FastDeploy/dist``` directory.
## 5. Precompiled Operator Wheel Packages ## 5. Precompiled Operator Wheel Packages
FastDeploy provides precompiled GPU operator wheel packages for quick setup without building the entire source code. FastDeploy provides precompiled GPU operator wheel packages for quick setup without building the entire source code.
This method currently supports **SM80/90 architecture (e.g., A100/H100)** and **CUDA 12.6** environments only.
> By default, `build.sh` compiles all custom operators from source.To use the precompiled package, enable it with the `FD_USE_PRECOMPILED` parameter. > By default, `build.sh` compiles all custom operators from source.To use the precompiled package, enable it with the `FD_USE_PRECOMPILED` parameter.
> If the precompiled package cannot be downloaded or does not match the current environment, the system will automatically fall back to `4. Build Wheel from Source`. > If the precompiled package cannot be downloaded or does not match the current environment, the system will automatically fall back to `4. Build Wheel from Source`.
@@ -113,7 +115,7 @@ cd FastDeploy
bash build.sh 1 python false [90] 1 bash build.sh 1 python false [90] 1
# Use precompiled wheel from a specific commit # Use precompiled wheel from a specific commit
bash build.sh 1 python false [90] 1 8a9e7b53af4a98583cab65e4b44e3265a93e56d2 bash build.sh 1 python false [90] 1 d693d4be1448d414097882386fdc24c8bec2a63a
``` ```
The downloaded wheel packages will be stored in the `FastDeploy/pre_wheel` directory. The downloaded wheel packages will be stored in the `FastDeploy/pre_wheel` directory.
@@ -122,9 +124,9 @@ After the build completes, the operator binaries can be found in `FastDeploy/fas
> **Notes:** > **Notes:**
> >
> - This mode prioritizes downloading precompiled GPU operator wheels to reduce build time. > - This mode prioritizes downloading precompiled GPU operator wheels to reduce build time.
> - Currently supports **GPU, SM80/90, CUDA 12.6** only. > - Supports **GPU, SM80/86/89/90.
> - For custom architectures or modified operator logic, please use **source compilation (Section 4)**. > - For custom architectures or modified operator logic, please use **source compilation (Section 4)**.
> - You can check whether the precompiled wheel for a specific commit has been successfully built on the [FastDeploy CI Build Status Page](https://github.com/PaddlePaddle/FastDeploy/actions/workflows/ci_image_update.yml). > - You can check whether the precompiled wheel for a specific commit has been successfully built on the [FastDeploy CI Build Status Page](https://github.com/PaddlePaddle/FastDeploy/actions/workflows/ce_job.yml).
## Environment Verification ## Environment Verification
@@ -14,10 +14,13 @@
## 1. 预编译Docker安装(推荐) ## 1. 预编译Docker安装(推荐)
**注意** 预编译镜像支持 80/86/89/90 架构的GPU硬件 (如 A800/H800/L20/L40/4090)。 **注意** 预编译镜像支持 80/86/89/90 架构的GPU硬件 (如 A800/H800/L20/L40/4090) 且仅支持 Python 3.10
``` shell ``` shell
# CUDA 12.6
docker pull ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/fastdeploy-cuda-12.6:2.5.0 docker pull ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/fastdeploy-cuda-12.6:2.5.0
# CUDA 12.9
docker pull ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/fastdeploy-cuda-12.9:2.5.0
``` ```
## 2. 预编译Pip安装 ## 2. 预编译Pip安装
@@ -57,7 +60,7 @@ python -m pip install fastdeploy-gpu -i https://www.paddlepaddle.org.cn/packages
## 3. 镜像自行构建 ## 3. 镜像自行构建
> 注意 ```dockerfiles/Dockerfile.gpu``` 默认编译的架构支持SM 80/90,如若需要支持其它架构,需自行修改Dockerfile中的 ```bash build.sh 1 python false [80,90]```,建议不超过2个架构。 > 注意 ```dockerfiles/Dockerfile.gpu``` 默认编译产物仅支持 SM 80/86/89/90 架构,基于 CUDA 12.6 环境构建,且仅支持 Python 3.10,如若需要支持其它架构,需自行修改Dockerfile中的 ```bash build.sh 1 python false [80,90]```,建议不超过2个架构。
``` ```
git clone https://github.com/PaddlePaddle/FastDeploy git clone https://github.com/PaddlePaddle/FastDeploy
@@ -91,7 +94,7 @@ bash build.sh 1 python false [80,90]
## 5. 算子预编译 Wheel 包 ## 5. 算子预编译 Wheel 包
FastDeploy 提供了 GPU 算子预编译版 Wheel 包,可在无需完整源码编译的情况下快速构建。该方式当前仅支持 **SM80/90 架构(A100/H100等)** **CUDA 12.6** 环境。 FastDeploy 提供了 GPU 算子预编译版 Wheel 包,可在无需完整源码编译的情况下快速构建。该方式当前仅支持 **SM80/90 架构(A100/H100等)** **CUDA 12.6** 和 **Python 3.10** 环境。
>默认情况下,`build.sh` 会从源码编译;若希望使用预编译包,可使用`FD_USE_PRECOMPILED` 参数; >默认情况下,`build.sh` 会从源码编译;若希望使用预编译包,可使用`FD_USE_PRECOMPILED` 参数;
>若预编译包下载失败或与环境不匹配,系统会自动回退至 `4. wheel 包源码编译` 模式。 >若预编译包下载失败或与环境不匹配,系统会自动回退至 `4. wheel 包源码编译` 模式。
@@ -119,7 +122,7 @@ cd FastDeploy
bash build.sh 1 python false [90] 1 bash build.sh 1 python false [90] 1
# 从指定 commitID 获取对应预编译算子 # 从指定 commitID 获取对应预编译算子
bash build.sh 1 python false [90] 1 8a9e7b53af4a98583cab65e4b44e3265a93e56d2 bash build.sh 1 python false [90] 1 d693d4be1448d414097882386fdc24c8bec2a63a
``` ```
下载的 whl 包在 `FastDeploy/pre_wheel`目录下。 下载的 whl 包在 `FastDeploy/pre_wheel`目录下。
@@ -128,7 +131,7 @@ bash build.sh 1 python false [90] 1 8a9e7b53af4a98583cab65e4b44e3265a93e56d2
> **说明:** > **说明:**
> - 该模式会优先下载预编译的 GPU 算子 whl 包,减少编译时间; > - 该模式会优先下载预编译的 GPU 算子 whl 包,减少编译时间;
> - 目前仅支持 **GPU SM80/90 架构, CUDA 12.6** > - 目前仅支持 **GPU SM80/90 架构, CUDA 12.6 Python3.10**
> - 若希望自定义架构或修改算子逻辑,请使用 **源码编译方式(第4节)**。 > - 若希望自定义架构或修改算子逻辑,请使用 **源码编译方式(第4节)**。
> - 您可以在 FastDeploy CI 构建状态页面查看对应 commit 的预编译 whl 是否已构建成功。 > - 您可以在 FastDeploy CI 构建状态页面查看对应 commit 的预编译 whl 是否已构建成功。