diff --git a/build.sh b/build.sh index 8e830ba71c..0ff999fe5e 100644 --- a/build.sh +++ b/build.sh @@ -207,7 +207,7 @@ function copy_ops(){ } function extract_ops_from_precompiled_wheel() { - local WHL_NAME="fastdeploy_gpu-0.0.0-py3-none-any.whl" + local WHL_NAME="fastdeploy_gpu-0.0.0-cp310-cp310-manylinux_2_28_x86_64.whl" if [ -z "$FD_COMMIT_ID" ]; then if git rev-parse HEAD >/dev/null 2>&1; then FD_COMMIT_ID=$(git rev-parse HEAD) diff --git a/dockerfiles/Dockerfile.gpu b/dockerfiles/Dockerfile.gpu index 5ce8b05b19..de9a6b8e41 100644 --- a/dockerfiles/Dockerfile.gpu +++ b/dockerfiles/Dockerfile.gpu @@ -1,6 +1,6 @@ FROM ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/fastdeploy-cuda-12.6:tag-base -ARG PADDLE_VERSION=3.3.0 -ARG FD_VERSION=2.4.0 +ARG PADDLE_VERSION=3.3.1 +ARG FD_VERSION=2.5.0 ENV DEBIAN_FRONTEND=noninteractive @@ -16,7 +16,7 @@ RUN python -m pip uninstall paddlepaddle-gpu fastdeploy-gpu -y RUN python -m pip install --no-cache-dir paddlepaddle-gpu==${PADDLE_VERSION} -i https://www.paddlepaddle.org.cn/packages/stable/cu126/ # build and install FastDeploy -RUN python -m pip install --no-cache-dir fastdeploy-gpu==${FD_VERSION} -i https://www.paddlepaddle.org.cn/packages/stable/fastdeploy-gpu-80_90/ --extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple +RUN python -m pip install --no-cache-dir fastdeploy-gpu==${FD_VERSION} -i https://www.paddlepaddle.org.cn/packages/stable/cu126/ --extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple ENV http_proxy="" ENV https_proxy="" diff --git a/docs/get_started/installation/nvidia_gpu.md b/docs/get_started/installation/nvidia_gpu.md index cc7f8caffd..7dfb03b8a0 100644 --- a/docs/get_started/installation/nvidia_gpu.md +++ b/docs/get_started/installation/nvidia_gpu.md @@ -15,7 +15,10 @@ The following installation methods are available when your environment meets the **Notice**: The pre-built image supports SM 80/86/89/90 architecture GPUs (e.g. A800/H800/L20/L40/4090). ```shell +# CUDA 12.6 docker pull ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/fastdeploy-cuda-12.6:2.5.0 +# CUDA 12.9 +docker pull ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/fastdeploy-cuda-12.9:2.5.0 ``` ## 2. Pre-built Pip Installation @@ -23,13 +26,13 @@ docker pull ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/fastdeploy-cuda-12 First install paddlepaddle-gpu. For detailed instructions, refer to [PaddlePaddle Installation](https://www.paddlepaddle.org.cn/en/install/quick?docurl=/documentation/docs/en/develop/install/pip/linux-pip_en.html) ```shell # Install stable release -# CUDA 12.6 +# CUDA python -m pip install paddlepaddle-gpu==3.3.1 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/ # CUDA 12.9 python -m pip install paddlepaddle-gpu==3.3.1 -i https://www.paddlepaddle.org.cn/packages/stable/cu129/ # Install latest Nightly build -# CUDA 12.6 +# CUDA python -m pip install --pre paddlepaddle-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu126/ # CUDA 12.9 python -m pip install --pre paddlepaddle-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu129/ @@ -40,13 +43,13 @@ Then install fastdeploy. **Do not install from PyPI**. Use the following methods **Note**: Stable FastDeploy release pairs with stable PaddlePaddle; Nightly Build FastDeploy pairs with Nightly Build PaddlePaddle. The `--extra-index-url` is only used for downloading fastdeploy-gpu's dependencies; fastdeploy-gpu itself must be installed from the Paddle source specified by `-i`. ``` # Install stable release FastDeploy -# CUDA 12.6 +# CUDA python -m pip install fastdeploy-gpu==2.5.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/ --extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple # CUDA 12.9 python -m pip install fastdeploy-gpu==2.5.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu129/ --extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple # Install Nightly Build FastDeploy -# CUDA 12.6 +# CUDA python -m pip install fastdeploy-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu126/ --extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple # CUDA 12.9 python -m pip install fastdeploy-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu129/ --extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple @@ -54,7 +57,7 @@ python -m pip install fastdeploy-gpu -i https://www.paddlepaddle.org.cn/packages ## 3. Build from Source Using Docker -- Note: ```dockerfiles/Dockerfile.gpu``` by default supports SM 80/90 architectures. To support other architectures, modify ```bash build.sh 1 python false [80,90]``` in the Dockerfile. It's recommended to specify no more than 2 architectures. +> Note: `dockerfiles/Dockerfile.gpu` currently supports CUDA 12.6 only, targeting SM 80/86/89/90 architectures. To support other architectures, modify ```bash build.sh 1 python false [80,90]``` in the Dockerfile. It's recommended to specify no more than 2 architectures. ```shell git clone https://github.com/PaddlePaddle/FastDeploy @@ -84,7 +87,6 @@ The built packages will be in the ```FastDeploy/dist``` directory. ## 5. Precompiled Operator Wheel Packages FastDeploy provides precompiled GPU operator wheel packages for quick setup without building the entire source code. -This method currently supports **SM80/90 architecture (e.g., A100/H100)** and **CUDA 12.6** environments only. > By default, `build.sh` compiles all custom operators from source.To use the precompiled package, enable it with the `FD_USE_PRECOMPILED` parameter. > If the precompiled package cannot be downloaded or does not match the current environment, the system will automatically fall back to `4. Build Wheel from Source`. @@ -113,7 +115,7 @@ cd FastDeploy bash build.sh 1 python false [90] 1 # Use precompiled wheel from a specific commit -bash build.sh 1 python false [90] 1 8a9e7b53af4a98583cab65e4b44e3265a93e56d2 +bash build.sh 1 python false [90] 1 d693d4be1448d414097882386fdc24c8bec2a63a ``` The downloaded wheel packages will be stored in the `FastDeploy/pre_wheel` directory. @@ -122,9 +124,9 @@ After the build completes, the operator binaries can be found in `FastDeploy/fas > **Notes:** > > - This mode prioritizes downloading precompiled GPU operator wheels to reduce build time. -> - Currently supports **GPU, SM80/90, CUDA 12.6** only. +> - Supports **GPU, SM80/86/89/90. > - For custom architectures or modified operator logic, please use **source compilation (Section 4)**. -> - You can check whether the precompiled wheel for a specific commit has been successfully built on the [FastDeploy CI Build Status Page](https://github.com/PaddlePaddle/FastDeploy/actions/workflows/ci_image_update.yml). +> - You can check whether the precompiled wheel for a specific commit has been successfully built on the [FastDeploy CI Build Status Page](https://github.com/PaddlePaddle/FastDeploy/actions/workflows/ce_job.yml). ## Environment Verification diff --git a/docs/zh/get_started/installation/nvidia_gpu.md b/docs/zh/get_started/installation/nvidia_gpu.md index 004216c613..732691ec23 100644 --- a/docs/zh/get_started/installation/nvidia_gpu.md +++ b/docs/zh/get_started/installation/nvidia_gpu.md @@ -14,10 +14,13 @@ ## 1. 预编译Docker安装(推荐) -**注意**: 预编译镜像支持 80/86/89/90 架构的GPU硬件 (如 A800/H800/L20/L40/4090)。 +**注意**: 预编译镜像支持 80/86/89/90 架构的GPU硬件 (如 A800/H800/L20/L40/4090) 且仅支持 Python 3.10。 ``` shell +# CUDA 12.6 docker pull ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/fastdeploy-cuda-12.6:2.5.0 +# CUDA 12.9 +docker pull ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/fastdeploy-cuda-12.9:2.5.0 ``` ## 2. 预编译Pip安装 @@ -57,7 +60,7 @@ python -m pip install fastdeploy-gpu -i https://www.paddlepaddle.org.cn/packages ## 3. 镜像自行构建 -> 注意 ```dockerfiles/Dockerfile.gpu``` 默认编译的架构支持SM 80/90,如若需要支持其它架构,需自行修改Dockerfile中的 ```bash build.sh 1 python false [80,90]```,建议不超过2个架构。 +> 注意 ```dockerfiles/Dockerfile.gpu``` 默认编译产物仅支持 SM 80/86/89/90 架构,基于 CUDA 12.6 环境构建,且仅支持 Python 3.10,如若需要支持其它架构,需自行修改Dockerfile中的 ```bash build.sh 1 python false [80,90]```,建议不超过2个架构。 ``` git clone https://github.com/PaddlePaddle/FastDeploy @@ -91,7 +94,7 @@ bash build.sh 1 python false [80,90] ## 5. 算子预编译 Wheel 包 -FastDeploy 提供了 GPU 算子预编译版 Wheel 包,可在无需完整源码编译的情况下快速构建。该方式当前仅支持 **SM80/90 架构(A100/H100等)** 和 **CUDA 12.6** 环境。 +FastDeploy 提供了 GPU 算子预编译版 Wheel 包,可在无需完整源码编译的情况下快速构建。该方式当前仅支持 **SM80/90 架构(A100/H100等)** **CUDA 12.6** 和 **Python 3.10** 环境。 >默认情况下,`build.sh` 会从源码编译;若希望使用预编译包,可使用`FD_USE_PRECOMPILED` 参数; >若预编译包下载失败或与环境不匹配,系统会自动回退至 `4. wheel 包源码编译` 模式。 @@ -119,7 +122,7 @@ cd FastDeploy bash build.sh 1 python false [90] 1 # 从指定 commitID 获取对应预编译算子 -bash build.sh 1 python false [90] 1 8a9e7b53af4a98583cab65e4b44e3265a93e56d2 +bash build.sh 1 python false [90] 1 d693d4be1448d414097882386fdc24c8bec2a63a ``` 下载的 whl 包在 `FastDeploy/pre_wheel`目录下。 @@ -128,7 +131,7 @@ bash build.sh 1 python false [90] 1 8a9e7b53af4a98583cab65e4b44e3265a93e56d2 > **说明:** > - 该模式会优先下载预编译的 GPU 算子 whl 包,减少编译时间; -> - 目前仅支持 **GPU, SM80/90 架构, CUDA 12.6**; +> - 目前仅支持 **GPU, SM80/90 架构, CUDA 12.6, Python3.10**; > - 若希望自定义架构或修改算子逻辑,请使用 **源码编译方式(第4节)**。 > - 您可以在 FastDeploy CI 构建状态页面查看对应 commit 的预编译 whl 是否已构建成功。