* Update docs for release/2.5 * Update English docs for release/2.5 - Update README_EN.md: add v2.5 news entry, reformat v2.4 entry with release link - Update docs/get_started/installation/nvidia_gpu.md: - Docker image: 2.4.0 -> 2.5.0, notice now shows SM80/86/89/90 support - paddlepaddle-gpu: 3.3.0 -> 3.3.1, add CUDA 12.9 alternatives - fastdeploy-gpu: 2.4.0 -> 2.5.0, unified arch install with CUDA 12.9 option - Update docs/zh/get_started/installation/nvidia_gpu.md: - Fix remaining paddlepaddle-gpu==3.3.0 refs in sections 4&5 -> 3.3.1 Agent-Logs-Url: https://github.com/PaddlePaddle/FastDeploy/sessions/fa0be381-324e-4b0d-b7a6-e2c1fa12174f * Clarify --extra-index-url usage in installation docs Add note explaining that --extra-index-url is only for downloading fastdeploy-gpu dependencies; fastdeploy-gpu itself must be installed from the Paddle source specified by -i. Applied to both Chinese and English nvidia_gpu.md installation guides. Agent-Logs-Url: https://github.com/PaddlePaddle/FastDeploy/sessions/9fa8b3c9-7555-4eae-b9b9-026cddd7e74c * Update nvidia_gpu.md --------- Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com> Co-authored-by: jiang-jia-jun <jiangjiajun@baidu.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
6.2 KiB
NVIDIA CUDA GPU Installation
The following installation methods are available when your environment meets these requirements:
- GPU Driver >= 535
- CUDA >= 12.3
- CUDNN >= 9.5
- Python >= 3.10
- Linux X86_64
1. Pre-built Docker Installation (Recommended)
Notice: The pre-built image supports SM 80/86/89/90 architecture GPUs (e.g. A800/H800/L20/L40/4090).
docker pull ccr-2vdh3abv-pub.cnc.bj.baidubce.com/paddlepaddle/fastdeploy-cuda-12.6:2.5.0
2. Pre-built Pip Installation
First install paddlepaddle-gpu. For detailed instructions, refer to PaddlePaddle Installation
# Install stable release
# CUDA 12.6
python -m pip install paddlepaddle-gpu==3.3.1 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/
# CUDA 12.9
python -m pip install paddlepaddle-gpu==3.3.1 -i https://www.paddlepaddle.org.cn/packages/stable/cu129/
# Install latest Nightly build
# CUDA 12.6
python -m pip install --pre paddlepaddle-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu126/
# CUDA 12.9
python -m pip install --pre paddlepaddle-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu129/
Then install fastdeploy. Do not install from PyPI. Use the following methods instead (supports SM80/86/89/90 GPU architectures).
Note: Stable FastDeploy release pairs with stable PaddlePaddle; Nightly Build FastDeploy pairs with Nightly Build PaddlePaddle. The --extra-index-url is only used for downloading fastdeploy-gpu's dependencies; fastdeploy-gpu itself must be installed from the Paddle source specified by -i.
# Install stable release FastDeploy
# CUDA 12.6
python -m pip install fastdeploy-gpu==2.5.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/ --extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
# CUDA 12.9
python -m pip install fastdeploy-gpu==2.5.0 -i https://www.paddlepaddle.org.cn/packages/stable/cu129/ --extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
# Install Nightly Build FastDeploy
# CUDA 12.6
python -m pip install fastdeploy-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu126/ --extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
# CUDA 12.9
python -m pip install fastdeploy-gpu -i https://www.paddlepaddle.org.cn/packages/nightly/cu129/ --extra-index-url https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
3. Build from Source Using Docker
- Note:
dockerfiles/Dockerfile.gpuby default supports SM 80/90 architectures. To support other architectures, modifybash build.sh 1 python false [80,90]in the Dockerfile. It's recommended to specify no more than 2 architectures.
git clone https://github.com/PaddlePaddle/FastDeploy
cd FastDeploy
docker build -f dockerfiles/Dockerfile.gpu -t fastdeploy:gpu .
4. Build Wheel from Source
First install paddlepaddle-gpu. For detailed instructions, refer to PaddlePaddle Installation
python -m pip install paddlepaddle-gpu==3.3.1 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/
git clone https://github.com/PaddlePaddle/FastDeploy
cd FastDeploy
# Argument 1: Whether to build wheel package (1 for yes, 0 for compile only)
# Argument 2: Python interpreter path
# Argument 3: Whether to compile CPU inference operators
# Argument 4: Target GPU architectures
bash build.sh 1 python false [80,90]
The built packages will be in the FastDeploy/dist directory.
5. Precompiled Operator Wheel Packages
FastDeploy provides precompiled GPU operator wheel packages for quick setup without building the entire source code. This method currently supports SM80/90 architecture (e.g., A100/H100) and CUDA 12.6 environments only.
By default,
build.shcompiles all custom operators from source.To use the precompiled package, enable it with theFD_USE_PRECOMPILEDparameter. If the precompiled package cannot be downloaded or does not match the current environment, the system will automatically fall back to4. Build Wheel from Source.
First, install paddlepaddle-gpu. For detailed instructions, please refer to the PaddlePaddle Installation Guide.
python -m pip install paddlepaddle-gpu==3.3.1 -i https://www.paddlepaddle.org.cn/packages/stable/cu126/
Then, clone the FastDeploy repository and build using the precompiled operator wheels:
git clone https://github.com/PaddlePaddle/FastDeploy
cd FastDeploy
# Argument 1: Whether to build wheel package (1 for yes)
# Argument 2: Python interpreter path
# Argument 3: Whether to compile CPU inference operators (false for GPU only)
# Argument 4: Target GPU architectures (currently supports 80/90)
# Argument 5: Whether to use precompiled operators (1 for enable)
# Argument 6 (optional): Specific commitID for precompiled operators(The default is the current commit ID.)
# Use precompiled operators for accelerated build
bash build.sh 1 python false [90] 1
# Use precompiled wheel from a specific commit
bash build.sh 1 python false [90] 1 8a9e7b53af4a98583cab65e4b44e3265a93e56d2
The downloaded wheel packages will be stored in the FastDeploy/pre_wheel directory.
After the build completes, the operator binaries can be found in FastDeploy/fastdeploy/model_executor/ops/gpu.
Notes:
- This mode prioritizes downloading precompiled GPU operator wheels to reduce build time.
- Currently supports GPU, SM80/90, CUDA 12.6 only.
- For custom architectures or modified operator logic, please use source compilation (Section 4).
- You can check whether the precompiled wheel for a specific commit has been successfully built on the FastDeploy CI Build Status Page.
Environment Verification
After installation, verify the environment with this Python code:
import paddle
from paddle.jit.marker import unified
# Verify GPU availability
paddle.utils.run_check()
If the above code executes successfully, the environment is ready.