mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced 2026-04-25 09:57:51 +08:00
Modified to support custom all reduce by default (#3538)
This commit is contained in:
@@ -87,8 +87,7 @@ Add the following lines to the startup parameters
|
||||
```
|
||||
Notes:
|
||||
1. Usually, no additional parameters need to be set, but CUDAGraph will generate some additional memory overhead, which may need to be adjusted in some scenarios with limited memory. For detailed parameter adjustments, please refer to [GraphOptimizationBackend](../features/graph_optimization.md) for related configuration parameter descriptions
|
||||
2. When CUDAGraph is enabled, if running with multi-GPUs TP>1, `--enable-custom-all-reduce` must be specified at the same time.
|
||||
3. When CUDAGraph is enabled, the scenario of `max-model-len > 32768` is not currently supported.
|
||||
2. When CUDAGraph is enabled, the scenario of `max-model-len > 32768` is not currently supported.
|
||||
|
||||
#### 2.2.6 Rejection Sampling
|
||||
**Idea:**
|
||||
|
||||
Reference in New Issue
Block a user