FastDeploy/benchmarks/yaml/GLM45-air-32k-bf16-mtp.yaml at 3c7ca62dc3eefc7e5d79865565934b37a4578498 - FastDeploy - 子说镜像小站

apps/FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2026-04-23 00:17:25 +08:00

Files

T

xiegegege 51c6fa8afc [CE]add 21b cpu cache ,glm mtp,glm for rl config (#6328 )

2026-02-03 20:10:47 +08:00

8 lines

173 B

YAML

Raw Blame History

 max_model_len: 32768
 max_num_seqs: 128
 tensor_parallel_size: 4
 graph_optimization_config:
   use_cudagraph: True
   draft_model_use_cudagraph: True
 load_choices: "default_v1"