mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced 2026-04-23 00:17:25 +08:00
209e5cf7f4
* [CE]add 21b cpu cache ,glm mtp,glm for rl config * [CE]add 21b tp2 yaml * [CE]add 21b mooncake yaml * add fastdeploy benchmark,paddletest-155 * [CE] adjust vl wint4 config * [CE]add glm mtp with updatemodel config * [CE]fix * fix * test * test * test --------- Co-authored-by: xiegegege <>
6 lines
128 B
YAML
6 lines
128 B
YAML
max_model_len: 131072
|
|
max_num_seqs: 256
|
|
tensor_parallel_size: 2
|
|
kvcache_storage_backend: "mooncake"
|
|
enable_output_caching: True
|