mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced 2026-04-23 00:17:25 +08:00
remove load default_v1 since already been as default (#4980)
This commit is contained in:
@@ -15,8 +15,6 @@ export FD_MODEL_SOURCE=AISTUDIO # "AISTUDIO", "MODELSCOPE" or "HUGGINGFACE"
|
||||
export FD_MODEL_CACHE=/ssd1/download_models
|
||||
```
|
||||
|
||||
> ⭐ **说明**:带星号的模型可直接使用 **HuggingFace Torch 权重**,支持 **FP8/WINT8/WINT4 动态量化** 和 **BF16 精度** 推理,推理时需启用 **`--load-choices "default_v1"`**。
|
||||
|
||||
> 以baidu/ERNIE-4.5-21B-A3B-PT为例启动命令如下
|
||||
```
|
||||
python -m fastdeploy.entrypoints.openai.api_server \
|
||||
@@ -25,8 +23,7 @@ python -m fastdeploy.entrypoints.openai.api_server \
|
||||
--metrics-port 8181 \
|
||||
--engine-worker-queue-port 8182 \
|
||||
--max-model-len 32768 \
|
||||
--max-num-seqs 32 \
|
||||
--load-choices "default_v1"
|
||||
--max-num-seqs 32
|
||||
```
|
||||
|
||||
## 纯文本模型列表
|
||||
|
||||
Reference in New Issue
Block a user