mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced 2026-04-23 00:17:25 +08:00
ba5c2b7e37
* add install docs * 修改文档 * 修改文档
43 lines
1.5 KiB
Markdown
43 lines
1.5 KiB
Markdown
[English](../benchmark.md)
|
|
|
|
# Benchmark
|
|
|
|
FastDeploy基于[vLLM benchmark](https://github.com/vllm-project/vllm/blob/main/benchmarks/)脚本,增加了部分统计信息,可用于benchmark FastDeploy更详细的性能指标。
|
|
|
|
## 测试数据集
|
|
|
|
以下数据集来源于开源数据集(源数据来源于[HuggingFace Datasets](https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/resolve/main/ShareGPT_V3_unfiltered_cleaned_split.json))
|
|
|
|
| 数据集 | 说明 |
|
|
| :----------------------------------------------------------- | :--------- |
|
|
| https://fastdeploy.bj.bcebos.com/eb_query/filtered_sharedgpt_2000_input_1136_output_200_fd.json | 开源数据集 |
|
|
|
|
## 测试方式
|
|
|
|
```
|
|
cd FastDeploy/benchmarks
|
|
python -m pip install -r requirements.txt
|
|
|
|
# 启动服务
|
|
python -m fastdeploy.entrypoints.openai.api_server \
|
|
--model baidu/ERNIE-4.5-0.3B-Base-Paddle \
|
|
--port 8188 \
|
|
--tensor-parallel-size 1 \
|
|
--max-model-len 8192
|
|
|
|
# 压测服务
|
|
python benchmark_serving.py \
|
|
--backend openai-chat \
|
|
--model baidu/ERNIE-4.5-0.3B-Base-Paddle \
|
|
--endpoint /v1/chat/completions \
|
|
--host 0.0.0.0 \
|
|
--port 8188 \
|
|
--dataset-name EBChat \
|
|
--dataset-path ./filtered_sharedgpt_2000_input_1136_output_200_fd.json \
|
|
--percentile-metrics ttft,tpot,itl,e2el,s_ttft,s_itl,s_e2el,s_decode,input_len,s_input_len,output_len \
|
|
--metric-percentiles 80,95,99,99.9,99.95,99.99 \
|
|
--num-prompts 1 \
|
|
--max-concurrency 1 \
|
|
--save-result
|
|
```
|