mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2026-04-23 00:17:25 +08:00

Files

T

yangjianfengo1 ba5c2b7e37 [Docx] add language (en/cn) switch links (#4470 )

* add install docs

* 修改文档

* 修改文档

2025-10-17 15:47:41 +08:00

1.4 KiB

Raw Blame History

简体中文

Benchmark

FastDeploy extends the vLLM benchmark script with additional metrics, enabling more detailed performance benchmarking for FastDeploy.

Benchmark Dataset

The following dataset is sourced from open-source data (original data from HuggingFace Datasets):

Dataset	Description
https://fastdeploy.bj.bcebos.com/eb_query/filtered_sharedgpt_2000_input_1136_output_200_fd.json	Open-source dataset

How to Run

cd FastDeploy/benchmarks
python -m pip install -r requirements.txt

# Start service
python -m fastdeploy.entrypoints.openai.api_server \
       --model baidu/ERNIE-4.5-0.3B-Base-Paddle \
       --port 8188 \
       --tensor-parallel-size 1 \
       --max-model-len 8192

# Run benchmark
python benchmark_serving.py \
  --backend openai-chat \
  --model baidu/ERNIE-4.5-0.3B-Base-Paddle \
  --endpoint /v1/chat/completions \
  --host 0.0.0.0 \
  --port 8188 \
  --dataset-name EBChat \
  --dataset-path ./filtered_sharedgpt_2000_input_1136_output_200_fd.json \
  --percentile-metrics ttft,tpot,itl,e2el,s_ttft,s_itl,s_e2el,s_decode,input_len,s_input_len,output_len \
  --metric-percentiles 80,95,99,99.9,99.95,99.99 \
  --num-prompts 1 \
  --max-concurrency 1 \
  --save-result

1.4 KiB Raw Blame History

Benchmark

Benchmark Dataset

How to Run

1.4 KiB

Raw Blame History