Commit Graph

28 Commits

Author SHA1 Message Date
Zhang Yulong 2b10ebc1f1 [benchmark] Refactor debug logging and payload handling (#6949)
* Refactor debug logging and payload handling

* Update backend_request_func.py
2026-03-20 15:04:10 +08:00
Zhang Yulong 3a4e139f65 [Benchmark] fix multi turn (#6948) 2026-03-20 13:22:30 +08:00
Zhang Yulong 051bbbeead [Benchmark] Update backend_request_func.py (#6575) 2026-02-28 19:51:55 +08:00
Zhang Yulong ce8123cb7f [Benchmark] Update backend_request_func.py (#6566) 2026-02-28 14:54:30 +08:00
Zhang Yulong ff20a3cc02 [benchmark] update tool call (#6519) 2026-02-26 17:06:54 +08:00
Zhang Yulong 02c61f8346 [Benchmark] Update backend_request_func.py (#6441) 2026-02-10 19:58:50 +08:00
Zhang Yulong 66c9e11998 [benchmark] update tools (#6437) 2026-02-10 17:48:55 +08:00
Zhang Yulong 16d03c3127 update (#6335) 2026-02-03 21:59:32 +08:00
ophilia-lee 1705d0af7a [benchmark]支持SGLang/VLLM获取cached tokens (#6240)
* benchmark工具支持受限解码场景指定response_format

* Update backend_request_func.py

output.success判断兼容思考内容超长截断时回复内容为空的情况

* Update benchmark_serving.py

更新benchmark_metrics

* 支持Completions接口

* 支持Completions接口

* 支持Completions接口

* [Benchmark]支持Completions接口

* [Benchmark]支持Completions接口

* [Benchmark]async_request_eb_openai_completions 调大aiohttp 默认读 buffer size至4M,解决streaming 返回块过大报Chunk too big问题

* [Benchmark]调大aiohttp 默认读 buffer size至10M,解决streaming 返回块过大报Chunk too big问题

* [Benchmark]支持获取vLLM/SGLang cached_tokens

[Benchmark]支持获取vLLM/SGLang cached_tokens

* [benchmark]支持SGLang/VLLM获取cached tokens

[benchmark]支持SGLang/VLLM获取cached tokens

---------

Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>
2026-01-27 14:57:20 +08:00
jc e911ac2ce7 [BugFix] Refine the preparation of cpu and storage cache (#5777)
* Refine the preparation of cpu and storage cache

* fix error

* fix error

* up

* fix

* up docs

* fix unittest

* remove debug info
2026-01-05 10:13:30 +08:00
ophilia-lee d5f5dc4f6e [Benchmark]调大aiohttp 默认读 buffer size至10M,解决streaming 返回块过大报Chunk too big问题 (#5771)
* benchmark工具支持受限解码场景指定response_format

* Update backend_request_func.py

output.success判断兼容思考内容超长截断时回复内容为空的情况

* Update benchmark_serving.py

更新benchmark_metrics

* 支持Completions接口

* 支持Completions接口

* 支持Completions接口

* [Benchmark]支持Completions接口

* [Benchmark]支持Completions接口

* [Benchmark]async_request_eb_openai_completions 调大aiohttp 默认读 buffer size至4M,解决streaming 返回块过大报Chunk too big问题

* [Benchmark]调大aiohttp 默认读 buffer size至10M,解决streaming 返回块过大报Chunk too big问题

---------

Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>
2025-12-25 19:36:11 +08:00
Juncai 412867fd99 [Feature] Support KV Cache Storage (#5571)
* Support Mooncake Store

* up

* up

* add op

* fix conflict

* fix error

* up for comments

* avoid thread lock

* up

* fix unittest

* fix unittest

* remove debug info

* consider tp_size > 1

* add default rdma_nics

* add utils

* up

* fix error

---------

Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>
2025-12-25 16:30:35 +08:00
ophilia-lee 99258e19c8 [Benchmark]支持Completions接口 (#5700)
* benchmark工具支持受限解码场景指定response_format

* Update backend_request_func.py

output.success判断兼容思考内容超长截断时回复内容为空的情况

* Update benchmark_serving.py

更新benchmark_metrics

* 支持Completions接口

* 支持Completions接口

* 支持Completions接口

* [Benchmark]支持Completions接口

* [Benchmark]支持Completions接口

---------

Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>
2025-12-23 19:46:23 +08:00
Zhang Yulong 48f3e9797e Update backend_request_func.py (#5633)
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-12-18 16:21:34 +08:00
Zhang Yulong c89a62e550 Update backend_request_func.py (#5631) 2025-12-18 14:20:17 +08:00
Zhang Yulong f45c131ddf update (#5625)
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-12-17 21:38:14 +08:00
Zhang Yulong 510b82173a [Benchmark] Update benchmark (#5496)
* update benchmark

* update benchmark
2025-12-11 11:53:12 +08:00
Zhang Yulong 5b49142988 update (#5298) 2025-11-28 18:29:16 +08:00
Juncai 08ca0f6aea [Feature] [PD] add simple router and refine splitwise deployment (#4709)
* add simple router and refine splitwise deployment

* fix
2025-11-06 14:56:02 +08:00
ophilia-lee 412097c1b8 benchmark工具支持受限解码场景指定response_format (#4718) 2025-10-31 12:26:24 +08:00
ophilia-lee 70aa7423f8 benchmark工具适配SGLang框架 (#4607)
* benchmark工具适配SGLang框架

* benchmark工具适配SGLang框架

* benchmark工具适配SGLang框架
2025-10-27 18:52:56 +08:00
Zhang Yulong 10e85daf15 update benchmark scripts (#4497) 2025-10-20 17:03:10 +08:00
Zhang Yulong 8f77adc381 Add data dictionary for API response processing (#4454)
Initialize data dictionary for response handling.
2025-10-16 17:23:11 +08:00
Zhang Yulong c4f866c457 update benchmark tools (#4416) 2025-10-15 11:15:25 +08:00
Zhang Yulong 5151bc92c8 Update benchmark tools (#3004)
Deploy GitHub Pages / deploy (push) Has been cancelled
* update benchmark tools

* update benchmark tools
2025-07-24 15:19:23 +08:00
Zero Rains 25698d56d1 polish code with new pre-commit rule (#2923) 2025-07-19 23:19:27 +08:00
lijingning 9d6a42b334 适配vLLM无arrival_time;适配vLLM model必传;RequestFuncInput/RequestFuncOutput/SampleRequest新增用例编号no 2025-07-15 19:31:27 +08:00
Jiang-Jia-Jun 92c2cfa2e7 Sync v2.0 version of code to github repo 2025-06-29 23:29:37 +00:00