Commit Graph

113 Commits

Author SHA1 Message Date
zhouchong 6e16438a57 [Feature] implement log channel separation and request log level system (#7190)
* feat: implement log channel separation and request log level system

* fix: log system improvements based on review

* add request_id to error logs, use RequestLogLevel enum, and unify logger implementation from utils to logger module
2026-04-16 15:13:05 +08:00
luukunn 14d556692b [BugFix] fix tool call parser (#7369)
* fix tool call parser

* add unit test

* fix unit test

* add unit test
2026-04-15 16:21:46 +08:00
Echo-Nie 8819a039c9 [Others] Fix typo (#7280)
* typo

* typo

* typo

* typo
2026-04-14 17:28:22 +08:00
K11OntheBoat bb48bcbaa2 Split enable_mm (#7183)
Co-authored-by: liuruian <liuruian@MacBook-Pro.local>
2026-04-08 11:25:41 +08:00
luukunn 562fa31791 [BugFix]fix extract_tool_calls (#7154)
* fix extract_tool_calls
2026-04-02 21:18:37 +08:00
Yonghua Li 98f3fc9267 [RL] [KVCache] let cache transfer managers update key prefix after weight update and add unit tests (#7083)
* [test] add a few unit tests

* [feat] update key prefix when model weights are updated

* [test] try to fix test_worker_process
2026-04-02 19:58:41 +08:00
luukunn fa7a84926d [Optimization]Fix tool parser (#7079)
* fix tool parser
2026-04-01 21:20:34 +08:00
YuBaoku c6f0c5c3a6 [CI] Optimize test execution with single-GPU parallelism (#7085)
* [CI] Optimize test execution with single-GPU parallelism and log collection

* remove export CUDA_VISIBLE_DEVICES

* fix path error

* fix log_* path and debug

* [CI] Optimize test execution with single-GPU parallelism and log collection
2026-04-01 14:18:40 +08:00
luukunn 3651113ee5 [DataProcessor]Remove ENABLE_V1_DATA_PROCESSOR (#7052)
* remove ENABLE_V1_DATA_PROCESSOR

* fix unit test

* fix unit test
2026-04-01 09:53:41 +08:00
qwes5s5 ee2b965f5f adjust config info (#7054) 2026-03-31 21:26:05 +08:00
qwes5s5 daa95244f7 abort requests (#6992) 2026-03-31 11:02:26 +08:00
Yonghua Li 6d9739f360 [BugFix] fix speculative gauge metrics in multi api server (#7082) 2026-03-31 10:52:50 +08:00
jackyYang6 05f2d95729 [RL] Adapt async rollout checkpoint update flow (#7042)
* update checkpoint-transfer flow and control update_weights params

* test: add update_weights route validation
2026-03-30 19:19:34 +08:00
Yonghua Li a7f52c300d [Feature] support v1 update/clear api for RL (#6761)
* [Feature] support v1 update/clear api for RL

* [fix] fix execute_model and add sleep/wakeup api

* [fix] fix mtp and key_prefix

* [chore] move _update_key_prefix to resume method

* [fix] make the interface safe to call multiple times

* [fix] fix some tiny bugs

* [chore] make small changes against pr review

* [docs] add docs for weight update

* [test] add some tests and update docs

* [style] fix code style check

* [test] fix ci

* [fix] fix stale control responses when control method timed out

* [chore] remove unused code

* [chore] fix code style

* [chore] optimize tags and key_prefix

* [test] fix ci

* [chore] fix code style

* [test] fix ci

* [fix] fix ep control

* [fix] fix ep control for engine cache queue
2026-03-25 19:18:46 +08:00
luukunn f4a79d4c00 [Optimization]Unified data processing for online and offline (#6891)
* remove process_request

* fix chat

* fix unit test

* remove process response

* fix unit test

* fix offline decode

* Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

* fix sampling_params

---------

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
2026-03-19 21:56:09 +08:00
luukunn c3d8db85c4 [Optimization] Update ZMQ server (#6735)
* add batch zmq send reaponse

* update

* Revert "update"

This reverts commit 0234a25b47.

* update

* remove lock

* fix unit test

* add unit test

* add unit test

* pre commit

* add unit test

* fix unit test

* add unit test

* fix worker>1

* update zmq_worker_pid

* fix unit test

* fix unit test

* fix unit test

* add unit test

* fix unit test

* fix first token time

* fix logprobs

* add unit test

* op

* remore debug log

---------

Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
2026-03-19 21:53:16 +08:00
luukunn fe8d58a094 [Optimization]update request in tool parser&reasoning parser (#6858)
* update request in tool parser&reasoning parser
2026-03-17 11:51:12 +08:00
gongweibao a6351dea0b [BugFix][Optimization] Replace silent failures with catchable exceptions and informative error messages (#6533)
* init

* init

* fix format

* add

* add files

* add ut

* fix some

* add ut

* add more

* add

* fix pre-commit

* fix pre-commit

* fix cover

* skip long seq

* add

* add

* fix

* remove not need

* fix set attr

* fix comments

* fix comments

* fix failed tests

---------

Co-authored-by: gongweibao <gognweibao@baidu.com>
2026-03-16 21:32:43 +08:00
luukunn aac1484b0d [Feature]add arguments string in tool (#6704)
* add arguments string
2026-03-06 20:45:09 +08:00
luukunn caf73e8131 [Feature]add reasoning effort (#6656)
* add reasoning_effort

* fix log

* fix reasoning_effort

* add reasoning_effort level

* fix valid_parameters

* fix valid_parameters

* fix

* fix unit test

* add unit test

* add unit test
2026-03-06 14:16:02 +08:00
kesmeey 758770bc43 [CI] 【Hackathon 10th Spring No.28】功能模块 fastdeploy/entrypoints/engine_client.py 单测补充 (#6158)
* fix codestyle and update unit test coverage workflow

* fix test_engine_client.py: add main_process_metrics mock to prevent KeyError

* fix test_engine_client.py: comprehensive test improvements

* feat: enhance test_engine_client.py with comprehensive test improvements

* fix: resolve test failures in test_engine_client.py

* test: enhance EngineClient test coverage with comprehensive test suite

* test: add comprehensive EngineClient test suite (codestyle checked)
2026-03-02 14:29:23 +08:00
YuBaoku bb51829bd5 [CI] Fix tests and docs to resolve failure (#6572) 2026-03-01 12:33:01 +08:00
xunyoyo 12f754ef38 [CI] 【Hackathon 10th Spring No.42】功能模块 fastdeploy/entrypoints/openai/serving_chat.py单测补充 (#6112)
* test: expand OpenAI serving chat coverage

* Import RequestOutput in test_serving_chat.py

* Reorder import statements in test_serving_chat.py

* test: fix tool_calls finish_reason case

* test: refine serving_chat coverage

* test: format serving_chat tests

---------

Co-authored-by: CSWYF3634076 <wangyafeng@baidu.com>
2026-02-27 16:32:46 +08:00
YuBaoku 9d72332aca [CI] Optimize unittest and fix title format (#6464)
* [CI] Optimize unit test duration and fix PR title format
2026-02-11 20:48:56 +08:00
luukunn 765df94e6c [Optimization]update prompt & prompt_token_ids (#6334)
* fix prompt

* add unit test

* add unit test

* fix
2026-02-04 20:08:01 +08:00
xunyoyo 25656455ee [CI] 【Hackathon 10th Spring No.38】功能模块 fastdeploy/entrypoints/openai/serving_completion.py单测补充 (#6227)
* Add serving completion tests

* test: tighten serving completion coverage
2026-02-02 12:53:04 +08:00
xunyoyo 18ebce9dec [CI] 【Hackathon 10th Spring No.41】功能模块 fastdeploy/entrypoints/llm.py 单测补充 (#6108)
* Add LLM entrypoint tests for coverage

* test: streamline llm entrypoint coverage

* test: format llm tests
2026-01-30 12:58:10 +08:00
Yonghua Li bb76d3b6f0 [RL] [APIServer] add more status codes for update/clear api (#6141)
* [RL] add more status codes for update/clear api

* [feat] return json response

* [fix] fix ci
2026-01-22 17:26:18 +08:00
luukunn 6b968a76f1 【Optimization】update data_processor & add tool parser plugins (#6096)
* update data_processor

* fix unit test

* fix unit test

* add unit test

* add tool parser plugins

* fix tool call

* fix tool call

* fix tool call

* fix unit test

* fix unit test

* add unit test

* fix unit test

* fix unit test

* fix unit test
2026-01-22 17:17:32 +08:00
kxz2002 6e416c62dd [Optimization] The pre- and post-processing pipeline do not perform dict conversion (#5494)
* to_request_for_infer initial commit

* refact to from_chat_completion_request

* preprocess use request initial commit

* bugfix

* processors refact to using request

* bug fix

* refact Request from_generic_request

* post process initial commit

* bugfix

* postprocess second commit

* bugfix

* serving_embedding initial commit

* serving_reward initial commit

* bugfix

* replace function name

* async_llm initial commit

* offline initial commit and fix bug

* bugfix

* fix async_llm

* remove add speculate_metrics into data

* fix logprobs bug

* fix echo bug

* fix bug

* fix reasoning_max_tokens

* bugfix

* bugfix and modify unittest

* bugfix and modify unit test

* bugfix

* bugfix

* bugfix

* modify unittest

* fix error when reasong_content is none for text_processor

* remove some unnessary logic

* revert removed logic

* implement add and set method for RequestOutput and refact code

* modify unit test

* modify unit test

* union process_request and process_request_obj

* remove a unit test

* union process_response and process_response_obj

* support qwen3_vl_processor

* modify unittest and remove comments

* fix prompt_logprobs

* fix codestyle

* add v1

* v1

* fix unit test

* fix unit test

* fix pre-commit

* fix

* add process request

* add process request

* fix

* fix

* fix unit test

* fix unit test

* fix unit test

* fix unit test

* fix unit test

* remove file

* add unit test

* add unit test

* add unit test

* fix unit test

* fix unit test

* fix

* fix

---------

Co-authored-by: Jiaxin Sui <95567040+plusNew001@users.noreply.github.com>
Co-authored-by: luukunn <981429396@qq.com>
Co-authored-by: luukunn <83932082+luukunn@users.noreply.github.com>
Co-authored-by: Zhang Yulong <35552275+ZhangYulongg@users.noreply.github.com>
2026-01-22 00:50:52 +08:00
qwes5s5 b2a2e11551 [Feature] Support stopping the inference for the corresponding request in the online service after a disconnection request. (#5320)
* request disconnect

* request disconnect

* fix bug

* fix bug--amend

---------

Co-authored-by: root <root@yq01-sys-rpm26xc1knu.yq01.baidu.com>
2026-01-16 11:46:13 +08:00
Yonghua Li 60ee72f682 [BugFix] [MultiAPIServer] fix rdma script and port check for multi api server (#5935)
* [fix] fix rdma script and add more error log for multi api server

* [fix] log

* [fix] fix test_multi_api_server

* [fix] fix multi api server port check

---------

Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
2026-01-12 10:38:52 +08:00
essos 1d20957340 [CI]【Hackathon 9th Sprint No.50】NO.50 功能模块 fastdeploy/entrypoints/engine_client.py 单测补充 -part #5045 (#5807)
* update test code

* 减少 mock

* fix style

---------

Co-authored-by: CSWYF3634076 <wangyafeng@baidu.com>
2026-01-09 15:13:19 +08:00
kxz2002 cad2932990 [BugFix] Fix process_response_dict to support async in serving_completion (#5758)
* support process_response_dict async initial commit

* fixbug

* add unit test

* optimize
2025-12-26 17:40:58 +08:00
memoryCoderC be3be4913a [Optimization] refactor(chat_handler,completion_handler): extract base classes and use AsyncLLM (#5195)
* [Optimization] refactor(chat_handler,completion_handler): extract base classes and use AsyncLLM

* [Optimization] refactor(chat_handler,completion_handler): rename class
2025-12-25 16:28:15 +08:00
kesmeey f15edbb6ef [CI]【Hackathon 9th Sprint No.40】功能模块 fastdeploy/entrypoints/openai/api_server.py 单测补充 (#5567)
* Add tests for openai api_server coverage

* update

* Update tests for openai api_server

* fix bugs

* test: disable some api_server lifespan/controller tests for local env

* Format test_api_server with black

* update

* update

* test: narrow envs patch in api_server tests to avoid side effects

* fix: separate MagicMock creation to avoid missing req argument

* fix: patch TRACES_ENABLE env var in api_server tests

* fix: use os.environ patch for TRACES_ENABLE

* test: use fake fastdeploy.envs in api_server tests

* test: pass fake Request into chat/completion routes

* test: increase coverage for tracing and scheduler control

* fix: set dynamic_load_weight in tracing headers test

* ci: add retry and validation for FastDeploy.tar.gz download

* ci: fix indentation in _base_test.yml

* refactor: simplify test_api_server.py (807->480 lines, ~40% reduction)

* fix: restore missing args attributes (revision, etc.) in _build_args

* fix: patch sys.argv to prevent SystemExit: 2 in api_server tests

* improve coverage

* Remove docstring from test_api_server.py

Removed unnecessary docstring from test_api_server.py

---------

Co-authored-by: CSWYF3634076 <wangyafeng@baidu.com>
2025-12-23 18:06:43 +08:00
Yonghua Li 0c8c6369ed [Feature] [PD Disaggregation] simplify configuration for pd-disaggregated deployment, and refactor post-init and usage for all ports (#5415)
* [feat] simplify configuration for pd-disaggregated deployment, and refactor post-init and usage for all ports

* [fix] fix some bugs

* [fix] fix rdma port for cache manager/messager

* [fix] temporarily cancel port availability check to see if it can pass ci test

* [feat] simplify args for multi api server

* [fix] fix dp

* [fix] fix port for xpu

* [fix] add tests for ports post processing & fix ci

* [test] fix test_multi_api_server

* [fix] fix rdma_comm_ports args for multi_api_server

* [fix] fix test_common_engine

* [fix] fix test_cache_transfer_manager

* [chore] automatically setting FD_ENABLE_MULTI_API_SERVER

* [fix] avoid api server from creating engine_args twice

* [fix] fix test_run_batch

* [fix] fix test_metrics

* [fix] fix splitwise connector init

* [test] add test_rdma_transfer and test_expert_service

* [fix] fix code syntax

* [fix] fix test_rdma_transfer and build wheel with rdma script
2025-12-17 15:50:42 +08:00
GoldPancake 909059c60a [Feature] Support for request-level speculative decoding metrics monitoring. (#5518)
* support spec metrics monitor per request

* fix bug

* remove debug log

* fix ut bugs
2025-12-12 12:22:18 +08:00
qwes5s5 d79438bb86 add detoken switch (#5463)
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
2025-12-10 21:44:02 +08:00
luukunn fbc9bce1e9 [Feature]Optimization of Thinking Pattern Framework (#4302)
* add model status in vl

* add x1 parser

* add model_status

* fix parser

* fix parser

* fix parser

* fix parser

* Revert "fix parser"

This reverts commit 300f446d8a.

* fix parser

* fix

* fix

* fix

* fix

* fix parser

* fix unit test

* fix unit test

* add unit test

* fix

* fix

* add unit test

* fix unit test

* add unit test

* add unit test

* fix unit test

* fix unit test

* fix bug

* fix unit test

* x1 tool parser

* fix unit test

* fix unit test

* fix unit test

* fix n

* fix unit test

* add unit test

* add unit test

* remove pring
2025-12-10 16:17:06 +08:00
ming1753 9e15191cce [BugFix] fix audio end bug (#5464) 2025-12-10 13:37:26 +08:00
Echo-Nie 1b1bfab341 [CI] Add unittest (#5328)
* add test_worker_eplb

* remove tesnsor_wise_fp8

* add copyright
2025-12-09 19:19:42 +08:00
Juncai 80efe98f8d [PD Disaggregation] Add timestamp for analyzing splitwise deployment (#5317)
* Add timestamp for analyzing splitwise deployment

* up

* up

* up

* up

* up

* up

* fix format

* fix
2025-12-08 10:08:44 +08:00
qwes5s5 a52aea073c fix logprobs (#5335) 2025-12-04 10:38:51 +08:00
ming1753 5f8d4aedea [Feature] support audio tts (#5333) 2025-12-03 21:06:48 +08:00
qwes5s5 117980dd4e [LogProbs]Enable prompt logprobs output and modify data transmission method for the online interface. (#5089)
* add prompt logprobs

* Merge prompt_logprobs_tensors and prompt_logprobs

* fix param check

* trigger ci

* fix unitest

* fix logprobs bug
2025-12-02 13:49:51 +08:00
xiaolei373 84e2f6aa75 [CI]add clear to run-batch ci (#5307)
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
Publish Job / publish_pre_check (push) Has been cancelled
Publish Job / print_publish_pre_check_outputs (push) Has been cancelled
Publish Job / FD-Clone-Linux (push) Has been cancelled
Publish Job / Show Code Archive Output (push) Has been cancelled
Publish Job / BUILD_SM8090 (push) Has been cancelled
Publish Job / BUILD_SM8689 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8090 (push) Has been cancelled
Publish Job / PADDLE_PYPI_UPLOAD_8689 (push) Has been cancelled
Publish Job / Run FD Image Build (push) Has been cancelled
Publish Job / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
Publish Job / Run FastDeploy LogProb Tests (push) Has been cancelled
Publish Job / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
Publish Job / Run Base Tests (push) Has been cancelled
Publish Job / Run Accuracy Tests (push) Has been cancelled
Publish Job / Run Stable Tests (push) Has been cancelled
CI Images Build / FD-Clone-Linux (push) Has been cancelled
CI Images Build / Show Code Archive Output (push) Has been cancelled
CI Images Build / CI Images Build (push) Has been cancelled
CI Images Build / BUILD_SM8090 (push) Has been cancelled
CI Images Build / Run FastDeploy Unit Tests and Coverage (push) Has been cancelled
CI Images Build / Run FastDeploy LogProb Tests (push) Has been cancelled
CI Images Build / Extracted partial CE model tasks to run in CI. (push) Has been cancelled
CI Images Build / Run Base Tests (push) Has been cancelled
CI Images Build / Publish Docker Images Pre Check (push) Has been cancelled
2025-12-01 21:18:19 +08:00
Yonghua Li a535050b11 [FDConfig] remove engine client args, use fd_config instead (#5217)
* [refactor] remove engine client args, use fd_config instead

* [chore] update

* [fix] fix

* [fix] fix

* [chore] rename config to fd_config

* [fix] fix run_batch

* [ci] add ci case for engine client

---------

Co-authored-by: Jiaxin Sui <95567040+plusNew001@users.noreply.github.com>
2025-11-28 01:20:54 -08:00
chen 35f85baf09 [BugFix]fix v1 loader lm head fp32 (#5270) 2025-11-27 20:12:56 +08:00
xiaolei373 b52ec268f7 [CI]fix run batch unit test (#4628) 2025-11-27 19:38:04 +08:00