memoryCoderC
be3be4913a
[Optimization] refactor(chat_handler,completion_handler): extract base classes and use AsyncLLM ( #5195 )
...
* [Optimization] refactor(chat_handler,completion_handler): extract base classes and use AsyncLLM
* [Optimization] refactor(chat_handler,completion_handler): rename class
2025-12-25 16:28:15 +08:00
SunLei
809c1ac7ec
feat: add post-processing step for pool_output ( #4462 )
...
* feat: add post-processing step for pool_output
* bugfix
* fix: test_serving_embedding
* fix test_request_to_batch_dicts
* fix: code style
2025-10-21 20:24:26 +08:00
SunLei
b4b579a7ed
Feature:Add support for Pooling Model Embedding and provide an OpenAI-compatible API. ( #4344 )
...
* feat: add OpenAIServing
* feat: add ZmqOpenAIServing & OpenAIServingEmbedding
* feat: Refine the basic ServingEngine class and introduce ServingContext
* fix: codestyle
* fix: request
* fix: pooling_params
* feat: _process_chat_template_kwargs
* feat: support batch request
* feat: pooling_params verify & default parameters
---------
Co-authored-by: sunlei1024 <sunlei1024@example.com >
2025-10-15 19:42:59 +08:00