FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2026-04-23 00:17:25 +08:00

Files

T

freeliuzc f6c066fb9d Revert "[Optimization] Optimize ttft for prefill pd (#6680 )" (#7386 )

* Revert "[Optimization] Optimize ttft for prefill pd (#6680)"

This reverts commit 6727df8286.

* fix revert pr

2026-04-14 20:01:39 +08:00

__init__.py

…

dcu_model_runner.py

…

dcu_worker.py

…

eplb.py

…

experts_manager.py

…

gcu_model_runner.py

Split enable_mm (#7183 ) (#7233 )

2026-04-08 16:32:04 +08:00

gcu_worker.py

…

gpu_model_runner.py

[Feature] Support set PREEMPTED_TOKEN_ID in GET_SAVE_OUTPUT_V1 (#7159 ) (#7351 )

2026-04-13 15:24:01 +08:00

gpu_worker.py

[Cherry-Pick][FDConfig] Auto-scale CUDA Graph Capture & CLI Quantization Params + CUDAGraph Validation (#7215,#7281) (#7301 )

2026-04-10 16:10:31 +08:00

hpu_model_runner.py

…

hpu_worker.py

…

iluvatar_model_runner.py

…

iluvatar_worker.py

Split enable_mm (#7183 ) (#7233 )

2026-04-08 16:32:04 +08:00

input_batch.py

Split enable_mm (#7183 ) (#7233 )

2026-04-08 16:32:04 +08:00

metax_model_runner.py

Split enable_mm (#7183 ) (#7233 )

2026-04-08 16:32:04 +08:00

metax_worker.py

…

model_runner_base.py

…

output.py

…

tbo.py

…

worker_base.py

…

worker_process.py

Revert "[Optimization] Optimize ttft for prefill pd (#6680 )" (#7386 )

2026-04-14 20:01:39 +08:00

xpu_model_runner.py

Split enable_mm (#7183 ) (#7233 )

2026-04-08 16:32:04 +08:00

xpu_worker.py

…