FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2026-05-08 16:32:41 +08:00

Files

T

ddchenhao66 b87384aa70 [XPU] xpu currently disable prefix cache for VL model (#4695 )

Co-authored-by: ddchenhao66 <dhaochen163.com>

2025-10-31 10:36:39 +08:00

benchmarks

…

cache_manager

…

demo

…

distributed

…

engine

[XPU] xpu currently disable prefix cache for VL model (#4695 )

2025-10-31 10:36:39 +08:00

entrypoints

…

input

[BugFix] fix offline llm chat "enable_thinking" is always "False" (#4686 )

2025-10-30 19:45:41 +08:00

inter_communicator

…

logger

…

metrics

…

model_executor

[noauxtc_kernel] remove useless code (#4643 )

2025-10-30 18:59:04 +08:00

multimodal

…

output

…

platforms

…

plugins

…

reasoning

…

scheduler

…

spec_decode

[Graph Optimization] Add the CUDAGraph usage switch for Draft Model (#4601 )

2025-10-30 11:44:50 +08:00

splitwise

…

transformer_utils

…

worker

[Graph Optimization] Add the CUDAGraph usage switch for Draft Model (#4601 )

2025-10-30 11:44:50 +08:00

__init__.py

…

collect_env.py

…

config.py

[Graph Optimization] Add the CUDAGraph usage switch for Draft Model (#4601 )

2025-10-30 11:44:50 +08:00

envs.py

…

import_ops.py

…

stop.sh

…

test.yaml

…

utils.py

…