FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2026-05-10 09:31:48 +08:00

Files

T

freeliuzc 15f5112ecb [Speculative Decoding]Support different inferseed in speculate decoding (#5568 )

* fix mtp entropy drop in RL

* optimize usage and fix unit test

* optimize padding_sampling_params speed(vectorized)

---------

Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>

2025-12-17 16:14:29 +08:00

test_activation.py

…

test_append_attention_with_output.py

…

test_append_attention.py

…

test_attention_layer.py

[Others] Clean code && remove GPU sync code (#5548 )

2025-12-16 21:09:37 +08:00

test_ep_moe_expert_dispatch_fp8.py

[UNITEST] add test (#5305 )

2025-12-02 17:59:01 +08:00

test_ffn.py

[Models] Add forward_meta to moe models' forward function (#5138 )

2025-12-04 13:26:58 +08:00

test_fusedmoe.py

[New][RL] Support Rollout Routing Replay (#5405 )

2025-12-05 22:06:26 +08:00

test_guided_decoding.py

[CI] Add unittest (#5328 )

2025-12-09 19:19:42 +08:00

test_min_sampling.py

…

test_moba_attention_backend.py

…

test_native_paddle_backend.py

…

test_plas_attention.py

…

test_quantized_linear.py

…

test_repetition_early_stopper.py

…

test_sampler.py

…

test_speculative_sampler.py

[Speculative Decoding]Support different inferseed in speculate decoding (#5568 )

2025-12-17 16:14:29 +08:00

test_w4a8_moe.py

[CI] Add unittest (#5328 )

2025-12-09 19:19:42 +08:00

test_w4afp8_moe.py

[New][RL] Support Rollout Routing Replay (#5405 )

2025-12-05 22:06:26 +08:00