FastDeploy/custom_ops/xpu_ops/test at 3b9d6c60d33efc09788d0bad0ea92fe94401a6ac - FastDeploy - 子说镜像小站

apps/FastDeploy

mirror of https://github.com/PaddlePaddle/FastDeploy.git synced 2026-04-23 00:17:25 +08:00

Files

T

History

Jiajun Ji 29495b2cf1 [XPU] Unify Spec and non-spec branch.(#6947 ) (#7180 )

* [XPU] cherry-pick PR-6947

* [XPU] use unified_update_model_status.

* refactor xpu_model_runner.

* refactor sampler.

* fix codestyle.

* Fix XPU speculative decoding: rename output tensors to cu_seqlens_q_output/batch_id_per_token_output, correct
  WRAPPER_CHECK_PTR types, and fix dynamic gather shape in verify_draft_tokens path.

* fix codestyle.

* replace output_padding_offset with is_speculative flag in gather_next_token.

* rename hiddden_states.

* unify cu_seqlens_q_output and batch_id_per_token_output init.

---------

Co-authored-by: cmcamdy <1027740945@qq.com>

2026-04-16 14:58:38 +08:00

..

test_adjust_batch_and_gather_next_token.py

[XPU] Unify Spec and non-spec branch.(#6947 ) (#7180 )

2026-04-16 14:58:38 +08:00

test_adjust_batch_and_recover_batch_sequence.py

[XPU] add more type for recover batch sequence (#6142 )

2026-01-23 15:16:05 +08:00

test_block_attn_prefix_cache.py

[XPU] Split the block_attn operator into smaller operators (#6798 )

2026-04-16 14:28:40 +08:00

test_block_attn.py

[XPU] Split the block_attn operator into smaller operators (#6798 )

2026-04-16 14:28:40 +08:00

test_draft_model_postprocess.py

…

test_draft_model_preprocess.py

[XPU] support kernel for mtp(base) (#4748 )

2025-11-27 15:05:44 +08:00

test_draft_model_update.py

…

test_eagle_get_hidden_states.py

[XPU] support get hidden state for mix (#5513 )

2025-12-12 10:31:20 +08:00

test_eagle_get_self_hidden_states.py

…

test_fused_noaux_tc.py

[XPU] support noaux_tc (#6326 )

2026-02-05 12:04:16 +08:00

test_fused_rms_norm.py

…

test_get_infer_param.py

…

test_get_padding_offset.py

[XPU] Refactor get_padding_offset to single kernel. (#7029 )

2026-04-13 11:04:50 +08:00

test_get_token_penalty_multi_scores.py

…

test_moe_ep_combine.py

…

test_moe_ep_dispatch.py

…

test_moe_expert_ffn.py

[XPU] refine moe_expert_ffn ut (#5743 )

2025-12-25 10:35:24 +08:00

test_moe_redundant_topk_select.py

…

test_moe_topk_select.py

…

test_read_data_ipc.py

…

test_set_data_ipc.py

…

test_set_get_data_ipc.py

…

test_set_value_by_flags_and_idx.py

…

test_speculate_clear_accept_nums.py

…

test_speculate_get_logits.py

[XPU] add speculate_get_logits (#5497 )

2025-12-12 15:38:30 +08:00

test_speculate_get_output_padding_offset.py

…

test_speculate_get_padding_offset.py

…

test_speculate_get_seq_lens_output.py

…

test_speculate_get_token_penalty_multi_scores.py

[XPU] Speculate Decoding + PD, benchmark fix (#6036 )

2026-01-15 19:19:03 +08:00

test_speculate_limit_thinking_content_length.py

[XPU] Add speculate_limit_thinking_content_length Op. (#6627 )

2026-03-11 17:30:17 +08:00

test_speculate_pre_process.py

[XPU] Refactor pre process (#6993 )

2026-04-01 20:29:55 +08:00

test_speculate_rebuild_append_padding.py

…

test_speculate_schedule_cache.py

[XPU] rm stop nums (#6651 )

2026-03-12 14:05:58 +08:00

test_speculate_set_stop_value_multi_seqs.py

…

test_speculate_set_value_by_flags.py

[Feature] GPU Memory Optimization and Retirement of V0 Scheduler (#6407 )

2026-02-28 15:07:43 +08:00

test_speculate_step_system_cache.py

[XPU] add speculate_step_system_cache (#5397 )

2025-12-09 14:40:11 +08:00

test_speculate_step.py

[XPU] support kernel for mtp(base) (#4748 )

2025-11-27 15:05:44 +08:00

test_speculate_update_v3.py

[XPU] support kernel for mtp(base) (#4748 )

2025-11-27 15:05:44 +08:00

test_speculate_verify.py

[XPU] modify speculate_verify (#5522 )

2025-12-23 14:50:30 +08:00

test_step.py

…

test_stop_generation_multi_ends.py

…

test_token_repetition_penalty.py

…

test_unified_update_model_status.py

[XPU] Refactor pre process (#6993 )

2026-04-01 20:29:55 +08:00

test_update_attn_mask_xpu.py

[XPU] Add update_attn_mask_offsets op for xpu. (#6556 )

2026-03-03 18:00:05 +08:00

test_update_inputs.py

[XPU] use quant2d_per_token for weight quant int8 && fix some XPU Kernel check (#6869 )

2026-03-17 19:44:48 +08:00

test_verify_draft_tokens.py

[XPU] add verify draft tokens (#6947 )

2026-04-15 10:18:33 +08:00

test_weight_only_linear.py

…

test_weight_quantize_xpu.py

…