Commit Graph

15 Commits

Author SHA1 Message Date
RuohengMa 12c76f8137 [XPU] add speculate_get_logits (#5497)
* [XPU] add speculate_step_system_cache

* [XPU] add speculate_step_system_cache

* [XPU] add speculate_get_logits

* delete context

* add ptr check

---------

Co-authored-by: cmcamdy <1027740945@qq.com>
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>
2025-12-12 15:38:30 +08:00
cmcamdy 3c1f7b85a4 [XPU] support get hidden state for mix (#5513)
* fix git hidden states

* fix code style

* fix code style
2025-12-12 10:31:20 +08:00
RuohengMa 8178e3fc6a [XPU] add speculate_step_system_cache (#5397)
* [XPU] add speculate_step_system_cache

* [XPU] add speculate_step_system_cache

---------

Co-authored-by: cmcamdy <1027740945@qq.com>
2025-12-09 14:40:11 +08:00
cmcamdy 5a67a6d960 [XPU] support kernel for mtp(base) (#4748)
* [XPU] support kernel for mtp(base)

* [XPU] support kernel for mtp(base)

* format

* format

* format

* fix gather next token

* fix step && add test

* fix

* mv pre/post process

* add adjust batch / gather next token for mtp

* fix code style

* fix mtp kenrel name

* fix mtp kernel test

* mv xpu pre/post process

* mv xpu pre/post process
2025-11-27 15:05:44 +08:00
Lucas da7863ae85 [XPU] fix text_image_gather_scatter when image_token_num == token_num && text_token_num == 1 (#4882) 2025-11-12 17:13:22 +08:00
ddchenhao66 bffa08b74b [XPU] fix thinking bug where output only contains reasoning_content (#4761)
Co-authored-by: ddchenhao66 <dhaochen163.com>
2025-11-04 14:32:35 +08:00
ddchenhao66 5443b2cffb [XPU] xpu support think length limit (#4539)
CE Compile Job / ce_job_pre_check (push) Has been cancelled
CE Compile Job / print_ce_job_pre_check_outputs (push) Has been cancelled
CE Compile Job / FD-Clone-Linux (push) Has been cancelled
CE Compile Job / Show Code Archive Output (push) Has been cancelled
CE Compile Job / BUILD_SM8090 (push) Has been cancelled
CE Compile Job / BUILD_SM8689 (push) Has been cancelled
CE Compile Job / CE_UPLOAD (push) Has been cancelled
Deploy GitHub Pages / deploy (push) Has been cancelled
* [XPU] xpu support think length limit

* [XPU] xpu c++ code files format

---------

Co-authored-by: ddchenhao66 <dhaochen163.com>
2025-10-23 15:58:11 +08:00
zhupengyang 3a6883ac1a c++ code format (#4527) 2025-10-22 17:59:50 +08:00
Lucas 87179cb744 [XPU] support XPU VL model inference (#4030)
* [XPU] support XPU VL model inference

* fix image op import and device check

* rebase develop

* fix perf
2025-09-25 14:34:15 +08:00
co63oc 8466219ec8 fix typos (#3840)
* fix typos

* ci

---------

Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com>
2025-09-12 11:04:38 +08:00
co63oc d6369b4d51 fix typos (#3684) 2025-09-01 17:50:17 +08:00
lengxia 137e539456 [Feature][XPU] add custom kernels for mtp (#3537) 2025-08-25 10:14:17 +08:00
yinwei f2a528f9ae [XPU] Support kvblock centralized management (#3017) 2025-07-29 10:40:55 +08:00
周周周 1339e56282 [XPU] Remove padding_offsets from get_padding_offset.cu (#2911) 2025-07-18 14:16:44 +08:00
Jiang-Jia-Jun 92c2cfa2e7 Sync v2.0 version of code to github repo 2025-06-29 23:29:37 +00:00