fxyfxy777
9f3b3ce7f5
[Optimization] merge_allreduce ( #7039 )
2026-04-02 19:52:13 +08:00
fxyfxy777
8eb177147c
[BugFix]rm draft code for glm ( #6810 )
...
* rm draft code for glm
* fix baseline
* fix baseline 2
2026-03-12 23:26:05 -07:00
fxyfxy777
250ce40b40
[Feature] use phi permute/unpermute & rm swiglu ( #6361 )
...
* tp文字输出正常
* B eb5 mini文字输出正常
* eb5mini ep B卡 文字输出正常
* default use phi moe op
* stash
* tp H卡正常
* ep ok
* rm debug
* rm debug tool
* rm del ffn_out
* rm swiglu
* add envs to swiglu
* merge dev
* fix ci baseline
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com >
* fix ci baseline 2
---------
Co-authored-by: YuBaoku <49938469+EmmonsCurse@users.noreply.github.com >
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com >
2026-03-12 02:01:57 -07:00
RAM
cdaf6dd400
[RL][Cherry-Pick] Support Fully Async and PrefixCache ( #6599 )
...
* cherry-pick Support Fully Async and PrefixCache step 1
* copy routing_indices_cache.py from 2.4
* cherry-pick [RL] R3 Fix the bug for determining the end of a request (#6388 )
* cherry-pick [RL] Clear Requests status of R3 (#6569 )
* delete code
* fix rename bug
* fix status shape bug
* fix ci
2026-03-12 01:13:30 -07:00
chen
72fe94cb13
[Feature] support glm tp+dp+ep ( #6317 )
2026-02-05 21:47:01 +08:00
RAM
5b22e5dfe7
[RL] R3 Support Fused Put the Routing of All Layers ( #6099 )
...
* fused put routing
* fix bug
* [draft commit]dynamic dtype
* fix async put & numpy bug
* fix unit8 test case
2026-02-03 04:13:16 -08:00
GoldPancake
646aced1eb
[UT] Add GLM E2E tests for non-MTP and MTP ( #6163 )
...
* add glm ut
2026-01-23 10:34:29 +08:00
RAM
955785e2e0
[RL][R3] Fix typo ( #6046 )
...
* fix typo
2026-01-22 15:46:34 +08:00