gongweibao
|
a6351dea0b
|
[BugFix][Optimization] Replace silent failures with catchable exceptions and informative error messages (#6533)
* init
* init
* fix format
* add
* add files
* add ut
* fix some
* add ut
* add more
* add
* fix pre-commit
* fix pre-commit
* fix cover
* skip long seq
* add
* add
* fix
* remove not need
* fix set attr
* fix comments
* fix comments
* fix failed tests
---------
Co-authored-by: gongweibao <gognweibao@baidu.com>
|
2026-03-16 21:32:43 +08:00 |
|
JYChen
|
c745a22420
|
[Feature] Support Ernie FP8 on sm100 ( the fixed version) (#6304)
|
2026-02-03 17:47:38 +08:00 |
|
JYChen
|
6c685c9474
|
Revert "[Feature] Support Ernie FP8 on sm100 (#5593)" (#6275)
This reverts commit eb80724b71.
|
2026-01-30 11:22:01 +08:00 |
|
JYChen
|
eb80724b71
|
[Feature] Support Ernie FP8 on sm100 (#5593)
* Deepgemm暂时可用版本
* dense部分 e8m0 ok
* EB模型E8M0跑通的版本
* code check
* support 21b-tp2, dev_paddle
* 单机4.5T ep OK的版本
* 修复删除的代码,单机4.5T ep(非cudagraph)
* eb tp
* Support SM100 block-wise FP8 inference
* refine codes, support deepgemm on sm100
* add thirdparty PFCC/DeepGEMM
* fix ep decode
* 使用deepep ue8m0, 解决精度问题
* 修复FP8 TP精度
* Deepgemm升级适配Hopper逻辑
* add ue8m0 kernel
* add ue8m0 kernel
* fix custom_ops/gpu_ops/cpp_extensions.cc
* eb 输出正常
* eb5 text is right
* 目测精度一致
* 自测精度对齐
* 替换masked_per_token_quant, ep精度OK
* 性能提升约30%
* 暂时跑通ep但是有问题
* 自测一致
* rm test fun
* fix ep event
* 图优化算子更新Deepgemm
* fix build
* 暂时绕过deepgemm CI编译问题
* 根据SM区分deepgemm版本
* remove useless code
---------
Co-authored-by: ckl117 <ckl117@163.com>
Co-authored-by: K11OntheBoat <“ruianmaidanglao@163.com”>
Co-authored-by: fxyfxy777 <fxyfxy777@163.com>
|
2026-01-29 13:49:54 +08:00 |
|
Sunny-bot1
|
59d2edde29
|
[BugFix] Add support for weight shape constraints and group size selection in Machete (#4911)
|
2025-11-10 20:57:35 +08:00 |
|
Sunny-bot1
|
4ffe41a747
|
WINT4/WINT8 dense gemm default use Machete (#4451)
|
2025-10-23 17:57:59 +08:00 |
|
Sunny-bot1
|
b1a5b756a3
|
[Optimize] Support WINT8 and group scale for Machete (#3905)
|
2025-09-15 12:01:34 +08:00 |
|
Sunny-bot1
|
fe5d09f9ee
|
[FIX]Fix Machete compile via ENABLE_MACHETE (#3727)
* add ENABLE_MACHETE
* fix
* revert
* update
* pre_commit
* fix
* fix
---------
Co-authored-by: Ayakouji <yuhongh@qq.com>
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
Co-authored-by: aquagull <hongyuh@qq.com>
|
2025-08-30 17:50:17 +08:00 |
|
Sunny-bot1
|
479c8b85d3
|
[Optimize]support machete weight only gemm (#3561)
* support machete weight only gemm
* add generate
* update
* fix
* change file location
* add sm_version limit
* fix
* fix
* fix ci
* fix coverage
* fix xpu
|
2025-08-28 09:49:58 +08:00 |
|
Zero Rains
|
25698d56d1
|
polish code with new pre-commit rule (#2923)
|
2025-07-19 23:19:27 +08:00 |
|
Jiang-Jia-Jun
|
92c2cfa2e7
|
Sync v2.0 version of code to github repo
|
2025-06-29 23:29:37 +00:00 |
|