This website requires JavaScript.
Explore
Help
Sign In
apps
/
FastDeploy
Watch
1
Star
0
Fork
0
You've already forked FastDeploy
mirror of
https://github.com/PaddlePaddle/FastDeploy.git
synced
2026-04-23 00:17:25 +08:00
Code
Issues
Actions
19
Packages
Projects
Releases
Wiki
Activity
Files
3c7ca62dc3eefc7e5d79865565934b37a4578498
FastDeploy
/
custom_ops
/
gpu_ops
/
w4afp8_gemm
T
History
lizexu123
f4902fe42d
[BugFix] fix wint2 (
#6109
)
...
* fix * fix * fix
2026-01-20 21:46:21 +08:00
..
kernel_traits.h
opt w4afp8 (
#5853
)
2026-01-07 12:22:35 +08:00
mainloop_fwd.h
opt w4afp8 (
#5853
)
2026-01-07 12:22:35 +08:00
utils.hpp
【New Feature】W4afp8 supports per group quantization (
#4987
)
2025-11-13 19:17:27 +08:00
w4afp8_gemm_kernel.hpp
[BugFix] fix wint2 (
#6109
)
2026-01-20 21:46:21 +08:00
w4afp8_gemm.cu
【Optim】Optimize grid dimensions using max_tokens_per_expert for MoE models (
#6007
)
2026-01-15 19:18:42 +08:00
w4afp8_gemm.h
【Optim】Optimize grid dimensions using max_tokens_per_expert for MoE models (
#6007
)
2026-01-15 19:18:42 +08:00
weight_kernel.hpp
【New Feature】W4afp8 supports per group quantization (
#4987
)
2025-11-13 19:17:27 +08:00
weight_scale_kernel.hpp
【New Feature】W4afp8 supports per group quantization (
#4987
)
2025-11-13 19:17:27 +08:00