[Feature] support w4afp8 v1_loader and v0_loader(tp>1) (#5757)

* support * fix * support w4afp8 v1_loader and v0_loader * fix * fix test * fix test * fix test * fix moe.py * add test_ernie_4_5_w4afp8 * add test * delete tensor * fix test * fix * add * fix test
2026-04-24 01:29:57 +08:00 · 2025-12-30 14:11:52 +08:00
parent e78e22ebd5
commit 44a13e4557
7 changed files with 615 additions and 31 deletions
@@ -41,6 +41,7 @@ class W4AFP8Config(QuantConfigBase):
        self.is_permuted = is_permuted
        self.hadamard_block_size = hadamard_block_size
        self.is_quantized = is_quantized
+        self.is_checkpoint_bf16 = not is_quantized

    def name(self) -> str:
        return "w4afp8"