[Docx] add language (en/cn) switch links (#4470)

* add install docs

* 修改文档

* 修改文档
This commit is contained in:
yangjianfengo1
2025-10-17 15:47:41 +08:00
committed by GitHub
parent a3e0a15495
commit ba5c2b7e37
106 changed files with 206 additions and 0 deletions
+2
View File
@@ -1,3 +1,5 @@
[简体中文](../zh/quantization/wint2.md)
# WINT2 Quantization
Weights are compressed offline using the [CCQ (Convolutional Coding Quantization)](https://arxiv.org/pdf/2507.07145) method. The actual stored numerical type of weights is INT8, with 4 weights packed into each INT8 value, equivalent to 2 bits per weight. Activations are not quantized. During inference, weights are dequantized and decoded in real-time to BF16 numerical type, and calculations are performed using BF16 numerical type.