[Quantization] Improve the usage of FastDeploy tools. (#660)

* Add PaddleOCR Support

* Add PaddleOCR Support

* Add PaddleOCRv3 Support

* Add PaddleOCRv3 Support

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Add PaddleOCRv3 Support

* Add PaddleOCRv3 Supports

* Add PaddleOCRv3 Suport

* Fix Rec diff

* Remove useless functions

* Remove useless comments

* Add PaddleOCRv2 Support

* Add PaddleOCRv3 & PaddleOCRv2 Support

* remove useless parameters

* Add utils of sorting det boxes

* Fix code naming convention

* Fix code naming convention

* Fix code naming convention

* Fix bug in the Classify process

* Imporve OCR Readme

* Fix diff in Cls model

* Update Model Download Link in Readme

* Fix diff in PPOCRv2

* Improve OCR readme

* Imporve OCR readme

* Improve OCR readme

* Improve OCR readme

* Imporve OCR readme

* Improve OCR readme

* Fix conflict

* Add readme for OCRResult

* Improve OCR readme

* Add OCRResult readme

* Improve OCR readme

* Improve OCR readme

* Add Model Quantization Demo

* Fix Model Quantization Readme

* Fix Model Quantization Readme

* Add the function to do PTQ quantization

* Improve quant tools readme

* Improve quant tool readme

* Improve quant tool readme

* Add PaddleInference-GPU for OCR Rec model

* Add QAT method to fastdeploy-quantization tool

* Remove examples/slim for now

* Move configs folder

* Add Quantization Support for Classification Model

* Imporve ways of importing preprocess

* Upload YOLO Benchmark on readme

* Upload YOLO Benchmark on readme

* Upload YOLO Benchmark on readme

* Improve Quantization configs and readme

* Add support for multi-inputs model

* Add backends and params file for YOLOv7

* Add quantized model deployment support for YOLO series

* Fix YOLOv5 quantize readme

* Fix YOLO quantize readme

* Fix YOLO quantize readme

* Improve quantize YOLO readme

* Improve quantize YOLO readme

* Improve quantize YOLO readme

* Improve quantize YOLO readme

* Improve quantize YOLO readme

* Fix bug, change Fronted to ModelFormat

* Change Fronted to ModelFormat

* Add examples to deploy quantized paddleclas models

* Fix readme

* Add quantize Readme

* Add quantize Readme

* Add quantize Readme

* Modify readme of quantization tools

* Modify readme of quantization tools

* Improve quantization tools readme

* Improve quantization readme

* Improve PaddleClas quantized model deployment  readme

* Add PPYOLOE-l quantized deployment examples

* Improve quantization tools readme

* Improve Quantize Readme

* Fix conflicts

* Fix conflicts

* improve readme

* Improve quantization tools and readme

* Improve quantization tools and readme

* Add quantized deployment examples for PaddleSeg model

* Fix cpp readme

* Fix memory leak of reader_wrapper function

* Fix model file name in PaddleClas quantization examples

* Update Runtime and E2E benchmark

* Update Runtime and E2E benchmark

* Rename quantization tools to auto compression tools

* Remove PPYOLOE data when deployed on MKLDNN

* Fix readme

* Support PPYOLOE with OR without NMS and update readme

* Update Readme

* Update configs and readme

* Update configs and readme

* Add Paddle-TensorRT backend in quantized model deploy examples

* Support PPYOLOE+ series

* Add reused_input_tensors for PPYOLOE

* Improve fastdeploy tools usage

* improve fastdeploy tool

* Improve fastdeploy auto compression tool

* Improve fastdeploy auto compression tool

* Improve fastdeploy auto compression tool

* Improve fastdeploy auto compression tool

* Improve fastdeploy auto compression tool

* remove modify

* Improve fastdeploy auto compression tool

* Improve fastdeploy auto compression tool

* Improve fastdeploy auto compression tool

* Improve fastdeploy auto compression tool

* Improve fastdeploy auto compression tool

* Remove extra requirements for fd-auto-compress package

* Imporve fastdeploy-tools package

* Install fastdeploy-tools package when build fastdeploy-python

* Imporve quantization readme
This commit is contained in:
yunyaoXYY
2022-11-23 10:13:50 +08:00
committed by GitHub
parent 521ec87cf5
commit 712d7fd71b
20 changed files with 33 additions and 69 deletions
@@ -0,0 +1,54 @@
# FastDeploy 一键自动化压缩配置文件说明
FastDeploy 一键自动化压缩配置文件中,包含了全局配置,量化蒸馏训练配置,离线量化配置和训练配置.
用户除了直接使用FastDeploy提供在本目录的配置文件外,可以按照以下示例,自行修改相关配置文件, 来尝试压缩自己的模型.
## 实例解读
```
# 全局配置
Global:
model_dir: ./ppyoloe_plus_crn_s_80e_coco #输入模型的路径, 用户若需量化自己的模型,替换此处即可
format: paddle #输入模型的格式, paddle模型请选择'paddle', onnx模型选择'onnx'
model_filename: model.pdmodel #量化后转为paddle格式模型的模型名字
params_filename: model.pdiparams #量化后转为paddle格式模型的参数名字
qat_image_path: ./COCO_train_320 #量化蒸馏训练使用的数据集,此例为少量无标签数据, 选自COCO2017训练集中的前320张图片, 做少量数据训练
ptq_image_path: ./COCO_val_320 #离线训练使用的Carlibration数据集, 选自COCO2017验证集中的前320张图片.
input_list: ['image','scale_factor'] #待量化的模型的输入名字
qat_preprocess: ppyoloe_plus_withNMS_image_preprocess #模型量化蒸馏训练时,对数据做的预处理函数, 用户可以在 ../fdquant/dataset.py 中修改或自行编写新的预处理函数, 来支自定义模型的量化
ptq_preprocess: ppyoloe_plus_withNMS_image_preprocess #模型离线量化时,对数据做的预处理函数, 用户可以在 ../fdquant/dataset.py 中修改或自行编写新的预处理函数, 来支自定义模型的量化
qat_batch_size: 4 #量化蒸馏训练时的batch_size, 若为onnx格式的模型,此处只能为1
#量化蒸馏训练配置
Distillation:
alpha: 1.0 #蒸馏loss所占权重
loss: soft_label #蒸馏loss算法
Quantization:
onnx_format: true #是否采用ONNX量化标准格式, 要在FastDeploy上部署, 必须选true
use_pact: true #量化训练是否使用PACT方法
activation_quantize_type: 'moving_average_abs_max' #激活量化方式
quantize_op_types: #需要进行量化的OP
- conv2d
- depthwise_conv2d
#离线量化配置
PTQ:
calibration_method: 'avg' #离线量化的激活校准算法, 可选: avg, abs_max, hist, KL, mse, emd
skip_tensor_list: None #用户可指定跳过某些conv层,不进行量化
#训练参数配置
TrainConfig:
train_iter: 3000
learning_rate: 0.00001
optimizer_builder:
optimizer:
type: SGD
weight_decay: 4.0e-05
target_metric: 0.365
```
## 更多详细配置方法
FastDeploy一键压缩功能由PaddeSlim助力, 更详细的量化配置方法请参考:
[自动化压缩超参详细教程](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/example/auto_compression/hyperparameter_tutorial.md)
@@ -0,0 +1,56 @@
# Quantization Config File on FastDeploy
The FastDeploy quantization configuration file contains global configuration, quantization distillation training configuration, post-training quantization configuration and training configuration.
In addition to using the configuration files provided by FastDeploy directly in this directory, users can modify the relevant configuration files according to their needs
## Demo
```
# Global config
Global:
model_dir: ./ppyoloe_plus_crn_s_80e_coco #Path to input model
format: paddle #Input model format, please select 'paddle' for paddle model
model_filename: model.pdmodel #Quantized model name in Paddle format
params_filename: model.pdiparams #Parameter name for quantized paddle model
qat_image_path: ./COCO_train_320 #Data set paths for quantization distillation training
ptq_image_path: ./COCO_val_320 #Data set paths for PTQ
input_list: ['image','scale_factor'] #Input name of the model to be quanzitzed
qat_preprocess: ppyoloe_plus_withNMS_image_preprocess # The preprocessing function for Quantization distillation training
ptq_preprocess: ppyoloe_plus_withNMS_image_preprocess # The preprocessing function for PTQ
qat_batch_size: 4 #Batch size
# Quantization distillation training configuration
Distillation:
alpha: 1.0 #Distillation loss weight
loss: soft_label #Distillation loss algorithm
Quantization:
onnx_format: true #Whether to use ONNX quantization standard format or not, must be true to deploy on FastDeploy
use_pact: true #Whether to use the PACT method for training
activation_quantize_type: 'moving_average_abs_max' #Activations quantization methods
quantize_op_types: #OPs that need to be quantized
- conv2d
- depthwise_conv2d
# Post-Training Quantization
PTQ:
calibration_method: 'avg' #Activations calibration algorithm of post-training quantization , Options: avg, abs_max, hist, KL, mse, emd
skip_tensor_list: None #Developers can skip some conv layers quantization
# Training Config
TrainConfig:
train_iter: 3000
learning_rate: 0.00001
optimizer_builder:
optimizer:
type: SGD
weight_decay: 4.0e-05
target_metric: 0.365
```
## More details
FastDeploy one-click quantization tool is powered by PaddeSlim, please refer to [Auto Compression Hyperparameter Tutorial](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/example/auto_compression/hyperparameter_tutorial.md) for more details.
@@ -0,0 +1,50 @@
Global:
model_dir: ./MobileNetV1_ssld_infer/
format: 'paddle'
model_filename: inference.pdmodel
params_filename: inference.pdiparams
qat_image_path: ./ImageNet_val_640
ptq_image_path: ./ImageNet_val_640
input_list: ['input']
qat_preprocess: cls_image_preprocess
ptq_preprocess: cls_image_preprocess
qat_batch_size: 32
Distillation:
alpha: 1.0
loss: l2
node:
- softmax_0.tmp_0
Quantization:
use_pact: true
activation_bits: 8
is_full_quantize: false
onnx_format: True
activation_quantize_type: moving_average_abs_max
weight_quantize_type: channel_wise_abs_max
not_quant_pattern:
- skip_quant
quantize_op_types:
- conv2d
- depthwise_conv2d
weight_bits: 8
TrainConfig:
train_iter: 5000
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.015
T_max: 8000
optimizer_builder:
optimizer:
type: Momentum
weight_decay: 0.00002
origin_metric: 0.70898
PTQ:
calibration_method: 'avg' # option: avg, abs_max, hist, KL, mse
skip_tensor_list: None
@@ -0,0 +1,48 @@
Global:
model_dir: ./ResNet50_vd_infer/
format: 'paddle'
model_filename: inference.pdmodel
params_filename: inference.pdiparams
image_path: ./ImageNet_val_640
input_list: ['input']
qat_preprocess: cls_image_preprocess
ptq_preprocess: cls_image_preprocess
qat_batch_size: 32
Distillation:
alpha: 1.0
loss: l2
node:
- softmax_0.tmp_0
Quantization:
use_pact: true
activation_bits: 8
is_full_quantize: false
onnx_format: True
activation_quantize_type: moving_average_abs_max
weight_quantize_type: channel_wise_abs_max
not_quant_pattern:
- skip_quant
quantize_op_types:
- conv2d
- depthwise_conv2d
weight_bits: 8
TrainConfig:
train_iter: 5000
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.015
T_max: 8000
optimizer_builder:
optimizer:
type: Momentum
weight_decay: 0.00002
origin_metric: 0.7912
PTQ:
calibration_method: 'avg' # option: avg, abs_max, hist, KL, mse
skip_tensor_list: None
@@ -0,0 +1,39 @@
Global:
model_dir: ./ppyoloe_plus_crn_s_80e_coco
format: paddle
model_filename: model.pdmodel
params_filename: model.pdiparams
qat_image_path: ./COCO_train_320
ptq_image_path: ./COCO_val_320
input_list: ['image','scale_factor']
qat_preprocess: ppyoloe_plus_withNMS_image_preprocess
ptq_preprocess: ppyoloe_plus_withNMS_image_preprocess
qat_batch_size: 4
Distillation:
alpha: 1.0
loss: soft_label
Quantization:
onnx_format: true
use_pact: true
activation_quantize_type: 'moving_average_abs_max'
quantize_op_types:
- conv2d
- depthwise_conv2d
PTQ:
calibration_method: 'avg' # option: avg, abs_max, hist, KL, mse
skip_tensor_list: None
TrainConfig:
train_iter: 5000
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.00003
T_max: 6000
optimizer_builder:
optimizer:
type: SGD
weight_decay: 4.0e-05
@@ -0,0 +1,39 @@
Global:
model_dir: ./ppyoloe_crn_s_300e_coco
format: paddle
model_filename: model.pdmodel
params_filename: model.pdiparams
qat_image_path: ./COCO_train_320
ptq_image_path: ./COCO_val_320
input_list: ['image','scale_factor']
qat_preprocess: ppyoloe_withNMS_image_preprocess
ptq_preprocess: ppyoloe_withNMS_image_preprocess
qat_batch_size: 4
Distillation:
alpha: 1.0
loss: soft_label
Quantization:
onnx_format: true
use_pact: true
activation_quantize_type: 'moving_average_abs_max'
quantize_op_types:
- conv2d
- depthwise_conv2d
PTQ:
calibration_method: 'avg' # option: avg, abs_max, hist, KL, mse
skip_tensor_list: None
TrainConfig:
train_iter: 5000
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.00003
T_max: 6000
optimizer_builder:
optimizer:
type: SGD
weight_decay: 4.0e-05
@@ -0,0 +1,37 @@
Global:
model_dir: ./yolov5s.onnx
format: 'onnx'
model_filename: model.pdmodel
params_filename: model.pdiparams
qat_image_path: ./COCO_train_320
ptq_image_path: ./COCO_val_320
input_list: ['x2paddle_images']
qat_preprocess: yolo_image_preprocess
ptq_preprocess: yolo_image_preprocess
qat_batch_size: 1
Distillation:
alpha: 1.0
loss: soft_label
Quantization:
onnx_format: true
use_pact: true
activation_quantize_type: 'moving_average_abs_max'
quantize_op_types:
- conv2d
- depthwise_conv2d
PTQ:
calibration_method: 'avg' # option: avg, abs_max, hist, KL, mse
skip_tensor_list: None
TrainConfig:
train_iter: 3000
learning_rate: 0.00001
optimizer_builder:
optimizer:
type: SGD
weight_decay: 4.0e-05
target_metric: 0.365
@@ -0,0 +1,38 @@
Global:
model_dir: ./yolov6s.onnx
format: 'onnx'
model_filename: model.pdmodel
params_filename: model.pdiparams
qat_image_path: ./COCO_train_320
ptq_image_path: ./COCO_val_320
input_list: ['x2paddle_image_arrays']
qat_preprocess: yolo_image_preprocess
ptq_preprocess: yolo_image_preprocess
qat_batch_size: 1
Distillation:
alpha: 1.0
loss: soft_label
Quantization:
onnx_format: true
activation_quantize_type: 'moving_average_abs_max'
quantize_op_types:
- conv2d
- depthwise_conv2d
PTQ:
calibration_method: 'avg' # option: avg, abs_max, hist, KL, mse
skip_tensor_list: ['conv2d_2.w_0', 'conv2d_15.w_0', 'conv2d_46.w_0', 'conv2d_11.w_0', 'conv2d_49.w_0']
TrainConfig:
train_iter: 8000
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.00003
T_max: 8000
optimizer_builder:
optimizer:
type: SGD
weight_decay: 0.00004
@@ -0,0 +1,37 @@
Global:
model_dir: ./yolov7.onnx
format: 'onnx'
model_filename: model.pdmodel
params_filename: model.pdiparams
qat_image_path: ./COCO_train_320
ptq_image_path: ./COCO_val_320
input_list: ['x2paddle_images']
qat_preprocess: yolo_image_preprocess
ptq_preprocess: yolo_image_preprocess
qat_batch_size: 1
Distillation:
alpha: 1.0
loss: soft_label
Quantization:
onnx_format: true
activation_quantize_type: 'moving_average_abs_max'
quantize_op_types:
- conv2d
- depthwise_conv2d
PTQ:
calibration_method: 'avg' # option: avg, abs_max, hist, KL, mse
skip_tensor_list: None
TrainConfig:
train_iter: 3000
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.00003
T_max: 8000
optimizer_builder:
optimizer:
type: SGD
weight_decay: 0.00004
@@ -0,0 +1,37 @@
Global:
model_dir: ./PP_LiteSeg_T_STDC1_cityscapes_without_argmax_infer
format: paddle
model_filename: model.pdmodel
params_filename: model.pdiparams
qat_image_path: ./train_stuttgart
ptq_image_path: ./val_munster
input_list: ['x']
qat_preprocess: ppseg_cityscapes_qat_preprocess
ptq_preprocess: ppseg_cityscapes_ptq_preprocess
qat_batch_size: 16
Distillation:
alpha: 1.0
loss: l2
node:
- conv2d_94.tmp_0
Quantization:
onnx_format: True
quantize_op_types:
- conv2d
- depthwise_conv2d
PTQ:
calibration_method: 'avg' # option: avg, abs_max, hist, KL, mse
skip_tensor_list: None
TrainConfig:
epochs: 10
eval_iter: 180
learning_rate: 0.0005
optimizer_builder:
optimizer:
type: SGD
weight_decay: 4.0e-05