[LLM] First commit the llm deployment code

This commit is contained in:
jiangjiajun
2025-06-09 19:20:15 +08:00
committed by XieYunshen
parent 8513414112
commit 149c79699d
11814 changed files with 127294 additions and 1293102 deletions
-101
View File
@@ -1,101 +0,0 @@
# FastDeploy 工具包
FastDeploy提供了一系列高效易用的工具优化部署体验, 提升推理性能.
- [1.自动压缩工具包](#1)
- [2.模型转换工具包](#2)
- [3.paddle2coreml工具包](#3)
<p id="1"></p>
## 一键模型自动化压缩工具
FastDeploy基于PaddleSlim的Auto Compression Toolkit(ACT), 给用户提供了一键模型自动化压缩的工具, 用户可以轻松地通过一行命令对模型进行自动化压缩, 并在FastDeploy上部署压缩后的模型, 提升推理速度. 本文档将以FastDeploy一键模型自动化压缩工具为例, 介绍如何安装此工具, 并提供相应的使用文档.
### 环境准备
1.用户参考PaddlePaddle官网, 安装Paddle 2.4 版本
```
https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/develop/install/pip/linux-pip.html
```
2.安装PaddleSlim 2.4 版本
```bash
pip install paddleslim==2.4.0
```
3.安装fastdeploy-tools工具包
```bash
# 通过pip安装fastdeploy-tools. 此工具包目前支持模型一键自动化压缩和模型转换的功能.
# FastDeploy的python包已包含此工具, 不需重复安装.
pip install fastdeploy-tools==0.0.1
```
### 一键模型自动化压缩工具的使用
按照以上步骤成功安装后,即可使用FastDeploy一键模型自动化压缩工具, 示例如下.
```bash
fastdeploy compress --config_path=./configs/detection/yolov5s_quant.yaml --method='PTQ' --save_dir='./yolov5s_ptq_model/'
```
详细使用文档请参考[FastDeploy一键模型自动化压缩工具](./common_tools/auto_compression/README.md)
<p id="2"></p>
## 模型转换工具
FastDeploy 基于 X2Paddle 为用户提供了模型转换的工具, 用户可以轻松地通过一行命令将外部框架模型快速迁移至飞桨框架,目前支持 ONNX、TensorFlow 以及 Caffe,支持大部分主流的CV和NLP的模型转换。
### 环境准备
1. PaddlePaddle 安装,可参考如下文档快速安装
```
https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/develop/install/pip/linux-pip.html
```
2. X2Paddle 安装
如需使用稳定版本,可通过pip方式安装X2Paddle
```shell
pip install x2paddle
```
如需体验最新功能,可使用源码安装方式:
```shell
git clone https://github.com/PaddlePaddle/X2Paddle.git
cd X2Paddle
python setup.py install
```
### 使用方式
按照以上步骤成功安装后,即可使用 FastDeploy 一键转换工具, 示例如下:
```bash
fastdeploy convert --framework onnx --model yolov5s.onnx --save_dir pd_model
```
更多详细内容可参考[X2Paddle](https://github.com/PaddlePaddle/X2Paddle)
## paddle2coreml工具
FastDeploy 基于 paddle2coreml 为用户提供了模型转换的工具, 用户可以轻松地通过一行命令将飞桨模型快速迁移至苹果电脑和手机端。
### 环境准备
1. PaddlePaddle 安装,可参考如下文档快速安装
```
https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/develop/install/pip/linux-pip.html
```
2. paddle2coreml 安装
可通过pip方式安装paddle2coreml
```shell
pip install paddle2coreml
```
3. 使用方式
按照以上步骤成功安装后,即可使用 FastDeploy paddle2coreml 一键转换工具, 示例如下:
```bash
fastdeploy paddle2coreml --p2c_paddle_model_dir path/to/paddle_model --p2c_coreml_model_dir path/to/coreml_model --p2c_input_names "input1 input2" --p2c_input_shapes "1,3,224,224 1,4,64,64" --p2c_input_dtypes "float32 int32" --p2c_output_names "output1 output2"
```
注意,--p2c_input_names 与 --p2c_output_names 两个参数须与paddle模型的输入输出名字一致。
-98
View File
@@ -1,98 +0,0 @@
# FastDeploy Toolkit
FastDeploy provides a series of efficient and easy-to-use tools to optimize the deployment experience and improve inference performance.
- [1.Auto Compression Tool](#1)
- [2.Model Conversion Tool](#2)
<p id="1"></p>
## One-Click Model Auto Compression Tool
Based on PaddleSlim's Auto Compression Toolkit (ACT), FastDeploy provides users with a one-click model automation compression tool that allows users to easily compress the model with a single command. This document will take FastDeploy's one-click model automation compression tool as an example, introduce how to install the tool, and provide the corresponding documentation for usage.
### Environmental Preparation
1.Install PaddlePaddle 2.4 version
```
https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/develop/install/pip/linux-pip.html
```
2.Install PaddleSlim 2.4 version
```bash
pip install paddleslim==2.4.0
```
3.Install fastdeploy-tools package
```bash
# Installing fastdeploy-tools via pip
# This tool is included in the python installer of FastDeploy, so you don't need to install it again.
pip install fastdeploy-tools==0.0.1
```
### The Usage of One-Click Model Auto Compression Tool
After the above steps are successfully installed, you can use FastDeploy one-click model automation compression tool, as shown in the following example.
```bash
fastdeploy compress --config_path=./configs/detection/yolov5s_quant.yaml --method='PTQ' --save_dir='./yolov5s_ptq_model/'
```
For detailed documentation, please refer to [FastDeploy One-Click Model Auto Compression Tool](./common_tools/auto_compression/README_EN.md)
<p id="2"></p>
## Model Conversion Tool
Based on X2Paddle, FastDeploy provides users with a model conversion tool. Users can easily migrate external framework models to the Paddle framework with one line of commands. Currently, ONNX, TensorFlow and Caffe are supported, and most mainstream CV and NLP model conversions are supported.
### Environmental Preparation
1. Install PaddlePaddle, refer to the following documents for quick installation
```
https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/develop/install/pip/linux-pip.html
```
2. Install X2Paddle
To use the stable version, install X2Paddle via pip:
```shell
pip install x2paddle
```
To experience the latest features, you can use the source installation method:
```shell
git clone https://github.com/PaddlePaddle/X2Paddle.git
cd X2Paddle
python setup.py install
```
### How to use
After successful installation according to the above steps, you can use the FastDeploy one-click conversion tool. The example is as follows:
```bash
fastdeploy convert --framework onnx --model yolov5s.onnx --save_dir pd_model
```
For more details, please refer to[X2Paddle](https://github.com/PaddlePaddle/X2Paddle)
## paddle2coreml tool
FastDeploy provides users with a model conversion tool based on paddle2coreml, which allows users to easily migrate PaddlePaddle models to Apple computers and mobile devices with a single command.
### Environment Preparation
1. PaddlePaddle installation, please refer to the following document for quick installation:
```
https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/develop/install/pip/linux-pip.html
```
2. paddle2coreml installation
paddle2coreml can be installed using pip:
```shell
pip install paddle2coreml
```
3. Usage
After successfully installing as described above, you can use the FastDeploy paddle2coreml one-click conversion tool, as shown below:
```bash
fastdeploy paddle2coreml --p2c_paddle_model_dir path/to/paddle_model --p2c_coreml_model_dir path/to/coreml_model --p2c_input_names "input1 input2" --p2c_input_shapes "1,3,224,224 1,4,64,64" --p2c_input_dtypes "float32 int32" --p2c_output_names "output1 output2"
```
View File
@@ -1,131 +0,0 @@
# FastDeploy 一键模型自动化压缩
FastDeploy基于PaddleSlim的Auto Compression Toolkit(ACT), 给用户提供了一键模型自动化压缩的工具.
本文档以Yolov5s为例, 供用户参考如何安装并执行FastDeploy的一键模型自动化压缩.
## 1.安装
### 环境依赖
1.用户参考PaddlePaddle官网, 安装Paddle 2.4 版本
```
https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/develop/install/pip/linux-pip.html
```
2.安装PaddleSlim 2.4 版本
```bash
pip install paddleslim==2.4.0
```
### 一键模型自动化压缩工具安装方式
FastDeploy一键模型自动化压缩不需要单独的安装, 用户只需要正确安装好[FastDeploy工具包](../../README.md)即可.
## 2.使用方式
### 一键模型压缩示例
FastDeploy模型一键自动压缩可包含多种策略, 目前主要采用离线量化和量化蒸馏训练, 下面将从离线量化和量化蒸馏两个策略来介绍如何使用一键模型自动化压缩.
#### 离线量化
##### 1. 准备模型和Calibration数据集
用户需要自行准备待量化模型与Calibration数据集.
本例中用户可执行以下命令, 下载待量化的yolov5s.onnx模型和我们为用户准备的Calibration数据集示例.
```shell
# 下载yolov5.onnx
wget https://paddle-slim-models.bj.bcebos.com/act/yolov5s.onnx
# 下载数据集, 此Calibration数据集为COCO val2017中的前320张图片
wget https://bj.bcebos.com/paddlehub/fastdeploy/COCO_val_320.tar.gz
tar -xvf COCO_val_320.tar.gz
```
##### 2.使用fastdeploy compress命令,执行一键模型自动化压缩:
以下命令是对yolov5s模型进行量化, 用户若想量化其他模型, 替换config_path为configs文件夹下的其他模型配置文件即可.
```shell
fastdeploy compress --config_path=./configs/detection/yolov5s_quant.yaml --method='PTQ' --save_dir='./yolov5s_ptq_model/'
```
##### 3.参数说明
目前用户只需要提供一个定制的模型config文件,并指定量化方法和量化后的模型保存路径即可完成量化.
| 参数 | 作用 |
| -------------------- | ------------------------------------------------------------ |
| --config_path | 一键压缩所需要的量化配置文件.[详解](./configs/README.md) |
| --method | 压缩方式选择, 离线量化选PTQ,量化蒸馏训练选QAT |
| --save_dir | 产出的量化后模型路径, 该模型可直接在FastDeploy部署 |
#### 量化蒸馏训练
##### 1.准备待量化模型和训练数据集
FastDeploy一键模型自动化压缩目前的量化蒸馏训练,只支持无标注图片训练,训练过程中不支持评估模型精度.
数据集为真实预测场景下的图片,图片数量依据数据集大小来定,尽量覆盖所有部署场景. 此例中,我们为用户准备了COCO2017训练集中的前320张图片.
注: 如果用户想通过量化蒸馏训练的方法,获得精度更高的量化模型, 可以自行准备更多的数据, 以及训练更多的轮数.
```shell
# 下载yolov5.onnx
wget https://paddle-slim-models.bj.bcebos.com/act/yolov5s.onnx
# 下载数据集, 此Calibration数据集为COCO2017训练集中的前320张图片
wget https://bj.bcebos.com/paddlehub/fastdeploy/COCO_train_320.tar
tar -xvf COCO_train_320.tar
```
##### 2.使用fastdeploy compress命令,执行一键模型自动化压缩:
以下命令是对yolov5s模型进行量化, 用户若想量化其他模型, 替换config_path为configs文件夹下的其他模型配置文件即可.
```shell
# 执行命令默认为单卡训练,训练前请指定单卡GPU, 否则在训练过程中可能会卡住.
export CUDA_VISIBLE_DEVICES=0
fastdeploy compress --config_path=./configs/detection/yolov5s_quant.yaml --method='QAT' --save_dir='./yolov5s_qat_model/'
```
##### 3.参数说明
目前用户只需要提供一个定制的模型config文件,并指定量化方法和量化后的模型保存路径即可完成量化.
| 参数 | 作用 |
| -------------------- | ------------------------------------------------------------ |
| --config_path | 一键自动化压缩所需要的量化配置文件.[详解](./configs/README.md)|
| --method | 压缩方式选择, 离线量化选PTQ,量化蒸馏训练选QAT |
| --save_dir | 产出的量化后模型路径, 该模型可直接在FastDeploy部署 |
## 3. FastDeploy 一键模型自动化压缩 Config文件参考
FastDeploy目前为用户提供了多个模型的压缩[config](./configs/)文件,以及相应的FP32模型, 用户可以直接下载使用并体验.
| Config文件 | 待压缩的FP32模型 | 备注 |
| -------------------- | ------------------------------------------------------------ |----------------------------------------- |
| [mobilenetv1_ssld_quant](./configs/classification/mobilenetv1_ssld_quant.yaml) | [mobilenetv1_ssld](https://bj.bcebos.com/paddlehub/fastdeploy/MobileNetV1_ssld_infer.tgz) | |
| [resnet50_vd_quant](./configs/classification/resnet50_vd_quant.yaml) | [resnet50_vd](https://bj.bcebos.com/paddlehub/fastdeploy/ResNet50_vd_infer.tgz) | |
| [efficientnetb0_quant](./configs/classification/efficientnetb0_quant.yaml) | [efficientnetb0](https://bj.bcebos.com/paddlehub/fastdeploy/EfficientNetB0_small_infer.tgz) | |
| [mobilenetv3_large_x1_0_quant](./configs/classification/mobilenetv3_large_x1_0_quant.yaml) | [mobilenetv3_large_x1_0](https://bj.bcebos.com/paddlehub/fastdeploy/MobileNetV3_large_x1_0_ssld_infer.tgz) | |
| [pphgnet_tiny_quant](./configs/classification/pphgnet_tiny_quant.yaml) | [pphgnet_tiny](https://bj.bcebos.com/paddlehub/fastdeploy/PPHGNet_tiny_ssld_infer.tgz) | |
| [pplcnetv2_base_quant](./configs/classification/pplcnetv2_base_quant.yaml) | [pplcnetv2_base](https://bj.bcebos.com/paddlehub/fastdeploy/PPLCNetV2_base_infer.tgz) | |
| [yolov5s_quant](./configs/detection/yolov5s_quant.yaml) | [yolov5s](https://paddle-slim-models.bj.bcebos.com/act/yolov5s.onnx) | |
| [yolov6s_quant](./configs/detection/yolov6s_quant.yaml) | [yolov6s](https://paddle-slim-models.bj.bcebos.com/act/yolov6s.onnx) | |
| [yolov7_quant](./configs/detection/yolov7_quant.yaml) | [yolov7](https://paddle-slim-models.bj.bcebos.com/act/yolov7.onnx) | |
| [ppyoloe_withNMS_quant](./configs/detection/ppyoloe_withNMS_quant.yaml) | [ppyoloe_l](https://bj.bcebos.com/v1/paddle-slim-models/act/ppyoloe_crn_l_300e_coco.tar) | 支持PPYOLOE的s,m,l,x系列模型, 从PaddleDetection导出模型时正常导出, 不要去除NMS |
| [ppyoloe_plus_withNMS_quant](./configs/detection/ppyoloe_plus_withNMS_quant.yaml) | [ppyoloe_plus_s](https://bj.bcebos.com/paddlehub/fastdeploy/ppyoloe_plus_crn_s_80e_coco.tar) | 支持PPYOLOE+的s,m,l,x系列模型, 从PaddleDetection导出模型时正常导出, 不要去除NMS |
| [pp_liteseg_quant](./configs/segmentation/pp_liteseg_quant.yaml) | [pp_liteseg](https://bj.bcebos.com/paddlehub/fastdeploy/PP_LiteSeg_T_STDC1_cityscapes_without_argmax_infer.tgz) |
| [deeplabv3_resnet_quant](./configs/segmentation/deeplabv3_resnet_quant.yaml) | [deeplabv3_resnet101](https://bj.bcebos.com/paddlehub/fastdeploy/Deeplabv3_ResNet101_OS8_cityscapes_without_argmax_infer.tgz) | |
| [fcn_hrnet_quant](./configs/segmentation/fcn_hrnet_quant.yaml) | [fcn_hrnet](https://bj.bcebos.com/paddlehub/fastdeploy/FCN_HRNet_W18_cityscapes_without_argmax_infer.tgz) | |
| [unet_quant](./configs/segmentation/unet_quant.yaml) | [unet](https://bj.bcebos.com/paddlehub/fastdeploy/Unet_cityscapes_without_argmax_infer.tgz) | | |
## 4. FastDeploy 部署量化模型
用户在获得量化模型之后,即可以使用FastDeploy进行部署, 部署文档请参考:
具体请用户参考示例文档:
- [YOLOv5 量化模型部署](../../../examples/vision/detection/yolov5/quantize/)
- [YOLOv6 量化模型部署](../../../examples/vision/detection/yolov6/quantize/)
- [YOLOv7 量化模型部署](../../../examples/vision/detection/yolov7/quantize/)
- [PadddleClas 量化模型部署](../../../examples/vision/classification/paddleclas/quantize/)
- [PadddleDetection 量化模型部署](../../../examples/vision/detection/paddledetection/quantize/)
- [PadddleSegmentation 量化模型部署](../../../examples/vision/segmentation/paddleseg/quantize/)
@@ -1,139 +0,0 @@
# FastDeploy One-Click Model Auto Compression
FastDeploy, based on PaddleSlim's Auto Compression Toolkit(ACT), provides developers with a one-click model auto compression tool that supports post-training quantization and knowledge distillation training.
We take the Yolov5 series as an example to demonstrate how to install and execute FastDeploy's one-click model auto compression.
## 1.Install
### Environment Dependencies
1.Install PaddlePaddle 2.4 version
```
https://www.paddlepaddle.org.cn/install/quick?docurl=/documentation/docs/zh/develop/install/pip/linux-pip.html
```
2.Install PaddleSlim 2.4 version
```bash
pip install paddleslim==2.4.0
```
### Install Fastdeploy Auto Compression Toolkit
FastDeploy One-Click Model Automation compression does not require a separate installation, users only need to properly install the [FastDeploy Toolkit](../../README.md)
## 2. How to Use
### Demo for One-Click Auto Compression Toolkit
Fastdeploy Auto Compression can include multiple strategies, At present, offline quantization and quantization distillation are mainly used for training. The following will introduce how to use it from two strategies, offline quantization and quantitative distillation.
#### Offline Quantization
##### 1. Prepare models and Calibration data set
Developers need to prepare the model to be quantized and the Calibration dataset on their own.
In this demo, developers can execute the following command to download the yolov5s.onnx model to be quantized and calibration data set.
```shell
# Download yolov5.onnx
wget https://paddle-slim-models.bj.bcebos.com/act/yolov5s.onnx
# Download dataset. This Calibration dataset is the first 320 images from COCO val2017
wget https://bj.bcebos.com/paddlehub/fastdeploy/COCO_val_320.tar.gz
tar -xvf COCO_val_320.tar.gz
```
##### 2. Run fastdeploy compress command to compress the model
The following command is to quantize the yolov5s model, if developers want to quantize other models, replace the config_path with other model configuration files in the configs folder.
```shell
fastdeploy compress --config_path=./configs/detection/yolov5s_quant.yaml --method='PTQ' --save_dir='./yolov5s_ptq_model/'
```
[notice] PTQ is short for post-training quantization
##### 3. Parameters
To complete the quantization, developers only need to provide a customized model config file, specify the quantization method, and the path to save the quantized model.
| Parameter | Description |
| ------------- | ------------------------------------------------------------------------------------------------------------- |
| --config_path | Quantization profiles needed for one-click quantization.[Configs](./configs/README.md) |
| --method | Quantization method selection, PTQ for post-training quantization, QAT for quantization distillation training |
| --save_dir | Output of quantized model paths, which can be deployed directly in FastDeploy |
#### Quantized distillation training
##### 1.Prepare the model to be quantized and the training data set
FastDeploy currently supports quantized distillation training only for images without annotation. It does not support evaluating model accuracy during training.
The datasets are images from inference application, and the number of images is determined by the size of the dataset, covering all deployment scenarios as much as possible. In this demo, we prepare the first 320 images from the COCO2017 validation set for users.
Note: If users want to obtain a more accurate quantized model through quantized distillation training, feel free to prepare more data and train more rounds.
```shell
# Download yolov5.onnx
wget https://paddle-slim-models.bj.bcebos.com/act/yolov5s.onnx
# Download dataset. This Calibration dataset is the first 320 images from COCO val2017
wget https://bj.bcebos.com/paddlehub/fastdeploy/COCO_val_320.tar.gz
tar -xvf COCO_val_320.tar.gz
```
##### 2.Use fastdeploy compress command to compress models
The following command is to quantize the yolov5s model, if developers want to quantize other models, replace the config_path with other model configuration files in the configs folder.
```shell
# Please specify the single card GPU before training, otherwise it may get stuck during the training process.
export CUDA_VISIBLE_DEVICES=0
fastdeploy compress --config_path=./configs/detection/yolov5s_quant.yaml --method='QAT' --save_dir='./yolov5s_qat_model/'
```
##### 3.Parameters
To complete the quantization, developers only need to provide a customized model config file, specify the quantization method, and the path to save the quantized model.
| Parameter | Description |
| ------------- | ------------------------------------------------------------------------------------------------------------- |
| --config_path | Quantization profiles needed for one-click quantization.[Configs](./configs/README.md) |
| --method | Quantization method selection, PTQ for post-training quantization, QAT for quantization distillation training |
| --save_dir | Output of quantized model paths, which can be deployed directly in FastDeploy |
## 3. FastDeploy One-Click Model Auto Compression Config file examples
FastDeploy currently provides users with compression [config](./configs/) files of multiple models, and the corresponding FP32 model, Users can directly download and experience it.
| Config file | FP32 model | Note |
| -------------------- | ------------------------------------------------------------ |----------------------------------------- |
| [mobilenetv1_ssld_quant](./configs/classification/mobilenetv1_ssld_quant.yaml) | [mobilenetv1_ssld](https://bj.bcebos.com/paddlehub/fastdeploy/MobileNetV1_ssld_infer.tgz) | |
| [resnet50_vd_quant](./configs/classification/resnet50_vd_quant.yaml) | [resnet50_vd](https://bj.bcebos.com/paddlehub/fastdeploy/ResNet50_vd_infer.tgz) | |
| [efficientnetb0_quant](./configs/classification/efficientnetb0_quant.yaml) | [efficientnetb0](https://bj.bcebos.com/paddlehub/fastdeploy/EfficientNetB0_small_infer.tgz) | |
| [mobilenetv3_large_x1_0_quant](./configs/classification/mobilenetv3_large_x1_0_quant.yaml) | [mobilenetv3_large_x1_0](https://bj.bcebos.com/paddlehub/fastdeploy/MobileNetV3_large_x1_0_ssld_infer.tgz) | |
| [pphgnet_tiny_quant](./configs/classification/pphgnet_tiny_quant.yaml) | [pphgnet_tiny](https://bj.bcebos.com/paddlehub/fastdeploy/PPHGNet_tiny_ssld_infer.tgz) | |
| [pplcnetv2_base_quant](./configs/classification/pplcnetv2_base_quant.yaml) | [pplcnetv2_base](https://bj.bcebos.com/paddlehub/fastdeploy/PPLCNetV2_base_infer.tgz) | |
| [yolov5s_quant](./configs/detection/yolov5s_quant.yaml) | [yolov5s](https://paddle-slim-models.bj.bcebos.com/act/yolov5s.onnx) | |
| [yolov6s_quant](./configs/detection/yolov6s_quant.yaml) | [yolov6s](https://paddle-slim-models.bj.bcebos.com/act/yolov6s.onnx) | |
| [yolov7_quant](./configs/detection/yolov7_quant.yaml) | [yolov7](https://paddle-slim-models.bj.bcebos.com/act/yolov7.onnx) | |
| [ppyoloe_withNMS_quant](./configs/detection/ppyoloe_withNMS_quant.yaml) | [ppyoloe_l](https://bj.bcebos.com/v1/paddle-slim-models/act/ppyoloe_crn_l_300e_coco.tar) | Support PPYOLOE's s,m,l,x series models, export the model normally when exporting the model from PaddleDetection, do not remove NMS |
| [ppyoloe_plus_withNMS_quant](./configs/detection/ppyoloe_plus_withNMS_quant.yaml) | [ppyoloe_plus_s](https://bj.bcebos.com/paddlehub/fastdeploy/ppyoloe_plus_crn_s_80e_coco.tar) | Support PPYOLOE+'s s,m,l,x series models, export the model normally when exporting the model from PaddleDetection, do not remove NMS |
| [pp_liteseg_quant](./configs/segmentation/pp_liteseg_quant.yaml) | [pp_liteseg](https://bj.bcebos.com/paddlehub/fastdeploy/PP_LiteSeg_T_STDC1_cityscapes_without_argmax_infer.tgz) | |
| [deeplabv3_resnet_quant](./configs/segmentation/deeplabv3_resnet_quant.yaml) | [deeplabv3_resnet101](https://bj.bcebos.com/paddlehub/fastdeploy/Deeplabv3_ResNet101_OS8_cityscapes_without_argmax_infer.tgz) | |
| [fcn_hrnet_quant](./configs/segmentation/fcn_hrnet_quant.yaml) | [fcn_hrnet](https://bj.bcebos.com/paddlehub/fastdeploy/FCN_HRNet_W18_cityscapes_without_argmax_infer.tgz) | |
| [unet_quant](./configs/segmentation/unet_quant.yaml) | [unet](https://bj.bcebos.com/paddlehub/fastdeploy/Unet_cityscapes_without_argmax_infer.tgz) | | |
## 3. Deploy quantized models on FastDeploy
Once obtained the quantized model, developers can deploy it on FastDeploy. Please refer to the following docs for more details
- [YOLOv5 Quantized Model Deployment](../../../examples/vision/detection/yolov5/quantize/)
- [YOLOv6 Quantized Model Deployment](../../../examples/vision/detection/yolov6/quantize/)
- [YOLOv7 Quantized Model Deployment](../../../examples/vision/detection/yolov7/quantize/)
- [PadddleClas Quantized Model Deployment](../../../examples/vision/classification/paddleclas/quantize/)
- [PadddleDetection Quantized Model Deployment](../../../examples/vision/detection/paddledetection/quantize/)
- [PadddleSegmentation Quantized Model Deployment](../../../examples/vision/segmentation/paddleseg/quantize/)
@@ -1,54 +0,0 @@
# FastDeploy 一键自动化压缩配置文件说明
FastDeploy 一键自动化压缩配置文件中,包含了全局配置,量化蒸馏训练配置,离线量化配置和训练配置.
用户除了直接使用FastDeploy提供在本目录的配置文件外,可以按照以下示例,自行修改相关配置文件, 来尝试压缩自己的模型.
## 实例解读
```
# 全局配置
Global:
model_dir: ./ppyoloe_plus_crn_s_80e_coco #输入模型的路径, 用户若需量化自己的模型,替换此处即可
format: paddle #输入模型的格式, paddle模型请选择'paddle', onnx模型选择'onnx'
model_filename: model.pdmodel #量化后转为paddle格式模型的模型名字
params_filename: model.pdiparams #量化后转为paddle格式模型的参数名字
qat_image_path: ./COCO_train_320 #量化蒸馏训练使用的数据集,此例为少量无标签数据, 选自COCO2017训练集中的前320张图片, 做少量数据训练
ptq_image_path: ./COCO_val_320 #离线训练使用的Carlibration数据集, 选自COCO2017验证集中的前320张图片.
input_list: ['image','scale_factor'] #待量化的模型的输入名字
qat_preprocess: ppyoloe_plus_withNMS_image_preprocess #模型量化蒸馏训练时,对数据做的预处理函数, 用户可以在 ../fdquant/dataset.py 中修改或自行编写新的预处理函数, 来支自定义模型的量化
ptq_preprocess: ppyoloe_plus_withNMS_image_preprocess #模型离线量化时,对数据做的预处理函数, 用户可以在 ../fdquant/dataset.py 中修改或自行编写新的预处理函数, 来支自定义模型的量化
qat_batch_size: 4 #量化蒸馏训练时的batch_size, 若为onnx格式的模型,此处只能为1
#量化蒸馏训练配置
Distillation:
alpha: 1.0 #蒸馏loss所占权重
loss: soft_label #蒸馏loss算法
QuantAware:
onnx_format: true #是否采用ONNX量化标准格式, 要在FastDeploy上部署, 必须选true
use_pact: true #量化训练是否使用PACT方法
activation_quantize_type: 'moving_average_abs_max' #激活量化方式
quantize_op_types: #需要进行量化的OP
- conv2d
- depthwise_conv2d
#离线量化配置
PTQ:
calibration_method: 'avg' #离线量化的激活校准算法, 可选: avg, abs_max, hist, KL, mse, emd
skip_tensor_list: None #用户可指定跳过某些conv层,不进行量化
#训练参数配置
TrainConfig:
train_iter: 3000
learning_rate: 0.00001
optimizer_builder:
optimizer:
type: SGD
weight_decay: 4.0e-05
target_metric: 0.365
```
## 更多详细配置方法
FastDeploy一键压缩功能由PaddeSlim助力, 更详细的量化配置方法请参考:
[自动化压缩超参详细教程](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/example/auto_compression/hyperparameter_tutorial.md)
@@ -1,56 +0,0 @@
# Quantization Config File on FastDeploy
The FastDeploy quantization configuration file contains global configuration, quantization distillation training configuration, post-training quantization configuration and training configuration.
In addition to using the configuration files provided by FastDeploy directly in this directory, users can modify the relevant configuration files according to their needs
## Demo
```
# Global config
Global:
model_dir: ./ppyoloe_plus_crn_s_80e_coco #Path to input model
format: paddle #Input model format, please select 'paddle' for paddle model
model_filename: model.pdmodel #Quantized model name in Paddle format
params_filename: model.pdiparams #Parameter name for quantized paddle model
qat_image_path: ./COCO_train_320 #Data set paths for quantization distillation training
ptq_image_path: ./COCO_val_320 #Data set paths for PTQ
input_list: ['image','scale_factor'] #Input name of the model to be quanzitzed
qat_preprocess: ppyoloe_plus_withNMS_image_preprocess # The preprocessing function for Quantization distillation training
ptq_preprocess: ppyoloe_plus_withNMS_image_preprocess # The preprocessing function for PTQ
qat_batch_size: 4 #Batch size
# Quantization distillation training configuration
Distillation:
alpha: 1.0 #Distillation loss weight
loss: soft_label #Distillation loss algorithm
QuantAware:
onnx_format: true #Whether to use ONNX quantization standard format or not, must be true to deploy on FastDeploy
use_pact: true #Whether to use the PACT method for training
activation_quantize_type: 'moving_average_abs_max' #Activations quantization methods
quantize_op_types: #OPs that need to be quantized
- conv2d
- depthwise_conv2d
# Post-Training Quantization
PTQ:
calibration_method: 'avg' #Activations calibration algorithm of post-training quantization , Options: avg, abs_max, hist, KL, mse, emd
skip_tensor_list: None #Developers can skip some conv layers quantization
# Training Config
TrainConfig:
train_iter: 3000
learning_rate: 0.00001
optimizer_builder:
optimizer:
type: SGD
weight_decay: 4.0e-05
target_metric: 0.365
```
## More details
FastDeploy one-click quantization tool is powered by PaddeSlim, please refer to [Auto Compression Hyperparameter Tutorial](https://github.com/PaddlePaddle/PaddleSlim/blob/develop/example/auto_compression/hyperparameter_tutorial.md) for more details.
@@ -1,50 +0,0 @@
Global:
model_dir: ./EfficientNetB0_small_infer/
format: 'paddle'
model_filename: inference.pdmodel
params_filename: inference.pdiparams
qat_image_path: ./ImageNet_val_640
ptq_image_path: ./ImageNet_val_640
input_list: ['inputs']
qat_preprocess: cls_image_preprocess
ptq_preprocess: cls_image_preprocess
qat_batch_size: 32
Distillation:
alpha: 1.0
loss: l2
node:
- softmax_0.tmp_0
QuantAware:
use_pact: true
activation_bits: 8
is_full_quantize: false
onnx_format: True
activation_quantize_type: moving_average_abs_max
weight_quantize_type: channel_wise_abs_max
not_quant_pattern:
- skip_quant
quantize_op_types:
- conv2d
- depthwise_conv2d
- matmul
- matmul_v2
weight_bits: 8
TrainConfig:
epochs: 1
eval_iter: 500
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.015
T_max: 8000
optimizer_builder:
optimizer:
type: Momentum
weight_decay: 0.00002
origin_metric: 0.7738
PTQ:
calibration_method: 'avg' # option: avg, abs_max, hist, KL, mse
skip_tensor_list: None
@@ -1,50 +0,0 @@
Global:
model_dir: ./MobileNetV1_ssld_infer/
format: 'paddle'
model_filename: inference.pdmodel
params_filename: inference.pdiparams
qat_image_path: ./ImageNet_val_640
ptq_image_path: ./ImageNet_val_640
input_list: ['input']
qat_preprocess: cls_image_preprocess
ptq_preprocess: cls_image_preprocess
qat_batch_size: 32
Distillation:
alpha: 1.0
loss: l2
node:
- softmax_0.tmp_0
QuantAware:
use_pact: true
activation_bits: 8
is_full_quantize: false
onnx_format: True
activation_quantize_type: moving_average_abs_max
weight_quantize_type: channel_wise_abs_max
not_quant_pattern:
- skip_quant
quantize_op_types:
- conv2d
- depthwise_conv2d
weight_bits: 8
TrainConfig:
train_iter: 5000
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.015
T_max: 8000
optimizer_builder:
optimizer:
type: Momentum
weight_decay: 0.00002
origin_metric: 0.70898
PTQ:
calibration_method: 'avg' # option: avg, abs_max, hist, KL, mse
skip_tensor_list: None
@@ -1,46 +0,0 @@
Global:
model_dir: ./MobileNetV3_large_x1_0_ssld_infer/
format: 'paddle'
model_filename: inference.pdmodel
params_filename: inference.pdiparams
qat_image_path: ./ImageNet_val_640
ptq_image_path: ./ImageNet_val_640
input_list: ['inputs']
qat_preprocess: cls_image_preprocess
ptq_preprocess: cls_image_preprocess
qat_batch_size: 128
Distillation:
alpha: 1.0
loss: soft_label
QuantAware:
use_pact: true
activation_bits: 8
is_full_quantize: false
onnx_format: True
activation_quantize_type: moving_average_abs_max
weight_quantize_type: channel_wise_abs_max
not_quant_pattern:
- skip_quant
quantize_op_types:
- conv2d
- depthwise_conv2d
- matmul
- matmul_v2
weight_bits: 8
TrainConfig:
epochs: 2
eval_iter: 5000
learning_rate: 0.001
optimizer_builder:
optimizer:
type: Momentum
weight_decay: 0.00002
origin_metric: 0.7896
PTQ:
calibration_method: 'avg' # option: avg, abs_max, hist, KL, mse
skip_tensor_list: None
@@ -1,50 +0,0 @@
Global:
model_dir: ./PPHGNet_tiny_ssld_infer/
format: 'paddle'
model_filename: inference.pdmodel
params_filename: inference.pdiparams
qat_image_path: ./ImageNet_val_640
ptq_image_path: ./ImageNet_val_640
input_list: ['x']
qat_preprocess: cls_image_preprocess
ptq_preprocess: cls_image_preprocess
qat_batch_size: 32
Distillation:
alpha: 1.0
loss: l2
node:
- softmax_1.tmp_0
QuantAware:
use_pact: true
activation_bits: 8
is_full_quantize: false
onnx_format: True
activation_quantize_type: moving_average_abs_max
weight_quantize_type: channel_wise_abs_max
not_quant_pattern:
- skip_quant
quantize_op_types:
- conv2d
- depthwise_conv2d
- matmul
- matmul_v2
weight_bits: 8
TrainConfig:
epochs: 1
eval_iter: 500
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.015
T_max: 8000
optimizer_builder:
optimizer:
type: Momentum
weight_decay: 0.00002
origin_metric: 0.7959
PTQ:
calibration_method: 'avg' # option: avg, abs_max, hist, KL, mse
skip_tensor_list: None
@@ -1,50 +0,0 @@
Global:
model_dir: ./PPLCNetV2_base_infer/
format: 'paddle'
model_filename: inference.pdmodel
params_filename: inference.pdiparams
qat_image_path: ./ImageNet_val_640
ptq_image_path: ./ImageNet_val_640
input_list: ['x']
qat_preprocess: cls_image_preprocess
ptq_preprocess: cls_image_preprocess
qat_batch_size: 32
Distillation:
alpha: 1.0
loss: l2
node:
- softmax_1.tmp_0
QuantAware:
use_pact: true
activation_bits: 8
is_full_quantize: false
onnx_format: True
activation_quantize_type: moving_average_abs_max
weight_quantize_type: channel_wise_abs_max
not_quant_pattern:
- skip_quant
quantize_op_types:
- conv2d
- depthwise_conv2d
- matmul
- matmul_v2
weight_bits: 8
TrainConfig:
epochs: 1
eval_iter: 500
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.015
T_max: 8000
optimizer_builder:
optimizer:
type: Momentum
weight_decay: 0.00002
origin_metric: 0.7704
PTQ:
calibration_method: 'avg' # option: avg, abs_max, hist, KL, mse
skip_tensor_list: None
@@ -1,48 +0,0 @@
Global:
model_dir: ./ResNet50_vd_infer/
format: 'paddle'
model_filename: inference.pdmodel
params_filename: inference.pdiparams
image_path: ./ImageNet_val_640
input_list: ['input']
qat_preprocess: cls_image_preprocess
ptq_preprocess: cls_image_preprocess
qat_batch_size: 32
Distillation:
alpha: 1.0
loss: l2
node:
- softmax_0.tmp_0
QuantAware:
use_pact: true
activation_bits: 8
is_full_quantize: false
onnx_format: True
activation_quantize_type: moving_average_abs_max
weight_quantize_type: channel_wise_abs_max
not_quant_pattern:
- skip_quant
quantize_op_types:
- conv2d
- depthwise_conv2d
weight_bits: 8
TrainConfig:
train_iter: 5000
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.015
T_max: 8000
optimizer_builder:
optimizer:
type: Momentum
weight_decay: 0.00002
origin_metric: 0.7912
PTQ:
calibration_method: 'avg' # option: avg, abs_max, hist, KL, mse
skip_tensor_list: None
@@ -1,39 +0,0 @@
Global:
model_dir: ./ppyoloe_plus_crn_s_80e_coco
format: paddle
model_filename: model.pdmodel
params_filename: model.pdiparams
qat_image_path: ./COCO_train_320
ptq_image_path: ./COCO_val_320
input_list: ['image','scale_factor']
qat_preprocess: ppyoloe_plus_withNMS_image_preprocess
ptq_preprocess: ppyoloe_plus_withNMS_image_preprocess
qat_batch_size: 4
Distillation:
alpha: 1.0
loss: soft_label
QuantAware:
onnx_format: true
use_pact: true
activation_quantize_type: 'moving_average_abs_max'
quantize_op_types:
- conv2d
- depthwise_conv2d
PTQ:
calibration_method: 'avg' # option: avg, abs_max, hist, KL, mse
skip_tensor_list: None
TrainConfig:
train_iter: 5000
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.00003
T_max: 6000
optimizer_builder:
optimizer:
type: SGD
weight_decay: 4.0e-05
@@ -1,39 +0,0 @@
Global:
model_dir: ./ppyoloe_crn_s_300e_coco
format: paddle
model_filename: model.pdmodel
params_filename: model.pdiparams
qat_image_path: ./COCO_train_320
ptq_image_path: ./COCO_val_320
input_list: ['image','scale_factor']
qat_preprocess: ppyoloe_withNMS_image_preprocess
ptq_preprocess: ppyoloe_withNMS_image_preprocess
qat_batch_size: 4
Distillation:
alpha: 1.0
loss: soft_label
QuantAware:
onnx_format: true
use_pact: true
activation_quantize_type: 'moving_average_abs_max'
quantize_op_types:
- conv2d
- depthwise_conv2d
PTQ:
calibration_method: 'avg' # option: avg, abs_max, hist, KL, mse
skip_tensor_list: None
TrainConfig:
train_iter: 5000
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.00003
T_max: 6000
optimizer_builder:
optimizer:
type: SGD
weight_decay: 4.0e-05
@@ -1,37 +0,0 @@
Global:
model_dir: ./yolov5s.onnx
format: 'onnx'
model_filename: model.pdmodel
params_filename: model.pdiparams
qat_image_path: ./COCO_train_320
ptq_image_path: ./COCO_val_320
input_list: ['x2paddle_images']
qat_preprocess: yolo_image_preprocess
ptq_preprocess: yolo_image_preprocess
qat_batch_size: 1
Distillation:
alpha: 1.0
loss: soft_label
QuantAware:
onnx_format: true
use_pact: true
activation_quantize_type: 'moving_average_abs_max'
quantize_op_types:
- conv2d
- depthwise_conv2d
PTQ:
calibration_method: 'avg' # option: avg, abs_max, hist, KL, mse
skip_tensor_list: None
TrainConfig:
train_iter: 3000
learning_rate: 0.00001
optimizer_builder:
optimizer:
type: SGD
weight_decay: 4.0e-05
target_metric: 0.365
@@ -1,39 +0,0 @@
Global:
model_dir: ./yolov6s.onnx
format: 'onnx'
model_filename: model.pdmodel
params_filename: model.pdiparams
qat_image_path: ./COCO_train_320
ptq_image_path: ./COCO_val_320
input_list: ['x2paddle_image_arrays']
qat_preprocess: yolo_image_preprocess
ptq_preprocess: yolo_image_preprocess
qat_batch_size: 1
Distillation:
alpha: 1.0
loss: soft_label
QuantAware:
onnx_format: true
activation_quantize_type: 'moving_average_abs_max'
quantize_op_types:
- conv2d
- depthwise_conv2d
- conv2d_transpose
PTQ:
calibration_method: 'avg' # option: avg, abs_max, hist, KL, mse
skip_tensor_list: ['conv2d_2.w_0', 'conv2d_15.w_0', 'conv2d_46.w_0', 'conv2d_11.w_0', 'conv2d_49.w_0']
TrainConfig:
train_iter: 8000
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.00003
T_max: 8000
optimizer_builder:
optimizer:
type: SGD
weight_decay: 0.00004
@@ -1,37 +0,0 @@
Global:
model_dir: ./yolov7.onnx
format: 'onnx'
model_filename: model.pdmodel
params_filename: model.pdiparams
qat_image_path: ./COCO_train_320
ptq_image_path: ./COCO_val_320
input_list: ['x2paddle_images']
qat_preprocess: yolo_image_preprocess
ptq_preprocess: yolo_image_preprocess
qat_batch_size: 1
Distillation:
alpha: 1.0
loss: soft_label
QuantAware:
onnx_format: true
activation_quantize_type: 'moving_average_abs_max'
quantize_op_types:
- conv2d
- depthwise_conv2d
PTQ:
calibration_method: 'avg' # option: avg, abs_max, hist, KL, mse
skip_tensor_list: None
TrainConfig:
train_iter: 3000
learning_rate:
type: CosineAnnealingDecay
learning_rate: 0.00003
T_max: 8000
optimizer_builder:
optimizer:
type: SGD
weight_decay: 0.00004
@@ -1,37 +0,0 @@
Global:
model_dir: ./Deeplabv3_ResNet101_OS8_cityscapes_without_argmax_infer/
format: paddle
model_filename: model.pdmodel
params_filename: model.pdiparams
qat_image_path: ./train_stuttgart
ptq_image_path: ./train_stuttgart
input_list: ['x']
qat_preprocess: ppseg_cityscapes_qat_preprocess
ptq_preprocess: ppseg_cityscapes_ptq_preprocess
qat_batch_size: 2
Distillation:
alpha: 1.0
loss: l2
node:
- conv2d_225.tmp_0
QuantAware:
onnx_format: True
quantize_op_types:
- conv2d
- depthwise_conv2d
TrainConfig:
epochs: 1
eval_iter: 360
learning_rate: 0.0001
optimizer_builder:
optimizer:
type: SGD
weight_decay: 0.0005
PTQ:
calibration_method: 'avg' # option: avg, abs_max, hist, KL, mse
skip_tensor_list: None
@@ -1,36 +0,0 @@
Global:
model_dir: ./FCN_HRNet_W18_cityscapes_without_argmax_infer
format: paddle
model_filename: model.pdmodel
params_filename: model.pdiparams
qat_image_path: ./train_stuttgart
ptq_image_path: ./train_stuttgart
input_list: ['x']
qat_preprocess: ppseg_cityscapes_qat_preprocess
ptq_preprocess: ppseg_cityscapes_ptq_preprocess
qat_batch_size: 2
Distillation:
alpha: 1.0
loss: l2
node:
- conv2d_613.tmp_1
QuantAware:
onnx_format: True
quantize_op_types:
- conv2d
- depthwise_conv2d
TrainConfig:
epochs: 20
eval_iter: 360
learning_rate: 0.0001
optimizer_builder:
optimizer:
type: SGD
weight_decay: 4.0e-05
PTQ:
calibration_method: 'avg' # option: avg, abs_max, hist, KL, mse
skip_tensor_list: None
@@ -1,37 +0,0 @@
Global:
model_dir: ./PP_LiteSeg_T_STDC1_cityscapes_without_argmax_infer
format: paddle
model_filename: model.pdmodel
params_filename: model.pdiparams
qat_image_path: ./train_stuttgart
ptq_image_path: ./val_munster
input_list: ['x']
qat_preprocess: ppseg_cityscapes_qat_preprocess
ptq_preprocess: ppseg_cityscapes_ptq_preprocess
qat_batch_size: 16
Distillation:
alpha: 1.0
loss: l2
node:
- conv2d_94.tmp_0
QuantAware:
onnx_format: True
quantize_op_types:
- conv2d
- depthwise_conv2d
PTQ:
calibration_method: 'avg' # option: avg, abs_max, hist, KL, mse
skip_tensor_list: None
TrainConfig:
epochs: 10
eval_iter: 180
learning_rate: 0.0005
optimizer_builder:
optimizer:
type: SGD
weight_decay: 4.0e-05
@@ -1,38 +0,0 @@
Global:
model_dir: ./Unet_cityscapes_without_argmax_infer/
format: paddle
model_filename: model.pdmodel
params_filename: model.pdiparams
qat_image_path: ./train_stuttgart
ptq_image_path: ./train_stuttgart
input_list: ['x']
qat_preprocess: ppseg_cityscapes_qat_preprocess
ptq_preprocess: ppseg_cityscapes_ptq_preprocess
qat_batch_size: 2
Distillation:
alpha: 1.0
loss: l2
node:
- conv2d_37.tmp_1
QuantAware:
onnx_format: True
quantize_op_types:
- conv2d
- depthwise_conv2d
TrainConfig:
epochs: 10
eval_iter: 180
learning_rate: 0.0005
optimizer_builder:
optimizer:
type: SGD
weight_decay: 4.0e-05
PTQ:
calibration_method: 'avg' # option: avg, abs_max, hist, KL, mse
skip_tensor_list: None
@@ -1,388 +0,0 @@
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import cv2
import os
import numpy as np
import random
from PIL import Image, ImageEnhance
import paddle
"""
Preprocess for Yolov5/v6/v7 Series
"""
def generate_scale(im, target_shape):
origin_shape = im.shape[:2]
im_size_min = np.min(origin_shape)
im_size_max = np.max(origin_shape)
target_size_min = np.min(target_shape)
target_size_max = np.max(target_shape)
im_scale = float(target_size_min) / float(im_size_min)
if np.round(im_scale * im_size_max) > target_size_max:
im_scale = float(target_size_max) / float(im_size_max)
im_scale_x = im_scale
im_scale_y = im_scale
return im_scale_y, im_scale_x
def yolo_image_preprocess(img, target_shape=[640, 640]):
# Resize image
im_scale_y, im_scale_x = generate_scale(img, target_shape)
img = cv2.resize(
img,
None,
None,
fx=im_scale_x,
fy=im_scale_y,
interpolation=cv2.INTER_LINEAR)
# Pad
im_h, im_w = img.shape[:2]
h, w = target_shape[:]
if h != im_h or w != im_w:
canvas = np.ones((h, w, 3), dtype=np.float32)
canvas *= np.array([114.0, 114.0, 114.0], dtype=np.float32)
canvas[0:im_h, 0:im_w, :] = img.astype(np.float32)
img = canvas
img = np.transpose(img / 255, [2, 0, 1])
return img.astype(np.float32)
"""
Preprocess for PaddleClas model
"""
def cls_resize_short(img, target_size):
img_h, img_w = img.shape[:2]
percent = float(target_size) / min(img_w, img_h)
w = int(round(img_w * percent))
h = int(round(img_h * percent))
return cv2.resize(img, (w, h), interpolation=cv2.INTER_LINEAR)
def crop_image(img, target_size, center):
height, width = img.shape[:2]
size = target_size
if center == True:
w_start = (width - size) // 2
h_start = (height - size) // 2
else:
w_start = np.random.randint(0, width - size + 1)
h_start = np.random.randint(0, height - size + 1)
w_end = w_start + size
h_end = h_start + size
return img[h_start:h_end, w_start:w_end, :]
def cls_image_preprocess(img):
# resize
img = cls_resize_short(img, target_size=256)
# crop
img = crop_image(img, target_size=224, center=True)
#ToCHWImage & Normalize
img = np.transpose(img / 255, [2, 0, 1])
img_mean = np.array([0.485, 0.456, 0.406]).reshape((3, 1, 1))
img_std = np.array([0.229, 0.224, 0.225]).reshape((3, 1, 1))
img -= img_mean
img /= img_std
return img.astype(np.float32)
"""
Preprocess for PPYOLOE
"""
def ppdet_resize_no_keepratio(img, target_shape=[640, 640]):
im_shape = img.shape
resize_h, resize_w = target_shape
im_scale_y = resize_h / im_shape[0]
im_scale_x = resize_w / im_shape[1]
scale_factor = np.asarray([im_scale_y, im_scale_x], dtype=np.float32)
return cv2.resize(
img, None, None, fx=im_scale_x, fy=im_scale_y,
interpolation=2), scale_factor
def ppyoloe_withNMS_image_preprocess(img):
img, scale_factor = ppdet_resize_no_keepratio(img, target_shape=[640, 640])
img = np.transpose(img / 255, [2, 0, 1])
img_mean = np.array([0.485, 0.456, 0.406]).reshape((3, 1, 1))
img_std = np.array([0.229, 0.224, 0.225]).reshape((3, 1, 1))
img -= img_mean
img /= img_std
return img.astype(np.float32), scale_factor
def ppyoloe_plus_withNMS_image_preprocess(img):
img, scale_factor = ppdet_resize_no_keepratio(img, target_shape=[640, 640])
img = np.transpose(img / 255, [2, 0, 1])
return img.astype(np.float32), scale_factor
"""
Preprocess for PP_LiteSeg
"""
def ppseg_cityscapes_ptq_preprocess(img):
#ToCHWImage & Normalize
img = np.transpose(img / 255.0, [2, 0, 1])
img_mean = np.array([0.5, 0.5, 0.5]).reshape((3, 1, 1))
img_std = np.array([0.5, 0.5, 0.5]).reshape((3, 1, 1))
img -= img_mean
img /= img_std
return img.astype(np.float32)
def ResizeStepScaling(img,
min_scale_factor=0.75,
max_scale_factor=1.25,
scale_step_size=0.25):
# refer form ppseg
if min_scale_factor == max_scale_factor:
scale_factor = min_scale_factor
elif scale_step_size == 0:
scale_factor = np.random.uniform(min_scale_factor, max_scale_factor)
else:
num_steps = int((max_scale_factor - min_scale_factor) / scale_step_size
+ 1)
scale_factors = np.linspace(min_scale_factor, max_scale_factor,
num_steps).tolist()
np.random.shuffle(scale_factors)
scale_factor = scale_factors[0]
w = int(round(scale_factor * img.shape[1]))
h = int(round(scale_factor * img.shape[0]))
img = cv2.resize(img, (w, h), interpolation=cv2.INTER_LINEAR)
return img
def RandomPaddingCrop(img,
crop_size=(512, 512),
im_padding_value=(127.5, 127.5, 127.5),
label_padding_value=255):
if isinstance(crop_size, list) or isinstance(crop_size, tuple):
if len(crop_size) != 2:
raise ValueError(
'Type of `crop_size` is list or tuple. It should include 2 elements, but it is {}'
.format(crop_size))
else:
raise TypeError(
"The type of `crop_size` is invalid. It should be list or tuple, but it is {}"
.format(type(crop_size)))
if isinstance(crop_size, int):
crop_width = crop_size
crop_height = crop_size
else:
crop_width = crop_size[0]
crop_height = crop_size[1]
img_height = img.shape[0]
img_width = img.shape[1]
if img_height == crop_height and img_width == crop_width:
return img
else:
pad_height = max(crop_height - img_height, 0)
pad_width = max(crop_width - img_width, 0)
if (pad_height > 0 or pad_width > 0):
img = cv2.copyMakeBorder(
img,
0,
pad_height,
0,
pad_width,
cv2.BORDER_CONSTANT,
value=im_padding_value)
img_height = img.shape[0]
img_width = img.shape[1]
if crop_height > 0 and crop_width > 0:
h_off = np.random.randint(img_height - crop_height + 1)
w_off = np.random.randint(img_width - crop_width + 1)
img = img[h_off:(crop_height + h_off), w_off:(w_off + crop_width
), :]
return img
def RandomHorizontalFlip(img, prob=0.5):
if random.random() < prob:
if len(img.shape) == 3:
img = img[:, ::-1, :]
elif len(img.shape) == 2:
img = img[:, ::-1]
return img
else:
return img
def brightness(im, brightness_lower, brightness_upper):
brightness_delta = np.random.uniform(brightness_lower, brightness_upper)
im = ImageEnhance.Brightness(im).enhance(brightness_delta)
return im
def contrast(im, contrast_lower, contrast_upper):
contrast_delta = np.random.uniform(contrast_lower, contrast_upper)
im = ImageEnhance.Contrast(im).enhance(contrast_delta)
return im
def saturation(im, saturation_lower, saturation_upper):
saturation_delta = np.random.uniform(saturation_lower, saturation_upper)
im = ImageEnhance.Color(im).enhance(saturation_delta)
return im
def hue(im, hue_lower, hue_upper):
hue_delta = np.random.uniform(hue_lower, hue_upper)
im = np.array(im.convert('HSV'))
im[:, :, 0] = im[:, :, 0] + hue_delta
im = Image.fromarray(im, mode='HSV').convert('RGB')
return im
def sharpness(im, sharpness_lower, sharpness_upper):
sharpness_delta = np.random.uniform(sharpness_lower, sharpness_upper)
im = ImageEnhance.Sharpness(im).enhance(sharpness_delta)
return im
def RandomDistort(img,
brightness_range=0.5,
brightness_prob=0.5,
contrast_range=0.5,
contrast_prob=0.5,
saturation_range=0.5,
saturation_prob=0.5,
hue_range=18,
hue_prob=0.5,
sharpness_range=0.5,
sharpness_prob=0):
brightness_lower = 1 - brightness_range
brightness_upper = 1 + brightness_range
contrast_lower = 1 - contrast_range
contrast_upper = 1 + contrast_range
saturation_lower = 1 - saturation_range
saturation_upper = 1 + saturation_range
hue_lower = -hue_range
hue_upper = hue_range
sharpness_lower = 1 - sharpness_range
sharpness_upper = 1 + sharpness_range
ops = [brightness, contrast, saturation, hue, sharpness]
random.shuffle(ops)
params_dict = {
'brightness': {
'brightness_lower': brightness_lower,
'brightness_upper': brightness_upper
},
'contrast': {
'contrast_lower': contrast_lower,
'contrast_upper': contrast_upper
},
'saturation': {
'saturation_lower': saturation_lower,
'saturation_upper': saturation_upper
},
'hue': {
'hue_lower': hue_lower,
'hue_upper': hue_upper
},
'sharpness': {
'sharpness_lower': sharpness_lower,
'sharpness_upper': sharpness_upper,
}
}
prob_dict = {
'brightness': brightness_prob,
'contrast': contrast_prob,
'saturation': saturation_prob,
'hue': hue_prob,
'sharpness': sharpness_prob
}
img = img.astype('uint8')
img = Image.fromarray(img)
for id in range(len(ops)):
params = params_dict[ops[id].__name__]
prob = prob_dict[ops[id].__name__]
params['im'] = img
if np.random.uniform(0, 1) < prob:
img = ops[id](**params)
img = np.asarray(img).astype('float32')
return img
def ppseg_cityscapes_qat_preprocess(img):
min_scale_factor = 0.5
max_scale_factor = 2.0
scale_step_size = 0.25
crop_size = (1024, 512)
brightness_range = 0.5
contrast_range = 0.5
saturation_range = 0.5
img = ResizeStepScaling(
img, min_scale_factor=0.5, max_scale_factor=2.0, scale_step_size=0.25)
img = RandomPaddingCrop(img, crop_size=(1024, 512))
img = RandomHorizontalFlip(img)
img = RandomDistort(
img, brightness_range=0.5, contrast_range=0.5, saturation_range=0.5)
img = np.transpose(img / 255.0, [2, 0, 1])
img_mean = np.array([0.5, 0.5, 0.5]).reshape((3, 1, 1))
img_std = np.array([0.5, 0.5, 0.5]).reshape((3, 1, 1))
img -= img_mean
img /= img_std
return img.astype(np.float32)
@@ -1,189 +0,0 @@
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import sys
import numpy as np
import time
import argparse
from tqdm import tqdm
import paddle
from paddleslim.common import load_config, load_onnx_model
from paddleslim.auto_compression import AutoCompression
from paddleslim.quant import quant_post_static
from .dataset import *
def argsparser():
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument(
'--config_path',
type=str,
default=None,
help="path of compression strategy config.",
required=True)
parser.add_argument(
'--method',
type=str,
default=None,
help="choose PTQ or QAT as quantization method",
required=True)
parser.add_argument(
'--save_dir',
type=str,
default='output',
help="directory to save compressed model.")
parser.add_argument(
'--devices',
type=str,
default='gpu',
help="which device used to compress.")
return parser
def reader_wrapper(reader, input_list):
if isinstance(input_list, list) and len(input_list) == 1:
input_name = input_list[0]
def gen():
in_dict = {}
for i, data in enumerate(reader()):
imgs = np.array(data[0])
in_dict[input_name] = imgs
yield in_dict
return gen
if isinstance(input_list, list) and len(input_list) > 1:
def gen():
for idx, data in enumerate(reader()):
in_dict = {}
for i in range(len(input_list)):
intput_name = input_list[i]
feed_data = np.array(data[0][i])
in_dict[intput_name] = feed_data
yield in_dict
return gen
def auto_compress(FLAGS):
#FLAGS needs parse
time_s = time.time()
paddle.enable_static()
assert FLAGS.devices in ['cpu', 'gpu', 'xpu', 'npu']
paddle.set_device(FLAGS.devices)
global global_config
if FLAGS.method == 'QAT':
all_config = load_config(FLAGS.config_path)
assert "Global" in all_config, f"Key 'Global' not found in config file. \n{all_config}"
global_config = all_config["Global"]
input_list = global_config['input_list']
assert os.path.exists(global_config[
'qat_image_path']), "image_path does not exist!"
paddle.vision.image.set_image_backend('cv2')
# transform could be customized.
train_dataset = paddle.vision.datasets.ImageFolder(
global_config['qat_image_path'],
transform=eval(global_config['qat_preprocess']))
train_loader = paddle.io.DataLoader(
train_dataset,
batch_size=global_config['qat_batch_size'],
shuffle=True,
drop_last=True,
num_workers=0)
train_loader = reader_wrapper(train_loader, input_list=input_list)
eval_func = None
# ACT compression
ac = AutoCompression(
model_dir=global_config['model_dir'],
model_filename=global_config['model_filename'],
params_filename=global_config['params_filename'],
train_dataloader=train_loader,
save_dir=FLAGS.save_dir,
config=all_config,
eval_callback=eval_func)
ac.compress()
# PTQ compression
if FLAGS.method == 'PTQ':
# Read Global config and prepare dataset
all_config = load_config(FLAGS.config_path)
assert "Global" in all_config, f"Key 'Global' not found in config file. \n{all_config}"
global_config = all_config["Global"]
input_list = global_config['input_list']
assert os.path.exists(global_config[
'ptq_image_path']), "image_path does not exist!"
paddle.vision.image.set_image_backend('cv2')
# transform could be customized.
val_dataset = paddle.vision.datasets.ImageFolder(
global_config['ptq_image_path'],
transform=eval(global_config['ptq_preprocess']))
val_loader = paddle.io.DataLoader(
val_dataset,
batch_size=1,
shuffle=True,
drop_last=True,
num_workers=0)
val_loader = reader_wrapper(val_loader, input_list=input_list)
# Read PTQ config
assert "PTQ" in all_config, f"Key 'PTQ' not found in config file. \n{all_config}"
ptq_config = all_config["PTQ"]
# Inititalize the executor
place = paddle.CUDAPlace(
0) if FLAGS.devices == 'gpu' else paddle.CPUPlace()
exe = paddle.static.Executor(place)
# Read ONNX or PADDLE format model
if global_config['format'] == 'onnx':
load_onnx_model(global_config["model_dir"])
inference_model_path = global_config["model_dir"].rstrip().rstrip(
'.onnx') + '_infer'
else:
inference_model_path = global_config["model_dir"].rstrip('/')
quant_post_static(
executor=exe,
model_dir=inference_model_path,
quantize_model_path=FLAGS.save_dir,
data_loader=val_loader,
model_filename=global_config["model_filename"],
params_filename=global_config["params_filename"],
batch_size=32,
batch_nums=10,
algo=ptq_config['calibration_method'],
hist_percent=0.999,
is_full_quantize=False,
bias_correction=False,
onnx_format=True,
skip_tensor_list=ptq_config['skip_tensor_list']
if 'skip_tensor_list' in ptq_config else None)
time_total = time.time() - time_s
print("Finish Compression, total time used is : ", time_total, "seconds.")
-270
View File
@@ -1,270 +0,0 @@
import argparse
import ast
import uvicorn
def argsparser():
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument(
'tools',
choices=['compress', 'convert', 'simple_serving', 'paddle2coreml'])
## argumentments for auto compression
parser.add_argument(
'--config_path',
type=str,
default=None,
help="path of compression strategy config.")
parser.add_argument(
'--method',
type=str,
default=None,
help="choose PTQ or QAT as quantization method")
parser.add_argument(
'--save_dir',
type=str,
default='./output',
help="directory to save model.")
parser.add_argument(
'--devices',
type=str,
default='gpu',
help="which device used to compress.")
## arguments for other x2paddle
parser.add_argument(
'--framework',
type=str,
default=None,
help="define which deeplearning framework(tensorflow/caffe/onnx)")
parser.add_argument(
'--model',
type=str,
default=None,
help="define model file path for tensorflow or onnx")
parser.add_argument(
"--prototxt",
"-p",
type=str,
default=None,
help="prototxt file of caffe model")
parser.add_argument(
"--weight",
"-w",
type=str,
default=None,
help="weight file of caffe model")
parser.add_argument(
"--caffe_proto",
"-c",
type=str,
default=None,
help="optional: the .py file compiled by caffe proto file of caffe model"
)
parser.add_argument(
"--input_shape_dict",
"-isd",
type=str,
default=None,
help="define input shapes, e.g --input_shape_dict=\"{'image':[1, 3, 608, 608]}\" or" \
"--input_shape_dict=\"{'image':[1, 3, 608, 608], 'im_shape': [1, 2], 'scale_factor': [1, 2]}\"")
parser.add_argument(
"--enable_code_optim",
"-co",
type=ast.literal_eval,
default=False,
help="Turn on code optimization")
## arguments for simple serving
parser.add_argument(
"--app",
type=str,
default="server:app",
help="Simple serving app string")
parser.add_argument(
"--host",
type=str,
default="127.0.0.1",
help="Simple serving host IP address")
parser.add_argument(
"--port", type=int, default=8000, help="Simple serving host port")
## arguments for paddle2coreml
parser.add_argument(
"--p2c_paddle_model_dir",
type=str,
default=None,
help="define paddle model path")
parser.add_argument(
"--p2c_coreml_model_dir",
type=str,
default=None,
help="define generated coreml model path")
parser.add_argument(
"--p2c_coreml_model_name",
type=str,
default="coreml_model",
help="define generated coreml model name")
parser.add_argument(
"--p2c_input_names", type=str, default=None, help="define input names")
parser.add_argument(
"--p2c_input_dtypes",
type=str,
default="float32",
help="define input dtypes")
parser.add_argument(
"--p2c_input_shapes",
type=str,
default=None,
help="define input shapes")
parser.add_argument(
"--p2c_output_names",
type=str,
default=None,
help="define output names")
## arguments for other tools
return parser
def main():
args = argsparser().parse_args()
if args.tools == "compress":
from .auto_compression.fd_auto_compress.fd_auto_compress import auto_compress
print("Welcome to use FastDeploy Auto Compression Toolkit!")
auto_compress(args)
if args.tools == "convert":
try:
import platform
import logging
v0, v1, v2 = platform.python_version().split('.')
if not (int(v0) >= 3 and int(v1) >= 5):
logging.info("[ERROR] python>=3.5 is required")
return
import paddle
v0, v1, v2 = paddle.__version__.split('.')
logging.info("paddle.__version__ = {}".format(paddle.__version__))
if v0 == '0' and v1 == '0' and v2 == '0':
logging.info(
"[WARNING] You are use develop version of paddlepaddle")
elif int(v0) != 2 or int(v1) < 0:
logging.info("[ERROR] paddlepaddle>=2.0.0 is required")
return
from x2paddle.convert import tf2paddle, caffe2paddle, onnx2paddle
if args.framework == "tensorflow":
assert args.model is not None, "--model should be defined while convert tensorflow model"
tf2paddle(args.model, args.save_dir)
elif args.framework == "caffe":
assert args.prototxt is not None and args.weight is not None, "--prototxt and --weight should be defined while convert caffe model"
caffe2paddle(args.prototxt, args.weight, args.save_dir,
args.caffe_proto)
elif args.framework == "onnx":
assert args.model is not None, "--model should be defined while convert onnx model"
onnx2paddle(
args.model,
args.save_dir,
input_shape_dict=args.input_shape_dict)
else:
raise Exception(
"--framework only support tensorflow/caffe/onnx now")
except ImportError:
print(
"Model convert failed! Please check if you have installed it!")
if args.tools == "simple_serving":
custom_logging_config = {
"version": 1,
"disable_existing_loggers": False,
"formatters": {
"default": {
"()": "uvicorn.logging.DefaultFormatter",
"fmt": "%(asctime)s %(levelprefix)s %(message)s",
'datefmt': '%Y-%m-%d %H:%M:%S',
"use_colors": None,
},
},
"handlers": {
"default": {
"formatter": "default",
"class": "logging.StreamHandler",
"stream": "ext://sys.stderr",
},
'null': {
"formatter": "default",
"class": 'logging.NullHandler'
}
},
"loggers": {
"": {
"handlers": ["null"],
"level": "DEBUG"
},
"uvicorn.error": {
"handlers": ["default"],
"level": "DEBUG"
}
},
}
uvicorn.run(args.app,
host=args.host,
port=args.port,
app_dir='.',
log_config=custom_logging_config)
if args.tools == "paddle2coreml":
if any([
args.p2c_paddle_model_dir is None,
args.p2c_coreml_model_dir is None,
args.p2c_input_names is None, args.p2c_input_shapes is None,
args.p2c_output_names is None
]):
raise Exception(
"paddle2coreml need to define --p2c_paddle_model_dir, --p2c_coreml_model_dir, --p2c_input_names, --p2c_input_shapes, --p2c_output_names"
)
import coremltools as ct
import os
import numpy as np
def type_to_np_dtype(dtype):
if dtype == 'float32':
return np.float32
elif dtype == 'float64':
return np.float64
elif dtype == 'int32':
return np.int32
elif dtype == 'int64':
return np.int64
elif dtype == 'uint8':
return np.uint8
elif dtype == 'uint16':
return np.uint16
elif dtype == 'uint32':
return np.uint32
elif dtype == 'uint64':
return np.uint64
elif dtype == 'int8':
return np.int8
elif dtype == 'int16':
return np.int16
else:
raise Exception("Unsupported dtype: {}".format(dtype))
input_names = args.p2c_input_names.split(' ')
input_shapes = [[int(i) for i in shape.split(',')]
for shape in args.p2c_input_shapes.split(' ')]
input_dtypes = map(type_to_np_dtype, args.p2c_input_dtypes.split(' '))
output_names = args.p2c_output_names.split(' ')
sample_input = [
ct.TensorType(
name=k,
shape=s,
dtype=d, )
for k, s, d in zip(input_names, input_shapes, input_dtypes)
]
coreml_model = ct.convert(
args.p2c_paddle_model_dir,
convert_to="mlprogram",
minimum_deployment_target=ct.target.macOS13,
inputs=sample_input,
outputs=[ct.TensorType(name=name) for name in output_names], )
coreml_model.save(
os.path.join(args.p2c_coreml_model_dir,
args.p2c_coreml_model_name))
if __name__ == '__main__':
main()
@@ -1,15 +0,0 @@
mean:
-
- 123.675
- 116.28
- 103.53
std:
-
- 58.395
- 57.12
- 57.375
model_path: ./PP_TinyPose_256x192_infer/PP_TinyPose_256x192_infer.onnx
outputs_nodes: ['conv2d_441.tmp_1']
do_quantization: False
dataset:
output_folder: "./PP_TinyPose_256x192_infer"
@@ -1,15 +0,0 @@
mean:
-
- 127.5
- 127.5
- 127.5
std:
-
- 127.5
- 127.5
- 127.5
model_path: ./Portrait_PP_HumanSegV2_Lite_256x144_infer/Portrait_PP_HumanSegV2_Lite_256x144_infer.onnx
outputs_nodes:
do_quantization: True
dataset: "./Portrait_PP_HumanSegV2_Lite_256x144_infer/dataset.txt"
output_folder: "./Portrait_PP_HumanSegV2_Lite_256x144_infer"
@@ -1,15 +0,0 @@
model_path: ./ResNet50_vd_infer/ResNet50_vd_infer.onnx
output_folder: ./ResNet50_vd_infer
mean:
-
- 123.675
- 116.28
- 103.53
std:
-
- 58.395
- 57.12
- 57.375
outputs_nodes:
do_quantization: False
dataset: "./ResNet50_vd_infer/dataset.txt"
@@ -1,15 +0,0 @@
mean:
-
- 127.5
- 127.5
- 127.5
std:
-
- 127.5
- 127.5
- 127.5
model_path: ./ms1mv3_arcface_r18/ms1mv3_arcface_r18.onnx
outputs_nodes:
do_quantization: True
dataset: "./ms1mv3_arcface_r18/datasets.txt"
output_folder: "./ms1mv3_arcface_r18"
@@ -1,15 +0,0 @@
mean:
-
- 127.5
- 127.5
- 127.5
std:
-
- 127.5
- 127.5
- 127.5
model_path: ./ms1mv3_arcface_r18/ms1mv3_arcface_r18.onnx
outputs_nodes:
do_quantization: False
dataset: "./ms1mv3_arcface_r18/datasets.txt"
output_folder: "./ms1mv3_arcface_r18"
@@ -1,17 +0,0 @@
mean:
-
- 123.675
- 116.28
- 103.53
std:
-
- 58.395
- 57.12
- 57.375
model_path: ./picodet_s_416_coco_lcnet/picodet_s_416_coco_lcnet.onnx
outputs_nodes:
- 'p2o.Mul.179'
- 'p2o.Concat.9'
do_quantization: False
dataset:
output_folder: "./picodet_s_416_coco_lcnet"
-15
View File
@@ -1,15 +0,0 @@
mean:
-
- 127.5
- 127.5
- 127.5
std:
-
- 127.5
- 127.5
- 127.5
model_path: ./ch_ppocr_mobile_v2.0_cls_infer/ch_ppocr_mobile_v2.0_cls_infer.onnx
outputs_nodes:
do_quantization: False
dataset:
output_folder: "./ch_ppocr_mobile_v2.0_cls_infer"
-15
View File
@@ -1,15 +0,0 @@
mean:
-
- 123.675
- 116.28
- 103.53
std:
-
- 58.395
- 57.12
- 57.375
model_path: ./ch_PP-OCRv3_det_infer/ch_PP-OCRv3_det_infer.onnx
outputs_nodes:
do_quantization: False
dataset:
output_folder: "./ch_PP-OCRv3_det_infer"
-15
View File
@@ -1,15 +0,0 @@
mean:
-
- 127.5
- 127.5
- 127.5
std:
-
- 127.5
- 127.5
- 127.5
model_path: ./ch_PP-OCRv3_rec_infer/ch_PP-OCRv3_rec_infer.onnx
outputs_nodes:
do_quantization: False
dataset:
output_folder: "./ch_PP-OCRv3_rec_infer"
@@ -1,17 +0,0 @@
mean:
-
- 0
- 0
- 0
std:
-
- 255
- 255
- 255
model_path: ./ppyoloe_plus_crn_s_80e_coco/ppyoloe_plus_crn_s_80e_coco.onnx
outputs_nodes:
- 'p2o.Mul.224'
- 'p2o.Concat.29'
do_quantization: True
dataset: "./ppyoloe_plus_crn_s_80e_coco/dataset.txt"
output_folder: "./ppyoloe_plus_crn_s_80e_coco"
-15
View File
@@ -1,15 +0,0 @@
mean:
-
- 127.5
- 127.5
- 127.5
std:
-
- 127.5
- 127.5
- 127.5
model_path: ./scrfd_500m_bnkps_shape640x640.onnx
outputs_nodes:
do_quantization: True
dataset: "./dataset.txt"
output_folder: "./"
@@ -1,15 +0,0 @@
mean:
-
- 127.5
- 127.5
- 127.5
std:
-
- 127.5
- 127.5
- 127.5
model_path: ./scrfd_500m_bnkps_shape640x640.onnx
outputs_nodes:
do_quantization: False
dataset: "./dataset.txt"
output_folder: "./"
@@ -1,17 +0,0 @@
mean:
-
- 0
- 0
- 0
std:
-
- 255
- 255
- 255
model_path: ./yolov8_n_500e_coco/yolov8_n_500e_coco.onnx
outputs_nodes:
- 'p2o.Mul.119'
- 'p2o.Concat.49'
do_quantization: True
dataset: "./yolov8_n_500e_coco/dataset.txt"
output_folder: "./yolov8_n_500e_coco"
@@ -1,17 +0,0 @@
mean:
-
- 0
- 0
- 0
std:
-
- 255
- 255
- 255
model_path: ./yolov8_n_500e_coco/yolov8_n_500e_coco.onnx
outputs_nodes:
- 'p2o.Mul.1'
- 'p2o.Concat.49'
do_quantization: False
dataset: "./yolov8_n_500e_coco/dataset.txt"
output_folder: "./yolov8_n_500e_coco"
-80
View File
@@ -1,80 +0,0 @@
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import yaml
import argparse
from rknn.api import RKNN
def get_config():
parser = argparse.ArgumentParser()
parser.add_argument("--verbose", default=True, help="rknntoolkit verbose")
parser.add_argument("--config_path")
parser.add_argument("--target_platform")
args = parser.parse_args()
return args
if __name__ == "__main__":
config = get_config()
with open(config.config_path) as file:
file_data = file.read()
yaml_config = yaml.safe_load(file_data)
print(yaml_config)
model = RKNN(config.verbose)
# Config
mean_values = yaml_config["mean"]
std_values = yaml_config["std"]
model.config(
mean_values=mean_values,
std_values=std_values,
target_platform=config.target_platform)
# Load ONNX model
if yaml_config["outputs_nodes"] is None:
ret = model.load_onnx(model=yaml_config["model_path"])
else:
ret = model.load_onnx(
model=yaml_config["model_path"],
outputs=yaml_config["outputs_nodes"])
assert ret == 0, "Load model failed!"
# Build model
ret = model.build(
do_quantization=yaml_config["do_quantization"],
dataset=yaml_config["dataset"])
assert ret == 0, "Build model failed!"
# Init Runtime
ret = model.init_runtime()
assert ret == 0, "Init runtime environment failed!"
# Export
if not os.path.exists(yaml_config["output_folder"]):
os.mkdir(yaml_config["output_folder"])
name_list = os.path.basename(yaml_config["model_path"]).split(".")
model_base_name = ""
for name in name_list[0:-1]:
model_base_name += name
model_device_name = config.target_platform.lower()
if yaml_config["do_quantization"]:
model_save_name = model_base_name + "_" + model_device_name + "_quantized" + ".rknn"
else:
model_save_name = model_base_name + "_" + model_device_name + "_unquantized" + ".rknn"
ret = model.export_rknn(
os.path.join(yaml_config["output_folder"], model_save_name))
assert ret == 0, "Export rknn model failed!"
print("Export OK!")
-24
View File
@@ -1,24 +0,0 @@
import setuptools
long_description = "fastdeploy-tools is a toolkit for FastDeploy, including auto compression .etc.\n\n"
long_description += "Usage of auto compression: fastdeploy compress --config_path=./yolov7_tiny_qat_dis.yaml --method='QAT' --save_dir='./v7_qat_outmodel/' \n"
install_requires = ['uvicorn==0.16.0']
setuptools.setup(
name="fastdeploy-tools", # name of package
version="0.0.5", #version of package
description="A toolkit for FastDeploy.",
long_description=long_description,
long_description_content_type="text/plain",
packages=setuptools.find_packages(),
install_requires=install_requires,
classifiers=[
"Programming Language :: Python :: 3",
"License :: OSI Approved :: Apache Software License",
"Operating System :: OS Independent",
],
license='Apache 2.0',
entry_points={
'console_scripts': ['fastdeploy = common_tools.common_tools:main', ]
})
-17
View File
@@ -1,17 +0,0 @@
# 1. Install basic software
apt update
apt-get install -y --no-install-recommends \
gcc g++ git make wget python unzip
# 2. Install arm gcc toolchains
apt-get install -y --no-install-recommends \
g++-arm-linux-gnueabi gcc-arm-linux-gnueabi \
g++-arm-linux-gnueabihf gcc-arm-linux-gnueabihf \
gcc-aarch64-linux-gnu g++-aarch64-linux-gnu
# 3. Install cmake 3.10 or above
wget -c https://mms-res.cdn.bcebos.com/cmake-3.10.3-Linux-x86_64.tar.gz && \
tar xzf cmake-3.10.3-Linux-x86_64.tar.gz && \
mv cmake-3.10.3-Linux-x86_64 /opt/cmake-3.10 && \
ln -s /opt/cmake-3.10/bin/cmake /usr/bin/cmake && \
ln -s /opt/cmake-3.10/bin/ccmake /usr/bin/ccmake
-58
View File
@@ -1,58 +0,0 @@
import paddle
import tvm
from tvm import relay
from tvm.contrib import graph_executor
import os
import argparse
def get_config():
parser = argparse.ArgumentParser()
parser.add_argument(
"--model_path", default="./picodet_l_320_coco_lcnet/model")
parser.add_argument(
"--shape_dict",
default={"image": [1, 3, 320, 320],
"scale_factor": [1, 2]})
parser.add_argument("--tvm_save_name", default="tvm_model")
parser.add_argument("--tvm_save_path", default="./tvm_save")
args = parser.parse_args()
return args
def read_model(model_path):
return paddle.jit.load(model_path)
def paddle_to_tvm(paddle_model,
shape_dict,
tvm_save_name="tvm_model",
tvm_save_path="./tvm_save"):
if isinstance(shape_dict, str):
shape_dict = eval(shape_dict)
mod, params = relay.frontend.from_paddle(paddle_model, shape_dict)
# 这里首先在PC的CPU上进行测试 所以使用LLVM进行导出
target = tvm.target.Target("llvm", host="llvm")
dev = tvm.cpu(0)
# 这里利用TVM构建出优化后模型的信息
with tvm.transform.PassContext(opt_level=2):
base_lib = relay.build_module.build(mod, target, params=params)
if not os.path.exists(tvm_save_path):
os.mkdir(tvm_save_path)
lib_save_path = os.path.join(tvm_save_path, tvm_save_name + ".so")
base_lib.export_library(lib_save_path)
param_save_path = os.path.join(tvm_save_path,
tvm_save_name + ".params")
with open(param_save_path, 'wb') as fo:
fo.write(relay.save_param_dict(base_lib.get_params()))
module = graph_executor.GraphModule(base_lib['default'](dev))
module.load_params(relay.save_param_dict(base_lib.get_params()))
print("export success")
if __name__ == "__main__":
config = get_config()
paddle_model = read_model(config.model_path)
shape_dict = config.shape_dict
paddle_to_tvm(paddle_model, shape_dict, config.tvm_save_name,
config.tvm_save_path)