Files
FastDeploy/fastdeploy/backends/tensorrt/utils.h
T
Jason 43cabceb6d Update release/0.4 (#423)
* [Backend] TRT backend & PP-Infer backend support pinned memory (#403)

* TRT backend use pinned memory

* refine fd tensor pinned memory logic

* TRT enable pinned memory configurable

* paddle inference support pinned memory

* pinned memory pybindings

Co-authored-by: Jason <jiangjiajun@baidu.com>

* [Bug Fix] release task scripts (#411)

* Update py_run.bat

* Update cpp_run.bat

* Update compare_with_gt.py

Increase score_diff and boxes_diff_ratio threshold

* Update cpp_run.bat

* Update release task scripts according to diffrent platforms

* Delete CMAKE_CXX_COMPILER in cpp_run.bat

* [Doc] add contributor for js application (#413)

add contributor

* [Other] Refactor js submodule (#415)

* Refactor js submodule

* Remove change-log

* Update ocr module

* Update ocr-detection module

* Update ocr-detection module

* Remove change-log

* [Doc] Add PicoDet & PaddleClas Android demo docs (#412)

* [Backend] Add override flag to lite backend

* [Docs] Add Android C++ SDK build docs

* [Doc] fix android_build_docs typos

* Update CMakeLists.txt

* Update android.md

* [Doc] Add PicoDet Android demo docs

* [Doc] Update PicoDet Andorid demo docs

* [Doc] Update PaddleClasModel Android demo docs

* [Doc] Update fastdeploy android jni docs

* [Doc] Update fastdeploy android jni usage docs

Co-authored-by: Jason <jiangjiajun@baidu.com>

* Update README.md

* Update README_CN.md

* Update README_CN.md

* Update README_EN.md

* [Doc] Add  tutorial of supporting new models (#418)

* first commit for yolov7

* pybind for yolov7

* CPP README.md

* CPP README.md

* modified yolov7.cc

* README.md

* python file modify

* delete license in fastdeploy/

* repush the conflict part

* README.md modified

* README.md modified

* file path modified

* file path modified

* file path modified

* file path modified

* file path modified

* README modified

* README modified

* move some helpers to private

* add examples for yolov7

* api.md modified

* api.md modified

* api.md modified

* YOLOv7

* yolov7 release link

* yolov7 release link

* yolov7 release link

* copyright

* change some helpers to private

* change variables to const and fix documents.

* gitignore

* Transfer some funtions to private member of class

* Transfer some funtions to private member of class

* Merge from develop (#9)

* Fix compile problem in different python version (#26)

* fix some usage problem in linux

* Fix compile problem

Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com>

* Add PaddleDetetion/PPYOLOE model support (#22)

* add ppdet/ppyoloe

* Add demo code and documents

* add convert processor to vision (#27)

* update .gitignore

* Added checking for cmake include dir

* fixed missing trt_backend option bug when init from trt

* remove un-need data layout and add pre-check for dtype

* changed RGB2BRG to BGR2RGB in ppcls model

* add model_zoo yolov6 c++/python demo

* fixed CMakeLists.txt typos

* update yolov6 cpp/README.md

* add yolox c++/pybind and model_zoo demo

* move some helpers to private

* fixed CMakeLists.txt typos

* add normalize with alpha and beta

* add version notes for yolov5/yolov6/yolox

* add copyright to yolov5.cc

* revert normalize

* fixed some bugs in yolox

* fixed examples/CMakeLists.txt to avoid conflicts

* add convert processor to vision

* format examples/CMakeLists summary

* Fix bug while the inference result is empty with YOLOv5 (#29)

* Add multi-label function for yolov5

* Update README.md

Update doc

* Update fastdeploy_runtime.cc

fix variable option.trt_max_shape wrong name

* Update runtime_option.md

Update resnet model dynamic shape setting name from images to x

* Fix bug when inference result boxes are empty

* Delete detection.py

Co-authored-by: Jason <jiangjiajun@baidu.com>
Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com>
Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>
Co-authored-by: huangjianhui <852142024@qq.com>

* first commit for yolor

* for merge

* Develop (#11)

* Fix compile problem in different python version (#26)

* fix some usage problem in linux

* Fix compile problem

Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com>

* Add PaddleDetetion/PPYOLOE model support (#22)

* add ppdet/ppyoloe

* Add demo code and documents

* add convert processor to vision (#27)

* update .gitignore

* Added checking for cmake include dir

* fixed missing trt_backend option bug when init from trt

* remove un-need data layout and add pre-check for dtype

* changed RGB2BRG to BGR2RGB in ppcls model

* add model_zoo yolov6 c++/python demo

* fixed CMakeLists.txt typos

* update yolov6 cpp/README.md

* add yolox c++/pybind and model_zoo demo

* move some helpers to private

* fixed CMakeLists.txt typos

* add normalize with alpha and beta

* add version notes for yolov5/yolov6/yolox

* add copyright to yolov5.cc

* revert normalize

* fixed some bugs in yolox

* fixed examples/CMakeLists.txt to avoid conflicts

* add convert processor to vision

* format examples/CMakeLists summary

* Fix bug while the inference result is empty with YOLOv5 (#29)

* Add multi-label function for yolov5

* Update README.md

Update doc

* Update fastdeploy_runtime.cc

fix variable option.trt_max_shape wrong name

* Update runtime_option.md

Update resnet model dynamic shape setting name from images to x

* Fix bug when inference result boxes are empty

* Delete detection.py

Co-authored-by: Jason <jiangjiajun@baidu.com>
Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com>
Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>
Co-authored-by: huangjianhui <852142024@qq.com>

* Yolor (#16)

* Develop (#11) (#12)

* Fix compile problem in different python version (#26)

* fix some usage problem in linux

* Fix compile problem

Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com>

* Add PaddleDetetion/PPYOLOE model support (#22)

* add ppdet/ppyoloe

* Add demo code and documents

* add convert processor to vision (#27)

* update .gitignore

* Added checking for cmake include dir

* fixed missing trt_backend option bug when init from trt

* remove un-need data layout and add pre-check for dtype

* changed RGB2BRG to BGR2RGB in ppcls model

* add model_zoo yolov6 c++/python demo

* fixed CMakeLists.txt typos

* update yolov6 cpp/README.md

* add yolox c++/pybind and model_zoo demo

* move some helpers to private

* fixed CMakeLists.txt typos

* add normalize with alpha and beta

* add version notes for yolov5/yolov6/yolox

* add copyright to yolov5.cc

* revert normalize

* fixed some bugs in yolox

* fixed examples/CMakeLists.txt to avoid conflicts

* add convert processor to vision

* format examples/CMakeLists summary

* Fix bug while the inference result is empty with YOLOv5 (#29)

* Add multi-label function for yolov5

* Update README.md

Update doc

* Update fastdeploy_runtime.cc

fix variable option.trt_max_shape wrong name

* Update runtime_option.md

Update resnet model dynamic shape setting name from images to x

* Fix bug when inference result boxes are empty

* Delete detection.py

Co-authored-by: Jason <jiangjiajun@baidu.com>
Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com>
Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>
Co-authored-by: huangjianhui <852142024@qq.com>

Co-authored-by: Jason <jiangjiajun@baidu.com>
Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com>
Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>
Co-authored-by: huangjianhui <852142024@qq.com>

* Develop (#13)

* Fix compile problem in different python version (#26)

* fix some usage problem in linux

* Fix compile problem

Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com>

* Add PaddleDetetion/PPYOLOE model support (#22)

* add ppdet/ppyoloe

* Add demo code and documents

* add convert processor to vision (#27)

* update .gitignore

* Added checking for cmake include dir

* fixed missing trt_backend option bug when init from trt

* remove un-need data layout and add pre-check for dtype

* changed RGB2BRG to BGR2RGB in ppcls model

* add model_zoo yolov6 c++/python demo

* fixed CMakeLists.txt typos

* update yolov6 cpp/README.md

* add yolox c++/pybind and model_zoo demo

* move some helpers to private

* fixed CMakeLists.txt typos

* add normalize with alpha and beta

* add version notes for yolov5/yolov6/yolox

* add copyright to yolov5.cc

* revert normalize

* fixed some bugs in yolox

* fixed examples/CMakeLists.txt to avoid conflicts

* add convert processor to vision

* format examples/CMakeLists summary

* Fix bug while the inference result is empty with YOLOv5 (#29)

* Add multi-label function for yolov5

* Update README.md

Update doc

* Update fastdeploy_runtime.cc

fix variable option.trt_max_shape wrong name

* Update runtime_option.md

Update resnet model dynamic shape setting name from images to x

* Fix bug when inference result boxes are empty

* Delete detection.py

Co-authored-by: Jason <jiangjiajun@baidu.com>
Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com>
Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>
Co-authored-by: huangjianhui <852142024@qq.com>

* documents

* documents

* documents

* documents

* documents

* documents

* documents

* documents

* documents

* documents

* documents

* documents

* Develop (#14)

* Fix compile problem in different python version (#26)

* fix some usage problem in linux

* Fix compile problem

Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com>

* Add PaddleDetetion/PPYOLOE model support (#22)

* add ppdet/ppyoloe

* Add demo code and documents

* add convert processor to vision (#27)

* update .gitignore

* Added checking for cmake include dir

* fixed missing trt_backend option bug when init from trt

* remove un-need data layout and add pre-check for dtype

* changed RGB2BRG to BGR2RGB in ppcls model

* add model_zoo yolov6 c++/python demo

* fixed CMakeLists.txt typos

* update yolov6 cpp/README.md

* add yolox c++/pybind and model_zoo demo

* move some helpers to private

* fixed CMakeLists.txt typos

* add normalize with alpha and beta

* add version notes for yolov5/yolov6/yolox

* add copyright to yolov5.cc

* revert normalize

* fixed some bugs in yolox

* fixed examples/CMakeLists.txt to avoid conflicts

* add convert processor to vision

* format examples/CMakeLists summary

* Fix bug while the inference result is empty with YOLOv5 (#29)

* Add multi-label function for yolov5

* Update README.md

Update doc

* Update fastdeploy_runtime.cc

fix variable option.trt_max_shape wrong name

* Update runtime_option.md

Update resnet model dynamic shape setting name from images to x

* Fix bug when inference result boxes are empty

* Delete detection.py

Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com>
Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>
Co-authored-by: huangjianhui <852142024@qq.com>

Co-authored-by: Jason <jiangjiajun@baidu.com>
Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com>
Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>
Co-authored-by: huangjianhui <852142024@qq.com>
Co-authored-by: Jason <928090362@qq.com>

* add is_dynamic for YOLO series (#22)

* modify ppmatting backend and docs

* modify ppmatting docs

* fix the PPMatting size problem

* fix LimitShort's log

* retrigger ci

* modify PPMatting docs

* modify the way  for dealing with  LimitShort

* change develop_a_new_model.md dir

Co-authored-by: Jason <jiangjiajun@baidu.com>
Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com>
Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>
Co-authored-by: huangjianhui <852142024@qq.com>
Co-authored-by: Jason <928090362@qq.com>

* [Doc] add readme for js packages  (#421)

* add contributor

* add package readme

* refine ocr readme

* refine ocr readme

Co-authored-by: Wang Xinyu <wangxinyu_es@163.com>
Co-authored-by: huangjianhui <852142024@qq.com>
Co-authored-by: Double_V <liuvv0203@163.com>
Co-authored-by: chenqianhe <54462604+chenqianhe@users.noreply.github.com>
Co-authored-by: DefTruth <31974251+DefTruth@users.noreply.github.com>
Co-authored-by: leiqing <54695910+leiqing1@users.noreply.github.com>
Co-authored-by: ziqi-jin <67993288+ziqi-jin@users.noreply.github.com>
Co-authored-by: root <root@bjyz-sys-gpu-kongming3.bjyz.baidu.com>
2022-10-24 16:48:48 +08:00

282 lines
7.4 KiB
C++

// Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserved.
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
#pragma once
#include <cuda_runtime_api.h>
#include <algorithm>
#include <iostream>
#include <map>
#include <memory>
#include <numeric>
#include <string>
#include <vector>
#include "NvInfer.h"
#include "fastdeploy/core/allocate.h"
#include "fastdeploy/core/fd_tensor.h"
#include "fastdeploy/utils/utils.h"
namespace fastdeploy {
struct FDInferDeleter {
template <typename T>
void operator()(T* obj) const {
if (obj) {
delete obj;
// obj->destroy();
}
}
};
template <typename T>
using FDUniquePtr = std::unique_ptr<T, FDInferDeleter>;
int64_t Volume(const nvinfer1::Dims& d);
nvinfer1::Dims ToDims(const std::vector<int>& vec);
nvinfer1::Dims ToDims(const std::vector<int64_t>& vec);
size_t TrtDataTypeSize(const nvinfer1::DataType& dtype);
FDDataType GetFDDataType(const nvinfer1::DataType& dtype);
nvinfer1::DataType ReaderDtypeToTrtDtype(int reader_dtype);
std::vector<int> ToVec(const nvinfer1::Dims& dim);
template <typename T>
std::ostream& operator<<(std::ostream& out, const std::vector<T>& vec) {
out << "[";
for (size_t i = 0; i < vec.size(); ++i) {
if (i != vec.size() - 1) {
out << vec[i] << ", ";
} else {
out << vec[i] << "]";
}
}
return out;
}
template <typename AllocFunc, typename FreeFunc>
class FDGenericBuffer {
public:
//!
//! \brief Construct an empty buffer.
//!
explicit FDGenericBuffer(nvinfer1::DataType type = nvinfer1::DataType::kFLOAT)
: mSize(0),
mCapacity(0),
mType(type),
mBuffer(nullptr),
mExternal_buffer(nullptr) {}
//!
//! \brief Construct a buffer with the specified allocation size in bytes.
//!
FDGenericBuffer(size_t size, nvinfer1::DataType type)
: mSize(size), mCapacity(size), mType(type) {
if (!allocFn(&mBuffer, this->nbBytes())) {
throw std::bad_alloc();
}
}
//!
//! \brief This use to skip memory copy step.
//!
FDGenericBuffer(size_t size, nvinfer1::DataType type, void* buffer)
: mSize(size), mCapacity(size), mType(type) {
mExternal_buffer = buffer;
}
FDGenericBuffer(FDGenericBuffer&& buf)
: mSize(buf.mSize),
mCapacity(buf.mCapacity),
mType(buf.mType),
mBuffer(buf.mBuffer) {
buf.mSize = 0;
buf.mCapacity = 0;
buf.mType = nvinfer1::DataType::kFLOAT;
buf.mBuffer = nullptr;
}
FDGenericBuffer& operator=(FDGenericBuffer&& buf) {
if (this != &buf) {
freeFn(mBuffer);
mSize = buf.mSize;
mCapacity = buf.mCapacity;
mType = buf.mType;
mBuffer = buf.mBuffer;
// Reset buf.
buf.mSize = 0;
buf.mCapacity = 0;
buf.mBuffer = nullptr;
}
return *this;
}
//!
//! \brief Returns pointer to underlying array.
//!
void* data() {
if (mExternal_buffer != nullptr) return mExternal_buffer;
return mBuffer;
}
//!
//! \brief Returns pointer to underlying array.
//!
const void* data() const {
if (mExternal_buffer != nullptr) return mExternal_buffer;
return mBuffer;
}
//!
//! \brief Returns the size (in number of elements) of the buffer.
//!
size_t size() const { return mSize; }
//!
//! \brief Returns the size (in bytes) of the buffer.
//!
size_t nbBytes() const { return this->size() * TrtDataTypeSize(mType); }
//!
//! \brief Set user memory buffer for TRT Buffer
//!
void SetExternalData(size_t size, nvinfer1::DataType type, void* buffer) {
mSize = mCapacity = size;
mType = type;
mExternal_buffer = const_cast<void*>(buffer);
}
//!
//! \brief Set user memory buffer for TRT Buffer
//!
void SetExternalData(const nvinfer1::Dims& dims, const void* buffer) {
mSize = mCapacity = Volume(dims);
mExternal_buffer = const_cast<void*>(buffer);
}
//!
//! \brief Resizes the buffer. This is a no-op if the new size is smaller than
//! or equal to the current capacity.
//!
void resize(size_t newSize) {
mExternal_buffer = nullptr;
mSize = newSize;
if (mCapacity < newSize) {
freeFn(mBuffer);
if (!allocFn(&mBuffer, this->nbBytes())) {
throw std::bad_alloc{};
}
mCapacity = newSize;
}
}
//!
//! \brief Overload of resize that accepts Dims
//!
void resize(const nvinfer1::Dims& dims) { return this->resize(Volume(dims)); }
~FDGenericBuffer() {
mExternal_buffer = nullptr;
freeFn(mBuffer);
}
private:
size_t mSize{0}, mCapacity{0};
nvinfer1::DataType mType;
void* mBuffer;
void* mExternal_buffer;
AllocFunc allocFn;
FreeFunc freeFn;
};
using FDDeviceBuffer = FDGenericBuffer<FDDeviceAllocator, FDDeviceFree>;
using FDDeviceHostBuffer = FDGenericBuffer<FDDeviceHostAllocator,
FDDeviceHostFree>;
class FDTrtLogger : public nvinfer1::ILogger {
public:
static FDTrtLogger* logger;
static FDTrtLogger* Get() {
if (logger != nullptr) {
return logger;
}
logger = new FDTrtLogger();
return logger;
}
void log(nvinfer1::ILogger::Severity severity,
const char* msg) noexcept override {
if (severity == nvinfer1::ILogger::Severity::kINFO) {
// Disable this log
// FDINFO << msg << std::endl;
} else if (severity == nvinfer1::ILogger::Severity::kWARNING) {
// Disable this log
// FDWARNING << msg << std::endl;
} else if (severity == nvinfer1::ILogger::Severity::kERROR) {
FDERROR << msg << std::endl;
} else if (severity == nvinfer1::ILogger::Severity::kINTERNAL_ERROR) {
FDASSERT(false, "%s", msg);
}
}
};
struct ShapeRangeInfo {
explicit ShapeRangeInfo(const std::vector<int64_t>& new_shape) {
shape.assign(new_shape.begin(), new_shape.end());
min.resize(new_shape.size());
max.resize(new_shape.size());
is_static.resize(new_shape.size());
for (size_t i = 0; i < new_shape.size(); ++i) {
if (new_shape[i] > 0) {
min[i] = new_shape[i];
max[i] = new_shape[i];
is_static[i] = 1;
} else {
min[i] = -1;
max[i] = -1;
is_static[i] = 0;
}
}
}
std::string name;
std::vector<int64_t> shape;
std::vector<int64_t> min;
std::vector<int64_t> max;
std::vector<int64_t> opt;
std::vector<int8_t> is_static;
// return
// -1: new shape is inillegal
// 0 : new shape is able to inference
// 1 : new shape is out of range, need to update engine
int Update(const std::vector<int64_t>& new_shape);
int Update(const std::vector<int>& new_shape) {
std::vector<int64_t> new_shape_int64(new_shape.begin(), new_shape.end());
return Update(new_shape_int64);
}
friend std::ostream& operator<<(std::ostream& out,
const ShapeRangeInfo& info) {
out << "Input name: " << info.name << ", shape=" << info.shape
<< ", min=" << info.min << ", max=" << info.max << std::endl;
return out;
}
};
} // namespace fastdeploy