Project Name  Stars  Downloads  Repos Using This  Packages Using This  Most Recent Commit  Total Releases  Latest Release  Open Issues  License  Language 

Tengine Convert Tools  72  2 years ago  12  apache2.0  C++  
Tengine Convert Tool supports converting multi framworks' models into tmfile that suitable for TengineLite AI framework.  
Wedx  21  5 months ago  1  agpl3.0  Python  
Building Edge AI Pipelines with NoCode  
Nnenum  20  5 months ago  1  gpl3.0  Python  
Neural Network Enumeration Tool  
Resnet Score  2  4 years ago  Go  
Resnet50v2 scoring in Go 
SelfCreated Tools to convert ONNX files (NCHW) to TensorFlow/TFLite/Keras format (NHWC). The purpose of this tool is to solve the massive Transpose extrapolation problem in onnxtensorflow (onnxtf). I don't need a Star, but give me a pull request. Since I am adding challenging model optimizations and fixing bugs almost daily, I frequently embed potential bugs that would otherwise break through CI's regression testing. Therefore, if you encounter new problems, I recommend that you try a package that is a few versions older, or try the latest package that will be released in a few days.
https://github.com/PINTO0309/onnx2tf/wiki/model_status
Video speed is adjusted approximately 50 times slower than actual speed.
coion
option. Executable file named flatc
.)
# Custom flatc binary for Ubuntu 20.04+
# https://github.com/PINTO0309/onnx2tf/issues/196
wget https://github.com/PINTO0309/onnx2tf/releases/download/1.7.3/flatc.tar.gz \
&& tar zxvf flatc.tar.gz \
&& sudo chmod +x flatc \
&& sudo mv flatc /usr/bin/
# Custom flatc binary for Windows
# Set the environment variable paths appropriately on your own.
# https://github.com/PINTO0309/onnx2tf/issues/196
https://github.com/PINTO0309/onnx2tf/releases/download/1.7.3/flatc.exe
$ docker run rm it \
v `pwd`:/workdir \
w /workdir \
ghcr.io/pinto0309/onnx2tf:1.8.1
or
$ pip install U onnx \
&& pip install U nvidiapyindex \
&& pip install U onnxgraphsurgeon \
&& pip install U onnxruntime==1.13.1 \
&& pip install U onnxsim \
&& pip install U simple_onnx_processing_tools \
&& pip install U onnx2tf \
&& pip install U h5py==3.7.0
or
$ pip install e .
or
!sudo addaptrepository y ppa:deadsnakes/ppa
!sudo aptget y update
!sudo aptget y install python3.9
!sudo aptget y install python3.9dev
!sudo aptget y install python3pip
!sudo aptget y install python3.9distutils
!wget https://github.com/PINTO0309/onnx2tf/releases/download/1.7.3/flatc.tar.gz \
&& tar zxvf flatc.tar.gz \
&& sudo chmod +x flatc \
&& sudo mv flatc /usr/bin/
!python3.9 m pip install U setuptools \
&& python3.9 m pip install U pip \
&& python3.9 m pip install U distlib
!sudo updatealternatives install /usr/bin/python3 python3 /usr/bin/python3.7 1
!sudo updatealternatives install /usr/bin/python3 python3 /usr/bin/python3.9 2
!python3.9 m pip install tensorflow==2.12.0 \
&& python3.9 m pip install U onnx \
&& python3.9 m pip install U nvidiapyindex \
&& python3.9 m pip install U onnxgraphsurgeon \
&& python3.9 m pip install U onnxruntime==1.13.1 \
&& python3.9 m pip install U onnxsim \
&& python3.9 m pip install U simple_onnx_processing_tools \
&& python3.9 m pip install U onnx2tf \
&& python3.9 m pip install U protobuf==3.20.3 \
&& python3.9 m pip install U h5py==3.7.0
Only patterns that are considered to be used particularly frequently are described. In addition, there are several other options, such as disabling Flex OP and additional options to improve inference performance. See: CLI Parameter
# Float32, Float16
# This is the fastest way to generate tflite,
# but the accompanying saved_model will not have a signature.
# "ValueError: Only support at least one signature key."
# If you are having trouble with this error, please use the `osd` option.
$ wget https://github.com/PINTO0309/onnx2tf/releases/download/0.0.2/resnet18v17.onnx
$ onnx2tf i resnet18v17.onnx
# saved_model with signaturedefs added.
# Output in the form of saved_model that can be used for serving
# or conversion to other frameworks.
$ wget https://github.com/PINTO0309/onnx2tf/releases/download/0.0.2/resnet18v17.onnx
$ onnx2tf i resnet18v17.onnx osd
# Keras h5 format
$ wget https://github.com/PINTO0309/onnx2tf/releases/download/0.0.2/resnet18v17.onnx
$ onnx2tf i resnet18v17.onnx oh5
# Keras keras_v3 format (TensorFlow v2.12.0 or later only)
$ wget https://github.com/PINTO0309/onnx2tf/releases/download/0.0.2/resnet18v17.onnx
$ onnx2tf i resnet18v17.onnx okv3
# INT8 Quantization, Full INT8 Quantization
# INT8 Quantization with INT16 activation, Full INT8 Quantization with INT16 activation
# Dynamic Range Quantization
$ wget https://github.com/PINTO0309/onnx2tf/releases/download/1.1.1/emotionferplus8.onnx
# INT8 Quantization (perchannel)
$ onnx2tf i emotionferplus8.onnx oiqt
# INT8 Quantization (pertensor)
$ onnx2tf i emotionferplus8.onnx oiqt qt pertensor
# Split the model at the middle position for debugging
# Specify the output name of the OP
$ onnx2tf i resnet18v17.onnx onimc resnetv15_stage2_conv1_fwd resnetv15_stage2_conv2_fwd
# Suppress generation of Flex OP and replace with PseudoFunction
# [Asin, Acos, Atan, Abs, PReLU, LeakyReLU, Power, GatherND, Neg, HardSwish, Erf]
# Below is a sample of replacing GELU / Erf with another set of operations.
$ wget https://s3.apnortheast2.wasabisys.com/tempmodels/onnx2tf_readme/gelu_11.onnx
$ onnx2tf i gelu_11.onnx rtpo Erf
# Highdimensional Transpose decomposition
# If you do not like FlexTranspose being generated, try `nodafc`.
# Suppresses the generation of FlexTranspose by decomposing Transpose
# to the specified number of dimensions.
# In TensorFlow v2.12.0 and later, up to 6 dimensions are converted to normal Transpose;
# in v2.11.0 and earlier, up to 5 dimensions are converted to normal Transpose.
# Note that specifying `2` for the `nodafc` option causes all Transpose OPs to disappear
# from the model structure.
# Below is an example of decomposing a Transpose of 5 or more dimensions into a Transpose
# of 4 dimensions.
$ onnx2tf i xxxx.onnx nodafc 4
# Parameter replacement (Resize,Transpose,Softmax)
$ rm replace.json
$ wget https://github.com/PINTO0309/onnx2tf/releases/download/1.1.27/human_segmentation_pphumanseg_2021oct.onnx
$ wget https://github.com/PINTO0309/onnx2tf/releases/download/1.1.27/replace.json
$ onnx2tf i human_segmentation_pphumanseg_2021oct.onnx prf replace.json
Perform error checking of ONNX output and TensorFlow output. Verify that the error of all outputs, one operation at a time, is below a certain threshold. Automatically determines before and after which OPs the tool's automatic conversion of the model failed. Know where dimensional compression, dimensional expansion, and dimensional transposition by Reshape
and Traspose
are failing. Once you have identified the problem area, you can refer to the tutorial on Parameter replacement to modify the tool's behavior.
ois
an option to overwrite the input OP to a static size if it has undefined dimensions. cotof
option checks the accuracy of all OPs one by one. cotoa
is the error value of the threshold for determining an accuracy error. If there are undefined dimensions in the input OP, it is better to fix them to the static geometry to improve the accuracy of the accuracy measurement.
The cotof
option only compares the original ONNX and converted TensorFlow (Keras) models at Float32 precision, not at Float16 or INT8 precision.
$ onnx2tf i mobilenetv212.onnx ois input:1,3,224,224 cotof cotoa 1e1
or
$ onnx2tf i mobilenetv212.onnx b 1 cotof cotoa 1e1
If you want to match tflite's input/output OP names and the order of input/output OPs with ONNX, you can use the interpreter.get_signature_runner()
to infer this after using the coion
/ copy_onnx_input_output_names_to_tflite
option to output tflite file. See: https://github.com/PINTO0309/onnx2tf/issues/228
import torch
import onnxruntime
import numpy as np
import onnx2tf
import tensorflow as tf
from tensorflow.lite.python import interpreter as tflite_interpreter
class Model(torch.nn.Module):
def forward(self, x, y):
return {
"add": x + y,
"sub": x  y,
}
# Let's double check what PyTorch gives us
model = Model()
pytorch_output = model.forward(10, 2)
print("[PyTorch] Model Predictions:", pytorch_output)
# First, export the above model to ONNX
torch.onnx.export(
Model(),
{"x": 10, "y": 2},
"model.onnx",
opset_version=16,
input_names=["x", "y"],
output_names=["add", "sub"],
)
# And check its output
session = onnxruntime.InferenceSession("model.onnx")
onnx_output = session.run(["add", "sub"], {"x": np.array(10), "y": np.array(2)})
print("[ONNX] Model Outputs:", [o.name for o in session.get_outputs()])
print("[ONNX] Model Predictions:", onnx_output)
# Now, let's convert the ONNX model to TF
onnx2tf.convert(
input_onnx_file_path="model.onnx",
output_folder_path="model.tf",
copy_onnx_input_output_names_to_tflite=True,
non_verbose=True,
)
# Now, test the newer TFLite model
interpreter = tf.lite.Interpreter(model_path="model.tf/model_float32.tflite")
tf_lite_model = interpreter.get_signature_runner()
tt_lite_output = tf_lite_model(
x=tf.constant((10,), dtype=tf.int64),
y=tf.constant((2,), dtype=tf.int64),
)
print("[TFLite] Model Predictions:", tt_lite_output)
[PyTorch] Model Predictions:
{
'add': 12,
'sub': 8
}
[ONNX] Model Outputs:
[
'add',
'sub'
]
[ONNX] Model Predictions:
[
array(12, dtype=int64),
array(8, dtype=int64)
]
[TFLite] Model Predictions:
{
'add': array([12]),
'sub': array([8])
}
If you want to embed label maps, quantization parameters, descriptions, etc. into your tflite file, you can refer to the official tutorial and try it yourself. For now, this tool does not plan to implement the ability to append metadata, as I do not want to write byte arrays to the tflite file that are not essential to its operation.
Adding metadata to TensorFlow Lite models
It is a matter of model structure. The activation function (SiLU
/Swish
), kernel size and stride for Pooling
, and kernel size and stride for Conv
should be completely revised. See: https://github.com/PINTO0309/onnx2tf/issues/244#issuecomment1475128445, and https://github.com/PINTO0309/onnx2tf/issues/269
If you want to see the difference in quantization error between SiLU
and ReLU
, please check this Gist by @motokimura who helped us in our research. Thanks Motoki!
Gist: Quantization error simulation of SiLU (Swish) activation
The accuracy error rates after quantization for different activation functions are shown in the figure below. The graph plots the distribution of absolute error, so a position with a higher value on the horizontal axis indicates a larger error. The vertical axis is the number of samples. SiLU (Swish)
produces catastrophic errors after INT8 quantization.
e.g. YOLOv8 https://docs.openvino.ai/latest/notebooks/230yolov8optimizationwithoutput.html
e.g. YOLOXNano TexasInstruments/edgeaiyolox
BeforeAfter
::::
Swish
/SiLU
ReLU

DepthwiseConv2D
Conv2D

MaxPool
, kernel_size=5x5,9x9,13x13
MaxPool
, kernel_size=3x3

### Float32  YOLOXNano
(1, 52, 52, 85)
array([[[
[ 0.971787, 0.811184, 0.550566, ..., 5.962632, 7.403673, 6.735206],
[ 0.858804, 1.351296, 1.231673, ..., 6.479690, 8.277064, 7.664936],
[ 0.214827, 1.035119, 1.458006, ..., 6.291425, 8.229385, 7.761562],
...,
[ 0.450116, 1.391900, 1.533354, ..., 5.672194, 7.121591, 6.880231],
[ 0.593133, 2.112723, 0.968755, ..., 6.150078, 7.370633, 6.874294],
[ 0.088263, 1.985220, 0.619998, ..., 5.507928, 6.914980, 6.234259]]]]),
### INT8  YOLOXNano
(1, 52, 52, 85)
array([[[
[ 0.941908, 0.770652, 0.513768, ..., 5.993958, 7.449634, 6.850238],
[ 0.856280, 1.284420, 1.198792, ..., 6.507727, 8.391542, 7.792146],
[ 0.256884, 0.941908, 1.455676, ..., 6.336471, 8.305914, 7.877774],
...,
[ 0.342512, 1.370048, 1.541304, ..., 5.737075, 7.192750, 7.107122],
[ 0.513768, 2.226327, 1.027536, ..., 6.165215, 7.449634, 7.021494],
[ 0.085628, 2.055072, 0.685024, ..., 5.480191, 7.021494, 6.422099]]]]),
Other recommended replacement OP
Before  After 

HardSwish 
ReLU 
Calibration data (.npy) for INT8 quantization (qcind
) is generated as follows. This is a sample when the data used for training is image data. See: https://github.com/PINTO0309/onnx2tf/issues/222
https://www.tensorflow.org/lite/performance/post_training_quantization
import cv2
import glob
import numpy as np
# Not used during data generation ################################
# You will need to do the calculations yourself using the test data
MEAN = np.asarray([[[[0.485, 0.456, 0.406]]]], dtype=np.float32) # [1,1,1,3]
STD = np.asarray([[[[0.229, 0.224, 0.225]]]], dtype=np.float32) # [1,1,1,3]
# Not used during data generation ################################
files = glob.glob("data/*.png")
img_datas = []
for idx, file in enumerate(files):
bgr_img = cv2.imread(file)
rgb_img = cv2.cvtColor(bgr_img, cv2.COLOR_BGR2RGB)
resized_img = cv2.resize(rgb_img, dsize=(200,112))
extend_batch_size_img = resized_img[np.newaxis, :]
normalized_img = extend_batch_size_img / 255.0 # 0.0  1.0
print(
f'{str(idx+1).zfill(2)}. extend_batch_size_img.shape: {extend_batch_size_img.shape}'
) # [1,112,200,3]
img_datas.append(extend_batch_size_img)
calib_datas = np.vstack(img_datas)
print(f'calib_datas.shape: {calib_datas.shape}') # [10,112,200,3]
np.save(file='data/calibdata.npy', arr=calib_datas)
loaded_data = np.load('data/calibdata.npy')
print(f'loaded_data.shape: {loaded_data.shape}') # [10,112,200,3]
"""
qcind INPUT_NAME NUMPY_FILE_PATH MEAN STD
int8_calib_datas = (loaded_data  MEAN) / STD # 1.0  1.0
e.g.
qcind pc_dep 'data/calibdata.npy' [[[[0.485, 0.456, 0.406]]]] [[[[0.229, 0.224, 0.225]]]]
qcind feat 'data/calibdata2.npy' [[[[0.123, ..., 0.321]]]] [[[[0.112, ..., 0.451]]]]
"""
If you do not need to perform INT8 quantization with this tool alone, the following method is the easiest.
The osd
option will output a saved_model.pb
in the saved_model
folder with the full size required for quantization. That is, a default signature named serving_default
is embedded in .pb
. The b
option is used to convert the batch size by rewriting it as a static integer.
Note: INT8 TFLite generated by following this procedure as is will result in a model with significantly degraded accuracy. This tutorial only demonstrates the INT8 quantization procedure; if you wish to correct for accuracy, please refer to Parameter replacement to correct for transposition errors in the operation.
# Ref: https://github.com/onnx/models/tree/main/text/machine_comprehension/bertsquad
wget https://s3.apnortheast2.wasabisys.com/tempmodels/onnx2tf_248/bertsquad12.onnx
onnx2tf i bertsquad12.onnx b 1 osd cotof
Use the saved_model_cli
command to check the saved_model
signature. INT8 quantization calibration using signatures allows correct control of the input order of data for calibration. Therefore, calibration with signatures is recommended for INT8 quantization of models with multiple inputs.
saved_model_cli show dir saved_model/ tag_set serve signature_def serving_default
The given SavedModel SignatureDef contains the following input(s):
inputs['input_ids_0'] tensor_info:
dtype: DT_INT64
shape: (1, 256)
name: serving_default_input_ids_0:0
inputs['input_mask_0'] tensor_info:
dtype: DT_INT64
shape: (1, 256)
name: serving_default_input_mask_0:0
inputs['segment_ids_0'] tensor_info:
dtype: DT_INT64
shape: (1, 256)
name: serving_default_segment_ids_0:0
inputs['unique_ids_raw_output___9_0'] tensor_info:
dtype: DT_INT64
shape: (1)
name: serving_default_unique_ids_raw_output___9_0:0
Calibrate by specifying the input OP name displayed in inputs
. The np.ones([xxx], dtype=np.int64)
part must be replaced with the correct calibration test data. In practice, several pieces of data used for training are extracted and used.
import tensorflow as tf
import numpy as np
def representative_dataset():
unique_ids = np.ones([10, 256], dtype=np.int64)
segment_ids = np.ones([10, 256], dtype=np.int64)
input_masks = np.ones([10, 256], dtype=np.int64)
input_ids = np.ones([10], dtype=np.int64)
for unique_id, segment_id, input_mask, input_id \
in zip(unique_ids, segment_ids, input_masks, input_ids):
yield {
"unique_ids_raw_output___9_0": unique_id,
"segment_ids_0": segment_id,
"input_mask_0": input_mask,
"input_ids_0": input_id,
}
converter = tf.lite.TFLiteConverter.from_saved_model('saved_model')
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8 # or tf.uint8
converter.inference_output_type = tf.int8 # or tf.uint8
tflite_quant_model = converter.convert()
with open('saved_model/int8_model.tflite', 'wb') as w:
w.write(tflite_quant_model)
https://www.tensorflow.org/lite/performance/post_training_quantization
See: https://github.com/PINTO0309/onnx2tf/issues/248
When converting to TensorFlow.js, process as follows.
pip install tensorflowjs
onnx2tf i mobilenetv212.onnx ois input:1,3,224,224 osd
tensorflowjs_converter \
input_format tf_saved_model \
output_format tfjs_graph_model \
saved_model \
tfjs_model
See: https://github.com/tensorflow/tfjs/tree/master/tfjsconverter
When converting to CoreML, process as follows. The k
option is for conversion while maintaining the input channel order in ONNX's NCHW format.
pip install coremltools
onnx2tf i mobilenetv212.onnx k input ois input:1,3,224,224 osd
import coremltools as ct
FOLDER_PATH = 'saved_model'
model = ct.convert(
model=FOLDER_PATH,
source='tensorflow',
)
model.save(f'{FOLDER_PATH}/model.mlmodel')
See: apple/coremltools
$ onnx2tf h
usage: onnx2tf
[h]
(i INPUT_ONNX_FILE_PATH  V)
[o OUTPUT_FOLDER_PATH]
[osd]
[oh5]
[okv3]
[ow]
[coion]
[oiqt]
[qt {perchannel,pertensor}]
[qcind INPUT_NAME NUMPY_FILE_PATH MEAN STD]
[ioqd {int8,uint8}]
[nuo]
[nuonag]
[b BATCH_SIZE]
[ois OVERWRITE_INPUT_SHAPE [OVERWRITE_INPUT_SHAPE ...]]
[nlt]
[onwdt]
[k KEEP_NCW_OR_NCHW_OR_NCDHW_INPUT_NAMES [KEEP_NCW_OR_NCHW_OR_NCDHW_INPUT_NAMES ...]]
[kt KEEP_NWC_OR_NHWC_OR_NDHWC_INPUT_NAMES [KEEP_NWC_OR_NHWC_OR_NDHWC_INPUT_NAMES ...]]
[kat KEEP_SHAPE_ABSOLUTELY_INPUT_NAMES [KEEP_SHAPE_ABSOLUTELY_INPUT_NAMES ...]]
[onimc OUTPUT_NAMES [OUTPUT_NAMES ...]]
[dgc]
[ebu]
[dsft]
[nodafc]
[ofgd]
[rari64  rarf32  rafi64  raff32]
[fasr FUSED_ARGMAX_SCALE_RATIO]
[rtpo REPLACE_TO_PSEUDO_OPERATORS [REPLACE_TO_PSEUDO_OPERATORS ...]]
[me MVN_EPSILON]
[prf PARAM_REPLACEMENT_FILE]
[cgdc]
[coto  cotof]
[coton]
[cotor CHECK_ONNX_TF_OUTPUTS_ELEMENTWISE_CLOSE_RTOL]
[cotoa CHECK_ONNX_TF_OUTPUTS_ELEMENTWISE_CLOSE_ATOL]
[n]
optional arguments:
h, help
show this help message and exit
i INPUT_ONNX_FILE_PATH, input_onnx_file_path INPUT_ONNX_FILE_PATH
Input onnx file path.
V, version
Show version and exit.
o OUTPUT_FOLDER_PATH, output_folder_path OUTPUT_FOLDER_PATH
Output folder path. Default: "saved_model"
osd, output_signaturedefs
Signature is added to the output for serving or for conversion
to other model formats. However, this can significantly reduce the speed
of model conversion and significant increase the size of the model.
oh5, output_h5
Output model in Keras (hdf5) format.
okv3, output_keras_v3
Output model in Keras (keras_v3) format.
ow, output_weights
Output weights in hdf5 format.
coion, copy_onnx_input_output_names_to_tflite
Copy the input/output OP name of ONNX to the input/output OP name of tflite.
Due to Tensorflow internal operating specifications,
the input/output order of ONNX does not necessarily match
the input/output order of tflite.
Be sure to check that the input/output OP names in the generated
tflite file have been converted as expected.
Also, this option generates a huge JSON file as a temporary file for processing.
Therefore, it is strongly discouraged to use it on large models of hundreds
of megabytes or more.
oiqt, output_integer_quantized_tflite
Output of integer quantized tflite.
qt {perchannel,pertensor}, quant_type {perchannel,pertensor}
Selects whether "perchannel" or "pertensor" quantization is used.
Default: "perchannel"
qcind INPUT_NAME NUMPY_FILE_PATH MEAN STD, \
quant_calib_input_op_name_np_data_path INPUT_NAME NUMPY_FILE_PATH MEAN STD
INPUT Name of OP and path of calibration data file (Numpy) for quantization and mean and std.
The specification can be omitted only when the input OP is a single 4D tensor image data.
If omitted, it is automatically calibrated using 20 normalized MSCOCO images.
The type of the input OP must be Float32.
Data for calibration must be prenormalized to a range of 0 to 1.
qcind {input_op_name} {numpy_file_path} {mean} {std}
Numpy file paths must be specified the same number of times as the number of input OPs.
Normalize the value of the input OP based on the tensor specified in mean and std.
(input_value  mean) / std
Tensors in Numpy file format must be in dimension order after conversion to TF.
Note that this is intended for deployment on lowresource devices,
so the batch size is limited to 1 only.
e.g.
The example below shows a case where there are three input OPs.
Assume input0 is 128x128 RGB image data.
In addition, input0 should be a value that has been divided by 255
in the preprocessing and normalized to a range between 0 and 1.
input1 and input2 assume the input of something that is not an image.
Because input1 and input2 assume something that is not an image,
the divisor is not 255 when normalizing from 0 to 1.
"n" is the number of calibration data.
ONNX INPUT shapes:
input0: [n,3,128,128]
mean: [1,3,1,1] > [[[[0.485]],[[0.456]],[[0.406]]]]
std : [1,3,1,1] > [[[[0.229]],[[0.224]],[[0.225]]]]
input1: [n,64,64]
mean: [1,64] > [[0.1, ..., 0.64]]
std : [1,64] > [[0.05, ..., 0.08]]
input2: [n,5]
mean: [1] > [0.3]
std : [1] > [0.07]
TensorFlow INPUT shapes (Numpy file ndarray shapes):
input0: [n,128,128,3]
mean: [1,1,1,3] > [[[[0.485, 0.456, 0.406]]]]
std : [1,1,1,3] > [[[[0.229, 0.224, 0.225]]]]
input1: [n,64,64]
mean: [1,64] > [[0.1, ..., 0.64]]
std : [1,64] > [[0.05, ..., 0.08]]
input2: [n,5]
mean: [1] > [0.3]
std : [1] > [0.07]
qcind "input0" "../input0.npy" [[[[0.485, 0.456, 0.406]]]] [[[[0.229, 0.224, 0.225]]]]
qcind "input1" "./input1.npy" [[0.1, ..., 0.64]] [[0.05, ..., 0.08]]
qcind "input2" "input2.npy" [0.3] [0.07]
ioqd {int8,uint8}, input_output_quant_dtype {int8,uint8}
Input and Output dtypes when doing Full INT8 Quantization.
"int8"(default) or "uint8"
nuo, not_use_onnxsim
No optimization by onnxsimplifier is performed.
If this option is used, the probability of a conversion error is very high.
nuonag, not_use_opname_auto_generate
Automatic generation of each OP name in the old format ONNX file
and assignment of OP name are not performed.
b BATCH_SIZE, batch_size BATCH_SIZE
Fixes the dynamic batch size to the specified numeric batch size.
A value of 1 or more must be specified.
ois OVERWRITE_INPUT_SHAPE [OVERWRITE_INPUT_SHAPE ...], \
overwrite_input_shape OVERWRITE_INPUT_SHAPE [OVERWRITE_INPUT_SHAPE ...]
Overwrite the input shape.
The format is
"i1:dim0,...,dimN" "i2:dim0,...,dimN" "i3:dim0,...,dimN"
When there is only one input, for example,
"data:1,3,224,224"
When there are multiple inputs, for example,
"data1:1,3,224,224" "data2:1,3,112" "data3:5"
A value of 1 or more must be specified.
Numerical values other than dynamic dimensions are ignored.
Ignores batch_size if specified at the same time as batch_size.
nlt, no_large_tensor
Suppresses constant bloat caused by Tile OP when optimizing models in onnxsim.
See: https://github.com/daquexian/onnxsimplifier/issues/178
onwdt, output_nms_with_dynamic_tensor
The number of bounding boxes in the NMS output results is
not fixed at the maximum number of max_output_boxes_per_class,
but rather at the smallest possible number of dynamic tensors.
If this option is disabled, NMS output is padded to the number
set in the max_output_boxes_per_class attribute.
e.g.
disable output_nms_with_dynamic_tensor:
output_tensor_shape: [100, 7]
enable output_nms_with_dynamic_tensor:
output_tensor_shape: [N, 7]
k KEEP_NCW_OR_NCHW_OR_NCDHW_INPUT_NAMES [KEEP_NCW_OR_NCHW_OR_NCDHW_INPUT_NAMES ...], \
keep_ncw_or_nchw_or_ncdhw_input_names KEEP_NCW_OR_NCHW_OR_NCDHW_INPUT_NAMES \
[KEEP_NCW_OR_NCHW_OR_NCDHW_INPUT_NAMES ...]
Holds the NCW or NCHW or NCDHW of the input shape for the specified INPUT OP names.
If a nonexistent INPUT OP name is specified, it is ignored.
Valid only for 3D, 4D and 5D input tensors.
e.g. keep_ncw_or_nchw_or_ncdhw_input_names "input0" "input1" "input2"
kt KEEP_NWC_OR_NHWC_OR_NDHWC_INPUT_NAMES [KEEP_NWC_OR_NHWC_OR_NDHWC_INPUT_NAMES ...], \
keep_nwc_or_nhwc_or_ndhwc_input_names KEEP_NWC_OR_NHWC_OR_NDHWC_INPUT_NAMES \
[KEEP_NWC_OR_NHWC_OR_NDHWC_INPUT_NAMES ...]
Holds the NWC or NHWC or NDHWC of the input shape for the specified INPUT OP names.
If a nonexistent INPUT OP name is specified, it is ignored.
If the input OP name is the same as the input OP name specified
in the keep_ncw_or_nchw_or_ncdhw_input_names option, it is ignored.
Valid only for 3D, 4D and 5D input tensors.
e.g. keep_nwc_or_nhwc_or_ndhwc_input_names "input0" "input1" "input2"
kat KEEP_SHAPE_ABSOLUTELY_INPUT_NAMES [KEEP_SHAPE_ABSOLUTELY_INPUT_NAMES ...], \
keep_shape_absolutely_input_names KEEP_SHAPE_ABSOLUTELY_INPUT_NAMES \
[KEEP_SHAPE_ABSOLUTELY_INPUT_NAMES ...]
Name of the INPUT that unconditionally maintains its shape.
If a nonexistent INPUT OP name is specified, it is ignored.
e.g. keep_shape_absolutely_input_names "input0" "input1" "input2"
onimc OUTPUT_NAMES [OUTPUT_NAMES ...], \
output_names_to_interrupt_model_conversion OUTPUT_NAMES [OUTPUT_NAMES ...]
Output names that interrupt model conversion.
Interrupts model transformation at the specified output name and outputs the
model partitioned into subgraphs.
e.g. output_names_to_interrupt_model_conversion "output0" "output1" "output2"
dgc, disable_group_convolution
Disable GroupConvolution and replace it with SeparableConvolution for
output to saved_model format.
ebu, enable_batchmatmul_unfold
BatchMatMul is separated batch by batch to generate a primitive MatMul.
dsft, disable_suppression_flextranspose
Disables FlexTranspose generation suppression.
nodafc, number_of_dimensions_after_flextranspose_compression
Number of Transpose OP dimensions generated after avoiding FlexTranspose generation.
Also suppress the creation of the Transpose itself by specifying 2.
Default: 6
ofgd, optimization_for_gpu_delegate
Replace operations that do not support gpu delegate with those
that do as much as possible.
rari64, replace_argmax_to_reducemax_and_indicies_is_int64
Replace ArgMax with a ReduceMax. The returned indicies are int64.
Only one of replace_argmax_to_reducemax_and_indicies_is_int64
and replace_argmax_to_reducemax_and_indicies_is_float32
and replace_argmax_to_fused_argmax_and_indicies_is_int64
and replace_argmax_to_fused_argmax_and_indicies_is_float32 can be specified.
rarf32, replace_argmax_to_reducemax_and_indicies_is_float32
Replace ArgMax with a ReduceMax. The returned indicies are float32.
Only one of replace_argmax_to_reducemax_and_indicies_is_int64
and replace_argmax_to_reducemax_and_indicies_is_float32
and replace_argmax_to_fused_argmax_and_indicies_is_int64
and replace_argmax_to_fused_argmax_and_indicies_is_float32 can be specified.
rafi64, replace_argmax_to_fused_argmax_and_indicies_is_int64
Replace ArgMax with a Fused_ArgMax. The returned indicies are int64.
It improves inference speed at the cost of a small sacrifice in accuracy.
See. https://github.com/tensorflow/models/tree/master/official/projects/edgetpu/vision#argmaxfusiontoimprovesegmentationmodellatency
Currently, only 4D tensors are supported.
Only one of replace_argmax_to_reducemax_and_indicies_is_int64
and replace_argmax_to_reducemax_and_indicies_is_float32
and replace_argmax_to_fused_argmax_and_indicies_is_int64
and replace_argmax_to_fused_argmax_and_indicies_is_float32 can be specified.
raff32, replace_argmax_to_fused_argmax_and_indicies_is_float32
Replace ArgMax with a Fused_ArgMax. The returned indicies are float32.
It improves inference speed at the cost of a small sacrifice in accuracy.
See. https://github.com/tensorflow/models/tree/master/official/projects/edgetpu/vision#argmaxfusiontoimprovesegmentationmodellatency
Currently, only 4D tensors are supported.
Only one of replace_argmax_to_reducemax_and_indicies_is_int64
and replace_argmax_to_reducemax_and_indicies_is_float32
and replace_argmax_to_fused_argmax_and_indicies_is_int64
and replace_argmax_to_fused_argmax_and_indicies_is_float32 can be specified.
fasr FUSED_ARGMAX_SCALE_RATIO, fused_argmax_scale_ratio FUSED_ARGMAX_SCALE_RATIO
For Fused ArgMax.
Scale ratio when generating Fused ArgMax.
0.0 < fused_argmax_scale_ratio <= 1.0
Default: 0.5
rtpo, replace_to_pseudo_operators
Replace list of operators to pseudo operators.
Full name of the target operators should be given.
Currently supported operators :
Asin, Acos, Atan, Abs, PReLU, LeakyReLU, Power, GatherND, Neg, HardSwish, Erf
me, mvn_epsilon
For MeanVarianceNormalization.
The number to be added to the variance to avoid division by zero
when normalizing the value.
(input_tensor  mean) / tf.sqrt(variance + mvn_epsilon)
Default: 0.0000000001
prf PARAM_REPLACEMENT_FILE, param_replacement_file PARAM_REPLACEMENT_FILE
Parameter replacement file path. (.json)
cgdc, check_gpu_delegate_compatibility
Run TFLite ModelAnalyzer on the generated Float16 tflite model
to check if the model can be supported by GPU Delegate.
e.g.
"""
=== TFLite ModelAnalyzer ===
Your TFLite model has '1' subgraph(s). In the subgraph description below,
T# represents the Tensor numbers. For example, in Subgraph#0, the RESHAPE op takes
tensor #0 and tensor #6 as input and produces tensor #7 as output.
Subgraph#0 main(T#0) > [T#17]
Op#0 RESHAPE(T#0, T#6[2, 8, 8, 3, 2, ...]) > [T#7]
Op#1 SPLIT(T#5[0], T#7) > [T#8, T#9]
Op#2 RESHAPE(T#8, T#1[8, 8, 3, 2, 2]) > [T#10]
Op#3 TRANSPOSE(T#10, T#4[0, 3, 1, 4, 2]) > [T#11]
Op#4 RESHAPE(T#11, T#2[1, 8, 2, 8, 2, ...]) > [T#12]
Op#5 RESHAPE(T#9, T#1[8, 8, 3, 2, 2]) > [T#13]
Op#6 TRANSPOSE(T#13, T#4[0, 3, 1, 4, 2]) > [T#14]
Op#7 RESHAPE(T#14, T#2[1, 8, 2, 8, 2, ...]) > [T#15]
Op#8 CONCATENATION(T#12, T#15) > [T#16]
Op#9 RESHAPE(T#16, T#3[2, 16, 16, 3]) > [T#17]
Tensors of Subgraph#0
T#0(inputs_0) shape:[2, 8, 8, 12], type:FLOAT32
T#1(model/tf.compat.v1.squeeze_2/Squeeze) shape:[5], type:INT32 RO 20 bytes, data:[8, 8, 3, 2, 2]
T#2(model/tf.expand_dims_1/ExpandDims) shape:[6], type:INT32 RO 24 bytes, data:[1, 8, 2, 8, 2, ...]
T#3(model/tf.reshape_1/Reshape/shape) shape:[4], type:INT32 RO 16 bytes, data:[2, 16, 16, 3]
T#4(model/tf.compat.v1.transpose/transpose/perm) shape:[5], type:INT32 RO 20 bytes, data:[0, 3, 1, 4, 2]
T#5(model/tf.concat/concat/axis) shape:[], type:INT32 RO 4 bytes, data:[0]
T#6(model/tf.reshape/Reshape/shape) shape:[6], type:INT32 RO 24 bytes, data:[2, 8, 8, 3, 2, ...]
T#7(model/tf.reshape/Reshape) shape:[2, 8, 8, 3, 2, 2], type:FLOAT32
T#8(model/tf.split/split) shape:[1, 8, 8, 3, 2, 2], type:FLOAT32
T#9(model/tf.split/split1) shape:[1, 8, 8, 3, 2, 2], type:FLOAT32
T#10(model/tf.compat.v1.squeeze_1/Squeeze) shape:[8, 8, 3, 2, 2], type:FLOAT32
T#11(model/tf.compat.v1.transpose/transpose) shape:[8, 2, 8, 2, 3], type:FLOAT32
T#12(model/tf.expand_dims/ExpandDims) shape:[1, 8, 2, 8, 2, 3], type:FLOAT32
T#13(model/tf.compat.v1.squeeze_2/Squeeze1) shape:[8, 8, 3, 2, 2], type:FLOAT32
T#14(model/tf.compat.v1.transpose_1/transpose) shape:[8, 2, 8, 2, 3], type:FLOAT32
T#15(model/tf.expand_dims_1/ExpandDims1) shape:[1, 8, 2, 8, 2, 3], type:FLOAT32
T#16(model/tf.concat/concat) shape:[2, 8, 2, 8, 2, 3], type:FLOAT32
T#17(Identity) shape:[2, 16, 16, 3], type:FLOAT32
Your model looks compatibile with GPU delegate with TFLite runtime version 2.10.0.
But it doesn't guarantee that your model works well with GPU delegate.
There could be some runtime incompatibililty happen.

Model size: 2988 bytes
Nondata buffer size: 2757 bytes (92.27 %)
Total data buffer size: 231 bytes (07.73 %)
(Zero value buffers): 4 bytes (00.13 %)
* Buffers of TFLite model are mostly used for constant tensors.
And zero value buffers are buffers filled with zeros.
Nondata buffers area are used to store operators, subgraphs and etc.
You can find more details from https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/schema/schema.fbs
"""
coto, check_onnx_tf_outputs_elementwise_close
Returns "Matches" if the output of onnx and the output of TF are
within acceptable proximity element by element.
Returns "Unmatched" if the output of onnx and the output of TF are
not within acceptable proximity element by element.
If the output of onnx is 1D, it returns "Skipped" and skips the comparison
between the output of onnx and that of TF. This is because when undefined
dimensions are present, a situation often arises where very large index
values are compared, causing OutOfMemory.
Only the output content of the models final output OP is checked.
cotof, check_onnx_tf_outputs_elementwise_close_full
Returns "Matches" if the output of onnx and the output of TF are
within acceptable proximity element by element.
Check the output of all OPs in sequence from the beginning,
including all but the final output OP of the model.
Returns "Unmatched" if the output of onnx and the output of TF are
not within acceptable proximity element by element.
If the output of onnx is 1D, it returns "Skipped" and skips the comparison
between the output of onnx and that of TF. This is because when undefined
dimensions are present, a situation often arises where very large index
values are compared, causing OutOfMemory.
It is very time consuming because it performs as many inferences as
there are operations.
coton, check_onnx_tf_outputs_sample_data_normalization
norm: Validate using random data normalized to the range 0.0 to 1.0
denorm: Validate using random data in the range 0.0 to 255.0
If there is a normalization layer at the model's entry point, or
if the model was trained on denormalized data, "denorm" must be specified.
Default: "norm"
cotor CHECK_ONNX_TF_OUTPUTS_ELEMENTWISE_CLOSE_RTOL,\
check_onnx_tf_outputs_elementwise_close_rtol CHECK_ONNX_TF_OUTPUTS_ELEMENTWISE_CLOSE_RTOL
The relative tolerance parameter.
Default: 0.0
cotoa CHECK_ONNX_TF_OUTPUTS_ELEMENTWISE_CLOSE_ATOL,\
check_onnx_tf_outputs_elementwise_close_atol CHECK_ONNX_TF_OUTPUTS_ELEMENTWISE_CLOSE_ATOL
The absolute tolerance parameter.
Default: 1e4
n, non_verbose
Do not show all information logs. Only error logs are displayed.
>>> from onnx2tf import convert
>>> help(convert)
Help on function convert in module onnx2tf:
convert(
input_onnx_file_path: Union[str, NoneType] = '',
onnx_graph: Union[onnx.onnx_ml_pb2.ModelProto, NoneType] = None,
output_folder_path: Union[str, NoneType] = 'saved_model',
output_signaturedefs: Optional[bool] = False,
output_h5: Optional[bool] = False,
output_keras_v3: Optional[bool] = False,
output_weights: Optional[bool] = False,
copy_onnx_input_output_names_to_tflite: Optional[bool] = False,
output_integer_quantized_tflite: Optional[bool] = False,
quant_type: Optional[str] = 'perchannel',
quant_calib_input_op_name_np_data_path: Optional[List] = None,
input_output_quant_dtype: Optional[str] = 'int8',
not_use_onnxsim: Optional[bool] = False,
not_use_opname_auto_generate: Optional[bool] = False,
batch_size: Union[int, NoneType] = None,
overwrite_input_shape: Union[List[str], NoneType] = None,
no_large_tensor: Optional[bool] = False,
output_nms_with_dynamic_tensor: Optional[bool] = False,
keep_ncw_or_nchw_or_ncdhw_input_names: Union[List[str], NoneType] = None,
keep_nwc_or_nhwc_or_ndhwc_input_names: Union[List[str], NoneType] = None,
keep_shape_absolutely_input_names: Optional[List[str]] = None,
output_names_to_interrupt_model_conversion: Union[List[str], NoneType] = None,
disable_group_convolution: Union[bool, NoneType] = False,
enable_batchmatmul_unfold: Optional[bool] = False,
disable_suppression_flextranspose: Optional[bool] = False,
number_of_dimensions_after_flextranspose_compression: Optional[int] = 5,
optimization_for_gpu_delegate: Optional[bool] = False,
replace_argmax_to_reducemax_and_indicies_is_int64: Union[bool, NoneType] = False,
replace_argmax_to_reducemax_and_indicies_is_float32: Union[bool, NoneType] = False,
replace_argmax_to_fused_argmax_and_indicies_is_int64: Union[bool, NoneType] = False,
replace_argmax_to_fused_argmax_and_indicies_is_float32: Union[bool, NoneType] = False,
fused_argmax_scale_ratio: Union[float, NoneType] = 0.5,
replace_to_pseudo_operators: List[str] = None,
mvn_epsilon: Union[float, NoneType] = 0.0000000001,
param_replacement_file: Optional[str] = '',
check_gpu_delegate_compatibility: Optional[bool] = False,
check_onnx_tf_outputs_elementwise_close: Optional[bool] = False,
check_onnx_tf_outputs_elementwise_close_full: Optional[bool] = False,
check_onnx_tf_outputs_sample_data_normalization: Optional[str] = 'norm',
check_onnx_tf_outputs_elementwise_close_rtol: Optional[float] = 0.0,
check_onnx_tf_outputs_elementwise_close_atol: Optional[float] = 1e4,
non_verbose: Union[bool, NoneType] = False
) > keras.engine.training.Model
Convert ONNX to TensorFlow models.
Parameters

input_onnx_file_path: Optional[str]
Input onnx file path.
Either input_onnx_file_path or onnx_graph must be specified.
onnx_graph: Optional[onnx.ModelProto]
onnx.ModelProto.
Either input_onnx_file_path or onnx_graph must be specified.
onnx_graph If specified, ignore input_onnx_file_path and process onnx_graph.
output_folder_path: Optional[str]
Output tensorflow model folder path.
Default: "saved_model"
output_signaturedefs: Optional[bool]
Signature is added to the output for serving or for conversion
to other model formats. However, this can significantly reduce the speed
of model conversion and significant increase the size of the model.
output_h5: Optional[bool]
Output model in Keras H5 format.
output_keras_v3: Optional[bool]
Output model in Keras (keras_v3) format.
output_weights: Optional[bool]
Output weights in hdf5 format.
copy_onnx_input_output_names_to_tflite: Optional[bool]
Copy the input/output OP name of ONNX to the input/output OP name of tflite.
Due to Tensorflow internal operating specifications,
the input/output order of ONNX does not necessarily match
the input/output order of tflite.
Be sure to check that the input/output OP names in the generated
tflite file have been converted as expected.
Also, this option generates a huge JSON file as a temporary file for processing.
Therefore, it is strongly discouraged to use it on large models of hundreds
of megabytes or more.
output_integer_quantized_tflite: Optional[bool]
Output of integer quantized tflite.
quant_type: Optional[str]
Selects whether "perchannel" or "pertensor" quantization is used.
Default: "perchannel"
quant_calib_input_op_name_np_data_path: Optional[List]
quant_calib_input_op_name_np_data_path INPUT_NAME NUMPY_FILE_PATH MEAN STD
INPUT Name of OP and path of calibration data file (Numpy) for quantization and mean and std.
The specification can be omitted only when the input OP is a single 4D tensor image data.
If omitted, it is automatically calibrated using 20 normalized MSCOCO images.
The type of the input OP must be Float32.
Data for calibration must be prenormalized to a range of 0 to 1.
qcind {input_op_name} {numpy_file_path} {mean} {std}
Numpy file paths must be specified the same number of times as the number of input OPs.
Normalize the value of the input OP based on the tensor specified in mean and std.
(input_value  mean) / std
Tensors in Numpy file format must be in dimension order after conversion to TF.
Note that this is intended for deployment on lowresource devices,
so the batch size is limited to 1 only.
e.g.
The example below shows a case where there are three input OPs.
Assume input0 is 128x128 RGB image data.
In addition, input0 should be a value that has been divided by 255
in the preprocessing and normalized to a range between 0 and 1.
input1 and input2 assume the input of something that is not an image.
Because input1 and input2 assume something that is not an image,
the divisor is not 255 when normalizing from 0 to 1.
"n" is the number of calibration data.
ONNX INPUT shapes:
input0: [n,3,128,128]
mean: [1,3,1,1] > [[[[0.485]],[[0.456]],[[0.406]]]]
std : [1,3,1,1] > [[[[0.229]],[[0.224]],[[0.225]]]]
input1: [n,64,64]
mean: [1,64] > [[0.1, ..., 0.64]]
std : [1,64] > [[0.05, ..., 0.08]]
input2: [n,5]
mean: [1] > [0.3]
std : [1] > [0.07]
TensorFlow INPUT shapes (Numpy file ndarray shapes):
input0: [n,128,128,3]
mean: [1,1,1,3] > [[[[0.485, 0.456, 0.406]]]]
std : [1,1,1,3] > [[[[0.229, 0.224, 0.225]]]]
input1: [n,64,64]
mean: [1,64] > [[0.1, ..., 0.64]]
std : [1,64] > [[0.05, ..., 0.08]]
input2: [n,5]
mean: [1] > [0.3]
std : [1] > [0.07]
qcind=[
["input0","../input0.npy",[[[[0.485, 0.456, 0.406]]]],[[[[0.229, 0.224, 0.225]]]]],
["input1","./input1.npy",[0.1, ..., 0.64],[0.05, ..., 0.08]],
["input2","input2.npy",[0.3],[0.07]],
]
input_output_quant_dtype: Optional[str]
Input and Output dtypes when doing Full INT8 Quantization.
"int8"(default) or "uint8"
not_use_onnxsim: Optional[bool]
No optimization by onnxsimplifier is performed.
If this option is used, the probability of a conversion error is very high.
not_use_opname_auto_generate: Optional[bool]
Automatic generation of each OP name in the old format ONNX file
and assignment of OP name are not performed.
batch_size: Optional[int]
Fixes the dynamic batch size to the specified numeric batch size.
A value of 1 or more must be specified.
overwrite_input_shape: Optional[List[str]]
Overwrite the input shape.
The format is
['i1:dim0,dim1,...,dimN', 'i2:dim0,dim1,...,dimN', 'i3:dim0,dim1,...,dimN']
When there is only one input, for example,
['data:1,3,224,224']
When there are multiple inputs, for example,
['data1:1,3,224,224','data2:1,3,112','data3:5']
A value of 1 or more must be specified.
Numerical values other than dynamic dimensions are ignored.
Ignores batch_size if specified at the same time as batch_size.
no_large_tensor: Optional[bool]
Suppresses constant bloat caused by Tile OP when optimizing models in onnxsim.
See: https://github.com/daquexian/onnxsimplifier/issues/178
output_nms_with_dynamic_tensor: Optional[bool]
The number of bounding boxes in the NMS output results is
not fixed at the maximum number of max_output_boxes_per_class,
but rather at the smallest possible number of dynamic tensors.
If this option is disabled, NMS output is padded to the number
set in the max_output_boxes_per_class attribute.
e.g.
disable output_nms_with_dynamic_tensor:
output_tensor_shape: [100, 7]
enable output_nms_with_dynamic_tensor:
output_tensor_shape: [N, 7]
keep_ncw_or_nchw_or_ncdhw_input_names: Optional[List[str]]
Holds the NCW or NCHW or NCDHW of the input shape for the specified INPUT OP names.
If a nonexistent INPUT OP name is specified, it is ignored.
Valid only for 3D, 4D and 5D input tensors.
e.g.
keep_ncw_or_nchw_or_ncdhw_input_names=['input0','input1','input2']
keep_nwc_or_nhwc_or_ndhwc_input_names: Optional[List[str]]
Holds the NWC or NHWC or NDHWC of the input shape for the specified INPUT OP names.
If a nonexistent INPUT OP name is specified, it is ignored.
If the input OP name is the same as the input OP name specified
in the keep_ncw_or_nchw_or_ncdhw_input_names option, it is ignored.
Valid only for 3D, 4D and 5D input tensors.
e.g.
keep_nwc_or_nhwc_or_ndhwc_input_names=['input0','input1','input2']
keep_shape_absolutely_input_names: Optional[List[str]]
Name of the INPUT that unconditionally maintains its shape.
If a nonexistent INPUT OP name is specified, it is ignored.
e.g.
keep_shape_absolutely_input_names=['input0','input1','input2']
output_names_to_interrupt_model_conversion: Optional[List[str]]
Output names that interrupt model conversion.
Interrupts model transformation at the specified output name
and outputs the model partitioned into subgraphs.
e.g.
output_names_to_interrupt_model_conversion=['output0','output1','output2']
disable_group_convolution: Optional[bool]
Disable GroupConvolution and replace it with SeparableConvolution for
output to saved_model format.
enable_batchmatmul_unfold: Optional[bool]
BatchMatMul is separated batch by batch to generate a primitive MatMul.
disable_suppression_flextranspose: Optional[bool]
Disables FlexTranspose generation suppression.
number_of_dimensions_after_flextranspose_compression: Optional[int]
Number of Transpose OP dimensions generated after avoiding FlexTranspose generation.
Also suppress the creation of the Transpose itself by specifying 2.
Default: 6
optimization_for_gpu_delegate: Optional[bool]
Replace operations that do not support gpu delegate with those
that do as much as possible.
replace_argmax_to_reducemax_and_indicies_is_int64: Optional[bool]
Replace ArgMax with a ReduceMax. The returned indicies are int64.
Only one of replace_argmax_to_reducemax_and_indicies_is_int64 and
replace_argmax_to_reducemax_and_indicies_is_float32 and
replace_argmax_to_fused_argmax_and_indicies_is_int64 and
replace_argmax_to_fused_argmax_and_indicies_is_float32 can be specified.
Default: False
replace_argmax_to_reducemax_and_indicies_is_float32: Optional[bool]
Replace ArgMax with a ReduceMax. The returned indicies are float32.
Only one of replace_argmax_to_reducemax_and_indicies_is_int64 and
replace_argmax_to_reducemax_and_indicies_is_float32 and
replace_argmax_to_fused_argmax_and_indicies_is_int64 and
replace_argmax_to_fused_argmax_and_indicies_is_float32 can be specified.
Default: False
replace_argmax_to_fused_argmax_and_indicies_is_int64: Optional[bool]
Replace ArgMax with a ReduceMax. The returned indicies are int64.
It improves inference speed at the cost of a small sacrifice in accuracy.
See. https://github.com/tensorflow/models/tree/master/official/projects/edgetpu/vision#argmaxfusiontoimprovesegmentationmodellatency
Currently, only 4D tensors are supported.
Only one of replace_argmax_to_reducemax_and_indicies_is_int64 and
replace_argmax_to_reducemax_and_indicies_is_float32 and
replace_argmax_to_fused_argmax_and_indicies_is_int64 and
replace_argmax_to_fused_argmax_and_indicies_is_float32 can be specified.
Default: False
replace_argmax_to_fused_argmax_and_indicies_is_float32: Optional[bool]
Replace ArgMax with a ReduceMax. The returned indicies are float32.
It improves inference speed at the cost of a small sacrifice in accuracy.
See. https://github.com/tensorflow/models/tree/master/official/projects/edgetpu/vision#argmaxfusiontoimprovesegmentationmodellatency
Currently, only 4D tensors are supported.
Only one of replace_argmax_to_reducemax_and_indicies_is_int64 and
replace_argmax_to_reducemax_and_indicies_is_float32 and
replace_argmax_to_fused_argmax_and_indicies_is_int64 and
replace_argmax_to_fused_argmax_and_indicies_is_float32 can be specified.
Default: False
fused_argmax_scale_ratio: Optional[float]
For Fused ArgMax.
Scale ratio when generating Fused ArgMax.
0.0 < fused_argmax_scale_ratio <= 1.0
Default: 0.5
replace_to_pseudo_operators: List[str]
Replace list of operators to pseudo operators.
Full name of the target operators should be given.
Currently supported operators :
Asin, Acos, Atan, Abs, PReLU, LeakyReLU, Power, GatherND, Neg, HardSwish, Erf
mvn_epsilon: Optional[float]
For MeanVarianceNormalization.
The number to be added to the variance to avoid division by zero
when normalizing the value.
(input_tensor  mean) / tf.sqrt(variance + mvn_epsilon)
Default: 0.0000000001
param_replacement_file: Optional[str]
Parameter replacement file path. (.json)
check_gpu_delegate_compatibility: Optional[bool]
Run TFLite ModelAnalyzer on the generated Float16 tflite model
to check if the model can be supported by GPU Delegate.
e.g.
"""
=== TFLite ModelAnalyzer ===
Your TFLite model has '1' subgraph(s). In the subgraph description below,
T# represents the Tensor numbers. For example, in Subgraph#0, the RESHAPE op takes
tensor #0 and tensor #6 as input and produces tensor #7 as output.
Subgraph#0 main(T#0) > [T#17]
Op#0 RESHAPE(T#0, T#6[2, 8, 8, 3, 2, ...]) > [T#7]
Op#1 SPLIT(T#5[0], T#7) > [T#8, T#9]
Op#2 RESHAPE(T#8, T#1[8, 8, 3, 2, 2]) > [T#10]
Op#3 TRANSPOSE(T#10, T#4[0, 3, 1, 4, 2]) > [T#11]
Op#4 RESHAPE(T#11, T#2[1, 8, 2, 8, 2, ...]) > [T#12]
Op#5 RESHAPE(T#9, T#1[8, 8, 3, 2, 2]) > [T#13]
Op#6 TRANSPOSE(T#13, T#4[0, 3, 1, 4, 2]) > [T#14]
Op#7 RESHAPE(T#14, T#2[1, 8, 2, 8, 2, ...]) > [T#15]
Op#8 CONCATENATION(T#12, T#15) > [T#16]
Op#9 RESHAPE(T#16, T#3[2, 16, 16, 3]) > [T#17]
Tensors of Subgraph#0
T#0(inputs_0) shape:[2, 8, 8, 12], type:FLOAT32
T#1(model/tf.compat.v1.squeeze_2/Squeeze) shape:[5], type:INT32 RO 20 bytes, data:[8, 8, 3, 2, 2]
T#2(model/tf.expand_dims_1/ExpandDims) shape:[6], type:INT32 RO 24 bytes, data:[1, 8, 2, 8, 2, ...]
T#3(model/tf.reshape_1/Reshape/shape) shape:[4], type:INT32 RO 16 bytes, data:[2, 16, 16, 3]
T#4(model/tf.compat.v1.transpose/transpose/perm) shape:[5], type:INT32 RO 20 bytes, data:[0, 3, 1, 4, 2]
T#5(model/tf.concat/concat/axis) shape:[], type:INT32 RO 4 bytes, data:[0]
T#6(model/tf.reshape/Reshape/shape) shape:[6], type:INT32 RO 24 bytes, data:[2, 8, 8, 3, 2, ...]
T#7(model/tf.reshape/Reshape) shape:[2, 8, 8, 3, 2, 2], type:FLOAT32
T#8(model/tf.split/split) shape:[1, 8, 8, 3, 2, 2], type:FLOAT32
T#9(model/tf.split/split1) shape:[1, 8, 8, 3, 2, 2], type:FLOAT32
T#10(model/tf.compat.v1.squeeze_1/Squeeze) shape:[8, 8, 3, 2, 2], type:FLOAT32
T#11(model/tf.compat.v1.transpose/transpose) shape:[8, 2, 8, 2, 3], type:FLOAT32
T#12(model/tf.expand_dims/ExpandDims) shape:[1, 8, 2, 8, 2, 3], type:FLOAT32
T#13(model/tf.compat.v1.squeeze_2/Squeeze1) shape:[8, 8, 3, 2, 2], type:FLOAT32
T#14(model/tf.compat.v1.transpose_1/transpose) shape:[8, 2, 8, 2, 3], type:FLOAT32
T#15(model/tf.expand_dims_1/ExpandDims1) shape:[1, 8, 2, 8, 2, 3], type:FLOAT32
T#16(model/tf.concat/concat) shape:[2, 8, 2, 8, 2, 3], type:FLOAT32
T#17(Identity) shape:[2, 16, 16, 3], type:FLOAT32
Your model looks compatibile with GPU delegate with TFLite runtime version 2.10.0.
But it doesn't guarantee that your model works well with GPU delegate.
There could be some runtime incompatibililty happen.

Model size: 2988 bytes
Nondata buffer size: 2757 bytes (92.27 %)
Total data buffer size: 231 bytes (07.73 %)
(Zero value buffers): 4 bytes (00.13 %)
* Buffers of TFLite model are mostly used for constant tensors.
And zero value buffers are buffers filled with zeros.
Nondata buffers area are used to store operators, subgraphs and etc.
You can find more details from https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/schema/schema.fbs
"""
check_onnx_tf_outputs_elementwise_close: Optional[bool]
Returns "Matches" if the output of onnx and the output of TF are
within acceptable proximity element by element.
Returns "Unmatched" if the output of onnx and the output of TF are
not within acceptable proximity element by element.
If the output of onnx is 1D, it returns "Skipped" and skips the comparison
between the output of onnx and that of TF. This is because when undefined
dimensions are present, a situation often arises where very large index
values are compared, causing OutOfMemory.
Only the output content of the models final output OP is checked.
check_onnx_tf_outputs_elementwise_close_full: Optional[bool]
Returns "Matches" if the output of onnx and the output of TF are
within acceptable proximity element by element.
Check the output of all OPs in sequence from the beginning,
including all but the final output OP of the model.
Returns "Unmatched" if the output of onnx and the output of TF are
not within acceptable proximity element by element.
If the output of onnx is 1D, it returns "Skipped" and skips the comparison
between the output of onnx and that of TF. This is because when undefined
dimensions are present, a situation often arises where very large index
values are compared, causing OutOfMemory.
It is very time consuming because it performs as many inferences as
there are operations.
check_onnx_tf_outputs_sample_data_normalization: Optional[str]
norm: Validate using random data normalized to the range 0.0 to 1.0
denorm: Validate using random data in the range 0.0 to 255.0
If there is a normalization layer at the models entry point, or
if the model was trained on denormalized data, "denorm" must be specified.
Default: "norm"
check_onnx_tf_outputs_elementwise_close_rtol: Optional[float]
The relative tolerance parameter.
Default: 0.0
check_onnx_tf_outputs_elementwise_close_atol: Optional[float]
The absolute tolerance parameter.
Default: 1e4
non_verbose: Optional[bool]
Do not show all information logs. Only error logs are displayed.
Default: False
Returns

model: tf.keras.Model
Model
This tool is used to convert NCW
to NWC
, NCHW
to NHWC
, NCDHW
to NDHWC
, NCDDHW
to NDDHWC
, NCDDDDDDHW
to NDDDDDDHWC
. Therefore, as stated in the Key Concepts, the conversion will inevitably break down at some point in the model. You need to look at the entire conversion log to see which OP transpositions are failing and correct them yourself. I dare to explain very little because I know that no matter how much detail I put in the README, you guys will not read it at all. attribute
or INPUT constant
or INPUT Initializer
can be replaced with the specified value.
Starting from v1.3.0
, almost all OPs except for some special OPs support pre and posttransposition by pre_process_transpose
and post_process_transpose
.
Do not submit an issue that only contains an amount of information that cannot be reproduced.
convert option
param_replacement_file param_replacement.json
or
prf param_replacement.json
param_replacement.json
{
"format_version": 1,
"operations": [
{
"op_name": "StatefulPartitionedCall/Tile_4",
"param_target": "inputs", # attributes or inputs
"param_name": "const_fold_opt__677",
"values": [1,1,17] # Disable parameter transposition or overwrite parameters
},
{
"op_name": "StatefulPartitionedCall/Cast_3",
"param_target": "attributes", # attributes or inputs
"param_name": "to",
"values": 1 # Disable parameter transposition or overwrite "to" parameters
},
{
"op_name": "Resize__697",
"param_target": "inputs",
"param_name": "Concat__696:0",
"values": [26,26] # Replacement of unk__x (Resize OP, sizes height/width parameter)
},
{
"op_name": "Transpose__927",
"param_target": "attributes",
"param_name": "perm",
"values": [0,1,2,3] # Disable parameter transposition or overwrite "perm" parameters
},
{
"op_name": "StatefulPartitionedCall/functional_1/max_unpooling2d_2/Reshape_1",
"param_target": "inputs",
"param_name": "const_fold_opt__911",
"values": [4,131072] # Overwrite "shape" parameters
},
{
"op_name": "Reshape_25",
"param_target": "outputs",
"param_name": "onnx::InstanceNormalization_270",
"post_process_transpose_perm": [0,2,1] # Extrapolate 3D Transpose after Reshape
},
{
"op_name": "Reshape_30",
"param_target": "outputs",
"param_name": "onnx::Mul_275",
"post_process_transpose_perm": [0,2,3,1] # Extrapolate 4D Transpose after Reshape
},
{
"op_name": "flatten_1127",
"param_target": "inputs",
"param_name": "dropout0",
"pre_process_transpose_perm": [0,3,1,2]
},
{
"op_name": "/Slice",
"param_target": "op",
"begin": [0,0,1,0],
"end": [0,0,0,0],
"end_mask": 15
},
{
"op_name": "/Slice_1",
"param_target": "op",
"begin": [0,0,0,0],
"end": [0,0,39,0],
"end_mask": 11
}
]
}
Replacement Supported OPs
No.  OP type  Remarks  

1  Add  1. "param_target": "inputs"pre_process_transpose_perm : Transpose is applied to the tensor before the Add operation with the perm specified as preprocessing.2. "param_target": "outputs" post_process_transpose_perm : Transpose is applied to the tensor after the Add operation with the perm specified as postprocessing. 

2  Cast 


3  Concat  1. "param_target": "attributes"axis : Value of axis 2. "param_target": "outputs" post_process_transpose_perm : Transpose is applied to the tensor after the Concat operation with the perm specified as postprocessing. 

4  ConvTranspose 
ConvTranspose implements special replacements separately ignore all automatic conversions and generate tf.nn.conv1d_transpose or tf.nn.conv2d_transpose or tf.nn.conv3d_transpose directly by specifying all parameters.https://www.tensorflow.org/api_docs/python/tf/nn/conv1d_transpose https://www.tensorflow.org/api_docs/python/tf/nn/conv2d_transpose https://www.tensorflow.org/api_docs/python/tf/nn/conv3d_transpose 1. "param_target": "op" output_shape : Value of output_shape strides : Value of strides padding : Value of padding dilations : Value of dilations


5  Div  1. "param_target": "inputs"values : Value of input pre_process_transpose_perm : Transpose is applied to the tensor before the Div operation with the perm specified as preprocessing.2. "param_target": "outputs" post_process_transpose_perm : Transpose is applied to the tensor after the Div operation with the perm specified as postprocessing. 

6  Expand  1. "param_target": "inputs"values : Value of shape pre_process_transpose_perm : Transpose is applied to the tensor before the Expand operation with the perm specified as preprocessing.2. "param_target": "outputs" post_process_transpose_perm : Transpose is applied to the tensor after the Expand operation with the perm specified as postprocessing. 

7  Flatten  1. "param_target": "attributes"axis : Value of axis 2. "param_target": "inputs" pre_process_transpose_perm : Transpose is applied to the tensor before the Flatten operation with the perm specified as preprocessing.3. "param_target": "outputs" post_process_transpose_perm : Transpose is applied to the tensor after the Flatten operation with the perm specified as postprocessing. 

8  Gemm  
9  Gather  1. "param_target": "attributes"axis : Value of axis 2. "param_target": "inputs" values : Value of indices pre_process_transpose_perm : Transpose is applied to the tensor before the Gather operation with the perm specified as preprocessing.3. "param_target": "outputs" post_process_transpose_perm : Transpose is applied to the tensor after the Gather operation with the perm specified as postprocessing. 

10  MatMul  1. "param_target": "inputs"pre_process_transpose_perm : Transpose is applied to the tensor before the MatMul operation with the perm specified as preprocessing.2. "param_target": "outputs" post_process_transpose_perm : Transpose is applied to the tensor after the MatMul operation with the perm specified as postprocessing. 

11  Mul  1. "param_target": "inputs"values : Value of input pre_process_transpose_perm : Transpose is applied to the tensor before the Mul operation with the perm specified as preprocessing.2. "param_target": "outputs" post_process_transpose_perm : Transpose is applied to the tensor after the Mul operation with the perm specified as postprocessing. 

12  NonMaxSuppression  
13  ReduceL1 ReduceL2 ReduceLogSum ReduceLogSumExp ReduceMax ReduceMean ReduceMin ReduceProd ReduceSum ReduceSumSquare 
1. "param_target": "attributes"axes : Value of axes keepdims : Value of keepdims 2. "param_target": "inputs" pre_process_transpose_perm : Transpose is applied to the tensor before the ReduceXX operation with the perm specified as preprocessing.3. "param_target": "outputs" post_process_transpose_perm : Transpose is applied to the tensor after the ReduceXX operation with the perm specified as postprocessing. 

14  Unsqueeze  1. "param_target": "inputs"pre_process_transpose_perm : Transpose is applied to the tensor before the Unsqueeze operation with the perm specified as preprocessing.2. "param_target": "outputs" post_process_transpose_perm : Transpose is applied to the tensor after the Unsqueeze operation with the perm specified as postprocessing.3. "param_target": "op" new_shape : Specifies directly the shape after Unsqueeze processing. 

15  Reshape  1. "param_target": "inputs"values : Value of shape pre_process_transpose_perm : Transpose is applied to the tensor before the Reshape operation with the perm specified as preprocessing.2. "param_target": "outputs" post_process_transpose_perm : Transpose is applied to the tensor after the Reshape operation with the perm specified as postprocessing. 

16  Resize  1. "param_target": "attributes"coordinate_transformation_mode : Value of coordinate_transformation_mode extrapolation_value : Value of extrapolation_value mode : Value of mode 2. "param_target": "inputs" values : Value of roi or scales or sizes . scales =[scale_h,scale_w] ,sizes =[h,w] pre_process_transpose_perm : Transpose is applied to the tensor before the Resize operation with the perm specified as preprocessing.3. "param_target": "outputs" post_process_transpose_perm : Transpose is applied to the tensor after the Resize operation with the perm specified as postprocessing. 

17  Slice 
Slice implements special replacements separately ignore all automatic conversions and generate tf.strided_slice directly by specifying all parameters of tf.strided_slice directly.https://www.tensorflow.org/api_docs/python/tf/strided_slice See replace_slice.json for a sample description. 1. "param_target": "op" begin : Value of begin end : Value of end strides : Value of strides begin_mask : Value of begin_mask end_mask : Value of end_mask ellipsis_mask : Value of ellipsis_mask new_axis_mask : Value of new_axis_mask shrink_axis_mask : Value of shrink_axis_mask


18  Softmax  1. "param_target": "attributes"axis : Value of axis . The transpositions corresponding to the specified axis are extrapolated before and after Softmax .2. "param_target": "inputs" values : Value of tensor


19  Split  1. "param_target": "inputs"values : Value of split 2. "param_target": "attributes" axis : Value of axis .num_outputs : Value of num_outputs . 

20  Sub  1. "param_target": "inputs"values : Value of input pre_process_transpose_perm : Transpose is applied to the tensor before the Sub operation with the perm specified as preprocessing.2. "param_target": "outputs" post_process_transpose_perm : Transpose is applied to the tensor after the Sub operation with the perm specified as postprocessing. 

21  Tile  1. "param_target": "inputs"values : Value of input pre_process_transpose_perm : Transpose is applied to the tensor before the Tile operation with the perm specified as preprocessing.2. "param_target": "outputs" post_process_transpose_perm : Transpose is applied to the tensor after the Tile operation with the perm specified as postprocessing. 

22  Transpose  1. "param_target": "attributes"perm : Value of perm 2. "param_target": "inputs" values : Value of tensor

✔️: SupportedHelp wanted: Pull Request are welcome
OP  Status 

Abs  ✔️ 
Acosh  ✔️ 
Acos  ✔️ 
Add  ✔️ 
And  ✔️ 
ArgMax  ✔️ 
ArgMin  ✔️ 
Asinh  ✔️ 
Asin  ✔️ 
Atanh  ✔️ 
Atan  ✔️ 
AveragePool  ✔️ 
BatchNormalization  ✔️ 
Bernoulli  ✔️ 
BitShift  ✔️ 
BitwiseAnd  Help wanted 
BitwiseNot  Help wanted 
BitwiseOr  Help wanted 
BitwiseXor  Help wanted 
Cast  ✔️ 
Ceil  ✔️ 
Celu  ✔️ 
CenterCropPad  Help wanted 
Clip  ✔️ 
Col2Im  Help wanted 
Compress  ✔️ 
ConcatFromSequence  ✔️ 
Concat  ✔️ 
ConstantOfShape  ✔️ 
Constant  ✔️ 
Conv  ✔️ 
ConvTranspose  ✔️ 
Cosh  ✔️ 
Cos  ✔️ 
CumSum  ✔️ 
DepthToSpace  ✔️ 
Det  ✔️ 
DequantizeLinear  ✔️ 
DFT  Help wanted 
Div  ✔️ 
Dropout  ✔️ 
DynamicQuantizeLinear  ✔️ 
Einsum  ✔️ 
Elu  ✔️ 
Equal  ✔️ 
Erf  ✔️ 
Expand  ✔️ 
Exp  ✔️ 
EyeLike  ✔️ 
Flatten  ✔️ 
Floor  ✔️ 
FusedConv  ✔️ 
GatherElements  ✔️ 
GatherND  ✔️ 
Gather  ✔️ 
Gemm  ✔️ 
GlobalAveragePool  ✔️ 
GlobalLpPool  ✔️ 
GlobalMaxPool  ✔️ 
GreaterOrEqual  ✔️ 
Greater  ✔️ 
GridSample  ✔️ 
GroupNormalization  Help wanted 
GRU  Help wanted 
Hardmax  ✔️ 
HardSigmoid  ✔️ 
HardSwish  ✔️ 
Identity  ✔️ 
If  ✔️ 
Input  ✔️ 
InstanceNormalization  ✔️ 
Inverse  ✔️ 
IsInf  ✔️ 
IsNaN  ✔️ 
LayerNormalization  ✔️ 
LeakyRelu  ✔️ 
LessOrEqual  ✔️ 
Less  ✔️ 
Log  ✔️ 
LogSoftmax  ✔️ 
Loop  Help wanted 
LpNormalization  ✔️ 
LRN  ✔️ 
LSTM  Help wanted 
MatMul  ✔️ 
MatMulInteger  ✔️ 
MaxPool  ✔️ 
Max  ✔️ 
MaxRoiPool  Help wanted 
MaxUnpool  ✔️ 
Mean  ✔️ 
MeanVarianceNormalization  ✔️ 
MelWeightMatrix  Help wanted 
Min  ✔️ 
Mish  ✔️ 
Mod  ✔️ 
Mul  ✔️ 
Multinomial  ✔️ 
Neg  ✔️ 
NonMaxSuppression  ✔️ 
NonZero  ✔️ 
Optional  Help wanted 
OptionalGetElement  Help wanted 
OptionalHasElement  Help wanted 
Not  ✔️ 
OneHot  ✔️ 
Or  ✔️ 
Pad  ✔️ 
Pow  ✔️ 
PRelu  ✔️ 
QLinearAdd  ✔️ 
QLinearConcat  ✔️ 
QLinearConv  ✔️ 
QLinearLeakyRelu  ✔️ 
QLinearMatMul  ✔️ 
QLinearMul  ✔️ 
QLinearSigmoid  ✔️ 
QLinearSoftmax  ✔️ 
QuantizeLinear  ✔️ 
RandomNormalLike  ✔️ 
RandomNormal  ✔️ 
RandomUniformLike  ✔️ 
RandomUniform  ✔️ 
Range  ✔️ 
Reciprocal  ✔️ 
ReduceL1  ✔️ 
ReduceL2  ✔️ 
ReduceLogSum  ✔️ 
ReduceLogSumExp  ✔️ 
ReduceMax  ✔️ 
ReduceMean  ✔️ 
ReduceMin  ✔️ 
ReduceProd  ✔️ 
ReduceSum  ✔️ 
ReduceSumSquare  ✔️ 
Relu  ✔️ 
Reshape  ✔️ 
Resize  ✔️ 
ReverseSequence  ✔️ 
RNN  Help wanted 
RoiAlign  ✔️ 
Round  ✔️ 
ScaleAndTranslate  ✔️ 
Scatter  ✔️ 
ScatterElements  ✔️ 
ScatterND  ✔️ 
Scan  Help wanted 
Selu  ✔️ 
SequenceAt  ✔️ 
SequenceConstruct  ✔️ 
SequenceEmpty  ✔️ 
SequenceErase  ✔️ 
SequenceInsert  ✔️ 
SequenceLength  ✔️ 
Shape  ✔️ 
Shrink  ✔️ 
Sigmoid  ✔️ 
Sign  ✔️ 
Sinh  ✔️ 
Sin  ✔️ 
Size  ✔️ 
Slice  ✔️ 
Softmax  ✔️ 
Softplus  ✔️ 
Softsign  ✔️ 
SpaceToDepth  ✔️ 
Split  ✔️ 
SplitToSequence  ✔️ 
Sqrt  ✔️ 
Squeeze  ✔️ 
STFT  Help wanted 
StringNormalizer  Help wanted 
Sub  ✔️ 
Sum  ✔️ 
Tanh  ✔️ 
Tan  ✔️ 
TfIdfVectorizer  Help wanted 
ThresholdedRelu  ✔️ 
Tile  ✔️ 
TopK  ✔️ 
Transpose  ✔️ 
Trilu  ✔️ 
Unique  ✔️ 
Unsqueeze  ✔️ 
Upsample  ✔️ 
Where  ✔️ 
Xor  ✔️ 
YOLOv7tiny with PostProcess (NMS) ONNX to TFLite Float32 https://github.com/PINTO0309/onnx2tf/releases/download/0.0.33/yolov7_tiny_head_0.768_post_480x640.onnx
onnx2tf  onnxtensorflow (Super redundant + Broken) 

YOLACTEdge MobileNetV2 with PostProcess (MultiClassNMS) ONNX to TFLite Float32 https://github.com/PINTO0309/onnx2tf/releases/download/1.0.11/yolact_edge_mobilenetv2_550x550.onnx
MoveNet MultiPose ONNX to TFLite Float32 (Cast
and TrueDiv
standard OP support)
https://github.com/PINTO0309/onnx2tf/releases/download/1.0.24/movenet_multipose_lightning_192x256_p6.onnx
ONNX file for testing. https://github.com/PINTO0309/onnx2tf/releases/tag/1.1.28
No.  Model  Pass 

1  age_googlenet.onnx  ✔️ 
2  alike_t_opset11_192x320.onnx  ✔️ 
3  arcfaceresnet1008.onnx  ✔️ 
4  baseline_simplified.onnx  ✔️ 
5  bvlcalexnet12.onnx  ✔️ 
6  caffenet12.onnx  ✔️ 
7  convtranspose_3_1_5_2.onnx  ✔️ 
8  convtranspose_4_5_2_2.onnx  ✔️ 
9  convtranspose_5_5_6_1.onnx  ✔️ 
10  convtranspose_6_5_5_8.onnx  ✔️ 
11  convtranspose_7_1_3_4.onnx  ✔️ 
12  damoyolo_tinynasL20_T_192x192_post.onnx  ✔️ 
13  deeplabv3_mobilenet_v3_large.onnx  ✔️ 
14  densenet12.onnx  ✔️ 
15  depth_to_spase_17.onnx  ✔️ 
16  digits.onnx  ✔️ 
17  detr_demo.onnx  ✔️ 
18  efficientformer_l1.onnx  ✔️ 
19  efficientdet_lite2_detection_1.onnx  ✔️ 
20  efficientnetlite411_nchw.onnx  ✔️ 
21  effnet_opset11_dynamic_axis.onnx  ✔️ 
22  emotionferplus8_rename.onnx  ✔️ 
23  face_detection_yunet_2022mar.onnx  ✔️ 
24  face_recognition_sface_2021decact_int8wt_int8quantized.onnx  ✔️ 
25  face_recognition_sface_2021dec.onnx  ✔️ 
26  faster_rcnn10.onnx  ✔️ 
27  fastestdet.onnx  ✔️ 
28  fused_conv_clip.onnx  ✔️ 
29  fused_conv_hardsigmoid.onnx  ✔️ 
30  fused_conv_leakyrelu.onnx  ✔️ 
31  fused_conv_relu.onnx  ✔️ 
32  fused_conv_sigmoid.onnx  ✔️ 
33  fused_conv_tanh.onnx  ✔️ 
34  gender_googlenet.onnx  ✔️ 
35  gmflowscale1mixdatatrain320x5764c3a6e9a_1x3x480x640_bidir_flow_sim.onnx  ✔️ 
36  handpose_estimation_mediapipe_2022may.onnx  ✔️ 
37  iat_llie_180x320.onnx  ✔️ 
38  if_p1_11.onnx  ✔️ 
39  if_p2_11.onnx  ✔️ 
40  if_p3_11.onnx  ✔️ 
41  imageclassifier.onnx  ✔️ 
42  inceptionv29.onnx  ✔️ 
43  inverse11.onnx  ✔️ 
44  mhformer_NxFxKxXY_1x27x17x2.onnx  ✔️ 
45  mnist12.onnx  ✔️ 
46  mobilenetv212.onnx  ✔️ 
47  mosaic_11.onnx  ✔️ 
48  mosaic9.onnx  ✔️ 
49  movenet_multipose_lightning_192x256_p6.onnx  ✔️ 
50  nanodetplusm_416.onnx  ✔️ 
51  object_tracking_dasiamrpn_kernel_cls1_2021nov.onnx  ✔️ 
52  object_tracking_dasiamrpn_kernel_r1_2021nov.onnx  ✔️ 
53  object_tracking_dasiamrpn_model_2021nov.onnx  ✔️ 
54  pidnet_S_cityscapes_192x320.onnx  ✔️ 
55  ppmattingv2_stdc1_human_480x640.onnx  ✔️ 
56  qlinear_conv_tensor_test.onnx  ✔️ 
57  rcnnilsvrc139.onnx  ✔️ 
58  regnet_x_400mf.onnx  ✔️ 
59  ResNet101DUC12.onnx  ✔️ 
60  resnet18v17.onnx  ✔️ 
61  resnet50v112.onnx  ✔️ 
62  resnet50v27.onnx  ✔️ 
63  retinanet9.onnx  ✔️ 
64  sinet_320_op.onnx  ✔️ 
65  squeezenet1.012.onnx  ✔️ 
66  superresolution10.onnx  ✔️ 
67  swinirm_64x64_12.onnx  ✔️ 
68  tinyyolov28.onnx  ✔️ 
69  versionRFB640.onnx  ✔️ 
70  vitb32_textual.onnx  ✔️ 
71  vitb32_visual.onnx  ✔️ 
72  yolact_edge_mobilenetv2_550x550.onnx  ✔️ 
73  yolact_regnetx_600mf_d2s_31classes_512x512.onnx  ✔️ 
74  yolact_regnetx_800mf_20classes_512x512.onnx  ✔️ 
75  yolo_free_nano_crowdhuman_192x320_post.onnx  ✔️ 
76  yolov7_tiny_head_0.768_post_480x640.onnx  ✔️ 
77  yolov8n.onnx  ✔️ 
78  yolov8nseg.onnx  ✔️ 
79  yolox_nano_192x192.onnx  ✔️ 
80  yolox_nano_416x416.onnx  ✔️ 
81  yolox_s.onnx  ✔️ 
82  yolox_x_crowdhuman_mot17_bytetrack.onnx  ✔️ 
83  zero_dce_640_dele.onnx  ✔️ 
84  zfnet51212.onnx  ✔️ 
[x] onnxtensorflow is a very useful tool, but the performance of the generated TensorFlow models is significantly degraded due to the extrapolation of a large number of Transpose
OPs before and after each OP during the format conversion from NCHW
to NHWC
. Therefore, I will make this tool myself as a derivative tool of onnxtensorflow without extrapolating Transpose
.
[x] Most of the internal processing of the tool is fullscratch, but some of the more complex OPs have been adapted from onnxtensorflow. I am very grateful to the engineers at International Business Machines Corporation / LeapMind / Microsoft / IBM for developing onnxtensorflow.
[x] I have incorporated all my knowledge of model optimization to other models such as TFLite, EdgeTPU, TensorFlow.js and Myriad based on my years of experience implementing openvino2tensorflow and tflite2tensorflow. It probably has the best model optimization performance and conversion efficiency of any tool I have created in the past, and the lowest rate of conversion errors.
[x] Supported layers list. Supported layers
[x] If you are having trouble with conversion errors, searching for resolved or open issues will almost always solve your problems. Issues are knowledge for engineers around the world.
[x] Contributors to this repository should first read Contribution Guide.
[x] All OPs are decomposed into primitive operations as much as possible. This is beneficial for lateral deployment of models to frameworks other than TFLite. Therefore, OPs belonging to tf.keras.layers
are almost never used, and the tool consists only of tf.xxx
. (except for a very few OPs)
[x] As I do not want to add more dependent packages, I do not use tensorflow_addons (tfa)
, but replace it with the standard OP of tensorflow.
[x] Not only does it handle conversions of 4dimensional inputs, such as NCHW
to NHWC
, but also the number of input dimensions in 3, 5, or even more dimensions. For example, NCDHW
to NDHWC
, etc. However, since 1D, 2D, 3D and 6D input may produce patterns that are mechanically difficult to convert, it should be possible to give parameters to externally modify the tool's behavior. See Parameter replacement
[x] If there are undefined dimensions in the input OP, the model structure is not fully optimized and conversion errors are very likely to occur.
[x] Immediately following a Reshape
OP with dimensional compression and dimensional decompression, there is a 95% probability that the model transformation operation will be disrupted and errors will occur. For example, patterns such as [1,200,200,5]
> [1,200,1]
or [10,20,30,40,50]
> [10,2,10,30,10,4,50]
or Flatten
. See #8 Not able to reshape input in replace.json, or #15 Conv layer shape wrong, or #18 Question about channel_transpose in common_functions.py, or #105 [MobileFormer]Converted model outputs values mismatch with original ones., or #133 When Onnx Matmul inputs have different dimension.
[x] TensorFlow's Convolution does not have an equivalent operation to ONNX's Padding operation. Therefore, a Pad
OP is inserted immediately before a Convolution with Padding of size greater than 1.
[x] Support conversion to TensorFlow saved model and TFLite (Float32/Float16/INT8).
[x] Files exceeding the Protocol Buffers file size limit of 2GB are not supported. Therefore, the external format is not supported at the initial stage of tool creation.
[x] If there are ONNX OPs that are not supported by TensorFlow, use simpleonnxprocessingtools to replace them with harmless OPs in advance and then use this tool to convert them. In other words, you can convert any model with your efforts.
[x] ONNX splitting, merging, generating OPs, rewriting OP attributes, BGR<>RGB conversion, converting to JSON and editing in the IDE, batch size changes for undefined dimensions, and various other processing can be done with the simpleonnxprocessingtools. Therefore, it is recommended that models with very complex structures be converted to TFLite after modifying the structure beforehand.
[x] BatchNormalization
supports only inference mode.
[x] LayerNormalization
supports only inference mode.
[x] Only for opset=11
or higher
[x] If you do not like the generated TFLite OP name, edit it using tflite2json2tflite.
[x] The generated Keras models cannot be used for retraining. If you want to train, you must build your own model.
[x] When converting to TensorFlow.js, CoreML, etc., please generate saved_model with the output_signaturedefs
option and use the generated saved_model to convert with various converters. tensorflowjs_converter, coremltools, edgetpu_compilier, etc... If this option is not enabled, saved_model records only the minimum necessary information and its size is minimized. When this option is enabled, saved_model records the maximum amount of information, and instead of being maximized in size, the output is in a format that supports conversion to other frameworks. It can also be used for serving.
[x] There are many OPs on ONNX that do not support TFLite/EdgeTPU/TFJS/CoreML/TensorRT. Therefore, if you need to generate an EdgeTPU model, please specify replace_to_pseudo_operators
to convert your model. onnx2tf will attempt to replace the OP with an TFLite/EdgeTPU/TFJS/CoreML/TensorRTcompatible OP whenever possible.
[x] The main factors that cause accuracy degradation after model conversion are as follows
scale
when resizing imagesThe above differences often cannot be dealt with by simply converting the model in a straightforward manner. Therefore, you need to replace the model yourself in advance with an operation that is less prone to errors.
INT8 Quantization
, Full INT8 Quantization
, INT8 Quantization with INT16 activation
, Full INT8 Quantization with INT16 activation
and Dynamic Range Quantization
.PerChannel Quantization
and PerTensor Quantization
.GroupConvolution
.TrueDiv
(INT), so TrueDiv
is avoided if possible.Resize
process for the 5D tensor.Asin
with pseudoAsin
.Acos
with pseudoAcos
.Atan
with pseudoAtan
.Abs
with pseudoAbs
.GatherND
with pseudoGatherND
.HardSwish
with pseudoHardSwish
.GridSample
with pseudoGridSample
.PRelu
with pseudoPRelu
.LeakyRelu
with pseudoLeakyRelu
.Power
with pseudoPower
.Neg
with pseudoNeg
.ArgMax
with pseudoArgMax
.Erf
with pseudoErf
.N
to a specified number.overwrite_input_shape
Made with contrib.rocks.