SGS Post-Processing Module

Description

The SigmaStar post-processing module is located at SGS_IPU_SDK/Scripts/postprocess.

This module mainly uses TFLitePostProcess class to implement a set of APIs for generating TFLite Flatbuffer and a general generation method for detecting network post-processing BBOX. To use this module, compile the Python post-processing file according to the post-processing method to generate an independent post-processing model file, and then use the network connection process to connect the backbone network model and the post-processing model into a network model file. For details on the compilation of the Python file, see the example given in the directory SGS_IPU_SDK/Scripts/postprocess/postprocess_method. To generate the post-processing network model file after the compilation is done, follow the steps below:

  1. Save the compiled file in SGS_IPU_SDK/Scripts/postprocess/postprocess_method, and add to SGS_IPU_SDK/Scripts/postprocess/postprocess_method/__ini__.py the filename just saved. Taking the caffe_yolo_v2_postprocess.py file as an example, you can enter the following command in the directory SGS_IPU_SDK/Scripts/postprocess/:

    python3 postprocess.py -n caffe_yolo_v2_postprocess
    
  2. Set the -n/--model_name parameter to the compiled post-processed Python file path to run postprocess.py after compiling the post-processing network Python file. Connect the network process to SGS_IPU_SDK/bin/concat_net. The input name of the post-processing network must match the output name of the backbone network; otherwise, an error will occur when connecting the network model.

    The parameters of concat_net are as follows:

    --mode: Network connection mode, concat or append. To connect the backbone network with the post-processing network, please use append mode.

    --input_config: The input_config.ini file, which needs to use the complete network configuration file. The complete network configuration file has the same configuration as the backbone network configuration file, except the name used for output.

    --model1: Backbone network model sim path.

    --model2: Post-processing network model sim path.

    --output: Output path of the synthetic network model.

    The following sections will provide a detailed introduction on how to compile Python file in encapsulated post-processing flow and customized post-processing flow to generate post-processed network model file.

Usage of BBox

For ease of use, post-processing of networks including SSD, YOLOv1, YOLOv2, and YOLOv3 with respect to the BBox coordinates extraction has been analyzed and put into a user-friendly decoding process. These networks share the same structure; only some of the operators and the anchor parameters are used in different ways. Therefore, by configuring the config dictionary variables, you can obtain the post-processed network model of BBox coordinates. The BBox coordinate decoding network is shown in the following figure:

You can modify the config dictionary variables of the generated BBox coordinate decoding network model. Below is a description of the parameters used in the variables:

Parameter Name Parameter Type Description
sh ape [ int] BBox tensor shape, e.g. [1,837]
tx_func (tflite.BuiltinOperator, str) 1. tflite.BuiltinOperator is a TFLite built-in operator type.
2. str is a string of x_scale or None. If the operator in item 1 above is a single-constant-operand operator, set str as None. If the operator in item 1 above is a dual-port operator, set str as x_scale and specify its value in the member variable.
ty_func (tflite.BuiltinOperator, str) 1. tflite.BuiltinOperator is a TFLite built-in operator type.
2. str is a string of y_scale or None. If the operator in item 1 above is a single-constant-operand operator, set str as None. If the operator in item 1 above is a dual-port operator, set str as y_scale and specify its value in the member variable.
tw_func (tflite.BuiltinOperator, str) 1. tflite.BuiltinOperator is a TFLite built-in operator type.
2. str is a string of w_scale or None. If the operator in item 1 above is a single-constant-operand operator, set str as None. If the operator in item 1 above is a dual-port operator, set str as w_scale and specify its value in the member variable.
th_func (tflite.BuiltinOperator, str) 1. tflite.BuiltinOperator is a TFLite built-in operator type.
2. str is a string of h_scale or None. If the operator in item 1 above is a single-constant-operand operator, set str as None. If the operator in item 1 above is a dual-port operator, set str as h_scale and specify its value in the member variable.
x_scale float Value specified when tx_func[1] is x_scale
y_scale float Value specified when tx_func[1] is y_scale
w_scale float Value specified when tx_func[1] is w_scale
h_scale float Value specified when tx_func[1] is h_scale
anchor_selector str constant or None
Specifies whether pw and ph are constant or generated by pw_func and ph_func.
pw [ float ] When anchor_selector is constant, pw is specified as a float list.
ph [ float ] When anchor_selector is constant, ph is specified as a float list.
ppw [ float ] When anchor_selector is constant, ppw is specified as a float list.
pph [ float ] When anchor_selector is constant, pph is specified as a float list.
px [ float ] px is specified as a float list.
py [ float ] py is specified as a float list.
sx [ float ] sx is specified as a float list.
sy [ float ] sy is specified as a float list.
sw [ float ] sw is specified as a float list.
sh [ float ] sh is specified as a float list.

TFlite FlatBuffer

buildBuffer

buildBuffer(buffer_name, buffer_data=None)

buildBuffer is used to build a buffer.

:param buffer_name: A string used in coding to identify buffer, which will not be stored in model.

:param buffer_data: By default, if the created buffer is used for variable tensor, the default None value will be applied. If the created buffer is used for constant tensor, the data byte stream will be used.

:return: Returns the offset after encoding.

buildTensor

buildTensor(shape, name, buffer=0,type=tflite.TensorType.TensorType().FLOAT32)

buildTensor is used to build a tensor.

:param shape: The tensor shape identified in [int] int list.

:param name: String identifying the name of the created tensor.

:param buffer: Index value of int type, identified in the index of the buffer array.

:param type: Tensor type tflite.TensorType, default is FLOAT32.

:return: Returns the index of the created tensor in the tensor array of the subgraph, or directly returns the index if the tensor is already in existence.

buildOperatorCode

buildOperatorCode(opcode_name, builtin_code, custom_code=None)

buildOperatorCode is used to build an OperatorCode or return an OperatorCode already created.

:param opcode_name: A string, in the form of a name used by the user to identify, record and distinguish the operators. In actual implementation, it will ensure that only one OperatorCode of the same type exists in the OperatorCode array.

:param builtin_code: tflite.BuiltinOperator type, i.e. the built-in operator type.

:param custom_code: User-specified customer string tag.

:return: Returns the index of the OperatorCode.

buildOperator

buildOperator(op_code_name, input_names, output_names,builtin_options_type=None, builtin_options=None, custom_options=None, is_custom=False)

buildOperator is used to build an operator, for use in subgraph.

:param op_code_name: OperatorCode identifier specified in buildOperatorCode, using name to get the index of the returned OperatorCode.

:param input_names: List of [str] input tensor names.

:param output_names: List of [str] output tensor names.

:param builtin_options_type: tflite.BuiltinOptions type. If you need to specify any required optional parameters, specify the type of operator parameters here.

:param builtin_options: int type, corresponding to the offset of builtin_options_type operator parameter flatbuffer and using flatbuffer created by APIs such as createReshapeOptions. Currently TFLitePostProcess.py only implements few options of FlatBuffer encoding. For any options not yet implemented, refer to createReshapeOptions to implement new methods.

:param custom_options: bytearray of [byte] after filexbuffer encoding. If it is a custom operator, specify the parameters here. The corresponding operator will parse its own parameters.

:param is_custom: Indicates whether it is a custom operator, default is False.

:return: Returns the index of the operators in the subgraph.

buildSubGraph

buildSubGraph(input_tensor_names, output_tensor_names, subgraph_name)

buildSubGraph is used to build a subgraph and compile the created buffer into the subgraph.

:param input_tensor_names: [str] type. The list of subgraph input tensor names, which must correspond to the names of tensors built by buildTensor.

:param output_tensor_names: [str] type. The list of subgraph output tensor names, which must correspond to the names of tensors built by buildTensor.

:param subgraph_name: str type, specifying a name to identify a subgraph.

:return: Returns the subgraph FlatBuffer offset.

createModel

createModel(version, operator_codes, subgraphs, description, buffers, metadata_buffer=None)

createModel is used to encode all encoded data into a complete TFLite FlatBuffer.

:param version: UINT; TFLite version. Just input ‘3’.

:param operator_codes: [OperatorCode]; OperatorCode list, created by buildOperatorCode and stored in TFLitePostProcess.operator_codes.

:param subgraphs: [SubGraph]; SubGraph list, stored in TFLitePostProcess.subgraphs.

:param description: String; user-specified descriptive string.

:param buffers: [Buffer]; buffer list, created by buildBuffer and stored in TFLitePostProcess.buffers list.

:param metadata_buffer: [int]; not used currently, returns None.

:return: Returns the complete TFLite FlatBuffer handle created.

createFlexBuffer

createFlexBuffer(values)

createFlexBuffer is the operator parameter used when OperatorCode type is tflite.BuiltinOperator.BuiltinOperator().CUSTOM.

:param values: Tuple type list, the tuple type is (str, int/float, str):

  • The first item is the value name, used by the implementator of the operator to parse the value.
  • The second item is the value.
  • The third item is the value type string, to identify the type of the second item. If the second item is int, input 'int' here; if the second item is float, input 'float' here. Currently, only float and int types are supported.

:return: Returns the encoded bytearray.

Example:

cus_options = [(b"input_coordinate_x1",0,"int"),
                (b"input_coordinate_y1",1,"int"),
                (b"input_coordinate_x2",2,"int"),
                (b"input_coordinate_y2",3,"int"),
                (b"nms_score_threshold",0.4,"float"),
                (b"nms_iou_threshold",0.45,"float")]
options = sgs_builder.createFlexBuffer(cus_options)

buildBoxDecoding

buildBoxDecoding(unpacked_box)

buildBoxDecoding is used to separate the output BBox coordinates of the backbone network before importing BBox coordinates to decode the network.

:param unpacked_box: List of input tensor names for BBox, 4 tensors all told.

:return: Returns a list of the four decoded tensor names x1, y1, x2, and y2.

SigmaStar Custom Post-Processing Operator

The OperatorCode type of SigmaStar custom post-processing operator is tflite.BuiltinOperator.BuiltinOperator().CUSTOM. Therefore, you need to use the createFlexBuffer API to pass the parameters. The parameters passed must use the tuple type (str, int/float, str) containing the three items mentioned above.

PostProcess_Unpack

The function of the PostProcess_Unpack operator is to divide the backbone network outputs, into max. 7 branches. The usage is as follows:

cus_options = [(b"x_offset",0,"int"), 
                (b"x_lengh",1,"int"), 
                (b"y_offset",1,"int"), 
                (b"y_lengh",1,"int"), 
                (b"w_offset",2,"int"), 
                (b"w_lengh",1,"int"), 
                (b"h_offset",3,"int"), 
                (b"h_lengh",1,"int"), 
                (b"confidence_offset",0,"int"), 
                (b"confidence_lengh",0,"int"), 
                (b"scores_offset",0,"int"), 
                (b"scores_lengh",0,"int"), 
                (b"max_score",0,"int")]

Please modify the second parameter of each line according to the network used. If any branch is not required, set the corresponding offset and lengh to 0.

x_offset: The x offset of the branched coordinates

x_lengh: Coordinate x length, generally 1

y_offset: The y offset of the branched coordinates

y_lengh: Coordinate y length, generally 1

w_offset: The w offset of the branched coordinates

w_lengh: Coordinate w length, generally 1

h_offset: The h offset of the branched coordinates

h_lengh: Coordinate h length, generally 1

confidence_offset: Branched confidence offset

confidence_lengh: Confidence length, generally 1

scores_offset: Branched scores offset

scores_lengh: Scores lenth, indicating the number of classifications for the network.

max_score: Generally 1

By using different parameter configurations, in combination with the BBox coordinate decoding module, the PostProcess_Unpack network post-processing can be implemented in different ways, by:

  1. Separating BBox coordinates:

  2. Separating BBox coordinates, confidence, scores, and max_score:

TFLite_Detection_NMS

The TFLite_Detection_NMS operator combines NMS operation into an operator in cooperation with the PostProcess_Unpack operator. Max. 7 inputs and 4 or 5 outputs are supported. The usage is as follows:

cus_options = [(b"input_coordinate_x1",1,"int"), 
                (b"input_coordinate_y1",0,"int"), 
                (b"input_coordinate_x2",3,"int"), 
                (b"input_coordinate_y2",2,"int"), 
                (b"input_class_idx",5,"int"), 
                (b"input_score_idx",4,"int"), 
                (b"input_confidence_idx",-1,"int"), 
                (b"input_facecoordinate_idx",-1,"int"), 
                (b"output_detection_boxes_idx",0,"int"), 
                (b"output_detection_classes_idx",1,"int"), 
                (b"output_detection_scores_idx",2,"int"), 
                (b"output_num_detection_idx",3,"int"), 
                (b"output_detection_boxes_index_idx",-1,"int"), 
                (b"nms",0,"float"), 
                (b"clip",0,"float"), 
                (b"max_detections",10,"int"), 
                (b"max_classes_per_detection",1,"int"), 
                (b"detections_per_class",1,"int"), 
                (b"num_classes",90,"int"), 
                (b"bmax_score",0,"int"), 
                (b"num_classes_with_background",1,"int"), 
                (b"nms_score_threshold",9.99999994e-09,"float"), 
                (b"nms_iou_threshold",0.600000024,"float")]

Please modify the second parameter of each line according to the network used. If any parameter is not required, set the corresponding value to -1.

input_coordinate_x1: Number corresponding to PostProcess_Unpack operator x_offset

input_coordinate_y1: Number corresponding to PostProcess_Unpack operator y_offset

input_coordinate_x2: Number corresponding to PostProcess_Unpack operator w_offset

input_coordinate_y2: Number corresponding to PostProcess_Unpack operator h_offset

input_class_idx: Input number of corresponding class

input_score_idx: Number corresponding to PostProcess_Unpack operator score

input_confidence_idx: Number corresponding to PostProcess_Unpack operator confidence

input_facecoordinate_idx: Default is -1

output_detection_boxes_idx: Outputs the detection BBox coordinate number

output_detection_classes_idx: Outputs the class number of the corresponding detecton

output_detection_scores_idx: Outputs the score number of the corresponding detection

output_num_detection_idx: Outputs the number of detected targets

output_detection_boxes_index_idx: Outputs the number of detected targets in specified sequence

nms: 0 for Fast NMS, 1 for Normal NMS

clip: Whether or not to clip out-of-bound BBox coordinates. 1 for clip, 0 for reserve.

max_detections: Max. number of output targets

max_classes_per_detection: Default is 1

detections_per_class: Default is 1

num_classes: Number of network model classes (not including background, which is applicable only to SSD post-processing configuration)

bmax_score: 1 if corresponding to PostProcess_Unpack operator max_score, and 0 if not.

num_classes_with_background: Default is 1

nms_score_threshold: NMS score threshold

nms_iou_threshold: NMS IoU threshold

Note

TFLite_Detection_NMS operator supports max. 24576 BBoxes.

Deciding whether to let NMS output index information

NMS can select 4 or 5 outputs. Among them, the 4 items, output_detection_boxes_idx, output_detection_classes_idx, output_detection_scores_idx, and output_num_detection_idx, are mandatory, whereas the remaining item, output_detection_boxes_index_idx, is optional. To enable output of output_detection_boxes_index_idx, modify the post-processed Python file in the method illustrated below. The following takes ssd_mobilenet_v1 model post-processing as an example. For the complete code, see

SGS_IPU_SDK/Scripts/postprocess/postprocess_method/ssd_mobilenet_v1_index_postprocess.py

To create “detectionIndex” Tensor after “numDetections” Tensor, use the command below:

sgs_builder.buildTensor(model_config["out_shapes"][3],"numDetections") 
nms_out_tensors.append("numDetections") 

sgs_builder.buildTensor(model_config["out_shapes"][4],"detectionIndex") 
nms_out_tensors.append("detectionIndex")

cus_code = 'TFLite_Detection_NMS' 
sgs_builder.buildOperatorCode("SGS_nms",tflite.BuiltinOperator.BuiltinOperator().CUSTOM,cus_code)

To modify TFLite_Detection_NMS operator parameter, set output_detection_boxes_index_idx to 4:

cus_options = [(b"input_coordinate_x1",1,"int"), 
                (b"input_coordinate_y1",0,"int"), 
                (b"input_coordinate_x2",3,"int"), 
                (b"input_coordinate_y2",2,"int"), 
                (b"input_class_idx",5,"int"),
                (b"input_score_idx",4,"int"), 
                (b"input_confidence_idx",-1,"int"), 
                (b"input_facecoordinate_idx",-1,"int"), 
                (b"output_detection_boxes_idx",0,"int"), 
                (b"output_detection_classes_idx",1,"int"), 
                (b"output_detection_scores_idx",2,"int"), 
                (b"output_num_detection_idx",3,"int"),
                (b"output_detection_boxes_index_idx",4,"int"), 
                (b"nms",0,"float"), 
                (b"clip",0,"float"), 
                (b"max_detections",10,"int"), 
                (b"max_classes_per_detection",1,"int"), 
                (b"detections_per_class",1,"int"), 
                (b"num_classes",90,"int"), 
                (b"bmax_score",0,"int"), 
                (b"num_classes_with_background",1,"int"), 
                (b"nms_score_threshold",9.99999994e-09,"float"), 
                (b"nms_iou_threshold",0.600000024,"float")]

To create network model output Tensor name:

network_out_tensors = [] 
network_out_tensors.append("detectionBoxes") 
network_out_tensors.append("detectionClasses") 
network_out_tensors.append("detectionScores") 
network_out_tensors.append("numDetections") 
network_out_tensors.append("detectionIndex") 
sgs_builder.subgraphs.append(sgs_builder.buildSubGraph(model_config["input"],network_out_tensors,model_config["name"]))

To modify the model configuration parameter, out_shapes, to increase the number of detection index output shapes:

model_config = {"name":"ssdlite_mobilenet_v2",
                 "input" : ["Squeeze","convert_scores"],
                 "input_shape" : [[1,1917,4],[1,1917,91]],
                 "shape" : [1,1917],
                 "out_shapes" : [[1,10,4],[1,10],[1,10],[1],[1,10]]}

After the modification above is done, the generated post-processing model NMS output will contain 5 outputs.

Note

To avoid unexpected error when connecting networks, be sure to modify the outputs in the input_config.ini file before connecting the backbone network, because an additional output Tensor is present.

Getting Offline Anchor Data

If the data in the PriorBox node of the Caffe network is generated offline, you can obtain it by the method stated below. For details please refer to SGS_Models/caffe/caffe_ssd_mobilenet_v1. To convert backbone network containing PriorBox node, use Netron to open the network prototxt file, as illustrated below:

Please modify the corresponding input_config.ini file and the ConvertTool convert command. There are three outputs for the backbone network. After the conversion is done, all PriorBox nodes will merge into one single node. To save anchor data as .npy file, open the generated backbone network using Netron, click the mbox_priorbox node, and then the save button shown in red frame. While configuring BBox coordinate decoding module, load the .npy file using numpy.load to configure the corresponding variables. If anchor data is already in place, you can apply the anchor data directly without going through the method discussed here.

Post-processed Python file, when under compilation, has three inputs to add model configuration and increase the number of mbox_priorbox shapes. See the following for the code details:

SGS_IPU_SDK/Scripts/postprocess/postprocess_method/caffe_ssd_mobilenet_v1_postprocess.py

model_config = {"name":"caffe_ssd_mobilenet_v1", 
                "input" : ["mbox_loc","mbox_conf_softmax","mbox_priorbox"], 
                "input_shape" : [[1,1917,4],[1,1917,21],[1917,4]], 
                "shape" : [1,1917], 
                "out_shapes" : [[1,10,4],[1,10],[1,10],[1]]}

Once the post-processed model is generated, connecting it using the concat_net tool will automatically clear the mbox_priorbox node.

Example

The following example is based on caffe_yolo_v2 model post-processing. For detailed codes, see

SGS_IPU_SDK/Scripts/postprocess/postprocess_method/caffe_yolo_v2_postprocess.py

Creating a TFLitePostProcess Instance

To create a TFLitePostProcess instance, configure the config dictionary variables by configuring the various config parameters based on the actual calculation method of the BBox coordinate decoding.

The configuration parameters include:

box_num = 5 
side_x = 13 
side_y = 13 
ppw = anchor.ones(845) 
px = anchor.index_div_linear(1,1,0,box_num ,side_x,side_y) 
pph = anchor.ones(845) 
py = anchor.index_div_linear(1,1,0,side_x*box_num,side_y,1) 
pw = anchor.ones(845) 
ph = anchor.ones(845) 
sx = anchor.ns(845,1.0/13) 
sy = anchor.ns(845,1.0/13) 
biases= [[1.3221,1.73145],[3.19275,4.00944],[5.05587,8.09892],[9.47112,4.84053],[11.2364,10.0071]] 
sw = [x[0]/(2*13) for x in biases ]*(13*13) 
sh = [x[1]/(2*13) for x in biases ]*(13*13)

To configure the config dictionary variables:

config = {"shape" : [1,845],
         "tx_func" : (tflite.BuiltinOperator.BuiltinOperator().LOGISTIC,None),                       
         "ty_func" : (tflite.BuiltinOperator.BuiltinOperator().LOGISTIC,None), 
         "tw_func" : (tflite.BuiltinOperator.BuiltinOperator().RESHAPE,None), 
         "th_func" : (tflite.BuiltinOperator.BuiltinOperator().RESHAPE,None),
         "x_scale" : 0.1, 
         "y_scale" : 0.1, 
         "w_scale" : 1, 
         "h_scale" : 1, 
         "anchor_selector" : "constant", 
         "pw" : pw, 
         "ph" : ph, 
         "pw_func" : (None,None), 
         "ph_func" : (None,None), 
         "ppw" : ppw, 
         "px" : px, 
         "pph" : pph, 
         "py" : py, 
         "sx" : sx, 
         "sy" : sy, 
         "sw" : sw, 
         "sh" : sh
         }

To create a TFLitePostProcess instance:

yolov2 = TFLitePostProcess(config)

Creating a Constant Tensor

To pack the float list into bytearray:

py_vector=[] 
for value in self.py: 
    py_vector += bytearray(struct.pack("f", value))

To use the bytearray to create a constant buffer:

self.buildBuffer("py_buffer",py_vector)

To use the constant buffer to create a tensor:

self.buildTensor([len(self.py)],"py_tensor",self.getBufferByName("py_buffer"))

Creating an Operator

To create a dual-port Mul operator:

score1_out_tensors = [] 
score1_in_tensors = [] 
score1_in_tensors.append("confidence_tensor") 
score1_in_tensors.append("score0_tensor") 
sgs_builder.buildTensor([1,845], "SGS_score1")
score1_out_tensors.append("SGS_score1") 
sgs_builder.buildOperatorCode("SGS_score_mul",tflite.BuiltinOperator.BuiltinOperator().MUL) 
sgs_builder.buildOperator("SGS_score_mul",score1_in_tensors,score1_out_tensors)

To create a constant tensor, if you want to create a Reshape operator:

reshape_out_shape1 = [1,4695,4] 
reshape_out_tensors1 = [] 
reshape_in_tensors1 = [] 
sgs_builder.buildBuffer('NULL') 
sgs_builder.buildTensor([1,4695,1,4], '283_in') 
reshape_in_tensors1.append('283_in') 
reshape_vector1 = [] 
for value in reshape_out_shape1: 
    reshape_vector1 += bytearray(struct.pack("i", value)) 
sgs_builder.buildBuffer("reshape_vector1",reshape_vector1) 
sgs_builder.buildTensor([len(reshape_out_shape1)],"reshape_shape1",sgs_builder.getBufferByName("reshape_vector1"),tflite.TensorType.TensorType().INT32) 
reshape_in_tensors1.append("reshape_shape1") 
sgs_builder.buildTensor(reshape_out_shape1,"reshape_tensor1") 
reshape_out_tensors1.append("reshape_tensor1") 
sgs_builder.buildOperatorCode("SGS_reshape1",tflite.BuiltinOperator.BuiltinOperator().RESHAPE) 
reshape_newshape1 = sgs_builder.createReshapeOptions(reshape_out_shape1) 
sgs_builder.buildOperator("SGS_reshape1",reshape_in_tensors1, reshape_out_tensors1,tflite.BuiltinOptions.BuiltinOptions().ReshapeOptions,reshape_newshape1)

Creating a Custom Operator

To create an OperatorCode:

sgs_builder.buildOperatorCode("SGS_nms",tflite.BuiltinOperator.BuiltinOperator().CUSTOM,cus_code) 
cus_options = [(b"input_coordinate_x1",0,"int"), 
                (b"input_coordinate_y1",1,"int"), 
                (b"input_coordinate_x2",2,"int"), 
                (b"input_coordinate_y2",3,"int"), 
                (b"input_class_idx",6,"int"), 
                (b"input_score_idx",5,"int"), 
                (b"input_confidence_idx",4,"int"), 
                (b"input_facecoordinate_idx",-1,"int"), 
                (b"output_detection_boxes_idx",0,"int"), 
                (b"output_detection_classes_idx",1,"int"), 
                (b"output_detection_scores_idx",2,"int"),
                (b"output_num_detection_idx",3,"int"), 
                (b"output_detection_boxes_index_idx ",-1,"int"), 
                (b"nms",0,"int"), 
                (b"clip",0,"int"), 
                (b"max_detections",100,"int"), 
                (b"max_classes_per_detection",1,"int"), 
                (b"detections_per_class",1,"int"),
                 (b"num_classes",20,"int"), 
                (b"bmax_score",1,"int"), 
                (b"num_classes_with_background",1,"int"), 
                (b"nms_score_threshold",0.4,"float"), 
                (b"nms_iou_threshold",0.45,"float")]

To create a FlexBuffer:

options = sgs_builder.createFlexBuffer(cus_options)

To use the FlexBuffer to create an Operator:

sgs_builder.buildOperator("SGS_nms", nms_in_tensors, nms_out_tensors, None, None, options)

Creating a Model and Saving the Model File

To create a subgraph:

sgs_builder.subgraphs.append(sgs_builder.buildSubGraph(['conv23'],nms_out_tensors,'caffe_yolo_v2'))

To create a model:

sgs_builder.model = sgs_builder.createModel(3, sgs_builder.operator_codes, 
sgs_builder.subgraphs, 'caffe_yolo_v2', sgs_builder.buffers) 
file_identifier = b'TFL3' 
sgs_builder.builder.Finish(sgs_builder.model, file_identifier)

To output the model:

buf = sgs_builder.builder.Output()

To save the model to a file:

outfilename = 'caffe_yolo_v2_postprocess.sim' 
with open(outfilename, 'wb') as f:
    f.write(buf)