Calibrator

Usage¶

The Calibrator tool is located at ~/SGS_IPU_SDK/Scripts/calibrator/calibrator.py.

This tool converts the SigmaStar floating-point network model into a SigmaStar fixed-point network model.

Run the following under ~/SGS_IPU_SDK directory (can be ignored, if this has already been done):

$ cd ~/SGS_IPU_SDK 
$ source cfg_env.sh

Enter the working directory for this tool. Here is a usage example:

python3 calibrator.py \ 
-i ~/SGS_Models/resource/detection/coco2017_calibration_set32 \ 
-m ~/SGS_Models/tensorflow/ssd_mobilenet_v1/ssd_mobilenet_v1_float.sim \ 
-c Detection \ 
-n ~/SGS_Models/tensorflow/ssd_mobilenet_v1/ssd_mobilenet_v1.py \ 
--input_config ~/SGS_Models/tensorflow/ssd_mobilenet_v1/input_config.ini \ 
--num_process 20

You can alternatively enter the working directory by inputting the specified image path list file:

python3 calibrator.py \ 
-i ~/SGS_Models/resource/detection/file.list \ 
-m ~/SGS_Models/tensorflow/ssd_mobilenet_v1/ssd_mobilenet_v1_float.sim \ 
-c Detection \ 
-n ~/SGS_Models/tensorflow/ssd_mobilenet_v1/ssd_mobilenet_v1.py \ 
--input_config ~/SGS_Models/tensorflow/ssd_mobilenet_v1/input_config.ini \ 
--num_process 20

The following is a specific description of the tool parameters.

Mandatory Parameter¶

-i,--image: Path to image file or image folder or specified image path list file.
-m, --model: Floating-point network model file path.
-c, --category: Category of the model, mainly including Classification, Detection, and Unknown:
- Classification: The model has one output, which will output the top 5 output scores from high to low based on the output sequence.
- Detection: The model has four outputs, which will be converted to the bbox position and category of the input image based on the output. Note that only SigmaStar post-processing method is supported. See SGS Post-Processing Module for details. Please use 'Unknown' if any other post-processing method is applied.
- Unknown: The model output does not belong to the above two types. It will output all the Tensor values.
--input_config: Path to input_config.ini file, which contains the input Tensor configuration information. For details, see INPUT_CONFIG.
-n, --preprocess: Pre-processing method, which is related to the Image Pre-Processing Method described below. You can also specify the pre-processing file path after completing the pre-processing file configuration. Without this parameter, the image parameters will need the original data. You can use --save_input to save image data, and then configure the other original data according to their format.

Note

If the model is a multi-input one, the parameter -n,--preprocess would require multiple pre-processing runs, for example -n preprocess1.py,preprocess2.py or --preprocess preprocess1.py,preprocess2.py.

When the input format of the -i/--image parameter is a specified image path list file, the path will be as follows:

For single-input network model:

~/SGS_IPU_SDK/image_test/2007000364.jpg
~/SGS_IPU_SDK/image_test/ILSVRC2012_test_00000002.bmp
...

For multi-input network model:

~/SGS_IPU_SDK/image_test/2007000364.jpg,~/SGS_IPU_SDK/image_test/ILSVRC2012_test_00000002.bmp
~/SGS_IPU_SDK/image_test/2007000365.jpg,~/SGS_IPU_SDK/image_test/ILSVRC2012_test_00000003.bmp
...

Optional Parameter¶

-o, --output: Model output path. You can specify the location for fixed-point network model output data. If a folder is assigned to the floating-point network model, the output will be automatically named with the file prefix, followed by fixed.sim; if a particular path is assigned along with a filename, the output will be named by the specified path and filename; if nothing is specified, the fixed-point network model will be stored in the path of the floating-point network model file.
--num_process: Number of processes running simultaneously. (Optional parameter, if not specified, 10 processes will be run by default.)
--quant_level: Quantization level selection: [L1, L2, L3, L4, L5]. The default quantization level is L5. The higher the level, the higher the quantization accuracy, and the slower the quantization speed. The following is a description of each quantization level:
- L1: Does a quick data comparison using max. and min. values for quantization, at a faster speed.
- L2: Uses fast comparison data to quantify weight data.
- L3: Does further analysis of statistical information to approximate to the original data distribution.
- L4: Approximately fits the weighted data distribution and provides suggestions for upgrade of some convolutions to 16-bit quantization.
- L5: Uses high-precision data analysis methods to fit the original data distribution to the greatest extent and provides suggestions for upgrade of some convolutions to 16-bit quantization.

Note¶

Converting a floating-point network model into a fixed-point network model requires about 30 training images to analyze and quantify the data of the fixed-point network model. Therefore, the -i/--image parameter should be followed by the path of the image folder during the conversion. Of course, if you specify the -i/--image parameter with a single image file path, you can still convert a floating-point network model into a fixed-point network model, but the model accuracy may be affected. Further, due to different accuracy requirements, you can have different quantization configurations in input_config.ini file for convolution in different networks to thereby obtain a balance between accuracy and speed.
The Calibrator will search for system variables to obtain the tool path required for the corresponding stage task. Therefore, the parameter -t/--tool , by default, does not have to specify the location of the tool.
Using the calibrator to convert a floating-point network model to a fixed-point network model will generate a log directory under the current directory. The tensor_min_max.txt file under the log directory records the maximum and minimum values of the input and output of each network layer, which data is useful for the subsequent data analysis. The data in the log directory will be cleared upon the next use of the calibrator. Please have the data stored elsewhere, if necessary.

Image Pre-Processing Method¶

Because different network models might use different pre-processing methods, in order to minimize the loss of accuracy when converting the network models, the same image pre-processing method as used in training should be employed. Each pre-processing method needs to have a python file compiled independently. Below are two ways to save the pre-processed Python file:

Save the file in the SGS_IPU_SDK/Scripts/calibrator/preprocess folder, and add the file name to the preprocess_method/__init__.py file. When used in a calibrator or simulator, the -n/--preprocess parameter represents the compiled file name, and the compile file path does not have to be specified.
Make the -n/--preprocess parameter the file path for pre-processing Python file.

The following uses the caffe_resnet18 network as an example for compiling an image pre-processing file.

Compile the image processing function (function name not limited), and return the image data in np.array format. The function should contain two parameters:

Image path
Normalization flag (norm = True)

The normalization flag is used to distinguish whether the network model is a floating-point model, because in the floating-point network model phase, the normalization of the image needs to be processed before the image is imported to the network. On the other hand, since fixed-point network models and offline network models already include the configuration information of the input_config.ini file, and can therefore normalize the image data by themselves, no normalization is required of the network model data fed to the network. This is the same as the method by which the network model data is processed on SigmaStar hardware.

import cv2
import numpy as np

def get_image(img_path, resizeH=224, resizeW=224, resizeC=3, norm=True, meanB=104.0, meanG=117.0, meanR=123.0, std=1.0, rgb=False, nchw=False):
    img = cv2.imread(img_path, flags=-1)
    if img is None:
        raise FileNotFoundError('No such image: {}'.format(img_path))

    try:
        img_dim = img.shape[2]
    except IndexError:
        img_dim = 1
    if img_dim == 4:
        img = cv2.cvtColor(img, cv2.COLOR_BGRA2BGR)
    elif img_dim == 1:
        img = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR)
    img_float = img.astype('float32')
    img_norm = cv2.resize(img_float, (resizeW, resizeH), interpolation=cv2.INTER_LINEAR)

    if norm and (resizeC == 3):
        img_norm = (img_norm - [meanB, meanG, meanR]) / std
        img_norm = img_norm.astype('float32')
    elif norm and (resizeC == 1):
        img_norm = (img_norm - meanB) / std
        img_norm = img_norm.astype('float32')
    else:
        img_norm = np.round(img_norm).astype('uint8')

    if rgb:
        img_norm = cv2.cvtColor(img_norm, cv2.COLOR_BGR2RGB)

    if nchw:
        # NCHW
        img_norm = np.transpose(img_norm.reshape(resizeW, resizeH, -1), axes=(2, 0, 1))

    return np.expand_dims(img_norm, 0)

def image_preprocess(img_path, norm=True):
    return get_image(img_path, norm=norm)

Use the image_preprocess functon call, and strictly follow the statements below:

def image_preprocess(img_path, norm=True): 
    return get_image(img_path, norm=norm)

Save the file as caffe_resnet18.py.

In the file SGS_IPU_SDK/Scripts/calibrator/preprocess_method/__init__.py, add the Python file just compiled.

_all_ = ['caffe_mobilenet_v2', 'caffe_resnet18', 'caffe_resnet50_conv', 'mobilenet_v1']

If the -n/--preprocess parameter is caffe_resnet18 when you are using the calibrator or simulator, you can call the just compiled image pre-processing file directly. If the image pre-processing file is not added in the file SGS_IPU_SDK/Scripts/calibrator/preprocess_method/__init__.py, but the -n/--preprocess parameter is set to point to the path of the caffe_resnet18.py file, you can also pre-process the image.

Convolutional Quantization¶

The convolutional quantization for conversion of a floating-point network model to a fixed-point network model supports quantization methods of UINT8 and INT16. When using a calibrator, L2, L3 and L4 of --quant_level will automatically configure the convolutional quantization method based on the statistical information. If any specific setting is required, please refer to CONV_CONFIG for details. INT16 quantization can be set for part of the convolutional layers individually, or all of them. If the quantization method is not set, the default quantization method recommended by the calibrator will be used.

calibrator_custom.calibrator¶

calibrator_custom.calibrator is a Python-based module for quick quantizaion and model conversion. With the help of the calibrator_custom.calibrator, user can perform quantization and conversion against multi-input and multi-segment network in a more convenient and flexible fashion. Based on the current docker environment, a Python3.7-precompiled Python module is provided. The method of use and the related API interface are as follows:

import calibrator_custom 
model_path = './mobilenet_v2_float.sim' 
input_config_path = './input_config.ini' 
model = calibrator_custom.calibrator(model_path, input_config_path)

When using the calibrator_custom.calibrator, you need to assign the float.sim model path as well as the corresponding input_config.ini path for creating the calibrator instance. If the assigned value is incorrect, you will not be able to create the calibrator instance, and a ValueError message will be returned.

Methods of calibrator_custom.calibrator¶

get_input_details

This will return the network model input information (list).

input_details = model.get_input_details()
>>> print(input_details) 
[{'index': 0, 'shape': array([ 1, 513, 513, 3], dtype=int32), 'name': 'sub_7', 'dtype': <class 'numpy.float32'>}]

The returned list would contain the following dict information depending on the number of input models:

index: Index of the Input Tensor

name: Name of the Input Tensor

shape: Shape of the Input Tensor

dtype: Data type of the Input Tensor

get_output_detail

This will return the network model output information (list).

output_details = model.get_output_details() 
>>> print(output_details) 
[{'index': 0, 'shape': array([ 1, 257, 257, 30], dtype=int32), 'name': 'MobilenetV2/Conv/Conv2D', 'dtype': <class 'numpy.float32'>}]

The returned list would contain the following dict information depending on the number of output models:

index: Index of the Output Tensor

name: Name of the Output Tensor

shape: Shape of the Output Tensor

dtype: Data type of the Output Tensor

set_input

This will set the network model input data.

model.set_input(0, img_data)

The 0 in the input data represents the index of the Input Tensor, which can be obtained from the return value of get_input_details(). The img_data refers to the data in numpy.ndarray format which correspond to the model’s input shape and dtype. Incorrect shape or dtype will cause set_input to return a ValueError message. If the model has multiple inputs, you can call set_input multiple times to get the input data of the Tensor corresponding to the index based on the return value of the get_input_details().

invoke

This will invoke the model to operate once.

model.invoke()

Before calling the invoke command, please use set_input to set the iput data first. Calling invoke directly without calling set_input beforehand will return a ValueError message.

get_output

This will get the network model output data.

result = model.get_output(0)

The 0 in the output data represents the index of the Output Tensor, which can be obtained from the return value of get_output_details(). The output data returned is in numpy.ndarray format. If the model has multiple outputs, you can call get_output multiple times to get the output data of the Tensor corresponding to the index based on the return value of the get_output_details().

get_tensor_details

This will get the Tensor information (list) of the network model.

tensor_details = model.get_tensor_details() 
>>>print(tensor_details) 
[{'dtype': 'FLOAT32', 'name': 'MobilenetV2/Conv/Conv2D', 'qtype': 'INT16', 'shape': array([ 1, 257, 257, 30], dtype=int32)}, {'dtype': 'FLOAT32', 'name': 'MobilenetV2/Conv/Conv2D_bias', 'qtype': 'INT16', 'shape': array([ 2, 30], dtype=int32)}, {'dtype': 'FLOAT32', 'name': 'MobilenetV2/Conv/weights/read', 'qtype': 'INT8', 'shape': array([30, 3, 3, 3], dtype=int32)}, {'dtype': 'FLOAT32', 'name': 'sub_7', 'qtype': 'UINT8', 'shape': array([ 1, 513, 513, 3], dtype=int32)}]

The returned list would contain the following dict information depending on the number of the model Tensors:

name: Name of the Tensor

shape: Shape of the Tensor

dtype: Data type of the Tensor

qtype: Quantization type of the Tensor

Methods of calibrator_custom.SIM_Calibrator¶

For multi-input and multi-segment network, calibrator_custom.SIM_Calibrator can be utilized to realize simultaneous conversion through a simple definition.

The calibrator_custom.SIM_Calibrator is a class which has already been implemented, in which only the forward parameter needs to be defined. Once the forward method is implemented, you will be able to complete the conversion successfully.

In the following example, which is based on SGS_IPU_SDK/Scripts/examples/sim_calibrator.py, we will illustrate the method of use of calibrator_custom.SIM_Calibrator:

import calibrator_custom 
class Net(calibrator_custom.SIM_Calibrator):  
    def __init__(self):  
        super().__init__()  
        self.model = calibrator_custom.calibrator(model_path, input_config)  
    def forward(self, x):  
        out_details = self.model.get_output_details()  
        self.model.set_input(0, x)  
        self.model.invoke()  
        result_list = []  
        for idx in range(len(out_details)):  
            result = self.model.get_output(idx)  
            result_list.append(result)  
        return result_list

The forward parameter is the model input. If there are multiple inputs, the forward parameter can be increased.

To create a calibrator_custom.SIM_Calibrator instance:

net = Net()

To call calibrator_custom.SIM_Calibrator for model conversion:

net.convert(img_gen, fix_model=[out_model_path])

To convert the model, you should input two parameters: image_gen and fixed.sim model save path list:

img_gen

For the convenience of multi-input and multi-segment network model conversion, img_gen is provided to facilitate the arrangement of input image sequence. If the model has multiple inputs, the generator will follow the input sequence defined by the forward parameter and return a list with multiple numpy.ndarray output data.

calibrator_custom.utils.image_preprocess_func will use the pre-defined pre-processing method.
```
preprocess_func = calibrator_custom.utils.image_preprocess_func(model_name)

def image_generator(folder_path, preprocess_func):  
    images = [os.path.join(folder_path, img) for img in os.listdir(folder_path)]  
    for image in images:  
        img = preprocess_func(image)  
        yield [img] 
img_gen = image_generator('./images', preprocess_func)
```
fixed.sim model save path list

If multiple models are defined in __init__, the fixed model save path list should follow the sequence of models defined in __init__.
Other optional parameters

num_process: Number of processes simultaneously run by the CPU.

quant_level: Quantization level selection: [L1, L2, L3, L4, L5]. The default quantization level is L5. For details, see Optional Parameter.

quant_param: Parameter imported for quantization. If the quantization parameter of the corresponding model already exists, you can import the existing quantization parameter during the model conversion.

Rule for Importing the Quantization Parameter¶

Quantization Strategy and Method

Conv2D Quantization Method

Conv2D quantization includes Input, Weights and Output. Currently 8-bit and 16-bit quantizations are supported.

First, based on the min and max values of the Input, Weights and Output obtained through statistics, Weights are quantized to fixed-point data and saved inside the fixed-point network. The Input and Output data are then dynamically quantized during network operation. The number of min and max values of Weights is determined by the number of kernels, while the number of min and max values of Input and Output is determined by the C dimension of the Input and Output. In 8-bit quantization case, the Input of Conv2D is UINT8 (equivalent to the performance of INT9) and the Weights are INT8; in 16-bit quantization case, the Input and Weights of Conv2D are both INT16.
DepthwiseConv2D Quantization Method

DepthwiseConv2D quantization includes Input, Weights and Output. Currently 8-bit and 16-bit quantizations are supported.

First, based on the min and max values of the Input, Weights and Output obtained through statistics, Weights are quantized to fixed-point data and saved inside the fixed-point network. The Input and Output data are then dynamically quantized during network operation. The number of min and max values of Input, Weights and Output is each determined by the C dimension of the corresponding Tensor. In 8-bit quantization case, the Input of DepthwiseConv2D is UINT8 (equivalent to the performance of INT9) and the Weights are INT8; in 16-bit quantization case, the Input and Weights of DepthwiseConv2D are both INT16.
Other Op Quantization Method

The min and max values for quantization of other operator in the network are both determined by the number of the C dimension. Only 16-bit quantization is supported. By calling get_tensor_details in the calibrator_custom.calibrator, you can obtain the data type of the Tensor in fixed-point model from qtype.

Quantization Parameter Description

According to the quantization strategy and method described above, user should provide the following information to the Tensor:

Tensor name (name) [str]
Number of min and max values of the associated operator [list]
Quantization bit [int]
Constant Tensor data (data) (Optional) [numpy.ndarray]

The following is a usage example of the corresponding parameter . The parameter as a whole is a list, wherein each item is a dict containing the above information.

[ 
    {  
        "name": "FeatureExtractor/MobilenetV2/Conv2d_0/weights",   
        "min": [-4.555312, -2.876907, -1.234419],   
        "max": [7.364561, 3.960804, 6.0],   
        "bit": 8  
    },   
    {...},  
    ... 
]

The SGS_IPU_SDK/Scripts/examples/sim_calibrator.py, already having the parameter imported, can be used directly to read and import json and pkl files.

Quantization Data Import Flow

In order to be compatible with the original quantization process, the original quantization strategy will remain in force.

After the original model is converted into an SGS_Float model, some operators will be combined and optimized, hence causing a visible difference between the calculated diagram and the original framework model. As such, the contents of the provided quantization file, for example the name of the Tensor used, should be adjusted based on the converted SGS_Float model.

To facilitate modification of the combined and optimized layer, you can use get_tensor_details in calibrator_custom.calibrator to get the Tensor’s basic information, including the name, shape, dtype, and qtype of each Tensor.

The converted fixed model can use the get_tensor_details in calibrator_custom.fixed_simulator to get the Tensor’s basic information, including the name, shape, min, max, quantization, and dtype of each Tensor.

Once the quantization file of the original model is updated based on the model’s Tensor information, you can start to import the quantization data file.

After parsing the quantization file, a comparison will be made between the existing quantization information and the newly imported quantization information. The quantization information newly imported will basically be used as a first priority, but there still exists a possibility to abandon use of the imported quantization information due to irrationality of the information gotten from the merger.