DumpDebug Tool

Description

The DumpDebug Tool is located at ~/SGS_IPU_SDK/DumpDebug/.

The tool's main functions include:

  • Parsing network model data
  • Comparing network model data of different stages
  • Drawing histogram chart of each layer of network model

When the network for the SigmaStar IPU SDK is switched, the results of the Caffe and TensorFlow network models would be completely consistent with those of the SigmaStar floating-point network model, and the results of the SigmaStar fixed-point network model would be completely consistent with those of the SigmaStar offline network model.

The only case where accuracy differs is the stage during which SigmaStar floating-point model is converted to SigmaStar fixed-point model. Therefore, when the results of the SigmaStar fixed-point network are found to be significantly different from the original floating-point results, you can use the DumpDebug Tool to check the error of the converted fixed-point network.

Usage

Dump data of each layer in the network model

  1. Find the DebugConfig.txt in the SGS_IPU_SDK/cfg folder and copy it to the current running path.

  2. Open DebugConfig.txt and modify the file contents:

    dumpTensor 
    eliminateGarbage 
    dequantFixed 
    #dumpasstring 
    #disableDomainFuseOps 
    path=
    

    The following is a description of the DebugConfig.txt parameters:

    dumpTensor: Master switch of each layer of Dump network model data.

    eliminateGarbage: Removes unnecessary data when dumping network model data (recommended).

    dequantFixed: For fixed-point network models, converts fixed-point data to floating-point data (recommended).

    dumpasstring: Disables the option for binary file type when dumping string-type network model data. (If the function listed in Analyzing data with the auto_dump_debug.sh script is used, be sure to turn off this option.)

    disableDomainFuseOps: Disables network layer fusion when switching fixed-point network models (recommended).

    path=: Specifies the full path for the generated file output. (Be sure to fill in an absolute path after the path= parameter, e.g. /home/user. If threre are no contents after path= or the path= does not even exist, the file will be output to $HOME. Note that the absolute path or /home/user should not exceed 122 bytes.)

  3. Run the simulator to deduce a single frame.

  4. After the deduction is completed, sigma_outtensor_dump.bin will be generated in the directory specified by path= parameter, which is the data of each layer in the Dump network model.

Note

  • Please rename the sigma_outtensor_dump.bin file after the dumping is completed. Do not modify the file suffix. Note that the sigma_outtensor_dump.bin file will be overwritten by the new dump file.
  • The disableDomainFuseOps option in DebugConfig.txt file is only useful when converting floating-point network model to fixed-point network model using a calibrator. The purpose of this option is to cancel the network fusion function. When this option is turned off, the fixed-point network model and the offline network model can optimize the network model's operators during conversion and speed up the model's operation, but in the meantime it will affect the hierarchical structure of the network and prevent some operators' output from being dumped to sigma_outtersor_dump.bin file. If you need data for each layer of the network model, you can turn on the disableDomainFuseOps option and re-run the calibrator to convert the fixed-point network model. The converted model will not be optimized for fusion in this case and can therefore output the data for each layer.
  • To dump the network model data for different stages, compare floating-point network models with fixed-point network models and use the parameters -t/--type and -m/--model in the simulator to specify models for different stages.
  • Offline network model does not support Dump Debug.
  • When you use a calibrator to convert a floating-point network model to a fixed-point network model, a tensor_min_max.txt file will be generated under the SGS_IPU_SDK root directory. The tensor_min_max.txt file records the maximum and minimum values of each input and output of the network, which data will be used for the subsequent dump data analysis.

Analyzing data with the auto_dump_debug.sh script

The tool is located at ~SGS_IPU_SDK/DumpDebug/auto_dump_debug.sh.

The auto_dump_debug.sh script can make a comparison of the MSE and RMSE for same-layer output tensor between the sample bin (sample) and the reference bin (benchmark). Note that you have to dump the bin files of the floating-point network model and the fixed-point network model pursuant to Generating data of each layer in the Dump network model.

Below is an example usage:

./auto_dump_debug.sh \
/home/user/SGS_IPU_SDK \
/home/user/sample.bin \
/home/user/benchmark.bin

The following is a description of the parameters used:

Param1: Path of SGS_IPU_SDK. If it is in the current position, use the folder name.

Param2: Path of the sample bin that has been dumped for comparison. This should be the path of the bin file dumped from fixed-point network model.

Param3: Path of the benchmark bin that has been dumped for reference. This should be the path of the bin file dumped from floating-point network model.

If the disableDomainFuseOps option is not turned on, the result of the analysis will be as displayed (partially) below:

3.conv1_xx_xx_xx.xx.output0             OP_TYPE:CONV_2D     MSE:0.000046    COS:0.999898    RMSE:0.014310
4.pool1.xx.output0                      OP_TYPE:MAX_POOL_2D MSE:0.000057    COS:0.999931    RMSE:0.011018
7.res2a_branch1_xx_xx.xx.output0        OP_TYPE:CONV_2D     MSE:0.000076    COS:0.999886    RMSE:0.015652
11.res2a_branch2a_xx_xx_xx.xx.output0   OP_TYPE:CONV_2D     MSE:0.000027    COS:0.999652    RMSE:0.023612
16.res2a_xx.xx.output0                  OP_TYPE:RELU        MSE:0.000177    COS:0.999644    RMSE:0.025049
20.res2b_branch2a_xx_xx_xx.xx.output0   OP_TYPE:CONV_2D     MSE:0.000069    COS:0.999242    RMSE:0.039121
25.res2b_xx.xx.output0                  OP_TYPE:RELU        MSE:0.000430    COS:0.999452    RMSE:0.030551
28.res3a_branch1_xx_xx.xx.output0       OP_TYPE:CONV_2D     MSE:0.000165    COS:0.998975    RMSE:0.041644
32.res3a_branch2a_xx_xx_xx.xx.output0   OP_TYPE:CONV_2D     MSE:0.000139    COS:0.998842    RMSE:0.048416
37.res3a_xx.xx.output0                  OP_TYPE:RELU        MSE:0.000297    COS:0.999069    RMSE:0.041043
41.res3b_branch2a_xx_xx_xx.xx.output0   OP_TYPE:CONV_2D     MSE:0.000070    COS:0.998836    RMSE:0.051215
46.res3b_xx.xx.output0                  OP_TYPE:RELU        MSE:0.000366    COS:0.999035    RMSE:0.042863
49.res4a_branch1_xx_xx.xx.output0       OP_TYPE:CONV_2D     MSE:0.000026    COS:0.999282    RMSE:0.035575
53.res4a_branch2a_xx_xx_xx.xx.output0   OP_TYPE:CONV_2D     MSE:0.000108    COS:0.998950    RMSE:0.047213
58.res4a_xx.xx.output0                  OP_TYPE:RELU        MSE:0.000149    COS:0.999174    RMSE:0.040582
62.res4b_branch2a_xx_xx_xx.xx.output0   OP_TYPE:CONV_2D     MSE:0.000030    COS:0.999020    RMSE:0.046630
67.res4b_xx.xx.output0                  OP_TYPE:RELU        MSE:0.000168    COS:0.999073    RMSE:0.041229
70.res5a_branch1_xx_xx.xx.output0       OP_TYPE:CONV_2D     MSE:0.000305    COS:0.999276    RMSE:0.033715
74.res5a_branch2a_xx_xx_xx.xx.output0   OP_TYPE:CONV_2D     MSE:0.000032    COS:0.998723    RMSE:0.053469
79.res5a_xx.xx.output0                  OP_TYPE:RELU        MSE:0.000507    COS:0.998762    RMSE:0.050663
83.res5b_branch2a_xx_xx_xx.xx.output0   OP_TYPE:CONV_2D     MSE:0.000027    COS:0.998182    RMSE:0.064201
88.res5b_xx.xx.output0                  OP_TYPE:RELU        MSE:0.006022    COS:0.998665    RMSE:0.056931
89.pool5.xx.output0                     OP_TYPE:AVG_POOL_2D MSE:0.004321    COS:0.998488    RMSE:0.079694
90.fc1000_reshape#1_output#183.output0  OP_TYPE:ReshapeAliasMSE:0.004321    COS:0.998488    RMSE:0.079694
91.fc1000#185.xx.output0                OP_TYPE:CONV_2D     MSE:0.014197    COS:0.998901    RMSE:0.051447
92.fc1000.xx.output0                    OP_TYPE:ReshapeAliasMSE:0.014197    COS:0.998901    RMSE:0.051447
93.prob.xx.output0                      OP_TYPE:Fix2Float   MSE:0.000000    COS:1.000000    RMSE:0.000152

If the disableDomainFuseOps option is turned on, the result of the analysis will be as displayed (partially) below:

0.conv1.xx.output0                      OP_TYPE:CONV_2D     MSE:2.397930    COS:0.999887    RMSE:0.019312
1.conv1_xx.xx.output0                   OP_TYPE:MUL         MSE:0.000037    COS:0.999850    RMSE:0.020800
2.conv1_xx_xx.xx.output0                OP_TYPE:ADD         MSE:0.000037    COS:0.999936    RMSE:0.010667
3.conv1_xx_xx_xx.xx.output0             OP_TYPE:RELU        MSE:0.000052    COS:0.999885    RMSE:0.015123
4.pool1.xx.output0                      OP_TYPE:MAX_POOL_2D MSE:0.000063    COS:0.999924    RMSE:0.011603
5.res2a_branch1.xx.output0              OP_TYPE:CONV_2D     MSE:0.000096    COS:0.999952    RMSE:0.009368
6.res2a_branch1_xx.xx.output0           OP_TYPE:MUL         MSE:0.000080    COS:0.999956    RMSE:0.009144
7.res2a_branch1_xx_xx.xx.output0        OP_TYPE:ADD         MSE:0.000080    COS:0.999880    RMSE:0.016141
8.res2a_branch2a.xx.output0             OP_TYPE:CONV_2D     MSE:0.000349    COS:0.999951    RMSE:0.009727
9.res2a_branch2a_xx.xx.output0          OP_TYPE:MUL         MSE:0.000049    COS:0.999956    RMSE:0.009349
10.res2a_branch2a_xx_xx.xx.output0      OP_TYPE:ADD         MSE:0.000049    COS:0.999886    RMSE:0.016510
11.res2a_branch2a_xx_xx_xx.xx.output0   OP_TYPE:RELU        MSE:0.000028    COS:0.999641    RMSE:0.024211
12.res2a_branch2b.xx.output0            OP_TYPE:CONV_2D     MSE:0.000100    COS:0.999821    RMSE:0.018379
13.res2a_branch2b_xx.xx.output0         OP_TYPE:MUL         MSE:0.000118    COS:0.999832    RMSE:0.017982
14.res2a_branch2b_xx_xx.xx.output0      OP_TYPE:ADD         MSE:0.000118    COS:0.999670    RMSE:0.024984
15.res2a.xx.output0                     OP_TYPE:ADD         MSE:0.000241    COS:0.999771    RMSE:0.021801
16.res2a_xx.xx.output0                  OP_TYPE:RELU        MSE:0.000182    COS:0.999634    RMSE:0.025262
17.res2b_branch2a.xx.output0            OP_TYPE:CONV_2D     MSE:0.000525    COS:0.999709    RMSE:0.023127
18.res2b_branch2a_xx.xx.output0         OP_TYPE:MUL         MSE:0.000193    COS:0.999746    RMSE:0.022434
19.res2b_branch2a_xx_xx.xx.output0      OP_TYPE:ADD         MSE:0.000193    COS:0.999474    RMSE:0.032121
20.res2b_branch2a_xx_xx_xx.xx.output0   OP_TYPE:RELU        MSE:0.000071    COS:0.999219    RMSE:0.039747
21.res2b_branch2b.xx.output0            OP_TYPE:CONV_2D     MSE:0.000157    COS:0.999549    RMSE:0.030371
22.res2b_branch2b_xx.xx.output0         OP_TYPE:MUL         MSE:0.000272    COS:0.999587    RMSE:0.029537
23.res2b_branch2b_xx_xx.xx.output0      OP_TYPE:ADD         MSE:0.000272    COS:0.999151    RMSE:0.040304
24.res2b.xx.output0                     OP_TYPE:ADD         MSE:0.000534    COS:0.999425    RMSE:0.031402
25.res2b_xx.xx.output0                  OP_TYPE:RELU        MSE:0.000448    COS:0.999429    RMSE:0.031203
26.res3a_branch1.xx.output0             OP_TYPE:CONV_2D     MSE:0.000267    COS:0.999320    RMSE:0.034101
27.res3a_branch1_xx.xx.output0          OP_TYPE:MUL         MSE:0.000169    COS:0.999345    RMSE:0.034331
28.res3a_branch1_xx_xx.xx.output0       OP_TYPE:ADD         MSE:0.000169    COS:0.998949    RMSE:0.042434
29.res3a_branch2a.xx.output0            OP_TYPE:CONV_2D     MSE:0.001885    COS:0.999472    RMSE:0.030785
30.res3a_branch2a_xx.xx.output0         OP_TYPE:MUL         MSE:0.000427    COS:0.999486    RMSE:0.030586
31.res3a_branch2a_xx_xx.xx.output0      OP_TYPE:ADD         MSE:0.000427    COS:0.998909    RMSE:0.045188
32.res3a_branch2a_xx_xx_xx.xx.output0   OP_TYPE:RELU        MSE:0.000147    COS:0.998773    RMSE:0.049700
33.res3a_branch2b.xx.output0            OP_TYPE:CONV_2D     MSE:0.000331    COS:0.999425    RMSE:0.034038
34.res3a_branch2b_xx.xx.output0         OP_TYPE:MUL         MSE:0.000368    COS:0.999411    RMSE:0.034275
35.res3a_branch2b_xx_xx.xx.output0      OP_TYPE:ADD         MSE:0.000368    COS:0.998931    RMSE:0.044919
36.res3a.xx.output0                     OP_TYPE:ADD         MSE:0.000585    COS:0.999040    RMSE:0.042398
37.res3a_xx.xx.output0                  OP_TYPE:RELU        MSE:0.000301    COS:0.999058    RMSE:0.041479
38.res3b_branch2a.xx.output0            OP_TYPE:CONV_2D     MSE:0.000811    COS:0.999530    RMSE:0.029739
39.res3b_branch2a_xx.xx.output0         OP_TYPE:MUL         MSE:0.000288    COS:0.999580    RMSE:0.028879
40.res3b_branch2a_xx_xx.xx.output0      OP_TYPE:ADD         MSE:0.000288    COS:0.999383    RMSE:0.033874
41.res3b_branch2a_xx_xx_xx.xx.output0   OP_TYPE:RELU        MSE:0.000073    COS:0.998787    RMSE:0.052493
42.res3b_branch2b.xx.output0            OP_TYPE:CONV_2D     MSE:0.000136    COS:0.999469    RMSE:0.033442
43.res3b_branch2b_xx.xx.output0         OP_TYPE:MUL         MSE:0.000314    COS:0.999489    RMSE:0.033050
44.res3b_branch2b_xx_xx.xx.output0      OP_TYPE:ADD         MSE:0.000314    COS:0.999308    RMSE:0.036079
45.res3b.xx.output0                     OP_TYPE:ADD         MSE:0.000659    COS:0.998989    RMSE:0.042628
46.res3b_xx.xx.output0                  OP_TYPE:RELU        MSE:0.000375    COS:0.999011    RMSE:0.043575
47.res4a_branch1.xx.output0             OP_TYPE:CONV_2D     MSE:0.000098    COS:0.999264    RMSE:0.037842
48.res4a_branch1_xx.xx.output0          OP_TYPE:MUL         MSE:0.000027    COS:0.999298    RMSE:0.037615
49.res4a_branch1_xx_xx.xx.output0       OP_TYPE:ADD         MSE:0.000027    COS:0.999253    RMSE:0.036302
50.res4a_branch2a.xx.output0            OP_TYPE:CONV_2D     MSE:0.001011    COS:0.999543    RMSE:0.028507
51.res4a_branch2a_xx.xx.output0         OP_TYPE:MUL         MSE:0.000300    COS:0.999544    RMSE:0.028563
52.res4a_branch2a_xx_xx.xx.output0      OP_TYPE:ADD         MSE:0.000300    COS:0.999147    RMSE:0.040221
53.res4a_branch2a_xx_xx_xx.xx.output0   OP_TYPE:RELU        MSE:0.000107    COS:0.998961    RMSE:0.047238
54.res4a_branch2b.xx.output0            OP_TYPE:CONV_2D     MSE:0.000335    COS:0.999464    RMSE:0.032660
55.res4a_branch2b_xx.xx.output0         OP_TYPE:MUL         MSE:0.000255    COS:0.999453    RMSE:0.032788
56.res4a_branch2b_xx_xx.xx.output0      OP_TYPE:ADD         MSE:0.000255    COS:0.999234    RMSE:0.038888
57.res4a.xx.output0                     OP_TYPE:ADD         MSE:0.000312    COS:0.999298    RMSE:0.036873
58.res4a_xx.xx.output0                  OP_TYPE:RELU        MSE:0.000146    COS:0.999191    RMSE:0.040181
59.res4b_branch2a.xx.output0            OP_TYPE:CONV_2D     MSE:0.000369    COS:0.999748    RMSE:0.021992
60.res4b_branch2a_xx.xx.output0         OP_TYPE:MUL         MSE:0.000156    COS:0.999772    RMSE:0.021527
61.res4b_branch2a_xx_xx.xx.output0      OP_TYPE:ADD         MSE:0.000156    COS:0.999625    RMSE:0.026857
62.res4b_branch2a_xx_xx_xx.xx.output0   OP_TYPE:RELU        MSE:0.000028    COS:0.999073    RMSE:0.045494
63.res4b_branch2b.xx.output0            OP_TYPE:CONV_2D     MSE:0.000048    COS:0.999748    RMSE:0.022612
64.res4b_branch2b_xx.xx.output0         OP_TYPE:MUL         MSE:0.000191    COS:0.999751    RMSE:0.022354
65.res4b_branch2b_xx_xx.xx.output0      OP_TYPE:ADD         MSE:0.000191    COS:0.999678    RMSE:0.024581
66.res4b.xx.output0                     OP_TYPE:ADD         MSE:0.000337    COS:0.999448    RMSE:0.032108
67.res4b_xx.xx.output0                  OP_TYPE:RELU        MSE:0.000160    COS:0.999113    RMSE:0.040588
68.res5a_branch1.xx.output0             OP_TYPE:CONV_2D     MSE:0.000072    COS:0.999179    RMSE:0.038052
69.res5a_branch1_xx.xx.output0          OP_TYPE:MUL         MSE:0.000290    COS:0.999179    RMSE:0.038130
70.res5a_branch1_xx_xx.xx.output0       OP_TYPE:ADD         MSE:0.000290    COS:0.999311    RMSE:0.032880
71.res5a_branch2a.xx.output0            OP_TYPE:CONV_2D     MSE:0.000355    COS:0.999758    RMSE:0.020371
72.res5a_branch2a_xx.xx.output0         OP_TYPE:MUL         MSE:0.000133    COS:0.999750    RMSE:0.020561
73.res5a_branch2a_xx_xx.xx.output0      OP_TYPE:ADD         MSE:0.000133    COS:0.999613    RMSE:0.026392
74.res5a_branch2a_xx_xx_xx.xx.output0   OP_TYPE:RELU        MSE:0.000028    COS:0.998859    RMSE:0.051003
75.res5a_branch2b.xx.output0            OP_TYPE:CONV_2D     MSE:0.000063    COS:0.999512    RMSE:0.031809
76.res5a_branch2b_xx.xx.output0         OP_TYPE:MUL         MSE:0.001009    COS:0.999480    RMSE:0.032233
77.res5a_branch2b_xx_xx.xx.output0      OP_TYPE:ADD         MSE:0.001009    COS:0.999257    RMSE:0.039436
78.res5a.xx.output0                     OP_TYPE:ADD         MSE:0.001413    COS:0.999379    RMSE:0.034297
79.res5a_xx.xx.output0                  OP_TYPE:RELU        MSE:0.000457    COS:0.998876    RMSE:0.048274
80.res5b_branch2a.xx.output0            OP_TYPE:CONV_2D     MSE:0.001387    COS:0.999842    RMSE:0.015781
81.res5b_branch2a_xx.xx.output0         OP_TYPE:MUL         MSE:0.000239    COS:0.999839    RMSE:0.015798
82.res5b_branch2a_xx_xx.xx.output0      OP_TYPE:ADD         MSE:0.000239    COS:0.999639    RMSE:0.026762
83.res5b_branch2a_xx_xx_xx.xx.output0   OP_TYPE:RELU        MSE:0.000023    COS:0.998458    RMSE:0.058598
84.res5b_branch2b.xx.output0            OP_TYPE:CONV_2D     MSE:0.000041    COS:0.999061    RMSE:0.041137
85.res5b_branch2b_xx.xx.output0         OP_TYPE:MUL         MSE:0.007750    COS:0.999058    RMSE:0.041188
86.res5b_branch2b_xx_xx.xx.output0      OP_TYPE:ADD         MSE:0.007751    COS:0.998732    RMSE:0.052601
87.res5b.xx.output0                     OP_TYPE:ADD         MSE:0.008533    COS:0.998792    RMSE:0.052043
88.res5b_xx.xx.output0                  OP_TYPE:RELU        MSE:0.004860    COS:0.998924    RMSE:0.051718
89.pool5.xx.output0                     OP_TYPE:AVG_POOL_2D MSE:0.004061    COS:0.998601    RMSE:0.078395
90.fc1000_reshape#1_output#183.output0  OP_TYPE:ReshapeAliasMSE:0.004061    COS:0.998601    RMSE:0.078395
91.fc1000#185.xx.output0                OP_TYPE:CONV_2D     MSE:0.013363    COS:0.998935    RMSE:0.050327
92.fc1000.xx.output0                    OP_TYPE:ReshapeAliasMSE:0.013363    COS:0.998935    RMSE:0.050327
93.prob.xx.output0                      OP_TYPE:SOFTMAX     MSE:0.000000    COS:1.000000    RMSE:0.000152

Histogram

The histogram.py can be used to draw histogram for each layer of the network model.

This tool, which is located at ~/SGS_IPU_SDK/DumpDebug/histogram.py, can plot the data distribution of each layer of the dumped data. To use this tool, besides the data file dumped, you need also the log/tensor_min_max.txt file generated under the root directory when you use the calibrator tool to convert the floating-point network model to a fixed-point network model.

Below is an example tool usage:

Python3 histogram.py sigma_outtensor_dump.bin tensor_min_max.txt

Progress prompted during operation reads as follows:

[===============================================> ]97.61%

The drawn histogram is as illustrated below:

The blue part in the figure above is the number of occurrences of network data in this layer, and the red dashed lines on the left and right sides are the minimum and maximum values.

Note

  • When the tool is running, a Histograms folder will be created in the current directory to store the data histogram images of each network layer.
  • When drawing the histogram of a network model for different Dump data, you should rename the Histograms folder under the current path, or move it to another path. The histogram.py tool will delete the Histograms folder under the current path when running.

Error Handling

The process will be terminated when a SegmentFault is encountered during the operation. It is very important to pinpoint the location of the fault that occurs in this case. To do this, input the following commands to get the information needed.

Change the core file build path to the current directory, and set the core file size as unlimited:

echo core > /proc/sys/kernel/core_pattern 
ulimit -c unlimited 

Re-run the network conversion command which encountered the fault mentioned above.

After the core file is generated, run the following command to print the process related information:

~/SGS_IPU_SDK/DumpDebug/show_address.sh /path/to/SGS_IPU_SDK/bin/XXX /path/to/core   

The show_address.sh script requires two parameters. The first one is the path to the bin file for executing the command, and the second one is the core path.

The screen will print the memory map information and the process function stack address; please feedback the information.

Optimizing Model Accuracy

You can use sim_optimizer to optimize model accuracy.

When using a calibrator, L1, L2, L3, L4 and L5 of --quant_level will automatically configure the convolutional quantization method based on statistical information. If the accuracy of the fixed model is inferior to that of the float model, and modifying [CONV_CONFIG] in input_config.ini to use ALL_INT16 instead can make the accuracy of the fixed model similar to that of the float model, you can use the sim_optimizer.py tool to optimize the training and get a more appropriate model accuracy.

The sim_optimizer.py tool is located at ~/SGS_IPU_SDK/Scripts/examples/sim_optimizer.py.

Mandatory Parameter

-i, --image: Path to image file or image folder.

-m, --model: Floating-point network model file path.

--input_config: Path to input_config.ini file, which contains the input tensor configuration information.

-n, --preprocess: Pre-processing method, which is related to the image pre-processing method. You can also specify the pre-processing file path after completing the pre-processing file configuration.

Optional Parameter

--num_process: Number of processes running simultaneously. (Optional parameter, if not specified, 10 processes will be run by default.)

Usage Example

Before using sim_optimizer.py, you should first modify the input_config.ini configuration convolution to full 16-bit precision, re-convert the float and fixed models, and verify if the fixed model is substantially the same in precision as the original model. To use sim_optimizer.py, revert the input_config.ini configuration convolution to the default 8-bit quantization, and then re-convert the float and fixed models. An example usage of sim_optimizer.py is illustrated below:

 python3 ~/SGS_IPU_SDK/Scripts/examples/sim_optimizer.py \
 -i ~/SGS_Models/resource/detection/coco2017_calibration_set32 \
 -m ~/SGS_Models/tensorflow/ssd_mobilenet_v1/ssd_mobilenet_v1_float.sim \
 -n ssd_mobilenet_v1 \
 --input_config ~/SGS_Models/tensorflow/ssd_mobilenet_v1/input_config.ini \
 --num_process 20

After the sim_optimizer.py is completed, several quantization import files will be generated, the number of which being identical to the list defined by compression_rates in sim_optimizer.py. Hence, by modifying the parameter of the list, you can configure the number of quantization import files to be generated. The value range is (0.5, 1).

After that, you can use the sim_calibrator.py tool to import the quantization information to the model to generate the fixed model.

 python3 ~/SGS_IPU_SDK/Scripts/examples/sim_calibrator.py \                                                                         
 -i ~/SGS_Models/resource/detection/coco2017_calibration_set32 \                                                                         
 -m ~/SGS_Models/tensorflow/ssd_mobilenet_v1/ssd_mobilenet_v1_float.sim \                                                                         
 -n ssd_mobilenet_v1 \                                                                         
 --input_config ~/SGS_Models/tensorflow/ssd_mobilenet_v1/input_config.ini \                                                                         
 --quant_file ./optimized_quant_info_0.pkl \                                                                         
 --num_process 20

The DumpDebug Tool provides a method for troubleshooting accuracy degradation after model quantization, which can be used as a reference for actual problem elimination. When using a calibrator, L2, L3, L4 and L5 of --quant_level will automatically configure the convolutional quantization method based on statistical information. If the calibrator fails to modify the quantization method to “INT16” convolution mode under the following circumstances, modify it manually.

  1. When converting SigmaStar floating-point network model into SigmaStar fixed-point network model, pay attention to the log/tensor_min_max.txt file stored under the current directory. The file records the maximum and minimum values of each layer of the network in the conversion process. If the difference between the maximum value and the minimum value of the convolution input layer is too large (generally greater than 30), you need to enable the “INT16” convolution mode of the layer from the input_config.ini file of the corresponding network. Please refer to input config Configuration for the configuration details. Once the input_config.ini file is modified, you should restart the conversion from the model trained by the original framework.

  2. If the RMSE value obtained by the data comparison through the method set out in Analyzing data with the auto_dump_debug.sh script is too large (generally greater than 0.5), you can enable the INT16 convolution mode of the convolution input layer preceeding the instant layer. Once the input_config.ini file is modified, you should restart the conversion from the model trained by the original framework.

  3. To use the histogram.py tool referred to in Using histogram.py to draw histogram for each layer of network model to draw histogram, be sure to use Dump data of the SigmaStar floating-point network and the corresponding tensor_min_max.txt file. If the data distribution in the histogram is concentrated, but the difference between maximum and minimum values is large (generally greater than 30 between maximum and minimum values), you can consider turning on the INT16 convolution mode for the input of the layer convolution. Once the input_config.ini file is modified, you should restart the conversion from the model trained by the original framework.