DumpDebug Tool
Description¶
The DumpDebug Tool is located at ~/SGS_IPU_SDK/DumpDebug/
.
The tool's main functions include:
- Parsing network model data
- Comparing network model data of different stages
- Drawing histogram chart of each layer of network model
When the network for the SigmaStar IPU SDK is switched, the results of the Caffe and TensorFlow network models would be completely consistent with those of the SigmaStar floating-point network model, and the results of the SigmaStar fixed-point network model would be completely consistent with those of the SigmaStar offline network model.
The only case where accuracy differs is the stage during which SigmaStar floating-point model is converted to SigmaStar fixed-point model. Therefore, when the results of the SigmaStar fixed-point network are found to be significantly different from the original floating-point results, you can use the DumpDebug Tool to check the error of the converted fixed-point network.
Usage¶
Dump data of each layer in the network model¶
-
Find the DebugConfig.txt in the SGS_IPU_SDK/cfg folder and copy it to the current running path.
-
Open DebugConfig.txt and modify the file contents:
dumpTensor eliminateGarbage dequantFixed #dumpasstring #disableDomainFuseOps path=
The following is a description of the DebugConfig.txt parameters:
dumpTensor
: Master switch of each layer of Dump network model data.eliminateGarbage
: Removes unnecessary data when dumping network model data (recommended).dequantFixed
: For fixed-point network models, converts fixed-point data to floating-point data (recommended).dumpasstring
: Disables the option for binary file type when dumping string-type network model data. (If the function listed in Analyzing data with the auto_dump_debug.sh script is used, be sure to turn off this option.)disableDomainFuseOps
: Disables network layer fusion when switching fixed-point network models (recommended).path=
: Specifies the full path for the generated file output. (Be sure to fill in an absolute path after the path= parameter, e.g. /home/user. If threre are no contents after path= or the path= does not even exist, the file will be output to $HOME. Note that the absolute path or /home/user should not exceed 122 bytes.) -
Run the simulator to deduce a single frame.
-
After the deduction is completed, sigma_outtensor_dump.bin will be generated in the directory specified by path= parameter, which is the data of each layer in the Dump network model.
Note
- Please rename the sigma_outtensor_dump.bin file after the dumping is completed. Do not modify the file suffix. Note that the sigma_outtensor_dump.bin file will be overwritten by the new dump file.
- The disableDomainFuseOps option in DebugConfig.txt file is only useful when converting floating-point network model to fixed-point network model using a calibrator. The purpose of this option is to cancel the network fusion function. When this option is turned off, the fixed-point network model and the offline network model can optimize the network model's operators during conversion and speed up the model's operation, but in the meantime it will affect the hierarchical structure of the network and prevent some operators' output from being dumped to sigma_outtersor_dump.bin file. If you need data for each layer of the network model, you can turn on the disableDomainFuseOps option and re-run the calibrator to convert the fixed-point network model. The converted model will not be optimized for fusion in this case and can therefore output the data for each layer.
- To dump the network model data for different stages, compare floating-point network models with fixed-point network models and use the parameters
-t/--type
and-m/--model
in the simulator to specify models for different stages. - Offline network model does not support Dump Debug.
- When you use a calibrator to convert a floating-point network model to a fixed-point network model, a tensor_min_max.txt file will be generated under the SGS_IPU_SDK root directory. The tensor_min_max.txt file records the maximum and minimum values of each input and output of the network, which data will be used for the subsequent dump data analysis.
Analyzing data with the auto_dump_debug.sh script¶
The tool is located at ~SGS_IPU_SDK/DumpDebug/auto_dump_debug.sh
.
The auto_dump_debug.sh script can make a comparison of the MSE and RMSE for same-layer output tensor between the sample bin (sample) and the reference bin (benchmark). Note that you have to dump the bin files of the floating-point network model and the fixed-point network model pursuant to Generating data of each layer in the Dump network model.
Below is an example usage:
./auto_dump_debug.sh \ /home/user/SGS_IPU_SDK \ /home/user/sample.bin \ /home/user/benchmark.bin
The following is a description of the parameters used:
Param1
: Path of SGS_IPU_SDK. If it is in the current position, use the folder name.
Param2
: Path of the sample bin that has been dumped for comparison. This should be the path of the bin file dumped from fixed-point network model.
Param3
: Path of the benchmark bin that has been dumped for reference. This should be the path of the bin file dumped from floating-point network model.
If the disableDomainFuseOps option is not turned on, the result of the analysis will be as displayed (partially) below:
3.conv1_xx_xx_xx.xx.output0 OP_TYPE:CONV_2D MSE:0.000046 COS:0.999898 RMSE:0.014310 4.pool1.xx.output0 OP_TYPE:MAX_POOL_2D MSE:0.000057 COS:0.999931 RMSE:0.011018 7.res2a_branch1_xx_xx.xx.output0 OP_TYPE:CONV_2D MSE:0.000076 COS:0.999886 RMSE:0.015652 11.res2a_branch2a_xx_xx_xx.xx.output0 OP_TYPE:CONV_2D MSE:0.000027 COS:0.999652 RMSE:0.023612 16.res2a_xx.xx.output0 OP_TYPE:RELU MSE:0.000177 COS:0.999644 RMSE:0.025049 20.res2b_branch2a_xx_xx_xx.xx.output0 OP_TYPE:CONV_2D MSE:0.000069 COS:0.999242 RMSE:0.039121 25.res2b_xx.xx.output0 OP_TYPE:RELU MSE:0.000430 COS:0.999452 RMSE:0.030551 28.res3a_branch1_xx_xx.xx.output0 OP_TYPE:CONV_2D MSE:0.000165 COS:0.998975 RMSE:0.041644 32.res3a_branch2a_xx_xx_xx.xx.output0 OP_TYPE:CONV_2D MSE:0.000139 COS:0.998842 RMSE:0.048416 37.res3a_xx.xx.output0 OP_TYPE:RELU MSE:0.000297 COS:0.999069 RMSE:0.041043 41.res3b_branch2a_xx_xx_xx.xx.output0 OP_TYPE:CONV_2D MSE:0.000070 COS:0.998836 RMSE:0.051215 46.res3b_xx.xx.output0 OP_TYPE:RELU MSE:0.000366 COS:0.999035 RMSE:0.042863 49.res4a_branch1_xx_xx.xx.output0 OP_TYPE:CONV_2D MSE:0.000026 COS:0.999282 RMSE:0.035575 53.res4a_branch2a_xx_xx_xx.xx.output0 OP_TYPE:CONV_2D MSE:0.000108 COS:0.998950 RMSE:0.047213 58.res4a_xx.xx.output0 OP_TYPE:RELU MSE:0.000149 COS:0.999174 RMSE:0.040582 62.res4b_branch2a_xx_xx_xx.xx.output0 OP_TYPE:CONV_2D MSE:0.000030 COS:0.999020 RMSE:0.046630 67.res4b_xx.xx.output0 OP_TYPE:RELU MSE:0.000168 COS:0.999073 RMSE:0.041229 70.res5a_branch1_xx_xx.xx.output0 OP_TYPE:CONV_2D MSE:0.000305 COS:0.999276 RMSE:0.033715 74.res5a_branch2a_xx_xx_xx.xx.output0 OP_TYPE:CONV_2D MSE:0.000032 COS:0.998723 RMSE:0.053469 79.res5a_xx.xx.output0 OP_TYPE:RELU MSE:0.000507 COS:0.998762 RMSE:0.050663 83.res5b_branch2a_xx_xx_xx.xx.output0 OP_TYPE:CONV_2D MSE:0.000027 COS:0.998182 RMSE:0.064201 88.res5b_xx.xx.output0 OP_TYPE:RELU MSE:0.006022 COS:0.998665 RMSE:0.056931 89.pool5.xx.output0 OP_TYPE:AVG_POOL_2D MSE:0.004321 COS:0.998488 RMSE:0.079694 90.fc1000_reshape#1_output#183.output0 OP_TYPE:ReshapeAliasMSE:0.004321 COS:0.998488 RMSE:0.079694 91.fc1000#185.xx.output0 OP_TYPE:CONV_2D MSE:0.014197 COS:0.998901 RMSE:0.051447 92.fc1000.xx.output0 OP_TYPE:ReshapeAliasMSE:0.014197 COS:0.998901 RMSE:0.051447 93.prob.xx.output0 OP_TYPE:Fix2Float MSE:0.000000 COS:1.000000 RMSE:0.000152
If the disableDomainFuseOps option is turned on, the result of the analysis will be as displayed (partially) below:
0.conv1.xx.output0 OP_TYPE:CONV_2D MSE:2.397930 COS:0.999887 RMSE:0.019312 1.conv1_xx.xx.output0 OP_TYPE:MUL MSE:0.000037 COS:0.999850 RMSE:0.020800 2.conv1_xx_xx.xx.output0 OP_TYPE:ADD MSE:0.000037 COS:0.999936 RMSE:0.010667 3.conv1_xx_xx_xx.xx.output0 OP_TYPE:RELU MSE:0.000052 COS:0.999885 RMSE:0.015123 4.pool1.xx.output0 OP_TYPE:MAX_POOL_2D MSE:0.000063 COS:0.999924 RMSE:0.011603 5.res2a_branch1.xx.output0 OP_TYPE:CONV_2D MSE:0.000096 COS:0.999952 RMSE:0.009368 6.res2a_branch1_xx.xx.output0 OP_TYPE:MUL MSE:0.000080 COS:0.999956 RMSE:0.009144 7.res2a_branch1_xx_xx.xx.output0 OP_TYPE:ADD MSE:0.000080 COS:0.999880 RMSE:0.016141 8.res2a_branch2a.xx.output0 OP_TYPE:CONV_2D MSE:0.000349 COS:0.999951 RMSE:0.009727 9.res2a_branch2a_xx.xx.output0 OP_TYPE:MUL MSE:0.000049 COS:0.999956 RMSE:0.009349 10.res2a_branch2a_xx_xx.xx.output0 OP_TYPE:ADD MSE:0.000049 COS:0.999886 RMSE:0.016510 11.res2a_branch2a_xx_xx_xx.xx.output0 OP_TYPE:RELU MSE:0.000028 COS:0.999641 RMSE:0.024211 12.res2a_branch2b.xx.output0 OP_TYPE:CONV_2D MSE:0.000100 COS:0.999821 RMSE:0.018379 13.res2a_branch2b_xx.xx.output0 OP_TYPE:MUL MSE:0.000118 COS:0.999832 RMSE:0.017982 14.res2a_branch2b_xx_xx.xx.output0 OP_TYPE:ADD MSE:0.000118 COS:0.999670 RMSE:0.024984 15.res2a.xx.output0 OP_TYPE:ADD MSE:0.000241 COS:0.999771 RMSE:0.021801 16.res2a_xx.xx.output0 OP_TYPE:RELU MSE:0.000182 COS:0.999634 RMSE:0.025262 17.res2b_branch2a.xx.output0 OP_TYPE:CONV_2D MSE:0.000525 COS:0.999709 RMSE:0.023127 18.res2b_branch2a_xx.xx.output0 OP_TYPE:MUL MSE:0.000193 COS:0.999746 RMSE:0.022434 19.res2b_branch2a_xx_xx.xx.output0 OP_TYPE:ADD MSE:0.000193 COS:0.999474 RMSE:0.032121 20.res2b_branch2a_xx_xx_xx.xx.output0 OP_TYPE:RELU MSE:0.000071 COS:0.999219 RMSE:0.039747 21.res2b_branch2b.xx.output0 OP_TYPE:CONV_2D MSE:0.000157 COS:0.999549 RMSE:0.030371 22.res2b_branch2b_xx.xx.output0 OP_TYPE:MUL MSE:0.000272 COS:0.999587 RMSE:0.029537 23.res2b_branch2b_xx_xx.xx.output0 OP_TYPE:ADD MSE:0.000272 COS:0.999151 RMSE:0.040304 24.res2b.xx.output0 OP_TYPE:ADD MSE:0.000534 COS:0.999425 RMSE:0.031402 25.res2b_xx.xx.output0 OP_TYPE:RELU MSE:0.000448 COS:0.999429 RMSE:0.031203 26.res3a_branch1.xx.output0 OP_TYPE:CONV_2D MSE:0.000267 COS:0.999320 RMSE:0.034101 27.res3a_branch1_xx.xx.output0 OP_TYPE:MUL MSE:0.000169 COS:0.999345 RMSE:0.034331 28.res3a_branch1_xx_xx.xx.output0 OP_TYPE:ADD MSE:0.000169 COS:0.998949 RMSE:0.042434 29.res3a_branch2a.xx.output0 OP_TYPE:CONV_2D MSE:0.001885 COS:0.999472 RMSE:0.030785 30.res3a_branch2a_xx.xx.output0 OP_TYPE:MUL MSE:0.000427 COS:0.999486 RMSE:0.030586 31.res3a_branch2a_xx_xx.xx.output0 OP_TYPE:ADD MSE:0.000427 COS:0.998909 RMSE:0.045188 32.res3a_branch2a_xx_xx_xx.xx.output0 OP_TYPE:RELU MSE:0.000147 COS:0.998773 RMSE:0.049700 33.res3a_branch2b.xx.output0 OP_TYPE:CONV_2D MSE:0.000331 COS:0.999425 RMSE:0.034038 34.res3a_branch2b_xx.xx.output0 OP_TYPE:MUL MSE:0.000368 COS:0.999411 RMSE:0.034275 35.res3a_branch2b_xx_xx.xx.output0 OP_TYPE:ADD MSE:0.000368 COS:0.998931 RMSE:0.044919 36.res3a.xx.output0 OP_TYPE:ADD MSE:0.000585 COS:0.999040 RMSE:0.042398 37.res3a_xx.xx.output0 OP_TYPE:RELU MSE:0.000301 COS:0.999058 RMSE:0.041479 38.res3b_branch2a.xx.output0 OP_TYPE:CONV_2D MSE:0.000811 COS:0.999530 RMSE:0.029739 39.res3b_branch2a_xx.xx.output0 OP_TYPE:MUL MSE:0.000288 COS:0.999580 RMSE:0.028879 40.res3b_branch2a_xx_xx.xx.output0 OP_TYPE:ADD MSE:0.000288 COS:0.999383 RMSE:0.033874 41.res3b_branch2a_xx_xx_xx.xx.output0 OP_TYPE:RELU MSE:0.000073 COS:0.998787 RMSE:0.052493 42.res3b_branch2b.xx.output0 OP_TYPE:CONV_2D MSE:0.000136 COS:0.999469 RMSE:0.033442 43.res3b_branch2b_xx.xx.output0 OP_TYPE:MUL MSE:0.000314 COS:0.999489 RMSE:0.033050 44.res3b_branch2b_xx_xx.xx.output0 OP_TYPE:ADD MSE:0.000314 COS:0.999308 RMSE:0.036079 45.res3b.xx.output0 OP_TYPE:ADD MSE:0.000659 COS:0.998989 RMSE:0.042628 46.res3b_xx.xx.output0 OP_TYPE:RELU MSE:0.000375 COS:0.999011 RMSE:0.043575 47.res4a_branch1.xx.output0 OP_TYPE:CONV_2D MSE:0.000098 COS:0.999264 RMSE:0.037842 48.res4a_branch1_xx.xx.output0 OP_TYPE:MUL MSE:0.000027 COS:0.999298 RMSE:0.037615 49.res4a_branch1_xx_xx.xx.output0 OP_TYPE:ADD MSE:0.000027 COS:0.999253 RMSE:0.036302 50.res4a_branch2a.xx.output0 OP_TYPE:CONV_2D MSE:0.001011 COS:0.999543 RMSE:0.028507 51.res4a_branch2a_xx.xx.output0 OP_TYPE:MUL MSE:0.000300 COS:0.999544 RMSE:0.028563 52.res4a_branch2a_xx_xx.xx.output0 OP_TYPE:ADD MSE:0.000300 COS:0.999147 RMSE:0.040221 53.res4a_branch2a_xx_xx_xx.xx.output0 OP_TYPE:RELU MSE:0.000107 COS:0.998961 RMSE:0.047238 54.res4a_branch2b.xx.output0 OP_TYPE:CONV_2D MSE:0.000335 COS:0.999464 RMSE:0.032660 55.res4a_branch2b_xx.xx.output0 OP_TYPE:MUL MSE:0.000255 COS:0.999453 RMSE:0.032788 56.res4a_branch2b_xx_xx.xx.output0 OP_TYPE:ADD MSE:0.000255 COS:0.999234 RMSE:0.038888 57.res4a.xx.output0 OP_TYPE:ADD MSE:0.000312 COS:0.999298 RMSE:0.036873 58.res4a_xx.xx.output0 OP_TYPE:RELU MSE:0.000146 COS:0.999191 RMSE:0.040181 59.res4b_branch2a.xx.output0 OP_TYPE:CONV_2D MSE:0.000369 COS:0.999748 RMSE:0.021992 60.res4b_branch2a_xx.xx.output0 OP_TYPE:MUL MSE:0.000156 COS:0.999772 RMSE:0.021527 61.res4b_branch2a_xx_xx.xx.output0 OP_TYPE:ADD MSE:0.000156 COS:0.999625 RMSE:0.026857 62.res4b_branch2a_xx_xx_xx.xx.output0 OP_TYPE:RELU MSE:0.000028 COS:0.999073 RMSE:0.045494 63.res4b_branch2b.xx.output0 OP_TYPE:CONV_2D MSE:0.000048 COS:0.999748 RMSE:0.022612 64.res4b_branch2b_xx.xx.output0 OP_TYPE:MUL MSE:0.000191 COS:0.999751 RMSE:0.022354 65.res4b_branch2b_xx_xx.xx.output0 OP_TYPE:ADD MSE:0.000191 COS:0.999678 RMSE:0.024581 66.res4b.xx.output0 OP_TYPE:ADD MSE:0.000337 COS:0.999448 RMSE:0.032108 67.res4b_xx.xx.output0 OP_TYPE:RELU MSE:0.000160 COS:0.999113 RMSE:0.040588 68.res5a_branch1.xx.output0 OP_TYPE:CONV_2D MSE:0.000072 COS:0.999179 RMSE:0.038052 69.res5a_branch1_xx.xx.output0 OP_TYPE:MUL MSE:0.000290 COS:0.999179 RMSE:0.038130 70.res5a_branch1_xx_xx.xx.output0 OP_TYPE:ADD MSE:0.000290 COS:0.999311 RMSE:0.032880 71.res5a_branch2a.xx.output0 OP_TYPE:CONV_2D MSE:0.000355 COS:0.999758 RMSE:0.020371 72.res5a_branch2a_xx.xx.output0 OP_TYPE:MUL MSE:0.000133 COS:0.999750 RMSE:0.020561 73.res5a_branch2a_xx_xx.xx.output0 OP_TYPE:ADD MSE:0.000133 COS:0.999613 RMSE:0.026392 74.res5a_branch2a_xx_xx_xx.xx.output0 OP_TYPE:RELU MSE:0.000028 COS:0.998859 RMSE:0.051003 75.res5a_branch2b.xx.output0 OP_TYPE:CONV_2D MSE:0.000063 COS:0.999512 RMSE:0.031809 76.res5a_branch2b_xx.xx.output0 OP_TYPE:MUL MSE:0.001009 COS:0.999480 RMSE:0.032233 77.res5a_branch2b_xx_xx.xx.output0 OP_TYPE:ADD MSE:0.001009 COS:0.999257 RMSE:0.039436 78.res5a.xx.output0 OP_TYPE:ADD MSE:0.001413 COS:0.999379 RMSE:0.034297 79.res5a_xx.xx.output0 OP_TYPE:RELU MSE:0.000457 COS:0.998876 RMSE:0.048274 80.res5b_branch2a.xx.output0 OP_TYPE:CONV_2D MSE:0.001387 COS:0.999842 RMSE:0.015781 81.res5b_branch2a_xx.xx.output0 OP_TYPE:MUL MSE:0.000239 COS:0.999839 RMSE:0.015798 82.res5b_branch2a_xx_xx.xx.output0 OP_TYPE:ADD MSE:0.000239 COS:0.999639 RMSE:0.026762 83.res5b_branch2a_xx_xx_xx.xx.output0 OP_TYPE:RELU MSE:0.000023 COS:0.998458 RMSE:0.058598 84.res5b_branch2b.xx.output0 OP_TYPE:CONV_2D MSE:0.000041 COS:0.999061 RMSE:0.041137 85.res5b_branch2b_xx.xx.output0 OP_TYPE:MUL MSE:0.007750 COS:0.999058 RMSE:0.041188 86.res5b_branch2b_xx_xx.xx.output0 OP_TYPE:ADD MSE:0.007751 COS:0.998732 RMSE:0.052601 87.res5b.xx.output0 OP_TYPE:ADD MSE:0.008533 COS:0.998792 RMSE:0.052043 88.res5b_xx.xx.output0 OP_TYPE:RELU MSE:0.004860 COS:0.998924 RMSE:0.051718 89.pool5.xx.output0 OP_TYPE:AVG_POOL_2D MSE:0.004061 COS:0.998601 RMSE:0.078395 90.fc1000_reshape#1_output#183.output0 OP_TYPE:ReshapeAliasMSE:0.004061 COS:0.998601 RMSE:0.078395 91.fc1000#185.xx.output0 OP_TYPE:CONV_2D MSE:0.013363 COS:0.998935 RMSE:0.050327 92.fc1000.xx.output0 OP_TYPE:ReshapeAliasMSE:0.013363 COS:0.998935 RMSE:0.050327 93.prob.xx.output0 OP_TYPE:SOFTMAX MSE:0.000000 COS:1.000000 RMSE:0.000152
Histogram¶
The histogram.py can be used to draw histogram for each layer of the network model.
This tool, which is located at ~/SGS_IPU_SDK/DumpDebug/histogram.py
, can plot the data distribution of each layer of the dumped data. To use this tool, besides the data file dumped, you need also the log/tensor_min_max.txt file generated under the root directory when you use the calibrator tool to convert the floating-point network model to a fixed-point network model.
Below is an example tool usage:
Python3 histogram.py sigma_outtensor_dump.bin tensor_min_max.txt
Progress prompted during operation reads as follows:
[===============================================> ]97.61%
The drawn histogram is as illustrated below:
The blue part in the figure above is the number of occurrences of network data in this layer, and the red dashed lines on the left and right sides are the minimum and maximum values.
Note
- When the tool is running, a Histograms folder will be created in the current directory to store the data histogram images of each network layer.
- When drawing the histogram of a network model for different Dump data, you should rename the Histograms folder under the current path, or move it to another path. The histogram.py tool will delete the Histograms folder under the current path when running.
Error Handling¶
The process will be terminated when a SegmentFault is encountered during the operation. It is very important to pinpoint the location of the fault that occurs in this case. To do this, input the following commands to get the information needed.
Change the core file build path to the current directory, and set the core file size as unlimited:
echo core > /proc/sys/kernel/core_pattern ulimit -c unlimited
Re-run the network conversion command which encountered the fault mentioned above.
After the core file is generated, run the following command to print the process related information:
~/SGS_IPU_SDK/DumpDebug/show_address.sh /path/to/SGS_IPU_SDK/bin/XXX /path/to/core
The show_address.sh script requires two parameters. The first one is the path to the bin file for executing the command, and the second one is the core path.
The screen will print the memory map information and the process function stack address; please feedback the information.
Optimizing Model Accuracy¶
You can use sim_optimizer to optimize model accuracy.
When using a calibrator, L1, L2, L3, L4 and L5 of --quant_level
will automatically configure the convolutional quantization method based on statistical information. If the accuracy of the fixed model is inferior to that of the float model, and modifying [CONV_CONFIG] in input_config.ini to use ALL_INT16 instead can make the accuracy of the fixed model similar to that of the float model, you can use the sim_optimizer.py tool to optimize the training and get a more appropriate model accuracy.
The sim_optimizer.py tool is located at ~/SGS_IPU_SDK/Scripts/examples/sim_optimizer.py
.
Mandatory Parameter¶
-i, --image
: Path to image file or image folder.
-m, --model
: Floating-point network model file path.
--input_config
: Path to input_config.ini file, which contains the input tensor configuration information.
-n, --preprocess
: Pre-processing method, which is related to the image pre-processing method. You can also specify the pre-processing file path after completing the pre-processing file configuration.
Optional Parameter¶
--num_process
: Number of processes running simultaneously. (Optional parameter, if not specified, 10 processes will be run by default.)
Usage Example¶
Before using sim_optimizer.py, you should first modify the input_config.ini configuration convolution to full 16-bit precision, re-convert the float and fixed models, and verify if the fixed model is substantially the same in precision as the original model. To use sim_optimizer.py, revert the input_config.ini configuration convolution to the default 8-bit quantization, and then re-convert the float and fixed models. An example usage of sim_optimizer.py is illustrated below:
python3 ~/SGS_IPU_SDK/Scripts/examples/sim_optimizer.py \ -i ~/SGS_Models/resource/detection/coco2017_calibration_set32 \ -m ~/SGS_Models/tensorflow/ssd_mobilenet_v1/ssd_mobilenet_v1_float.sim \ -n ssd_mobilenet_v1 \ --input_config ~/SGS_Models/tensorflow/ssd_mobilenet_v1/input_config.ini \ --num_process 20
After the sim_optimizer.py is completed, several quantization import files will be generated, the number of which being identical to the list defined by compression_rates in sim_optimizer.py. Hence, by modifying the parameter of the list, you can configure the number of quantization import files to be generated. The value range is (0.5, 1).
After that, you can use the sim_calibrator.py tool to import the quantization information to the model to generate the fixed model.
python3 ~/SGS_IPU_SDK/Scripts/examples/sim_calibrator.py \ -i ~/SGS_Models/resource/detection/coco2017_calibration_set32 \ -m ~/SGS_Models/tensorflow/ssd_mobilenet_v1/ssd_mobilenet_v1_float.sim \ -n ssd_mobilenet_v1 \ --input_config ~/SGS_Models/tensorflow/ssd_mobilenet_v1/input_config.ini \ --quant_file ./optimized_quant_info_0.pkl \ --num_process 20
Related Issues¶
The DumpDebug Tool provides a method for troubleshooting accuracy degradation after model quantization, which can be used as a reference for actual problem elimination. When using a calibrator, L2, L3, L4 and L5 of --quant_level
will automatically configure the convolutional quantization method based on statistical information. If the calibrator fails to modify the quantization method to “INT16” convolution mode under the following circumstances, modify it manually.
-
When converting SigmaStar floating-point network model into SigmaStar fixed-point network model, pay attention to the log/tensor_min_max.txt file stored under the current directory. The file records the maximum and minimum values of each layer of the network in the conversion process. If the difference between the maximum value and the minimum value of the convolution input layer is too large (generally greater than 30), you need to enable the “INT16” convolution mode of the layer from the input_config.ini file of the corresponding network. Please refer to input config Configuration for the configuration details. Once the input_config.ini file is modified, you should restart the conversion from the model trained by the original framework.
-
If the RMSE value obtained by the data comparison through the method set out in Analyzing data with the auto_dump_debug.sh script is too large (generally greater than 0.5), you can enable the INT16 convolution mode of the convolution input layer preceeding the instant layer. Once the input_config.ini file is modified, you should restart the conversion from the model trained by the original framework.
-
To use the histogram.py tool referred to in Using histogram.py to draw histogram for each layer of network model to draw histogram, be sure to use Dump data of the SigmaStar floating-point network and the corresponding tensor_min_max.txt file. If the data distribution in the histogram is concentrated, but the difference between maximum and minimum values is large (generally greater than 30 between maximum and minimum values), you can consider turning on the INT16 convolution mode for the input of the layer convolution. Once the input_config.ini file is modified, you should restart the conversion from the model trained by the original framework.