Operators

Caffe-Supported Operators¶

Operator	Note
ArgMax	nan
BatchNorm	nan
Concat	max. 256 tensor concat
Convolution	tensor size < 2^31 hw < 64 if kernel size is dohwdi. Depthwise Convolution if group is 1. Convolution if group is C. Multiple Convolution if 1 < group < C. round(di/16)round(do/16) < 5121024
ConvolutionDepthwise	Natively supported Kernel_size is 3 * 3. For any other cases, Convolution will be performed. Limitation: Pad range [0,1].
CReLU	nan
ContinuationIndicator	nan
Crop	nan
Deconvolution	tensor size < 2^31 hw < 64 if kernel size is dohwdi round(di/16)round(do/16) < 5121024
Dropout	nan
Eltwise	Supports Add, Sub, Mul and Maximum shape: (4 dimensional vectors, NCHW), (NCHW, const), (NCHW, C), (NCHW, NCHW)
Flatten	nan
InnerProduct	round(di/16)round(do/16) < 5121024 if weight size do*di
Permute	nan
Pooling	assuming that kernel size is hw 1. global_pooling is true hw <= 255 if AVGPool input is unit8 hw <= 65025 if AVGPool input is int16 hw <= 65025 for MAXPool 2. global_pooling is false hw <= 1919
PriorBox	nan
Power	Only supports positive integers
Reshape	nan
Reverse	nan
ROIPooling	The input dimension of rois in ROIPooling is (Nx5). Note that only when the second half of the network is entirely InnerProduct, will you be able to set N to be greater than 1. If any convolution exists in the second half of the network, N can only be set to 1, and the second network segment should be executed N times in a loop. For detailed method of use and restriction, see the note provided below.
ReLU	input <= 4-dimensional
PReLU	input <= 4-dimensional
Sigmoid	nan
Slice	nan
Scale	nan
Softmax	If you need to perform calculation on a specified dimension, transport the dimension to be calculated to the last dimension (innermost dimension) max. 32*512=16384
Split	nan
Tanh	nan
Threshold	input = 4-dimensional
Tile	nan
Upsample	Upsample operator does not exist in caffe. You can modify Deconvolution to Upsample manually. input = 4-dimensional support the same scale on H and W
Reorg	Only supports stride = 2
LSTM	support Forward only

Please Note:

Upsample operator is defined as follows in prototxt:
```
layer { 
    bottom: "layer85-conv" 
    top: "layer86-upsample" 
    name: "layer86-upsample" 
    type: "Upsample" 
    upsample_param { 
        scale: 2 
    } 
}
```
The parameter scale has the same meaning as Stride in Deconvolution. Note however that Upsample is equivalent to the Deconvolution operator with a weight of all 1.
The description of the ROIPooling operator in prototxt goes as follows:
```
layer { 
    name: "roi_pool5" 
    type: "ROIPooling" 
    bottom: "conv5_3" 
    bottom: "rois" 
    top: "pool5" 
    roi_pooling_param { 
        pooled_w: 7 
        pooled_h: 7 
        spatial_scale: 0.0625 
    } 
}
```
Roi_pooling_param supports pooled_w, pooled_h and spatial_scale only. The rois input of the Float model is the coordinate of the rpn layout output. The rois input of the Fixed and Offline models is the output coordinates of the rpn layer multiplied by the spatial_scale value and then quantized to int16, which will be sent to the model.

TensorFlow-Supported Operators¶

Category	Operator	Note
Convolution	Conv	Limitation: Kernel_size: H * W < 255
Convolution	DepthwiseConv2dNative	Natively supported Kernel_size is 3 * 3. For any other cases, Convolution will be performed.
Convolution	FullyConnected	nan
Pooling	Max pooling	nan
Pooling	Average Pooling	nan
Activation	ReLU	nan
Activation	PReLU	nan
Activation	ReLU6	nan
Activation	LeakyReLU	nan
Activation	Sigmoid	nan
Math	Less	nan
Math	Greater	nan
Math	GreaterEqual	nan
Math	Equal	nan
Math	Add	nan
Math	Sub	nan
Math	Mul	nan
Math	RealDiv	Only supports the second operand as a constant Tensor
Math	Maximum	nan
Math	Minimum	nan
Math	Mean	nan
Math	Max	nan
Math	Sqrt	nan
Math	Rsqrt	nan
Math	Round	nan
Math	Softmax	If you need to perform calculation on a specified dimension, transport the dimension to be calculated to the last dimension (innermost dimension)
Math	FusedBatchNorm	nan
Math	Exp	nan
DMA	Align	nan
DMA	ConcatV2	nan
DMA	Fill	nan
DMA	Gather	nan
DMA	GatherV2	nan
DMA	Pack	nan
DMA	Pad	nan
DMA	SpaceToBatchND	nan
DMA	BatchToSpaceND	nan
DMA	Zeroslike	nan
DMA	Split	nan
DMA	Slice	nan
DMA	Unpack	nan
DMA	Tile	nan
DMA	Reshape	nan
DMA	Transpose	nan
DMA	Resize_bilinear	The current bilinear interpolation supports only those cases which satisfy the following requirements: (1) supports interger-fold magnification only (2) magnification must be less than or equal to 8 times (3) supports only 3D data difference, i.e. the N of NHWC must be 1, which is similar to the convolution.
Misc	TopKV2	nan
Misc	shape	nan

Onnx-Supported Operators¶

Operator	Note
Add	shape: (4 dimensional vectors, NCHW), (NCHW, const), (NCHW, C), (NCHW, NCHW)
Abs	nan
ArgMax	nan
AveragePool	hw <= 1919 if kernel size is h*w
BatchNorm	nan
Concat	max. 256 tensor concat
Convolution	tensor size < 2^31 hw < 64 if kernel size cohwci round(ci/16)round(co/16) < 5121024 Supports autoPad SAME_UPPER attribute. Limitation: Pads range: [0, 7]
Clip	Max == 6 is equal to Relu6
DepthwiseConv2D	Natively supported Kernel_size is 3 * 3. For any other cases, Convolution will be performed. Limitation: Pad range [0,1].
Div	refer to Add operator
Dropout	nan
DepthToSpace	for 4 dimensional input only
Expand	nan
Exp	nan
Gather	indices are int32 constant
Gemm	nan
GlobalAveragePool	if kernel size is hw hw <= 255 for uint8 h*w <= 65025 for int16
GlobalMaxPool	if kernel size is hw hw <= 255 for uint8 h*w <= 65025 for int16
LSTM	for Forward only
Matmul	input <= 4-dimensional if weight size is dodi round(di/16)round(do/16) < 512*1024
Mul	Refer to Add operator
MaxPool	if kernel size is hw, hw <= 19*19
Max	nan
Pad	support constant only
Reshape	nan
ReduceSum	input <= 4-dimensional
ReduceMean	input <= 4-dimensional
ReduceMax	input <= 4-dimensional Only supports calculation of one axis at the same time
Resize	Does not support roi parameter. This operator only scales the entire dimension. In bilinear fashion, the parameter coordinate_transformation_mod only supports align_corners and asymmetric. down sampling not supported input = 4-dimensional same scale on H and W
ReLU	input = 4-dimensional
PReLU	input <= 4-dimensional
LeakyReLU	input <= 4-dimensional
TanH	nan
Sigmoid	nan
Slice	nan
Softmax	If you need to perform calculation on a specified dimension, transport the dimension to be calculated to the last dimension (innermost dimension) max. 32*512=16384
Split	nan
SpaceToDepth	nan
Squeeze	nan
Sub	Refer to Add operator
Transpose	nan
Tile	nan
Unsqueeze	nan
Upsample	input = 4-dimensional same scale on H and W

SigmaStar IPU SDK Model Limitations¶

For DepthwiseConv, if kernel size > 3, input size == kernel size must be satisfied.
For Softmax with specified dimension, we only support operation of the inntermost dimension (that is, if the Tensor is multi-dimensional, only Softmax operation against the innermost dimension will be done).
For TensorFlow network, try to minimize use of DMA operators that occupy large amount of data (including operators for pure data transportation operation such as Gather, Unpack, Pack, Concat, Reshape, Slice, Tile, Tanspose, Pad, and Split). If the C dimension is an integer multiple of 16, use of the above operator would speed up the calculation.
Similar to TensorFlow network, minimize use of operators such as Split, Concat, Reshape, Flatten, Slice, and Permute, in Caffe network.
Except the first-layer conversion, the Conv DI dimension of the remaining layers (i.e. the C dimension in NHWC) is in positive proportion to the efficiency. The larger the dimension, the higher the efficiency. The maximum support is 2048.
For math type operators (including Add, Sub, Mul, and Div, etc.), the efficiency will be higher if the right-hand operand is scaler (single number) or 1-dimensional (with identical HW data dimension but different C data dimension).
In the network structure, avoid using the output of one operator as the input of multiple operators, such as ResNet residual structure and GoogleNet inception module, where possible.