doc: graph: add operation documents

This commit is contained in:
Wuxun Zhang
2022-11-18 20:58:44 +08:00
committed by GitHub
parent 0648a3589f
commit f0a28862d1
96 changed files with 4269 additions and 44 deletions

View File

@ -1,5 +1,5 @@
#===============================================================================
# Copyright 2021 Intel Corporation
# Copyright 2021-2022 Intel Corporation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
@ -45,6 +45,7 @@ if (DOXYREST_FOUND)
--frame-dir=${DOXYREST_FRAME_DIR}/cfamily
--config=${CMAKE_CURRENT_BINARY_DIR}/doxyrest-config.lua
COMMAND ${CMAKE_COMMAND} -E copy_directory ${CMAKE_CURRENT_SOURCE_DIR}/doc/rst ${DOXYREST_OUTPUT_DIR}/rst
COMMAND ${CMAKE_COMMAND} -E copy_directory ${CMAKE_CURRENT_SOURCE_DIR}/doc/graph/rst ${DOXYREST_OUTPUT_DIR}/rst
COMMAND ${CMAKE_COMMAND} -E touch ${DOXYREST_STAMP_FILE}
WORKING_DIRECTORY ${DOXYREST_OUTPUT_DIR}
COMMENT "Translating documentation from .xml to .rst with Doxyrest" VERBATIM)

View File

@ -270,6 +270,8 @@ ALIASES += diffdstiterc="\f$\diffdstiterc\f$"
ALIASES += diffgamma="\f$\diffgamma\f$"
ALIASES += diffbeta="\f$\diffbeta\f$"
ALIASES += workspace="\f$\workspace\f$"
ALIASES += srcshape="\f$\srcshape\f$"
ALIASES += dstshape="\f$\dstshape\f$"
# This tag can be used to specify a number of word-keyword mappings (TCL only).
# A mapping has the form "name=value". For example adding "class=itcl::class"

View File

@ -281,7 +281,7 @@ $ cmake -DONEDNN_GPU_RUNTIME=OCL -DOPENCLROOT=/path/to/opencl/sdk ..
## Graph component limitations
The graph component can be enabled via the build option `ONEDNN_BUILD_GRAPH`.
But the build option doesn't work with some values of other build options.
But the build option does not work with some values of other build options.
Specifying the options and values simutanously in one build will lead to a CMake
error.

View File

@ -13,11 +13,11 @@ The graph dumping feature only works when `ONEDNN_BUILD_GRAPH` is ON.
## Run-Time Controls
When the feature is enabled at build time, users can use an environment variable
`ONEDNN_GRAPH_DUMP` to control the serialization level. This option accepts
setting flags. These flags can be combined together to make the library dumping
different files. For example, the below setting will generate files containing
library graph and subgraphs in each partition.
When the feature is enabled at build time, the environment variable
`ONEDNN_GRAPH_DUMP` can be used to control the serialization level. This option
accepts setting flags. These flags can be combined together to make the library
dumping different files. For example, the below setting will generate files
containing library graph and subgraphs in each partition.
| Variable | Flags | Description
| :--- | :--- |:---

View File

@ -0,0 +1,41 @@
# Abs {#dev_guide_op_abs}
## General
Abs operation performs element-wise the absolute value with given tensor, it
applies following formula on every element of \src tensor (the variable names
follow the standard @ref dev_guide_conventions):
\f[ dst = \begin{cases} src & \text{if}\ src \ge 0 \\
-src & \text{if}\ src < 0 \end{cases} \f]
## Operation attributes
Abs operation does not support any attribute.
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `src` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `dst` | Required
## Supported data types
Abs operation supports the following data type combinations.
Src | Dst
-- | --
f32 | f32
f16 | f16
bf16 | bf16

View File

@ -0,0 +1,42 @@
# AbsBackward {#dev_guide_op_absbackward}
## General
AbsBackward operation computes gradient for Abs operation.
\f[ dst = \begin{cases} diff\_dst & \text{if}\ src > 0 \\
-diff\_dst & \text{if}\ src < 0 \\
0 & \text{if}\ src = 0 \\
\end{cases} \f]
## Operation attributes
AbsBackward operation does not support any attribute.
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `src` | Required
1 | `diff_dst` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `diff_src` | Required
## Supported data types
AbsBackward operation supports the following data type combinations.
Src | Diff_dst | Diff_src
-- | -- | --
f32 | f32 | f32
f16 | f16 | f16
bf16 | bf16 | bf16

View File

@ -0,0 +1,50 @@
# Add {#dev_guide_op_add}
## General
Add operation performs element-wise addition operation with two given tensors
applying multi-directional broadcast rules.
\f[
\dst(\overline{x}) =
\src_0(\overline{x}) \mathbin{+} \src_1(\overline{x}),
\f]
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[auto_broadcast](@ref dnnl::graph::op::attr::auto_broadcast) | Specifies rules used for auto-broadcasting of src tensors. |string |`none`, `numpy` (default) | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
| Index | Argument Name | Required or Optional |
| ----- | ------------- | -------------------- |
| 0 | `src_0` | Required |
| 1 | `src_1` | Required |
@note Both src shapes should match and no auto-broadcasting is allowed if
`auto_broadcast` attributes is `none`. `src_0` and `src_1` shapes can be
different and auto-broadcasting is allowed if `auto_broadcast` attributes is
`numpy`.
### Outputs
| Index | Argument Name | Required or Optional |
| ----- | ------------- | -------------------- |
| 0 | `dst` | Required |
## Supported data types
Add operation supports the following data type combinations.
| Source0/1 | Destination |
| ---- | ------- |
| f32 | f32 |
| bf16 | bf16 |
| f16 | f16 |

View File

@ -0,0 +1,61 @@
# AvgPool {#dev_guide_op_avgpool}
## General
AvgPool operation performs the computation following the below formulas.
Variable names follow the standard @ref dev_guide_conventions.
\f[
\dst(n, c, oh, ow) =
\frac{1}{DENOM}
\sum\limits_{kh, kw}
\src(n, c, oh \cdot SH + kh \cdot (DH + 1) - PH_L, ow \cdot SW + kw \cdot (DW + 1) - PW_L)
\f]
where,
- when attribute `exclude_pad` is set to false, in which case
\f$DENOM = KH \cdot KW\f$,
- when attribute `exclude_pad` is set to true, in which case \f$DENOM\f$ equals
to the size of overlap between an averaging window and images.
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[strides](@ref dnnl::graph::op::attr::strides) | Controls the strides the window is moved. |s64 |A s64 list containing positive values | Required
[pads_begin](@ref dnnl::graph::op::attr::pads_begin) | Controls number of zeros to be add to the front/top/left of spatial dimensions, the attribute will be ignored when `auto_pad` attribute is specified to `same_upper`, `same_lower` or `valid`.|s64 | A s64 list containing non-negative values | Required
[pads_end](@ref dnnl::graph::op::attr::pads_end) | Controls number of zeros to be add to the back/bottom/right of spatial dimensions, the attribute will be ignored when `auto_pad` attribute is specified to `same_upper`, `same_lower` or `valid`. |s64 |A s64 list containing non-negative values | Required
[kernel](@ref dnnl::graph::op::attr::kernel) | Size of pooling window. | s64| A s64 list containing positive values | Required
[exclude_pad](@ref dnnl::graph::op::attr::exclude_pad)| Controls whether the padded values are counted. |bool | True, False| required
[rounding_type](@ref dnnl::graph::op::attr::rounding_type) | Controls how to do rounding. |string | `floor` (default), `ceil` | Optional
[auto_pad](@ref dnnl::graph::op::attr::auto_pad) |Controls how the paddings are calculated.| string | `none` (default), `same_upper`, `same_lower`, `valid` | Optional
[data_format](@ref dnnl::graph::op::attr::data_format) |Controls how to interpret the shape of `src` and `dst`.| string|`NCX`, `NXC` (default) | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`src` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`dst` | Required
## Supported data types
AvgPool operation supports the following data type combinations.
Src | Dst
-- | --
f32 |f32
bf16 |bf16
f16 |f16

View File

@ -0,0 +1,50 @@
# AvgPoolBackward {#dev_guide_op_avgpoolbackward}
## General
AvgPoolBackward operation accepts \f$\diffdst\f$ tensor and \f$\srcshape\f$
tensor (optional), and calculates \f$\diffsrc\f$ tensor.
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[strides](@ref dnnl::graph::op::attr::strides) | Controls the strides the window is moved. |s64 |A s64 list containing positive values | Required
[pads_begin](@ref dnnl::graph::op::attr::pads_begin) | Controls number of zeros to be add to the front/top/left of spatial dimensions, the attribute will be ignored when `auto_pad` attribute is specified to `same_upper`, `same_lower` or `valid`.|s64 | A s64 list containing non-negative values | Required
[pads_end](@ref dnnl::graph::op::attr::pads_end) | Controls number of zeros to be add to the back/bottom/right of spatial dimensions, the attribute will be ignored when `auto_pad` attribute is specified to `same_upper`, `same_lower` or `valid`. |s64 |A s64 list containing non-negative values | Required
[kernel](@ref dnnl::graph::op::attr::kernel) | Size of pooling window. | s64| A s64 list containing positive values | Required
[exclude_pad](@ref dnnl::graph::op::attr::exclude_pad)| Controls whether the padded values are counted. |bool | True, False| Required
[auto_pad](@ref dnnl::graph::op::attr::auto_pad) |Controls how the paddings are calculated.| string | `none` (default), `same_upper`, `same_lower`, `valid` | Optional
[data_format](@ref dnnl::graph::op::attr::data_format) |Controls how to interpret the shape of `src` and `dst`.| string|`NCX`, `NXC` (default) | Optional
[src_shape](@ref dnnl::graph::op::attr::src_shape) |Denotes the shape of input of forward op.| string|`NCX`, `NXC` (default) | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`diff_dst` | Required
1|`src_shape` | Optional
@note Either `src_shape` input or `src_shape` attribute should be provided. If
both provided, `src_shape` input will precede over `src_shape` attribute.
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`diff_src` | Required
## Supported data types
AvgPoolBackward operation supports the following data type combinations.
Diff_dst |Diff_src|Src_shape
-- | --|--
f32 |f32|s64
bf16 |bf16|s64
f16 |f16|s64

View File

@ -0,0 +1,56 @@
# BatchNormForwardTraining {#dev_guide_op_batchnormforwardtraining}
## General
BatchNormForwardTraining operation performs batch normalization at training mode.
Mean and variance are computed at runtime, the following formulas are used:
- \f$\mu(c) = \frac{1}{NHW} \sum\limits_{nhw} \src(n, c, h, w)_{}\f$,
- \f$\sigma^2(c) = \frac{1}{NHW} \sum\limits_{nhw} {}_{} (\src(n, c, h, w) - \mu(c))^2\f$.
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[epsilon](@ref dnnl::graph::op::attr::epsilon) | A number to be added to the variance to avoid division by zero. |f32 |A positive f32 value | Required
[momentum](@ref dnnl::graph::op::attr::momentum) | A number to be used to calculate running mean and running variance. |f32 |A positive f32 value | Optional
[data_format](@ref dnnl::graph::op::attr::data_format) |Controls how to interpret the shape of `src` and `dst`.| string|`NCX`, `NXC` (default) | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`src` | Required
1|`mean` | Required
2|`variance`|Required
3|`gamma` | Optional
4|`beta` (\f$\sigma^2\f$)|Optional
@note `gamma` and `beta` should be either both provided or neither provided.
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`dst` | Required
1|`running_mean` | Required
2|`running_variance` | Required
3|`batch_mean` | Required
4|`batch_variance` | Required
## Supported data types
BatchNormInference operation supports the following data type combinations.
Src / Dst | Gamma / Beta / Mean / Variance / Batch_mean / Batch_variance / Running_mean / Running_variance
--|--
f32 | f32
bf16 | f32, bf16
f16 | f32

View File

@ -0,0 +1,59 @@
# BatchNormInference {#dev_guide_op_batchnorminference}
## General
The formula is the same as
[Batch Normalization primitive](@ref dev_guide_batch_normalization) like below.
\f[
\dst(n, c, h, w) =
\gamma(c) \cdot
\frac{\src(n, c, h, w) - \mu(c)} {\sqrt{\sigma^2(c) + \varepsilon}}
+ \beta(c),
\f]
where
- \f$\gamma(c), \beta(c)\f$ are required scale and shift for a channel,
- \f$\mu(c), \sigma^2(c)\f$ are mean and variance for a channel, and
- \f$\varepsilon\f$ is a constant to improve numerical stability.
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[epsilon](@ref dnnl::graph::op::attr::epsilon) | A number to be added to the variance to avoid division by zero. |f32 |A positive float value | Required
[data_format](@ref dnnl::graph::op::attr::data_format) |Controls how to interpret the shape of `src` and `dst`.| string|`NCX`, `NXC` (default) | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`src` | Required
1|`gamma` | Required
2|`beta`|Required
3|`mean` | Required
4|`variance` (\f$\sigma^2\f$)|Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`dst` | Required
## Supported data types
BatchNormInference operation supports the following data type combinations.
Src / Dst | Gamma / Beta / Mean / Variance
--|--
f32 | f32
bf16 | f32, bf16
f16 | f32

View File

@ -0,0 +1,49 @@
# BatchNormTrainingBackward {#dev_guide_op_batchnormtrainingbackward}
## General
BatchNormTrainingBackward operation calculated the gradients of input tensors.
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[epsilon](@ref dnnl::graph::op::attr::epsilon) | A number to be added to the variance to avoid division by zero. |f32 |A positive f32 value | Required
[data_format](@ref dnnl::graph::op::attr::data_format) |Controls how to interpret the shape of `src` and `dst`.| string|`NCX`, `NXC` (default) | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`src` | Required
1|`diff_dst` | Required
2|`mean`|Required
3|`variance` | Required
4|`gamma` (\f$\sigma^2\f$)|Optional
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`diff_src` | Required
1|`diff_gamma` | Optional
2|`diff_beta` | Optional
@note `diff_gamma` and `diff_beta` should be either both provided or neither
provided. If neither provided, the input `gamma` will be ignored.
## Supported data types
BatchNormTrainingBackward operation supports the following data type
combinations.
Src / Diff_dst / Dst | Mean / Variance / Gamma / Diff_gamma / Diff_beta
--|--
f32 | f32
bf16 | f32, bf16
f16 | f32

View File

@ -0,0 +1,45 @@
# BiasAdd {#dev_guide_op_biasadd}
## General
Add bias to channel dimension of input. This is a special `Add` with bias
restricted to be 1-D. Broadcasting is supported.
\f[ \dst(n,c,h,w) = \src(n,c,h,w) + \bias(c) \f]
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[data_format](@ref dnnl::graph::op::attr::data_format) | Controls how to interpret the shape of `src` and `dst`. |string |`NCX` , `NXC` (default) | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
| Index | Argument Name | Required or Optional |
| ----- | ------------- | -------------------- |
| 0 | `src` | Required |
| 1 | `bias` | Required |
@note `bias` is a 1D tensor to be added to `src` tensor. The size should be the
same as size of channel dimension of `src` tensor.
### Outputs
| Index | Argument Name | Required or Optional |
| ----- | ------------- | -------------------- |
| 0 | `dst` | Required |
## Supported data types
BiasAdd operation supports the following data type combinations.
| Src | Bias | Dst |
| ---- | ------ | ------- |
| f32 | f32 | f32 |
| bf16 | bf16 | bf16 |
| f16 | f16 | f16 |

View File

@ -0,0 +1,40 @@
# BiasAddBackward {#dev_guide_op_biasaddbackward}
## General
BiasAddBackward operation computes the gradients on the bias tensor for
BiasAdd operator. This op accumulates all the values from \f$\diffdst\f$ into
the channel dimension, the axis depends on the layout of \src tensor.
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[data_format](@ref dnnl::graph::op::attr::data_format) | Controls how to interpret the shape of `diff_dst` and `diff_bias`. |string |`NCX` , `NXC` (default) | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`diff_dst` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`diff_bias` | Required
## Supported data types
BiasAddBackward operation supports the following data type combinations.
diff_dst | diff_bias
---- | -------
f32 | f32
bf16 | bf16
f16 | f16

View File

@ -0,0 +1,43 @@
# Clamp {#dev_guide_op_clamp}
## General
Clamp operation represents clipping activation function, it applies following
formula on every element of \src tensor (the variable names follow the standard
@ref dev_guide_conventions):
\f[ clamp(src_i) = min(max(src_i, min\_value), max\_value) \f]
## Operation attributes
Attribute Name | Description | Value Type | Supported Values | Required or Optional
-- | -- | -- | -- | --
[min](@ref dnnl::graph::op::attr::min) | The lower bound of values in the output. Any value in the input that is smaller than the bound, is replaced with the `min` value. | f32 | Arbitrary valid f32 value | Required
[max](@ref dnnl::graph::op::attr::max) | The upper bound of values in the output. Any value in the input that is greater than the bound, is replaced with the `max` value. | f32 | Arbitrary valid f32 value | Required
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `src` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `dst` | Required
## Supported data types
Clamp operation supports the following data type combinations.
Src | Dst
-- | --
f32 | f32
f16 | f16
bf16 | bf16

View File

@ -0,0 +1,41 @@
# ClampBackward {#dev_guide_op_clampbackward}
## General
ClampBackward operation computes gradient for Clamp.
## Operation attributes
Attribute Name | Description | Value Type | Supported Values | Required or Optional
-- | -- | -- | -- | --
[min](@ref dnnl::graph::op::attr::min) | The lower bound of values in the output. Any value in the input that is smaller than the bound, is replaced with the `min` value. | f32 | Arbitrary valid f32 value | Required
[max](@ref dnnl::graph::op::attr::max) | The upper bound of values in the output. Any value in the input that is greater than the bound, is replaced with the `max` value. | f32 | Arbitrary valid f32 value | Required
[use_dst](@ref dnnl::graph::op::attr::use_dst) | If true, use `dst` of Clamp operation to calculate the gradient. Otherwise, use `src`. | bool | `true` (default), `false` | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `src` / `dst` | Required
1 | `diff_dst` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `diff_src` | Required
## Supported data types
ClampBackward operation supports the following data type combinations.
Src | Diff_dst | Diff_src
-- | -- | --
f32 | f32 | f32
f16 | f16 | f16
bf16 | bf16 | bf16

View File

@ -0,0 +1,51 @@
# Concat {#dev_guide_op_concat}
## General
Concat operation concatenates \f$N\f$ tensors over `axis` (here designated
\f$C\f$) and is defined as (the variable names follow the standard
@ref dev_guide_conventions):
\f[
\dst(\overline{ou}, c, \overline{in}) =
\src_i(\overline{ou}, c', \overline{in}),
\f]
where \f$c = C_1 + .. + C_{i-1} {}_{} + c'\f$.
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[axis](@ref dnnl::graph::op::attr::axis) | Specifies dimension along which concatenation happens. |s64 | A s64 value in the range of [-r, r-1] where r = rank(src) | Required
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`src_i` | Required
@note At least one input tensor is required. Data types and ranks of all input
tensors should match. The dimensions of all input tensors should be the same
except for the dimension specified by `axis` attribute.
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`dst` |Required
## Supported data types
Concat operation supports the following data type combinations.
Src | Dst
-- | --
f32 |f32
bf16 |bf16
f16 |f16

View File

@ -0,0 +1,58 @@
# ConvTranspose {#dev_guide_op_convtranspose}
## General
ConvTranspose operation performs the same computation as calculating the
gradient with regard to \src of Convolution operation. To see the difference
visually, you can go to
[visualization page](https://github.com/vdumoulin/conv_arithmetic/blob/master/README.md).
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[strides](@ref dnnl::graph::op::attr::strides) | Controls the strides the weights tensor is moved when computing convolution. |s64 |A s64 list containing positive values | Required
[pads_begin](@ref dnnl::graph::op::attr::pads_begin) | Controls number of zeros to be add to the front/top/left of spatial dimensions.|s64 | A s64 list containing non-negative values | Required
[pads_end](@ref dnnl::graph::op::attr::pads_end) | Controls number of zeros to be add to the back/bottom/right of spatial dimensions. |s64 |A s64 list containing non-negative values | Required
[dilations](@ref dnnl::graph::op::attr::dilations) | Controls the amount of stretching the kernel before convolution ([visualization link](https://github.com/vdumoulin/conv_arithmetic/blob/master/README.md#dilated-convolution-animations)). | s64| A s64 list containing positive values (>1 means dilated convolution) | Required
[auto_pad](@ref dnnl::graph::op::attr::auto_pad)| Controls how the padding is calculated.|string | `none` (default), `same_upper`, `same_lower`, `valid` | Optional
[output_padding](@ref dnnl::graph::op::attr::output_padding)| Adds additional amount of padding per each spatial axis in `dst`.|s64 | A s64 list containing non-negative values, all zeros by default | Optional
[groups](@ref dnnl::graph::op::attr::groups) | Controls how input channels and output channels are divided into. |s64 |A positive s64 value, `1` by default | Optional
[data_format](@ref dnnl::graph::op::attr::data_format) |Controls how to interpret the shape of `src` and `dst`.| string|`NCX`, `NXC` (default) | Optional
[weights_format](@ref dnnl::graph::op::attr::weights_format) |Controls how to interpret the shape of `weights`.| string|`IOX`, `XOI` (default) | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`src` | Required
1|`weights` | Required
2|`bias`|Optional
@note
The shape of \weights is
\f$(in\_channels / groups, out\_channels, spatial\_shape)\f$ for `IOX` format or
\f$(spatial\_shape, out\_channels, in\_channels / groups)\f$ for `XOI` format.
Both \f$in\_channels\f$ and \f$out\_channels\f$ must be divisible by *groups*
attribute.
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`dst` | Required
## Supported data types
ConvTranspose operation supports the following data type combinations.
Src | Weights | Bias | Dst
--|--|-- | --
f32 | f32 | f32 |f32
bf16 | bf16 | bf16 |bf16
f16 | f16 | f16 |f16

View File

@ -0,0 +1,56 @@
# ConvTransposeBackwardData {#dev_guide_op_convtransposebackwarddata}
## General
ConvTransposeBackwardData operation takes \f$\diffdst\f$ and \weights and
computes \f$\diffsrc\f$.
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[strides](@ref dnnl::graph::op::attr::strides) | Controls the strides the weights tensor is moved when computing convolution. |s64 |A s64 list containing positive values | Required
[pads_begin](@ref dnnl::graph::op::attr::pads_begin) | Controls number of zeros to be add to the front/top/left of spatial dimensions.|s64 | A s64 list containing non-negative values | Required
[pads_end](@ref dnnl::graph::op::attr::pads_end) | Controls number of zeros to be add to the back/bottom/right of spatial dimensions. |s64 |A s64 list containing non-negative values | Required
[dilations](@ref dnnl::graph::op::attr::dilations) | Controls the amount of stretching the kernel before convolution ([visualization link](https://github.com/vdumoulin/conv_arithmetic/blob/master/README.md#dilated-convolution-animations)). | s64| A s64 list containing positive values (>1 means dilated convolution) | Required
[auto_pad](@ref dnnl::graph::op::attr::auto_pad)| Controls how the padding is calculated.|string | `none` (default), `same_upper`, `same_lower`, `valid` | Optional
[output_padding](@ref dnnl::graph::op::attr::output_padding)| Adds additional amount of padding per each spatial axis in `dst`.|s64 | A s64 list containing non-negative values, all zeros by default | Optional
[groups](@ref dnnl::graph::op::attr::groups) | Controls how input channels and output channels are divided into. |s64 |A positive s64 value, `1` by default | Optional
[data_format](@ref dnnl::graph::op::attr::data_format) |Controls how to interpret the shape of `src` and `dst`.| string|`NCX`, `NXC` (default) | Optional
[weights_format](@ref dnnl::graph::op::attr::weights_format) |Controls how to interpret the shape of `weights`.| string|`IOX`, `XOI` (default) | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`diff_dst` | Required
1|`weights` | Required
@note
The shape of \weights is
\f$(in\_channels / groups, out\_channels, spatial\_shape)\f$ for `IOX` format or
\f$(spatial\_shape, out\_channels, in\_channels / groups)\f$ for `XOI` format.
Both \f$in\_channels\f$ and \f$out\_channels\f$ must be divisible by *groups*
attribute.
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`diff_src` | Required
## Supported data types
ConvTransposeBackwardData operation supports the following data type
combinations.
Diff_dst | Weights | Diff_src
--|--|--
f32 | f32 |f32
bf16 | bf16 |bf16
f16 | f16 | f16 |f16

View File

@ -0,0 +1,62 @@
# ConvTransposeBackwardWeights {#dev_guide_op_convtransposebackwardweights}
## General
ConvTransposeBackwardWeights operation takes \f$\diffdst\f$, \src and optional
\f$weights\_shape\f$ computes \f$\diffweights\f$.
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[strides](@ref dnnl::graph::op::attr::strides) | Controls the strides the weights tensor is moved when computing convolution. |s64 |A s64 list containing positive values | Required
[pads_begin](@ref dnnl::graph::op::attr::pads_begin) | Controls number of zeros to be add to the front/top/left of spatial dimensions.|s64 | A s64 list containing non-negative values | Required
[pads_end](@ref dnnl::graph::op::attr::pads_end) | Controls number of zeros to be add to the back/bottom/right of spatial dimensions. |s64 |A s64 list containing non-negative values | Required
[dilations](@ref dnnl::graph::op::attr::dilations) | Controls the amount of stretching the kernel before convolution ([visualization link](https://github.com/vdumoulin/conv_arithmetic/blob/master/README.md#dilated-convolution-animations)). | s64| A s64 list containing positive values (>1 means dilated convolution) | Required
[auto_pad](@ref dnnl::graph::op::attr::auto_pad)| Controls how the padding is calculated.|string | `none` (default), `same_upper`, `same_lower`, `valid` | Optional
[output_padding](@ref dnnl::graph::op::attr::output_padding)| Adds additional amount of padding per each spatial axis in `dst`.|s64 | A s64 list containing non-negative values, all zeros by default | Optional
[groups](@ref dnnl::graph::op::attr::groups) | Controls how input channels and output channels are divided into |s64 |A positive s64 value, `1` by default | Optional
[data_format](@ref dnnl::graph::op::attr::data_format) |Controls how to interpret the shape of `src` and `dst`.| string|`NCX`, `NXC` (default) | Optional
[weights_format](@ref dnnl::graph::op::attr::weights_format) |Controls how to interpret the shape of `weights`.| string|`IOX`, `XOI` (default) | Optional
[weights_shape](@ref dnnl::graph::op::attr::weights_shape) |Denotes the shape of the `weights` tensor.| s64| A s64 list containing positive values| Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`src` | Required
1|`diff_dst` | Required
2|`weights_shape`|Optional
@note
The shape of \weights is
\f$(in\_channels / groups, out\_channels, spatial\_shape)\f$ for `IOX` format or
\f$(spatial\_shape, out\_channels, in\_channels / groups)\f$ for `XOI` format.
Both \f$in\_channels\f$ and \f$out\_channels\f$ must be divisible by *groups*
attribute.
@note Either `weights_shape` input or `weights_shape` attribute should be
provided. If both provided, `weights_shape` input will precede over
`weights_shape` attribute.
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`diff_weights` | Required
## Supported data types
ConvTransposeBackwardWeights operation supports the following data type
combinations.
Src | Diff_dst | Diff_weights | Weights_shape
--|--|-- | --
f32 | f32 | f32 |s32
bf16 | bf16 | bf16 |s32
f16 | f16 | f16 |s32

View File

@ -0,0 +1,139 @@
# Convolution {#dev_guide_op_convolution}
## General
Convolution operation performs the convolution between src tensor and weight
tensor, which is defined as by the following formulas. Variable names follow the
standard @ref dev_guide_conventions.
Let \src, \weights and \dst tensors have shape \f$N \times IC \times IH \times
IW\f$, \f$OC \times IC \times KH \times KW\f$, and \f$N \times OC \times OH
\times OW\f$ respectively.
Furthermore, let the remaining convolution parameters be:
| Parameter | Depth | Height | Width | Comment
| :--| :-- | :-- | :-- |:--
| Paddings: Front, top, and left | \f$PD_L\f$ | \f$PH_L\f$ | \f$PW_L\f$ | In the attributes we use `pads_begin` to indicate the corresponding vector of paddings |
| Padding: Back, bottom, and right | \f$PD_R\f$ | \f$PH_R\f$ | \f$PW_R\f$ | In the attributes we use `pads_end` to indicate the corresponding vector of paddings |
| Stride | \f$SD\f$ | \f$SH\f$ | \f$SW\f$ | In the attributes we use `strides` to indicate the corresponding vector of strides |
| Dilation | \f$DD\f$ | \f$DH\f$ | \f$DW\f$ | In the attributes we use `dilations` to indicate the corresponding vector of dilations|
To further simplify the formulas, we assume that the attribute `data_format` and
`weights_format` are set to `NCX` and `OIX` respectively. `NCX` means the fist
axis represents batch dimension, the second axis represents channel dimension
and the rest represents spatial dimensions. `OIX` means the first axis
represents output channel dimension, the second axis represents input channel
dimension and the rest represents weights spatial dimensions.
### Regular Convolution
This is the same as the formula in
[Convolution primitive](@ref dev_guide_convolution).
\f[\dst(n, oc, oh, ow) = \bias(oc) \\
+ \sum_{ic=0}^{IC-1}\sum_{kh=0}^{KH-1}\sum_{kw=0}^{KW-1}
\src(n, ic, oh \cdot SH + kh - PH_L, ow \cdot SW + kw - PW_L)
\cdot
\weights(oc, ic, kh, kw).\f]
Here:
- \f$OH = \left\lfloor{\frac{IH - KH + PH_L + PH_R}{SH}} \right\rfloor + 1,\f$
- \f$OW = \left\lfloor{\frac{IW - KW + PW_L + PW_R}{SW}} \right\rfloor + 1.\f$
### Convolution with Groups
The attribute `groups` is set to \f$>1\f$.
\f[
\dst(n, g \cdot OC_G + oc_g, oh, ow) =
\bias(g \cdot OC_G + oc_g) \\
+
\sum_{ic_g=0}^{IC_G-1}\sum_{kh=0}^{KH-1}\sum_{kw=0}^{KW-1}
\src(n, g \cdot IC_G + ic_g, oh \cdot SH + kh - PH_L,
ow \cdot SW + kw - PW_L)
\cdot
\weights(g, oc_g, ic_g, kh, kw),
\f]
where
- \f$IC_G = \frac{IC}{G}\f$,
- \f$OC_G = \frac{OC}{G}\f$, and
- \f$oc_g \in [0, OC_G).\f$
### Convolution with Dilation
The attribute `dilation` contains the element which is \f$>1\f$.
\f[
\dst(n, oc, oh, ow) =
\bias(oc) \\
+
\sum_{ic=0}^{IC-1}\sum_{kh=0}^{KH-1}\sum_{kw=0}^{KW-1}
\src(n, ic, oh \cdot SH + kh \cdot DH - PH_L,
ow \cdot SW + kw \cdot DW - PW_L)
\cdot
\weights(oc, ic, kh, kw).
\f]
Here:
- \f$OH = \left\lfloor{\frac{IH - DKH + PH_L + PH_R}{SH}}
\right\rfloor + 1,\f$ where \f$DKH = 1 + (KH - 1) \cdot DH\f$, and
- \f$OW = \left\lfloor{\frac{IW - DKW + PW_L + PW_R}{SW}}
\right\rfloor + 1,\f$ where \f$DKW = 1 + (KW - 1) \cdot DW\f$.
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[strides](@ref dnnl::graph::op::attr::strides) | Controls the strides the weights tensor is moved when computing convolution |s64 |A s64 list containing positive values | Required
[pads_begin](@ref dnnl::graph::op::attr::pads_begin) | Controls number of zeros to be add to the front/top/left of spatial dimensions|s64 | A s64 list containing non-negative values | Required
[pads_end](@ref dnnl::graph::op::attr::pads_end) | Controls number of zeros to be add to the back/bottom/right of spatial dimensions |s64 |A s64 list containing non-negative values | Required
[dilations](@ref dnnl::graph::op::attr::dilations) | Controls the amount of stretching the kernel before convolution ([visualization link](https://github.com/vdumoulin/conv_arithmetic/blob/master/README.md#dilated-convolution-animations)) | s64| A s64 list containing positive values (>1 means dilated convolution) | Required
[auto_pad](@ref dnnl::graph::op::attr::auto_pad)| Controls how the padding is calculated|string | `none` (default), `same_upper`, `same_lower`, `valid` | Optional
[groups](@ref dnnl::graph::op::attr::groups) | Controls how input channels and output channels are divided into |s64 |A positive s64 value, `1` by default | Optional
[data_format](@ref dnnl::graph::op::attr::data_format) |Controls how to interpret the shape of `src` and `dst`.| string|`NCX`, `NXC` (default) | Optional
[weights_format](@ref dnnl::graph::op::attr::weights_format) |Controls how to interpret the shape of `weights`| string|`OIX`, `XIO` (default) | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`src` | Required
1|`weights` | Required
2|`bias`|Optional
@note
The shape of \weights is
\f$(out\_channels, in\_channels / groups, spatial\_shape)\f$ for `OIX` format or
\f$(spatial\_shape, in\_channels / groups, out\_channels)\f$ for `XIO` format.
Both \f$in\_channels\f$ and \f$out\_channels\f$ must be divisible by *groups*
attribute.
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`dst` | Required
## Supported data types
Convolution operation supports the following data type combinations.
Src | Weights | Bias | Dst
--|--|-- | --
f32 | f32 | f32 |f32
bf16 | bf16 | bf16 |bf16
f16 | f16 | f16 |f16

View File

@ -0,0 +1,108 @@
# ConvolutionBackwardData {#dev_guide_op_convolutionbackwarddata}
## General
ConvolutionBackwardData operation accepts \f$\diffdst\f$, \weights and optional
dst shape as inputs, and compute the \f$\diffsrc\f$.
If `auto_pad` attribute is specified to one of `valid`, `same_upper` and
`same_lower`, `pads_begin` and `pads_end` attributes will be ignored. The
paddings will be calculated by following the below formula:
Let the parameters be:
| Parameter | Depth | Height | Width | Comment
| :--| :-- | :-- | :-- |:--
| Paddings: Front, top, and left | \f$PD_L\f$ | \f$PH_L\f$ | \f$PW_L\f$ | In the attributes we use `pads_begin` to indicate the corresponding vector of paddings |
| Padding: Back, bottom, and right | \f$PD_R\f$ | \f$PH_R\f$ | \f$PW_R\f$ | In the attributes we use `pads_end` to indicate the corresponding vector of paddings |
| Stride | \f$SD\f$ | \f$SH\f$ | \f$SW\f$ | In the attributes we use `strides` to indicate the corresponding vector of strides |
| Dilation | \f$DD\f$ | \f$DH\f$ | \f$DW\f$ | In the attributes we use `dilations` to indicate the corresponding vector of dilations|
Firstly, \f$total\_padding\f$ is calculated according to \f$src\_shape\f$ and \f$dst\_shape\f$.
Let \f$src\_h\f$ be height dimension of \f$src\_shape\f$ and \f$dst\_h\f$ be
height dimension of \f$dst\_shape\f$.
\f[
total\_padding_h = SH \times (src\_h - 1) + ((KH -1 ) \times DH + 1) - dst\_h + output\_padding_h
\f]
If `auto_pad` attribute is specified as `valid`:
\f[
PD_L = 0 \\
PD_R = 0
\f]
If `auto_pad` attribute is specified as `same_lower`:
\f[
PD_L = floor(total\_padding / 2) \\
PD_R = total\_padding - PD_L
\f]
If `auto_pad` attribute is specified as `same_upper`:
\f[
PD_L = total\_padding - PD_R \\
PD_R = floor(total\_padding / 2)
\f]
where:
- \f$dst\_shape\f$ is either an attribute or an input tensor,
- \f$output\_padding\f$ is an optional attribute.
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[strides](@ref dnnl::graph::op::attr::strides) | Controls the strides the weights tensor is moved when computing convolution. |s64 |A s64 list containing positive values | Required
[pads_begin](@ref dnnl::graph::op::attr::pads_begin) | Controls number of zeros to be add to the front/top/left of spatial dimensions, the attribute will be ignored when `auto_pad` attribute is specified to `same_upper`, `same_lower` or `valid`.|s64 | A s64 list containing non-negative values | Required
[pads_end](@ref dnnl::graph::op::attr::pads_end) | Controls number of zeros to be add to the back/bottom/right of spatial dimensions, the attribute will be ignored when `auto_pad` attribute is specified to `same_upper`, `same_lower` or `valid`. |s64 |A s64 list containing non-negative values | Required
[dilations](@ref dnnl::graph::op::attr::dilations) | Controls the amount of stretching the kernel before convolution ([visualization link](https://github.com/vdumoulin/conv_arithmetic/blob/master/README.md#dilated-convolution-animations)). | s64| A s64 list containing positive values (>1 means dilated convolution) | Required
[auto_pad](@ref dnnl::graph::op::attr::auto_pad)| Controls how the padding is calculated.|string | `none` (default), `same_upper`, `same_lower`, `valid` | Optional
[output_padding](@ref dnnl::graph::op::attr::output_padding)| Adds additional amount of padding per each spatial axis in `dst`.|s64 | A s64 list containing non-negative values, all zeros by default | Optional
[groups](@ref dnnl::graph::op::attr::groups) | Controls how input channels and output channels are divided into. |s64 |A positive s64 value, `1` by default | Optional
[data_format](@ref dnnl::graph::op::attr::data_format) |Controls how to interpret the shape of `src` and `dst`.| string|`NCX`, `NXC` (default) | Optional
[weights_format](@ref dnnl::graph::op::attr::weights_format) |Controls how to interpret the shape of `weights`.| string|`OIX`, `XIO` (default) | Optional
[dst_shape](@ref dnnl::graph::op::attr::dst_shape) |Denotes the shape of the `dst` tensor.| s64| A s64 list containing positive values| Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`diff_dst` | Required
1|`weights` | Required
2|`dst_shape`|Optional
@note
The shape of \weights is
\f$(out\_channels, in\_channels / groups, spatial\_shape)\f$ for `OIX` format or
\f$(spatial\_shape, in\_channels / groups, out\_channels)\f$ for `XIO` format.
Both \f$in\_channels\f$ and \f$out\_channels\f$ must be divisible by *groups*
attribute.
@note Either `dst_shape` input or `dst_shape` attribute should be provided. If
both provided, `dst_shape` input will precede over `dst_shape` attribute.
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`diff_src` | Required
## Supported data types
ConvolutionBackwardData operation supports the following data type combinations.
Diff_dst | Weights | Diff_src | Dst_shape
--|--|-- | --
f32 | f32 | f32 |s32
bf16 | bf16 | bf16 |s32
f16 | f16 | f16 |s32

View File

@ -0,0 +1,61 @@
# ConvolutionBackwardWeights {#dev_guide_op_convolutionbackwardweights}
## General
ConvolutionBackwardWeights operation accepts \src, \f$\diffdst\f$ and optional
weights shape as inputs, and compute the \f$\diffweights\f$.
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[strides](@ref dnnl::graph::op::attr::strides) | Controls the strides the weights tensor is moved when computing convolution. |s64 |A s64 list containing positive values | Required
[pads_begin](@ref dnnl::graph::op::attr::pads_begin) | Controls number of zeros to be add to the front/top/left of spatial dimensions, the attribute will be ignored when `auto_pad` attribute is specified to `same_upper`, `same_lower` or `valid`.|s64 | A s64 list containing non-negative values | Required
[pads_end](@ref dnnl::graph::op::attr::pads_end) | Controls number of zeros to be add to the back/bottom/right of spatial dimensions, the attribute will be ignored when `auto_pad` attribute is specified to `same_upper`, `same_lower` or `valid`. |s64 |A s64 list containing non-negative values | Required
[dilations](@ref dnnl::graph::op::attr::dilations) | Controls the amount of stretching the kernel before convolution ([visualization link](https://github.com/vdumoulin/conv_arithmetic/blob/master/README.md#dilated-convolution-animations)). | s64| A s64 list containing positive values (>1 means dilated convolution) | Required
[auto_pad](@ref dnnl::graph::op::attr::auto_pad)| Controls how the padding is calculated.|string | `none` (default), `same_upper`, `same_lower`, `valid` | Optional
[groups](@ref dnnl::graph::op::attr::groups) | Controls how input channels and output channels are divided into. |s64 |A positive s64 value, `1` by default | Optional
[data_format](@ref dnnl::graph::op::attr::data_format) |Controls how to interpret the shape of `src` and `dst`.| string|`NCX`, `NXC` (default) | Optional
[weights_format](@ref dnnl::graph::op::attr::weights_format) |Controls how to interpret the shape of `weights`.| string|`OIX`, `XIO` (default) | Optional
[weights_shape](@ref dnnl::graph::op::attr::weights_shape) |Denotes the shape of the `weights` tensor.| s64| A s64 list containing positive values| Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`src` | Required
1|`diff_dst` | Required
2|`weights_shape`|Optional
@note
The shape of \weights is
\f$(out\_channels, in\_channels / groups, spatial\_shape)\f$ for `OIX` format or
\f$(spatial\_shape, in\_channels / groups, out\_channels)\f$ for `XIO` format.
Both \f$in\_channels\f$ and \f$out\_channels\f$ must be divisible by *groups*
attribute.
**Note** Either `weights_shape` input or `weights_shape` attribute should be
provided. If both provided, `weights_shape` input will precede over
`weights_shape` attribute.
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`diff_weights` | Required
## Supported data types
ConvolutionBackwardWeights operation supports the following data type
combinations.
Src | Diff_dst | Diff_weights | Weights_shape
--|--|-- | --
f32 | f32 | f32 |s32
bf16 | bf16 | bf16 |s32
f16 | f16 | f16 |s32

View File

@ -0,0 +1,54 @@
# Dequantize {#dev_guide_op_dequantize}
## General
Dequantize operation converts a quantized (u8 or s8) tensor to a f32 tensor. It
supports both per-tensor and per-channel asymmetric linear de-quantization.
Rounding mode is library-implementation defined.
For per-tensor de-quantization:
\f[ \dst_{i} = round((\src_{i} - zps) \times scale) \f]
For per-channel de-quantization, taking channel axis = 1 as an example:
\f[ dst_{\cdots,i,\cdots,\cdots} = (\src_{\cdots,i,\cdots,\cdots} - zps_i) \times scale_i, i \in {[0, ic-1]} \f]
where \f$ic\f$ is the number of channels.
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[qtype](@ref dnnl::graph::op::attr::qtype) | Specifies which de-quantization type is used. |string | `per_tensor` (default), `per_channel` | Optional
[axis](@ref dnnl::graph::op::attr::axis) | Specifies dimension on which per-channel de-quantization is applied. |s64 | A s64 value in the range of [-r, r-1] where r = rank(src), `1` by default | Optional
[scales](@ref dnnl::graph::op::attr::scales) | Scalings applied on the src data. |f32 | A f32 list (only contain one element if qtype is `per_tensor`) | Required
[zps](@ref dnnl::graph::op::attr::zps) | Offset values that maps to float zero. |s64 | A s64 list (only contain one element if qtype is `per_tensor`) | Required
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`src` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`dst` |Required
## Supported data types
Dequantize operation supports the following data type combinations.
Src | Dst
-- | --
s8, u8 |f32
@note This operation is to support
[int8 quantization](@ref dev_guide_graph_int8_quantization_model) model.

View File

@ -0,0 +1,50 @@
# Divide {#dev_guide_op_divide}
## General
Divide operation performs element-wise division operation with two given tensors
applying multi-directional broadcast rules.
\f[
\dst(\overline{x}) =
\src_0(\overline{x}) \mathbin{/} \src_1(\overline{x}),
\f]
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[auto_broadcast](@ref dnnl::graph::op::attr::auto_broadcast) | Specifies rules used for auto-broadcasting of src tensors. |string |`none`,`numpy` (default) | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
----- | ------------- | --------------------
0 | `src_0` | Required
1 | `src_1` | Required
@note Both src shapes should match and no auto-broadcasting is allowed if
`auto_broadcast` attributes is `none`. `src_0` and `src_1` shapes can be
different and auto-broadcasting is allowed if `auto_broadcast` attributes is
`numpy`. Broadcasting is performed according to auto_broadcast value.
### Outputs
Index | Argument Name | Required or Optional
----- | ------------- | --------------------
0 | `dst` | Required
## Supported data types
Divide operation supports the following data type combinations.
Diff_dst | Diff_bias
---- | -------
f32 | f32
bf16 | bf16
f16 | f16

View File

@ -0,0 +1,62 @@
# DynamicDequantize {#dev_guide_op_dynamicdequantize}
## General
DynamicDequantize operation converts a quantized (s8 or u8) tensor to a f32
tensor. It supports both per-tensor and per-channel asymmetric linear
de-quantization. Rounding mode is library-implementation defined. Unlike the
@ref dev_guide_op_dequantize, DynamicDequantize takes scales and zero-points as
operator src tensors.
For per-tensor de-quantization
\f[ dst = (src - zps)*scales \f]
For per-channel de-quantization, taking channel axis = 1 as an example:
\f[ {dst}_{\cdots,i,\cdots,\cdots} = (src_{\cdots,i,\cdots,\cdots} - zps_i)*scales_i,i\in [0,channelNum-1] \f]
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[qtype](@ref dnnl::graph::op::attr::qtype) | Specifies which de-quantization type is used. |string | `per_tensor` (default), `per_channel` | Optional
[axis](@ref dnnl::graph::op::attr::axis) | Specifies dimension on which per-channel de-quantization is applied. |s64 | A s64 value in the range of [-r, r-1] where r = rank(src), `1` by default. Negative value means counting the dimension backwards from the end. | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
----- | ------------- | --------------------
0 | `src` | Required
1 | `scales` | Required
2 | `zps` | Optional
@note `scales` is a f32 1D tensor to be applied to the de-quantization formula.
For `qtype` = `per-tensor`, there should be only one element in the scales
tensor. For `qtype` = `per-channel`, the element number should be equal to the
element number of src tensor along the dimension axis.
@note `zps` is a 1D tensor with offset values that map to zero. For `qtype` =
`per-tensor`, there should be only one element in the zps tensor. For `qtype` =
`per-channel`, the element number should be equal to the element number of input
tensor along the dimension axis. If not specified, the library can assume the
operator is symmetric de-quantization and perform kernel optimization accordingly.
### Outputs
Index | Argument Name | Required or Optional
----- | ------------- | --------------------
0 | `dst` | Required
## Supported data types
DynamicDequantize operation supports the following data type combinations.
Src | Dst| Scales |Zps
---- | ------- | ---|--
s8 | f32 | f32 |s8, u8, s32
u8 | f32 | f32 |s8, u8, s32

View File

@ -0,0 +1,62 @@
# DynamicQuantize {#dev_guide_op_dynamicquantize}
## General
DynamicQuantize operation converts a f32 tensor to a quantized (s8 or u8)
tensor. It supports both per-tensor and per-channel asymmetric linear
quantization. The target quantized data type is specified via the data type of
dst logical tensor. Rounding mode is library-implementation defined.
For per-tensor quantization
\f[ dst = round(src/scales + zps) \f]
For per-channel quantization, taking channel axis = 1 as an example:
\f[ {dst}_{\cdots,i,\cdots,\cdots} =
round(src_{\cdots,i,\cdots,\cdots}/scales_i + zps_i),i\in [0,channelNum-1] \f]
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[qtype](@ref dnnl::graph::op::attr::qtype) | Specifies which quantization type is used. |string | `per_tensor` (default), `per_channel` | Optional
[axis](@ref dnnl::graph::op::attr::axis) | Specifies dimension on which per-channel quantization is applied. |s64 | A s64 value in the range of [-r, r-1] where r = rank(src), `1` by default. Negative value means counting the dimension backwards from the end. | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
----- | ------------- | --------------------
0 | `src` | Required
1 | `scales` | Required
2 | `zps` | Optional
@note `scales` is a f32 1D tensor to be applied to the quantization formula. For
`qtype` = `per-tensor`, there should be only one element in the scales tensor.
For `qtype` = `per-channel`, the element number should be equal to the element
number of src tensor along the dimension axis.
@note `zps` is a 1D tensor with offset values that map to zero. For `qtype` = `per-tensor`, there should be only one
element in the zps tensor. For `qtype` = `per-channel`, the element number should be
equal to the element number of input tensor along the dimension axis. If not
specified, the library can assume the operator is symmetric quantization and
perform kernel optimization accordingly.
### Outputs
Index | Argument Name | Required or Optional
----- | ------------- | --------------------
0 | `dst` | Required
## Supported data types
DynamicQuantize operation supports the following data type combinations.
Src |Scales | Zps | Dst
---- | ------- | ---|--
f32 |f32 | s8, u8, s32 | s8
f32 |f32 | s8, u8, s32 | u8

View File

@ -0,0 +1,42 @@
# Elu {#dev_guide_op_elu}
## General
Elu operation applies following formula on every element of \src tensor (the
variable names follow the standard @ref dev_guide_conventions):
\f[ dst = \begin{cases} \alpha(e^{src} - 1) & \text{if}\ src < 0 \\
src & \text{if}\ src \ge 0 \end{cases} \f]
## Operation attributes
Attribute Name | Description | Value Type | Supported Values | Required or Optional
-- | -- | -- | -- | --
[alpha](@ref dnnl::graph::op::attr::alpha) | Scale for the negative factor. | f32 | Arbitrary non-negative f32 value | Required
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `src` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `dst` |Required
## Supported data types
Elu operation supports the following data type combinations.
Src | Dst
-- | --
f32 | f32
bf16 | bf16
f16 | f16

View File

@ -0,0 +1,40 @@
# EluBackward {#dev_guide_op_elubackward}
## General
EluBackward operation computes gradient for Elu operation.
## Operation attributes
Attribute Name | Description | Value Type | Supported Values | Required or Optional
-- | -- | -- | -- | --
[alpha](@ref dnnl::graph::op::attr::alpha) | Scale for the negative factor. | f32 | Arbitrary non-negative f32 value | Required
[use_dst](@ref dnnl::graph::op::attr::use_dst) | If true, use `diff_src` of Elu operation to calculate the gradient. Otherwise, use `src`. | bool | `true` (default), `false` | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `src` | Required
1 | `diff_dst` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `diff_src` | Required
## Supported data types
EluBackward operation supports the following data type combinations.
Src | Diff_dst | Diff_src
-- | -- | --
f32 | f32 | f32
f16 | f16 | f16
bf16 | bf16 | bf16

View File

@ -0,0 +1,35 @@
# End {#dev_guide_op_end}
## General
End operation is used to help construct graph, for example tracking the uses of
a tensor.
## Operation attributes
End operation does not support any attribute.
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
| Index | Argument Name | Required or Optional |
| ----- | ------------- | -------------------- |
| 0 | `src` | Required |
### Outputs
End operation does not support output tensor.
## Supported data types
End operation supports the following data type combinations.
| Src | Destination |
| ---- | ------- |
| f32 | f32 |
| bf16 | bf16 |
| f16 | f16 |

View File

@ -0,0 +1,39 @@
# Exp {#dev_guide_op_exp}
## General
Exp operation is an exponential element-wise activation function, it applies
following formula on every element of \src tensor (the variable names follow
the standard @ref dev_guide_conventions):
\f[ dst = e^{src} \f]
## Operation attributes
Exp operation does not support any attribute.
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `src` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `dst` | Required
## Supported data types
Exp operation supports the following data type combinations.
Src | Dst
-- | --
f32 | f32
f16 | f16
bf16 | bf16

View File

@ -0,0 +1,38 @@
# GELU {#dev_guide_op_gelu}
## General
GELU operation applies following formula on every element of \src tensor (the
variable names follow the standard @ref dev_guide_conventions):
\f[ dst = 0.5 * src * (1.0 + erf(src) / \sqrt2) \f]
## Operation attributes
GELU operation does not support any attribute.
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `src` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `dst` |Required
## Supported data types
GELU operation supports the following data type combinations.
Src | Dst
-- | --
f32 | f32
bf16 | bf16
f16 | f16

View File

@ -0,0 +1,37 @@
# GELUBackward {#dev_guide_op_gelubackward}
## General
GELUBackward operation computes gradient for GELU.
## Operation attributes
GELUBackward operation does not support any attribute.
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `src` | Required
1 | `diff_dst` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `diff_src` | Required
## Supported data types
GELUBackward operation supports the following data type combinations.
Src | Diff_dst | Diff_src
-- | -- | --
f32 | f32 | f32
f16 | f16 | f16
bf16 | bf16 | bf16

View File

@ -0,0 +1,39 @@
# HardSwish {#dev_guide_op_hardswish}
## General
HardSwish operation applies following formula on every element of \src tensor
(the variable names follow the standard @ref dev_guide_conventions):
\f[ dst = src * \frac{\min(\max(src + 3, 0), 6)}{6} \f]
## Operation attributes
HardSwish operation does not support any attribute.
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `src` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `dst` |Required
## Supported data types
HardSwish operation supports the following data type combinations.
Src | Dst
-- | --
f32 | f32
bf16 | bf16
f16 | f16

View File

@ -0,0 +1,37 @@
# HardSwishBackward {#dev_guide_op_hardswishbackward}
## General
HardSwishBackward operation computes gradient for HardSwish.
## Operation attributes
HardSwishBackward operation does not support any attribute.
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `src` | Required
1 | `diff_dst` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `diff_src` | Required
## Supported data types
HardSwishBackward operation supports the following data type combinations.
Src | Diff_dst | Diff_src
-- | -- | --
f32 | f32 | f32
f16 | f16 | f16
bf16 | bf16 | bf16

View File

@ -0,0 +1,64 @@
# Interpolate {#dev_guide_op_interpolate}
## General
Interpolate layer performs interpolation on \src tensor at spatial dimensions.
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| -- |--
[mode](@ref dnnl::graph::op::attr::mode) | Specifies type of interpolation. |string |`nearest`, `linear`, `bilinear`, `trilinear` | Required
[coordinate_transformation_mode](@ref dnnl::graph::op::attr::coordinate_transformation_mode) | Specifies how to transform the coordinate in the resized tensor to the coordinate in the original tensor. |string | `half_pixel`(default), `align_corners` | Optional
[sizes](@ref dnnl::graph::op::attr::sizes) | Specifies dst shape for spatial axes. |s64 |A s64 list containing positive values, `none` is default | Optional
[scales](@ref dnnl::graph::op::attr::scales) | Specifies `scales` for spatial axes. | f32 | A f32 list, `none` is default | Optional
[data_format](@ref dnnl::graph::op::attr::data_format) | Controls how to interpret the shape of `src` and `dst`. |string | `NCX`, `NXC` (default) | Optional
@note Either `sizes` or `scales` should be provided. When `sizes` is
used, `scales` will be ignored.
@note
The attribute `coordinate_transformation_mode` is the name of transformation
mode in string format.\n
Here `scale[x]` is `dst_shape[x]/src_shape[x]` and `x_resized` is a
coordinate in axis `x`,for any axis `x` from the src axis.\n
For `half_pixel`: the coordinate in the original tensor axis `x` is
calculated as `((x_resized + 0.5) / scale[x]) - 0.5`.\n
For `align_corners`: the coordinate in the original tensor axis `x` is
calculated as 0 if `dst_shape[x] == 1` else `x_resized * (src_shape[x] - 1)
/ (dst_shape[x] - 1)`.
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`src` | Required
1|`sizes` | Optional
@note `sizes` is a 1D tensor describing output shape for spatial axes. It is a
non-differentiable tensor.
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`dst` | Required
@note The shape of the dst matches src
shape except spatial dimensions. For spatial dimensions, they should match sizes
from sizes or calculated from `scales` attribute.
## Supported data types
Interpolate operation supports the following data type combinations.
Src/Dst | Sizes
-- |--
f32 | s32
bf16 | s32
f16 | s3

View File

@ -0,0 +1,65 @@
# InterpolateBackward {#dev_guide_op_interpolatebackward}
## General
InterpolateBackward computes the gradients of Interpolate operation.
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[mode](@ref dnnl::graph::op::attr::mode) | Specifies type of interpolation |string. |`nearest`, `linear`, `bilinear`, `trilinear` | Required
[coordinate_transformation_mode](@ref dnnl::graph::op::attr::coordinate_transformation_mode) | Specifies how to transform the coordinate in the resized tensor to the coordinate in the original tensor|string. | `half_pixel`(default),`align_corners` | Optional
[sizes](@ref dnnl::graph::op::attr::sizes) | Specifies dst shape for spatial axes. |s64 |A s64 list containing positive values,`none` is default | Optional
[scales](@ref dnnl::graph::op::attr::scales) | Specifies `scales` for spatial axes. | f32| A f32 list,`none` is default | Optional
[data_format](@ref dnnl::graph::op::attr::data_format)| Controls how to interpret the shape of `src` and `dst`.|string | `NCX`, `NXC` (default) | Optional
@note Either `sizes` or `scales` should be provided. When `sizes` is
used, `scales` will be ignored.
@note
The attribute `coordinate_transformation_mode` is the name of transformation
mode in string format.\n
Here `scale[x]` is `dst_shape[x]/src_shape[x]` and `x_resized` is a
coordinate in axis `x`,for any axis `x` from the src axis.\n
For `half_pixel`: the coordinate in the original tensor axis `x` is
calculated as `((x_resized + 0.5) / scale[x]) - 0.5`.\n
For `align_corners`: the coordinate in the original tensor axis `x` is
calculated as 0 if `dst_shape[x] == 1` else `x_resized * (src_shape[x] - 1)
/ (dst_shape[x] - 1)`.\n
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`src` | Required
1|`diff_dst` | Required
2|`sizes` | Optional
@note
`src` is original input tensor of Interpolate op.\n
`diff_dst` is the gradient tensor with respect to the dst.\n
`sizes` is a 1D tensor describing output shape for spatial axes.
### Outputs
Index| Argument Name | Required or Optional
-- | -- | --
0 |`diff_src` | Required
@note `diff_src` is the gradient tensor with respect to the src of Interpolate.
## Supported data types
InterpolateBackward operation supports the following data type combinations.
Src/Diff_dst/Diff_src | Sizes
-- |--
f32 | s32
bf16 | s32
f16 | s32

View File

@ -0,0 +1,78 @@
# LayerNorm {#dev_guide_op_layernorm}
## General
LayerNorm performs a layer normalization operation on \src tensor.
The layerNorm operation performs normalization from `begin_norm_axis` to last
dimension of the data tensor. It is defined by the following formulas which is
the same as @ref dev_guide_layer_normalization.
\f[
\dst(t, n, c) =
\gamma(c) \cdot
\frac{\src(t, n, c) - \mu(t, n)} {\sqrt{\sigma^2(t, n) + \epsilon}}
+ \beta(c),
\f]
where
- \f$\gamma(c), \beta(c)\f$ are optional scale and shift for a channel
- \f$\mu(t, n), \sigma^2(t, n)\f$ are mean and variance (see
- \f$\epsilon\f$ is a constant to improve numerical stability.
Mean and variance are computed at runtime or provided by a user. When mean and
variance are computed at runtime, the following formulas are used:
- \f$\mu(t, n) = \frac{1}{C} \sum\limits_{c} \src(t, n, c)_{}\f$,
- \f$\sigma^2(t, n) = \frac{1}{C} \sum\limits_{c} {}_{} (\src(t, n, c) - \mu(t, n))^2\f$.
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[keep_stats](@ref dnnl::graph::op::attr::keep_stats) | Indicate whether to output mean and variance which can be later passed to backward op. |bool |`false`,`true` (default) | Optional
[begin_norm_axis](@ref dnnl::graph::op::attr::begin_norm_axis) | `begin_norm_axis` is used to indicate which axis to start layer normalization. The normalization is from `begin_norm_axis` to last dimension. Negative values means indexing from right to left. This op normalizes over the last dimension by default, e.g. C in TNC for 3D and LDNC for 4D. |s64 |[-r,r-1],where r=rank(src). -1 is default | Optional
[use_affine](@ref dnnl::graph::op::attr::use_affine) | When set to True, this module has learnable per-element affine parameters. |bool |`false`, `true` (default) | Optional
[epsilon](@ref dnnl::graph::op::attr::epsilon) | The constant to improve numerical stability. |f32 |Arbitrary positive f32 value, `1e-5`(default) | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
----- | ------------- | --------------------
0 | `src` | Required
1 | `gamma` | Optional
2 | `beta` | Optional
@note `gamma` is scaling for normalized value. `beta` is the bias added to
the scaled normalized value. They are both 1D tensor with the same span as srcs
channel axis and required if attribute `use_affine` is set to True.
### Outputs
Index | Argument Name | Required or Optional
----- | ------------- | --------------------
0 | `dst` | Required
1 | `mean` | Optional
2 | `variance` | Optional
@note Both `mean` and `variance` are required if attribute `keep_stats` is set to
True.
## Supported data types
LayerNorm operation supports the following data type combinations.
Src / Dst | Gamma / Beta / Mean / Variance
-- |--
f32 | f32
bf16 | f32, bf16
f16 | f32

View File

@ -0,0 +1,61 @@
# LayerNormBackward {#dev_guide_op_layernormbackward}
## General
LayerNormBackward performs the backward of LayerNorm operation.
The backward propagation computes
\f$\diffsrc(t, n, c)\f$,
\f$\diffgamma(c)^*\f$, and \f$\diffbeta(c)^*\f$
based on
\f$\diffdst(t, n, c)\f$, \f$src(t, n, c)\f$, \f$\mu(t, n)\f$,
\f$\sigma^2(t, n)\f$, \f$\gamma(c) ^*\f$, and \f$\beta(c) ^*\f$.
The tensors marked with an asterisk are used only when the operation is
configured to use \f$\gamma(c)\f$, and \f$\beta(c)\f$
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[begin_norm_axis](@ref dnnl::graph::op::attr::begin_norm_axis) | `begin_norm_axis` is used to indicate which axis to start layer normalization. The normalization is from `begin_norm_axis` to last dimension. Negative values means indexing from right to left. This op normalizes over the last dimension by default, e.g. C in TNC for 3D and LDNC for 4D. |s64 |[-r,r-1],where r=rank(src). -1 is default | Optional
[use_affine](@ref dnnl::graph::op::attr::use_affine) | When set to True, this module has learnable per-element affine parameters. |bool |`false`,`true` (default) | Optional
[epsilon](@ref dnnl::graph::op::attr::epsilon) | The constant to improve numerical stability. |f32 |Arbitrary positive f32 value, 1e-5 (default) | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
----- | ------------- | --------------------
0 | `src` | Required
1 | `diff_dst` | Required
2 | `mean` | Required
3 | `variance` | Required
4 | `gamma` | Optional
5 | `beta` | Optional
@note `gamma` is scaling for normalized value. `beta` is the bias added to
the scaled normalized value. They are both 1D tensor with the same span as srcs channel
axis and required if attribute `use_affine` is set to True.
### Outputs
Index | Argument Name | Required or Optional
----- | ------------- | --------------------
0 | `diff_src` | Required
1 | `diff_gamma` | Optional
2 | `diff_beta` | Optional
## Supported data types
LayerNormBackward operation supports the following data type combinations.
Src / Diff_dst / Diff_src | Gamma / Beta / Mean / Variance / Diff_gamma / Diff_beta
-- |--
f32 | f32
bf16 | f32, bf16
f16 | f32

View File

@ -0,0 +1,50 @@
# LeakyReLU {#dev_guide_op_leakyrelu}
## General
LeakyReLU operation is a type of activation function based on ReLU. It has a
small slope for negative values with which LeakyReLU can produce small,
non-zero, and constant gradients with respect to the negative values. The slope
is also called the coefficient of leakage.
Unlike @ref dev_guide_op_prelu, the coefficient \f$\alpha\f$ is constant and
defined before training.
LeakyReLU operation applies following formula on every element of \src tensor
(the variable names follow the standard @ref dev_guide_conventions):
\f[ dst = \begin{cases} src & \text{if}\ src \ge 0 \\
\alpha src & \text{if}\ src < 0 \end{cases} \f]
## Operation attributes
Attribute Name | Description | Value Type | Supported Values | Required or Optional
-- | -- | -- | -- | --
[alpha](@ref dnnl::graph::op::attr::alpha) | Alpha is the coefficient of leakage. | f32 | Arbitrary f32 value but usually a small positive value. | Required
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `src` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `dst` |Required
## Supported data types
LeakyReLU operation supports the following data type combinations.
Src | Dst
-- | --
f32 | f32
bf16 | bf16
f16 | f16

View File

@ -0,0 +1,40 @@
# Log {#dev_guide_op_log}
## General
Log operation performs element-wise natural logarithm operation with given
tensor, it applies following formula on every element of \src tensor (the
variable names follow the standard @ref dev_guide_conventions):
\f[ dst = \log(src) \f]
## Operation attributes
Log operation does not support any attribute.
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `src` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `dst` | Required
## Supported data types
Log operation supports the following data type combinations.
Src | Dst
-- | --
f32 | f32
f16 | f16
bf16 | bf16

View File

@ -0,0 +1,40 @@
# LogSoftmax {#dev_guide_op_logsoftmax}
## General
LogSoftmax operation applies the \f$ \log(softmax(src)) \f$ function to an
n-dimensional input Tensor. The formulation can be simplified as:
\f[ dst_i = \log\Big( \frac{exp(src_i)}{\sum_{j}^{ } exp(src_j)} \Big) \f]
## Operation attributes
Attribute Name | Description | Value Type | Supported Values | Required or Optional
-- | -- | -- | -- | --
[axis](@ref dnnl::graph::op::attr::axis) | Represents the axis of which the LogSoftmax is calculated. Negative value means counting dimensions from the back. | s64 | Arbitrary s64 value (`-1` in default) | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `src` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `dst` |Required
## Supported data types
LogSoftmax operation supports the following data type combinations.
Src | Dst
-- | --
f32 | f32
bf16 | bf16
f16 | f16

View File

@ -0,0 +1,39 @@
# LogSoftmaxBackward {#dev_guide_op_logsoftmaxbackward}
## General
LogSoftmaxBackward operation computes gradient for LogSoftmax.
## Operation attributes
Attribute Name | Description | Value Type | Supported Values | Required or Optional
-- | -- | -- | -- | --
[axis](@ref dnnl::graph::op::attr::axis) | Represents the axis of which the LogSoftmax is calculated. Negative value means counting dimensions from the back. | s64 | Arbitrary s64 value (`-1` in default) | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `diff_dst` | Required
1 | `src` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `diff_src` | Required
## Supported data types
LogSoftmaxBackward operation supports the following data type combinations.
Src | Dst
-- | --
f32 | f32
bf16 | bf16
f16 | f16

View File

@ -0,0 +1,67 @@
# MatMul {#dev_guide_op_matmul}
## General
MatMul operation computes the product of two tensors with optional bias addition
The variable names follow the standard @ref dev_guide_conventions, typically
taking 2D input tensors as an example, the formula is below:
\f[
\dst(m, n) =
\sum_{k=0}^{K - 1} \left(
\src(m, k) \cdot \weights(k, n)
\right) +
\bias(m, n)
\f]
In the shape of a tensor, two right-most axes are interpreted as row and column
dimensions of a matrix while all left-most axes (if present) are interpreted as
batch dimensions. The operation supports broadcasting semantics for those batch
dimensions. For example \src can be broadcasted to \weights if the corresponding
dimension in \src is `1` (and vice versa). Additionally, if ranks of \src and
\weights are different, the tensor with a smaller rank will be *unsqueezed* from
the left side of dimensions (inserting `1`) to make sure two ranks matched.
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[transpose_a](@ref dnnl::graph::op::attr::transpose_a) | Controls whether to transpose the last two dimensions of `src`. |bool | True, False (default) | Optional
[transpose_b](@ref dnnl::graph::op::attr::transpose_b) | Controls whether to transpose the last two dimensions of `weights`. |bool | True, False (default) | Optional
The above transpose attribute will not in effect when rank of an input tensor is
less than 2. For example, in library implementation 1D tensor is unsqueezed
firstly before compilation. The rule is applied independently.
- For \src tensor, the rule is defined like: `[d] -> [1, d]`.
- For \weights tensor, the rule is defined like: `[d] -> [d, 1]`.
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`src` | Required
1|`weights` | Required
2|`bias` | Optional
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`dst` |Required
## Supported data types
MatMul operation supports the following data type combinations.
Src | Weights | Bias | Dst
--|--|-- | --
f32 | f32 | f32 |f32
bf16 | bf16 | bf16 |bf16
f16 | f16 | f16 |f16

View File

@ -0,0 +1,54 @@
# MaxPool {#dev_guide_op_maxpool}
## General
MaxPool operation performs the computation following the below formulas.
Variable names follow the standard @ref dev_guide_conventions.
\f[
\dst(n, c, oh, ow) =
\max\limits_{kh, kw}
\left(
\src(n, c, oh \cdot SH + kh \cdot (DH + 1) - PH_L, ow \cdot SW + kw \cdot (DW + 1) - PW_L)
\right)
\f]
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[strides](@ref dnnl::graph::op::attr::strides) | Controls the strides the window is moved. |s64 |A s64 list containing positive values | Required
[pads_begin](@ref dnnl::graph::op::attr::pads_begin) | Controls number of zeros to be add to the front/top/left of spatial dimensions, the attribute will be ignored when `auto_pad` attribute is specified to `same_upper`, `same_lower` or `valid`.|s64 | A s64 list containing non-negative values | Required
[pads_end](@ref dnnl::graph::op::attr::pads_end) | Controls number of zeros to be add to the back/bottom/right of spatial dimensions, the attribute will be ignored when `auto_pad` attribute is specified to `same_upper`, `same_lower` or `valid`. |s64 |A s64 list containing non-negative values | Required
[kernel](@ref dnnl::graph::op::attr::kernel) | Size of pooling window. | s64| A s64 list containing positive values | Required
[rounding_type](@ref dnnl::graph::op::attr::rounding_type) | Controls how to do rounding. |string | `floor` (default), `ceil` | Optional
[auto_pad](@ref dnnl::graph::op::attr::auto_pad) |Controls how the paddings are calculated.| string | `none` (default), `same_upper`, `same_lower`, `valid` | Optional
[dilations](@ref dnnl::graph::op::attr::dilations) |Denotes the distance in width and height between elements in the window. |s64 | A s64 list containing positive values, a list of `1`s (default) means no dilation| Optional
[data_format](@ref dnnl::graph::op::attr::data_format) |Controls how to interpret the shape of `src` and `dst`.| string|`NCX`, `NXC` (default) | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`src` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`dst` | Required
## Supported data types
MaxPool operation supports the following data type combinations.
Src | Dst
-- | --
f32 |f32
bf16 |bf16
f16 |f16

View File

@ -0,0 +1,46 @@
# MaxPoolBackward {#dev_guide_op_maxpoolbackward}
## General
AvgPoolBackward operation accepts \src tensor and \f$\diffdst\f$ tensor, and
calculates \f$\diffsrc\f$ tensor.
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[strides](@ref dnnl::graph::op::attr::strides) | Controls the strides the window is moved. |s64 |A s64 list containing positive values | Required
[pads_begin](@ref dnnl::graph::op::attr::pads_begin) | Controls number of zeros to be add to the front/top/left of spatial dimensions, the attribute will be ignored when `auto_pad` attribute is specified to `same_upper`, `same_lower` or `valid`.|s64 | A s64 list containing non-negative values | Required
[pads_end](@ref dnnl::graph::op::attr::pads_end) | Controls number of zeros to be add to the back/bottom/right of spatial dimensions, the attribute will be ignored when `auto_pad` attribute is specified to `same_upper`, `same_lower` or `valid`. |s64 |A s64 list containing non-negative values | Required
[kernel](@ref dnnl::graph::op::attr::kernel) | Size of pooling window | s64| A s64 list containing positive values | Required
[auto_pad](@ref dnnl::graph::op::attr::auto_pad) |Controls how the paddings are calculated.| string | `none` (default), `same_upper`, `same_lower`, `valid` | Optional
[dilations](@ref dnnl::graph::op::attr::dilations) |Denotes the distance in width and height between elements in the window.|s64 | A s64 list containing positive values, a list of `1`s (default) means no dilation| Optional
[data_format](@ref dnnl::graph::op::attr::data_format) |Controls how to interpret the shape of `src` and `dst`.| string|`NCX`, `NXC` (default) | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`src` | Required
1|`diff_dst` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`diff_src` | Required
## Supported data types
MaxPoolBackward operation supports the following data type combinations.
Src | Diff_dst|Diff_src
-- | --|--
f32 |f32|f32
bf16 |bf16|bf16
f16 |f16|f16

View File

@ -0,0 +1,47 @@
# Maximum {#dev_guide_op_maximum}
## General
Maximum operation performs element-wise maximum operation with two given tensors applying
multi-directional broadcast rules.
\f[ \dst(\overline{x})) = max(\src\_0(\overline{x}), \src\_1(\overline{x})) \f]
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[auto_broadcast](@ref dnnl::graph::op::attr::auto_broadcast) | Specifies rules used for auto-broadcasting of src tensors.|string |`none`,`numpy` (default) | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
----- | ------------- | --------------------
0 | `src_0` | Required
1 | `src_1` | Required
@note Both src shapes should match and no auto-broadcasting is allowed if
`auto_broadcast` attributes is `none`. `src_0` and `src_1` shapes can be
different and auto-broadcasting is allowed if `auto_broadcast` attributes is
`numpy`. Broadcasting is performed according to auto_broadcast value.
### Outputs
Index | Argument Name | Required or Optional
----- | ------------- | --------------------
0 | `dst` | Required
## Supported data types
Maximum operation supports the following data type combinations.
Source0/1 | Destination
---- | -------
f32 | f32
bf16 | bf16
f16 | f16

View File

@ -0,0 +1,47 @@
# Minimum{#dev_guide_op_minimum}
## General
Minimum operation performs element-wise minimum operation with two given tensors applying
multi-directional broadcast rules.
\f[ \dst(\overline{x})) = min(\src\_0(\overline{x}), \src\_1(\overline{x})) \f]
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[auto_broadcast](@ref dnnl::graph::op::attr::auto_broadcast) | Specifies rules used for auto-broadcasting of src tensors. |string |`none`,`numpy` (default) | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
----- | ------------- | --------------------
0 | `src_0` | Required
1 | `src_1` | Required
@note Both src shapes should match and no auto-broadcasting is allowed if
`auto_broadcast` attributes is `none`. `src_0` and `src_1` shapes can be
different and auto-broadcasting is allowed if `auto_broadcast` attributes is
`numpy`. Broadcasting is performed according to auto_broadcast value.
### Outputs
Index | Argument Name | Required or Optional
----- | ------------- | --------------------
0 | `dst` | Required
## Supported data types
Minimum operation supports the following data type combinations.
Source0/1 | Destination
---- | -------
f32 | f32
bf16 | bf16
f16 | f16

View File

@ -0,0 +1,38 @@
# Mish {#dev_guide_op_mish}
## General
Mish performs element-wise activation function on a given input tensor, based
on the following mathematical formula:
\f[ dst = src * \tanh(SoftPlus(src)) = src * \tanh(\ln(1 + e^{src})) \f]
## Operation attributes
Mish operation does not support any attribute.
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `src` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `dst` |Required
## Supported data types
Mish operation supports the following data type combinations.
Src | Dst
-- | --
f32 | f32
bf16 | bf16
f16 | f16

View File

@ -0,0 +1,41 @@
# MishBackward {#dev_guide_op_mishbackward}
## General
MishBackward operation computes gradient for Mish.
\f[ dst & = diff_{dst} * \frac{e^{src} * \omega}{\delta^{2}}, where \\
\omega & = e^{3src} + 4 * e^{2src} + e^{src} * (4 * src + 6) + 4 * (src + 1) \\
\delta & = e^{2src} + 2 * e^{src} + 2 \f]
## Operation attributes
MishBackward operation does not support any attribute.
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `src` | Required
1 | `diff_dst` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `diff_src` | Required
## Supported data types
MishBackward operation supports the following data type combinations.
Src | Diff_dst | Diff_src
-- | -- | --
f32 | f32 | f32
f16 | f16 | f16
bf16 | bf16 | bf16

View File

@ -0,0 +1,52 @@
# Multiply{#dev_guide_op_multiply}
## General
Multiply operation performs element-wise multiply operation with two given tensors applying
multi-directional broadcast rules.
\f[
\dst(\overline{x}) =
\src_0(\overline{x}) \times \src_1(\overline{x}),
\f]
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[auto_broadcast](@ref dnnl::graph::op::attr::auto_broadcast) | Specifies rules used for auto-broadcasting of src tensors. |string |`none`,`numpy` (default) | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
----- | ------------- | --------------------
0 | `src_0` | Required
1 | `src_1` | Required
@note Both src shapes should match and no auto-broadcasting is allowed if
`auto_broadcast` attributes is `none`. `src_0` and `src_1` shapes can be
different and auto-broadcasting is allowed if `auto_broadcast` attributes is
`numpy`. Broadcasting is performed according to auto_broadcast value.
### Outputs
Index | Argument Name | Required or Optional
----- | ------------- | --------------------
0 | `dst` | Required
## Supported data types
Multiply operation supports the following data type combinations.
Source0/1 | Destination
---- | -------
f32 | f32
bf16 | bf16
f16 | f16

View File

@ -0,0 +1,60 @@
# PReLU {#dev_guide_op_prelu}
## General
PReLU operation performs element-wise parametric ReLU operation on a given
input tensor, based on the following mathematical formula:
\f[ dst = \begin{cases} src & \text{if}\ src \ge 0 \\
\alpha src & \text{if}\ src < 0 \end{cases} \f]
## Operation attributes
Attribute Name | Description | Value Type | Supported Values | Required or Optional
-- | -- | -- | -- | --
[data_format](@ref dnnl::graph::op::attr::data_format) | Denotes the data format of the input and output data. | string | `NCX`, `NXC`(default) | Optional
[per_channel_broadcast](@ref dnnl::graph::op::attr::per_channel_broadcast) | Denotes whether to apply per_channel broadcast when slope is 1D tensor. | bool | `false`, `true`(default) | Optional
### Broadcasting Rules
Only slope tensor supports broadcasting semantics. Slope tensor is
uni-directionally broadcasted to \src if one of the following rules is met:
- 1: slope is 1D tensor and `per_channel_broadcast` is set to `true`, the
length of slope tensor is equal to the length of \src of channel dimension.
- 2: slope is 1D tensor and `per_channel_broadcast` is set to `false`, the
length of slope tensor is equal to the length of \src of the rightmost
dimension.
- 3: slope is nD tensor, starting from the rightmost dimension,
\f$input.shape_i == slope.shape_i\f$ or \f$slope.shape_i == 1\f$ or
slope dimension i is empty.
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `src` | Required
1 | `slope` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `dst` |Required
## Supported data types
PReLU operation supports the following data type combinations.
Src | Dst | Slope
-- | -- | --
f32 | f32 | f32
bf16 | bf16 | bf16
f16 | f16 | f16

View File

@ -0,0 +1,56 @@
# PReLUBackward {#dev_guide_op_prelubackward}
## General
PReLUBackward operation computes gradient for PReLU.
## Operation attributes
Attribute Name | Description | Value Type | Supported Values | Required or Optional
-- | -- | -- | -- | --
[data_format](@ref dnnl::graph::op::attr::data_format) | Denotes the data format of the input and output data. | string | `NCX`, `NXC`(default) | Optional
### Broadcasting Rules
Only slope tensor supports broadcasting semantics. Slope tensor is
uni-directionally broadcasted to \src if one of the following rules is met:
1. PyTorch case: slope is 1D tensor and broadcast per channel, length of
slope is equal to the length of \src in channel dimension.
2. PyTorch case: slope is 1D tensor and broadcast per tensor, length of slope
is equal to 1.
3. Tensorflow case: slope is nD tensor and its dimensions must be equal to
the \src dimensions starting from the second element:
\f$ slope\_shape = input\_forward\_shape[1:] \f$
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `src` | Required
1 | `slope` | Required
2 | `diff_dst` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `diff_src` | Required
1 | `diff_slope` | Required
## Supported data types
PReLUBackward operation supports the following data type combinations.
Src | Slope | Diff_dst | Diff_src | Diff_slope
-- | -- | -- | -- | --
f32 | f32 | f32 | f32 | f32
bf16 | bf16 | bf16 | bf16 | bf16
f16 | f16 | f16 | f16 | f16

View File

@ -0,0 +1,55 @@
# Quantize {#dev_guide_op_quantize}
## General
Quantize operation converts a f32 tensor to a quantized (u8/s8) tensor. It
supports both per-tensor and per-channel asymmetric linear quantization. Output
data type is specified in output tensor data type. Rounding mode is
library-implementation defined.
For per-tensor quantization:
\f[ \dst_{i} = round(\src_{i} / scale + zp) \f]
For per-channel quantization, taking channel axis = 1 as an example:
\f[ dst_{\cdots,i,\cdots,\cdots} = round(\src_{\cdots,i,\cdots,\cdots} / scale_i + zp_i), i \in {[0, ic-1]} \f]
where \f$ic\f$ is the number of channels.
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[qtype](@ref dnnl::graph::op::attr::qtype) | Specifies which quantization type is used. |string | `per_tensor` (default), `per_channel` | Optional
[axis](@ref dnnl::graph::op::attr::axis) | Specifies dimension on which per-channel quantization is applied. |s64 | A s64 value in the range of [-r, r-1] where r = rank(src), `1` by default | Optional
[scales](@ref dnnl::graph::op::attr::scales) | Scalings applied on the src data. |f32 | A f32 list (only contain one element if qtype is `per_tensor`) | Required
[zps](@ref dnnl::graph::op::attr::zps) | Offset values that maps to float zero. |s64 | A s64 list (only contain one element if qtype is `per_tensor`) | Required
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`src` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`dst` |Required
## Supported data types
Quantize operation supports the following data type combinations.
Src | Dst
-- | --
f32 |s8, u8
@note This operation is to support
[int8 quantization](@ref dev_guide_graph_int8_quantization_model) model.

View File

@ -0,0 +1,40 @@
# ReLU {#dev_guide_op_relu}
## General
ReLU applies following formula on every element of \src tensor (the
variable names follow the standard @ref dev_guide_conventions):
\f[ dst = \begin{cases} src & \text{if}\ src > 0 \\
0 & \text{if}\ src \leq 0 \end{cases} \f]
## Operation attributes
ReLU operation does not support any attribute.
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`src` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`dst` |Required
## Supported data types
ReLU operation supports the following data type combinations.
Src | Dst
-- | --
f32 | f32
bf16 | bf16
f16 | f16

View File

@ -0,0 +1,39 @@
# ReLUBackward {#dev_guide_op_relubackward}
## General
ReLUBackward operation computes gradient for ReLU.
## Operation attributes
Attribute Name | Description | Value Type | Supported Values | Required or Optional
-- | -- | -- | -- | --
[use_dst](@ref dnnl::graph::op::attr::use_dst) | If true, use `dst` of ReLU operation to calculate the gradient. Otherwise, use `src`. | bool | `true` (default), `false` | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `src` / `dst` | Required
1 | `diff_dst` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `diff_src` | Required
## Supported data types
ReLUBackward operation supports the following data type combinations.
Src | Diff_dst | Diff_src
-- | -- | --
f32 | f32 | f32
f16 | f16 | f16
bf16 | bf16 | bf16

View File

@ -0,0 +1,39 @@
# Reciprocal{#dev_guide_op_reciprocal}
## General
Reciprocal operation is element-wise Power operation where exponent(power) equals to -1. Reciprocal of 0 is infinity.
\f[dst = \begin{cases} src^{-1} & \text{if}\ src \neq 0 \\
inf & \text{if}\ src = 0 \end{cases} \f]
## Operation attributes
Reciprocal operation does not support any attribute.
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
----- | ------------- | --------------------
0 | `src` | Required
### Outputs
Index | Argument Name | Required or Optional
----- | ------------- | --------------------
0 | `dst` | Required
## Supported data types
Reciprocal operation supports the following data type combinations.
Source | Destination
---- | -------
f32 | f32
bf16 | bf16
f16 | f16

View File

@ -0,0 +1,56 @@
# ReduceL1{#dev_guide_op_reducel1}
## General
ReduceL1 operation performs the reduction with finding the L1 norm (sum of
absolute values) on a given src data along dimensions specified by axes.
Take channel axis = 0 and keep_dims = True as an example:
\f[ {dst}_{0,\cdots,\cdots} =
\sum\limits_{i}|src_{i,\cdots,\cdots}| ,i \in [0,channelNum-1] \f]
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[axes](@ref dnnl::graph::op::attr::axes) | Specify indices of src tensor, along which the reduction is performed. If axes is a list, reduce over all of them. If axes is empty, corresponds to the identity operation. If axes contains all dimensions of src tensor, a single reduction value is calculated for the entire src tensor. Exactly one of attribute `axes` and the second input tensor `axes` should be available. |s64 |A s64 list values which is in the range of [-r, r-1] where r = rank(src). Empty list(default) | Optional
[keep_dims](@ref dnnl::graph::op::attr::keep_dims) | If set to `true` it holds axes that are used for reduction. For each such axes, dst dimension is equal to 1. |bool |`true`,`false`(default) | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
----- | ------------- | --------------------
0 | `src` | Required
1 | `axes` | Optional
@note `axes` is an 1-D tensor specifying the axis along which the reduction is
performed. 1D tensor of unique elements. The range of elements is [-r, r-1],
where r is the rank of src tensor. Exactly one of attribute axes and the second
input tensor axes should be available.
### Outputs
Index | Argument Name | Required or Optional
----- | ------------- | --------------------
0 | `dst` | Required
@note The result of ReduceL1 function applied to src tensor. shape[i] =
shapeOf(data)[i] for all i that is not in the list of axes from the second
input. For dimensions from axes, shape[i] == 1 if keep_dims == True, or i-th
dimension is removed from the dst otherwise.
## Supported data types
ReduceL1 operation supports the following data type combinations.
Source/Destination |Axes
---- | -------
f32 | s32
bf16 | s32
f16 | s32

View File

@ -0,0 +1,56 @@
# ReduceL2{#dev_guide_op_reducel2}
## General
ReduceL2 operation performs the reduction with finding the L2 norm (square root
of sum of squares) on a given src data along dimensions specified by axes.
Take channel axis = 0 and keep_dims = True as an example:
\f[ {dst}_{0,\cdots,\cdots} =
\sqrt{\sum\limits_{i}{src_{i,\cdots,\cdots}^2}} ,i \in [0,channelNum-1] \f]
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[axes](@ref dnnl::graph::op::attr::axes) | Specify indices of src tensor, along which the reduction is performed. If axes is a list, reduce over all of them. If axes is empty, corresponds to the identity operation. If axes contains all dimensions of src tensor, a single reduction value is calculated for the entire src tensor. Exactly one of attribute `axes` and the second input tensor `axes` should be available. |s64 |A s64 list values which is in the range of [-r, r-1] where r = rank(src). Empty list(default) | Optional
[keep_dims](@ref dnnl::graph::op::attr::keep_dims) | If set to `true` it holds axes that are used for reduction. For each such axes, dst dimension is equal to 1. |bool |`true`,`false`(default) | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
----- | ------------- | --------------------
0 | `src` | Required
1 | `axes` | Optional
@note `axes` is an 1-D tensor specifying the axis along which the reduction is
performed. 1D tensor of unique elements. The range of elements is [-r, r-1],
where r is the rank of src tensor. Exactly one of attribute axes and the second
input tensor axes should be available.
### Outputs
Index | Argument Name | Required or Optional
----- | ------------- | --------------------
0 | `dst` | Required
@note The result of ReduceL2 function applied to src tensor. shape[i] =
shapeOf(data)[i] for all i that is not in the list of axes from the second
input. For dimensions from axes, shape[i] == 1 if keep_dims == True, or i-th
dimension is removed from the dst otherwise.
## Supported data types
ReduceL2 operation supports the following data type combinations.
Source/Destination |Axes
---- | -------
f32 | s32
bf16 | s32
f16 | s32

View File

@ -0,0 +1,56 @@
# ReduceMax{#dev_guide_op_reducemax}
## General
ReduceMax operation performs the reduction with finding the maximum value on a
given src data along dimensions specified by axes.
Take channel axis = 0 and keep_dims = True as an example:
\f[ {dst}_{0,\cdots,\cdots} =
\max\{src_{i,\cdots,\cdots}\} ,i \in [0,channelNum-1] \f]
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[axes](@ref dnnl::graph::op::attr::axes) | Specify indices of src tensor, along which the reduction is performed. If axes is a list, reduce over all of them. If axes is empty, corresponds to the identity operation. If axes contains all dimensions of src tensor, a single reduction value is calculated for the entire src tensor. Exactly one of attribute `axes` and the second input tensor `axes` should be available. |s64 |A s64 list values which is in the range of [-r, r-1] where r = rank(src). Empty list(default) | Optional
[keep_dims](@ref dnnl::graph::op::attr::keep_dims) | If set to `true` it holds axes that are used for reduction. For each such axes, dst dimension is equal to 1. |bool |`true`,`false`(default) | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
----- | ------------- | --------------------
0 | `src` | Required
1 | `axes` | Optional
@note `axes` is an 1-D tensor specifying the axis along which the reduction is
performed. 1D tensor of unique elements. The range of elements is [-r, r-1],
where r is the rank of src tensor. Exactly one of attribute axes and the second
input tensor axes should be available.
### Outputs
Index | Argument Name | Required or Optional
----- | ------------- | --------------------
0 | `dst` | Required
@note The result of ReduceMax function applied to src tensor. shape[i] =
shapeOf(data)[i] for all i that is not in the list of axes from the second
input. For dimensions from axes, shape[i] == 1 if keep_dims == True, or i-th
dimension is removed from the dst otherwise.
## Supported data types
ReduceMax operation supports the following data type combinations.
Source/Destination |Axes
---- | -------
f32 | s32
bf16 | s32
f16 | s32

View File

@ -0,0 +1,56 @@
# ReduceMean{#dev_guide_op_reducemean}
## General
ReduceMean operation performs the reduction with finding the arithmetic mean on
a given src data along dimensions specified by axes.
Take channel axis = 0 and keep_dims = True as an example:
\f[ {dst}_{0,\cdots,\cdots} =\frac{
{\sum\limits_{i}{src_{i,\cdots,\cdots}}}}{channelNum} ,i \in [0,channelNum-1] \f]
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[axes](@ref dnnl::graph::op::attr::axes) | Specify indices of src tensor, along which the reduction is performed. If axes is a list, reduce over all of them. If axes is empty, corresponds to the identity operation. If axes contains all dimensions of src tensor, a single reduction value is calculated for the entire src tensor. Exactly one of attribute `axes` and the second input tensor `axes` should be available. |s64 |A s64 list values which is in the range of [-r, r-1] where r = rank(src). Empty list(default) | Optional
[keep_dims](@ref dnnl::graph::op::attr::keep_dims) | If set to `true` it holds axes that are used for reduction. For each such axes, dst dimension is equal to 1. |bool |`true`,`false`(default) | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
----- | ------------- | --------------------
0 | `src` | Required
1 | `axes` | Optional
@note `axes` is an 1-D tensor specifying the axis along which the reduction is
performed. 1D tensor of unique elements. The range of elements is [-r, r-1],
where r is the rank of src tensor. Exactly one of attribute axes and the second
input tensor axes should be available.
### Outputs
Index | Argument Name | Required or Optional
----- | ------------- | --------------------
0 | `dst` | Required
@note The result of ReduceMean function applied to src tensor. shape[i] =
shapeOf(data)[i] for all i that is not in the list of axes from the second
input. For dimensions from axes, shape[i] == 1 if keep_dims == True, or i-th
dimension is removed from the dst otherwise.
## Supported data types
ReduceMean operation supports the following data type combinations.
Source/Destination |Axes
---- | -------
f32 | s32
bf16 | s32
f16 | s32

View File

@ -0,0 +1,56 @@
# ReduceMin{#dev_guide_op_reducemin}
## General
ReduceMin operation performs the reduction with finding the minimum value on a
given src data along dimensions specified by axes.
Take channel axis = 0 and keep_dims = True as an example:
\f[ {dst}_{0,\cdots,\cdots} =
\min\{src_{i,\cdots,\cdots}\} ,i \in [0,channelNum-1] \f]
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[axes](@ref dnnl::graph::op::attr::axes) | Specify indices of src tensor, along which the reduction is performed. If axes is a list, reduce over all of them. If axes is empty, corresponds to the identity operation. If axes contains all dimensions of src tensor, a single reduction value is calculated for the entire src tensor. Exactly one of attribute `axes` and the second input tensor `axes` should be available. |s64 |A s64 list values which is in the range of [-r, r-1] where r = rank(src). Empty list(default) | Optional
[keep_dims](@ref dnnl::graph::op::attr::keep_dims) | If set to `true` it holds axes that are used for reduction. For each such axes, dst dimension is equal to 1. |bool |`true`,`false`(default) | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
----- | ------------- | --------------------
0 | `src` | Required
1 | `axes` | Optional
@note `axes` is an 1-D tensor specifying the axis along which the reduction is
performed. 1D tensor of unique elements. The range of elements is [-r, r-1],
where r is the rank of src tensor. Exactly one of attribute axes and the second
input tensor axes should be available.
### Outputs
Index | Argument Name | Required or Optional
----- | ------------- | --------------------
0 | `dst` | Required
@note The result of ReduceMin function applied to src tensor. shape[i] =
shapeOf(data)[i] for all i that is not in the list of axes from the second
input. For dimensions from axes, shape[i] == 1 if keep_dims == True, or i-th
dimension is removed from the dst otherwise.
## Supported data types
ReduceMin operation supports the following data type combinations.
Source/Destination |Axes
---- | -------
f32 | s32
bf16 | s32
f16 | s32

View File

@ -0,0 +1,56 @@
# ReduceProd{#dev_guide_op_reduceprod}
## General
ReduceProd operation performs the reduction with multiplication on a given src
data along dimensions specified by axes.
Take channel axis = 0 and keep_dims = True as an example:
\f[ {dst}_{0,\cdots,\cdots} =
\prod\limits_{i}src_{i,\cdots,\cdots} ,i \in [0,channelNum-1] \f]
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[axes](@ref dnnl::graph::op::attr::axes) | Specify indices of src tensor, along which the reduction is performed. If axes is a list, reduce over all of them. If axes is empty, corresponds to the identity operation. If axes contains all dimensions of src tensor, a single reduction value is calculated for the entire src tensor. Exactly one of attribute `axes` and the second input tensor `axes` should be available. |s64 |A s64 list values which is in the range of [-r, r-1] where r = rank(src). Empty list(default) | Optional
[keep_dims](@ref dnnl::graph::op::attr::keep_dims) | If set to `true` it holds axes that are used for reduction. For each such axes, dst dimension is equal to 1. |bool |`true`,`false`(default) | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
----- | ------------- | --------------------
0 | `src` | Required
1 | `axes` | Optional
@note `axes` is an 1-D tensor specifying the axis along which the reduction is
performed. 1D tensor of unique elements. The range of elements is [-r, r-1],
where r is the rank of src tensor. Exactly one of attribute axes and the second
input tensor axes should be available.
### Outputs
Index | Argument Name | Required or Optional
----- | ------------- | --------------------
0 | `dst` | Required
@note The result of ReduceProd function applied to src tensor. shape[i] =
shapeOf(data)[i] for all i that is not in the list of axes from the second
input. For dimensions from axes, shape[i] == 1 if keep_dims == True, or i-th
dimension is removed from the dst otherwise.
## Supported data types
ReduceProd operation supports the following data type combinations.
Source/Destination |Axes
---- | -------
f32 | s32
bf16 | s32
f16 | s32

View File

@ -0,0 +1,56 @@
# ReduceSum{#dev_guide_op_reducesum}
## General
ReduceSum operation performs the reduction with addition on a given src data
along dimensions specified by axes.
Take channel axis = 0 and keep_dims = True as an example:
\f[ {dst}_{0,\cdots,\cdots} =
\sum\limits_{i}src_{i,\cdots,\cdots} ,i \in [0,channelNum-1] \f]
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[axes](@ref dnnl::graph::op::attr::axes) | Specify indices of src tensor, along which the reduction is performed. If axes is a list, reduce over all of them. If axes is empty, corresponds to the identity operation. If axes contains all dimensions of src tensor, a single reduction value is calculated for the entire src tensor. Exactly one of attribute `axes` and the second input tensor `axes` should be available. |s64 |A s64 list values which is in the range of [-r, r-1] where r = rank(src). Empty list(default) | Optional
[keep_dims](@ref dnnl::graph::op::attr::keep_dims) | If set to `true` it holds axes that are used for reduction. For each such axes, dst dimension is equal to 1. |bool |`true`,`false`(default) | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
----- | ------------- | --------------------
0 | `src` | Required
1 | `axes` | Optional
@note `axes` is an 1-D tensor specifying the axis along which the reduction is
performed. 1D tensor of unique elements. The range of elements is [-r, r-1],
where r is the rank of src tensor. Exactly one of attribute axes and the second
input tensor axes should be available.
### Outputs
Index | Argument Name | Required or Optional
----- | ------------- | --------------------
0 | `dst` | Required
@note The result of ReduceSum function applied to src tensor. shape[i] =
shapeOf(data)[i] for all i that is not in the list of axes from the second
input. For dimensions from axes, shape[i] == 1 if keep_dims == True, or i-th
dimension is removed from the dst otherwise.
## Supported data types
ReduceSum operation supports the following data type combinations.
Source/Destination |Axes
---- | -------
f32 | s32
bf16 | s32
f16 | s32

View File

@ -0,0 +1,49 @@
# Reorder {#dev_guide_op_reorder}
## General
Reorder operation converts \src tensor to \dst tensor with different layout. It
supports the conversion between:
- Two different opaque layouts.
- Two different strided layouts.
- One strided layout and another opaque layout.
Reorder operation does not support layout conversion cross different backends or
different engines. Unlike [reorder primitive](@ref dev_guide_reorder), Reorder
operation cannot be used to cast the data type from \src to \dst. Please check
the usage of [TypeCast](@ref dev_guide_op_typecast) and
[Dequantize](@ref dev_guide_op_dequantize) operation.
## Operation attributes
Reorder operation does not support any attribute.
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`src` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`dst` |Required
## Supported data types
Reorder operation supports the following data type combinations.
Src | Dst
-- | --
f32 |f32
bf16 |bf16
f16 |f16

View File

@ -0,0 +1,37 @@
# Round {#dev_guide_op_round}
## General
Round operation rounds the values of a tensor to the nearest integer,
element-wise.
## Operation attributes
Round operation does not support any attribute.
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `src` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `dst` | Required
## Supported data types
Round operation supports the following data type combinations.
Src | Dst
-- | --
f32 | f32
f16 | f16
bf16 | bf16

View File

@ -0,0 +1,39 @@
# Sigmoid {#dev_guide_op_sigmoid}
## General
Sigmoid operation applies following formula on every element of \src tensor
(the variable names follow the standard @ref dev_guide_conventions):
\f[ dst = \frac{1}{1 + e^{-src}} \f]
## Operation attributes
Sigmoid operation does not support any attribute.
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`src` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`dst` |Required
## Supported data types
Sigmoid operation supports the following data type combinations.
Src | Dst
-- | --
f32 | f32
bf16 | bf16
f16 | f16

View File

@ -0,0 +1,39 @@
# SigmoidBackward {#dev_guide_op_sigmoidbackward}
## General
SigmoidBackward operation computes gradient for Sigmoid.
## Operation attributes
Attribute Name | Description | Value Type | Supported Values | Required or Optional
-- | -- | -- | -- | --
[use_dst](@ref dnnl::graph::op::attr::use_dst) | If true, use `dst` of Sigmoid operation to calculate the gradient. Otherwise, use `src`. | bool | `true` (default), `false` | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `src` / `dst` | Required
1 | `diff_dst` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `diff_src` | Required
## Supported data types
SigmoidBackward operation supports the following data type combinations.
Src | Diff_dst | Diff_src
-- | -- | --
f32 | f32 | f32
f16 | f16 | f16
bf16 | bf16 | bf16

View File

@ -0,0 +1,41 @@
# SoftPlus {#dev_guide_op_softplus}
## General
SoftPlus operation applies following formula on every element of \src tensor
(the variable names follow the standard @ref dev_guide_conventions):
\f[ dst = 1 / beta * \ln(e^{beta*src} + 1.0) \f]
## Operation attributes
Attribute Name | Description | Value Type | Supported Values | Required or Optional
-- | -- | -- | -- | --
[beta](@ref dnnl::graph::op::attr::beta) | Value for the SoftPlus formulation. | s64 | Arbitrary s64 value (`1` in default) | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `src` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `dst` |Required
## Supported data types
SoftPlus operation supports the following data type combinations.
Src | Dst
-- | --
f32 | f32
bf16 | bf16
f16 | f16

View File

@ -0,0 +1,39 @@
# SoftPlusBackward {#dev_guide_op_softplusbackward}
## General
SoftPlusBackward operation computes gradient for SoftPlus.
## Operation attributes
Attribute Name | Description | Value Type | Supported Values | Required or Optional
-- | -- | -- | -- | --
[beta](@ref dnnl::graph::op::attr::beta) | Value for the SoftPlus formulation. | s64 | Arbitrary s64 value (`1` in default) | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `src` | Required
1 | `diff_dst` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `diff_src` |Required
## Supported data types
SoftPlusBackward operation supports the following data type combinations.
Src | Diff_dst | Diff_src
-- | -- | --
f32 | f32 | f32
bf16 | bf16 | bf16
f16 | f16 | f16

View File

@ -0,0 +1,42 @@
# Softmax {#dev_guide_op_softmax}
## General
Softmax operation applies following formula on every element of \src tensor
(the variable names follow the standard @ref dev_guide_conventions):
\f[ dst_i = \frac{exp(src_i)}{\sum_{j=1}^{C} exp(src_j)} \f]
where \f$ C \f$ is a size of tensor along axis dimension.
## Operation attributes
Attribute Name | Description | Value Type | Supported Values | Required or Optional
-- | -- | -- | -- | --
[axis](@ref dnnl::graph::op::attr::axis) | Represents the axis of which the Softmax is calculated. | s64 | Arbitrary s64 value (`1` in default) | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `src` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `dst` |Required
## Supported data types
Softmax operation supports the following data type combinations.
Src | Dst
-- | --
f32 | f32
bf16 | bf16
f16 | f16

View File

@ -0,0 +1,39 @@
# SoftmaxBackward {#dev_guide_op_softmaxbackward}
## General
SoftmaxBackward operation computes gradient for Softmax.
## Operation attributes
Attribute Name | Description | Value Type | Supported Values | Required or Optional
-- | -- | -- | -- | --
[axis](@ref dnnl::graph::op::attr::axis) | Represents the axis of which the Softmax is calculated. | s64 | Arbitrary s64 value (`1` in default) | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `diff_dst` | Required
1 | `src` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `diff_src` | Required
## Supported data types
SoftmaxBackward operation supports the following data type combinations.
Src | Diff_dst | Diff_src
-- | -- | --
f32 | f32 | f32
bf16 | bf16 | bf16
f16 | f16 | f16

View File

@ -0,0 +1,36 @@
# Sqrt {#dev_guide_op_sqrt}
## General
Sqrt operation performs element-wise square root operation with given tensor.
## Operation attributes
Sqrt operation does not support any attribute.
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `src` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `dst` | Required
## Supported data types
Sqrt operation supports the following data type combinations.
Src | Dst
-- | --
f32 | f32
f16 | f16
bf16 | bf16

View File

@ -0,0 +1,39 @@
# SqrtBackward {#dev_guide_op_sqrtbackward}
## General
SqrtBackward operation computes gradient for Sqrt.
## Operation attributes
Attribute Name | Description | Value Type | Supported Values | Required or Optional
-- | -- | -- | -- | --
[use_dst](@ref dnnl::graph::op::attr::use_dst) | If true, use `dst` of Sqrt operation to calculate the gradient. Otherwise, use `src`. | bool | `true` (default), `false` | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `src` / `dst` | Required
1 | `diff_dst` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `diff_src` | Required
## Supported data types
SqrtBackward operation supports the following data type combinations.
Src | Diff_dst | Diff_src
-- | -- | --
f32 | f32 | f32
f16 | f16 | f16
bf16 | bf16 | bf16

View File

@ -0,0 +1,36 @@
# Square {#dev_guide_op_square}
## General
Square operation performs element-wise square operation with given tensor.
## Operation attributes
Square operation does not support any attribute.
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `src` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `dst` | Required
## Supported data types
Square operation supports the following data type combinations.
Src | Dst
-- | --
f32 | f32
f16 | f16
bf16 | bf16

View File

@ -0,0 +1,48 @@
# SquaredDifference {#dev_guide_op_squareddifference}
## General
SquaredDifference operation performs element-wise subtraction operation with
two given tensors applying multi-directional broadcast rules, after that each
result of the subtraction is squared.
Before performing arithmetic operation, \f$src_1\f$ and \f$src_2\f$ are
broadcasted if their shapes are different and `auto_broadcast` attributes is not
`none`. Broadcasting is performed according to `auto_broadcast` value. After
broadcasting SquaredDifference does the following with the input tensors:
\f[ dst_i = (src\_1_i - src\_2_i)^2 \f]
## Operation attributes
Attribute Name | Description | Value Type | Supported Values | Required or Optional
-- | -- | -- | -- | --
[auto_broadcast](@ref dnnl::graph::op::attr::auto_broadcast) | Specifies rules used for auto-broadcasting of input tensors. | string | `none`, `numpy`(default) | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `src_1` | Required
1 | `src_2` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `dst` | Required
## Supported data types
SquaredDifference operation supports the following data type combinations.
Src_1 | Src_2 | Dst
-- | -- | --
f32 | f32 | f32
bf16 | bf16 | bf16
f16 | f16 | f16

View File

@ -0,0 +1,53 @@
# StaticReshape {#dev_guide_op_staticreshape}
## General
StaticReshape operation changes dimensions of \src tensor according to the
specified shape. The volume of \src is equal to \dst, where volume is the
product of dimensions. \dst may have a different memory layout from \src.
StaticReshape operation is not guaranteed to return a view or a copy of \src
when \dst is in-placed with the \src. StaticReshape can be used where if shape
is stored in a constant node or available during graph building stage. Then
shape can be passed via `shape` attribute.
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[shape](@ref dnnl::graph::op::attr::shape) | Specifies rules used for auto-broadcasting of src tensors. |string |`none`, `numpy` (default) | Required
[special_zero](@ref dnnl::graph::op::attr::special_zero) | Controls how zero values in shape are interpreted. |bool |`true`, `false` | Required
@note `shape`: dimension `-1` means that this dimension is calculated to keep
the same overall elements count as the src tensor. That case that more than
one `-1` in the shape is not supported.
@note `special_zero`: if false, `0` in the shape is interpreted as-is (for
example a zero-dimension tensor); if true, then all `0`s in shape implies the
copying of corresponding dimensions from src into dst.
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
| Index | Argument Name | Required or Optional |
| ----- | ------------- | -------------------- |
| 0 | `src` | Required |
### Outputs
| Index | Argument Name | Required or Optional |
| ----- | ------------- | -------------------- |
| 0 | `dst` | Required |
## Supported data types
StaticReshape operation supports the following data type combinations.
| Src | Dst |
| ---- | ------- |
| f32 | f32 |
| bf16 | bf16 |
| f16 | f16 |

View File

@ -0,0 +1,46 @@
# StaticTranspose {#dev_guide_op_statictranspose}
## General
StaticTranspose operation rearranges the dimensions of \src. \dst may have a
different memory layout from \src. StaticTranspose operation is not guaranteed
to return a view or a copy of \src when \dst is in-placed with the \src.
\f[
dst[src(order[0]), src(order[1]),\cdots, src(order[N-1])]\ =src[src(0), src(1),\cdots, src(N-1)]
\f]
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[order](@ref dnnl::graph::op::attr::order) | Specifies permutation to be applied on `src`. |s64 |A s64 list containing the element in the range of [-N, N-1], negative value means counting from last axis | Required
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
| Index | Argument Name | Required or Optional |
| ----- | ------------- | -------------------- |
| 0 | `src` | Required |
### Outputs
| Index | Argument Name | Required or Optional |
| ----- | ------------- | -------------------- |
| 0 | `dst` | Required |
## Supported data types
StaticTranspose operation supports the following data type combinations.
| Src | Dst |
| ---- | ------- |
| f32 | f32 |
| bf16 | bf16 |
| f16 | f16 |

View File

@ -0,0 +1,50 @@
# Subtract {#dev_guide_op_subtract}
## General
Subtract operation performs element-wise subtraction operation with two given
tensors applying multi-directional broadcast rules.
\f[
\dst(\overline{x}) =
\src_0(\overline{x}) - \src_1(\overline{x}),
\f]
## Operation attributes
Attribute Name | Description | Value Type |Supported Values | Required or Optional
-- | -- | --| --|--
[auto_broadcast](@ref dnnl::graph::op::attr::auto_broadcast) | Specifies rules used for auto-broadcasting of src tensors. |string |`none`,`numpy` (default) | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
| Index | Argument Name | Required or Optional |
| ----- | ------------- | -------------------- |
| 0 | `src_0` | Required |
| 1 | `src_1` | Required |
@note Both src shapes should match and no auto-broadcasting is allowed if
`auto_broadcast` attributes is `none`. `src_0` and `src_1` shapes can be
different and auto-broadcasting is allowed if `auto_broadcast` attributes is
`numpy`. Broadcasting is performed according to auto_broadcast value.
### Outputs
| Index | Argument Name | Required or Optional |
| ----- | ------------- | -------------------- |
| 0 | `dst` | Required |
## Supported data types
Substract operation supports the following data type combinations.
| Source0/1 | Destination |
| ---- | ------- |
| f32 | f32 |
| bf16 | bf16 |
| f16 | f16 |

View File

@ -0,0 +1,39 @@
# Tanh {#dev_guide_op_tanh}
## General
Tanh operation applies following formula on every element of \src tensor (the
variable names follow the standard @ref dev_guide_conventions):
\f[ dst = tanh(src) \f]
## Operation attributes
Tanh operation does not support any attribute.
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`src` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`dst` |Required
## Supported data types
Tanh operation supports the following data type combinations.
Src | Dst
-- | --
f32 | f32
bf16 | bf16
f16 | f16

View File

@ -0,0 +1,39 @@
# TanhBackward {#dev_guide_op_tanhbackward}
## General
TanhBackward operation computes gradient for Tanh.
## Operation attributes
Attribute Name | Description | Value Type | Supported Values | Required or Optional
-- | -- | -- | -- | --
[use_dst](@ref dnnl::graph::op::attr::use_dst) | If true, use `dst` of Tanh operation to calculate the gradient. Otherwise, use `src`. | bool | `true` (default), `false` | Optional
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `src` / `dst` | Required
1 | `diff_dst` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0 | `diff_src` | Required
## Supported data types
TanhBackward operation supports the following data type combinations.
Src | Diff_dst | Diff_src
-- | -- | --
f32 | f32 | f32
f16 | f16 | f16
bf16 | bf16 | bf16

View File

@ -0,0 +1,40 @@
# TypeCast {#dev_guide_op_typecast}
## General
TypeCast operation performs element-wise cast from input data type to the data
type given by output tensor. It requires that \src and \dst have the same shape
and layout. Rounding to nearest even will be used during cast.
## Operation attributes
TypeCast operation does not support any attribute.
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`src` | Required
### Outputs
Index | Argument Name | Required or Optional
-- | -- | --
0|`dst` |Required
## Supported data types
TypeCast operation supports the following data type combinations.
Src | Dst
-- | --
bf16, f16 |f32
f32 | bf16, f16
@note This operation is to support
[mixed precision](@ref dev_guide_graph_mixed_precision_model) computation.

View File

@ -0,0 +1,35 @@
# Wildcard {#dev_guide_op_wildcard}
## General
Wildcard operation is used to represent any compute logic and help construct
graph. Typically the operation can support mapping any framework ops which are
not supported by the library implementation. It's useful to make the graph
completed or connected.
## Operation attributes
Wildcard operation does not support any attribute.
## Execution arguments
The inputs and outputs must be provided according to below index order when
constructing an operation.
### Inputs
| Index | Argument Name | Required or Optional |
| ----- | ------------- | -------------------- |
| 0 | `src` | Optional |
### Outputs
| Index | Argument Name | Required or Optional |
| ----- | ------------- | -------------------- |
| 0 | `dst` | Optional |
@note WildCard operation can accept arbitrary number of inputs or outputs.
## Supported data types
Wildcard operation supports arbitrary data type combinations.

View File

@ -1,12 +1,12 @@
# oneDNN Graph API Concepts {#dev_guide_graph_basic_concepts}
# Basic Concepts {#dev_guide_graph_basic_concepts}
## Introduction
oneDNN Graph API programming model allows users to express their computational
graph and generate optimized sub-graphs which are called `partitions` in the
library. `Partition` is decided by oneDNN Graph API implementation. It is the
key concept to satisfy the different needs of AI hardware classes by using a
unified API. Users can compile `partitions`, bind `tensor` data, and execute
In oneDNN Graph API programming model, a computation graph is passed to library
and then optimized sub-graphs which are called `partitions` are returned by
the library. `Partition` is decided by oneDNN Graph API implementation. It is
the key concept to satisfy the different needs of AI hardware classes by using a
unified API. Typically can compile `partitions`, bind `tensor` data, and execute
`compiled partitions`.
The key concepts in oneDNN Graph API include `logical tensor`, `op`, `graph`,
@ -14,15 +14,15 @@ The key concepts in oneDNN Graph API include `logical tensor`, `op`, `graph`,
between these entities. Besides, oneDNN Graph API shares the common `engine` and
`stream` concepts of oneDNN primitive API.
@img{img_graph_programming_model.jpg,Figure 1: Overview of Graph API programming model. Blue rectangles denote oneDNN objects\, and red lines denote dependencies between objects.,80%,}
@img{img_graph_programming_model.png,Figure 1: Overview of Graph API programming model. Blue rectangles denote oneDNN objects\, and red lines denote dependencies between objects.,80%,}
## Logical Tensor
`Logical tensor` (@ref dnnl::graph::logical_tensor) describes the metadata of
the input and output tensors, like data type, number of dimensions, size for
each dimension, tensor layout and property. Each logical tensor has a unique ID
which is immutable during the lifetime of a logical tensor. Users cannot modify
the metadata of a logical tensor without creating a new one.
which is immutable during the lifetime of a logical tensor. The metadata of a
logical tensor cannot be modified without creating a new one.
## Op
@ -36,13 +36,13 @@ tensor as the edge between them.
## Graph
`Graph` (@ref dnnl::graph::graph) contains a set of operations. A graph object
is associated to a specific engine kind (@ref dnnl::engine::kind). Users can add
multiple operations (@ref dnnl::graph::graph::add_op) and their input and output
logical tensors to a graph. After finishing adding operations, users need to
call a finalization API (@ref dnnl::graph::graph::finalize) to indicate that the
graph is ready for partitioning. By calling partitioning API (@ref
dnnl::graph::get_partitions), users will get a group of partitions from the
graph.
is associated to a specific engine kind (@ref dnnl::engine::kind). Multiple
operations can be added (@ref dnnl::graph::graph::add_op) along with input and
output logical tensors to a graph. After finishing adding operations,
finalization API (@ref dnnl::graph::graph::finalize) can be called to indicate
that the graph is ready for partitioning. By calling partitioning API (@ref
dnnl::graph::graph::get_partitions), a group of partitions from the graph will
be returned .
## Partition
@ -68,34 +68,35 @@ logical tensors.
A partition may contains many logical tensors with part of them are internal
intermediate results connecting two operations inside the partition. The
required inputs and outputs of a partition are also called `ports` of a
partition. Users can call API `get_input_ports` (@ref
partition. Two APIs `get_input_ports` (@ref
dnnl::graph::partition::get_input_ports) and `get_output_ports` (@ref
dnnl::graph::partition::get_output_ports) to query the ports and understand
which input logical tensors and output logical tensors are needed to compile a
partition. The input logical tensors and output logical tensors must match IDs
with ports. These in ports and out ports can also be used to track the producer
and consumer of a partitions through logical tensor IDs and for framework
integration, connect the partition back to the framework graph as a custom node.
dnnl::graph::partition::get_output_ports) are provided to query the ports and
help understand which input logical tensors and output logical tensors are
needed to compile a partition. The input logical tensors and output logical
tensors must match IDs with ports. These in ports and out ports can also be used
to track the producer and consumer of a partitions through logical tensor IDs
and for framework integration, connect the partition back to the framework graph
as a custom node.
## Compiled Partition
`Compiled partition` (@ref dnnl::graph::compiled_partition) represents the
generated code specialized for a target hardware and tensor metadata passed
through compilation API. To execute a compiled partition (@ref
dnnl::graph::compiled_partition::execute), users need to pass input and output
tensors and a stream (@ref dnnl::stream). Input and output tensors must bind
data buffers to the input and output logical tensors respectively.
dnnl::graph::compiled_partition::execute), both input and output tensors, and a
stream (@ref dnnl::stream) are required to pass. Input and output tensors must
bind data buffers to the input and output logical tensors respectively.
Users can query output logical tensors (@ref
dnnl::graph::compiled_partition::query_logical_tensor) from a compiled partition
to know the output layout and memory size (@ref
dnnl::graph::logical_tensor::get_size) when they specify output logical tensor
with `any` layout type during compilation.
An API (@ref dnnl::graph::compiled_partition::query_logical_tensor) is provided
to query output logical tensors from a compiled partition. It allows to know the
output layout and memory size (@ref dnnl::graph::logical_tensor::get_mem_size)
when they specify output logical tensor with `any` layout type during
compilation.
## Tensor
`Tensor` (@ref dnnl::graph::tensor) is an abstraction for multi-dimensional
input and output data which is needed in the execution of a compiled partition.
A tensor contains a logical tensor, an engine (@ref dnnl::engine), and a data
handle. Users are responsible for managing the data handle's lifecycle, e.g.
free the memory resource when it is not used anymore.
handle. The application is responsible for managing the data handle's lifecycle,
for example free the memory resource when it is not used anymore.

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.6 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 32 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

View File

@ -0,0 +1,57 @@
# Low Precision {#dev_guide_graph_low_precision}
oneDNN Graph provides low precision support with int8 (signed/unsigned 8-bit
integer), bf16 and f16 data types. oneDNN Graph API expects the computation
graph is converted to low precision representation, the data's precision and
quantization parameters are specified explicitly. oneDNN Graph API
implementation will strictly respect the numeric precision of the computation.
@anchor dev_guide_graph_int8_quantization_model
## INT8
oneDNN Graph API provides below two operations to support quantized model with
static quantization:
- [Dequantize](@ref dev_guide_op_dequantize)
- [Quantize](@ref dev_guide_op_quantize)
Dequantize operation takes integer tensor with its associated scale and zero
point and returns f32 tensor. Quantize operation takes f32 tensor, scale, zero
point, and returns integer tensor. The scale and zero point are single dimension
tensors, which could contain one value for the per-tensor quantization case or
multiple values for the per-channel quantization case. The integer tensor could
be represented in unsigned int8 or signed int8 data type. Zero point could be
zero for symmetric quantization scheme, and a non-zero value for asymmetric
quantization scheme.
Dequantize and Quantize operation should be inserted manually in the graph as
part of quantization process before passing to oneDNN Graph. oneDNN Graph honors
the data type passed via logical tensor and faithfully follows the numeric
semantics. For example, if the graph has a Quantize operation followed by a
Dequantize operation with exact same scale and zero point, oneDNN Graph
implementation should not eliminate them since that implicitly changes the
numeric precision.
oneDNN Graph partitioning API may return a partition containing Dequantize,
Quantize, and Convolution operations in-between. It is not necessary to
recognize the subgraph pattern explicitly and convert to fused operation.
Depending on oneDNN Graph implementation capability, the partition may include
more or fewer operations.
@img{int8_programming.jpg,Figure 1: Overview of int8 programming model.,80%,}
@anchor dev_guide_graph_mixed_precision_model
## BF16/F16
oneDNN Graph provides [TypeCast](@ref dev_guide_op_typecast) operation, which
can convert a f32 tensor to bf16 or f16, and vice versa. It is used to support
auto mixed precision mechanism in popular deep learning frameworks. All oneDNN
Graph operations support bf16 and f16 data types.
A TypeCast operation performing down conversion should be inserted clearly to
indicate the use of low numeric precision. oneDNN Graph implementation fully
honors the API-specified numeric precision and only performs the computation
using the API-specified or higher numeric precision.
@img{bf16_programming.jpg,Figure 2: Overview of bf16 programming model.,80%,}

View File

@ -0,0 +1,2 @@
Programming Model
#################

View File

@ -0,0 +1,6 @@
Supported Operations
####################
The complete operation set is defined at
`specification <https://spec.oneapi.io/onednn-graph/latest/ops/index.html>`__.
Here only a subset of operation set is implemented.

View File

@ -0,0 +1,104 @@
# Supported Fusion Patterns {#dev_guide_graph_fusion_patterns}
## Fusion Patterns
The following fusion patterns are subgraphs that the oneDNN Graph API recognizes
as candidate for fusion. The patterns are described using oneDNN Graph
operation (op) names with the following convention.
@note oneDNN Graph performs limited input validation to minimize the performance
overheads. The application is responsible for sanitizing inputs passed to the
library. For large u8 or s8 inputs may lead to accumulator overflow, you can use
floating point patterns instead of quantized patterns.
`"+"` describes a chain of two ops. The preceding op produces an output tensor,
which is consumed by the following op as its first operand.
`"[]"` describes a component of the overall pattern description. For example,
it could include a subgraph or all the op choices within the bracket.
`"|"` describes choices of multiple operations, say A+[B|C] means the graph
partition contains A followed by B or C.
`","` describes a graph composed of multiple subgraphs, each subgraph marks its
output tensor explicitly, which is consumed by other subgraphs.
`Superscript` denotes the numbers of repetition pattern. For example,
A+[B|C]\f$^{3}\f$ means the graph partition contains A followed by three ops,
each of them is either B or C. The superscript could be a range of number
meaning allowing a range of repetition. If the range is between 0 and 1, we use
superscript `"?"`.
`Subscript` denotes the input and output tensors which need to explicitly mark
the producer and consumer relation within one graph partition. For example,
A\f$_{>t1}\f$+B+C\f$_{<t1}\f$ refers
to the pattern started with A followed by B and C, and C takes an implicit input
tensor from B and an extra tensor t1 output from A. `">"` refers to the output
tensor, and `"<"` for input tensor. Input and output tensor between neighbor
ops are not explicitly marked, for example, B consumes t1 implicitly in the
example above.
Subscript `"out"` marks the output tensor of a certain op to be the output of
a graph partition. For example, in
A\f$_{>t1}\f$+B\f$_{>out}\f$+C\f$_{<t1,>out}\f$, B's output and C's output
are marked as output tensors.
Subscript `"in"` marks the input tensor of a certain op to be the input of a
graph partition. For example, in A\f$_{<in1}\f$+B\f$_{<in1}\f$ A's input and
B's second input are graph partition input, and they share the same input tensor
in1. Most input tensors of a graph partition are not explicitly marked.
For example, the input tensors of the first op are implicitly regarded as graph
partition inputs. Besides, for input tensors of other ops, if they are not
produced by any proceeding ops, they are regarded as implicit graph partition
inputs. In the example A\f$_{>t1}\f$+B+C\f$_{<t1}\f$, A's inputs are
regarded as implicit graph partition inputs, and if B is a binary operation, the
second input tensor is an implicit graph partition input.
The following categories will be used in describing fusion pattern.
Unary = [Abs | Clamp | Elu | Exp | GELU | HardSwish | LeakyReLU |
Log | Sigmoid | SoftPlus | Pow | ReLU | Round | Sqrt | Square | Tanh]
Binary = [Add | Divide | Maximum | Minimum | Multiply | Subtract]
Reduction = [ReduceL1 | ReduceL2 | ReduceMax | ReduceMean | ReduceMin |
ReduceProd | ReduceSum]
### Inference
#### Floating Point Patterns
Pattern | Description
:-- | :--:
Convolution + BiasAdd\f$^?\f$ + BatchNormInference\f$^?\f$ + [Unary \| Binary]\f$^{0-3}\f$\f$_{>out}\f$ | This pattern is widely used in Convolution Neural Networks, for example ResNet, ResNext, SSD, etc.
ConvTranspose + BiasAdd\f$^?\f$ + [Unary \| Binary]\f$^{0-3}\f$\f$_{>out}\f$ | This pattern is widely used in Generative Adversarial Networks.
Interpolate + [Unary \| Binary]\f$^{0-3}\f$\f$_{>out}\f$ | This pattern is widely used for image processing.
MatMul + BiasAdd\f$^?\f$ + [Unary \| Binary]\f$^{0-3}\f$\f$_{>out}\f$ | This pattern is widely used in language models and recommendation models, for example BERT, DLRM, etc.
Reduction + [Unary \| Binary]\f$^{0-3}\f$\f$_{>out}\f$ | This pattern is widely used for data processing, for example loss reduction.
Unary + Binary\f$^{0-3}\f$\f$_{>out}\f$ | This pattern is widely used in Convolution Neural Networks.
Binary + [Unary \| Binary]\f$^{0-3}\f$\f$_{>out}\f$ | This pattern is widely used in Generative Adversarial Networks, for example ParallelWaveGAN.
[AvgPool \| MaxPool] + Binary\f$^{0-3}\f$\f$_{>out}\f$ | This pattern is widely used in Convolution Neural Networks.
BatchNormInference + ReLU\f$_{>out}\f$ | This pattern is widely used in Convolution Neural Networks, for example DenseNet.
Reciprocal + Multiply\f$_{>out}\f$ | N/A
Reorder + Add\f$_{>out}\f$ | N/A
#### Quantized Patterns
Pattern | Description
:-- | :--:
Quantize\f$^?\f$ + Dequantize\f$_{>t1}\f$, Dequantize\f$_{>t2}\f$\f$^{0-3}\f$, Dequantize + Convolution\f$_{<t1}\f$ + BiasAdd\f$^?\f$ + [Unary \| Binary\f$_{<t2}\f$]\f$^{0-3}\f$ + Quantize\f$^?\f$\f$_{>out}\f$ | N/A
Quantize\f$^?\f$ + Dequantize\f$_{>t1}\f$, Dequantize\f$_{>t2}\f$\f$^{0-3}\f$, Dequantize + ConvTranspose\f$_{<t1}\f$ + BiasAdd\f$^?\f$ + [Unary \| Binary\f$_{<t2}\f$]\f$^{0-3}\f$ + Quantize\f$^?\f$\f$_{>out}\f$ |N/A
Quantize\f$^?\f$ + Dequantize\f$_{>t1}\f$, Dequantize\f$_{>t2}\f$\f$^{0-3}\f$, Dequantize + MatMul\f$_{<t1}\f$ + BiasAdd\f$^?\f$ + [Unary \| Binary\f$_{<t2}\f$]\f$^{0-3}\f$ + Quantize\f$^?\f$\f$_{>out}\f$ |N/A
Dequantize + [AvgPool \| MaxPool] + Quantize\f$_{>out}\f$ |N/A
Dequantize\f$_{>t1}\f$, Dequantize + [AvgPool \| MaxPool] + Add\f$_{<t1}\f$ + Quantize\f$_{>out}\f$ |N/A
Dequantize + Reorder + Quantize\f$_{>out}\f$ |N/A
Dequantize\f$_{>t1}\f$, Dequantize + Reorder + Add\f$_{<t1}\f$ + Quantize\f$_{>out}\f$ |N/A
#### Training
Pattern | Description
:-- | :--:
ConvolutionBackwardWeights + BiasAddBackward\f$_{>out}\f$ | N/A
ReLUBackward + BatchNormTrainingBackward\f$_{>out}\f$ |N/A
All the above fusion patterns are supported by default.

View File

@ -120,3 +120,18 @@ The sequence of actions to create a primitive is:
memory formats if the primitive supports it.
2. Create a primitive based on the primitive descriptor obtained in step 1.
## Graph Extension
Graph extension is a high level abstraction in oneDNN that allows you to work
with a computation graph instead of individual primitives. This approach allows
you to make an operation fusion:
* Transparent: the integration efforts are reduced by abstracting backend-aware
fusion logic.
* Scalable: no integration code change is necessary to benefit from new fusion
patterns enabled in oneDNN.
The programming model for the graph extension is detailed in the
[graph basic concepts section](@ref dev_guide_graph_basic_concepts).

Binary file not shown.

Before

Width:  |  Height:  |  Size: 126 KiB

View File

@ -0,0 +1,10 @@
Graph Extension
###############
.. toctree::
:maxdepth: 1
graph_programming_model
graph_supported_operations
dev_guide_graph_fusion_patterns
dev_guide_graph_dump

View File

@ -1,5 +1,5 @@
oneDNN Documentation
========================
Intel® oneAPI Deep Neural Network Library Developer Guide and Reference
=======================================================================
.. toctree::
:maxdepth: 1
@ -7,6 +7,7 @@ oneDNN Documentation
build_and_link
programming_model
supported_primitives
graph_extension
dev_guide_examples
performance_profiling_and_inspection
advanced_topics

View File

@ -164,7 +164,9 @@ mathjax3_config = {
'diffdstiterc': '\\operatorname{diff\\_dst\\_iter\\_c}',
'diffgamma': '\\operatorname{diff\\_\\gamma}',
'diffbeta': '\\operatorname{diff\\_\\beta}',
'workspace': '\\operatorname{workspace}'
'workspace': '\\operatorname{workspace}',
'srcshape': '\\operatorname{src\\_\\shape}',
'dstshape': '\\operatorname{dst\\_\\shape}'
}
}
}
@ -200,7 +202,88 @@ def addTocTrees(app, env, docnames):
trees2Add = {'rst/dev_guide_inference_and_training_aspects.rst':['dev_guide_inference.rst','dev_guide_inference_int8.rst','dev_guide_training_bf16.rst'],
'rst/dev_guide_attributes.rst':['dev_guide_attributes_fpmath_mode.rst','dev_guide_attributes_quantization.rst','dev_guide_attributes_post_ops.rst','dev_guide_attributes_scratchpad.rst'],
'rst/dev_guide_basic_concepts.rst':['dev_guide_graph_basic_concepts.rst']}
'rst/graph_supported_operations.rst':[
'dev_guide_op_abs.rst',
'dev_guide_op_absbackward.rst',
'dev_guide_op_add.rst',
'dev_guide_op_avgpool.rst',
'dev_guide_op_avgpoolbackward.rst',
'dev_guide_op_batchnormforwardtraining.rst',
'dev_guide_op_batchnorminference.rst',
'dev_guide_op_batchnormtrainingbackward.rst',
'dev_guide_op_biasadd.rst',
'dev_guide_op_biasaddbackward.rst',
'dev_guide_op_clamp',
'dev_guide_op_clampbackward',
'dev_guide_op_concat.rst',
'dev_guide_op_convolution.rst',
'dev_guide_op_convolutionbackwarddata.rst',
'dev_guide_op_convolutionbackwardweights.rst',
'dev_guide_op_convtranspose.rst',
'dev_guide_op_convtransposebackwarddata.rst',
'dev_guide_op_convtransposebackwardweights.rst',
'dev_guide_op_dequantize.rst',
'dev_guide_op_divide.rst',
'dev_guide_op_dynamicdequantize.rst',
'dev_guide_op_dynamicquantize.rst',
'dev_guide_op_elu',
'dev_guide_op_elubackward',
'dev_guide_op_end',
'dev_guide_op_exp',
'dev_guide_op_gelu',
'dev_guide_op_gelubackward',
'dev_guide_op_hardswish',
'dev_guide_op_hardswishbackward',
'dev_guide_op_interpolate.rst',
'dev_guide_op_interpolatebackward.rst',
'dev_guide_op_layernorm.rst',
'dev_guide_op_layernormbackward.rst',
'dev_guide_op_leakyrelu',
'dev_guide_op_log',
'dev_guide_op_logsoftmax',
'dev_guide_op_logsoftmaxbackward',
'dev_guide_op_matmul.rst',
'dev_guide_op_maximum.rst',
'dev_guide_op_maxpool.rst',
'dev_guide_op_maxpoolbackward.rst',
'dev_guide_op_minimum.rst',
'dev_guide_op_mish',
'dev_guide_op_mishbackward',
'dev_guide_op_multiply.rst',
'dev_guide_op_prelu',
'dev_guide_op_prelubackward',
'dev_guide_op_quantize.rst',
'dev_guide_op_reciprocal.rst',
'dev_guide_op_reducel1.rst',
'dev_guide_op_reducel2.rst',
'dev_guide_op_reducemax.rst',
'dev_guide_op_reducemean.rst',
'dev_guide_op_reducemin.rst',
'dev_guide_op_reduceprod.rst',
'dev_guide_op_reducesum.rst',
'dev_guide_op_relu.rst',
'dev_guide_op_relubackward',
'dev_guide_op_reorder.rst',
'dev_guide_op_round',
'dev_guide_op_sigmoid',
'dev_guide_op_sigmoidbackward',
'dev_guide_op_softmax',
'dev_guide_op_softmaxbackward',
'dev_guide_op_softplus',
'dev_guide_op_softplusbackward',
'dev_guide_op_sqrt',
'dev_guide_op_sqrtbackward',
'dev_guide_op_square',
'dev_guide_op_squareddifference',
'dev_guide_op_staticreshape',
'dev_guide_op_statictranspose',
'dev_guide_op_subtract.rst',
'dev_guide_op_tanh',
'dev_guide_op_tanhbackward',
'dev_guide_op_typecast.rst',
'dev_guide_op_wildcard'
],
'rst/graph_programming_model.rst':['dev_guide_graph_basic_concepts.rst', 'dev_guide_graph_low_precision.rst']}
for rstFile in trees2Add: