frozenleaves/oneDNN

Fork 0

mirror of https://github.com/uxlfoundation/oneDNN.git synced 2025-10-20 18:43:49 +08:00

Files

Viktoriia 56e219e035 all: updated Github links to uxlfoundation

2025-03-12 15:08:59 -07:00

20 KiB

Raw Blame History

Build Options

oneDNN supports the following build-time options.

CMake Option	Supported values (defaults in bold)	Description
ONEDNN_LIBRARY_TYPE	SHARED, STATIC	Defines the resulting library type
ONEDNN_CPU_RUNTIME	NONE, OMP, TBB, SEQ, THREADPOOL, SYCL	Defines the threading runtime for CPU engines
ONEDNN_GPU_RUNTIME	NONE, OCL, SYCL	Defines the offload runtime for GPU engines
ONEDNN_BUILD_DOC	ON, OFF	Controls building the documentation
ONEDNN_BUILD_EXAMPLES	ON, OFF	Controls building the examples
ONEDNN_BUILD_TESTS	ON, OFF	Controls building the tests
ONEDNN_BUILD_GRAPH	ON, OFF	Controls building graph component
ONEDNN_ENABLE_GRAPH_DUMP	ON, OFF	Controls dumping graph artifacts
ONEDNN_EXPERIMENTAL_GRAPH_COMPILER_BACKEND	ON, OFF	Enables the [graph compiler backend](@ref dev_guide_graph_compiler) of the graph component (experimental)
ONEDNN_ARCH_OPT_FLAGS	compiler flags	Specifies compiler optimization flags (see warning note below)
ONEDNN_ENABLE_CONCURRENT_EXEC	ON, OFF	Disables sharing a common scratchpad between primitives in #dnnl::scratchpad_mode::library mode
ONEDNN_ENABLE_JIT_PROFILING	ON, OFF	Enables [integration with performance profilers](@ref dev_guide_profilers)
ONEDNN_ENABLE_ITT_TASKS	ON, OFF	Enables [integration with performance profilers](@ref dev_guide_profilers)
ONEDNN_ENABLE_PRIMITIVE_CACHE	ON, OFF	Enables [primitive cache](@ref dev_guide_primitive_cache)
ONEDNN_ENABLE_MAX_CPU_ISA	ON, OFF	Enables [CPU dispatcher controls](@ref dev_guide_cpu_dispatcher_control)
ONEDNN_ENABLE_CPU_ISA_HINTS	ON, OFF	Enables [CPU ISA hints](@ref dev_guide_cpu_isa_hints)
ONEDNN_ENABLE_WORKLOAD	TRAINING, INFERENCE	Specifies a set of functionality to be available based on workload
ONEDNN_ENABLE_PRIMITIVE	ALL, PRIMITIVE_NAME	Specifies a set of functionality to be available based on primitives
ONEDNN_ENABLE_PRIMITIVE_CPU_ISA	ALL, CPU_ISA_NAME	Specifies a set of functionality to be available for CPU backend based on CPU ISA
ONEDNN_ENABLE_PRIMITIVE_GPU_ISA	ALL, GPU_ISA_NAME	Specifies a set of functionality to be available for GPU backend based on GPU ISA
ONEDNN_ENABLE_GEMM_KERNELS_ISA	ALL, NONE, ISA_NAME	Specifies a set of functionality to be available for GeMM kernels for CPU backend based on ISA
ONEDNN_EXPERIMENTAL	ON, OFF	Enables [experimental features](@ref dev_guide_experimental)
ONEDNN_VERBOSE	ON, OFF	Enables [verbose mode](@ref dev_guide_verbose)
ONEDNN_DEV_MODE	ON, OFF	Enables internal tracing and `debuginfo` logging in verbose output (for oneDNN developers)
ONEDNN_AARCH64_USE_ACL	ON, OFF	Enables integration with Arm Compute Library for AArch64 builds
ONEDNN_BLAS_VENDOR	NONE, ARMPL, ACCELERATE	Defines an external BLAS library to link to for GEMM-like operations
ONEDNN_GPU_VENDOR	NONE, INTEL, NVIDIA, AMD	When DNNL_GPU_RUNTIME is not NONE defines GPU vendor for GPU engines otherwise its value is NONE
ONEDNN_DPCPP_HOST_COMPILER	DEFAULT, GNU or Clang C++ compiler executable	Specifies host compiler executable for SYCL runtime
ONEDNN_LIBRARY_NAME	dnnl, library name	Specifies name of the library
ONEDNN_TEST_SET	SMOKE, CI, NIGHTLY, MODIFIER_NAME	Specifies the testing coverage enabled through the generated testing targets

All building options listed support their counterparts with DNNL prefix instead of ONEDNN. DNNL options would take precedence over ONEDNN versions, if both versions are specified.

ONEDNN_BUILD_DOC, ONEDNN_BUILD_EXAMPLES and ONEDNN_BUILD_TESTS are disabled by default when oneDNN is built as a sub-project.

All other building options or values that can be found in CMake files are intended for development/debug purposes and are subject to change without notice. Please avoid using them.

Common options

Host compiler

When building oneDNN with oneAPI DPC++/C++ Compiler user can specify a custom host compiler. The host compiler is a compiler that will be used by the main compiler driver to perform host compilation step.

The host compiler can be specified with ONEDNN_DPCPP_HOST_COMPILER CMake option. It should be specified either by name (in this case, the standard system environment variables will be used to discover it) or an absolute path to the compiler executable.

The default value of ONEDNN_DPCPP_HOST_COMPILER is DEFAULT, which is the default host compiler used by the compiler specified with CMAKE_CXX_COMPILER.

The DEFAULT host compiler is the only supported option on Windows. On Linux, user can specify a GNU C++ compiler as the host compiler.

@warning oneAPI DPC++/C++ Compiler requires host compiler to be compatible. The minimum allowed GNU C++ compiler version is 7.4.0. See GCC* Compatibility and Interoperability section in oneAPI DPC++/C++ Compiler Developer Guide.

@warning The minimum allowed Clang C++ compiler version is 8.0.0.

Configuring functionality

Using ONEDNN_ENABLE_WORKLOAD and ONEDNN_ENABLE_PRIMITIVE it is possible to limit functionality available in the final shared object or statically linked application. This helps to reduce the amount of disk space occupied by an app.

ONEDNN_ENABLE_WORKLOAD

This option supports only two values: TRAINING (the default) and INFERENCE. INFERENCE enables only forward propagation kind part of functionality, removing all backward-related functionality, except those which are dependencies for forward propagation kind part.

ONEDNN_ENABLE_PRIMITIVE

This option supports several values: ALL (the default) which enables all primitives implementations or a set of BATCH_NORMALIZATION, BINARY, CONCAT, CONVOLUTION, DECONVOLUTION, ELTWISE, INNER_PRODUCT, LAYER_NORMALIZATION, LRN, MATMUL, POOLING, PRELU, REDUCTION, REORDER, RESAMPLING, RNN, SDPA, SHUFFLE, SOFTMAX, SUM. When a set is used, only those selected primitives implementations will be available. Attempting to use other primitive implementations will end up returning an unimplemented status when creating primitive descriptor. In order to specify a set, a CMake-style string should be used, with semicolon delimiters, as in this example:

-DONEDNN_ENABLE_PRIMITIVE=CONVOLUTION;MATMUL;REORDER

ONEDNN_ENABLE_PRIMITIVE_CPU_ISA

This option supports several values: ALL (the default) which enables all ISA implementations or one of SSE41, AVX2, AVX512, and AMX. Values are linearly ordered as SSE41 < AVX2 < AVX512 < AMX. When specified, selected ISA and all ISA that are "smaller" will be available. When specified, [CPU dispatcher controls](@ref dev_guide_cpu_dispatcher_control) are also affected in compliance with the option.

Note that AVX2 denotes whole AVX2-based family ISAs, AVX512 denotes whole AVX512-based family ISAs, as well as AMX denotes any ISA containing AMX unit.

Example that enables SSE41 and AVX2 sets:

-DONEDNN_ENABLE_PRIMITIVE_CPU_ISA=AVX2

ONEDNN_ENABLE_PRIMITIVE_GPU_ISA

This option supports several values: ALL (the default) which enables all ISA implementations or any set of GEN9, GEN11, XELP, XEHP, XEHPG, XEHPC, XE2, and XE3. Selected ISA will enable correspondent parts in just-in-time kernel generation based implementations. OpenCL based kernels and implementations will always be available. Example that enables XeLP and XeHP set:

-DONEDNN_ENABLE_PRIMITIVE_GPU_ISA=XELP;XEHP

ONEDNN_ENABLE_GEMM_KERNELS_ISA

This option supports several values: ALL (the default) which enables all ISA kernels from x64/gemm folder, NONE which disables all kernels and removes correspondent interfaces, or one of SSE41, AVX2, and AVX512. Values are linearly ordered as SSE41 < AVX2 < AVX512. When specified, selected ISA and all ISA that are "smaller" will be available. Example that leaves SSE41 and AVX2 sets, but removes AVX512 and AMX kernels:

-DONEDNN_ENABLE_GEMM_KERNELS_ISA=AVX2

Configuring testing

ONEDNN_TEST_SET

This option specifies testing coverage enabled through testing targets generated by the build system. The variable consists of two parts: the set value which defines the number of test cases, and the modifiers for testing commands. The final string must contain a single value for a set and as many compatible values for modifiers.

The set value is defined by one of: SMOKE, CI, or NIGHTLY. The modifier values (referred as MODIFIER_NAME) are one of: NO_CORR, ADD_BITWISE. The input is expected in the CMake list style - a semicolon separated string - e.g., ONEDNN_TEST_SET=CI;NO_CORR.

When SMOKE value is specified, it enables a short set of test cases which verifies that basic library functionality works as expected. When CI value is specified, it enables a regular set of test cases which verifies that all library supported functionality works as expected. When NIGHTLY value is specified, it enables the largest set of test cases which verifies that all library supported functionality and all kernel optimizations work as expected.

When NO_CORR modifier value is specified, it removes correctness validation, which is set by default, from benchdnn testing targets. It helps to save time when correctness validation is not necessary. When ADD_BITWISE modifier value is specified, the build system will add an additional set of tests with a bitwise validation mode for benchdnn. The correctness set remains unmodified.

CPU Options

Intel Architecture Processors and compatible devices are supported by oneDNN CPU engine. The CPU engine is built by default but can be disabled at build time by setting ONEDNN_CPU_RUNTIME to NONE. In this case, GPU engine must be enabled.

Targeting Specific Architecture

oneDNN uses JIT code generation to implement most of its functionality and will choose the best code based on detected processor features. However, some oneDNN functionality will still benefit from targeting a specific processor architecture at build time. You can use ONEDNN_ARCH_OPT_FLAGS CMake option for this.

For Intel(R) C++ Compilers, the default option is -xSSE4.1, which instructs the compiler to generate the code for the processors that support SSE4.1 instructions. This option would not allow you to run the library on older processor architectures.

For GNU* Compilers and Clang, the default option is -msse4.1.

@warning While use of ONEDNN_ARCH_OPT_FLAGS option gives better performance, the resulting library can be run only on systems that have instruction set compatible with the target instruction set. Therefore, ARCH_OPT_FLAGS should be set to an empty string ("") if the resulting library needs to be portable.

Runtimes

CPU engine can use OpenMP, Threading Building Blocks (TBB) or sequential threading runtimes. OpenMP threading is the default build mode. This behavior is controlled by the ONEDNN_CPU_RUNTIME CMake option.

OpenMP

oneDNN uses OpenMP runtime library provided by the compiler.

When building oneDNN with oneAPI DPC++/C++ Compiler the library will link to Intel OpenMP runtime. This behavior can be changed by changing the host compiler with ONEDNN_DPCPP_HOST_COMPILER option.

@warning Because different OpenMP runtimes may not be binary-compatible, it's important to ensure that only one OpenMP runtime is used throughout the application. Having more than one OpenMP runtime linked to an executable may lead to undefined behavior including incorrect results or crashes. However as long as both the library and the application use the same or compatible compilers there would be no conflicts.

Threading Building Blocks (TBB)

To build oneDNN with TBB support, set ONEDNN_CPU_RUNTIME to TBB:

$ cmake -DONEDNN_CPU_RUNTIME=TBB ..

Optionally, set the TBBROOT environmental variable to point to the TBB installation path or pass the path directly to CMake:

$ cmake -DONEDNN_CPU_RUNTIME=TBB -DTBBROOT=/opt/intel/path/tbb ..

oneDNN has functional limitations if built with TBB:

Winograd convolution algorithm is not supported for fp32 backward by data and backward by weights propagation.

Threadpool

To build oneDNN with support for threadpool threading, set ONEDNN_CPU_RUNTIME to THREADPOOL

$ cmake -DONEDNN_CPU_RUNTIME=THREADPOOL ..

The _ONEDNN_TEST_THREADPOOL_IMPL CMake variable controls which of the three threadpool implementations would be used for testing: STANDALONE, TBB, or EIGEN. The latter two require also passing TBBROOT or Eigen3_DIR paths to CMake. For example:

$ cmake -DONEDNN_CPU_RUNTIME=THREADPOOL -D_ONEDNN_TEST_THREADPOOL_IMPL=EIGEN -DEigen3_DIR=/path/to/eigen/share/eigen3/cmake ..

Threadpool threading support is experimental and has the same limitations as TBB plus more:

As threadpools are attached to streams which are only passed during primitive execution, work decomposition is performed statically at the primitive creation time. At the primitive execution time, the threadpool is responsible for balancing the static decomposition from the previous item across available worker threads.

AArch64 Options

oneDNN includes experimental support for Arm 64-bit Architecture (AArch64). By default, AArch64 builds will use the reference implementations throughout. The following options enable the use of AArch64 optimised implementations for a limited number of operations, provided by AArch64 libraries.

AArch64 build configuration	CMake Option	Environment variables	Dependencies
Arm Compute Library based primitives	ONEDNN_AARCH64_USE_ACL=ON	ACL_ROOT_DIR=</path/to/ComputeLibrary>	Arm Compute Library
Vendor BLAS library support	ONEDNN_BLAS_VENDOR=ARMPL	None	Arm Performance Libraries

Arm Compute Library

Arm Compute Library is an open-source library for machine learning applications. The development repository is available from mlplatform.org, and releases are also available on GitHub. The ONEDNN_AARCH64_USE_ACL CMake option is used to enable Compute Library integration:

$ cmake -DONEDNN_AARCH64_USE_ACL=ON ..

This assumes that the environment variable ACL_ROOT_DIR is set to the location of Arm Compute Library, which must be downloaded and built independently of oneDNN.

@warning For a debug build of oneDNN it is advisable to specify a Compute Library build which has also been built with debug enabled.

@warning oneDNN only supports builds with Compute Library v23.11 or later.

Vendor BLAS libraries

oneDNN can use a standard BLAS library for GEMM operations. The ONEDNN_BLAS_VENDOR build option controls BLAS library selection, and defaults to NONE. For AArch64 builds with GCC, use the Arm Performance Libraries:

$ cmake -DONEDNN_BLAS_VENDOR=ARMPL ..

Additional options available for development/debug purposes. These options are subject to change without notice, see cmake/options.cmake for details.

GPU Options

Intel Processor Graphics is supported by oneDNN GPU engine. GPU engine is disabled in the default build configuration.

Runtimes

To enable GPU support you need to specify the GPU runtime by setting ONEDNN_GPU_RUNTIME CMake option. The default value is "NONE" which corresponds to no GPU support in the library.

OpenCL*

OpenCL runtime requires Intel(R) SDK for OpenCL* applications. You can explicitly specify the path to the SDK using -DOPENCLROOT CMake option.

$ cmake -DONEDNN_GPU_RUNTIME=OCL -DOPENCLROOT=/path/to/opencl/sdk ..

@anchor component_limitation

Graph component limitations

The graph component can be enabled via the build option ONEDNN_BUILD_GRAPH. But the build option does not work with some values of other build options. Specifying the options and values simultaneously in one build will lead to a CMake error.

CMake Option	Unsupported Values
ONEDNN_GPU_VENDOR	NVIDIA
ONEDNN_ENABLE_PRIMITIVE	PRIMITIVE_NAME

Graph Compiler Backend Limitations

As a backend of the graph component, besides the options described in [Graph component limitations](@ref component_limitation), graph compiler backend has some extra limitations. Specifying unsupported build options will lead to a CMake error.

CMake Option	Unsupported Values
ONEDNN_CPU_RUNTIME	THREADPOOL, SYCL
ONEDNN_GPU_RUNTIME	OCL, SYCL

Besides, the instructions contained in the kernels generated by the graph compiler backend are [AVX512_CORE](@ref dev_guide_cpu_dispatcher_control) or above, so these kernels will not be dispatched on systems that do not have corresponding instruction sets support.

20 KiB Raw Blame History

Build Options

Common options

Host compiler

Configuring functionality

ONEDNN_ENABLE_WORKLOAD

ONEDNN_ENABLE_PRIMITIVE

ONEDNN_ENABLE_PRIMITIVE_CPU_ISA

ONEDNN_ENABLE_PRIMITIVE_GPU_ISA

ONEDNN_ENABLE_GEMM_KERNELS_ISA

Configuring testing

ONEDNN_TEST_SET

CPU Options

Targeting Specific Architecture

Runtimes

OpenMP

Threading Building Blocks (TBB)

Threadpool

AArch64 Options

Arm Compute Library

Vendor BLAS libraries

GPU Options

Runtimes

OpenCL*

Graph component limitations

Graph Compiler Backend Limitations

20 KiB

Raw Blame History