oneDNN

mirror of https://github.com/uxlfoundation/oneDNN.git synced 2025-10-20 10:03:50 +08:00

Author	SHA1	Message	Date
Ovchinnikov Dmitriy	846ba7c1d4	x64: matmul: Enable AVX2 matmul weight decompression	2025-10-18 16:47:57 +02:00
Ovchinnikov Dmitriy	0ba6d85080	x64: matmul: Enable grouped ZP for per_oc/per_ocic & rework scales	2025-10-18 16:47:57 +02:00
Ovchinnikov Dmitriy	051b020bb1	x64: matmul: Enable AVX512 f32:int4/int8:f32 case	2025-10-18 16:47:57 +02:00
Ovchinnikov Dmitriy	f92a20983c	x64: matmul: Refactoring matmul copy kernels	2025-10-18 16:47:57 +02:00
Vadim Pirogov	794caa04ba	common: verbose: added SYCL kernel compiler status to verbose header	2025-10-17 17:10:13 -07:00
Vadim Pirogov	fda8d0c376	gpu: sycl: added nullptr check for ext2cl_str result ext2cl_str() may return nullptr for OpenCL extensions not relevant to oneDNN.	2025-10-17 17:10:13 -07:00
Vadim Pirogov	81b67328c5	gpu: sycl: fixed build issue with SYCL kernel compiler	2025-10-17 17:10:13 -07:00
张健10355098	de3f018b62	cpu: rv64: eltwise: remove channel blocked layouts support	2025-10-17 15:48:20 -07:00
张健10355098	a6956f00d4	cpu: rv64: eltwise: fix: remove early pointer casting	2025-10-17 15:48:20 -07:00
张健10355098	944c782e17	cpu: rv64: eltwise: rebase & remove f16 deadcode	2025-10-17 15:48:20 -07:00
张健10355098	1aa33bbb25	cpu: rv64: eltwise: fix: Add Zvfh extension guard to f16-related codes	2025-10-17 15:48:20 -07:00
张健10355098	d3eb5ed7ac	cpu: rv64: eltwise: fix templatization and ifdef line	2025-10-17 15:48:20 -07:00
张健10355098	a0961ab37e	cpu: rv64: add support for rvv eltwise feature	2025-10-17 15:48:20 -07:00
Xia Zhuozhao	dca0b6c7d6	cpu: risc-v: Restore specific checks in post_ops_ok for gemm convolution Co-authored-by: Fei Zhang <zhangfei@iscas.ac.cn>	2025-10-17 15:41:00 -07:00
xiazhuozhao	532111e2d6	cpu: risc-v: convolution: Add a newline at the end of the file.	2025-10-17 15:41:00 -07:00
xiazhuozhao	a4ac6806be	cpu: rv64: conv: Refactor validation logic and cleanup Co-authored-by: Fei Zhang <zhangfei@iscas.ac.cn>	2025-10-17 15:41:00 -07:00
xiazhuozhao	fe04323ab0	cpu: rv64: convolution: optimize ncsp relu and adjust nspc impl Co-authored-by: Fei Zhang <zhangfei@iscas.ac.cn>	2025-10-17 15:41:00 -07:00
xiazhuozhao	1147a0739a	cpu: rv64: convolution: Vectorize post-ops with RVV intrinsics Co-authored-by: Fei Zhang <zhangfei@iscas.ac.cn>	2025-10-17 15:41:00 -07:00
Zhukova, Maria	fcc4e1eb10	doc: small subsection wrt how to run examples	2025-10-17 10:35:27 -07:00
Ryo Suzuki	7815069074	ci: aarch64: Fix permission issues and input data handling (#4125 )	2025-10-17 17:38:19 +01:00
dependabot[bot]	d2b59266e1	github: workflows: bump KyleMayes/install-llvm-action from 2.0.7 to 2.0.8 (#4121 ) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-10-17 11:50:39 +01:00
dependabot[bot]	5365a85d64	github: workflows: bump github/codeql-action from 3.30.6 to 4.30.8 Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.30.6 to 4.30.8. - [Release notes](https://github.com/github/codeql-action/releases) - [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md) - [Commits](`64d10c1313...f443b600d9`) --- updated-dependencies: - dependency-name: github/codeql-action dependency-version: 4.30.8 dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com>	2025-10-16 15:40:30 -07:00
Guskov, Andrey Y	7b351ae955	src: gpu: intel: gemm: add strided batch support to group sums	2025-10-16 14:30:24 -07:00
Maria Zhukova	dc1d8c3d55	doc: update supported attributes section for matmul	2025-10-16 13:38:24 -07:00
Maria Zhukova	52077ed87c	api, doc: fix documentation warn wrt parameter name	2025-10-16 13:38:24 -07:00
Maria Zhukova	c16cf34b28	doc: fix a few misspellings	2025-10-16 13:38:24 -07:00
Zhukova, Maria	cff4bb8aaa	doc: add section for quantization-related APIs and refactor	2025-10-16 13:38:24 -07:00
Chereshnev, Eugene	954f7f76f2	xe: jit: fix shr usage	2025-10-16 12:47:19 -07:00
Andrei Hutu	3a756b982b	cpu: aarch64: skip f16 to f32 upcast for clip and clip_v2 eltwise (#4101 )	2025-10-16 15:06:10 +01:00
Ryo Suzuki	e2aa79e849	ci: aarch64: make comparison script friendlier for local testing (#4157 )	2025-10-16 13:12:46 +01:00
David Svantesson	34caf5740b	cpu: aarch64: jit_reorder: fix dispatch issue for direct copy (#4150 )	2025-10-16 08:57:31 +01:00
Guskov, Andrey Y	bf6617dd1f	tests: benchdnn: utils: improve --buffer-prefix messaging	2025-10-15 18:09:06 -07:00
Umar Arshad	b5773a0826	xe: sdpa: workaround for xe3 regressions for PTL	2025-10-15 17:55:25 -04:00
Sergey Kazakov	34c81a1ca8	xe: gemm: convert HHS kernel to FHS to fix perf regression	2025-10-15 13:53:19 -07:00
Peter Caday	7b3f2af13c	xe: jit: gemm: fix stream-k slab count calculation	2025-10-15 10:05:31 -07:00
Kassen, Andrew	a4874e48ea	tests: sdpa: fix fallthrough	2025-10-15 09:08:56 -07:00
Kassen, Andrew	21adf4d977	tests: sdpa: fix typo	2025-10-15 09:08:56 -07:00
Kassen, Andrew	ff30885a40	xe: sdpa: add check for host scalar fetch	2025-10-15 09:08:56 -07:00
Kassen, Andrew	f8ef3008b1	xe: rnn: avoid potential null pointer dereference	2025-10-15 09:08:56 -07:00
Kassen, Andrew	201e204682	xe: conv: make conversion explicit	2025-10-15 09:08:56 -07:00
Kassen, Andrew	3f66dc5049	third_party: ngen: avoid unnecessary copy	2025-10-15 09:08:56 -07:00
Kassen, Andrew	2b06b5c1b5	xe: jit: avoid unnecessary copies	2025-10-15 09:08:56 -07:00
Kassen, Andrew	4060edd33c	xe: gemm: assert host scalar scale is nonzero	2025-10-15 09:08:56 -07:00
Kassen, Andrew	f1302e4804	xe: ir: remove dead code	2025-10-15 09:08:56 -07:00
Kassen, Andrew	8df1253973	xe: lrn: remove dead code	2025-10-15 09:08:56 -07:00
Mourad Gouicem	27e186ccd9	gtest: check returned ptr on expected md init failure	2025-10-15 10:42:00 +02:00
Mourad Gouicem	de84f7f345	benchdnn: properly lock before checking allocations size	2025-10-15 10:42:00 +02:00
Mourad Gouicem	34663d344a	benchdnn: fix reference check in init_memory	2025-10-15 10:42:00 +02:00
Denis Samoilov	67cd11b889	cpu: x64: matmul fix wei_k_blk query	2025-10-14 23:10:32 -04:00
Umar Arshad	5ab84bf3df	xe: ukernel: Add new ukernel entries for MoE support	2025-10-14 20:05:31 -04:00

1 2 3 4 5 ...

20988 Commits