|
846ba7c1d4
|
x64: matmul: Enable AVX2 matmul weight decompression
|
2025-10-18 16:47:57 +02:00 |
|
|
0ba6d85080
|
x64: matmul: Enable grouped ZP for per_oc/per_ocic & rework scales
|
2025-10-18 16:47:57 +02:00 |
|
|
051b020bb1
|
x64: matmul: Enable AVX512 f32:int4/int8:f32 case
|
2025-10-18 16:47:57 +02:00 |
|
|
f92a20983c
|
x64: matmul: Refactoring matmul copy kernels
|
2025-10-18 16:47:57 +02:00 |
|
|
794caa04ba
|
common: verbose: added SYCL kernel compiler status to verbose header
|
2025-10-17 17:10:13 -07:00 |
|
|
fda8d0c376
|
gpu: sycl: added nullptr check for ext2cl_str result
ext2cl_str() may return nullptr for OpenCL extensions not relevant
to oneDNN.
|
2025-10-17 17:10:13 -07:00 |
|
|
81b67328c5
|
gpu: sycl: fixed build issue with SYCL kernel compiler
|
2025-10-17 17:10:13 -07:00 |
|
|
de3f018b62
|
cpu: rv64: eltwise: remove channel blocked layouts support
|
2025-10-17 15:48:20 -07:00 |
|
|
a6956f00d4
|
cpu: rv64: eltwise: fix: remove early pointer casting
|
2025-10-17 15:48:20 -07:00 |
|
|
944c782e17
|
cpu: rv64: eltwise: rebase & remove f16 deadcode
|
2025-10-17 15:48:20 -07:00 |
|
|
1aa33bbb25
|
cpu: rv64: eltwise: fix: Add Zvfh extension guard to f16-related codes
|
2025-10-17 15:48:20 -07:00 |
|
|
d3eb5ed7ac
|
cpu: rv64: eltwise: fix templatization and ifdef line
|
2025-10-17 15:48:20 -07:00 |
|
|
a0961ab37e
|
cpu: rv64: add support for rvv eltwise feature
|
2025-10-17 15:48:20 -07:00 |
|
|
dca0b6c7d6
|
cpu: risc-v: Restore specific checks in post_ops_ok for gemm convolution
Co-authored-by: Fei Zhang <zhangfei@iscas.ac.cn>
|
2025-10-17 15:41:00 -07:00 |
|
|
532111e2d6
|
cpu: risc-v: convolution: Add a newline at the end of the file.
|
2025-10-17 15:41:00 -07:00 |
|
|
a4ac6806be
|
cpu: rv64: conv: Refactor validation logic and cleanup
Co-authored-by: Fei Zhang <zhangfei@iscas.ac.cn>
|
2025-10-17 15:41:00 -07:00 |
|
|
fe04323ab0
|
cpu: rv64: convolution: optimize ncsp relu and adjust nspc impl
Co-authored-by: Fei Zhang <zhangfei@iscas.ac.cn>
|
2025-10-17 15:41:00 -07:00 |
|
|
1147a0739a
|
cpu: rv64: convolution: Vectorize post-ops with RVV intrinsics
Co-authored-by: Fei Zhang <zhangfei@iscas.ac.cn>
|
2025-10-17 15:41:00 -07:00 |
|
|
fcc4e1eb10
|
doc: small subsection wrt how to run examples
|
2025-10-17 10:35:27 -07:00 |
|
|
7815069074
|
ci: aarch64: Fix permission issues and input data handling (#4125)
|
2025-10-17 17:38:19 +01:00 |
|
|
d2b59266e1
|
github: workflows: bump KyleMayes/install-llvm-action from 2.0.7 to 2.0.8 (#4121)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
|
2025-10-17 11:50:39 +01:00 |
|
|
5365a85d64
|
github: workflows: bump github/codeql-action from 3.30.6 to 4.30.8
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.30.6 to 4.30.8.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](64d10c1313...f443b600d9 )
---
updated-dependencies:
- dependency-name: github/codeql-action
dependency-version: 4.30.8
dependency-type: direct:production
update-type: version-update:semver-major
...
Signed-off-by: dependabot[bot] <support@github.com>
|
2025-10-16 15:40:30 -07:00 |
|
|
7b351ae955
|
src: gpu: intel: gemm: add strided batch support to group sums
|
2025-10-16 14:30:24 -07:00 |
|
|
dc1d8c3d55
|
doc: update supported attributes section for matmul
|
2025-10-16 13:38:24 -07:00 |
|
|
52077ed87c
|
api, doc: fix documentation warn wrt parameter name
|
2025-10-16 13:38:24 -07:00 |
|
|
c16cf34b28
|
doc: fix a few misspellings
|
2025-10-16 13:38:24 -07:00 |
|
|
cff4bb8aaa
|
doc: add section for quantization-related APIs and refactor
|
2025-10-16 13:38:24 -07:00 |
|
|
954f7f76f2
|
xe: jit: fix shr usage
|
2025-10-16 12:47:19 -07:00 |
|
|
3a756b982b
|
cpu: aarch64: skip f16 to f32 upcast for clip and clip_v2 eltwise (#4101)
|
2025-10-16 15:06:10 +01:00 |
|
|
e2aa79e849
|
ci: aarch64: make comparison script friendlier for local testing (#4157)
|
2025-10-16 13:12:46 +01:00 |
|
|
34caf5740b
|
cpu: aarch64: jit_reorder: fix dispatch issue for direct copy (#4150)
|
2025-10-16 08:57:31 +01:00 |
|
|
bf6617dd1f
|
tests: benchdnn: utils: improve --buffer-prefix messaging
|
2025-10-15 18:09:06 -07:00 |
|
|
b5773a0826
|
xe: sdpa: workaround for xe3 regressions for PTL
|
2025-10-15 17:55:25 -04:00 |
|
|
34c81a1ca8
|
xe: gemm: convert HHS kernel to FHS to fix perf regression
|
2025-10-15 13:53:19 -07:00 |
|
|
7b3f2af13c
|
xe: jit: gemm: fix stream-k slab count calculation
|
2025-10-15 10:05:31 -07:00 |
|
|
a4874e48ea
|
tests: sdpa: fix fallthrough
|
2025-10-15 09:08:56 -07:00 |
|
|
21adf4d977
|
tests: sdpa: fix typo
|
2025-10-15 09:08:56 -07:00 |
|
|
ff30885a40
|
xe: sdpa: add check for host scalar fetch
|
2025-10-15 09:08:56 -07:00 |
|
|
f8ef3008b1
|
xe: rnn: avoid potential null pointer dereference
|
2025-10-15 09:08:56 -07:00 |
|
|
201e204682
|
xe: conv: make conversion explicit
|
2025-10-15 09:08:56 -07:00 |
|
|
3f66dc5049
|
third_party: ngen: avoid unnecessary copy
|
2025-10-15 09:08:56 -07:00 |
|
|
2b06b5c1b5
|
xe: jit: avoid unnecessary copies
|
2025-10-15 09:08:56 -07:00 |
|
|
4060edd33c
|
xe: gemm: assert host scalar scale is nonzero
|
2025-10-15 09:08:56 -07:00 |
|
|
f1302e4804
|
xe: ir: remove dead code
|
2025-10-15 09:08:56 -07:00 |
|
|
8df1253973
|
xe: lrn: remove dead code
|
2025-10-15 09:08:56 -07:00 |
|
|
27e186ccd9
|
gtest: check returned ptr on expected md init failure
|
2025-10-15 10:42:00 +02:00 |
|
|
de84f7f345
|
benchdnn: properly lock before checking allocations size
|
2025-10-15 10:42:00 +02:00 |
|
|
34663d344a
|
benchdnn: fix reference check in init_memory
|
2025-10-15 10:42:00 +02:00 |
|
|
67cd11b889
|
cpu: x64: matmul fix wei_k_blk query
|
2025-10-14 23:10:32 -04:00 |
|
|
5ab84bf3df
|
xe: ukernel: Add new ukernel entries for MoE support
|
2025-10-14 20:05:31 -04:00 |
|