kernels

mirror of https://github.com/huggingface/kernels.git synced 2025-10-20 20:56:31 +08:00

Files

Daniël de Kok fc935d9874 Support registering inference/training-specific layers (#103 )

* Support registering inference/training-specific layers

This change makes it possible to register kernels specialized for
inference, training, and/or `torch.compile`. To do so, the mapping
notation is extended to support registering specialized kernels
for a specific 'mode'. For instance, the following mapping,

```python
kernel_layer_mapping = {
    "SiluAndMul": {
        "cuda": {
          Mode.DEFAULT: LayerRepository(
              repo_id="kernels-community/activation",
              layer_name="SiluAndMul",
          ),
          Mode.TRAINING | Mode.TORCH_COMPILE: LayerRepository(
              repo_id="kernels-community/activation-training-optimized",
              layer_name="SiluAndMul",
          ),
      }
    }
}
```

uses `kernels-community/activation` by default, but will switch to
using `kernels-community/activation-training-optimized` if a model
is kernelized for training and `torch.compile`.

To make it easier to add more modes in the future and to unify the
`register_kernel_mapping` and `kernelize` signatures, the `training`
and `needs_torch_compile` arguments of `kernelize` are replaced by
a single `mode` argument:

```python
model = MyModel(...)
model = kernelize(model, mode=Mode.TRAINING | Mode.TORCH_COMPILE)
```

* Documentation fixes

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>

* Add note on when the fallback is used

* Tighten up some Mode checks

* Fix ruff check

* Attempt to fix mypy errors

* More typing fixes

* Ignore Python < 3.11 type check SNAFU

---------

Co-authored-by: Mohamed Mekkouri <93391238+MekkCyber@users.noreply.github.com>

2025-07-04 19:57:14 +02:00

kernel_locking

Make the forward pass torch.compile compatible (#87 )

2025-06-03 15:06:02 +02:00

conftest.py

Add support for Metal builds (#89 )

2025-05-30 15:54:28 +02:00

test_basic.py

Add get_local_kernel function (#102 )

2025-07-01 13:58:47 +02:00

test_benchmarks.py

Add support for Metal builds (#89 )

2025-05-30 15:54:28 +02:00

test_kernel_locking.py

Add support for Metal builds (#89 )

2025-05-30 15:54:28 +02:00

test_layer.py

Support registering inference/training-specific layers (#103 )

2025-07-04 19:57:14 +02:00