Daniël de Kok 544354cb97 Add support for locking kernels (#10)
* PoC: allow users to lock the kernel revisions

This change allows Python projects that use kernels to lock the
kernel revisions on a project-basis. For this to work, the user
only has to include `hf-kernels` as a build dependency. During
the build, a lock file is written to the package's pkg-info.
During runtime we can read it out and use the corresponding
revision. When the kernel is not locked, the revision that is provided
as an argument is used.

* Generate lock files with `hf-lock-kernels`, copy to egg

* Various improvements

* Name CLI `hf-kernels`, add `download` subcommand

* hf-kernels.lock

* Bump version to 0.1.1

* Use setuptools for testing the wheel

* Factor out tomllib module selection

* Pass through `local_files_only` in `get_metadata`

* Do not reuse implementation in `load_kernel`

* The tests install hf-kernels from PyPI, should be local

* docker: package is in subdirectory
2025-01-21 16:08:40 +01:00
2025-01-20 13:08:53 +01:00
2025-01-20 12:55:22 +01:00
2024-11-29 17:43:30 +01:00
2025-01-20 12:55:22 +01:00

hf-kernels

Make sure you have torch==2.5.1+cu124 installed.

import torch

from hf_kernels import get_kernel

# Download optimized kernels from the Hugging Face hub
activation = get_kernel("kernels-community/activation")

# Random tensor
x = torch.randn((10, 10), dtype=torch.float16, device="cuda")

# Run the kernel
y = torch.empty_like(x)
activation.gelu_fast(y, x)

print(y)

Docker Reference

build and run the reference example/basic.py in a Docker container with the following commands:

docker build --platform linux/amd64 -t kernels-reference -f docker/Dockerfile.reference .
docker run --gpus all -it --rm -e HF_TOKEN=$HF_TOKEN kernels-reference
Description
Load compute kernels from the Hub
Readme Apache-2.0 782 KiB
Languages
Python 98.7%
Nix 1.2%