Releases: huggingface/kernels
v0.7.0
API changes
This version contains an API change to the kernelize function that makes it possible to use different kernels for inference/training/torch.compile. This requires a small adjustment to how kernelize is called, see the kernelize documentation for more information. In short, to kernelize a model for inference, use:
model = MyModel(...)
model = kernelize(model, mode=Mode.INFERENCE)For training:
model = MyModel(...)
model = kernelize(model, mode=Mode.TRAINING)What's Changed
- Add
get_local_kernelfunction by @danieldk in #102 - Support registering inference/training-specific layers by @danieldk in #103
- Set version to 0.7.0.dev0 by @danieldk in #104
Full Changelog: v0.6.2...v0.7.0
v0.6.2
v0.6.1
Features
This release adds an experimental generate-readme subcommand to the kernels utility. This command generates a README containing the API docs for a kernel. Functions and layers can be annotated with Hugging Face doc-builder formatting.
Example kernel with API docs: https://huggingface.co/kernels-community/triton-layer-norm
v0.6.0
Features
New Hub layer API
The layer API has been redesigned to make it compatible with torch.compile. Prior to this change, the use_kernel_forward_from_hub decorator would replace a layer's forward method by one that uses dynamic kernel dispatch. However, this type of data-dependent branching is not compatible with torch compile.
In the new API, use_kernel_forward_from hub only associates a layer with a kernel name. Then the replacement of forward methods is done by a new kernelize function:
from kernels import kernelize
model = ModelWithKernelLayers(...)
kernelize(model)
Y = model(X)See the layers documentation for more information about the layer API.
Support for loading Metal kernels
kernels now has experimental support for loading Metal kernels. For example:
import torch
from kernels import get_kernel
relu = get_kernel("kernels-test/relu-metal")
x = torch.arange(-10, 10, dtype=dtype, device="mps")
y = metal_kernel.relu(x)Generate wheels from Hub kernels
The kernels utility now has a to-wheel subcommand for converting Hub kernels to Python wheels for legacy deployment scenarios. For example, to make Python wheels for the activation kernel:
$ uv run kernels to-wheel kernels-community/activation 0.0.3
...
☸️ activation-0.0.3+torch25cu124cxx98-cp39-abi3-manylinux_2_28_x86_64.whl
☸️ activation-0.0.3+torch26cu124cxx98-cp39-abi3-manylinux_2_28_x86_64.whl
☸️ activation-0.0.3+torch27cu118cxx11-cp39-abi3-manylinux_2_28_x86_64.whlFull Changelog: v0.5.0...v0.6.0
v0.5.0
What's Changed
- locking docs: fix command name (
kernel->kernels) by @danieldk in #74 - Specify required aarch64 and ROCm build variants by @danieldk in #76
- docs: link to autogenerated build variant list by @danieldk in #77
- Allow layers to opt in to
torch.compileby @danieldk in #79 - Set version to 0.5.0 by @danieldk in #82
Full Changelog: v0.4.4...v0.5.0
v0.4.4
v0.4.3
What's Changed
- Add more details about the ABI requirements by @danieldk in #63
- Update ABI requirement to
manylinux_2_28by @danieldk in #65 - Add Apache License version 2.0 by @danieldk in #66
- Support
DISABLE_KERNEL_MAPPINGenv var for completely disabling kernel mappings by @danieldk in #70 - Set version to 0.4.3 by @danieldk in #71
Full Changelog: v0.4.2...v0.4.3