Releases: FluxML/NNlib.jl
Releases · FluxML/NNlib.jl
v0.8.20
NNlib v0.8.20
Closed issues:
Merged pull requests:
- Add hand written gelu derivative (#480) (@chengchingwen)
- [NNlibAMDGPUExt] Load MIOpen module only if it is available (#483) (@pxl-th)
- Use KernelAbstractions.jl for upsample kernels (#486) (@pxl-th)
- Use KernelAbstractions.jl for gather/scatter kernels (#487) (@pxl-th)
- [AMDGPU] Add dispatch path for FP16 batched mul (#488) (@pxl-th)
v0.8.19
NNlib v0.8.19
Merged pull requests:
- Add AMDGPU extension (#470) (@pxl-th)
- Merge NNlibCUDA as subpackage (#471) (@ToucheSir)
- Add dropout & attention tests for AMDGPU (#472) (@pxl-th)
- Allow regular convolution for AMDGPU (#473) (@pxl-th)
- Add downstream tests for Lux.jl (#474) (@avik-pal)
- change AMDGPUExt to NNlibAMDGPUExt (#477) (@CarloLucibello)
v0.8.18
NNlib v0.8.18
Closed issues:
- Unstable performance (#461)
Merged pull requests:
- Fix conv with groups when falling in direct backend (#468) (@gabrielpreviato)
v0.8.17
NNlib v0.8.17
Closed issues:
- Multi-head attention? (#385)
- Batched multiplication support for ndims > 3 (#391)
- Symmetric Padding (#463)
Merged pull requests:
- implement dot_product_attention (#455) (@CarloLucibello)
- symmetric and circular padding (#465) (@nikopj)
- Reduce
lpnormpool
unnecessary conversion (#467) (@skyleaworlder) - Update deprecated actions & change coverage upload entry (#469) (@skyleaworlder)
v0.8.16
NNlib v0.8.16
Closed issues:
batched_vec
>1000X slower thanbatched_mul
(#462)
Merged pull requests:
- Add
lppool
implementation (#447) (@skyleaworlder) - Refactor
batched_vec
(#464) (@jondeuce)
v0.8.15
v0.8.14
NNlib v0.8.14
Closed issues:
- support arbitrary number of batch dimensions in batched_mul (#451)
Merged pull requests:
v0.8.13
v0.8.12
v0.8.11
NNlib v0.8.11
Closed issues:
- (Flaky?) CI failures on GHA latest + Buildkite (#359)
Merged pull requests:
- Trigger tagbot on issue comments (#440) (@Saransh-cpp)
- Remove threading from all
∇*conv_filter
and re-enable old tests (#441) (@ToucheSir) - Slightly faster softplus (#443) (@Sleort)
- Add fold and unfold (#444) (@nikopj)