You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, @oliIMG
In the PR #7127, I see there are different versions of qs8/qu8-f32-vcvt kernels
namely qs8-f32-vcvt-rvv-u1v.c, qs8-f32-vcvt-rvv-u2v.c, qu8-f32-vcvt-rvv-u1v.c and qu8-f32-vcvt-rvv-u2v.c.
They are about the m1/m2 rvv implementations of the kernels.
In some other kernels, they have four RVV implementation versions (based on scalar) in the forms of m1, m2, m4 and m8.
I want to know the purpose of the different implementations, and how the users can select the best version.
The text was updated successfully, but these errors were encountered:
In general a kernel with 'u4v' means 'm4' for the source.
Kernels such as float binary ops, can implement all 4 variations - m1, m2, m4, m8.
In the src/configs/gemm-config.c etc, the fastest of these variations can be enabled.
It will depend on hardware, so once some benchmarks can be done with different vendors, a switch statement on uarch can be added to select different kernels for different hardware.
With 8 or 16 bit datatypes, the intermediates are often lengthened, limiting the variations to m1, m2, and maybe m4.
Hi, @oliIMG
In the PR #7127, I see there are different versions of qs8/qu8-f32-vcvt kernels
namely qs8-f32-vcvt-rvv-u1v.c, qs8-f32-vcvt-rvv-u2v.c, qu8-f32-vcvt-rvv-u1v.c and qu8-f32-vcvt-rvv-u2v.c.
They are about the m1/m2 rvv implementations of the kernels.
In some other kernels, they have four RVV implementation versions (based on scalar) in the forms of m1, m2, m4 and m8.
I want to know the purpose of the different implementations, and how the users can select the best version.
The text was updated successfully, but these errors were encountered: