-
Notifications
You must be signed in to change notification settings - Fork 7
Bi Kernel Matvec Functions
An optional BKM function can be used to further accelerate matrix-vector multiplication in JIT mode.
Since H2Pack currently focuses on symmetric kernel matrices, for most Bi,j and Di,j kernel blocks,
the matrix-vector multiplication needs to multiply one block by a vector and multiply its transposes by another vector.
If only a kernel matrix evaluation (KME) function is provided, H2Pack will compute this Bi,j or Di,j block first, and immediately use the block twice for the two multiplications.
In this approach, the computed Bi,j or Di,j block needs to be stored in a buffer (usually in L2 cache) and transferred to the processor.
A BKM function further eliminates this data movement for better performance.
A BKM function must be defined as follows:
// Input parameters:
// coord0 : Matrix, size pt_dim-by-ld0, coordinates of the 1st point set
// ld0 : Leading dimension of coord0, should be >= n0
// n0 : Number of points in coord0 (each column in coord0 is a coordinate)
// coord1 : Matrix, size pt_dim-by-ld1, coordinates of the 2nd point set
// ld1 : Leading dimension of coord1, should be >= n1
// n1 : Number of points in coord1 (each column in coord0 is a coordinate)
// krnl_param : Pointer to kernel function parameter array
// x_in_0 : Vector, size >= krnl_dim * n1, will be left multiplied by kernel_matrix(coord0, coord1)
// x_in_1 : Vector, size >= krnl_dim * n0, will be left multiplied by kernel_matrix(coord1, coord0).
// Output parameters:
// x_out_0 : Vector, size >= krnl_dim * n0, x_out_0 += kernel_matrix(coord0, coord1) * x_in_0
// x_out_1 : Vector, size >= krnl_dim * n1, x_out_1 += kernel_matrix(coord1, coord0) * x_in_1
void your_bkm_function_name (
const DTYPE *coord0, const int ld0, const int n0,
const DTYPE *coord1, const int ld1, const int n1,
const void *krnl_param,
const DTYPE *x_in_0, const DTYPE *x_in_1,
DTYPE * restrict x_out_0, DTYPE * restrict x_out_1
);
The first 7 input parameters are the same as that of a KME function. x_in_0
is a vector of size krnl_dim * n1
that will be multiplied by the kernel matrix block and the output vector will be accumulated to output vector x_out_0
of size krnl_dim * n0
. Similarly, x_in_1
is a vector of size krnl_dim * n0
that will be multiplied by the transpose of kernel matrix block and the output vector will be accumulated to output vector x_out_1
of size krnl_dim * n1
.
An example of a KME function for the 3D Gaussian kernel:
void Gaussian_3D_krnl_bimv_std(
const DTYPE *coord0, const int ld0, const int n0,
const DTYPE *coord1, const int ld1, const int n1,
const void *krnl_param,
const DTYPE *x_in_0, const DTYPE *x_in_1,
DTYPE * restrict x_out_0, DTYPE * restrict x_out_1
)
{
const DTYPE *x0 = coord0 + ld0 * 0, *x1 = coord1 + ld1 * 0;
const DTYPE *y0 = coord0 + ld0 * 1, *y1 = coord1 + ld1 * 1;
const DTYPE *z0 = coord0 + ld0 * 2, *z1 = coord1 + ld1 * 2;
const DTYPE *krnl_param_ = (DTYPE*) krnl_param;
const DTYPE l = krnl_param_[0];
for (int i = 0; i < n0; i++)
{
const DTYPE x0_i = x0[i], y0_i = y0[i], z0_i = z0[i];
const DTYPE x_in_1_i = x_in_1[i];
DTYPE sum_i = 0.0;
#pragma omp simd // Don't forget it!
for (int j = 0; j < n1; j++)
{
DTYPE dx = x1[j] - x0_i;
DTYPE dy = y1[j] - y0_i;
DTYPE dz = z1[j] - z0_i;
DTYPE r2 = dx * dx + dy * dy + dz * dz;
DTYPE kval = DEXP(-l * r2);
sum_i += kval * x_in_0[j];
x_out_1[j] += kval * x_in_1_i;
}
x_out_0[i] += sum_i;
}
}
This BKM function is very similar to the 3D Gaussian kernel KME function in Using and Writing Kernel Functions. The only difference is that the calculated kernel matrix element kval
is consumed and discarded immediately in the BKM function but stored to kmat
in the KME function.
For tensor kernels, H2Pack has other requirements for their BKM functions. If you need to write a BKM function for your tensor kernel, please email Hua Huang for further assistance.
- Return to the top H2Pack github page (leave this wiki)
- Installing H2Pack
- Basic Application Interface
- Using and Writing Kernel Functions
- Two Running Modes for H2Pack
- HSS-Related Computations
- Bi-Kernel Matvec (BKM) Functions
- Vector Wrapper Functions for Kernel Evaluations
- Proxy Points and their Reuse
- Python Interface
- H2 Matrix File Storage Scheme (draft)
- Using H2 Matrix File Storage