-
Notifications
You must be signed in to change notification settings - Fork 7
Using and Writing Kernel Functions
In H2Pack, the supported kernel functions K(x,y) should be translationally-invariant (i.e., K(x,y)=K(x-y)) and symmetric (i.e., K(x,y)=K(y,x)). You can either write your own kernel function or use one that is built into H2Pack.
To use your own kernel in H2Pack, you need to write a Kernel Matrix Evaluation (KME) function which takes two point sets X0, Y0, and outputs the kernel block K(X0, Y0). Note that the performance of H2Pack depends heavily on the performance of evaluating the kernel function. Your KME function must be defined as follows:
// Input parameters:
// coord0 : Matrix, size pt_dim-by-ld0, coordinates of the 1st point set
// ld0 : Leading dimension of coord0, should be >= n0
// n0 : Number of points in coord0 (each column in coord0 is a coordinate)
// coord1 : Matrix, size pt_dim-by-ld1, coordinates of the 2nd point set
// ld1 : Leading dimension of coord1, should be >= n1
// n1 : Number of points in coord1 (each column in coord0 is a coordinate)
// ldm : Leading dimension of the kernel matrix
// krnl_param : Pointer to kernel function parameter array
// Output parameter:
// mat : Obtained kernel matrix, size n0-by-ldm
void name_of_your_kernel(
const DTYPE *coord0, const int ld0, const int n0,
const DTYPE *coord1, const int ld1, const int n1,
const void *krnl_param, DTYPE *mat, const int ldm
);
A KME function takes two sets of point coordinates and a parameter array as input and returns a kernel matrix. Two sets of point coordinates: coord0
is a pt_dim
-by-ld0
row-major matrix, its first n0
columns are point coordinates, and n0
<= ld0
. coord1
is a pt_dim
-by-ld1
row-major matrix, its first n1
columns are point coordinates, and n1
<= ld1
. The parameter array krnl_param
contains the parameters used by the kernel. The returning matrix is a row-major matrix of size n0
-by-ldm
, its first n0
-by-n1
block stores the kernel matrix.
An example of a 3D Gaussian kernel:
void Gaussian_3D_eval_std(
const DTYPE *coord0, const int ld0, const int n0,
const DTYPE *coord1, const int ld1, const int n1,
const void *krnl_param, DTYPE *mat, const int ldm
)
{
const DTYPE *x0 = coord0 + ld0 * 0, *x1 = coord1 + ld1 * 0;
const DTYPE *y0 = coord0 + ld0 * 1, *y1 = coord1 + ld1 * 1;
const DTYPE *z0 = coord0 + ld0 * 2, *z1 = coord1 + ld1 * 2;
const DTYPE *krnl_param_ = (DTYPE*) krnl_param;
const DTYPE l = krnl_param_[0];
for (int i = 0; i < n0; i++)
{
const DTYPE x0_i = x0[i], y0_i = y0[i], z0_i = z0[i];
DTYPE *kmat_i = kmat + i * ldm;
#pragma omp simd // Don't forget it!
for (int j = 0; j < n1; j++)
{
DTYPE dx = x1[j] - x0_i;
DTYPE dy = y1[j] - y0_i;
DTYPE dz = z1[j] - z0_i;
DTYPE r2 = dx * dx + dy * dy + dz * dz;
kmat_i[j] = DEXP(-l * r2);
}
}
}
-construction,
-matvec in JIT mode, and
-matmul in JIT mode all need to evaluate a large number of kernel matrix blocks.
Instead of evaluating the kernel function at one pair of points, the above KME function allows the compiler to vectorize kernel evaluations and thus provides better performance.
Note 1:
In the function, we use DEXP()
instead of exp()
.
DEXP
is a macro defined in src/include/H2Pack_config.h
, and by default DEXP
is defined as exp
.
Using DEXP()
instead of exp()
allows H2Pack and your kernel to be used in single precision mode without extra work.
We recommend users to use the macros provided in H2Pack_config.h
.
Note 2: You do not need to and are actually not supposed to parallelize your KME function with multiple threads. A KME function should be single-threaded and should only use variables or memory that can be updated by the current thread.
-
For better
-matvec and
-matmul performance in JIT mode, users can provide an additional kernel-related function called the Bi-Kernel Matvec (BKM) function. For more details, see Bi-Kernel Matvec (BKM) Functions.
-
For even better performance of
-construction,
-matvec (JIT mode), and
-matmul (JIT mode) performance, vector wrapper functions can be used for writing the KME and/or BKM functions. For more details, see Vector Wrapper Functions For Kernel Functions.
Note: Providing a KME function is both sufficient and necessary for H2Pack to work properly. The above BKM function and vector wrapper functions are used as auxiliary tools to further improve the efficiency of H2Pack.
H2Pack provides the following optimized kernels (notation: , 2D / 3D means the point coordinate dimension):
- 2D Laplace:
- 3D Coulomb:
- 2D & 3D Gaussian:
,
is a scalar parameter
- 2D & 3D Exponential:
,
is a scalar parameter
- 2D & 3D Matern 3/2:
,
is a scalar parameter
- 2D & 3D Matern 5/2:
,
is a scalar parameter
- 2D & 3D Quadratic:
,
are two scalar parameters
- 3D Stokes:
To use these optimized kernels, add #include "H2Pack_2D_kernels.h"
and #include "H2Pack_3D_kernels.h"
to your C/C++ file. Both KME and BKM functions are provided:
- For KME function pointer
kernel_eval_fptr
, the optimized scalar kernels (i.e. except Stokes) are named as<kernel_prefix>_eval_intrin_d
, the suffix_intrin_d
means the function is manually vectorized for double precision. For 3D Stokes kernel, the function isStokes_eval_std
without manual vectorization. - For BKM function pointer
kernel_bimv_fptr
, the optimized kernels are named as<kernel_prefix>_krnl_bimv_intrin_d
, the suffix_intrin_d
means the function is manually vectorized for double precision. - For BKM flop numbers
krnl_bimv_flop
, the values are named as<kernel_prefix>_krnl_bimv_flop
.
Kernel naming prefixes:
Kernel | Naming Prefix |
---|---|
2D Laplace | Laplace_2D |
3D Coulomb | Coulomb_3D |
2D & 3D Gaussian | Gaussian_{2, 3}D |
2D & 3D Exponential | Expon_{2, 3}D |
2D & 3D Matern 3/2 | Matern32_{2, 3}D |
2D & 3D Matern 5/2 | Matern52_{2, 3}D |
2D & 3D Quadratic | Quadratic_{2, 3}D |
3D Stokes | Stokes |
An example code for using 3D Gaussian kernel:
// Partitioning points
int max_leaf_points = 0;
DTYPE max_leaf_size = 0.0;
char *pp_fname = "./PP_3DCoulomb_1e-6.dat";
H2P_calc_enclosing_box(pt_dim, n_point, coord, pp_fname, &h2pack->root_enbox);
H2P_partition_points(h2pack, n_point, coord, max_leaf_points, max_leaf_size);
// Kernel configuration
DTYPE krnl_param[1] = {2.0};
kernel_eval_fptr krnl_eval = Gaussian_3D_eval_intrin_d;
kernel_bimv_fptr krnl_bimv = Gaussian_3D_krnl_bimv_intrin_d;
int krnl_bimv_flop = Gaussian_3D_krnl_bimv_flop;
// Constructing proxy points
H2P_dense_mat_p *pp;
H2P_generate_proxy_point_ID_file(h2pack, krnl_param, krnl_eval, pp_fname, &pp);
// Constructing H2 representation
int BD_JIT = 1;
H2P_build(
h2pack, pp, BD_JIT, krnl_param,
krnl_eval, krnl_bimv, krnl_bimv_flop
);
- Return to the top H2Pack github page (leave this wiki)
- Installing H2Pack
- Basic Application Interface
- Using and Writing Kernel Functions
- Two Running Modes for H2Pack
- HSS-Related Computations
- Bi-Kernel Matvec (BKM) Functions
- Vector Wrapper Functions for Kernel Evaluations
- Proxy Points and their Reuse
- Python Interface
- H2 Matrix File Storage Scheme (draft)
- Using H2 Matrix File Storage