|
| 1 | +# # Performance Tips |
| 2 | +# ## Optimize contraction orders |
| 3 | + |
| 4 | +# Let us use a problem instance from the "Promedus" dataset of the UAI 2014 competition as an example. |
| 5 | +using TensorInference |
| 6 | +problem = problem_from_artifact("uai2014", "MAR", "Promedus", 11) |
| 7 | +model, evidence = read_model(problem), read_evidence(problem); |
| 8 | + |
| 9 | +# Next, we select the tensor network contraction order optimizer. |
| 10 | +optimizer = TreeSA(ntrials = 1, niters = 5, βs = 0.1:0.3:100) |
| 11 | + |
| 12 | +# Here, we choose the local search based [`TreeSA`](@ref) algorithm, which often finds the smallest time/space complexity and supports slicing. |
| 13 | +# One can type `?TreeSA` in a Julia REPL for more information about how to configure the hyper-parameters of the [`TreeSA`](@ref) method, |
| 14 | +# while the detailed algorithm explanation is in [arXiv: 2108.05665](https://arxiv.org/abs/2108.05665). |
| 15 | +# Alternative tensor network contraction order optimizers include |
| 16 | +# * [`GreedyMethod`](@ref) (default, fastest in searching speed but worst in contraction complexity) |
| 17 | +# * [`KaHyParBipartite`](@ref) |
| 18 | +# * [`SABipartite`](@ref) |
| 19 | + |
| 20 | +tn = TensorNetworkModel(model; optimizer, evidence); |
| 21 | + |
| 22 | +# The returned object `tn` contains a field `code` that specifies the tensor network with optimized contraction order. To check the contraction complexity, please type |
| 23 | +contraction_complexity(tn) |
| 24 | + |
| 25 | +# The returned object contains log2 values of the number of multiplications, the number elements in the largest tensor during contraction and the number of read-write operations to tensor elements. |
| 26 | + |
| 27 | +probability(tn) |
| 28 | + |
| 29 | +# ## Using the slicing technique to reduce the memory cost |
| 30 | + |
| 31 | +# For large scale applications, it is also possible to slice over certain degrees of freedom to reduce the space complexity, i.e. |
| 32 | +# loop and accumulate over certain degrees of freedom so that one can have a smaller tensor network inside the loop due to the removal of these degrees of freedom. |
| 33 | +# In the [`TreeSA`](@ref) optimizer, one can set `nslices` to a value larger than zero to turn on this feature. |
| 34 | +# As a comparison we slice over 5 degrees of freedom, which can reduce the space complexity by at most 5. |
| 35 | +# In this application, the slicing achieves the largest possible space complexity reduction 5, while the time and read-write complexity are only increased by less than 1, |
| 36 | +# i.e. the peak memory usage is reduced by a factor ``32``, while the (theoretical) computing time is increased by at a factor ``< 2``. |
| 37 | +optimizer = TreeSA(ntrials = 1, niters = 5, βs = 0.1:0.3:100, nslices=5) |
| 38 | +tn = TensorNetworkModel(model; optimizer, evidence); |
| 39 | +contraction_complexity(tn) |
| 40 | + |
| 41 | +# ## Faster Tropical tensor contraction to speed up MAP and MMAP |
| 42 | +# No extra effort is required to enjoy the BLAS level speed provided by [`TropicalGEMM`](https://github.com/TensorBFS/TropicalGEMM.jl). |
| 43 | +# The benchmark in the `TropicalGEMM` repo shows this performance is close to the theoretical optimal value. |
| 44 | +# Its implementation on GPU is under development in Github repo [`CuTropicalGEMM.jl`](https://github.com/ArrogantGao/CuTropicalGEMM.jl) as a part of [Open Source Promotion Plan summer program](https://summer-ospp.ac.cn/). |
| 45 | + |
| 46 | +# ## Working with GPUs |
| 47 | +# To upload the computation to GPU, you just add `using CUDA` before calling the `solve` function, and set the keyword argument `usecuda` to `true`. |
| 48 | +# ```julia |
| 49 | +# julia> using CUDA |
| 50 | +# [ Info: OMEinsum loaded the CUDA module successfully |
| 51 | +# |
| 52 | +# julia> marginals(tn; usecuda = true); |
| 53 | +# ``` |
| 54 | + |
| 55 | +# Functions support `usecuda` keyword argument includes |
| 56 | +# * [`probability`](@ref) |
| 57 | +# * [`log_probability`](@ref) |
| 58 | +# * [`marginals`](@ref) |
| 59 | +# * [`most_probable_config`](@ref) |
| 60 | + |
| 61 | +# ## Benchmarks |
| 62 | +# Please check our [paper (link to be added)](). |
0 commit comments