Faster implementation of maximum cardinality search. #168

samuelsonric · 2025-02-14T21:21:48Z

I have a library CliqueTrees.jl with a faster implementation of the maximum cardinality algorithm. Here are some benchmarks.

julia> using BenchmarkTools, MatrixMarket, SuiteSparseMatrixCollection, Graphs

julia> using CausalInference: count_mcs

julia> ssmc = ssmc_db(); name = "venturiLevel3";

julia> graph = DiGraph(mmread(joinpath(fetch_ssmc(ssmc[ssmc.name .== name, :], format="MM")[1], "$(name).mtx")))
{4026819, 16108474} directed simple Int64 graph

current version

julia> @benchmark count_mcs($graph)
BenchmarkTools.Trial: 6 samples with 1 evaluation per sample.
 Range (min … max):  710.482 ms …    1.613 s  ┊ GC (min … max): 32.27% … 69.52%
 Time  (median):     765.882 ms               ┊ GC (median):    32.35%
 Time  (mean ± σ):   895.824 ms ± 353.408 ms  ┊ GC (mean ± σ):  44.48% ± 15.14%

  █ ▁ ▁ ▁                                                     ▁  
  █▁█▁█▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
  710 ms           Histogram: frequency by time          1.61 s <

 Memory estimate: 706.63 MiB, allocs estimate: 20134711.

new version

julia> @benchmark count_mcs($graph)
BenchmarkTools.Trial: 18 samples with 1 evaluation per sample.
 Range (min … max):  274.436 ms … 311.835 ms  ┊ GC (min … max): 0.00% … 1.62%
 Time  (median):     288.422 ms               ┊ GC (median):    1.17%
 Time  (mean ± σ):   289.501 ms ±  11.964 ms  ┊ GC (mean ± σ):  1.09% ± 1.10%

  ▁▁ █  ▁ ▁ ▁      ▁▁        ▁  ▁ ▁     █    ▁     ▁ ▁        ▁  
  ██▁█▁▁█▁█▁█▁▁▁▁▁▁██▁▁▁▁▁▁▁▁█▁▁█▁█▁▁▁▁▁█▁▁▁▁█▁▁▁▁▁█▁█▁▁▁▁▁▁▁▁█ ▁
  274 ms           Histogram: frequency by time          312 ms <

 Memory estimate: 153.61 MiB, allocs estimate: 15.

You can take a look at the implementation here.

mschauer · 2025-02-15T11:25:33Z

Thank you for the PR! Looks very good, there is some conflict in the compat requirements of our packages, not sure what exactly conflicts. Cc @marcel3243

mwien · 2025-02-15T12:25:16Z

One could check whether this actually improves the performance of the Zigzag sampler. Maybe its not the bottleneck.

On another note: since coding this I noted that it's possible to implement MCS without linked lists. Just use a Vector for each of the sets and store the set that each vertex is currently is in (like is probably already done with something like size[v]).

To move vertices to a new set, just insert them in the new set (and don't delete from the old one). To retrieve the vertex with maximum cardinality, just pop vertices v from maximum cardinality set (say with cardinality x) until you find one with size[v] = x. I did something like this here: https://github.com/mwien/CliquePicking/blob/bfc95435174951c6ea8284260e1eda30d325aa11/cliquepicking_rs/src/chordal.rs#L3

Has same number of insertions/deletions as the linked list implementation. Memory consumption is limited by graph size. If I would reimplement MCS, I would probably take that route.

samuelsonric · 2025-02-15T14:40:33Z

One could check whether this actually improves the performance of the Zigzag sampler. Maybe its not the bottleneck.

On another note: since coding this I noted that it's possible to implement MCS without linked lists. Just use a Vector for each of the sets and store the set that each vertex is currently is in (like is probably already done with something like size[v]).

To move vertices to a new set, just insert them in the new set (and don't delete from the old one). To retrieve the vertex with maximum cardinality, just pop vertices v from maximum cardinality set (say with cardinality x) until you find one with size[v] = x. I did something like this here: https://github.com/mwien/CliquePicking/blob/bfc95435174951c6ea8284260e1eda30d325aa11/cliquepicking_rs/src/chordal.rs#L3

Has same number of insertions/deletions as the linked list implementation. Memory consumption is limited by graph size. If I would reimplement MCS, I would probably take that route.

Since the sets are disjoint, you can store the whole collection using three vectors of length |V|. This reduces allocations, which improves performance.

mwien · 2025-02-15T15:07:15Z

Since the sets are disjoint, you can store the whole collection using three vectors of length |V|. This reduces allocations, which improves performance.

Huh! Didn't know about this, cool idea :)

samuelsonric · 2025-02-15T15:46:20Z

Thank you for the PR! Looks very good, there is some conflict in the compat requirements of our packages, not sure what exactly conflicts. Cc @marcel3243

I suspect that the problem is that I am only supporting Julia 1.10+, and you are testing with Julia 1.6. I will see if I can release an update that supports earlier versions of Julia.

samuelsonric · 2025-02-15T16:55:55Z

One could check whether this actually improves the performance of the Zigzag sampler. Maybe its not the bottleneck.

I benchmarked the causalzigzag(n; score, κ, iterations) call in this example on Julia 1.11.3.

old version

BenchmarkTools.Trial: 3 samples with 1 evaluation per sample.
 Range (min … max):  1.736 s …    1.963 s  ┊ GC (min … max): 15.96% … 24.19%
 Time  (median):     1.861 s               ┊ GC (median):    20.60%
 Time  (mean ± σ):   1.853 s ± 113.775 ms  ┊ GC (mean ± σ):  20.42% ±  4.13%

  █                               █                        █  
  █▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
  1.74 s         Histogram: frequency by time         1.96 s <

 Memory estimate: 2.99 GiB, allocs estimate: 60403438.

new version

BenchmarkTools.Trial: 4 samples with 1 evaluation per sample.
Range (min … max):  1.522 s …   1.669 s  ┊ GC (min … max): 25.69% … 30.39%
Time  (median):     1.589 s              ┊ GC (median):    27.95%
Time  (mean ± σ):   1.592 s ± 80.245 ms  ┊ GC (mean ± σ):  27.34% ±  3.69%

 █                                                 ▁     ▁  
 █▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁█ ▁
 1.52 s         Histogram: frequency by time        1.67 s <

Memory estimate: 3.06 GiB, allocs estimate: 60820094.

It's hard to say, but the run-time seems to have decreased.

samuelsonric · 2025-02-15T22:28:13Z

I can push an update that supports Julia 1.8 and 1.9.

For 1.6 and 1.7, I would have to loosen some of the compatibility requirements in TreeWidthSolver.

mschauer · 2025-02-16T08:54:17Z

I am happy to drop < 1.8.

samuelsonric · 2025-02-16T15:24:43Z

Tests are passing (locally) on Julia versions

v1.8.5
v1.9.4
v1.10.8
v1.11.3

samuelsonric · 2025-02-16T16:18:08Z

CI failed because it's on Julia 1.7.

mschauer · 2025-02-16T16:23:24Z

Let's see...

mschauer · 2025-02-16T16:30:15Z

Great, now CI is updated too. Is this ready to merge?

samuelsonric · 2025-02-16T16:30:25Z

Well well well!

samuelsonric · 2025-02-16T16:30:36Z

Yes, it's ready!

Using implementation of maximum cardinality search from CliqueTrees.jl.

4c7cdb0

samuelsonric closed this Feb 15, 2025

samuelsonric reopened this Feb 15, 2025

CliqueTrees 0.3

33bce4e

mschauer closed this Feb 16, 2025

mschauer reopened this Feb 16, 2025

mschauer merged commit 40f80a2 into mschauer:master Feb 16, 2025
6 of 10 checks passed

mschauer mentioned this pull request Mar 6, 2025

Fast implementation of triangulation algorithms. wangjie212/TSSOS#14

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Faster implementation of maximum cardinality search. #168

Faster implementation of maximum cardinality search. #168

Uh oh!

samuelsonric commented Feb 14, 2025

Uh oh!

mschauer commented Feb 15, 2025

Uh oh!

mwien commented Feb 15, 2025

Uh oh!

samuelsonric commented Feb 15, 2025

Uh oh!

mwien commented Feb 15, 2025

Uh oh!

samuelsonric commented Feb 15, 2025

Uh oh!

samuelsonric commented Feb 15, 2025

Uh oh!

samuelsonric commented Feb 15, 2025

Uh oh!

mschauer commented Feb 16, 2025

Uh oh!

samuelsonric commented Feb 16, 2025 •

edited

Loading

Uh oh!

samuelsonric commented Feb 16, 2025

Uh oh!

mschauer commented Feb 16, 2025

Uh oh!

mschauer commented Feb 16, 2025

Uh oh!

samuelsonric commented Feb 16, 2025

Uh oh!

samuelsonric commented Feb 16, 2025

Uh oh!

Uh oh!

Uh oh!

Faster implementation of maximum cardinality search. #168

Faster implementation of maximum cardinality search. #168

Uh oh!

Conversation

samuelsonric commented Feb 14, 2025

Uh oh!

mschauer commented Feb 15, 2025

Uh oh!

mwien commented Feb 15, 2025

Uh oh!

samuelsonric commented Feb 15, 2025

Uh oh!

mwien commented Feb 15, 2025

Uh oh!

samuelsonric commented Feb 15, 2025

Uh oh!

samuelsonric commented Feb 15, 2025

Uh oh!

samuelsonric commented Feb 15, 2025

Uh oh!

mschauer commented Feb 16, 2025

Uh oh!

samuelsonric commented Feb 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

samuelsonric commented Feb 16, 2025

Uh oh!

mschauer commented Feb 16, 2025

Uh oh!

mschauer commented Feb 16, 2025

Uh oh!

samuelsonric commented Feb 16, 2025

Uh oh!

samuelsonric commented Feb 16, 2025

Uh oh!

Uh oh!

Uh oh!

samuelsonric commented Feb 16, 2025 •

edited

Loading